Доброе утро, я попробовал анализ дерева решений на моем наборе данных.Это работает хорошо, но когда я пытаюсь представить результаты, они становятся почти неразборчивыми.Дерево очень сложное, и когда я строю его, я не вижу процент описанного кластера.
DecisionTree <- J48(cluster ~., data = data)
#to plot I use this
if(require("party", quietly = TRUE)) plot(DecisionTree)
#but it is too complicated
summary(DecisionTree)
> DecisionTree
J48 pruned tree
------------------
Akkermansiamuciniphila <= 0.036534
| g__.Ruminococcus. <= 0.042792
| | Coprococcus <= 0.029093: cluster2 (8.0)
| | Coprococcus > 0.029093
| | | Bacteroideseggerthii <= 0.019078
| | | | Acidaminococcus <= 0.000643: cluster1 (9.0)
| | | | Acidaminococcus > 0.000643
| | | | | Faecalibacteriumprausnitzii <= 12.003904: cluster1 (2.0)
| | | | | Faecalibacteriumprausnitzii > 12.003904: cluster2 (2.0)
| | | Bacteroideseggerthii > 0.019078: cluster2 (4.0)
| g__.Ruminococcus. > 0.042792
| | Bacteroides <= 55.685153: cluster2 (53.0/2.0)
| | Bacteroides > 55.685153: cluster1 (3.0/1.0)
Akkermansiamuciniphila > 0.036534
| Lachnospira <= 1.741031
| | Collinsellaaerofaciens <= 0.876882
| | | Dorea <= 7.094334
| | | | Bacteroides <= 0.961964: cluster1 (17.0)
| | | | Bacteroides > 0.961964
| | | | | Haemophilusparainfluenzae <= 0.094729
| | | | | | Bacteroidesfragilis <= 0.008524: cluster1 (23.0)
| | | | | | Bacteroidesfragilis > 0.008524
| | | | | | | g__.Ruminococcus.gnavus <= 0.043714: cluster1 (10.0)
| | | | | | | g__.Ruminococcus.gnavus > 0.043714
| | | | | | | | Sutterella <= 0.046028
| | | | | | | | | Catenibacterium <= 0.003085
| | | | | | | | | | Bacteroidesovatus <= 0.083545: cluster1 (5.0/1.0)
| | | | | | | | | | Bacteroidesovatus > 0.083545
| | | | | | | | | | | Methanobrevibacter <= 0.057396: cluster2 (27.0)
| | | | | | | | | | | Methanobrevibacter > 0.057396
| | | | | | | | | | | | Anaerostipes <= 0.084714: cluster2 (2.0)
| | | | | | | | | | | | Anaerostipes > 0.084714: cluster1 (2.0)
| | | | | | | | | Catenibacterium > 0.003085: cluster1 (3.0)
| | | | | | | | Sutterella > 0.046028
| | | | | | | | | Prevotella <= 0.210431
| | | | | | | | | | Roseburia <= 0.404605
| | | | | | | | | | | Methanobrevibacter <= 0.003117
| | | | | | | | | | | | Megasphaera <= 0.00507: cluster1 (8.0)
| | | | | | | | | | | | Megasphaera > 0.00507: cluster2 (2.0)
| | | | | | | | | | | Methanobrevibacter > 0.003117: cluster2 (4.0)
| | | | | | | | | | Roseburia > 0.404605: cluster1 (18.0)
| | | | | | | | | Prevotella > 0.210431
| | | | | | | | | | Bifidobacteriumlongum <= 0.056646: cluster1 (2.0)
| | | | | | | | | | Bifidobacteriumlongum > 0.056646: cluster2 (8.0)
| | | | | Haemophilusparainfluenzae > 0.094729
| | | | | | g__.Prevotella. <= 1.262307
| | | | | | | Bifidobacterium <= 3.136373
| | | | | | | | Klebsiella <= 0.128726: cluster2 (23.0)
| | | | | | | | Klebsiella > 0.128726
| | | | | | | | | Eggerthellalenta <= 0.069993: cluster2 (5.0)
| | | | | | | | | Eggerthellalenta > 0.069993: cluster1 (3.0)
| | | | | | | Bifidobacterium > 3.136373: cluster1 (2.0)
| | | | | | g__.Prevotella. > 1.262307: cluster1 (2.0)
| | | Dorea > 7.094334: cluster2 (10.0)
| | Collinsellaaerofaciens > 0.876882
| | | Paraprevotella <= 0.127762
| | | | Pseudomonas <= 0.01816: cluster1 (41.0)
| | | | Pseudomonas > 0.01816
| | | | | Roseburia <= 0.981812: cluster1 (2.0)
| | | | | Roseburia > 0.981812: cluster2 (2.0)
| | | Paraprevotella > 0.127762
| | | | Prevotella <= 1.432273: cluster1 (3.0)
| | | | Prevotella > 1.432273: cluster2 (3.0)
| Lachnospira > 1.741031: cluster2 (14.0/1.0)
Number of Leaves : 33
Size of the tree : 65
Есть ли способ уменьшить сложность или перепрыгнуть некоторую часть из трех, которые описывают только небольшие частимои образцы?большое спасибо