Я пытаюсь интерпретировать строковое представление Weka RandomTree. В обучающей выборке 1000 записей (экземпляров). Глядя на строку, кажется, что количество экземпляров в листах составляет 1030. Как это возможно? Я как-то неправильно интерпретирую строку?
См. Полное описание запуска ниже.
Обратите внимание на следующее:
Total Number of Instances 1000
при сборе всех значений из листов: (10/0),(1/0),(354/0),(18/1),(37/0),(11/0),(9/4),(5/0),(7/3),(5/0),(20/0),(1/0),(2/0),(168/0),(1/0),(145/0),(61/3),(3/1),(5/0),(44/13),(8/0),(10/2),(63/0),(8/3),(4/0)
приводит к 1030.
Спасибо!
Вот полное описание запуска:
=== Run information ===
Scheme: weka.classifiers.trees.RandomTree -K 0 -M 1.0 -V 0.001 -S 1 -depth 5
Relation: test-data
Instances: 1000
Attributes: 5
feature1
feature2
feature3
feature4
class
Test mode: evaluate on training data
=== Classifier model (full training set) ===
RandomTree
==========
feature2 < -0.27
| feature2 < -0.61
| | feature3 < 1.09
| | | feature2 < -2.41
| | | | feature2 < -2.45 : 0 (10/0)
| | | | feature2 >= -2.45 : 1 (1/0)
| | | feature2 >= -2.41
| | | | feature2 < -0.7 : 0 (354/0)
| | | | feature2 >= -0.7 : 0 (18/1)
| | feature3 >= 1.09
| | | feature2 < -0.94 : 0 (37/0)
| | | feature2 >= -0.94
| | | | feature1 < -0.02 : 0 (11/0)
| | | | feature1 >= -0.02 : 0 (9/4)
| feature2 >= -0.61
| | feature3 < -0.34
| | | feature1 < 1.19 : 1 (5/0)
| | | feature1 >= 1.19
| | | | feature2 < -0.39 : 0 (7/3)
| | | | feature2 >= -0.39 : 0 (5/0)
| | feature3 >= -0.34
| | | feature2 < -0.32 : 0 (20/0)
| | | feature2 >= -0.32
| | | | feature2 < -0.3 : 1 (1/0)
| | | | feature2 >= -0.3 : 0 (2/0)
feature2 >= -0.27
| feature1 < 1.19
| | feature3 < -0.11 : 1 (168/0)
| | feature3 >= -0.11
| | | feature3 < -0.1 : 0 (1/0)
| | | feature3 >= -0.1
| | | | feature4 < 0.59 : 1 (145/0)
| | | | feature4 >= 0.59 : 1 (61/3)
| feature1 >= 1.19
| | feature2 < 0.82
| | | feature2 < -0.18
| | | | feature2 < -0.21 : 0 (3/1)
| | | | feature2 >= -0.21 : 0 (5/0)
| | | feature2 >= -0.18
| | | | feature1 < 2.28 : 1 (44/13)
| | | | feature1 >= 2.28 : 0 (8/0)
| | feature2 >= 0.82
| | | feature1 < 2.67
| | | | feature1 < 1.33 : 1 (10/2)
| | | | feature1 >= 1.33 : 1 (63/0)
| | | feature1 >= 2.67
| | | | feature1 < 2.97 : 0 (8/3)
| | | | feature1 >= 2.97 : 1 (4/0)
Size of the tree : 49
Max depth of tree: 5
Time taken to build model: 0.05 seconds
=== Evaluation on training set ===
Time taken to test model on training data: 0.03 seconds
=== Summary ===
Correctly Classified Instances 970 97 %
Incorrectly Classified Instances 30 3 %
Kappa statistic 0.94
Mean absolute error 0.0421
Root mean squared error 0.145
Relative absolute error 8.4142 %
Root relative squared error 29.0073 %
Total Number of Instances 1000
=== Detailed Accuracy By Class ===
TP Rate FP Rate Precision Recall F-Measure MCC ROC Area PRC Area Class
0.964 0.024 0.976 0.964 0.970 0.940 0.997 0.996 0
0.976 0.036 0.964 0.976 0.970 0.940 0.997 0.995 1
Weighted Avg. 0.970 0.030 0.970 0.970 0.970 0.940 0.997 0.996
=== Confusion Matrix ===
a b <-- classified as
486 18 | a = 0
12 484 | b = 1