Я хочу вычислить и напечатать точность, вызывать, fscore и поддерживать, используя sklearn.metrics в python. Я doig NLP, так что мои y_test и y_pred являются основными словами до шага векторизации.
ниже некоторой информации, которая может вам помочь:
y_test: [0 0 0 1 1 0 1 1 1 0]
y_pred [0.86 0.14 1. 0. 1. 0. 0.04 0.96 0.01 0.99 1. 0. 0.01 0.99
0.41 0.59 0.02 0.98 1. 0. ]
x_train 50
y_train 50
x_test 10
y_test 10
x_valid 6
y_valid 6
y_pred dimension: (20,)
y_test dimension: (10,)
полная ошибка обратной связи:
Traceback (most recent call last):
File "C:\Users\iduboc\Documents\asd-dev\train.py", line 324, in <module>
precision, recall, fscore, support = score(y_test, y_pred)
File "C:\Users\iduboc\Python1\envs\asd-v3-1\lib\site-packages\sklearn\metrics\classification.py", line 1415, in precision_recall_fscore_support
pos_label)
File "C:\Users\iduboc\Python1\envs\asd-v3-1\lib\site-packages\sklearn\metrics\classification.py", line 1239, in _check_set_wise_labels
y_type, y_true, y_pred = _check_targets(y_true, y_pred)
File "C:\Users\iduboc\Python1\envs\asd-v3-1\lib\site-packages\sklearn\metrics\classification.py", line 71, in _check_targets
check_consistent_length(y_true, y_pred)
File "C:\Users\iduboc\Python1\envs\asd-v3-1\lib\site-packages\sklearn\utils\validation.py", line 205, in check_consistent_length
" samples: %r" % [int(l) for l in lengths])
ValueError: Found input variables with inconsistent numbers of samples: [10, 20]
мой код:
from sklearn.metrics import precision_recall_fscore_support as score
precision, recall, fscore, support = score(y_test, y_pred)
print('precision: {}'.format(precision))
print('recall: {}'.format(recall))
print('fscore: {}'.format(fscore))
print('support: {}'.format(support))
Мой код для прогнозирования значений:
elif clf == 'rndforest':
# No validation data in rnd forest
x_train = np.concatenate((x_train, x_valid))
y_train = np.concatenate((y_train, y_valid))
model = RandomForestClassifier(n_estimators=int(clf_params['n_estimators']),
max_features=clf_params['max_features'])
model.fit(pipe_vect.transform(x_train), y_train)
datetoday = datetime.today().strftime('%d-%b-%Y-%H_%M')
model_name_save = abspath(os.path.join("models", dataset, name_file + '-' +
vect + reduction + '-rndforest'\
+ datetoday + '.pickle'))
print("Model d'enregistrement : ", model_name_save)
x_test_vect = pipe_vect.transform(x_test)
y_pred = model.predict_proba(x_test_vect)