я пытаюсь позвонить в GridSearchCV, чтобы получить лучшие оценки
и если я вызываю параметры, как это
clf = DecisionTreeClassifier(random_state=42)
parameters = {'max_depth':[2,3,4,5,6,7,8,9,10],\
'min_samples_leaf':[2,3,4,5,6,7,8,9,10],\
'min_samples_split':[2,3,4,5,6,7,8,9,10]}
scorer = make_scorer(f1_score)
grid_obj = GridSearchCV(clf, parameters, scoring=scorer)
grid_fit = grid_obj.fit(X_train, y_train)
best_clf = grid_fit.best_estimator_
best_clf.fit(X_train, y_train)
best_train_predictions = best_clf.predict(X_train)
best_test_predictions = best_clf.predict(X_test)
print('The training F1 Score is', f1_score(best_train_predictions, y_train))
print('The testing F1 Score is', f1_score(best_test_predictions,
y_test))
Результат будет
The training F1 Score is 0.784810126582
The testing F1 Score is 0.72
результат будет отличаться от этого для тех же данных
я только изменил [2,3,4,5,6,7,8,9,10] на [2,4,6,8,10]
clf = DecisionTreeClassifier(random_state=42)
parameters = {'max_depth':[2,4,6,8,10],'min_samples_leaf':[2,4,6,8,10],\
'min_samples_split':[2,4,6,8,10] }
scorer = make_scorer(f1_score)
grid_obj = GridSearchCV(clf, parameters, scoring=scorer)
grid_fit = grid_obj.fit(X_train, y_train)
best_clf = grid_fit.best_estimator_
best_clf.fit(X_train, y_train)
best_train_predictions = best_clf.predict(X_train)
best_test_predictions = best_clf.predict(X_test)
print('The training F1 Score is', f1_score(best_train_predictions, y_train))
print('The testing F1 Score is', f1_score(best_test_predictions, y_test))
Результат
The training F1 Score is 0.814814814815
The testing F1 Score is 0.8
Запутался, как именно работает GridsearchCV