python - can't reproduce results from gridsearch

Question

Welcome To Ask or Share your Answers For Others

python - can't reproduce results from gridsearch

posted Feb 19, 2021 in Technique[技术] by 深蓝 (71.8m points)

python - can't reproduce results from gridsearch

I am trying to optimize the parameters of my algorithm with gridsearch, however, the results I get when I apply the optimized parameters are much lower than the ones resulted from grid search. I know that this could be because of the cross-validation on gridsearch. Is there any way to avoid this difference and receive approximately the same results on my predictions and the gridsearch?

from sklearn.model_selection import GridSearchCV
from sklearn.ensemble import GradientBoostingClassifier


X_train, X_test, y_train, y_test = train_test_split(X, y, 
                                                    test_size=0.33,                                                                                                                                                             
                                                    random_state=42)

gbc = GradientBoostingClassifier()
parameters = {'learning_rate':[0.01, 0.05, 0.1, 0.5, 1], 
              'min_samples_split':[2,5,10,20], 
              'max_depth':[2,3,5,10]}

clf = GridSearchCV(gbc, parameters, cv=3, scoring='f1')
clf.fit(X_train, y_train)
print("Best parameter (CV score=%0.3f):" % clf.best_score_)

# Best parameter (CV score=0.737)


gbc_tunned = gbc.set_params(**clf.best_params_)
gbc_tunned .fit(X_train,y_train.values.ravel()) 
ypred_test = lgbc_tunned .predict(X_test)
print(f1_score(y_test,ypred_test))

# 0.7008433734939759

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

1 Reply

深蓝 · Answer 1 · 2021-02-19T04:03:41+0000

To be able to reproduce the results, you need to fix random_state for the GradientBoostingClassifier.

gbc = GradientBoostingClassifier(random_state=42)

There is also a good practice to fix the random seed for numpy at the beginning.

numpy.random.seed(42)

Categories

python - can't reproduce results from gridsearch

python - can't reproduce results from gridsearch

Please log in or register to add a comment.

Please log in or register to reply this article.

1 Reply

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags