Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
249 views
in Technique[技术] by (71.8m points)

python - can't reproduce results from gridsearch

I am trying to optimize the parameters of my algorithm with gridsearch, however, the results I get when I apply the optimized parameters are much lower than the ones resulted from grid search. I know that this could be because of the cross-validation on gridsearch. Is there any way to avoid this difference and receive approximately the same results on my predictions and the gridsearch?

from sklearn.model_selection import GridSearchCV
from sklearn.ensemble import GradientBoostingClassifier


X_train, X_test, y_train, y_test = train_test_split(X, y, 
                                                    test_size=0.33,                                                                                                                                                             
                                                    random_state=42)

gbc = GradientBoostingClassifier()
parameters = {'learning_rate':[0.01, 0.05, 0.1, 0.5, 1], 
              'min_samples_split':[2,5,10,20], 
              'max_depth':[2,3,5,10]}

clf = GridSearchCV(gbc, parameters, cv=3, scoring='f1')
clf.fit(X_train, y_train)
print("Best parameter (CV score=%0.3f):" % clf.best_score_)

# Best parameter (CV score=0.737)


gbc_tunned = gbc.set_params(**clf.best_params_)
gbc_tunned .fit(X_train,y_train.values.ravel()) 
ypred_test = lgbc_tunned .predict(X_test)
print(f1_score(y_test,ypred_test))

# 0.7008433734939759

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

To be able to reproduce the results, you need to fix random_state for the GradientBoostingClassifier.

gbc = GradientBoostingClassifier(random_state=42)

There is also a good practice to fix the random seed for numpy at the beginning.

numpy.random.seed(42)

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...