I'm currently working on recursive feature elimination (RFECV) within a grid search (GridSearchCV) for tree based methods using scikit-learn. To do this, I'm using the current dev version on GitHub (0.17) which allows RFECV to use feature importance from the tree methods to select features to discard.
For clarity this means:
- loop over hyperparameters for the current tree method
- for each set of parameters perform recursive feature elimination to obtain the optimal number of features
- report the 'score' (e.g. accuracy)
- determine which set of parameters produced the best score
This code is working fine at the moment - but I'm getting a depreciation warning about using estimator_params. Here is the current code:
# set up list of parameter dictionaries (better way to do this?)
depth = [1, 5, None]
weight = ['balanced', None]
params = []
for d in depth:
for w in weight:
params.append(dict(max_depth=d,
class_weight=w))
# specify the classifier
estimator = DecisionTreeClassifier(random_state=0,
max_depth=None,
class_weight='balanced')
# specify the feature selection method
selector = RFECV(estimator,
step=1,
cv=3,
scoring='accuracy')
# set up the parameter search
clf = GridSearchCV(selector,
{'estimator_params': param_grid},
cv=3)
clf.fit(X_train, y_train)
clf.best_estimator_.estimator_
Here is the depreciation warning in full:
home/csw34/git/scikit-learn/sklearn/feature_selection/rfe.py:154: DeprecationWarning:
The parameter 'estimator_params' is deprecated as of version 0.16 and will be removed in 0.18. The parameter is no longer necessary because the value is set via the estimator initialisation or set_params method.
How I would be able to achieve the same result without using estimator_params in GridSearchCV to pass the parameters through RFECV to the estimator?
See Question&Answers more detail:
os 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…