I have an ordinal regression problem. I.e. a multiclass classification with an inherent ranking between the classes (1>2>3>...>10). From Multi-class, multi-label, ordinal classification with sklearn I found a good solution to handle ordinality. Essentially the algorithm creates k-1 binary classifiers, where k is the number of unique classifications. Here is the code I use:
class OrdinalClassifier():
def __init__(self, clf):
self.clf = clf
self.clfs = {}
def fit(self, X, y):
self.unique_class = np.sort(np.unique(y))
if self.unique_class.shape[0] > 2:
for i in range(self.unique_class.shape[0]-1):
# for each k - 1 ordinal value we fit a binary classification problem
binary_y = (y > self.unique_class[i]).astype(np.uint8)
clf = clone(self.clf)
clf.fit(X, binary_y)
self.clfs[i] = clf
def predict_proba(self, X):
clfs_predict = {k:self.clfs[k].predict_proba(X) for k in self.clfs}
predicted = []
for i,y in enumerate(self.unique_class):
if i == 0:
# V1 = 1 - Pr(y > V1)
predicted.append(1 - clfs_predict[y][:,1])
elif y in clfs_predict:
# Vi = Pr(y > Vi-1) - Pr(y > Vi)
predicted.append(clfs_predict[y-1][:,1] - clfs_predict[y][:,1])
else:
# Vk = Pr(y > Vk-1)
predicted.append(clfs_predict[y-1][:,1])
return np.vstack(predicted).T
def predict(self, X):
return np.argmax(self.predict_proba(X), axis=1)
This class takes a classifier as input for which all hyperparameters should be defined. I would like to do some form of hyperparameter optimization to improve the model precision. Since my dataset is quite large and I already need to train k-1 models I want to do this as time efficient as possible. Two methods that I think will work best are: 1) successive halving and 2) RandomizedSearchCV. However, I am not sure how I could implement such methods when I need to train k-1 models.
One of the comments under the before mentioned post commented something about adding inheritence to make it possible to do gridsearch. I do not fully understand the comment, but if the proposed method makes it possible to use gridsearch, then it should also make it possible to use my proposed methods. This was the comment: "ou might want to add some inheritance for OrdinalClassifier. from sklearn.base import clone, BaseEstimator, ClassifierMixin class OrdinalClassifier(BaseEstimator, ClassifierMixin): ...
Then, if you want to use something like GridSearchCV, you can create a subclass for a specific algorithm: class KNeighborsOrdinalClassifier(OrdinalClassifier): def __init__(self, n_neighbors=5, ...): self.n_neighbors = n_neighbors ... self.clf = KNeighborsClassifier(neighbors=self.n_neighbors, ...) self.clfs = {}
–"
Thanks in advance!
question from:
https://stackoverflow.com/questions/65905287/hyperparameter-optimization-for-mutliple-models 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…