Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
177 views
in Technique[技术] by (71.8m points)

python - Hyperparameter optimization for mutliple models

I have an ordinal regression problem. I.e. a multiclass classification with an inherent ranking between the classes (1>2>3>...>10). From Multi-class, multi-label, ordinal classification with sklearn I found a good solution to handle ordinality. Essentially the algorithm creates k-1 binary classifiers, where k is the number of unique classifications. Here is the code I use:

class OrdinalClassifier():

    def __init__(self, clf):
        self.clf = clf
        self.clfs = {}

    def fit(self, X, y):
        self.unique_class = np.sort(np.unique(y))
        if self.unique_class.shape[0] > 2:
            for i in range(self.unique_class.shape[0]-1):
                # for each k - 1 ordinal value we fit a binary classification problem
                binary_y = (y > self.unique_class[i]).astype(np.uint8)
                clf = clone(self.clf)
                clf.fit(X, binary_y)
                self.clfs[i] = clf

    def predict_proba(self, X):
        clfs_predict = {k:self.clfs[k].predict_proba(X) for k in self.clfs}
        predicted = []
        for i,y in enumerate(self.unique_class):
            if i == 0:
                # V1 = 1 - Pr(y > V1)
                predicted.append(1 - clfs_predict[y][:,1])
            elif y in clfs_predict:
                # Vi = Pr(y > Vi-1) - Pr(y > Vi)
                 predicted.append(clfs_predict[y-1][:,1] - clfs_predict[y][:,1])
            else:
                # Vk = Pr(y > Vk-1)
                predicted.append(clfs_predict[y-1][:,1])
        return np.vstack(predicted).T

    def predict(self, X):
        return np.argmax(self.predict_proba(X), axis=1)

This class takes a classifier as input for which all hyperparameters should be defined. I would like to do some form of hyperparameter optimization to improve the model precision. Since my dataset is quite large and I already need to train k-1 models I want to do this as time efficient as possible. Two methods that I think will work best are: 1) successive halving and 2) RandomizedSearchCV. However, I am not sure how I could implement such methods when I need to train k-1 models.

One of the comments under the before mentioned post commented something about adding inheritence to make it possible to do gridsearch. I do not fully understand the comment, but if the proposed method makes it possible to use gridsearch, then it should also make it possible to use my proposed methods. This was the comment: "ou might want to add some inheritance for OrdinalClassifier. from sklearn.base import clone, BaseEstimator, ClassifierMixin class OrdinalClassifier(BaseEstimator, ClassifierMixin): ... Then, if you want to use something like GridSearchCV, you can create a subclass for a specific algorithm: class KNeighborsOrdinalClassifier(OrdinalClassifier): def __init__(self, n_neighbors=5, ...): self.n_neighbors = n_neighbors ... self.clf = KNeighborsClassifier(neighbors=self.n_neighbors, ...) self.clfs = {} –"

Thanks in advance!

question from:https://stackoverflow.com/questions/65905287/hyperparameter-optimization-for-mutliple-models

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)
Waitting for answers

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...