What you want is not batch gradient descent, but stochastic gradient descent; batch learning means learning on the entire training set in one go, while what you describe is properly called minibatch learning. That's implemented in sklearn.linear_model.SGDClassifier
, which fits a logistic regression model if you give it the option loss="log"
.
With SGDClassifier
, like with LogisticRegression
, there's no need to wrap the estimator in a OneVsRestClassifier
-- both do one-vs-all training out of the box.
# you'll have to set a few other options to get good estimates,
# in particular n_iterations, but this should get you going
lr = SGDClassifier(loss="log")
Then, to train on minibatches, use the partial_fit
method instead of fit
. The first time around, you have to feed it a list of classes because not all classes may be present in each minibatch:
import numpy as np
classes = np.unique(["ham", "spam", "eggs"])
for xs, ys in minibatches:
lr.partial_fit(xs, ys, classes=classes)
(Here, I'm passing classes
for each minibatch, which isn't necessary but doesn't hurt either and makes the code shorter.)
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…