Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
1.2k views
in Technique[技术] by (71.8m points)

python - ConvergenceWarning: Stochastic Optimizer: Maximum iterations (10) reached and the optimization hasn't converged yet

I am trying sklearn to write about the training of a neutral_work on MNIST datasets. Why the optimizer did not converge? What else can be done to increase the accuracy which I get in?

import matplotlib.pyplot as plt
from sklearn.datasets import fetch_openml
from sklearn.neural_network import MLPClassifier

print(__doc__)

# load data from http://www.openml.org/d/d554
X, y = fetch_openml('mnist_784', version=1, return_X_y=True)
X = X / 255.

# rescale the data. use the traditional train/test split
X_train, X_test = X[:60000], X[60000:]
y_train, y_test = y[:60000], y[60000:]

# mlp = MLPClassifier(hidden_layer_sizes=(100, 100), max_iter=400, alpha=1e-4,
#                    solver='sgd' , verbose=10, tol=14-4, random_state=1)
mlp = MLPClassifier(hidden_layer_sizes=(50,), max_iter=10, alpha=1e-4,
                    solver='sgd', verbose=10, tol=1e-4, random_state=1,
                    learning_rate_init=.1)

mlp.fit(X_train, y_train)
print("Training set score: %f" % mlp.score(X_train, y_train))
print("Test set score: %f" % mlp.score(X_test, y_test))

fig, axes = plt.subplots(4, 4)
# use global min/max to ensure all weights are shown on the same scale
vmin, vmax = mlp.coefs_[0].min(), mlp.coefs_[0].max()
for coef, ax in zip(mlp.coefs_[0].T, axes.ravel()):
    ax.matshow(coef.reshape(28, 28), cmap=plt.cm.gray, vmin=.5 * vmin,
               vmax=.5 * vmax)
    ax.set_xticks(())
    ax.set_yticks(())

plt.show()
See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

"ConvergenceWarning: Stochastic Optimizer: Maximum iterations (10) reached and the optimization hasn't converged yet. ConvergenceWarning)"

A convergence point is a machine learning models localized optimal state. It basically means that the variables within the model have the best posible values (Within a certain vicinity) in order to predict a target feature based on another set of features. In a Multi-layer perceptron (MLP), these variables are the weights within each neuron. Generally, when a data set doesn't represent a organized and discernable pattern, machine learning algorithms might not be able to find a convergence point. However, if there is a convergence point, a machine learning model will do its best to find it. In order to train a MLP you need to iterate a data set within the network many times in order for its weights to find a convergence point. You can also limit the amount of iterations in order to limit processing time or as a regularization tool.

In your code example you have 2 MLP models, however I'll focus on the snippet that isn't commented:

mlp = MLPClassifier(hidden_layer_sizes=(50,), max_iter=10, alpha=1e-4,solver='sgd', verbose=10, tol=1e-4, random_state=1,learning_rate_init=.1)

There are several parameters that could influence the amount of iterations the model would need in order to converge, however the simplest change you could make is increasing the maximum amount of iterations, for example setting it to max_iter=100 or any other needed greater value.

However, there might be a deeper issue in this machine learning model. The MNIST data set is a set of written character images. MLP is a highly flexible machine learning model, nevertheless, it might not be an adequate model for computer vision and image classification. You might get some positive results with MLP, you might get even better results with Convolutional Neural Networks, which are basically really fancy MLPs.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...