Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
424 views
in Technique[技术] by (71.8m points)

python - How to know if underfitting or overfitting is occuring?

I'm trying to do image classification with two classes. I have 1000 images with balanced classes. When I train the model, I get a low constant validation accuracy but a decreasing validation loss. Is this a sign of overfitting or underfitting? I should also note that I'm attempting to retrain the Inception V3 model with new classes and a different dataset.

Epoch 1/10
2/2 [==============================]2/2 [==============================] - 126s 63s/step - loss: 0.7212 - acc: 0.5312 - val_loss: 0.7981 - val_acc: 0.3889

Epoch 2/10
2/2 [==============================]2/2 [==============================] - 70s 35s/step - loss: 0.6681 - acc: 0.5959 - val_loss: 0.7751 - val_acc: 0.3889

Epoch 3/10
2/2 [==============================]2/2 [==============================] - 71s 35s/step - loss: 0.7313 - acc: 0.4165 - val_loss: 0.7535 - val_acc: 0.3889

Epoch 4/10
2/2 [==============================]2/2 [==============================] - 67s 34s/step - loss: 0.6254 - acc: 0.6603 - val_loss: 0.7459 - val_acc: 0.3889

Epoch 5/10
2/2 [==============================]2/2 [==============================] -  68s 34s/step - loss: 0.6717 - acc: 0.5959 - val_loss: 0.7359 - val_acc: 0.3889

Epoch 6/10
2/2 [==============================]2/2 [==============================] - 107s 53s/step - loss: 0.6633 - acc: 0.5938 - val_loss: 0.7259 - val_acc: 0.3889

Epoch 7/10
2/2 [==============================]2/2 [==============================] - 67s 33s/step - loss: 0.6674 - acc: 0.6411 - val_loss: 0.7160 - val_acc: 0.3889

Epoch 8/10
2/2 [==============================]2/2 [==============================] - 105s 53s/step - loss: 0.6296 - acc: 0.6562 - val_loss: 0.7099 - val_acc: 0.3889

Epoch 9/10
2/2 [==============================]2/2 [==============================] - 67s 34s/step - loss: 0.5717 - acc: 0.8273 - val_loss: 0.7064 - val_acc: 0.4444

Epoch 10/10
2/2 [==============================]2/2 [==============================] - 103s 52s/step - loss: 0.6276 - acc: 0.6875 - val_loss: 0.7035 - val_acc: 0.4444
See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

What is overfitting

Overfitting ( or underfitting) occurs when a model is too specific (or not specific enough) to the training data, and doesn't extrapolate well to the true domain. I'll just say overfitting from now on to save my poor typing fingers [*]

I think the wikipedia image is good:

wikipedia overfitting curve

Clearly, the green line, a decision boundary trying to separate the red class from the blue, is "overfit", because although it will do well on the training data, it lacks the "regularized" form we like to see when generalizing [**].

These CMU slides on overfitting/cross validation also make the problem clear:

enter image description here

And here's some more intuition for good measure


When does overfitting occur, generally?

Overfitting is observed numerically when the testing error does not reflect the training error

Obviously, the testing error will always (in expectation) be worse than the training error, but at a certain number of iterations, the loss in testing will start to increase, even as the loss in training continues to decline.


How to tell when a model has overfit visually?

Overfitting can be observed by plotting the decision boundary (as in the wikipedia image above) when dimensionality allows, or by looking at testing loss in addition to training loss during the fit procedure

You don't give us enough points to make these graphs, but here's an example (from someone asking a similar question) showing what those loss graphs would look like: Overfit loss curves

While loss curves are sometimes more pretty and logarthmic, note the trend here that training error is still decreasing but testing error is on the rise. That's a big red flag for overfitting. SO discusses loss curves here

The slightly cleaner and more real-life example is from this CMU lecture on ovefitting ANN's:

Ovefitting second example

The top graph is overfitting, as before. The bottom graph is not.


When does this occur?

When a model has too many parameters, it is susceptible to overfitting (like a n-degree polynomial to n-1 points). Likewise, a model with not enough parameters can be underfit.

Certain regularization techniques like dropout or batch normalization, or traditionally l-1 regularization combat this. I believe this is beyond the scope of your question.

Further reading:

  1. A good statistics-SO question and answers
  2. Dense reading: bounds on overfitting with some models
  3. Lighter reading: general overview
  4. The related bias-variance tradeoff

Footnotes

[*] There's no reason to keep writing "overfitting/underfitting", since the reasoning is the same for both, but the indicators are flipped, obviously (a decision boundary that hasn't latched onto the true border enough, as opposed to being too tightly wrapped against individual points). In general, overfitting is the more common to avoid, since "more iterations/more parameters" is the current theme. If you have lots of data and not lot of parameters, maybe you really are worried about underfitting, but I doubt it.

[**] One way to formalize the idea that the black line is preferable than the green one in the first image from wikipedia is to penalize the number of parameters required by your model during model selection


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...