Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
224 views
in Technique[技术] by (71.8m points)

python - Neural Network training with PyBrain won't converge

I have the following code, from the PyBrain tutorial:

from pybrain.datasets import SupervisedDataSet
from pybrain.supervised.trainers import BackpropTrainer
from pybrain.tools.shortcuts import buildNetwork
from pybrain.structure.modules import TanhLayer

ds = SupervisedDataSet(2, 1)
ds.addSample((0,0), (0,))
ds.addSample((0,1), (1,))
ds.addSample((1,0), (1,))
ds.addSample((1,1), (0,))

net     = buildNetwork(2, 3, 1, bias=True, hiddenclass=TanhLayer)
trainer = BackpropTrainer(net, ds)

for inp, tar in ds:
     print [net.activate(inp), tar]

errors  = trainer.trainUntilConvergence()

for inp, tar in ds:
     print [net.activate(inp), tar]

However the result is a neural network that is not trained well. When looking at the error output the network gets trained properly however it uses the 'continueEpochs' argument to train some more and the network is performing worse again. So the network is converging, but there is no way to get the best trained network. The documentation of PyBrain implies that the network is returned which is trained best, however it returns a Tuple of errors.

Whens etting continueEpochs to 0 I get an error (ValueError: max() arg is an empty sequence) so continueEpochs must be larger than 0.

Is PyBrain actually maintained because it seems there is a big difference in documentation and code.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

After some more digging I found that the example on the PyBrain's tutorial is completely out of place.

When we look at the method signature in the source code we find:

def trainUntilConvergence(self, dataset=None, maxEpochs=None, verbose=None, continueEpochs=10, validationProportion=0.25):

This means that 25% of the training set is used for validation. Although that is a very valid method when training a network on data you are not going to do this when you have the complete range of possiblities at your disposal, namely a 4-row XOR 2-in-1-out solution set. When one wants to train an XOR set and you remove one of the rows for validation that has as an immediate consequence that you get a very sparse training set where one of the possible combinations is omitted resulting automatically into those weights not being trained.

Normally when you omit 25% of the data for validation you do this by assuming that those 25% cover 'most' of the solution space the network already has encountered more or less. In this case this is not true and it covers 25% of the solution space completely unknown to the network since you removed it for validation.

So, the trainer was training the network correctly, but by omitting 25% of the XOR problem this results in a badly trained network.

A different example on the PyBrain website as a quickstart would be very handy, because this example is just plain wrong in this specific XOR case. You might wonder if they tried the example themselves, because it just outputs random badly trained networks.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...