machine learning - Why should weights of Neural Networks be initialized to random numbers?

Question

Welcome To Ask or Share your Answers For Others

machine learning - Why should weights of Neural Networks be initialized to random numbers?

1 Reply

深蓝 · Answer 1 · 2021-10-16T22:34:22+0000

Breaking symmetry is essential here, and not for the reason of performance. Imagine first 2 layers of multilayer perceptron (input and hidden layers):

enter image description here

During forward propagation each unit in hidden layer gets signal:

$enter image description here$

That is, each hidden unit gets sum of inputs multiplied by the corresponding weight.

Now imagine that you initialize all weights to the same value (e.g. zero or one). In this case, each hidden unit will get exactly the same signal. E.g. if all weights are initialized to 1, each unit gets signal equal to sum of inputs (and outputs sigmoid(sum(inputs))). If all weights are zeros, which is even worse, every hidden unit will get zero signal. No matter what was the input - if all weights are the same, all units in hidden layer will be the same too.

This is the main issue with symmetry and reason why you should initialize weights randomly (or, at least, with different values). Note, that this issue affects all architectures that use each-to-each connections.

Categories

machine learning - Why should weights of Neural Networks be initialized to random numbers?

machine learning - Why should weights of Neural Networks be initialized to random numbers?

Please log in or register to add a comment.

Please log in or register to reply this article.

1 Reply

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags