python - DropoutWrapper being non-deterministic across runs?

Question

Welcome To Ask or Share your Answers For Others

python - DropoutWrapper being non-deterministic across runs?

posted Oct 24, 2021 in Technique[技术] by 深蓝 (71.8m points)

python - DropoutWrapper being non-deterministic across runs?

In the beginning of my code, (outside the scope of a Session), I've set my random seed -

np.random.seed(1)
tf.set_random_seed(1)

This is what my dropout definition looks like -

cell = tf.nn.rnn_cell.DropoutWrapper(cell, output_keep_prob=args.keep_prob, seed=1)

In my first experiment, I kept keep_prob=1. All results obtained were deterministic. I'm running this on a multicore CPU.

In my second experiment, I set keep_prob=0.8 and I ran the same code two times. Each code had these statements,

sess.run(model.cost, feed)
sess.run(model.cost, feed)

Results for first code run -

(Pdb) sess.run(model.cost, feed)
4.9555049
(Pdb) sess.run(model.cost, feed)
4.9548969

Expected behaviour, since DropoutWrapper uses random_uniform.

Results for second code run -

(Pdb) sess.run(model.cost, feed)
4.9551616
(Pdb) sess.run(model.cost, feed)
4.9552417

Why is this sequence not identical to the first output despite defining an operation and graph seed?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

1 Reply

深蓝 · Answer 1 · 2021-10-23T20:07:14+0000

The answer was already provided in the comments, but no-one has written it explicitly yet, so here it is:

dynamic_rnn will internally use tf.while_loop, which can actually evaluate multiple iterations in parallel (see documentation on parallel_iterations). In practice, if everything inside the loop-body or loop-cond depends on the previous values, it cannot run anything in parallel but there could be computations which don't depend on the previous values. These will be evaluated in parallel. In your case, inside the DropoutWrapper, you have at some point sth like this:

random_ops.random_uniform(noise_shape, ...)

This operation is independent from the previous values of the loop, so it can be calculated in parallel for all time-steps. If you do such parallel execution, it will be non-deterministic which time-frame gets which dropout mask.

Categories

python - DropoutWrapper being non-deterministic across runs?

python - DropoutWrapper being non-deterministic across runs?

Please log in or register to add a comment.

Please log in or register to reply this article.

1 Reply

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags