As per TensorFlow documentation , the prefetch
and map
methods of tf.contrib.data.Dataset
class, both have a parameter called buffer_size
.
For prefetch
method, the parameter is known as buffer_size
and according to documentation :
buffer_size: A tf.int64 scalar tf.Tensor, representing the maximum
number elements that will be buffered when prefetching.
For the map
method, the parameter is known as output_buffer_size
and according to documentation :
output_buffer_size: (Optional.) A tf.int64 scalar tf.Tensor,
representing the maximum number of processed elements that will be
buffered.
Similarly for the shuffle
method, the same quantity appears and according to documentation :
buffer_size: A tf.int64 scalar tf.Tensor, representing the number of
elements from this dataset from which the new dataset will sample.
What is the relation between these parameters ?
Suppose I create aDataset
object as follows :
tr_data = TFRecordDataset(trainfilenames)
tr_data = tr_data.map(providefortraining, output_buffer_size=10 * trainbatchsize, num_parallel_calls
=5)
tr_data = tr_data.shuffle(buffer_size= 100 * trainbatchsize)
tr_data = tr_data.prefetch(buffer_size = 10 * trainbatchsize)
tr_data = tr_data.batch(trainbatchsize)
What role is being played by the buffer
parameters in the above snippet ?
Question&Answers:
os 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…