Due to the limitation of RAM memory, I followed these instructions and built a generator that draw small batch and pass them in the fit_generator of Keras.
But Keras can't prepare the queue with the multiprocessing even I inherit the Sequence.
Here is my generator for multiprocessing.
class My_Generator(Sequence):
def __init__(self, image_filenames, labels, batch_size):
self.image_filenames, self.labels = image_filenames, labels
self.batch_size = batch_size
def __len__(self):
return np.ceil(len(self.image_filenames) / float(self.batch_size))
def __getitem__(self, idx):
batch_x = self.image_filenames[idx * self.batch_size:(idx + 1) * self.batch_size]
batch_y = self.labels[idx * self.batch_size:(idx + 1) * self.batch_size]
return np.array([
resize(imread(file_name), (200, 200))
for file_name in batch_x]), np.array(batch_y)
The main function:
batch_size = 100
num_epochs = 10
train_fnames = []
mask_training = []
val_fnames = []
mask_validation = []
I would like that the generator read batches in the folders seperatly in different threads by IDs (where IDs look like: {number}.csv for raw images and {number}_label.csv for mask images). I initially built another more elegant class to stock every data in one .h5 file instead of directory. But blocked of the same problem. Thus, if you have a code to do this, I'm taker also.
for dirpath, _, fnames in os.walk('./train/'):
for fname in fnames:
if 'label' not in fname:
training_filenames.append(os.path.abspath(os.path.join(dirpath, fname)))
else:
mask_training.append(os.path.abspath(os.path.join(dirpath, fname)))
for dirpath, _, fnames in os.walk('./validation/'):
for fname in fnames:
if 'label' not in fname:
validation_filenames.append(os.path.abspath(os.path.join(dirpath, fname)))
else:
mask_validation.append(os.path.abspath(os.path.join(dirpath, fname)))
my_training_batch_generator = My_Generator(training_filenames, mask_training, batch_size)
my_validation_batch_generator = My_Generator(validation_filenames, mask_validation, batch_size)
num_training_samples = len(training_filenames)
num_validation_samples = len(validation_filenames)
Herein, the model is out of scope. I believe that it's not a problem of the model so I won't paste it.
mdl = model.compile(...)
mdl.fit_generator(generator=my_training_batch_generator,
steps_per_epoch=(num_training_samples // batch_size),
epochs=num_epochs,
verbose=1,
validation_data=None, #my_validation_batch_generator,
# validation_steps=(num_validation_samples // batch_size),
use_multiprocessing=True,
workers=4,
max_queue_size=2)
The error shows that the class I create is not an Iterator:
Traceback (most recent call last):
File "test.py", line 141, in <module> max_queue_size=2)
File "/anaconda3/lib/python3.6/site-packages/tensorflow/python/keras/engine/training.py", line 2177, in fit_generator
initial_epoch=initial_epoch)
File "/anaconda3/lib/python3.6/site-packages/tensorflow/python/keras/engine/training_generator.py", line 147, in fit_generator
generator_output = next(output_generator)
File "/anaconda3/lib/python3.6/site-packages/tensorflow/python/keras/utils/data_utils.py", line 831, in get six.reraise(value.__class__, value, value.__traceback__)
File "/anaconda3/lib/python3.6/site-packages/six.py", line 693, in reraise
raise value
TypeError: 'My_Generator' object is not an iterator
See Question&Answers more detail:
os