I'm training a MobileNet architecture on dummy data with Keras, on Mac OSX. I set both nump.random
and tensorflow.set_random_seed
, but for some reasons I can't get reproducible results: each time I rerun the code, I get different results. Why? This is not due to the GPU, because I'm running on a MacBook Pro 2017 which has a Radeon graphics card, thus Tensorflow doesn't take advantage of it. The code is run with
python Keras_test.py
so it's not a problem of state (I'm not using Jupyter or IPython: the environment should be reset each time I run the code).
EDIT: I changed my code by moving the setting of all seeds before importing Keras. The results are still not deterministic, however the variance in results is much smaller than it was before. This is very bizarre.
The current model is very small (as far as deep neural network go) without being trivial, it doesn't need a GPU to run and it trains in a few minutes on a modern laptop, so repeating my experiments is within anyone's reach. I invite you to do it: I'd be very interested in learning about the level of variation from a system to another.
import numpy as np
# random seeds must be set before importing keras & tensorflow
my_seed = 512
np.random.seed(my_seed)
import random
random.seed(my_seed)
import tensorflow as tf
tf.set_random_seed(my_seed)
# now we can import keras
import keras.utils
from keras.applications import MobileNet
from keras.callbacks import ModelCheckpoint
from keras.optimizers import Adam
import os
height = 224
width = 224
channels = 3
epochs = 10
num_classes = 10
# Generate dummy data
batch_size = 32
n_train = 256
n_test = 64
x_train = np.random.random((n_train, height, width, channels))
y_train = keras.utils.to_categorical(np.random.randint(num_classes, size=(n_train, 1)), num_classes=num_classes)
x_test = np.random.random((n_test, height, width, channels))
y_test = keras.utils.to_categorical(np.random.randint(num_classes, size=(n_test, 1)), num_classes=num_classes)
# Get input shape
input_shape = x_train.shape[1:]
# Instantiate model
model = MobileNet(weights=None,
input_shape=input_shape,
classes=num_classes)
model.compile(loss='categorical_crossentropy',
optimizer='adam',
metrics=['accuracy'])
# Viewing Model Configuration
model.summary()
# Model file name
filepath = 'model_epoch_{epoch:02d}_loss_{loss:0.2f}_val_{val_loss:.2f}.hdf5'
# Define save_best_only checkpointer
checkpointer = ModelCheckpoint(filepath=filepath,
monitor='val_acc',
verbose=1,
save_best_only=True)
# Let's fit!
model.fit(x_train, y_train,
batch_size=batch_size,
epochs=epochs,
validation_data=(x_test, y_test),
callbacks=[checkpointer])
As always, here are my Python, Keras & Tensorflow versions:
python -c 'import keras; import tensorflow; import sys; print(sys.version, 'keras.__version__', 'tensorflow.__version__')'
/anaconda2/lib/python2.7/site-packages/h5py/__init__.py:36: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`.
from ._conv import register_converters as _register_converters
Using TensorFlow backend.
('2.7.15 |Anaconda, Inc.| (default, May 1 2018, 18:37:05)
[GCC 4.2.1 Compatible Clang 4.0.1 (tags/RELEASE_401/final)]', '2.1.6', '1.8.0')
Here are some results obtained running this code multiple times: as you can see, the code saves the best model (best validation accuracy) out of 10 epochs with a descriptive filename, so comparing filenames across different runs allows to judge the variability in results.
model_epoch_01_loss_2.39_val_3.28.hdf5
model_epoch_01_loss_2.39_val_3.54.hdf5
model_epoch_01_loss_2.40_val_3.47.hdf5
model_epoch_01_loss_2.41_val_3.08.hdf5
See Question&Answers more detail:
os