Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
377 views
in Technique[技术] by (71.8m points)

python - 如何确保神经网络性能的可比性?(How to ensure neural net performance comparability?)

For my thesis i am trying to evaluate the impact of different parameters on my active learning object detector with tensorflow (v 1.14).

(就我的论文而言,我正在尝试使用张量流(v 1.14)评估不同参数对我的主动学习对象检测器的影响。)

Therefore i am using the faster_rcnn_inception_v2_coco standard config from the model zoo and a fixed random.seed(1).

(因此,我正在使用来自Zoo模型的fast_rcnn_inception_v2_coco标准配置和固定的random.seed(1)。)

To make sure i have a working baseline experiment i tried to run the object detector two times with the same dataset, learning time, poolingsize and so forth.

(为了确保我有一个正常的基准实验,我尝试使用相同的数据集,学习时间,poolingsize等两次运行对象检测器。)

Anyhow the two plotted graphs after 20 active learning cycles are quite different as you can see here:

(无论如何,经过20个活跃的学习周期后,这两个绘图图有很大的不同,您可以在此处看到:) 在此处输入图片说明 Is it possible to ensure a comparable neural net performance?

(是否有可能确保可比的神经网络性能?)

If yes, how to setup a scientific experiment setup, to compare parameter changes outcomes like learning rate, learning time (its a constraint in our active learning cycle!) poolingsize, ...

(如果是,如何设置科学实验设置,以比较参数变化的结果,例如学习率,学习时间(这是我们积极学习周期的约束!))

  ask by Sh0rtey translate from so

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

To achieve determinism when training on CPU, the following should be sufficient:

(为了在使用CPU进行培训时获得确定性,应满足以下条件:)

1. SET ALL SEEDS

(1.设置所有种子)

SEED = 123
os.environ['PYTHONHASHSEED']=str(SEED)
random.seed(SEED)
np.random.seed(SEED)
tf.set_random_seed(SEED)

2. LIMIT CPU THREADS TO ONE

(2.将CPU线程数限制为一)

session_config.intra_op_parallelism_threads = 1
session_config.inter_op_parallelism_threads = 1

3. DATASET WORKERS

(3.数据集工作者)

If you are using tf.data.Dataset , then make sure the number of workers is limited to one.

(如果使用的是tf.data.Dataset ,请确保tf.data.Dataset的数量限制为一个。)

4. HOROVOD

(4.水平)

If you are training with more than two GPUs using Horovod, like so,

(如果您使用Horovod使用两个以上的GPU进行训练,)

os.environ['HOROVOD_FUSION_THRESHOLD']='0'

To more clearly check for determinism between runs, I recommend the method I have documented here .

(为了更清楚地检查两次运行之间的确定性,我建议使用此处记录的方法。)

I also recommend using this approach to confirm that the initial weights (before step one of training) are exactly the same between runs.

(我还建议使用这种方法来确认两次跑步之间的初始权重(在训练的第一步之前)完全相同。)


For the latest information on determinism in TensorFlow (with a focus on determinism when using GPUs), please take a look the tensorflow-determinism project which NVIDIA is kindly paying me to drive.

(有关TensorFlow中确定性的最新信息(在使用GPU时侧重于确定性),请查看NVIDIA请我驱动的tensorflow确定性项目。)


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...