Tensorflow automatically runs the computations on as many cores as are available on a single machine.
If you have a distributed cluster, be sure you follow the instructions at https://www.tensorflow.org/how_tos/distributed/ to configure the cluster. (e.g. create the tf.ClusterSpec correctly, etc.)
To help debug, you can use the log_device_placement
configuration options on the session to have Tensorflow print out where the computations are actually placed. (Note: this works for both GPUs as well as distributed Tensorflow.)
# Creates a session with log_device_placement set to True.
sess = tf.Session(config=tf.ConfigProto(log_device_placement=True))
Note that while Tensorflow's computation placement algorithm works fine for small computational graphs, you might be able to get better performance on large computational graphs by manually placing the computations in specific devices. (e.g. using with tf.device(...):
blocks.)
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…