I have a loss value/function and I would like to compute all the second derivatives with respect to a tensor f (of size n). I managed to use tf.gradients twice, but when applying it for the second time, it sums the derivatives across the first input (see second_derivatives in my code).
Also I managed to retrieve the Hessian matrix, but I would like to only compute its diagonal to avoid extra-computation.
import tensorflow as tf
import numpy as np
f = tf.Variable(np.array([[1., 2., 0]]).T)
loss = tf.reduce_prod(f ** 2 - 3 * f + 1)
first_derivatives = tf.gradients(loss, f)[0]
second_derivatives = tf.gradients(first_derivatives, f)[0]
hessian = [tf.gradients(first_derivatives[i,0], f)[0][:,0] for i in range(3)]
model = tf.initialize_all_variables()
with tf.Session() as sess:
sess.run(model)
print "
loss
", sess.run(loss)
print "
loss'
", sess.run(first_derivatives)
print "
loss''
", sess.run(second_derivatives)
hessian_value = np.array(map(list, sess.run(hessian)))
print "
Hessian
", hessian_value
My thinking was that tf.gradients(first_derivatives, f[0, 0])[0] would work to retrieve for instance the second derivative with respect to f_0 but it seems that tensorflow doesn't allow to derive from a slice of a tensor.
See Question&Answers more detail:
os 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…