python - Second derivative in Keras

Question

Welcome To Ask or Share your Answers For Others

python - Second derivative in Keras

posted Oct 24, 2021 in Technique[技术] by 深蓝 (71.8m points)

python - Second derivative in Keras

For a custom loss for a NN I use the function . u, given a pair (t,x), both points in an interval, is the the output of my NN. Problem is I'm stuck at how to compute the second derivative using K.gradient (K being the TensorFlow backend):

def custom_loss(input_tensor, output_tensor):
    def loss(y_true, y_pred):

        # so far, I can only get this right, naturally:            
        gradient = K.gradients(output_tensor, input_tensor)

        # here I'm falling badly:

        # d_t = K.gradients(output_tensor, input_tensor)[0]
        # dd_x = K.gradient(K.gradients(output_tensor, input_tensor),
        #                   input_tensor[1])

        return gradient # obviously not useful, just for it to work
    return loss

All my attemps, based on Input(shape=(2,)), were variations of the commented lines in the snippet above, mainly trying to find the right indexation of the resulting tensor.

Sure enough I lack knowledge of how exactly tensors work. By the way, I know in TensorFlow itself I could simply use tf.hessian, but I noticed it's just not present when using TF as a backend.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

1 Reply

深蓝 · Answer 1 · 2021-10-23T17:52:34+0000

In order for a K.gradients() layer to work like that, you have to enclose it in a Lambda() layer, because otherwise a full Keras layer is not created, and you can't chain it or train through it. So this code will work (tested):

import keras
from keras.models import *
from keras.layers import *
from keras import backend as K
import tensorflow as tf

def grad( y, x ):
    return Lambda( lambda z: K.gradients( z[ 0 ], z[ 1 ] ), output_shape = [1] )( [ y, x ] )

def network( i, d ):
    m = Add()( [ i, d ] )
    a = Lambda(lambda x: K.log( x ) )( m )
    return a

fixed_input = Input(tensor=tf.constant( [ 1.0 ] ) )
double = Input(tensor=tf.constant( [ 2.0 ] ) )

a = network( fixed_input, double )

b = grad( a, fixed_input )
c = grad( b, fixed_input )
d = grad( c, fixed_input )
e = grad( d, fixed_input )

model = Model( inputs = [ fixed_input, double ], outputs = [ a, b, c, d, e ] )

print( model.predict( x=None, steps = 1 ) )

def network models f( x ) = log( x + 2 ) at x = 1. def grad is where the gradient calculation is done. This code outputs:

[array([1.0986123], dtype=float32), array([0.33333334], dtype=float32), array([-0.11111112], dtype=float32), array([0.07407408], dtype=float32), array([-0.07407409], dtype=float32)]

which are the correct values for log( 3 ), ⅓, -1 / 3², 2 / 3³, -6 / 3⁴.

Reference TensorFlow code

For reference, the same code in plain TensorFlow (used for testing):

import tensorflow as tf

a = tf.constant( 1.0 )
a2 = tf.constant( 2.0 )

b = tf.log( a + a2 )
c = tf.gradients( b, a )
d = tf.gradients( c, a )
e = tf.gradients( d, a )
f = tf.gradients( e, a )

with tf.Session() as sess:
    print( sess.run( [ b, c, d, e, f ] ) )

outputs the same values:

[1.0986123, [0.33333334], [-0.11111112], [0.07407408], [-0.07407409]]

Hessians

tf.hessians() does return the second derivative, that's a shorthand for chaining two tf.gradients(). The Keras backend doesn't have hessians though, so you do have to chain the two K.gradients().

Numerical approximation

If for some reason none of the above works, then you might want to consider numerically approximating the second derivative with taking the difference over a small ε distance. This basically triples the network for each input, so this solution introduces serious efficiency considerations, besides lacking in accuracy. Anyway, the code (tested):

import keras
from keras.models import *
from keras.layers import *
from keras import backend as K
import tensorflow as tf

def network( i, d ):
    m = Add()( [ i, d ] )
    a = Lambda(lambda x: K.log( x ) )( m )
    return a

fixed_input = Input(tensor=tf.constant( [ 1.0 ], dtype = tf.float64 ) )
double = Input(tensor=tf.constant( [ 2.0 ], dtype = tf.float64 ) )

epsilon = Input( tensor = tf.constant( [ 1e-7 ], dtype = tf.float64 ) )
eps_reciproc = Input( tensor = tf.constant( [ 1e+7 ], dtype = tf.float64 ) )

a0 = network( Subtract()( [ fixed_input, epsilon ] ), double )
a1 = network(               fixed_input,              double )
a2 = network(      Add()( [ fixed_input, epsilon ] ), double )

d0 = Subtract()( [ a1, a0 ] )
d1 = Subtract()( [ a2, a1 ] )

dv0 = Multiply()( [ d0, eps_reciproc ] )
dv1 = Multiply()( [ d1, eps_reciproc ] )

dd0 = Multiply()( [ Subtract()( [ dv1, dv0 ] ), eps_reciproc ] )

model = Model( inputs = [ fixed_input, double, epsilon, eps_reciproc ], outputs = [ a0, dv0, dd0 ] )

print( model.predict( x=None, steps = 1 ) )

Outputs:

[array([1.09861226]), array([0.33333334]), array([-0.1110223])]

(This only gets to the second derivative.)

Categories

python - Second derivative in Keras

python - Second derivative in Keras

Please log in or register to add a comment.

Please log in or register to reply this article.

1 Reply

Reference TensorFlow code

Hessians

Numerical approximation

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags