python - Getting around tf.argmax which is not differentiable

Question

Welcome To Ask or Share your Answers For Others

python - Getting around tf.argmax which is not differentiable

1 Reply

深蓝 · Answer 1 · 2021-10-23T17:46:42+0000

If you are cool with approximates,

import tensorflow as tf
import numpy as np

sess = tf.Session()
x = tf.placeholder(dtype=tf.float32, shape=(None,))
beta = tf.placeholder(dtype=tf.float32)

# Pseudo-math for the below
# y = sum( i * exp(beta * x[i]) ) / sum( exp(beta * x[i]) )
y = tf.reduce_sum(tf.cumsum(tf.ones_like(x)) * tf.exp(beta * x) / tf.reduce_sum(tf.exp(beta * x))) - 1

print("I can compute the gradient", tf.gradients(y, x))

for run in range(10):
    data = np.random.randn(10)
    print(data.argmax(), sess.run(y, feed_dict={x:data/np.linalg.norm(data), beta:1e2}))

This is using a trick that computing the mean in low temperature environments gives to the approximate maximum of the probability space. Low temperature in this case correlates with beta being very large.

In fact, as beta approaches infinity, my algorithm will converge to the maximum (assuming the maximum is unique). Unfortunately, beta can't get too large before you have numerical errors and get NaN, but there are tricks to solve that I can go into if you care.

The output looks something like,

So you can see that it messes up in some spots, but often gets the right answer. Depending on your algorithm, this might be fine.

Categories

python - Getting around tf.argmax which is not differentiable

python - Getting around tf.argmax which is not differentiable

Please log in or register to add a comment.

Please log in or register to reply this article.

1 Reply

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags