python - Tensorflow negative sampling

Question

Welcome To Ask or Share your Answers For Others

python - Tensorflow negative sampling

posted Oct 24, 2021 in Technique[技术] by 深蓝 (71.8m points)

python - Tensorflow negative sampling

I am trying to follow the udacity tutorial on tensorflow where I came across the following two lines for word embedding models:

  # Look up embeddings for inputs.
  embed = tf.nn.embedding_lookup(embeddings, train_dataset)
  # Compute the softmax loss, using a sample of the negative labels each time.
  loss = tf.reduce_mean(tf.nn.sampled_softmax_loss(softmax_weights, softmax_biases, 
                        embed, train_labels, num_sampled, vocabulary_size))

Now I understand that the second statement is for sampling negative labels. But the question is how does it know what the negative labels are? All I am providing the second function is the current input and its corresponding labels along with number of labels that I want to (negatively) sample from. Isn't there the risk of sampling from the input set in itself?

This is the full example: https://github.com/tensorflow/tensorflow/blob/master/tensorflow/examples/udacity/5_word2vec.ipynb

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

1 Reply

深蓝 · Answer 1 · 2021-10-23T19:27:38+0000

You can find the documentation for tf.nn.sampled_softmax_loss() here. There is even a good explanation of Candidate Sampling provided by TensorFlow here (pdf).

How does it know what the negative labels are?

TensorFlow will randomly select negative classes among all the possible classes (for you, all the possible words).

Isn't there the risk of sampling from the input set in itself?

When you want to compute the softmax probability for your true label, you compute: logits[true_label] / sum(logits[negative_sampled_labels]. As the number of classes is huge (the vocabulary size), there is very little probability to sample the true_label as a negative label.
Anyway, I think TensorFlow removes this possibility altogether when randomly sampling. (EDIT: @Alex confirms TensorFlow does this by default)

Categories

python - Tensorflow negative sampling

python - Tensorflow negative sampling

Please log in or register to add a comment.

Please log in or register to reply this article.

1 Reply

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags