Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
361 views
in Technique[技术] by (71.8m points)

Initializing tensorflow Variable with an array larger than 2GB

I am trying to initialize a tensorflow Variable with pre-trained word2vec embeddings.

I have the following code:

import tensorflow as tf
from gensim import models

model = models.Word2Vec.load_word2vec_format('GoogleNews-vectors-negative300.bin', binary=True)
X = model.syn0

embeddings = tf.Variable(tf.random_uniform(X.shape, minval=-0.1, maxval=0.1), trainable=False)

sess.run(tf.initialize_all_variables())

sess.run(embeddings.assign(X))

And I am receiving the following error:

ValueError: Cannot create an Operation with a NodeDef larger than 2GB.

The array (X) I am trying to assign is of shape (3000000, 300) and its size is 3.6GB.

I am getting the same error if I try tf.convert_to_tensor(X) as well.

I know that it fails due to the fact that the array is larger than 2GB. However, I do not know how to assign an array larger than 2GB to a tensorflow Variable

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

It seems like the only option is to use a placeholder. The cleanest way I can find is to initialize to a placeholder directly:

X_init = tf.placeholder(tf.float32, shape=(3000000, 300))
X = tf.Variable(X_init)
# The rest of the setup...
sess.run(tf.initialize_all_variables(), feed_dict={X_init: model.syn0})

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...