Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
424 views
in Technique[技术] by (71.8m points)

python - overcome Graphdef cannot be larger than 2GB in tensorflow

I am using tensorflow's imageNet trained model to extract the last pooling layer's features as representation vectors for a new dataset of images.

The model as is predicts on a new image as follows:

python classify_image.py --image_file new_image.jpeg 

I edited the main function so that I can take a folder of images and return the prediction on all images at once and write the feature vectors in a csv file. Here is how I did that:

def main(_):
  maybe_download_and_extract()
  #image = (FLAGS.image_file if FLAGS.image_file else
  #         os.path.join(FLAGS.model_dir, 'cropped_panda.jpg'))
  #edit to take a directory of image files instead of a one file
  if FLAGS.data_folder:
    images_folder=FLAGS.data_folder
    list_of_images = os.listdir(images_folder)
  else: 
    raise ValueError("Please specify image folder")

  with open("feature_data.csv", "wb") as f:
    feature_writer = csv.writer(f, delimiter='|')

    for image in list_of_images:
      print(image) 
      current_features = run_inference_on_image(images_folder+"/"+image)
      feature_writer.writerow([image]+current_features)

It worked just fine for around 21 images but then crashed with the following error:

  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 1912, in as_graph_def
    raise ValueError("GraphDef cannot be larger than 2GB.")
ValueError: GraphDef cannot be larger than 2GB.

I thought by calling the method run_inference_on_image(images_folder+"/"+image) the previous image data would be overwritten to only consider the new image data, which doesn't seem to be the case. How to resolve this issue?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

The problem here is that each call to run_inference_on_image() adds nodes to the same graph, which eventually exceeds the maximum size. There are at least two ways to fix this:

  1. The easy but slow way is to use a different default graph for each call to run_inference_on_image():

    for image in list_of_images:
      # ...
      with tf.Graph().as_default():
        current_features = run_inference_on_image(images_folder+"/"+image)
      # ...
    
  2. The more involved but more efficient way is to modify run_inference_on_image() to run on multiple images. Relocate your for loop to surround this sess.run() call, and you will no longer have to reconstruct the entire model on each call, which should make processing each image much faster.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...