This repository contains code for detecting objects in images mentioned by referring expressions. The code is an implementation of the technique presented in our paper. We have also included links to pretrained models, our split of the Google RefExp dataset and also the processed version of the UNC RefExp dataset.
@inproceedings{nagaraja16refexp,
title={Modeling Context Between Objects for Referring Expression Understanding},
author={Varun K. Nagaraja and Vlad I. Morariu and Larry S. Davis},
booktitle={ECCV},
year={2016}
}
We have also implemented the baseline and max-margin techniques proposed by Mao et al. in their CVPR 2016 paper. If you use the Google RefExp dataset, please cite this paper
@inproceedings{google_refexp,
title={Generation and Comprehension of Unambiguous Object Descriptions},
author={Mao, Junhua and Huang, Jonathan and Toshev, Alexander and Camburu, Oana and Yuille, Alan and Murphy, Kevin},
booktitle={CVPR},
year={2016}
}
If you use the UNC RefExp dataset, please cite the following paper
@inproceedings{unc_refexp,
title={Modeling Context in Referring Expressions},
author={Licheng Yu and Patric Poirson and Shan Yang and Alexander C. Berg and Tamara L. Berg},
booktitle={ECCV},
year={2016}
}
Download Google RefExp dataset with our split and MCG region candidates
cd$COCO_PATH
wget https://obj.umiacs.umd.edu/referring-expressions/google_refexp_umd_split.tar.gz
tar -xzf google_refexp_umd_split.tar.gz
rm google_refexp_umd_split.tar.gz
Note: If you want the original split of the Google RefExp dataset, follow the instructions at this link. Then move the dataset files to the appropriate folder as indicated above.
Download UNC RefExp dataset with MCG candidates
cd$COCO_PATH
wget https://obj.umiacs.umd.edu/referring-expressions/unc_refexp.tar.gz
tar -xzf unc_refexp.tar.gz
rm unc_refexp.tar.gz
Testing
Create cache directories where we will store the model and vocabulary files
cd$COCO_PATH/cache_dir
cd h5_data
wget https://obj.umiacs.umd.edu/referring-expressions/Google_RefExp_vocabulary.txt
wget https://obj.umiacs.umd.edu/referring-expressions/UNC_RefExp_vocabulary.txt
cd ..
cd models
# baseline models trained on Google RefExp and UNC RefExp
wget https://obj.umiacs.umd.edu/referring-expressions/baseline_models.tar.gz
tar -xzf baseline_models.tar.gz
rm baseline_models.tar.gz
# max-margin models
wget https://obj.umiacs.umd.edu/referring-expressions/max_margin_models.tar.gz
tar -xzf max_margin_models.tar.gz
rm max_margin_models.tar.gz
# context models with negative bag margin
wget https://obj.umiacs.umd.edu/referring-expressions/mil_context_withNegMargin_models.tar.gz
tar -xzf mil_context_withNegMargin_models.tar.gz
rm mil_context_withNegMargin_models.tar.gz
# context models with positive bag margin
wget https://obj.umiacs.umd.edu/referring-expressions/mil_context_withPosNegMargin_models.tar.gz
tar -xzf mil_context_withPosNegMargin_models.tar.gz
rm mil_context_withPosNegMargin_models.tar.gz
Note: In the paper, for Google RefExp experiments, we report numbers from models trained on a subset of the training set since we use the remaining training set for validation. However, these pretrained models were trained on the entire training set and hence provide slighty better numbers than those reported in the paper.
This will first extract region features for all images in the dataset and dump them in a format suitable for loading in caffe. It will require a lot of space on disk depending on the experiment type you want to run.
If you are working with Google RefExp, we will split the training data to create a validation partition of our own. The test set of the Google RefExp dataset is not yet released.
cd $COCO_PATH/cache_dir/h5_data/buffer_16/Google_RefExp_baseline_20
head -n 5038 hdf5_chunk_list.txt > hdf5_chunk_list_part1.txt
tail -n 300 hdf5_chunk_list.txt > hdf5_chunk_list_part2.txt
Edit training prototxt file (Ex.: proto_files/google_refexp/google_refexp.baseline.prototxt) and set the correct source in hdf5_data_param. For example,
The script will print the log to the screen and also write to a file.
When the training is complete, choose the iteration number of the model snapshot with the lowest cross entropy loss on the validation set. Set this iteration number in lib/experiment_settings.py file to test the trained model. The following command will extract the lines which contain the cross entropy loss.
grep "Testing net (#1)" -A 4 google_refexp.baseline.log
请发表评论