ColumbiaDVMM/CDC: CDC: Convolutional-De-Convolutional Networks for Precise Tempo ...

原作者: [db:作者] 来自: 网络收藏邀请

开源软件名称（OpenSource Name）：

ColumbiaDVMM/CDC

开源软件地址(OpenSource Url)：

https://github.com/ColumbiaDVMM/CDC

开源编程语言(OpenSource Language)：

开源软件介绍(OpenSource Introduction)：

Please find the whole code repo and models at CDC-bitbucket

Project website: http://www.ee.columbia.edu/ln/dvmm/researchProjects/cdc/

CDC: Convolutional-De-Convolutional Networks for Precise Temporal Action Localization in Untrimmed Videos

By Zheng Shou, Jonathan Chan, Alireza Zareian, Kazuyuki Miyazawa, and Shih-Fu Chang.

Introduction:

CDC is a deep learning framework for per-frame labeling and precise temporal action localization in untrimmed long videos.

This code has been tested on Ubuntu 14.04 with a single NVIDIA GeForce GTX TITAN X card.

Please use "Issues" on github instead of bitbucket to ask questions or report bugs. Thanks.

Current code is our rough version and we are improving its implementation details, while the current version suffices to run demo, repeat our experimental results, and train your own models.

License

CDC is released under the MIT License (refer to the LICENSE file for details).

Citing:

If you find CDC useful, please consider citing:

@inproceedings{cdc_shou_cvpr17,
  author = {Zheng Shou and Jonathan Chan and Alireza Zareian and Kazuyuki Miyazawa and Shih-Fu Chang},
  title = {CDC: Convolutional-De-Convolutional Networks for Precise Temporal Action Localization in Untrimmed Videos},
  year = {2017},
  booktitle = {CVPR} 
  }

We build this repo based on C3D and THUMOS Challenge 2014 . Please cite the following papers as well:

D. Tran, L. Bourdev, R. Fergus, L. Torresani, and M. Paluri, Learning Spatiotemporal Features with 3D Convolutional Networks, ICCV 2015.

Y. Jia, E. Shelhamer, J. Donahue, S. Karayev, J. Long, R. Girshick, S. Guadarrama, and T. Darrell, Caffe: Convolutional Architecture for Fast Feature Embedding, arXiv 2014.

A. Karpathy, G. Toderici, S. Shetty, T. Leung, R. Sukthankar, and L. Fei-Fei, Large-scale Video Classification with Convolutional Neural Networks, CVPR 2014.

@misc{THUMOS14,
  author = "Jiang, Y.-G. and Liu, J. and Roshan Zamir, A. and Toderici, G. and Laptev, I. and Shah, M. and Sukthankar, R.",
  title = "{THUMOS} Challenge: Action Recognition with a Large Number of Classes",
  howpublished = "\url{http://crcv.ucf.edu/THUMOS14/}",
  Year = {2014}
  }
  
@inproceedings{scnn_shou_wang_chang_cvpr16,
  author = {Zheng Shou and Dongang Wang and Shih-Fu Chang},
  title = {Temporal Action Localization in Untrimmed Videos via Multi-stage CNNs},
  year = {2016},
  booktitle = {CVPR} 
  }

Installation:

Hint: please refer to C3D-v1.0 and Caffe for more details about compilation such as making your own Makefile.config
Compile CDC: cd ./CDC/; make all
Note that do not need to make unit test cases
For the sake of using our code smoothly, please first get familiar with C3D.

Run demo:

This demo is designed to let users to have a quick try of CDC feature extraction.
More details of this demo:

we provide input data in demo/data/window along with input data list file demo/data/test.lst
each input data sample is a 32-frames long window. In order to directly reuse VIDEO_SEGMENTATION_DATA data format developed in C3D-v1.0, each of our input data is stored in bin format and consists of pixel values stacked over time (in the channel dim, besides RGB values, the pixel-level ground truth label is attached as the 4-th value; all pixels in the same frame have the same label; during testing, we only need provide random value for the label since it won't be used). We provide an example code for generating such bin file on THUMOS test set in the next section.
run the demo: cd demo; ./xfeat.sh;
output results will be stored in demo/feat

Reproduce results on THUMOS 2014 dataset:

Pre-process

first extract all frames in the following folder which will be used in the next step python file: inputdir = '/DATA_ROOT/THUMOS14/test/all_frames_pervideo/'
cd THUMOS14/predata/test and run python gen_test_bin_and_list.py to generate the bin files and the list file for the test set.

CDC network prediction

cd THUMOS14/test and you will see needed files for using CDC network to do prediction (i.e. feature extraction of the last layer) and outputs will be stored in feat
the trained model used for feature extraction is /CDC_root/model/thumos_CDC/convdeconv-TH14_iter_24390
the last layer of our trained model has 22 nodes corresponding to 22 possible frame-level classes(from the first to the last: background, action1-20, ambiguous)

Post-process

cd THUMOS14/test/postprocess and we have three post-processing steps in matlab
run matlab step1_gen_test_metadata.m and will generate metadata.mat which consists of three vectors (each vector corresponds to each video's all frames ordered in time for all test videos):
- frmid: frame id in each video, starts with 1
- videoid: belongs to which video
- kept_frm_index: when we generate bin window files, the last window may overlap with its previous window and thus have duplicate frames; this kept_frm_index can be used to index all unique frames from the frame list for bin window files.
run matlab step2_read_feat.m and will read all caffe outputs into two matlab matrixs:
- read_res.mat: this contains prob, which is the prediction results directly read from CDC outputs. #frames by #classes.
- proball.mat: this contains proball, which uses kept_frm_index to remove redundant frames in the above prob and removes confidence score prediction for the last class (ambiguous)
run matlab step3_gen_CDC_det.m to produce action segment instances prediction for temporal localization.
- SCNN-proposal.mat is the proposal results of Segment-CNN and we follow Segment-CNN to keep segments with confidence score > 0.7
- res_seg_swin.mat is the results of refining temporal boundaries of segment candidates from Segment-CNN. This mat file contains seg_swin: each row stands for one candidate segment; as for each column:
  - 1: video name in THUMOS14 test set
  - 2: sliding window length measured by number of frames
  - 3: start frame index
  - 4: end frame index
  - 5: start time
  - 6: end time
  - 9: confidence score of being the class indicated in the column 11
  - 10: confidence score of being action/non-background
  - 11: the predicted action class (from the 20 action classes [index 1-20] and the background [index 0])

Evaluation

Per-frame labeling:
- note that multi-label-test.mat contains ground truth label for each frame. #frames by #classes.
- cd THUMOS14/eval/PreFrameLabeling/ and run matlab compute_framelevel_mAP.m; map is the per-frame labeling mAP
Temporal localization:
- cd THUMOS14/eval/TemporalActionLocalization/ and run matlab eval_thumos14.m
- results are stored in res_CDC_thumos14.mat. we vary the overlap threshold IoU used in evaluation from 0.3 to 0.7

Train your own model:

Prepare pre-trained model as init: as explained in the paper, we use weights in sports1m model (model/sports1m_C3D/conv3d_deepnetA_sport1m_iter_1900000) to init our CDC network. We prepare this following script for generating the init model of CDC: cd THUMOS14/training/init/; ./run_net_surgey_sports1m_convdeconv.sh and it generates conv3d_deepnetA_sport1m_iter_1900000.convdeconv to be used as init during fine-tuning your own model.
Training example: example files for setting up caffe to run CDC fine-tuning are in THUMOS14/training/. Please refer to the above code for THUMOS test set window bin files generation to prepare your own training data. And please refer to C3D and Caffe for more general instructions about how to train 3D CNN model.

鲜花

握手

雷人

路过

鸡蛋

该文章已有0人参与评论

请发表评论

全部评论

专题导读

More+

10-27 六六分期app的软件客服如何联系？(六六分期

11-06 可心卡盟:win10系统火狐flash插件崩溃怎么

11-06 亲亲特价:怎么删除回收站图标

11-06 济南大学虚拟社区:鲁大师节能降温的具体办

11-06 xlueops.exe:无线网络安装向导

11-06 女斗合众国:win7系统cf与主机连接不稳定怎

11-06 0xc000022-[cf烟雾头]cf怎么调烟雾头

11-06 qizideyouhuo:应用程序无法正常启动0xc0000

11-06 ipz-185:win7系统vcf文件怎么打开

11-06 傻哥蹦迪:win10系统s4怎么打开usb调试

11-06 八神浩树gtaste:回收站清空了怎么恢复

11-06 妖尾之黑色守护:win10系统电脑没有1440x900

11-06 校园至尊魔王小说:win7系统浏览网页时字体

11-06 女斗合众国:win10系统访问共享文件夹提示请

11-06 tokyo hot n0654:恢复win7系统默认字体一招

11-06 雨酷仙境:设置win7系统转移临时文件夹腾出

11-06 阿穆纳伊之杖:win7系统开始菜单在右边还原

11-06 tunespotting:win10系统火狐flash插件总是

11-06 甘尔葛分析师：计谋网站seo关键词暴涨有什

11-06 蔡贵霖: 计谋网站seo关键词暴涨有什么秘密

11-06 博益网首页:ao3网页版进入不了解决方法

11-06 漏斗子专栏: 网站数据分析小白易懂精华篇

11-06 见证双虹怎么做:win7系统开启telnet命令的

11-06 颾狐蝶蜋:系统资源不足无法完成请求的服务

11-06 国光中学校歌:提交网站到alexa查询详细步骤

11-06 西安有情天:静态网页和动态网页的区别

11-06 红木雅尚斋:外部链接构造对网站的好处

11-06 前官礼遇：防止域名劫持–增强域安全性的10

11-06 密传二转答案: 中文分词算法有哪些

11-06 金泉家园邮编:百度快照劫持的表现及应对方

naver/kapture: kapture is a file format as well as a set of tools for manipulati ...发布时间：2022-08-16

openedx/i18n-tools: Tools to help with internationalization and localization of ...发布时间：2022-08-16

剪的笔顺,诠释剪的笔画,认识剪的部首

1 六六分期app的软件客服如何联系？(六六分期

六六分期app的软件客服如何联系？不知道吗？加qq群【895510560】即可！标题：六六分期

阅读：18915|2023-10-27

2 可心卡盟:win10系统火狐flash插件崩溃怎么

今天小编告诉大家如何处理win10系统火狐flash插件总是崩溃的问题，可能很多用户都不知

阅读：9900|2022-11-06

3 亲亲特价:怎么删除回收站图标

今天小编告诉大家如何对win10系统删除桌面回收站图标进行设置，可能很多用户都不知道

阅读：8295|2022-11-06

4 济南大学虚拟社区:鲁大师节能降温的具体办

今天小编告诉大家如何对win10系统电脑设置节能降温的设置方法，想必大家都遇到过需要

阅读：8655|2022-11-06

5 xlueops.exe:无线网络安装向导

我们在使用xp系统的过程中,经常需要对xp系统无线网络安装向导设置进行设置，可能很多

阅读：8587|2022-11-06

6 女斗合众国:win7系统cf与主机连接不稳定怎

今天小编告诉大家如何处理win7系统玩cf老是与主机连接不稳定的问题，可能很多用户都不

阅读：9594|2022-11-06

7 0xc000022-[cf烟雾头]cf怎么调烟雾头

电脑对日常生活的重要性小编就不多说了，可是一旦碰到win7系统设置cf烟雾头的问题，很

阅读：8578|2022-11-06

8 qizideyouhuo:应用程序无法正常启动0xc0000

我们在日常使用电脑的时候，有的小伙伴们可能在打开应用的时候会遇见提示应用程序无法

阅读：7969|2022-11-06

9 ipz-185:win7系统vcf文件怎么打开

今天小编告诉大家如何对win7系统打开vcf文件进行设置，可能很多用户都不知道怎么对win

阅读：8587|2022-11-06

10 傻哥蹦迪:win10系统s4怎么打开usb调试

今天小编告诉大家如何对win10系统s4开启USB调试模式进行设置，可能很多用户都不知道怎

阅读：7508|2022-11-06

客服电话

电子邮件

ColumbiaDVMM/CDC: CDC: Convolutional-De-Convolutional Networks for Precise Tempo ...

开源软件名称（OpenSource Name）：

开源软件地址(OpenSource Url)：

开源编程语言(OpenSource Language)：

开源软件介绍(OpenSource Introduction)：

CDC: Convolutional-De-Convolutional Networks for Precise Temporal Action Localization in Untrimmed Videos

Introduction:

License

Citing:

Installation:

Run demo:

Reproduce results on THUMOS 2014 dataset:

Train your own model:

请发表评论

全部评论

上一篇：

下一篇：

bradtraversy/iweather: Ionic 3 mobile we

Delphi - 采用第三方控件TMS、SPComm开发串

joaomh/curso-de-matlab

断牙刷新位置时间（断牙属性及刷新位置介绍

CVE-2022-20912

剪的笔顺,诠释剪的笔画,认识剪的部首

六六分期app的软件客服如何联系？(六六分期

florent37/ViewAnimator: A fluent Android

florent37/Shrine-MaterialDesign2: implem

CVE-2020-36276

SimpleSoftwareIO/simple-sms: Send and re

关于我们

产品与服务

解决方案

139-2527-9053