Preparing MSR-VTT Retrieval/ Video Question-Answering Dataset¶
Introduction¶
@inproceedings{xu2016msr,
title={Msr-vtt: A large video description dataset for bridging video and language},
author={Xu, Jun and Mei, Tao and Yao, Ting and Rui, Yong},
booktitle={CVPR},
pages={5288--5296},
year={2016}
}
Before preparing the dataset, please make sure that the directory is located at $MMACTION2/tools/data/msrvtt/
.
Step 1. Download Annotation Files¶
You can directly download the following annotation files related to MSR-VTT from the Google Drive link provided by VindLU and place them in the $MMACTION2/tools/data/msrvtt/annotations
directory:
Step 2. Prepare Video Data¶
You can refer to the official website of this dataset for basic information. Run the following commands to prepare the MSRVTT video files:
# Download original videos
bash download_msrvtt.sh
# Preprocess videos to lower FPS and dimensions
bash compress_msrvtt.sh
After completing the above preparation steps, the directory structure will be as follows:
mmaction2
├── mmaction
├── tools
├── configs
├── data
│ └── msrvtt
│ │ ├── annotations
│ │ │ ├── msrvtt_qa_train.json
│ │ │ ├── msrvtt_qa_val.json
│ │ │ ├── msrvtt_qa_test.json
│ │ │ ├── msrvtt_qa_answer_list.json
│ │ │ ├── msrvtt_mc_test.json
│ │ │ ├── msrvtt_ret_train9k.json
│ │ │ ├── msrvtt_ret_train7k.json
│ │ │ ├── msrvtt_ret_test1k.json
│ │ │ └── msrvtt_test1k.json
│ │ └── videos_2fps_224
│ │ ├── video0.mp4
│ │ ├── video1.mp4
│ │ ├── ...
│ │ └── video9999.mp4