Shortcuts

Preparing MSR-VTT Retrieval/ Video Question-Answering Dataset

Introduction

@inproceedings{xu2016msr,
      title={Msr-vtt: A large video description dataset for bridging video and language},
      author={Xu, Jun and Mei, Tao and Yao, Ting and Rui, Yong},
      booktitle={CVPR},
      pages={5288--5296},
      year={2016}
}

Before preparing the dataset, please make sure that the directory is located at $MMACTION2/tools/data/msrvtt/.

Step 1. Download Annotation Files

You can directly download the following annotation files related to MSR-VTT from the Google Drive link provided by VindLU and place them in the $MMACTION2/tools/data/msrvtt/annotations directory:

Step 2. Prepare Video Data

You can refer to the official website of this dataset for basic information. Run the following commands to prepare the MSRVTT video files:

# Download original videos
bash download_msrvtt.sh
# Preprocess videos to lower FPS and dimensions
bash compress_msrvtt.sh

After completing the above preparation steps, the directory structure will be as follows:

mmaction2
├── mmaction
├── tools
├── configs
├── data
│   └── msrvtt
│   │   ├── annotations
│   │   │   ├── msrvtt_qa_train.json
│   │   │   ├── msrvtt_qa_val.json
│   │   │   ├── msrvtt_qa_test.json
│   │   │   ├── msrvtt_qa_answer_list.json
│   │   │   ├── msrvtt_mc_test.json
│   │   │   ├── msrvtt_ret_train9k.json
│   │   │   ├── msrvtt_ret_train7k.json
│   │   │   ├── msrvtt_ret_test1k.json
│   │   │   └── msrvtt_test1k.json
│   │   └── videos_2fps_224
│   │       ├── video0.mp4
│   │       ├── video1.mp4
│   │       ├── ...
│   │       └── video9999.mp4