Skip to content

bytedance/Portrait-Mode-Video

Repository files navigation

Portrait-Mode Video Recognition

We are releasing our code and dataset regarding Portrait-Mode Video Recognition research. The videos are sourced from Douyin platform. We distribute video content through the provision of links. Users are responsible for downloading the videos independently.

Taxonomy

Please check our released taxonomy here. There is also an interactive demo of the taxonomy here.

Usage

We assume two directories for this project. {CODE_DIR} for the code respository; {PROJ_DIR} for the model logs, checkpoints and dataset.

To start with, please clone our code from Github

git clone https://github.com/bytedance/Portrait-Mode-Video.git {CODE_DIR}

Python environment

We train our model with Python 3.7.3 and Pytorch 1.10.0. Please use the following command to install the packages used for our project. First install pytorch following the official instructions. Then install other packages by

pip3 install -r requirements.txt

Data downloading

Please refer to DATA.md for data downloading. We assume the videos are stored under {PROJ_DIR}/PMV_dataset. Category IDs for the released videos are under {CODE_DIR}/MViT/data_list/PMV and {CODE_DIR}/Uniformer/data_list/PMV.

Training

We provide bash scripts for training models using our PMV-400 data, as in exps/PMV/. A demo running script is

bash exps/PMV/run_MViT_PMV.sh

For each model, e.g., MViT, we provide the scripts for different training recipes in a single bash scripts, e.g., exps/PMV/run_MViT_PMV.sh. Please choose the one suiting your purpose.

Note that you should set some environment variables in the bash scripts, such as WORKER_0_HOST, WORKER_NUM and WORKER_ID in run_SlowFast_MViTv2_S_16x4_PMV_release.sh; PROJ_DIR in run_{model}_PMV.sh.

Inference

We provide inference scripts for obtaining the report results in our paper. We also provide the trained model checkpoints.

License

Our code is licensed under an Apache 2.0 License. Our data is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0) License. The data is released for non-commercial research purposes only.

By engaging in the downloading process, users are considered to have agreed to comply with our distribution license terms and conditions.


We would like to extend our thanks to the teams behind SlowFast code repository, 3Massiv, Kinetics and Uniformer. Our work builds upon their valuable contributions. Please acknowledge these resources in your work.