Skip to content

STAIR-Lab-CIT/metavd

Repository files navigation

MetaVD Logo

MetaVD

MetaVD is a Meta Video Dataset for enhancing human action recognition datasets. It provides human-annotated relationship labels between action classes across human action recognition datasets. MetaVD is proposed in the following paper:

Yuya Yoshikawa, Yutaro Shigeto, and Akikazu Takeuchi. "MetaVD: A Meta Video Dataset for enhancing human action recognition datasets." Computer Vision and Image Understanding 212 (2021): 103276. [link]

MetaVD integrates the following datasets:

This repository does NOT provide videos in the datasets. For information on how to download the videos, please refer to the website of each dataset.

Data Format

Relation label data

MetaVD relation labels are provided in metavd_v*.csv. Each row represents an individual relation from an action class, called as source action class to another action class, called as target action class. The row is composed of the following information.

  • from_dataset (str): Dataset name of source action class. Any of ucf101, hmdb51, activitynet, stair_actions, charades, kinetics700.
  • from_action_idx (int): Index of source action class.
  • from_action_name (str): Name of source action class.
  • to_dataset (str): Dataset name of target action class. Any of ucf101, hmdb51, activitynet, stair_actions, charades, kinetics700.
  • to_action_idx (int): Index of target action class.
  • to_action_name (str): Name of target action class.
  • relation (str): Relation type. Any of equal, similar and is-a.

Relation types equal and similar are undirected, while only is-a is directional. Note that, a pair of action classes with any of equal and similar appears in the file only once.

Action class list

We provide the list of action classes of each dataset in [dataset_name]_classes.csv. The 'idx' column corresponds to 'from_action_idx' and 'to_action_idx', and the 'name' column corresponds to 'from_action_name' and 'to_action_name' in relation label data.

Citation

@article{yoshikawa2021metavd,
  title = {MetaVD: A Meta Video Dataset for enhancing human action recognition datasets},
  journal = {Computer Vision and Image Understanding},
  volume = {212},
  pages = {103276},
  year = {2021},
  issn = {1077-3142},
  doi = {https://doi.org/10.1016/j.cviu.2021.103276},
  url = {https://www.sciencedirect.com/science/article/pii/S107731422100120X},
  author = {Yuya Yoshikawa and Yutaro Shigeto and Akikazu Takeuchi},
  keywords = {Human action recognition, Video datasets}
}

License

MetaVD relation label data (metavd_v1.csv) and action class lists ([dataset_name]_classes.csv) are licensed under Creative Commons Attribution 4.0 International license (CC BY 4.0).

Acknowledgement

This dataset is based on results obtained from a project, JPNP20006, commissioned by the New Energy and Industrial Technology Development Organization (NEDO).

Release Notes

  • 1/6/2022: MetaVD is available for download