Abstract

Research on depth-based human activity analysis achieved outstanding performance and demonstrated the effectiveness of 3D representation for action recognition. The existing depth-based and RGB+D-based action recognition benchmarks have a number of limitations, including the lack of large-scale training samples, realistic number of distinct class categories, diversity in camera views, varied environmental conditions, and variety of human subjects. In this work, we introduce a large-scale dataset for RGB+D human action recognition, which is collected from 106 distinct subjects and contains more than 114 thousand video samples and 8 million frames. This dataset contains 120 different action classes including daily, mutual, and health-related activities. We evaluate the performance of a series of existing 3D activity analysis methods on this dataset, and show the advantage of applying deep learning methods for 3D-based human action recognition. Furthermore, we investigate a novel one-shot 3D activity recognition problem on our dataset, and a simple yet effective Action-Part Semantic Relevance-aware (APSR) framework is proposed for this task, which yields promising results for recognition of the novel action classes. We believe the introduction of this large-scale dataset will enable the community to apply, adapt, and develop various data-hungry learning techniques for depth-based and RGB+D-based human activity understanding.

Keywords

Benchmark (surveying)Artificial intelligenceComputer scienceScale (ratio)RGB color modelComputer visionMachine learningPattern recognition (psychology)CartographyGeography

MeSH Terms

AlgorithmsBenchmarkingDeep LearningHuman ActivitiesHumansImage ProcessingComputer-AssistedPattern RecognitionAutomatedSemanticsVideo Recording

Affiliated Institutions

Related Publications

Deep Colorization

This paper investigates into the colorization problem which converts a grayscale image to a colorful version. This is a very difficult problem and normally requires manual adjus...

2015 540 citations

Publication Info

Year
2019
Type
article
Volume
42
Issue
10
Pages
2684-2701
Citations
1561
Access
Closed

Social Impact

Social media, news, blog, policy document mentions

Citation Metrics

1561
OpenAlex
214
Influential
1314
CrossRef

Cite This

Jun Liu, Amir Shahroudy, Mauricio Pérez et al. (2019). NTU RGB+D 120: A Large-Scale Benchmark for 3D Human Activity Understanding. IEEE Transactions on Pattern Analysis and Machine Intelligence , 42 (10) , 2684-2701. https://doi.org/10.1109/tpami.2019.2916873

Identifiers

DOI
10.1109/tpami.2019.2916873
PMID
31095476
arXiv
1905.04757

Data Quality

Data completeness: 93%