Abstract

We present an approach to efficiently detect the 2D pose of multiple people in an image. The approach uses a nonparametric representation, which we refer to as Part Affinity Fields (PAFs), to learn to associate body parts with individuals in the image. The architecture encodes global context, allowing a greedy bottom-up parsing step that maintains high accuracy while achieving realtime performance, irrespective of the number of people in the image. The architecture is designed to jointly learn part locations and their association via two branches of the same sequential prediction process. Our method placed first in the inaugural COCO 2016 keypoints challenge, and significantly exceeds the previous state-of-the-art result on the MPII Multi-Person benchmark, both in performance and efficiency.

Keywords

Benchmark (surveying)Computer scienceContext (archaeology)Representation (politics)Image (mathematics)Artificial intelligenceParsingPoseProcess (computing)ArchitectureGreedy algorithmMachine learningAerial imageComputer visionPattern recognition (psychology)AlgorithmProgramming language

Affiliated Institutions

Related Publications

Publication Info

Year
2017
Type
preprint
Pages
1302-1310
Citations
7012
Access
Closed

External Links

Social Impact

Social media, news, blog, policy document mentions

Citation Metrics

7012
OpenAlex

Cite This

Zhe Cao, Tomas Simon, Shih-En Wei et al. (2017). Realtime Multi-person 2D Pose Estimation Using Part Affinity Fields. , 1302-1310. https://doi.org/10.1109/cvpr.2017.143

Identifiers

DOI
10.1109/cvpr.2017.143