Abstract

Non-rigid object detection and articulated pose estimation are two related and challenging problems in computer vision. Numerous models have been proposed over the years and often address different special cases, such as pedestrian detection or upper body pose estimation in TV footage. This paper shows that such specialization may not be necessary, and proposes a generic approach based on the pictorial structures framework. We show that the right selection of components for both appearance and spatial modeling is crucial for general applicability and overall performance of the model. The appearance of body parts is modeled using densely sampled shape context descriptors and discriminatively trained AdaBoost classifiers. Furthermore, we interpret the normalized margin of each classifier as likelihood in a generative model. Non-Gaussian relationships between parts are represented as Gaussians in the coordinate system of the joint between parts. The marginal posterior of each part is inferred using belief propagation. We demonstrate that such a model is equally suitable for both detection and pose estimation tasks, outperforming the state of the art on three recently proposed datasets.

Keywords

Computer scienceArtificial intelligencePoseClassifier (UML)Margin (machine learning)AdaBoostComputer visionPattern recognition (psychology)Object detectionGaussianMachine learningMixture modelContext (archaeology)

Affiliated Institutions

Related Publications

Publication Info

Year
2009
Type
article
Citations
805
Access
Closed

External Links

Social Impact

Social media, news, blog, policy document mentions

Citation Metrics

805
OpenAlex

Cite This

Mykhaylo Andriluka, Stefan Roth, Bernt Schiele (2009). Pictorial structures revisited: People detection and articulated pose estimation. 2009 IEEE Conference on Computer Vision and Pattern Recognition . https://doi.org/10.1109/cvpr.2009.5206754

Identifiers

DOI
10.1109/cvpr.2009.5206754