Abstract

Deep learning methods have achieved great successes in pedestrian detection, owing to its ability to learn discriminative features from raw pixels. However, they treat pedestrian detection as a single binary classification task, which may confuse positive with hard negative samples (Fig.1 (a)). To address this ambiguity, this work jointly optimize pedestrian detection with semantic tasks, including pedestrian attributes (e.g. `carrying backpack') and scene attributes (e.g. `vehicle', `tree', and `horizontal'). Rather than expensively annotating scene attributes, we transfer attributes information from existing scene segmentation datasets to the pedestrian dataset, by proposing a novel deep model to learn high-level features from multiple tasks and multiple data sources. Since distinct tasks have distinct convergence rates and data from different datasets have different distributions, a multi-task deep model is carefully designed to coordinate tasks and reduce discrepancies among datasets. Extensive evaluations show that the proposed approach outperforms the state-of-the-art on the challenging Caltech [9] and ETH [10] datasets where it reduces the miss rates of previous deep models by 17 and 5.5 percent, respectively.

Keywords

Computer sciencePedestrian detectionArtificial intelligencePedestrianDeep learningDiscriminative modelTask (project management)SegmentationConvolutional neural networkAmbiguityTree (set theory)Machine learningObject detectionTransfer of learningPattern recognition (psychology)Computer vision

Affiliated Institutions

Related Publications

Publication Info

Year
2015
Type
preprint
Pages
5079-5087
Citations
418
Access
Closed

External Links

Social Impact

Social media, news, blog, policy document mentions

Citation Metrics

418
OpenAlex

Cite This

Yonglong Tian, Ping Luo, Xiaogang Wang et al. (2015). Pedestrian detection aided by deep learning semantic tasks. , 5079-5087. https://doi.org/10.1109/cvpr.2015.7299143

Identifiers

DOI
10.1109/cvpr.2015.7299143