GOT-10k: A Large High-Diversity Benchmark for Generic Object Tracking in the Wild

Abstract

We introduce here a large tracking database that offers an unprecedentedly wide coverage of common moving objects in the wild, called GOT-10k. Specifically, GOT-10k is built upon the backbone of WordNet structure [1] and it populates the majority of over 560 classes of moving objects and 87 motion patterns, magnitudes wider than the most recent similar-scale counterparts [19], [20], [23], [26]. By releasing the large high-diversity database, we aim to provide a unified training and evaluation platform for the development of class-agnostic, generic purposed short-term trackers. The features of GOT-10k and the contributions of this article are summarized in the following. (1) GOT-10k offers over 10,000 video segments with more than 1.5 million manually labeled bounding boxes, enabling unified training and stable evaluation of deep trackers. (2) GOT-10k is by far the first video trajectory dataset that uses the semantic hierarchy of WordNet to guide class population, which ensures a comprehensive and relatively unbiased coverage of diverse moving objects. (3) For the first time, GOT-10k introduces the one-shot protocol for tracker evaluation, where the training and test classes are zero-overlapped. The protocol avoids biased evaluation results towards familiar objects and it promotes generalization in tracker development. (4) GOT-10k offers additional labels such as motion classes and object visible ratios, facilitating the development of motion-aware and occlusion-aware trackers. (5) We conduct extensive tracking experiments with 39 typical tracking algorithms and their variants on GOT-10k and analyze their results in this paper. (6) Finally, we develop a comprehensive platform for the tracking community that offers full-featured evaluation toolkits, an online evaluation server, and a responsive leaderboard. The annotations of GOT-10k's test data are kept private to avoid tuning parameters on it.

Keywords

Computer scienceArtificial intelligenceWordNetBenchmark (surveying)Eye trackingVideo trackingBitTorrent trackerMinimum bounding boxPopulationComputer visionInformation retrievalMachine learningObject (grammar)Image (mathematics)

Affiliated Institutions

Related Publications

Is object localization for free? - Weakly-supervised learning with convolutional neural networks

Maxime Oquab , Léon Bottou , Ivan Laptev +1 more

Successful methods for visual object recognition typically rely on training datasets containing lots of richly annotated images. Detailed image annotation, e.g. by object boundi...

2015 915 citations

Learning Deep Features for Discriminative Localization

Bolei Zhou , Aditya Khosla , Àgata Lapedriza +2 more

In this work, we revisit the global average pooling layer proposed in [13], and shed light on how it explicitly enables the convolutional neural network (CNN) to have remarkable...

2016 10334 citations

Spatial Priors for Part-Based Recognition Using Statistical Models

David Crandall , Pedro F. Felzenszwalb , D.P. Huttenlocher

We present a class of statistical models for part-based object recognition that are explicitly parameterized according to the degree of spatial structure they can represent. The...

2005 279 citations

Motion-Compensated Television Coding: Part I

Arun N. Netravali , J. D. Robbins

We present methods of estimating displacements of moving objects from one frame to the next in a television scene and using such displacements for frame-to-frame prediction. Dis...

1979 Bell System Technical Journal 462 citations

Parallel Tracking and Mapping on a camera phone

Georg Klein , David W. Murray

Camera phones are a promising platform for hand-held augmented reality. As their computational resources grow, they are becoming increasingly suitable for visual tracking tasks....

2009 518 citations

Publication Info

Year: 2019
Type: article
Volume: 43
Issue: 5
Pages: 1562-1577
Citations: 1605
Access: Closed

External Links

Download PDF (Free) View on DOI.org arXiv PubMed Semantic Scholar

Social Impact

Altmetric

GOT-10k: A Large High-Diversity Benchmark for Generic Object Tracking in the Wild

PlumX Metrics

Social media, news, blog, policy document mentions

Citation Metrics

1605

OpenAlex

377

Influential

1247

CrossRef

Cite This

APA Style

                            
                                    Lianghua Huang, 
                                
                                    Xin Zhao, 
                                
                                    Kaiqi Huang
                                
                            (2019). 
                            GOT-10k: A Large High-Diversity Benchmark for Generic Object Tracking in the Wild. 
                            IEEE Transactions on Pattern Analysis and Machine Intelligence
                            , 43
                            (5)
                            , 1562-1577.
                            https://doi.org/10.1109/tpami.2019.2957464

Identifiers

DOI: 10.1109/tpami.2019.2957464
PMID: 31804928
arXiv: 1810.11981

Data Quality

Data completeness: 88%