Abstract

In this paper, we present LaSOT, a high-quality benchmark for Large-scale Single Object Tracking. LaSOT consists of 1,400 sequences with more than 3.5M frames in total. Each frame in these sequences is carefully and manually annotated with a bounding box, making LaSOT the largest, to the best of our knowledge, densely annotated tracking benchmark. The average video length of LaSOT is more than 2,500 frames, and each sequence comprises various challenges deriving from the wild where target objects may disappear and re-appear again in the view. By releasing LaSOT, we expect to provide the community with a large-scale dedicated benchmark with high quality for both the training of deep trackers and the veritable evaluation of tracking algorithms. Moreover, considering the close connections of visual appearance and natural language, we enrich LaSOT by providing additional language specification, aiming at encouraging the exploration of natural linguistic feature for tracking. A thorough experimental evaluation of 35 tracking algorithms on LaSOT is presented with detailed analysis, and the results demonstrate that there is still a big room for improvements.

Keywords

Benchmark (surveying)Computer scienceMinimum bounding boxTracking (education)Video trackingBitTorrent trackerArtificial intelligenceFrame (networking)Scale (ratio)Object (grammar)Computer visionFeature (linguistics)Quality (philosophy)Eye trackingImage (mathematics)

Affiliated Institutions

Related Publications

Publication Info

Year
2019
Type
article
Pages
5369-5378
Citations
1504
Access
Closed

Social Impact

Social media, news, blog, policy document mentions

Citation Metrics

1504
OpenAlex
407
Influential
1257
CrossRef

Cite This

Heng Fan, Liting Lin, Fan Yang et al. (2019). LaSOT: A High-Quality Benchmark for Large-Scale Single Object Tracking. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) , 5369-5378. https://doi.org/10.1109/cvpr.2019.00552

Identifiers

DOI
10.1109/cvpr.2019.00552
arXiv
1809.07845

Data Quality

Data completeness: 84%