A Comprehensive Review of YOLO Architectures in Computer Vision: From YOLOv1 to YOLOv8 and YOLO-NAS

Abstract

YOLO has become a central real-time object detection system for robotics, driverless cars, and video monitoring applications. We present a comprehensive analysis of YOLO’s evolution, examining the innovations and contributions in each iteration from the original YOLO up to YOLOv8, YOLO-NAS, and YOLO with transformers. We start by describing the standard metrics and postprocessing; then, we discuss the major changes in network architecture and training tricks for each model. Finally, we summarize the essential lessons from YOLO’s development and provide a perspective on its future, highlighting potential research directions to enhance real-time object detection systems.

Keywords

Computer scienceArchitectureArtificial intelligenceObject detectionRoboticsSystems engineeringHuman–computer interactionEngineeringRobotGeography

Affiliated Institutions

Related Publications

Object Detection With Deep Learning: A Review

Zhong‐Qiu Zhao , Peng Zheng , Shou-Tao Xu +1 more

Due to object detection's close relationship with video analysis and image understanding, it has attracted much research attention in recent years. Traditional object detection ...

2019 IEEE Transactions on Neural Networks ... 5019 citations

YOLOv7: Trainable Bag-of-Freebies Sets New State-of-the-Art for Real-Time Object Detectors

Alexey Bochkovskiy , Hong-Yuan Mark Liao , Chien-Yao Wang

Real-time object detection is one of the most important research topics in computer vision. As new approaches regarding architecture optimization and training optimization are c...

2023 9475 citations

A ConvNet for the 2020s

Zhuang Liu , Hanzi Mao , Chao-Yuan Wu +3 more

The "Roaring 20s" of visual recognition began with the introduction of Vision Transformers (ViTs), which quickly superseded ConvNets as the state-of-the-art image classification...

2022 2022 IEEE/CVF Conference on Computer ... 5683 citations

FCOS: Fully Convolutional One-Stage Object Detection

Zhi Tian , Chunhua Shen , Hao Chen +1 more

We propose a fully convolutional one-stage object detector (FCOS) to solve object detection in a per-pixel prediction fashion, analogue to semantic segmentation. Almost all stat...

2019 5672 citations

ORB: An efficient alternative to SIFT or SURF

Ethan Rublee , Vincent Rabaud , Kurt Konolige +1 more

Feature matching is at the base of many computer vision problems, such as object recognition or structure from motion. Current methods rely on costly descriptors for detection a...

2011 9963 citations

Publication Info

Year: 2023
Type: review
Volume: 5
Issue: 4
Pages: 1680-1716
Citations: 1932
Access: Closed

External Links

Download PDF (Free) View on DOI.org arXiv Semantic Scholar

Social Impact

Altmetric

A Comprehensive Review of YOLO Architectures in Computer Vision: From YOLOv1 to YOLOv8 and YOLO-NAS

PlumX Metrics

Social media, news, blog, policy document mentions

Citation Metrics

1932

OpenAlex

Influential

1950

CrossRef

Cite This

APA Style

                            
                                    Juan Terven, 
                                
                                    Diana‐Margarita Córdova‐Esparza, 
                                
                                    Julio-Alejandro Romero-González
                                
                            (2023). 
                            A Comprehensive Review of YOLO Architectures in Computer Vision: From YOLOv1 to YOLOv8 and YOLO-NAS. 
                            Machine Learning and Knowledge Extraction
                            , 5
                            (4)
                            , 1680-1716.
                            https://doi.org/10.3390/make5040083

Identifiers

DOI: 10.3390/make5040083
arXiv: 2304.00501

Data Quality

Data completeness: 88%