Abstract
<title>Abstract</title> Automated detection of ocular lesions from fundus images is of great significance for disease screening and early diagnosis. However, existing methods are often constrained by the trade-off between model accuracy and computational efficiency, particularly under resource-limited hardware conditions, where missed detections and false positives are common.In this paper, we propose FFD-YOLO11 (Fundus Disease Detection-YOLO11), a novel and efficient detection framework based on the YOLO11 architecture. The proposed model integrates the RepViT structure and Efficient Multi-scale Attention (EMA) into the backbone to enhance the representation of pathological features. Moreover, a Large Separable Kernel Attention (LSKA) mechanism is embedded in the Spatial Pyramid Pooling Fast (SPPF) module to expand the receptive field and strengthen contextual feature modeling.Furthermore, we design a Lightweight Shared Convolutional Detection Head (LSCD) and a Feature Diffusion Pyramid Network (FDPN), which effectively fuse multi-scale features while significantly reducing the model parameters. Experimental results show that FFD-YOLO11 achieves 97.4% mAP with only 2.6M parameters, outperforming the baseline by 5.1% and achieving the best performance among comparable models. Visualization analysis further demonstrates the model’s precise focus and localization of clinically critical lesion regions.Overall, FFD-YOLO11 provides a high-accuracy, lightweight, and robust detection solution suitable for clinical environments and embedded medical imaging systems, offering a new technological approach for intelligent ophthalmic diagnosis assistance.
Affiliated Institutions
Related Publications
Efficient Multi-Scale Attention Module with Cross-Spatial Learning
Remarkable effectiveness of the channel or spatial attention mechanisms for producing more discernible feature representation are illustrated in various computer vision tasks. H...
M2Det: A Single-Shot Object Detector Based on Multi-Level Feature Pyramid Network
Feature pyramids are widely exploited by both the state-of-the-art one-stage object detectors (e.g., DSSD, RetinaNet, RefineDet) and the two-stage object detectors (e.g., Mask R...
Cascaded Partial Decoder for Fast and Accurate Salient Object Detection
Existing state-of-the-art salient object detection networks rely on aggregating multi-level features of pre-trained convolutional neural networks (CNNs). However, compared to hi...
Feature Pyramid Networks for Object Detection
Feature pyramids are a basic component in recognition systems for detecting objects at different scales. But pyramid representations have been avoided in recent object detectors...
LeGO-LOAM: Lightweight and Ground-Optimized Lidar Odometry and Mapping on Variable Terrain
We propose a lightweight and ground-optimized lidar odometry and mapping method, LeGO-LOAM, for realtime six degree-of-freedom pose estimation with ground vehicles. LeGO-LOAM is...
Publication Info
- Year
- 2025
- Type
- article
- Citations
- 0
- Access
- Closed
External Links
Social Impact
Social media, news, blog, policy document mentions
Citation Metrics
Cite This
Identifiers
- DOI
- 10.21203/rs.3.rs-8254879/v1