Abstract
Skin diseases are common conditions that pose a significant threat to human health, and automated classification plays an important role in assisting clinical diagnosis. However, existing image classification approaches based on convolutional neural networks (CNNs) and Transformers have inherent limitations. CNNs are constrained in capturing global features, whereas Transformers are less effective in modeling local details. Given the characteristics of dermoscopic images, both local and global features are equally crucial for classification tasks. To address this issue, we propose an improved Swin Transformer-based model, termed MaLafFormer, which incorporates a Modulated Fusion of Multi-scale Attention (MFMA) module and a Lesion-Area Focus (LAF) module to enhance global modeling, emphasize critical local regions, and improve lesion boundary perception. Experimental results on the ISIC2018 dataset show that MaLafFormer achieves 84.35% ± 0.56% accuracy (mean of three runs), outperforming the baseline 77.98% ± 0.34% by 6.37%, and surpasses other compared methods across multiple metrics, thereby validating its effectiveness for skin lesion classification tasks.
Affiliated Institutions
Related Publications
FFA-Net: Feature Fusion Attention Network for Single Image Dehazing
In this paper, we propose an end-to-end feature fusion at-tention network (FFA-Net) to directly restore the haze-free image. The FFA-Net architecture consists of three key compo...
Swin-Unet: Unet-Like Pure Transformer for Medical Image Segmentation
In the past few years, convolutional neural networks (CNNs) have achieved milestones in medical image analysis. Especially, the deep neural networks based on U-shaped architectu...
Dual Attention Network for Scene Segmentation
In this paper, we address the scene segmentation task by capturing rich contextual dependencies based on the self-attention mechanism. Unlike previous works that capture context...
UNETR: Transformers for 3D Medical Image Segmentation
Fully Convolutional Neural Networks (FCNNs) with contracting and expanding paths have shown prominence for the majority of medical image segmentation applications since the past...
Efficient Multi-Scale Attention Module with Cross-Spatial Learning
Remarkable effectiveness of the channel or spatial attention mechanisms for producing more discernible feature representation are illustrated in various computer vision tasks. H...
Publication Info
- Year
- 2025
- Type
- article
- Volume
- 15
- Issue
- 24
- Pages
- 12952-12952
- Citations
- 0
- Access
- Closed
External Links
Social Impact
Social media, news, blog, policy document mentions
Citation Metrics
Cite This
Identifiers
- DOI
- 10.3390/app152412952