Abstract
Ensuring navigational safety is one of the most critical challenges in autonomous maritime navigation research, requiring accurate real-time assessment of collision risks and prompt navigational decisions based on such assessments. Traditional rule-based systems utilizing radar and Automatic Identification Systems (AIS) exhibit fundamental limitations in simultaneously analyzing discrete objects such as vessels and buoys alongside continuous environmental boundaries like coastlines and bridges. To address these limitations, recent research has incorporated artificial intelligence approaches, though most recent studies have primarily focused on object detection methods. This study proposes a structured tag-based multimodal navigation safety framework that performs inference on maritime scenes by integrating YOLO-based object detection with the LLaVA vision–language model, generating outputs that include risk level assessment, navigation action recommendations, reasoning explanations, and object information. The proposed method achieved 86.1% accuracy in risk level assessment and 76.3% accuracy in navigation action recommendations. Through a hierarchical early stopping system using delimiter-based tags, the system reduced output token generation by 95.36% for essential inference results and 43.98% for detailed inference results compared to natural language outputs.
Affiliated Institutions
Related Publications
2019 ASCCP Risk-Based Management Consensus Guidelines for Abnormal Cervical Cancer Screening Tests and Cancer Precursors
Table: of ContentsA. EXECUTIVE SUMMARY Updated US consensus guidelines for management of cervical screening abnormalities are needed to accommodate the 3 available cervical scre...
Publication Info
- Year
- 2025
- Type
- article
- Volume
- 13
- Issue
- 12
- Pages
- 2339-2339
- Citations
- 0
- Access
- Closed
External Links
Social Impact
Social media, news, blog, policy document mentions
Citation Metrics
Cite This
Identifiers
- DOI
- 10.3390/jmse13122339