Abstract

This paper addresses the problem of anomaly detection in multi-source heterogeneous data within the ETL (Extract-Transform-Load) process and proposes an intelligent detection framework that integrates temporal modeling and attention mechanisms. The method achieves effective dynamic aggregation of multidimensional features and temporal dependency modeling in ETL logs through the coordinated design of feature encoding, gated recurrent modeling, and multi-head attention allocation. At the feature level, the model uses a unified encoding structure to map raw logs, monitoring metrics, and task status information into a high-dimensional latent space, ensuring consistency of feature scales and completeness of information. At the temporal level, a GRU-based time modeling structure is introduced to capture long-term dependencies, enhancing the model's ability to perceive the evolution of anomaly patterns. At the attention level, a multi-head mechanism is applied to weight different time segments and feature dimensions, enabling adaptive focus on key moments and important features. Finally, the model combines anomaly scoring with distribution consistency constraints to achieve accurate identification and discrimination of potential anomalies. Experimental results show that the proposed framework significantly outperforms traditional rule-based detection, statistical methods, and basic deep models across various ETL task scenarios, demonstrating higher detection accuracy, stability, and generalization capability. The findings verify the effectiveness of integrating temporal modeling and attention mechanisms for anomaly detection in complex data streams and provide a feasible solution for building reliable and scalable intelligent ETL monitoring systems.

Related Publications

Publication Info

Year
2025
Type
article
Citations
0
Access
Closed

External Links

Social Impact

Social media, news, blog, policy document mentions

Citation Metrics

0
OpenAlex

Cite This

Haige Wang, Cong Nie, C. Chiang (2025). Attention-Driven Deep Learning Framework for Intelligent Anomaly Detection in ETL Processes. Preprints.org . https://doi.org/10.20944/preprints202512.0884.v1

Identifiers

DOI
10.20944/preprints202512.0884.v1