Abstract

The data stream model has recently attracted attention for its applicability to numerous types of data, including telephone records, Web documents, and clickstreams. For analysis of such data, the ability to process the data in a single pass, or a small number of passes, while using little memory, is crucial. We describe such a streaming algorithm that effectively clusters large data streams. We also provide empirical evidence of the algorithm's performance on synthetic and real data streams.

Keywords

Computer scienceData stream miningCluster analysisData miningData streamData stream clusteringSTREAMSProcess (computing)CURE data clustering algorithmMachine learningFuzzy clusteringComputer network

Affiliated Institutions

Related Publications

Publication Info

Year
2003
Type
article
Volume
15
Issue
3
Pages
515-528
Citations
896
Access
Closed

External Links

Social Impact

Social media, news, blog, policy document mentions

Citation Metrics

896
OpenAlex

Cite This

Suvajyoti Guha, Adam Meyerson, Nita Mishra et al. (2003). Clustering data streams: theory and practice. IEEE Transactions on Knowledge and Data Engineering , 15 (3) , 515-528. https://doi.org/10.1109/tkde.2003.1198387

Identifiers

DOI
10.1109/tkde.2003.1198387