Abstract

The discovery and analysis of community structure in networks is a topic of considerable recent interest within the physics community, but most methods proposed so far are unsuitable for very large networks because of their computational cost. Here we present a hierarchical agglomeration algorithm for detecting community structure which is faster than many competing algorithms: its running time on a network with n vertices and m edges is O (md log n) where d is the depth of the dendrogram describing the community structure. Many real-world networks are sparse and hierarchical, with m approximately n and d approximately log n, in which case our algorithm runs in essentially linear time, O (n log(2) n). As an example of the application of this algorithm we use it to analyze a network of items for sale on the web site of a large on-line retailer, items in the network being linked if they are frequently purchased by the same buyer. The network has more than 400 000 vertices and 2 x 10(6) edges. We show that our algorithm can extract meaningful communities from this network, revealing large-scale patterns present in the purchasing habits of customers.

Keywords

Community structurePurchasingComputer scienceNetwork structureBinary logarithmDendrogramComplex networkData miningTheoretical computer scienceCombinatoricsMathematicsWorld Wide WebBusinessMarketingSociology

Affiliated Institutions

Related Publications

Publication Info

Year
2004
Type
article
Volume
70
Issue
6
Pages
066111-066111
Citations
7254
Access
Closed

External Links

Social Impact

Social media, news, blog, policy document mentions

Citation Metrics

7254
OpenAlex

Cite This

Aaron Clauset, M. E. J. Newman, Cristopher Moore (2004). Finding community structure in very large networks. Physical Review E , 70 (6) , 066111-066111. https://doi.org/10.1103/physreve.70.066111

Identifiers

DOI
10.1103/physreve.70.066111