Abstract
A variety of clustering algorithms have recently been proposed to handle data that is not linearly separable; spectral clustering and kernel k-means are two of the main methods. In this paper, we discuss an equivalence between the objective functions used in these seemingly different methods--in particular, a general weighted kernel k-means objective is mathematically equivalent to a weighted graph clustering objective. We exploit this equivalence to develop a fast, high-quality multilevel algorithm that directly optimizes various weighted graph clustering objectives, such as the popular ratio cut, normalized cut, and ratio association criteria. This eliminates the need for any eigenvector computation for graph clustering problems, which can be prohibitive for very large graphs. Previous multilevel graph partitioning methods, such as Metis, have suffered from the restriction of equal-sized clusters; our multilevel algorithm removes this restriction by using kernel k-means to optimize weighted graph cuts. Experimental results show that our multilevel algorithm outperforms a state-of-the-art spectral clustering algorithm in terms of speed, memory usage, and quality. We demonstrate that our algorithm is applicable to large-scale clustering tasks such as image segmentation, social network analysis and gene network analysis.
Keywords
Affiliated Institutions
Related Publications
Kernel k-means
Kernel k-means and spectral clustering have both been used to identify clusters that are non-linearly separable in input space. Despite significant research, these methods have ...
New spectral methods for ratio cut partitioning and clustering
Partitioning of circuit netlists in VLSI design is considered. It is shown that the second smallest eigenvalue of a matrix derived from the netlist gives a provably good approxi...
On the Quality of Spectral Separators
Computing graph separators is an important step in many graph algorithms. A popular technique for finding separators involves spectral methods. However, there has not been much ...
Learning Eigenfunctions Links Spectral Embedding and Kernel PCA
In this letter, we show a direct relation between spectral embedding methods and kernel principal components analysis and how both are special cases of a more general learning p...
On clusterings
We motivate and develop a natural bicriteria measure for assessing the quality of a clustering that avoids the drawbacks of existing measures. A simple recursive heuristic is sh...
Publication Info
- Year
- 2007
- Type
- article
- Volume
- 29
- Issue
- 11
- Pages
- 1944-1957
- Citations
- 1016
- Access
- Closed
External Links
Social Impact
Social media, news, blog, policy document mentions
Citation Metrics
Cite This
Identifiers
- DOI
- 10.1109/tpami.2007.1115