Abstract
We present the first linear time (1+ε)-approximation algorithm for the k-means problem for fixed k and ε. Our algorithm runs in O(nd) time, which is linear in the size of the input. Another feature of our algorithm is its simplicity – the only technique involved is random sampling. 1.
Keywords
Affiliated Institutions
Related Publications
Near-Optimal Signal Recovery From Random Projections: Universal Encoding Strategies?
Suppose we are given a vector f in a class F ⊂ ℝN, e.g., a class of digital signals or digital images. How many linear measurements do we need to make about f to be able to reco...
Kernel k-means
Kernel k-means and spectral clustering have both been used to identify clusters that are non-linearly separable in input space. Despite significant research, these methods have ...
Selection of Variables for Fitting Equations to Data
Selecting a suitable equation to represent a set of multifactor data that was collected for other purposes in a plant, pilot-plant, or laboratory can be troublesome. If there ar...
Similarity Search in High Dimensions via Hashing
The nearest- or near-neighbor query problems arise in a large variety of database applications, usually in the context of similarity searching. Of late, there has been increasin...
Fast Kernels for String and Tree Matching
In this paper we present a new algorithm suitable for matching discrete objects such as strings and trees in linear time, thus obviating dynamic programming with quadratic time ...
Publication Info
- Year
- 2004
- Type
- article
- Pages
- 454-462
- Citations
- 241
- Access
- Closed
External Links
Social Impact
Social media, news, blog, policy document mentions
Citation Metrics
Cite This
Identifiers
- DOI
- 10.1109/focs.2004.7