Abstract
We present a new approach for the approximate K-nearest neighbor search based on navigable small world graphs with controllable hierarchy (Hierarchical NSW, HNSW). The proposed solution is fully graph-based, without any need for additional search structures (typically used at the coarse search stage of the most proximity graph techniques). Hierarchical NSW incrementally builds a multi-layer structure consisting of a hierarchical set of proximity graphs (layers) for nested subsets of the stored elements. The maximum layer in which an element is present is selected randomly with an exponentially decaying probability distribution. This allows producing graphs similar to the previously studied Navigable Small World (NSW) structures while additionally having the links separated by their characteristic distance scales. Starting the search from the upper layer together with utilizing the scale separation boosts the performance compared to NSW and allows a logarithmic complexity scaling. Additional employment of a heuristic for selecting proximity graph neighbors significantly increases performance at high recall and in case of highly clustered data. Performance evaluation has demonstrated that the proposed general metric space search index is able to strongly outperform previous opensource state-of-the-art vector-only approaches. Similarity of the algorithm to the skip list structure allows straightforward balanced distributed implementation.
Keywords
Affiliated Institutions
Related Publications
Image Super-Resolution With Sparse Neighbor Embedding
Until now, neighbor-embedding-based (NE) algorithms for super-resolution (SR) have carried out two independent processes to synthesize high-resolution (HR) image patches. In the...
Similarity Search in High Dimensions via Hashing
The nearest- or near-neighbor query problems arise in a large variety of database applications, usually in the context of similarity searching. Of late, there has been increasin...
Approximate Evaluation Techniques for the Single-Link and Complete-Link Hierarchical Clustering Procedures
Abstract A technique is presented for testing the hypothesis that a hierarchical sequence of partitions constructed by the single-link or complete-link clustering method could h...
An Algroithm for Finding Best Matches in Logarithmic Expected Time
An algorithm and data structure are presented for searching a file containing N records, each described by k real valued keys, for the m closest matches or nearest neighbors to ...
Low-Complexity Single-Image Super-Resolution based on Nonnegative Neighbor Embedding
This paper describes a single-image super-resolution (SR) algorithm based on nonnegative neighbor embedding. It belongs to the family of single-image example-based SR algorithms...
Publication Info
- Year
- 2018
- Type
- article
- Volume
- 42
- Issue
- 4
- Pages
- 824-836
- Citations
- 1341
- Access
- Closed
External Links
Social Impact
Social media, news, blog, policy document mentions
Citation Metrics
Cite This
Identifiers
- DOI
- 10.1109/tpami.2018.2889473