autonlab.org

An Investigation of Practical Approximate Nearest Neighbor Algorithms (2004)

Ting Liu, Andrew Moore, Alexander Gray, Ke Yang

Tags

Kd-trees and Ball-trees

Abstract

This paper concerns approximate nearest neighbor searching algorithms, which have become increasingly important, especially in high dimensional perception areas such as computer vision, with dozens of publications in recent years. Much of this enthusiasm is due to a successful new approximate nearest neighbor approach called Locality Sensitive Hashing (LSH). In this paper we ask the question: can earlier spatial data structure approaches to exact nearest neighbor, such as metric trees, be altered to provide approximate answers to proximity queries and if so, how? We introduce a new kind of metric tree that allows overlap: certain datapoints may appear in both the children of a parent. We also introduce new approximate k-NN search algorithms on this structure. We show why these structures should be able to exploit the same random-projection-based approximations that LSH enjoys, but with a simpler algorithm and perhaps with greater efficiency. We then provide a detailed empirical evaluation on five large, high dimensional datasets which show up to 31-fold accelerations over LSH. This result holds true throughout the spectrum of approximation levels.

Full text

Download (application/pdf, 122.4 kB)

Datasets

Datasets used in this paper can be found here.

Approximate BibTeX Entry

@proceedings{liu-nips04,
    Month = {12},
    Year = {2004},
    Author = { Ting Liu, Andrew Moore, Alexander Gray, Ke Yang },
    Title = {An Investigation of Practical Approximate Nearest Neighbor Algorithms}
}

Copyright 2006, Carnegie Mellon University, Auton Lab