autonlab.org

Efficient Exact k-NN and Nonparametric Classification in High Dimensions (2003)

Ting Liu, Andrew Moore, Alexander Gray

Tags

Auton Fast Classifiers, Efficient Statistical Algorithms, Kd-trees and Ball-trees, Statistical Data Mining for Astrophysics

Abstract

This paper is about non-approximate acceleration of high dimensional nonparametric operations such as $k$ nearest neighbor classifiers and the prediction phase of Support Vector Machine classifiers. We attempt to exploit the fact that even if we want exact answers to nonparametric queries, we usually do not need to explicitly find the datapoints close to the query, but merely need to ask questions about the properties about that set of datapoints. This offers a small amount of computational leeway, and we investigate how much that leeway can be exploited. For clarity, this paper concentrates on pure $k$-NN classification and the prediction phase of SVMs. We introduce new ball tree algorithms that on real-world datasets give accelerations of 2-fold up to 100-fold compared against highly optimized traditional ball-tree-based $k$-NN. These results include datasets with up to $10^6$ dimensions and $10^5$ records, and show non-trivial speedups while giving exact answers.

Full text

Download (application/pdf, 84.4 kB)

Approximate BibTeX Entry

@proceedings{Liu-knn,
    Year = {2003},
    Booktitle = {Proceedings of Neural Information Processing Systems},
    Author = { Ting Liu, Andrew Moore, Alexander Gray },
    Title = {Efficient Exact k-NN and Nonparametric Classification in High Dimensions}
}

Copyright 2008, Carnegie Mellon University, Auton Lab. All Rights Reserved.