autonlab.org

Rapid Evaluation of Multiple Density Models (2003)

Alexander Gray, Andrew Moore

Tags

Astrostatistics, Cached Sufficient Statistics, Efficient Statistical Algorithms, Kd-trees and Ball-trees, Kernel Density Estimation, Memory-based Learning, Statistical Data Mining for Astrophysics

Abstract

When highly-accurate and/or assumption-free density estimation is needed, nonparametric methods are often called upon - most notably the popular kernel density estimation (KDE) method. However, the practitioner is instantly faced with the formidable computational cost of KDE for appreciable dataset sizes, which becomes even more prohibitive when many models with different kernel scales (bandwidths) must be evaluated -- this is necessary for finding the optimal model, among other reasons. In previous work we presented an algorithm for fast KDE which addresses large dataset sizes and large dimensionalities, but assumes only a single bandwidth. In this paper we present a generalization of that algorithm allowing multiple models with different bandwidths to be computed simultaneously, in substantially less time than either running the single-bandwidth algorithm for each model independently, or running the standard exhaustive method. We show examples of computing the likelihood curve for 100,000 data and 100 models ranging across 3 orders of magnitude in scale, in minutes or seconds.

Full text

Download (application/pdf, 206.5 kB)

Approximate BibTeX Entry

@inproceedings{gray-rapid,
    Year = {2003},
    Booktitle = {Artificial Iintelligence and Statistics},
    Author = { Alexander Gray, Andrew Moore },
    Title = {Rapid Evaluation of Multiple Density Models}
}

Copyright 2010, Carnegie Mellon University, Auton Lab. All Rights Reserved.