Rapid Evaluation of Multiple Density Models (2003)
Tags
Astrostatistics, Cached Sufficient Statistics, Efficient Statistical Algorithms, Kd-trees and Ball-trees, Kernel Density Estimation, Memory-based Learning, Statistical Data Mining for Astrophysics
Abstract
When highly-accurate and/or assumption-free density estimation is needed, nonparametric methods are often called upon - most notably the popular kernel density estimation (KDE) method. However, the practitioner is instantly faced with the formidable computational cost of KDE for appreciable dataset sizes, which becomes even more prohibitive when many models with different kernel scales (bandwidths) must be evaluated -- this is necessary for finding the optimal model, among other reasons. In previous work we presented an algorithm for fast KDE which addresses large dataset sizes and large dimensionalities, but assumes only a single bandwidth. In this paper we present a generalization of that algorithm allowing multiple models with different bandwidths to be computed simultaneously, in substantially less time than either running the single-bandwidth algorithm for each model independently, or running the standard exhaustive method. We show examples of computing the likelihood curve for 100,000 data and 100 models ranging across 3 orders of magnitude in scale, in minutes or seconds.
Full text
Download (application/pdf, 206.5 kB)
Approximate BibTeX Entry
@inproceedings{gray-rapid,
Year = {2003},
Booktitle = {Artificial Iintelligence and Statistics},
Author = {
Alexander Gray, Andrew
Moore
},
Title = {Rapid Evaluation of Multiple Density Models}
}