Fast Nonlinear Regression via Eigenimages Applied to Galactic Morphology (2004)
Brigham Anderson Andrew Moore Andrew Connolly Robert Nichol
Tags
Efficient Statistical Algorithms, Optimization, Statistical Data Mining for Astrophysics
Abstract
Astronomy increasingly faces the issue of massive datasets. For instance, the Sloan Digital Sky Survey (SDSS) has so far generated tens of millions of images of distant galaxies, of which only a tiny fraction have been morphologically classified. Our aim is to reduce each dataset image to a small set of informative features, in this case by using a known parameterized model of the image contents, and replacing each image with its best-fit parameters. This is a standard nonlinear regression problem, whose challenges are fourfold, 1) the atmospheric and mirror-based distortion suffered by each image, 2) large numbers of local minima, 3) large amounts of noise, and 4) the speed required to cope with the massiveness of the datasets. Our strategy is to use the known model's eigenimages to form a new basis, then to map both the target images and the model parameters into this eigenspace, and finally to find the best image-to-parameter matches within the space. To do this, we create a database of many random sets of parameters and their locations in eigenspace, thereby making the fitting process a nearest-neighbor search. Complications arise in the form of missing data and heteroskedasticity, both of which are addressed with weighted linear regression. Compared to existing techniques, speedups achieved are between 2 and 3 orders of magnitude. This enables the analysis of the entire SDSS dataset, itself a scientific wealth.
Full text
Download (application/pdf, 140.5 kB)
Approximate BibTeX Entry
@inproceedings{anderson-gmorph,
Year = {2004},
Journal = {Knowledge Discovery from Databases Conference},
Author = {Brigham Anderson Andrew Moore Andrew Connolly Robert Nichol},
Title = {Fast Nonlinear Regression via Eigenimages Applied to Galactic Morphology}
}