Alex's fascinations in early grade school were Legos, breaking ciphers, and drawing human anatomy. After studying Applied Math and Computer Science at Berkeley, he resisted a job offer to do Hollywood special effects and ended up working at NASA's Jet Propulsion Laboratory for six years developing machine learning algorithms for interesting and hard scientific problems (as well as trading options on the side). He finally realized that having non-trivial ideas is effectively not allowed without having a PhD, so he went to CMU to get one. His current fascinations are still building (systems that really solve hard problems that people really want solved), deciphering (things that seem complicated), and creating (new and inspiring ways of looking at things).
Large-scale learning algorithms. Unsupervised learning. Time series and control. Automatic derivation of parametric learning algorithms. Nonparametric methods. Recursive statistical models. Data Structures. Fundamental extensions of divide-and-conquer. Computational geometry. Challenge problems of numerical analysis and operations research.
Astrostatistics, Auton Fast Classifiers, Bayesian Networks, Cached Sufficient Statistics, Clustering, Efficient Statistical Algorithms, Kd-trees and Ball-trees, Kernel Density Estimation, K Nearest Neighbor, Life Science Data Mining, Locally Weighted Learning, Memory-based Learning, Mixture Models, Optimization, Statistical Data Mining for Astrophysics
An Investigation of Practical Approximate Nearest Neighbor Algorithms
How to use variations on classic exact data structures for nearest neighbor, if you want to get faster answers and are prepared to accept approximation?
High-Dimensional Probabilistic Classification for Drug Discovery
Discriminative probabilistic classifiers have been used successfully on large life-sciences datasets, but high dimensionalities have prohibited the use of nonparametric class probability estimation. This paper explores a method (SLAMDUNK) which addresses
Rapid Evaluation of Multiple Density Models
A way to quickly evaluate and compare multiple nonparametric density estimates.
Efficient Exact k-NN and Nonparametric Classification in High Dimensions
Can we do non-approximate k-NN classification without actually finding the k-NN?
N-Body Problems in Statistical Learning
A way to use multiple trees simultaneously to solve a large class of statistical problems efficiently.