Faceted Browser
53 document(s) found.
Order by:
Open this query in the query search page
|
Filters:
| 1-10 of 53 Next >
Accelerating Exact k-means Algorithms with Geometric Reasoning
Document Type: Paper
Tags: Statistical Data Mining for Astrophysics, Cached Sufficient Statistics, Astrostatistics, Clustering, Efficient Statistical Algorithms, Kd-trees and Ball-trees, Mixture Models A K-means tutorial. We present new algorithms for the k-means clustering problem. They use the kd-tree data structure to reduce the large number of nearest-neighbor queries issued by the traditional algorithm. Sufficient statistics are stored in the nodes of the kd-tree. Then, an analysis of th...
Accelerating Exact k-means Algorithms with Geometric Reasoning (Extended version)
Document Type: Paper
Tags: Statistical Data Mining for Astrophysics, Cached Sufficient Statistics, Clustering, Kd-trees and Ball-trees, Efficient Statistical Algorithms, Mixture Models This is an extended version of the KDD99 paper (available here. We present new algorithms for the k-means clustering problem. They use the kd-tree data structure to reduce the large number of nearest-neighbor queries issued by the traditional algorithm. Sufficient statistics are stored in the no...
A Comparison of Statistical and Machine Learning Algorithms on the Task of Link Completion
Document Type: Paper
Tags: GDA, Testing, Link Analysis, Efficient Statistical Algorithms, Applications Link data, consisting of a collection of subsets of entities, can be an important source of information for a variety of fields including the social sciences, biology, criminology, and business intelligence. However, these links may be incomplete, containing one or more unknown members. We consi...
Active Learning in Discrete Input Spaces
Document Type: Paper
Tags: AD-trees, Cached Sufficient Statistics, Efficient Statistical Algorithms, Optimization, Association Rules, Active Learning Traditional design of experiments (DOE) from the statistics literature focuses on optimizing an output parameter over a space of continuous input parameters. Here we consider DOE, or active learning, for descrete input spaces. A trivial example of this is the k-armed bandit problem, which is the...
AD-trees for Fast Counting and for Fast Learning of Association Rules
Document Type: Paper
Tags: AD-trees, Efficient Statistical Algorithms, Association Rules The problem of discovering association rules in large databases has received considerable research attention. Much research has examined the exhaustive discovery of all association rules involving positive binary literals (e.g. Agrawal et al. 1996). Other research has concerned finding complex a...
A Dynamic Adaptation of AD-trees for Efficient Machine Learning on Large Data Sets
Document Type: Paper
Tags: AD-trees, Statistical Data Mining for Astrophysics, Cached Sufficient Statistics, Bayesian Networks, Efficient Statistical Algorithms, Association Rules This paper has no novel learning or statistics: it is concerned with making a wide class of pre-existing statistics and learning algorithms computationally tractable when faced with data sets with massive numbers of records or attributes. It briefly reviews the static AD-tree structure of Moore ...
A Fast Multi-Resolution Method for Detection of Significant Spatial Disease Clusters
Document Type: Paper
Tags: Biosurveillance, Clustering, Kd-trees and Ball-trees, Efficient Statistical Algorithms, Applications Given an N x N grid of squares, where each square has a count and an underlying population, our goal is to find the square region with the highest density, and to calculate its significance by randomization. Any density measure D, dependent on the total count and total population of a region, ca...
A Fast Multi-Resolution Method for Detection of Significant Spatial Overdensities
Document Type: Paper
Tags: Biosurveillance, Statistical Data Mining for Astrophysics, Clustering, Efficient Statistical Algorithms, Kd-trees and Ball-trees, Applications, Spatial Statistics Given an NxN grid of squares, where each square s_ij has count c_ij and an underlying population p_ij, our goal is to find the square region S with the highest density, and to calculate the significance of this region by Monte Carlo testing. Any density measure D, which depends on the total coun...
Alexander Gray
Document Type: Person
Tags: Auton Fast Classifiers, Statistical Data Mining for Astrophysics, K Nearest Neighbor, Astrostatistics, Cached Sufficient Statistics, Clustering, Memory-based Learning, Efficient Statistical Algorithms, Life Science Data Mining, Locally Weighted Learning, Kernel Density Estimation, Bayesian Networks, Kd-trees and Ball-trees, Mixture Models, Optimization Alex's fascinations in early grade school were Legos, breaking ciphers, and drawing human anatomy. After studying Applied Math and Computer Science at Berkeley, he resisted a job offer to do Hollywood special effects and ended up working at NASA's Jet Propulsion Laboratory for six years developi...
Andrew Moore
Document Type: Person
Tags: Link Analysis, Auton Fast Classifiers, Statistical Data Mining for Astrophysics, Cached Sufficient Statistics, Efficient Statistical Algorithms, Spatial Statistics, Life Science Data Mining, Logistic Regression, Locally Weighted Learning, GDA, AD-trees, Bayesian Networks, Kernel Density Estimation, Kd-trees and Ball-trees, Mixture Models, WSARE, Reinforcement Learning, Active Learning, Markov Decision Processes, K Nearest Neighbor, Astrostatistics, Clustering, Memory-based Learning, Biosurveillance, Applications, Optimization, Association Rules Andrew began his career writing video-games for an obscure British personal computer. He rapidly became a thousandaire and retired to academia, where he received a PhD from the University of Cambridge in 1991. He researched robot learning as a Post-doc working with Chris Atkeson, and then moved ...
|
![[delete]](/resources/skins/auton/images/facet_filter_delete.gif)
![[desc]](/resources/skins/auton/images/facet_sort_asc_active.gif)
![[asc]](/resources/skins/auton/images/facet_sort_asc.gif)