The Auton Lab encourages researchers to examine and replicate our findings. To facilitate this goal, we provide datasets identical to those used in our published works.
If you would like to request a dataset not found here, please contact Andrew Moore.
Some datasets used in recent papers:
|Alias Detection Datasets||The following datasets were used in (Hsiung et al, 2005) Alias Detection in Link Data Sets by Paul Hsiung, Andrew Moore, Daniel Neill and Jeff Schneider, Proceedings of the International Conference on Intelligence Analysis, 2005. The datasets can be used as example inputs to the Many Names On...||show|
|Link Datasets||Datasets for Link Detection, GDA, k-groups, cGraph, and Sparse Bayes Net search are found offsite: http://www.cs.cmu.edu/~jkubica/code/linkds.html||show|
|Logistic Regression Datasets||Datasets for logistic regression are found offsite: http://komarix.org/ac/ds||show|
|Nearest Neighbor (NIPS 2004) Datasets||The following datasets were used in Liu, Moore, Gray and Yang (2004), An Investigation of Practical Approximate Nearest Neighbor Algorithms. NIPS 2004. They are stored in this form on this page in order to allow other researchers to run experiments on the same datasets with identical preprocess...||show|
|Optimal Reinsertion Datasets||The following datasets were used in Moore and Wong (2003), Optimal Reinsertion: A new search operator for accelerated and more accurate Bayesian network structure learning, ICML 2003. They are stored in this form on this page in order to allow other researchers to run experiments on the same da...||show|
|SBNS Datasets||Dataset Descriptions Institute Data A set of records of collaborations between professors and students collected from publicly available web pages listed on Carnegie Mellon University Robotics Institute?s web site. NIPS Data A set containing co-authorship information of the Neural Information Pr...||show|
|WSARE Datasets||The following datasets were used in Wong and Moore (2003), Bayesian Network Anomaly Pattern Detection for Disease Outbreaks ICML 2003. They are stored in this form on this page in order to allow other researchers to run experiments on the same datasets with identical preprocessing, including di...||show|