Alias Detection Datasets
The following datasets were used in (Hsiung et al, 2005) Alias Detection in Link Data Sets by Paul Hsiung, Andrew Moore, Daniel Neill and Jeff Schneider, Proceedings of the International Conference on Intelligence Analysis, 2005.
The datasets can be used as example inputs to the Many Names One Person software by Paul Hsiung.
They are stored in this form on this page in order to allow other researchers to run experiments on the same datasets with identical preprocessing, including discretization levels of real-valued attributes and compensation for missing values.
- Readme file
- Spam Archive Data
- Paul Hsiung's Spam Data
- News Article Entities Data