autonlab.org

Making Logistic Regression A Core Data Mining Tool With TR-IRLS (2005)

Paul Komarek, Andrew Moore

Tags

Applications, Efficient Statistical Algorithms, Optimization

Abstract

Binary classification is a core data mining task. For large datasets or real-time applications, desirable classifiers are accurate, fast, and need no parameter tuning. We present a simple implementation of logistic regression that meets these requirements. A combination of regularization, truncated Newton methods, and iteratively re-weighted least squares make it faster and more accurate than modern SVM implementations, and relatively insensitive to parameters. It is robust to linear dependencies and some scaling problems, making most data preprocessing unnecessary.

Full text

Download (application/pdf, 208.9 kB)

Approximate BibTeX Entry

@inproceedings{komarek:icdm2005,
    Howpublished = {InProceedings},
    Year = {2005},
    Pages = {4},
    Booktitle = {Proceedings of the 5th International Conference on Data Mining Machine Learning},
    Institution = {Carnegie Mellon University},
    Author = { Paul Komarek, Andrew Moore },
    Title = {Making Logistic Regression A Core Data Mining Tool With TR-IRLS}
}

Copyright 2010, Carnegie Mellon University, Auton Lab. All Rights Reserved.