Optimal Reinsertion: A new search operator for accelerated and more accurate Bayesian network structure learning (2003)
Tags
AD-trees, Astrostatistics, Bayesian Networks, Biosurveillance, Cached Sufficient Statistics, Efficient Statistical Algorithms, Optimization, Statistical Data Mining for Astrophysics, WSARE
Abstract
We show how a conceptually simple search operator called Optimal Reinsertion can be applied to learning Bayesian Network structure from data. On each step we pick a node called the target. We delete all arcs entering or exiting the target. We then find, subject to some constraints, the globally optimal combination of in-arcs and out-arcs with which to reinsert it. The heart of the paper is a new algorithm called ORSearch which allows each optimal reinsertion step to be computed efficiently on large datasets. Our empirical results compare Optimal Reinsertion against a highly tuned implementation of multi-restart hill climbing. The results typically show one to two orders of magnitude speed-up on a variety of datasets. They usually show better final results, both in terms of BDEU score and in modeling of future data drawn from the same distribution.
Full text
Download (application/pdf, 129.9 kB)
Datasets
Datasets used in this paper are available here.
Approximate BibTeX Entry
@inproceedings{moore-optreinsert,
Howpublished = {Proceedings of the 20th International Conference on Machine Learning},
Month = {August},
Year = {2003},
Pages = {552-559},
Publisher = {AAAI Press},
Address = {Menlo Park, California},
Booktitle = {Proceedings of the 20th International Conference on Machine Learning (ICML '03)},
Editor = {T. Fawcett and N. Mishra},
Author = {Andrew Moore Weng-Keen Wong},
Title = {Optimal Reinsertion: A new search operator for accelerated and more accurate Bayesian network structure learning}
}