Optimal Reinsertion: A new search operator for accelerated and more accurate Bayesian network structure learning (2003)
Tags
AD-trees, Astrostatistics, Bayesian Networks, Biosurveillance, Cached Sufficient Statistics, Efficient Statistical Algorithms, Optimization, Statistical Data Mining for Astrophysics, WSARE
Abstract
We show how a conceptually simple search operator called Optimal Reinsertion can be applied to learning Bayesian Network structure from data. On each step we pick a node called the target. We delete all arcs entering or exiting the target. We then find, subject to some constraints, the globally optimal combination of in-arcs and out-arcs with which to reinsert it. The heart of the paper is a new algorithm called ORSearch which allows each optimal reinsertion step to be computed efficiently on large datasets. Our empirical results compare Optimal Reinsertion against a highly tuned implementation of multi-restart hill climbing. The results typically show one to two orders of magnitude speed-up on a variety of datasets. They usually show better final results, both in terms of BDEU score and in modeling of future data drawn from the same distribution.
Datasets used in this paper are available here.
Full text
Download (application/pdf, 129.9 kB)
Approximate BibTeX Entry
@inproceedings{moore-optreinsert,
Howpublished = {Proceedings of the 20th International Conference on Machine Learning},
Month = {August},
Year = {2003},
Pages = {552-559},
Publisher = {AAAI Press},
Address = {Menlo Park, California},
Booktitle = {Proceedings of the 20th International Conference on Machine Learning (ICML '03)},
Editor = {T. Fawcett and N. Mishra},
Author = {Andrew Moore Weng-Keen Wong},
Title = {Optimal Reinsertion: A new search operator for accelerated and more accurate Bayesian network structure learning}
}