autonlab.org

Optimal Reinsertion: A new search operator for accelerated and more accurate Bayesian network structure learning (2003)

Andrew Moore, Weng-Keen Wong

Tags

AD-trees, Astrostatistics, Bayesian Networks, Biosurveillance, Cached Sufficient Statistics, Efficient Statistical Algorithms, Optimization, Statistical Data Mining for Astrophysics, WSARE

Abstract

We show how a conceptually simple search operator called Optimal Reinsertion can be applied to learning Bayesian Network structure from data. On each step we pick a node called the target. We delete all arcs entering or exiting the target. We then find, subject to some constraints, the globally optimal combination of in-arcs and out-arcs with which to reinsert it. The heart of the paper is a new algorithm called ORSearch which allows each optimal reinsertion step to be computed efficiently on large datasets. Our empirical results compare Optimal Reinsertion against a highly tuned implementation of multi-restart hill climbing. The results typically show one to two orders of magnitude speed-up on a variety of datasets. They usually show better final results, both in terms of BDEU score and in modeling of future data drawn from the same distribution.

Full text

Download (application/pdf, 129.9 kB)

Datasets

Datasets used in this paper are available here.

Approximate BibTeX Entry

@inproceedings{moore-optreinsert,
    Howpublished = {Proceedings of the 20th International Conference on Machine Learning},
    Month = {August},
    Year = {2003},
    Pages = {552-559},
    Publisher = {AAAI Press},
    Address = {Menlo Park, California},
    Booktitle = {Proceedings of the 20th International Conference on Machine Learning (ICML '03)},
    Editor = {T. Fawcett and N. Mishra},
    Author = { Andrew Moore, Weng-Keen Wong },
    Title = {Optimal Reinsertion: A new search operator for accelerated and more accurate Bayesian network structure learning}
}

Copyright 2010, Carnegie Mellon University, Auton Lab. All Rights Reserved.