Bayesian Networks for Lossless Dataset Compression (1999)
Tags
Abstract
The recent explosion in research on probabilistic data mining algorithms such as Bayesian networks has been focussed primarily on their use in diagnostics, prediction and efficient inference. In this paper, we examine the use of Bayesian networks for a different purpose: lossless compression of large datasets. We present algorithms for automatically learning Bayesian networks and new structures called "Huffman networks" that model statistical relationships in the datasets, and algorithms for using these models to then compress the datasets. These algorithms often achieve significantly better compression ratios than achieved with common dictionary-based algorithms such those used by programs like ZIP.
Full text
Download (application/pdf, 166.0 kB)
Approximate BibTeX Entry
@inproceedings{davies-bayesian,
Year = {1999},
Pages = {387-391},
Publisher = {AAAI Press},
Booktitle = {Proceedings of the Fifth International Conference on Knowledge Discovery in Databases},
Author = {
Scott Davies, Andrew
Moore
},
Title = {Bayesian Networks for Lossless Dataset Compression}
}