Research Thrust
Rapid Detection of Emerging Pattern
Data mining algorithms at the Auton Lab have successfully detected new emerging patterns in various domains: Health services, Agriculture, and Manufacturing and Oil companies. Our algorithms are 10-1000 times faster than other traditional techniques. The results demonstrate significantly higher detection power with much smaller false positive rates. We have applied these algorithms in semi/fully-automated modes under supervied/unsupervised environments and for retrospective/prospective surveillance. A few algorithms for Rapid detection of emerging patterns are: WSARE, Ultra Fast SSS, and TipMon.
Paper
| Name | Summary | Actions |
|---|---|---|
| A Bayesian scan statistic for spatial cluster detection | This paper develops a new Bayesian method for cluster detection, the ?Bayesian spatial scan statistic,? and compares this method to the standard (frequentist) scan statistic approach on the task of prospective disease surveillance. | show |
| A Bayesian spatial scan statistic | We propose a new Bayesian method for spatial cluster detection, the ?Bayesian spatial scan statistic,? and compare this method to the standard (frequentist) scan statistic approach. We demonstrate that the Bayesian statistic has several advantages over the frequentist approach, including increa... | show |
| A Fast Multi-Resolution Method for Detection of Significant Spatial Disease Clusters | Given an N x N grid of squares, where each square has a count and an underlying population, our goal is to find the square region with the highest density, and to calculate its significance by randomization. Any density measure D, dependent on the total count and total population of a region, ca... | show |
| A Fast Multi-Resolution Method for Detection of Significant Spatial Overdensities | Given an NxN grid of squares, where each square s_ij has count c_ij and an underlying population p_ij, our goal is to find the square region S with the highest density, and to calculate the significance of this region by Monte Carlo testing. Any density measure D, which depends on the total coun... | show |
| A Study into Detection of Bio-Events in Multiple Streams of Surveillance Data | This paper reviews the results of a study into combining evidence from multiple streams of surveillance data in order to improve timeliness and speci?city of detection of bio-events. In the experiments we used three streams of real food- and agriculture-safety related data that is being routinel... | show |
| Bayesian Network Anomaly Pattern Detection for Disease Outbreaks | Early disease outbreak detection systems typically monitor health care data for irregularities by comparing the distribution of recent data against a baseline distribution. Determining the baseline is difficult due to the presence of different trends in health care data, such as trends caused by... | show |
| Detecting Anomalous Patterns in Pharmacy Retail Data | in this workshop, we present our biosurveillance system that is used to collect feedback data from public health officials monitoring spatial scan clusters in nationwide over-the-counter pharmacy sales. | show |
| Detecting Significant Multidimensional Spatial Clusters | Assume a uniform, multidimensional grid of bivariate data, where each cell of the grid has a count c_i and a baseline b_i. Our goal is to find spatial regions (d-dimensional rectangles) where the c_i are significantly higher than expected given b_i. We focus on two applications: detection of clu... | show |
| Efficient Algorithms for Non-Parametric Clustering with Clutter | Detecting and counting overdensities in data is a common problem in the physical and geographic sciences. One of the most successful of recent algorithms for the counting version of the problem was introduced by Cuevas, Febrero and Fraiman [Cuevas et al., 2000], which will be referred to as the ... | show |
| Efficient Analytics for Effective Monitoring of Biomedical Security | Robin Sabhnani, Daniel B. Neill, Andrew W. Moore, Artur W. Dubrawski, Weng-Keen Wong | show |
| Monitoring Food Safety by Detecting Patterns in Consumer Complaints | EPFC (Emerging Patterns in Food Complaints) is the analytical component of the Consumer Complaint Monitoring System, designed to help the food safety officials to efficiently and effectively monitor incoming reports of adverse effects of food on its consumers. These reports, collected in a passi... | show |
| Optimal Reinsertion: A new search operator for accelerated and more accurate Bayesian network structure learning | We show how a conceptually simple search operator called Optimal Reinsertion can be applied to learning Bayesian Network structure from data. On each step we pick a node called the target. We delete all arcs entering or exiting the target. We then find, subject to some constraints, the globally ... | show |
| Rapid Detection of Significant Spatial Clusters | Given an N times N grid of squares, where each square has a count c_{ij} and an underlying population p_{ij}, our goal is to find the rectangular region with the highest density, and to calculate its significance by randomization. An arbitrary density function D, dependent on a region's total co... | show |
| Rule-based Anomaly Pattern Detection for Detecting Disease Outbreaks | This paper presents an algorithm for performing early detection of disease outbreaks by searching a database of emergency department cases for anomalous patterns. Traditional techniques for anomaly detection are unsatisfactory for this problem because they identify individual data points that ar... | show |
| Summary of Biosurveillance-relevant statistical and data mining technologies | This short report very briefly surveys a spectrum of technologies from statistics, computer science and data mining that can help with Biosurveillance. We indicate which we have chosen, so far, to use in our development of analysis methods and our informal reasoning. | show |
| What's Strange About Recent Events | This paper, which is a shortened version of (Wong et al. 2002), presents an algorithm for performing early detection of disease outbreaks by searching a database of emergency department cases for anomalous patterns. Traditional techniques for anomaly detection are unsatisfactory for this problem... | show |