<html>
<body>
<h2>What is Fast Spatial Scan?</h2>
<p><tt>Fast Spatial Scan makes the automatic detection of anomalous spatial
clusters very efficiently and effectively. Given a massive set of spatial or
space-time count data (e.g. the number of reported disease cases in each zip
code on each day), it searches through the dataset to find spatial regions with
higher than expected counts. This process has two steps: first it infers the
expected count for each spatial location using time series analysis, then uses
an expectation-based spatial scan statistic approach to find spatial regions
where the counts are significantly higher than expected. Randomization testing
is performed to compute the statistical significance of each discovered cluster,
enabling us to distinguish true clusters from those due to chance.</tt></p>
<h2>What's special about Fast Spatial Scan?</h2>
<p><tt>Spatial scan statistics are a powerful statistical test for detection of
significant spatial clusters. Unlike many other cluster detection methods, they
can be used both to determine whether any statistically significant clusters
exist and to precisely pinpoint the size and location of clusters. Because the
statistic scans over a huge number of regions of variable shape and size (and
each region can contain between one and many locations), it has high power to
detect clusters regardless of whether they affect a small or large spatial
area. Our statistical test correctly adjusts for the multiplicity of tests
performed, enabling us to ensure a low false positive rate while maintaining
high power to detect any significant clusters that do occur.</tt></p>
<p><tt>Our new implementation of spatial scan statistics has several advantages
over standard spatial scan approaches (e.g. SaTScan). First, we use novel
spatial statistical methods to adjust for spatial and temporal variation in the
baseline counts, allowing us to account correctly for day of week, seasonality,
and other trends. This improves detection power, allowing more timely detection
of emerging clusters with fewer false positives. Second, we have developed a new
computational method, the “fast spatial scan.” This fast multi-resolution search
approach allows us to compute the spatial scan hundreds to thousands of times
faster than the standard approach. Thus we can obtain results in minutes rather
than hours or days, even for massive datasets containing millions of
records.</tt></p>
<p><tt>Speed/Performance Results (click thumbnail below to view big
picture):</tt></p>
<table width="284" class="borderless">
<tbody>
<tr>
<td>
<p>
<a href="daisy:16639"><tt><img width="141" height="104" src="daisy:16635"/></tt></a>
</p>
</td>
<td>
<p><a href="daisy:16638"><img width="151" height="105" src="daisy:16637"/></a>
</p>
</td>
</tr>
</tbody>
</table>
<h2>What type of problem can be solved by Fast Spatial Scan?</h2>
<p><tt>Spatial Scan can find anomalous spatial clusters in spatial or space-time
data sets. In particular, given a large set of spatial locations (e.g. zip
codes), where each location has an associated time series of counts, it can
detect any spatial regions where the most recent counts are significantly higher
than expected, given the historical baseline data. For example, if we are given
the number of emergency department visits in each zip code on each day, it can
find areas where the recent number of cases is abnormally high, which may be
indicative of an emerging outbreak of disease.</tt></p>
<h2>Fast Spatial Scan in action</h2>
<p><strong><em>Retrospective analysis of Walkerton outbreak</em></strong></p>
<p><tt>In May 2000, an outbreak of gastroenteritis in Walkerton, Ontario
resulted from contamination of the water supply with <em>E. coli</em>
bacteria. Over 2000 individuals were affected by severe gastrointestinal
symptoms, including 65 hospitalizations and 6 deaths. We used the fast spatial
scan software to perform a retrospective analysis of emergency department visits
in Walkerton and the surrounding Grey-Bruce region of Ontario between 1999 and
2001. At a rate of only two false positives per year, the software was able to
detect the outbreak on May 19, 2000, two days before the first public health
response and one day before the other surveillance methods tested. </tt></p>
<p><strong><em>Nationwide monitoring of over-the-counter drug
sales</em></strong></p>
<p><tt>We are currently using fast spatial scan tool to perform daily monitoring
of over-the-counter medication sales from the National Retail Data Monitor
(NRDM). Our system receives daily counts of the number of units sold in 18
different product categories (cough remedies, nasal decongestants, etc.) from
over 20,000 retail stores and pharmacies nationwide. It then uses our new
spatial cluster detection methods to find areas where the sales are
significantly higher than expected, and makes these results available to state
and local public health officials via a web-based graphical interface.</tt></p>
<table>
<p><tt>An interface of National Retail Data Monitor, see 2005 KDD paper for
details of deployment.</tt></p>
<table width="193">
<tbody>
<tr>
<td><img width="183" height="114" src="daisy:16640"/></td>
<td>
<p><tt><tt><tt>We are currently using fast spatial scan tool to perform daily
monitoring of over-the-counter medication sales from the National Retail Data
Monitor (NRDM). Our system receives daily counts of the number of units sold in
18 different product categories (cough remedies, nasal decongestants, etc.) from
over 20,000 retail stores and pharmacies nationwide. It then uses our new
spatial cluster detection methods to find areas where the sales are
significantly higher than expected, and makes these results available to state
and local public health officials via a web-based graphical
interface.</tt></tt></tt></p>
</td>
</tr>
</tbody>
</table>
<h2>Representative Publications</h2>
<p><tt>Methods for detecting spatial and spatio-temporal clusters. In M. Wagner,
A. Moore, and R. Aryel, eds., Handbook of Biosurveillance, 2006<br/>
Efficient scan statistic computations. In A. Lawson and K. Kleinman, eds.,
Spatial and Syndromic Surveillance for Public Health, 2005<br/>
A Bayesian spatial scan statistic. In Advances in Neural Information Processing
Systems 18, 2006, in press<br/>
A Bayesian scan statistic for spatial cluster detection. Proceedings of the
National Syndromic Surveillance Conference, 2005. Received “Best Research
Presentation” award<br/>
<a href="http://www.cs.cmu.edu/~neill/papers/sss-kdd05.pdf">Detection of
emerging space-time clusters</a> (KDD 2005)<br/>
<a href="http://www.autonlab.org/autonweb/15867.html">Anomalous spatial cluster
detection</a> (KDD 2005)<br/>
<a href="http://www.autonlab.org/autonweb/15868.html">Detecting anomalous
patterns in pharmacy retail data</a> (KDD 2005)<br/>
<a href="http://www.autonlab.org/autonweb/14675.html">Detecting significant
multidimensional spatial clusters</a> (NIPS 2004)<br/>
<a href="http://www.autonlab.org/autonweb/14667.html">Rapid detection of
significant spatial clusters</a> (KDD 2004)</tt></p>
</body>
</html>