Atlas home || Conferences | Abstracts | about Atlas

Algebraic Topological Methods in Computer Science
July 7-11, 2008
Paris 7 Chevalaret
Paris, France

Organizers
Eric Goubault, Emmanuel Haucourt, Michel Hirschowitz, Sanjeevi Krishnan, Martin Raussen

View Abstracts
Conference Homepage

Persistent homology and detection of disease clusters with arbitrary shapes
by
Andrew Blumberg
Stanford University
Coauthors: Michael Mandell and Gunnar Carlsson

Persistent homology and detection of disease clusters with arbitrary shapes

Persistent homology and detection of disease clusters with arbitrary shapes

Andrew J. Blumberg

Department of Mathematics, Stanford University, Stanford, CA 94305

Gunnar Carlsson

Department of Mathematics, Stanford University, Stanford, CA 94305

Michael A. Mandell

Department of Mathematics, Indiana University, Bloomington, IN  47405

May 22, 2008

Extended abstract

Detection of disease clusters is a problem of central importance in public health. The problem is to find clusters of disease cases which are "unusually dense", given the background population and some model of the baseline disease incidence. Such clusters can indicate developing disease outbreaks and hence locations for public health intervention. The standard method is to apply circular scan statistics, due to Kulldorf and collaborators; this method applies a likelihood ratio test to circular subregions of the region under investigation. Variants of the circular scan statistic methodology are currently in use by public health departments in the United States.

Recently, Wieland, et al. introduced a new method based on Euclidean minimum spanning trees for disease cluster detection. This method corrects a central defect of the circular scan statistic: the circular scan statistic is optimized for detecting circular clusters. The minimum spanning tree method has dramatically improved perception of noncircular disease clustering data. However, relating the natural test statistic in this method to the underlying statistical model of the disease distribution is difficult, and in particular certain standard statistical variants are hard to handle (e.g. covariates). Also, a key step in the algorithm is extremely computationally demanding.

We introduce a new method for disease cluster detection based on persistent homology. Our algorithm is based on the observation that the "potential clusters" Wieland, et al. derive from minimum spanning trees are in fact precisely persistent components. Thus, we proceed by computing persistent components and then computing a likelihood ratio for the associated simplicial complex. This geometric interpretation of the clustering algorithm permits extremely useful refinements, notably the incorporation of "density" parameters in terms of the presence of higher-dimensional simplices. This allows the algorithm to be tuned via a geometrically meaningful integer parameter (a dimension constraint) to bias towards dense, circular clusters.

Experimental results show that our new method achieves superior perception to both the circular scan statistic and the Euclidean minimum spanning tree algorithm.

Date received: May 23, 2008


Copyright © 2008 by the author(s). The author(s) of this document and the organizers of the conference have granted their consent to include this abstract in Atlas Conferences Inc. Document # caxd-30.