Spatial clustering of the failure to geocode and its implications for the detection of disease clustering

Dale L. Zimmerman, Xiangming Fang, Soumya Mazumdar

Research output: Contribution to journalArticlepeer-review

21 Citations (Scopus)


Geocoding a study population as completely as possible is an important data assimilation component of many spatial epidemiologic studies. Unfortunately, complete geocoding is rare in practice. The failure of a substantial proportion of study subjects' addresses to geocode has consequences for spatial analyses, some of which are not yet fully understood. This article explicitly demonstrates that the failure to geocode can be spatially clustered, and it investigates the implications of this for the detection of disease clustering. A data set of more than 9000 ground-truthed addresses from Carroll County, Iowa, which was geocoded via a standard address matching and street interpolation algorithm, is used for this purpose. Through simulation of disease processes at these addresses, the authors show that spatial clustering of geocoding failure has no effect on the marginal power to detect spatial disease clustering if the likelihood of disease is independent of the failure to geocode, but that power is substantially reduced if disease likelihood and geocoding failure are positively associated.

Original languageEnglish
Pages (from-to)4254-4266
Number of pages13
JournalStatistics in Medicine
Issue number21
Publication statusPublished - 20 Sept 2008
Externally publishedYes


Dive into the research topics of 'Spatial clustering of the failure to geocode and its implications for the detection of disease clustering'. Together they form a unique fingerprint.

Cite this