m...

7.

...:. >333. ....Ia...;.>l.n.s

.5. ﬁéatf. .
4. 4

 

 

1.. a4. .3...» u)
3... r5

 

 

 

 

 

:: L...» .3 .~
. ....mi..u .2943...

 

 

LIBRARY
I Michigan State
University

.4; W

71 Ct")

 

 

 

This is to certify that the
thesis entitled

CHILDHOOD LEAD POISONING IN MICHIGAN:
SPATIAL ANALYSES OF THE DISTRIBUTION OF AND
FACTORS RELATING TO COMMUNITY ELEVATED BLOOD
LEAD LEVELS

presented by

ERIC SANDBERG

has been accepted towards fulﬁllment
of the requirements for the

MS. degree in Geography

 

 

”-5

., ,2 ﬂajor Professor“; Signature
/4/ [Joy 0 5)

Date

 

MSU is an Afﬁrmative Action/Equal Opportunity Employer

 

 

PLACE IN RETURN BOX to remove this checkout from your record.
TO AVOID FINES return on or before date due.
MAY BE RECALLED with earlier due date if requested.

 

DATE DUE DATE DUE DATE DUE

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

5/08 K IPrq/Achres/CIRC/DateOue indd

CHILDHOOD LEAD POISONING IN MICHIGAN:
SPATIAL ANALYSES OF THE DISTRIBUTION OF AND FACTORS RELATING
TO COMMUNITY ELEVATED BLOOD LEAD LEVELS
By

Eric Allen Sandberg

A THESIS

Submitted to
Michigan State University
in partial fulﬁllment of the requirements
for the degree of

MASTER OF SCIENCE
Geography

2008

ABSTRACT
CHILDHOOD LEAD POISONING IN MICHIGAN:
SPATIAL ANALYSES OF THE DISTRIBUTION OF AND FACTORS RELATING
TO COMMUNITY ELEVATED BLOOD LEAD LEVELS
By
Eric Allen Sandberg
Lead poisoning, deﬁned by the Centers for Disease Control as equal-to or greater-

than ten micrograms per deciliter of blood, afﬂicts children in Michigan at a higher rate
than the national average. The primary, though not exclusive, source of exposure is lead-
based paint in households that dates to before the 1978 ban on this product. Since lead
exposure causes permanent neural damage and is difficult to extract from the body,
primary prevention by removing the hazards is the only solution to this problem. This
thesis uses point-based clustering and regression techniques to examine the spatial
patterns and characteristics of childhood blood lead levels in Michigan. The Michigan
Lead Database results of blood lead tests from 1998 to 2005 are employed for this
objective. Only children insured by Medicaid, a majority of the database and typically at
higher risk of lead poisoning, are included in this thesis. Results indicate that the inner
city children in Michigan suffer the greatest from lead exposure. Regression analysis
reveals that older housing within an area is the best predictor of mean blood lead levels.

Spatial techniques used in this thesis have the potential to greatly enhance primary

prevention efforts.

Copyright by
ERIC ALLEN SANDBERG
2008

This thesis is dedicated to my family

iv

ACKNOWLEDGEMENTS

I wish to express my thanks to Dr. Joseph Messina, who has mentored and
advised me throughout my time in graduate school and welcomed me warmly to
Michigan and MSU.

I also wish to express my gratitude towards Dr. Sue Grady, who has warmly
advised and enCouraged me during the past two years, and Dr. Stan Kaplowitz, who has
provided invaluable help, advice, and encouragement in pursuing research into lead
poisoning.

I wish to recognize and thank Dr. Ashton Shortridge, who provided much of the
instruction and technical assistance in running the methods of this thesis.

I wish to acknowledge Mark Finn, Ivan Ramirez, Annalie Campos, and Lindsey

Campbell for their assistance.

TABLE OF CONTENTS

LIST OF TABLES ........................................................................................................... viii
LIST OF FIGURES ............................................................................................................ x
1 Introduction ................................................................................................................... l
1.1 Introduction ............................................................................................................ 1
1.1.1 Purpose of Study ............................................................................................... 5

1.2 Literature Review ................................................................................................ 6
1.2.1 Lead Uses and Consequent Problems ............................................................. 6
1.2.2 Research, Industry, and Public Policy .......................................................... 11
1.2.3 Geographic Studies of Lead .......................................................................... 20
1.2.4 Theoretical Basis and Hypothesis ................................................................. 29

2 Data and Methods ....................................................................................................... 34
2.1 Data ................................................................................................................... 34
2.1.1 Michigan Lead Database ............................................................................... 34
2.1.2 United States Census ..................................................................................... 43

2.2 Methods ................................................................................................................ 46
2.2.1 Clustering ...................................................................................................... 46
2.2.2 Geographically Weighted Regression ........................................................... 63

3 Results ......................................................................................................................... 71
3.1 Clustering Results .................................................................................................. 71
3.1.1 South DetrOIt .................................. 73
3.1.2 North Detroit ................................................................................................... 77
3.1.3 Southeast Michigan ......................................................................................... 81
3.1.4 Flint ................................................................................................................. 86
3.1.5 Genesee ........................................................................................................... 90
3.1.6 Lansing ............................................................................................................ 94
3.1.7 Mid-South ...... 98
3.1.8 Battle Creek .................................................................................................. 104
3.1.9 Kalamazoo .................................................................................................... 108
3.1.10 Southwest .................................................................................................... 112
3.1.12 Lower Coast ................................................................................................ 12]
3.1.13 Mid Coast .................................................................................................... 125

3.1. 14 Saginaw/Bay City ....................................................................................... 129
3.1.15 West Bay ..................................................................................................... 133
3.1.16 East Bay ...................................................................................................... 138
3.1.17 North Central .............................................................................................. 142
3.1.18 Eastern Upper Peninsula ............................................................................. 146

3.1. 19 Western Upper Peninsula ............................................................................ 150

3. 2 Geographically Weighted Regression Results ..................................................... 154

vi

3.2.1 Minor Civil Division ..................................................................................... 158

3.2.2 Zip Code ........................................................................................................ 166
3.2.3 Tract .............................................................................................................. 175

4 Conclusions ............................................................................................................... 187
4.1 Overview .............................................................................................................. 187
4.2 Discussion of Results ........................................................................................... 190
4.2.1 Clustering ...................................................................................................... 190
4.2.2 Geographically Weighted Regression ........................................................... 195
4.2.3 Research Questions ....................................................................................... 197

4.3 Future Research ................................................................................................... 201
Appendix 1 ...................................................................................................................... 206
Michigan Statewide Lead Testing/Lead Screening Plan ............................................ 206
Appendix 2 ...................................................................................................................... 207
Difference of K code in R ........................................................................................... 207
Appendix 3 ...................................................................................................................... 209
Geographic Analysis Machine code in R .................................................................... 209
Literature Cited ............................................................................................................... 211

vii

LIST OF TABLES

Table 1: Summary of previous geographic studies of lead poisoning ............................ 24
Table 2: Regression results from earlier studies. Columns are author, independent
variable, whether the coefficient is positive or negative, and the p-value ........................ 27
Table 3: Continuation of Table 2 showing regression results from earlier studies ......... 28
Table 4: Example highlighting the changes between the original BLL database and the
database used in this thesis ............................................................................................... 35
Table 5: Cuzick-Edwards results for South Detroit ........................................................ 75
Table 6: Cuzick-Edwards results for NOrth Detroit ........................................................ 79
Table 7: Cuzick-Edwards results for Southeast Michigan .............................................. 83
Table 8: Cuzick-Edwards results for Flint ...................................................................... 88
Table 9: Cuzick-Edwards results for Genesee ................................................................ 92
Table 10: Cuzick-Edwards results for Lansing ............................................................... 96
Table 11: Cuzick-Edwards results for Mid-South ......................................................... 100
Table 12: Cuzick-Edwards results for Battle Creek ...................................................... 105
Table 13: Cuzick-Edwards results for Kalamazoo ........................................................ 110
Table 14: Cuzick-Edwards results for Southwest Michigan ......................................... 114
Table 15: Cuzick-Edwards results for Grand Rapids .................................................... 119
Table 16: Cuzick-Edwards results for the Lower Coast ............................................... 123
Table 17: Cuzick-Edwards results for the Mid-Coast ................................................... 127
Table 18: Cuzick-Edwards results for Saginaw/Bay City ............................................. 131
Table 19: Cuzick-Edwards results for West Bay .......................................................... 135
Table 20: Cuzick-Edwards results for East Bay ............................................................ 140

viii

Table 21: Cuzick-Edwards results for North Central .................................................... 144

Table 22: Cuzick-Edwards results for Eastern Upper Peninsula .................................. 148
Table 23: Cuzick-Edwards results for Western Upper Peninsula ................................. 152
Table 24: Independent variables tested by regression analysis ..................................... 156
Table 25: Yearly global regression results for minor civil divisions. Light blue

represents a signiﬁcant variable (a = 0.05) ..................................................................... 162
Table 26: GWR regression results for minor civil division all years mean BLL .......... 163

Table 27: Yearly global regression results for zip codes. Light blue represents a

signiﬁcant variable (a = 0.05) ......................................................................................... 171
Table 28: GWR regression results for zip code all years mean BLL ............................ 172
Table 29: Yearly global regression results for census tracts. Light blue represents a

signiﬁcant variable (a = 0.05) ......................................................................................... 180
Table 30: GWR regression results for census tract all years mean BLL ...................... 181

LIST OF FIGURES

Figure 1: Reference map of Michigan .............................................................................. 2
Figure 2: Timeline of events relating to lead poisoning. Legislation is marked in blue,

business and industry marked in orange, and research is marked in green. ..................... 16
Figure 3: Map of zip codes deemed “high risk” by CDC standards ............................... 19
Figure 4: The human ecology triangle ............................................................................ 30

Figure 5: Percentage of children under six years of age tested for lead. All test results
for Michigan counties and Detroit included. .................................................................... 37

Figure 6: Descriptive statistics of the thesis lead database. Note that elevated means

above 10 ug/dL and numbers are for Medicaid insured children. .................................... 41
Figure 7: Migration of MSU database to GIS-utilizable .dbf format .............................. 42
Figure 8: The geographic coordinates were geocoded to a point vector data set through

use of the MCGI state boundary vector data set ............................................................... 43
Figure 9: Schemata of the transfer of census variables to vector data sets ..................... 45

Figure 10: Study areas identiﬁed for the clustering techniques. Areas based on HSA
boundaries are outlined with black and labeled in bold, while areas based on urban
boundaries are outlined in blue and labeled in italics ....................................................... 49

Figure 11: Example of Cuzick-Edwards statistic based on one nearest neighbor .......... 51

Figure 12: Ripley’s K ﬁmction with circles of distance h around event 1'. Clustering of
events are present within four circles around event 1' ........................................................ 53

Figure 13: Method for obtaining difference of K values for each year at case/control

thresholds of 5, 10, and 25 ug/dL. .................................................................................... 57
Figure 14: Method in R for creating GAM maps ............................................................ 61
Figure 15: Map of the South Detroit study region .......................................................... 74

Figure 16: The 2005 South Detroit difference of K graph for the 10 ug/dL threshold .. 76

Figure 17: The 2004 GAM map of South Detroit for the 5 ug/dL threshold ................. 77

Figure 18:
Figure 19:
Figure 20:
Figure 21:

Figure 22:

Figure 23:
Figure 24:
Figure 25:
Figure 26:
Figure 27:
Figure 28:
Figure 29:
Figure 30:
Figure 31:
Figure 32:
Figure 33:
Figure 34:
Figure 35:
Figure 36:
Figure 37:
Figure 38:
Figure 39:

Figure 40:

Map of the North Detroit study region .......................................................... 78

The 2003 North Detroit difference of K graph for the 5 ug/dL threshold 80

The 1999 GAM map of North Detroit for the 10 ug/dL threshold ............... 81
Map of the Southeast Michigan study region ................................................ 82
The 2004Southeast Michigan difference of K graph for the Sug/dL thresholgl5
The 1998 GAM map of Southeast Michigan for the 10 ug/dL threshold ..... 86
Map of the Flint study region ........................................................................ 87
The 2003 Flint difference of K graph for the 10 ug/dL threshold ................ 89
The 1998 GAM map of Flint for the 10 ug/dL threshold ............................. 90
Map of the Genesee study region .................................................................. 91
The 2002 Genesee difference of K graph for the 5 ug/dL threshold ............ 93
The 2001 GAM map of Genesee for the 5 ug/dL threshold ......................... 94
Map of the Lansing study region ................................................................... 95
The 2000 Lansing difference of K graph for the 10 ug/dL threshold ........... 97
The 1998 GAM map of Lansing for the 5 pg/dL threshold .......................... 98
Map of the Mid-South study region .............................................................. 99

The 1999 Mid-South difference of K graph for the 5 ug/dL threshold ....... 102

The 2005 GAM map of the Mid-South for the 5 ug/dL threshold .............. 103
Map of the Battle Creek study area ............................................................. 104
The 2001 Battle Creek difference of K graph for the 5 ug/dL threshold.... 107 I

The 1999 GAM map of Battle Creek for the 10 ug/dL threshold ............... 108
Map of the Kalamazoo study area ............................................................... 109
The 2000 Kalamazoo difference of K graph for the 10 ug/dL threshold... 111

xi

Figure 41: The 2001 GAM map of Kalamazoo for the 5 ug/dL threshold ................... 112
Figure 42: Map of the Southwest study area ................................................................. 113
Figure 43: The 1998 Southwest Michigan difference of K graph for the 25 ug/dL
threshold .......................................................................................................................... 116
Figure 44: The 1999 GAM map of Southwest Michigan for the 25 ug/dL threshold.
Other study regions outlined in white ............................................................................. 117
Figure 45: Map of the Grand Rapids study region ........................................................ 118
Figure 46: The 2003 Grand Rapids difference of K graph for the 10 ug/dL threshold 120
Figure 47: The 2001 GAM map of Grand Rapids for the 5 ug/dL threshold ............... 121
Figure 48: Map of the Lower Coast study region ......................................................... 122
Figure 49: The 2000 Lower Coast difference of K graph for the 10 ug/dL threshold . 124
Figure 50: The 2002 GAM map of Lower Coast for the 10 ug/dL threshold .............. 125
Figure 51: Map of the Mid Coast study region ............................................................. 126
Figure 52: The 1998 Mid-Coast difference of K graph for the 5 ug/dL threshold ....... 128
Figure 53: The 2000 GAM map of Mid-Coast for the 5 pg/dL threshold .................... 129
Figure 54: Map of the Saginaw/Bay City study region ................................................ 130
Figure 55: The 2004 Saginaw/Bay City difference of K graph for the 10 ug/dL threshlo3lg
Figure 56: The 2001 GAM map of Saginaw/Bay City for the 5 ug/dL threshold ........ 133
Figure 57: Map of the West Bay study region .............................................................. 134
Figure 58: The 1998 West Bay difference of K graph for the 5 ug/dL threshold ........ 137
Figure 59: The 2003 GAM map of West Bay for the 5 ug/dL threshold ..................... 138
Figure 60: Map of the East Bay study region ............................................................... 139
Figure 61: The 1998 East Bay difference of K graph for the 5 ug/dL threshold .......... 141

xii

Figure 62: The 1999 GAM map of East Bay for the 5 pg/dL threshold ....................... 142
Figure 63: Map of the North Central study region ........................................................ 143
Figure 64: The 1998 North Central difference of K graph for the 5 ug/dL threshold.. 145
Figure 65: The 2004 GAM map of North Central for the 5 ug/dL threshold ............... 146
Figure 66: Map of the Eastern Upper Peninsula study region ...................................... 147
Figure 67: The 1998 Eastern Upper Peninsula difference of K graph for the 5 ug/dL
threshold .......................................................................................................................... 149
Figure 68: The 2000 GAM map of Eastern Upper Peninsula for the 5 ug/dL threshold
......................................................................................................................................... 150
Figure 69: Map of the Western Upper Peninsula study region ..................................... 151
Figure 70: The 2000 Western Upper Peninsula difference of K graph for the 5 ug/dL
threshold .......................................................................................................................... 153
Figure 71: The 1999 GAM map of Western Upper Peninsula for the 5 ug/dL thresholii54
Figure 72: Map of the minor civil division standard deviations of yearly mean BLL . 159
Figure 73: Map of mean BLL by minor civil division and all years global regression
results .............................................................................................................................. 161
Figure 74: Map of the R-Squared for the minor civil division GWR model ................ 165
Figure 75: Map of the coefﬁcients from the minor civil division GWR model for pre-
1940 housing ................................................................................................................... 166
Figure 76: Map of zip code standard deviations of the yearly mean BLL .................... 167
Figure 77: Map of mean BLL by zip code and all years global regression results ....... 169
Figure 78: Map of the R-squared for the zip code GWR model ................................... 173
Figure 79: Map of the coefﬁcients from the zip code GWR model for percentage under
6 years of age .................................................................................................................. 174
Figure 80: Map of census tract standard deviations of yearly mean BLL .................... 176

Figure 81: Map of mean BLL by census tract and all years global regression results. 178

xiii

Figure 82: Map of the R-Squared from the census tract GWR model .......................... 182

Figure 83: Map of the coefﬁcients from the census tract GWR model for pre-l 940
housing ............................................................................................................................ 183

Figure 84: Map of the coefﬁcients from the census tract GWR model for percentage
African-American ........................................................................................................... 1 84

Figure 85: Map of the coefﬁcients from the census tract GWR model for percentage

Vacant Houses ................................................................................................................ 185

Images in this thesis are presented in color.

xiv

LIST OF ABBREVIATIONS

ug/dL: Micrograms per Deciliter

BLL: Blood Lead Level

CDC: Centers for Disease Control and Prevention
CLPPP: Childhood Lead Poisoning Prevention Program
EPA: Environmental Protection Agency

FDA: Food and Drug Administration

GAM: Geographic Analysis Machine

GIS: Geographic Information Systems

GWR: Geographically Weighted Regression

HSA: Health Systems Agencies

LBPPPA: Lead-Based Paint Poisoning Prevention Act
LIA: Lead Industries Association

MCD: Minor Civil Division

MCGI: Michigan Center for Geographic Information
MDCH: Michigan Department of Community Health
OLS: Ordinary Least Squares Regression

PCA: Principal Components Analysis

TEL: Tetraethyl Lead

XV

1 Introduction

1.1 Introduction

Lead has adversely affected humans for thousands of years (Bellinger and
Schwartz 1997). Though the harmful effects of lead were recognized in antiquity, it has
continued to be used in many manufactured items. Recent events such as the lead paint
found in Chinese-manufactured toys emphasize the risk which still exists from products
found on store shelves (Barboza 2007). But the greatest hazards from lead are from the
vestiges of an earlier time period when lead was commonly used in house paint and
gasoline. Many people still suffer needlessly from the effects of lead particle inhalation
or ingestion within their homes and neighborhoods. Children suffer the most because of
the small size of their bodies, and their behaviors put them at greater risk (Centers for
Disease Control and Prevention 2005a). The children who are insured by Medicaid, a
govemment-funded health care coverage program for low-income individuals and
families, are known to typically have higher blood lead levels than the general population
(Kemper et al. 2005a). Thus all children on Medicaid are required by law to be tested by
two years of age, and others are encouraged to be tested during a health visit (Michigan
Department of Community Health 2001). Michigan is sixth in the nation for percentage
of children with elevated blood lead levels (Task Force to Eliminate Childhood Lead
Poisoning 2004). Indications are that the distribution of children with high blood lead
levels (BLL) in Michigan is not random, but is associated with historical patterns of
development and current place-based socio-demographic and economic characteristics

(Frost 2004). This research focuses on exploring the spatial distribution of BLL in

children in the State of Michigan (Figure l), emphasizing the patterns observed and the

common socio-demographic and economic characteristics associated with them.

(79"
(,4.
‘ .. ' I ' . Mar ucltc .' . . .
Q . , , - . , , q ‘ ‘ r ) Saull 51. Marie
. ,. ‘ lshpcming - T” \4‘: ' ‘ " ‘u.

. . 1.4.,
\‘ ,

liscnnaba

- Urban Areas

    

Major Roads ‘ .
CountyBoundaries 7 . ‘ _
Vlusku'on ‘- 3 bl
‘ l .: .
0 50 ‘00 ’(Imnd Rapids- . I H
Miles I1 .8: :.,-,_ ‘ ’I ‘ .
Holland _ . - 1 w‘l “"5" 1.18:3?»
f' 4 ‘ ~; ;- Balllc Creek‘s 7;]
- " ‘:"’ Ann ,
._, <- , ‘ l DETROIT
halamamez Jackson *‘AIbor jig)
f I. r r . . .
{I I; _ I v, Monrmh/

 

Figure 1: Reference map of Michigan

Michigan children have historically had higher BLL than the national average

stemming from a variety of risk factors. Heavy industrialization throughout the late 19th

and early 20th century caused atmospheric lead deposition in the state from the
combustion of coal and leaded gasoline from cars (Yohn et al. 2004). In many urban
areas in Michigan and throughout the United States, soil depositions from leaded gasoline
(1929-1986) created a large persistent reservoir of lead (Mielke 1999). This input is
frequently coupled with lead house paint, both interior and exterior. Though lead paint
was banned from use in residential homes in 1978, an estimated 64 million homes in the
United States still contain layers of lead-based paint (Jacobs et al. 2002). Children living
in states with older housing are at greater risk of lead poisoning because lead paint chips
are often in or around the outside of the house. The chips and dust of lead can amass in
areas of the house, accessible for children to inhale. According to the US. Census
Bureau, nearly three—fourths of Michigan houses were built during or before the 19705
(US Census Bureau 2001). While many substantial sources of lead such as leaded paint
and gasoline are no longer in production, used lead is environmentally stable and
continues to be a hazard to which Michigan children could be exposed.

With the threat to children of lead ﬁrmly established, Governor Jennifer
Granholm (2002 - present) recently created a task force to lead “a statewide effort to
successfully address the goal of the elimination of childhood lead poisoning in Michigan
by 2010” (Task Force to Eliminate Childhood Lead Poisoning 2004). In 1997,
regulations were put into place that required Michigan laboratories to report the results of
all blood lead tests to the Michigan Department of Community Health (MDCH),
replacing the voluntary reporting set up in 1992 (Michigan Department of Community
Health 2005a). Within MDCH, the Childhood Lead Poisoning Prevention Program

(CLPPP) coordinates lead-related activities. The results are received by CLPPP,

reviewed for data entry errors, and put into the statewide child lead database. CLPPP
then relays results of children with elevated BLL to the local health departments, so they
can target homes and neighborhoods for environmental remediation.

Since Michigan’s push for the elimination of lead poisoning began, there have
been positive developments. The percentage of children in Michigan with elevated BLL
(>= 10 ug/dL) decreased from 9.7% (n = 7,100 out of 73,643 tests) of those tested in
1998 to 2.3% (n = 3,137 out of 132,913 tests) of children tested in 2005, possibly
indicating CLPPP methods have been successful (Michigan Department of Community
Health 2005a). New legislation passed by the Michigan Legislature in 2004 sanctions
testing of more children within the state, including ensuring follow-up tests for children
with elevated BLL results and faster reporting by labs to CLPPP.

Unfortunately, progress has begun to stall on some fronts. Recent budget
challenges within Michigan have put state funds for lead poisoning prevention in
jeopardy (Lam 2007). The result is that less money will be available to local health
departments for environmental testing and removal (remediation) of environmental lead
sources. A recent survey of health ofﬁcers from local health departments throughout
Michigan found that 74% of the respondents reported that lead poisoning was not
adequately addressed in their health district (Kemper, Uren, and Hudson 2007). At the
same time that funding for lead programs is being out, new medical and epidemiological
research has found that children with BLL lower than the 10 ug/dL cutoff point
considered elevated by the Centers for Disease Control and Prevention (CDC) suffer
damaging effects (Lanphear et al. 2005b; Finkelstein, Markowitz, and Rosen 1998;

Canﬁeld et a1. 2003). These studies have shown that effects of lead exposure, such as IQ

loss, can actually occur at a faster rate below the current CDC threshold (Canﬁeld et al.
2003)

The geographic aspects of lead poisoning have received more attention in recent
years in community health because of advances in computing technologies such as
Geographic Information Systems (GIS), geocomputation, and spatial statistics (Cromley
and McLafferty 2002). Analyses of the geographic distribution of lead poisoning are
useful for ﬁnding “hot spots” where clusters of children with elevated blood lead levels
reside and for creating models for where lead exposure is likely higher based on socio-
demographic and housing variables (Grifﬁth et al. 1998). The overall population hazard
from lead has dropped due to the metal being largely taken out of industrial use and
exposure has become more concentrated in older areas. As this drop has occurred,
disparities between areas of high and low incidence of lead poisoning have developed
(Lanphear 2005a). This divergence can be observed in geographic variations in
neighborhood characteristics as well as public health intervention (Bailey, Sargent, and

Blake 1998).

1.1.1 Purpose of Study

The purpose of this study is to use the Michigan statewide yearly database of lead
test results in children from year to year to explore spatial patterns and processes over
time and to measure the extent to which geographic variation in BLL can be explained by
US Census socio-demographic variables. This will be accomplished using spatial
statistics, spatial clustering techniques, and geographic regression modeling. Building on

previous research on the geographic dimensions of lead exposure, this research explores

Spatio-temporal variations in lead test results in Michigan. The main questions that this
study aims to address are:

Are there spatial clusters of elevated BLL in Michigan? At what spatial scales do
these patterns manifest?

Are socio-demographic and economic variables in the US Census able to predict
and explain the geographic variation in elevated blood lead levels in Michigan children?

Can a model based on US Census socio-demographic and economic variables
accurately predict the spatial distribution of elevated BLL in Michigan over time?

This thesis is organized into four chapters. The remainder of Chapter 1 provides
a review of relevant literature and the research hypothesis. Chapter 2 describes data and
methods used in investigating these research questions. The results from these analyses
and a discussion of their implications are presented in Chapter 3. Finally, Chapter 4
concludes with recommendations for policy and programmatic changes and suggestions

of future research.

1.2 Literature Review

1.2.1 Lead Uses and Consequent Problems

Lead is a bluish-gray metal that occurs naturally within the Earth’s crust (Centers
for Disease Control and Prevention 2005a). There are several elemental properties that
make it of use to humans. Lead is very dense, able to be shaped easily, and resistant to
corrosion (United States Geological Survey 2007). It is soft enough that it can be rolled

into a sheet and Shaped into rods and pipes (Hunter 1969). Lead has a very low melting

point, allowing it to be softened in a temperatures as low as a campﬁre (Angier 2007).
Because of these qualities, lead has been distributed widely throughout the environment
through extensive human use. Lead does not break down naturally, a fact which
separates it from many other environmental contaminants (Kitman 2000).

Archaeological evidence of human use of lead dates back thousands of years. A
lead ﬁgurine in the British Museum has been dated to 5,800 yrs ago in the Neolithic
Period (Clarkson 1995). Lead was also found in Bronze Age pottery and was extensively
mined by the Ancient Greeks and Romans (Brill and Wampler 1967; Weiss, Shotyk, and
Kempf 1999). Roman use included making lead pipe for plumbing and as a preservative
in wine, inducing high lead levels among the Roman aristocracy and suspicion among
modern researchers that lead might have played a role in the decline of the empire
(Nriagu 1983; Waldron 1973). Evidence of lead’s durability is found in excavated 2,000
year old perfectly preserved Roman water pipes (Hunter 1969).

Though lead was continuously used in pre-industrial societies, studies conducted
in various environmental archives such as peat bogs and glaciers conﬁrm that lead
production and use in the environment exponentially increased after the industrial
revolution (Weiss, Shotyk, and Kempf 1999). Lead has been used in many products such
as batteries, water pipes, ammunition, ceramic glazes, rooﬁng, and lead sheet for lining
buildings. But the two applications that caused the most damage to American children
were lead paints and in leaded gasoline (Centers for Disease Control and Prevention
2005a).

Leaded gasoline was developed to reduce engine knock. The solution settled on

in the 19208 by the automotive industry was Tetraethyl lead (TEL), selected over several

safer alternatives such as ethanol (Kit-man 2000). TEL improved engine performance and
was an effective anti-knocking agent, which led to it being called “a gift from God” by an
industry executive (Nriagu 1990). Despite early warning signs such as reﬁnery worker
deaths, the industries involved in the production and use of leaded gasoline continued to
resist any efforts by the public health community for a ban and worked to fund its own
research (Kovarik 2005). Leaded gasoline is documented as the source of nearly all the
lead found in the environment (Hemberg 2000).

Lead historically has been used in paints because of its anti-corrosive properties.
Two lead compounds, white and red lead, were commonly used in paints through the
20th century. While red lead was used primarily in painting of ships, white lead paint
was used in households because it was resistant to water and prevented mildew (Hunter
1969). Lead was considered a valuable addition to paint, making the cost of house paint
rise with the amount of lead added into the mixture (Beam 2007). The paint industry as
well as the Lead Industries Association (LIA), a lead industry trade group, heavily
marketed lead paint (Markowitz and Rosner 2000). Advertisements appeared in popular
periodicals touting the durability of leaded paint. The industry also created a mascot of
the Dutch Boy, a young boy who appeared in many advertisements encouraging children
to use lead paint (Markowitz and Rosner 2002).

There are several ways lead can enter a child’s body once it is in the local
environment. Lead has a sweet taste, which makes young children (under two years of
age) especially vulnerable to lead around the home because children have a tendency to
put objects in their mouth, a condition known as pica (Gaston 1972). Also in the home,

lead paint can chip, and the dust can accumulate in areas of the house such as

windowsills, carpet, and other accessible places (Lanphear et al. l998d). Inhalation of
lead paint dust by children also can occur when the old paint layers are sanded during
home renovation (Lanphear 2005a). Another pathway by which children may be exposed
to lead is through the soil around the child’s residence. Left over lead from the leaded
gasoline era has been found to have accumulated in areas of high trafﬁc congestion (Tong
1990). Children who play in such environments often get lead particles on their hands
which can easily be transferred to the mouth and ingested (Mielke 1999). Thus oral
ingestion and inhalation are the two main routes by which children are exposed.

Lead is able to disrupt many essential nervous system functions at a cellular level,
particularly affecting the developing bodies of children (Garza et al. 2005). Lead is a
potent neurotoxin that has been established as a poison for centuries (Lidsky and
Schneider 2003). It has been suggested that the root of the neurotoxicity goes far back in
the evolution of living cells and lead’s role as a non-essential metal. Lead levels in
modern humans are estimated to be 50-200 times higher than in estimated blood lead
levels before human lead usage following the industrial revolution (F legal and Smith
1992). Tests on animals have shown similar negative effects of exposure which show up
in humans (Finkelstein, Markowitz, and Rosen 1998). Once lead is inside the human
system, it is able to mimic the role of other essential metals for cell function like calcium
(Clarkson 1995). No known life forms rely on lead for survival (Angier 2007).

Once inside the body, lead effects on children are serious and long-term even at
very low levels. Lead exposure is typically measured in micrograms of lead per deciliter

(ug/dL) of blood. The current threshold for what is considered lead poisoning by the

CDC is 10 ug/dL. This is equivalent to a teaspoon of lead in a swimming pool 100 feet

by 40 feet and ﬁve feet deep (Richardson 2005). At clinical levels of lead exposure,
generally above 60-70 ug/dL, a child will begin to Show outward signs that poisoning has
occurred. These include loss of the ability to coordinate muscular movement,
convulsions, anemia, stupor, colic, coma, and possibly death (Agency for Toxic
Substances & Disease Registry 2007). Such high levels of lead were once quite common
in the United States, but since the gradual phasing out of leaded paint and gasoline, lead
exposure usually occurs at a sub-clinical level where testing is needed to conﬁrm
poisoning. Sub-clinical effects of lead exposure include decreased impulse transmission
through the nervous system, reduced cell and nerve function, loss of IQ points, and.
decreased hearing and growth (Bellinger and Bellinger 2006). Follow-up studies of
children with high blood lead levels as toddlers have found links with loss of IQ points
once the child enters school (Chen et al. 2005). There has been recent interest in studying
the effects of lead exposure below the CDC threshold 10 ug/dL for lead poisoning
(Canﬁeld et al. 2003; F inkelstein, Markowitz, and Rosen 1998; Lanphear et al. 2005b;
Needleman and Bellinger 1991a). Research has shown children with blood lead levels
within this lower range (<10ug/dL) experience adverse effects. Needleman and Bellinger
(1991a) summarized the research and found a strong link for loss of IQ points at lower
levels. Finkelstein, Markowitz, and Rosen (1998) studied the effects of lead on the central
nervous system and found that any amount of lead within the body was hazardous.
Canﬁeld et al. (2003) found that IQ loss occurred more rapidly at BLL concentrations
below the CDC threshold than at higher concentrations. Lanphear et' al. (2005b)
conﬁrmed this ﬁnding by surveying IQ test scores and BLL levels. Their research found

an inverse relationship between IQ and BLL with the steepest drop under the 10 ug/dL.

10

This development has led to greater concern among public health ofﬁcials for the safety
of children who have been exposed but have a blood lead level under the CDC threshold,
as well as initiated calls for the threshold to be lowered (Gilbert and Weiss 2005).
Treatment for lead exposure is time consuming and often cannot undo the damage
already caused (Silbergeld 1997). Because lead is absorbed into the body at a cellular
level, it is very difﬁcult to extract. Chelation therapy is a process where a chelating agent
is added to the body which binds with lead, making it inert and speeding up bodily
excretion (Ettinger 1999). It is has been licensed by the Food and Drug Administration
(FDA) to be used when the child’s blood lead level is above 45 ug/dL (Dietrich et al.
2004). The process can take many treatments as BLL often rebounds following initial
dosage. Chelation therapy has come under scrutiny because of its ineffectiveness of
preventing neurological damage (Rosen and Mushak 2001). Medical professionals
increasingly stress that the only effective way of treating lead exposure is primary

prevention of lead hazards within the children’s environment.

1.2.2 Research, Industry, and Public Policy

Through the lens of hindsight, many early warnings of the danger of lead were
missed or ignored (Figure 2). A few observers in Roman times made the connection
between ship builders and lead poisoning, but modern discovery of the etiologic
connections between lead and various symptoms of poisoning dates to the 19th century
(Hemberg 2000). Early studies of the effects of lead examined factory workers who were
exposed to massive amounts of lead dust (Tong, Schimding, and Prapamontol 2000).

The ﬁrst study of the source of lead in children was conducted by an Australian doctor, J.

11

Lockhart Gibson, who identiﬁed lead paint as the source of exposure (Gibson 1904).
News of the Australian results reached American researchers when mentioned within a
medical textbook in 1907 and Gibson’s call for lead paint to be banned from places near
children in 1911 (Markowitz and Rosner 2002). Very soon, articles about lead began to
appear in the American academic journals. Early research came from John Hopkins
Hospital in Baltimore, where in 1917 physician Kenneth Blackfan described the horrible
condition of children suffering from clinical lead poisoning and called for measures to
keep children from lead paint (Fee 1990). Mounting pressure began to build around the
world for lead to be banned from house paint.

During the ﬁrst few decades of the 20th century, an assortment of countries
banned lead from household interior paint. France, Belgium, and Austria were the ﬁrst to
ban indoor lead paint in 1909, followed by bans in Tunisia and Greece as well as a
resolution supporting outlawing lead paint by the League of Nations in 1922 (Chisolm
2001). By 1927, Great Britain, Australia, Czechoslovakia, Sweden, Belgium and Poland
had followed suit (Richardson 2005). But the United States would not take this step for
another 50 years.

The creation of the Lead Industries Association (LIA) trade group in 1928 had a
profound effect on US policy relating to lead products. The group was able to
successfully lobby for the industry and stiﬂe any attempt at regulation of lead paint. At
the same time, the health community was debating TEL gasoline. The lead gasoline
industry turned to Robert Kehoe, a researcher out of the University of Cincinnati, for
scientiﬁc aid to support their case. Kehoe is widely recognized as the originator of a

paradigm still used by industry today, that burden of proof for proving a product

12

hazardous enough for removal lies with health experts and not industry (Nriagu 1998). In
Kehoe, the industry found their spokesman scientist who would point to lead being a
natural element within the human body (Needleman 1998). For most of the middle part
of the 20th century, the only research funding for studying lead came from industry, and
most of those funds went to Kehoe. His research on behalf of the makers of TEL and his
primacy in lead research helped keep regulation at bay (Kitrnan 2000). At a 1925
conference commissioned by the surgeon general to debate regulations on TEL, Kehoe
successfully defended its use against other health advocates who called for a ban. With
no formidable opposition, the lead industry began to advertise heavily. LIA began to
intensely promote white lead paint in residential homes, producing pamphlets for
children, buying ad space in popular magazines, and having representatives travel around
the country promoting its use to a variety of state and local governments. This promotion
of lead by LIA included advocating its use in some Michigan public school districts
(Markowitz and Rosner 2002).

The tide began to turn against the lead industry in the 19405. A rash of lead
related sickness and deaths during the Great Depression made the issue harder for the
medical community to ignore. As blood lead testing became more widely available,
medical consensus grew on the harm of lead, and the chorus of criticism put the lead
industry increasingly on the defensive. Randolph Byers and Elizabeth Lord published a
study in 1943 where they followed children who had been poisoned by lead in early
childhood, ﬁnding nearly all experienced behavioral problems and struggled in school
(Chisolm 2001). Time magazine picked up the story and brought it to a national audience

(Markowitz and Rosner 2000). Many other stories about lead poisoning began to appear

13

in magazines and on television news over the next decade (Markowitz and Rosner 2002).
However, while the paint industry voluntarily reduced lead content in its paints in the
mid-19408, it did not remove lead completely from house paint. As environmental
awareness grew during the 19608, public tolerance of industrial contamination waned.
In 1970 there were no federal regulations regarding lead paint, and only four
states and ten cities in the United States had bans on the indoor use of paint (Hemberg
2000). Early legislation in the United States was meant to respond to lead poisoning
rather than prevent it. Congress passed the ﬁrst federal legislation against lead paint in
1971, a half-century after many other developed nations. Known as the Lead-Based
Paint Poisoning Prevention Act (LBPPPA), the measure prohibited lead-based paint
(deﬁned as more than 1% lead by weight) in residential structures built by the federal
government, set the lead poisoning threshold at 60 ug/dl, and set abatement standards
(Department of Housing and Urban Development 2004). The newly created
Environmental Protection Agency (EPA) followed in 1973 with the ﬁrst regulations of
leaded gasoline, beginning a gradual phase-out that lasted until 1986. In 1975 model
year, automobile manufacturers began building vehicles which had a new emission
control system including a catalytic converter, which required unleaded gasoline
(Environmental Protection Agency 1996). The ﬁnal major policy regulations came in
1977, when the US Consumer Product Safety Commission ruled that residential house
paint could not contain more the 0.06% lead by dry weight (Bellinger and Bellinger
2006). With the regulations of the 19705, major sources of childhood lead poisoning
were no longer being manufactured, though the vestiges of earlier usage remained a

threat.

l4

Effects of the new legislation were immediate and striking. In the National
Health and Nutrition Examination Survey (N HANES II) conducted by the CDC, average
BLL of people surveys dropped from 16 pig/d1 to 9 rig/d1 between 1976 and 1980
(Needleman 2004). But the same survey estimated that 700,000 children likely had
elevated blood lead levels (30ug/dL at this time), leading to a continued push by the
public health community for more funds (Rabin 1989). In the research community, the
priority began to shift from demonstrating the harm of lead to targeting the source of
elevated blood lead levels in communities. The new population-based studies began to
look at what locales were at risk in order to aid the removal of hazards and the prevention

of exposure before it occurs.

15

1900 —-r-—

 

Gibson identiﬁes lead
paint exposure

 

 

 

 

 

France, Belgium, Austria ‘
ban indoor lead paint —-—- 1910

 

 

 

Blackfan describes
clinical lead poisoning

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

1920 —-—
Tetraethyl lead gasoline
additive introduced Creation Of the Lead
__ 1930 Industries Association
1940 __ Byers and Lord publish
inﬂuential study linking
l 1 l r ‘l d lead poisoning to
nc ustry \0 un arr 'y re tires ‘ behavioral issues
amount of lead In paint
1950
First geographic studies 196° +—
of lead poisoning distribution
Lead—Based Paint . ___._. 1970

 

 

Prevention Act

 

Catalytic converter

Lead paint banned introduced for cars
in US homes 1930 __

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Leaded gasoline
. phase-out complete
Title X provides funds __ 1990
for lead remediation Bailey uses regression
analysis to improve
Michigan passed Lead _____ remediation efforts
Abatement Act 2000 ..__..._
Lead Abatement Act
amended to increase testing
—— 2008

Figure 2: Timeline of events relating to lead poisoning. Legislation is marked in blue,
business and industry marked in orange, and research is marked in green.

16

In the early 1990s, legislation was passed at the federal level to provide ﬁmding
for primary prevention of lead poisoning. Coupled with the lowering of the elevated
BLL threshold to 10 ug/dL in 1991, the passage of Title X of the Housing and
Community Development Act of 1992 made federal funding available for remediation
programs and broadened the official deﬁnition of a lead-based hazard. Remediation of
lead involves removal of all lead paint dust, removal of lead-based paint, removal of lead-
contaminated topsoil, and replacing painted ﬁxtures (Environmental Protection Agency
2001). It has to be carried out by a state-certiﬁed contractor. The bill made grants
available for state and local governments to reduce lead paint in private sector housing. It
required that housing sold by the federal government be lead-free, extended the LBPPPA
to all housing, and ensured disclosure of the danger to residents (Richardson 2005). Title
X marked a change in policy from treating speciﬁc cases to prevention of lead poisoning
before it occurs. Lead-based hazards were extended from just paint chips to dust within
the house and bare soil on the property (Department of Housing and Urban Development
1993). Individual states were now expected to draft abatement plans or risk loss of
federal funding.

The threat of funding shortfall prompted the Michigan Legislature to pass the
Lead Abatement Act in 1998. This provided local health departments throughout
Michigan with funds to conduct blood tests on children and remediate the child’s
environment if necessary. A screening plan (Appendix 1) was developed to cover
children thought to be at risk is based on the CDC recommendations (Michigan
Department of Community Health 2007). Universal screening is now recommended for

zip codes in Michigan where 27% of housing was built before 1950 (national average),

17

12% incidence of lead poisoning among children 12 to 36 months of age in 2000, or high
percentages of pre-1950 housing and children living in poverty. Zip codes that are
deemed hi gh-risk by those standards are shown in ﬁgure 3. If a child is not in one of
these zip codes but is insured by Medicaid, a blood lead test is required and paid for by
the federal government (Kemper and Clark 2005c). Though follow-up screening is
required for children who have BLL above the 10ug/dL limit, this mandate is not
followed nearly half the time (Kemper et al. 2005b). Finally, if the child is not insured by
Medicaid and does not live in a high risk zip code, MDCH recommends that the parents
or guardians be given a questionnaire to determine if a blood lead should be given. The
questions ask if the child lives in or visits a building built before 1950, has a sibling or
playmate with lead poisoning, lives around an adult who works with lead, is subject to
cultural practices or remedies containing lead, or is included in a special population group
that may had suffered previous exposure such as a foreign adoptee. A yes answer to any
of these questions prompts a blood lead test (Michigan Department of Community Health

2007)

18

     
 

High Risk Zip Codes

- High Risk
Not High Risk

 

 

 

 

 

Figure 3: Map of zip codes deemed “high risk” by CDC standards

Following press reports on lead poisoning in 2003, the Michigan Legislature
amended the Lead Abatement Act in 2004 to increase testing of vulnerable children
(Centers for Disease Control and Prevention 2005b). The Lead Task Force appointed by

the governor crafted a plan to rid Michigan of lead poisoning by eliminating lead hazards

in housing, expanding testing, assuring capacity to serve kids who need medical help, and

securing funding (Task Force to Eliminate Childhood Lead Poisoning 2004).

1.2.3 Geographic Studies of Lead

Research in how lead exposure varies by geographic location began in the 19605.
The geography of lead poisoning was a component of the wider research into clinical lead
poisoning (Gaston 1972). Many studies were based in large cities where the residence of
children who were treated in a hospital was plotted on a city map. For example,
J acobziner and Raybin (1962) investigated cases of lead poisoning reported by New York
City hospitals. Analysis was restricted to disease mapping, where locations of the
residences of lead poisoned children were plotted on a map. The authors found a spatial
pattern of children with elevated BLL, uncovering a “lead belt” through the low income,
largely minority neighborhoods which was attributed to substandard housing with lead-
based paint (J acobziner and Raybin 1962). Other studies based their spatial analysis on
blood lead samples collected throughout study areas, such as the cities of Chicago and
Philadelphia (Gaston 1972). Disease maps of the samples conﬁrmed that lead poisoning
(above 60 ug/dL at the time) generally afflicted lower income neighborhoods that often
contained older housing and politically dispossessed citizens. The spatial patterns found
by these community samples were later conﬁrmed through larger statewide population
surveys and screening programs (Griffith et al. 1998).

Larger population-based studies at county, state, and national levels that looked at
using population variables to focus primary prevention strategies were completed in the

19805 and 19905. The NHANES 11 survey from 1976-1980 conducted the ﬁrst

20

population-wide study of children with lead poisoning (Bailey et al. 1994). Results
showed that the problem was the worst in urban areas, and African-American children
suffered more exposure to lead than others (Mahaffey et al. 1982). Children under the
age of six were found to have the highest mean BLL. Unlike adults where men had
higher average BLL, the child’s sex was found to not be predictor of lead exposure
(Mahaffey et a1. 1982). While statewide screening programs generally came after Title
X, several studies looked at lead poisoning in cities that had programs. Daniel (1990)
found that while BLL in New York City was declining overall, the older urban areas were
more likely to have housing with layers of lead paint than housing outside the city.
African-Americans accounted for nearly two-thirds of lead poisoning cases, and children
between six months and two years old were found to be at the highest risk (Daniel et al.
1990). Guthe et al. (1992) used GIS to examine at the spatial pattern of blood lead test
results compared to major roadways and industrial sites in Newark, New Jersey. The
lack of conclusive links between these sites and the occurrences of elevated BLL caused
the authors to call for additional research (Guthe et al. 1992). Since these studies
revealed the same patterns with the same population markers, research into the spatial
distribution of lead poisoning turned to using regression analyses to discover areas where
exposure was more likely.

To better target screening programs that proliferated after the passage of Title X,
researchers studying the geography of lead poisoning tumed to regression models based
on enumerative unit variables (Table 1). An early example was Bailey et al. (1994), who
looked at lead poisoning in children in Massachusetts at the minor civil division scale.

Though the research was criticized because the state screening program at the time used a

21

surrogate marker rather than the actual blood lead level, the paper did indicate that many
population risk factors that had been identiﬁed earlier indeed helped explain the
distribution of lead poisoning throughout Massachusetts. Several common indicators of
community lead risk were found to explain the geographic variation of lead poisoning in
the state including percentage of African-Americans, percentage of housing units built
before 1940, and percentage of households headed by a female (Bailey et al. 1994).
Bailey also looked at the role of an area’s industrial heritage in lead poisoning by creating
a dummy variable for minor civil divisions that bordered the industry-heavy Merrimack
River and found that adjacency to this waterway was statistically signiﬁcant in predicting
elevated BLL.

The next regression model for lead poisoning that appeared in the literature was
Sargent et al. (1995), who also looked at lead poisoning in Massachusetts. Many of the
same variables were observed to affect geographic variation of lead poisoning as Bailey
(1994), this time at a community level (Sargent et al. 1995). In each case, impoverished
communities had greater difﬁculty with childhood lead poisoning. Similar to the Bailey
model, this regression did suffer from the fact that Massachusetts used a surrogate marker
for BLL. Two years later, both authors were involved in creating a model for lead
exposure, this time at the census tract level in Providence, Rhode Island (Sargent et al.
1997). While many of the same poverty and racial characteristics were found to predict
geographic variations as the earlier models, additional variables were used which were
found to have a signiﬁcant effect. One such factor was the percentage of recent

immigrants to the United States (< 5 years). The authors speculate that the lack of

22

understanding of the dangers of lead paint and the language barrier might have placed
immigrants at greater risk for lead exposure (Sargent et al. 1997).

The ﬁrst regression model for lead poisoning that considered the spatial
component was Grifﬁth et al. (1998). The study looked at Syracuse, New York with three
US Census scales: blocks, block groups, and tracts. New variables found to explain
geographic variation of BLL were average household value and average rent. Griﬂith
also used buffering analysis around major roadways and found the BLL of children living
next to roadways to be similar to the rest of the study population, which indicated that
leaded gasoline did not contribute to elevated BLL. But the main contribution of the
study was the combination of regression analysis with spatial analysis. Grifﬁth found
that incorporating space into the regression analysis through the use of a spatial
autoregressive model helped further explain the geographic variance. Elevated BLL in
Syracuse was found to cluster at every scale (block group, tract, and zip code) tested,
which led the authors conclude that community childhood lead exposure cannot be
understood completely without accounting for the geographic dimension (Grifﬁth et al.

1998).

23

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Author i Study Site I Spatial Scale Method Dep. Variable
l‘“ "W“ " l ”w w" " _ ’i“ i ‘"
Bailey (1994) l Massechusetts I Minor Civil Division Poisson Regression Count > 25 mg/dL
Sargent (1995) i: Massechusetts Minor Civil Division Logistic Regression Cases / Tests
2.. 7%.. _._____- __ ”4T- _ __
Sargent (1997) ? Providence, RI Census Tract Linear Regression % > 10 mg/dL
Griffith (1998) S racuse NY Census Block. S atial Re ression Number of Cases
' y ' Blk Group,Tract p g
.Lﬁ ._.- L...W...+_.AL LL_ *_- L--- “..__..._ __4.4._-_.44.4I.
l
Lanphear (1998) )I Rochester, NY Block Group Logistic Regression °/o > 10 mg/dL
. L I i _ ..______-
Talbot (1998) ‘ New York State I Zip Code Linear Regression Ln(% > 10 mg/dL)
4 44444444 44 4- I L..- - ..___ WI; --.-_.__.._--. ._
Litaker (2 000) l 19 Ohio Counties ! Census Tract Logistic Regression 12% 0f more >
I 10 mg/dl.
.._ _L-__.. _ - -_ -___ __ mm...“ + _ ____
Miranda (2000) ‘ 6 NC Counties Tax Parcel Linear Regression Ln(BLL)
Haley (2004) i New York State Zip Code Linear, Spatial Error Ln(% > 10 mg/dL)
l .._ _
. . Individual,
Kaplowrtz (n/a) Michigan Blk Group Linear Ln(BLL)
I

 

Table 1: Summary of previous geographic studies of lead poisoning

Several other local scale studies in the literature have produced interesting results.

Lanphear et al (l998b) studied childhood BLL at the census block group level in

Rochester, New York. While their regression model did not use any new variables, they

tested the model against individual data collected by a testing clinic in a local area.

Results showed the block group level data in the community predicted elevated BLL as

well as the individual level data (Lanphear et al. 1998b). Litaker et al (2000) used a risk

score based on housing, ethnicity, education, and housing rental for their regression

 

model of 19 Ohio counties. They found that their model predicted the spatial distribution
of elevated BLL better than the CDC guidelines, which are the same as the screening plan

by MDCH (Litaker et al. 2000). The study by Miranda (2002) is the only lead regression

24

model organized at the parcel level. Though not practical for a statewide study, the
authors used tax parcel data for six counties in North Carolina to estimate the areas most
in need of primary prevention. The ﬁner scale of the analysis allowed a residence-by-
residence analysis based on the year each structure was built (Miranda, Dolinoy, and
Overstreet 2002). While the study worked at a microscale for the counties surveyed, the
difﬁculty of gathering household data on other variables did not allow the authors to look
at many other socio-economic factors.

The largest population-based geographic elevated BLL study was done in New
York State (Haley and Talbot 2004; Talbot, Forand, and Haley 1998). Authors of the
study used zip code level variables to predict areas in the state where the percentage of
children with elevated BLL would be higher. A linear regression model and a spatial
error regression model were used throughout the entire state. Perhaps the most
interesting result in the research was that the same variables of percentage housing built
before 1940, percentage high school graduates, and percentage African-American births
were the best predictors of childhood BLL in both New York City as well as the rest of
the state (Talbot, Forand, and Haley 1998). Generally, lower levels of BLL found in
New York City are attributed to the fact the lead paint was banned by the local
government in residential areas within the city two decades earlier than the federal ban,
though the result still surprised the authors. Conclusions of the study were that when
working with a large study area, variables that explain BLL variance at ﬁner scales might
not persist. For example, population density was noted to not have an effect at the

statewide level, unlike earlier localized studies (Haley and Talbot 2004).

25

The faculty of the Sociology Department at Michigan State University has studied
common factors of BLL in Michigan. A detailed survey was used to sample around
4,200 children throughout Michigan to determine signiﬁcant indicators of elevated
BLL(Frost 2004). Children who lived in urban, low-income areas were sampled. The
variables found to signiﬁcantly predict BLL in a child were water through lead pipes,
siblings with elevated BLL, adults in the house with elevated BLL, the child is Aﬁ‘ican-
American, and household income below $20,000. The data were later used to create a
predictive model based on census variables (Kaplowitz, Perlstadt, and Post 2007). As the
ﬁrst study to use a continuous dependent variable for BLL, the authors found that
Medicaid status, race of the child, and ethnic character of the neighborhood were strong
predictors of BLL. Other interesting ﬁnds included that exposure risk was higher with
pre-l940 housing than the housing built between 1940 and 1950 (Kaplowitz, Perlstadt,

and Post 2007).

26

 

I Independent Variable

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Author , +/- P - Value
. Log (Number ol‘children screened) + <0.00l
Bailey ( 1994) I Percentage African-American + 0.004
I Percentage Female-l leaded Households + 0.003
'5 Percentage Houses built before 1940 + <0.00l
L Median Per Capita Income - <0.00l
: Percentage African-American + <0,()0|
Sargent ( I995) Percentage Houses built before l950 + <0,00|
L Screening Rate + <0.001
? Poverty Scale + 0.007
‘ Percentage Screened + 0.0l
. Percentage Houses built before l950 + <0,0()l
Sargent ( WW) iNlatural Log (Number of Vacant Houses) + <0.00l
PPercentage Recent Immigrants (< 5 years) + 0,003
1 Population Density + undisclosed
l'ract '2 Average House Value - undisclosed
. Percentage Under 18 years old undisclosed
Population Density undisclosed
Block T .
Grifﬁth Group L Average House Value - undisclosed
( WW) I Percentage African-American + undisclosed
i Percentage African-American undisclosed
. Average House Value - undisclosed
Block I Percentage Under l8 years old + undisclosed
' Percentage Hispanic + undisclosed
in Percentage Renter Occupied Housing + undisclosed
3 City Residence + <0.00l
.__._ Percentage Screened + <0.00l
African-American Population + <0.00l
r Percentage Houses built before I950 + <0,()0|
Lanphear ( l998) ---._.-____ Population Density h + <0.00I
I; Low House Value + <0.00l
% " High Poverty + <0.001
Low High School Graduation Rates + 0.004
PM“ Lon ()uner Occupied Housing + 0.0l2

 

Table 2: Regression results from earlier studies. Columns are author, independent
variable, whether the coefﬁcient is positive or negative, and the p-value

27

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Author Independent Variable +/- P - Value
3 Percentage African-American births + <0.00l
'l'albot ( IWS) “WP-Percentage High School (iraduates - <0.00l
‘ Percentage Houses built before 1940 + <0.0m
* Percentage living in rural areas - 0.005
Percentage African-American + <0.00l
Percentage Houses built before 1950 + <0.00]
Litaker (2000) W“ Percentage Under 6 }ears old + <0,00|
Percentage Male Under 6 years old + 0.00l
6 Percentage u ithout High School Diploma + <0,00|
P— Perecntage belo“ l50% povert} line + <0,0()|
i Percentage Housing Renters + <0,00|
Percentage l‘emale Headed Households + <0.00l
Residence Year ol'Construction - <0.00l
Miranda (3003) Median Income - <0.00l
Percentage African-American + 0.00l
New l Percentage Houses built before I940 + <0.00l
York ;Percentage “ithout High School Diploma + 0.02
llale\ Cit-V ﬂ Percentage African-American + <0.00l
(200:1) Ne“ Percentage Houses built before l940 + <0.001
York Percentage “ithout High School Diploma + <0.00I
State Percentage African-American + <0.00l
Percentage belou l85% pm erty line + <0.00l
. rn—“w Percentage African-American + <0.00|
Kaplots W ..-_-____._ W Percentage Latino + <0 OOI
(unpublished) F .. '
Percentage “ithout High School Diploma + <0,0()|
if“ — Percentage Houses built before l950 + <0.00l

 

Table 3: Continuation of Table 2 showing regression results from earlier studies

Previous geographic studies of lead exposure have shown the usefulness of using
regression models (Tables 2 and 3). While many similar variables have been shown to be
predictive of childhood BLL, the geographic element of lead poisoning has proved to be
important. Factors such as population density have inﬂuence at certain spatial scales, but

not others.

28

1.2.4 Theoretical Basis and Hypothesis

Medical geography is a research ﬁeld which draws upon concepts from a range of
disciplines (Meade and Earickson 2000). While interest in how disease varies through
space goes back centuries, the organization of medical geography as an academic ﬁeld
dates to the middle of the 20th century (Akhtar 1982). The work of Jacques May in the
19508 introduced the ecology of disease where human behavior-based factors determined
the limitations of disease incidence (Meade 1977). The disease ecology approach resulted
in a shift from studying disease itself, a process rooted in germ theory, to studying the
environment where the disease grows and occurs (Akhtar 1982). Disease became to be
viewed as a interrelationship of factors occurring at a certain time and space (Jones and
Moon 1987). Disease agents are constrained by the typical environments where they can
survive, creating a characteristic spatial distribution, also called landscape epidemiology
(Mayer 1986). Disease mapping became a valuable tool for the study of the pattern of
disease, although without an underlying process theory (Mayer 1982).

The human ecology model came to medical geography from the biological
sciences by way of sociology (Honari 1999). According to Meade and Earickson (2000),
human ecology refers to the “patterns of human interaction with the physical
environment, including not only behavior but genetic adaptation and physiological
reaction to environmental stimuli.” Human ecology is a holistic model, concerned with
interactions at all scales (Honari 1999). The human-ecology triangle (Figure 4) was
created to show that human health is based on the interactions between individual or

population characteristics, behavior, and habitat (Meade and Earickson 2000). Population

29

is concerned with the individual or groups of individuals with common characteristics,
looking at how factors such as age, gender, and genetics affect human health. Behavior
refers to the observable aspect of culture, which manifests itself in conditions humans
create through alteration of the landscape, customs and social norms, and utilization of
resources (Meade 1977). Habitat is the environment, both natural and human
constructed, in which a person lives as well as the social environment that controls the
structure of the person’s surroundings (Meade and Earickson 2000). The study of
elevated BLL in children that utilizes the human ecology perspective is important
because of the clear relationship between children and their behavior in their local
environment. The concern among many researchers is not so much with lead itself, but
with the environment where it is prevalent and the children who are at risk of exposure.
The state of a child’s health as related to lead exposure depends on factors related to all

three vertices of the triangle, meaning each should be considered.

Population

Human
Health

 

Behavior Habitat

Figure 4: The hmnan ecology triangle

The behavioral aspect of the human ecology triangle for lead has been the most
inﬂuential due to the preventable nature of lead exposure. Lead poisoning is a disease

that is entirely produced by human use of resources. The decision to use lead as an

30

additive to paint and gasoline for most of the 20th century is the driving reason behind
the problem today. Political indifference to the seriousness of lead poisoning also
contributed greatly to the prevalence of lead in the American environment. In terms of a
spatial lead study, human behavior comes into play in several ways. The ﬁrst is through
the marginalization of impoverished areas, which are known to be the areas of highest
lead exposure risk (Pirkle et al. 1998). The expense of remediation and the historically
lukewarm response from the public sector has left lower income areas without a
correcting mechanism for eradicating the lead in their environment (Rabin 2008).
Studies of lead exposure have shown that the effect of human behavior does not always
come from industrial or political decision-making (Bailey, Sargent, and Blake 1998).
Local efforts to screen children for lead in the bloodstream have an effect on BLL, as
well as the educational attainment levels in the community. Individual behavior of both
the parent and child inﬂuence lead exposure as well. Parents who are employed where
lead is present can unknowingly bring it home on their clothes (Frost 2004). Other
parental behaviors which affect childhood lead exposure are remodeling an older house
with lead paint, using foreign-made products such as cosmetics which might contain lead,
and not complying with lead paint removal regulations. The main behavior of children
that puts them at risk is pica, the compulsive need to ingest non-food substances (Gaston
1972)

The child’s environment, or habitat, affects lead exposure. It ﬁgures prominently
in the human ecology model for a variety of diseases, but is not a large factor in
childhood lead exposure. Pre—industrial levels of lead were much lower than today,

indicating lead posed virtually no risk before human’s began altering the environment

31

(Kovarik 2005). Current background concentrations in the soil have been found to be
highest near industrialized areas (Murray, Rogers, and Kaufman 2004). Still, it is from
the child’s human-constructed environment where children live that poses the highest risk
of lead exposure. A young child’s world is much more constrained than an adult,
meaning that more oﬁen than not the trigger for lead exposure lies within the house.

Lead products lie in older housing stock, dating from years of leaded paint and lead water
pipes, and they generally make housing age among the best predictors of child BLL
(Pirkle et al. 1998). Other habitat features include the settlement patterns of towns and
cities. Michigan cities tend to be decentralized, leading to greater use of cars (Vojnovic
et al. 2006). This long—term trend could create lead reservoirs near major roadways that
were heavily trafﬁcked during the leaded gasoline era (Hunter 1976).

The human ecology model also considers the social environment in which the
child is living. Social environment in the human ecology triangle refers to the “groups,
relations, and societies which people live (Meade and Earickson 2000).” Recent
immigrants to the United States demonstrate an example of how the social environment
around a child could affect BLL. Often, the communities live in substandard housing, do
not speak English, are unaware of the dangers of lead, or have residents in the country
illegally who cannot come forward for testing (Centers for Disease Control and
Prevention 2005b).

Individual level factors are an important part of the human ecology model, but
generally are not that important in lead exposure studies. Because lead toxicity is
harmful to everyone, typical population factors such as genetics do not make a difference.

The ethnic makeup of a neighborhood does predict the elevated BLL, but this is not due

32

to any physical factor which falls under the population vertices of the human ecology
triangle. Researchers also have looked at disparity in BLL between the two genders and
uncovered no signiﬁcant difference in BLL between male and female children (Mahaffey
et al. 1982). Age and race are normally the only individual factor that has an effect
(Goyer 1993). Typically the peak age for childhood BLL has been found to be about two
years of age (Lanphear et al. 2005b).

With knowledge of previous research and the background of the human ecology
triangle, this thesis will attempt to answer the questions posed earlier by developing a
geographically based regression model. The goal is to create a useful model that
illuminates the spatial character of elevated BLL in Michigan and provides a tool for use
in primary prevention. From past research, I hypothesize that:

l. Clusters of elevated BLL exist in Michigan. These clusters are within older
urban neighborhoods. Similar to Grifﬁth et a1 (1998), these patterns will
manifest at several spatial scales.

2. Variables associated with older housing, lower income, lack of education, and
recent immigration to the US will best predict the spatial distribution of BLL.
The predictive power of each variable will also vary by place throughout the
state and at different geographic scales.

3. The model will work across time ranges due to the underlying socio-economic

factors causing the same distribution of BLL every year.

33

2 Data and Methods

2.1 Data

Lead in the environment remains a hazard for Michigan children. The only viable
solution is to prevent exposure at the source (Rosen and Mushak 2001). Primary
prevention remains a key strategy for eliminating lead in the human environment
(Centers for Disease Control and Prevention 2005b). This thesis divides the geographic
study of blood lead levels (BLL) into two phases, the identiﬁcation of the patterns of
affected children and an examination of the socio-economic correlates. Two datasets
were used for the geographic study of BLL within the state of Michigan. The primary
dataset used is the Michigan Lead Database, created and maintained by Michigan
Department of Community Health (MDCH), which contains information and BLL results
of each child under the age of six who took a blood lead test. To make sense of the
spatial patterns of BLL observed in the lead database, data tables containing possible
independent variables were downloaded from the United States Census Summary Files
for the 2000 Census. These two sources were used to create both the geocoded BLL test

results point dataset and the statewide areal units.

2.1.1 Michigan Lead Database

Since 1997, all laboratories that conduct lead tests within Michigan have been
required to report all results to MDCH (Michigan Department of Community Health
1998). These results were originally sent by the labs as paper copies of the Blood Lead
Analysis Report, but 2004 legislation now requires electronic reporting (Kemper et al.

2005a). Blood lead analysis reports ﬁled by the testing labs are reviewed for

34

completeness, entered into the database, and run through quality control checks to ﬁnd
any data entry errors (Michigan Department of Community Health 1998). A 2002
internal study that tested the registry’s ability to link to other state-maintained datasets
such as the Medicaid enrollment ﬁles found it to be over 99% accurate (Kemper et al.
2005a). Once the test information is entered into the database, MDCH notiﬁes the child’s
health care provider and local public health organization of the results (Michigan
Department of Community Health 2006). In the case of children with elevated BLL, a

local environmental investigation may follow to determine the source of exposure

MDCH Database
Child ID 1 Address Birth Date Race lnsurancci Testing Date Test TypeiBLL
000001 1 4311s: 3/8/2003 White Self-Pay] 6/3/2004 Capillary '>

 

 

 

000002 . 682ISt 4/24/20031White Medicaidi 6/6/2004 Venous 10
i l i L i ’
000002 " 6821 St 4324/2003 §White Medicaid' 9/17/2004 Venous , 4

 

 

 

 

 

 

 

Duplicate tests removed (highest BLL kept)
Addresses Geocoded

MSU Database

Child ID Address Birth Date Race Insurance Testing Date Test Type BLL
000001 431 [St 3/8/2003 White Self-Pay 6/3/2004 Capillary '7

000002 6821 St Q4/2003 White Medicaid 6/6/2004 Venous 10

 

 

 

 

 

 

 

 

 

 

 

 

 

Non-Medicaid Children Removed

Thesis Database

 

Child ID | Address Birth Date Raceilnsurance: Testing Date Test Type BLL
000002 1 682181 41/24/2003 White Medicaidi 6/6/2004 Venous 10

 

 

 

 

 

 

 

 

 

Table 4: Example highlighting the changes between the original BLL database and the
database used in this thesis

35

The MDCH database contains information about each lead test from 1998 to 2005
and personal information for the examined child. The microgram per deciliter result of
the child’s blood lead test is recorded as an integer value, with 1 being the lowest
number. Also included is whether the test was a capillary or venous test. Capillary tests,
also known as ﬁnger stick, draw only a small amount of blood (under 100 uL) and are
cheaper to administer than the venous test (Parsons, Reilly, and Esernio-Jenssen 1997).
General consensus holds that the venous test is more accurate and less susceptible to
contamination, so any child who has a high blood lead result on a capillary test is given a
venous test to conﬁrm elevated BLL (Michigan Department of Community Health 2007).
For this reason, venous tests are the preferred method for investigators (Dignam et al.
2004)

In addition to the information on the actual test, the registry contains some
personal information about the child. Age of the child and date of the blood test are
included, which allow the data to be separated by year and age. The race of the child is
recorded as well as whether or not the child is covered by Medicaid. The test is required
for all children covered by Medicaid, so such children constitute a majority of the

registry. Finally, the testing labs record the address of the child’s residence.

36

2003

 

l%-5% 6%-ll% 12%-|6% l7%-23°/o 24%-49%

.f’

  

2004 2005

Figure 5: Percentage of children under six years of age tested for lead. All test results
for Michigan counties and Detroit included.

37

Certain assumptions must be made when relying on data acquired from another
source rather than collected ﬁrst hand. Beside the question of data entry and locational
accuracy, what proportions of the population of Michigan children were tested remains a
concern. In every year since the release of the 2000 US census, MDCH has listed the
percentage of children within each county and the city of Detroit who were tested during
that year (Figure 5). A general increase in the number of children tested can be seen
across the state. This is reﬂective of the increased state government pressure to eliminate
elevated BLL. But overall, there is no county where over 50% of the children were
tested.

Michigan State University researchers were able to examine the children’s test
results in this database. A grant was secured from the Centers for Disease Control for the
MSU team to work with the MDCH blood lead test results (Kaplowitz, Perlstadt, and
Post 2007). The researchers used the test data to create a regression model with a mix
and individual from the database and group variables from the US census. Some test
results were discarded in order to avoid complications from multiple samples of the same
child. For children who had been tested more than once, the highest test result was kept
and the others removed (Kaplowitz, Perlstadt, and Post 2007).

The MSU research team found the geographic location of each child’s residence
through geocoding. The geocoding process uses a GIS vector data set of the streets
within Michigan to estimate the location of each child’s residence. The location of the
address point is determined by two factors. One is the location along the road segment,
estimated by using the address range of the segment as a guide to ﬁnd the address point

location. Another factor is perpendicular offsetting the address point from the road

38

segment for an accurate estimate of the actual residence site. The process is subject to
error but is a commonly used method for GIS-based spatial analysis in health geography
(Zandbergen and Green 2007).

Roughly two-thirds of the children in the MSU database were on Medicaid
(Kaplowitz, Perlstadt, and Post 2007). This number is much higher than the proportion
of children statewide on Medicaid. Because of the concerns over the sampling protocol, it
was decided that this thesis would focus exclusively on children covered by Medicaid.
Children who are on Medicaid are three times as likely to have elevated BLL as children
who are not enrolled (Kemper and Clark 2005c). Since two-thirds of the MSU database
is children on Medicaid, these children are more likely to represent the population on
Medicaid than the entire MSU database represents the general population. The
percentage of Michigan children who are enrolled in Medicaid is around 33% (American
Academy of Pediatrics 2003).

With approval from the MSU Human Research Protection Program (IRB # 07-
362), the MDCH blood lead database was made available for this thesis. The database
was imported into Microsoft Access in order to view descriptive statistics on the children
who have been tested. Summary statistics of this database are in ﬁgure 6. The number of
children tested steadily increased through the years in the registry. There is an especially
large rise in the number of tests between 2003 and 2004 after the state government made
remediation of lead poisoning a higher priority (Task Force to Eliminate Childhood Lead
Poisoning 2004). Another trend is the steady decline in both the mean BLL level in the
registry and the percentage of the children whose BLL was elevated (above 10 ug/dL).

This decline would likely signal the effectiveness of the primary prevention programs and

39

remediation, but could also be a product of the increased number of tests. According to
Kemper (2005a), the number of children tested likely increased due to requirements by
daycare enrollment or early education programs. This might explain why the age of
children tested is older than what the CDC recommends.

The donut graphs show that there has been little change in characteristics of the
children tested between 1998 and 2005. Children on Medicaid are required to get tested
for lead before the age of two or between three to ﬁve years of age if not previously
tested (Kemper et al. 2005a). Testing under the age of two is generally preferred because
children around the age of two tend to show the highest BLL (Ozden et al. 2004). In this
dataset, there does not seem to be a preference of testing for children under the age of
two. This could be further conﬁrmation that many tests occur later when the child enters
educational programs.

The second donut graph shows the proportion of children in the dataset who
received a venous test as opposed to a capillary (stick) test. The majority of tests in this
dataset, between 60 and 70 percent depending on the year, are venous blood tests. This is
encouraging for this research because the venous test is less affected by contamination of

the sample (Kemper, Bordley, and Downs 1998).

40

 

Years Old Test Type

   
   
 
 
 
 
 
 
 
 
 
 
 
 
 
   
   
 
 
 
 
 
 
    
   

 

 

Count 39.183 7 ' 0 ' I Venous

.. ‘ 0'

Mean 131.1. 6.25 66’”

1998
Std Dev 5.45 Stick
5 o i
Uri) [Elevated 17.6 "’4'0
Count 36.961 Venous

 

Mean 131.1. 5.38 68%
1999
Std l)ev 4.89 Stick .

”/0 Elevated 12.7 32°"

Count 36.389 Venous

Mean [31.1. 4.68 65%
2000
Std Dev 4.36 Stick .
"1. [Elevated 9,2 35%

 

Count 48.002
Mean BLI. 4.62
2001
Std Dev 4.38
94'. Elevated 8.7

Venous
64%

Slick
36% .

 

 

 

 

 

 

“Ponnnpp

Count 49.496 "'» _ 2;" Venous
‘" -5 o .0;
Mean 131.1. 4.54 n ‘- 4 9, 64 '°
:00: : 0‘
Std I)L‘\ 4.25 _ 7 _ i7: , Stick
z 0‘: J- A I". - ‘ :5‘ ~ 9'
‘36 Hevated 8.1 3300 '1' J6 0‘
Count 45.965 “_ 960' Venous
- ‘ 7' j , u 0
Mean nu. 3.81 . ~ _~, (’7 "
2003 .r ,5 : 0-
Std Dev 3.76 g, ' ‘7 3“ '° , Stick
~ 4 - 5 «3..
”A. Elevated 5.1 ‘ 4 3400; J 0
Count 65.874 7 ' " ~ § 97'0' Venous
~, ‘ j u a -
Mean 131.1. 3.44 4 _ ~, 6’
2004 . .-. .L’ : 0‘
sm Dev 3.33 E y;- 3" 3 Stick
‘4, 7“ " . v " 0’
0.1. Elevated 3.7 "'1... -“ 3700 * )7 0
Count 76.1 18 V . 0 '0' Venous
. a j V ~\ 44 U ()(“0
Mean 131.1. .v...6 ,. . ‘. 7 - x
2005 j ”E 2 :4.-
sm I)C\ 3.36 ". ‘ , -‘-‘ ‘_’ Stick
or : ~. ‘ \_. .‘ :v' 4 - 3 4000
'0 l.|e\.ited _\.7 “N“... r 3; "n

 

Figure 6: Descriptive statistics of the thesis lead database. Note that elevated means
above 10 ug/dL and numbers are for Medicaid insured children.

41

The process of moving the database to a GIS data format began with importing
the MSU database into Microsoft Access (Figure 7). After non-Medicaid children were
removed, the new thesis database was divided into eight dBASE (.dbf) ﬁles containing
the test results for each year. The .dbf format was chosen because of the ease of moving
the tables into the GIS program ArcMap. The .dbf ﬁles were brought into ArcGIS in

order to geocode them.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

F
I
I 1
I
E ii
.' ) ) .
Remove 35}: it '
Non-Medicaid ' 3 . .
year ‘ ; , _+L“
~ . 1 “ ‘ pr
MSLI Database ‘l'hesis Database __ ', H—

 

 

 

 

 

Database by Year

Figure 7: Migration of MSU database to GIS-utilizable .dbf format

A vector data set of Michigan based on the Michigan GeoRef projection was
downloaded from MCGI (www.michigan.gov/cgi). The GeoRef projection is preferred
when working with Michigan data because it accurately projects the entire state rather
than dividing it into sections (Michigan Department of Natural Resources 2001).
Latitude and longitude coordinates were used to locate the child’s address (Figure 8).
The result was eight point-based vector data sets representing each year with all of the

database information included.

42

 

 

 

 

l).ll.llv.l\e ..llwl‘ ‘ -. _’"3". 2’. '21“.

  

i‘vtv \»
"‘4 l?" ' 7"
.. ,3 ‘\
V 41'
l ,1 (leuentieti 1’01Ills
/ U \l
.\ .‘
r9
/
z. ,_. ‘ g 7, 1
Michigan (ieoRel' Vector Slate Boundaries

Figure 8: The geographic coordinates were geocoded to a point vector data set through
use of the MCGI state boundary vector data set

2.1.2 United States Census

To supply the socio-demographic and economic variables for the regression
portion of this thesis, ASCII text data ﬁles from the 2000 US census were obtained. Each
summary ﬁle is available for download ﬁom the US census web site (www.census.gov).
The various tables can be linked to a variety of geographic divisions through the logical
record number. For this thesis, the regression analysis is limited to the geographic levels
used in previous spatial BLL studies. This includes census tract, ﬁve digit zip code, and
minor civil divisions.

The ﬁnest scale geographic unit in which the Census Bureau aggregates data for

public use is the census block. A block is an areal unit contained within the surrounding

43

streets or a water body, similar to a city block (US Census Bureau 2000). Census blocks
are generally not used in medical geography because they include only raw population
counts, not socio-economic variables. Summary File3 is not aggregated by the US
Census Bureau because of the small number of census long-form sample respondents
within a block. But census blocks provide the basis for every larger geographic unit.

The block group is a cluster of contiguous census blocks. The ﬁrst digit in the
three-di git census block number indicates block groups. Participation by a local
statistical committee is taken into account when forming block groups. Each block group
is contained entirely within a census tract. A census tract is a statistical subdivision
containing between 600 to 3,000 housing units that are delineated by a local committee of
data users (US Census Bureau 2000). Census tracts boundaries follow permanent
geographic features such as streets, railroads, rivers, and canals. Tract boundaries are
geographically contained within individual counties and are designed to be as
homogenous as possible with respect to the characteristics of the population within them
(US Census Bureau 2000). The tract is a common unit of analysis in medical geography
and was used in this thesis.

The ﬁnal two geographic units of analysis, ﬁve digit zip codes and minor civil
divisions, are based on federal and local government divisions. Zip codes are service
areas created by the United States Postal Service. The Census Bureau aggregated to this
unit of analysis for the ﬁrst time in 2000. This is an important unit of analysis in BLL
research because it is often used in testing standards of the CDC and subsequently
MDCH. Unlike any other spatial unit, the deﬁnition of minor civil divisions (MCD)

varies from state to state. In Michigan, MCD refers to townships and incorporated cities

44

(US Census Bureau 2000). MCD are often preferred as a unit of analysis that the size of
each enumerative unit remains fairly constant across the entire state. This is the case in
Michigan, where most townships are 36 square mile units created by the Public [and
Survey System.

Previous research has identiﬁed important variables for the prediction of elevated
BLL in children (Bailey, Sargent, and Blake 1998; Talbot, Forand, and Haley 1998;
Kaplowitz, Perlstadt, and Post 2007; Griffith et al. 1998; Haley and Talbot 2004;
Lanphear et al. 1998b; Litaker et al. 2000; Miranda, Dolinoy, and Overstreet 2002;
Sargent et al. 1997; Sargent et al. 1995). The matrices containing signiﬁcant independent
variables noted in tables2 and 3 were downloaded from the census website into Microsoft
Access. From there, an identiﬁer called the log record number was used to link the
census data with the desired geographic unit. The output table was exported into a .dbf
ﬁle and joined in ArcMap to census-based vector data sets that were downloaded from

MCGI (Figure 9).

ll .4-
l! 11.“;le

 

 

SQI. ' Join to

t

 

Extraction ot‘ ~ - - \rector MCD
~ - Data Sets -'

 

 

 

 

Regression

 

 

"’1 "a W .
Downloaded \ "I It 1” Regresslon
Census Variables
Database .dhl'

6 Zip Codes

Figure 9: Schemata of the transfer of census variables to vector data sets

45

2.2 Methods

2.2.1 Clustering

Each child’s geocoded address was used to ﬁnd areas where higher BLL values
cluster. Clustering techniques typically involve the division of the point dataset into
cases of disease and control cases representing the population at large. With elevated
BLL, the thresholds of lead representing a case of disease are vague and the current level
of 10 ug/dL has been the designation only since 1991 (Sargent et al. 1995). Disease-
clustering techniques seek to study point patterns in order to ﬁnd areas where the
likelihood of disease occurrence is greater than would be expected by chance. A variety
of methods are available to study point patterns of disease. This thesis employed three
methods, each of which revealed characteristics of clusters. The Cuzick-Edwards
statistic reveals the occurrence and size of the clusters, the difference of K-function ﬁnds
the distance between elevated lead clusters compared to the background population, and
the Geographic Analysis Machine creates a visualization of the point pattern (Waller and
Gotway 2004; Wheeler 2007; Dockerty, Sharples, and Borman 1999; Dolk et al. 1998;
Openshaw et al. 1988).

This thesis sought to test the clustering of “cases” of lead poisoning at several
levels of ug/dL. The control points were children with a BLL test result are 1 ug/dL, the
lowest value in the database. These children represent a majority of the results and
provide a background population representing the spatial distribution of children on
Medicaid within the state. Several aspects of lead clustering were investigated, such as

the number of cases near each other, distances at which cases cluster, and where these

46

clusters tend to occur. The linkage between these methods is that they are a different
display of the underlying pattern. The neighbor method and the distance method are both
expressing the same pattern in a different way. Underlying each is the notion that when
controlling for how the population is spread, are the cases of elevated BLL more likely to
be near each other. The two methods express this nearness in different ways. The
neighbor method says are these cases likely to be neighbors compared to the background
population, while the distance method analyzes whether these cases are closer to each
other in distance compared to the background population. The link between the two
clustering signiﬁcance tests and the mapping the clusters is not perfect. Questions can
arise as to whether any clusters that appear in the neighbor and distance methods are
displayed in the map. But mapping is necessary to give clustering analysis any practical
purpose. Without knowing the location of clusters of elevated BLL, the exercise of
testing for clustering is academic. The distance based clustering tests sketch a rough
outline of how large the diameter of the cluster is. More often than not, clear clusters
present in the test methods show up at roughly the same size on the maps.

The decision was made to look at possible clustering by individual year rather
than aggregating all or several years results together. There were two main reasons for
this decision. The ﬁrst was to see if patterns of clustering or changes in the size of the
clusters changed over time. Differences between different years could reﬂect possible
effects of on the ground efforts for testing programs and remediation. The second reason
was a matter of computing time. The software required to perform the clustering analysis
cannot support a distance matrix of test results for all eight years in many parts of the

state.

47

The tens of thousands of data points for each year in the blood lead database
required that the Michigan study area be subdivided into sections for the clustering
analysis. This was carried out for a couple reasons. The ﬁrst was computer processing
time. The amount of data points created distance matrices too large to process in a timely
manner or at all. Another is the difference in scale between a cluster in an urban area and
a cluster in a rural area. In more urban areas, data points are close together, often within
a few yards of each other. The rural areas of the state could have several miles between
data points within the database.

The state was divided up initially by Health Systems Agencies (HSA). These
were areas deﬁned in the 19705 for health care planning in Michigan (Firm 2007). The
boundaries followed county lines and divided the state into eight zones. Two of these
zones were too large to run the GAM analysis with the hardware available, so they were
divided into two. The Upper Peninsula HSA was divided into two pieces, an East and
West, based on a gap in the location of test results. The Bay HSA was divided into two
pieces based on the Shiawassee/Saginaw Rivers. Because the HSAs in southern
Michigan were too large for the number of data points within them, the large urban areas
were selected out by the Federal Aid Urban Boundary and analyzed separately. The
federal urban aid areas selected were Detroit, Flint, Saginaw/Bay City, Lansing, Battle
Creek, Grand Rapids, and Kalamazoo. The Detroit study region still had too many data
points for analysis, and was divided into North and South Detroit based on the Wayne
County border with Oakland and Macomb counties. In all, the state was divided into 19

sections (Figure 10) each of which, with the exception of South Detroit, had between

48

1,000 and 4,000 data points. The South Detroit study area had a yearly data point value

typically 18,000 to 24,000.

”4
4,; “ . /” W" E.
\I . k- ,Jr-L . f" i («q/x /
" 1’ 1 Eastern -,
. Western Upper Peninsula Upper Peninsula ,__
..e -\_f"".-\_‘ ,1” vm‘\ L I.
.‘ ,5" Ir-.__. I) ; N;
“’l.‘ i aII { g a)"; .._..,.‘"".‘
C e i \3

It)——" i

1") Ky";
. West Bay 1;‘ m
i f“ \1
I .
,r’ Mid Coast | Sagmaw& , m/ .l
LBay City "'3; East Bay 1,
\\ . i. i?) 4 r“; , )I
G dR‘ 'd —- ' 7 "" t*"'\
ran 0?, .5» i " Flint; Genesee ;.
1 I . i i' . ' __ -,,l ‘

l“

1 “w": Lansing ‘ N. Detroit— 5 it“
{$9251- I. . .- . 4
Kalamazoo?“ '5.“ 1 , . K” .
// ' Battle Cree/5, __‘ 1,5410 etr 0"
.4 l 1 ,J'
Southwest Mid South; Squtheast

L I
Figure 10: Study areas identiﬁed for the clustering techniques. Areas based on HSA
boundaries are outlined with black and labeled in bold, while areas based on urban
boundaries are outlined in blue and labeled in italics
Nearest neighbor statistics look at where disease cases are located in relation to

other nearby cases as well as the general population. In terms of this thesis, the nearest

neighbor for each child is the nearest other child in the database. This is determined by

49

radial distance between the two residences. A popular statistic called the Cuzick-
Edwards k-nearest neighbor statistic uses nearest neighbor statistics to estimate the
vicinity of disease cases to each other (Waller and Gotway 2004). The basic premise of
the statistic is to count every instance where the nearest neighbor to a case is another
case. The case-case count can be expanded to several nearest residences. The k-nearest

neighbors equation is written as:

Tk = 2771i mjaij
i 1'

Equation 1: Cuzick-Edwards test statistic

where k is the number of nearest neighbors allowed for each case, m is the child
in question, mj is the every other child, and aij is an indicator variable equal to one when i
and j are k nearest neighbors (Waller and Gotway 2004). If i and j are cases, then m and

m equals one. All three variables have to equal one to add to the ﬁnal result. An

example is shown below in ﬁgure llwhere there are four instances where the nearest

neighbor to a case was another case.

50

   
 
 

0 Cases
0 Controls
/ Nearest Neighbor

Figure 11: Example of Cuzick-Edwards statistic based on one nearest neighbor

A random labeling hypothesis can be used to test the signiﬁcance of the k-nearest
neighbor result (Wheeler 2007; Waller and Gotway 2004). Each child’s residence is
randomly labeled as a case or control in the same proportion as the actual data. The
results of the random simulations form a normal distribution of test statistics and where
the rank of the actual test result falls permits the calculation of a p-value. Many k values
of nearest neighbors are used to ﬁnd if clusters occur in small (one or two neighbors) or
large groups (ten or above). The Bonferroni adjustment p-value is used to test clustering
across all k values by multiplying the number of tests by the minimum p-value (Wheeler
2007).

The Cuzick-Edwards statistic has been used for both environmental and animal-
bome diseases. Dockerty (1999) used the statistic to study clustering of childhood
leukemia and lymphoma in New Zealand. The results showed no signiﬁcant clustering in
any age group or nearest neighbor value (Dockerty, Sharples, and Borman 1999).

Wheeler (2007), who studied childhood leukemia in Ohio, looked at possible clustering

51

of leukemia cases versus the background child population in the state. He found no
signiﬁcant clustering at any level of k, meaning that there is no evidence that childhood

leukemia cases are geographically dependent (Wheeler 2007).
The software program ClusterseerTM was used to conduct the Cuzick-Edwards

statistic tests. Clusterseer is a computer package designed to study spatial and temporal
clusters of disease (Wheeler 2007). Case/control boundaries of 5, 10, and 25 ug/dL were
tested. The statistic was calculated for k values of 1 through 20. To determine if the
Cuzick-Edwards statistics were signiﬁcant, 999 Monte Carlo simulations were run.

The main drawback of nearest-neighbor statistics is that they do not take distance
into account. The nearest neighbor to an event may be far away and therefore less likely
to be related. The difference of K-functions seeks to ﬁnd at what distances cases of
disease cluster (Waller and Gotway 2004). The statistic is based on Ripley’s K, a
common point pattern analysis tool. The Ripley’s K function is often used in health
studies to ﬁnd spatial dependence between individual points at different spatial scales.

The basic formula for Ripley’s K is:

R01): :72 :W1h(dij)

i=1j=1,i¢j

 

Equation 2: Equation for Ripley’s K

where R is the region of interest with n number of cases. On the right side of the
equation, dij is the distance between point i and the surrounding point j and 1;, is an

indicator variable equal to 1 if j is within distance h of i, otherwise it equals zero

52

(McKnight 2006). Wu refers to the proportion of the circle around point i which falls

within the study area (Waller and Gotway 2004). Ripley’s K works by placing a series of
concentric circles of increasing radii around each disease event and counting events
within that circle. If the number of disease events within the circle is greater than what
would be expected based on the number of total events and the size of the study area, that
spatial scale is considered clustered. An example of the Ripley’s K can be seen in ﬁgure

12.

 

Figure 12: Ripley’s K function with circles of distance h around event 1'. Clustering of
events are present within four circles around event 1'.
The Ripley’s K results are typically compared on a graph with complete spatial
randomness patterns in order to ﬁnd signiﬁcant clustering or inhibition at different spatial
scales. With the childhood BLL data, it is not assumed that the underlying distribution of

children is spatially random because a majority of the population of Michigan lives in

53

metropolitan areas. The clusters of urban settlements within the state make the Ripley’s
K comparison against spatial randomness useless. Therefore, the distribution of elevated
BLL cases must be compared against the background pattern of settlement within
Michigan in order to tell if the results are noteworthy. The difference of K-flmction takes
care of this by taking the difference between the K results of the primary pattern of cases

and the secondary pattern of controls.

K D (h) = Kcases (h) — Kcontrols 00

Equation 3: Difference of K

The control pattern is assumed to represent the underlying population from which
the cases of disease are picked. The difference of K functions can reveal spatial scales
where disease cases tend to cluster more than the population from which they are drawn.
If the difference between the two K-functions is zero, the cases of disease are random
within the background population. With a positive difference between the K—functions,
the cases are clustered together at that spatial scale, while a negative difference indicates
dispersion of the cases. A random labeling simulation can be used to test for signiﬁcance
(Waller and Gotway 2004). Each point within the dataset is randomly assigned as a case
or control based on the proportion of each label in the original dataset. The simulation
results form a normal distribution at each distance, which can be used to create an
envelope of results. The true difference of K results can be compared to this envelope to
determine signiﬁcance.

Difference of K analyses has been used in geographical studies in both the human
health and veterinary ﬁelds. Dolk et al (1998) used the difference of K function to look

at congenital diseases related to pesticide use. Difference of K functions showed a lack

54

of localized clustering in cases, leading the authors to conclude that there is little
geographic variation (Dolk et al. 1998). Another study that looked at biologically similar
cancers in dogs and humans in Michigan showed a strong dependence between dog and
human cancer, indicating that for certain types of cancer one may be used as a proxy for
the other (O'Brien et al. 2000). Foley (2001) also looked at dogs and the spatial
distribution of a certain tick-bome disease. Results showed that the dogs with the disease
where signiﬁcantly more spatially clustered than the dog population at large (Foley,
Foley, and Madigan 2001). Finally, Prince et al (2001) studied a liver disease with
unknown environmental risks using the difference of K method. A high amount of
clustering was found at nearly all distances, leading the researchers to conclude that there
was a strong link between the disease and local environmental conditions (Prince et al.
2001)

The difference of K functions analysis was performed in R, which is “an
integrated suite of software facilities for data manipulation, calculation, and graphical
display (Venables and Smith 2008).” This software is open source, command line-based,
and utilizes the S computer language. Individual library packages can be uploaded into
the program in order to provide statistical functions within the R framework. Three
packages were used: splancs, spatstat, and maptools. Splancs and spatstat are packages
designed for spatial point pattern analysis, and maptools is a package for working with
geographical data and can handle the importation of vector data sets.

Using the maptools package, each yearly lead test results point data set was
imported into R. A vector data set representing the state boundary was also imported.

The points data are then converted into a data frame to create separate point features for

55

the cases and controls. Similar to the Cuzick-Edwards test, the case control thresholds of
5, 10, and 25 ug/dL were used. Once the case and control point features were created,
the Ripley’s K values were computed on each feature using the khat function in the
package Splancs. The distances speciﬁed for the concentric circles ranged from 0.5
kilometers to 10 kilometers, with increments of half a kilometer. These distances were
selected with a mind to strike a balance between urban and rural study areas. The output
of this function is a graph showing how the Ripley’s K value changes with distance. For
each year and case/control threshold, the control K values were subtracted from the case
K values. Finally, to test for the signiﬁcance of the difference of K values, the Splancs
function Kenv.label was used to generate difference of K values from random labeling
simulations. The ﬁnal result was a simulation envelope of the maximum and minimum

simulation produced K values for comparison with the actual difference of K (Figure 13).

56

Figure 13: Method for obtaining difference of K values for each year at case/control
thresholds of 5, 10, and 25 ug/dL.

57

    

. . i “Port intoR/
Test Results State Boundary

Pomt Vector Data Set Vector Data Set

 

 

 

 

Michigan Test Results Data Frame

 
  
 

 
  
 

Create Separate
Case and Control
Data lirames

 

 

 

 

 

 

 

 

Case Data Frame Control Data Frame

' Run Ripley's K Function

/\ a

Subtract Controls from Cases J

 

 

 

 

 

 

 

 

 

1 F

 

 

Run Random
labeling Simulations

 

 

 

 

 

 

58

Geographic Analysis Machine (GAM) is a technique created by Stan Openshaw
at the University of Leeds in 1987 to study childhood leukemia clusters (Openshaw et al.
1988). It is a computationally expensive, but well used, exploratory analysis technique.
The method begins with overlaying down a ﬁne mesh grid over an entire study area.
Each mesh point of the grid is the center point of a series of concentric circles that
overlap each other (Openshaw et al. 1988). The GAM algorithm counts the number of
cases and controls within the circle and determines signiﬁcance either through a random
labeling simulation or a Poisson distribution (Waller and Gotway 2004). In a random
labeling simulation, if the observed value of disease counts within the circle is higher
than the results from random labeling, the circle is drawn on a map. The Poisson test
involves using the percentage of cases to total points as the mean of the distribution. The
probability of observing the number of observed cases in each circle is calculated, and
circles above a signiﬁcance threshold are retained for the map. The ﬁnal map usually
features many overlapping circles of varying sizes. To make the pattern easier to
interpret, a kemel-smoothing technique can be used. The ﬁnal result of this process is a
map showing hotspots within the study region. These hotspots look like large, brightly
colored blotches that deﬁne the area where cases of lead poisoning occur at a
signiﬁcantly higher rate than the background population. The usefulness of this method
is that by converting the point pattern into an area—based hotspot map, the pattern of
elevated BLL can be cataloged and interpreted with easier comparison to the geographic
unit based maps in regression analysis.

As with the difference of K function, GAM was run in R (Figure 14). The

analysis was accomplished with the R library “splancs,” which contains a tool for spatial

59

point pattern analysis. First, the geocoded locations and Michigan boundary ﬁles were
imported into R. For each case-control threshold, the background rate used is the local
ratio of cases to controls across all years. To ﬁnd clusters of cases, a grid of
pointslkilometer apart within the Michigan border was created. The distance between
the grid points and the geocoded address of each child were calculated with a Euclidean
distance function and placed in a distance matrix. If the percentage of cases to controls
within 1.8 kilometers of a grid point was less than the 5% chance from randomness
predicted by the Poisson distribution, the grid point was marked as having a signiﬁcantly
amount of cases. For better visibility of the resulting pattern, a kernel-smoothing process

was used to create the ﬁnal maps.

60

Figure 14: Method in R for creating GAM maps.

61

 

Test Results
Point Vector Data Set

Create Case
Data l’l'alne

 

 

 

 

Case Data Frame

1 kilometer grid points

1

 

 

 

 

Results Table

Import into R

 

'i

Area Boundary
Vector Data Set

 

Create Control
Data liralne

 

 

 

Area Test Results Data Frame

 

Map to A rea

 

 

 

 

Control Data Frame

 

l‘ind (lrid points with
a significant ease / control
ratio within 1.8 km

J

 

Case / Control Points

 

Run Kernel Smoothing

     

'k

Final Map

62

2.2.2 Geographically Weighted Regression

Regression models are commonly used in medical geography in order to ﬁnd
explanations for the spatial patterns of disease (Nakaya et al. 2005). Global linear
regression models such as Ordinary Least Squares (OLS) are popular for their ability to

offer insight into the variations in the data. The basic model is:

P
Y=ﬁo+Zﬁka+ 6
k=1

Equation 4: OLS regression model

where Y is the dependent variable, Xk are the independent variables, Bk are the

regression coefficients, and 0 is the error term (Huang and Leung 2002). The regression

coefﬁcients are calculated in matrix form:

63

p" = (XT X)-1XT Y

where
'1 X11 X119-
1 X21 X219

1 X... m.
Y. 135‘

Y= ,3”: = 3i

Y... B}:

Equation 5: Matrix calculation of the OLS coefﬁcients

><
||

 

 

The X matrix is composed of the independent variable values as well as a column
of 1 values to stand in for the intercept (O'Sullivan and Unwin 2003). XT matrix is

transposed from the X matrix. The Y matrix is made up of the values of the dependent
variable.

While the OLS method is extremely popular, researchers interested in the
geographic dimension of regression analysis have been looking into other options. The
main problem with OLS regression is that spatial homogeneity (i.e. variable coefﬁcients
are constant across space) is assumed to be valid. This runs counter to much research
within the social sciences which observes that most social processes are not stationary

(Fotheringham, Brunsdon, and Charlton 2002). In global regression models, space can

64

only be explored through the residuals of each observation, but the variable within the
model responsible for the error remains unclear. The spatial pattern of the residuals can
reveal spatial autocorrelation, meaning the errors are not independent and the model
systematically fails across space.

With the static nature of global regression illustrated, new methods have been
devised to bring geographic location into regression modeling. Some methods, such as
spatial lag or spatial error models, keep the global framework and bring geography into
the equation as another independent variable. A new method that is becoming
increasingly popular is Geographically Weighted Regression (GWR). The roots of GWR
lie in the growing ﬁeld of local spatial statistics (Fotheringham, Brunsdon, and Charlton
2002). It is based on the idea that each location is unique, and different processes occur
in different areas (Shearmur et al. 2007). GWR breaks down global regression so the
changes in model coefﬁcients and predictive power can be analyzed for each geographic
unit. Coefﬁcients for each location are estimated by a weighted least squares regression

equation (Leung, Mei, and Zhang 2000). The basic equation is:

p
Yr = 30(141': Vi) + z 13k(ui,vi)Xik + 9i
k=1

Equation 6: Geographically Weighted Regression model

where i is the geographic unit and u, and V, are the coordinates. The matrix

calculation of GWR is similar to OLS except that a diagonal weight matrix is included.

65

,, —1
Ba) = (WM/(ax) XTme
where
'Wi1 0 0
o w,2 0
_ 0 0 WiN-

Equation 7: Matrix calculation of GWR coefﬁcients for location i

We) =

 

 

The diagonal matrix gives weights to each other location as they relate to location
i. GWR has several different weighting functions, all of which are based on the
geographic axiom that nearby locations exert more inﬂuence than distant locations
(Fotheringham, Brunsdon, and Charlton 2002). The most commonly used weighting
function is ﬁxed distance and based on a Gaussian curve:

W.j -_- 60.3de

I

Equation 8: Fixed weighting scheme based on Gaussian curve

where dij is the distance from location i to location j and B is the bandwidth of the

Gaussian curve (Huang and Leung 2002). For polygon features, the distance is measured
between the centroids of the area features. This weighting scheme has the same ﬁxed
bandwidth for each observation point i. As the bandwidth increases, the weights of a

location at any distance decreases. The choice of bandwidth can be arbitrary, but a

66

common method of selecting the bandwidth is to minimize the residual sum of squares

for all data points:

N
* 2
E [Yi "" Yati (B )]
i: 1
Equation 9: Sum of squares method to determine the bandwidth

where Y*(B) is the ﬁtted value of Y when the bandwidth [3 is used. The

bandwidth that produces the lowest sum of squares is used in the GWR weighting
ﬁrnction. The location i is not included in the function because it will overpower all other
observations if the bandwidth is small, the estimates will ﬂuctuate wildly and be of little
value (F otheringham, Brunsdon, and Charlton 2002). GWR can use an adaptive
bandwidth, where the size of the bandwidth of the Gaussian weighting curve at point i
depends in part on the density of data points within the nearby area. This method is
useful is study regions where the density of data points varies across space
(Fotheringham, Brunsdon, and Charlton 2002). This thesis chose to use the ﬁxed
bandwidth exclusively after the ﬁnal results showed no difference between the two.

The biggest advantage of GWR is that it can model spatial non-stationarity, which
is important when using a large and diverse study area such as the entire state of
Michigan (Shearmur et al. 2007). Localized parameters allow visualization of how well
each variable and the whole model work across space. Another advantage of GWR is
that the results can be visualized through the use of GIS. Unlike the parameters of OLS

regression that focus on similarity throughout the study, the results of GWR can only be

67

easily understood through the use of maps (Fotheringham, Brunsdon, and Charlton 2002).
GWR is less prone, though not immune, to spatial autocorrelation in the residuals.

Leung et a1 (2000) developed a test statistic, similar to the F-test, which reveals if
the GWR model works better than the global model. It uses the F-distribution to compare
the residual sum of squares from the local GWR model to the global OLS model. The

formula is:

_ assg/a1
" RSSO/(n — P — 1)

Equation 10: Leung test statistic

 

F1

where RSSg is the residual sum of squares for the geographically weighted

regression model, 81 is the degrees of freedom in the GWR model, RSSO is the residual

sum of squares in the OLS model, and (n — p — I) is the degrees of freedom in the OLS
model.

Ten US Census variables selected from tables 2 and 3 were used to create a GWR
model to explain the variation in elevated BLL. Each variable used had been identiﬁed
as a predictor of lead poisoning in a previous study:

1. Percentage pre-l940 housing - This variable is a measure of housing units within

a geographic area that were built before 1940. It has been used before because

housing built in that time period would certainly have originally had lead

paint(Haley and Talbot 2004).

2. Percentage African-American — The number of Aﬁican-American residents

within a geographic unit has often been used as a predictor because minority

68

10.

communities have historically suffered from lead poisoning to the greatest extent
(Grifﬁth et al. 1998).

Percentage Latino — Similar to African-Americans, Latino residents have been
found to suffer from excess lead poisoning (Lanphear et al. 1998b).

Percentage recent immi grants -— Immigrants to the United States may suffer from
lead poisoning due to exposure in their country of origin or from imported

products or cultural practices (Sargent et al. 1997).

. Percentage under six years of age -— If there is a greater pool of children available,

the chance of childhood lead exposure increases.

Percentage of rental housing - Children who live in rental housing are often at
higher risk of lead poisoning due to lack of disclosure and neglect from the
landlord.

Percentage of houses headed by a female — Single parent households are often an
indicator of lower socio-economic status, thought to be a leading indicator of lead
poisoning (Sargent et al. 1995).

Percentage vacant housing — Areas with many housing units lying vacant are
thought to show signs of age and neglect (Bailey et al. 1994).

Percentage of residents without a high school diploma — Education attainment is
thought to be signiﬁcant because it is an indicator of socio-economic status
(Talbot, Forand, and Haley 1998).

Percentage below 185% of the poverty line — Lower income is believed to

correlate with lead poisoning and 185% of the poverty line covers residents in

69

poverty as well as those in danger of falling into poverty (Kaplowitz, Perlstadt,

and Post 2007).

The ﬁrst step involved taking the point datasets of the children’s addresses and
aggregating them to the same enumeration units as the census variables. This process
began by using the intersect tool in ArcGIS to code each child’s location with the
appropriate census tract, MCD, and zip code of their residence. Once all of the children’s
test results were coded, dbf ﬁles were exported into Microsoft Access. An SQL query
was used to compile the dependent variable, mean BLL, for each census unit. The query
for each year exported as a dbf ﬁle back into ArcGIS and joined to the census vector data
sets to create the ﬁnal enumeration units to run the analysis.

The three vector data sets containing the census data and aggregated lead data
were imported into R. The function “1m”, or linear model, was used to create global
regression models and eliminate variables in each area] unit that were not signiﬁcant.
Once the signiﬁcant ((1 = 0.05) variables for each US census level were established, the
resulting model was run on each individual year to study possible changes over time. For
the GWR portion of the thesis, the R library “spgwr” was used. A Gaussian weighting
scheme was used for weighting all other location values with relation to each location i,
with the bandwidth calculated for each census unit by reducing the sum of squares. The
results were exported out of R as a text ﬁle and joined with ArcMap vector data sets for

visualization.

70

3 Results

3.1 Clustering Results

The purpose of testing for clustering of disease is to determine if pockets of cases
are spatially arranged in a manner that would not have occurred from random chance.
Clustering analysis in this thesis used three different techniques. The ﬁrst was the
Cuzick-Edwards statistic. This approach looked at the size of clusters through the
relationship of cases to other nearby blood test addresses. The second technique was the
difference of K method. It functioned by ﬁnding the Ripley’s K value for cases of
elevated BLL in a study area as well as the Ripley’s K value for the background or
control child population. The difference of K value is the result of subtracting the K
value from the control population from the K value of the cases of elevated BLL. The
ﬁnal method is the Geographic Analysis Machine (GAM). This is a visualization tool
used to ﬁnd “hotspots” where cases of disease cluster.

Due to the size of Michigan and the enormous amount of test data, the state was
divided intol 9 study areas for the cluster analysis. Rural areas were represented by the
Hospital Service Areas (HSA). Two of these districts had to be divided into 2 pieces
because the land area was too large for the GAM analysis. The Bay HSA was divided
into East and West along the Saginaw/Shiawassee Rivers, while the Upper Peninsula
HSA was divided along border between Luce/Mackinac and Alger/Schoolcraft Counties.
One urban area was broken into two study areas in order to cut down on processing time.
The Detroit Federal Urban Aid Boundary was divided in two different study regions

along the Wayne County border with Oakland and Macomb Counties.

71

The results of the clustering analyses followed a similar pattern across different
study areas. With the Cuzick-Edwards tests, the 5 ug/dL level often exhibited clustering.
This was particularly true in the urban areas, but often extended to less populated parts of
the state. The 10 ug/dL cutoff exhibited more variability across the state. In the larger
urban areas, a high amount of clustering among cases was present. This persisted
through all years in the lead database. In smaller cities, clustering of cases of 10 ug/dL
and above were smaller and more common in the earlier years covered by the study. In
more rural areas of the state, the low number of cases resulted in clustering being much
less common. At the 25 ug/dL case level, only the large urban areas showed any signs of
clustering. Other study areas typically did not have enough cases at the 25 ug/dL level.

The difference of K results generally agreed with the Cuzick-Edwards ﬁndings.
In interpreting difference of K graphs, clustering is noted when the K values at any
distance are above the simulation envelope of random labeling test results. At the 5
ug/dL level, in urban areas the K value rises above the simulation envelopes immediately
and remains above for the entire 10 kilometer distance tested. In smaller midsized city
study areas, the K values sometimes drop back down to zero at greater distances due to
the edge effects caused by the small study area size. In the larger HSA study areas,
results are mixed depending on if there is a central city within the study area. Clustering
is only present at the 25 ug/dL level in the largest cities.

The GAM maps were used in this thesis to determine the spatial location of
clusters of elevated BLL cases. Rather than being a signiﬁcance test of clustering, GAM
is a visualization technique that ﬁnds hotspots of likely clustering. In urban areas with

many test cases, GAM provided good results of where the hot spots of elevated BLL

72

cases were located. GAM worked fairly well is areas where there were strong clusters
consistently through time. This method did not work as well in the nrral areas. Since
signiﬁcance values were locally based, one elevated BLL case could be considered a
cluster in a rural area because of the lack of cases overall.

This section of results covering clustering techniques is presented by individual
study area. Key points and diagrams are shown. Tables are used to display the Cuzick-
Edwards results. Years that have a signiﬁcant ((1 = 0.05) Bonferroni p-value for all k
levels are highlighted in orange. The numbers under each k value is the Cuzick-Edwards
value, or the amount of neighbor connections at that level. Cuzick-Edwards test statistics
that are signiﬁcantly higher than the previous k level values are highlighted in orange.
For the difference of K and GAM analysis, ﬁgures of individual years were chosen which
best represented the overall pattern in the study area. The code used to create the graphs
and maps is available in Appendices 2 and 3. In this section, the 5 ug/dL threshold refers
to the tests where 5 ug/dL was the cutoff between cases of elevated BLL and the control

population of unaffected children. This phrasing is repeated for 10 and 25 ug/dL.

3.1.1 South Detroit

The region of South Detroit in this thesis represents the Detroit Federal Urban Aid
Boundary area south of the northern boundary of Wayne County (Figure 15). This area
includes the cities of Detroit, Dearbom, Grosse Pointe, and others in Wayne County. It is
the most heavily populated area of the state and seems to have the most robust testing for
lead in children. The number of blood tests performed in this region, 15 to 20 thousand

each year, was at least three times higher than any other part of the state.

73

 

Kilometers . “I 7 V

Figure 15: Map of the South Detroit study region

The Cuzick-Edwards results reveal high levels of clustering across all years and
threshold levels (Table 5). At the 5 and 10 uydL threshold levels, Monte Carlo tests
reveal that total number of case-case nearest neighbors to be highly signiﬁcant for every
k value. South Detroit was also the only area of the state that had a large amount of
children with BLL at or above 25 ug/dL. The South Detroit study area is the only region
of the state where the Bonferroni p-value, an indication of clustering across all k values,

is signiﬁcant at all of years in the database for the 25 ug/dL threshold.

74

5 Threshold

2
C
.:
ll
9
I-
z
r.
G

25 Threshold

 

Table 5: Cuzick-Edwards results for South Detroit

The difference of K graphs for the South Detroit region show a very high degree
of spatial clustering of elevated BLL cases. The K values for each threshold level
continue to rise even as the distance increases. This is unlike any other region of the
state, and would seem to conﬁrm that the spatial clusters of elevated BLL are quite large.
Because the K values fall well above the simulation envelopes created ﬁom random
labeling tests, the degree of clustering is signiﬁcant. This can be seen in ﬁgure 16. The
second graph in ﬁgure 16 shows the difference of K values rise as high as 18 times as
high as the upper bound of the simulation envelope. There is no other study region where

the difference of K values rise immediately and continue to rise all the way to ten

75

kilometers. Since this occurs at all threshold levels, it is safe to say that this study region

has the largest cluster of lead poisoning victims in the state.

2005 10 micrograms per deciliter

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

w 0
O a
+ _.
(D 0
v
w 0
O
+ ..
a, I
m I
x
.E °° °
re ‘3- -
o g,» .
w o
o O
+ ..
3 O
O . o ____________________
o . ---------------------------
+ - Gun-2:31:22 """"
o ......................
o ------------------------------
I I l l l
2000 4000 6000 8000 10000
18 - Distance
16 ’
'0
g 14 -
O
“i 12.-
51 10 :
; 8 .,
“5 s
3 6 —
e 4 l
e
D 2 T l l
0 . .. .. . ..., _.. . . ._. .. ..
2000 4000 6000 8000 10000

Figure 16: The 2005 South Detroit difference of K graph for the 10 ug/dL threshold

The GAM analysis reveals the spatial location of the clusters of elevated BLL to
be squarely within the city of Detroit. The level of intensity of the hotspots fades in later

years of the database, but generally falls within the same areas of the city. Figure 17

76

reveals the two main hotspots that showed up at all threshold levels. These two regions
are located to the east and the west of the downtown Detroit area. The western hotspot
extends towards the boundary with Dearbom and the eastern hotspot occupies the eastern

part of the city of Detroit.

  
 

2004 5 micrograms
per deciliter

Figure 17: The 2004 GAM map of South Detroit for the 5 ug/dL threshold

3.1.2 North Detroit

North Detroit covers the area of the Detroit Federal Urban Aid Boundary area that
falls within Oakland or Macomb Counties (Figure 18). The region contains many
suburbs of Detroit and covers a mostly developed landscape. This includes cities such as

Pontiac, Warren, St. Clair Shores, Novi, and others. The Detroit Federal Urban Aid

77

Boundary was divided along the county line due to the large differences in the number of
test results between North Detroit and South Detroit. North Detroit has far fewer test

results, 2 to 7 thousand per year, than South Detroit.

 

 

Kilometers

Figure 18: Map of the North Detroit study region

The Cuzick-Edwards results for North Detroit reveal a strong clustering pattern at
lower threshold levels and very little clustering at higher threshold levels. At theS ug/dL
threshold level, the total case-case neighbors run far ahead of the number expected at
every level of neighborhood. This pattern is consistent across all years (Table 6). There
is overall clustering at 10 ug/dL threshold, but the clusters grow very slowly after the k =

3 level. This suggests that the clusters of cases within North Detroit are smaller than

78

what was seen in South Detroit. At the very high 25 ug/dL threshold, the low number of
cases makes it difficult to ﬁnd any consistency between the years. These very high cases

do seem to be near each other, but it does not always constitute a cluster.

 

Table 6: Cuzick-Edwards results for North Detroit

The difference of K graphs conﬁrms the clustering within the North Detroit
region. The 5 ug/dL threshold shows the rise of the difference of K being well above the
simulation envelope. At around ﬁve kilometers, the K values begin to drop off, a signal
that cases are no longer being added as quickly as controls. This drop occurs in every
yearly difference of K graph, and can be seen in ﬁgure 19. While the difference of K
values peak at ﬁve kilometers, the second graph indicates that the fastest growth occurs

less than two kilometers. At two kilometers in ﬁgure 19, the difference of K values are 9

79

times as high as the upper bound of the simulation envelope. The 10 ug/dL threshold

patterns rise immediately and then fall below the envelope, revealing fairly small clusters.

The 25 ug/dL threshold shows no degree of clustering.

Diff in K

Difference of K / Upper Bound

2003 5 micrograms per deciliter

0.0e+00 1.0e+08 2.0e+08

~10e+08

..a
O

OHwaLﬂOﬁNmLD

 

l

l

 

 

 

 

 

f l l l I
2000 4000 6000 8000 10000
Distance
2000 4000 6000 8000 10000

Figure 19: The 2003 North Detroit difference of K graph for the 5 uydL threshold

The GAM analysis of North Detroit suggests that Pontiac has the largest cluster of

high BLL test results in the region. The city has visible clustering in every year for both

80

5 and 10 ug/dL thresholds. A secondary area of high BLL clustering is the area which
borders the city of Detroit. This includes Warren, Royal Oak, and Southﬁeld. Both of
these hotspots are visible in ﬁgure 20. Unlike Pontiac, the secondary cluster near the city
of Detroit disappears over time, possibly due to increased testing rates. At the very high
25 ug/dL threshold, Pontiac is the only area which consistently shows any hotspots, but

the other tests make this seem like these are not very signiﬁcant.

1999 10 micrograms
per deciliter

1.,

   
 

Figure 20: The 1999 GAM map of North Detroit for the 10 ug/dL threshold

3.1.3 Southeast Michigan

The Southeast Michigan region includes all of the Southeast HSA which does not
fall within the Detroit urban boundary (Figure 21). While this region is mostly rural, it
does have several cities mixed in with surrounding rural areas. The two Detroit study
areas do take a large bite out the original HSA, but the vast gulf in the number of tests
between the study areas make it reasonable to keep them separate. The three main cities

of the Southeast region are Ann Arbor, Monroe, and Port Huron. For every year between

81

1998 and 2003, the number of blood tests is under 2,000. The number of tests doubles to

around 3,500 in 2004 and increases again to nearly 4,000 in 2005.

Point HURON

      
   

d'll
p-EU‘II’VJ

SOUTH DETROIT\

..T'i

20 m; ,.-' i A

Kilometers

arm

0 10

 
 
  

 

I‘ Luna Pier
Figure 21: Map of the Southeast Michigan study region

The Cuzick Edwards results for this region show clustering through all years at
the 5 ug/dL threshold (Table 7). The Bonferroni p value conﬁrms there is clustering
across all k values, but Monte Carlo analysis reveals that the clustering is strongest at k

values of 5 or less. Still, many years have fairly large clusters at the 5 ug/dL threshold.

82

At the 10 ug/dL threshold, the clusters are smaller. The number of case-case neighbors is
high at the k=l level, indicating small pockets of elevated BLL within the region. The
clustering is stronger in the earlier years, but is less prominent in the later years of the
database with the exception of 2005 where there are 10 neighbors at k=2 level among the
31 cases. At the 25 ug/dL threshold, there are not enough cases in this region for a
cluster analysis in nearly every year, though in 2005 two out of three cases are nearest

neighbors.

 

Table 7: Cuzick-Edwards results for Southeast Michigan

The difference of K graphs for Southeast Michigan show that where clustering
exists, it is small. Depending on year, the difference of K result may be above the upper

bound of the simulation envelope at shorter distances, but the results fall back down as

83

the distance grows. Often the K values hug the upper bounds of the simulation envelopes
like in ﬁgure 22. There is a quick rise in difference of K values, as high as 2.5 to 3 times
above the upper bound of the simulation envelope, fall back down in the envelope by four
kilometers. The initial jump is visible in the 5 ug/dL threshold graphs, but less so in the
10 ug/dL threshold graphs. Since the simulation envelopes change with every
simulation, this low of a degree of separation means that clustering cannot be conﬁrmed.
The fact that clustering is obvious in the Cuzick-Edwards tests but not the difference of
K could be a sign that it is conﬁned to a small area that is picked up more easily by

neighborhood measures than distance measures.

84

2004 5 micrograms per deciliter

 

 

 

 

 

 

 

 

(I)
O
+ _
q; _______
(0 xx
00 r
O
+ _
0
V _______
8 o ' o ’ﬂl: o
x 3'" o o’ ° 0 . .
s N - .
E - , . ' ,
O 8 ° ——————— o
+ " o
‘1’ a
O """"" o
co ““““
O
+ \
cu"
“3
co ..........
O .......................
03‘
Y 2000 4000 6000 8000 10000
Distance
3 4.
,3 2.5 ~
C
3
3 2 4
8
O.
3 1.5 ‘
x
3 1 ~
E
“J
i i i I 11
D
2000 4000 6000 8000 10000

 

 

Figure 22: The 2004Southeast Michigan difference of K graph for the Sug/dL threshold

The results of the GAM analysis show the small pockets of clusters. At the 5

ug/dL threshold, there are a large number of very small hotspots whose placement varies

year to year. While it is difficult to pin down the location, Monroe County in the south

has very high number of tiny clusters. Both Port Huron and Monroe are visible hot spots

85

across all years. Ann Arbor is a hotspot only in 1998 (Figure 23). This distinction is
apparent at the 10 ug/dL threshold as well, where Ann Arbor quickly disappears as the
years progress. Monroe also disappears in later years, while Port Huron remains a hot

spot.

  
 

1998 10 micrograms
per deciliter

Figure 23: The 1998 GAM map of Southeast Michigan for the 10 ug/dL threshold

3.1.4 Flint
The Flint region covers the Flint Federal Urban Aid Boundary (Figure 24). It

covers the city of Flint as well as surrounding cities such as Burton, Grand Blanc, and

86

Fenton. The region is mostly urban and developed. The number of blood tests with the

Flint study area rises from under 1,000 in 1998 to over 4,000 in the year 2005.

 

Kilometers

Figure 24: Map of the Flint study region

The Cuzick-Edwards results for Flint show strong clustering at both the 5 and 10
ug/dL thresholds (Table 8). For the 5 uydL threshold, this signiﬁcance remains high
even as the number of neighbors grows, indicating the larger cluster of cases. The 10
ug/dL threshold displays signiﬁcant test statistic values at smaller k values, indicating

tight clusters of cases. The 10 pg/dL threshold clustering is higher than similar sized

87

cities within Michigan, which could indicate the severity of elevated BLL in Flint. A
couple years even have a signiﬁcant Bonferroni p-value for the 25 ug/dL threshold due to

two cases being nearest neighbors at the k = 1 level.

 

Table 8: Cuzick-Edwards results for Flint

Results from the difference of K test conﬁrmed the presence of signiﬁcant spatial
clusters at the 5 and 10 ug/dL thresholds. Each level has K values above the upper bound
of the simulation envelope. At the 5 ug/dL threshold, the K values rise immediately and
stay above the upper bound for the entire ten kilometer distance. They do fall at large
distances, but this is could be due to edge effects of the study area. With the IOuydL
threshold, the K values rise quickly before falling below the upper bound of the

simulation envelope around a distance of six or seven kilometers, as illustrated by ﬁgure

88

25. The K values reach a height of about 2.5 times the upper bound around four

kilometers, indicating signiﬁcant clustering. The 25 ug/dL threshold numbers do not

indicate any signiﬁcant clustering in any year.

2003 10 micrograms per deciliter

 

 

 

 

 

 

 

 

(D o .......... o
o . ------ ‘
+ - . o
0.) o I' o
N
4 . ' ~~~~~~
x ----- "
.5 8 -
E t,“ .....
o O ‘
co .
O ‘.
+ _, ‘ s‘
0.) __________
0.4 .
2000 4000 6000 8000 10000
3 I Distance
'0 2.5 ",
C
3 1
o l
f 2 1
8 ;
Q 1
3 1.5 1
X ‘r
“6 i
8 1 7
c ,
E 1
g 0.5 “i I I | I l I
c. 3 I
o _.‘ , 2.. . _. . .. .. -. .- .. .. . . . _'.
2000 4000 6000 8000 10000

Figure 25: The 2003 Flint difference of K graph for the 10 ug/dL threshold

GAM results for the Flint study area show the clustering of elevated BLL is

contained almost exclusively within the city of Flint. The worst areas in all threshold

89

levels tend to be the neighborhoods to the northwest of downtown and north of the Flint
River (Figure 26). While the shape of the hotspot varies year to year, at each threshold
level it is centered in these Northwest Flint neighborhoods. This area is likely the source

of the elevated BLL clustering seen in the other tests.

  
 

‘

1998 10 micrograms
per deciliter

Figure 26: The 1998 GAM map of Flint for the 10 ug/dL threshold

3.1.5 Genesee

The Genesee study area includes the counties of Shiawassee, Lapeer, and all of
Genesee County that is not in the Flint Urban Aid Boundary (Figure 27). It is a mostly
rural study area that does not have any large cities. The main towns are Lapeer, Owosso,

and Perry. The Flint study region divides the Genesee HSA in half, and the number of

90

blood tests in the Genesee study region is about one-third of the number of tests in the
Flint study region. The total blood tests is below 500 for each of the years 1998-2003,

followed by a sharp increase in 2004 to around 900 and more than 1,300 in 2005.

Montrose

Gorunna

Q";
a“!

a.
[WWW

.. .v/ '

Durand

 

Kilometers

Figure 27: Map of the Genesee study region

The Cuzick-Edwards statistic tests revealed no consistent signiﬁcant clustering of
lead poisoning cases at any level (Table 9). At each threshold level, the number of case-
case nearest neighbors does not fall far from what would be expected by chance. This is
a stark contrast to the more urban areas of the state, but in line with other regions that

lack a major city. The years 2004 in the 5 ug/dL threshold and 2001 in the 10 ugdL

91

threshold are the only individual years that indicate clustering is present. In a nearest
neighbor test such as Cuzick-Edwards, distance is not a factor. However, there is

seemingly little clustering at any level.

 

1‘)?

Table 9: Cuzick-Edwards results for Genesee

Difference of K results conﬁrms the lack of clustering of elevated BLL. At every
threshold level, the difference of K values at every distance is within the simulation
envelopes. There is not a year where the K values of any of the three threshold levels rise
above the upper bound of the simulation envelopes. Figure 28 shows the difference of K
for 2002 at the 5 ug/dL threshold, and the K values stay around zero and fall well within

the simulation envelopes. The second graph shows the difference of K values never

92

exceeded 60% of the upper bound of the simulation envelope, a sign that the pattern of

cases does is not signiﬁcantly different from the results of the random simulations.

2001 5 micrograms per deciliter

 

Diff in K
0e+00 5e+08 1e+09

l

-5e+08

'4
‘4
4
‘4
4-.
.._
...................
..........
‘4
~4

 

 

 

6000 8000

Distance

.0 9
U1 Ch

.0
a.

O
N

Difference of K / Upper Bound
o o
H W

 

 

 

 

 

 

lllllllll

 

 

2000

4000

6000

8000

 

 

 

10000

Figure 28: The 2002 Genesee difference of K graph for the 5 ug/dL threshold

Despite the lack of any small or large clusters in the study area, the GAM maps
for the Genesee can be useful to show a general pattern of cases. At the 5 ug/dL

threshold level, this pattern seems to be that many cases are located in Shiawassee

93

County around the city of Owosso. But the problem with rural areas is that without a
large number of cases, individual cases show up as hotspots. Shiawassee County seems
to have the most cases in the region, like in ﬁgure 29, but the hotspots change year to
year without any consistency. At the 10 and 25 ug/dL thresholds, the dearth of cases

makes it difﬁcult to ﬁnd any discemable pattern.

2001 5 micrograms
per deciliter

 

Figure 29: The 2001 GAM map of Genesee for the 5 ug/dL threshold

3.1.6 Lansing

The Lansing study area consists of the Lansing Federal Urban Aid Boundary.
The study region is situated around the city of Lansing (Figure 30). Surrounding cities
within this area are East Lansing, Grand Ledge, Okemos, and Mason. The area is a
developed urban area. The number of yearly blood tests in the Lansing study area range
from 1,300 to 1,800 in the years 1998-2004, followed by a increase to over 2,100 in

2005.

94

Kilometers

 

Figure 30: Map of the Lansing study region

The 5 ug/dL threshold Cuzick-Edwards statistics reveal clustering within the
Lansing area (Table 10). As the k value is increased, the number of case neighbors
continues to grow nearly every year. This would indicate that the clusters of elevated
BLL are fairly large within the Lansing area. With the 10 uydL threshold, the results
changed slightly. At lower k values, the signiﬁcance was high, but little growth in the

test statistic occurred at k values higher than 3 or 4. Still, nearly every year had

95

signiﬁcant clustering at the 10 ug/dL threshold according to the Bonferroni p-value.
Since this continues through all years within the database, it likely indicates a sustained

risk exposure. The 25 ug/dL threshold indicated no clustering except in the year 2000.

 

Table 10: Cuzick-Edwards results for Lansing

The difference of K values in the Lansing study area are surprisingly inconsistent.
At the 5 ug/dL threshold, the K value each year rises quickly at short distance and falls
beyond six kilometers. The results are surprisingly inconsistent, with a couple years
exhibiting signiﬁcant clustering while other years do not. The trend seems to be that the
amount of clustering dissipates over time, suggesting that the cluster might weaken.
Another interesting fact is that 10 ug/dL threshold graphs show clustering across all

years. The graphs all show an early rise in the K values at short distances, then fall below

96

the upper bound of the simulation envelopes like in ﬁgure 31. The peak around four

kilometers in the difference of K graph coincides with the K values being 3 times as large

as the upper bound of the simulation envelope, making four kilometers the likely

diameter of the cluster. At the 25 ug/dL threshold, the k values never fall outside the

simulation envelopes.

Difference of K / Upper Bound

Diffin K
0e+00

2000 10 micrograms per deciliter

 

2e+08
1

1e+08

-1e+08

 

______
a"
—
r

‘.
~
‘~
~~~
‘~

‘~
u
‘s
‘.
‘~

.........

--'
.........
~-
——————
."
,a

~-__
......
............
\~ 4

 

 

 

 

 

2000

4600

 

4000

6600
Distance

6000 8000 10000

Figure 31: The 2000 Lansing difference of K graph for the 10 ug/dL threshold

97

The GAM maps show a clear cluster of BLL cases within the Lansing study
region. The main cluster in nearly all of the maps is the area around downtown Lansing.
The neighborhoods between downtown and the eastern edge of the city of Lansing are a
hotspot for elevated BLL every year. This pattern manifests itself in both the 5 and 10

ug/dL threshold levels and can be seen in ﬁgure 32.

   

1998 5 micrograms
per deciliter

Figure 32: The 1998 GAM map of Lansing for the 5 ug/dL threshold

3.1.7 Mid-South
The Mid-South study area covers all of the Mid-South HSA not within the
boundaries of the Lansing study region (Figure 33). This is a mostly rural study area, and

includes the counties of Clinton, Eaton, Ingham, Jackson, Hillsdale, and Lenawee. There

98

are several cities within the Mid-South area such as Jackson, Adrian, Hillsdale, and
Charlotte. The number of blood tests in the region shows a decrease from over 1,600 in
1998 to under 700 in 2000. This initial decrease is offset in 2004, where the yearly
number of tests more than doubled ﬁom less than 1,300 the previous year to over 2,800.

The larger number of tests in 2004 and 2005 has an eﬁect on the results of each test.

i

can."

Potterville ‘ , _ = 4: a:
,. ‘ if?»
C-harlotte ’ “m . .

3

Maﬁa

IGK

g}.
JAGK§©N

 

o 10 L'tehﬁeld mm W
LLLLJAIJ Hillsdale

Kilometers \ , a.“
E— '

 

Morenci
u

Figure 33: Map of the Mid-South study region

99

The Cuzick-Edwards tests reveal clustering of 5 ug/dL threshold cases across
nearly all years (Table l 1). An interesting pattern is the huge increase in the number of
tests in 2004 and 2005. This greatly increases the Cuzick-Edwards statistic at all k values
for those two years. At the 10 ug/dL threshold, most years have a signiﬁcant Bonferroni
p-value due to initial clustering at the k = l or k = 2 levels. The low number of cases at
the 25 ug/dL threshold makes the Cuzick-Edwards test ineffective. The years 2000, 2002
and 2005 have two neighbors who both are 25 ug/dL threshold cases, but these could be

siblings in the same household.

 

Table 11: Cuzick-Edwards results for Mid-South

Results from the Cuzick-Edwards test were conﬁrmed by the difference of K

graphs. The K value remains well above the simulation envelopes every year for the 5

100

ug/dL threshold such as ﬁgure 34, indicating strong clustering. The K values remain
between 3 and 4 times as large as the upper bounds of the simulation envelope as the
result of strong initial clustering and no edge effects. After about three kilometers, the K
values stay at around the same value, an indication that they are no longer increasing
cases. This is unusual for a mostly rural region, indicating a strong cluster likely exists
somewhere in the study area. The 10 ug/dL threshold graphs have K values which
remain above the upper bounds of the simulation envelope as well. For the 25 ug/dL

threshold, there seems to be little clustering due to lack of cases.

101

1999 5 micrograms per deciliter

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

O)
o o
+ .
OJ
0.
x ~~~~~~
E - o ---------------------------------------
E .
D 0 ll"‘
0 '1‘
o .
+ -1
(D x
O. ‘‘‘‘‘‘
o N.“
2000 4000 6000 8000 10000
4.5 — Distance
4 _
D
g 3.5 ~
0
e a «
SE 2.5 .
E 2 .
“5
g 1.5 -
b 1 ‘
é’
o 0.5 r
0 V
2000 4000 6000 8000 10000

Figure 34: The 1999 Mid-South difference of K graph for the 5 ug/dL threshold

The GAM maps reveal interesting patterns. At the 5 ug/dL threshold, two major

factors stand out. First is the reoccurring cluster in the city of Jackson. This result is

similar to other urban areas across the state. It is likely that the city of Jackson is the

source of the consistent cluster seen in the difference of K graphs. The second is the high

number of cases in Lenawee County in 2004 and 2005. This was seen in the Cuzick-

102

Edwards table, and it seems that many of the cases were found in this county, particularly
in the city of Adrian. Each of these two clusters can be seen in ﬁgure 35, as well as many
constellations of individual cases. This pattern dissipates at the 10 ug/dL threshold level,
and the city of Jackson becomes more apparent. No pattern can be found at the 25 ug/dL

threshold level.

   
 

2005 5 micrograms
per deciliter

Figure 35: The 2005 GAM map of the Mid-South for the 5 ug/dL threshold

103

3.1.8 Battle Creek

The Battle Creek study region includes all area within the Battle Creek Urban Aid
Boundary (Figure 36). This is a fairly small study area that includes the cities of Battle
Creek and Springﬁeld, as well as some areas to the north and east of the cities. It is the
smallest of the 19 study regions in this thesis in terms of area size. The number of blood

tests in a year does not exceed 1,000 except for the year 2005.

, , Springfield I

 

Kilometers

Figure 36: Map of the Battle Creek study area

104

Battle Creek shows a pattern of Cuzick-Edwards results which is similar to other
mid-sized cities (Table 12). At the 5 ug/dL threshold, the results show consistent
clustering across all years in the database. The values increase fairly slowly at the higher
k values, indicating that any clusters within the study area are smaller than in other cities.
The 10 ug/dL threshold results show that in earlier years, there is strong clustering fed by
several k = 1 neighbors, but this pattern seems to fade over time. The 25 ug/dL results
show a couple years where two k = 1 neighbors both were 25 ug/dL threshold cases.

This is interesting considering the low number of total cases at the 25 ug/dL threshold

level.

 

Table 12: Cuzick—Edwards results for Battle Creek

105

Results from the Cuzick-Edwards test are conﬁrmed by the difference of K
graphs. The 5 pg/dL threshold K values show up immediate sharp jump above the
simulation envelopes like in ﬁgure 37. The K values rise to around 3.5 times the upper
bound of the simulation envelope by two kilometers and continue to add cases until
around four kilometers. In each graph around four kilometers, the K values begin a rapid
decline. The consistency of this drop indicates the edge of the cluster, but could also be
related to edge effects of the small study area. A similar pattern is repeated at the 10
ug/dL threshold level in earlier years, but only in the early years of the database. There is

no real change in the 25 pg/dL threshold results.

106

2001 5 micrograms per deciliter

4e+07

DHTHIK

0e+00

. .w
m w w h -4e+07

N

F
m

1

Difference of K / Upper Bound
N

F’
oU'I

 

 

 

 

i

>I I
W,

 

 

2000 4000

6000

8000

-i 0 .
x ' """"" .
0
I l I l l
2000 4000 6000 8000 10000
Distance

10000

Figure 37 : The 2001 Battle Creek difference of K graph for the 5 ug/dL threshold

The GAM results show that the 5 ug/dL threshold cases are concentrated in

downtown Battle Creek. A closer analysis shows that the strongest hotspots across all

years appear to be on the eastern side of downtown. The 10 ug/dL threshold results show

a similar pattern to the 5 ug/dL threshold. Though the hotspot is not the same every year,

107

the downtown area seen in ﬁgure 38 is central to the hotspot. At the 25 ug/dL threshold,

the low number of cases makes GAM analysis less reliable.

 
   

1999 10 micrograms
per deciliter

Figure 38: The 1999 GAM map of Battle Creek for the 10 pg/dL threshold

3.1.9 Kalamazoo

The Kalamazoo study area covers the Federal Urban Aid Boundary around the
aforementioned metro area (Figure 39). This is a mostly developed district that surrounds
the city of Kalamazoo, as well as the cities of Portage and Galesburg. The study area
also includes some rural area around the cities. Similar to several other study areas, there
is a large increase in blood lead tests in 2004 and 2005 compared to previous years.
There were over 1,200 blood tests in 2004 and 2005, while none of the other years

exceeded 850.

108

 

N
5

Kilometers

Figure 39: Map of the Kalamazoo study area

The pattern seen in the Cuzick-Edwards results is similar to other mid-sized cities

(Table 13). The 5 ug/dL threshold has signiﬁcant clustering of cases across all years

according to the Bonferroni p-values. It appears that the clusters of cases are fairly large
as well, as the total case-case count continues to steadily rise as the number of nearest

neighbors is increased. At the 10 ug/dL threshold, strong initial clustering exists, but it

109

does not continue to grow at a signiﬁcant rate as k increases. The clustering at the 10
ug/dL threshold seems to fade over time, possibly due to remediation efforts. There is no

apparent clustering at the 25 ug/dL threshold for Kalamazoo.

 

Table 13: Cuzick-Edwards results for Kalamazoo

Similar to Cuzick-Edwards, the difference of K results in Kalamazoo show
patterns of clustering similar to other mid-sized cities within Michigan. At the 5 ug/dL
threshold level, K values immediately jump up at short distances. There is no doubt that
signiﬁcant clustering of 5 ug/dL threshold cases exists within Kalamazoo. At the 10
ug/dL threshold, results show strong clustering at short distances as well. The K values
rise well above the upper bound of the simulation envelopes, and then fall back at around

six kilometers such as in ﬁgure 40. The peak of the K values occurs around four

110

kilometers where the difference of K is 2.5 times as high as the upper bound of the

simulation envelope. The rapid decline of K values afterwards indicates four kilometers

is the likely diameter of the cluster. This pattern persists across all years without fading,

possibly indicating the consistent underlying threat. The 25 ug/dL threshold K values

were not signiﬁcant.

2000 10 micrograms per deciliter

2e+08

Diff in K
1e+08

0e+00

-1e+08

U)

f"
01

b—I

Difference of K / Upper Bound
O H
in U1

Figure 40:

N

 

1

 

s ,—
s ..-
...........
_________
-----------

 

~“
.........
.........
............
‘‘‘‘‘
-~_4

 

 

 

 

2000 4000 6000 8000 10000
Distance
2000 4000 6000 8000 10000

The 2000 Kalamazoo difference of K graph for the 10 ug/dL threshold

111

The GAM results for Kalamazoo show a consistent pattern of hotspots. At each
of the threshold levels, the corresponding hotspot is located around the central business
district of the city of Kalamazoo. This hotspot stretches from there down to the southeast
through the nearby neighborhoods, shown in ﬁgure 41. The neighborhoods directly to
the north of downtown Kalamazoo are affected as well. These areas are the most likely

source of the clustering seen in earlier tests.

 
   

2001 5 micrograms
per deciliter

K

Figure 41: The 2001 GAM map of Kalamazoo for the 5 ug/dL threshold

3.1.10 Southwest
The region of Southwest Michigan covers the similarly named HSA with the

exception of the Kalamazoo and Battle Creek study areas (Figure 42). With these cities

112

removed, the study region is more rural in composition. It covers the counties of Berrien,
Van Buren, Cass, St. Joseph, Branch, Calhoun, Barry, and all of Kalamazoo County that
does not fall within the Kalamazoo study area. While the Southwest Michigan region is
more rural with some of the cities removed, there are still several smaller cities and
towns. These include Benton Harbor, Niles, Sturgis, and Goldwater. The number of
yearly blood lead tests is typically between 2,000 and 2,500, but there is an increase to

over 4,000 in 2004 and 2005.

ff.

1‘:

Hastin s

.1
BA‘WLE GREEK
STUDY‘REGK9N a.
_1 t

 

s2"

madam
"‘ BEzN'TGN HARBOR
”T“? WW5- as:
,
33‘ MB I .
II'
0 20 40
Kilometers

Figure 42: Map of the Southwest study area

113

Despite the more rural nature of the study region, the Southwest area Cuzick-
Edwards results display strong clustering across all years at the 5 and 10 ug/dL
thresholds and several instances at the 25 pg/dL threshold (Table 14). With the 5 and 10
ug/dL thresholds, the Bonferroni p-values indicate clustering across all k sizes. This is
the highest amount of clustering found for a HSA-based study area, indicating that there
is a real hotspot in the region. The Monte Carlo simulations reveal that the steady growth
of case-case neighbors continues to steadily increase as k gets larger. The 25 ug/dL

threshold has signiﬁcant clustering in several years as well, but it is more inconsistent.

 

Table 14: Cuzick-Edwards results for Southwest Michigan

The difference of K graphs for Southwest Michigan conﬁrm the earlier results

that there is strong clustering of elevated BLL at every threshold level. For the 5 ug/dL

114

threshold level, the K values rise far above the upper bound of the simulation envelope.
This is also true for the 10 ug/dL threshold. At the 25 ug/dL threshold, the K values stay
above the upper bounds of the simulation envelopes for most years in the database like in
ﬁgure 43. The K values increase very quickly to over three times the value of the upper
bound of the simulation envelope, and then levels off at two kilometers. This is rare for a
region this large and likely indicates areas of unusually high BLL rates. Both Cuzick-

Edwards and difference of K seem to point to a very strong cluster in the region.

115

Figure 43:

Difference of K / Upper Bound

Diffin K

1998 25 micrograms per deciliter

 

0e+00 2e+09 4e+09
I l J

1

-2e+09

 

\‘~
............
.........................
.......

 

 

O

 

6000 8000

Distance

2000 4000

2000 4000 6000 8000 10000

The 1998 Southwest Michigan difference of K graph for the 25 ug/dL

threshold

The GAM results reveal that the Benton Harbor area is the likely source of the

high clustering. The city is present on every threshold level map through all years of the

database. At the 5 ug/dL threshold level, this city is present, but there is also a

constellation of smaller hotspots. It is difﬁcult to determine whether or not these

116

represent signiﬁcant clusters. At the 10 ug/dL threshold, the primacy of the Benton
Harbor area becomes more apparent. The 25 ug/dL threshold GAM maps show only

Benton Harbor, which can be seen in ﬁgure 44.

 
   

1999 25 micrograms
per deciliter

Figure 44: The 1999 GAM map of Southwest Michigan for the 25 ug/dL threshold.
Other study regions outlined in white

3.1.11 Grand Rapids

The Grand Rapids study region covers the city’s Federal Urban Aid Boundary
(Figure 45). This is the second most populous area of the state after Detroit. Several
cities are included within the Grand Rapids study area. They are Grand Rapids,
Wyoming, Kentwood, and Walker. The number of yearly blood lead tests range from

3,500 to 6,000.

117

Hudsonville :‘é‘

a:

 

Figure 45: Map of the Grand Rapids study region

The Cuzick-Edwards results reveal the Grand Rapids region has large clusters at
all threshold levels (Table 15). Given the large population and results in other Michigan
urban areas, this is not a surprise. At the 5 ug/dL threshold level, there is strong
clustering across all years in the database. The number of case-case neighbors continues
to grow at a prodigious rate as k values climbs in value, leading to the conclusion that the
cluster or clusters are large. The 10 ug/dL threshold shows very large spatial clustering

as well. This is different from many other cities within Michigan and is evidence of the

118

extent of the problem in Grand Rapids. Strong initial clustering with the 25 ug/dL
threshold can also be seen in the study area. Much of it is linked to a small number of
cases at the k = 1 level, but the Bonferroni p-value indicates it is signiﬁcant in several

years.

5 Threshold

2
c
.=
m
9
I.
F
F
3

25 Threshold

 

Table 15: Cuzick-Edwards results for Grand Rapids

The difference of K graphs conﬁrms the strong clustering of elevated BLL cases
at all threshold levels within the study area of Grand Rapids. At both the 5 and 10 ug/dL
thresholds, the K values rise far above the upper bounds of the simulation envelope. The
elevated BLL cases at both thresholds appear to be in large clusters. There is a consistent
drop off after about seven kilometers at the Sug/dL threshold level and six kilometers at

the 10 ug/dL threshold level (Figure46). These are fairly sizable cluster diameters.

119

Despite the drop after six kilometers, the K values remain twice as high as the upper
bound even at ten kilometers. The 25 ug/dL threshold also shows clustering. The drop
off in K values is lower, around four kilometers. Overall, the region shows strong, large

clusters at each threshold level.

2003 10 micrograms per deciliter

 

4e+08
l l l

Diffin K
2e+08

..-.-..~~
. ~
--' ‘~.
-- a.
a .
_________
.-‘

_—
,-
4"
,4
,o

0e+00

.....
~“
~.-
~
‘~
~
.
‘.
‘~
~
~.
.‘_
.....
""" . -..--a-‘
-_ _ .-
..........

 

 

 

-Ze+08

2000 4000 6000 8000 10000
Distance

2000 4000 6000 8000 10000

Difference of K / Upper Bound
O H N W A U! 0'1 \l 00 L0

 

 

 

Figure 46: The 2003 Grand Rapids difference of K graph for the 10 ug/dL threshold

120

GAM analysis reveals a strong concentration of elevated BLL cases in central
Grand Rapids. Figure 47, representative of the pattern across all thresholds, shows the
hotspot of BLL in downtown Grand Rapids. The prime area of clustering of elevated
BLL seems to be on the eastern side of the city. Similar to other urban study areas, the

central downtown area overwhelms other cities within the region.

 
   

2001 5 micrograms
per deciliter

Figure 47: The 2001 GAM map of Grand Rapids for the 5 ug/dL threshold

3.1.12 Lower Coast

The study region titled “Lower Coast” represents the lower half of the West HSA
excluding the Grand Rapids urban aid boundary (Figure 48). This includes the counties
of Ionia, Kent, Allegan, Ottawa, and Muskegon. The study region is a majority rural

area, but several cities are located within the area. A couple of examples are Muskegon,

121

Holland, Ionia, Grand Haven, and Zeeland. The number of blood lead tests in a year
within the study area falls between 1,800 and 2,200 for the years 1998-2003, followed by

an increase to nearly 4,000 in 2004 and over 5,000 in 2005.

1'
”f Welding
.1

APIDS
I \
REGION

 

as;
...:_'r'
Holland
":7 Saugatauk Wayland ’

 

Allegan
"q _,
, %~ I
0 10 20 iii?
Ll._l_|_l..l_l_|_l
Kilometers

Figure 48: Map of the Lower Coast study region

The Lower Coast study area exhibits clustering tendencies of elevated BLL cases

at the 5 and 10 pg/dL thresholds levels according to Cuzick-Edwards (Table 16). Across

122

all years in the database, the 5 ug/dL threshold has both signiﬁcant overall clustering
according to the Bonferroni p-value and clustering at many levels of k. The 10 ug/dL
threshold contains clustering across k values for every year as well. The size of these
clusters though seems to be small. The Monte Carlo tests reveal strong initial clustering,
but slower growth to the total case neighbors as k grows. At the 25 ug/dL threshold,

there seems to be little to no clustering except for two k = 1 neighbors in 2004.

 

Table 16: Cuzick-Edwards results for the Lower Coast

Difference of K results reveal clustering in the cases at both the 5 and 10 ug/dL
thresholds. At both of these levels, there is a quick rise in K values until about four
kilometers, where the values level out and begin a slow decline. Still, the K values

remain above the upper bounds of the simulation envelope in every year. This pattern

123

can be seen in ﬁgure 49. The K values are 4 times as high as the upper bound of

simulation envelope, indicating the concentration of cases within the region in a cluster.

Similar to the Cuzick-Edwards results, the difference of K graphs indicate at least one

very strong cluster of cases at both the 5 and 10 ug/dL threshold.

Diff in K

Difference of K / Upper Bound

2000 10 micrograms per deciliter

 

 

 

 

 

 

 

 

0') . . . . . ° 0
O 9 9 o .
+ ‘ . g
Q o a
m
0’ o
O
+ ~ 0
Q)
N
m o
O
+ ..
33 ' .................................
o . -------------
O .....
+ -1 -~-
on .......
O ““““““
O) ~~~~~~~~~~~~~~
O -----------------
+ ----------
‘91 I I I I I
' 2000 4000 6000 8000 10000
9 1 Distance
8 _.
7
e
s
4 i
3 T
2 i
1 1
2000 4000 6000 8000 10000

Figure 49: The 2000 Lower Coast difference of K graph for the 10 ug/dL threshold

124

The GAM maps point to the source of the clustering in several locations. The
most obvious source is the coastal city of Muskegon. This area shows up in every yearly
map at every threshold level. In ﬁgure 50, The Muskegon area is the obvious source of
the cluster seen in the Cuzick-Edwards and difference of K tests. Another hotspot that
factors into the clustering seen earlier is the city of Holland. It is not as consistently a
hotspot, but the city could be the source of clustering in addition to Muskegon. At the 5
ug/dL threshold level, there are a large number of hotspots that do not appear regularly.
These are likely single cases. In all likelihood, Muskegon is the source of the strong

clustering seen in earlier tests.

  
 

2002 10 micrograms
per deciliter

Figure 50: The 2002 GAM map of Lower Coast for the 10 ug/dL threshold

3.1.13 Mid Coast

125

The region labeled “Mid Coast” represents the upper half of the West HSA
(Figure 51). The mostly rural region includes the counties of Mason, Oceana, Lake,
Newaygo, Osceola, Mecosta, and Montcalm. There are not too many built up areas
within the region. A couple of the cities are Big Rapids, Ludington, Reed City, and
Newaygo. Blood lead test numbers range from 800 to 1,000 in most the years, but

quickly rise towards 1,500 and 2,000 in 2004 and 2005.

IE3

”-

Seottville

Willis“

1‘ Fremont

4‘

Greenville

 

," r J: " '51 Kilometers
«' 1)

Figure 51: Map of the Mid Coast study region

126

The Cuzick-Edwards results for the Mid Coast region tend to show clustering
only at the 5 ug/dL threshold level (Table 17). In all years in the database, it seems that
initial clustering is present and provides a signiﬁcant Bonferroni p—value for the overall
test. The Monte Carlo results for the 5 ug/dL threshold reveal that these clusters are
small and involve mostly low k values. With the 10 ug/dL threshold level, some years
provide two neighbors next to each other, but none of the years in the database show a
signiﬁcant Bonferroni p-value. Several of the years in the database do not even show any
of the cases at this level being within 10 neighbors of each other. As for the 25 ug/dL

threshold, most years do not have more than one case.

 

Table 17: Cuzick-Edwards results for the Mid-Coast

127

The difference of K results for the Mid-Coast region do not reveal strong

clustering. Nearly every year, even at the 5 ug/dL threshold, has K values that fall within

the simulation envelopes (Figure52). The difference of K values never rise above 60% of

the upper bound of the simulation envelope. At the 10 ug/dL threshold level, the number

of cases is so low that the K values do not show much vertical movement.

1998 5 micrograms per deciliter

 

1 0e+08

Diff in K
0.0e+00
I

l

-1.0e+08

 

.........

.....
—-'

~._‘

 

‘~
y
‘.
-----------
~~~~~

 

0.7

0.6 '

0.5 ‘

0.4 1

Difference of K / Upper Bound

 

 

 

6000 8000

Distance

2000 4000

2000 4000 6000 8000 10000

Figure 52: The 1998 Mid-Coast difference of K graph for the 5 ug/dL threshold

128

With the lack of clustering in the region, the GAM maps mostly reveal the
locations of single cases. As with other rural areas, it is difficult to discern any pattern in
the results. The spots appear as constellations that seem to differ in patterns every year
like in ﬁgure 53. While the Cuzick-Edwards indicated clustering at the 5 ug/dL
threshold, it is possible that the neighbors are spread out far enough that they appear only
as single cases in GAM and not a large hotspot. It is therefore nearly impossible to ﬁnd

an underlying pattern in the GAM maps for the Mid-Coast.

  

2000 5 micrograms
per deciliter

Figure 53: The 2000 GAM map of Mid-Coast for the 5 ug/dL threshold

3.1.14 Saginaw/Bay City

The Saginaw/Bay City study region represents the Federal Urban Aid Boundary
around the two cities (Figure 54). It runs ﬁ'om the city of Saginaw and its surrounding
environs down a thin connecting strip of land to Bay City and the Saginaw Bay coastline.

The region is urban and developed. There is a steady increase in the number of blood

129

lead tests in the Saginaw/Bay City study region in the years of the database, from under

650 in 1998 to over 2,500 in 2005.

 

Kilometers

Figure 54: Map of the Saginaw/Bay City study region

The Cuzick-Edwards results for the Saginaw/Bay City region tend to follow a
typical pattern for mid-to-large sized cities within Michigan (Table 18). The 5 ug/dL
threshold level shows large clusters, a strong Bonferroni p-value, and continued grth

of the total case neighbors as k rises. The 10 ug/dL threshold also shows a pattern seen

130

in other urban study areas. There is strong initial clustering that gives the region a strong
Bonferroni p-value, but the growth slows at larger k values and indicates the small size of
the clusters. There are not enough cases at the 25 ug/dL threshold level to distinguish

real clusters, though some years have two neighbors at the k = 1 level.

 

Table 18: Cuzick-Edwards results for Saginaw/Bay City

The difference of K results in the Saginaw/Bay City region show signs of
clustering. At the 5 ug/dL threshold, the K values rise above the simulation envelopes
immediately, and then fall back down below after about ﬁve kilometers. The yearly
consistency in this pattern leads to the possibility that the same underlying area is
showing up each year. The 10 ug/dL threshold results show the same early rise in K

values, though the drop below the upper bound occurs quickly such as ﬁgure 55. The

131

difference of K values stay around 2 times as high as the upper bound of the simulation

envelope, though K values precipitously drop after four kilometers. Given the

consistency of the pattern, this region seems to exhibit clustering at the lower thresholds.

There is no vertical movement in the K values at the 25 ug/dL threshold.

Difference of K / Upper Bound

[ﬁﬁniK
—5e+07 0e+00 5e+07

2004 10 micrograms per deciliter

1e+08

-1e+08

3.5

2.5 j

1.5 3

H

 

l

l

 

 

 

 

O - ’ a- __q," ....................
.\
2000 4000 6000 8000 10000
Distance
2000 4000 6000 8000 10000

Figure 55: The 2004 Saginaw/Bay City difference of K graph for the 10 uydL threshold

132

GAM results for the region reveal that the clusters of elevated BLL cases occur
almost exclusively within the city limits of Saginaw and Bay City. While this is not
surprising given similar results around the state, it is still signiﬁcant. The city of Saginaw
exhibits the strongest hotspots such as ﬁgure 56. In Saginaw, most of the hotspots appear
to occur either near the Saginaw River or on the eastern side of the city. For Bay City,

the main yearly hotspots seem to occur on the eastern side of the river.

   
 

2001 5 micrograms
per deciliter

Figure 56: The 2001 GAM map of Saginaw/Bay City for the 5 ug/dL threshold

3.1.15 West Bay
The “West Bay” region represents the western half of the Bay HSA, not including

the Saginaw/Bay City study area (Figure 57). The mostly rural region includes the

133

counties of Iosco, Ogemaw, Roscommon, Clare, Gladwin, Arenac, Isabella, Midland,

Gratiot, and the portions of Saginaw and Bay counties that lie to the west of the

Shiawassee/Saginaw Rivers. Midland is the main city within the region, but there are

other built-up areas such as Mount Pleasant, Alma, and Gladwin. The number of yearly

blood lead tests ranges from a low of 571 tests in 1998 to 1,898 tests in 2005.

Harrison
3‘

 

..—

MIDLAND " ,

In
mPleasant

N
20

Kilometers

Figure 57: Map of the West Bay study region

134

The Cuzick-Edwards results for the West Bay region are inconsistent (Table 19).
The years of 2003 and 2004 show signiﬁcant results at the 5 ug/dL threshold level
according to the Bonferroni p-values. The clustering seen in these years are a result of
case-case neighbors at lower k values. Three different years (1998, 2000, and 2002) have
10 ug/dL threshold Bonferroni p-values which are signiﬁcant, but this is often entirely
due to only two cases next to each other. Overall, the clusters in this region are not very
big and are not consistent year to year. There were not enough cases at the 25 ug/dL

threshold for analysis.

if) I 7

905

063
(ii)1
897

 

Table 19: Cuzick-Edwards results for West Bay

The difference of K graphs reveal no clustering at any distance for any threshold

level. This is somewhat surprising given the fact that a city the size of Midland, with a

135

population around 50,000, is located within the study region (US Census Bureau 2001).
At both the 5 and 10 ug/dL threshold levels, the K values fail to clear the upper bounds of
the simulation envelopes. In ﬁgure 58, this is demonstrated by the lack of vertical
movement of the K values. The difference of K values do not even rise above zero until
nearly eight kilometers, indicating large distances between the individual cases in the
study area. This result leads to the conclusion that the spatial organization of cases to
controls is not signiﬁcantly different than what is produced by the random labeling

hypothesis.

136

Diffin K

Difference of K / Upper Bound

1998 5 micrograms per deciliter

 

0e+00 2e+08 4e+08
l

-Ze+08
L

 

.......

e .p“
— ~~
‘~~
‘~.
‘~

~~‘

 

 

0.2

0.1

2000 4000

2000 4000

8000 10000

8000

6000
Distance

 

6000 10000

Figure 58: The 1998 West Bay difference of K graph for the 5 ug/dL threshold

GAM results for the West Bay region conﬁrm the earlier analysis showing lack of

any clustering. The maps reveal that cases do exist within the region, but no real

discemable pattern can be found. Midland does not show up prominently on many of the

maps. This is surprising given results seen in other portions of the state where large cities

As with other rural areas of the state, the GAM suffers from the low case/control rate

1

37

exposing nearly every case as a hotspot. Figure 59 shows individual cases, not

necessarily hotspots.

  
 

2003 5 micrograms
per deciliter

Figure 59: The 2003 GAM map of West Bay for the 5 ug/dL threshold

3.1.16 East Bay

The “East Bay” region represents the eastern half of the Bay HSA with the
exception of the Saginaw/Bay City study area (Figure 60). Most of this rural region
covers the area of Michigan known as “the thumb” of the state. This includes the
counties of Sanilac, Huron, Tuscola, and the parts of Saginaw and Bay counties east of
the Shiawassee/ Saginaw Rivers. The region has very few towns and developed areas. A
few towns within the study area are Bad Axe, Sandusky, Croswell, and Frankenmuth.
The number of blood lead tests in the East Bay region ranges from a low of 279 in 1999

to 1,161 in 2005.

138

Beach

m_

C-roswell
F

 

Kilometers

Figure 60: Map of the East Bay study region

Cuzick-Edwards results for the East Bay region reveal on-and-off level of
clustering across all years (Table 20). At both the 5 and 10 ug/dL thresholds, the years of
1998-2000have signiﬁcant levels of clustering according to the Bonferroni p-value while
later years, with the exception of 2004, do not. The difference is usually in whether or
not there is a large amount of case-case neighbors at the k = 1 level. Overall, the pattern
of clustering seems fairly weak. The 25 ug/dL threshold does not have any cases most

years to analyze.

139

 

Table 20: Cuzick-Edwards results for East Bay

The difference of K results for the East Bay region exhibit little if any signs of
clustering. At the 5 ug/dL threshold, the K values brieﬂy creep above the upper bound of
the simulation envelope in the years 1998-2000, but most exhibit no clustering like in
ﬁgure 61. In this ﬁgure, the K values barely rise to 50% of the upper bound of the
simulation envelope. Since the simulation envelopes can change slightly with each run, it
cannot be conﬁrmed that clustering is visible in any of the graphs. The 10 ug/dL
threshold graphs show very little linear movement in the K values. This is the result of a

low number of cases at the threshold level in addition to lack of clustering.

140

1998 5 micrograms per deciliter

 

l
x

1 .Oe+08

1

0061-00

Diffin K

-1 .0e+08

l
;

 

.............
~_~
~
‘~ .
,—

 

 

-2.0e+08

.0 .0 9 .0
w a U1 m

.0
N

Difference of K / Upper Bound
o
H

 

2000

4000

6000 8000 10000

Distance

llIII ill]   % II

2000

4000

 

 

 

 

6000 8000 10000

Figure 61: The 1998 East Bay difference of K graph for the 5 ug/dL threshold

Similar to other more rural areas, the GAM maps are hard to read for the East Bay

region. The study area’s low rates of cases mean that any area with cases at all can show

up as a hotspot. On the western side of the study area, there are many single cases in the

Vassar area and surrounding environs (see ﬁgure 62). Unfortunately, it is difficult to pick

up a consistent pattern in the cases year to year.

141

 
 
 
 

1999 5 micrograms
per deciliter

Figure 62: The 1999 GAM map of East Bay for the 5 ug/dL threshold

3.1.17 North Central

The study region of North Central covers the HSA that holds the same name
(Figure 63). The mostly rural and natural area covers the northern parts of the Lower
Peninsula. The counties included in the North Central study region are Emmet,
Cheboygan, Presque Isle, Alpena, Montmorency, Otsego, Charlevoix, Antrim, Leelanau,
Benzie, Grand Traverse, Kalkaska, Crawford, Oscoda, Alcona, Missaukee, Wexford, and
Manistee. This region has several cities, including Traverse City, Alpena, Cadillac,
Cheboygan, and Rogers City. The region has a large increase in the number of blood
lead tests over the years covered by the database, from 414 tests in 1998 to 2,408 tests in

2005.

142

@heboygan

Ro er
”'9 my

I. Dal-@117
W

Grayling

 

0 40

Kilometers

Figure 63: Map of the North Central study region

Cuzick-Edwards results for the North Central region seem to reveal inconsistent
results (Table 21). At the 5 ug/dL threshold level, there are as many years where the
Bonferroni p-values are not signiﬁcant as there are signiﬁcant years. It seems that the
number of case neighbors at most k values do not differ from what would be expected by
chance given the case/control ratios within the region. There are a couple years where

initial clustering at the low k values pushes the Bonferroni p-values into signiﬁcance.

143

But the temporal pattern is inconsistent and does not suggest a strengthening or
weakening pattern. At both the 10 and 25 ug/dL thresholds, the number of cases is too

small to detect any conclusive clustering.

 

Table 21: Cuzick-Edwards results for North Central

The North Central study region shows no clustering in the difference of K graphs.
Figure 64 is a good example. The K values do not jump at all, a good indication of just
how scarce cases of elevated BLL are, even at the 5 ug/dL threshold. In ﬁgure 64, the K
values do not even exceed 50% of the upper bound of the simulation envelope anywhere
within the ten kilometers tested. While cases certainly exist within this region, their

spatial conﬁguration does not seem particularly clustered.

1998 5 micrograms per deciliter

 

l

1e+09

Se+08

Eﬁﬁin K

-5e+08 0e+00

.....

 

 

 

-1e+09

2000 4000 6000 8000 10000
05 l Distance

'0 . ”Ii ll |||w ”Hill

2000 4000 6000 8000 10000

9 9 9
N cu th-
.1 -l‘ ..

Difference of K/ Upper Bound
C)
H

 

 

Figure 64: The 1998 North Central difference of K graph for the 5 ug/dL threshold

Similar to other rural regions in the state, the GAM maps for the North Central
region do not reveal any speciﬁc hotspots year to year. Instead, a collection of individual
cases spot the landscape like in ﬁgure 65. It is tough to even ﬁnd a pattern within the
individual cases, compounding any attempt to ﬁnd hotspots. Since GAM is based on grid

points, it will not locate individual cases.

145

   

2004 5 micrograms
per deciliter

Figure 65: The 2004 GAM map of North Central for the 5 ug/dL threshold

3.1.18 Eastern Upper Peninsula

The study area of Eastern Upper Peninsula includes the three easternmost
counties (Figure 66). These counties are Chippewa, Mackinac, and Luce. It is a mostly
rural region, but with a fair concentration of people on the route from Sault St. Marie to
the Mackinac Bridge. Sault St. Marie is the major city within the region, but there are a
few other towns as well such as St. Ignace. The number of blood lead tests in the study

area is under 400 every year in the database.

146

    

N 0 20 40
I-hl-I-I-d-l-l-I
Kilometers

Figure 66: Map of the Eastern Upper Peninsula study region

The Eastern Upper Peninsula region results for the Cuzick-Edwards tests reveal
little clustering (Table 22). The 5 ug/dL threshold level does not have signiﬁcant
clustering except for the ﬁnal two years of 2004 and 2005. The 10 ug/dL threshold
numbers reveal no signiﬁcant clustering only in 1999 and there are not enough cases at
the 25 ug/dL threshold. What these numbers could reveal is a lack of testing in this study
region. Both 2004 and 2005 were years with a substantial statewide increase in BLL
testing. It is possible that these clusters at the 5 ug/dL threshold were not discovered

until more tests were done.

147

 

Table 22: Cuzick-Edwards results for Eastern Upper Peninsula

In the Eastern Upper Peninsula, the difference of K values show little to no
vertical movement at any threshold level, as displayed in ﬁgure 67. The years which did
show vertical movement did so were nearly entirely within the simulation envelope. The
K values do not even exceed 40% of the upper bound of the simulation envelope. Also,
the movement did not occur initially, but after one or two kilometers. This cast doubts on
any tight urban clusters within the region. This is a somewhat surprising result given that

a city as large as Sault St. Marie is located in the study area.

148

1998 5 micrograms per deciliter

 

Ze+09

.......
....
...................
'''''
......

1e+09

Diffin K
Oe+00

.....

-1e+09

---
‘0‘ .‘Q‘
-----'

 

\
\
‘.
---‘
~\
.....
.~’

 

 

2000 4000 6000 8000 10000
05 ., Distance

:

1

;'

i

4.’

I

I

I I... I l 1.....- ..__

0 .
2000 4000 6000 8000 10000

011'
(13
02

OJ

Difference of K / Upper Bound

Figure 67 : The 1998 Eastern Upper Peninsula difference of K graph for the 5 ug/dL
threshold
GAM results for this region, similar to other more rural study areas, are more
useful for looking for patterns of cases rather than identifying the location of clusters.
One surprising pattern that reemerged across many years was a group of cases in the rural

roads directly south of Sault St. Marie. Figure 68 is a good example of this, where there

149

are several single cases near each other in this rural area. Surprisingly, the pattern is

stronger in this area than in Sault St. Marie. This is different from elsewhere in the state,
where urban areas consistently exhibited more hotspots than nearby rural areas. Cases at
both the 5 and 10 ug/dL thresholds also seem to show up in the western part of the study

region as well.

2000 5 micrograms
per deciliter

   

Figure 68: The 2000 GAM map of Eastern Upper Peninsula for the 5 ug/dL threshold

3.1.19 Western Upper Peninsula

The ﬁnal region covers all of the Upper Peninsula of Michigan except the three
easternmost counties (Figure 69). The region of the Western Upper Peninsula covers the
counties of Schoolcraft, Alger, Delta, Menominee, Marquette, Dickinson, Iron, Baraga,
Gogebic, Ontonagon, Houghton, and Keweenaw. It is mostly rural or natural area, but
there are several cities and towns of importance. These include Marquette, Houghton,
Escanaba, lshpeming, Iron Mountain, and Ironwood. The nrunber of yearly blood lead

tests in the study region grows from under 500 in 1998 to over 1,300 in 2005.

150

d

..‘J
Ironwood lshpemimg
, lam

    

Iron Mountain

0 40 80
Li-l-hLl-hl-i
Kilometers

Figure 69: Map of the Western Upper Peninsula study region

The Cuzick-Edwards test results for the Western Upper Peninsula show a similar
pattern to the eastern half of the peninsula (Table 23). The results are inconsistent until
the large increase in the number of blood tests exhibits clustering in 2004 and 2005.
Unlike the eastern part, the Western Upper Peninsula study region does have clustering in
1998. Given that both Upper Peninsula study areas show increased clustering in the last
two years of the database, it is possible that this part of the state is conducting more

rigorous lead screening.

151

.
\

 

Table 23: Cuzick-Edwards results for Western Upper Peninsula

The difference of K results for the Western Upper Peninsula study area follows
the Cuzick-Edwards ﬁndings. There are a few years in the 5 ug/dL threshold results
where the K values hug the upper bound of the simulation envelopes such as ﬁgure 70.
The K values nearly touch reach the upper bounds of the simulation envelope. Since the
random simulations would be different each time the difference of K is run, even if the K
values had slightly exceeded the upper bound the results would still not prove clustering.
At the 10 ug/dL threshold, there is no year where the difference of K values differs
greatly from zero. Everything points to little if any conﬁrmed clustering of elevated BLL

cases within the region according to difference of K.

152

DiffinK
-5e+08 0e+00 5e+08

Difference of K / Upper Bound

2000 5 micrograms per deciliter

 

1e+09

-1e+09

 

03 i
03 f

05 7
0.5
04 a
03 9
02 -
04 9

 

 

 

2000

 

 

 

2000

6000 3000

Distance

4000

4000

6000 8000 10000

Figure 70: The 2000 Western Upper Peninsula difference of K graph for the 5 ug/dL

threshold

Despite the lack of provable clustering, the GAM results do reveal areas of the

state that consistently look troublesome. An area in which cases seem to continually crop

up is the lshpeming area. In nearly all of the years examined, cases show up in this area.

The Houghton area is also visible on most of the maps as well. Finally, Escanaba and the

153

surrounding environments look like they could be the home of some cases of elevated
BLL (Figure 71 ). The city of Marquette, the most populated city in the study region, is
surprisingly not much of a factor. This goes against the pattern of results for most of the

rest of the state for large cities.

   

1999 5 micrograms
per deciliter

Figure 71: The 1999 GAM map of Western Upper Peninsula for the 5 ug/dL threshold

3.2 Geographically Weighted Regression Results

Regression analysis was employed in this thesis in order to understand and
explain the spatial patterns of childhood BLL in Michigan. Linear regression was run on
three different areal units: US census tract, zip code, minor civil division. US census

block groups were also considered for this analysis, but the small size of the individual

154

units made the analysis useless for two main reasons. The size often left many units with
few if any test results located within, and the huge number of block groups statewide
made computing the GWR models impossible for the R software. For the three
geographic units utilized, this analysis used linear regression for the creation of a
statewide model, hereafter referred to as a global model, of childhood BLL. The linear
regression models were used to evaluate the performance of independent variables at a
statewide level, but additional regression methods were needed to analyze the
performance of the models geographically. While linear regression allows for geographic
analysis of error with residual mapping, how each variable and the model as a whole
varies over space is unknown.

The second part of the regression analysis used Geographically Weighted
Regression (GWR) to examine the effectiveness of the model and its variables across
space. GWR models work by conducting the regression analysis on each geographical
unit (i.e. each census tract) rather than statewide like the global linear regression; All
other observations are weighted in GWR based on their distance to the focal geographical
unit. This thesis used a common GWR weighting scheme based on a Gaussian curve,
where nearby observations a given more weight than observations further away. To
deﬁne the shape of the curve, a bandwidth is selected by ﬁnding the minimum residual
sum of squares for all data points.

The dependent variable in all of the regression models was the mean BLL based
on all blood test results within the geographical unit. In the linear regression analysis, the
mean BLL of test results for each individual year of the database were also tested as

dependent variables in order to evaluate the models over time. All mean BLL values

155

calculated for this thesis were not weighted by population or the number of test results.
In the case of all three different geographic units, the mean BLL numbers were normally
distributed and did not require any data transformation.

The ten independent variables shown in table 24 used were chosen based on
earlier studies (see tables 2 and 3) as well as availability from the US Census Bureau.
Three out of the ten variables had skewed distributions of values in all areal units, and
were logarithmically changed to achieve a normal distribution. For each of the three
variables, any zero values were changed to 0.00001 to permit logarithmic transformation.
To decide which variables to use in each model, linear regression was used to eliminate
variables which were not signiﬁcant (11 = 0.05) for mean BLL based on all years of blood
tests. The remaining signiﬁcant variables were then used for the yearly and GWR

regression models.

 

Percentage Pre-1940 Housing
Percentage of African-Americans (logged)
FM -- w-mP—efcentage of Latinos (logged)
Percentage of Recent Immigrants (logged)
Percentage under 6 years of age
__mm_ Percentage of Housing Rented
Percentage of Housing Headed by Females
_ ___“Percentage of Housing Vacant
Percentage without a high school diploma

Percentage below 185% of the Poverty Line

 

 

 

 

 

 

 

 

 

 

 

 

Table 24: Independent variables tested by regression analysis

Presented in the results section for regression are several different maps and

models. The ﬁrst map is a map of the standard deviation of yearly mean blood lead

156

levels. The mean BLL for each year of the database (1998-2005) was calculated based
on the ug/dL blood lead test results within each individual unit. The standard deviation
of the eight yearly mean BLL results was calculated for each geographic unit. This map
gives a sense of the yearly volatility in the mean BLL. The second part of the regression
results section shows a map of the mean BLL in each unit for all eight years combined, as
well as the results of the linear regression model with the eight year mean BLL as the
dependent variable. The third section shows the results of linear regression models where
the independent variables were used to predict the mean BLL in an individual year. The

variables that are signiﬁcant predictors (a = 0.05) are marked in blue in the table, while
variables that are not signiﬁcant are marked in red. The bar graph shows the R2 values

for each yearly model with a line for comparison to the all years model.

The ﬁnal section contains the GWR results, which are put into a table. The tables
show a summary of the coefﬁcients produced for each individual geographic unit divided
in quartiles. Also available are the regression diagnostics including the size of the ﬁxed
bandwidth in meters, the number of individual geographic units, the effective number of
parameters and degrees of freedom, sigma squared (standard error of the estimate), and
the Akaike Information Criterion (AIC) which is a measure of the goodness of ﬁt
(Fotheringham, Brunsdon, and Charlton 2002). Also listed is the Leung statistic, which
was explained in equation 10, a measure of how well the GWR model reduces the
residual sum of squares compared to the linear OLS regression. Finally, maps are
provided which show how the coefﬁcients of key independent variables change across

Michigan.

157

3.2.1 Minor Civil Division

The ﬁrst areal unit regression analysis was Minor Civil Divisions (MCD), a term
covering all local political boundaries such as city limits and townships. The map of the
mean BLL for all years in ﬁgure 73 shows a different pattern from the other areal units.
The cities such as Detroit and Grand Rapids have the highest mean values, but they have
far less inﬂuence as single entities. Select rural areas dominate the map, including the
southwest portion of the state, the “thumb” of Michigan, and portions of the northern half
of the Lower Peninsula. The standard deviations map in ﬁgure 72 follows the mean BLL

map fairly consistently.

158

,1. ,9
h N
y 1
.. 1 ‘- I.
A“ I‘ "I
I“ , ‘
. I ‘~
' _d
u . I
I 4‘ I
n- I‘ V I ' -..
I , I ' I.»
Minor Civil Division - -
Standard Deviation .
0000-0590 ‘ I
0591-0985 . I
0.986- 1.433 ‘ . , ,
1.434-2.148 I ’ . '- I. ,
5:2]: 2149-3480 I . .I I;
- 3.481 -6.889 ' .* . f
I . . .
. ' , ‘ :' '1
, > I 5 .
- ..I . I I

Figure 72: Map of the minor civil division standard deviations of yearly mean BLL

The global regression model in ﬁgure 73 shows MCD level analysis to be poor for
studying elevated BLL based on the independent variables commonly associated with the
ailment. The R2 for the overall global model is 0.17, very poor when compared with

census tracts and zip codes. In the MCD global model, the main independent variable is

159

again percentage pro-1940 housing. Perhaps the most interesting facet of the MCD all
years model is that percentage Aﬁican-American has a lower t-value than percentage
without a high school diploma. This is certainly due to the fact that the cities, such as
Detroit, are entire units rather than broken up into sections. The large number of
townships increases the inﬂuence of rural areas on the model. Cities have far less

inﬂuence when compared to census tracts and zip codes.

160

  

Mean Blood Lead Level
1.000 - 2.279
2.280 - 2.885
2.886 - 3.649
L .5 3.650 - 5.250
- 5.251 - 10.000

 
 
 
  
 

 

 

 

 

 

 

 

 

Pre1940
FemaleHeaded
No School
Under 6
= 0.1774
R2 = 0.1742
= 54.67 on 6 and 1521 of Freedom 2E-

Figure 73: Map of mean BLL by minor civil division and all years global regression
results

161

The yearly global regression model results (Table 25) reinforce the notion that
MCD level analysis is not suitable for mean BLL. The signiﬁcance of each variable
oscillates from year to year. Even the variables most associated with mean BLL in the all
years model fall below lower values of signiﬁcance. For example, pro-1940 housing is a
better predictor of mean BLL than female-headed households by far in the all years

model, but not in the 1998 or 1999 model. Much like the all years model, the individual
MCD yearly models do not explain much of the variance in mean BLL. The R2 range is

typically between 0.10 and 0.16.

Yearly Signiﬁcance Table

Coefﬁcient 1998 1999 2000 2001 2002 2003 2004 2005
lnBlack
lnLatino

Pct Pre1940
Pct FemaleHeaded
Pct No Hi hSchool

 

 

 

 

 

 

 

 

Pct Under6
Yearly R-Squared
0.18 q
016 _i All-Years R-squared
i z.“ ‘2 i. ‘i
(114 i ﬂ i E .
i . Irv-*- in; ii i E =
0-12 fl 2 “r “'2 i i i ll
. - a "- r '
'0 1' i7 'E i i r l- E ._., i .. E '.
e 0.1 --.- ‘ .5 9. ii E; E7. :1
‘3 3.11 in r1. 1: 1
0' .3 r i . S 4 i 3 5 S t‘1 ,3 5.: ‘g i;
U? 0'08 . ' .--‘ i l i . 2- ‘.i g i: i; i :1 at :15.
0‘ g a l i. E j E l‘ i E i ii 2 ii
. l '. . 1 . l‘ 'r 13 " ' I;
.11 '. .. 7 "J '1 = ' . 3 h 1.": s-
i . ~ 3 .' "~’ ‘ :13 ”‘ ii
004 '3 i 5* i i is 31 i g r,- g
' ‘ . 7 l .2 a s . .: fr. g
1 . l ‘ i- g N l ,4 ’ '2 is E
* - -. -. i
i r. ',_ . . . i~ I. '- ﬁ é! . If: ;
. 4.11 _ .11 L .l 1r .4 _ , i..- . .2 Li a... ..i
0 . . . . .. . . .. ., -.. _. . .. -.., 1 .. . “MY--. ..__1

1998 1999 2000 2001 2002 2003 2004 2005

Table 25: Yearly global regression results for minor civil divisions. Light blue
represents a signiﬁcant variable (a = 0.05)

162

A combination of low predictive value and a fairly large bandwidth of around 66

kilometers cause the GWR model for minor civil divisions to be not much of an

improvement over the global model. The Leung test in table 26 reveals that the GWR

model did signiﬁcantly reduce the sum of squares of the residuals from 798.17 in the

original global model to 611.33. The variable percent under 6 years of age has a very

large difference between the median GWR model coefﬁcient value and the coefﬁcient

from the global linear model. The likely cause is that some outlier areas of the state may

show a very strong link between this variable and mean BLL, but it is less predictive for

the state as a whole.

 

Summary of Regression Coefﬁcients

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Residual sum ot'squares

 

6| |.3304

 

 

 

 

 

 

 

 

 

Minimum lst Quartile Median 3rd Quartile Maximum Global
Intercept 0.03069 2.045 2.65 I 2.963 3.643 2.0754
lnBlack -0.0l I74 0.0l537 0.0297I 0.04694 0.100I 0.0378
Inlatino -0.()9798 0.02673 0.1022 0. I442 0.198I 0.041 I
l’ct l’rel940 41.3829 0.9842 L235 l.875 3.178 I.4526
Pct Femalel leaded -0.7762 0.37 0.9645 L353 2.894 0.7856
Pct N) High School -6.46 l.292 2.637 3.558 4.633 3.0166
Pct Under 6 -8.573 «4.298 0.8652 3.637 29.72 3.7443
Fixed Bandwidth (meters) 66745.8
Number of Data Points I528 Leung Statistic
LI‘I'ective number of 86.050 l 5 OLS Residuals Sum of 798. I 792
. parameters ‘ Squares
[inwgtijﬁzm m 1441.95 (M R R3323: sum "' 6| 1.3304
Sigma Squared 0.4000854 I’ - Statistic 0.8079
A|(_‘ 2998.605 p - value I.92I~‘.-()5

 

Table 26: GWR regression results for minor civil division all years mean BLL

For the minor civil division level model, the GWR maps are of little value. In

general, the large bandwidth size resulted in stripe-like patterns across the state. The

163

 

 

pattern across the state for the R2 is very smooth and not reﬂecting the pockets of high

and low mean BLL that exist (Figure 74). The highest R2 values appear to be in the

southwest comer of Michigan. A likely reason is that the southwestern portion of the
state seems to have higher mean BLL values in many of the rural townships. Since cities
are single units at the minor civil division level, the rural areas have more inﬂuence on

the model result.

164

 

 

 

 

 

 

    

 

 

 

 

Minor Civil Division
R-Squared
0053-0123 - ‘ ,_-

. 0124-0207 'w ‘

- 0208-0302 i

- 0.303 -0.396

- 0397-0514

- 0515-0671

 

 

 

 

 

Figure 74: Map of the R-Squared for the minor civil division GWR model

The map of coefﬁcients for the variable percent pre-l940 housing shows the
inﬂuence of Detroit. The high coefﬁcient values reveal that older housing is having a
large amount of inﬂuence on the model. The map in ﬁgure 75 does not reveal however

the variability that likely exists throughout the state. The larger bandwidth size, caused

165

by the low predictive ability of the variables at the minor civil division level, is causing

many likely pockets of the state such as Grand Rapids to be missed.

 

 

Minor Civil Division
Percent Pro-1940 Housing Coefficient
-0.383 - 0.466
0.467 - 1.006
1.007 - 1.368
'1 ‘3 1.369 -1.840
_ 1.841 -2.404
- 2.405 -3.178

  
 

Figure 75: Map of the coeﬁicients from the minor civil division GWR model for pre-
1940 housing

3.2.2 Zip Code

166

The second areal unit regression analysis involved US postal zip codes for
Michigan. Similar to Census tracts, the highest mean BLL numbers were found in the
urban zip codes. Other prominent areas include the southwest comer of the state as well
as parts of the southern border of the Lower Peninsula. The standard deviations map
(Figure 76) shows that the rural areas of the state are more volatile year-to-year in mean
BLL than the urban areas of the state.

1"

Mr L. A

Zip Code

Standard deviation _ ﬂ.
0000-0350 ' 1

0.351 -0.723 , » f“

0.724- 1.184 . ,

1.185- 1.984 " . - I

:5; 1985-4204 , . ' "*1- '

- 4205-7054 " ‘ , '_ ' "

 

Figure 76: Map of zip code standard deviations of the yearly mean BLL

167

Nine variables from the original choices were used in the global model for zip
codes. Though more variables proved to be signiﬁcant (or = 0.05) than in census tracts,
the t-values are not as high. The most signiﬁcant variable proves to be percentage pre-
1940 housing. This is not surprising given similar results seen in other areal units. What
is interesting in the t-values is that both Percentage Aﬁican-American and Percentage
Latino are well above the other remaining variables (Figure 77). This could suggest the

strength of ethnicity as a strong predictor at the zip code level. Overall, the model for all

years had an R2 value of 0.41.

168

Mean Blood Lead Level

1.000-2.259
2260-2880
2.881 - 3.782
- 3783-5775
- 5776-9000

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

‘ ___
Coefficients Estimate Std. Error t-value Pr(>|t|)
(Intercept) 1.99385 0.173767 11.474 215-16
lnBlack 0.081036 0.010365 7.818 1.18E-14
InLatino 0.101676 0.013875 7.328 43015-13
InRecent Immigrants 0.030678 0.008648 3.547 0.000404
Pct_Rental 0.973719 0.272797 3.569 0.000372
Pct_Vacant 0.70847 0.177352 3.995 6.88E-05
Pct_Pre1940 2.04307 0.208258 9.81 2E-16
Pct_FemaIeHeaded 1.609344 0.282665 5.693 1.57E-08
Pct_No High School 2.815125 0.551952 5.1 3.94E-07
Pct_Under 6 7.864077 1.567739 5.016 6.07E-07
R2 = 0.4164
Adjusted R2 = 0.4119
F-statistic = 94.01 on 9 and 1186 Degrees of Freedom l 21:-16

 

 

Figure 77: Map of mean BLL by zip code and all years global regression results

169

 

The yearly models for zip codes proved that independent variables in the all years
model may not represent signiﬁcance on a yearly basis (Table 26). The clearest example
is percentage houses rented and percentage houses vacant, which both are signiﬁcant in
the all years model, but are rarely signiﬁcant in an individual year. Often these variables
have opposite positive and negative coefficients, indicating likely colinearity in the
individual year’s model. Several other variables such as percentage recent immigrants
and percentage without a high school diploma show varying levels of signiﬁcance. The
yearly models reinforce the strength of three variables: percentage pre-l940 housing,

percentage Aﬁican-American, and percentage Latino. Similar to the other areal units, the

zip code yearly R2 falls below the all years model, with a range around 030-038.

170

Yearly Signiﬁcance Table

Coefficient 1998 1999 2000 2001 2002 2003 2004 2005
lnBlack

lnLatino

InRecent Immi ants

Pct Rental

Pct Vacant

Pct Pre1940

Pct FemaleHeaded
Pct No Hi hSchool

 

 

 

Pct Under6
Yearly R-Squared
0.45 '1
.; All-Years R—squared
0.4 'a
1 [.0 »‘%i i i
. py‘ ' ft ‘ a .i F1 2
e 1 ‘11 :1 g i '2 2': i1 1'
0.3 :; . f . m :- ' i "I i 51 1191.1 "1
5 3 z .1 1 i i- . - ? i
o 1 "i .3 =': : 1 2: i i i-
m 3 1.. i i- ,- . . I 1 1!. r. 1 i r
:3 a .1 i i 1‘ I1 55 i 3- ‘3 '1 i
a” 0-2 i '1 l? ’ g :3 2 1
a: 3 a s? i ‘1 i " i i
0.15 “I E i 1 § "(1 i‘ 4 g .f' g E‘-
1 "1 i i: l I a t i.
01 t f ;= as 1;" 1' ;: i 1
‘ i ii i i- i g i 1 ~ 1 i If i
i ': » , g i i I i l i 5
0.05 '1 i r 1' -’ l. 3‘ E f? 1
‘ . ', i - .- i if -.
O _ “raw,“ 1.. ... _,,7m...--.; 2 ,1“... ”-.- .;. a. ,1 i..---~_.__,1r_,,,.lt...,,_r1,__,saw-4.11,“.I

 

 

 

 

 

1998 1999 2000 2001 2002 2003 2004 2005
Table 27 : Yearly global regression results for zip codes. Light blue represents a
signiﬁcant variable (a = 0.05)

The GWR model for zip codes turned out to be a case of a better model does not
necessarily improve the analysis capabilities. The Leung test for the GWR model versus
the global model showed that using the GWR model signiﬁcantly reduced the sum of
squares of the residuals (Table 28). This would indicate that the model is better at
predicting the mean BLL than the global model. What is interesting is that the reduction

of the residuals for zip codes was the lowest of any of the three geographic units. In the

171

summary of coefﬁcients, the large difference between the median GWR coefﬁcient for

the variable percentage under 6 years of age and the global linear coefﬁcient.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Summary of Regression Coellicients
Vlinimum lst Quartile Median 3rd Quartile Maximum Global
Intercept 0.2863 2.448 2.674 2.9] 7 3.5 l6 l.9939
lnBlack 0.04087 0.072 0.08873 0.0924 0.09548 0.08l
lnl.atino -0.0l687 0. l 358 0.l508 0.l7l3 0.!943 0. I0l7
Pct Recent Immigrant 0.00996l 0.0255 0.029I 0.03! 0.06l 0.0307
Pct Rental 4.289 0.!662 0.4538 Li H 3.382 0.9737
Pct Vacant -0.6l l6 0.5l7 0.8l37 l.04 2.065 0.7085
Pet Prel940 004869 |.537 2.437 2.922 3.325 2.043l
Pet FemaleHeaded -0.5557 0.7076 l.9l9 2.757 3.55l |.6093
Pct No High School 0.03435 0.844 l.56 2.94] 4.759 2.8l5l
Pct Under 6 -l.957 -0.5089 L89 7.457 2l.l3 7.864I
Fi\ed Bandwidth (meters) l I7372.7
Number of Data Points l I96 Leung Statistic
Effective number of 5 l .78 l 7 OLS Residuals Sum of 9450133
‘ parameters ‘ Suuares
Enec‘:;Lecj:::eeS m l 144.: 18 GWR R2233: sum 0' 78 l .6938
Sigma Squared 0.6535902 F - Statistic 0.8574
AIC‘ 2924.042 p - value 0.004238
Residual sum of squares 78 l .6938

 

 

 

Table 28: GWR regression results for zip code all years mean BLL

 

 

Similar to the minor civil division, the zip code GWR model suffers from a
weaker weighting scheme. The bandwidth for the all years model for zip codes was
around 117 kilometers, which is twice as high as minor civil divisions and nearly 5 times
as high as census tracts. While the cross-validation algorithm chose this bandwidth

because reduced the sum of squares to the greatest degree, it provides little sound
mapping examples. In the R2 map in ﬁgure 78, the values trend downward as distance

from Detroit increases. Similar patterns can be seen in the individual variable maps.

172

What this indicates is that there is a spatial component to mean BLL at the zip code level
and that including a spatial component does improve the predictive power.
Unfortunately, the linear nature of this spatial component indicates that the model is not
picking up the pockets of spatial variation seen in the census tracts GWR model. In all

likelihood, an independent variable based in latitude would likely work as well.

   

Zip Code
R-Squared

0.320 ~ 0.377
. ; 0.378 - 0.414
- 0.415 -O.458
- 0.459 — 0.509
- 0.510 - 0.565
- 0.566 - 0.624

 

Figure 78: Map of the R-squared for the zip code GWR model

173

In both zip code GWR models as well as the earlier minor civil division model,
the variable percent under 6 years of age produces the widest variability in coefﬁcient
values. Figure 79 shows the map for coefﬁcients for the percentage under 6 years of age.
The highest coefﬁcients are in the far western areas of the Upper Peninsula. What could
be behind the high coefﬁcients is that many other predictive variables such as percentage

Aﬁ'ican-American are not a big factor.

     

Zip Code
Percent Under 6 years Coefﬁcient
-1.957 —O.278
0.279 - 2.600
2.601 - 5.625
:1. 5.626 -9.046 1 J
- 9.047 - 13.500 .;2
-13.501 ~21.131 ,1.

Figure 79: Map of the coefﬁcients from the zip code GWR model for percentage under
6 years of age

174

3.2.3 Tract

Census tracts were the third areal unit examined by regression analysis. The
preference of the US census bureau for relatively homogenous populations when drawing
up the boundaries of tracts is a great advantage for regression. There is often a sharp
divide between the means in neighboring tracts. Each yearly map of BLL means yields
similar results. To test the yearly variability in the mean BLL, the standard deviation was
computed for each tract. The resulting map shows the strongest deviations scattered
among more rural or suburban tracts (Figure 80). A closer examination showed high
standard deviations were usually due to a couple factors: the presence of a high BLL

outlier case, a low test population, and generally low BLL test results in the tract.

175

 

Census Tract
Standard Deviation
0.000 - 0.763
1 0.764 - 1.207
1.207 - 1.813
1 1.814 - 2.949
..53 2.950 - 6.000
- 6.001 - 11.843

 

Figure 80: Map of census tract standard deviations of yearly mean BLL

The results of the regression analysis on Census tracts yielded the best and most

conclusive results (Figure 81). In the global regression, the eight independent variables
yielded an R2 value of 0.67 for elevated BLL data covering all years. All of the

independent variables yielded p-values that were highly signiﬁcant. Not surprisingly, the
percentage of pre-l940 homes within the tract is the most signiﬁcant variable, with a t-

value at 35.6. The percentage of African-American residents and percentage of

176

households headed by a woman only were also highly signiﬁcant. Note that at the
Census tract level, the percentage of Latino residents had a negative effect on the mean
BLL in a tract. This is different from what was found in the MCD or zip code

regressions.

177

Mean Blood Lead Level
1.000 — 2.495
2.496 — 3.582
3.583 - 5.130
; .: 5.131 - 7.188
- 7.189-12.000

 

 

 

 

 

 

 

 

Rental
Vacant
Pre1940
FemaleHeaded
No School
Under 6
= 0.6724
R2 = 0.6714
= 693.2 on 8 and 2702 of Freedom 2E-

 

Figure 81: Map of mean BLL by census tract and all years global regression results

178

In addition to testing the independent variables against the mean BLL results for

all years, the predictors were tested against the mean BLL in the tracts for each year

(Table 27). A glimpse at the R2 across the eight years shows a range of about 0.44 to

0.53. This is below the R2 for the all years model and likely reveals some volatility in the

yearly mean BLL numbers. The global regression analysis by year conﬁrms that both
pre-194O housing and percentage African-American are the strongest predictors. In every
year, their p-value is highly signiﬁcant. The percentage of houses within a tract that are

vacant shows itself to be a worst predictor when looking at individual years.

179

Yearly Signiﬁcance Table

Rental

Vacant
Pre1940
FemaleHeaded
No School
Under 6

 

0.7 *

 

All-Years R-squared

Yearly R-Squa red
0.6 1
0.5

0.4 ~
0.3 ‘
0.2 "
0.1 a
0 .; 2.1-. “2.-.-.. ”1-. ...,._ _ _
000 2001 2002 2003 2004 2005

1998 1999 2

R-Squared

 

Table 29: Yearly global regression results for census tracts. Light blue represents a
signiﬁcant variable (a = 0.05)

The GWR model, where individual regression analyses were run on each tract
based on a weighting scheme, performed better at reducing the sum of squares of the
residuals than the global model according to the Leung test statistic (Table 30). This
statistic showed vast improvement in the predictive capability of the GWR model. This
might be linked to the lower bandwidth value, around 25 kilometers. The median

coefﬁcient values for all the individual GWR models are similar to the coefﬁcient values

180

from the global linear model. The largest exception seems to the percentage of vacant

houses within the study region. In addition to being the least consistent variable in the

yearly global linear models, the effect on mean BLL the percentage of vacant houses is

responsible for seems to vary widely across the state.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Summary of Regression Coefﬁcients
Minimum lst Quartile Median 3rd Quartile Maximum Global
Intercept -3.497 I . I06 I .396 2.5 l 3 9.952 I . I833
lnBlack -0.1357 0.09274 0.2029 0.2434 0.4129 0. I928
lnlatino -0.3762 -0.219 -0.l533 0.03366 0.657l -0.l762
Pct Rental ~25.34 -l.|36 -0.7l68 -0.2082 5.l52 -0.5497
Pct Vacant -9.62 l.049 3.329 4.834 6.543 0.8772
Pct Prel940 -l.074 2.647 4.084 4.549 5.684 3.882
Pct FemaleHeaded -6.749 0.858 L797 2.0l5 l|.52 1.94l3
Pct l\'o High School -l 3.36 2.365 2.733. 3.054 9.65 3.4503
Pct Under6 -28.28 l.533 4.739 5.973 40.57 6.0l75
l-‘i\ed Bandwidth (meters) 25539.43
Number of Data Points 271 l Leung Statistic
l-Lfl'eetive number of 19 l84 OLS Residuals Sum of 2| 15.556
parameters Squares
Effective degrees of 2371.8l6 GWR Residuals Sum of 1325.99
lreedom Squares
Sigma Squared 0.489l I47 F - Statistic 0.7 l4
AIC 6020.737 p - value 2.2E-l6
Residual sum of squares I325.99

 

 

 

Table 30: GWR regression results for census tract all years mean BLL

The real value of GWR and where the census tract model really shines is the maps
of coefﬁcients. A map of the R2, shown in ﬁgure 82, reveals that the model works very

well in urban areas, but also in some of the rural areas as well. Grand Rapids stands out
as an area where the model is highly effective among the urban areas of Michigan, with
Detroit and Flint visible to a lesser degree. The model is also effective on much of the

Upper Peninsula, particularly in the far western end as well as the Sault St. Marie area.

181

Finally, the center of the Lower Peninsula shows rural areas where the model works

eﬁectively as well.

  

Census Tract
R-Squared
0.312 - 0.580
, 0.581 - 0.681
- 0.682 - 0.739
- 0.740-0.785
- 0786-0839
- 0.840- 0.995

Figure 82: Map of the R-Squared from the census tract GWR model

The maps of the coefﬁcients for each of the variables give an important clue as to
what parts of the state each variable is contributing most. For the percentage African-
American variable, the Grand Rapids and Detroit areas show the highest positive

coefﬁcients (Figure 84). According to this model, in the two largest cities in Michigan,

182

the areas that have the higher percentages of African-Americans have the higher mean
BLL. This pattern is largely repeated in the map of coefﬁcients for percentage houses
built before 1940 (Figure 83). Detroit and Grand Rapids continue to stand out well
beyond the rest of the state. The two main variables, percentage African-Americans and

pre-l 940 housing, exert the greatest inﬂuence in Michigan’s urban areas

Census Tract
Precent Pre-1940 Housing Coefficient
—1.074 - 1.594
1.595 - 2.614
2.615 - 3.449
LT: : 3.450 - 4.189
- 4.190 - 4.796
- 4.797 - 5.684

 

Figure 83: Map of the coefﬁcients from the census tract GWR model for pre-1940
housing

183

Census Tract _
Percentage African-American Coefﬁcient J' 1
-0.136 - 0.035 7 ‘
0.036 - 0.1 15 n
0116-0181 _. ,
0.182 - 0.228 , ‘5’“
:11 0.229 - 0.306
- 0.307 - 0.413 1;.

   

Figure 84: Map of the coefﬁcients from the census tract GWR model for percentage
African-American
The ﬁnal map is the map of coefﬁcients for the variable percentage vacant houses.
This was the most inconsistent variable in terms of signiﬁcance from year to year and the
variable that had a large difference between the median of the GWR coefﬁcients and the
global coefﬁcient. The map in ﬁgure 85 reveals the likely cause of this disparity.

Percentage vacant houses seem to have a large effect in the southern areas of Detroit and

184

extending down to the Ohio border. But in the Grand Rapids area, the variable has no

effect. This disparity could be the underlying cause behind the inconsistent performance

of vacant houses as a predictor of mean BLL.

   
 

Census Tract

Percent Vacant Houses
-9.620 --0.870
-0.869 - 1.178

1 ' 1.179-2.902

E2731 2.903 -4.186

- 4.187 -5.o95

- 5.096 -6.543

Figure 85: Map of the coefﬁcients from the census tract GWR model for percentage
Vacant Houses

The overall results of the regression analysis prove the importance of the unit of

analysis as well as the independent variables used. In all three areal units, three of the

185

variables (percentage African-American, percentage Latino, and percentage recent
immigrants) were logged in order to give the data values a normal distribution. Each of
the three different areal units tested produced very different outcomes of what census
variables were signiﬁcant and how much of the variance in mean BLL could be
explained. One constant throughout the different units of analysis was the two main
variables that proved most signiﬁcant, the percentage of houses built before 1940 and the
percentage of African-Americans. Other independent variables proved to be signiﬁcant
as well, but these two were consistently the best predictors.

The GWR analysis provided an opportunity to map the coefﬁcients of each
variable in every regression run as well as the chance to view the R2 spatially. The ‘
mapped results showed the great difference between the different area] units used.
Census tract analysis proved best for GWR. This was due to the fact that the independent
variables were better predictors at this level, which in turn revealed more spatial
variation. The low predictive ability of both the zip code and minor civil division models

made GWR analysis basically worthless.

186

4 Conclusions

4.1 Overview

The legacy of commercial lead usage continues to affect Michigan children to this
day. The large amount of lead used in early 20th century products made the element
accessible to children. Industry pressure and dismissal of medical evidence allowed lead
usage in paint and gasoline to continue in the United States much longer than other
developed nations. For many years, the warning signs of lead poisoning in children were
dismissed and many suffered grievous injury and even death. As lead was phased out of
paint and gasoline in the 1970s, the number of serious clinical cases of lead poisoning has
dropped.

New research has shown that sub-clinical levels of lead in a child’s body cause
irreparable harm. Though Chelation therapy can be used to slowly cleanse the body, the
only sound solution to the problem of lead in the human environment is primary
prevention. This tactic has been emphasized within the United States since passage of
Title X in 1992. The state government of Michigan responded in 1998 with the Lead
Abatement Act, which provided funds for reducing elevated BLL in Michigan through
the creation of database of all blood test results of children and eradicating lead from
dangerous home environments. Supplemental legislation in 2004 has worked to
streamline the testing process and setting a ﬁrm goal of eliminating elevated BLL within
Michigan by 2010.

This thesis utilized the Michigan Department of Community Health (MDCH)

database of child blood lead test results from 1998 to 2005 in order to study the spatial

187

patterns of distribution. The research was limited to children on Medicaid, two-thirds of
the original database, to deal with sampling issues. This database was created by MDCH
from all the testing labs in Michigan by law. Information available included the child’s
address, age, test result (in ug/dL), test type, and the data the blood test occurred.

For all children tested more than once, the highest test result was used. The
research examined at both the point patterns based on the children’s addresses as well as
areal analysis the characteristics of the neighborhoods based on US Census data. Several
different clustering techniques were used in order to examine the number of neighbors,
size of the cluster in terms of distance, and the likely locations of clusters. Each test was
done on the data from every individual year of lead testing in order to look at possible
changes over time. Because of computing limitations, the state was divided into nineteen
different study areas. In the census-based analysis, variables that had been found to be
signiﬁcant in previous studies of spatial variation in lead poisoning were tested in
Michigan. Regression analysis in this thesis was run on three different area] units, all of
which were used in previous spatial-based childhood BLL studies. Geographically
Weighted Regression was employed to visually understand how well the model works in
various portions of the state and how the independent variables changed over space.

A number of conclusions can be drawn from the results of the clustering and
regression methods about childhood BLL in Michigan. Listed below is a summary of the
major points that emerged:

1. Elevated BLL in children insured by Medicaid is clustered in Michigan.

2. Clusters of elevated BLL are most considerable in the urban areas of the state.

188

. The size of clusters is greatest when 5 ug/dL is used as the partition between

cases and controls. When 10 ug/dL is used as the divide, the size of the
clusters is smaller. Clusters of elevated BLL cases at the 25 ug/dL partition
are only common in the more populated study regions such as South Detroit.
. In Federal Urban Aid Boundary-based study areas, the central city and

surrounding neighborhoods display elevated BLL hotspots.

. Rural study regions that lack a central city do not typically display clustering

of elevated BLL regardless of what partition of ug/dL is used.

. In HSA-based study areas, presence of clustering is dependent on a moderate

to large city within the region. The only consistent hotspots in the study
region are centered on these cities.

. The choice of areal unit in regression analysis is critical to the predictive
capability of the regression model. With the independent variables used in
this thesis, US Census tracts explain the variance in mean BLL to the greatest
degree. The same variables at zip code level explain the mean BLL variance
to a lesser degree, and have a low predictive ability when aggregated to minor

civil divisions.

. The percentage of an area’s housing that was built before 1940 was the best

predictor of mean BLL. The next best predictor of mean BLL was percentage

of an area of African-American ethnicity.

. The Geographically Weighted Regression (GWR) model for census tracts

conﬁrmed that the Detroit and Grand Rapids had the highest positive

coefﬁcients in the state for both the percentage pre-1940 housing and

189

percentage African—American variables, indicating that these two cities exert

the greatest inﬂuence over the statewide model.

4.2 Discussion of Results

4.2.1 Clustering

A thorough search of the academic literature found no studies where clustering
methods were used to identify areas of lead poisoning. Typically, such techniques are
more suited for study of infectious diseases to identify hotspots and clusters where a
disease epidemic is occurring. For a chronic disease such as lead poisoning, the hazard is
mostly stationary because the lead threat is ﬁxed in the local environment. The clustering
methods presented in this thesis as well as others available in the literature have value for
evaluating lead poisoning cases.

Three different methods for analyzing point patterns were utilized for this thesis.
Each method uncovered a different aspect of the point patterns. Cuzick-Edwards tests
were used to reveal the size and signiﬁcance of clusters of elevated BLL cases based on
neighbor analysis. The difference of K graphs was used to understand the size and
signiﬁcance of clusters based on distance. Finally, Geographic Analysis Machine
(GAM) maps were created to highlight hotspots where clustering was likely occurring.
The results from all three tests reveal distinct patterns of elevated BLL throughout the
state of Michigan.

All evidence in the clustering methods points to the severity of lead exposure in
urban areas. The Cuzick-Edwards statistic and the difference of K graphs both provided

a sort of informal ranking of the study regions as to the severity of elevated BLL. At the

190

top of this ranking are the metropolitan areas of Detroit (represented by two study areas)
and Grand Rapids. Each showed extraordinary amounts of clustering of cases at all three
thresholds, evidenced by the highly signiﬁcant test statistic values in the Cuzick-Edwards
statistics as well as the difference of K values which rose quickly above the upper bounds
of the simulation envelopes. The GAM maps showed that the hotspots of elevated BLL
occurred primarily in the urban core of each city.

A second level of the informal making was middle to small-sized cities. These
were study areas such as Lansing, Flint, Kalamazoo, Battle Creek, and Saginaw/Bay
City. The three clustering techniques revealed as high amount of clustering among the
lower thresholds of 5 and 10 ug/dL, but diminished at the 25 ug/dL threshold due to the
lack of cases. Often the 5 ug/dL threshold had clustering levels nearly as high as the
major cities, but the 10 ug/dL threshold showed a noticeable drop off in the size of the
clusters. This is evident in both the Cuzick-Edwards and the difference of K graphs,
leading to the conclusion that there are small pockets of lead poisoning cases in urban
study regions. The GAM maps demonstrated that the hotspots were in the central
sections of the mid-sized cities, similar to Detroit and Grand Rapids but on a smaller
scale.

The third level in the ranking was HSA-based areas that had cities or several large
towns within them. These included the Southwest, Southeast, Mid-South, and Lower
Coast regions. Similar to the smaller cities, these regions displayed clustering at the 5
ug/dL threshold level. At the 10 ug/dL threshold, clustering results are typically much
weaker and vary in signiﬁcance year to year. The GAM maps for these regions were also

more difﬁcult to interpret due to the large number of single case hotspots. Having a

191

lower case/control ratio than the urban study areas causes these hotspots. The resulting
maps show a constellation of hotspots that shift from year to year. But in each of the four
study regions in this level, one constant is a hotspot centered on an urban area. This
primary city is certainly the source of clustering seen throughout the region.

The fourth and ﬁnal tier of the informal ranking from the clustering analysis was
the more rural areas. These were the Upper Peninsula study areas, North Central, West
Bay, East Bay, and the Mid Coast. They were characterized by some clustering at the 5
ug/dL threshold, occasionally picked up by the Cuzick-Edwards test. But overall, the
regions displayed little if any clustering. GAM maps were less useﬁil in these regions
because a hotspot could be just one case. In such instances, investigators would not need
to consult clustering maps and would likely not rely on clustering methods.

While these results seem fairly conclusive, there are lingering questions with
regards to the point-based clustering analysis. The most important uncertainty is the
validity of the sample. This thesis used statewide testing data, numbering in the hundreds
of thousands, for analysis. The study was limited to Medicaid-only children, a majority
of the MSU database, so that the sample constituted a better representation of the
underlying population at risk. Since Medicaid requires recipients to undergo a blood test
for lead, this population is more represented in the test results than the Michigan
population as a whole. Still, limiting the study to Medicaid-insured children carries
biases as well. The population and spatial distribution of children in Michigan may be
different than Medicaid-insured children. This difference could complicate clustering

and hotspot analysis and lead to false conclusions.

192

A question or issue that also inevitably arises is the idea that the clustering
methods are only showing clusters in cities due to the high number of test results. This
idea does lend itself to some credence given the impressive stratiﬁcation of clustering
within the state almost entirely based on population. However, there are some factors to
consider. First, the task of looking at lead poisoning across an entire state means that
much of the local variation can be missed. The individual clusters picked up in the
Cuzick-Edwards and difference of K measures may not perfectly translate to GAM
analysis. In GAM, what looks like a hotspot containing an entire city may be a coarser
picture of the local spatial variation. But the fact that GAM worked much better in urban
areas at pinpointing locations of elevated BLL makes it a useful tool.

The relationship between size of the city and cluster magnitude demonstrates that
the highest BLL cases are still in major cities with a few exceptions visible. The 25
ug/dL threshold probably best illustrates the signiﬁcance of elevated BLL in the major
cities. Cases of BLL 25 ug/dL and above are the most indicative of a major problem, and
the fact that they are almost exclusively found in the major urban areas negates the
assumption that all the clustering was only due to a larger number of samples. The
second point is that a few major cities of Michigan did not ﬁt the ranking rule that
developed. The most obvious case was Midland, which is in a study region where it is
the only major town, but still did not show up as a cluster or hotspot on the GAM maps.

Each individual clustering method that was used has both an upside and downside
to implementation. The main upside to the Cuzick-Edwards statistic is that in not
considering distance, the results can pick up clusters in both cramped urban areas and

spread-out rural study regions. While this is useful, it did not seem to factor into the

193

results from this thesis. The mostly rural study areas of the state did not seem to display
clustering at any level without the presence of a moderate-sized town or city.
Meanwhile, even with the larger number of control test results, nearly every urban aid
boundary—based study region showed signiﬁcant clusters at the 5 and 10 ug/dL threshold
levels. The downside to the Cuzick-Edwards is related to the upside. The distance
between the nearest 20 neighbors is much closer in urban areas than in rural areas.
Twenty neighbors in an urban area likely constitute a neighborhood, while twenty
neighbors in a more rural area are likely much more dispersed. Since clustering analysis
seeks to link cases within a cluster, this can complicate matters in rural areas. For this
thesis, the downside of Cuzick-Edwards seems to be mostly mitigated due to the
differences in clustering results between urban and rural study regions. The urban areas
of the state showed much stronger clustering than the rural areas, leading to the
conclusion that certain areas of Michigan cities exhibit high lead exposure risk.

The main drawback to the difference of K method is the problem of edge effects.
The study area boundaries can have an effect on the results. There are examples in this
thesis. The smallest study area, Battle Creek, has a quick drop in K values right after four
or ﬁve kilometers. This is not due to the sudden loss of cases as much as the concentric
circles extending beyond the boundaries of the region. Another drawback to the
difference of K method is difficulty of interpretation. The K values can be inside or
outside the simulation envelope depending on the simulation results, a situation that can
lead to confusion about signiﬁcance. In this thesis, clustering was assumed to only be

occurring when the difference of K values far exceeded the upper bound of the simulation

194

envelope. Most study areas with clustering of elevated BLL have difference of K values
well above the envelope, leaving the ambiguity problem most mute.

The greatest weakness of the GAM analysis turned out to be the case/control ratio
for each study area baseline rate. The ratio of cases to controls in many rural regions of
the state was much smaller than in the more urban regions of the state. This meant that
the hotspots in rural study areas often only had one case in them. This is signiﬁcant for
remediation, but it does not count as a cluster. This leads to a varying pattern of hotspots
year to year. Identifying places with higher threats ﬁ‘om lead exposure becomes more
difficult. More urban areas that had a larger ratio of cases to controls were more
successful at identifying consistent hotspots, but individual cases outside of the main
clusters could be missed. This becomes a problem when the area the individual case’s

area is under-sampled, but contains environmental lead hazards.

4.2.2 Geographically Weighted Regression

The clustering portion of this thesis answers many of the questions as to where the
hotspots of elevated BLL were located, but regression analysis can provide insight into
why these clusters occur and who is most affected. The results of the regression analysis
conﬁrmed that the spatial patterns in Michigan were similar to what was seen in earlier
studies of other locations. The main predictor of children’s BLL was older housing. This
is to be expected. Pre-1940 housing showed up as the main predictor on all three
different areal units as well as during almost every individual year. Another variable that
was signiﬁcant was percentage of African-Americans. The positive coefﬁcients

associated with the percentage African-American variable around the high mean BLL

195

cities of Detroit and Grand Rapids suggest that children of this ethnicity are likely the
primary victim of lead exposure.

Beyond older housing and percentage of Aﬁican-Americans, the three different
areal unit global regression models diverged in predictive value. The census tract model
was by far the best. This is due to the US census bureau attempts to divide areas into
tracts with relatively homogeneous populations. Therefore, the ability of independent
variables to explain mean BLL in census tracts is superior due to stark differences in
socio-economic conditions in different units. This was a great contrast from the minor
civil divisions model. In that model, all spatial and socio-economic variation within the
urban areas was lost. Zip codes worked slightly better, but not as well as tracts. The
conclusion is that the modiﬁable areal unit problem is signiﬁcant in the study of BLL.
None of the earlier statewide regression studies (Bailey 1994; Sargent 1995; Talbot 1998;
Haley 2004) used census tracts, so they all could have missed much of the spatial
variation.

The GWR results were only useful at the census tract level. Both the zip code
level analysis as well as the minor civil division level analysis yielded coarse results
because the independent variables explained less in zip codes and far less in MCD of the
variance when compared to census tracts. As a result, the GWR models for these two
areal units used larger bandwidth values for the weighting schemes. The reason was that
the geographic variation in mean BLL is not explained well in zip codes and minor civil
divisions by the independent variables used. Therefore, larger bandwidths giving greater
weight to distant observations are needed to explain the spatial pattern. The resulting

maps of the coefﬁcients for zip codes and minor civil divisions had a linear striped

196

pattern. In this case, adding x and y coordinates as independent variables would have
worked just as well.

Census tract results for GWR yielded the most insights. The model, according to
the R2 values, explained variance the best in the urban areas, particularly the two main

cities of Detroit and Grand Rapids. It is not surprising that the two most signiﬁcant

variables from the global model, percentage pre-l 940 housing and percentage African-
American, both had coefﬁcient maps that mimicked the R2 values fairly well. This

would lead to the conclusion that these two variables are linked to urban BLL levels.
Since urban mean BLL is more stable year to year than suburban or rural areas, older
housing and percentage African-American are the best predictors because they are higher
in the cities. Coefﬁcient maps for other variables revealed that they were a greater factor
in more rural areas. It is more difﬁcult to discern meaning because the rural areas of the
state have more unpredictable mean BLL numbers.

A drawback to running regression analysis across eight years is that the US
census data is ﬁxed in the year 2000. Any changes that occurred across the eight years,
such as migration of people or the building of new homes, is not available for modeling.
Unfortunately, many of the census yearly estimates are completed at large geographic
levels such as counties or states. Gathering data at the census tract, zip code, and minor

civil division level requires waiting for the decennial census.

4.2.3 Research Questions
At the outset of Chapter 1, this thesis presented three research questions relating

to the spatial distribution of elevated BLL in Michigan. Each of these three questions

197

will be discussed in terms of the stated hypothesis and results from the clustering and

regression tests.

(I) Are there spatial clusters of elevated BLL in Michigan? At what spatial
scales do these patterns manifest?

The hypothesis of this thesis was that spatial clusters of elevated BLL existed in
Michigan’s older, urban areas. By all measures, this has been conﬁrmed. The Cuzick-
Edwards tests and the Difference of K graphs both conﬁrmed a clustering hierarchy in
Michigan. Each found the greatest amount of clustering occurred in urban areas, such as
Detroit and Grand Rapids. Smaller urban areas, such as Flint, Lansing, and Kalamazoo,
all showed strong signs of clustering as well. In the larger study areas based on HSA
boundaries, the occurrence of spatial clusters usually depended on the presence of a city
or town within the region. GAM analysis conﬁrmed that hotspots occurred most often in
urban areas.

The global regression analysis conﬁrmed the signiﬁcance of older housing on
mean BLL. Each regression models for all three areal units revealed the percentage of
housing units within an area that date to before 1940 was the best predictor of BLL. The
geographically weighted regression model for census tracts conﬁrmed that the
coefﬁcients of the pre-l940 housing variable were greatest in the urban core of Michigan,
particularly Grand Rapids. These ﬁndings, combined with the clustering results, show
that clusters of BLL in Michigan are greatest in the older, urban areas.

The spatial scale of the clustering explored in this thesis was slightly different

from Grifﬁth et al (1998). In that paper, changes in the spatial scale of elevated BLL

198

were evaluated through using hierarchical census units. This thesis used three different
areal units that are not hierarchical, but were created by three different supervising
bodies. The clustering analysis based on point data in this thesis did provide interesting

results for the spatial scale of lead poisoning in terms of both distance and severity.

(2) Are socio-demographic and economic variables in the US Census able to
predict and explain the geographic variation in elevated blood lead levels in Michigan
children?

Socio-economic and demographic data proved to be effective at predicting BLL
in Michigan. The hypothesis put forth in this thesis was that lack of education, recent
immigration to the US, lower income, and older housing were predictors of the
geographic variation of elevated BLL. The results conﬁrmed two out of the four
variables. Virtually every regression model run showed that older housing was the best
predictor of BLL. The percentage of residents without a high school diploma was also a
good predictor in most regression analyses. The other two variables listed in the
hypothesis as likely predictors were disappointing. The US census variable percentage
under 185% of the poverty line was not a signiﬁcant predictor of BLL in Michigan in any
of the three areal units. Recent immigration was only signiﬁcant at the zip code level,
and not signiﬁcant for several individual years of that areal unit. Demographic variables
that proved to be effective predictors were Percentage African-American and Percentage
Latino. -

Overall, the results from this study seemed to ﬁt into a pattern found by other

researchers who studied BLL through regression analysis. Four of the geographic studies

199

listed in section 1.2.3 of this thesis were conducted at a statewide level. Bailey (1994)
found in Massachusetts that the percentage pre—l940 housing was the best predictor of
the number of children above 25 ug/dL, the dependent variable in the study. Similar
results were found in Sargent (l 995), who found that both percentage pre-l950 housing
as well as percentage African-American was signiﬁcant predictors. These two variables
were also the most signiﬁcant in two regression studies of New York State: Talbot (1998)
and Haley (2004). The similarity of the patterns seen in this thesis in Michigan compared
to previous studies in Massachusetts and New York reveal the same factors at work.
Older urban housing within the cities seems to be the primary source of lead exposure,

with African-Americans suffering the most.

(3) Can a model based on US Census soda-demographic and economic variables
accurately predict the spatial distribution of elevated BLL in Michigan over time?

The answer to this question is a bit more complicated than the previous two. The
hypothesis of this thesis was that a model based on socio-demographic and economic
variables would work over time because the same underlying factors were predictive for
lead exposure. In the regression portion of this thesis, this assertion turned out to be true
for some variables, but not others. For each of the three areal units, several independent
variables that were signiﬁcant when the mean BLL from all years in the database was
used turned out to not be signiﬁcant in several of the individual years. On the other hand,
the strongest predictors such as pre-1940 housing turned out to predict mean BLL on a

yearly basis as well.

200

The GWR model for the census tract level also sheds light on this question. The
three variables that best predicted mean BLL were percentage pre-l940 housing,
percentage Aﬁican-American, and percentage female-headed households. GWR maps of
the coefﬁcients for these variables revealed that they had the highest positive effect in the
urban areas of Michigan where mean BLL is higher. The implications are that the
variables that predict best in the cities are going to work best on a yearly basis. Variables
that characterize suburban or rural areas, where mean BLL is more volatile on a yearly
basis according to the standard deviation maps, are less likely to signiﬁcantly predict
mean BLL over a shorter time span. The implication of this is that the temporal length of
the research is very important to the outcome. A study that only covers a couple of years
within the database may show independent variables as signiﬁcant or insigniﬁcant
predictors of mean BLL differently from a study that covers all years of the database. An
example is at the census tract level, the variable percentage of housing units vacant is a
signiﬁcant predictor of mean BLL for all eight years of the database. But when tested as
a predictor of the mean BLL for each individual year, percentage of vacant houses is only

signiﬁcant in two years, 2000 and 2001.

4.3 Future Research

Spatial epidemiology is a useful tool in understanding and combating the threats
posed by health hazards such as lead. With the ﬁrm goal of eliminating elevated BLL in
Michigan children, future work must take both a research and policy route. These two

routes are not mutually exclusive, instead relying heavily upon each other in order to

201

accomplish meaningful results. Future research involving lead poisoning should involve
two different tracks. First, studies from a spatially epidemiological perspective such as
this thesis could delve deeper into the issue at a ﬁner spatial scale. A second line of
ﬁiture research could examine the problem through on-site medical investigation of
children who have been exposed to lead. This line of inquiry could take on a geographic
perspective by determining if different lead-based hazards (paint, water pipes, and
atmospheric lead deposition) are responsible for exposure in different areas of Michigan.
As for public policy, greater coordination with academia and public health could improve
statewide remediation efforts. Spatial epidemiologic approaches to the elevated BLL
highlighting hotspots and areas of concern could be a more efﬁcient remediation measure
in the long run than targeting houses case by case.

This thesis sought to follow both previous geographic analyses of elevated BLL
and commonly used techniques for testing for clusters. In seeking to cover the entire
state of Michigan, the analysis in this thesis remained rather coarse. Study areas in this
thesis covered either health districts comprising multiple counties or large urban areas.
This might not be ideal for micro-targeting problem areas on a limited budget. Future
research could focus instead on taking methods such the Geographic Analysis Machine in
smaller study areas such as sections of a city to ﬁnd pockets of consistently high blood
lead test results. The statewide analysis in this thesis used a one-kilometer grid, but a
study in a smaller study region could use a much smaller grid such as 100 meters since
computer processing time would not be an issue. This might reveal neighborhood
variation and strongly localized clusters that a statewide or citywide study might miss. In

a more localized cluster analysis, it might be possible to obtain a better control dataset as

202

well. A focus on smaller geographic units for regression analysis might yield better
predictive models as well. The regression analysis in this thesis was limited to
enumerative units for which census data were available. More locally focused analysis
could use a unit of analysis such as tax parcels that would illuminate variation within the
neighborhood. Housing information such as the year an individual home was built would
greatly aid primary prevention efforts. Such data would likely be difﬁcult to obtain, but
the information would be invaluable in building a strong regression model at a parcel
level. If these results were combined with survey data collected in the ﬁeld, a more
accurate picture of the local risks could be obtained.

The second line of future research could take a medical investigation approach to
ground-level studies elevated BLL in children. While the majority of cases of elevated
BLL occurred within urban areas of Michigan, the GAM maps proved that elevated BLL
was present as well in more rural areas. An interesting research question would be
whether the mechanism of exposure was any different between different parts of state.
While many cases in both urban and rural can might still be related to exposure to old
paint, it would be compelling if other mechanisms such as old drinking water pipes,
nearby smelters, or other paths to exposure were present. Areas where these extra factors
were present could then be examined for possible increased incidence of elevated BLL.
This could go a long way in explaining areas with anomalously high incidence compared
to what might be expected based on housing age. Case investigation could yield the
greatest results in rural areas of the state, where individual cases are more likely to go
against what the area models predicted. While cluster analysis and spatial regression are

powerful tools, the exact cause of exposure can only be inferred from these methods.

203

The map in ﬁgure 3 showed the zip codes deemed high risk based on the CDC
recommendations. The majority of zip codes within Michigan were deemed high risk.
This project has while many of the zip codes that have the largest clusters of elevated
BLL identiﬁed in this thesis are deemed high risk, several areas of the state considered
not high risk still show cases. A good example is in the North Central study region in
this thesis. The GAM map in ﬁgure 65 shows a constellation of cases in areas that are
not considered high risk. Other non-high risk areas in other parts of the state show
examples of these isolated cases. A comparison of the ﬁgure 3 high risk zip code map
with the mean BLL zip code map in ﬁgure 77 reveals non-high risk areas such as the
suburbs around Grand Rapids have as high if not higher mean BLL values than the high
risk zip codes. Since this thesis focused on children covered by Medicaid, in theory these
kids in non-high risk zip codes would be tested anyway. Still, it is a reminder that even
outside of the high risk zip areas, the threat of lead poisoning is present. Kids who are
not covered by Medicaid could very easily slip through the testing plan in Appendix 1.
To reach the ﬁnal goal of complete elimination of lead poisoning in Michigan, the best
solution might be the most difﬁcult: full screening of children under two years of age
and prompt remediation.

In 2004, the Task Force to Eliminate Childhood Lead Poisoning published seven
public policy priority recommendations for the government action. These included
building effective coalitions to secure funding for community prevention programs, case
management for children with elevated BLL, establish a trust to secure stability for lead
prevention funding, create a housing registry for pre-1978 homes, develop a public

awareness program, coordinate activity statewide, and expand lead remediation in

204

residential environments (Task Force to Eliminate Childhood Lead Poisoning 2004). The
main recommendation that could be added to the list is a closer relationship between the
state and the academic community regarding research. A coordinated effort between the
state and academia could harness spatial epidemiology studies in order to analyze test
results in real time. Such analysis would provide insight into how incoming results ﬁt the
overall patterns of BLL within Michigan. Real time spatial epidemiology could ﬁnd
areas that have been overlooked. Perhaps more importantly, such coordination between
the state and academia could evaluate the progress of remediation efforts. Only so much
can be gleaned for looked at maps and test results without the context of what is being
done on the ground. With such a partnership of real-time test results and statistical
mapping, remediation of lead—based hazards could take a leap forward and lead poisoning

in Michigan children could ﬁnally become a relic of an earlier era.

205

Appendix 1

Michigan Statewide Lead Testing/Lead Screening Plan

 

Three Criteria for testing a
Child for Lead Poisoning

 

 

 

Criterion l
GEOGRAPHY

Option One: All Children living within a
high-risk zip code should be tested
Option Two: Children can recieve a risk
evaluation regarding testing using website
midata.msu.edu "bll

Criterion 2
MEDICAID

Medicaid: All Medicaid-enrolled children
must be tested - No exceptions or waivers

Criterion 3
QUESTIONNAIRE
for
Children NOT enrolled in Medicaid

Children NOT living within a high risk
zip code

—>

—>

——>

 

l Speciﬁcs for Each Criterion ]

 

High Risk Zip Code:

I. 27% pre-l950 built housing

2. I296 incidence oflead poisoning among
children l2 to 36 months of age in 2000

3. High percentages of pre~l950 housing and
children under six years old in poverty

A blood test is required for any Medicaid-
enrolled child at l2 and 24 months ofage
or between 36 and 72 months of age if not
previously tested

Questionnaire:

l. Does the child live in or often visit a house.
03) care. or preschool built before I950?

2. Does the child live in or often visit a house
built before I978 that has been remodeled within
the last war?

3. Does the child have a brother or sister or
play mate with lead poisoning?

4. Does the child live with an adult whose job
or hobby involves lead‘.’

5. Does the child's family use any home
remedies or cultural practices that ma} contain
or use lead?

6. Is the child included in a special population
group. i.e. foreign adoptee. refugee. immigrant.
foster care child?

206

Appendix 2
Difference of K code in R
# Difference of K functi0n#
Iibrary(maptools)
library(spatstat)
library(splancs)

Ian<- read.shape("Lansing") #Load study area shapefile
med<- read.shape("Med98L") #Load 1998 Lansing test results shapefile

x<— vector(length=length(med$Shape))#Create empty vector for x coordinates
y<- vector(length=length(med$Shape)) #Create empty vector for y coordinates
for (i in 1:1ength(med$Shape)) {
x[i] <- med$Shape[[i]]$verts[,1]#Fill x and y vectors with the Michigan
y[i] <- med$Shape[[i]]$verts[,2]#Georef coordinates

}

wp<— cbind(x, y, med$attdata) #Create data frame with locations and attributes
wp<- subset(wp, select = C(x, y, CC10))#Select out the case/control threshold of 10

ex <- lan$Shape[[1]]$verts[,1]#Create data frame of study area x coordinates
cy <- lan$Shape[[1]]$verts[,2]#Create data frame of study area y coordinates

lan.bdy<- cbind(cx, cy) #Create study area boundary

cases<- wp[wp$CC10==1,] #Select out all cases at the 10 ug/sthreshold
controls<- wp[wp$CC10==0,]#Select out all controls at the 10 ug/sthreshold

p.cases <- as.points(cases)#Convert cases to points
p.controls <- as.points[controls)#Convert controls to points

#define distances
dist<- seq(500, 10000, 500)#Deﬁne distances ofconcentric circles

k.case <- khat(p.cases, lan.bdy, s=dist)#Calculate Ripley's K for cases
kcontrol <- khat(p.controls, lan.bdy, s=dist)#CaIculate Ripley's K for controls

K.diff <- k.case - k.control#Calculate the difference of K

# Random Labeling Simulation#
env.lab<- Kenv.label(p.cases, p.controls, bboxx(bbox(lan.bdy)], nsim=19, s=dist)

207

#Plot the Results#

plot(dist, K.diff, xlab="Distance", ylab="Diff in K", yIim=range(K.diff—dist,
+ env.lab$lower-dist, env.lab$upper-dist))

lines(dist, env.lab$upper, lty=2)

lines(dist, env.lab$lower, lty=2)

208

Appendix 3

Geographic Analysis Machine code in R
#Geographic Analysis Machine#

library(splancs)
library(spatstat)
library(maptools)

Ian<- read.shape("Lansing")#Load study area shapeﬁle
med98<- read.shape("Med98L")#Load 1998 Lansing test results shapefile

lx<- lan$Shape[[1]]$verts[,1]#Create data frame ofstudy area x coordinates
ly<- lan$Shape[[1]]$verts[,2] #Create data frame ofstudy area y coordinates

lan.bdy<- cbind(lx, ly)#Create study area boundary

x<- vector(length=length(med98$Shape))#Create empty vector for x coordinates
y<- vector[length=length(med98$Shape))#Create empty vector for y coordinates
for (i in 1:]ength(med98$Shape]) {

x[i] <- med98$Shape[[i]]$verts[,1]#Fill x and y vectors with the Michigan

y[i] <- med98$Shape[[i]]$verts[,2]#Georef coordinates

medp<- cbind(x, y, med98$attdata)#Create data frame with locations, attributes

medp<- subset(medp, select = C(x, y, CC10))#Select out the case/control threshold
#0f10

distance<- function (x1, y1, x2, y2) {#Create function to calculate distance

euc<- sqrt((x2 -x1)"2 + (y2-y1)"2)

return(euc)

}

backgd.rate <- 0.014147#ENTER BACKGROUND RATE HERE
lan.grid<- gridpts(lan.bdy, xs=1000, ys=1000] #Create 1 kilometer grid

#Create empty distance matrix
dist.mat<- matrix(nrow=length(lan.grid[,1]), ncol=length(medp$x))

#Create empty matrix for calculation results
close<- matrix(data=0, nrow=length(lan.grid[,1]), ncol=4)

#Calculate Distance between grid points and test results

209

for [i in 1:length(mich.grid[,1]))
dist.mat[i,]<-distance(mich.grid[i,1], mich.grid[i,2], medp$x, medp$y)

#Loop to fill calculation matrix with number ofpoints within 1.8 kilometers ofthe
#grid points, the number ofthese points that are controls, number that are elevated
#BLL cases, and the expected number of cases

for (i in 1:length(mich.grid[,1])) {

close[i,1] <- sum(dist.mat[i,] < 1800) # all pts within 1.8km

close[i,2] <- sum(dist.mat[i,medp$CC10==0]<1800) # just control

close[i,3] <- sum(dist.mat[i,medp$CC10==1]<1800) # just lead

close[i,4] <- close[i,1]*backgd.rate # Expected # cases

# Highlight grid points where there is less than a 5% chance of the number of
#elevated BLL cases occurring according to a Poisson distribution with the
#background rate as the mean

v1800.98<- ((ppois(close[,3], (close[,4])) > 0.95) & (close[,3] > 0))

#Run kernel smoother over the resulting grid
k1800.98<- kerne12d(mich.grid[v1800.98,], mich.bdy, h0=1800, nx=500, ny=500)

#Plot final map

polymap(mich.bdy, border="grey")
image(k1800.98, add=TRUE, col=heat.colors(20))

210

Literature Cited

Agency for Toxic Substances & Disease Registry. 2007. Lead Toxicity - What Are the
Physiologic Effects of Lead Exposure 2007 [cited October 18 2007]. Available
from http://www.atsdr.cdc.gov/csem/lead/pbphvsiologic effectthtml.

Akhtar, R. 1982. The Geography of Health: An Essay and a Bibliography. New Delhi:
Marwah Publications.

American Academy of Pediatrics. 2003. Michigan Medicaid Facts.
Angier, N. 2007. The Pernicious Allure of Lead. New York Times, August 21, 2007.

Bailey, A., J. Sargent, and M. Blake. 1998. A Tale of Two Counties: Childhood Lead
Poisoning, Industrialization, and Abatement in New England. Economic
Geography 74196-111.

Bailey, A., J. Sargent, D. Goodman, J. Freeman, and M. J. Brown. 1994. Poisoned
Landscapes: The Epidemiology of Environmental Lead Exposure in
Massachusetts Children 1990-1991. Social Science and Medicine 19 (6):757-766.

Barboza, D. 2007. Why Lead in Toy Paint? It's Cheaper. New York Times, September 11,
2007.

Beam, C. 2007. Why Do They Put Lead Paint in Toys: It's Bright, Cheap, and Lasts
Forever 2007 [cited October 13 2007]. Available from
http://slatecom/id/2l 72289.

Bellinger, D., and A. Bellinger. 2006. Childhood lead poisoning: the torturous path from
science to policy. The Journal of Clinical Investigation 116 (4):853-857.

Bellinger, D., and J. Schwartz. 1997. Effects of Lead in Children and Adults. In Topics in
Environmental Epidemiology, eds. K. Steenland and D. Savitz. New York:
Oxford University Press.

Brill, R., and J. Wampler. 1967. Isotope Studies of Ancient Lead. American Journal of
Archaeology 71 (1):63-77.

Canﬁeld, R., C. Henderson, D. Cory-Slechta, C. Cox, T. Jusko, and B. P. Lanphear.
2003. Intellectual Impairment in Children with Blood Lead Concentrations Below
10 Micrograms per Deciliter The New England Journal of Medicine 348
(16):1517-1526.

Centers for Disease Control and Prevention. 2005a. ToxFAQs for Lead, ed. ATSDR.

211

 

. 2005b. Building Blocks for Primary Prevention: Protecting Children from Lead-
Based Paint Hazards, ed. H. a. H. Services, 264 p.

Chen, A., K. Dietrich, J. Ware, J. Radcliffe, and W. Regan. 2005. IQ and Blood Lead
from 2 to 7 Years of Age: Are the Effects in Older Children the Residual of High

Blood Lead Concentrations in 2-Year Olds. Environmental Health Perspectives
113 (5)2597-601.

Chisolm, J. 2001. Evolution of the Management and Prevention of Childhood Lead
Poisoning: Dependence of Advances in Public Health on Technological Advances
in the Determination of Lead and Related Biochemical Indicators of Its Toxicity.
Environmental REsearch Section A 86 (2):]11-121.

Clarkson, T. 1995. Health Effects of Metals: A Role for Evolution? Environmental
Health Perspectives 103 (Supplement 1):9-12.

Cromley, E. K., and S. L. McLafferty. 2002. GIS and Public Health. New York: The
Guilford Press.

Daniel, K., M. Sedlis, L. Polk, S. Dowuona-Hammond, B. McCants, and T. Matte. 1990.
Childhood Lead Poisoning, New York City, 1988. The Morbitity and Mortality
Weely Report 39:1-7.

Department of Housing and Urban Development. 1993. Understanding Title X: A
Practical Guide to the Residential Lead-Based Paint Hazard Reduction Act of
1992.

 

. 2004. History of Lead-Based Paint Legislation.

Dietrich, K., J. Ware, M. Salganik, J. Radcliffe, W. Rogan, G. Rhoads, M. Fay, C.
Davoli, M. Denckla, R. Bomschein, D. Schwartz, D. Dockery, S. Adubato, and R.
Jones. 2004. Effect of Chelation Therapy on the Neuropsychological and
Behavioral Development of Lead-Exposed Children Aﬁer School Entry.
Pediatrics 1 14 ( 1 ): 19-26.

Dignam, T., A. Evens, E. Eduardo, S. Ramirez, K. Caldwell, N. Kilpatrick, G. Noonan,
D. Flanders, P. Meyer, and M. McGeehin. 2004. High-Intensity Targeted
Screening for Elevated Blood Lead Levels among Children in 2 Inner-City
Chicago Communities. American Journal of Public Health 94 (l 1):1945-1951.

Dockerty, J ., K. Sharples, and B. Borman. 1999. An Assessment of Spatial Clustering of

Leukaemias and Lymphomas among Young People in New Zealand. Journal of
Epidemiology and Community Health 53: 154-158.

212

Dolk, H., A. Busby, B. Armstrong, and P. Walls. 1998. Geographical Variation in
Anophthalmia and Microphtalmia in England, 1988-94. British Medical Journal
317 (7163):905-910.

Environmental Protection Agency. 1996. EPA Takes Final Step in Phaseout of Leaded
Gasoline.

 

. 2001. Lead Based Paint Prevention in Certain Residential Structures.

Ettinger, A. S. 2007. Chelation Therapy for Childhood Lead Poisoning: Does Excretion
Equal Eﬁicacy? Harvard School of Public Medicine 1999 [cited October 21
2007]. Available from
http://www.hsph.harvard.edw’Organizations/ddil/chelation.htm.

Fee, E. 1990. Public Health in Practice: An Early Confrontation with the 'Silent
Epidemic' of Childhood Lead Paint Poisoning. Journal of the History of Medicine
and Allied Sciences 45 (4):570-606.

Finkelstein, Y., M. Markowitz, and J. Rosen. 1998. Low-Level Lead-Induced
Neurotoxicity in Children: An Update on Central Nervous System Effects. Brain
Research Reviews 27 (2):]68-176.

Finn, M. 2007. Health Care Demand in Michigan: An Examination of the Michigan
Certiﬁcate of Need Acute Care Bed Need Methodology, Geography, Michigan
State University, East Lansing.

F legal, A., and D. Smith. 1992. Lead Levels in Preindustrial Humans. New England
Journal of Medicine 326 (19):1293-1294.

Foley, J ., P. Foley, and J. Madigan. 2001. Spatial Distribution of Seropositivity to the
Causative Agent of Granulocytic Ehrlichiosis in Dogs in California. American
Journal of Veterinary Research 62 (10): 1599-1605.

Fotheringham, A., C. Brunsdon, and M. Charlton. 2002. Geographically Weighted
Regression: The Analysis of Spatially Varying Relationships. Hoboken, NJ: John
Wiley & Sons.

Frost, S. W. 2004. Lead Poisoning in Young Children: Determining Risk Factors and
Exposure Sources - An Environmental Justice Approach, Sociology, Michigan
State University, East Lansing.

Garza, A., H. Chavez, R. Vega, and E. Soto. 2005. Cellular and Molecular Mechanism of
Lead Neurotoxicity. Salud Mental 28 (2):48-5 8.

Gaston, J. 1972. Geography of Lead Poisoning: Development of a Model, Geography,
Michigan State University, East Lansing.

213

Gibson, J. 1904. A Plea for Painted Railings and Painted Walls of Rooms as the Source
of Lead Poisoning Amongst Queensland Children. Australasian Medical Gazette
(reprinted in Public Health Reports May-June 2005).

Gilbert, 8., and B. Weiss. 2005. Preventing Neurodevelopment Disorders: The CDC
Should Lower the Blood Lead Action Level From 10 to 2 micrograms per
deciliter. Paper read at 22nd International Neurotoxicology Conference,
September 11-14, at Research Triangle Park, NC.

Goyer, R. 1993. Lead Toxicity: Current Concerns. Environmental Health Perspectives
100:177—187.

Grifﬁth, D. A., P. G. Doyle, D. C. Wheeler, and D. L. Johnson. 1998. A Tale of Two
Swaths: Urban Childhood Blood-Lead Levels across Syracuse, New York. Annals
of the Association of American Geographers 88 (4):640-655.

Guthe, W., R. Tucker, E. Murphy, R. England, E. Stevenson, and J. Luckhardt. 1992.
Reassessment of Lead Exposure in New Jersey Using GIS Technology.
Environmental Research 59 (2):318-325.

Haley, V., and T. Talbot. 2004. Geographic Analysis of Blood Lead Levels in New York
State Children Born 1994-1997. Environmental Health Perspectives 112
(15):1577-1582.

Hemberg, S. 2000. Lead Poisoning in a Historical Perspective. American Journal of
Industrial Medicine 38 (3):244—254.

Honari, M. 1999. Health Ecology: An Introduction. In Health Ecology: Health, Culture
and Human-Environment Interaction, eds. M. Honari and T. Boleyn. London:
Routledge.

Huang, Y., and Y. Leung. 2002. Analysing Regional Industrialisation in Jiangsu Province
using Geographically Weighted Regression. Journal of Geographical Systems 4
(2):233-249.

Hunter, D. 1969. The Diseases of Occupations. 4th ed. Boston: Little, Brown.

Hunter, J. 1976. Aerosol and Roadside Lead as Environmental Hazard. Economic
Geography 52 (2): 147-160.

Jacobs, D., R. Clickner, J. Zhou, S. Viet, D. Marker, J. Rogers, D. Zeldin, P. Broene, and

W. Friedman. 2002. The Prevalence of Lead-Based Paint Hazards in US.
Housing. Environmental Health Perspectives 110 (lO):A599-A606.

214

Jacobziner, H., and H. Raybin. 1962. Epidemiology of Lead Poisoning. Archives of
Pediatrics 79 (2):72-76.

Jones, K., and G. Moon. 1987. Health, Disease, and Society. London: Routledge &
Kegan Paul.

Kaplowitz, S., H. Perlstadt, and L. Post. 2007. Predicting Blood Lead Level from
Medicaid Eligibility, Race, and Neighborhood Census Data: An Analysis of
Michigan Data. East Lansing, MI: Michigan State University.

Kemper, A. R., C. Bordley, and S. Downs. 1998. Cost-Effectiveness Analysis of Lead
Poisoning Screening Strategies Following the 1997 Guidelines of the Centers for

Disease Control and Prevention. Archives of Pediatrics & Adolescent Medicine
152 (12):]202-1208.

Kemper, A. R., and S. Clark. 20050. Physician Barriers to Lead Testing of Medicaid-
Enrolled Children. Ambulatory Pediatrics 5 (5):290-293.

Kemper, A. R., L. M. Cohn, K. E. Fant, and K. J. Dombkowski. 2005a. Blood Lead
Testing Among Medicaid-Enrolled Children in Michigan. Archives of Pediatrics
& Adolescent Medicine 159 (7):646-650.

Kemper, A. R., L. M. Cohn, K. E. Fant, K. J. Dombkowski, and S. Hudson. 2005b.
Follow-up Testing Among Children With Elevated Screening Blood Lead Levels.
Journal of the American Medical Association 293 (1 8):2232-223 7.

Kemper, A. R., R. Uren, and S. Hudson. 2007. Childhood Lead Poisoning Prevention
Activities within Michigan Local Public Health Departments. Public Health
Reports 122 (1):88-92.

Kitrnan, J. L. 2000. The Secret History of Lead: Special Report. The Nation.
Kovarik, W. 2005. Ethyl-Leaded Gasoline: How a Classic Occupational Disease Became
an International Public Health Disaster. International Journal of Occupational

and Environmental Health 11 (4):384-397.

Lam, T. 2007. Money on the Way to Fight Lead Poisoning in Homes. Detroit F ree-Press,
October 2, 2007.

Lanphear, B. P. 2005a. Childhood Lead Poisoning: Too Little, Too Late. Journal of the
American Medical Association 293 (18):2274-2276.

Lanphear, B. P., R. Byrd, P. Auinger, and S. Schaffer. l998b. Community Characteristics
Associated with Elevated Blood Lead Levels in Children. Pediatrics 101 (2):264-
271.

215

Lanphear, B. P., R. Homung, J. Khoury, K. Yolton, P. Baghurst, D. Bellinger, R. L.
Canfreld, K. N. Dietrich, R. Bomschein, T. Greene, S. J. Rothenberg, H. L.
Needleman, L. Schnaas, G. Wasserrnan, J. Graziano, and R. Roberts. 2005b.
Low-Level Environmental Lead Exposure and Children's Intellectual Function:
An International Pooled Analysis. Environmental Health Perspectives 113
(7)2894-899.

Lanphear, B. P., T. Matte, J. Rogers, R. Clickner, B. Dietz, R. Bomschein, P. Succop, K.
Mahaffey, S. Dixon, W. Galke, M. Rabinowitz, M. Farfel, C. Rohde, J. Schwartz,
P. Ashley, and D. Jacobs. l998d. The Contribution of Lead-Contaminated House
Dust and Residential Soil to Children's Blood Lead Levels: A Pooled Analysis of
12 Epidemiologic Studies. Environmental Research 79 (l):51-68.

Leung, Y., C.-L. Mei, and W.-X. Zhang. 2000. Statistical Tests for Spatial
Nonstationarity Based on the Geographically Weighted Regression Model.
Environment and Planning A 32 (1):9-32.

Lidsky, T., and J. Schneider. 2003. Lead Neurotoxicity in Children: Basic Mechanisms
and Clinical Correlates. Brain 12625-19.

Litaker, D., C. M. Kippes, T. E. Gallagher, and M. E. O'Connor. 2000. Targeting Lead
Screening: The Ohio Lead Risk Score. Pediatrics 106 (5):Art. No. e69.

Mahaffey, K., J. Annest, J. Roberts, and R. Murphy. 1982. National Estimates of Blood
Lead Levels: United States 1976-1980. New England Journal of Medicine 307
(10):573-579.

Markowitz, G., and D. Rosner. 2000. "Cater to the Children": The Role of The Lead
Industry in a Public Health Tragedy, 1900-1955. American Journal of Public
Health 90 (1):36-46.

 

. 2002. Deceit and Denial: The Deadly Politics of Industrial Pollution. Berkeley:
University of California Press.

Mayer, J. 1982. Medical Geography: Some Unsolved, Problems. The Professional
Geographer 34 (3):261-269.

. 1986. Ecological Associative Analysis. In Medical Geography: Progress and
Prospect, ed. M. Pacione. London: Croom Helm.

 

McKnight, K. 2006. Spatial Trends of West Nile Virus in Detroit, Michigan 2002,
Geography, Michigan State University, East Lansing.

Meade, M. 1977. Medical Geography as Human Ecology: The Dimension of Population
Movement. Geographical Review 67 (4):379-393.

216

Meade, M., and R. Earickson. 2000. Medical Geography. New York: The Guilford Press.

Michigan Department of Community Health. 1998. Annual Report on Blood Lead Levels
in Michigan.

 

. 2001. Annual Report on Blood Lead Levels in Michigan.

 

. 2005a. Annual Report on Blood Lead Levels on Adults and Children in
Michigan.

 

. 2006. Annual Report on Blood Lead Levels on Adults and Children in Michigan.

 

. 2007. Statewide Lead Testing/Lead Screening Plan.

Michigan Department of Natural Resources. 2001. GIS/GPS Education.

Mielke, H. 1999. Lead in the Inner Cities. American Scientist 87 (1):62-73.

Miranda, M. L., D. Dolinoy, and M. A. Overstreet. 2002. Mapping for Prevention: GIS
Models for Directing Childhood Lead Poisoning Prevention Programs.
Environmental Health Perspectives 110 (9):947-953.

Murray, K., D. Rogers, and M. Kaufman. 2004. Heavy Metals in an Urban Watershed in
Southeastern Michigan. Journal of Environmental Quality 33 (1):163-172.

Nakaya, T., A. Fotheringham, C. Brunsdon, and M. Charlton. 2005. Geographically
Weighted Poisson Regression for Disease Association Mapping. Statistics in
Medicine 24 (17):2695-2717.

Needleman, H. 1998. Clair Patterson and Robert Kehoe: Two Views of Lead Toxicity.
Environmental REsearch Section A 78 (2):79-85.

 

. 2004. Lead Poisoning. Annual Review of Health 55 (1):209-222.

Needleman, H., and D. Bellinger. 1991a. The Health Effects of Low Level Exposure to
Lead. Annual Review of Public Health 12:1 1 l-140.

Nriagu, J. 1983. Satumine Gout among Roman Aristocrats. Did Lead Poisoning
Contribute to the Fall of the Empire. New England Journal of Medicine 308
(1 1):660-663.

 

. 1990. The Rise and Fall of Leaded Gasoline. The Science of the Total
Environment 92: 1 3-28.

 

. 1998. Clair Patterson and Robert Kehoe's Paradigm of "Show Me the Data" on
Environmental Lead Poisoning. Environmental REsearch Section A 78 (2):71-78.

217

O'Brien, [1, J. Kaneene, A. Getis, J. Lloyd, G. Swanson, and R. Leader. 2000. Spatial
and Temporal Comparison of Selected Cancers in Dogs and Humans, Michigan,
USA, 1964-1994. Preventive Veterinary Medicine 47 (3): 1 87-204.

O'Sullivan, D., and D. Unwin. 2003. Geographic Information Analysis. Hoboken, NJ:
Wiley and Sons, Inc.

Openshaw, S., A. Craft, M. Charlton, and J. Birch. 1988. Investigation of Leukaemia
Clusters by use of a Geographic Analysis Machine. The Lancet 331 (8580):272-
273.

Ozden, T., H. Issever, G. Gokcay, and G. Saner. 2004. Longitudinal Analyses of Blood-
Lead Levels and Risk Factors for Lead Poisoning in Healthy Children under Two
Years of Age. Indoor Built Environment 13:303-308.

Parsons, P., A. Reilly, and D. Esemio-Jenssen. 1997. Screening Children Exposed to
Lead: An Assessment of the Capillary Blood Lead Fingerstick Test. Clinical
Chemistry 43 (2):302-31 1.

Pirkle, J ., R. Kaufrnann, D. Brody, T. Hickman, E. Gunter, and D. Paschal. 1998.
Exposure of the US Population to Lead, 1991-1994. Environmental Health
Perspectives 106 (11):745-750.

Prince, M., A. Chetwynd, P. Diggle, M. Jamer, J. Metcalf, and 0. James. 2001. The
Geographical Distribution of Primary Biliary Cirrhosis in a Well-Deﬁned Cohort.
Hepatology 34 (6): 1083- 1088.

Rabin, R. 1989. Warnings Unheeded: A History of Child Lead Poisoning. American
Journal of Public Health 79 (12): 1668-1674.

 

. 2008. The Lead Industry and Lead Water Pipes: "A Modest Campaign".
American Journal of Public Health 98 (9): 1584-1 592.

Richardson, J. 2005. The Cost of Being Poor: Poverty, Lead Poisoning, and Policy
Implementation. Westport, CT: Praeger Publishers.

Rosen, J., and P. Mushak. 2001. Primary Prevention of Childhood Lead Poisoning: The
Only Solution. New England Journal of Medicine 344 (19): 1470-1471.

Sargent, J., A. Bailey, P. Simon, M. Blake, and M. Dalton. 1997. Census Tract Analysis

of Lead Exposure in Rhode Island Children. Environmental Research 74 (2):]59-
168.

218

Sargent, J ., M. J. Brown, J. Freeman, A. Bailey, D. Goodman, and D. Freeman. 1995.
Childhood Lead Poisoning in Massachusetts Communities: Its Association with

Sociodemographic and Housing Characteristics. American Journal of Public
Health 85 (4):528-534.

Shearmur, R., P. Apparicio, P. Lizion, and M. Polese. 2007. Space, Time, and Local
Employment Growth: An Application of Spatial Regression Analysis. Growth
and Change 38 (4):696-722.

Silbergeld, E. 1997. Preventing Lead Poisoning in Children. Annual Review of Public
Health 18:187-210.

Talbot, T., S. Forand, and V. Haley. 1998. Geographic Analysis of Childhood Lead
Exposure in New York State. Paper read at Proceedings of the 3rd National
Conference on GIS in Public Health, August 17-20, at San Diego.

Task Force to Eliminate Childhood Lead Poisoning. 2004. Final Report of the Task Force
to Eliminate Lead Poisoning.

Tong, S. 1990. Roadside Dusts and Soils Contamination in Cincinnati, Ohio, USA.
Environmental Management 14 (1):107-1 l3.

Tong, S., Y. Schimding, and T. Prapamontol. 2000. Environmental Lead Exposure: A
Public Health Problem of Global Dimensions. Bulletin of the World Health
Organization 78 (9): 1068-1077.

United States Geological Survey. 2007. Lead: Statistics and Information 2007 [cited
October 11 2007]. Available from

http://minerals.usgs.gov/minerals/pubs/commoditv/lcad/indcx.html#mvb.

US Census Bureau. 2000. Geographic Areas Reference Manual.

 

. 2001. DP-4, Profile of Selected Housing Characteristics: 2000 (Geographic
Area: Michigan).

Venables, W., and D. Smith. 2008. An Introduction to R.
Vojnovic, 1., C. Jackson-Elmoore, J. Holtrop, and S. Bruch. 2006. The Renewed Interest
in Urban Form and Public Health: Promoting Increased Physical Activity in

Michigan. Cities 23 (1)21-17.

Waldron, H. 1973. Lead Poisoning in the Ancient World. Medical History 17 (4):391-
399.

Waller, L., and C. Gotway. 2004. Applied Spatial Statistics for Public Health Data.
Hoboken, NJ: Wiley and Sons, Inc.

219

Weiss, D., W. Shotyk, and O. Kempf. 1999. Archives of Atmospheric Lead Pollution.
Naturwissenschaften 86 (6):262-275.

Wheeler, D. C. 2007. A comparison of spatial clustering and cluster detection techniques
for childhood leukemia incidence in Ohio, 1996-2003. International Journal of
Health Geographies 6 (13):2-38.

Yohn, S., D. Long, J. Fett, and L. Patino. 2004. Regional Versus Local Inﬂuences on
Lead and Cadmium Loading to the Great Lakes Region. Applied Geochemistry 19
(7):] 157-1 175.

Zandbergen, P., and J. Green. 2007. Error and Bias in Determining Exposure Potential of
Children at School Locations Using Proximity-Based GIS Techniques.
Environmental Health Perspectives 115 (9):1363-1369.

220