thomas talbot chief, environmental health surveillance section nys department of health april 18,...

65
Geographic Aggregation Geographic Aggregation GIS & Public Health GIS & Public Health Class Class Thomas Talbot Thomas Talbot Chief, Environmental Health Surveillance Section Chief, Environmental Health Surveillance Section NYS Department of Health NYS Department of Health April 18, 2013 April 18, 2013

Upload: jonah-jackson

Post on 13-Jan-2016

219 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Thomas Talbot Chief, Environmental Health Surveillance Section NYS Department of Health April 18, 2013

Geographic AggregationGeographic AggregationGIS & Public Health ClassGIS & Public Health Class

Thomas TalbotThomas TalbotChief, Environmental Health Surveillance SectionChief, Environmental Health Surveillance Section

NYS Department of HealthNYS Department of Health

April 18, 2013April 18, 2013

Page 2: Thomas Talbot Chief, Environmental Health Surveillance Section NYS Department of Health April 18, 2013

W

• State health departments and federal agencies such as NCI and CDC provide county level health indicators.

• Stakeholders want the data at a finer geographic scale.

Page 3: Thomas Talbot Chief, Environmental Health Surveillance Section NYS Department of Health April 18, 2013

Health data can be shown at different geographic scales

• Residential address

• Census blocks, and tracts

• ZIP codes

• Towns

Page 4: Thomas Talbot Chief, Environmental Health Surveillance Section NYS Department of Health April 18, 2013

Concerns about release of small area data

• Risk of disclosure of confidential information.

• Rates of disease are unreliable due to small numbers.

Page 5: Thomas Talbot Chief, Environmental Health Surveillance Section NYS Department of Health April 18, 2013

Rate maps with small numbers provide very little information.Rates are suppressed due to confidentiality or are unstable.

Page 6: Thomas Talbot Chief, Environmental Health Surveillance Section NYS Department of Health April 18, 2013

Disclosure of confidential information

Census Blocks

Page 7: Thomas Talbot Chief, Environmental Health Surveillance Section NYS Department of Health April 18, 2013

Geographic Smoothed or Aggregated Count & Rate Maps

• Protect Confidentiality so data can be shared.

• Reduce random fluctuations in rates due to small numbers.

Page 8: Thomas Talbot Chief, Environmental Health Surveillance Section NYS Department of Health April 18, 2013

Smoothed Rate Maps

• Borrow data from neighboring areas to provide more stable rates of disease.

– Shareware tools available– Empirical or Hierarchal Bayesian approaches– Adaptive Spatial Filters– Head banging– etc.

Page 9: Thomas Talbot Chief, Environmental Health Surveillance Section NYS Department of Health April 18, 2013
Page 10: Thomas Talbot Chief, Environmental Health Surveillance Section NYS Department of Health April 18, 2013

from Talbot et al., Statistics in Medicine, 2000

Page 11: Thomas Talbot Chief, Environmental Health Surveillance Section NYS Department of Health April 18, 2013

Problems with smoothing• Does not provide counts & rates for

defined geographic areas.

• Not clear how to link risk factor data with smoothed health data.

• Methods are sometimes difficult to understand - “black boxes”

• May not meet requirements of some policies or legislation.

Page 12: Thomas Talbot Chief, Environmental Health Surveillance Section NYS Department of Health April 18, 2013

Environmental Facilities & Cancer Incidence Map Law, 2008

§ 3-0317

• Plot cancer cases by census block, except in cases where such plotting could make it possible to identify any cancer patient.

• Census blocks shall be aggregated to protect confidentiality.

Page 13: Thomas Talbot Chief, Environmental Health Surveillance Section NYS Department of Health April 18, 2013

Environmental Justice & Permitting NYSDEC Commissioner Policy 29

• Incorporate existing human health data into the environmental review process.

• Data will be made available at a fine geographic scale.

Page 14: Thomas Talbot Chief, Environmental Health Surveillance Section NYS Department of Health April 18, 2013

Public Health Support for Brownfield/Land Reuse in the Areas of Concern for the Great Lakes

CDC-RFA-TS10-1003

• Identification of community health status indicators for areas of concern– Environmental data– Community health concerns– Public health data – Pre and post development

Page 15: Thomas Talbot Chief, Environmental Health Surveillance Section NYS Department of Health April 18, 2013

Aggregation

• Consider geographic scale

• Consider zone

Page 16: Thomas Talbot Chief, Environmental Health Surveillance Section NYS Department of Health April 18, 2013

• In the following example I randomly placed points on a map with on average 10 points for each grid cell.

• The observed number of points vs. the expected number of points changes as we move the grid or if we change the scale by combining grids.

Page 17: Thomas Talbot Chief, Environmental Health Surveillance Section NYS Department of Health April 18, 2013

Talbot

Page 18: Thomas Talbot Chief, Environmental Health Surveillance Section NYS Department of Health April 18, 2013
Page 19: Thomas Talbot Chief, Environmental Health Surveillance Section NYS Department of Health April 18, 2013
Page 20: Thomas Talbot Chief, Environmental Health Surveillance Section NYS Department of Health April 18, 2013

Aggregated Count or Rate Maps

• Merge small areas with neighboring areas to provide more stable rates of disease and/or protect confidentiality.

– Aggregation can be done manually.– Existing automated tools were difficult to use.

Page 21: Thomas Talbot Chief, Environmental Health Surveillance Section NYS Department of Health April 18, 2013

Original ZIP Codes3 Years Low Birth Weight Incidence Ratios

Page 22: Thomas Talbot Chief, Environmental Health Surveillance Section NYS Department of Health April 18, 2013

Aggregated to 250 Births per ZIP Code Group

Page 23: Thomas Talbot Chief, Environmental Health Surveillance Section NYS Department of Health April 18, 2013

Goal• Aggregate small areas into larger ones.

• User decides how much aggregation is needed.– Based on cases and/or underlying population– Example 250 births and at least 3 low birth weight births

• Works with various levels of geography.– Census blocks, tracts, towns, ZIP codes etc.– Can nest one level of geography in another

• Uses open source free software.

• Can output results for use in mapping programs.

NYSDOH Geographic Aggregation Tool

Page 24: Thomas Talbot Chief, Environmental Health Surveillance Section NYS Department of Health April 18, 2013

Aggregation Tool

C14

B20

A13

RegionCases

Original Block Data † Regions

SAS or R Tool

6

9

4

3

8

3

2

11

2

CasesBlock

103202/2002

103202/2001

014500/3010

014500/3009

014500/3008

014500/3007

014500/3005

122300/2005

122300/2004

6

9

4

3

8

3

2

11

2

Cases RegionBlock

C103202/2002

C103202/2001

B014500/3010

B014500/3009

B014500/3008

B014500/3007

B014500/3005

A122300/2005

A122300/2004

Page 25: Thomas Talbot Chief, Environmental Health Surveillance Section NYS Department of Health April 18, 2013

The Aggregation Process: goals

• Should form a large number of areas

• The areas should be reasonably compact

• The areas have minimum values as defined by the user.

Page 26: Thomas Talbot Chief, Environmental Health Surveillance Section NYS Department of Health April 18, 2013

The Aggregation Process: method

• Pair-wise merges

• Merge until the areas have minimum values. – Cases and/or population

– Expected numbers.

Page 27: Thomas Talbot Chief, Environmental Health Surveillance Section NYS Department of Health April 18, 2013

The Pairwise Merge: 1st area

• Select those areas which require merging to meet minimum values. Example: 3 low birth weight babies, 250 births

• Of those, select those whose value is the highest percentage of the minimum value to merge first.– 20>3, 8>3 these numbers

not used– 244/250>85/250 Low birth weight counts

Total births

LBWbirths

Page 28: Thomas Talbot Chief, Environmental Health Surveillance Section NYS Department of Health April 18, 2013

The Pairwise merge: 2nd area

• Find the adjacent neighbors of the selected area

• If a boundary variable is used, select those neighbors that are within the boundary variable

Page 29: Thomas Talbot Chief, Environmental Health Surveillance Section NYS Department of Health April 18, 2013

The Pairwise Merge

• If there are no adjacent neighbors, choose the closest area (according to distance between centroids)

Page 30: Thomas Talbot Chief, Environmental Health Surveillance Section NYS Department of Health April 18, 2013

Water

Page 31: Thomas Talbot Chief, Environmental Health Surveillance Section NYS Department of Health April 18, 2013

The Pairwise merge: two methods to choose 2nd area

• Choose the area whose centroid is closest to the first area

• Choose the area which has the smallest ratio of the aggregation variable to the minimum value.– e.g. 85/250

Page 32: Thomas Talbot Chief, Environmental Health Surveillance Section NYS Department of Health April 18, 2013

9 Cases

98 Population

† Simulated data

Page 33: Thomas Talbot Chief, Environmental Health Surveillance Section NYS Department of Health April 18, 2013

9 Cases

98 Population

† Simulated data

Page 34: Thomas Talbot Chief, Environmental Health Surveillance Section NYS Department of Health April 18, 2013

9 Cases

98 Population

† Simulated data

Page 35: Thomas Talbot Chief, Environmental Health Surveillance Section NYS Department of Health April 18, 2013

9 Cases

98 Population

† Simulated data

Page 36: Thomas Talbot Chief, Environmental Health Surveillance Section NYS Department of Health April 18, 2013

9 Cases

98 Population

† Simulated data

Page 37: Thomas Talbot Chief, Environmental Health Surveillance Section NYS Department of Health April 18, 2013

9 Cases

98 Population

† Simulated data

Page 38: Thomas Talbot Chief, Environmental Health Surveillance Section NYS Department of Health April 18, 2013

9 Cases

98 Population

† Simulated data

Page 39: Thomas Talbot Chief, Environmental Health Surveillance Section NYS Department of Health April 18, 2013

9 Cases

98 Population

† Simulated data

Page 40: Thomas Talbot Chief, Environmental Health Surveillance Section NYS Department of Health April 18, 2013

9 Cases

98 Population

† Simulated data

Page 41: Thomas Talbot Chief, Environmental Health Surveillance Section NYS Department of Health April 18, 2013

9 Cases

98 Population

† Simulated data

Page 42: Thomas Talbot Chief, Environmental Health Surveillance Section NYS Department of Health April 18, 2013

9 Cases

98 Population

† Simulated data

Page 43: Thomas Talbot Chief, Environmental Health Surveillance Section NYS Department of Health April 18, 2013

9 Cases

98 Population

† Simulated data

Page 44: Thomas Talbot Chief, Environmental Health Surveillance Section NYS Department of Health April 18, 2013

9 Cases

98 Population

† Simulated data

Page 45: Thomas Talbot Chief, Environmental Health Surveillance Section NYS Department of Health April 18, 2013

9 Cases

98 Population

† Simulated data

Page 46: Thomas Talbot Chief, Environmental Health Surveillance Section NYS Department of Health April 18, 2013

9 Cases

98 Population

† Simulated data

Page 47: Thomas Talbot Chief, Environmental Health Surveillance Section NYS Department of Health April 18, 2013

9 Cases

98 Population

† Simulated data

Page 48: Thomas Talbot Chief, Environmental Health Surveillance Section NYS Department of Health April 18, 2013

9 Cases

98 Population

† Simulated data

Page 49: Thomas Talbot Chief, Environmental Health Surveillance Section NYS Department of Health April 18, 2013

New York StateDescriptive StatisticsYear 2000 populated census blocks

14741Median number of blocks

3820101Median number of cases

1,46777038539Median Population

11,38121,52539,748225,167Number of regions

24 cases12 cases6 casesOriginal Census

BlocksStatistic (calculated using populated regions only)

New Regions: Level of Aggregation

NYS number of cases (5 yrs) 470,000NYS population 2000 18,976,457

Note: The range in the census block populations is 0 - 23,373 Persons

Page 50: Thomas Talbot Chief, Environmental Health Surveillance Section NYS Department of Health April 18, 2013

Performance Measures

• Compactness

• Homogeneity with respect to demographic factors (measured as index of dissimilarity)

• Similar population sizes.

• Number of aggregated areas.

• Aggregated zones are completely contained within larger areas. – e.g. blocks aggregation areas contained within tracts

Page 51: Thomas Talbot Chief, Environmental Health Surveillance Section NYS Department of Health April 18, 2013
Page 52: Thomas Talbot Chief, Environmental Health Surveillance Section NYS Department of Health April 18, 2013

Index of dissimilaritythe percentage of one group that would have to move to a

different area in order to have a even distribution

bi = the minority population of the ith area, e.g. census tract

B = the total minority population of the large geographic entity for which the index of dissimilarity is being calculated.

wi = the non-minority population of the ith area

W = the total non-minority population of the large geographic entity for which the index of dissimilarity is being calculated.

Wikipedia

Page 53: Thomas Talbot Chief, Environmental Health Surveillance Section NYS Department of Health April 18, 2013

Follow-up IssuesScale and method of aggregation will impact map & correlation coefficients.

Modifiable area unit problem

Counties

AggregationAreas

ZIP Codes

Page 54: Thomas Talbot Chief, Environmental Health Surveillance Section NYS Department of Health April 18, 2013

Compactness

Page 55: Thomas Talbot Chief, Environmental Health Surveillance Section NYS Department of Health April 18, 2013

Compactness

Page 56: Thomas Talbot Chief, Environmental Health Surveillance Section NYS Department of Health April 18, 2013

GAT Outputs KML Files

Page 57: Thomas Talbot Chief, Environmental Health Surveillance Section NYS Department of Health April 18, 2013

What is R?

• A programming language

• A software environment

• Similar to S or S-plus

• Can do statistical computing

• Has graphics capabilities

Page 58: Thomas Talbot Chief, Environmental Health Surveillance Section NYS Department of Health April 18, 2013

Why R?

• It’s free

• Widely used and accepted

• Works on windows, MacOS, unix platforms

• Many user-developed packages that add functionality

• Can run script files

Page 59: Thomas Talbot Chief, Environmental Health Surveillance Section NYS Department of Health April 18, 2013

Viewing the Results

• ArcGIS

• MapInfo

• Google Maps

• Google Earth

Page 60: Thomas Talbot Chief, Environmental Health Surveillance Section NYS Department of Health April 18, 2013
Page 61: Thomas Talbot Chief, Environmental Health Surveillance Section NYS Department of Health April 18, 2013

Lab Exercise

We will be trying out a beta version of GAT in the lab today.

5 years of simulated low birth weight data, NY State.

2003 ZIP Code Scale.

Socio-economic variables for race, poverty and education.

Detailed Instructions are provided for running the GAT Tool program in the “GAT v12 Manual” which is in the GAT R directory

You will run the program “GAT vR12” batch program to aggregate zip codes into larger regions see next slide.

Page 62: Thomas Talbot Chief, Environmental Health Surveillance Section NYS Department of Health April 18, 2013

Geographic Aggregation Tool is available on Talbot web site

Page 63: Thomas Talbot Chief, Environmental Health Surveillance Section NYS Department of Health April 18, 2013

Spatial AggregationHomework Assignment  Use the Geographic Aggregation Tool to aggregate ZIP Codes from the testdata ZIP Code Shape File. Each aggregated area should have a minimum of 250 births (simbir0105) and 3 low birth weight births (Simlbw0105) 1.Make a thematic map of percent of low birth weight births for the original unaggregated ZIP codes. Make a second thematic map of percent of low birth weight births for your new aggregated boundaries. Each map should have at least 5 classes (categories). Use the same class breaks and colors for both maps. Make sure you include a legend and title on the map. Use ArcGIS or Indie Mapper to make the Thematic Maps. Attach a copy of your aggregation log file to your lab write-up along with the two thematic maps.  

2.Open the aggregated boundary in Google Earth. Use the print screen feature in Windows to show that the file successfully opened in Google Earth. Add the screen shot to your write up. 

3.Provide at least one suggestion on how the GAT-R program could be made more user friendly. 

4.Provide at least one suggestion of how the User Guide could be made more useful or easier to understand.

5. Provide at least one suggestion of additional features that could be added to the program.

 Lab due May 2, 2013

Page 64: Thomas Talbot Chief, Environmental Health Surveillance Section NYS Department of Health April 18, 2013

GeoMasking ToolRandomly Moves Points within a user defined area

Page 65: Thomas Talbot Chief, Environmental Health Surveillance Section NYS Department of Health April 18, 2013

The End