harvard ace project 2: [.1in] air pollutant mixtures in ...€¦ · harvard ace project 2 may 30,...
TRANSCRIPT
Harvard ACE Project 2:
Air Pollutant Mixtures in Eastern Massachusetts: SpatialMulti-resolution Analysis of Trends, Effects of Modifiable
Factors, Climate, and Particle-induced Mortality
Brent Coull, Petros Koutrakis, Joel Schwartz, Itai Kloog, AntonellaZanobetti, Joseph Antonelli, Ander Wilson, Jeremiah Zhu,
Weeberb Requia, Choong-Min Kang, Marianthi-AnnaKioumourtzoglou, Ce Li, Glen McGee
May 30, 2018
Harvard ACE Project 2 May 30, 2018 1 / 34
Objectives
1 Decompose high-resolution PM2.5 mass and ground air temperature datainto regional, sub-regional, and local spatial scales.
2 Conduct a spatiotemporal analysis of sub-regional and local variation inPM2.5 mass and ground air temperature, and local PM2.5 emissions.
3 Conduct spatial multi-resolution analysis of PM2.5 mixtures.
4 Conduct air pollution mortality and birth outcome studies inMassachusetts using multi-resolution PM2.5 mass and species data.
Harvard ACE Project 2 May 30, 2018 2 / 34
Spatio-temporal Models for Black Carbon
Harvard ACE Project 2 May 30, 2018 3 / 34
Prediction Models for Black Carbon:Support Vector Machines
Abu Awad et al., Environmental International 2017.
Harvard ACE Project 2 May 30, 2018 4 / 34
Spatio-temporal Models for Black Carbon
Harvard ACE Project 2 May 30, 2018 5 / 34
Spatio-temporal Modeling of PM2.5 Elements from XRF
Harvard ACE Project 2 May 30, 2018 6 / 34
Spatio-temporal Modeling of PM2.5 Elements
Name Type Sites Time Samples DurationHSPH Outdoor 1 03/2006− 12/2010 1755 DailyMAD EPA Outdoor 6 03/2006− 12/2010 796 DailyMPO PPG Outdoor 53 03/2006− 10/2008 444 Daily
MPI PPG Indoor 68 10/2006− 07/2010 341 WeeklyMNI NAS Indoor 270 07/2010− 12/2010 321 Weekly
Elements Modeled:
K, CA, Fe, Zn, Cu, Ti, Al, Pb, V, Ni
Harvard ACE Project 2 May 30, 2018 7 / 34
Spatio-temporal Modeling of PM2.5 Elements
Outcomes considered1 Element Concentration2 Element Concentration / PM2.5
3 PCA Component Scores
Modeling frameworks1 Spatio-temporal Generalized Additive Models (GAMs)2 Random forests3 Support Vector Machines4 GAMs + Gradient Boosting
Harvard ACE Project 2 May 30, 2018 8 / 34
Spatio-temporal GAMs + Gradient Boosting
Validation R2
ElementData I-95 loop K Fe Cu
MAD out 0.42 0.58 0.47in 0.48 0.60 0.50
overall 0.42 0.58 0.46
MNI out 0.46 0.52 0.44in 0.47 0.54 0.49
overall 0.45 0.53 0.44
MPI out 0.46 0.61 0.49in 0.49 0.63 0.51
overall 0.46 0.61 0.49
MPO out 0.44 0.62 0.42in 0.47 0.64 0.47
overall 0.44 0.62 0.42
Harvard ACE Project 2 May 30, 2018 9 / 34
Temporal Distribution of XRF Data, 2006-2012
(a) MAD (b) MPI
(c) MPO (d) MNI
Harvard ACE Project 2 May 30, 2018 10 / 34
Spatio-temporal Modeling of PM2.5 Elements from XRF
Metal concentrations (Red: < 3∗Uncertainty)
Harvard ACE Project 2 May 30, 2018 11 / 34
Spatio-temporal Modeling of PM2.5 Elements: Next Steps
MADEP data to be added:1 7 new locations2 2002-20093 6500 samples (will triple existing data)
Ensemble (averaging) of different models
Harvard ACE Project 2 May 30, 2018 12 / 34
Model Ensembling: Accounting for Model Uncertainty
Spatial distribution of PM2.5 model CV errors from two exposure models
(a) IK (b) QD
Harvard ACE Project 2 May 30, 2018 13 / 34
Model Ensembling: Accounting for Model Uncertainty
Estimated Effect Heterogeneity Using Three Exposure Models
1.00
1.05
1.10
1.15
1.20
1.25
1.30H
azar
d R
atio
per
5 µ
g/m
3
ME Q1 Q2 Q3 Q4 ME Q1 Q2 Q3 Q4 ME Q1 Q2 Q3 Q4
●
●●
●
●
●
AVIKQD
Harvard ACE Project 2 May 30, 2018 14 / 34
Bayesian Spatially-Varying Model Ensembles
y(x): Pollution Spatial Process
{yk}Kk=1: Prediction from K base models
We seek model ensemble estimates of the form
yens(x) =∑K
k=1 wk(x)yk(x)
where spatially varying weight functions are modeled as
wk =exp(w ′
k)∑Kk=1 exp(w ′
k)where
w ′k(x) ∼ GP
[0, kw ,k(x, x′|l)
]We obtain posterior predictive distribution of y that incorporates both
model-specific prediction errormodel-to-model uncertainty
Harvard ACE Project 2 May 30, 2018 15 / 34
Bayesian Spatially-Varying Model Ensembles: Simulation
Simulation: Residual process from individual base models
Harvard ACE Project 2 May 30, 2018 16 / 34
Bayesian Spatially-Varying Model Ensembles: Simulation
Ensemble weight flexibility v.s. Cross-validated RMSE
Harvard ACE Project 2 May 30, 2018 17 / 34
Critical Windows of Exposure for Children’s Health
Evidence supports associations between maternal exposure to air pollutionduring pregnancy and children’s health outcomes.
Recent interest focuses on critical windows of vulnerability.
We have estimated daily exposures from multiple pollutants:
NO2,OC,EC,Sulfate,O3,NH4
Recent work has shown distributed lag modeling (DLM) can outperformmodels based on trimester-averaged exposures (TAE).
No corresponding methods for air pollution mixtures.
Goal: Develop DLM methods for air pollution mixtures.
Harvard ACE Project 2 May 30, 2018 18 / 34
Distributed Lag Model for Single Exposure
Exposure zi (t) is time-specific (week of pregnancy)
Yi = α + γ
∫zi (t)w(t)dt + xTi β + εi
w(t) identifies critical windows of vulnerability
γ is the within-window effect
Harvard ACE Project 2 May 30, 2018 19 / 34
Bayesian Kernel Machine Regression for Mixtures
1 Yi is a health endpoint, xi contains potential confounders.
2 zi = (zi1, . . . , ziM)T are M (univariate) pollutant concentrations.
Yi = h (zi1, . . . , ziM) + xTi β + εi , i = 1, . . . , n
h (·) is an unknown function
β: effects of the confounders; εiiid∼ N
(0, σ2
)
Harvard ACE Project 2 May 30, 2018 20 / 34
Bayesian Kernel Machine Regression - DLM
Ultimately interest focuses on zmi (t), m = 1, . . . ,M
Yi = h(E1i , . . . ,EMi ) + xTi β + εi .
Emi =
∫zmi (t)wm(t)dt.
Model fitting estimates the critical windows wm(t) and the mixture effecth() simultaneously.
Harvard ACE Project 2 May 30, 2018 21 / 34
ACCESS Prospective Birth Cohort
traditional LUR predictors to yieldresidence-specific estimates of daily PM2.5
as detailed previously (22). The model wasrun using day-specific calibrations of AODdata using ground PM2.5 measurementsfrom 78 monitoring stations coveringNew England and LUR and meteorologicvariables (temperature, wind speed,visibility, elevation, distance to major roads,percent open space, point emissions,and area emissions). This approachincorporates highly resolved spatialinformation from the LUR data andimportant spatiotemporal data from theremote sensing satellite data.
The AOD-PM2.5 relationship wascalibrated for each day using data fromgrid cells with both monitor and AODvalues using mixed models with randomslopes for day, nested within region. Fordays without AOD data (because of cloudcoverage, snow, and so forth), the modelwas fit with a smooth function of latitudeand longitude and a random intercept foreach cell (similar to universal Kriging).The “out of sample” 10-fold crossvalidation R2 for daily values was 0.83 and0.81 for days with and without availableAOD data, respectively. For use in thehealth effect models, to reduce potentialnoise caused by day-to-day PM2.5 variation,daily levels were averaged into weeklyexposure profiles. Predicted overall prenatalPM2.5 levels at participant’s residence inrelation to the 103 10 km grids for whichAOD data were available are shown inFigure 1. Although levels were higheraround major roadways as anticipated,there was reasonable heterogeneity.
AsthmaMaternal-reported clinician-diagnosedasthma was ascertained from birth up toage 6 years through telephone and face-to-face interviews at approximately 3-monthintervals for the first 24 months thenannually thereafter. Mothers were asked,“Has a doctor or nurse ever said that yourchild had asthma?” Most of these childrenwere given a diagnosis of asthma after theage of 3 years (78.6%) (see Figure E1 inthe online supplement).
CovariatesMaternal age, race, education, andprepregnancy height and weight, and child’ssex were ascertained by questionnaire; dateof birth, gestational age, and birth weightwere obtained by medical record review.
A validation analysis on a subset of 121ACCESS women showed no difference inthe level of agreement/disagreement forheight and weight when comparing valuesmeasured early in pregnancy (,10 wk)with self-report (34). Women were askedabout smoking at enrollment and in thethird trimester and classified as prenatalsmokers if smoking at either visit. Mothersreported postnatal smoking and whetherothers smoked in the home at eachpostpartum interview. Household crowdingwas calculated by dividing the number ofpersons living in the home by the numberof rooms based on maternal report inpregnancy. Maternal atopy was defined byself-reported doctor-diagnosed asthma,eczema, and/or hay fever. Body mass indexwas calculated by dividing weight by heightsquared (kg/m2); obesity was defined asbody mass index greater than or equal to30 kg/m2 (35).
Because prenatal stress may covarywith pollution and has been associated withasthma (36), this was also considered as
a confounder. We measured stress usingthe Crisis in Family Systems-Revised surveyadministered prenatally within 2 weeks ofenrollment (37, 38). This survey assesseslife events experienced across 11 domains(e.g., financial, relationships, violence,housing, discrimination/prejudice).Mothers endorsed events experienced inthe past 6 months and rated each aspositive, negative, or neutral. The numberof domains with one or more negative eventwas summed to create a continuousnegative life events (NLEs) domain score,with higher scores indicating greater stress.Because birth weight and gestationalage may be on the pathway betweenprenatal PM and asthma risk, birthweight for gestational age z score (39)was considered in sensitivity analyses.
Statistical AnalysisAnalyses included 736 singleton full-term(gestational age >37 wk) children withtwo or more postnatal interviews followedup to age 6 years and air pollution exposure
N
8.46–9.98
PM2.5 estimates over pregnancy (µg/m3)
9.99–10.9610.97–11.8411.85–13.66Major road ways
0 10 20Kilometers
10×10km prediction grid
Figure 1. Predicted daily particulate matter with a diameter less than or equal to 2.5 mm (PM2.5)levels for Asthma Coalition on Community, Environment and Social Stress participants averaged overpregnancy. This figure demonstrates predicted daily PM2.5 levels for study participants based onresidence and averaged throughout the gestation period. The 103 10 km aerosol optical depth gridused to predict daily PM2.5 levels is also depicted.
ORIGINAL ARTICLE
1054 American Journal of Respiratory and Critical Care Medicine Volume 192 Number 9 | November 1 2015
Study participants (i):191 Boston-area birthsbetween 8/2002 and 1/2007
Exposure (Zmit): NO2,OC,EC,S,O3,NH4 at maternal residence for eachweek (t) of pregnancy
Outcomes (Yi ): FEV1 at age 8
Baseline covariates (Xi ): child sex, maternal pre-pregnancy BMI, age,education, race/ethnicity, atopy, self reported smoking during pregnancy, stressindex, neighborhood disadvantage index
[figure source: Hsu et al. Am. J. Respir. Crit. Care Med. 2015]
Harvard ACE Project 2 May 30, 2018 22 / 34
BKMR using pregnancy-average exposure
Harvard ACE Project 2 May 30, 2018 23 / 34
BKMR-DLM: Exposure-response of each exposure at low/high levels ofanother
Harvard ACE Project 2 May 30, 2018 24 / 34
BDLM analyses of NH4, by low/high NO2, EC, Sulfate, O3
Harvard ACE Project 2 May 30, 2018 25 / 34
Delay of Onset and Exposure Error in Case-CrossoverDesigns
Time-stratified case-crossover studies often used to estimate effect of airpollution on acute events
For each event, create set of reference times based on same year, month,DOW, and hour
Delayed recording of event onset yields exposure error.
Can yield severe attenuation of effect estimates (Lokken et al. 2009).
Goal: Develop a method that corrects for this bias based on a validationsample of delay times.
Harvard ACE Project 2 May 30, 2018 26 / 34
(A) Distribution of delay times and (B) Resulting Exposure Error in24-hour PM2.5 in Boston Stroke Study
Harvard ACE Project 2 May 30, 2018 27 / 34
Subtle problem: Matched sets change with error.
None of the typical measurement error assumptions hold.
We developed the following error corrections:
Marginal likelihood estimatorRegression calibration estimatorConditional score estimatorEach of these + second-stage parametric bootstrap
Harvard ACE Project 2 May 30, 2018 28 / 34
Comparative analysis of Boston stoke study using true and delayed eventtimes and measurement error corrections.
●
●
●
●
●
●
●
True
Error
CS
RC
B.Error
B.CS
B.RC
0.00 0.01 0.02 0.03Effect Estimate
Met
hod
ndelays=300
Harvard ACE Project 2 May 30, 2018 29 / 34
Spatial multi-resolution analysis: Conceptual Framework
Goal: Decompose daily pollution surfaces into different scales, which arerepresentative of different sources of pollution
Examine health effects, predictors, and trends at different scales
Figure taken from HSPH class EH521 Notes (Annette Peters)
Harvard ACE Project 2 May 30, 2018 30 / 34
Applied to Satellite PM2.5 Predictions
Two-dimensional wavelet decomposition:
All panels are averaged over days in 2006
Harvard ACE Project 2 May 30, 2018 31 / 34
Multi-resolution Work - Integration within Larger Center
Currently using wavelet decompositions for:
mortality and scale-specific PM2.5 in New England, 2000-2015source-receptor mapping of power-plant emissions (Project 4)emissions modeling (Project 1)
Reproducible Software
Harvard ACE Project 2 May 30, 2018 32 / 34
Project 2 Posters
1 Abu Awad et al. A spatio-temporal prediction model for Black Carbonbased on ensemble machine learning.
2 Liu et al. Spatio-temporal modeling of ambient PM2.5 elementalconcentrations in eastern Massachusetts.
3 Liu et al. Adaptive Bayesian spatio-temporal ensemble of air pollutionpredictive models.
4 Wilson et al. Distributed lag models for assessing critical windows ofexposure to air pollution mixtures.
5 Coull et al. Corrections for measurement error due to delayed onset ofillness for case-crossover designs.
Harvard ACE Project 2 May 30, 2018 33 / 34
Harvard ACE SAC Review 2017
“...we encourage the Center to also consider the potential for confoundingand/or effect modification by the broader pollutant mixture...”
BC, XRF modelingBKMR-DLM
“Each project should try to quantify the uncertainties, how theypropagate, and identify which are the major uncertainties of concern.”
Model ensembling
Consider birth cohorts in addition to administrative records
ACCESS, VIVA
Multi-resolution analysis of PM2.5-mortality relationship
Data and computational infrastructure completeResults being generated now
Harvard ACE Project 2 May 30, 2018 34 / 34