application of statistical downscaling for ozone air quality in the...

6
000 001 002 003 004 005 006 007 008 009 010 011 012 013 014 015 016 017 018 019 020 021 022 023 024 025 026 027 028 029 030 031 032 033 034 035 036 037 038 039 040 041 042 043 044 045 046 047 048 049 050 051 052 053 054 055 056 057 058 059 060 061 062 063 064 065 066 067 068 069 070 071 072 073 074 075 076 077 078 079 080 081 082 083 084 085 086 087 088 089 090 091 092 093 094 095 096 097 098 099 100 101 102 103 104 105 106 107 108 109 Application of statistical downscaling for ozone air quality in the US Abstract Tropospheric ozone is an air pollutant that can impact both human health and agriculture. Health impact studies typically require ozone concentration fields at high spatial resolution (km scale) but global numerical models of air qual- ity are generally run at much coarser resolu- tion (hundreds of km scale) due to computa- tional cost. Statistical downscaling is a method of overcoming this computational constraint. It involves developing statistical relationships be- tween coarse resolution predictor variables and high resolution predictand variables. Here we apply a regularized multitask regression method for surface level ozone in the United States using model output from a coarse resolution chemical transport model and fine-scale observed ozone at EPA monitoring sites. 1. Introduction Tropospheric ozone is an air pollutant that can impact both human health and agriculture. Its production is controlled by chemical reactions between hydrogen oxide radicals (HO x ), nitrogen oxide radicals (NO x ), and volatile organic compounds (VOCs) (Seinfeld & Pandis, 2006). NO x and VOCs have both natural and anthropogenic sources in the environment. Sources of NO x include automobile exhuast, lightning, and soils. Sources of VOCs include forests and industrial emissions. Ozone is regulated under the national ambient air quality standards (NAAQS) as a criteria pol- lutant. The current regulation sets a standard of 75 parts per billion (ppb) maximum daily 8-hour average (MDA8) ozone (EPA), proposed to decrease to 70 ppb. Although large suites of data are available for ozone, mod- els are necessary because they are able to provide global spatial fields that are continuous in time. Global numeri- cal models of air quality, called chemical transport mod- els (CTMs), typically have spatial resolutions of hundreds Preliminary work. Under review by the International Conference on Machine Learning (ICML). Do not distribute. of kilometers due to computational resource constraints as well as availability of high-resolution inputs. However, many applications of CTMs, such as health impact stud- ies, typically require ozone concentration fields at high spatial resolution (km scale). In the air quality com- munity, the common approach is dynamical downscaling, which involves using output from coarse-resolution CTMs as boundary conditions for running high-resolution CTMs over a limited region. Statistical downscaling is a method of modeling fine-scale ozone without the computational constraints of numerical models. It involves developing statistical relationships be- tween coarse resolution predictor variables and high resolu- tion predictand variables. In the atmospheric sciences com- munity, it was first applied to output from general circula- tion models (GCMs) for the purposes of studying climate- related variables (Wilby et al., 1998; Maurer & Hidalgo, 2008). Although many methods have been attempted, it is unclear whether any of them yield reliable results for op- erational use. Burger et al. (2011) compared five differ- ent statistical methods for climate models and found that all methods resulted in ”moderate” reliability for predict- ing precipitation events. There was no evidence that more complex neural network based methods produced better re- sults than simple linear regression methods. Statistical downscaling is less commonly used for air qual- ity purposes. Alkuwari et al. (2013) developed a statis- tical method for ozone air quality using fitted empirical orthogonal functions as the predictor variables. Berrocal et al. (2014) use extreme value theory to derive a model for the distribution of fourth-highest MDA8 ozone based on coarse-resolution CTM output and observed ozone at AQS monitoring sites. None of these methods have been incor- porated into operational modeling tools used by the EPA for air quality assessments. Here we attempt a novel method of statistical downscal- ing of ozone using multi-task regression. With multi-task regression, rather than treating the regression at each mon- itoring site as an independent task, it seeks to improve per- formance by treating the tasks as related. Multi-task learn- ing methods based on group Lasso are described in Roth & Fischer (2008); Kim & Xing (2010); Xu et al. (2010); Gong et al. (2012), while a regularized method similar to

Upload: others

Post on 11-Jul-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Application of statistical downscaling for ozone air quality in the USkyu0110.github.io/CS281.pdf · 2019-02-09 · Statistical downscaling is less commonly used for air qual-ity

000001002003004005006007008009010011012013014015016017018019020021022023024025026027028029030031032033034035036037038039040041042043044045046047048049050051052053054

055056057058059060061062063064065066067068069070071072073074075076077078079080081082083084085086087088089090091092093094095096097098099100101102103104105106107108109

Application of statistical downscaling for ozone

air quality in the US

Abstract

Tropospheric ozone is an air pollutant thatcan impact both human health and agriculture.Health impact studies typically require ozoneconcentration fields at high spatial resolution (kmscale) but global numerical models of air qual-ity are generally run at much coarser resolu-tion (hundreds of km scale) due to computa-tional cost. Statistical downscaling is a methodof overcoming this computational constraint. Itinvolves developing statistical relationships be-tween coarse resolution predictor variables andhigh resolution predictand variables. Here weapply a regularized multitask regression methodfor surface level ozone in the United States usingmodel output from a coarse resolution chemicaltransport model and fine-scale observed ozone atEPA monitoring sites.

1. Introduction

Tropospheric ozone is an air pollutant that can impact bothhuman health and agriculture. Its production is controlledby chemical reactions between hydrogen oxide radicals(HO

x

), nitrogen oxide radicals (NOx

), and volatile organiccompounds (VOCs) (Seinfeld & Pandis, 2006). NO

x

andVOCs have both natural and anthropogenic sources in theenvironment. Sources of NO

x

include automobile exhuast,lightning, and soils. Sources of VOCs include forests andindustrial emissions. Ozone is regulated under the nationalambient air quality standards (NAAQS) as a criteria pol-lutant. The current regulation sets a standard of 75 partsper billion (ppb) maximum daily 8-hour average (MDA8)ozone (EPA), proposed to decrease to 70 ppb.

Although large suites of data are available for ozone, mod-els are necessary because they are able to provide globalspatial fields that are continuous in time. Global numeri-cal models of air quality, called chemical transport mod-els (CTMs), typically have spatial resolutions of hundreds

Preliminary work. Under review by the International Conferenceon Machine Learning (ICML). Do not distribute.

of kilometers due to computational resource constraints aswell as availability of high-resolution inputs. However,many applications of CTMs, such as health impact stud-ies, typically require ozone concentration fields at highspatial resolution (km scale). In the air quality com-munity, the common approach is dynamical downscaling,which involves using output from coarse-resolution CTMsas boundary conditions for running high-resolution CTMsover a limited region.

Statistical downscaling is a method of modeling fine-scaleozone without the computational constraints of numericalmodels. It involves developing statistical relationships be-tween coarse resolution predictor variables and high resolu-tion predictand variables. In the atmospheric sciences com-munity, it was first applied to output from general circula-tion models (GCMs) for the purposes of studying climate-related variables (Wilby et al., 1998; Maurer & Hidalgo,2008). Although many methods have been attempted, it isunclear whether any of them yield reliable results for op-erational use. Burger et al. (2011) compared five differ-ent statistical methods for climate models and found thatall methods resulted in ”moderate” reliability for predict-ing precipitation events. There was no evidence that morecomplex neural network based methods produced better re-sults than simple linear regression methods.

Statistical downscaling is less commonly used for air qual-ity purposes. Alkuwari et al. (2013) developed a statis-tical method for ozone air quality using fitted empiricalorthogonal functions as the predictor variables. Berrocalet al. (2014) use extreme value theory to derive a model forthe distribution of fourth-highest MDA8 ozone based oncoarse-resolution CTM output and observed ozone at AQSmonitoring sites. None of these methods have been incor-porated into operational modeling tools used by the EPAfor air quality assessments.

Here we attempt a novel method of statistical downscal-ing of ozone using multi-task regression. With multi-taskregression, rather than treating the regression at each mon-itoring site as an independent task, it seeks to improve per-formance by treating the tasks as related. Multi-task learn-ing methods based on group Lasso are described in Roth& Fischer (2008); Kim & Xing (2010); Xu et al. (2010);Gong et al. (2012), while a regularized method similar to

Page 2: Application of statistical downscaling for ozone air quality in the USkyu0110.github.io/CS281.pdf · 2019-02-09 · Statistical downscaling is less commonly used for air qual-ity

110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164

165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219

Statistical downscaling of ozone

ridge regression is described in Evgeniou & Pontil (2004).

2. Methods

2.1. Data

The model is trained and validated with data from theEPA Air Quality System (AQS). AQS contains ozonedata collected from 1173 monitors across the UnitedStates, along with meteorological information at eachof the sites. Data is collected hourly and aggregatedaggregated into hourly, daily, and monthly data sets.These data sets are publicly available for download athttps://aqs.epa.gov/aqsweb/documents/data mart welcome.html.We remove data points flagged with possible measurementissues or associated with exceptional events, as well as anydays with incomplete measurements.

Figure 1 shows the MDA8 ozone observed at each monitor-ing station on August 31, 2013. AQS monitors represent avariety of conditions, ranging from rural to urban, offeringa fairly representative sample of surface air quality in theUS. We see that coastal sites typically have lower ozone,while parts of the south and California has higher ozone.

●●

●●●●●●●

●●

●●

●●

●●

●●

●●

●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●

●●●●●

●●

● ●●●

●●

●●

●●

●●●●

●●●●●

●●●

●● ●●●●●●●●

●●●●●●●●●●●●●●

●●●●●●

●●●

●●

●●●●

●●●●

●●●●●●●●●

●●● ●

●●●●●●●

●●

●●●●

●●

●●●●●

●●●●●●●●●●

●●

●●●●●●●●●●●●●●●●●

●●●●

●●●●

●●●

●●

●●

●●●●

●●●●●

●●

●●●●●●●●●●

●●●●●

●●●●

●●●

●●●●

●●

●●

●●●●

●●● ●

●●●●●●●●●●●

●●●●●

●●●●

●●●

●●

●●●●

●●●●●●●

●●●

●●

●●

●●

●●●

●●

●●●

●●

●●●●●●●

●●●

●●●

●●

●●●

●●●●●●

●●

●●●

●● ●

● ●

●●●●●●●●●●●●

● ●●●

●●

●●●●●

●●

●●

●●●

●●●

●●

●●●●●

●●●●●●

●●

●●●

●●

●●●●●●●● ●●

●●●●

●●

●●

●● ●●●●●●

●●●●

●●

●●●

● ●

●●

●●

●●

●●●

●●

●●● ●●●●

●●●

●●●

●●

●●● ●●●●●●●●●●

●●●●●●●

●●

●●●●●●●●

●●●●

●●●●● ●

●●●

●● ●●●●●●● ●

●●

●●●

●●●●●●●●●●●●

●●

●●●●

●●●●

●●●

●●

●●

● ●

●●●●

●●

● ●●

●●

●●●

●●

●●

●●

●●●●

●●●●

●●●

●●

●●

●●●

●●●

●●

●●●●●●●●●

●●●●●●●

●●

●●

●●●●

●●●●●●

●●●

●●

●●●●●●●

●●

●●●●●●●

●●●●●● ●●

● ●●

●●

●●●

●●

●●

●●

●●●●

●●●

●●●●

●●●

●●

●●●●●

● ●●●●

● ● ●●●●● ●●

●●●●●● ●●●●

●●●●

●●

●●●●●

●●●

● ●●● ●●

●●●●●●●

●●●●

●●●●●

●●●

●●

●●

●●

●●●

●●

●● ●

●●●●●●

● ●

●●●

●●

●●

●●●●

●● ●●●●●

●●

●●

●● ●●

●●●●●●●● ●●●

●●●●

●●●

●●

●●●

●●

●●

●●●●●●●● ●

●●●●●●●

●●●●●● ●●

●●

●●

●●

●●●

●●●●●●

●●●●●●●

●● ●●

●●

●●●●●●

●●

●●●●●

●●●●●

●●●●●

●●● ●●●

●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●

●●

●●●●●●●

● ●

●●

●●

●●

●●●●

●●

●●●●

●●

● ●●●●●●●

●●

●●

●●●●●

●●●

● ●●

●●

●●● ●

●●●

●●●●●●

●●

●●●

●●

●●

●●●

●●

●●●

●●

●●

●●●

●●●●

● ●

●●

●●

●●●

●●●●

●●

●●●●●●●

●● ●●

●●●●

20

30

40

50

−120 −110 −100 −90 −80 −70lon

lat

20 40 60 80MDA8

Figure 1. Observed MDA8 ozone at AQS monitors on August 31,2013.

We use 9 months of data (Jan 2013 - Sep 2013) and ran-domly assign 90% of the data at each site to the trainingset and 10% to the validation set. This results in approx-imately 200 data points per monitoring site in the trainingset and approximatley 20 data points per monitoring site in

the test set. Mean MDA8 ozone over this time period is 43ppb, with a standard deviation of 12 ppb and values rangingfrom 2 ppb - 12 ppb.

2.2. Numerical model

The numerical model we use for providing coarse-scale ozone is the GEOS-Chem chemical trans-port model (CTM) (Bey et al., 2001), version 9-02,(http://wiki.seas.harvard.edu/geos-chem/index.php/GEOS-Chem v9-02). GEOS-Chem has previously been used fora wide range of purposes ranging from studying regionalair quality (Chen et al., 2009) to transcontinent pollution(Zhang et al., 2009). It is typically run at horizontalresolutions of 4ox5o (⇡ 400 km x 400 km) or 2ox2.5o (⇡200 km x 200 km) with 47 vertical levels. GEOS-Chem isdriven by assimilated meteorology from the NASA GEOS-5 system (http://gmao.gsfc.nasa.gov/GEOS/), which isproduced at a native resolution of 0.25ox0.3125o (⇡ 28 kmx 28 km) and 72 vertical levels. GEOS-Chem models thechemical reactions between 196 chemical species (Parrellaet al., 2012), using operator splitting to separately solvethe equations of transport and chemistry. The timestep forthe 4x5 simulation is 30 min for transport and 60 min forchemistry.

Figure 2 shows a snapshot of the modeled ozone fromGEOS-Chem on August 31, 2013. As in the observations,there is regionally high ozone over the Southeast US, whilecoastal regions have lower ozone. However, unlike the ob-servational sites, the coarse resolution model is unable toresolve the sometimes sharp gradients from one location tothe next.

Applica'on*of*Sta's'cal*Downscaling*to*Ozone*Air*Quality**

Karen*Yu * * * * * * * * *CS281*Final*Project*

Tropospheric*ozone*is*an*air*pollutant*that*can*impact*both*human*health*and*agriculture.*Ozone*is*

regulated*under*the*na'onal*ambient*air*quality*standards*(NAAQS)*as*a*criteria*pollutant.*The*

current*regula'on*sets*a*standard*of*70*ppb*maximum*daily*8Ohour*average*(MDA8)*ozone.**

Although*large*suites*of*data*are*available*for*ozone,*models*are*necessary*because*they*are*able*to*

provide*global*spa'al*fields*con'nuous*in*'me.*Global*numerical*models*of*air*quality,*called*

chemical*transport*models*(CTMs),*are*typically*run*at*spa'al*resolu'ons*of*hundreds*of*kilometers*

due*to*computa'onal*resource*constraints*as*well*as*availability*of*highOresolu'on*inputs.*However,*

many*applica'ons*of*CTMs,*such*as*health*impact*studies,*typically*require*ozone*concentra'on*

fields*at*high*spa'al*resolu'on*(km*scale).*In*the*air*quality*community,*the*common*approach*is*

dynamical*downscaling,*which*involves*using*output*from*coarseOresolu'on*CTMs*as*boundary*

condi'ons*for*running*highOresolu'on*CTMs*over*a*limited*region.**

Sta's'cal*downscaling*is*a*method*of*overcoming*the*computa'onal*constraints*of*numerical*

models.*It*involves*developing*sta's'cal*rela'onships*between*coarse*resolu'on*predictor*variables*

and*high*resolu'on*predictand*variables.*Here*we*apply*regularized*mul'task*regression*to*develop*

a*robust*sta's'cal*downscaling*method*for*surface*level*ozone*in*the*United*States.**

Motivation

Evgenious,*Theodoros*and*Massimiliano*Pon'l,*2004.*Regularized*Mul'OTask*Learning.*KDD’04.**

References*

Inclusion)of)addi-onal)features)O  Meteorological*variables*such*as*wind*speed*and*temperature*

O  Ozone*precursors*such*as*NOX*and*VOCs*

Using)PCA)to)iden-fy)pa8erns)that)may)be)useful)in)feature)selec-on)O  Applying*PCA*on*model*errors*can*help*iden'fy*features*that*maybe*useful*in*regression.*

Non=linear)transforma-ons)on)features)O  Use*a*nonOlinear*func'on,*such*as*radial*basis*func'on,*to*transform*the*features*to*see*

if*this*improves*results*

Effect)of)seasonal/temporal)trends)O  There* is* currently* no*'me* component* in* the* regression,* but*we* know* that* the*model*

performs*be`er*in*some*seasons*than*others.*

Future improvements

Project progress Data*processing*and*baseline*have*been*completed.*I*am*s'll*

working*on*implemen'ng*the*mul'Otask*regression*method*but*

an'cipate*this*to*be*finished*in*a*day*or*two.**

The*results*of*the*baseline*are*almost*no*be`er*than*performing*

no*regression,*so*there*is*definitely*lots*of*room*for*improving*

upon*the*baseline.**

Aaer*the*basic*linear*mul'Otask*regression*has*been*

implemented,*I*plan*on*implemen'ng*a*few*different*ideas*

outlined*in*“future*improvements”.**

Data

Figure*[2]:*Sca`erplot*showing*

rela'onship*between*modeled*and*

observed*ozone.*Only*a*subset*of*the*

data*is*plo`ed*to*decrease*clu`er.*Note*

that*observed*ozone*has*more*variability*

than*modeled*ozone.**

The*model*is*trained*and*validated*with*data*from*the*EPA*Air*Quality*System*(AQS).*AQS*

contains*ozone*data*collected*from*1173*monitors*across*the*United*States,*along*with*

meteorological*informa'on*at*each*of*the*sites.*We*use*9*months*of*data*(Jan*2013*–*Sep*2013)*

and*randomly*assign*90%*of*the*data*to*the*training*set*and*10%*to*the*valida'on*set.*

● ●

●●●●●●●●●

●●

●●

●●

●●

●●

●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●

●●

● ●●●

●●

●●

●●

●●●●●

●●●●●

●●●

●●● ●●●●●●

●●

●●●●●●●●●●●●●●

●●●●●●

●●●

●●

●●●●

●●●●

●●●●●●●●●●●● ●

●●●●●●●

●●

●●●●

●●

●●●●●

●●●●●●●●●●

●●

●●●●●●●●●●●●●●●●●

●●●●

●●●●

●●●●●

●●

●●●●

●●●●●

●●

●●●●●●●●●●

●●●●●

●●●●

●●●

●●●●

●●

●●

●●●●

●●●●●●●●●●●●●●●

●●●●●●●●●

●●●

●●

●●●●

●●●●●●●

●●●

●●

●●

●●

●●●

●●

●●●

●●

●●●●●●●

●●●

●●●●

●●

●●● ●●

●●●●

●●

●●●

●● ●

● ●●

●●●●●●●●●●●●

●●

● ●●●

●●

●●●●●

●●

●●

●●●

●●●

●●

●●●●●

●●●●●●

●●

●●●

●●

●●●●●●●● ●●●●●●

●●

●●

●● ●●●●●●

●●●●

●●

●●●

● ●

●●

●●

●●

●●●

●●

●●● ●●●●

●●●

●●●

●●

●●● ●●●●●●● ●●●

●●●●●●

●●

●●●●●●●●

●●●●

●●●●●●

●●●●● ●●●●●●●●

●●

●●●

●●●●●●●●●●●●

●●

●●●●

●●●●

●●●

●● ●●

●● ●

●●●●

●●

● ●●

●●

● ●●

●●

●●

●●

●●

●●●●

●●●●

●●●

●●●

●●●●●

●●●

●●●●●●●●●●●

●●●●●●● ●●

●●

●●●●

●●●●●●

●●●●

●●●●●●●●●●●

●●●●●●●

●●●●●● ●●

● ●●

●●

●●●

●●

●●

●●

●●●●

●●●

●●●●

●●●

●●

●●●●●● ●●●●

● ● ●●●●● ●●

●●●●●● ●●●●

●●●●

●●

●●●●● ●●●

● ●●● ●●●

●●●●●●●

●●●●

●●●●●

●●●

●●

●●

●●

●●●

●●●● ●

●●●●●●●

● ●

●●●

●●

●●

●●

●●●

●● ●●●●●

●●

●●●● ●●

●●●●●●●● ●●●

●●●●

●●● ●

●●

●●●

●●

●●●

●●●●●●● ●●●●●●●●

●●●●●● ●●

●●

●●

●●

●●●

●●●●●●

●●●●●●●

●● ●●

●●

●●●●●●

●●●●●●● ●●●●●●●●●●

●●● ●●●

●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●

●●

●●●●●●●

●●

●●

●●

●●

●●●●

●●

●●●●

●●

● ●●●●●●●

●●

●●

●●●●●

●●●

● ●●

●●

●●● ●

●●●

●●●●●●

●●

●●●

●●●

●●●

●●●

●●

●●●

●● ●

●●●

●●●●●

● ●

●●

●●

● ●●

●●●●

●●

●●●●●●●

●●●●

●●

●●●●

20

30

40

50

−120 −110 −100 −90 −80 −70lon

lat

20

40

60

80X1st.Max.ValueMDA8)O3)

Figure*[1]:*MDA8*ozone*observed*at*

AQS*monitors*in*the*US*on*31*August*

2013.*Note*that*a*value*above*70*ppb*

indicates*exceedance*of*the*new*EPA*

standards.*Values*below*20*ppb*

indicate*very*clean,*background*values.**

*

Figure*[3]:*Modeled*ozone*from*the*

GEOSOChem*CTM.*GEOSOChem*is*a*

global*numerical*model*that*solves*

transport*and*chemistry*equa'ons*for*

over*100*chemical*species.*It*has*been*

used*for*a*variety*of*air*quality*

applica'ons,*ranging*from*

transcon'nental*pollu'on*to*local*

regulatory*purposes.*Grid*boxes*are*

4ox5o.*There*is*a*similar*spa'al*pa`ern*

to*the*observed*ozone.**

The*average*RMSE*of*the*GEOSOChem*model*compared*against*the*AQS*measurements*is*9.1)ppb)averaged*across*all*monitoring*sites*over*the*9*month*period.*The*largest*RMSE*is*24*ppb,*

while*the*lowest*is*0.2*ppb.**

Baseline The*simplest*method*of*sta's'cal*downscaling*is*to*

perform*linear*regression*between*modeled,*coarseO

scale*ozone*and*observed*ozone*for*each*site*

individually.*Once*a*rela'onship*between*observed*and*

modeled*ozone*has*been*determined,*the*output*can*be*

interpolated*to*create*a*spa'al*field.**

Figure*[4]:*Map*of*RMSE*at*each*of*the*sites*in*

the*test*set.**

For*each*AQS*monitoring*site,*

Obssite*i*

=*biassite*i

*+*weightsite*i

***modelsite*I

*+*ε *

We*use*ridge*regression*with*regulariza'on*parameter*λ*=*0.5*to*find*the*bias*and*the*weight*for*each*site.*Using*this*

method,*the*average*RMSE*on*the*test*set*is*8.7)ppb,*represen'ng*only*a*minor*improvement*from*not*performing*

the*regression*at*all.*The*maximum*RMSE*for*any*site*is*18*ppb*

and*the*minimum*3.4*ppb.**

●●

●●

●●

●●●

●●●

●●

●●●●●●

●●●●●●●●●●

● ● ●

●●●●●●●

●●●●

●●

●●

●●

●●

●●

●●

●●● ●

●●

● ●

●●

●●

●●

●●

● ●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●●●

●●

● ●

●●●

●●●●●●

●●●

●●

●●

●●

●●●

●●

●●●●●●●●●●●

●●

●●

●●

●●●

●●●●

●●●

● ●●

●●●

●●●●●

●●

●●●●●●●

●●

● ●● ●

●●●

●●

● ●

●●

●●

●●

●●

●●

●●

●●●●

●●

●●●●●● ●●● ●●

●●

●●●

●●●●●●●●

●●●

●● ●

●●●

●●

●●

●●

● ●

●●

●●●

●●

●●

●●●

● ●●● ●●●●

●●

●● ●●

● ●

●●

●●●

●●

●●

●●

●●● ●●●

●●●

●●●●

●●●●

●●

●● ●●●●

●●●● ●

●●

●●●● ●●●●●

●●●

●●

●●

●●●

●●●

●●

●●

●●

●●

●●●

●●●● ●

●●

● ●●●

●● ●●

●●

●●

●●●● ●●●

●●

● ●

●●●

●●

●●

●●

●●

●●

● ●

●●

● ●

●●●●●●●●

●●●

●●

●● ●

●●

●●●

●●

●●●

●●

●●●

●●

●●

●●

●●

●●

●●●

●●

●●●

●●●●

●●

●●

●● ●●●●

●●●

●●

●●

●●●● ●●●

●●

●●●

●●●

●●

●●

●●

●●

●●●●

●●

●●

●●●●●

20

30

40

50

−120 −110 −100 −90 −80 −70lon

lat

4

6

8

10

12rmse

In*the*baseline*setup,*we*assumed*that*each*monitoring*site*is*independent*of*the*others*

and*the*only*predictor*of*observed*ozone*is*modeled*ozone*at*that*same*loca'on.*It*is*

reasonable*to*assume*that*clusters*of*sites*that*share*similar*proper'es*may*behave*in*

similar*ways,*making*it*advantageous*to*learn*tasks*simultaneously.**

*

We*follow*the*method*described*in*Evgeniou*and*Pon'l,*2004*and*use*their*nota'on*in*this*

presentaion.**

*

Let*y*be*the*observed*ozone*at*the*monitoring*sites,*x)be*the*modeled*ozone,*and*w*be*the*weights.*For*each*task*t,*want*to*solve*the*linear*equa'on*

y))=)wtx)Where**

wt)=)w0)+)vt))

If*vt)is*small,*then*the*tasks*are*similar*to*each*other*because*w0)is*common*to*all*the*tasks.*

In*the*baseline*case,*we*essen'ally*have*w0)=*0*and*all*the*tasks*are*independent.**The*op'miza'on*problem*is*then*

*

*

*

Subject*to*the*constraints*

yit(w0*+*vt)*xit*>=*1*–*ξit**   ξ

it*>=*0*

Where*l1*and*l

2*are*regulariza'on*parameters*and*x

it*are*slack*variables.*As*in*singleOtask*

regression,*we*can*apply*a*nonOlinear*feature*map*to*this*method.**

Regularized Multi-task regression

Figure 2. GEOS-Chem modeled ozone over North America.

In order to use GEOS-Chem output as predictor variablesfor regression, we sample the model hourly at all locationswhere there is an AQS monitoring site. Because the grid

Page 3: Application of statistical downscaling for ozone air quality in the USkyu0110.github.io/CS281.pdf · 2019-02-09 · Statistical downscaling is less commonly used for air qual-ity

220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274

275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329

Statistical downscaling of ozone

cells are so large, each grid cell may be sampled multi-ple times for different monitoring sites. This allows us tothen compute MDA8 ozone from the model, which is di-rectly comparable to the observed MDA8 ozone. The meanMDA8 ozone sampled at all AQS sites from GEOS-Chemis 43 ppb, with a standard deviation of 9 ppb and valuesranging from 10 ppb - 80 ppb. From these aggregate statis-tics, we can see that GEOS-Chem produces less variabilitythan the observations, but captures the region-wide meanfairly well.

Figure 3. Scatter plot showing observed MDA8 ozone (x-axis)and coarse-scale modeled MDA8 ozone (y-axis). Only twomonths of data are plotted to reduce clutter.

Figure 3 shows a scatterplot of the observed and modeledMDA8 ozone for August and September 2013. As notedbefore, GEOS-Chem has much less variability compared tothe observations. There is a large amount of scatter, indi-cating large errors at each individual site, despite the goodagreement in the mean. We see that the model tends to un-derestimate the high end of the observations and overesti-mate the low end. The average RMSE of the GEOS-Chemmodel compared against the AQS measurements is 9.3 ppbaveraged across all monitoring sites over the 9 month pe-riod. The largest RMSE is 24 ppb, while the lowest is 0.2ppb.

2.3. Regularized multi-task regression

We apply the regularized multi-task learning method de-scribed in Evgeniou & Pontil (2004) as it is the most anal-ogous to the ridge regression we use in the baseline. Herewe describe this method in the context of our problem (Ev-geniou & Pontil (2004) give the example of a classifica-tion problem while we are interested in regression), thoughwe attempt to stick to the notation of (Evgeniou & Pontil,2004) as close as possible.

For a particular measurement station t, the regression prob-lem we attempt to solve can be described as the followingset of linear equations

yt

= wt

·Xt

+ ✏

where yt

is the observed MDA8 ozone at the site, Xt

isthe matrix of predictor variables. In its simplest form, thiscorresponds to the coarse-scale ozone and a bias variable.wt is the vector of regression weights that station. ✏ isa Gaussian random variable with mean zero and standarddeviation �.

Typically, for a set of T sites, each site is treated as inde-pendent of the others, with a separate wt for each site. Inmulti-task regression, we can think of the regression weightvector for each site as

wt

= w0 + vt

where the weight for each site, wt

, is the sum of a commonweight for all sites, w0, and v

t

, which measures how dif-ferent each task is from the common weight. This amountsto solving the following minimization problem

minw0,vt

{J(w0,vt

) = (1)

TX

t=1

(yt

� (w0 + vt

)Xt

)T · (yt

� (w0 + vt

)Xt

) (2)

+�1

T

TX

t=1

||vt

||2 + �2||w0||2} (3)

The last two terms are regularization terms for the indi-vidual and common regression weights, respectively, withregularization parameters �1 and �2. A large �1

�2ratio will

make the differet sites unrelated while a small ratio givesthe same regression weights to all the sites.

The multi-task can be written in a more computationallyexpedient form as

minwt

{TX

t=1

(yt

�wt

Xt

)T · (yt

�wt

Xt

) + (4)

⇢1

TX

t=1

||wt

||2 + ⇢2

TX

t=1

||wt

� 1

T

TX

s=1

ws

||2} (5)

where⇢1 =

1

T

�1�2

�1 + �2

and

⇢2 =1

T

�21

�1 + �2

This method can be extended to non-linear regression usingReproducing Kernel Hilbert Spaces.

Page 4: Application of statistical downscaling for ozone air quality in the USkyu0110.github.io/CS281.pdf · 2019-02-09 · Statistical downscaling is less commonly used for air qual-ity

330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384

385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439

Statistical downscaling of ozone

3. Results

3.1. Baseline

For the baseline, we performed linear regression using thecoarse-scale GEOS-Chem ozone as the predictor variablealong with a bias variable. We use ridge regression withregularization parameter � = 0.5. This was chosen by test-ing a few different � values and choosing the one that pro-duced regression weights with the loweset RMSEs. Eachsite was treated individually. Figure 5 shows the RMSEbetween the down-scaled ozone and the observed ozone ateach site for the test set.

●●

●●

●●

●●●

●●●

●●

●●●●●●

●●●●●●●●●●

● ● ●

●●●●●●●

●●●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

● ●

●●

●●

●●

●●

● ●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●●●

●●

● ●

●●●

●●●●●●

●●●

●●

●●

●●

●●●

●●

●●●●●●●●●●●

●●

●●

●●

●●●

●●●●

●●●

● ●●

●●●

●●●●●

●●

●●●●●●●

●●

● ●● ●

●●●

●●

● ●

●●

●●

●●

●●

●●

●●

●●●●

●●

●●●●●

● ●●● ●●

●●

●●●

●●●●●●●●

●●●

●● ●

●●●

●●

●●

●●

● ●

●●

●●●

●●

●●

●●●

● ●●● ●●●●

●●

●● ●●

● ●

●●

●●●

●●

●●

●●

●● ●●

●●

●●●

●●●●

●●●●

●●

●● ●●●●

●●●● ●

●●

●●●● ●●●●●

●●●

●●

●●

●●●

●●●

●●

●●

●●

●●

●●●

●●●● ●

●●

● ●●●

●● ●●

●●

●●

●●●● ●●●

●●

● ●

●●●

●●

●●

●●

●●

●●

● ●

●●

● ●

●●●●●●●●

●●●

●●

●● ●

●●

●●●

●●●

●●

●●●

●●

●●

●●

●●

●●

●●●

●●

●●●

●● ●●

●●

●●

●● ●●●●

●●●

●●

●●

●●●● ●●●

●●

●●●

●●●

●●

●●

●●

●●

●●●●

●●

●●

●●●●●

20

30

40

50

−120 −110 −100 −90 −80 −70lon

lat

4 6 8 10 12rmse

Figure 4. RMSE of downscaled ozone at each site for the test set.

Average RMSE on the test set is 8.8 ppb, which is only aminor improvement from not performing regression at all.The RMSE for all sites range from 3.4 ppb to 18 ppb. Interms of spatial patterns, there is no region that stands out,indicating smal-scale variabilities are more important thanlarge-scale ones in this case.

3.2. Multi-task regression

We perform multi-task regression as described in section2. We use parameters ⇢1 = ⇢2 = 0.0002 to stay consis-tent with the ridge regression regularization parameter ofthe baseline. We choose L-BFGS-B optimization from thescipy.optimize.minimize package.

The following figure shows the RMSE at each monitoringsite in the test set.

The average RMSE on the test set is 8.7 ppb, which is only

●●

●●

●●

●●●

●●●

●●

●●●●●●

●●●●●●●●●●

● ● ●

●●●●●●●

●●●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

● ●

●●

●●

●●

●●

● ●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●●●

●●

● ●

●●●

●●●●●●

●●●

●●

●●

●●

●●●

●●

●●●●●●●●●●●

●●

●●

●●

●●●

●●●●

●●●

● ●●

●●●

●●●●●

●●

●●●●●●●

●●

● ●● ●

●●●

●●

● ●

●●

●●

●●

●●

●●

●●

●●●●

●●

●●●●●

● ●●● ●●

●●

●●●

●●●●●●●●

●●●

●● ●

●●●

●●

●●

●●

● ●

●●

●●●

●●

●●

●●●

● ●●● ●●●●

●●

●● ●●

● ●

●●

●●●

●●

●●

●●

●● ●●

●●

●●●

●●●●

●●●●

●●

●● ●●●●

●●●● ●

●●

●●●● ●●●●●

●●●

●●

●●

●●●

●●●

●●

●●

●●

●●

●●●

●●●● ●

●●

● ●●●

●● ●●

●●

●●

●●●● ●●●

●●

● ●

●●●

●●

●●

●●

●●

●●

● ●

●●

● ●

●●●●●●●●

●●●

●●

●● ●

●●

●●●

●●●

●●

●●●

●●

●●

●●

●●

●●

●●●

●●

●●●

●● ●●

●●

●●

●● ●●●●

●●●

●●

●●

●●●● ●●●

●●

●●●

●●●

●●

●●

●●

●●

●●●●

●●

●●

●●●●●

20

30

40

50

−120 −110 −100 −90 −80 −70lon

lat

4 6 8 10 12rmse

Figure 5. RMSE of downscaled ozone at each site for the test setusing multitask regression.

0.1 ppb better than the baseline. Compared to the baseline,we see that there is more improvement in the inland regionsof the US while the coastal regions do not visibly improvefrom the baseline.

4. Discussion

Multi-task regression failed to significantly improve uponthe results of the baseline, which itself is not much bet-ter than directly using the results of the numerical model.There are several possible reasons for this:

• We do not include a temporal component in the re-gression, so we are essentially assume that the modelgets the seasonal trend of ozone correct. If, for exam-ple, the model does a good job of simulating ozone inJanuary but has high biases in August, then a regres-sion weight that does not take this temporal compo-nent into account will lead to a correction in Januarythat worsens the model output.

• The output of the numerical model is not a good pre-dictor of observed ozone at monitoring sites. Perhapsthe premise of using coarse-resolution output from anumerical output to predict fine-scale ozone is incor-rect. Given the amount of variability between mon-itoring sites that are within close proximity to eachother, the average ozone over a 400 km x 400 km gridbox actually offers very little information about what

Page 5: Application of statistical downscaling for ozone air quality in the USkyu0110.github.io/CS281.pdf · 2019-02-09 · Statistical downscaling is less commonly used for air qual-ity

440441442443444445446447448449450451452453454455456457458459460461462463464465466467468469470471472473474475476477478479480481482483484485486487488489490491492493494

495496497498499500501502503504505506507508509510511512513514515516517518519520521522523524525526527528529530531532533534535536537538539540541542543544545546547548549

Statistical downscaling of ozone

the ozone value should be at a particular site. This isespecially true if a coarse grid cell includes both urbanand rural settings. In fact, simply predicting ozoneusing observed meteorlogical variables yields lowerRMSEs than using the modeled coarse-scale ozone.

• AQS monitoring stations are too heavily biased to-wards urban and coastal sites, which are known to bedifficult for the numerical model to capture.

• It was incorrect to treat each task as a monitoring site.Rather, we could first cluster sites then treat each clus-ter as a task.

Additionally, using optimization to find the regressionweights was much slower than ridge regression. Thealogrithm I chose to code up was relatively simple, andother algorithms could very well be more computationallyefficient. Given these disadvantages, it does not seem to bea good method for this application.

Even though this project did not succeed in developing arobust method for downscaling ozone, it was useful to learnabout statistical methods used in the field and now I havea better understanding for why they aren’t applied morefrequently.

5. Extensions

We test the hypothesis that temporal variations are not cor-rectly captured in GEOS-Chem by adding an additionalpredictor variable for the month the data point was taken. Imodified the code to do this, and attempted to run the opti-mization for the regression again, but did not have enoughtime to reach convergence. However, even by the 30th iter-ation, the we reached about the same RMSE as the regularmulti-task regression by the time it converged, so it is rea-sonable to believe that this method would at least slightlyimprove the results of the regression.

6. Future work

Given the disappointing results of the previous sections,there is clearly much room for improvement. Here we out-line several ideas that may produce improved regressionresults.

• Extend the multi-task linear regression method tomulti-task non-linear regression through the use ofReproducing Kernel Hilbert Spaces (RKHS). Thisidea was briefly discussed in Evgeniou & Pontil(2004).

• Use PCA to identify patterns that may be useful infeature selection. Alkuwari et al. (2013) used PCA

to generate the predictor variables that went into theirregression.

• Inclusion of additional features, such as meteorolog-ical variables like wind and temperature. Ozone isknown to be correlated with temperature, so includ-ing temperature will likely improve the accuracy ofthe prediction. Including additional chemical species,such as NO

x

or VOCs, which are ozone precursors,may also improve the regression, but these variableswould come from the coarse-resolution model andmay not necessarily help.

• Attempt a more efficient multi-task regression algo-rithm. Kim & Xing (2010) describes a tree-guidedgroup lasso method for multi-task regression that maypotentially speed up the computation.

If I had more time to work on this project, I would spendmore time on exploratory data analysis to determine whatthe appropriate predictor variables should be. Due to thetime constraints, I implemented a machine learning algo-rithm that I read about without fully considering what thebest method for this particular problem would be.

7. Code

The code is available at:https://github.com/kyu0110/CS281

References

Alkuwari, F. A., Guillas, S., and Wang, Y. Statistical down-scaling of an air quality model using fitted empirical or-thogonal functions. Atmos. Env., 81:1–10, 2013.

Berrocal, V. J., Gelfand, A. E., and Holland, D. M. Assess-ing exceedance of ozone standards: a space-time down-scaler for fourth highest ozone concentrations. Environ-metrics, 25:279–291, 2014.

Bey, I., Jacob, D. J., Yantosca, R. M., Logan, J. A., Field,B., Fiore, A. M., Li, Q., Liu, H., Mickley, L. J., andSchultz, M. Global modeling of tropospheric chemistrywith assimilated meteorology: Model description andevaluation. J. Geophys. Res., 106:23073–23096, 2001.

Burger, G., Murdock, T. Q., Werner, A. T., and Sobie, S. R.Downscaling extremes – an interomparison of multiplestatistical methods for present climate. Journal of Cli-mate, 25:4367–4385, 2011.

Chen, D., Wang, Y. X., McElroy, M. B., He, K., Yantosca,R. M., and Sager, P. Le. Regional co pollution in chinasimulated by the high-resolution nested-grid geos-chemmodel. Atmos. Chem. Phys., 11:3825 – 3839, 2009.

Page 6: Application of statistical downscaling for ozone air quality in the USkyu0110.github.io/CS281.pdf · 2019-02-09 · Statistical downscaling is less commonly used for air qual-ity

550551552553554555556557558559560561562563564565566567568569570571572573574575576577578579580581582583584585586587588589590591592593594595596597598599600601602603604

605606607608609610611612613614615616617618619620621622623624625626627628629630631632633634635636637638639640641642643644645646647648649650651652653654655656657658659

Statistical downscaling of ozone

EPA. National ambient air quality standards (NAAQS).URL http://www3.epa.gov/ttn/naaqs/

criteria.html#3.

Evgeniou, T. and Pontil, M. Regularized multi-task learn-ing. In Proceedings of the 10th ACM SIGKDD Confer-ence on Knowledge Discovery and Data Mining (KDD2004), pp. 109–117, Seattle, WA, 2004.

Gong, P., Ye, J., and Zhang, C. Robust multi-task fea-ture learning. In Proceedings of the 18th ACM SIGKDDConference on Knowledge Discovery and Data Mining(KDD 2012), pp. 895–903, Beijing, China, 2012.

Kim, S. and Xing, E. P. Tree-guided group lasson for multi-task regression with structured sparsity. In Proceedingsof the 27th International Conference on Machine Learn-ing (ICML 2010), Haifa, Israel, 2010.

Maurer, E. P. and Hidalgo, H. G. Utility of daily vs.monthly large-scale climate data: an intercomparison oftwo statistical downscaling methods. Hydrol. Earth Syst.Sci., 12:551–563, 2008.

Parrella, J. P., Jacob, D. J., Liang, Q., Zhang, Y., Mick-ley, L. J., Miller, B., Evans, M. J., Yang, X., Pyle,J. A., Theys, N., and Roozendael, M. Van. Troposphericbromine chemistry: implications for present and pre-industrial ozone and mercury. Atmos. Chem. Phys., 12:1823–1832, 2012.

Roth, V. and Fischer, B. The group-lasso for generalizedlinear models: Uniqueness of solutions and efficient al-gorithms. In Proceedings of the 25th International Con-ference on Machine Learning (ICML 2008), pp. 848–855, Helsinki, Finland, 2008.

Seinfeld, J. H. and Pandis, S. N. Atmospheric Chemistryand Physics: From Air Pollution to Climate Change.John Wiley and Sons, Inc., Hoboken, NJ, 2006.

Wilby, R. L., Wigley, T. M. L., Conway, D., Jones, P. D.,Hewitson, B. C., Main, J., and Wilks, D. S. Statisti-cal downscaling of general circulation model output: Acomparison of methods. Water Resources Research, 34(11):2995–3008, 1998.

Xu, Z., Jin, R., Yang, H., King, I., and Lyu, M. R. Sim-ple and efficient multiple kernel learning by group lasso.In Proceedings of the 27th International Conference onMachine Learning (ICML 2010), Haifa, Israel, 2010.

Zhang, L., Jacob, D. J., Kopacz, M., Henze, D. K., andJaffe, D. A. Intercontinental source attribution of ozonepollution at western us sites using an adjoint method.Geophys. Res. Lett., 36:L11810, 2009.