statistical approaches to combining models and observations · 2012. 4. 16. · (berliner & kim...
TRANSCRIPT
-
Statistical Approaches to Combining Models andObservations
Mark Berliner
Department of Statistics
The Ohio State University
SIAM UQ12, April 5, 2012
Outline
I. Bayesian Hierarchical Modeling
II. Examples: Glacial dynamics; Paleoclimate
III. Using Large-scale Models
IV. Example: Ocean Forecasting
1
-
I. Bayesian Hierarchical Modeling (BHM)
• Bayesian Analysis:(1) UQ via prob. modeling;
(2) update via Bayes’ Theorem;
(3) infer, predict & make decisions via prob. (“risk”) analysis
• Hier. Model: Sequence of conditional probability distributions(corresponding to a joint distribution)
• Framework for modeling:Observations Y Processes (State Variables) X Parameters θ
1. Data Model [Y | X, θ]
2. Prior Process Model [X | θ]
3. Prior Parameter Model [ θ ]
• Bayes’ Theorem: [X, θ | Y]
2
-
Strategies
1. Incorporating physical models
Physical-statistical modeling (Berliner 2003 JGR)
From “F=ma” to process model [ X |θ ]
2. Qualitative use of theory
(e.g., Pacific SST model (Berliner et al. 2000 J. Climate)
******************************************************
3. Incorporating large-scale computer model output
4. Combinations
5. Alternate Approaches
3
-
Ex) Glacial Dynamics (Berliner et al. 2008 J. Glaciol)
• Steady Flow of Glaciers and Ice Sheets
– Flow: gravity moderated by drag (base & sides) & stuff
– Simple models: flow from geometry
• Data
– Program for Arctic Climate Regional Assessments
– Radarsat Antarctic Mapping Project
∗ surface topography (laser altimetry)∗ basal topography (radar altimetry)∗ velocity data (interferometry)
4
-
North East Ice Stream, Greenland
5
-
Physical Modeling: Surface: s, Thickness: H, Velocity: u
• Basal Stress: τ = −ρ g H ds/dx (+ ”stuff”)
• Velocities: u = ub + b0 H τ n where ub = k τ p + (ρgH)−q
Our Model
• Random Basal Stress: τ = −ρ g H ds/dx + ηwhere η is a “corrector process” (“model error”)
• Random H: wavelet smoothing prior
• Random s: parameterized model from literature
• Random Velocities: u = ub + b H τ n + ewhere ub = kτ
p + (ρgH)−q or a constant (change-point)
b is unknown parameter, e is a noise process
• Process Model :[u | s, H, η] [η] [s, H]
6
-
7
-
8
-
9
-
Ex) Paleoclimate: Temperature Reconstruction(Brynjarsdóttir & Berliner 2010 Ann. Applied Stat)
• Use of proxies:
– Inverse problem:
proxy ≈ f(climate) → climate ≈ g(proxy)– Inverse probability problem:
[proxy data | climate] → [climate | proxy data]
• Boreholes: earth stores info about surface temp’s
– Inverse: borehole data ≈ f(surface temp’s)– Model:
∗ Heat equation∗ Infer boundary condition
10
-
De
pth
[m
]
Relative temperature [ºC]
60
05
00
40
03
00
20
01
00
0
●
●
●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
SRD−1
14.51●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
SRD−2
15.73●
●
●
●
●
●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
SRD−3
16.00●
●
●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
SRD−4
16.17●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
SRD−7
13.86●
●
●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
SRS−3
11.30●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
SRS−4
12.54●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
SRS−5
12.08●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
WSR−1
13.54
San Rafael Desert San Rafael Swell
1ºC
11
-
Page 1 of 1
5/19/2010http://farm4.static.flickr.com/3580/3330922136_4b2c40edec_b.jpg
Page 1 of 1
5/19/2010http://hellforleathermagazine.com/images/San_Rafael_Swell.jpg12
-
Data Model : ~Y | ~Tr, q ∼ N(~Tr + T0 ~1 + q ~R(k), σ2y I)
• ~Tr: reduced temperatures (literature: ~Tr = ~T−T0 ~1− q ~R(k))
• T0: reference surface temperature
• q: surface heat flow
• ~R: thermal resistances (∝ k−1, thermal conductivities, adjustedfor rock types, etc.)
Process Model :
• Heat equation applied to ~Tr• B.C.: Surface temp history ~Th (assumed piecewise constant)
• Easy to solve the heat equation~Tr | ~Th, q ∼ N(A ~Th, σ2 I)
~Th | q ∼ N(~0, σ2h I)Parameter Model : next
13
-
Spatial Hierarchy
• Nine boreholes: 5 in desert region, 4 in swell region
• Extend the hierarchy: for j = 1, . . . , 9
Data Model : ~Yj | ~Trj, qj ∼ N(~Trj + T0j ~1 + qj ~Rj(kj), σ2yj I)Process Model :
~Trj|~Thj, qj ∼ N(Aj ~Thj, σ2j I)
~Thj|qj ∼ N(~0, σ2hj I)Parameter Model
• ~Th1, . . . , ~Th5 conditionally independent N(~µd, γ2d I)
• ~Th6, . . . , ~Th9 conditionally independent N(~µs, γ2s I)
• ~µd & ~µs ∼ N(~µ0, γ20 I)
• q1, . . . , q5 conditionally independent N(νd, η2d)q6, . . . , q9 conditionally independent N(νs, η
2s )
• etc.
14
-
● ● ●●
● ●●
●
●
●
●
1600 1700 1800 1900 2000
−2−1
01
2
Year
Anom
aly [º
C]SRD−1, SR Desert
● ● ● ●●
●●
●
● ●
●
1600 1700 1800 1900 2000
−2−1
01
2
Year
Anom
aly [º
C]
SRD−2, SR Desert
● ● ●●
●●
●●
●●
●
1600 1700 1800 1900 2000
−2−1
01
2
Year
Anom
aly [º
C]
SRD−3, SR Desert
● ● ●●
●●
● ●
●●
●
1600 1700 1800 1900 2000
−2−1
01
2
Year
Anom
aly [º
C]
SRD−4, SR Desert
● ●●
●
● ●
●
●
●
● ●
1600 1700 1800 1900 2000
−2−1
01
2
Year
Anom
aly [º
C]
SRD−7, SR Desert
● ●● ●
●
●●
●
● ●
●
1600 1700 1800 1900 2000
−2−1
01
2
Year
Anom
aly [º
C]
SRS−3, SR Swell
●●
●
●●
●●
●
●
●
●
1600 1700 1800 1900 2000
−2−1
01
2
Year
Anom
aly [º
C]
SRS−4, SR Swell
●
●
●
●
●
●
●
●●
●
●
1600 1700 1800 1900 2000
−2−1
01
2
Year
Anom
aly [º
C]
SRS−5, SR Swell
●
●
●
●
●
●● ● ●
●
●
1600 1700 1800 1900 2000
−2−1
01
2
Year
Anom
aly [º
C]
WSR−1, SR Swell
15
-
● ● ●●
●●
●
●
●
●
●
1600 1700 1800 1900 2000
−1.5
−0.5
0.5
1.5
Year
Anom
aly
[ºC]
µD
● ●●
●●
● ●
●
●
●
●
1600 1700 1800 1900 2000
−1.5
−0.5
0.5
1.5
Year
Anom
aly
[ºC]
µS
16
-
1600 1700 1800 1900 2000
−2
−1
01
2
Year
An
om
aly
[ºC
]
SRD−1, SR Desert
1600 1700 1800 1900 2000
−2
−1
01
2
Year
An
om
aly
[ºC
]
SRD−2, SR Desert
1600 1700 1800 1900 2000
−2
−1
01
2
Year
An
om
aly
[ºC
]
SRD−3, SR Desert
1600 1700 1800 1900 2000
−2
−1
01
2
Year
An
om
aly
[ºC
]
SRD−4, SR Desert
1600 1700 1800 1900 2000
−2
−1
01
2
Year
An
om
aly
[ºC
]
SRD−7, SR Desert
1600 1700 1800 1900 2000
−2
−1
01
2
Year
An
om
aly
[ºC
]
SRS−3, SR Swell
1600 1700 1800 1900 2000
−2
−1
01
2
Year
An
om
aly
[ºC
]
SRS−4, SR Swell
1600 1700 1800 1900 2000
−2
−1
01
2
Year
An
om
aly
[ºC
]
SRS−5, SR Swell
1600 1700 1800 1900 2000
−2
−1
01
2
Year
An
om
aly
[ºC
]
WSR−1, SR Swell
Multi−site Single−site
17
-
1600 1700 1800 1900 2000
−2
−1
01
2
Year
An
om
aly
[ºC
]
SRD−2, SR Desert
Multi−site Single−site
1600 1700 1800 1900 2000
−2
−1
01
2
Year
An
om
aly
[ºC
]
SRS−5, SR Swell
Multi−site Single−site
18
-
III. Incorporating Large-Scale Computer Models
Ensembles ~O = O1 = M(T1), . . . , On = M(Tn)
(T’s include “controls, model names, etc.)
• Data Model Treat ~O as ”observations”
– [Y, ~O | X, θ] (include ”bias, offset, model error ..”)– Convenient for design of collection of Y, ~O
• Process Model Use ~O to develop [X | θ]
– Kernel density estimate
Σi αik(x | Oi)
– Gaussian process models; emulators; UQ
Model Output: ~O = (O1 = M(T1), . . . , On = M(Tn))
[O | θ, ~O]→ [X | θ]
• Parameter Model from model output(eg: Berliner, Levine, & Shea 2003 J. Climate)
• Combinations
19
-
A Bayesian Multi-model Climate Projection(Berliner & Kim 2008 J. Climate)
• Themes
– “climate” = parameters of prob. dist. of “weather”
– build or “parameterize” scales into dynamic model for X
• Future climate depends on future, but unknown, inputs.
• IPCC: construct plausible future inputs, ”SRES Scenarios”(CO2 etc.)
• Assume a scenario and find corresponding projection
20
-
Hemispheric Monthly Surface Temperatures (X)
• Observations (Y) for 1882-2001.
• Two models (~O): PCM (n=4), CCSM (n=1) for 2002-2197
• 3 SRES scenarios (B1,A1B,A2).
21
-
Hierarchical Data Model for Model Output
• Scalar climate variable X; m = 1, . . . , M models (time fixed)
• ~Om: ensemble of size nm of estimates of X from model m.
1. Given means µm and variances σ2Ym
, m = 1, . . . , M;~Om are independent and
~Om | µm ∼ Gau(µm~1nm, σ2Ym Inm)
2. Given β, model biases bm and variances σ2µm;
µm are independent and
µm | β, bm ∼ Gau(β + bm, σ2µm)
3. Given X,
β | X ∼ Gau(X, σ2β) and bm | X ∼ Gau(b0m, σ2bm
)
22
-
Remarks
• Implied marginal dist.: “~O given X”:Integrating out β induces dependence within & across ensembles
• Modify intuition about value of increasing ensemble size:“Infinite” ensembles do not give perfect forecasts:
If all biases = 0, infinite ensembles give the value of β, not X
• Extensions to different model classes (more β’s)and richer models are feasible.
23
-
Model Overview
1. [Y | X, θ]: measurement error modelGaussian with mean = true temp.
& unknown variance (with a change-point)
2. [X | θ]: Time series models with time varying parameters
Xt = µi(t) +
ηnj(t) 00 ηsj(t)
(Xt−1 − µi(t−1)) + et3. [θ] Climate = parameters of distribution of weather
• Climate-weather: multiscale phenomena• Time evolution: µi(t) slow; ηj(t) moderate;
et fast, but variances of et are slow
• µi = A + B CO2i + noise• Obs period: ηj = C + D |SOI|j + noise
Fore. period: AR model (i.e., SOI not observed)
• Variances of et: AR-like prior
24
-
1900 1950 2000 2050 210014
14.5
15
15.5
16
16.5
17
17.5
18
year
mea
n
1900 1950 2000 2050 210014
14.5
15
15.5
16
16.5
17
17.5
18
18.5
year
annu
al te
mpe
ratu
re
25
-
1900 1950 2000 2050 210012.5
13
13.5
14
14.5
15
15.5
year
mea
n
1900 1950 2000 2050 210012.5
13
13.5
14
14.5
15
15.5
16
year
annu
al te
mpe
ratu
re
26
-
1900 1950 2000 2050 21000
0.02
0.04
0.06
0.08
0.1
0.12
0.14
year
syst
em
variance
27
-
Model Adapted to Decadal Forecasting
(Kim & Berliner 2012)
1994 1996 1998 2000 2002 200413
13.5
14
14.5
15
15.5
16
Year
Tem
pera
ture
(Cel
cius
)
1994 1996 1998 2000 2002 200413
13.5
14
14.5
15
15.5
16
Year
Tem
pera
ture
(Cel
cius
)
28
-
IV. Combining Approaches:Mediterranean Ocean Forecasting
1. Winds as a boundary condition for the ocean surface
(Milliff et al. & Bonazzi et al. 2011 Quart. J. Roy. Met. Soc.)
2. Bayesian Multi-model Ensembling for Ocean Forecasting
(Berliner et al. 2012)
Observations
Initial – Boundary conditions
BHMWinds
Model 1
Model 2
BHM Oceans
BHM Ocean Post. Dist.
29
-
Bayesian Hierarchical Models to Augment The Mediterranean Forecast System (MFS)
Ralph Milliff CoRA Chris Wikle Univ. Missouri Mark Berliner Ohio State Univ.
Nadia Pinardi INGV (I'Istituto Nazionale di Geofisica e Vulcanologia) Univ. Bologna (MFS Director) Alessandro Bonazzi, Srdjan Dobricic INGV, Univ. Bologna
30
-
Overview:
• MFS: deterministic operational forecasting
system
• Boundary condition/forcing: Surface Winds
• Bayesian Hierarchical Model (BHM) to
quantify Surface Vector Wind (SVW)
distributions
• Ocean Ensemble Forecasting (OEF) using 10
member BHM-SVW ensembles
Key: Exploit abundant, “good” satellite wind
data combined with physical modeling
31
-
Building
wind dist. (BHM-SVW)
1. Data Stage:
Satellite
(QSCAT)
and
Numerical
Weather
Pred.
Analyses
(ECMWF)
QSCAT
ECMWF
32
-
2. Process Model: Rayleigh Friction Model
(Linear Planetary Boundary Layer Equations)
Theory (neglect second
order time
derivative)
discretize:
Our
model
33
-
BHM Ensemble Winds
10 m/s
10 members selected from the Posterior Distribution (blue)
Ensemble mean wind (green); ECMWF Analysis wind (red)
34
-
BHM-SVW-OEF initial condition spread:
Uncertainty is
concentrated at
mesoscales
Sea Surface Temperature
Sea Surface Height
Initial condition spread
Initial condition spread
35
-
Sea Surface Height
BHM-SVW-OEF 10 day forecast spread
Initial condition ensemble
spread has
amplified at the 10 day fcst
in mesoscales
Initial condition spread
10-th fcst day spread
36
-
Th
e f
ore
cast
sp
read
at
10 d
ays
EEPS forced ensemble BHM-SVW ensemble
ECMWF EPS forcing
Ineffective at
producing flow
field changes
at mesoscales
37
-
Discussion:
• BHM methods produce realistic distributions of surface winds (SVW)
(Milliff et al. 2011 J Roy Met Soc)
• BHM-SVW results used to in a new ocean ensemble forecasting method:
BHM-SVW-OEF
(Bonazzi et al 2011 J Roy Met Soc)
• The BHM-SVW-OEF produces 10-day-forecast spreads at mesoscales and in the upper thermocline
38
-
2. Bayesian Multi-model Ensembling forOcean Forecasting (Berliner et al. 2012)
• profiles of wintertime (60 days) of temperature T(z,t) (& salinity)(Rhodes Gyre region in Eastern Mediterranean Sea)
• 16 vertical levels z = 0 m to z = 300 m
• BHM Winds produce ensembles from two models:1) Ocean PArallelise (OPA)
2) Nucleus for European Modeling of the Ocean (NEMO)
• Model as in climate example: model-specific biases
• Prior: Analyzed fields
39
-
40
-
41
-
42