statistics, data, and deterministic models nrcse

28
Statistics, data, and deterministic models NRCSE

Post on 21-Dec-2015

219 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Statistics, data, and deterministic models NRCSE

Statistics, data, and deterministic models

NRCSE

Page 2: Statistics, data, and deterministic models NRCSE

Some issues in model assessment

Spatiotemporal misalignmentGrid boxes vs observations

Types of errorMeasurement error and bias

Model error

Approximation error

Manipulate data or model output?

Two case studies:SARMAP – kriging

MODELS-3 – Bayesian melding

Other uses of Bayesian hierarchical models

Page 3: Statistics, data, and deterministic models NRCSE

Assessing the SARMAP model

60 days of hourly observations at 32 sites in Sacramento region

Hourly model runs for three “episodes”

Page 4: Statistics, data, and deterministic models NRCSE

Task

Estimate from data the ozone level at x’s in a grid square. Use sum to estimate integral over grid square.

Issues:Transformation

Diurnal cycle

Temporal dependence

Spatial dependence

Space-time interaction

Page 5: Statistics, data, and deterministic models NRCSE

Transformation

Heterogeneous variability–mean and variance positively related

Square root transformation

All modeling now on square root scale–approximately normal

Page 6: Statistics, data, and deterministic models NRCSE

Diurnal cycle

Page 7: Statistics, data, and deterministic models NRCSE

Temporal dependence

Page 8: Statistics, data, and deterministic models NRCSE

Spatial dependence

Page 9: Statistics, data, and deterministic models NRCSE

Estimating a grid square average

Estimate using

(not averages of squares of kriging estimates on the square root scale)

Vt (s) = Zt (s)

Vt (s) =μ t (s) +Wt (s)Wt (s) =α1(s)Wt−1(s) + α2 (s)Wt−2 (s) + Yt (s)

1

AVt

2 (s)dsA∫

1

ME Vt (s j )

2 data from 1,..., t{ }∑

Page 10: Statistics, data, and deterministic models NRCSE

Looking at an episode

Page 11: Statistics, data, and deterministic models NRCSE

Afternoon comparison

Page 12: Statistics, data, and deterministic models NRCSE

Nighttime comparison

Page 13: Statistics, data, and deterministic models NRCSE

A Bayesian approach

SARMAP study spatially data richIf spatially sparse data, how estimate grid squares?

P = Z + M + A + O = ()Z + B + E

P = process model outputO = observationsZ = truth

Calculate (Z |P,O) for prediction

Calculate (O |P, = 1, M = S = A = 0) for model assessment

Page 14: Statistics, data, and deterministic models NRCSE

CASTNet and Models-3

CASTNet is a dry deposition network

Models-3 sophisticated air quality model

Average fluxes on 36x36 km2 grid

Weekly data and hourly output

Page 15: Statistics, data, and deterministic models NRCSE

Estimated model bias

The multiplicatice bias is taken spatially constant (= 0.5). The additive bias E(M+A+) is spatially distributed.

Page 16: Statistics, data, and deterministic models NRCSE

Assessing model fit

Predict CASTNet observation Oi from posterior mean of prediction using Models-3 output Pi and remaining observations O-I.

Average length of 90% credible intervals is 7 ppb

Average length using only Models-3 is 3.5 ppb

Page 17: Statistics, data, and deterministic models NRCSE

Crossvalidation

Page 18: Statistics, data, and deterministic models NRCSE

The Bayesian hierarchical approach

Three levels of modelling:

Data model:

f(data | process, parameters)

Process model:

f(process | parameters)

Parameter model:

f(parameters)

Use Bayes’ theorem to compute posterior

f(process, parameters | data)

Page 19: Statistics, data, and deterministic models NRCSE

Some applications

Data assimilation

Satellite tracking

Precipitation measurement

Combination of data on different scales

Image analysis

Agricultural field trials

Page 20: Statistics, data, and deterministic models NRCSE

Application to Models-3

where (I) are samples from the posterior distribution of

(Z(s0 ) P,O) ∝ f(Z(s0 ) P,O,)f( P,O)d∫≈

1M

f(Z(s0 P,O,(i) )i=1

M

ME IL NC IN FL MI

CASTNet 0.15 3.29 0.90 3.14 0.57 1.02

Models-3 0.33 3.33 5.32 9.59 0.52 1.04

Adjusted

Models-3

0.12 2.88 1.09 3.12 0.44 1.01

Page 21: Statistics, data, and deterministic models NRCSE

Predictions

Page 22: Statistics, data, and deterministic models NRCSE

NCAR-GSP (IMAGe)

Theme for 2005: Data Assimilation in the Geosciences

Page 23: Statistics, data, and deterministic models NRCSE

ENSO project

El Niño/Southern Oscillation is driven by surface temperature in tropical Pacific

Data 2ox2o monthly SST anomalies at 2261 locations; zonal 10m wind

Previous work indicates EOFs of SST may develop in a Markovian fashion

Forecast 7 months ahead uses data from Jan 70 through latest available.

Cressie-Wikle-Berliner (http://www.stat.ohio-state.edu/ ~sses/collab_enso.php)

Page 24: Statistics, data, and deterministic models NRCSE

Model

Data model:

Process model:

Parameter model:

The current state is a mixture over three regimes (determined by SOI), with mixing probabilities that depend on the wind statistic

Standardize by subtracting climatology (monthly average 1971-2000)

Zt =Φat + νt

EOFs

a t+τ =μ t +Htat + ηt+τ

Ht =H(It, J t )regimes

winds

Page 25: Statistics, data, and deterministic models NRCSE

Latest ENSO forecast

Page 26: Statistics, data, and deterministic models NRCSE

Latest forecast with data

fore

cast

dat

a

Page 27: Statistics, data, and deterministic models NRCSE

Relative performance

Performance measure for anomalies:

ave((forecast - data)2} over all pixels in Niño3.4-region

Relative Performance of Forecast A relative to Forecast B is

RP(A,B)=log(Perf B / Perf A)

RP(A,B)>0 indicates A better than B

Persistence: Predict using data 7 months ago

Climatology: Predict using 0

Page 28: Statistics, data, and deterministic models NRCSE

Comparison to climatology and persistence