uncertainty quantification in geosciences with computationally expensive simulation models (with...

29
Uncertainty Quantification In Geosciences with Computationally Expensive Simulation Models (with last minute modification to Include Sustainability/Energy analysis. Christine A. Shoemaker School of Civil and Environmental Engineering and School of Operations Research and Information Engineering Cornell University 1

Upload: thomas-shepherd

Post on 27-Mar-2015

218 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Uncertainty Quantification In Geosciences with Computationally Expensive Simulation Models (with last minute modification to Include Sustainability/Energy

Uncertainty Quantification In Geosciences with Computationally Expensive Simulation Models

(with last minute modification to Include Sustainability/Energy analysis.

Christine A. Shoemaker

School of Civil and Environmental Engineering and School of Operations Research and Information

Engineering

Cornell University

1

Page 2: Uncertainty Quantification In Geosciences with Computationally Expensive Simulation Models (with last minute modification to Include Sustainability/Energy

Uncertainty Quantification with Computationally Expensive Simulation Models

• Many problems in geoscience and environmental engineering are described by computationally expensive simulation models.

• Probably the main obstacle to doing rigorous statistical analysis of uncertainty for a given data set and model is computational time because:– Simulation models typically need to be run many (e.g.

thousands of ) times for uncertainty quantification– Multiple types of uncertainty need to be incorporated

including data error, model error, parameter error, randomness in model input (static and dynamic)

• I’m interested in developing new algorithms for this problem.2

Page 3: Uncertainty Quantification In Geosciences with Computationally Expensive Simulation Models (with last minute modification to Include Sustainability/Energy

Examples of Static Geoscience Problems Requiring Uncertainty Quantification

• Examples of Goals:– Determine the spatial distribution of different types of

geologic materials in the subsurface– Determine location of oil reservoirs or underground

water• Data sources:

– based on sound waves or radar (many spatial points but not highly accurate).

– Based on drilling into the subsurface (more accurate but very few spatial points because because of high cost of drilling each well.

3

Page 4: Uncertainty Quantification In Geosciences with Computationally Expensive Simulation Models (with last minute modification to Include Sustainability/Energy

Examples of Uncertainty Quantification Needs for Dynamic Geoscience Problems

• Examples of problems include:– Forecasting man-made changes in fluid flow in

geologic formations with effects on floods and contaminant transport

– Predicting the impact of Greenhouse gases on climate change over time.

– Forecasting dynamic response of multiple fluids in the subsurface to waste disposal (e.g. carbon sequestration)

4

Page 5: Uncertainty Quantification In Geosciences with Computationally Expensive Simulation Models (with last minute modification to Include Sustainability/Energy

Quantifying Uncertainty from Dynamic Model Predictions with Model Parameters based on Data

5

Calibration Time PeriodsWith Measured Static Input and Dynamic Output (flow,

contamination, etc.)

Forecasting Time PeriodUsing Estimates of Static Input

Data is taken during Calibration time period and it can be used to establish parameter values (as deterministic values or parameters as random variables with pdf’s estimated based on data).

We then want to forecast model output in the future using a probabilistic representation of model parameters and input. This forecast then has uncertainty including static and dynamic inputs, model error, and parameter error

Measured (Weather) Dynamic Input

Predicted (Weather) Input-perhaps many scenarios to represent randomness

TIME

Page 6: Uncertainty Quantification In Geosciences with Computationally Expensive Simulation Models (with last minute modification to Include Sustainability/Energy

6

How Do We Protect This Water From Pollution?

This is New York City (NYC) water supply.

Excess Phosphorous from Watershed would result in cost of

$US 8 Billion Water Treatment Plant for NYC!

Page 7: Uncertainty Quantification In Geosciences with Computationally Expensive Simulation Models (with last minute modification to Include Sustainability/Energy

7

Study Area• Cannonsville Reservoir Basin – agricultural basin• New York City water supply• P ‘restriction’ impedes economic growth of county

Model incorporates over 20,000 data values available for this watershed.

Page 8: Uncertainty Quantification In Geosciences with Computationally Expensive Simulation Models (with last minute modification to Include Sustainability/Energy

8

Cow Waste (Manure) Is the Primary Source of Phosphorous Transported By Water to Lake

Page 9: Uncertainty Quantification In Geosciences with Computationally Expensive Simulation Models (with last minute modification to Include Sustainability/Energy

9

SWAT2000: A Spatially Distributed Simulation Model with Subbasins, Land Use and Soil Type in “HRUs”

27

38

29

1

6

43

4

11

19

3

9

26

728

41

10

2

8

18

305

16

32

40

24

33

37

2012

13

23

21

34

2214

39

31

15

17

363542

25

43 subbasins758 HRUsAvg HRU = 1.6 km2

Using a spatially distributed model helps us evaluate management options.

SWAT model developed by USDA and is used worldwide.

Page 10: Uncertainty Quantification In Geosciences with Computationally Expensive Simulation Models (with last minute modification to Include Sustainability/Energy

Sept 12, 2003 10

Simulation Model Predictions of Annual Phosphorous Contamination:

Repeating 1988-1999 Climates for 72 years• By repeating 12 year blocks of climate, the trends in climate inputs are eliminated

Total P to reservoir over 72 year simulation repeating 12 year climate blocks (1988-1999)

35561

3822436974366803464134297

2000022000240002600028000300003200034000360003800040000

1 2 3 4 5 6Block #

Av

era

ge

to

tal P

to

re

se

rvo

ir in

kg

/yr

2064-20752004-2015

P load to reservoir simulated to increase over time!

Increase is measured relative to first climate block = (38224-34297)/34297*100%

= 11.4% increase in P load to reservoir over 72 yrs

Page 11: Uncertainty Quantification In Geosciences with Computationally Expensive Simulation Models (with last minute modification to Include Sustainability/Energy

Significance of Results on Phosphorous in Watershed

• The forecast of increase in phosphorous loading to the water (with no increase in human or animal activity) is very serious because:– NY City’s drinking water quality will decline (and there

might not be a replacement)– Future cost could be Billions of dollars– Steps should be taken now to stop the increase in

phosphorous pollution• Hence the rate of increase (11%/72 years) is important

and we would like to know the uncertainty associated with it.

11

Page 12: Uncertainty Quantification In Geosciences with Computationally Expensive Simulation Models (with last minute modification to Include Sustainability/Energy

Sustainability/Energy Issues

• Development of renewable energy devices –fuel cells, photovoltaics, biofuels, etc.

• Wind Energy also has some of the same issues as the watershed analysis—you have a calibration period, but the real uncertainty is during the forecast period and is driven by variability in weather inputs (and fear of disasters).

• My group has started to work on Carbon Sequestration which is important in waste disposal and has some similarity to the problems that arise in nuclear waste disposal.

Page 13: Uncertainty Quantification In Geosciences with Computationally Expensive Simulation Models (with last minute modification to Include Sustainability/Energy

Carbon Sequestration Another Application where Uncertainty Quantification is Important

• Carbon sequestration: storage of super critical carbon dioxide in geological formations about 1 kilometer below the surface.

Page 14: Uncertainty Quantification In Geosciences with Computationally Expensive Simulation Models (with last minute modification to Include Sustainability/Energy

Uncertainty Quantification Importance for Carbon Sequestration

• The issue is that there are serious public health and environmental risks associated with the unexpected movement of CO2 upward to groundwater or to the surface.

• We would like to use a model plus monitoring data to generate an estimate of the CO2 plume in the ground to determine if the system is functioning OK and where the plume is going to move in the future.

14

Page 15: Uncertainty Quantification In Geosciences with Computationally Expensive Simulation Models (with last minute modification to Include Sustainability/Energy

Related Issues

• How do we efficiently quantify tail probabilities/rare events of high-impact?

• Can we augment the predictive ability by using experiments? How?

• Can we derive predictive reduced-order models and/or response surfaces?

• Likelihoods can be multimodal and/or rough.

15

Page 16: Uncertainty Quantification In Geosciences with Computationally Expensive Simulation Models (with last minute modification to Include Sustainability/Energy

16

Most Nonlinear Simulation Models will Lead toLikelihood Models with Multiple Local Minima

Multi-Modal Problems have Multiple local minima

F(x)

X (parameter value)

Local minimum

Global minimum

Page 17: Uncertainty Quantification In Geosciences with Computationally Expensive Simulation Models (with last minute modification to Include Sustainability/Energy

17

Likelihood can be a Rough Surface because of Numerical Simulation

Objective (likelihood) Function Versus Parameter Value

Page 18: Uncertainty Quantification In Geosciences with Computationally Expensive Simulation Models (with last minute modification to Include Sustainability/Energy

Our Approach to Uncertainty Quantification

• I have been working with statistician David Ruppert and others on quantification of uncertainty for computationally expensive simulation models.

• My approach has been to imbed both optimization search and response surfaces into the algorithms for uncertainty quantification to significantly reduce computational effort (by orders of magnitude).

• Our methods are designed to work with multi modal functions and rough surfaces.

18

Page 19: Uncertainty Quantification In Geosciences with Computationally Expensive Simulation Models (with last minute modification to Include Sustainability/Energy

19

Paper 1 on SOARS

• Blizniouk, N., D. Ruppert, C.A. Shoemaker, R. G. Regis, S. Wild, P. Mugunthan, “Bayesian Calibration of Computationally Expensive Models Using Optimization and Radial Basis Function Approximation.” Journal of Computational and Graphical Statistics, July 2008.

Page 20: Uncertainty Quantification In Geosciences with Computationally Expensive Simulation Models (with last minute modification to Include Sustainability/Energy

20

Notation

• Y0 - vector of observed data;

• η - parameters in the (joint) statistical model • [list1|list2] - conditional density of random variables

in list1 given list2, e.g.

Is the conditional density of η given the data Y0

η= in this example

[ | ]OY

Page 21: Uncertainty Quantification In Geosciences with Computationally Expensive Simulation Models (with last minute modification to Include Sustainability/Energy

21

Kernel estimates of the marginal posterior densities by a) (solid line) exact joint posterior obtained from conventional MCMC .Analysis with 10,000 function evaluations & b) (dashed lines) with our function approximation method with 150 function evaluations. One graph for each parameter.

Page 22: Uncertainty Quantification In Geosciences with Computationally Expensive Simulation Models (with last minute modification to Include Sustainability/Energy

22

OUTPUT UNCERTAINTY: Again we got excellent agreement between our approach with 150 evaluations and

the conventional approach with 10,000 evaluations.

Output Comparison

Page 23: Uncertainty Quantification In Geosciences with Computationally Expensive Simulation Models (with last minute modification to Include Sustainability/Energy

23

SWAT2000: A Spatially Distributed Simulation Model with Subbasins, Land Use and Soil Type in “HRUs”

27

38

29

1

6

43

4

11

19

3

9

26

728

41

10

2

8

18

305

16

32

40

24

33

37

2012

13

23

21

34

2214

39

31

15

17

363542

25

43 subbasins758 HRUsAvg HRU = 1.6 km2

Using a spatially distributed model helps us evaluate management options.

SWAT model developed by USDA and is used worldwide.

Page 24: Uncertainty Quantification In Geosciences with Computationally Expensive Simulation Models (with last minute modification to Include Sustainability/Energy

24

Marginal posterior Densities of the 5 parameters (βi) for watershed model

This contains all the statistical information . Everything else is computed on the basis of this.

[ | ]OY

Page 25: Uncertainty Quantification In Geosciences with Computationally Expensive Simulation Models (with last minute modification to Include Sustainability/Energy

25

Joint Posterior Density of Six Different Model Outputs

Page 26: Uncertainty Quantification In Geosciences with Computationally Expensive Simulation Models (with last minute modification to Include Sustainability/Energy

26

Quantiles, Means, Standard Dev.

Output (mm) Mean S. Dev.5% 25% 75% 95%

Surface Runoff 54.6 55.7 56.8 59.9 56.9 1.6Lateral Flow 341.9 349.4 354.8 364.8 354.2 7.2Groundwater Flow 275.0 280.3 284.8 295.5 284.8 6.3ET 434.9 436.4 437.5 441.3 437.8 2.0Revap 11.8 11.9 11.9 12.0 11.9 0.1Yield 690.5 693.0 694.4 697.4 694.2 2.1

Quantiles

These are based on statistically rigorous analysis with transformations to account for non normal data obtained with a small fraction of the number of simulations required by other methods including MCMC and GLUE

Page 27: Uncertainty Quantification In Geosciences with Computationally Expensive Simulation Models (with last minute modification to Include Sustainability/Energy

Questions?

27

Page 28: Uncertainty Quantification In Geosciences with Computationally Expensive Simulation Models (with last minute modification to Include Sustainability/Energy

Groundwater Simulation Model for BeijingFigure 1. Miyun-Huai-Shun groundwater aquifer with hydraulic head observation wells of the study area.

The main aquifer covers 456 km2 of the basin area. This is primary water supply for Beijing.

Upper layers of the Aquifer are contaminated, thereby significantly reducing Beijing’s water supply.

Parameter estimation goal in order to understand effect of extracting water and potential for contamination.

Model involves solving PDE’s.28

Page 29: Uncertainty Quantification In Geosciences with Computationally Expensive Simulation Models (with last minute modification to Include Sustainability/Energy

Groundwater Aquifer is 3-dimensional. Below is a vertical cross section.

Cross Hatched areas indicate different “conductivities”, which need to be estimated as parameters.

Simulation model solves a systems of partial differential equations by finite difference method.

29