uncertainty analysis using gem-sa tony o’hagan. outline setting up the project running a simple...

28
Uncertainty Analysis Using GEM- SA Tony O’Hagan

Upload: chloe-passons

Post on 01-Apr-2015

216 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Uncertainty Analysis Using GEM-SA Tony O’Hagan. Outline  Setting up the project  Running a simple analysis  Exercise  More complex analyses

Uncertainty Analysis Using GEM-SA

Tony O’Hagan

Page 2: Uncertainty Analysis Using GEM-SA Tony O’Hagan. Outline  Setting up the project  Running a simple analysis  Exercise  More complex analyses

Outline

Setting up the project

Running a simple analysis

Exercise

More complex analyses

Page 3: Uncertainty Analysis Using GEM-SA Tony O’Hagan. Outline  Setting up the project  Running a simple analysis  Exercise  More complex analyses

Setting up the project

Page 4: Uncertainty Analysis Using GEM-SA Tony O’Hagan. Outline  Setting up the project  Running a simple analysis  Exercise  More complex analyses

Number of inputs

Select Project -> New, or click toolbar icon

Select number of inputs using

Project dialog appears

Page 5: Uncertainty Analysis Using GEM-SA Tony O’Hagan. Outline  Setting up the project  Running a simple analysis  Exercise  More complex analyses

Our example

We’ll use the example “model1” in the GEM-SA DEMO DATA directory

This example is based on a vegetation model with 7 inputs– RESAEREO, DEFLECT, FACTOR, MO,

COVER, TREEHT, LAI The model has 16 outputs, but for the present

we will consider output 4– June monthly GPP

Page 6: Uncertainty Analysis Using GEM-SA Tony O’Hagan. Outline  Setting up the project  Running a simple analysis  Exercise  More complex analyses

Define input names

Click on “Names …”

Enter parameter names

Click “OK”

The “Input parameter names” dialog opens

Page 7: Uncertainty Analysis Using GEM-SA Tony O’Hagan. Outline  Setting up the project  Running a simple analysis  Exercise  More complex analyses

Files

Click on Files tab

The “Inputs” files contains one column for each parameter and one row for each model training run (the design)

The “Outputs” files contains the outputs of those runs (one column)

Using “Browse” buttons, select input and output files

Page 8: Uncertainty Analysis Using GEM-SA Tony O’Hagan. Outline  Setting up the project  Running a simple analysis  Exercise  More complex analyses

Close project and save

We will leave all other settings at their default values for now

Click “OK”

Select Project -> Save

– Or click toolbar icon

Choose a name and click “Save”

Page 9: Uncertainty Analysis Using GEM-SA Tony O’Hagan. Outline  Setting up the project  Running a simple analysis  Exercise  More complex analyses

Running a simple analysis

Page 10: Uncertainty Analysis Using GEM-SA Tony O’Hagan. Outline  Setting up the project  Running a simple analysis  Exercise  More complex analyses

Build the emulator

Click to build the emulator A lot of things now start to happen!

– The log window at the bottom starts to record various bits of information

– A little window appears showing progress of minimisation of the roughness parameter estimation criterion

– A new window “Main Effects Realisations” appears and several graphs appear Progress bar at the bottom

Page 11: Uncertainty Analysis Using GEM-SA Tony O’Hagan. Outline  Setting up the project  Running a simple analysis  Exercise  More complex analyses

Focus on the log window

Close the “Main Effects Realisations” window when it’s finished – We don’t need it in this session! – In the main window we now have a table – Which we will also ignore for now

Focus on the log window This reports two key things

– Diagnostics of the emulator build– The basic uncertainty analysis results

Page 12: Uncertainty Analysis Using GEM-SA Tony O’Hagan. Outline  Setting up the project  Running a simple analysis  Exercise  More complex analyses

Emulation diagnostics

Note where the log window reports …

The first line says roughness parameters have been estimated by the simplest method

The values of these indicate how non-linear the effect of each input parameter is– Note the high value for input 4 (MO)

Estimating emulator parameters by maximising probability distribution...

maximised posterior for emulator parameters: precision = 12.1881, roughness = 0.227332 0.0256299 0.00388643 74.0941 0.963724 1.22783 2.42148

Page 13: Uncertainty Analysis Using GEM-SA Tony O’Hagan. Outline  Setting up the project  Running a simple analysis  Exercise  More complex analyses

Uncertainty analysis – mean

Below this, the log reports

So the best estimate of the output (June GPP)

is 24.3 (mol C/m2)– This is averaged over the uncertainty in the

7 inputs Better than just fixing inputs at best estimates

– There is an emulation standard error of 0.065 in this figure

Estimate of mean output is 24.3088, with variance 0.00422996

Page 14: Uncertainty Analysis Using GEM-SA Tony O’Hagan. Outline  Setting up the project  Running a simple analysis  Exercise  More complex analyses

Uncertainty analysis – variance

The final line of the log is

This shows the uncertainty in the model output that is induced by input uncertainties– The variance is 72.9– Equal to a standard deviation of 8.5– So although the best estimate of the output

is 24.3, the uncertainty in inputs means it could easily be as low as 16 or as high as 33

Estimate of total output variance = 72.9002

Page 15: Uncertainty Analysis Using GEM-SA Tony O’Hagan. Outline  Setting up the project  Running a simple analysis  Exercise  More complex analyses

Exercise

Page 16: Uncertainty Analysis Using GEM-SA Tony O’Hagan. Outline  Setting up the project  Running a simple analysis  Exercise  More complex analyses

A small change

Run the same model with Output 11 instead of Output 4

Calculate the coefficient of variation (CV) for this output– NB: the CV is defined as the standard

deviation divided by the mean

Page 17: Uncertainty Analysis Using GEM-SA Tony O’Hagan. Outline  Setting up the project  Running a simple analysis  Exercise  More complex analyses

More complex analyses

Page 18: Uncertainty Analysis Using GEM-SA Tony O’Hagan. Outline  Setting up the project  Running a simple analysis  Exercise  More complex analyses

Input distributions

A normal (gaussian) distribution is generally a more realistic representation of uncertainty– Range unbounded– More probability in the

middle

Default is to assume the uncertainty in each input is represented by a uniform distribution– Range determined by the range of values

found in the input file

Page 19: Uncertainty Analysis Using GEM-SA Tony O’Hagan. Outline  Setting up the project  Running a simple analysis  Exercise  More complex analyses

Changing input distributions

In Project dialog, Options tab, click the button for “All unknown, product normal”

Then OK A new dialog

opens to specify means and variances

Page 20: Uncertainty Analysis Using GEM-SA Tony O’Hagan. Outline  Setting up the project  Running a simple analysis  Exercise  More complex analyses

Model 1 example

Uniform distributions from input ranges

Normal distributions to match– Range is 4

std devs Except for MO

– Narrower distribution

  Uniform Normal

Parameter Lower Upper Mean Variance

RESAEREO 80 200 140 900

DEFLECT 0.6 1 0.8 0.01

FACTOR 0.1 0.5 0.3 0.01

MO 30 100 60 100

COVER 0.6 0.99 0.8 0.01

TREEHT 10 40 25 100

LAI 3.75 9 6.5 1

Page 21: Uncertainty Analysis Using GEM-SA Tony O’Hagan. Outline  Setting up the project  Running a simple analysis  Exercise  More complex analyses

Effect on UA

After running the revised model, we see:– It runs faster, with no need to rebuild the

emulator

– The mean is changed a little and variance is halved

The emulator fit is unchanged

Estimate of mean output is 26.4649, with variance 0.0108452

Estimate of total output variance = 36.8522

Page 22: Uncertainty Analysis Using GEM-SA Tony O’Hagan. Outline  Setting up the project  Running a simple analysis  Exercise  More complex analyses

Reducing the MO uncertainty further

If we reduce the variance of MO even more, to 49:– UA mean changes a little more and

variance reduces again

– Notice also how the emulation uncertainty has increased (0.004 for uniform)

– This is because the design points cover the new ranges less thoroughly

Estimate of mean output is 26.6068, with variance 0.014514

Estimate of total output variance = 26.4372

Page 23: Uncertainty Analysis Using GEM-SA Tony O’Hagan. Outline  Setting up the project  Running a simple analysis  Exercise  More complex analyses

Another exercise

What happens if we reduce the uncertainty in MO to zero?

Two ways to do this– Literally set variance to zero– Select “Some known, rest product normal”

on Project dialog, check the tick box for MO in the mean and variance dialog

What changes do you see in the UA?

Page 24: Uncertainty Analysis Using GEM-SA Tony O’Hagan. Outline  Setting up the project  Running a simple analysis  Exercise  More complex analyses

Cross-validation

In the Project dialog, look at the bottom menu box, labelled “Cross-validation”

There are 3 options– None– Leave-one-out– Leave final 20% out

CV is a way of checking the emulator fit– Default is None because CV takes time

Page 25: Uncertainty Analysis Using GEM-SA Tony O’Hagan. Outline  Setting up the project  Running a simple analysis  Exercise  More complex analyses

Cross Validation Root Mean-Squared Error = 0.844452

Cross Validation Root Mean-Squared Relative Error = 4.00836 percent

Cross Validation Root Mean-Squared Standardised Error = 1.01297

Cross Validation variances range from 0.173433 to 2.89026

Written cross-validation means to file cvpredmeans.txt

Written cross-validation variances to file cvpredvars.txt

Leave-one-out CV

After estimating roughness and other parameters, GEM predicts each training run point using only the remaining n-1 points

Results appear in log windowClose to 1

(Model 1, output 4, uniform inputs)

Page 26: Uncertainty Analysis Using GEM-SA Tony O’Hagan. Outline  Setting up the project  Running a simple analysis  Exercise  More complex analyses

Leave out final 20% CV

This is an even better check, because it tests the emulator on data that have not been used in any way to predict it

Emulator is built on first 80% of data and used to predict last 20%

[Marc, zero standardised error??!!!]

Cross Validation Root Mean-Squared Error = 0.959898

Cross Validation Root Mean-Squared Relative Error = 4.65714 percent

Cross Validation Root Mean-Squared Standardised Error = 0

Cross Validation variances range from 0.182214 to 2.17168

Page 27: Uncertainty Analysis Using GEM-SA Tony O’Hagan. Outline  Setting up the project  Running a simple analysis  Exercise  More complex analyses

Other options

There are various other options associated with the emulator building that we have not dealt with

But we’ve done the main things that should be considered in practice

And it’s enough to be going on with!

Page 28: Uncertainty Analysis Using GEM-SA Tony O’Hagan. Outline  Setting up the project  Running a simple analysis  Exercise  More complex analyses

When it all goes wrong

How do we know when the emulator is not working?– Large roughness parameters

Especially ones hitting the limit of 99

– Large emulation variance on UA mean– Poor CV standardised prediction error

Especially when some are extremely large

In such cases, see if a larger training set helps– Other ideas like transforming output scale