uncertainty quantification methods enabling credible ...personal background • ph.d., computational...

Uncertainty Quantification Methods Enabling Credible Simulationg

Brian M. AdamsOptimization and Uncertainty Quantification

(with Michael S. Eldred and Laura P. Swiler)

November 5, 2009Nuclear Engineering / Mathematics Joint Seminar

North Carolina State UniversityRaleigh, NC

Sandia is a multiprogram laboratory operated by Sandia Corporation, a Lockheed Martin Company,for the United States Department of Energy’s National Nuclear Security Administration

under contract DE-AC04-94AL85000.

Personal Background

• Ph.D., Computational and Applied Mathematics, NC State– mathematics, statistics, computer science, immunologyp gy– nondeterministic model calibration (HIV)

• SNL, Albuquerque since 2005:– went for optimization focus; diversified into UQ– mix of algorithm development, production software– agent-based disease simulations, power grid optimization, g p g p

research in mixed integer surrogates

• Algorithms discussed today publicly available in DAKOTA– Many science/engineering application customers; their unmet

needs drive research and software

O i k ith H INL ORNL i DOE N l• Ongoing work with Hany, INL, ORNL in DOE Nuclear Energy Advanced Modeling and Simulation (NEAMS)

Outline: UQ Algorithms for Credible Simulation

GOAL: Demonstrate how a mix of statistics, nonlinear optimization, numerical integration, and surrogate (meta-) modeling enables robust and efficient UQ methods.

• Motivation for credible simulation (computational models)( p )• Characterizations and propagation of uncertainty• Uncertainty quantification algorithms

Sampling based– Sampling-based– Reliability– Stochastic expansion

Interval estimation– Interval estimation• UQ research challenges and directions

Insight from Computational Simulation

Systems of systems analysis: multi-scale, multi-phenomenon

Mi l t h i lMicro-electro-mechanical systems (MEMS): quasi-

static nonlinear elasticity, process modeling

Joint mechanics: system-level FEA for component

assessment

Electrical circuits: networks, PDEs, differential algebraic

equations (DAEs), E&M

dHurricane Katrina: weather,

logistics, economics, human behaviorEarth penetrator: nonlinear

PDEs with contact, transient analysis, material modeling

Credible Simulation

• Historically: primary focus on modeling fidelity (with enormous investments)C t i i l ( h ld) d d f f ll• Customers increasingly (should) demand proof of overall analysis pedigree in the data and/or extrapolation domain.

Bill Oberkampf

Inclusion of Nondeterministic Element (early 1970’s)

Bill Oberkampf

Formal V&V, UQ, and Model FidelitySupport Credible Simulation

Ultimate purpose (arguably): insight, prediction, and risk-informed decision-making need credibility for intended application

Bill Oberkampf

Verification & Validation

• Verification: “Are we solving the equations correctly?”– mathematics/computer science issue: Is our mathematical formulation

and software implementation of the physics model correct?and software implementation of the physics model correct?– code verification (software correctness)– solution verification (e.g., exhibits proper order of convergence)

V lid ti “A l i th i ht ti ?”• Validation – “Are we solving the right equations?”– a disciplinary science issue: is the science (physics, biology, etc.)

model sufficient for the intended application? Involves data and metrics.

• Ideally: code verification solution verification validation then follow-on optimization/calibration/uncertainty quantification

Supporting concepts:pp g p• Sensitivity Analysis (SA): both local and global

– How do code outputs vary with respect to changes in code inputs?• Uncertainty Quantification (UQ):

– What are the probability distributions on code outputs, given the probability di t ib ti d i t ? U k i t di t ib ti ?distributions on my code inputs? Unknown input distributions?

• Quantification of margins and uncertainties (QMU):– How “close” are my code output predictions (incl. UQ) to the system’s

required performance level?

Uncertainties in Simulation and Validation

• physics/science parametersA few uncertainties affecting computational model output/results:

physics/science parameters• statistical variation, inherent randomness• model form / accuracy

ti i t i t f• operating environment, interference• initial, boundary conditions; forcing• geometry / structure / connectivity• material properties• manufacturing quality• experimental error (measurement error, measurement bias)experimental error (measurement error, measurement bias)• numerical accuracy (mesh, solvers); approximation error• human reliability, subjective judgment, linguistic imprecision

The effect of these on model outputs should be integral to an analyst’s deliverable: best estimate PLUS uncertainty!

Uncertainty Quantification

• A single optimal design or nominal performance prediction is often insufficient for – decision making / trade-off assessment– quantification of margins and uncertainties (QMU):

How close are my uncertainty-aware code predictions to required performance expectations or limits?required performance expectations or limits?

• GOAL: risk-informed decisions, based on effects F in a l T e m p e ra tu re V a lu e s

of uncertainty on model output (forward propagation)

2

3

4

5

% in

Bin

Test Data

Model Data

• Validation: Given the relevant uncertainties, is the science (physics, biology,

0

1

2

Te m e pra ture [de g C]

%

etc.) model sufficient for the intended application?

Quantitative Comparison with Experiment: Validation Metrics

Verification to estimate numerical error; uncertainty quantification to assess effect of unknowns or natural variability

Categories of Uncertainty

• Aleatory (think probability density function; sufficient data)

Often useful algorithmic distinctions, but not always a clear division

• Aleatory (think probability density function; sufficient data)– Inherent variability (e.g., in a population), type-A, stochastic– Irreducible uncertainty – can’t reduce it by further knowledge

simulationcode

InputRandomVariables

OutputMetricStatisticsVariables Statistics

Categories of Uncertainty

• Aleatory (think probability density function; sufficient data)

Often useful algorithmic distinctions, but not always a clear division

• Aleatory (think probability density function; sufficient data)– Inherent variability (e.g., in a population), type-A, stochastic– Irreducible uncertainty – can’t reduce it by further knowledge

• Epistemic (think bounded intervals)– Subjective, type-B, state of knowledge uncertainty

R l t d t h t d ’t k– Related to what we don’t know– Reducible: If you had more data or more information, you

could make your uncertainty estimation more precise

simulationcode

[ ][ ]

[ ]InputIntervals

[ ][ ]

[ ]OutputIntervals[ ]

[ ] [ ]

Uncertainty QuantificationForward propagation: quantify the effect that uncertain (nondeterministic) input variables have on model output

I t V i blInput Variables u(physics parameters, geometry, initial and boundary conditions)

ComputationalModel

Variable PerformanceMeasures f(u)

• based on uncertain inputs, determine variance of outputs and probabilities

Potential Goals:(possibly given distributions)

Output Distributions

N samplesvariance of outputs and probabilities of failure (reliability metrics)

• identify parameter correlations/local sensitivities, robust optima

Distributions

measure 1Modelu1

, p• identify inputs whose variances

contribute most to output variance (global sensitivity analysis)

measure 2

u2

u3

• quantify uncertainty when using calibrated model to predict

Typical method: Monte Carlo Sampling

Thermal Uncertainty Quantification

• Device subject to heating (experiment or computational simulation)

• Uncertainty in composition/• Uncertainty in composition/ environment (thermal conductivity, density, boundary), parameterized by u1, …, uN

• Response temperature f(u)=T(u1, …, uN)calculated by heat transfer code

Given distributions of u1,…,uN, UQ methods calculateUQ methods calculate statistical info on outputs:• Mean(T), StdDev(T), Probability(T ≥ T iti l)

Final Temperature Values

4

5

Probability(T ≥ Tcritical)• Probability distribution of temperatures• Correlations (trends) and 1

2

3

4

% in

Bin

( )sensitivity of temperature0

30 36 42 48 54 60 66 72 78 84

Temeprature [deg C]

Challenges for Simulation-based UQ

• Engineering application: propagate variability through a computer model.

• Need statistics of response function f, e.g., µf, f, Prob[ f > fcritical]• Typical characteristics: • input parameters specified by

probability density functions

f(x1, x2)

• no explicit function for f(x1,x2)• expensive to evaluate f(x1,x2) and

may fail to calculate

0.60.81.0

• limited number of samples• noisy / non-smooth

Is it better to:

0.00.20.4

0 4

0.20.4

Is it better to:• compute statistics from the limited

f(x1,x2) sample values, or• construct an approximation model

0.40.6

0.81.0 x 1

0.60.8

1.01.2

x2

ppbased on the f(x1,x2) values and then compute statistics

UQ: Sampling Methods

Given distributions of u1,…,uN, sampling-based methods calculate sample statistics, e.g., on temperature T(u1,…,uN):

• sample meanOutput

DistributionsN samples

N

i

iuTN

T1

)(1

• sample variancemeasure 1Model

u1

u2

Ni TuT

NT 2)(1

2

• full PDF(probabilities)Final Temperature Values

measure 22

u3

iN 1

2

3

4

5

% in

Bin

• Monte Carlo sampling• Quasi-Monte Carlo• Centroidal Voroni Tessalation (CVT)

0

1

30 36 42 48 54 60 66 72 78 84

Temeprature [deg C]

%

• Latin Hypercube samplingRobust, but slow convergence: O(N-

1/2)

Latin Hypercube Sampling (LHS)

• Specialized Monte Carlo (MC) sampling technique: workhorse method in DAKOTA / at Sandia

• Stratified random sampling among equal probability bins for

• Stratified random sampling among equal probability bins for all 1-D projections of an n-dimensional set of samples.

• McKay and Conover (early), restricted pairing by ImanG

H

I0.2 0.2 0.2 0.2 0.2

A B C D

0.8

1

J

K

0.2

0.4

0.6

0.8

A B C DL

A Two-Dimensional Representation of One Possible LHS of size 5 Utilizing X1 (normal)

and X2 (uniform)0

A B C D

Intervals Used with a LHS of Size n = 5 in Terms of the pdf and CDF for a Normal

Random Variable

( )

Calculating Probability of Failure

• Given uncertainty in materials, geometry, and environment, determine likelihood of failure Probability(T ≥ T )Probability(T ≥ Tcritical)

Final Temperature Values • Could perform 10,000 LHS samples and count how

3

4

5

Bin

samples and count how many exceed threshold…

• …or MV: make a linearity (and possibly normality)

•0

1

2

3

% in

B assumption and project… • or directly determine input

variables which give rise to failure behaviors by solving30 36 42 48 54 60 66 72 78 84

Temeprature [deg C]

failure behaviors by solving an optimization problem.

By combining optimization, uncertainty analysis methods, and surrogateBy combining optimization, uncertainty analysis methods, and surrogate (meta-) modeling in a single framework, DAKOTA enables more efficient UQ.

Analytic Reliability: MPP Search

Perform optimization in uncertain variable space to determine Most Probable Point (of response or failure occurring) for G(u) = T(u).

Reliability Index Approach (RIA)

Region of u values where T ≥ T map Tcritical to a T ≥ Tcritical

p criticalprobability

G(u)

Reliability: Algorithmic Variations

• Limit state linearizations: use a local surrogate for the limit state G(u) during optimization in u-space (or x-space):

Many variations possible to improve efficiency, including in DAKOTA…

optimization in u space (or x space):

• Integrations (in u-space to determine probabilities): may need higher order

(could use analytic, finite difference, or quasi-Newton (BFGS, SR1) Hessians in approximation/optimization – results here mostly use SR1 quasi-Hessians.)

• Integrations (in u-space to determine probabilities): may need higher order for nonlinear limit states

1st-order: 2nd-order:

• MPP search algorithm: Sequential Quadratic Prog. (SQP) vs. Nonlinear Interior Point (NIP)

• Warm starting (for linearizations initial iterate for MPP searches): speeds

curvature correction

• Warm starting (for linearizations, initial iterate for MPP searches): speeds convergence when increments made in: approximation, statistics requested, design variables

Efficient Global Reliability Analysis

• EGRA (B.J. Bichon) performs reliability analysis with – EGO Gaussian Process surrogate with NCSU DIRECT optimizer– multimodal adaptive importance sampling for probability calculation

• Created to address nonlinear and/or multi-modal limit states in MPP searches.

• In EGO: expected improvement is large near promising minima, or in

GP surrogate

g gregions of high uncertainty True fn

ExpectedpImprovement

From Jones, Schonlau, Welch, 1998

Efficient Global Reliability Analysis

• Apply an EGO-like method to the equality-constrained optimization problem• In EGRA, an expected feasibility function balances exploration with local

search near the failure boundary to refine the GPsearch near the failure boundary to refine the GP• Cost competitive with best MPP search methods, yet better probability of

failure estimatesGaussian process model of reliability limit state withGaussian process model of reliability limit state with

10 samples 28 samples

exploit

lexplore

Stochastic Expansions

• Create polynomial approximation to response function• Polynomial chaos expansions (PCE): known basis, compute

coefficients

• (Lagrange) Stochastic collocation (SC): known coefficients, form interpolant

• Form basis, then sample, calculate moments, probabilities, etc.T il i fi i d l ith i t l• Tailoring fine-grained algorithmic control:– Synchronize PCE form with numerical integration– Optimal basis & Gauss pts/wts for arbitrary input PDFs– Anisotropic approaches: emphasize key dimensions

• h/p-adaptive collocation (FY10-12)

Generalized Polynomial Chaos Expansions (PCE)

Approximate response stochasticity with Galerkin projection usingmultivariate orthogonal polynomial basis functions defined over standardrandom variablesrandom variables

e.g. using

R(ξ) ≈ f(u)• Intrusive• Nonintrusive: estimate response coefficients using sampling (expectation), quadrature/cubature (num integration), point collocation (regression)Wi A k G li d PCE ith d ti it

R(ξ) f(u)

Wiener-Askey Generalized PCE with adaptivity• Tailor basis: optimal basis selection leads to exponential convergence

Efficient Integration

Stochastic Collocation(based on Lagrange interpolation)

Instead of estimating coeffs for known basis fns, form interpolants for known coefficients

F i t l t i S f t d t ( f i id)

R

Form sparse interpolant using S of tensor products (same as forming sparse grid)

Key is use of same Gauss points/weights from the orthogonal polynomials for specified input PDFs same exponential convergence rates

Simpler than PCE, and:• Adapts to integration approach / collocation pt set:

doesn’t over-/under-integrate a (nonsynchronized) expansion

• Estimating moments of any order is easy: E[Rk] = S rkj wj

Disadvantages relative to PCE:• Requires structured data sets: quadrature/sparse grid

(cubature?), no random sets• Expansion variance not guaranteed positive, no analytic VBD

Fast Convergence

Hermite basis, lognormal distributions

CDF

Dempster-Shafer Theory

Intervals on the inputs are propagated to calculate

• Belief: a lower bound on a probability value that is consistent with the evidence

• Plausibility: an upper bound on a probability value that is consistent with the evidence.

orP

l(x)

0.7

0.8

0.9

1.0Frame 1a

CPF

orP

l(>x)

0.7

0.8

0.9

1.0Frame 1b

CCPF

orP

l(x)

0.7

0.8

0.9

1.0Frame 1a

CPF

orP

l(>x)

0.7

0.8

0.9

1.0Frame 1b

CCPF

x)

orP

rob(x

)

0.3

0.4

0.5

0.6 CDF

x)or

Pro

b(>x

)

0.3

0.4

0.5

0.6

CCDF

CCBFx)

orP

rob(x

)

0.3

0.4

0.5

0.6 CDF

x)or

Pro

b(>x

)

0.3

0.4

0.5

0.6

CCDF

CCBF

x

Bel

(

0.0 0.5 1.00.0

0.1

0.2 CBF

x

Bel

(>

0.0 0.5 1.00.0

0.1

0.2CCBF

x

Bel

(

0.0 0.5 1.00.0

0.1

0.2 CBF

x

Bel

(>

0.0 0.5 1.00.0

0.1

0.2CCBF

“Second-Order” Probability

• Nested sampling technique which combines epistemic and aleatoryuncertainty, e.g., UQ with bounds on the mean of a normal distribution

• Frequently used in UQ studies and regulatory analyses (e.g. WIPP)Frequently used in UQ studies and regulatory analyses (e.g. WIPP)• For each outer loop sample of epistemic (interval) variables, run an inner

loop UQ study over aleatory (probability) variables; potentially costly

0.75

1.00

epistemic

50 outer loop samples→ 50 CDF traces

0.50

0 5

Cum

Pro

b

sampling

aleatoryeach discrete

0 00

0.25sampling

simulation

each discrete (empirical) CDF: 100 inner loop samples 0.00

response metric

“Envelope” of CDF traces represents response epistemic uncertainty

samples

New Interval Estimation Approach

• Outer loop: determine interval on a moment (e.g., mean or variance)

l b l ti i ti bl fi d / i f t ti ti f– global optimization problem: find max/min of statistic of interest, given bound constrained interval variables

– use EGO to solve 2 optimization problems with ti ll G i tessentially one Gaussian process surrogate

• Inner loop: Use stochastic expansion methods (PCE) to d t i th CDF t ith t t thdetermine the CDFs or moments with respect to the aleatory variables

)|(min uuf )|(max uuf)|(min

UBELB

EASTATu

uuu

uufE

)|(max

UBELB

EASTATu

uuu

uufE

)(~ AA uFu )(~ AA uFu

Sandia Research Interests

GOAL: Advanced efficient, robust, accurate UQ methods for validation, extrapolation, risk-informed decisions

• Efficient, adaptive polynomial chaos techniques• UQ and surrogate approaches for mixed-integer higher-UQ and surrogate approaches for mixed-integer, higher-

order moments, tail statistics• Multi-level (system and hierachical) UQ• How to allocate margin across a system• How to allocate margin across a system• Stochastic processes and random fields• Epistemic UQ approaches and alternative frames• Adaptive UQ in multi-fidelity/hierarchical contexts

Thank you for your attention!a you o you atte t o

[email protected]

EXTRA SLIDESEXTRA SLIDES

Code Verification: ASC Software Quality Plan Practices

Project Management• Integrated Teaming

Software Engineering• Technical Solution

Verification: “Are we solving the equations correctly?”

Integrated Teaming• Graded Level of Formality• Measurement & Analysis• Requirements Development

and Management

• Technical Solution• Software Development• Configuration Management• Product Integration

Deployment and Lifecycleand Management• Risk Management• Project Planning & Oversight

• Deployment and LifecycleSupport

Software Verification• Verification Plan• Technical Reviews• Testing (incl regression)

Training• Team Professional

development• Track Training: prove

• Implementation of software quality practices is tailored to the stated

• Testing (incl. regression)• Acceptance Criteria

• Track Training: prove team qualification

p q y plevel of risk (formality) in using the particular code

• Software projects are routinely assessed to these practices (internal and external review)

Solution Verification: Is the solution behaving as expected?

• Do we correctly solve problems with analytic and/or known solutions?• Do we realize the expected order of accuracy in space and/or time?

Verification: “Are we solving the equations correctly?”

Do we realize the expected order of accuracy in space and/or time? • Relevant methods:

– Method of manufactured solutions (MMS)– Grid convergence studies / Richardson extrapolation– SIERRA Encore: Online verification through MMS, adjoint-based error estimation,

and adaptivity:

Application of the new error estimator in Encore t l t ti 3D t t l t i blto a large rotation 3D structural torsion problem

solved using the SIERRA Mechanics code Adagio. The mesh adaptivity is driven by the

errors in the displacement gradients.

A verification problem for coupled conduction and thermal radiation using the Encore/MMS capability in SIERRA Mechanics Temperature contours are shown on the left andMechanics. Temperature contours are shown on the left, and convergence rates are shown on the right. Encore has quantified that the simulation has the expected accuracy, providing evidence that the numerical algorithms have been implemented correctly.

Algorithms for ComputationalModeling & Simulation

System Design Physics

Are you sure you don’t need verification?!

Geometric Modeling

Meshing

Model Equations

Discretization

Partitioning and Mapping

Time integrationAdOptimization

Nonlinear solve

Linear solve

Time integrationAdaptpand UQ

Linear solve

Information Analysis & ValidationInformation Analysis & Validation

Improved design and understanding

uncertainty quantification methods enabling credible ...personal background • ph.d., computational...

Documents