uncertainty quantification methods enabling credible ...personal background • ph.d., computational...
TRANSCRIPT
Uncertainty Quantification Methods Enabling Credible Simulationg
Brian M. AdamsOptimization and Uncertainty Quantification
(with Michael S. Eldred and Laura P. Swiler)
November 5, 2009Nuclear Engineering / Mathematics Joint Seminar
North Carolina State UniversityRaleigh, NC
Sandia is a multiprogram laboratory operated by Sandia Corporation, a Lockheed Martin Company,for the United States Department of Energy’s National Nuclear Security Administration
under contract DE-AC04-94AL85000.
Personal Background
• Ph.D., Computational and Applied Mathematics, NC State– mathematics, statistics, computer science, immunologyp gy– nondeterministic model calibration (HIV)
• SNL, Albuquerque since 2005:– went for optimization focus; diversified into UQ– mix of algorithm development, production software– agent-based disease simulations, power grid optimization, g p g p
research in mixed integer surrogates
• Algorithms discussed today publicly available in DAKOTA– Many science/engineering application customers; their unmet
needs drive research and software
O i k ith H INL ORNL i DOE N l• Ongoing work with Hany, INL, ORNL in DOE Nuclear Energy Advanced Modeling and Simulation (NEAMS)
Outline: UQ Algorithms for Credible Simulation
GOAL: Demonstrate how a mix of statistics, nonlinear optimization, numerical integration, and surrogate (meta-) modeling enables robust and efficient UQ methods.
• Motivation for credible simulation (computational models)( p )• Characterizations and propagation of uncertainty• Uncertainty quantification algorithms
Sampling based– Sampling-based– Reliability– Stochastic expansion
Interval estimation– Interval estimation• UQ research challenges and directions
Insight from Computational Simulation
Systems of systems analysis: multi-scale, multi-phenomenon
Mi l t h i lMicro-electro-mechanical systems (MEMS): quasi-
static nonlinear elasticity, process modeling
Joint mechanics: system-level FEA for component
assessment
Electrical circuits: networks, PDEs, differential algebraic
equations (DAEs), E&M
dHurricane Katrina: weather,
logistics, economics, human behaviorEarth penetrator: nonlinear
PDEs with contact, transient analysis, material modeling
Credible Simulation
• Historically: primary focus on modeling fidelity (with enormous investments)C t i i l ( h ld) d d f f ll• Customers increasingly (should) demand proof of overall analysis pedigree in the data and/or extrapolation domain.
Bill Oberkampf
Inclusion of Nondeterministic Element (early 1970’s)
Bill Oberkampf
Formal V&V, UQ, and Model FidelitySupport Credible Simulation
Ultimate purpose (arguably): insight, prediction, and risk-informed decision-making need credibility for intended application
Bill Oberkampf
Verification & Validation
• Verification: “Are we solving the equations correctly?”– mathematics/computer science issue: Is our mathematical formulation
and software implementation of the physics model correct?and software implementation of the physics model correct?– code verification (software correctness)– solution verification (e.g., exhibits proper order of convergence)
V lid ti “A l i th i ht ti ?”• Validation – “Are we solving the right equations?”– a disciplinary science issue: is the science (physics, biology, etc.)
model sufficient for the intended application? Involves data and metrics.
• Ideally: code verification solution verification validation then follow-on optimization/calibration/uncertainty quantification
Supporting concepts:pp g p• Sensitivity Analysis (SA): both local and global
– How do code outputs vary with respect to changes in code inputs?• Uncertainty Quantification (UQ):
– What are the probability distributions on code outputs, given the probability di t ib ti d i t ? U k i t di t ib ti ?distributions on my code inputs? Unknown input distributions?
• Quantification of margins and uncertainties (QMU):– How “close” are my code output predictions (incl. UQ) to the system’s
required performance level?
Uncertainties in Simulation and Validation
• physics/science parametersA few uncertainties affecting computational model output/results:
physics/science parameters• statistical variation, inherent randomness• model form / accuracy
ti i t i t f• operating environment, interference• initial, boundary conditions; forcing• geometry / structure / connectivity• material properties• manufacturing quality• experimental error (measurement error, measurement bias)experimental error (measurement error, measurement bias)• numerical accuracy (mesh, solvers); approximation error• human reliability, subjective judgment, linguistic imprecision
The effect of these on model outputs should be integral to an analyst’s deliverable: best estimate PLUS uncertainty!
Uncertainty Quantification
• A single optimal design or nominal performance prediction is often insufficient for – decision making / trade-off assessment– quantification of margins and uncertainties (QMU):
How close are my uncertainty-aware code predictions to required performance expectations or limits?required performance expectations or limits?
• GOAL: risk-informed decisions, based on effects F in a l T e m p e ra tu re V a lu e s
of uncertainty on model output (forward propagation)
2
3
4
5
% in
Bin
Test Data
Model Data
• Validation: Given the relevant uncertainties, is the science (physics, biology,
0
1
2
Te m e pra ture [de g C]
%
etc.) model sufficient for the intended application?
Quantitative Comparison with Experiment: Validation Metrics
Verification to estimate numerical error; uncertainty quantification to assess effect of unknowns or natural variability
Outline: UQ Algorithms for Credible Simulation
GOAL: Demonstrate how a mix of statistics, nonlinear optimization, numerical integration, and surrogate (meta-) modeling enables robust and efficient UQ methods.
• Motivation for credible simulation (computational models)( p )• Characterizations and propagation of uncertainty• Uncertainty quantification algorithms
Sampling based– Sampling-based– Reliability– Stochastic expansion
Interval estimation– Interval estimation• UQ research challenges and directions
Categories of Uncertainty
• Aleatory (think probability density function; sufficient data)
Often useful algorithmic distinctions, but not always a clear division
• Aleatory (think probability density function; sufficient data)– Inherent variability (e.g., in a population), type-A, stochastic– Irreducible uncertainty – can’t reduce it by further knowledge
simulationcode
InputRandomVariables
OutputMetricStatisticsVariables Statistics
Categories of Uncertainty
• Aleatory (think probability density function; sufficient data)
Often useful algorithmic distinctions, but not always a clear division
• Aleatory (think probability density function; sufficient data)– Inherent variability (e.g., in a population), type-A, stochastic– Irreducible uncertainty – can’t reduce it by further knowledge
• Epistemic (think bounded intervals)– Subjective, type-B, state of knowledge uncertainty
R l t d t h t d ’t k– Related to what we don’t know– Reducible: If you had more data or more information, you
could make your uncertainty estimation more precise
simulationcode
[ ][ ]
[ ]InputIntervals
[ ][ ]
[ ]OutputIntervals[ ]
[ ] [ ]
Uncertainty QuantificationForward propagation: quantify the effect that uncertain (nondeterministic) input variables have on model output
I t V i blInput Variables u(physics parameters, geometry, initial and boundary conditions)
ComputationalModel
Variable PerformanceMeasures f(u)
• based on uncertain inputs, determine variance of outputs and probabilities
Potential Goals:(possibly given distributions)
Output Distributions
N samplesvariance of outputs and probabilities of failure (reliability metrics)
• identify parameter correlations/local sensitivities, robust optima
Distributions
measure 1Modelu1
, p• identify inputs whose variances
contribute most to output variance (global sensitivity analysis)
measure 2
u2
u3
• quantify uncertainty when using calibrated model to predict
Typical method: Monte Carlo Sampling
Thermal Uncertainty Quantification
• Device subject to heating (experiment or computational simulation)
• Uncertainty in composition/• Uncertainty in composition/ environment (thermal conductivity, density, boundary), parameterized by u1, …, uN
• Response temperature f(u)=T(u1, …, uN)calculated by heat transfer code
Given distributions of u1,…,uN, UQ methods calculateUQ methods calculate statistical info on outputs:• Mean(T), StdDev(T), Probability(T ≥ T iti l)
Final Temperature Values
4
5
Probability(T ≥ Tcritical)• Probability distribution of temperatures• Correlations (trends) and 1
2
3
4
% in
Bin
( )sensitivity of temperature0
30 36 42 48 54 60 66 72 78 84
Temeprature [deg C]
Challenges for Simulation-based UQ
• Engineering application: propagate variability through a computer model.
• Need statistics of response function f, e.g., µf, f, Prob[ f > fcritical]• Typical characteristics: • input parameters specified by
probability density functions
f(x1, x2)
• no explicit function for f(x1,x2)• expensive to evaluate f(x1,x2) and
may fail to calculate
0.60.81.0
• limited number of samples• noisy / non-smooth
Is it better to:
0.00.20.4
0 4
0.20.4
Is it better to:• compute statistics from the limited
f(x1,x2) sample values, or• construct an approximation model
0.40.6
0.81.0 x 1
0.60.8
1.01.2
x2
ppbased on the f(x1,x2) values and then compute statistics
Outline: UQ Algorithms for Credible Simulation
GOAL: Demonstrate how a mix of statistics, nonlinear optimization, numerical integration, and surrogate (meta-) modeling enables robust and efficient UQ methods.
• Motivation for credible simulation (computational models)( p )• Characterizations and propagation of uncertainty• Uncertainty quantification algorithms
Sampling based– Sampling-based– Reliability– Stochastic expansion
Interval estimation– Interval estimation• UQ research challenges and directions
UQ: Sampling Methods
Given distributions of u1,…,uN, sampling-based methods calculate sample statistics, e.g., on temperature T(u1,…,uN):
• sample meanOutput
DistributionsN samples
N
i
iuTN
T1
)(1
• sample variancemeasure 1Model
u1
u2
Ni TuT
NT 2)(1
2
• full PDF(probabilities)Final Temperature Values
measure 22
u3
iN 1
2
3
4
5
% in
Bin
• Monte Carlo sampling• Quasi-Monte Carlo• Centroidal Voroni Tessalation (CVT)
0
1
30 36 42 48 54 60 66 72 78 84
Temeprature [deg C]
%
• Latin Hypercube samplingRobust, but slow convergence: O(N-
1/2)
Latin Hypercube Sampling (LHS)
• Specialized Monte Carlo (MC) sampling technique: workhorse method in DAKOTA / at Sandia
• Stratified random sampling among equal probability bins for
• Stratified random sampling among equal probability bins for all 1-D projections of an n-dimensional set of samples.
• McKay and Conover (early), restricted pairing by ImanG
H
I0.2 0.2 0.2 0.2 0.2
A B C D
0.8
1
J
K
0.2
0.4
0.6
0.8
A B C DL
A Two-Dimensional Representation of One Possible LHS of size 5 Utilizing X1 (normal)
and X2 (uniform)0
A B C D
Intervals Used with a LHS of Size n = 5 in Terms of the pdf and CDF for a Normal
Random Variable
( )
Calculating Probability of Failure
• Given uncertainty in materials, geometry, and environment, determine likelihood of failure Probability(T ≥ T )Probability(T ≥ Tcritical)
Final Temperature Values • Could perform 10,000 LHS samples and count how
3
4
5
Bin
samples and count how many exceed threshold…
• …or MV: make a linearity (and possibly normality)
•0
1
2
3
% in
B assumption and project… • or directly determine input
variables which give rise to failure behaviors by solving30 36 42 48 54 60 66 72 78 84
Temeprature [deg C]
failure behaviors by solving an optimization problem.
By combining optimization, uncertainty analysis methods, and surrogateBy combining optimization, uncertainty analysis methods, and surrogate (meta-) modeling in a single framework, DAKOTA enables more efficient UQ.
Analytic Reliability: MPP Search
Perform optimization in uncertain variable space to determine Most Probable Point (of response or failure occurring) for G(u) = T(u).
Reliability Index Approach (RIA)
Region of u values where T ≥ T map Tcritical to a T ≥ Tcritical
p criticalprobability
G(u)
Reliability: Algorithmic Variations
• Limit state linearizations: use a local surrogate for the limit state G(u) during optimization in u-space (or x-space):
Many variations possible to improve efficiency, including in DAKOTA…
optimization in u space (or x space):
• Integrations (in u-space to determine probabilities): may need higher order
(could use analytic, finite difference, or quasi-Newton (BFGS, SR1) Hessians in approximation/optimization – results here mostly use SR1 quasi-Hessians.)
• Integrations (in u-space to determine probabilities): may need higher order for nonlinear limit states
1st-order: 2nd-order:
• MPP search algorithm: Sequential Quadratic Prog. (SQP) vs. Nonlinear Interior Point (NIP)
• Warm starting (for linearizations initial iterate for MPP searches): speeds
curvature correction
• Warm starting (for linearizations, initial iterate for MPP searches): speeds convergence when increments made in: approximation, statistics requested, design variables
Efficient Global Reliability Analysis
• EGRA (B.J. Bichon) performs reliability analysis with – EGO Gaussian Process surrogate with NCSU DIRECT optimizer– multimodal adaptive importance sampling for probability calculation
• Created to address nonlinear and/or multi-modal limit states in MPP searches.
• In EGO: expected improvement is large near promising minima, or in
GP surrogate
g gregions of high uncertainty True fn
ExpectedpImprovement
From Jones, Schonlau, Welch, 1998
Efficient Global Reliability Analysis
• Apply an EGO-like method to the equality-constrained optimization problem• In EGRA, an expected feasibility function balances exploration with local
search near the failure boundary to refine the GPsearch near the failure boundary to refine the GP• Cost competitive with best MPP search methods, yet better probability of
failure estimatesGaussian process model of reliability limit state withGaussian process model of reliability limit state with
10 samples 28 samples
exploit
lexplore
Stochastic Expansions
• Create polynomial approximation to response function• Polynomial chaos expansions (PCE): known basis, compute
coefficients
• (Lagrange) Stochastic collocation (SC): known coefficients, form interpolant
• Form basis, then sample, calculate moments, probabilities, etc.T il i fi i d l ith i t l• Tailoring fine-grained algorithmic control:– Synchronize PCE form with numerical integration– Optimal basis & Gauss pts/wts for arbitrary input PDFs– Anisotropic approaches: emphasize key dimensions
• h/p-adaptive collocation (FY10-12)
Generalized Polynomial Chaos Expansions (PCE)
Approximate response stochasticity with Galerkin projection usingmultivariate orthogonal polynomial basis functions defined over standardrandom variablesrandom variables
e.g. using
R(ξ) ≈ f(u)• Intrusive• Nonintrusive: estimate response coefficients using sampling (expectation), quadrature/cubature (num integration), point collocation (regression)Wi A k G li d PCE ith d ti it
R(ξ) f(u)
Wiener-Askey Generalized PCE with adaptivity• Tailor basis: optimal basis selection leads to exponential convergence
Efficient Integration
Stochastic Collocation(based on Lagrange interpolation)
Instead of estimating coeffs for known basis fns, form interpolants for known coefficients
F i t l t i S f t d t ( f i id)
R
Form sparse interpolant using S of tensor products (same as forming sparse grid)
Key is use of same Gauss points/weights from the orthogonal polynomials for specified input PDFs same exponential convergence rates
Simpler than PCE, and:• Adapts to integration approach / collocation pt set:
doesn’t over-/under-integrate a (nonsynchronized) expansion
• Estimating moments of any order is easy: E[Rk] = S rkj wj
Disadvantages relative to PCE:• Requires structured data sets: quadrature/sparse grid
(cubature?), no random sets• Expansion variance not guaranteed positive, no analytic VBD
Fast Convergence
Hermite basis, lognormal distributions
CDF
Fast Convergence
Hermite basis, lognormal distributions
CDF
Dempster-Shafer Theory
Intervals on the inputs are propagated to calculate
• Belief: a lower bound on a probability value that is consistent with the evidence
• Plausibility: an upper bound on a probability value that is consistent with the evidence.
orP
l(x)
0.7
0.8
0.9
1.0Frame 1a
CPF
orP
l(>x)
0.7
0.8
0.9
1.0Frame 1b
CCPF
orP
l(x)
0.7
0.8
0.9
1.0Frame 1a
CPF
orP
l(>x)
0.7
0.8
0.9
1.0Frame 1b
CCPF
x)
orP
rob(x
)
0.3
0.4
0.5
0.6 CDF
x)or
Pro
b(>x
)
0.3
0.4
0.5
0.6
CCDF
CCBFx)
orP
rob(x
)
0.3
0.4
0.5
0.6 CDF
x)or
Pro
b(>x
)
0.3
0.4
0.5
0.6
CCDF
CCBF
x
Bel
(
0.0 0.5 1.00.0
0.1
0.2 CBF
x
Bel
(>
0.0 0.5 1.00.0
0.1
0.2CCBF
x
Bel
(
0.0 0.5 1.00.0
0.1
0.2 CBF
x
Bel
(>
0.0 0.5 1.00.0
0.1
0.2CCBF
“Second-Order” Probability
• Nested sampling technique which combines epistemic and aleatoryuncertainty, e.g., UQ with bounds on the mean of a normal distribution
• Frequently used in UQ studies and regulatory analyses (e.g. WIPP)Frequently used in UQ studies and regulatory analyses (e.g. WIPP)• For each outer loop sample of epistemic (interval) variables, run an inner
loop UQ study over aleatory (probability) variables; potentially costly
0.75
1.00
epistemic
50 outer loop samples→ 50 CDF traces
0.50
0 5
Cum
Pro
b
sampling
aleatoryeach discrete
0 00
0.25sampling
simulation
each discrete (empirical) CDF: 100 inner loop samples 0.00
response metric
“Envelope” of CDF traces represents response epistemic uncertainty
samples
New Interval Estimation Approach
• Outer loop: determine interval on a moment (e.g., mean or variance)
l b l ti i ti bl fi d / i f t ti ti f– global optimization problem: find max/min of statistic of interest, given bound constrained interval variables
– use EGO to solve 2 optimization problems with ti ll G i tessentially one Gaussian process surrogate
• Inner loop: Use stochastic expansion methods (PCE) to d t i th CDF t ith t t thdetermine the CDFs or moments with respect to the aleatory variables
)|(min uuf )|(max uuf)|(min
UBELB
EASTATu
uuu
uufE
)|(max
UBELB
EASTATu
uuu
uufE
)(~ AA uFu )(~ AA uFu
Sandia Research Interests
GOAL: Advanced efficient, robust, accurate UQ methods for validation, extrapolation, risk-informed decisions
• Efficient, adaptive polynomial chaos techniques• UQ and surrogate approaches for mixed-integer higher-UQ and surrogate approaches for mixed-integer, higher-
order moments, tail statistics• Multi-level (system and hierachical) UQ• How to allocate margin across a system• How to allocate margin across a system• Stochastic processes and random fields• Epistemic UQ approaches and alternative frames• Adaptive UQ in multi-fidelity/hierarchical contexts
Thank you for your attention!a you o you atte t o
EXTRA SLIDESEXTRA SLIDES
Code Verification: ASC Software Quality Plan Practices
Project Management• Integrated Teaming
Software Engineering• Technical Solution
Verification: “Are we solving the equations correctly?”
Integrated Teaming• Graded Level of Formality• Measurement & Analysis• Requirements Development
and Management
• Technical Solution• Software Development• Configuration Management• Product Integration
Deployment and Lifecycleand Management• Risk Management• Project Planning & Oversight
• Deployment and LifecycleSupport
Software Verification• Verification Plan• Technical Reviews• Testing (incl regression)
Training• Team Professional
development• Track Training: prove
• Implementation of software quality practices is tailored to the stated
• Testing (incl. regression)• Acceptance Criteria
• Track Training: prove team qualification
p q y plevel of risk (formality) in using the particular code
• Software projects are routinely assessed to these practices (internal and external review)
Solution Verification: Is the solution behaving as expected?
• Do we correctly solve problems with analytic and/or known solutions?• Do we realize the expected order of accuracy in space and/or time?
Verification: “Are we solving the equations correctly?”
Do we realize the expected order of accuracy in space and/or time? • Relevant methods:
– Method of manufactured solutions (MMS)– Grid convergence studies / Richardson extrapolation– SIERRA Encore: Online verification through MMS, adjoint-based error estimation,
and adaptivity:
Application of the new error estimator in Encore t l t ti 3D t t l t i blto a large rotation 3D structural torsion problem
solved using the SIERRA Mechanics code Adagio. The mesh adaptivity is driven by the
errors in the displacement gradients.
A verification problem for coupled conduction and thermal radiation using the Encore/MMS capability in SIERRA Mechanics Temperature contours are shown on the left andMechanics. Temperature contours are shown on the left, and convergence rates are shown on the right. Encore has quantified that the simulation has the expected accuracy, providing evidence that the numerical algorithms have been implemented correctly.
Algorithms for ComputationalModeling & Simulation
System Design Physics
Are you sure you don’t need verification?!
Geometric Modeling
Meshing
Model Equations
Discretization
Partitioning and Mapping
Time integrationAdOptimization
Nonlinear solve
Linear solve
Time integrationAdaptpand UQ
Linear solve
Information Analysis & ValidationInformation Analysis & Validation
Improved design and understanding