statistical meta-modeling for complex system simulations

44
Statistical meta modeling of computer experiments Kriging as the main tool: two examples of computer experiments - two levels of accuracy - with qualitative/quantitative factors Alternatives to kriging (to avoid numerical instability) Some novel design issues C. F. Jeff Wu School of Industrial and Systems Engineering Georgia Institute of Technology Statistical Meta-Modeling for Complex System Simulations: Kriging, Alternatives and Design

Upload: others

Post on 14-May-2022

6 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Statistical Meta-Modeling for Complex System Simulations

• Statistical meta modeling of computer experiments• Kriging as the main tool: two examples of computer experiments

- two levels of accuracy- with qualitative/quantitative factors

• Alternatives to kriging (to avoid numerical instability)• Some novel design issues

C. F. Jeff WuSchool of Industrial and Systems Engineering

Georgia Institute of Technology

Statistical Meta-Modeling for Complex System Simulations:

Kriging, Alternatives and Design

Page 2: Statistical Meta-Modeling for Complex System Simulations

Why computer experiments?

Study dangerous or infeasible physical experiments, such as ammunition detonation.

1

Page 3: Statistical Meta-Modeling for Complex System Simulations

Some examples

2

Page 4: Statistical Meta-Modeling for Complex System Simulations

Statistical Meta-Modeling of Computer Experiments

Robustness, optimization

Surrogate model(Kriging)

Noise simulations, error propagation

More FEA runs

Computer modeling (finite-element simulation)

Physical experimentor observations

3

Page 5: Statistical Meta-Modeling for Complex System Simulations

Kriging as an interpolator

• Kriging is an interpolation method:• Originated in geostatistics; proposed for computer experiments by Sacks, Welch et al.

.,)(ˆ pii Rxyxy ∈=

4

Page 6: Statistical Meta-Modeling for Complex System Simulations

Kriging Modeling

• Model assumption: Gaussian process.

– mean, linear model where

– correlation function, e.g., Gaussian Correlation Function,

– variance.

5

Page 7: Statistical Meta-Modeling for Complex System Simulations

Best Linear Unbiased Predictor

• Best Linear Unbiased Predictor (BLUP):

– is the generalized least squares estimation, =

– .– = vector of correlation between

prediction points and observed points.

– interpolating property.,)(ˆ ii yxy =

6

Page 8: Statistical Meta-Modeling for Complex System Simulations

Kriging Modeling: EstimationKriging Modeling: Estimation

M i Lik lih d E i i• Maximum Likelihood Estimation:

))}.log(det()ˆlog({min 2

0Rn

./)ˆ()ˆ(ˆ

))}g( ()g({1'2

0

nFyRFy

where

• Numerical Stability: R is ill-conditioned, especially for large nd/ k ( i t di i ) (P d W 2010b)and/or k (=input dimension) (Peng and Wu, 2010b).

• Computational Complexity: matrix inversion O(n3); manyoptimization iterations are needed; in each iterationoptimization iterations are needed; in each iterationmatrix inversion is computed.

7

Page 9: Statistical Meta-Modeling for Complex System Simulations

Examples of Kriging Models

• Computer experiments with two levels of accuracy

• Computer experiments with quantitative and qualitative factors

• Simulation codes with calibration parameters

8

Page 10: Statistical Meta-Modeling for Complex System Simulations

Example: Designing Cellular Heat Exchangers for an Electronic Cooling Application

9

Page 11: Statistical Meta-Modeling for Complex System Simulations

Heat Transfer AnalysisHE: Detailed Computer Simulation – Finite Element

Analysis (FEA) Method

• Using the computational fluid dynamic solver FLUENT

• Problem domain is divided into thousands or millions of elements.

• Each run requires hours to daysto complete.

FLUENT

10

Page 12: Statistical Meta-Modeling for Complex System Simulations

Heat Transfer AnalysisLE: Approximate Computer Simulation —Finite

Difference Method

• The finite difference techniqueis a numerical technique forsolving 2- or 3-D steady state heat transfer problems.

• Temperature distribution approximated via numerical solution of 3D heat transfer equations using forward or central difference methods.

• Each run takes minutes to complete.

• Less accurate than FEA.

Finite Difference

11

Page 13: Statistical Meta-Modeling for Complex System Simulations

High-accuracy and Low-accuracy Experiments

• Paradigm shift: single experiment→ multiple experiments withdifferent

levels of accuracy.

• A generic pair:high-accuracy experiment(HE) andlow-accuracyexperiment (LE).

• HE is moreaccuratebut moreexpensivethan LE.

• Examples:

– physical experiment vs. computer simulation

– detailed computer simulation vs. approximate computer simulation

• 1. Modeling:

How to model and analyze data from HE and LE?

2. Experimental design:

How to plan HE and LE?

12

Page 14: Statistical Meta-Modeling for Complex System Simulations

Frequentist Approach in Qian et al. (2006, ASME)Related to the autoregressive model in Kennedy and O’Hagan (2000)

• x = (x1, . . . ,xk): design variables.Dl andDh: sets of design points (xi) for LE and HE withDh ⊂ Dl .yh andyl : outputs from HE and LE.

• Base Surrogate Model:yl = βl0 +∑ j βl j xi + εl (xi), xi ∈ Dl ,εl ∼ GP(0,σ2

l ,φl ).

• Adjustment Model: yh(xi) = ρ(xi)yl (xi)+δ(xi), xi ∈ Dh,

– scale adjustment:ρ(xi) = ρ0 +∑ j ρ jxi j ;

– location adjustment: δ ∼ GP(δ0,σ2δ,φδ).

• Fitting:

Base surrogate model:yl

Adjustment model:ρ andδ

=⇒ Final surrogate model:yh = ρyl + δ.

• Desirable property:yh interpolatesyh(xi),xi ∈ Dh if HE is deterministic.

13

Page 15: Statistical Meta-Modeling for Complex System Simulations

Bayesian Approach in Qian and Wu (2008,

Technometrics)

• UseflexibleBayesian hierarchicalGaussianprocess model (BHGP).

• Provide moreflexible adjustment. Can accommodateparameteruncertainty and measurement error of HE.

• BHGP:

– Base Surrogate Model: yl = βl0 +∑ j βl j xi + εl (xi), xi ∈ Dl ,

εl ∼ GP(0,σ2l ,φl ).

– Flexible Adjustment Model:

yh(xi) = ρ(xi)yl (xi)+δ(xi)+ ε(xi), xi ∈ Dh,

scale adjustmentρ ∼ GP(ρ0,σ2ρ,φρ),

location adjustmentδ ∼ GP(δ0,σ2δ,φδ),

ε(x) ∼ N(0,σ2ε). ε = 0⇔ no measurement error.

14

Page 16: Statistical Meta-Modeling for Complex System Simulations

Model Parameters and Priors for the BHGP Model

• Model parameters

– Mean parametersθ1 = (βl ,ρ0,δ0).

– Variance parametersθ2 = (σ2l ,σ

2ρ,σ2

δ,σ2ε).

– Correlationparametersθ3 = (φl ,φρ,φδ);

complexityincreaseswith the no. of input factors (∵dim = 3k).

• Priors

– Normal forθ1.

– Inverse-gamma forθ2.

– Gamma forθ3.

15

Page 17: Statistical Meta-Modeling for Complex System Simulations

Computational Algorithm

1: Fitting Correlation Parameters θ3

Solve astochasticprogram

maxφρ,φδ

L2 = Ep(τ1,τ2) f (τ1,τ2)

by the Sample Average Approximation (SAA) algorithm.

2: Markov Chain Monte Carlo (MCMC) Sampling from p(θ1,θ2|yl ,yh,θ3)

• Rationale: Conditional distributions forθ1,θ2 are in closed form, but not for

θ3 givenθ1, θ2.

• It is not afully Bayesian procedure (to save computation forθ3).

16

Page 18: Statistical Meta-Modeling for Complex System Simulations

Posterior density ofρ0, δ0, σ2ρ, σ2

δ

0.6 0.8 1.0 1.2 1.4

01

23

4

(a)

ρ0

De

nsi

ty

0.0 0.5 1.0 1.5 2.0 2.5 3.0

0.0

0.5

1.0

1.5

2.0

2.5

(b)

σρ2

De

nsi

ty

−1 0 1 2 3 4

0.0

0.1

0.2

0.3

0.4

0.5

0.6

(c)

δ0

De

nsi

ty

0 1 2 3 4 5

0.0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

(d)

σδ2

De

nsi

ty

17

Page 19: Statistical Meta-Modeling for Complex System Simulations

Location and Scale Change

• Symmetric location changeδ0 with center at 0.85.

• Scaleadjustmentρ0 has multi modes, which may be explained bymultiplelaws.

18

Page 20: Statistical Meta-Modeling for Complex System Simulations

Calibration of Computer ModelsCalibration of Computer Models

C lib ti f fitti t d l t b d• Calibration = process of fitting a computer model to observed data by adjusting input parameters of the model (Kennedy and O’Hagan, 2001).

• Computer model: ,• Physical model:

= calibration parameters = model inadequacy = adjustment

),( xyc

,)(),( xxyp )(x – = calibration parameters, = model inadequacy, = adjustment term.

• Example 1-Nuclear Accident: Use data on reactor fire to calibrate a plume model of radioactive deposition Source term

)(x

calibrate a plume model of radioactive deposition. Source term and deposition velocity = calibration parameters.

• Example 2-Methane Combustion: a computer model of ignition d l f h b i C lib idelay of methane combustion. Calibration parameters = seven reaction rates.

19

adminNUS
Typewritten Text
adminNUS
Typewritten Text
Page 21: Statistical Meta-Modeling for Complex System Simulations

GP with quanti/quali factors: Data Center Thermal Distribution

20

Page 22: Statistical Meta-Modeling for Complex System Simulations

Configuration Variables for the Data Center Example

• Five quanti factors: rack temperature rise, rack power, diffuser angle, diffuser

flow rate, ceilingheight.

• Threequali factors: diffuser location, hot-air return-vent location, power

allocation.

21

Page 23: Statistical Meta-Modeling for Complex System Simulations

GP Models with Quantitative and Qualitative Factors

(Qian, H. Wu, C. F. J. Wu 2008,Technometrics)

• Input factors:w = (x,z); I quantitative factors:x = (x1, . . . ,xI ); J qualitative

factors:z = (z1, . . . ,zJ); response value:y(w).

• Build asingleGP model for bothx andz. Borrow strengths from all the

observations:

y(w) = ∑m

βm fm(w)+ ε(w).

• How to specify fm? Use regression modeling involvingx andz.

• How to specifyε (especially its correlation structure)?

• Related work: Bayesian hierarchical GP models (Han et al., 2009).

• Corresponding designs:Sliced space-filling designs(Qian and Wu, 2009,

Biometrika).

22

Page 24: Statistical Meta-Modeling for Complex System Simulations

Construction of Correlation Functions for ε(w)

• Consider one quali factorz1 with m1 levels 1, . . . ,m1.

Foru = 1, . . . ,m1, εu(x) = ε(x,u).

• Idea: envision a mean-zero multivariate process(ε1(x), . . . ,εm1(x)′ = Aη(x).

• A: anm1×m1 matrix withunit row vectors.

• Elements ofη(x): m1 independent processes with a common varianceσ2.

• For two input valuesw1 = (x1,z11) andw2 = (x2,z12),

cor(ε(w1),ε(w2)) = τz11,z12Kφ(x1,x2)

– Kφ(x1,x2): correlation betweenx1 andx2.

– T1 = AA t : anm1×m1 correlationmatrix for z1 (i.e.,positive definitematrix with unit diagonal elements).

23

Page 25: Statistical Meta-Modeling for Complex System Simulations

Alternatives to Kriging

1. Tweaking of kriging:– covariance matrix tapering (Kaufman et al., 2008),– rank reduction (Cressie-Johannesson, 2008),– regularized krigng (Peng-Wu, 2010a)

2. Adding penalty to likelihood (Li-Sudjianto, 2005): another approach to regularization.

24

Page 26: Statistical Meta-Modeling for Complex System Simulations

Alternatives to Kriging (cont )Alternatives to Kriging (cont.)

3. Completely different ideas:

- Radial basis interpolation functions (Buhmann 2004)- Radial basis interpolation functions (Buhmann, 2004)

- Regression-based inverse distance weighting (Joseph and Kang, 2009): a fast interpolator

O l b i h d (OBSM) (Ch W d W- Overcomplete basis surrogate method (OBSM) (Chen, Wang and Wu, 2010): an approximator, not interpolator

25

Page 27: Statistical Meta-Modeling for Complex System Simulations

Inverse Distance Weighting (IDW)

• Inverse Distance Weighting (Shepard, 1968):

– Simple computation but poor prediction.

26

Page 28: Statistical Meta-Modeling for Complex System Simulations

Regression-Based Inverse Distance Weighting (RIDW)

• Add regression part to IDW (Joseph and Kang, 2009):

– = mean part; can be linear, nonlinear, nonparametric.

– (faster convergence than IDW)

27

Page 29: Statistical Meta-Modeling for Complex System Simulations

Comparisons Between RIDW and Kriging

28

Page 30: Statistical Meta-Modeling for Complex System Simulations

Design of Experiments with HE and LE

• Key to efficiently allocate resources and acquire information from HE and LE.

• x = (x1, . . . ,xk): design variables in[0,1]k.

Dl : set of design points (xi) for LE. Dh: set of design points (xi) for HE.

• Three principles for constructingDl andDh:

Economy: The size ofDh is less than the size ofDl .

Nested relationship: Dh is a subset ofDl .

Space-filling: Points inDh andDl achieve uniformity in low dimensions.

• How to constructmultiple experimentswith respect tomultiplerequirements?

• New issue in design of experiments: traditional methods deal almost

exclusively with experiments with one level of accuracy.

29

Page 31: Statistical Meta-Modeling for Complex System Simulations

Nested Space-Filling Designs

Proposed in Qian, Tang and Wu (2009),StatisticaSinica, and furtherdeveloped in

Qian, Ai and Wu (2009),Annals of Statistics, Qian (2009),Biometrika.

Ideas:

1. Use aspecialorthogonal array to construct an OA-based Latin hypercube

design forDl .

2. ChooseDh to be acarefullyselected subset ofDl with two-dimensional

balance.

Need to use simple Galois field theory.

30

Page 32: Statistical Meta-Modeling for Complex System Simulations

Nested Orthogonal Array

Let A be anOA(N1,k,s1, t). SupposeA has asubarray withN2 rows, denoted by

A1, and there is a projectionφ that collapse thes1 levels ofA into s2 levels. Further

supposeA1 is anOA(N2,k,s2, t) if the s1 levels of its entries are collapsed tos2

levels according toφ. Then an array with this structure is called anested

orthogonal array.

31

Page 33: Statistical Meta-Modeling for Complex System Simulations

Construction of Nested Space-Filling Design

• The nested orthogonal arrayA2 ⊂ A1 does not automatically generatenested

space-fillingdesigns if thes1 levels are arbitrarily labeled.

• In constructing OA-based Latin hypercubeD1 usingA1, thes1 levels ofA1,

represented by the polynomials of degreeu1−1 or lower, have to be first

labeled as 1, . . . ,s1.

• The subsetD2 of D1 corresponding toA2 may not have good space-filling

properties.

• Care should be taken inlabelingthe levels to ensure thatD2 achieves

stratificationin two dimensions.

• Thes1 levels ofA1 must be labeled in such a way that the group of levels that

are mapped to the same level should form a consecutive subset of 1, . . . ,s1.

32

Page 34: Statistical Meta-Modeling for Complex System Simulations

An Example of Nested Space-Filling Design

• Five factorsx = (x1,x2,x3,x4,x5) taking values inthe unit hypercube[0,1]5.

• Use the orthogonal array in Table 1 to construct an OA-based Latin hypercube

D1 for x1-x5.

• Choose a subarrayD2 of D1 consisting of the 16 points inD1 corresponding to

runs 1-4, 9-12, 17-20, 25-28 of the array in Table 1.

33

Page 35: Statistical Meta-Modeling for Complex System Simulations

The Underlying Nested Orthogonal Array

Run # x1 x2 x3 x4 x5 Run # x1 x2 x3 x4 x5

1 1 1 1 1 1 33 8 1 8 8 8

2 1 3 3 5 7 34 8 3 6 4 2

3 1 5 5 8 4 35 8 5 4 1 5

4 1 7 7 4 6 36 8 7 2 5 3

5 1 8 8 7 2 37 8 8 1 2 7

6 1 6 6 3 8 38 8 6 3 6 1

7 1 4 4 2 3 39 8 4 5 7 6

8 1 2 2 6 5 40 8 2 7 3 4

9 3 1 3 3 3 41 6 1 6 6 6

10 3 3 1 7 5 42 6 3 8 2 4

11 3 5 7 6 2 43 6 5 2 3 7

12 3 7 5 2 8 44 6 7 4 7 1

13 3 8 6 5 4 45 6 8 3 4 5

14 3 6 8 1 6 46 6 6 1 8 3

15 3 4 2 4 1 47 6 4 7 5 8

16 3 2 4 8 7 48 6 2 5 1 2

34

Page 36: Statistical Meta-Modeling for Complex System Simulations

Run # x1 x2 x3 x4 x5 Run # x1 x2 x3 x4 x5

17 5 1 5 5 5 49 4 1 4 4 4

18 5 3 7 1 3 50 4 3 2 8 6

19 5 5 1 4 8 51 4 5 8 5 1

20 5 7 3 8 2 52 4 7 6 1 7

21 5 8 4 3 6 53 4 8 5 6 3

22 5 6 2 7 4 54 4 6 7 2 5

23 5 4 8 6 7 55 4 4 1 3 2

24 5 2 6 2 1 56 4 2 3 7 8

25 7 1 7 7 7 57 2 1 2 2 2

26 7 3 5 3 1 58 2 3 4 6 8

27 7 5 3 2 6 59 2 5 6 7 3

28 7 7 1 6 4 60 2 7 8 3 5

29 7 8 2 1 8 61 2 8 7 8 1

30 7 6 4 5 2 62 2 6 5 4 7

31 7 4 6 8 5 63 2 4 3 1 4

32 7 2 8 4 3 64 2 2 1 5 6

Table 1. An OA(64,5,8,2) that contains an OA(16,5,4,2), i.e., the submatrix consisting of runs1-4, 9-12, 17-20, 25-28 becomes an OA(16,5,4,2) after some suitable level collapsing

35

Page 37: Statistical Meta-Modeling for Complex System Simulations

2-D Projection of D1

x1

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

0.0

0.2

0.4

0.6

0.8

1.0

x2

0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

x3

36

Page 38: Statistical Meta-Modeling for Complex System Simulations

2-D Projection of D2

x1

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

0.0

0.2

0.4

0.6

0.8

1.0

x2

0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

x3

37

Page 39: Statistical Meta-Modeling for Complex System Simulations

iConclusions

• Computer simulations have become popular in studying complex systems.

• Statistics-based meta (surrogate, approximate) models are useful companions to simulations.

• A rich class of problems: modeling, estimation, prediction, calibration, computations, design.

• Kriging is popular for building meta models but can have problems with numerical instability.

• Effective alternatives to kriging have emerged in recent work. Cross-disciplinary in nature.

38

Page 40: Statistical Meta-Modeling for Complex System Simulations

Step 1: Fitting Correlation Parametersθ3

• Obtainposterior modeθ3 = (φl , φρ, φδ) by solving

(P) : maxφl ,φρ,φδ

p(θ3|yh,yl ).

• (P) is equivalent to two separableproblems:

– (P1) : a non-linear program forφl ,

– (P2) : a problem forφρ andφδ. Its objective function involves an integral.

(P2) ⇔ astochastic program (P′2) : max

φρ,φδL2 = Ep(τ1,τ2) f (τ1,τ2).

• SolveP′2 by theSample Average Approximation (SAA) algorithm :

1. Generate random samples(τs1,τ

s2) from p(τ1,τ2),s= 1,· · · ,S.

2. Solve the approximate problem:

(φρ, φδ) = argmaxφρ,φδ

[L2 =1S

S

∑s=1

f (τs1,τ

s2)].

39

Page 41: Statistical Meta-Modeling for Complex System Simulations

Step 2: Markov Chain Monte Carlo (MCMC)

Sampling from p(θ1,θ2|yl ,yh,θ3)

• Somefull conditional distributions for θ1,θ2 are not regular:

– p(βl |yl ,yh,θ3,βl ) ∼ Normal,

– p(ρ0|yl ,yh,θ3,ρ0) ∼ Normal,

– p(δ0|yl ,yh,θ3,δ0) ∼ Normal,

– p(σ2l |yl ,yh,θ3,σ2

l ) ∼ IG,

– p(σ2ρ|yl ,yh,θ3,σ2

ρ) ∼ IG,

– p(τ1,τ2|yl ,yh,τ1,τ2) ∝ 1

ταδ+ 3

21

αε+12

exp{−1τ1

(γδσ2

ρ+

(δ0−uδ)2

2vδσ2ρ

)− γετ2σ2

ρ} 1

|M|12

·exp{−(yh−ρ0yl1 −δ01n1)

tM−1(yh−ρ0yl1 −δ01n1)

2σ2ρ

}

– τ1 = σ2δ/σ2

ρ,τ2 = σ2δ/σ2

ε

• Use theMetropolis-within-Gibbs algorithm.

40

Page 42: Statistical Meta-Modeling for Complex System Simulations

The General Case

• ConsiderJ qualitative factorsz1, . . . ,zJ with zj havingmj levels 1, . . . ,mj .

• For two input valuesw1 = (x1,z11) andw2 = (x2,z12),

cor(ε(w1),ε(w2)) =

[J

∏j=1

τ j,zj1,zj2

]exp

{−

I

∑i=1

φi(xi1 −xi2)2

}.

• T j : anmj ×mj correlationmatrix for zj (i.e., apositivedefinite matrix with

unit diagonal elements).

• This correlation function has a product form.

• Assume the elements ofT j ≥ 0 for deterministic computer experiments.

41

Page 43: Statistical Meta-Modeling for Complex System Simulations

Some Restrictive Forms of Tj

Consider two input valuesw1 = (x1,z1) andw2 = (x2,z2) with responsesy(w1)

andy(w2). Recallthatτ j,zj1,zj2 is the correlation betweenzj1 andzj2.

1. Isotropy correlation function:τ j,zj1,zj2 = exp{−θ j I [zj1 6= zj2]}.

Forw1 andw2,

cor(ε(w1),ε(w2)) = exp

{−

I

∑i=1

φi(xi1 −xi2)2−

J

∑j=1

θ j I [zj1 6= zj2]

}

Euclideandistance forxi ; 0-1 distance forzj .

2. Multiplicative correlation function:

τr,s = exp{−θr,s} = exp{−(θr +θs)I [r 6= s]}. (1)

3. Group correlation functions.

4. Correlation functions for ordinal qualitative factors.

42

Page 44: Statistical Meta-Modeling for Complex System Simulations

Estimation

• Modelparameters:

– mean parametersβ = (β1, . . . ,βp).

– variance parameterσ2.

– correlation parametersphi = (φ1, . . . ,φI )t , andT = {T1, . . . ,TJ}.

• The estimation iterates between

Regression fitting: Given ˆphi andT, estimateβ andσ2.

Simple!

Correlation fitting: Givenβ andσ2, let U = (u1, . . . ,un) with

ui = [yi − βtf(wi)]/σ and then fit a GP with mean zero and variance one to

the transformed dataU.

43