inverse modeling technology for parameter estimation

INDUSTRIAL APPLICATIONS

Inverse modeling technology for parameter estimation

Srikanth Akkaram & Don Beeson & Harish Agarwal &Gene Wiggs

Received: 22 May 2006 /Revised: 19 September 2006 /Published online: 9 January 2007# Springer-Verlag Berlin Heidelberg 2007

Abstract Computational simulation models are extensivelyused in the development, design, and analysis of an aircraftengine and its components to represent the physics of anunderlying phenomenon. The use of such a model-basedsimulation in engineering often necessitates the need toestimate model parameters based on physical experiments orfield data. This class of problems, referred to as inverseproblems (Woodbury KA 2003 Inverse engineering hand-book. CRC, Boca Raton) in the literature, can be classifiedas well-posed or ill-posed depending on the quality(uncertainty) and quantity (amount) of data that areavailable to the engineer. The development of a genericinverse modeling solver in a probabilistic design system

(PEZ version 2.6 user-manual. Probabilistic design systemat General Electric Aviation, Cincinnati) requires the abilityto handle diverse characteristics in various models. Thesecharacteristics include (a) varying fidelity in model accu-racy with simulation times from a couple of seconds tomany hours; (b) models being black-box, with the engineerhaving access to only the input and output; (c) nonlinearityin the model; and (d) time-dependent model input andoutput. This paper demonstrates methods that have beenimplemented to handle these features, with emphasis onapplications in heat transfer and applied mechanics. Apractical issue faced in the application of inverse modelingfor parameter estimation is ill-posedness, which is charac-terized by instability and nonuniqueness in the solution.Generic methods to deal with ill-posedness include (a)model development, (b) optimal experimental design, and(c) regularization methods. The purpose of this paper is tocommunicate the development and implementation of aninverse method that provides a solution for both well-posedand ill-posed problems using regularization based on theprior values of the parameters. In the case of an ill-posedproblem, the method provides two solution schemes—amost probable solution closest to the prior, based on thesingular value decomposition (SVD), and a maximum aposteriori probability (MAP) solution. The inverse problemis solved as a finite dimensional nonlinear optimizationproblem using the SVD and/or MAP techniques tailored tothe specifics of the application. The objective of the paperis to demonstrate the development and validation of theseinverse modeling techniques in several industrial applica-tions, e.g., heat transfer coefficient estimation for diskquenching in process modeling, material model parameterestimation, sparse clearance data modeling, and steady state

Struct Multidisc Optim (2007) 34:151–164DOI 10.1007/s00158-006-0067-1

S. Akkaram (*)Energy and Propulsion Technologies, K1 Building, Room 4B18A,General Electric Global Research Center,Niskayuna, NY 12309, USAe-mail: [email protected]

D. BeesonGeneral Electric Aviation,1 Neumann Way, Mail Stop T207,Cincinnati, OH 45215, USAe-mail: [email protected]

H. AgarwalPhysical Sciences Technologies, K1 Building, Room 2A 62,General Electric Global Research Center,Niskayuna, NY 12309, USAe-mail: [email protected]

G. WiggsGeneral Electric Aviation,1 Neumann Way, Mail Stop W26,Cincinnati, OH 45215, USAe-mail: [email protected]

and transient engine high-pressure compressor heat transferestimation.

Keywords Inverse modeling . Parameter estimation .

Singular value decomposition . Bayesian methods .

Engineering simulation

1 Introduction

An introduction to the state of technology in inversemodeling can be found in Woodbury (2003) and Zabaraset al. (1993) with emphasis on applications in heat transferand applied mechanics. The term inverse modeling in thispaper is in the context of parameter estimation and refers to afinite-dimensional estimation (as opposed to the definition inthe mathematics community referring to infinite-dimension-al function estimation) to have a practical solution methodthat will handle model features (a)–(d) listed in the abstract.

The engineering inverse problem can be defined asfollows for time-independent simulation models (Fig. 1):Given M experiments performed corresponding to Pconstant system parameters Cij (i=1 to M, j=1 to P) thatsample Q observed performance or output parameters, Yij

(i=1 to M, j=1 to Q), infer the model parameters (or theirmost probable values) Xi (i=1 to N), assuming reasonableknowledge of first guesses or priors for X distribution mean(mp

i ) and standard deviation (spi ). Model parameters are

estimated from i=1 to M experiments that sample observa-tions at possibly distinct values of system parameters C.The use of “∼” instead of “=” in equating the simulationmodel output to the observations is to emphasize thatsimulation model F is deterministic with the Q observationsin Y having an uncertainty defined by a normal distributionwith a mean of 0 and standard deviation sY

j ( j=1 to Q), i.e.,Nor(0, sY

j ). In this paper, the uncertainties in the N modelparameters X and the Q observations in Y are characterizedby normal distributions. The subscripts in the simulationmodel should read Fj(X,Cij) and have been dropped fornotational simplicity.

Inverse problems can be classified fundamentally underthree broad categories based on concepts from linearalgebra (Press et al. 1996; Golub and Van Loan 1996):

(1) Underdetermined: This type can be loosely character-ized as having more Xs than Ys, leading to non-uniqueness in the parameter estimate. One can,however, find the most probable solution solved usingBayesian probability concepts (Meeker and Escobar1998) with the solution depending on the assumedprior information in X. This solution is developed inSection 2.

(2) Exactly determined: This type can be loosely charac-terized as having the same number of Xs and Ysleading to a unique deterministic solution obtained bya system of nonlinear algebraic equations.

(3) Overdetermined: This type can be loosely character-ized as having a larger number of Ys than Xs, leadingto a “best-fit” solution obtained from a nonlinear“least-squares” system.

In this classification, the term “loosely characterized” isused because counting the number of Xs and Ys does notcorrectly determine the type of inverse problem. The rankdeficiency of matrix [A] (that has the interpretation ofrepresenting the sensitivity of the model outputs Y toparameters X corresponding to the experiments) is whatreally matters. The matrix [A] can be defined as ∂F/∂X,with the number of rows representing the number ofparameters (X) being estimated, i.e., N, and the number ofcolumns representing the amount of observations (Y), i.e.,QM.

A practical issue faced in the application of inversemodeling for parameter estimation is ill-posedness (Zabaraset al. 1993), which is characterized by the following twoproperties in the solution:

(a) Instability defined as a small change in observed Yleading to a large change in estimated X, i.e., thesolution does not depend continuously on the data.Instability can be caused by badly conditionedsensitivity matrix [A].

Fig. 1 Schematic that depictsvarious parameters in a generictime-independent inverse prob-lem where X and Y are time-independent parameters

152 S. Akkaram et al.

(b) Nonuniqueness defined as multiple values of X thatlead to the same values of Y. Nonuniqueness can becaused by a rank-deficient sensitivity matrix [A].

Apart from the model sensitivity for the experiments asrepresented in [A] affecting the inverse solution, the quality(sY

j ) and quantity (number of experiments M correspondingto different values of constant parameters, number ofobservations Q) of data also affects the inverse solution.Nonuniqueness can also be viewed as an extreme case ofinstability that results as the condition number of [A] goesto infinity, resulting in the rank deficiency. In the context ofnonlinear models, concepts of instability and nonunique-ness are defined locally for the linearized model. Non-uniqueness is typically caused by lack of sufficient data andinstability is caused by lack of good-quality (nonnoisy) datafor the model. Some generic causes of ill-posedness are:

(a) Zero or near-zero sensitivities defined in terms of theparameter being estimated not significantly affectingany of the outputs. As an example, consider Fig. 2where heat transfer coefficient HS is estimated purelybased on data from thermocouple (TC) 1 that isphysically far from the boundary and thereforeexhibits a low temperature sensitivity to that parameter

(b) Confounded or nearly confounded sensitivities de-fined as parameters in the model affect the output inthe same way. As an example, one cannot uniquelyestimate distinct heat transfer coefficients HT and HBbased on TCs that are placed on the symmetry-plane.

These TCs would exhibit the same (i.e., confounded)sensitivity to HT and HB.

Addressing the ill-posedness issue in inverse modelingcan be done through:

(a) Model development: This assumes that the engineerhas access to the model and has the freedom to alterthe form of the model to find the best match to theavailable data. The concept of computing and under-standing model response sensitivities (the matrix [A])prior to parameter estimation to ensure a well-conditioned sensitivity matrix can be found in thecontext of material model parameter development inFossum (1998).

(b) Experimental design: This assumes that the engineerhas a given model, but has the freedom to performvarious experiments that maximize the amount ofinformation he can obtain on the model parameters.This concept of optimal experimental design has beenexplored in Emery et al. (2000) in the context of heattransfer applications to optimize sensor locations.

(c) Regularization: This assumes that the engineer has afixed model and is limited to the existing availabledata. This is the commonly encountered situation inengineering. Regularization (smoothing) methods firstproposed by Tikhonov (Zabaras et al. 1993) that useprior information on the Xs are needed to providestable and unique solutions and are the solutionapproach developed in the next section.

2 Inverse solution method

2.1 Development of the SVD inverse solution

The solution methods developed in this section enable thehanding of all three inverse problem types (underdeter-mined, exactly determined, and overdetermined). The ill-posedness issue is handled by regularization and is alsomotivated from a probabilistic perspective. The priordistribution for the parameters X plays an important rolein the solution, and the N parameters (Xi) to be estimatedare first normalized using their prior means (mp

i ) andstandard deviations (sp

i ) that represent the initial guessand corresponding uncertainty in X (1). In many applica-tions, individual parameters within X have different initialuncertainty levels because some modeling phenomenon arebetter understood than others, and consequently, uniquestandard deviations can be specified for the individualparameters in X. The standard deviation or the uncertaintyin the Q observed parameters defined by (sY

j ) is used to

Fig. 2 Inverse heat transfer for a disk instrumented with four TCs onthe mid symmetry plane, where the heat transfer coefficients at the top(HT), bottom (HB) and side (HS) need to be determined

Inverse modeling technology for parameter estimation 153

normalize the observations Y in (2). The (MQ×N) Yij∼F(X,C) nonlinear system can be reparameterized using thenormalized variables x and y in (3).

xi ¼ Xi � μ pi

σ pi

; i ¼ 1 to N ð1Þ

yij ¼ YijsYj; i ¼ 1 toM and j ¼ 1 to Q ð2Þ

yij � f x;Cð Þ; i ¼ 1 toM and j ¼ 1 to Q ð3Þ

Representing (3) as a vector with MQ components, onecan define the Jacobian J as a (MQ×N) rectangular matrixcomputed using forward finite differences for black-boxmodels in (4). The inverse solution can be iterativelyobtained using the Newton scheme to solve a nonlinearsystem, starting with the initial guess of x0=0, representingthe prior mean and J+ representing the Moore–Penrose(pseudo) inverse of the Jacobian in (5). The Jacobian [J] isthe generalization of matrix [A] introduced in Section 1 forthe case of nonlinear models.

J ¼ @f x;Cð Þ@x

ð4Þ

xkþ1 ¼ xk þ Jþ yij � f xk ;C� �� ð5Þ

The pseudoinverse of the Jacobian is computed using thesingular value decomposition (SVD) as outlined in Goluband Van Loan (1996):

J½ �MQxN ¼ U½ �MQxMQ w½ �MQxN V½ �TNxN ð6Þ

[w] MQ×N is a diagonal matrix=diag(w1, w2,..., wp), wherep=min(MQ,N); U and V are orthogonal matrices. Thepseudo inverse J+ is computed as:

J½ �þNxMQ ¼ V½ �NxN w½ �þNxMQ U½ �TMQxMQ ð7Þ

[w]+N×MQ is a diagonal matrix=diag(1/w1, 1/w2,..., 1/wr, 0,0, 0) where r≤p represents the rank of the Jacobian and thezero values of wi are set to zero in [w]+. An under-determined inverse problem is characterized by a rankdeficiency in J. The condition number of J is defined as theratio of the maximum singular value to the minimumnonzero singular value in J. An ill conditioned Jacobian ischaracterized by a large condition number (caused byround-off and low model parameter sensitivity) and resultsin an unstable inverse solution. The use of truncated SVDin the computation of the pseudoinverse that eliminates

near-zero singular values below a threshold tolerance (thatrepresents the level of noise in the system, e.g., 10−06)ensures a stable solution. Zero or near-zero singular valueindicates information that cannot be obtained from the data.

2.2 SVD inverse solution characteristics

This inverse SVD solution provides a stable and uniqueestimate to all types of inverse problems:

(a) For overdetermined systems, the Jacobian has fullrank and the solution provided reduces to that of anonlinear least squares approach starting from theprior.

(b) For underdetermined systems, the Jacobian is rank-deficient and SVD provides a solution (out of theinfinite possible solutions) that is closest to the prior.

(c) For exactly determined systems, the Jacobian is squareand has full rank. The SVD provides a solution thatcorresponds to that of a square nonlinear systemsolved by Newton method. The pseudoinverse for thissystem takes the definition of classical inverse for asquare matrix.

Issues of instability and nonuniqueness are addressed asfollows:

(a) Instability The use of truncation in the computation ofthe inverse prevents instabilities in theinverse solution from eliminating noisearising either from round-off or from lowsensitivities of the model output to param-eters. The use of truncated SVD thusprovides the necessary regularization(smoothness) for the inverse solution.

(b) Nonuniqueness Uniqueness is provided by use ofthe prior solution in the normaliza-tion of the model parameters andthe use of SVD in the approach.The normalization provides the in-terpretation that the inverse solutionprovided would be closest to priorfor ill-posed problems.

For overdetermined and exactly determined systems(where the Jacobian has full rank and has no near zerosingular values), the prior solution will have no impact onthe final inverse solution, except for the fact that it providesgood normalization and an initial guess to the solutionscheme.

2.3 Numerical implementation

The numerical implementation as a solution of a nonlinearsystem needs to address the following issue: Using the full


Newton step has poor global convergence propertiesbecause higher-order derivative terms are neglected in thelinearization used to provide the Newton update (Press etal. 1996). Thus, having a poor prior solution can cause theNewton method to fail even in cases where sufficient,good-quality data are available for the inverse problem.This limitation was overcome using a maximum likelihoodestimation (MLE) solved using an optimization scheme thatuses line-search-based back tracking to avoid taking the fullNewton step and that is globally convergent (Bard 1974).Such a MLE implementation described below provides analternate equivalent interpretation (analogy between thepseudoinverse definition and linear least squares) in thecase where the model parameters and observations havenormal distributions, and can be easily extended to handlenonnormal distributions by appropriate changes in theprobability density functions for the likelihood (Meekerand Escobar 1998).

The likelihood function represents the probability withwhich a particular model (with certain values for itsparameters) matches the M experiments for the Q obser-vations. The inverse solution is defined by the parametersthat maximize this likelihood function. Details on thecomputation of the likelihood function can be found inMeeker and Escobar (1998) and Bard (1974), and areshown below for a problem with a single normallydistributed observation (Q=1) as an example. In the caseof normal distributions that are characterized by anexponential function in the probability density, functionworking with the logarithm of the likelihood is preferred.The log-likelihood (log L) is computed (within an additiveconstant as):

log L xð Þ½ � ¼ � 1

2

XMi¼1

f x;Cð Þ � yi1sY1

� �2" #

ð8Þ

The maximization of the likelihood function in the N-dimensional model parameter space is performed using theGauss–Newton method (Fossum 1998; Bard 1974). Thelog-likelihood is differentiated with respect to the modelparameters to compute the gradient and the Newton methodused to linearize and equate the gradient to zero. The use ofthe Newton method requires differentiation of the gradientof log-likelihood with respect to model parameters andprovides the Hessian [H] of the log-likelihood, which is anN×N matrix that is referred to as Fischer informationmatrix in statistical literature (Meeker and Escobar 1998).In the Gauss–Newton method, the second derivatives of themodel (that are computationally expensive and unstable tocompute using finite differencing) are neglected and anapproximate Hessian [H] is used that has the interpretationof solving a sequence of linear regression problems (Bard1974), and is equivalent to the nonlinear system solution

method in (5). The Hessian [H] has the role that theJacobian [J] played in (5). The use of pseudoinverse (fora square matrix) in the computation of inverse Hessian[H]+ with truncation of near-zero singular values leads toan alternate optimization-based inverse SVD implemen-tation. The Hessian [H] being singular is equivalent tothe Jacobian matrix [J] being rank-deficient and viceversa. The optimization-based estimate has the interpreta-tion of representing the Maximum likelihood solutionclosest to the prior. This Gauss–Newton truncated SVDmethod was selected and implemented in this paper with aline-search technique used for global convergence. Theline-search technique is used to scale and uses a fraction ofthe full Gauss–Newton step. The partial step that isobtained guarantees a decrease in the negative log-likelihood (that is minimized) and ensures that a subsequentestimate will have increased the likelihood function ascompared to the current (Bard 1974). This provides adegree of robustness in ensuring global convergence;however, the method is gradient-based and could gettrapped in a local optimal that does not globally maximizethe likelihood.

Such a Gauss–Newton–SVD method using the pseu-doinverse was suggested for parameter estimation problemsin Bard (1974), but was less preferred and not exercisedwhen compared to the Marquardt or directional discrim-ination methods that add positive diagonal values to theHessian to make it nonsingular. We, however, believethat this method when used along with the truncatedSVD is a powerful method to solve practical engineeringinverse problems, because it has the ability to warn theuser about ill-posedness in the inverse model. One cantrack if singular values are close to zero and provide theuser with the number of near-zero singular values thatrepresent the nonobservability of the system. Thisinformation lets the engineer know that parameterestimate can be obtained only in a most probable sensedependent on the prior. Most practical engineeringinverse problems tend to be ill-posed and the standarduse of unconstrained optimization techniques that typi-cally tend to make the Hessian positive-definite to handleround-off and conditioning without any tracking [e.g.,variable metric methods (Press et al. 1996), Marquadt,etc.] results in loss of information to the engineer. Thecurrent implementation provides the ability to handlespecified and independent normal uncertainty in each ofthe Q observations (sY

j ), i.e., a Q×Q diagonal covariancematrix. In the case where this information is not available,the (Q×Q) covariance matrix is estimated by (a) firstmaximizing the log likelihood with respect to the covari-ance matrix (that results in the covariance matrix beingrelated to the moment matrix of the residuals) and (b) byusing that estimate for maximization of the log-likelihood


with respect to the model parameters, as suggested in Bard(1974).

2.4 Alternate maximum a posteriori probability inversesolution

The optimization-based approach for inverse problemsallows for another inverse solution to be developed that isreferred to as the maximum a posteriori probability (MAP)solution. Ill-posedness arising due to bad-quality data (i.e.,with large measurement uncertainty sY

j ) was not addressedby the SVD method. Such a case is better handled by theMAP method that provides a compromise between a goodprior solution and best model match to the (bad-quality,limited) data. Here, one computes the posterior probabilitydensity that is a product of the likelihood function and theprior solution probability from Bayes theorem (Meeker andEscobar 1998; Bard 1974). In this context, one starts witha prior probability distribution for the model parametersand updates this distribution to being a conditionalprobability given the data. Maximizing this posteriorconditional probability provides the MAP solution, andthat is an easy Bayesian estimate to compute. Assuming anormal distribution, the logarithm of the prior probabilitycan be written as:

log po xð Þ½ � ¼ � 1

2

XNi¼1

x2i ð9Þ

In theMAPsolution, theGauss–Newtonsolution scheme isused as in the SVD, except that the posterior probabilityφ¼ log L þ α log poð Þis now maximized, where α=1.The use of log po ensures that the singular values of theHessian of φ will be nonzero and will ensure a well-conditioned inverse problem. Note however that the priorsolution will influence the MAP estimate even in the casewhere the inverse problem is exactly or overdetermined.The MAP solution essentially translates to the solution of abiobjective optimization problem with a specially selectedweight α=1. Different values of α correspond to differentstrengths of zeroth-order (Tikhonov) regularization andprovide the optimal Pareto front that describes compromisebetween the solution closeness to the prior and matchingthe data. Inverse solutions corresponding to a value of α=1(MAP) and α=0 (SVD) are the most appealing becausethey have a probabilistic interpretation, as described above.The SVD estimate is often referred in literature as the limit(α−>0) of zeroth-order Tikhonov regularization (Zabaras etal. 1993). The MAP provides a stronger regularization thanSVD because:

(a) The prior affects the SVD solution to handle stabilityand uniqueness issues arising through an ill-condi-tioned or rank-deficient Hessian matrix [H], whereas

the prior affects the MAP solution even in the casewhere the Hessian matrix [H] is well conditioned.

(b) MAP estimate handles instability issues that arise dueto noisy data better because deviations from the priorare restricted in the case where the Hessian matrix iswell conditioned, whereas the SVD estimate will findthe best match to the noisy data in this case with theprior having no influence on the SVD estimate.

To provide a better understanding of the relationshipbetween the two inverse solutions, it can be noted that:

(a) The MAP solution is the same as the SVD solution inthe case where the prior distributions are noninforma-tional, i.e., if the prior is assumed to have a uniformdistribution.

(b) The MAP solution will tend to the SVD solution, asthe standard deviation of a normal prior distributionbecomes a large value.

The current implementation provides the flexibility ofspecifying lower and upper bounds (box-constraints) forthe model parameters using projection-based methods tomodify the unconstrained optimization problem ensuringthat the intermediate iterations do not enter an infeasiblemodel parameter space (Bard 1974).

2.5 Recommended approach (flow chartfor implementation)

Based on the above discussions, the following is therecommended approach to solve a parameter estimationproblem:

(a) Compute the inverse solution using the SVD method. Ifthe method does not result in warnings, then theparameter estimation problem is well posed and aunique solution reported was obtained from the dataalone. Use the MAP estimate as an alternative if a goodprior estimate exists and the data are noisy (σY

j is large).(b) If warnings are encountered about nonuniqueness, the

user can:

1. Retain the SVD solution with the understandingthat it is a most-probable solution closest to theprior; no estimates of uncertainty in the modelparameter estimates are provided.

2. Obtain more data or investigate changes to the formof the model (if that flexibility exists) that eliminateany warning from the SVD method.

3. Use the MAP solution if a good prior solution isavailable and the data is noisy(σY

j is large).

There are several possible solution methods andapproaches to solve the parameter estimation problemdiscussed in Table 1, and the SVD and MAP techniques


were selected to handle the widest class of engineeringmodels.

2.6 Time-dependent models

Extension of the inverse solution to the case of time-dependent models where the model input and output aretime-dependent is provided here. We restrict attention to thecase where the parameters estimated are finite dimensional.Any time-dependence in the model input parameters isparameterized using basis functions that vary with time andthe estimation of the constant coefficients of these basisfunctions is desired. The observations Y are now time-dependent and an additional subscript k is added to thedefinition in Fig. 1 to indicate the time dependence, i.e., Yijk

(i=1 to M, j=1 to Q, k=1 to R) refers to the jth observationfor the ith experiment at time referred by the index k. Thus,we have:

Yijk � F X ;Cð Þ ¼ F X ;Cð Þ þ Nor 0;σYj

� i ¼ 1 toM ; j ¼ 1 toQ and k ¼ 1 to R

ð10Þ

The standard deviations for the Q time-dependentobservations are assumed to be sY

j that is independent oftime and specified by the user. With this definition, theinverse solution method can be extended to solve severalpractical engineering inverse problems. A feature unique totime-dependent models is the ability to obtain parameterestimates as a function of time that provides an understand-ing of the drift in parameter estimates with time. Becausethe parameter estimates computed were assumed to beindependent of time, this computation provides the abilityto cross-verify that assumption, i.e., any significant drift in

model parameter estimates with time indicates an incorrectmodeling assumption that needs correction. Parameterestimate at time (t= t1) is computed using all transient datauntil time t1. This estimate is used as an initial guess toparameter estimate that is requested for a time (t=t2>t1).The nonlinearity and black-box nature of models preventsthe use of a batch least-squares-type estimation that updatesan estimate at time t1 using only data available betweentime [t1, t2] to provide an estimate at time t2. Here, datauntil time t1, used to provide a parameter estimate at time t1,are reused when computing the parameter estimate at time(t= t2> t1). The use of a recursive initial guess mitigatessome of the increase in computational cost that results fromhaving parameter estimates as a function of time.

3 Numerical examples

Benchmarking of the inverse method was performed onseveral numerical examples and engineering applicationsand a couple are highlighted to demonstrate the flavor of theproposed approach. Examples 3.1 and 3.2 are the bench-marks and the rest of the examples are engineeringapplications.

3.1 Chemical kinetics model

This chemical kinetics parameter estimation benchmarkwas taken from Bard (1974) and is described by N=2model parameters X1 and X2 that represent frequency andactivation energy to be estimated and one output parameterQ=1 that represents fraction remaining of a compound in areaction. The parameters Ci1 and Ci2 are two constant

Table 1 Advantages and limitations of inverse modeling solution methods

Solution scheme Advantages Limitations

Filtered Monte Carlo and MCMC methods(Statinkov and Matusov 1995)

Treats nonnormal priorsnaturally

Unsuitable for models that have a large simulation time

Provides global inversesolutions

MPP math from fast probability integration theory(Khalessi and Lin 1993)

Equivalent to SVD Extensions to the case of multiple observations andexperiments are not intuitiveSVD/MPP are shortest

distance computationsOptimization methods (SQP, BFGS, etc.)(Vanderplaats 1984)

User needs to regularize thesolution

Cannot automatically detect ill-posednessConvergence may not be the best for parameterestimation

SVD and MAP inverse solutions Fast due to being gradient-based

Cannot provide global inverse

Handles ill-posedness Cannot handle discrete variablesMultiobjective (data and prior) genetic algorithms(Furukawa 2001)

Provides global inverse Unsuitable for large simulation time modelsHandles discrete parameters

MCMC Markov chain Monte Carlo, MPP most probable point, SQP sequential quadratic programming, BFGS Broyden–Fletcher–GoldFarb–Shanno algorithm


system parameters (P=2) representing time and tempera-ture and there are M=15 experiments that were performedat different combinations of these parameters.

The chemical kinetics model is given as:

Yi1 ¼ exp �X1 Ci1 exp �X2=Ci2ð Þð Þ; i ¼ 1 to 15 ð11Þ

In this example, the uncertainty or noise in theobservations was assumed to be unknown (i.e., sY

1 isestimated from the data). A poor prior mean (to test therobustness of the inverse method) [X1, X2]=[100.0,2,000.0] and a prior standard deviation of 50.0 was usedfor both parameters. The SVD threshold was specified as10−05 and a 0.01% finite difference step-size was used. Theconvergence of the solution scheme is reported in Fig. 3with solution obtained after 28 evaluations of the likelihoodfunction.

The Hessian [H] from the SVD method was well-conditioned at the solution and a unique estimate of[813.89, 961.01] could be inferred from the data. Thestandard deviation sY

1 (uncertainty) estimate in this examplewith one observation reduces to a distribution of residuals (inFig. 4a) uniformly among the number of degrees offreedom (number of experiments−number of parameters)in the system. Patterns in such residual plots are useful toidentify discrepancies in the form of the nonlinear modelin matching the data and to understand the validity of theconstant variance (homoscedasticity) assumption.

The variation of the log-likelihood as a function of themodel parameters can be approximated using a Taylorseries expansion in terms of the log-likelihood and the

Hessian of the log-likelihood [H] at the inverse solution(the gradient of the log-likelihood being zero at the inversesolution) and is used to develop. The inverse Hessianmatrix provides the covariance matrix of parameter esti-mates (Meeker and Escobar 1998; Bard 1974). Theestimated uncertainty or standard deviation of the parameterestimate was [229.39, 63.81] and is representative of thediagonal terms in the covariance matrix (inverse Hessian)for the parameters, and the correlation coefficient wasestimated as 0.9812. One can finally compute 95%confidence bounds on the mean estimates of the modelusing the sum of the estimated uncertainty sY

1 andpropagating the uncertainty associated with the model

Fig. 3 Convergence of parameter estimates, as well as the negativelog-likelihood with the Gauss–Newton–SVD (GNS) iterations. EveryGNS iteration involves multiple evaluations of the likelihood functionfor the finite difference gradient computation and in the line search

Fig. 4 a Difference between the model and data at the 15 experimentsfor the inverse solution. b Mean and 95% confidence bounds on themodel output shown along with the experimental data. It can be notedthat the 95% confidence bounds on the model output contain the data


estimates as shown in Fig. 4b. The MAP solution was alsovalidated with (Bard 1974) for 2 cases for priors withdifferent standard deviation in the prior.

(a) A prior mean of [1,000.0, 1,000.0] and a priorstandard deviation of [200.0, 200.0] yielded a poste-rior mean estimate of [928.95, 990.65] for [X1, X2].

(b) A prior mean of [1,000.0, 1,000.0] and a priorstandard deviation of [100.0, 100.0] yielded a poste-rior mean estimate of [976.19, 1,001.68] for [X1, X2].

It can be seen that as the prior standard deviation isincreased to ∝ (prior becoming noninformational), the MAPestimate will tend to the SVD solution.

3.2 One-dimensional inverse heat transfer

This inverse heat transfer estimation benchmark was takenfrom Woodbury (2003) and is described by N=2 modelparameters representing the heat flux X1 at z=0 and auniform initial temperature X2 of the one-dimensional rodgoverned by transient heat conduction as represented inFig. 5.

Two (Q=2) TCs located at z=0 and z=L provide thedata from one experiment (M=1) and temperature readingsare simulated every 1 s until t=100 s (R=101). There areno system parameters in this example (P=0). The total timeof interest is 100 s. Two cases are considered for thegeneration of data for the inverse model:

(a) Modeling error: Data are generated using values ofX1=50 kW/m2 and X2=20°C that correspond to anincorrect modeling of heat flux at z=L. The modelassumes that z=L is insulated, whereas the dataassume a heat loss at this surface with a Biot numberBi=0.1=hL/K.

(b) Measurement uncertainty: Data are generated by addingmeasurement uncertainty corresponding to sY

i ¼ 5:0 forboth TCs with values of X1=50 kW/m2 and X2=20°C.

Solution to these two cases was obtained using the SVDand MAP methods with a prior initial guess of (X1, X2)=(40.0, 10.0) and prior standard deviation of 5.0 for bothvariables. A solution was obtained by also considering onlyone TC at z=L. The one TC case at z=L is interesting

because it is situated far from the heat transfer coefficientbeing estimated and results in an ill-posed problem at smalltimes. Parameter estimates were obtained as a function oftime at 10 equal time intervals between 0.0 and 100.0 s. Theinitial temperature estimate was good and close to 20°C withall three data sets, using SVD or MAP solutions, with one ortwo TCs, and hence is not reported in the charts that follow.

Inverse solution in the presence of modeling error Figure 6a depicts the variation of residual with time for the case ofSVD or MAP solutions based on parameter estimates at finaltime=100.0 s for the data set generated with modeling error.The residual pattern is nonrandom and is indicative of a biasor modeling error in the problem. As mentioned in Woodbury(2003), the presence of such a modeling error will be typicallyreflected in a trend or drift in model parameter estimates withtime. This trend is shown in Fig. 6b and is more severe in thecase of a single TC case where lack of sufficient informationamplifies this drift. Such a case is quite common in practicalengineering time-dependent inverse applications, as shownin Sections 3.3 and 3.5. The ability to monitor parameterestimates with time can assist in the reduction of modelingerrors by focusing efforts on parameters that show asignificant drift. For the case with one TC, at time t=10 s,it is noted that the SVD and MAP solutions are unaffectedby the data and are equal to the prior (i.e., informationprovided by the TC at z=L at time=10.0 s is less than thethreshold singular value for the SVD method). At 20.0 s,however, the SVD detects significant information comingfrom the data and updates the estimate to be closer to thedata. The MAP solution, however, provides a compromisebetween the prior solution and the best match to the data.With time, information from the data gains more weightthan the prior, and so, the MAP solution tends to theSVD solution at larger times. For the case of both TCs,the TC at z=0 provides significant information about theheat flux at z=0, and therefore, even at time t=10.0 s, onenotices that SVD solution is close to the true solution,whereas the MAP solution is in between the prior and theSVD solution.

Inverse solution in the presence of measurement uncertain-ty The residual pattern (not shown here) is random and one

Fig. 5 1D heat transfer benchmark (Woodbury 2003) where the heat flux and initial temperature are computed based on “simulated” transienttemperature–time data available at two TCs positioned at z=0 and z=L, respectively


no longer observes the drift in model parameter estimateswith time. The heat flux estimate as a function of time isshown in Fig. 7 and converges to the true estimate in boththe SVD and MAP solutions with increasing data in time.This convergence is slower for the case with one TC thanfor the case of two TCs as a result of the smaller amount ofinformation in the former case. The stronger nature of theMAP regularization is noticed in the case of one TCsolution, where the SVD demonstrates a more oscillatoryand less smooth trend compared to MAP, especially atearlier times. The SVD threshold knob was set at 10−02 for

this example; using a smaller value resulted in a largeroscillation at smaller times (i.e., t<20.0). This exampledemonstrates the distinction between the regularized so-lutions provided by SVD and MAP and their dependenceon the threshold singular value and prior solutions,respectively (especially in the case where insufficientinformation is coming from the data).

3.3 Process modeling and materials modeling applications

A common problem encountered in process modeling ofdisk forgings is the computation of heat transfer coefficientsduring heat treatment or quenching. An accurate character-ization of the heat treatment process is essential to capturethe right thermal gradients that effect residual stress andmaterial properties like yield strength. The disk forging wasinstrumented with five TCs and air-cooled. The convectioncoefficients were estimated based on transient temperaturetime data. Radiation view factors were assumed known inthis problem. Various boundary segments where the heattransfer coefficients were desired were lumped into ninesegments where the convection coefficients were estimatedusing the inverse solution. Figure 8 depicts the instru-mented disk, as well as the inverse SVD solution for thenine convective coefficients on the boundary. The SVDsolution was close to an alternate solution provided byDEFORM’s inverse code developed as part of the DARPA

Fig. 7 Difference between the heat flux estimate and true solution(50 kW/m2) as a function of time for the case of one and two TCswith the SVD and MAP solutions for the data set with measurementuncertainty

Fig. 6 a Residual plot as a function of time based on parameterestimates at final time for the case of SVD or MAP solutions with twoTCs, for the data set with modeling error. (b) Difference between theheat flux estimate and true solution (50 kW/m2) as a function of timefor the case of one and two TCs with the SVD and MAP solutions forthe data set with modeling error


Accelerated Insertion of Materials Program (http://www.deform.com/). Difference between the two inverse solu-tions was attributed to the fact that the nine-parametersolution was nonunique. The SVD solution provided theuser with a warning that one of the singular values wasbelow the threshold and that the solution provided was themost probable solution closest to the prior. The engineerthen has the choice to combine the heat transfer segments tocome up with a fewer (possibly eight) number of heattransfer coefficients that can be best (uniquely) estimatedbased on the given instrumentation and data. Figure 9 adepicts the temperature vs time profiles for two selected TClocations three and five for the initial guess (constantconvection on all the segments) and the SVD solution,clearly demonstrating the ability of the inverse solution toimprove the process model. Residual plots for all the fiveTCs are shown in Fig. 9b, where one observes (a) aresidual signature that is indicative of modeling error, asdescribed in Section 3.2, and (b) a higher mismatch atsmaller times and for the TC locations that are closer tothe surface. These trends are indicative of the fact thatpart of the residual signature can be improved byincorporating radiation view factors as additional param-eters to be tuned in the model.

A material’s life assessment application for the time-dependent inverse approach was for high-temperature,constitutive modeling of blade superalloys that exhibit ahighly nonlinear behavior. In this case, the inverse solution

is used to estimate 10 parameters (in an elasto-viscoplastichardening model) that define the evolution of the statevariable and plastic strains, based on multiple tests ashighlighted in Fig. 10.

3.4 Sparse clearance data modeling

A problem encountered in the design of turbo-generatorsand jet engines is to reduce the clearance betweenrotating and stationary parts. For higher efficiencies, itis desirable to reduce clearance as much as possible.Thus, engine cold clearances are designed above a smallthreshold to minimize the risk of rub/interference duringoperation. The circumferential variation of clearances yat a given compressor or turbine stage can be describedby looking at dominant terms in the Nth-order Fourierseries. The different coefficients in the Fourier expan-sion provide information about circumferential distortionlevels at a given axial location in the engine.

Fig. 9 a Initial and optimal thermal transients at TC locations threeand five. b Residual plot at the inverse solution demonstrating amaximum mismatch of about +35F

Fig. 8 Instrumented disk and optimal inverse convective coefficientsfor nine boundary segments


http://www.deform.com/

http://www.deform.com/

y ¼ a0 þXn¼N

n¼0

an cos nqð Þ þ bn sin nqð Þð Þ

The amount of experimental data available is sparse,with 2–4 clearance probe data points available at a givenstage in the engine. Thus, the Fourier series cannot be usedin its entirety and the true spatial variation of the clearancecannot be captured purely from the data. The challenge is todevelop a systematic approach that allows for using sparseexperimental data for model improvement and validation.Figure 11 depicts an axial location in the engine where dataat three circumferential locations are available. Only threeterms in the Fourier series can be estimated from the data.

To estimate the higher-order effects, virtual (prior) infor-mation from a finite element (3D analysis system) physics-based simulation model is used. Thus, 10 virtual probeswere placed spatially and corresponding informationobtained from the simulation model. The first step is toget an understanding of the maximum number of terms tobe included in the Fourier series. This is done by fittingFourier series of different orders to the model data (SVDapproach) and looking at the magnitude of the Fouriercoefficients (N=3 chosen here). The next step is to bring inthe experimental data and estimate the posterior informa-tion for the Fourier terms. The experimental data areassigned a low standard deviation with a high belief,whereas the virtual (model) data are given a high standarddeviation with a low belief. The difference in the Fourierterms of the prior and the posterior (e.g., growth term a0,horizontal shift a1) gives the engineer an insight into theeffects missing in the simulation model. For example,differences in the growth term a0 are indicative of a thermalmismatch that could be resolved by thermally tuning theengine using an approach illustrated in example 3.5.

3.5 Aircraft engine inverse heat transfer

The modeling of a complex multidisciplinary jet enginesystem (Fig. 12) with physically motivated boundaryconditions often needs tuning to match field data. One ofthe challenges in developing a thermomechanical finiteelement model of the engine is to apply the right set ofboundary conditions that gives accurate prediction ofcasing temperatures and engine clearance. Two demonstra-tions of the inverse modeling approach for (a) tuning heattransfer coefficients in the undercowl region for steady statedata and (b) transient temperature data matching for tuningthe high-pressure compressor (HPC) cavity heat transfercoefficients are illustrated here.

Results of the steady state temperature data matching isshown in Fig. 13, where around 23 heat transfer coefficientsthat represent the axial and circumferential variation in the

Fig. 10 Nonlinear constitutive modeling of a unified plasticity statevariable based constitutive model for prediction of stable hysterisisloops in low-cycle fatigue tests with compressive and tensile dwells

Fig. 11 The prior and posterior clearance predictions are shown byblue and orange curves, respectively. It should be noted that theposterior prediction passes through the experimental data points andretains the overall shape of the clearance profile because of the highbelief given to the data

Fig. 12 Schematic of a jet engine with the undercowl and HPC cavityinstrumented with TCs during engine development


undercowl were tuned to match around 26 TCs placed inthe metal skins, towers, and manifolds of the HPC section.The data matching was performed by performing asensitivity analysis to group TCs and heat transfercoefficients (to improve the efficiency of the data match-ing) and solving six smaller-dimension well-posed inverseproblems using the SVD approach to get the optimalsolution shown in Fig. 12.

Results of the transient temperature data matching forthe HPC cavity in a low-cycle fatigue burst chop cycle forone of the five TCs close to stage 5 is shown in Fig. 14along with a residual plot indicative of transient-modelingerrors. In this 2D finite-element thermal data matchingapplication, around 25 cavity heat transfer coefficients weretuned to match 30 TCs, using the grouping approach basedon transient sensitivity information and subsequently solv-ing four smaller dimensional inverse problems. Matchingthe stage 5 inverse subproblems required around 93simulations of thermal model and required around 47 h ofsimulation time to converge. Plans are in place to parallelize

the independent gradient computations in the inversesolution to improve the turn-around time.

4 Conclusions

This paper presents an inverse modeling-based solution toparameter estimation problems starting from basic conceptsin linear algebra and statistics. Most emphasis of optimiza-tion-based technology in the industry has been towardsdesign-optimization rather than parameter estimation thatgets reflected in commercially available products (for, e.g.,http://www.engineous.com/ ). There is, however, a need forspecific solution methods to handle the class of parameterestimation problems (including the issues addressed in thispaper) in a probabilistic design system for engineeringapplications without user intervention. This paper has takenthe first step towards addressing that need in GeneralElectric’s probabilistic design system. Extensions to provide

Fig. 13 Optimal heat transfer coefficients and temperature mis-matches for undercowl steady state data matching demonstrating theinverse modeling approach in tuning 3D analysis system finite-element engine thermal models

Fig. 14 Initial and optimal transient temperature profiles for an low-cycle fatigue burst chop cycle at a particular TC location close to stage5 followed by residual plot at five TCs indicative of transientmodeling errors


http://www.engineous.com/

(a) global inverse solutions and (b) handle infinite-dimen-sional (function) estimation, stochastic measurements, and(c) nonnormal distributions will be made after identifyingengineering needs.

Acknowledgments This effort was funded through the AdvancedDesign Technology program at General Electric (GE) Global Researchthrough GE Aircraft Engines in 2004–2005. The authors would like tothank Srinivas Mullahalli, Kenneth Seitzer, Dragos Licu, MothilalRengappa, and James Griffiths for their inputs. This paper is based ona talk (GT2006-90058) given at the American Society of MechanicalEngineers (ASME) International Gas Turbine Institute Turbo ExpoConference held in Barcelona, 8–11 May 2006. Thanks to ASME forproviding the permissions to reproduce some of that content for thispaper.

References

Bard Y (1974) Nonlinear estimation. Academic, New YorkEmery AF, Nenarokomov AV, Fadale TD (2000) Uncertainties in

parameter estimation: the optimal experiment design. Int J HeatMass Transfer 43:3331–3339

Fossum AF (1998) Rate data and material model parameterestimation. J Eng Mater Technol 120:7–12

Furukawa T (2001) Parameter identification with weightless regular-ization. Int J Numer Methods Eng 52:218–238

Golub G, Van Loan CF (1996) Matrix computations, 3rd edn. JohnHopkins University Press, Baltimore

Khalessi MR, Lin HZ (1993) Most probable point locus structuralreliability method. AIAA Los Angeles Sect Monogr AIAA-93:1154–1162

Meeker WQ, Escobar LA (1998) Statistical methods for reliabilitydata. Wiley, New York

PEZ version 2.6 user-manual (2005). Probabilistic design system atGeneral Electric Aviation, Cincinnati

Press WH, Teukolsky SA, Vetterling WT, Flannery BP (1996)Numerical recipes in C—the art of scientific computing, 2ndedn. Cambridge University Press, Cambridge

Statinkov RB, Matusov JB (1995) Multi-criterion optimization &engineering. Chapman & Hall, New York

Vanderplaats GN (1984) Numerical optimization techniques forengineering design with applications. McGraw-Hill, New York

Woodbury KA (2003) Inverse engineering handbook. CRC, BocaRaton

Zabaras N, Woodbury KA, Raynaud M (1993) Inverse problems inengineering—theory and practice. American Society of MechanicalEngineers, New York


inverse modeling technology for parameter estimation

Documents