selection of polynomial chaos bases via bayesian model

21
Journal of Computational Physics 259 (2014) 114–134 Contents lists available at ScienceDirect Journal of Computational Physics www.elsevier.com/locate/jcp Selection of polynomial chaos bases via Bayesian model uncertainty methods with applications to sparse approximation of PDEs with stochastic inputs Georgios Karagiannis, Guang Lin Computational Sciences and Mathematics Division, Pacific Northwest National Laboratory, 902 Battelle Boulevard, P.O. Box 999, MSIN K7-90, Richland, WA 99352, USA article info abstract Article history: Received 27 April 2013 Received in revised form 24 September 2013 Accepted 13 November 2013 Available online 1 December 2013 Keywords: Uncertainty quantification Generalized polynomial chaos Bayesian model uncertainty LASSO Median probability model Bayesian model average MCMC Splines Generalized polynomial chaos (gPC) expansions allow us to represent the solution of a stochastic system using a series of polynomial chaos basis functions. The number of gPC terms increases dramatically as the dimension of the random input variables increases. When the number of the gPC terms is larger than that of the available samples, a scenario that often occurs when the corresponding deterministic solver is computationally expensive, evaluation of the gPC expansion can be inaccurate due to over-fitting. We propose a fully Bayesian approach that allows for global recovery of the stochastic solutions, in both spatial and random domains, by coupling Bayesian model uncertainty and regularization regression methods. It allows the evaluation of the PC coefficients on a grid of spatial points, via (1) the Bayesian model average (BMA) or (2) the median probability model, and their construction as spatial functions on the spatial domain via spline interpolation. The former accounts for the model uncertainty and provides Bayes- optimal predictions; while the latter provides a sparse representation of the stochastic solutions by evaluating the expansion on a subset of dominating gPC bases. Moreover, the proposed methods quantify the importance of the gPC bases in the probabilistic sense through inclusion probabilities. We design a Markov chain Monte Carlo (MCMC) sampler that evaluates all the unknown quantities without the need of ad-hoc techniques. The proposed methods are suitable for, but not restricted to, problems whose stochastic solutions are sparse in the stochastic space with respect to the gPC bases while the deterministic solver involved is expensive. We demonstrate the accuracy and performance of the proposed methods and make comparisons with other approaches on solving elliptic SPDEs with 1-, 14- and 40-random dimensions. Published by Elsevier Inc. 1. Introduction Uncertainty Quantification (UQ) aims at a meaningful characterization of uncertainties in stochastic systems and efficient propagation of these uncertainties for quantitative validation of model predictions from available measurements. The math- ematical models of a physical system may be stochastic functions or governed by stochastic partial differential equations (SPDEs) that include a set of random and spatial input variables. In many cases, this mapping does not have an explicit analytical form due to the complexity of the underlying phenomena. A common practice for the evaluation of the stochastic * Corresponding author. E-mail addresses: [email protected] (G. Karagiannis), [email protected] (G. Lin). 0021-9991/$ – see front matter Published by Elsevier Inc. http://dx.doi.org/10.1016/j.jcp.2013.11.016

Upload: others

Post on 26-Jul-2022

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Selection of polynomial chaos bases via Bayesian model

Journal of Computational Physics 259 (2014) 114–134

Contents lists available at ScienceDirect

Journal of Computational Physics

www.elsevier.com/locate/jcp

Selection of polynomial chaos bases via Bayesian modeluncertainty methods with applications to sparseapproximation of PDEs with stochastic inputs

Georgios Karagiannis, Guang Lin ∗

Computational Sciences and Mathematics Division, Pacific Northwest National Laboratory, 902 Battelle Boulevard, P.O. Box 999, MSIN K7-90,Richland, WA 99352, USA

a r t i c l e i n f o a b s t r a c t

Article history:Received 27 April 2013Received in revised form 24 September2013Accepted 13 November 2013Available online 1 December 2013

Keywords:Uncertainty quantificationGeneralized polynomial chaosBayesian model uncertaintyLASSOMedian probability modelBayesian model averageMCMCSplines

Generalized polynomial chaos (gPC) expansions allow us to represent the solution ofa stochastic system using a series of polynomial chaos basis functions. The numberof gPC terms increases dramatically as the dimension of the random input variablesincreases. When the number of the gPC terms is larger than that of the availablesamples, a scenario that often occurs when the corresponding deterministic solver iscomputationally expensive, evaluation of the gPC expansion can be inaccurate due toover-fitting. We propose a fully Bayesian approach that allows for global recovery of thestochastic solutions, in both spatial and random domains, by coupling Bayesian modeluncertainty and regularization regression methods. It allows the evaluation of the PCcoefficients on a grid of spatial points, via (1) the Bayesian model average (BMA) or (2) themedian probability model, and their construction as spatial functions on the spatial domainvia spline interpolation. The former accounts for the model uncertainty and provides Bayes-optimal predictions; while the latter provides a sparse representation of the stochasticsolutions by evaluating the expansion on a subset of dominating gPC bases. Moreover,the proposed methods quantify the importance of the gPC bases in the probabilisticsense through inclusion probabilities. We design a Markov chain Monte Carlo (MCMC)sampler that evaluates all the unknown quantities without the need of ad-hoc techniques.The proposed methods are suitable for, but not restricted to, problems whose stochasticsolutions are sparse in the stochastic space with respect to the gPC bases while thedeterministic solver involved is expensive. We demonstrate the accuracy and performanceof the proposed methods and make comparisons with other approaches on solving ellipticSPDEs with 1-, 14- and 40-random dimensions.

Published by Elsevier Inc.

1. Introduction

Uncertainty Quantification (UQ) aims at a meaningful characterization of uncertainties in stochastic systems and efficientpropagation of these uncertainties for quantitative validation of model predictions from available measurements. The math-ematical models of a physical system may be stochastic functions or governed by stochastic partial differential equations(SPDEs) that include a set of random and spatial input variables. In many cases, this mapping does not have an explicitanalytical form due to the complexity of the underlying phenomena. A common practice for the evaluation of the stochastic

* Corresponding author.E-mail addresses: [email protected] (G. Karagiannis), [email protected] (G. Lin).

0021-9991/$ – see front matter Published by Elsevier Inc.http://dx.doi.org/10.1016/j.jcp.2013.11.016

Page 2: Selection of polynomial chaos bases via Bayesian model

G. Karagiannis, G. Lin / Journal of Computational Physics 259 (2014) 114–134 115

solution is the construction of a numerical model that approximates the solutions of the stochastic system and incorporatesa probabilistic description of the random output variables using the random input variables.

Monte Carlo (MC) methods [1] and their extensions have been extensively used for uncertainty propagation over the pastyears. However, it has been shown that these methods are generally inefficient for large scale stochastic systems because oftheir low convergence rate and thus alternative methods have been developed. Among these methods, this paper focuseson the generalized polynomial chaos (gPC).

The gPC methods [2–4], and its extensions to multi-element gPC (ME-gPC) [5,6], have been successively applied toa variety of UQ problems. In the gPC context, the stochastic solution u(x; ξ), where ξ ∈ Γ and x ∈ D, is modeled as aconvergent series of polynomial bases {ψα(·)}α∈Λ , orthogonal to each other with respect to a distribution fξ (d·) of randominput variables ξ , and the associated PC coefficients {cα(·)}α∈Λ as u(x; ξ) ≈ ∑

α∈Λ cα(x)ψα(ξ) where Λ is a set of indexesdenoting the gPC bases. Stochastic Galerkin projection based polynomial chaos methods [4,7,8] are intrusive methods whichrequires the modification of deterministic solvers. Popular non-intrusive alternatives are the stochastic collocation methods[9,10,3,11] which are based on sparse grid integration/interpolation in the stochastic space. Nonetheless, these numericalapproaches suffer from the ‘curse of dimensionality’ issue.

The number of the gPC bases increases with the gPC degree and the dimension of the random input variables; thisissue is known as the ‘curse of dimensionality’. In theory, a large gPC degree is preferable because the accuracy of thegPC approximation improves as the gPC degree increases [2], however it is shown in [12,13] that a gPC approximationof degree 2 or 3 may be satisfactory in certain cases. When the dimensions of the input random variables increase, thenumber of PC coefficients to be evaluated increases dramatically due to the tensor product involved in the design of themultivariate gPC bases. Therefore, more samples are usually needed for the evaluation of all the required PC coefficients.Evaluations of the system can be quite costly or limited. Hence, often only a small number of samples can be available,perhaps even smaller than the number of the PC coefficients to be estimated. This can cause unstable and inaccurateestimates due to over-fitting when traditional evaluation methods are applied. To address the ‘curse of dimensionality’issue, a careful model reduction can be performed through the evaluation of a gPC expansion that contains a smallersubset of significant (or important or dominant) gPC bases. By significant gPC bases, we mean those that capture differentcharacteristics of the stochastic solutions and explain an acceptable fraction of its variation without increasing the bias in thegPC expansion significantly. Numerical methods that use similar ideas are: l1-minimization [14], reweighted l1-minimization[15] and Bayesian compressive sensing [16,17].

In the present study, we suggest a fully Bayesian non-intrusive, non-adaptive, stochastic method for the evaluation ofthe gPC expansion as a function of the spatial and random inputs. The method treats the problem in the statistical variableselection framework as established by [18–21]. The proposed method considers a collection of models equipped with LASSOshrinkage priors such that each model corresponds to a gPC expansion with different set of gPC bases. Given a discretizationof the spatial domain as a grid of spatial points, the likelihood is defined upon a truncated gPC expansion with an unknowncombination of gPC bases, while the truncation error is treated as a Gaussian variable with unknown variance. Each gPCbasis is associated with an unknown inclusion variable that indicates whether the gPC basis is significant to be includedin the gPC expansion. A priori information about the number of significant gPC bases can be taken into account before theevaluation of the gPC expansion through the assignment of prior distributions on the inclusion variables. The importance ofeach gPC basis is quantified through the associated marginal inclusion posterior probability. Shrinkage of the PC coefficientsis achieved by assigning independent double exponential prior distributions on the discretized PC coefficients. Because ofthe shrinkage property, smaller coefficients shrink faster towards zero while less shrinkage is applied to larger coefficients.This may further encourage the recovery of any sparse gPC representation. The method allows the global recovery of thestochastic solution through the evaluation of a sparse gPC expansion by using spline interpolation. Thus, unlike other ap-proaches [14–17] that focus only at one single point of spatial domain, predictions at arbitrary new spatial points can bemade by using the same gPC expansion. Therefore, predictions of the solution at arbitrary new spatial points can be com-puted directly from the gPC expansion without the need to sample measurements from the stochastic system at the newspatial points and re-evaluate the gPC expansion at these new spatial points.

The present method allows the evaluation of the gPC expansion either by Bayesian model average (BMA) or modelselection. BMA [22], a gold standard for inference or predictions, can be implemented for the evaluation of the gPC expan-sion. BMA takes into account model uncertainty by combining individual model-dependent inferences and weighting themaccording to the associated marginal model posterior probabilities. Consequently, the BMA based gPC expansion can beconsidered as a combination of single gPC expansions defined on different sets of bases. While it provides better predictiveability than any single model (according to [23,22]), it lacks sparsity since all the gPC bases are included in the expansion.On the other hand, when a sparse representation of the solution is needed, the gPC expansion may be evaluated basedonly on a single subset of gPC bases through model selection. Here, the selection of a single subset of significant gPC basesis performed by examining the marginal inclusion posterior probabilities. [24] discuss conditions under which the medianprobability model – here, the gPC expansion that consists of gPC bases with marginal inclusion posterior probability greaterthan or equal to 1/2 – provides Bayes optimal predictions, namely better than any other single gPC expansion based ondifferent set of bases and close to that of BMA based gPC expansion. They also support that their result holds even whenthose conditions are violated. It is worth mentioning that compared to standard compressive sensing methods [14–17], theproposed method provides a probabilistic mechanism that trades off between the bias caused by the omission of gPC basesand the over-fitting.

Page 3: Selection of polynomial chaos bases via Bayesian model

116 G. Karagiannis, G. Lin / Journal of Computational Physics 259 (2014) 114–134

For the computations required, we design a suitable Markov chain Monte Carlo (MCMC) sampler that does not requirepilot runs or add-hoc techniques to be tuned. In the present framework, due to the large number (e.g. m) of gPC bases,the number of possible subsets of gPC bases is huge (e.g. 2m). Therefore, exhaustive computation over all possible gPCexpansions is practically infeasible. MCMC based stochastic search samplers can be successfully implemented in the gPCframework to select significant bases and make inference about the unknown quantities of the gPC expansion.

The proposed method is suitable for, but not restricted to, gPC applications where the random input is high-dimensional,the number of available measurements is limited and the solution is sparse with respect to the gPC bases. Here, the solutionis defined as sparse if only a very small fraction of gPC bases is dominant and able to recover the stochastic solution. Then,reduction of the number of gPC bases could possibly allow the evaluation of gPC expansion without sacrificing the accuracyof the approximation significantly, as discussed in [14]. Elliptic SPDE with high-dimensional random coefficients [14,15]provides a suitable scenario of a stochastic system where the computational cost of running the required deterministicsolver to collect MC samples is high and thus the number of MC samples is small while the stochastic solution is sparseunder weak conditions [25,26,14].

This paper is organized as follows. In Section 2, we review the basic concepts of the gPC expansion, with a particularfocus on the Legendre polynomial bases. In Section 3, we briefly describe the setup of the elliptic SPDEs on which ournumerical examples are based. In Section 4, we describe the proposed method for the evaluation of a sparse gPC represen-tation. In Section, 5, we demonstrate the performance of the proposed method on linear elliptic SPDEs with 1-, 14- and40-random dimensions.

2. Generalized polynomial chaos (gPC)

We consider a stochastic system with solution u(x; ξ) that depends on a vector of spatial input variables x ∈ D and ad-dimensional vector of random input variables ξ ∈ Γ that admits distribution fξ (d·).

The solution u(x; ξ) of the above stochastic system can be represented by an infinite series of polynomial bases ψα(·)and coefficients cα(·) in the tensor form

u(x; ξ) =∑α∈Nd

0

ψα(ξ)cα(x), (1)

for ξ ∼ fξ (d·) and x ∈ D [2]. We denote multi-indices α := α1:d of size d defined on a set of non-negative integers Nd0 :=

{(α1, . . . ,αd): α j ∈N∪ {0}}. The family of polynomial bases {ψα(·)}α∈Nd0

contains multi-dimensional orthogonal polynomial

bases with respect to the probability measure fξ (d·) of ξ . Each multi-dimensional gPC basis ψα(·) results after tensoringunivariate orthogonal polynomial bases ψα j (·) of degree α j ∈ N

10 such that

ψα(ξ1:d) =d∏

j=1

ψα j (ξ j), α j ∈N10, (2)

where

E f(ψα j (ξ)ψα j′ (ξ)

) = Z jδ{0}(

j − j′), j, j′ = 1, . . . ,d, (3)

and Z j = E f (ψ2α j

(ξ)). It is common practice that the family of gPC bases ψα(·) is pre-defined with respect to the distributionfξ (d·) that the random input variable ξ admits. Most common distributions can be associated with a specific family ofpolynomials, e.g. the Askey family [3], otherwise one can possibly generate suitable polynomial bases numerically accordingto [6]. The PC coefficients {cα(x)}α∈Nd

0may be computed as a Galerkin projection of the solution u(x; ξ) onto the space

spanned by the polynomial bases {ψα(ξ)}α∈Nd0

as cα(x) = E f (u(x; ξ)ψα(ξ))/Zα [2], however, in practice the integral is

intractable and numerical methods are required.For computational purposes, a truncated version of (1) is used by considering a finite set of polynomials. Traditionally, it

is used as

up(x; ξ) =∑

α∈Λp,d

ψα(ξ)cα(x), (4)

which takes into account only a finite set of multi-indices Λp,d such that Λp,d = {α ∈ Nd:

∑di=1 αi � p} with cardinality

mp,d := (p+d)!p!d! . Thus we have u(x; ξ) = up(x; ξ)+εp,d(x; ξ) where εp,d is the systematic error (bias) caused by the truncation.

Other forms of truncation can be adopted, see [27].Theoretical results suggest that for a large enough degree p the truncated up(x; ξ) converges to u(x; ξ) in the mean-

square sense under mild conditions [2,8]. Following standard approximation theory and provided that u(x; ξ) is squareintegrable with respect to fξ (d·), the gPC expansion (4) converges to u(x; ξ) as

Page 4: Selection of polynomial chaos bases via Bayesian model

G. Karagiannis, G. Lin / Journal of Computational Physics 259 (2014) 114–134 117

limp→∞ Eξ

(u(x; ξ) − up(x; ξ)

)2 = 0.

The rate of the convergence depends on the regularity of u(x, ξ) and the family of bases considered. For example, in theLegendre case with d = 1, for fixed order p, the smoother u(x, ξ) is, the faster the convergence is [2]. Here, smoothness ismeasured by the differentiability of u.

Given that the gPC expansion up(x; ξ) is accurately evaluated, it is possible to describe the uncertainty of u(x; ξ) withrespect to ξ by computing the statistics of the expansion. For example, the expectation of the solution Eξ (u(x; ξ)) ≈Eξ (up(x; ξ)) = c0(x) and variance Varξ (u(x; ξ)) ≈ Varξ (up(x; ξ)) = ∑

α∈Λp,d−{0} c2α(x)Zα , where Zα = Eξ (ψ

2a (ξ)) for α ∈ Λp,d ,

[2].The evaluation of the gPC expansion (4) can be quite a challenge in high-dimensional scenarios where the dimension

of the random input variable is large and a high degree of accuracy is required. The number of the PC coefficients to beevaluated, which is equal to the cardinality of Λp,d , mp,d = (p+d)!

p!d! , increases dramatically with the dimension d and degree p.As a result, a larger number of measurements (system evaluations) is required for the evaluation of the PC coefficients cα(·).Then, the evaluation of the gPC expansion becomes prohibitively demanding when the computational cost for evaluationsof the system is high. In many cases, the number of available measurements can be smaller than the number of the PCcoefficients to be estimated. In such scenarios, traditional estimation methods may give inaccurate or unstable estimatesdue to over-fitting. Reduction of the gPC degree or careless omission of gPC bases in order to reduce the dimension of theunknown PC coefficients might lead to a significant increase of the bias and therefore a poor approximation of the stochasticsolution u(x; ξ). Therefore, advanced methods are required for the selection of a subset of significant gPC bases that are ableto trade off efficiently between the bias caused by omission of gPC bases and over-fitting.

3. Elliptic SPDEs involving sparse solutions

Elliptic SPDE as described in [14,15] provides a suitable scenario of a stochastic system whose stochastic solution issparse while the computational cost of running the corresponding deterministic solver required to collect samples is high.

We define a complete probability space (Ω,F ,P) where P is a probability measure on the σ -field F . We consider theelliptic SPDE

−∇ · (a(x;ω)∇u(x;ω)) = b(x), x ∈ D, (5)

u(x;ω) = 0, x ∈ ∂D,

P-a.s., ω ∈ Ω , defined on a bounded Lipschitz continuous domain D ⊂ RD , D = 1,2,3, with boundary ∂ D .

The diffusion coefficient a(x;ω) is an unknown stochastic function defined on (Ω,F ,P) and therefore it is the sourceof uncertainty. We allow a(x;ω) to be modeled as a truncated Karhunen–Loéve (K–L) expansion such as

a(x;ω) = a(x) + σa

d∑j=1

√ jφ j(x)ξ j(ω), (6)

where ξ := (ξ1, . . . , ξd) is a d-dimensional random variable defined as ξ : Ω → Γ that admits distribution fξ (d·),{( j, φ j)}d

j=1 are pairs of eigenvalues and eigenfunctions of the covariance function Caa(x1, x2) ∈ L2(D ×D) of a(x;ω), a(x)is the mean of a(x;ω) and σa is the standard deviation that controls the variability of a(x;ω).

Along the same lines as in [14,15], we assume that a(x; ξ) satisfies the following conditions (C-1, C-2):

C-1: For all x ∈D, there exist constants amin and amax such that 0 < amin � a(x;ω) � amax < ∞, P-a.s. ω ∈ Ω .C-2: The covariance function Caa(x1, x2) is piecewise analytic on D×D [26,28], implying that there exist constants c1, c2 ∈

R such that, 0 � j � c1 exp(−c2 j1/D) and ∀a ∈Nd:

√ j‖∂φ j‖L∞(D) � c1 exp(−c2 j1/D) for j = 1, . . . ,d, where a ∈ N is

a multi-index.

Similar to [14], we consider that {ξ j}dj=1 are independently and identically distributed with respect to U(−1,1) and therefore

fξ (ξ) := ∏dj=1 U(ξ j | − 1,1) and Γ := [−1,1]d . However, other joint distributions can be considered such as those in [15].

Considering that the diffusion coefficient is modeled as in (6) and therefore a(x; ξ) := a(x;ω) where α : D × Γ → R, thestochastic solution u(x;ω) of (5) can be represented as u(x; ξ) :D×Γ → R, where u :D×Γ → R. Condition C-1 guaranteesthat the solution u(x; ξ) is analytic with respect to the random values ξ . Also, condition C-2 ensures the existence of a sparserepresentation for the SPDE (5) as discussed in [26].

The solution u(x; ξ) of the SPDE (5) can be represented as a gPC expansion of Legendre polynomials. Given conditionsC-1 and C-2, the solution is analytic with respect to ξ and the gPC approximation converges exponentially fast in the mean-square sense as the gPC degree increases [2,8]. In cases where the deterministic solver of the SPDE (5) is computationallyexpensive, only a small number n of evaluations is available. Also, when the K–L truncation degree d is high the numbermp,d of the terms of a gPC expansion can be prohibitively large so that n � mp,d . Because the stochastic solution u(x; ξ) isnearly sparse, it can be accurately represented by a gPC expansion that includes a smaller subset of significant gPC basesthan Λp,d .

Page 5: Selection of polynomial chaos bases via Bayesian model

118 G. Karagiannis, G. Lin / Journal of Computational Physics 259 (2014) 114–134

4. Evaluation of a sparse gPC expansion

We consider a stochastic solution u(x; ξ) with d-dimensional random input variables ξ := ξ1:d , ξ ∈ Γ and spatial in-put variable x ∈ D of a system such as the elliptic SPDE described in (5). We assume that the spatial input variable x isdiscretized on a grid of q spatial points {xk}q

k=1. We consider that a sample {ui,1:q, ξi,1:d}ni=1 of n evaluations of the stochas-

tic system is available where i is the sample index and ui,1:q := u(x1:q; ξi,1:d) is the discretized solution evaluated by adeterministic solver for i = 1, . . . ,n.

The gPC expansion (4) can be re-written in a vectorized form as ui,k = Ψi,γ cγ ,k +εi,k where γ := (γ1, . . . , γm) is a binaryvector such that γ j = 1 if α j ∈ Λp,d and γ j = 0 otherwise, c j,k = cα j (xk), Ψi, j = ψα j (ξi), ui,k = u(xk, ξi,1:d) and εi,k ∈ R is aresidual error term for i = 1, . . . ,n and k = 1, . . . ,q. Here, we adopt a MATLAB-like notation for the representation of thematrices and their sub-matrices e.g. cγ ,k includes only the j-th elements of the k-th column of matrix c that correspondto γ j = 1, for j = 1, . . . ,m. Hereafter, we use m := mp,d to simplify the notation since values p and d are considered aspre-defined.

The proposed strategy can proceed in three steps: (1) define the Bayesian model, (2) perform a MCMC stochastic searchand (3) evaluate the gPC expansion either by BMA or based on a single subset of gPC bases using the median probabilitymodel. These steps are explained in detail in the following sections.

4.1. The Bayesian statistical model in the augmented space

We define a Bayesian hierarchical model on the augmented model space with constant dimension similarly to [18–20].We consider a collection of statistical models C : {Mγ ; γ ∈ {0,1}m}, such that

Mγ : ui,k = Ψi,γ cγ ,k + εi,k, εi,k ∼ N(0,h2), i = 1, . . . ,n, k = 1, . . . ,q,

where cγ ∈ Rmγ , mγ = ∑m

j=1 γ j . Each model Mγ is associated with a likelihood

L(u1:n,1:q|Ψ,γ , cγ ,h) =n∏

i=1

q∏k=1

N(ui,k

∣∣Ψi,γ cγ ,k,h2),which depends on the model index γ and the model parameters cγ ,1:q and h. The inclusion variable γ indicates whichof the gPC bases are significant to be included in the expansion. Different models in the collection C are characterized bydifferent ‘inclusion variables’ γ and therefore they represent gPC expansions defined on different subsets of gPC bases.

We assign independent Bernoulli priors on each individual inclusion variable {γ j}mj=1 such that π(γ |ρ) =∏m

j=1 Bernoulli(γ j |ρ). Therefore, prior information about the sparsity of the gPC expansion can be included by properlyadjusting ρ e.g. smaller values of ρ lead to sparser representations. In fact, a priori the number of significant gPC bases isE(mγ ) = n · ρ . Alternatively, one could have considered the sparsity hyper-parameter ρ dependent on j if prior informationabout each individual gPC basis individually was available.

Given γ , we assign individual Double Exponential prior distributions on the discretized significant PC coefficients cγ ,1:qsuch that π(cγ ,1:q|γ ) = ∏

j:γ j=1∏q

k=1 DE(c j,k|0, λ/h). The use of Double Exponential priors encourages shrinkage of the dis-cretized PC coefficients towards zero, similar to standard LASSO, within each statistical model [29,30,20]. This may facilitatethe recovery of sparse representations.

The maximum a posteriori estimate (MAP), which is equal to the mode of π(cγ ,1:q|u,Ψ,γ ,ρ,h, λ), coincides with theordinary frequentist LASSO estimate suggested by [31]. Given the inclusion variable γ , the full conditional posterior distri-bution density of the discretized gPC coefficients π(cγ ,1:q|u,Ψ,γ ,ρ,h, λ), in log scale and up to a normalizing constant,is

log(π(cγ ,1:q|u,Ψ,γ ,ρ,h, λ)

) = −1

2

1

h2

q∑k=1

(‖u1:n,k − Ψ1:n,γ cγ ,k‖22 + 2λh‖cγ ,k‖1

) + const.

Here, the shrinkage parameter is equal to 2λh.We supplement the hierarchical Bayesian model with artificial pseudo-priors Q (c−γ |γ ) on c−γ ,1:q where c−γ ,1:q denotes

the discretized PC coefficients associated to the non-significant gPC bases. We consider that the unused parameters c−γ ,1:qare a priori independent of each other and also of cγ ,1:q , given γ , namely, Q (c−γ |γ ) = ∏

j:γ j=0∏q

k=1 Q (c j,k). Any properdistribution can be considered as pseudo-prior, without affecting the marginal distributions of the rest of the parameters.Here, we consider Q (·) = Dirac(·|0), as a Dirac distribution on point zero.

On the residual standard deviation h, we assign half normal prior distributions, such that π(h|ah,bh) = N+(h|ah,bh). Thelimiting case that ah = 0 and bh → ∞ leads to non-informative but improper priors, π(h|ah,bh) ∝ 1, that can be appliedwhen no prior information about h is available. We assigned a half normal prior on h, mainly for practical reasons and dueto the lack of obvious conjugate prior, because we believe that it is easy to include information or ignorance by adjustingsuitably the location ah and scale bh parameters. Considering the residual variance as fixed might give misleading resultsbecause inference can be sensitive to this parameter.

Page 6: Selection of polynomial chaos bases via Bayesian model

G. Karagiannis, G. Lin / Journal of Computational Physics 259 (2014) 114–134 119

To account for uncertainty about the shrinkage parameter λ, we assign a Gamma hyper-prior distribution with parame-ters aλ and bλ , namely π(λ|aλ,bλ) = G(λ|aλ,bλ). Non-informative priors may be considered by choosing small values for aλ

and bλ or by following [32]. However, prior information can be incorporated, if available, by matching the moments of thepriors. By assigning a prior on λ, we let the data determine which value is suitable without the need of ad-hoc methodssuch as cross-validation type algorithms that some other methods require.

Finally, we extend the hierarchical model by assigning a beta hyper-prior on the sparsity hyper-parameter ρ with densityπ(ρ) = Be(ρ|aρ,bρ). When there is no prior information about ρ , one can consider a uniform distribution by settingaρ = bρ = 1. By assigning a prior on ρ , we let the data determine which value is appropriate, as the careless choice of afixed ρ might give misleading results.

The prior model π(γ ,ρ, c,h, λ|aρ,bρ,h,aλ,bλ) summarizes as

• π(γ |ρ) = ∏dj=1 Bernoulli(γ j |ρ);

• π(cγ ,1:q|γ ,h, λ) = ∏j:γ j=1

∏qk=1 DE(c j,k|0, h

λ);

• π(h|ah,bh) = N+(h|ah,bh);• π(λ|aλ,bλ) = G(λ|aλ,bλ);• π(ρ|aρ,bρ) = Be(ρ|aρ,bρ),

with pseudo-prior

• Q (c−γ |γ ) = ∏j:γ j=0

∏qk=1 δ0(c j,k).

According to the Bayes theorem, the full posterior of the augmented model space can be expressed as

π(γ , c,h, λ,ρ|u,Ψ ) = L(u|Ψ,γ , cγ ,1:q,h)π(cγ ,1:q|γ ,h, λ)π(h)π(λ)π(γ |ρ)π(ρ)

f (u,Ψ )Q (c−γ ,1:q|γ )

= π(γ , cγ ,1:q,h, λ,ρ|u,Ψ )Q (c−γ ,1:q|γ ), (7)

defined on Θ = {0,1}m ×Rq·m × (0,∞) × (0,∞) × (0,1). The posterior distribution π(γ , cγ ,h, λ,ρ|u,Ψ ), which is of main

interest here, is

π(γ , cγ ,1:q,h,ρ,λ|u,Ψ ) = L(u|Ψ,γ , cγ ,1:q,h)π(γ |ρ)π(cγ ,1:q|γ ,h, λ)π(h)π(ρ)π(λ)

f (u,Ψ )

= π(cγ ,h, λ|u,Ψ,γ ,ρ)π(γ |u,Ψ,ρ)π(ρ|u,Ψ ), (8)

where f (u,Ψ ) = ∑γ ∈{0,1}d

∫Θγ

L(u|Ψ,γ , cγ ,1:q,h)π(γ |ρ)π(cγ ,1:q|γ ,h, λ)π(h)π(ρ)π(λ), and Θγ = Rmγ × (0,1)× (0,∞) ×

(0,∞), and has density proportional to

π (γ , cγ ,1:q,h,ρ,λ|u,Ψ ) =q∏

k=1

n∏i=1

√1

h2exp

(− 1

2h2(ui,k − Ψi,γ cγ ,k)

2)

× 1

2q·mγ

h

)q·mγ

exp

(−λ

h

q∑k=1

∑j: γ j=1

|c j,k|)

×m∏

j=1

ργ j (1 − ρ)1−γ j × ρaρ−1(1 − ρ)bρ−1

× exp

(−1

2

(h − ah)2

b2h

)

× λaλ−1 exp(−bλλ). (9)

The posterior distribution density (8) and its marginals are not available in explicit form and therefore the evaluation of theassociated expectations cannot be performed analytically. Therefore algorithms for the numerical evaluation of the posteriorexpectations are required.

4.2. MCMC stochastic search

Exhausting enumeration of all the 2m possible statistical models (gPC expansions) for the estimation of posterior quan-tities of interest, and therefore the gPC expansion, is practically impossible when m is large. We design a suitable MCMCstochastic search sampler that targets (7) and visits each model proportionally to the marginal posterior probability.

Page 7: Selection of polynomial chaos bases via Bayesian model

120 G. Karagiannis, G. Lin / Journal of Computational Physics 259 (2014) 114–134

We design a blockwise MCMC algorithm that targets distribution π(γ , c,h,ρ,λ|u,Ψ ) in (8) with a symmetric sweep of 3main blocks that update the array of random parameters (γ , cγ ,1:q, c−γ ,1:q,h, λ,ρ) iteratively. The first block updates (γ , c)by jointly drawing pairs (γ j, c j,1:q) from the associated full conditional distributions for j = 1, . . . ,m successively. Updatingγ j and c j,1:q enjoys better convergence properties than just updating γ j and c j,1:q individually from the full conditionals[18]. The second block updates the standard deviation parameter h with an adaptive Metropolis–Hastings algorithm whilethe third block updates the shrinkage parameter and the sparse hyper-parameter (λ,ρ) in a Gibbs step. In what follows, wepresent the blocks of an MCMC sweep in details. The algorithm is presented as a pseudo-code in Algorithm 2.

Direct sampling from the full conditional posterior of (γ j, c j,1:q) is possible for j = 1, . . . ,m. This can be performed bydecomposing the associated conditional posterior distribution as

π(γ j, c j,1:q|u,Ψ,γ− j, c− j,1:q,ρ,h, λ) = Pr(γ j|u,Ψ, cγ− j ,1:q,ρ,h, λ)π(c j,1:q|u,Ψ,γ , cγ− j ,k,ρ,h, λ), (10)

for j = 1, . . . ,m. The conditional probability of γ j on the right hand side of (10) can be computed by integrating

out c j,1:q from (9) and computing P j = [1 +∫

c j,1:q π (1−γ j ,cγ j ,1:q,h,ρ,λ|γ− j ,cγ − j ,1:q,u,Ψ )∫c j,1:q π (γ j ,cγ j ,1:q,h,ρ,λ|γ− j ,cγ − j ,1:q,u,Ψ )

]−1. Then Pr(γ j |u,Ψ, cγ− j ,1:q,ρ,h, λ) =Bernoulli(γ j |P j) where

P j = 1 −[

1 + 1

2q

ρq

(1 − ρ)q

h

)q q∏k=1

(φ(−μ−

j,k/s j,k)

N(0|μ−j,k, s2

j,k)+ φ(μ+

j,k/s j,k)

N(0|μ+j,k, s2

j,k)

)]−1

, (11)

μ−j,k = (

Ψ T1:n, jΨ1:n, j

)−1[Ψ T

1:n, j(u1:n,k − Ψ1:n,γ− j cγ− j ,k) + hλ], (12)

μ+j,k = (

Ψ T1:n, jΨ1:n, j

)−1[Ψ T

1:n, j(u1:n,k − Ψ1:n,γ− j cγ− j ,k) − hλ], (13)

s2j,k = h2(Ψ T

1:n, jΨ1:n, j)−1

, (14)

for j = 1, . . . ,m and k = 1, . . . ,q. The full conditional posterior distribution of c j,1:q is decomposed into

π(c j,1:q|u,Ψ,γ , cγ− j ,k,ρ,h, λ) =q∏

k=1

π(c j,k|u,Ψ,γ , cγ− j ,k,ρ,h, λ),

and therefore c j,k is a posteriori independent with respect to k given γ . The distribution π(c j,k|u,Ψ,γ , cγ− j ,k,ρ,h, λ) admitsdensity such that

π(c j,k|u,Ψ,γ ,h,ρ,λ) ∝{

δ{0}(c j,k), if γ j = 0,∏ni=1

√1

h2 exp(− 12h2 (ui,k − Ψi,γ cγ ,k)

2)exp(− λh

∑j:γ j=1 |c j,k|), if γ j = 1,

(15)

for j = 1, . . . ,m and k = 1, . . . ,q. After separating the term in (15) corresponding to the Double Exponential prior intopositive and negative, we compute

π(c j,k|u,Ψ,γ , cγ− j ,k,ρ,hk, λ) ={

δ{0}(c j,k), if γ j = 0,

w j,kN−(c j,k|μ−j,k, s2

j,k) + (1 − w j,k)N+(c j,k|μ+

j,k, s2j,k), if γ j = 1,

(16)

where

w j,k =[

1 + φ(μ+j,k/s j,k)

N(0|μ+j,k, s2

j,k)

N(0|μ−j,k, s2

j,k)

φ(−μ−j,k/s j,k)

]−1

, (17)

is the weight in favor of the negative part, for j = 1, . . . ,m and k = 1, . . . ,q. Here, φ(·) denotes the cumulative densityfunction of the standard normal distribution.

Sampling from (11) is straightforward, see [1]. Direct sampling from (16) is possible through composition sampling [1].When γ j = 1, the full conditional distribution of c j,k (16) can be considered as a mixture of two components, the truncatednormal distributions N+(·, ·) and N−(·, ·), with weights 1 − w j,k and w j,k respectively. Then, samples can be obtained bysampling a component with probability w j,1 or 1 − w j,1 and then sampling from the corresponding distribution. Samplingfrom truncated normal distributions is performed by rejection sampling [33]. Alternatively, an inverse sampling method isproposed by [34], however, it requires evaluations of inverse error functions.

The full conditional posterior distribution of h is π(h|u,Ψ,γ , cγ ,k,ρ,λ) with density such that

π(h|u,Ψ,γ , cγ ,k,ρ,λ) ∝ exp

(− 1

2h2

q∑‖u1:n,k − Ψ1:n,γ cγ ,k‖2

2 − λ

h

q∑‖cγ ,k‖1 − 1

2

(h − ah)2

b2

). (18)

k=1 k=1 h

Page 8: Selection of polynomial chaos bases via Bayesian model

G. Karagiannis, G. Lin / Journal of Computational Physics 259 (2014) 114–134 121

This is not a standard distribution that can be sampled directly. Nevertheless, density (18) is available up to an unknownnormalizing constant and parameter h can be updated by a random walk Metropolis (RWM) step in log scale [35]. RWMis calibrated to achieve optimal performance by allowing the expected acceptance probability of the update to be around0.5 as suggested by [36]. The calibration of the RWM proposals is performed by using a stochastic adaptation scheme [37],preferably during the burn-in – the first few T0 ∈ N iterations that discard from the sample at the end. More precisely, forgiven initial values h(0) , r(0) , m(0) , S(0) , at t = 0, the adaptive RWM that targets π(h|u,Ψ,γ , cγ ,k,ρ,λ) proceeds as following.At iteration t and given that the Markov chain is at state h(t) , propose a value h′ = h(t) exp(z′) such that z′ ∼ N(0, r(t) S2,(t))

and accept h(t+1) = h′ as the next state of the chain with probability a(t)h,h′ = min{1,

π(h′|u,Ψ,...)

π(h(t)|u,Ψ,...)

h′h(t) }. Next, the adaptation

parameters r, m and S2 are updated according to

log(r(t+1)

) = log(r(t)) + θt

(a(t)

h,h′ − a), (19)

m(t+1) = m(t) + θt(h(t+1) − m(t)), (20)

S2,(t+1) = S2,(t) + θt((

h(t+1) − m(t))2 − S2,(t)), (21)

where a = 0.5 and θt = (t + 1)−a with a ∈ (0.5,1). The pseudo-code of the adaptive RWM update is summarized in Algo-rithm 1.

Algorithm 1 Adaptive RWM update for h, at iteration t .RWM update

1. Draw h′ = h(t) exp(z′) such that z′ ∼ N(0, r(t) S2,(t)),2. Set h(t) = h′ with prob. a(t)

h,h′ or h(t) = h(t−1) otherwise,

Compute r(t+1) , m(t+1) , S2,(t+1) according to (19), (20), (21), if t < T0.

Direct sampling from the full conditional posterior distributions of the shrinkage parameter of λ and the hyper-parameterρ is possible. It is π(λ,ρ|u,Ψ,γ , cγ ,1:q,h) = π(λ|u,Ψ γ , cγ ,1:q,h)π(ρ|u,Ψ,γ ), where the full conditional posterior distri-bution of λ is G(q · mγ + aλ,

∑qk=1

1h ‖cγ ,k‖1 + bλ) and ρ is Beta(mγ + aρ,m − mγ + bρ).

Algorithm 2 can be implemented in parallel. Given γ , the pairs of random parameters ρ and λ, c j,k and c j,k′ givenj = 1, . . . ,m for k �= k′ = 1, . . . ,q are a posteriori conditionally independent. Thus, updates within Block I.3 and Block III canbe performed in parallel. A parallel implementation of Algorithm 2 can dramatically reduce the computational cost in CPUtime. Moreover, in Block I, only significant PC coefficients, namely c j,1:q such that γ j = 1, need to be updated or consideredfor the computation of quantities (12), (13) and (14).

Algorithm 2 Blocks of the MCMC sweep.Block I Update (γ j , c j,1:q):

1. Compute μ−j,k , μ+

j,k and s2j,k , according to (12), (13) and (14), for k = 1, . . . ,q,

2. Update γ j : draw γ j from Bernoulli(P j), according to (11),3. Update c j,k : draw c j,k from π(c j,k|u,Ψ,γ , cγ− j ,k,ρ,h, λ) according to (16), for k = 1, . . . ,q, for j = 1, . . . ,m,

Block II Update h: Sample h according to Algorithm 1,Block III Update ρ: Sample from Beta(mγ + aρ ,m − mγ + bρ),

Update λ: Sample from G(q · mγ + aλ, 1h

∑qk=1 ‖cγ ,k‖1 + bλ).

4.3. Evaluation

According to the standard MCMC theory, Algorithm 2 generates a reversible, irreducible and aperiodic Markov chain{(γ (t), c(t)

γ ,h(t), ρ(t), λ(t))}Tt=1 of length T that is distributed according to the posterior distribution π(γ , cγ ,h,ρ,λ|u,Ψ ) in

the limit. The generated sample can be used to evaluate the gPC expansion, compute point estimates, and make inferencefor the unknown coefficients of the expansion. Below, we describe two approaches for the evaluation of the gPC expansion:the BMA based gPC and median probability model based gPC. The former is preferable because it has optimal predictiveability while the latter is appropriate when a sparse gPC expansion is needed because it evaluates the gPC expansion on asubset of bases.

4.3.1. Median probability model based evaluationIf a sparse representation of the stochastic solution is the main interest, then we need to evaluate the gPC expansion

based on only a small-size subset of significant gPC bases without lose in precision. [24] suggest that under certain con-ditions the median probability model – here it corresponds to gPC expansion on the subset of gPC bases associated with

Page 9: Selection of polynomial chaos bases via Bayesian model

122 G. Karagiannis, G. Lin / Journal of Computational Physics 259 (2014) 114–134

inclusion probabilities greater than 0.5 – presents optimal predictive ability. Moreover, they discuss that their results holdeven when these conditions are violated.

According to the median probability model, the subset of significant gPC bases Λp,d(γ ), that correspond to γ , is such thatγ j = δ(0.5,1)(Pr(γ j |u,Ψ )) where Pr(γ j |u,Ψ ) = ∫

Θγπ(γ , cγ ,1:q,h,ρ,λ|u,Ψ )d(cγ ,1:q,h,ρ,λ). The marginal inclusion posterior

probabilities {Pr(γ j |u,Ψ )}mj=1 are estimated as the relative number of times that each gPC basis occurs in the sample,

namely Pr(γ j |u,Ψ ) = 1T

∑Tt=1 δ{1}(γ (t)). Otherwise, Rao–Blackwellized estimates that have smaller relative error [38], can

also be obtained as Pr(RB)

(γ j |u,Ψ ) = 1T

∑Tt=1 P (t)

j , for j = 1, . . . ,m. In our numerical examples (Section 5), we report theformer estimator, as we did not notice any significant difference between them in the results. Therefore, the subset ofsignificant gPC bases that can be included in the gPC expansion is estimated as: Λp,d(γ ) = {α j ∈ Λp,d: γ j = 1; j = 1, . . . ,m}and contains only the indexes of the full set Λp,d that correspond to inclusion variable γ j = 1, for j = 1, . . . ,m.

Given a selected subset Λp,d(γ ) of significant gPC bases, the next step is to estimate the significant PC coefficients cγ (x)that corresponds to significant inclusion variables γ . At first, we compute the estimates of the discretized PC coefficients{cγ (xk)}q

k=1 on the spatial point grid {xk}qk=1, namely {cγ ,k = cγ (xk); k = 1, . . . ,q}. According to the standard MCMC practice,

they can be estimated by the ergodic average as cγ ,k = ∑Tt=1 c(t)

γ ,kδγ (γ (t))/∑T

t=1 δγ (γ (t)), for k = 1, . . . ,q. The PC coefficients

{c j(x)} j: γ j=1 are approximated as functions of x ∈D by an interpolation function, S(·|·), among points {(c j,k, xk)}qk=1, for j:

γ j = 1. We denote this interpolation function [39,40] here as c j(x) = S(x|{(c j,k, xk)}qk=1), for j: γ j = 1. The interpolation

scheme can be for example splines, polynomials, or kernels. In our examples we use quadratic splines although one canuse other options. Note that interpolation between cγ ,k and cγ ,k′ is feasible here because the selected set of significant gPC

bases is common throughout the grid of spatial points. The z-quartiles of cγ (x), here denoted as c(z)γ

(x), which are required

for the evaluation of the confidence intervals, can be estimated likewise, as c(z)γ

(x) = S(x|{(c(z)γ ,k, xk)}q

k=1), where c(z)γ ,k is the

z-th quartile of the empirical distribution from the MCMC sample, at xk .The stochastic solution can be evaluated as u(median)(x(∗); ξ (∗)) = ∑

j:γ j=1 c j(x(∗))Ψ j(ξ(∗)). We will refer to it as the ‘me-

dian probability model based gPC expansion’. Estimates and inference about λ, ρ , and σ 2 can be drawn likewise.

4.3.2. Bayesian model averaging based evaluationIf the predictive ability of the gPC expansion is of main interest and we are not concerned with sparsity, the evaluation

of the expansion can be made by Bayesian Model Averaging (BMA) [22]. Bayesian Model Average takes into account modeluncertainty by combining individual model-dependent inferences and weighting them according to the associated marginalmodel posterior probabilities. BMA presents optimal predictive performance compared to any other method that considersa single statistical model, (e.g. here, a single gPC expansion based on a single subset of significant gPC bases), as discussedin [22]. Here, BMA can be used for the evaluation of the PC coefficients and therefore the gPC expansion.

The BMA estimate of the j-th PC coefficient is approximated by c j(BMA)

(x) = S(x|{(c(BMA)

j,k , xk)}qk=1), where c(BMA)

j,k =1T

∑Tt=1 c(t)

j,k is the BMA point estimate of the discretized PC coefficient c j(xk) at xk spatial points, for j = 1, . . . ,m and

S(·|·) is the spline interpolation function. Then, the stochastic solution u(ξ (∗); x(∗)) at (ξ (∗), x(∗)), can be estimated byu(BMA)(x(∗); ξ (∗)) = ∑m

j=1 c(BMA)j (x(∗))Ψ j(ξ

(∗)). We will refer to it as the ‘Bayesian model averaging based gPC expansion’.

5. Numerical examples

To illustrate the proposed methods, we employ them on the representation of the solution u(x; ξ) of the 1D elliptic SPDE

d

dx

(a(x; ξ)

d

dxu(x; ξ)

)= −1, x ∈ D, ξ ∈ Γ,

u(0; ξ) = 0, u(1; ξ) = 0, ξ ∈ Γ, (22)

where D = (0,1), Γ = [−1,1]d and d ∈ N+ , as a function of the input random variables and spatial variables. In particular,

the solution u(x; ξ) is represented as a finite, and possibly sparse, gPC expansion with Legendre polynomial bases. Wedemonstrate the main idea of the proposed method on a tractable 1D scenario in which the stochastic solution and theassociated PC coefficients are available in closed form. We illustrate the performance of the proposed method with twointractable high-dimensional scenarios where the numbers of random variables are 14 and 40, respectively. We providecomparisons with other approaches such as the Bayesian compressive sensing with fixed error variance (BCS) and withrandom error variance (mtCS) and the l1-minimization using the L1-MAGIC collection of MATLAB routines. We also showempirically that the median probability model based gPC expansion presents similar predictive performance to that of BMA.The results reported here are encouraging and suggest that the proposed method outperforms other methods used in thegPC context, in terms of sparsity and accuracy, at a small additional computational cost.

We measure the performance of the methods with respect to the relative error of the expectation ε(μ(u; x)) =| μ(u;x)−μ(u;x) | and the standard deviation of the solution ε(σ (u; x)) = | σ (u;x)−σ(u;x) | where μ(u; x) = c0(x), σ (u; x) =

μ(u;x) σ (u;x)
Page 10: Selection of polynomial chaos bases via Bayesian model

G. Karagiannis, G. Lin / Journal of Computational Physics 259 (2014) 114–134 123

Fig. 1. [1D] Exact solution u(x; ξ) of (22) as a function of x and ξ , for a(x; ξ) = 1 + 0.5ξ .

√∑mj=1 c2

j (x)Z j and Z j := Eξ (Ψ2j (ξ)) are estimated by standard Monte Carlo integration after 105 evaluations of the

stochastic system. The predictive performance is measured with respect to the relative predictive error ε(u; x) =1

N∗∑N∗

i=1 | u∗(x;ξ∗i )−u(x;ξ∗

i )

u∗(x;ξ∗i )

|, given an independent sample of new measurements {u∗(x; ξ∗i ), ξ∗

i }N∗i=1, N∗ = 101, at x ∈ (0,1).

We examine the performance of the proposed approach against the resolution of the spatial grid, when multiple spa-tial points are considered, with respect to the total relative errors ε(μ) = 1

q∗∑q∗

k=1 ε(μ; x∗k ), ε(σ ) = 1

q∗∑q∗

k=1 ε(σ ; x∗k ) and

ε(u) = 1q∗

∑q∗k=1 ε(u; x∗

k ) given a common reference grid of randomly chosen spatial points {x∗k }q∗

k=1, where q∗ = 101.

5.1. One-dimensional example: scenario [1D]

The diffusion coefficient in SPDE (22) is modeled as a(x; ξ) = a(ξ) = 1+0.5ξ , where ξ is distributed according to U(−1,1)

and d = 1. This form of diffusion coefficient leads to a tractable exact solution u(x; ξ) = x(1−x)2+ξ

, displayed in Fig. 1, with

density fu(u; x) = x(1−x)2u2 , u ∈ (x(1 − x)/3, x(1 − x)), mean μ(u; x) = x(1 − x) log(3)

2 and standard deviation σ(u; x) = x(1 −x)

√13 − 1

4 log2(3), x ∈ (0,1) (see Appendix B).We assume that a gPC expansion of degree at most p = 80, and with m = 81 terms, can represent the solution adequately.

A sample of n = 20 evaluations of the system is available, namely {(u(x1:q; ξ (i)), ξ (i))}ni=1, where q � 1. Given that n � m, we

wish to identify a subset of significant gPC bases that dominate the rest and adequately approximate the stochastic solutionu(x; ξ) of (22). We define the collection of Bayesian models with weakly informative priors with fixed hyper-parametersaλ = 0.001, bλ = 0.001, ah = 0, bh = 100, aρ = 1 and bρ = 1. The MCMC sampler runs for 2 · 105 iterations of which thefirst 105 are discarded as burn-in. Significant gPC bases were selected according to the median probability model while thewithin-model parameters are estimated by the ergodic average of the MCMC sample.

5.1.1. Single spatial point case, at x∗ = 0.5We examine the stochastic system at spatial point x∗ = 0.5, where the stochastic solution is u(0.5; ξ) = (4(2 + ξ))−1 and

the associated expectation and standard deviation are μ(u;0.5) = log(3)/8 and σ(u;0.5) =√

148 − 1

64 log2(3), respectively.The estimates of the marginal inclusion posterior probabilities of the gPC bases are presented in Fig. 2(a). Only the first

mγ = 10 gPC bases are selected as significant according to the median probability model. Thus, the proposed single gPCexpansion includes only 10 out of 81 gPC basis functions and therefore provides a sparse gPC representation of the formu(0.5; ξ) = ∑9

j=0 c j(0.5)Ψ j(ξ), of the exact solution. The BCS and mtCS, using the default algorithmic parameters and priors,indicate the first 8 and 7 gPC bases as significant, respectively. The l1-minimization approach, using L1-MAGIC routines andthe default algorithmic parameters, detects 14 gPC bases with non-zero PC coefficients. The bases selected by BCS and mtCSare also selected by the median probability model and therefore the proposed method is consistent with other methods.

In Table 1 (1st row) we report the relative errors for the methods considered. We observe that BMA and median probabil-ity model based gPC expansions give similarly accurate results because their relative errors are close in value. The posteriordistribution densities of μ(u;0.5) and σ(u;0.5) are estimated from the MCMC sample and represented by the associatedhistograms (Figs. 2(b), 2(c)). The proposed method provides posterior distributions for the unknown quantities of interestrather than just point estimates which is a main advantage of the proposed method compared to l1-minimization approach.We observe that the sampler mixes well and the associated ergodic averages which correspond to the point estimates con-verge quickly (Figs. 2(b), 2(c)). Therefore, fewer iterations could have been considered in this case. Fig. 2(d) presents thepoint estimates of the PC coefficients at x∗ = 0.5. It is worth noticing that the largest differences between the estimates of

Page 11: Selection of polynomial chaos bases via Bayesian model

124 G. Karagiannis, G. Lin / Journal of Computational Physics 259 (2014) 114–134

Fig. 2. [1D] In 1-st panel: bar plot of the marginal inclusion posterior probabilities {Pr(γ j |u,Ψ )}. In 2-nd & 3-rd panels: histograms and trace plots of theexpectation and standard deviation of the solution at spatial point x∗ = 0.5 where the red lines correspond to the ergodic averages. In 4-th panel: estimatesof the absolute values of the PC coefficients at x∗ = 0.5 in log scale. (For interpretation of the references to color in this figure legend, the reader is referredto the web version of this article.)

the PC coefficients computed by the median probability model approach, BCS, mtCS and l1-minimization are observed inhigher order rather than in lower orders, e.g. 7, 8.

The estimated relative errors for the expectation of the solution ε(μ(u;0.5)), the standard deviation of the solutionε(σ (u;0.5)) and the prediction ε(u;0.5) are reported in Table 1 (1st row). We observe that the values of the relative errorsthat correspond to BMA and median probability model based gPC expansions are quite close in value. Therefore, the medianprobability model based gPC expansion presents similar performance to that of BMA while providing a sparse representationof the solution that considers only a single subset of gPC bases. We observe that the BMA and median probability modelbased gPC expansions present significantly better predictive ability than those evaluated by BCS, mtCS and l1-minimizationmethods with respect to the relative prediction error. The relative errors ε(μ;0.5) are significantly smaller in the BMA andmedian probability model based gPC expansions than in BCS and mtCS while the relative errors of the standard deviationε(σ ;0.5) appear to be close in value.

5.1.2. Multiple spatial pointsWe are interested in a sparse gPC expansion of the solution of SPDE in (22) that is able to approximate the stochastic

solution throughout the spatial domain D = (0,1). The spatial domain D = (0,1) is discretized at q = 7 equally spacedspatial points by considering a grid {xk = k/(q + 1); k = 1, . . . ,q}. The MCMC stochastic search algorithm is applied for theselection of a subset of significant gPC bases, the estimation of the associated PC coefficients, and the evaluation of the gPCexpansion for prediction and inference at arbitrary new spatial points. We run the algorithm for 2 · 105 iterations where thefirst 105 are discarded as burn-in.

The algorithm selects only mγ = 14 gPC bases as significant according to the median probability model. The estimatedmarginal inclusion probabilities are presented in Fig. 3(a). We observe that the number of the significant gPC bases is largerwhen multiple spatial points are considered than when only one spatial point is considered. This is reasonable becausemore bases are required by the gPC expansion to approximate the solution at arbitrary spatial points than at a single spatialpoint. Thus, gPC expansion is able to capture more features of the solution at different spatial points.

In Fig. 4, we plot the point estimates and the 95%-confidence intervals of significant PC coefficients as functions of thespatial input variable x. The significant PC coefficients {c j(x)} j: γ j=1 were estimated by the ergodic average of the MCMC

sample at the spatial points x ∈ {xk}q of the grid and then interpolated by quadratic splines for x /∈ {xk}q . In Table 2

k=1 k=1
Page 12: Selection of polynomial chaos bases via Bayesian model

G. Karagiannis, G. Lin / Journal of Computational Physics 259 (2014) 114–134 125

Table 1[1D, 14D, 40D; row-wise] Comparative results about the number of significant gPC bases, relative error of the mean, standard deviation and predictionfor the BMA, median rule, BCS, mtCS and l1-minimization approaches, using the default parameters. The threshold for the selection of significant bases inl1-minimization approach was 10−5, however for the evaluation of the relative errors all the coefficients were used.

BMA Median rule BCS mtCS l1-minimization

[1D] mγ 10 8 7 14ε(μ(u;0.5)) 1.41257e−06 1.41956e−06 5.05265e−06 2.85771e−05 2.7547e−04ε(σ (u;0.5)) 1.54245e−03 1.54245e−03 1.41417e−03 1.45818e−03 4.55625e−04ε(u;0.5) 2.00562e−06 2.37662e−06 1.06267e−04 2.31637e−04 2.89939e−04

[14D] mγ 17 27 19 156ε(μ(u;0.5)) 1.58344e−03 1.59309e−03 1.66220e−03 1.82236e−03 2.2594e−03ε(σ (u;0.5)) 8.35037e−03 8.0977e−03 7.83549e−03 9.23084e−03 1.54193e−02ε(u;0.5) 2.67914e−03 2.99446e−03 3.20046e−03 2.98875e−03 3.54889e−03

[40D] mγ 27 28 20 230ε(μ(u;0.5)) 8.86512e−04 8.56166e−04 9.90241e−04 1.19533e−03 1.35119e−03ε(σ (u;0.5)) 4.84905e−03 5.0053e−03 2.85535e−03 6.63752e−04 1.95584e−03ε(u;0.5) 1.25817e−03 1.31187e−03 1.33933e−03 1.86245e−03 1.37392e−03

Fig. 3. [1D, 14D, 40D] The estimated marginal inclusion posterior probabilities, {Pr(γ j |u,Ψ )}mj=1, multiple spatial point case.

Table 2[1D, 14D, 40D; row-wise] Comparative results for the integrated relative errors ε(μ), ε(σ ) and ε(u) resulted by evaluating the gPC expansion based on themodel combination approach (BMA) and the single model approach (median rule).

ε(μ) ε(σ ) ε(u)

[1D] BMA 1.45963e−05 0.0015921 1.14488e−05median rule 1.45991e−05 0.0015991 1.15003e−05

[14D] BMA 2.63615e−03 0.0230381 0.0121257median rule 2.63684e−03 0.0230419 0.0121292

[40D] BMA 7.82100e−04 0.0362461 0.0123519median rule 7.82343e−04 0.0362493 0.0123528

(1st row) we report the total relative errors ε(μ), ε(σ ) and ε(u), which correspond to the BMA and the median probabilitymodel based gPC expansions. We observe that BMA outperforms the median probability model based gPC expansion withrespect to the relative errors reported, however the relative errors are very close in value. Because we observed that themedian probability model based gPC expansion presents similar predictive performance to that of BMA and it is able torecover a sparse representation of the solution, we believe that it is a preferable choice if the SPDEs have sparse stochasticsolutions and the researcher prefers a sparse representation of the solution.

The accuracy of the median probability model based gPC expansion improves with the resolution of the spatial gridconsidered, however for high enough resolution the improvement in the accuracy is not significant when finer grids areconsidered. In Figs. 5(a), 5(d) and 5(g) we plot the total relative errors of the mean, standard deviation and prediction ofthe solution as functions of the size of the grid of spatial points q. In order to deal with the variance of the sample ofmeasurements, for each of the grid of spatial points under comparison, we run the experiment 5 times by collecting equalin size sets of measurements from the SPDE and running the stochastic MCMC sampler to evaluate the gPC expansion. Theresulting estimated relative errors were averaged and reported in Fig. 5. We observe that the relative errors decrease as thenumber of the spatial points increases, however, they stop evolving after a certain number of spatial points is reached, e.g.q∗ � 6. This indicates that the accuracy of the gPC expansion does not improve significantly when the spatial grid becomesfiner after a number of spatial points e.g. q∗ � 6 for this particular example.

Page 13: Selection of polynomial chaos bases via Bayesian model

126 G. Karagiannis, G. Lin / Journal of Computational Physics 259 (2014) 114–134

Fig. 4. [1D] Estimates of the first 11 significant PC coefficients as functions of the spatial parameter x ∈ (0,1). The PC coefficients are estimated via MCMCat spatial points of the grid of the space domain in 7 equally spaced points and interpolating them by quadratic splines. 95%-confidence intervals arepresented in black line.

Page 14: Selection of polynomial chaos bases via Bayesian model

G. Karagiannis, G. Lin / Journal of Computational Physics 259 (2014) 114–134 127

Fig. 5. [1D, 14D, 40D; column-wise] The total relative errors of the mean, standard deviation and prediction as functions of the number of spatial points,after repeating each of the experiments 5 times.

5.2. High-dimensional example: scenarios [14D] & [40D]

We consider that the diffusion coefficient a(x; ξ) is not given in closed form but approximated by a Karhunen–Loéve expansion that uses the d largest eigenvalues and corresponding eigenfunctions of covariance kernel Caa(x1, x2) =exp(− (x1−x2)2

l2c), where lc ∈ R

+ is the correlation length of a(x; ξ) that dictates the decay of the spectrum of a(x; ξ).

We collect a sample of measurements {u(x; ξ (i)), ξ (i)}ni=1, by evaluating the SPDE n times. For the evaluation of the SPDE

we use a deterministic solver similar to that in [15]. Briefly, for each ξi,1:d by integrating (22) we obtain

u′(x) = (a(0)u′(0) − x

)/a(x). (23)

Again integrating (23) we obtain

u(x) =x∫

0

a(0)u′(0) − s

a(s)ds, (24)

since u(0) = u(1) = 0. The integrals in (24) are evaluated by splitting the domain D = (0,1) into 2000 sub-intervals andusing 3 Gaussian quadrature points in each sub-interval. In what follows, we consider two high-dimensional scenarios:

Page 15: Selection of polynomial chaos bases via Bayesian model

128 G. Karagiannis, G. Lin / Journal of Computational Physics 259 (2014) 114–134

Fig. 6. [14D] In 1-st panel: bar plots of the marginal inclusion posterior probabilities {Pr(γ j |u,Ψ )}. In 2-nd & 3-rd panels: histograms and trace plots of theexpectation and standard deviation of the solution at spatial point x∗ = 0.5 where the red lines correspond to the ergodic averages. In 4-th panel: estimatesof the absolute values of the PC coefficients at x∗ = 0.5 in log scale. (For interpretation of the references to color in this figure legend, the reader is referredto the web version of this article.)

[14D] with d = 14, lc = 1/5, a = 0.1, σ = 0.03, p = 3 and n = 120 and [40D] with d = 40, lc = 1/14, a = 0.1, σ = 0.021,p = 2 and n = 240. Thus, conditions C-I and C-II that refer to the diffusion coefficient a(x; ξ) for the sparsity of the solutionare satisfied. For the evaluation of the expansion we consider the Bayesian model in Section 4.1 with weakly informativepriors and fixed hyper-parameters aλ = 0.001, bλ = 0.001, ah = 0, bh = 100, aρ = 1 and bρ = 1.

The expectation μ(u; x), the standard deviation σ(u; x) and the solution itself u(x; ξ), required for the evaluation of theassociated relative errors, are not available in closed form. Therefore, for the evaluation of μ(u; x), σ(u; x) and u(x; ξ) at agiven spatial point, we apply Monte Carlo integration with sample size 106.

5.2.1. Single spatial point case, at x∗ = 0.5In Figs. 6(a) and 7(a), we show the estimated marginal inclusion posterior probabilities of the gPC bases, for scenarios

[14D] and [40D]. Only 17 out 680 gPC bases in [14D] and 27 out of 861 gPC bases in [40D] are selected as significantby the proposed algorithm according to the median probability model. Therefore the median probability model based gPCexpansion is able to recover a sparse representation of the solution. We observe that the marginal inclusion probabilities ofthe gPC bases that are not included in the median probability model based gPC expansion have very small values. There-fore, these bases are expected to have only a small contribution to the BMA gPC expansion. Thus, we expect the medianprobability model based gPC expansion to have similar predictive performance to that of BMA expansion. The BCS andmtCS based gPC expansions, using the default algorithmic parameters and priors, indicate 27 and 19 gPC bases, respectively,as significant. The l1-minimization algorithm detects 156 and 230 gPC bases with non-zero PC coefficients in [14D] and[40D] scenarios respectively, and thus we observe that the proposed method finds a more sparse gPC representation of thesolution.

Figs. 6(b), 6(c), 7(b) and 7(c) show histograms that represent the estimated posterior distribution of the expectation andstandard deviation of the solution at x∗ = 0.5 given the median probability model. In the same figures, the trace plots of theexpectation and standard deviation of the solution at x∗ = 0.5 show that the sampler mixes well and the ergodic averages(the red lines) converge quickly. Similarly to the [1D] scenario, the samplers mix well and the associated ergodic averagesconverge fast. In Figs. 6(d) and 7(d), we plot the absolute values of the PC coefficients in log scale for the l1-minimization,BCS, mtCS, median probability model based gPC. We observe that for low-order PC coefficients, the three methods givesimilar estimates, but for the high-order ones they give slightly different estimates. This occurs because, as we see in

Page 16: Selection of polynomial chaos bases via Bayesian model

G. Karagiannis, G. Lin / Journal of Computational Physics 259 (2014) 114–134 129

Fig. 7. [40D] In 1-st panel: bar plots of the marginal inclusion posterior probabilities {Pr(γ j |u,Ψ )}. In 2-nd & 3-rd panels: histograms and trace plots of theexpectation and standard deviation of the solution at spatial point x∗ = 0.5 where the red lines correspond to the ergodic averages. In 4-th panel: estimatesof the absolute values of the PC coefficients at x∗ = 0.5 in log scale. (For interpretation of the references to color in this figure legend, the reader is referredto the web version of this article.)

Figs. 6(d) and 7(d), the three approaches have selected different significant bases in high dimensions and therefore thecommon high order PC coefficients, that usually capture interactions among input variables, may differ in value. However,the observed differences do not seem to be quite large.

The BMA and the median probability model based gPC expansions present better predictive ability than that of the BCS,mtCS and l1-minimization based gPC expansions in terms of the relative predictive error as shown in Table 1 (2nd & 3rdrows). Although the median probability model based gPC expansions present sparse forms in [14D] and [40D], they presentsimilar predictive performance to that of BMA according to the relative errors in Table 1 (2nd & 3rd rows). Moreover,both the BMA and median probability model based gPC expansions perform better than those evaluated by BCS, mtCS andl1-minimization in terms of μ(u;0.5). On the other hand, we observe that the relative errors of σ(u;0.5) that correspondto BCS based gPC and mtCS based gPC are smaller than those of the proposed methods using BMA or median probabilitymodel based gPC in [40D]; however we argue that the proposed approaches are preferable to BCS, mtCS and l1-minimizationbecause they present better predictive performance according to the relative predictive errors.

Figs. 8(a), 8(b) and 8(c) show how the performance measures evolve with the number of MC samples for the compu-tational approaches considered. We observe that the relative errors reduce in value with the number of MC samples. Thisdecay in the relative errors appears to be faster for small MC samples. The proposed approaches, based on BMA and medianprobability model, seem to perform better with respect to the relative errors for different MC samples. In fact, the proposedapproaches perform significantly better than the others under comparison especially for small MC sample sizes however thisdifference in performance seems to decay for large MC sample sizes. This suggests that among the evaluation approachesunder comparison, the proposed ones, based on BMA and median probability model, are preferable especially in cases thatthe MC sample size is small.

5.2.2. Multiple spatial pointsWe compute a sparse gPC representation of the stochastic solution of SPDE in (22) that can be used for predictions

on arbitrary new spatial points of the domain D = (0,1). For the evaluation of the gPC expansion, we consider a grid of7 equally-spaced spatial points {xk = k/(q + 1); k = 1, . . . ,q} and apply the MCMC stochastic search algorithm for 6 · 105

iterations where the first 105 are the burn-in.

Page 17: Selection of polynomial chaos bases via Bayesian model

130 G. Karagiannis, G. Lin / Journal of Computational Physics 259 (2014) 114–134

Fig. 8. [14D] Relative errors of the solution expectation, standard deviation and prediction at spatial point x∗ = 0.5 as functions of the size of the MC sampleconsidered for the evaluation of the gPC expansion. (Single spatial point case.)

According to the median probability model, the algorithm selects only 67 out of 680 gPC bases in [14D] and 71 out of861 gPC bases in [40D] as significant. Thus the median probability model based gPC expansion that contains only a singlesubset of significant bases provides a sparse representation of the stochastic solution. The estimated marginal inclusionprobabilities are presented in Figs. 3(b) and 3(c). Similarly to the [1D] scenario, here we observe that the number of thesignificant gPC bases has increased when compared to the single spatial point case, after we considered multiple spatialpoints. This is reasonable because when multiple spatial points are considered, the gPC expansion has to approximate thestochastic solution at different spatial points which can be associated with a different set of significant gPC bases.

In Figs. 9(a), 9(b), 9(f) and 9(g), we plot the 95% confidence intervals and the point estimates of μ(u; x) and σ(u; x)associated with the median probability model based gPC expansion in [14D] and [40D] scenarios. Figs. 9(c), 9(d) and 9(e),present estimates of the PC coefficients associated with bases ψ1(ξ1), ψ3(ξ3) and ψ19(ξ1, ξ3) = ψ1(ξ1) × ψ3(ξ3), in [14D]scenario. We observe that although the PC coefficient c19(x) is very small in magnitude compared to c1(x) and c3(x), thegPC basis ψ19(ξ1, ξ3) = ψ1(ξ1)×ψ3(ξ3) is selected by the algorithm as significant. This is evidence that ξ1 and ξ3 interact inthe sense that the dependence of the stochastic solution u(x; ξ) on the one input variable, e.g. ξ1, is affected by the valuesof the other, e.g. ξ3. In the [40D] scenario, we observe similar phenomena, see Figs. 9(i), 9(j), 9(h) and 9(k).

The total relative errors ε(μ), ε(σ ), and ε(u) of the BMA and the median probability model based gPC expansions arereported in Table 2. We observe that the total relative errors ε(μ) and ε(σ ) for BMA and the median probability modelbased gPC expansions are close in value and therefore the two approaches provide estimates for μ(u; x) and σ(u; x) withsimilar accuracy. Moreover the predictive ability of the median probability model based gPC expansion is close to that ofthe BMA expansion with respect to the total relative error of the prediction ε(u) as shown in Table 2. Thus the medianprobability model based gPC expansion presents similar predictive performance to that of BMA which is considered asoptimal. This supports that the median probability model based gPC expansion may be preferable to the BMA one if theSPDEs have sparse stochastic solutions or the researcher prefers a sparse representation of the solution.

The total relative errors ε(μ), ε(σ ), and ε(u) of the median probability model based gPC expansion are presented inFigure group 5 (columns 2 & 3) as functions of the resolution of the spatial grid. The size of the sample of the measurements{u(x1:q, ξi), ξi}n

i=1 in each case was 10% of the total number of the PC coefficients, {c j,k; j = 1 : d, k = 1 : q}. We observethat when the grid of spatial points becomes finer, the total relative errors ε(μ), ε(σ ), and ε(u) decrease until the numberof the spatial points considered reaches q∗ = 6. For q � 6, we observe that the relative errors do not evolve and thereforethere is no need to consider a finer spatial grid in order to increase the accuracy of the expansion. We believe that this isobserved because the stochastic solution u(x; ξ) is smooth with respect to x and can be satisfactorily approximated by thequadratic spline.

Figs. 10(a), 10(b) and 10(c) present the total relative errors ε(μ), ε(σ ), and ε(u) for the BMA and median probabilitymodel based gPC expansions as functions of the number of MC samples and for fixed number of spatial points q = 7.For both cases, we observe that the relative errors reduce as the number of MC samples increases. The steepest decay isobserved in the range of small MC sample sizes, n < 200. However, for moderate or large MC sample sizes, here n > 200,we observe that the total relative errors do not evolve significantly. This behavior is reasonable if we consider that, in themultiple spatial point case, the error associated to the spatial interpolation contributes to the total relative errors as well.Here, it seems that, for MC samples larger than 200, the interpolation error in the spatial domain governs the measuresof performance and therefore increasing the MC sample size further improves the performance of the proposed methodswith a smaller rate for fixed number of spatial points, q = 7. Thus, to achieve further improvement in terms of total relativeerrors, one possibly needs to refine the spatial grid as well.

6. Summary and conclusions

The present paper addresses the challenging problem of approximating the stochastic solutions of SPDEs using a gPCexpansion in high-dimensional scenarios where the number of the gPC bases may be larger than the size of the sample

Page 18: Selection of polynomial chaos bases via Bayesian model

G. Karagiannis, G. Lin / Journal of Computational Physics 259 (2014) 114–134 131

Fig. 9. [14D, 40D; row-wise] Estimates of the significant PC coefficients as functions of the spatial parameter x ∈ (0,1). The PC coefficients are estimated viaMCMC at spatial points of the grid of the space domain in 7 equally spaced points and interpolating them by quadratic splines. 95%-confidence intervalsare presented in black line.

Page 19: Selection of polynomial chaos bases via Bayesian model

132 G. Karagiannis, G. Lin / Journal of Computational Physics 259 (2014) 114–134

Fig. 10. [14D] Total relative errors of the expectation, standard deviation and prediction of the solution as functions of the size of the MC sample of theproposed approaches based on BMA and median probability model. We considered a grid of q = 7 equally spaced points. (Multiple spatial point case.)

of measurements. To address this problem, we proposed a fully Bayesian non-intrusive, non-adaptive, stochastic methodbased on the ideas of Bayesian model selection, Bayesian model averaging and MCMC. In fact, the proposed methodologyis suitable for SPDEs whose stochastic solutions are sparse in the stochastic space with respect to the gPC bases while thedeterministic solver involved is computationally expensive. The proposed method proceeds in three main steps: (1) definethe Bayesian model; (2) run the MCMC sampler; (3) evaluate the gPC expansion either by BMA or by the selection of asubset of significant gPC bases via the median probability model. The choice of the evaluation approach using either medianprobability model or BMA, can be taken according to whether the SPDEs have sparse stochastic solutions and also whetherthe researcher needs a sparse gPC expansion.

The proposed methods provide global recovery of the stochastic solutions with respect to both spatial and random do-mains. As a result, prediction of the stochastic solutions at arbitrary new spatial points can be computed directly from thegPC expansion without the need to re-collect measurements and re-evaluate the gPC expansion at those new spatial pointscompared to other methods discussed here. The proposed methods allow the evaluation of the gPC expansion via BMA (itpresents better predictive ability compared to any single gPC expansion based on a specific subset of gPC bases) or via me-dian probability model (it provides a sparse representation of the stochastic solutions, if needed, without significant loss inaccuracy). Compared to non-Bayesian methods, our approaches provide interval estimates about the PC coefficients and donot need ad-hoc methods, such as cross-validation, or pilot runs for the estimation of unknown parameters. Moreover, a pri-ori knowledge about the number of the significant gPC bases, the shrinkage parameter or the error of the gPC expansion ispossible to be considered through the prior distributions. Compared to traditional Bayesian methods, the proposed methodsquantify the importance of each gPC bases in the probabilistic sense through the marginal inclusion posterior probabilitiesand accounts for model uncertainty when BMA is used. Compared to standard compressive sensing methods [14–17], theproposed method provides a probabilistic mechanism that trades off between the bias caused by the omission of gPC basesand the over-fitting.

The empirical results show that the BMA based gPC expansion presents smaller relative errors of the mean, standarddeviation and prediction of the solution than median probability model based gPC expansion. However these relative errorswere close in value for the two approaches. Given that the median probability model based gPC expansion can accommodatea sparse, and therefore a simpler representation of the stochastic solution, we believe that it might be preferable to BMA oneif the SPDEs have sparse stochastic solutions. We observed that the performance of the proposed method improves with thesize of the MC sample both in the single and the multiple spatial point cases. When the stochastic solutions are smooth withrespect to the spatial grids, like in our examples, the resolution of the spatial grid, considered for the discretization of the PCcoefficients, does not need to be too high for the median probability model based gPC expansion to achieve an acceptableaccuracy in terms of the relative error of the mean, standard deviation and prediction of the solution. Compared to othernumerical methods that can be used for the evaluation of the gPC expansion at a single point, such as BCS, we observedthat the proposed method, both BMA and median probability model approaches, provide more accurate results in terms ofthe relative error of the mean and the prediction of the solution when one spatial point case was considered. However,in the three examples, the proposed method had a similar or slightly larger relative error of the standard deviation of thesolution compared to BCS and mtCS. However, we support that the proposed method is preferable since it outperforms thecompeting ones with respect to the relative predictive error. Finally, we observed that the median probability model basedgPC expansion provides a more sparse representation (in terms of gPC bases) and more accurate approximation (in terms ofthe relative errors considered) than the l1-minimization approach.

As commented by a reviewer, this method can be very effective when the sparsity of the solution, with respect to thegPC bases, does not change dramatically at different spatial points otherwise its effectiveness might be questionable dueto the global nature of γ . To overcome this potential issue, one can partition the spatial domain properly and implementthe proposed method individually for each partition. Although more computationally expensive compared to some com-pressive sensing methods, such as l1-minimization methods, the proposed algorithms can be coded in parallel to speed up

Page 20: Selection of polynomial chaos bases via Bayesian model

G. Karagiannis, G. Lin / Journal of Computational Physics 259 (2014) 114–134 133

the computation. Extensions of the current methodology include the evaluation of a sparse gPC expansion when there isdiscontinuity in stochastic space. This is ongoing work and results will be presented in the future.

Acknowledgements

This work was supported by the Applied Mathematics Program within the Department of Energy (DOE) Office of Ad-vanced Scientific Computing Research (ASCR) as part of the Collaboratory on Mathematics for Mesoscopic Modeling ofMaterials (CM4). PNNL is operated by Battelle for the U.S. Department of Energy under Contract DE-AC05-76RL01830. Wewould like to thank Drs. Bledar (Alex) Konomi and Bin Zheng for their constructive comments on UQ and SPDEs and Dr.Kenneth Jarman for carefully proofreading the manuscript. The research was performed using PNNL Institutional Computing,as well as the National Energy Research Scientific Computing Center at Lawrence Berkeley National Laboratory.

Appendix A. Notation

Table 3Notation of distributions.

Distribution Notation Density function

Dirac Dirac(a) Dirac(x|a) = δ{a}(x)

Uniform U(a,b) U(x|a,b) = 1b−a , x ∈ (a,b)

Beta Be(a,b) Be(x|a,b) = Γ (a)Γ (b)Γ (a+b)

xa−1(1 − x)b−1, x ∈ (0,1)

Gamma G(a,b) G(x|a,b) = ba

Γ (a)xa−1 exp(−bx), x ∈ (0,+∞)

Inverse Gamma IG(a,b) IG(x|a,b) = ba

Γ (a)x−a−1 exp(−b/x), x ∈ (0,+∞)

Normal Nn(μ,Σ) Nn(x|μ,Σ) =√

1(2π)n |Σ | exp(− 1

2 (x − μ)TΣ−1(x − μ)), x ∈R

Left truncated Normal N−(μ,σ 2) N−(x|μ,σ 2) =√

2πσ 2 exp(− 1

2σ 2 (x − μ)2), x ∈ (0,+∞)

Right truncated Normal N+(μ,σ 2) N+(x|μ,σ 2) =√

2πσ 2 exp(− 1

2σ 2 (x − μ)2), x ∈ (−∞,0)

Bernoulli Bernoulli(ρ) Bernoulli(x|ρ) = ρx(1 − ρ)1−xδ{0,1}(x)

Appendix B. Calculation of the solution statistics in [1D] case

Random variable ξ can be written as a function of u, ξ(u; x) = x(1−x)u − 2, u ∈ (x(1 − x)/3, x(1 − x)). Thus, the distribution

of u has density

fu(u; x) = fξ(ξ(u; x)

)∣∣∣∣ d

duξ(u; x)

∣∣∣∣ = 1

2

∣∣∣∣− x(1 − x)

u2

∣∣∣∣ = x(1 − x)

2u2, u ∈ (

x(1 − x)/3, x(1 − x)).

Moreover,

(u(ξ ; x)

) =1∫

−1

u(ξ ; x) fξ (ξ)dξ = x(1 − x)log(3)

2,

(u2(ξ ; x)

) =1∫

−1

u2(ξ ; x) fξ (ξ)dξ = x2(1 − x)2 1

3.

Thus,

μ(u; x) = x(1 − x)log(3)

2,

σ (u; x) =√

Varξ(u(ξ ; x)

) =√

(u2(ξ ; x)

) − (Eξ

(u(ξ ; x)

))2 = x(1 − x)

√1

3− log2(3)

4.

References

[1] B.D. Ripley, Stochastic Simulation, John Wiley & Sons, Inc., New York, NY, USA, 1987.[2] D. Xiu, Numerical Methods for Stochastic Computations: A Spectral Method Approach, Princeton University Press, 2010.[3] D. Xiu, G. Karniadakis, The Wiener–Askey polynomial chaos for stochastic differential equations, SIAM J. Sci. Comput. 24 (2) (2002) 619–644.[4] R. Ghanem, P. Spanos, Stochastic Finite Elements: A Spectral Approach, Dover Publications, 2003.

Page 21: Selection of polynomial chaos bases via Bayesian model

134 G. Karagiannis, G. Lin / Journal of Computational Physics 259 (2014) 114–134

[5] X. Wan, G. Karniadakis, An adaptive multi-element generalized polynomial chaos method for stochastic differential equations, J. Comput. Phys. 209 (2)(2005) 617–642.

[6] X. Wan, G. Karniadakis, Multi-element generalized polynomial chaos for arbitrary probability measures, SIAM J. Sci. Comput. 28 (3) (2006) 901–928.[7] M. Deb, I. Babuška, J. Oden, Solution of stochastic partial differential equations using Galerkin finite element techniques, Comput. Methods Appl. Mech.

Eng. 190 (48) (2001) 6359–6372.[8] I. Babuška, R. Tempone, G. Zouraris, Galerkin finite element approximations of stochastic elliptic partial differential equations, SIAM J. Numer. Anal.

42 (2) (2004) 800–825.[9] L. Mathelin, M. Hussaini, A stochastic collocation algorithm for uncertainty analysis, Citeseer, 2003.

[10] I. Babuška, F. Nobile, R. Tempone, A stochastic collocation method for elliptic partial differential equations with random input data, SIAM J. Numer.Anal. 45 (3) (2007) 1005–1034.

[11] D. Xiu, J.S. Hesthaven, High-order collocation methods for differential equations with random inputs, SIAM J. Sci. Comput. 27 (3) (2005) 1118–1139.[12] M. Berveiller, B. Sudret, M. Lemaire, Stochastic finite element: a non intrusive approach by regression, Eur. J. Comput. Mech. (Rev. Eur. Méc. Numér.)

15 (1–3) (2006) 81–92.[13] B. Sudret, Global sensitivity analysis using polynomial chaos expansions, Reliab. Eng. Syst. Saf. 93 (7) (2008) 964–979.[14] A. Doostan, H. Owhadi, A non-adapted sparse approximation of PDEs with stochastic inputs, J. Comput. Phys. 230 (8) (2011) 3015–3034.[15] X. Yang, G. Karniadakis, Reweighted l1 minimization method for stochastic partial differential equations, J. Comput. Phys. 248 (2013) 87–108.[16] S. Ji, Y. Xue, L. Carin, Bayesian compressive sensing, IEEE Trans. Signal Process. 56 (6) (2008) 2346–2356.[17] S. Ji, D. Dunson, L. Carin, Multitask compressive sensing, IEEE Trans. Signal Process. 57 (1) (2009) 92–106.[18] S.J. Godsill, On the relationship between Markov chain Monte Carlo methods for model uncertainty, J. Comput. Graph. Stat. 10 (2) (2001) 230–248.[19] P. Dellaportas, J.J. Forster, I. Ntzoufras, On Bayesian model and variable selection using MCMC, Stat. Comput. 12 (1) (2002) 27–36, http://dx.doi.org/

10.1023/A:1013164120801.[20] C. Hans, Model uncertainty and variable selection in Bayesian lasso regression, Stat. Comput. 20 (2) (2010) 221–229.[21] L. Kuo, B. Mallick, Variable selection for regression models, Sankhya, Ser. B 60 (1) (1998) 65–81.[22] J. Hoeting, D. Madigan, A. Raftery, C. Volinsky, Bayesian model averaging: a tutorial, Stat. Sci. (1999) 382–401.[23] D. Madigan, A. Raftery, Model selection and accounting for model uncertainty in graphical models using Occam’s window, J. Am. Stat. Assoc. 89 (428)

(1994) 1535–1546.[24] M.M. Barbieri, J.O. Berger, Optimal predictive model selection, Ann. Stat. 32 (3) (2004) 870–897.[25] R. Todor, C. Schwab, Convergence rates for sparse chaos approximations of elliptic problems with stochastic coefficients, IMA J. Numer. Anal. 27 (2)

(2007) 232–261.[26] M. Bieri, C. Schwab, Sparse high order fem for elliptic sPDEs, Comput. Methods Appl. Mech. Eng. 198 (2009) 1149–1170.[27] G. Blatman, B. Sudret, Adaptive sparse polynomial chaos expansion based on least angle regression, J. Comput. Phys. 230 (6) (2011) 2345–2367.[28] C. Schwab, R.A. Todor, Karhunen–Loéve approximation of random fields by generalized fast multipole methods, in: Uncertainty Quantification in Sim-

ulation Science, J. Comput. Phys. 217 (1) (2006) 100–122.[29] T. Park, G. Casella, The Bayesian lasso, J. Am. Stat. Assoc. 103 (482) (2008) 681–686.[30] M. Kyung, J. Gill, M. Ghosh, G. Casella, Penalized regression, standard errors, and Bayesian lassos, Bayesian Anal. 5 (2) (2010) 369–412.[31] R. Tibshirani, Regression shrinkage and selection via the lasso, J. R. Stat. Soc., Ser. B 58 (1996) 267–288.[32] A. Lykou, I. Ntzoufras, On Bayesian lasso variable selection and the specification of the shrinkage parameter, Stat. Comput. (2012) 1–30, http://dx.

doi.org/10.1007/s11222-012-9316-x.[33] C.P. Robert, Simulation of truncated normal variables, Stat. Comput. 5 (1995) 121–125, http://dx.doi.org/10.1007/BF00143942.[34] F. Lucka, Fast MCMC sampling for sparse Bayesian inference in high-dimensional inverse problems using l1-type priors, arXiv:1206.0262.[35] C.P. Robert, G. Casella, Monte Carlo Statistical Methods, 2nd edition, Springer, 2004.[36] G.O. Roberts, A. Gelman, W.R. Gilks, Weak convergence and optimal scaling of random walk Metropolis algorithms, Ann. Appl. Probab. 7 (1997)

110–120.[37] C. Andrieu, J. Thoms, A tutorial on adaptive MCMC, Stat. Comput. 18 (4) (2008) 343–373.[38] D. Blackwell, Conditional expectation and unbiased sequential estimation, Ann. Math. Stat. (1947) 105–110.[39] J. Ferguson, Multivariable curve interpolation, J. ACM 11 (2) (1964) 221–228.[40] G.M. Phillips, Interpolation and Approximation by Polynomials, CMS Books Math., vol. 14, Springer, 2003.