alternating direction optimization algorithms for ...mihailo/papers/zarjovgeoacc15.pdfalternating...

6
Alternating direction optimization algorithms for covariance completion problems Armin Zare, Mihailo R. Jovanovi´ c, and Tryphon T. Georgiou Abstract— Second-order statistics of nonlinear dynamical systems can be obtained from experiments or numerical sim- ulations. These statistics are relevant in understanding the fundamental physics, e.g., of fluid flows, and are useful for developing low-complexity models. Such models can be used for the purpose of control design and analysis. In many applications, only certain second-order statistics of a limited number of states are available. Thus, it is of interest to complete partially specified covariance matrices in a way that is consistent with the linearized dynamics. The dynamics impose structural constraints on admissible forcing correlations and state statistics. Solutions to such completion problems can be used to obtain stochastically driven linearized models. Herein, we address the covariance completion problem. We introduce an optimization criterion that combines the nuclear norm together with an entropy functional. The two, together, provide a numerically stable and scalable computational approach which is aimed at low complexity structures for stochastic forcing that can account for the observed statistics. We develop customized algorithms based on alternating direction methods that are well- suited for large scale problems. Index Terms— Alternating direction method of multipliers, alternating minimization algorithm, convex optimization, low- rank approximation, nuclear norm regularization, state covari- ances, structured matrix completion problems. I. I NTRODUCTION Motivation for this work stems from control-oriented modeling of systems with very large number of degrees of freedom. For example, the Navier-Stokes (NS) equations, which govern the dynamics of fluid flows, are prohibitively complex for purposes of analysis and control design. Thus, it is common practice to investigate low-dimensional models that preserve the essential dynamics. In wall-bounded flows, stochastically driven linearized models of the NS equations have been shown to be capable of qualitatively replicating the structural features of fluid motion [1]–[4]. However, it has also been recognized that white-in-time stochastic forcing is too restrictive to reproduce all statistical features of the nonlinear dynamics [5], [6]. Building on [7], [8], we depart from white-in-time restriction and consider low-complexity dynamical models with colored-in-time excitations that suc- cessfully account for the available statistics. The complexity is quantified by the rank of the correlation Financial support from the National Science Foundation under Award CMMI 1363266 and the University of Minnesota Informatics Institute Trans- disciplinary Faculty Fellowship is gratefully acknowledged. The University of Minnesota Supercomputing Institute is acknowledged for providing computing resources. Armin Zare, Mihailo R. Jovanovi´ c and Tryphon T. Georgiou are with the Department of Electrical and Computer Engineering, University of Minnesota, Minneapolis, MN 55455. E-mails: [email protected], mi- [email protected], [email protected]. structure of excitation sources. This provides a bound on the number of input channels and explains the directionality of input disturbances [9], [10]. The covariance completion prob- lem is formulated by utilizing nuclear norm minimization as a surrogate for rank minimization [11]–[16]. The resulting convex optimization problem can be recast as a semi-definite program (SDP) but cannot be handled by general purpose solvers for large problems. In recent work [17], [18], we provided optimization algorithms for solving the covariance completion problem. We also showed the utility of this framework in explaining turbulent flow statistics [18], [19]. Herein, we consider an optimization criterion that com- bines the nuclear norm with the logarithmic barrier function as an entropy functional. In order to solve the completion problem for large-scale systems, e.g., fluid flows, we develop customized algorithms based on the Alternating Direction Method of Multipliers (ADMM) and the Alternating Min- imization Algorithm (AMA). We demonstrate that AMA, which effectively works as a proximal gradient algorithm on the dual problem is more efficient in handling large problems. Our presentation is organized as follows. In Section II, we explain the structural constraints imposed on second- order statistics of stochastically forced linear systems. We formulate the covariance completion problem and express the constraint set in a form amenable to alternating optimization methods. In Section III, we present the optimality conditions and derive the dual problem. In Section IV, we present two customized alternating direction algorithms for solving the structured covariance completion problem and present results of numerical experiments in Section V. Finally, we provide concluding thoughts in Section VI. II. PROBLEM FORMULATION Consider a linear time-invariant (LTI) system ˙ x = Ax + B u, (1) where x(t) C n is the state vector, A C n×n and B C n×m are dynamic and input matrices, and u(t) C m is a stationary zero-mean stochastic process. For a Hurwitz matrix A and controllable pair (A, B), the positive semidefinite matrix X qualifies as being the steady-state covariance matrix of the state in (1), X := lim t →∞ E (x(t) x * (t)) , if and only if the linear equation, 2015 American Control Conference Palmer House Hilton July 1-3, 2015. Chicago, IL, USA 978-1-4799-8686-6/$31.00 ©2015 AACC 515

Upload: trinhnguyet

Post on 28-May-2018

214 views

Category:

Documents


0 download

TRANSCRIPT

Alternating direction optimization algorithmsfor covariance completion problems

Armin Zare, Mihailo R. Jovanovic, and Tryphon T. Georgiou

Abstract— Second-order statistics of nonlinear dynamicalsystems can be obtained from experiments or numerical sim-ulations. These statistics are relevant in understanding thefundamental physics, e.g., of fluid flows, and are useful fordeveloping low-complexity models. Such models can be usedfor the purpose of control design and analysis. In manyapplications, only certain second-order statistics of a limitednumber of states are available. Thus, it is of interest tocomplete partially specified covariance matrices in a way that isconsistent with the linearized dynamics. The dynamics imposestructural constraints on admissible forcing correlations andstate statistics. Solutions to such completion problems can beused to obtain stochastically driven linearized models. Herein,we address the covariance completion problem. We introduce anoptimization criterion that combines the nuclear norm togetherwith an entropy functional. The two, together, provide anumerically stable and scalable computational approach whichis aimed at low complexity structures for stochastic forcing thatcan account for the observed statistics. We develop customizedalgorithms based on alternating direction methods that are well-suited for large scale problems.

Index Terms— Alternating direction method of multipliers,alternating minimization algorithm, convex optimization, low-rank approximation, nuclear norm regularization, state covari-ances, structured matrix completion problems.

I. INTRODUCTION

Motivation for this work stems from control-orientedmodeling of systems with very large number of degrees offreedom. For example, the Navier-Stokes (NS) equations,which govern the dynamics of fluid flows, are prohibitivelycomplex for purposes of analysis and control design. Thus,it is common practice to investigate low-dimensional modelsthat preserve the essential dynamics. In wall-bounded flows,stochastically driven linearized models of the NS equationshave been shown to be capable of qualitatively replicating thestructural features of fluid motion [1]–[4]. However, it hasalso been recognized that white-in-time stochastic forcingis too restrictive to reproduce all statistical features of thenonlinear dynamics [5], [6]. Building on [7], [8], we departfrom white-in-time restriction and consider low-complexitydynamical models with colored-in-time excitations that suc-cessfully account for the available statistics.

The complexity is quantified by the rank of the correlation

Financial support from the National Science Foundation under AwardCMMI 1363266 and the University of Minnesota Informatics Institute Trans-disciplinary Faculty Fellowship is gratefully acknowledged. The Universityof Minnesota Supercomputing Institute is acknowledged for providingcomputing resources.

Armin Zare, Mihailo R. Jovanovic and Tryphon T. Georgiou are withthe Department of Electrical and Computer Engineering, University ofMinnesota, Minneapolis, MN 55455. E-mails: [email protected], [email protected], [email protected].

structure of excitation sources. This provides a bound on thenumber of input channels and explains the directionality ofinput disturbances [9], [10]. The covariance completion prob-lem is formulated by utilizing nuclear norm minimization asa surrogate for rank minimization [11]–[16]. The resultingconvex optimization problem can be recast as a semi-definiteprogram (SDP) but cannot be handled by general purposesolvers for large problems. In recent work [17], [18], weprovided optimization algorithms for solving the covariancecompletion problem. We also showed the utility of thisframework in explaining turbulent flow statistics [18], [19].

Herein, we consider an optimization criterion that com-bines the nuclear norm with the logarithmic barrier functionas an entropy functional. In order to solve the completionproblem for large-scale systems, e.g., fluid flows, we developcustomized algorithms based on the Alternating DirectionMethod of Multipliers (ADMM) and the Alternating Min-imization Algorithm (AMA). We demonstrate that AMA,which effectively works as a proximal gradient algorithm onthe dual problem is more efficient in handling large problems.

Our presentation is organized as follows. In Section II,we explain the structural constraints imposed on second-order statistics of stochastically forced linear systems. Weformulate the covariance completion problem and express theconstraint set in a form amenable to alternating optimizationmethods. In Section III, we present the optimality conditionsand derive the dual problem. In Section IV, we present twocustomized alternating direction algorithms for solving thestructured covariance completion problem and present resultsof numerical experiments in Section V. Finally, we provideconcluding thoughts in Section VI.

II. PROBLEM FORMULATION

Consider a linear time-invariant (LTI) system

x = Ax + B u, (1)

where x(t) ∈ Cn is the state vector, A ∈ Cn×n andB ∈ Cn×m are dynamic and input matrices, and u(t) ∈Cm is a stationary zero-mean stochastic process. For aHurwitz matrix A and controllable pair (A,B), the positivesemidefinite matrix X qualifies as being the steady-statecovariance matrix of the state in (1),

X := limt→∞

E (x(t)x∗(t)) ,

if and only if the linear equation,

2015 American Control ConferencePalmer House HiltonJuly 1-3, 2015. Chicago, IL, USA

978-1-4799-8686-6/$31.00 ©2015 AACC 515

AX + X A∗ = − (BH∗ + H B∗) , (2)

is solvable in terms of H ∈ Cn×m [7], [8]. Here, E is theexpectation operator and ∗ denotes the complex conjugatetranspose. In the case of u being white noise with covarianceW , X satisfies the Lyapunov equation

AX + X A∗ = −BWB∗.

This results from (2) with H = BW/2 and yields a negativesemi-definite right-hand-side, −BWB∗.

The problem of completing partially known sampledsecond-order statistics using stochastically-driven LTI mod-els was introduced in [9], and is formulated by introducing

Z := − (AX + X A∗) .

In contrast to the case of white-in-time excitation, Z mayhave both positive and negative eigenvalues.

The covariance completion problem considered in thispaper combines the nuclear norm together with an entropyfunctional. This provides an approach which is aimed at low-complexity structures for stochastic forcing and facilitatesthe construction of a particular class of low-pass filterswhich generate colored-in-time forcing correlations [7], [8].Figure 1 shows the interconnection of the spatio-temporalfilter and the linearized dynamics which can be used toaccount for partially observed second-order statistics.

The covariance completion problem can be formulated as,

minimizeX,Z

− log det (X) + γ ‖Z‖∗subject to AX + XA∗ + Z = 0

(CXC∗) ◦ E − G = 0.

(CP)

Here matrices A, C, E, and G denote problem data, andHermitian matrices X , Z ∈ Cn×n are optimization vari-ables. Entries of G represent partially available second-orderstatistics and C is a matrix that establishes the relationshipbetween entries of the state covariance matrix X and partiallyavailable statistics resulting from experiments or simulations.The symbol ◦ denotes elementwise matrix multiplication andE is the structural identity matrix,

Eij =

{1, if Gij is available0, if Gij is unavailable.

The constraint set in (CP) represents the intersection of twolinear subspaces, the Lyapunov-like constraint and the linearconstraint which incorporates the available statistics.

In (CP), the nuclear norm, i.e., the sum of singular values

Fig. 1: The cascade connection of an LTI system with a linearfilter is used to account for sampled covariance matrix X .

of a matrix, ‖Z‖∗ :=∑i σi(Z), is used as a proxy for

rank minimization [11], [14]. The parameter γ indicates therelative weight on the nuclear norm objective. On the otherhand, the logarithmic barrier function in the objective isintroduced to guarantee the positive definiteness of X . Theconvexity of (CP) follows from the convexity of its objectivefunction, denoted as Jp(X,Z), and the convexity of theconstraint set.

In order to bring (CP) into a convenient form for thealternating direction methods considered in this paper, wereformulate (CP) as

minimizeX,Z

− log det (X) + γ ‖Z‖∗subject to AX + BZ − C = 0,

(CP1)

where

A :=

[A1

A2

], B :=

[I0

], C :=

[0G

].

Here, A1, A2 : Cn×n → Cn×n are linear operators, with

A1(X) := AX + XA∗,

A2(X) := (CXC∗) ◦ E.

III. OPTIMALITY CONDITIONS AND THE DUAL PROBLEM

By splitting Z into positive and negative definite parts,

Z = Z+ − Z−, Z+ � 0, Z− � 0,

it can be shown [11, Section 5.1.2] that (CP1) can be castas an SDP,

minimizeX,Z+, Z−

− log det (X) + γ (trace (Z+) + trace (Z−))

subject to A1(X) + Z+ − Z− = 0

A2(X) − G = 0

Z+ � 0, Z− � 0. (P)

To derive the dual of the primal problem (P), we introducethe Lagrangian

L (X, Z±; Y1, Y2, Λ±) = − log det (X) +

γ trace (Z+ + Z−) − 〈Λ+, Z+〉 − 〈Λ−, Z−〉 +

〈Y1, A1(X) + Z+ − Z−〉 + 〈Y2, A2(X)−G〉 ,

where Hermitian matrices Y1, Y2, and Λ± � 0 are the dualvariables, and 〈·, ·〉 represents the standard inner product〈M1,M2〉 := trace(M∗1M2).

Minimization of L with respect to the primal variables Xand Z± yields the Lagrange dual of (P),

maximizeY1, Y2

log det(A†

1(Y1) +A†2(Y2)

)− 〈G, Y2〉+ n

subject to ‖Y1‖2 ≤ γ, (D)

where the adjoints of the operators A1 and A2 are given by

A†1(Y ) = A∗ Y + Y A,

A†2(Y ) = C∗(E ◦ Y )C.

The dual problem (D) is a convex optimization problemwith variables Y1, Y2 ∈ Cn×n and the objective function

516

Jd(Y1, Y2). These variables are dual feasible if the constraintin (D) is satisfied. This constraint is obtained by minimizingthe Lagrangian with respect to Z+ and Z−, which leads to

Y1 + γ I � Λ+ � 0, Z+ � 0,

−Y1 + γ I � Λ− � 0, Z− � 0.

Therefore, we have that

−γ I � Y1 � γ I ⇐⇒ ‖Y1‖2 ≤ γ.

In addition, minimizing L with respect to X yields

X−1 = A†1(Y1) + A†

2(Y2) � 0. (5)

In the case of primal and dual feasibility, any dual feasiblepair (Y1, Y2) gives a lower bound on the optimal value of theprimal problem J?p . The alternating minimization algorithmof Section IV-B effectively works as a proximal gradientalgorithm on the dual problem and is developed to achievesufficient dual ascent and satisfy (5).

IV. CUSTOMIZED ALGORITHMS

We next develop two customized algorithms based onthe Alternating Direction Method of Multipliers (ADMM)and the Alternating Minimization Algorithm (AMA). Thesemethods have been effectively employed in low-rank ma-trix recovery [20], sparse covariance selection [21], imagedenoising and magnetic resonance imaging [22], sparsefeedback synthesis [23], sparse Gaussian graphical modelestimation [24], and many other applications [25]–[28]. Ourapproach exploits the respective structures of the logarithmicbarrier function and the nuclear norm, and is well-suited forsolving large-scale and distributed optimization problems.

A. Alternating direction method of multipliers

The augmented Lagrangian associated with (CP1) is givenby

Lρ(X,Z;Y1, Y2) = − log det (X) + γ ‖Z‖∗

+ 〈Y, AX + BZ − C〉 +ρ

2‖AX + BZ − C‖2F ,

where Y := [Y1 Y2 ]∗ with Hermitian Y1, Y2 ∈ Cn×n is theLagrange multiplier, ρ is a positive scalar, and ‖ · ‖F is theFrobenius norm.

The ADMM algorithm uses a sequence of iterations tofind the minimizer of the constrained optimization prob-lem (CP1),

Xk+1 := argminX

Lρ (X, Zk, Y k) (6a)

Zk+1 := argminZ

Lρ (Xk+1, Z, Y k) (6b)

Y k+1 := Y k + ρ(AXk+1 + BZk+1 − C

). (6c)

ADMM iterations terminate when conditions on primal anddual residuals are satisfied [26, Section 3.3],

‖AXk+1 + BZk+1 − C‖F ≤ ε,

‖ρA†1

(Zk+1 − Zk

)‖F ≤ ε.

1) Solution to the X-minimization problem (6a): For fixed{Zk, Y k}, minimizing the augmented Lagrangian with re-spect to X amounts to

minimizeX

− log det (X) +ρ

2‖AX − Uk‖2F

where Uk := −(BZk − C + (1/ρ)Y k

). We use a proximal

gradient approach [29] to solve this sub-problem. By lineariz-ing the quadratic term around the current inner iterate Xi

and adding the quadratic penalty on the difference betweenX and Xi, Xi+1 is obtained as the minimizer of

− log det (X) + ρ⟨A† (AXi − Uk

), X⟩

2‖X −Xi‖2F .

(7)Here, the parameters ρ and µ satisfy µ ≥ ρ λmax(A†A),where power iteration is used to compute λmax(A†A). Bytaking the variation of (7) with respect to X , we obtain thefirst order optimality condition

µX − X−1 =(µ I − ρA†A

)Xi + ρA† (Uk) . (8)

The solution to (8) is given by

Xi+1 = V diag (g)V ∗,

where g is a vector with the jth entry,

gj =λj2µ

+

√(λj2µ

)2

+1

µ.

Here, λj’s are the eigenvalues of the matrix on the right-hand-side of (8) and V is the matrix of the correspondingeigenvectors. As typically done in proximal gradient algo-rithms [29], starting with X0 := Xk, we obtain Xk+1

by repeating inner iterations until the desired accuracy isreached.

2) Solution to the Z-minimization problem (6b): For fixed{Xk+1, Y k}, the augmented Lagrangian is minimized withrespect to Z,

minimizeZ

γ ‖Z‖∗ +ρ

2‖Z − V k‖2F (9)

where V k := −(A1

(Xk+1

)+ (1/ρ)Y k1

). The solution

to (9) is obtained by singular value thresholding [30],

Zk+1 = Sγ/ρ(V k).

For this purpose we first compute the singular value decom-position of the symmetric matrix V k = U ΣU∗, where Σis the diagonal matrix of the singular values σi. The soft-thresholding operator Sτ is defined as

Sτ (V k) := U Sτ (Σ)U∗, Sτ (Σ) = diag((σi − τ)+

),

and a+ = max{a, 0}. Thus, the optimality condition in (6b)is satisfied by applying the soft-thresholding operator Sγ/ρon the singular values of V k.

We also consider an accelerated variant of the ADMMalgorithm. The accelerated algorithm is simply ADMM witha Nesterov-type (predictor-corrector) acceleration step. Dueto weak convexity of the objective function, a restart rule

517

is required [27]. Although a global convergence rate can-not be guaranteed, the restart rule reduces the oscillatingbehavior which is often encountered in first-order iterativemethods [27], [31].

Accelerated ADMM algorithm is given by,

Xk := argminX

Lρ (X, Zk, Y k)

Zk := argminZ

Lρ (Xk, Z, Y k)

Y k := Y k + ρ(A(Xk)

+ BZk − C)

ck := 1ρ ‖Y

k − Y k‖2F + ρ ‖Zk − Zk‖2Fif ck < η ck−1,

αk+1 := (1 +√

1 + 4α2k)/2

Zk+1 := Zk + αk−1αk+1

(Zk − Zk−1)

Y k+1 := Y k + αk−1αk+1

(Y k − Y k−1)

else

αk+1 = 1, Zk+1 = Zk−1, Y k+1 = Y k−1

ck ← η−1 ck−1

Following [27], the algorithm is initialized with Z−1 = Z0,Y −1 = Y 0, ρ > 0, α1 = 1, and terminated using similarcriteria as in the ADMM algorithm.

B. Alternating minimization algorithm

The alternating minimization algorithm which was origi-nally developed by Tseng in [32] involves simpler steps thanADMM, but requires strong convexity of the smooth part ofthe objective function. Since the logarithmic barrier functionin (CP) is strongly convex over any compact subset of thepositive definite cone [33], we can use AMA to solve (CP).

AMA follows a sequence of iterations,

Xk+1 := argminX

L0 (X, Zk, Y k) (10a)

Zk+1 := argminZ

Lρ (Xk+1, Z, Y k) (10b)

Y k+1 := Y k + ρ(AXk+1 + BZk+1 − C

)(10c)

which terminate when the duality gap

∆gap := − log det(Xk+1

)+ γ ‖Zk+1‖∗ − Jd

(Y k+1

),

and the primal residual

∆p := ‖AXk+1 + BZk+1 − C‖F ,

are sufficiently small, i.e., |∆gap| ≤ ε, and ∆p ≤ ε. In theX-minimization step, AMA minimizes the Lagrangian L0

to obtain a closed form expression for Xk+1. This is incontrast to ADMM which aims at minimizing the augmentedLagrangian in both X- and Z-minimization steps. It can beshown that AMA works as a proximal gradient on the dualfunction, which allows us to select the step-size ρ in orderto achieve sufficient dual ascent.

1) Solution to the X-minimization problem (10a): At thekth iteration of AMA, minimizing the Lagrangian L0 with

respect to X for fixed {Zk, Y k} yields

Xk+1 =(A† (Y k))−1 . (11)

2) Solution to the Z-minimization problem (10b): Thisstep is identical to (6b) in the ADMM algorithm.

3) Lagrange multiplier update: The expressions for Xk+1

and Zk+1 can be used to bring (10c) into the following form

Y k+11 = Tγ

(Y k1 + ρA1

(Xk+1

))Y k+12 = Y k2 + ρ

(A2

(Xk+1

)−G

).

For Hermitian matrix M with singular value decompositionM = U ΣU∗, we have

Tτ (M) := U Tτ (Σ)U∗

Tτ (Σ) = diag (min (max(σi,−τ), τ))

where the saturation operator Tτ restricts the singular valuesof M between −τ and τ . This guarantees dual feasibility ofthe update, i.e., ‖Y k+1

1 ‖2 ≤ γ at each iteration, and justifiesthe choice of stopping criteria in ensuring primal feasibilityof the solution.

4) Choice of step-size for the dual update (10c): Wefollow an enhanced variant of AMA [24] which utilizes anadaptive Barzilia-Borwein step-size selection [34] in (10b)and (10c) to guarantee sufficient dual ascent and positivedefiniteness of X . Our numerical experiments indicate thatthis provides substantial acceleration relative to the use of afixed step-size. Since the standard Barzilia-Borwein step-sizemay not always satisfy the feasibility or the sufficient ascentconditions, we determine an appropriate step-size throughbacktracking.

At the kth iteration of AMA, an initial step-size,

ρk,0 =

⟨Y k+1 − Y k, Y k+1 − Y k

⟩〈Y k+1 − Y k, ∇Jd(Y k)−∇Jd(Y k+1)〉

,

is adjusted through a backtracking procedure to guaranteepositive definiteness of the subsequent iterate of (10a) andsufficient ascent of the dual function,

A† (Y k+1)� 0 (12a)

Jd(Y k+1

)≥ Jd

(Y k)

+⟨∇Jd(Y k), Y k+1 − Y k

⟩−

1

2ρk‖Y k+1 − Y k‖2F . (12b)

Here, ∇Jd is the gradient of the dual function. Condi-tion (12a) guarantees the positive definiteness of Xk+1,cf. (11), and the right hand side of (12b) is a local quadraticapproximation of the dual objective around Y k.

C. Computational complexity

The X-minimization step involves an eigenvalue decom-position in ADMM, and a matrix inversion in AMA, whichcosts O(n3) operations in both cases (n is the number ofstates). Both methods have Z-minimization steps that amountto a singular value decomposition, and require O(n3) opera-tions. Thus, the total computational cost for a single iteration

518

TABLE I: Comparison of different algorithms (in seconds)for different number of masses and γ = 10.

N CVX ADMM Fast ADMM + Restart AMA10 67.04 6.80 3.10 7.00

20 942.23 105.55 65.52 36.97

50 – 3930.4 3492.5 625.92

100 – 40754 34420 5429.8

of our customized algorithms is O(n3). In contrast, standardSDP solvers have a worst-case complexity of O(n6).

V. NUMERICAL EXPERIMENTS

We present an illustrative example to demonstrate theeffectiveness of our customized algorithms. Consider a mass-spring-damper (MSD) system subject to stochastic distur-bances that are generated by a low-pass filter,

low-pass filter: ζ = −ζ + d (13a)

MSD system: x = Ax + Bζ (13b)

The state vector x = [ p∗ v∗ ]∗, contains position and velocityof masses, and d represents a zero-mean unit variance whiteprocess. State and input matrices are

A =

[O I−T −I

], Bζ =

[0I

],

where O and I are zero and identity matrices and T is asymmetric tridiagonal Toeplitz matrix with 2 on the maindiagonal and −1 on the first upper and lower sub-diagonals.

The steady-state covariance of system (13) can be foundas the solution to the Lyapunov equation,

AΣ + Σ A∗ + B B∗ = 0,

whereA =

[A BO −I

], B =

[0I

],

Σ =

[Σxx ΣxζΣζx Σζζ

].

The sub-covariance Σxx denotes the state covariance of theMSD system, and is partitioned as

Σxx =

[Σpp ΣpvΣvp Σvv

].

We assume knowledge of one-point correlations of the po-sition and velocity of masses, i.e., the diagonal elements ofΣpp, Σvv , and Σpv . In order to account for these availablestatistics, we solve (CP) for a state covariance X of the MSDsystem which agrees with the available statistics.

Numerical experiments were conducted for N = 10, 20,50, and 100 masses and for various values of γ. Iterationswere run for each method until an iterate achieves a certaindistance from optimality, i.e., ‖Xk −X?‖/‖X?‖ < ε1 and

Iteration

(a) Jd(Y1, Y2)

Iteration

(b) ∆gap

Iteration

(c) ∆p

Fig. 2: Performance of the customized AMA for the MSDsystem with 50 masses, γ = 2.2, and ε = 10−3. (a) Thedual objective function Jd(Y1, Y2) of (CP); (b) the dualitygap, |∆gap|; and (c) the primal residual, ∆p.

‖Zk − Z?‖/‖Z?‖ < ε2. For γ = 10, Table I comparesvarious methods based on run times (sec). For N = 50and N = 100 CVX [35] ran out of memory. The choice ofε1, ε2 = 10−3 and γ = 10, guarantees sufficient optimality inmatching primal constraints and achieving low-rank solutionswhich bring the objective to within 0.1% of Jp(X?, Z?).Clearly, AMA outperforms ADMM for large problems.

We now focus on the performance of customized alternat-ing minimization algorithm. For 50 masses and γ = 2.2 weuse the algorithm discussed in Section IV-B to solve (CP)with ε = 10−3. Figure 2a illustrates the monotonic increaseof the dual objective function. The absolute value of theduality gap, |∆gap|, and the primal residual, ∆p are displayedin Fig. 2, thereby demonstrating convergence.

In (CP), the parameter γ determines the importance ofthe nuclear norm relative to the logarithmic barrier function.While larger values of γ result in lower rank solutions forZ, they may fail to achieve a reasonable completion ofthe “ideal” state covariance Σxx. When the true values ofall entries of the covariance matrix are not known, γ istypically chosen on an empirical basis or by cross-validation.In our example, however, the true state covariance Σxxis known. Minimum error is obtained at γ = 1.4, butthis value of γ does not yield a low-rank input correlationZ. For γ = 2.2 reasonable matching is obtained (82.7%matching) and the resulting Z displays a clear-cut in itssingular values with 62 of them being nonzero. Figure 3bshows the recovered covariance matrix of mass positions,

519

Xpp. We observe reasonable correspondence with the truecovariance Σpp in Fig. 3a.

(a) Σpp (b) Xpp

Fig. 3: The matrix Σpp of the MSD system and the covari-ance Xpp which comes as a solution to (CP).

VI. CONCLUDING REMARKS

We are interested in explaining partially known second-order statistics that originate from experimental measure-ments or simulations using stochastic linear models. Weassume that the linearized approximation of the dynamicalgenerator is known whereas the nature and directionalityof disturbances that can explain partially observed statisticsare unknown. This inverse problem can be formulated asa convex optimization problem, which utilizes nuclear normminimization to identify noise parameters of low rank and tocomplete unavailable covariance data. We also constrain thesolution set to positive definite state covariances by includinga logarithmic barrier function in the objective. To efficientlysolve covariance completion problems of large size wedevelop customized algorithms based on alternating directionmethods. Numerical experiments show that the enhancedvariant of AMA performs much better than strategies basedon ADMM, especially for large-scale problems.

REFERENCES

[1] B. F. Farrell and P. J. Ioannou, “Stochastic forcing of the linearizedNavier-Stokes equations,” Phys. Fluids A, vol. 5, no. 11, pp. 2600–2609, 1993.

[2] B. Bamieh and M. Dahleh, “Energy amplification in channel flowswith stochastic excitation,” Phys. Fluids, vol. 13, no. 11, pp. 3258–3269, 2001.

[3] M. R. Jovanovic and B. Bamieh, “Componentwise energy amplifica-tion in channel flows,” J. Fluid Mech., vol. 534, pp. 145–183, 2005.

[4] R. Moarref and M. R. Jovanovic, “Model-based design of transversewall oscillations for turbulent drag reduction,” J. Fluid Mech., vol.707, pp. 205–240, September 2012.

[5] M. R. Jovanovic and B. Bamieh, “Modelling flow statistics using thelinearized Navier-Stokes equations,” in Proceedings of the 40th IEEEConference on Decision and Control, 2001, pp. 4944–4949.

[6] M. R. Jovanovic and T. T. Georgiou, “Reproducing second orderstatistics of turbulent flows using linearized Navier-Stokes equationswith forcing,” in Bulletin of the American Physical Society, LongBeach, CA, November 2010.

[7] T. T. Georgiou, “The structure of state covariances and its relation tothe power spectrum of the input,” IEEE Trans. Autom. Control, vol. 47,no. 7, pp. 1056–1066, 2002.

[8] T. T. Georgiou, “Spectral analysis based on the state covariance: themaximum entropy spectrum and linear fractional parametrization,”IEEE Trans. Autom. Control, vol. 47, no. 11, pp. 1811–1823, 2002.

[9] Y. Chen, M. R. Jovanovic, and T. T. Georgiou, “State covariances andthe matrix completion problem,” in Proceedings of the 52nd IEEEConference on Decision and Control, 2013, pp. 1702–1707.

[10] A. Zare, Y. Chen, M. R. Jovanovic, and T. T. Georgiou, “Low-complexity modeling of partially available second-order statistics viamatrix completion,” IEEE Trans. Automat. Control, 2014, submitted;also arXiv:1412.3399v1.

[11] M. Fazel, “Matrix rank minimization with applications,” Ph.D. disser-tation, Stanford University, 2002.

[12] E. J. Candes and B. Recht, “Exact matrix completion via convexoptimization,” Found. Comput. Math., vol. 9, no. 6, pp. 717–772, 2009.

[13] R. Keshavan, A. Montanari, and S. Oh, “Matrix completion from afew entries,” IEEE Trans. on Inform. Theory, vol. 56, no. 6, pp. 2980–2998, 2010.

[14] B. Recht, M. Fazel, and P. A. Parrilo, “Guaranteed minimum-ranksolutions of linear matrix equations via nuclear norm minimization,”SIAM Rev., vol. 52, no. 3, pp. 471–501, 2010.

[15] E. J. Candes and Y. Plan, “Matrix completion with noise,” Proc. IEEE,vol. 98, no. 6, pp. 925–936, 2010.

[16] R. Salakhutdinov and N. Srebro, “Collaborative filtering in a non-uniform world: Learning with the weighted trace norm,” Advances inNeural Information Processing Systems 24, 2011.

[17] F. Lin, M. R. Jovanovic, and T. T. Georgiou, “An ADMM algorithmfor matrix completion of partially known state covariances,” in Pro-ceedings of the 52nd IEEE Conference on Decision and Control, 2013,pp. 1684–1689.

[18] A. Zare, M. R. Jovanovic, and T. T. Georgiou, “Completion of partiallyknown turbulent flow statistics,” in Proceedings of the 2014 AmericanControl Conference, Portland, OR, 2014, pp. 1680–1685.

[19] A. Zare, M. R. Jovanovic, and T. T. Georgiou, “Completion ofpartially known turbulent flow statistics via convex optimization,” inProceedings of the 2014 Summer Program, Center for TurbulenceResearch, Stanford University/NASA, 2014, pp. 345–354.

[20] M. Tao and X. Yuan, “Recovering low-rank and sparse componentsof matrices from incomplete and noisy observations,” SIAM J. Optim.,vol. 21, no. 1, pp. 57–81, 2011.

[21] X. Yuan, “Alternating direction method for covariance selection mod-els,” J. Sci. Comput., vol. 51, no. 2, pp. 261–273, 2012.

[22] T. Goldstein and S. Osher, “The split Bregman method for `1-regularized problems,” SIAM J. Imaging Sci., vol. 2, no. 2, pp. 323–343, 2009.

[23] F. Lin, M. Fardad, and M. R. Jovanovic, “Design of optimal sparsefeedback gains via the alternating direction method of multipliers,”IEEE Trans. Automat. Control, vol. 58, no. 9, pp. 2426–2431, 2013.

[24] O. A. Dalal and B. Rajaratnam, “G-AMA: Sparse Gaussian graph-ical model estimation via alternating minimization,” arXiv preprintarXiv:1405.3034, 2014.

[25] Z. Wen, D. Goldfarb, and W. Yin, “Alternating direction augmentedLagrangian methods for semidefinite programming,” Math. Program.Comp., vol. 2, no. 3-4, pp. 203–230, 2010.

[26] S. Boyd, N. Parikh, E. Chu, B. Peleato, and J. Eckstein, “Distributedoptimization and statistical learning via the alternating directionmethod of multipliers,” Found. Trends Mach. Learn., vol. 3, no. 1,pp. 1–122, 2011.

[27] T. Goldstein, B. O’Donoghue, S. Setzer, and R. Baraniuk, “Fastalternating direction optimization methods,” SIAM J. Imaging Sci.,vol. 7, no. 3, pp. 1588–1623, 2014.

[28] B. O’Donoghue, E. Chu, N. Parikh, and S. Boyd, “Operator splittingfor conic optimization via homogeneous self-dual embedding,” arXivpreprint arXiv:1312.3039, 2013.

[29] N. Parikh and S. Boyd, “Proximal algorithms,” Found. Trends Optim.,vol. 1, no. 3, pp. 123–231, 2013.

[30] J.-F. Cai, E. J. Candes, and Z. Shen, “A singular value thresholdingalgorithm for matrix completion,” SIAM J. Optim., vol. 20, no. 4, pp.1956–1982, 2010.

[31] B. O’Donoghue and E. Candes, “Adaptive restart for acceleratedgradient schemes,” Found. Comput. Math., pp. 1–18, 2013.

[32] P. Tseng, “Applications of a splitting algorithm to decomposition inconvex programming and variational inequalities,” SIAM J. ControlOptim., vol. 29, no. 1, pp. 119–138, 1991.

[33] O. Banerjee, L. El Ghaoui, and A. d’Aspremont, “Model selectionthrough sparse maximum likelihood estimation for multivariate gaus-sian or binary data,” J. Mach. Learn. Res., vol. 9, pp. 485–516, 2008.

[34] J. Barzilai and J. M. Borwein, “Two-point step size gradient methods,”IMA J. Numer. Anal., vol. 8, no. 1, pp. 141–148, 1988.

[35] M. Grant and S. Boyd, “CVX: Matlab software for disciplined convexprogramming, version 2.1,” http://cvxr.com/cvx, Mar. 2014.

520