studies on algorithms for solving generalized second … › papers › doctor ›...

STUDIES

ON

ALGORITHMS FOR SOLVING GENERALIZED

SECOND-ORDER CONE PROGRAMMING PROBLEMS

TAKAYUKI OKUNO

STUDIES

ON

ALGORITHMS FOR SOLVING GENERALIZED

SECOND-ORDER CONE PROGRAMMING PROBLEMS

by

TAKAYUKI OKUNO

Submitted in partial fulfillment of

the requirement for the degree of

DOCTOR OF INFORMATICS

(Applied Mathematics and Physics)

KYOT

OUNIVER

SITY

FO

UNDED

18

97

KYOTO JAPAN

KYOTO UNIVERSITY

KYOTO 606–8501, JAPAN

JANUARY 2013

Preface

The second-order cone (SOC), also called the Lorentz cone, is a special class of convex cones. Sofar there have been developed a lot of studies analyzing the structure of the SOC from geometricand algebraic standpoints. The second-order cone programming problem (SOCP) is an opti-mization problem that contains SOCs in its constraints. It is well-known that SOCP representsa wide class of optimization problems. In fact, standard nonlinear programming problems can beregarded as a subclass of SOCP naturally, and a class of robust optimization problems and con-vex quadratic programming problems can be reformulated as linear SOCPs (LSOCPs). In thelast decade, study on the SOCP has been promoted significantly. For example, a lot of efficientalgorithms have been proposed for solving SOCPs, such as primal-dual interior-point method,smoothing Newton method, SQP-type method, and so on. Especially, for solving the LSOCP,several software packages implementing the primal-dual interior point method have been pro-duced, and their superior performance has been reported by a large number of researchers. Also,SOCP finds a wide variety of applications in the real world, such as the antenna array weightdesign problem, the finite impulse response (FIR) design problem, the portfolio optimizationwith loss risk constraints, the magnetic shield design problem for maglev trains, and so on.Thanks to such useful algorithms and applications, SOCP attracts much attention from a lotof people in various fields, and is expected to play an important role in the future. However,we are often faced with a number of practical problems that can be formulated as optimizationproblems with SOC constraints whose structures are quite different from that of the standardSOCP. To deal with such problems in the real world, we need to consider new enhanced SOCPmodels and develop efficient methods for solving them.

In this thesis, we focus on two classes of problems which are generalizations of the SOCP. Thefirst one is a semi-infinite second-order cone programming problem (SISOCP), which is featuredas an optimization problem with an infinite number of SOC constraints. The second one is amathematical programming problem with SOC complementarity constraints (MPSOCC). Thoughthe MPSOCC is superficially regraded as an ordinary SOCP, we cannot handle it in the conven-tional framework established for solving the SOCP because of the complementarity constraints.Many important problems in the real world can be represented as SISOCP or MPSOCC straight-forwardly. For example, FIR filter-design and Chebyshev-like approximation problems involvingvector-valued functions can be formulated as SISOCPs, and a bilevel-programming problem thatpossesses a parametric SOCP as a lower-level problem can be expressed as MPSOCC.

The main purpose of the thesis is to propose numerical methods for SISOCP and MP-SOCC. To solve the SISOCP, we propose two different algorithms. The first one is based on an

6

exchange-type method, where we generate iteration points that are optima of relaxed SISOCPswith finitely many SOC constraints. The second one is based on a local reduction method,where the SISOCP is reduced to an SOCP locally by means of finitely many implicit functions.Either method has to compute an optimal solution of SOCP in each iteration, which can beobtained by existing algorithms quite efficiently. To solve the MPSOCC, we propose a smooth-ing SQP method, where we apply the SQP method combined with a smoothing technique tothe reformulated MPSOCC without SOC complementarity constraints. By using approximatedsolutions of smoothed problems, the proposed method attains practical efficiency.

The author hopes that the results of this thesis make some contributions to the furtherdevelopment of research on optimization problems with SOC constraints.

Takayuki OkunoJanuary 2013

Acknowledgment

First of all, I am deeply grateful to Professor Masao Fukushima of Kyoto University. Throughoutmy master’s and doctoral course, he carefully read my manuscripts and gave me a lot of advices.What I have learned from him is not only basic methodologies for research but also fundamentalattitude of researchers. With such a spirit in my mind, I will try to improve myself so that Ican become an excellent researcher like him in the future. Secondly, I would like to express myappreciation to Assistant Professor Shunsuke Hayashi of Kyoto University. He always encouragedand supervised me since I came to Professor Fukushima’s laboratory. Many of results attainedin this thesis are due to his help. Without his presence, I would not have entered the doctoralcourse. Thirdly, I would like to thank Associate Professor Nobuo Yamashita of Kyoto University.He always answered my trivial question sincerely. Also, his deep insights into the optimizationfield were incentive for me. His advice often became an hint to promote my research fromdifferent viewpoints. In addition, I would like to tender my acknowledgments to Mrs FumieYagura who is a secretary of the laboratory. She kindly took care of me and handled my facultydocuments.

I am grateful to the OBs and members in the laboratory. It was very pleasant for me todiscuss and study the optimization together with them. Especially, I am very thankful to KenjiUeda, Munetaka Kanayama, Noritoshi Kurokawa, Ryoich Nishimura, Katsunori Kubota, RyotaOtsubo, Daisuke Yamamoto, Kenji Hamatani, and Takahiro Nishi. They helped me when Ifaced some difficulties. Further, my heartfelt appreciation goes to Assistant Professor HiroshigeDan of Kansai University. He did not only give encouragement to me but also prepared a letterof recommendation for my job interview while he was busy with his own work.

I also warmly thank my colleagues in Karatedo club of Kyoto University who brought meboth physical and mental strengths for these 10 years.

Finally, I would like to express my highest gratitude to my family supporting and encouragingme generously during my student life.

List of Tables

3.1 Choice of grid points in each experiment . . . . . . . . . . . . . . . . . . . . . . . 443.2 Convergence behavior for Experiment 1 . . . . . . . . . . . . . . . . . . . . . . . 453.3 Comparison of regularized and non-regularized exchange methods . . . . . . . . . 463.4 Results for Experiment 3-2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

4.1 Results for Experiment 1 (n = 6) . . . . . . . . . . . . . . . . . . . . . . . . . . . 684.2 Results for Experiment 1 (n = 8) . . . . . . . . . . . . . . . . . . . . . . . . . . . 684.3 Results for Experiment 2 (n = 6) . . . . . . . . . . . . . . . . . . . . . . . . . . . 694.4 Results for Experiment 2 (n = 8) . . . . . . . . . . . . . . . . . . . . . . . . . . . 694.5 Comparison of Algorithm 4.1 and the QP-based method (Experiment 3) . . . . . 71

5.1 Results for problems with a single SOC complementarity constraint (Experiment 1) 945.2 Results for problems with multiple SOC complementarity constraints (Experi-

ment 1) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 945.3 Results for Algorithm 5.1 (Experiment 2) . . . . . . . . . . . . . . . . . . . . . . 965.4 Comparison of Algorithm 5.1 and the smoothing method . . . . . . . . . . . . . . 97

List of Figures

3.1 The graph of ‖H(t) − h(u∗, t)‖ (t ∈ [−1, 1]) in Experiment 3-1 . . . . . . . . . . . 47

Contents

Acknowledgment 7

1 Introduction 151.1 Overview of problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

1.1.1 Second-order cone programming problem . . . . . . . . . . . . . . . . . . 151.1.2 Semi-infinite second-order cone programming problem . . . . . . . . . . . 161.1.3 Mathematical program with SOC complementarity constraints . . . . . . 18

1.2 Outline of the thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

2 Preliminaries 21

2.1 Notations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 212.2 Basic properties of convex functions . . . . . . . . . . . . . . . . . . . . . . . . . 232.3 Basic properties of cones . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 252.4 Natural residual associated with SOC complementarity condition . . . . . . . . . 26

3 A regularized explicit exchange method for convex semi-infinite second-ordercone programming problems 29

3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 293.2 KKT conditions for SICP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 313.3 Explicit exchange method for SICP . . . . . . . . . . . . . . . . . . . . . . . . . . 35

3.3.1 Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 353.3.2 Global convergence under strict convexity assumption . . . . . . . . . . . 37

3.4 Regularized explicit exchange method for SICP . . . . . . . . . . . . . . . . . . . 393.4.1 Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 403.4.2 Global convergence without strict convexity assumption . . . . . . . . . . 41

3.5 Numerical experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 433.6 Concluding remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

4 A local reduction based SQP-type method for semi-infinite second-order coneprogramming problems 51

4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 514.2 Local reduction of SISOCP to SOCP . . . . . . . . . . . . . . . . . . . . . . . . . 524.3 Local reduction based SQP-type algorithm for the SISOCP . . . . . . . . . . . . 554.4 Convergence analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58

14 Acknowledgment

4.4.1 Global convergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 584.4.2 Local convergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64

4.5 Numerical experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 664.6 Concluding remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71

5 A smoothing SQP method for mathematical programs with SOC complemen-tarity constraints 735.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 735.2 Cartesian P0 and P matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 745.3 Reformulation of MPSOCC and B-stationary points of MPSOCC . . . . . . . . . 755.4 Smoothing function of natural residual . . . . . . . . . . . . . . . . . . . . . . . . 785.5 Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 825.6 Convergence analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 865.7 Numerical experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 925.8 Concluding remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96

6 Conclusion 99

Chapter 1

Introduction

The second-order cone programming problem (SOCP) is an optimization problem that containsa finite number of so-called second-order cone (SOC) constraints. The SOCP has attracteda lot of researchers in various fields, since many practical problems can be represented as anSOCP, and the obtained SOCP can be solved quite efficiently by using existing interior-pointsolvers [65, 68]. However, we are often faced with practical problems that can be formulatedas optimization problems with SOC constraints whose structures are quite different from thatof the conventional SOCP model. To deal with such problems in the real world, we need toconsider some new enhanced SOCP models and examine their properties precisely. In thisthesis, we focus on two classes of optimization problems which are extensions of the SOCP. Thefirst one is a semi-infinite second-order cone programming problem (SISOCP) which containsan infinite number of SOC constraints. The second one is a mathematical program with SOCcomplementarity constraints (MPSOCC). The main purpose of the thesis is to propose anddevelop algorithms for solving them.

In this chapter, we provide with overviews of SOCP, SISOCP, and MPSOCC, and thenoutline the contents of the thesis.

1.1 Overview of problems

1.1.1 Second-order cone programming problem

The second-order cone programming problem (SOCP) is an optimization problem of the followingform:

Minimizex∈Rn

f(x)

subject to g0(x) = 0,gs(x) ∈ Kms (s = 1, 2, . . . , S),

(1.1.1)

where f : Rn → R, gs : Rn → Rms are continuously differentiable functions, and Kms ⊆ Rm foreach s = 1, 2, . . . , S denotes the ms-dimensional second-order cone defined by

Kms :=

{(z1, z>)> ∈ R × Rms−1 | z1 ≥ ‖z‖

}(ms ≥ 2)

R+ := {z ∈ R | z ≥ 0} (ms = 1).(1.1.2)

16 Acknowledgment

When m1 = m2 = · · · = mS = 1, SOCP (1.1.1) naturally reduces to the standard nonlin-ear programming problem. Hence, the SOCP include the nonlinear programming problemsas a subclass. Studies on the SOCP have been advanced significantly in the last decade inboth theoretical and practical aspects. Especially, development of research on the linear SOCP(LSOCP), which consists of linear objective function and affine constraint functions, is notable.The LSOCP has wide applications in economics and engineering such as robust portfolio se-lection [22, 20], filter design [70, 72, 8, 40], magnetic shield design [58], and so on [37, 1]. Theprimal-dual interior-point method [37, 1, 69, 42] is well known as an effective algorithm for solv-ing the LSOCP, and some software packages implementing them [65, 68] have been produced.Compared with the LSOCP, the general case (1.1.1) is more complicated and has been studiednot so much as the LSOCP. So far, for solving them, there have been proposed the augmentedLagrangian method [36], primal-dual interior point method [76], SQP-type method [33], and soon.

1.1.2 Semi-infinite second-order cone programming problem

The classical semi-infinite programming problem (SIP) is characterized as an optimization prob-lem with a finite number of variables and an infinite number of inequality constraints. The SIPhas been studied extensively. It also has wide applications in economics and engineering, e.g.,the air pollution control, the robot trajectory planning, the stress of materials, etc.[29, 38]. Sofar, many efficient algorithms have been proposed for solving SIPs, such as the discretization-type methods [21, 28, 54, 64], local reduction-type methods [24, 30, 46, 50, 51, 66], exchange-typemethods [21, 27, 34, 73, 47, 79], Newton-type methods [44, 62, 63, 35, 53, 61, 7, 52], and so on[16, 73, 31, 60].

In the thesis, we consider the following SIP with infinitely many SOC constraints:

Minimizex∈Rn

f(x)

subject to g0(x) = 0,gs(x, t) ∈ Kms for all t ∈ T s (s = 1, 2, . . . , S),

(1.1.3)

where f : Rn → R, g0 : Rn → Rm0 and gs : Rn × R`s → Rms (s = 1, 2, . . . , S) are continuousfunctions, and T s ⊆ R`s (s = 1, 2, . . . , S) are nonempty compact index sets, and Kms is thems-dimensional SOC defined by (1.1.2) for each s = 1, 2, . . . , S. We call this problem thesemi-infinite second-order cone programming problem abbreviated as SISOCP. SISOCP (1.1.3)includes the classical SIP and SOCP as subclasses. In fact, if each index set Ts (s = 1, 2, . . . , S)consists of finitely many elements, then SISOCP (1.1.3) is nothing but SOCP (1.1.1). On theother hand, when m1 = m2 = · · · = mS = 1, i.e., Kms = R+ for s = 1, 2, . . . , S, SISOCP (1.1.3)reduces to the standard SIP.

There are some important applications of SISOCP (1.1.3). For example, SISOCP (1.1.3) canbe used to formulate a Chebyshev-like approximation problem involving vector-valued functions.Specifically, let X ⊆ R` be a nonempty set, Y ⊆ Rn be a given compact set, and Φ : Y → Rm

and F : R` × Y → Rm be given functions. Then, we seek a parameter u ∈ X such that

1.1 Overview of problems 17

Φ(y) ≈ F (u, y) for all y ∈ Y . One relevant approach is to solve the following problem:

Minimizeu∈X

maxy∈Y

‖Φ(y) − F (u, y)‖. (1.1.4)

By introducing the auxiliary variable r ∈ R, we can transform the above problem to

Minimize(u,r)∈X×R

r

subject to

(r

Φ(y) − F (u, y)

)∈ Km+1 for all y ∈ Y,

(1.1.5)

which is of the form (1.1.3).Another important application for SISOCP (1.1.3) is a finite impulse response (FIR) filter-

design [55, 72]. Generally, the FIR filter-design is to determine a vector h := (h0, h1, · · · , hn−1)> ∈Rn such that the frequency response functionH : Rn×R → C defined byH(h, ω) :=

∑n−1k=0 hke

−kω√−1

satisfies some given conditions for all ω ∈ [ω1, ω2] ⊆ [0, 2π]. The following problem is called theLog-Chebyshev approximation FIR filter problem [72]:

Minimizeh∈Rn

supω∈[0,π]

∣∣ log |H(h, ω)| − logD(ω)∣∣, (1.1.6)

where D : [0, π] → R is a given desired frequency magnitude such that D(ω) > 0 for all ω ∈[0, π]. By letting R(h, ω) := |H(h, ω)|2 and using an auxiliary variable r ∈ R, problem (1.1.6) isexpressed as

Minimize(r,h)∈R×Rn

r

subject to 1/r ≤ R(h, ω)/D(ω)2 ≤ r,

R(h, ω) ≥ 0 for all ω ∈ [0, π],

which can further be rewritten as

Minimize(r,h)∈R×Rn

r

subject to rD(ω)2 −R(h, ω) ≥ 0, R(h, ω) ≥ 0,R(h, ω) + r

R(h, ω) − r

2D(ω)

∈ K3 for all ω ∈ [0, π],

(1.1.7)

which is of the form SISOCP (1.1.3) with variables (r, h) ∈ R × Rn. In [39, 70], other kinds offilter design are considered and those design problems are formulated as SISOCPs with infinitelymany SOC constraints. However, such problems are actually solved via a uniform discretization.

Although such important applications exist, there have been very few studies on the SISOCP.One of possible reasons is that any closed convex cone including SOC can be represented as anintersection of finitely or infinitely many halfspaces. Then, we may reformulate (1.1.3) as a classi-cal SIP with infinitely many inequality constraints and solve it by using existing SIP algorithms.However, such a reformulation approach brings about serious difficulties since the dimension ofthe index set may become much larger than that of the original SISOCP (1.1.3). To see this,we consider SISOCP (1.1.3) with a single index set T ⊆ R` and a semi-infinite SOC constraint

18 Acknowledgment

g(x, t) ∈ Km (t ∈ T ). Since Km = {z ∈ Rm | z>v ≥ 0, ∀v ∈ V }, where V := {(1, v)> ∈Rm | ‖v‖ = 1}, SISOCP (1.1.3) can be reformulated as the SIP: min f(x) s.t. v>g(x, t) ≥0 for all (v, t) ∈ V × T . The dimension of V × T is then equal to m+ `− 1.

In the thesis, we will propose two algorithms for solving SISOCP (1.1.3) while maintainingthe SOC structures. The first one is a regularized explicit exchange method, which is developedfor the SISOCP with a convex objective function and infinitely many affine SOC constraint func-tions. It solves a sequence of finitely relaxed SISOCPs to generate iteration points. The secondone is a local reduction based SQP-type method, where SISOCP (1.1.3) is locally representedas an SOCP by means of implicit functions and the obtained SOCP is solved by an SQP-typemethod to generate a search direction.

1.1.3 Mathematical program with SOC complementarity constraints

A mathematical program with equilibrium constraints (MPEC) [41] is an optimization problemwhose constraints contain parametric variational inequalities. The MPEC has been studied ex-tensively, since it finds wide applications such as design problems in engineering, equilibriumproblems in economics, and game-theoretic multi-level optimization problems. Particularly, vari-ational inequality constraints in MPECs are often written as linear or nonlinear complementarityconstraints. Such an MPEC is also called a mathematical program with complementarity con-straints (MPCC), for which there have been proposed many algorithms. For example, Fukushimaand Tseng [19] proposed an active set algorithm, and proved that any accumulation point of thesequence generated by the algorithm is a B-stationary point under the uniform linear indepen-dence constraint qualification (LICQ) on the ε-feasible set. Luo, Pang, and Ralph [41] proposeda piece-wise sequential quadratic programming (SQP) algorithm, and showed that the generatedsequence converges to a B-stationary point locally superlinearly or quadratically under the LICQand the second order sufficient conditions. Fukushima, Luo, and Pang [17] proposed an SQP-type algorithm, and showed that the sequence generated by the algorithm globally converges toa B-stationary point under the nondegeneracy condition at the limit point.

In the thesis, we consider the following mathematical program with linear SOC complemen-tarity constraints (MPSOCC):

Minimizex,y,z

f(x, y)

subject to Ax ≤ b,

z = Nx+My + q,

K 3 y ⊥ z ∈ K,

(1.1.8)

where f : Rn+m → R is a continuously differentiable function, A ∈ Rp×n, b ∈ Rp, N ∈Rm×n, M ∈ Rm×m and q ∈ Rm are given matrices and vectors, ⊥ denotes the perpendicularity,and K is the Cartesian product of second-order cones, that is, K := Km1 ×Km2 × · · · × KmS ⊆Rm1 ×Rm2 ×· · ·×RmS = R`. A typical application of MPSOCC (1.1.8) is a bilevel-programmingproblem [2, 14] that possesses a parametric SOCP as a lower-level problem. By replacing thelower-level SOCP with its Karush-Kuhn-Tucker (KKT) conditions, it is formulated as MP-SOCC (1.1.8).

1.2 Outline of the thesis 19

When ms = 1 for all s, i.e., K = R`+, MPSOCC (1.1.8) reduces to the MPCC with linear

nonnegative complementarity constraints. In addition, since K 3 y ⊥ z ∈ K and z = Nx+My+qcan be rewritten as a certain parametric variational inequality constraint, MPSOCC (1.1.8) canbe classified as a special case of MPEC. One may think that MPSOCC (1.1.8) can be regardedas an SOCP by viewing K 3 y ⊥ z ∈ K as y ∈ K and z ∈ K and y>z = 0, and thusexisting algorithms for SOCP are applicable for the MPSOCC. However, due to the fact thatMangasarian-Fromovitz constraint qualification or LICQ are violated at all feasible points ofthe MPSOCC [41], we cannot apply the convergence theories for SOCP to MPSOCC (1.1.8)straightforwardly. Thus, we need to establish specialized theories and algorithms for solvingMPSOCC (1.1.8).

A second-order cone complementarity problem (SOCCP) is one of the most important prob-lems relevant to MPSOCC (1.1.8). The SOCCP is characterized as a problem to find a triple(x, y, z) ∈ Rν×Rn×Rn satisfying SOC complementarity condition K 3 y ⊥ z ∈ K, F (x, y, z) = 0,where F : Rν ×Rn ×Rn → Rn ×Rν is a continuously differentiable vector valued function. Theexistence and uniqueness properties of solutions to the SOCCP are analyzed in terms of Jordanalgebra [23, 67]. For solving SOCCP, a number of algorithms have been proposed such as thesmoothing methods [18, 26, 12, 43], matrix splitting methods [25, 74] and so on [10, 9, 49]. Mostof them are designed based on the reformulation approach, in which the SOC complementar-ity condition is reformulated as an equivalent system of equations by means of merit functionsinvolving the Fischer-Burmeister function or the natural residual [18]. As applications of theSOCCP, the robust Nash equilibrium problem in an N-person non-cooperative game [45] anda certain equilibrium problem with frictional contact [32, 78] can be formulated as SOCCPs.Particularly, since the KKT conditions for SOCP (1.1.1) are represented as SOCCPs, SOCPscan be solved through the SOCCP.

Compared with the MPCC and SOCCP, there are only a few studies on the MPSOCC.Recently, Yan and Fukushima [77] proposed a smoothing method for solving such a problem.To show convergence of the algorithm, they assume that smoothed subproblems are solved ex-actly. However, such an approach takes a large amount of costs for computing solutions of thosesubproblems. In Chapter 5 of the thesis, we propose a more practical algorithm combining thesmoothing method with the SQP method, which solves at each iteration a convex quadratic pro-gram that approximates a smoothed subproblem, instead of solving the latter problem exactly.

1.2 Outline of the thesis

The thesis is organized as follows. In Chapter 2, as preliminaries, we give some notations, fun-damental properties, and mathematical techniques that are necessary in the later arguments.Chapters 3 and 4 are devoted to the study of SISOCP (1.1.3). More precisely, in Chapter 3,we consider SISOCP (1.1.3) with a convex objective function and infinitely many affine SOCconstrains. For solving such SISOCPs, we combine a regularization scheme and an exchangemethod. With the help of the regularization technique, we can verify that, under mild assump-tions, generated iteration points globally converge to the optimal solution that attains the least2-norm value among the optimal solutions of the SISOCP. In Chapter 4, we construct an SQP-

20 Acknowledgment

type method for solving the SISOCP of the form (1.1.3). This method locally represents infinitelymany SOC constraints with finitely many SOC constraints by using implicit functions. We makeglobal and local convergence analyses. In Chapter 5, we propose an algorithm for solving MP-SOCC (1.1.8). This method first replaces the SOC complementarity constraints with equalityconstraints using the smoothing natural residual function, and apply the SQP method to thesmoothed problems with decreasing the smoothing parameters. We show that iteration pointsproduced by the proposed method converge to a B-stationary point of MPSOCC (1.1.8) underthe strict complementarity condition. In Chapter 6, we end the thesis with some concludingremarks and future works.

Chapter 2

Preliminaries

In this chapter, we give some fundamental notations and properties that will be useful in thesubsequent chapters.

2.1 Notations

Throughout the thesis, we use the following notations. We denote the m-dimensional nonnega-tive and positive cones by

Rm+ := {z ∈ Rm | zi ≥ 0, i = 1, 2, . . . ,m},

and

Rm++ := {z ∈ Rm | zi > 0, i = 1, 2, . . . ,m},

respectively. The m-dimensional second-order cone is denoted by

Km :=

{

(z1, z2, . . . , zm)> ∈ Rm | z1 ≥√∑m

i=2(zi)2}

(m ≥ 2)

R1+ (m = 1).

We also denote the set of m-dimensional symmetric positive-semidefinite matrices and the setof symmetric positive-definite matrices by Sm

+ and Sm++, respectively.

For a vector z := (z1, z2, . . . , zn)> ∈ Rn, the 1-, 2-, and ∞-norms of z are defined by

‖z‖1 := |z1| + |z2| + · · · + |zn|,

‖z‖2 :=√

(z1)2 + (z2)2 + · · · + (zn)2,

and

‖z‖∞ := max (|z1|, |z2|, . . . , |zn|) ,

respectively. Particularly, we often use ‖ · ‖ to denote the 2-norm, instead of ‖ · ‖2. For a matrixM ∈ Rn×n, ‖M‖ denotes the operator norm defined by

‖M‖ = max‖x‖2=1

‖Mx‖2.

22 Acknowledgment

For z1 +√−1z2 ∈ C with z1, z2 ∈ R, its absolute value is defined by

|z1 +√−1z2| :=

√(z1)2 + (z2)2.

For any vector z ∈ Rm, we let(z)+ := max(z, 0),

where the maximum is taken componentwise. For any scalar function ψ : Rn → R and z ∈ Rn,we define the function ψ+ : Rn → R by

ψ+(z) := (ψ(z))+ .

Also, for any vectors zi ∈ Rni (i = 1, 2, . . . , p), we often write

(z1, z2, . . . , zp) := ((z1)>, (z2)>, . . . , (zp)>)> ∈ Rn1+n2+···+np .

Moreover, for any vector z ∈ Rn and any vector function F : Rn → Rm, we let

z := (z1, z) ∈ R × Rn−1, F (z) := (F1(z), F (z)) ∈ R × Rm−1.

For a differentiable function f : Rn → R, we define the gradient of f at x by

∇xf(x) :=

∂f(x)∂x1...

∂f(x)∂xn

.

We often write ∇f(x) to denote ∇xf(x). In addition, if f is twice differentiable, the Hessian off at x is defined by

∇2xxf(x) :=

∂2f(x)∂x2

1

· · · ∂2f(x)∂x1∂xn

.... . .

...∂2f(x)∂xn∂x1

· · · ∂2f(x)∂x2

n

.

We often write ∇2f(x) to denote ∇2xxf(x). Moreover, for a differentiable vector function F :

Rn → Rm, we define the Jacobian of F at x by

∇xF (x) := (∇F1(x) · · · ∇Fm(x)) =

∂F1(x)∂x1

· · · ∂Fm(x)∂x1

.... . .

...∂F1(x)∂xn

· · · ∂Fm(x)∂xn

,

where F (x) = (F1(x), F2(x), . . . , Fm(x))>. We often write ∇F (x) to denote ∇xF (x). If F istwice differentiable, for any vector y := (y1, y2, . . . , ym)> ∈ Rm, we define

∇2F (x)y :=m∑

i=1

yi∇2Fi(x).

2.2 Basic properties of convex functions 23

Furthermore, for a twice differentiable function g : Rn × Rm → R, we define

∇2xyg(x, y) := ∇y(∇xg(x, y)),

where ∇xg(x, y) is the gradient of g(·, y) at x and ∇y(∇xg(x, y)) means the Jacobian of ∇xg(x, ·)at y.

Finally, in what follows, we list other notations that appear in the thesis.

• ⊥: the perpendicularity

• In: the n-dimensional identity matrix

• argminz∈Dh(z): the set of minimizers of h over D for a nonempty set D ⊆ Rn and afunction h : Rn → R

• intC: the interior of a set C ⊆ Rn

• bdC: the boundary of a set C ⊆ Rn

• PC(z): the Euclidean projection of z onto a closed convex set C ⊆ Rn, i.e., PC(z) :=argminw∈C‖z − w‖

• NC(z): the normal cone [56] of a set C ⊆ Rn at z ∈ C

• M � (�)N : M −N ∈ Sn+ (Sn

++) for matrices M,N ∈ Rn×n

• diag(z): the n× n diagonal matrix with diagonal elements zi (i = 1, 2, . . . , n) for z ∈ Rn

• B(z, δ): the closed ball with center z and radius δ, i.e., {w ∈ Rn | ‖w − z‖ ≤ δ}

2.2 Basic properties of convex functions

To begin with, we define the closedness, properness and convexity of functions.

Definition 2.2.1. For a given function f : Rn → (−∞,∞], we denote the effective domain off by

dom f := {x ∈ Rn | f(x) <∞}.

Then, we say that

1. the function f is proper if dom f 6= ∅,

2. the function f is closed if f is lower-semi-continuous, and

3. the function f is convex if

f(θx+ (1 − θ)y) ≤ θf(x) + (1 − θ)f(y), ∀θ ∈ [0, 1], ∀x, y ∈ Rn.

Secondly, the strict/strong convexity of proper functions is defined as follows:

Definition 2.2.2. For a given proper function f : Rn → (−∞,∞], we say that

24 Acknowledgment

1. the function f is strictly convex if

f(θx+ (1 − θ)y) < θf(x) + (1 − θ)f(y), ∀θ ∈ (0, 1), ∀x, y ∈ dom f such that x 6= y,

2. the function f is strongly convex if there exists a constant σ > 0 such that

f(θx+ (1− θ)y) +12σθ(1− θ)‖x− y‖2 ≤ θf(x) + (1− θ)f(y), ∀θ ∈ [0, 1], ∀x, y ∈ dom f.

Obviously, any strongly or strictly convex functions are convex. However, the converse isnot true in general. For example, the convex function f(x) = x is neither strictly convexnor strongly convex. The next proposition gives the necessary and sufficient condition for adifferentiable function to be convex.

Proposition 2.2.3. For a given function f : Rn → (−∞,∞], suppose that dom f is an openconvex set and f is differentiable on dom f . Then, the function f is (strictly) convex if and onlyif

f(y) − f(x) −∇f(x)>(y − x) ≥ (>)0

holds for any x, y ∈ dom f such that x 6= y.

For a given function f : Rn → (−∞,∞] and α ∈ R, we denote a level set of f by Lf (α) :={x | f(x) ≤ α}. We then give a proposition about level sets, which will play a key role in thesubsequent chapters.

Proposition 2.2.4. For a given convex function f : Rn → (−∞,∞] and α ∈ R, the followingproperties hold.

1. Lf (α) is convex,

2. if f is closed, then Lf (α) is closed,

3. if f is strongly convex, then Lf (α) is bounded.

Furthermore, provided that the function f is closed, proper, and has at least one nonemptycompact level set, then any nonempty level set of f is compact.

We next consider the convex optimization problem

minx∈D

f(x), (2.2.1)

where f : Rn → (−∞,∞] is a convex function and D ⊆ Rn is a nonempty closed convex set.Then, we have the following important properties concerning the existence and uniqueness ofan optimal solution of (2.2.1):

Proposition 2.2.5. 1. Suppose that the function f is strictly convex on D and problem (2.2.1)has a nonempty optimal solution set. Then, the optimum of (2.2.1) is unique.

2. Suppose that the function f is strongly convex on D. Then, the optimum of (2.2.1) uniquelyexists.

2.3 Basic properties of cones 25

2.3 Basic properties of cones

In this section, we give some definitions and fundamental properties of cones.

Definition 2.3.1. A set C ⊆ Rn is called a cone if

α ≥ 0, x ∈ C ⇒ αx ∈ C.

Definition 2.3.2. For a given cone C ⊆ Rn, we define the dual cone Cd by

Cd := {x ∈ Rn | x>y ≥ 0, ∀y ∈ C}.

In particular, C is called a self-dual cone if C = Cd holds.

Obviously, a Cartesian product of self-dual cones is also a self-dual cone. It is well-knownthat Rm

+ , Km and Sm+ are typical examples of self-dual cones [15].

Convexity of a cone can be featured by the following equivalency:

Proposition 2.3.3. Let C ⊆ Rn be a given cone. Then, C is convex if and only if

x, y ∈ C ⇒ x+ y ∈ C.

Furthermore, any closed convex cone can also be characterized as the intersection of finitelyor infinitely halfspaces.

Proposition 2.3.4. [57, Corollary 11.7.1] Let C ⊆ Rn be a given closed convex cone. For anys ∈ Rn with s 6= 0, define H(s) := {y ∈ Rn | s>y ≥ 0} and S := {s ∈ Rn | ‖s‖ = 1, H(s) ⊇ C}.Then, we have

C =∩s∈S

H(s).

The above proposition will be important when deriving the KKT conditions for SISOCP (1.1.3).We next consider the following conic optimization problem:

Minimizex∈Rn

f(x)

subject to gj(x) ∈ Cj (j = 1, 2, . . . , J),(2.3.1)

where f : Rn → R and gj : Rn → Rmj (j = 1, 2, . . . , J) are continuously differentiable andCj ⊆ Rmj (j = 1, 2, . . . , J) are closed convex cones with nonempty interior. Let x∗ ∈ Rn be alocal optimum of (2.3.1). Then, the KKT conditions hold at x∗ under the following Robinson’sconstraint qualification.

Definition 2.3.5 (Robinson’s constraint qualification). Let x ∈ Rn be feasible for (2.3.1). Then,we say that Robinson’s constraint qualification holds at x if there exists a vector d ∈ Rn suchthat

gj(x) + ∇gj(x)>d ∈ intCj (j = 1, 2, . . . , J).

For details about Robinson’s constraint qualification, see [6].

26 Acknowledgment

Theorem 2.3.6. Let x∗ ∈ Rn be a local optimum of (2.3.1) such that Robinson’s constraintqualification holds. Then, the KKT conditions hold at x∗, i.e., there exist Lagrange multipliervectors λj ∈ Rmj (j = 1, 2, . . . , J) such that

∇f(x∗) −J∑

j=1

∇gj(x∗)λj = 0, Cdj 3 λj ⊥ gj(x∗) ∈ Cj (j = 1, 2, . . . , J), (2.3.2)

where Cdj denotes the dual cone of Cj for j = 1, 2, . . . , J .

Especially, if each Cj is the mj-dimensional SOC, i.e., Cj := Kmj in (2.3.1), the KKTconditions (2.3.2) can be rewritten as

∇f(x∗) −J∑

j=1

∇gj(x∗)λj = 0, Kmj 3 λj ⊥ gj(x∗) ∈ Kmj (j = 1, 2, . . . , J),

since (Kmj )d = Kmj for j = 1, 2, . . . , J from the self-duality of second-order cones.

2.4 Natural residual associated with SOC complementarity con-

dition

Let K be a Cartesian product of second-order cones, i.e, K := Km1 × Km2 × · · · × Km` . Indesigning algorithms for solving problems involving SOC constraints, we often reformulate theSOC complementarity condition K 3 y ⊥ z ∈ K as a system of equations by means of the naturalresidual defined precisely below. To be specific, we first introduce the spectral factorization ofa vector with respect to the SOC, Km.

Definition 2.4.1. For any vector z := (z1, z2) ∈ R×Rm−1, we define the spectral factorizationwith respect to Km as

z = λ1c1 + λ2c

2,

where λ1 and λ2 are the spectral values given by

λj = z1 + (−1)j‖z2‖, j = 1, 2,

and c1 and c2 are the spectral vectors given by

cj =

12

(1, (−1)j z2

‖z2‖

)if z2 6= 0

12(1, (−1)jv) if z2 = 0

j = 1, 2,

where v ∈ Rm−1 is an arbitrary vector such that ‖v‖ = 1.

By using the spectral factorization, we can write the Euclidean projection onto Km explicitly asfollows [18]:

PKm(z) = (λ1)+c1 + (λ2)+c2,

where λj and cj (j = 1, 2) are the spectral values and the spectral vectors of z, respectively.Now, let us define the natural residual for the SOC complementarity condition by using theEuclidean projection.

2.4 Natural residual associated with SOC complementarity condition 27

Definition 2.4.2. Let y := (y1, y2, · · · , y`) and z := (z1, z2, · · · , z`) ∈ Rm1 ×Rm2× · · · ×Rm` =Rm be arbitrary vectors. Then, the natural residual function Φ : Rm × Rm → Rm with respectto K = Km1 ×Km2 × · · · × Km` is defined as

Φ(y, z) := y − PK(y − z) (2.4.1)

=

ϕ1(y1, z1)

...ϕ`(y`, z`)

∈ Rm1 × · · · × Rm` ,

whereϕi(yi, zi) = yi − PKmi (yi − zi), i = 1, 2, . . . , `.

Using the natural residual Φ, we can reformulate the SOC complementarity condition as anequivalent system of equations.

Proposition 2.4.3. [18] We have the following equivalency:

ϕi(yi, zi) = 0 ⇐⇒ Kmi 3 yi ⊥ zi ∈ Kmi (i = 1, 2, . . . , `)

Φ(y, z) = 0 ⇐⇒ K 3 y ⊥ z ∈ K.

From the above proposition, we have only to solve the equation Φ(y, z) = 0 for finding(y, z) ∈ Rm × Rm that satisfies the SOC complementarity condition K 3 y ⊥ z ∈ K. In [26],the authors proved that the natural residual Φ has the Jacobian consistency and semismooth-ness. By exploiting these properties, there have been proposed several efficient methods such assemismooth and smoothing Newton methods [49, 26].

Chapter 3

A regularized explicit exchange

method for convex semi-infinite

second-order cone programming

problems

3.1 Introduction

In this chapter, we consider SISOCP (1.1.3) with a convex objective function and infinitely manyaffine SOC constraints, i.e.,

Minimizex∈Rn

f(x)

subject to Aj(t)>x− bj(t) ∈ Kmj for all t ∈ Tj (j = 1, 2, . . . , J),(3.1.1)

where f : Rn → R is a continuously differentiable convex function, A : Tj → Rn×m andb : Tj → Rm are continuous functions for j = 1, 2, . . . , J , and Tj ⊆ R`j (j = 1, 2, . . . , J) arenonempty compact index sets. Note that the feasible set of SISOCP (3.1.1) can be expressed as∩

1≤j≤J

∩t∈Tj

{x ∈ Rn | Aj(t)>x− bj(t) ∈ Kmj}. (3.1.2)

Since {x ∈ Rn | Aj(t)>x − bj(t) ∈ Kmj} (t ∈ Tj , j = 1, 2, . . . , J) are closed convex sets fromthe closed convexity of Kmj , (3.1.2) is also closed convex. Therefore, SISOCP (3.1.1) is a convexprogram.

A lot of practical problems can be formulated as (3.1.1). For example, the Chebyshevapproximation problem (1.1.4) is transformed to an SISOCP of the form (3.1.1) when the functionF (·, y) is affine, and moreover, SISOCP (1.1.7) obtained from the FIR filter problem (1.1.6)obviously takes the form (3.1.1). Hence, it is important to develop efficient algorithms for solvingSISOCP (3.1.1).

In this chapter, for solving SISOCP (3.1.1), we propose an algorithm based on the exchangemethod. Many researchers [21, 27, 34, 73, 47, 79] have studied exchange-type algorithms for

30 Chapter 3 A regularized explicit exchange method for convex SISOCPs

solving convex semi-infinite programs. The exchange method solves a relaxed semi-infiniteprogram (SIP) with Tj (j = 1, 2, . . . , J) replaced by a finite subset T k

j ⊆ Tj for each j, whereT k

j is updated so that T k+1j ⊆ T k

j ∪ {t1j , t2j , · · · , trj} with {t1j , t2j , · · · , trj} ⊆ T \ T kj . As another

scheme analogous to the exchange method, the discretization method [21, 28, 54, 64] should bereferred to here. It also solves a sequence of relaxed SIPs with Tj (j = 1, 2, . . . , J) replaced byfinite subset T k

j ⊆ Tj for each j, but T kj is updated so that T k+1

j = T kj ∪ {t1j , t2j , · · · , trj} with

{t1j , t2j , · · · , trj} ⊆ Tj \ T kj and the distance1 from T k

j to Tj converges to 0 as k goes to infinity.Although this method is comprehensible and easy to implement, the computational cost tendsto be high since the cardinality of T k

j grows exponentially in the dimension of Tj . However,the exchange method can avoid such a serious drawback by removing unnecessary indices thatcorrespond to inactive constrains. We thus do not adopt the discretization method, but theexchange method for solving SISOCP (3.1.1).

We note that the exchange-type algorithm proposed in this chapter needs to solve a sequenceof SOCPs. To such a subproblem, we can apply an existing algorithm such as the interior-pointmethod and the smoothing Newton method [1, 18, 26, 37].

In the subsequent sections, we will propose exchange-type methods for solving SISOCP (3.1.1)and analyze convergence properties of generated iteration points. However, to make a conver-gence analysis in the more general framework, we consider the following problem:

Minimizex∈Rn

f(x)

subject to g(x, t) ∈ C for all t ∈ T,(3.1.3)

where f : Rn → R is a continuously differentiable function and g : Rn ×T → Rm is a continuousfunction such that g(·, t) is differentiable for each fixed t, T ⊆ R` is a compact set and C ⊆ Rm

is a closed convex cone with nonempty interior. We call this problem the semi-infinite conicprogram, SICP for short. SICP (3.1.3) contains SISOCP (3.1.1) as a subclass. Indeed, by letting

C := Km1 ×Km2 × · · · × KmJ ,

g(x, t) :=(A1(t1)>x− b1(t1), A2(t2)>x− b2(t2), . . . , AJ(t2)>x− bJ(t2)

)>,

t := (t1, t2, . . . , tJ), tj ∈ Tj (j = 1, 2, . . . , J), and T := T1 × T2 × · · · × TJ ,

SICP (3.1.3) reduces to SISOCP (3.1.1). Notice that another possible choice of C is a symmetricpositive semi-definite cone Sm

+ . Hence, SICP (3.1.4) also contains semi-infinite programs withinfinitely many symmetric positive semi-definite cone constraints.

The main purpose of this chapter is two-fold. First, we study the Karush-Kuhn-Tucker(KKT) conditions for SICP (3.1.3). Although the original KKT conditions for SICP could bedescribed by means of integration and Borel measure, we show that they can be represented by afinite number of elements in T under the generalized Robinson constraint qualification. Second,we propose two algorithms for solving the convex SICP (3.1.3) of the following form:

Minimizex∈Rn

f(x)

subject to A(t)>x− b(t) ∈ C for all t ∈ T,(3.1.4)

1For two sets X ⊆ Y , the distance from X to Y is defined as dist (X, Y ) := supy∈Y inf x∈X ‖x − y‖.

3.2 KKT conditions for SICP 31

where f : Rn → R is a continuously differentiable convex function, A : T → Rn×m and b : T →Rm are continuous functions, and T and C are as in SICP (3.1.3). The proposed algorithmsare based on the exchange method, which solves a sequence of subproblems with finitely manyconic constraints. The first algorithm is an explicit exchange method, for which we show globalconvergence under the strict convexity of the objective function. The second algorithm is aregularized explicit exchange method. With the help of regularization, global convergence of thealgorithm is established without the strict convexity assumption.

This chapter is organized as follows. In Section 3.2, we discuss the KKT conditions forSICP (3.1.3). In Section 3.3, we propose the explicit exchange method for solving SICP (3.1.4).In Section 3.4, we combine the explicit exchange method with the regularization method andshow that the hybrid algorithm is globally convergent for SICP (3.1.4). In Section 3.5, we givesome numerical results to examine the efficiency of the proposed algorithm. In Section 3.6, weconclude the chapter with some remarks.

3.2 KKT conditions for SICP

In this section, we derive the Karush-Kuhn-Tucker (KKT) conditions for SICP (3.1.4). Theresult is not only interesting in itself, but also provides us an important key to analyze globalconvergence of the algorithm proposed in Section 3.3.

When m = 1 and C = R+, SICP (3.1.3) reduces to the classical semi-infinite program andthe KKT conditions are given as follows.

Lemma 3.2.1. [29, Theorem 3.3] Let x∗ ∈ Rn be a local optimum of SICP (3.1.3) with C := R+.Let Tact(x) be the set of active indices at x ∈ Rn, i.e., Tact(x) := {t ∈ T | g(x, t) = 0}. Supposethat the Mangasarian-Fromovitz constraint qualification (MFCQ) holds at x∗, i.e., there existsa vector d ∈ Rn such that ∇xg(x∗, t)>d > 0 for any t ∈ Tact(x∗). Then, there exist p indicest1, t2, . . . , tp ∈ Tact(x∗) and Lagrange multipliers µ1, µ2, . . . , µp ≥ 0 such that p ≤ n and

∇f(x∗) −p∑

i=1

µi∇xg(x∗, ti) = 0,

R+ 3 µi ⊥ g(x∗, ti) ∈ R+ (i = 1, 2, . . . , p).

In the above lemma, the MFCQ plays a key role. However, for SICP (3.1.3), it is difficult toapply the MFCQ in a straightforward manner. We therefore introduce the generalized Robinsonconstraint qualification (GRCQ), which is defined as follows.

Definition 3.2.2 (Generalized Robinson Constraint Qualification (GRCQ)). Let x ∈ Rn be afeasible point of SICP (3.1.3). Then, we say that the Robinson constraint qualification (GRCQ)holds at x if there exists a vector d ∈ Rn such that

g(x, t) + ∇xg(x, t)>d ∈ intC for all t ∈ T. (3.2.1)

When m = 1 and C = R+, the GRCQ reduces to the MFCQ. When g is affine, i.e., g(x, t) :=A(t)>x−b(t), the GRCQ holds at any feasible point if and only if the Slater constraint qualifica-tion holds, i.e., there exists x0 ∈ Rn such that A(t)>x0 − b(t) ∈ intC for all t ∈ T . Furthermore,


if T consists of finitely elements, the GRCQ is reduced to the Robinson constraint qualificationdefined by Definition 2.3.5. The next proposition states that any closed convex cone is repre-sented as the intersection of finitely or infinitely many halfspaces generated by a certain compactset.

Proposition 3.2.3. Let C ( Rm be a nonempty closed convex cone. Then, (i) there exists anonempty compact set S ⊆ {s ∈ Rm | ‖s‖ = 1} such that

C = {y ∈ Rm | s>y ≥ 0, ∀s ∈ S}. (3.2.2)

Moreover, we have ( ii ) S ⊆ Cd, and ( iii ) intC ⊆ {y ∈ Rm | s>y > 0, ∀s ∈ S}.

Proof. We first show (i). For any s ∈ Rm with s 6= 0, we define the halfspace H(s) := {y ∈ Rm |s>y ≥ 0}. In addition, let S := {s ∈ Rm | ‖s‖ = 1, H(s) ⊇ C}. By Proposition 2.3.4, we haveC =

∩s∈S H(s). Therefore, it suffices to show the compactness of S. Since the boundedness

is evident, we only show the closedness. Choose an arbitrary convergent sequence {sk} ⊆ S

such that limk→∞ sk = s∗ and let z ∈ C be an arbitrary vector. Obviously, we have ‖sk‖ = 1.Moreover, from C =

∩s∈S H(s) ⊆ H(sk), we have (sk)>z ≥ 0 for all k. Therefore, by letting

k → ∞, we obtain ‖s∗‖ = 1 and (s∗)>z ≥ 0, which implies z ∈ H(s∗). Since z ∈ C wasarbitrary, we have C ⊆ H(s∗), and hence s∗ ∈ S.

Second, we show ( ii ). Choose s ∈ S arbitrarily. From (3.2.2), we have s>y ≥ 0 for all y ∈ C,which implies s ∈ Cd.

We finally show ( iii ). Choose z ∈ intC arbitrarily. From the compactness of S, thereexists s ∈ S such that s ∈ argmins∈Sz

>s. To show ( iii ), we only have to prove z>s > 0. Forcontradiction, suppose that z>s ≤ 0, which together with s ∈ S ⊆ Cd implies z>s = 0. Sincez ∈ intC, we have z − δs ∈ C for sufficiently small δ > 0. Then, by using z>s = 0, z − δs ∈ C

and s ∈ Cd, we have 0 ≤ (z − δs)>s = −δ‖s‖2, which yields s = 0. This contradicts the facts ∈ S.

By using this proposition, we reformulate SICP (3.1.3) as a standard semi-infinite program,whereby we can derive the KKT conditions.

Theorem 3.2.4. Let x∗ ∈ Rn be a local optimum of SICP (3.1.3). Suppose that the GRCQ holdsat x∗. Then, there exist p indices t1, t2, . . . , tp ∈ T and Lagrange multipliers y1, y2, . . . , yp ∈ Rm

such that p ≤ n and

∇f(x∗) −p∑

i=1

∇xg(x∗, ti)yi = 0, (3.2.3)

Cd 3 yi ⊥ g(x∗, ti) ∈ C (i = 1, 2, . . . , p). (3.2.4)

Proof. By Proposition 3.2.3, there exists a nonempty compact set S ⊆ {s ∈ Rm | ‖s‖ = 1} suchthat SICP (3.1.3) is equivalent to the following semi-infinite program:

Minimize f(x)

subject to s>g(x, t) ≥ 0 for all (s, t) ∈ S × T.(3.2.5)

3.2 KKT conditions for SICP 33

Let (S × T )act(x∗) := {(s, t) ∈ S × T | s>g(x∗, t) = 0}. If (S × T )act(x∗) = ∅, then we have(3.2.3) and (3.2.4) with yi = 0 for all i. Next, we suppose (S × T )act(x∗) 6= ∅. We first showthat the MFCQ holds for problem (3.2.5), i.e., there exists a vector d ∈ Rn such that

(∇xg(x∗, t)s)>d > 0 for all (s, t) ∈ (S × T )act(x∗). (3.2.6)

By assumption, there exists a vector d ∈ Rn satisfying GRCQ (3.2.1), i.e., g(x∗, t)+∇xg(x∗, t)>d ∈intC for all t ∈ T . By Proposition 3.2.3, we also have 0 /∈ S ⊆ Cd. Hence, from Proposition 3.2.3( iii ), we have s>(g(x∗, t) + ∇xg(x∗, t)>d) > 0 for all (s, t) ∈ S × T , which implies (3.2.6).Therefore, d satisfies (3.2.6). Now, applying Lemma 3.2.1 to problem (3.2.5), we have p indices(s1, t1), (s2, t2), . . . , (sp, tp) ∈ (S×T )act(x∗) and the Lagrange multipliers µ1, µ2, . . . , µp ≥ 0 suchthat p ≤ n and

∇f(x∗) −p∑

i=1

µi∇xg(x∗, ti)si = 0, (3.2.7)

R+ 3 µi ⊥ (si)>g(x∗, ti) ∈ R+ (i = 1, 2, . . . , p). (3.2.8)

By letting yi := µisi for each i, we have from (3.2.8) that 0 = µis

>i g(x

∗, ti) = (yi)>g(x∗, ti).We also have yi ∈ Cd since si ∈ S ⊆ Cd from Proposition 3.2.3 and µi ≥ 0. In addition, wehave g(x∗, ti) ∈ C since x∗ is feasible to SICP (3.1.3). Thus, (3.2.7) and (3.2.8) yield (3.2.3) and(3.2.4), respectively. This completes the proof.

Before closing this section, we give two theorems. The first one states that the KKT condi-tions are also sufficient for global optimality when the problem is the convex SICP (3.1.4).

Theorem 3.2.5. Let x∗ ∈ Rn be feasible to the convex SICP (3.1.4). If there exist p indicest1, t2, . . . , tp ∈ T and p Lagrange multiplier vectors y1, y2, . . . , yp ∈ Rm such that

∇f(x∗) −p∑

i=1

A(ti)yi = 0, (3.2.9)

Cd 3 yi ⊥ A(ti)>x∗ − b(ti) ∈ C (i = 1, 2, . . . , p), (3.2.10)

then x∗ is a global optimum of SICP (3.1.4).

Proof. Let ` : Rn → R be defined by `(x) := f(x) −∑p

i=1(yi)>(A(ti)>x − b(ti)), and let

x ∈ Rn be an arbitrary feasible point of SICP (3.1.4). Since ` is convex and ∇`(x∗) = ∇f(x∗)−∑pi=1A(ti)yi = 0 by (3.2.9), x∗ is a global minimum of `, i.e., `(x) − `(x∗) ≥ 0. Hence, we have

f(x) − f(x∗) = `(x) − `(x∗) +∑p

i=1(yi)>(A(ti)>x − b(ti)) ≥ 0, where the first equality follows

from the definition of ` and (3.2.10), and the last inequality follows from `(x) − `(x∗) ≥ 0,yi ∈ Cd, and A(ti)>x−b(ti) ∈ C (i = 1, 2, . . . , p). We thus conclude that x∗ is a global optimumof SICP (3.1.4).

Next, we enhance Theorem3.2.4 so that it can elaborate upon the case where C has aCartesian structure, i.e., C = C1 × · · · × Ch ⊆ Rm = Rm1 × · · · × Rmh . Consider the followingproblem:

Minimize f(x)

subject to gj(x, tj) ∈ Cj for all tj ∈ Tj , j = 1, 2, · · · , h,(3.2.11)


where gj : Rn × Tj → Rmj is continuous, gj(·, tj) is differentiable for each fixed tj , Tj ⊆ R`j is anonempty compact set and Cj ⊆ Rmj is a closed convex cone with nonempty interior for eachj. Then, the following theorem holds.

Theorem 3.2.6. Let x∗ ∈ Rn be a local optimum of SICP (3.2.11). Assume that the GRCQholds at x∗, i.e., there exists a vector d ∈ Rn such that

gj(x∗, tj) + ∇xgj(x∗, tj)>d ∈ intCj for all tj ∈ Tj , j = 1, 2, . . . , h. (3.2.12)

Then, there exist p indices2 j1, j2, . . . , jp ∈ {1, 2, . . . , h} and (tjii , y

jii ) ∈ Tji × Rmji for i =

1, 2, . . . , p such that p ≤ n and

∇f(x∗) −p∑

i=1

∇xgji(x∗, tji

i )yjii = 0 (3.2.13)

(Cji)d 3 yjii ⊥ gji(x

∗, tjii ) ∈ Cji (i = 1, 2, . . . , p). (3.2.14)

Proof. For each j = 1, 2, . . . , h, let tj ∈ R`j \ Tj be an arbitrary point and Tj be defined asTj := {t1} × · · · × {tj−1} × Tj × {tj+1} × · · · × {th} ⊆ R`1+`2+···+`h . Then we can easily see thatTj ∩ Tj′ = ∅ for any j 6= j′. Let

t := (t1, t2, . . . , th) ∈ R`1+`2+···+`h , T :=h∪

j=1

Tj ⊆ R`1+`2+···+`h , (3.2.15)

and g : Rn × T → Rm1+m2+···+mh be defined by

g(x, t) := (g1(x, t), . . . , gh(x, t)), (3.2.16)

where

gj(x, t) :=

gj(x, tj) (t ∈ T j)

ζj (t /∈ T j)(3.2.17)

and ζj ∈ intCj is an arbitrary vector. Then, the function g is continuous on Rn × T and g(·, t)is differentiable for each t ∈ T . In particular, we have

∇xgj(x, t) :=

∇xgj(x, tj) (t ∈ Tj)

0 (t /∈ Tj).(3.2.18)

Then, T is nonempty and compact, and SICP (3.2.11) is equivalent to SICP (3.1.3) with C =C1 × · · · × Ch and g defined by (3.2.16). By letting d ∈ Rn satisfy (3.2.12), we have

gj(x∗, t) + ∇xgj(x∗, t)>d =

gj(x∗, tj) + ∇xgj(x∗, tj)>d ∈ intCj (t ∈ Tj)

ζj ∈ intCj (t /∈ Tj)

for each j = 1, 2, . . . , h, where the first case follows from (3.2.12) and the second one followsfrom (3.2.17), (3.2.18) and ζj ∈ intCj . Therefore, we have g(x∗, t) + ∇g(x∗, t)>d ∈ intC for all

2Repeated choice of the same index is allowed in the set {j1, j2, . . . , jp}.

3.3 Explicit exchange method for SICP 35

t ∈ T , which implies that the GRCQ holds at x∗ for SICP (3.1.3). Hence, by Theorem 3.2.4,there exist p ≤ n, t1, t2, . . . , tp ∈ T and y1, y2, . . . , yp ∈ Rm such that

∇f(x∗) −p∑

i=1

∇xg(x∗, ti)yi = 0, (3.2.19)

Cd 3 yi ⊥ g(x∗, ti) ∈ C (i = 1, 2, . . . , p). (3.2.20)

Let ti := (t1i , t2i , . . . , t

hi ) ∈ R`1+`2+···+`h and yi := (y1

i , y2i , . . . , y

hi ) ∈ Rm1+m2+···+mh for i =

1, 2, . . . , p. From (3.2.15), for each i, there exists ji ∈ {1, 2, . . . , h} such that ti ∈ Tji , i.e.,tjii ∈ Tji . Then, we have

p∑i=1

∇xg(x∗, ti)yi =p∑

i=1

(∇xg1(x∗, ti),∇xg2(x∗, ti), . . . ,∇xgh(x∗, ti)

)y1

i...yh

i

=

p∑i=1

∇xgji(x∗, tji

i )yjii ,

where the second equality follows from (3.2.17) and (3.2.18), which together with (3.2.19) implies(3.2.13). In the last, we show (3.2.14). From (3.2.20) and Cd = (C1)d × (C2)d × · · · × (Ch)d, wehave (Cj)d 3 yj

i ⊥ gj(x∗, ti) ∈ Cj for j = 1, 2, . . . , h, which together with gji(x∗, ti) = gji(x

∗, tjii )

from (3.2.17) implies (3.2.14) for i = 1, 2, . . . , p. The proof is complete.

3.3 Explicit exchange method for SICP

In this section, we propose an explicit exchange method for solving the convex SICP (3.1.4) andshow its global convergence under the assumption that f is strictly convex.

3.3.1 Algorithm

The algorithm proposed in this section requires solving conic programs with finitely many con-straints as subproblems. Let CP(T ′) be the relaxed problem of SICP (3.1.4) with T replaced bya finite subset T ′ := {t1, t2, . . . , tp} ⊆ T . Then, CP(T ′) can be formulated as follows:

CP(T ′)Minimize f(x)

subject to A(ti)>x− b(ti) ∈ C (i = 1, 2, . . . , p).

Note that an optimum x∗ of CP(T ′) satisfies the following KKT conditions:

∇f(x∗) −p∑

i=1

A(ti)yti = 0,

Cd 3 yti ⊥ A(ti)>x∗ − b(ti) ∈ C (i = 1, 2, . . . , p),

where yti is the Lagrange multiplier vector corresponding to the constraint A(ti)>x∗− b(ti) ∈ C

for each i.Now, we propose the following algorithm.


Algorithm 3.1 (Explicit exchange method)

Step 0. Let {γk} ⊆ R++ be a positive sequence such that limk→∞ γk = 0. Choose a finitesubset T 0 := {t01, . . . , t0`} ⊆ T for some integer3 ` ≥ 0, and a vector e ∈ intC. Set k := 0.

Step 1. Obtain xk+1 and T k+1 by the following steps.

Step 1-0 Set r := 0, E0 := T k, and solve CP(E0) to obtain an optimum v0.

Step 1-1 Find a trnew ∈ T such that

A(trnew)>vr − b(trnew) /∈ −γke+ C. (3.3.1)

If such a trnew does not exist, i.e.,

A(t)>vr − b(t) ∈ −γke+ C (3.3.2)

for any t ∈ T , then set xk+1 := vr, T k+1 := Er, and go to Step 2. Otherwise, let

Er+1 := Er ∪ {trnew},

and go to Step 1-2.

Step 1-2 Solve CP(Er+1) to obtain an optimum vr+1 and the Lagrange multipliers yr+1t

for t ∈ Er+1.

Step 1-3 Let Er+1 := {t ∈ Er+1 | yr+1

t 6= 0}. Set r := r + 1 and return to Step 1-1.

Step 2. If γk is sufficiently small, then terminate. Otherwise, set k := k + 1 and return toStep 1.

Here, γk > 0 plays the role of a relaxation parameter for the feasible set of SICP (3.1.4). LetX(γ) := {x ∈ Rn | A(t)>x−b(t) ∈ −γe+C, ∀t ∈ T}. Then, X(0) coincides with the feasible setof SICP (3.1.4), and X(γ) expands as γ increases. Note that, by the termination criterion (3.3.2)for the inner loop, we have xk+1 ∈ X(γk) for each k. Hence, we can expect that the distancebetween xk and the feasible set of SICP (3.1.4) tends to 0 as k goes to infinity. Moreover, aswill be shown in the next subsection, the positivity of γk guarantees the inner loop of Step 1 toterminate in a finite number of iterations for each k.

When C is a symmetric cone such as an SOC or a semi-definite cone, a natural choice forthe vector e ∈ intC is the identity element with respect to Euclidean Jordan algebra [15].4

Moreover, in Step 1-2, we can employ an existing method such as the primal-dual interior pointmethod, the regularized smoothing method, and so on [1, 18, 26, 33, 37].

Let us denote the optimal values of CP(T ′) and SICP (3.1.4) by V (T ′) and V (T ), respectively.Since Er+1 is obtained by removing the constraints with zero Lagrange multipliers from E

r+1,and the feasible region of CP(Er) is larger than that of CP(Er+1), we have

V (E0) ≤ V (E1) = V (E1) ≤ · · · ≤ V (Er) ≤ V (Er+1) = V (Er+1) ≤ · · · ≤ V (T ) <∞. (3.3.3)3We allow ` = 0, which means T 0 = ∅.4For example, if C is R+, Km and Sm

+ , then the identity element is 1, (1, 0, . . . , 0)> ∈ Rm, and the m × m

identity matrix, respectively.

3.3 Explicit exchange method for SICP 37

In the subsequent convergence analysis, we omit the termination condition in Step 2, so thatthe algorithm may generate an infinite sequence {xk}.

Remark 3.3.1. Note that the optimal solution set of CP(Er) contains that of CP(Er) by theconstruction of Er in Step 1-3. Therefore, for each k ≥ 1, we may simply set v0 := xk inStep 1-0 without solving CP(E0) since CP(E0) is identical to CP(Er

∗) and xk solves CP(Er∗),

where Er∗ and E

r∗ are the finite index sets obtained at the end of Step 1 in the previous outer

iteration.

3.3.2 Global convergence under strict convexity assumption

In the previous subsection, we have proposed the explicit exchange method for solving SICP(3.1.4). In this subsection, we show that the algorithm generates a sequence converging to theoptimal solution under the following assumption.

Assumption 3.3.2. i) Function f is strictly convex over the feasible region of SICP (3.1.4).ii) In Step 1-2 of Algorithm 3.1, CP(Er+1) is solvable for each r. iii) A generated sequence {vr}in every Step 1 of Algorithm 3.1 is bounded.

Notice that all statements i)–iii) automatically hold when f is strongly convex. Under As-sumption A, we first show that the inner iterations within Step 1 do not repeat infinitely, whichensures that Algorithm 3.1 is well-defined. To prove this, we provide the following propositionstating that the distance between vr+1 and vr does not tend to zero during the inner iterationsin Step 1.

Proposition 3.3.3. Suppose that Assumption 3.3.2 holds. Then, there exists a positive numberN > 0 such that

‖vr+1 − vr‖ ≥ Nγk

for any r ≥ 0 and k ≥ 0.

Proof. Denote z(v, t) := A(t)>v − b(t) for simplicity. Due to the continuity of the matrix norm‖A(t)‖ := max‖w‖=1 ‖A(t)>w‖ and the compactness of T , there exists a sufficiently large M > 0such that ‖A(t)‖ ≤M for any t ∈ T . Hence, we have

‖z(vr+1, t) − z(vr, t)‖ = ‖A(t)>(vr+1 − vr)‖ ≤M‖vr+1 − vr‖ (3.3.4)

for any t ∈ T .We next show that ‖z(vr+1, trnew) − z(vr, trnew)‖ is bounded below by some positive number

for any r ≥ 0. Since e ∈ intC, there exists a δ > 0 such that e+B(0, δ) ⊆ C. We therefore have

z(vr+1, trnew) +B(0, δγk) = −γke+ z(vr+1, trnew) + γk (e+B(0, δ))

⊆ −γke+ C, (3.3.5)

where the inclusion holds since e + B(0, δ) ⊆ C, γk > 0, z(vr+1, trnew) ∈ C, and C is a convexcone5. From (3.3.1), we have z(vr, trnew) /∈ −γke+ C, which together with (3.3.5) implies that

‖z(vr+1, trnew) − z(vr, trnew)‖ ≥ δγk. (3.3.6)5When C is a convex cone, αx + βy ∈ C holds for any x, y ∈ C and α, β ≥ 0.


Combining (3.3.4) and (3.3.6) with N := δ/M , we obtain

‖vr+1 − vr‖ ≥ δγk/M = Nγk.

Theorem 3.3.4. Suppose that Assumption 3.3.2 holds. Then, the inner iterations in Step 1 ofAlgorithm 3.1 terminate finitely for each k.

Proof. Suppose, for contradiction, that the inner iterations in Step 1 do not terminate finitelyat some outer iteration k. (In what follows, k is fixed.) Then, by Assumption 3.3.2 iii), thereexist accumulation points v∗ and v∗∗ of {vr} such that vrj → v∗ and vrj+1 → v∗∗ as j → ∞.Moreover, we must have v∗ 6= v∗∗ from Proposition 3.3.3. Denote zr

t := A(t)>vr − b(t) forsimplicity. Since vr solves CP(Er), it satisfies the following KKT conditions:

∇f(vr) −∑t∈E

r

A(t)yrt = 0, (3.3.7)

Cd 3 yrt ⊥ zr

t ∈ C (t ∈ Er), (3.3.8)

where yrt are the Lagrange multipliers. From (3.3.3), we have f(v1) ≤ f(v2) ≤ · · · ≤ V (T ) <

+∞, which implies

limr→∞

(f(vr+1) − f(vr)

)= 0. (3.3.9)

Let Fr := f(vr+1) − f(vr) −∇f(vr)>(vr+1 − vr). Then, we have

f(vr+1) − f(vr) = Fr + ∇f(vr)>(vr+1 − vr)

= Fr +( ∑

t∈Er

A(t)yrt

)>(vr+1 − vr) (3.3.10)

= Fr +∑t∈E

r

(yrt )

>zr+1t −

∑t∈E

r

(yrt )

>zrt (3.3.11)

= Fr +∑t∈E

r

(yrt )

>zr+1t , (3.3.12)

where (3.3.10) and (3.3.12) follow from (3.3.7) and (3.3.8), respectively, and (3.3.11) followsfrom zr

t = A(t)>vr − b(t) and zr+1t = A(t)>vr+1 − b(t). Since f is convex, we have Fr ≥ 0. In

addition, since yrt ∈ Cd and zr+1

t ∈ C, we have∑

t∈Er(yr

t )>zr+1

t ≥ 0 . Therefore, from (3.3.9)and (3.3.12), we have

0 = limr→∞

Fr = limj→∞

Frj = f(v∗∗) − f(v∗) −∇f(v∗)>(v∗∗ − v∗). (3.3.13)

However, this contradicts the strict convexity of f since v∗ 6= v∗∗. Hence, the inner iterations inStep 1 must terminate finitely.

The next theorem shows global convergence of Algorithm 3.1 under the strict convexityassumption.

3.4 Regularized explicit exchange method for SICP 39

Theorem 3.3.5. Suppose that SICP (3.1.4) has a solution and Assumption 3.3.2 holds. Let x∗

be the optimum, and {xk} be the sequence generated by Algorithm 3.1. Then, we have

limk→∞

xk = x∗.

Proof. We first show that {xk} is bounded. Let X(γ) := {x ∈ Rn | A(t)>x−b(t)+γe ∈ C, ∀t ∈T} and L := {x ∈ Rn | f(x) ≤ f(x∗)}. Since xk ∈ L ∩X(γk) ⊆ L ∩X(γ) with γ := maxk≥0 γk,it suffices to show that L ∩X(γ) is bounded for any γ > 0. By Proposition 3.2.3, there exists acompact set S ⊆ Rm such that 0 /∈ S ⊆ Cd and

X(γ) = {x ∈ Rn | s>(A(t)>x− b(t) + γe

)≥ 0 , ∀(s, t) ∈ S × T}

= {x ∈ Rn | (e>s)−1(s>b(t) − (A(t)s)>x

)≤ γ , ∀(s, t) ∈ S × T}

={x ∈ Rn | h(x) := max

(s,t)∈S×T(e>s)−1

(s>b(t) − (A(t)s)>x

)≤ γ

},

where the second equality is valid since e ∈ intC and 0 6= s ∈ S ⊆ Cd entail mins∈S e>s > 0 from

Proposition 3.2.3 ( iii ). Notice that h(x) < ∞ from the compactness of S × T and continuity ofA(·) and b(·). Therefore, function h is closed, proper and convex. Now, let h : Rn → (−∞,+∞]be defined as

h(x) :=

h(x) (x ∈ L)

∞ (x /∈ L).

Then h is also closed, proper and convex since L is convex. Notice that

L ∩X(γ) = {x ∈ Rn | h(x) ≤ γ},

i.e., L ∩X(γ) is a level set of h. From Proposition 2.2.4, if a closed proper convex function hasat least one compact level set, then any nonempty level set is also compact. Moreover, we haveL ∩X(0) = {x∗} since f is strictly convex. Therefore, L ∩X(γ) is compact for any γ ≥ 0.

We next show that limk→∞ xk = x∗. Let x be an arbitrary accumulation point of {xk}.Then, there exists a subsequence {xkj} ⊆ {xk} such that limj→∞ xkj = x. For all j, we haveA(t)>xkj − b(t)+γkj

e ∈ C (∀t ∈ T ) and f(xkj ) ≤ f(x∗). Hence, by letting j tend to ∞, we have

A(t)>x− b(t) ∈ C (∀t ∈ T ), (3.3.14)

f(x) ≤ f(x∗) (3.3.15)

from the continuity of f and the closedness of C. From (3.3.14), we have f(x) ≥ f(x∗), whichtogether with (3.3.15) implies f(x) = f(x∗). Therefore, x solves SICP (3.1.4). Since f is strictlyconvex, we must have x = x∗. We thus have limk→∞ xk = x∗.

3.4 Regularized explicit exchange method for SICP

In the previous section, we have proposed the explicit exchange method for SICP (3.1.4) andanalyzed its convergence property. However, to ensure global convergence, we had to assumethe strict convexity of the objective function (Assumption 3.3.2). In this section, we proposea new method combining the regularization technique with the explicit exchange method, andestablish global convergence without assuming the strict convexity.


3.4.1 Algorithm

Let f : Rn → R be a convex function. Then, the function fε : Rn → R defined by fε(x) :=12ε‖x‖

2 + f(x) is strongly convex for any ε > 0. So, if we apply Algorithm 3.1 to the followingregularized SICP:

RSICP(ε)Minimize fε(x)

subject to A(t)>x− b(t) ∈ C for all t ∈ T,

then Step 1 always terminates in a finite number of (inner) iterations and the sequence generatedby Algorithm 3.1 converges to the unique solution x∗ε of RSICP(ε).

By introducing a positive sequence {εk} converging to 0, we can expect that x∗εkconverges to

the solution of the original SICP (3.1.4) as k goes infinity. However, since it is computationallyprohibitive to solve RSICP(εk) exactly for every k, we solve RSICP(εk) only approximately byusing the explicit exchange method. In the inner iteration of the latter method, we repeatedlysolve problems of the form:

CP(εk, T ′)Minimize fεk

(x)

subject to A(ti)>x− b(ti) ∈ C (i = 1, 2, . . . , p),

where T ′ := {t1, t2, . . . , tp} ⊆ T . The detailed steps of the regularized explicit exchange methodare described as follows.

Algorithm 3.2 (Regularized Explicit Exchange Method)

Step 0. Choose positive sequences {γk} ⊆ R++ and {εk} ⊆ R++ such that limk→∞ γk =limk→∞ εk = 0. Choose a finite subset T 0 := {t01, . . . , t0l } ⊆ T for some integer ` ≥ 0 anda vector e ∈ intC. Set k := 0.

Step 1. Obtain xk+1 and T k+1 by the following procedure.

Step 1-0 Set r := 0 and E0 := T k. Solve CP(εk, E0) and let v0 be an optimum.

Step 1-1 Find trnew ∈ T such that

A(trnew)>vr − b(trnew) /∈ −γke+ C. (3.4.1)

If such a trnew does not exist, i.e.,

A(t)>vr − b(t) ∈ −γke+ C (3.4.2)

for any t ∈ T , then set xk+1 := vr and T k+1 := Er, and go to Step 2. Otherwise, let

Er+1 := Er ∪ {trnew},

and go to Step 1-2.

Step 1-2 Solve CP(εk,Er+1) to obtain an optimum vr+1 and the Lagrange multipliers

yr+1t for t ∈ E

r+1.

3.4 Regularized explicit exchange method for SICP 41

Step 1-3 Let Er+1 := {t ∈ Er+1 | yr+1

t 6= 0}. Set r := r + 1 and return to Step 1-1.

Step 2. If γk and εk are sufficiently small, then terminate. Otherwise, set k := k+1 and returnto Step 1.

Algorithm 3.2 differs from Algorithm 3.1 only in the choice of {εk} in Step 0 and the sub-problems CP(εk, E

r+1) and CP(εk, E0) solved in Step 1. But, we give a full description ofAlgorithm 3.2 for completeness.

In the subsequent convergence analysis, we omit the termination condition in Step 2, so thatthe algorithm may generate an infinite sequence {xk}. Moreover, to ensure convergence, thesequences of {εk} and {γk} are required to satisfy the condition γk = O(εk).

3.4.2 Global convergence without strict convexity assumption

In this section, we show global convergence of Algorithm 3.2 for SICP (3.1.4) without the strictconvexity assumption. Indeed, we only need the following assumption for the proof of conver-gence.

Assumption 3.4.1. Function f is convex. Moreover, the Slater constraint qualification (SCQ)holds for SICP (3.1.4), i.e., there exists an x0 ∈ Rn such that A(t)>x0 − b(t) ∈ intC for allt ∈ T .

Notice that, for SICP (3.1.4), the SCQ holds if and only if any feasible point satisfies theGRCQ as studied in Section 3.2.2. We first show that Step 1 terminates finitely.

Proposition 3.4.2. Suppose that Assumption 3.2 holds. Then, the inner iterations in Step 1terminate finitely.

Proof. By Theorem 3.3.4, it suffices to show that conditions i)-iii) in Assumption 3.3.2 hold whenStep 1 of Algorithm 3.1 is applied to RSICP(ε) for any ε > 0. Since conditions i) and ii) holdfrom the strong convexity of fε, we only show condition iii). Let x∗ε be an optimum of RSICP(ε)and L∗

ε := {x ∈ Rn | fε(x) ≤ fε(x∗ε)}. Then, L∗ε is compact since fε is strongly convex. Moreover,

we have vr ∈ L∗ε, i.e., fε(vr) ≤ fε(x∗ε) for all r since Er ⊆ T . Hence, {vr} is bounded.

Now, we show that, under Assumption 3.4.1, the generated sequence {xk} is bounded, andAlgorithm 3.2 is globally convergent in the sense that the distance from xk to the solution setof SICP (3.1.4) tends to 0. In the proof, the KKT conditions established in Theorem 3.2.4 playsa critical role.

Theorem 3.4.3. Suppose that Assumption 3.4.1 holds. Let {xk} be the sequence generated byAlgorithm 3.2. Then, the following statements hold.

i) If {εk} and {γk} are chosen to satisfy γk = O(εk), then {xk} is bounded.

ii) Any accumulation point of {xk} solves SICP (3.1.4).


Proof. i) Let x∗ ∈ Rn be an optimum of SICP (3.1.4). Since Assumption 3.4.1 holds, The-orem 3.2.4 can be applied to SICP (3.1.4) to ensure that there exist t1, t2, . . . , tp ∈ T andy1, y2, . . . , yp ∈ Rm such that p ≤ n and

∇f(x∗) −p∑

i=1

A(ti)yi = 0, (3.4.3)

Cd 3 yi ⊥ A(ti)>x∗ − b(ti) ∈ C (i = 1, 2, . . . , p). (3.4.4)

Let {xk} be the sequence generated by Algorithm 3.2. Since xk solves CP(εk−1, Tk) and x∗ is

feasible to CP(εk−1, Tk), we have

12εk−1‖xk‖2 + f(xk) ≤ 1

2εk−1‖x∗‖2 + f(x∗). (3.4.5)

Multiplying both sides of (3.4.5) by 2/εk−1, we have

‖xk‖2 ≤ ‖x∗‖2 − 2εk−1

(f(xk) − f(x∗))

≤ ‖x∗‖2 − 2εk−1

∇f(x∗)>(xk − x∗)

= ‖x∗‖2 − 2εk−1

(p∑

i=1

A(ti)yi

)>

(xk − x∗), (3.4.6)

where the second inequality holds since f is convex, and the equality follows from (3.4.3).Moreover, the last term of (3.4.6) satisfies the following inequalities:

−( p∑

i=1

A(ti)yi

)>(xk − x∗)

= −p∑

i=1

(yi)>(A(ti)>xk − b(ti) + γk−1e) +p∑

i=1

(yi)>(γk−1e) +p∑

i=1

(yi)>(A(ti)>x∗ − b(ti))

≤p∑

i=1

(yi)>(γk−1e)

≤ pµ‖e‖γk−1, (3.4.7)

where µ := max{‖y1‖, ‖y2‖, . . . , ‖yp‖}, and the first inequality since (3.4.2) and (3.4.4) implyyi ∈ Cd, A(ti)>xk − b(ti) + γk−1e ∈ C and (yi)>(A(ti)>x∗ − b(ti)) = 0. Then, by substituting(3.4.7) into (3.4.6), we have

‖xk‖2 ≤ ‖x∗‖2 + 2pµ‖e‖γk−1/εk−1. (3.4.8)

Since γk−1 = O(εk−1), {γk−1/εk−1} is bounded, and hence {xk} is also bounded.ii) Let x be an accumulation point of {xk}. Then, taking a subsequence if necessary, we have

xk → x, εk−1 → 0, γk−1 → 0 (k → ∞).

First, we show that x is feasible to SICP (3.1.4). Since xk is determined as vr satisfying(3.4.2) with γk replaced by γk−1, A(t)>xk − b(t) + γk−1e ∈ C holds for any t ∈ T . Noticing that

3.5 Numerical experiments 43

C is closed, we have limk→∞A(t)>xk − b(t) + γk−1e = A(t)>x− b(t) ∈ C for any t ∈ T . Hence,x is feasible to SICP (3.1.4).

We next show that x is optimal to SICP (3.1.4). Let x∗ be an arbitrary optimum of SICP(3.1.4). Since x is feasible to SICP (3.1.4), we have f(x) ≥ f(x∗). On the other hand, x∗ is feasi-ble to CP(εk−1, Ek) since the feasible region of SICP (3.1.4) is contained in that of CP(εk−1, Ek).Hence, we have

12εk−1‖xk‖2 + f(xk) ≤ 1

2εk−1‖x∗‖2 + f(x∗). (3.4.9)

Due to the continuity of f , by letting k → ∞ in (3.4.9), we have f(x) ≤ f(x∗). Therefore, weobtain f(x∗) = f(x), which implies that x solves SICP (3.1.4).

From the above theorem, we can see that if we choose {εk} and {γk} such that γk = O(εk),then the generated sequence {xk} has an accumulation point and it solves SICP (3.1.4). More-over, we can show that, if {εk} and {γk} satisfy γk = o(εk), then {xk} is actually convergentand its limit point is the least 2-norm solution.

Theorem 3.4.4. Suppose that Assumption 3.4.1 holds. Let {εk} and {γk} be chosen such thatγk = o(εk), and {xk} be a sequence generated by Algorithm 3.2. Let S∗ ⊆ Rn denote thenonempty solution set of SICP (3.1.4) and x∗ ∈ Rn be the least 2-norm solution, i.e., x∗min :=argminx∈S∗‖x‖. Then, we have limk→∞ xk = x∗min.

Proof. By Theorem 3.4.3, {xk} is bounded and every accumulation point belongs to S∗. More-over, x∗min can be identified uniquely since S∗ is closed and convex. Therefore, it suffices to showthat ‖x‖ = ‖x∗min‖ for any accumulation point x of {xk}. Note that the inequality (3.4.8) in theproof of Theorem 3.4.3. ii) holds for any x∗ ∈ S, in particular, for x∗min. Since γk = o(εk), byletting k → ∞ in (3.4.8), we obtain ‖x‖ ≤ ‖x∗min‖. On the other hand, we have ‖x‖ ≥ ‖x∗min‖,since x ∈ S∗ and x∗min = argminx∈S∗‖x‖. We thus have ‖x‖ = ‖x∗min‖.

3.5 Numerical experiments

In this section, we report some numerical results. The program is coded in Matlab 2008a andrun on a machine with an IntelrCore2 Duo E6850 3.00GHz CPU and 4GB RAM. In thisexperiment, we consider SISOCP (3.1.1) with a linear objective function and infinitely manysecond-order cone constraints with respect to a single second-order cone. Actual implementationof Algorithm 3.2 is carried out as follows. In Step 0, we set e := (1, 0, . . . , 0)> ∈ intKm. InStep 1-1, to find trnew satisfying (3.4.1), we first choose N grid points t1, t2, . . . , tN from the indexset T and compute λ

(A(t)>vr − b(t) + γke

)for t = t1, t2, . . . , tN ∈ T , where λ(·) denotes the

spectral value of z ∈ Rm [18, 26] defined by

λ(z) := z1 −√z22 + z2

3 + · · · + z2m.

If we find a t ∈ {t1, t2, . . . , tN} such that λ(A(t)>vr − b(t) + γke

)< 0, then we set trnew := t.

Otherwise, we solve

Minimize λ(A(t)>vr − b(t) + γke)

subject to t ∈ T,(3.5.1)


and check the nonnegativity of its optimal value.6 To solve (3.5.1), we apply Newton’s methodcombined with the bisection method when T is one-dimensional, and fmincon solver in MatlabOptimization Toolbox when T is multi-dimensional. For both methods, we set the initial pointt0 ∈ T as t0 := argmin{λ

(A(t)>vr − b(t) + γke

)| t = t1, t2, . . . , tN}. Although there is no

theoretical guarantee, in practice, we can expect to find a global optimum of (3.5.1) by taking asufficiently large N . In Step 1-2, we solve CP(ε, T ′) by the smoothing method [18, 26]. In Step 1-3, we regard yr

t as 0 if ‖yrt ‖ ≤ 10−12. In Step 2, we terminate the algorithm if max(εk, γk) ≤ 10−5.

In each experiment reported below, we choose the grid points t1, t2, . . . , tN ∈ T as in Table 3.1.

Experiment T N {t1, t2, . . . , tN}1, 2, 3-1 [−1, 1] 101 {−1 + 1

50p}p=0,1,...,100

3-2 [0, 1] × [0, 1] 2601(= 512){

150(p1, p2)

}p1,p2=0,1,2,...,50

Table 3.1: Choice of grid points in each experiment

Experiment 1

In the first experiment, we solve the following SICP:

Minimize c>x

subject to A(t)>x− b(t) ∈ Km for all t ∈ [−1, 1],(3.5.2)

where Km := {(x1, x2, . . . , xm)> ∈ Rm | x1 ≥ ‖(x2, x3, . . . , xm)>‖}, c ∈ Rn, A(t) := (Aij(t)) ∈Rn×m with Aij(t) := αij0 + αij1t + αij2t

2 + αij3t3 (i = 1, 2, . . . , n, j = 1, 2, . . . ,m) and b(t) :=

(bj(t)) ∈ Rm with b1(t) := −∑m

j=2

∑3`=0 |β`

j | and bj(t) := βj0+βj1t+βj2t2+βj3t

3 (j = 2, . . . ,m).We choose αijk, βj` (i = 1, 2, . . . , n, j = 2, . . . ,m, k = 0, 1, 2, 3, ` = 0, 1, 2, 3) and all componentsof c randomly from [−1, 1]. Note that by the choice of b1(t), feasibility of (3.5.2) is ensured.7 Inthis way, we generate two sets of data A(t), b(t) and c for each of the three pairs (m,n) = (25, 15),(15, 15) and (10, 15), thereby obtaining six problems referred to as Problems 1, 2, . . . , 6.

In this experiment, using parameters {εk} and {γk} such that εk = 0.5k, γk = 0.3k, and theinitial index set T 0 := {−1, 0, 1} in Step 0, we observe the convergence behavior of the algorithm.The results are shown in Table 3.2, where

6Notice that λ`

A(t)>x − b(t) + γe´

≥ 0 if and only if A(t)>x − b(t) ∈ −γe + Km.7Note that the origin always lies in the interior of the feasible region, since we have −b(t) ∈ intKm from

−b1(t) − ‖(−b2(t), . . . ,−bm(t))>‖ > 0 for all t ∈ [−1, 1].


iteout : the number of outer iterations,

{rk} : the values of r when the inner termination criterion (3.4.2) is satisfied at the k-th

outer iteration for k = 0, 1, . . . , iteout − 1,

rsum : the sum of rk’s for all k = 0, 1, 2, . . . , iteout − 1,

tsocp : the number of times the sub-SOCPs (CP(εk, E0) and CP(εk, Er+1)) are solved,

Tfin : the index set T k upon termination of the algorithm,

time(sec) : the CPU time in seconds.

In the column of rk, pq means that we have rk = p in q consecutive iterations. For example,010, 2, 14 means that rk = 0 (k = 0, 1, . . . , 9), r10 = 2 and rk = 4 (k = 11, 12, 13, 14). Notice thatwe always have tsocp = iteout + rsum, since we solve sub-SOCPs once at Step 1-0 and rk timesat Step 1-3, for each k. Although Tfin usually represents an approximate active index set at theoptimum, the real active index set is {−1, 1} for Problems 2 and 3. This is because the innertermination criterion (3.4.2) is always satisfied with r = 0 and therefore the inactive index t = 0is never removed at Step 1-3. From the columns of rk, we can see that rk is sometimes largewhen k ≤ 4, but it is always 0 or 1 for k = 7, 8, . . . , 17. This fact suggests that Tfin is usuallyobtained in the early stage of iterations.

Problem (m,n) iteout {rk} rsum tsocp Tfin time(sec)1 (25, 15) 18 06, 4, 08, 1, 02 5 23 {−1,−0.296, 1} 5.572 (25, 15) 18 018 0 18 {−1, 0, 1} 2.413 (15, 15) 18 018 0 18 {−1, 0, 1} 1.844 (15, 15) 18 03, 11, 02, 1, 011 14 32 {−1,−0.2,−0.18, 1} 12.495 (10, 15) 18 02, 13, 0, 3, 0, 3, 0, 1, 09 20 38 {−1,−0.48,−0.46, 1} 3.836 (10, 15) 18 02, 7, 4, 6, 22, 05, 1, 03, 1, 0 23 41 {−1,−0.387, 0.25, 1} 12.91

Table 3.2: Convergence behavior for Experiment 1

Experiment 2

In the second experiment, we implement the non-regularized exchange method (Algorithm 3.1)as well as the regularized exchange method (Algorithm 3.2), and compare their performance.In Step 1 of Algorithm 3.1, for k ≥ 1 we set v0 := xk instead of solving CP(E0), as suggestedin Remark 3.3.1 in Section 3.3. For both methods, the initial index set T 0 is set to be T 0

a :={−1,−0.5, 0, 0.5, 1}, T 0

b := {−1, 0, 1}, or T 0c = {−0.5, 0, 0.5}. The parameters are chosen as

γk = 0.5k for Algorithm 3.1, and εk = γk = 0.5k for Algorithm 3.2. Both methods are appliedto the same problems as those used in Experiment 1.

Table 3.3 shows the obtained results, where tasocp, tbsocp and tcsocp denote the values of tsocp for

the initial index sets T 0a , T

0b and T 0

c , respectively, and “F” means that we fail to solve a problem.From the table, we can observe that tsocp for the non-regularized method is much less than tsocp

for the regularized method. This is due to the fact that the regularized exchange method has


to solve the sub-SOCP (CP(εk, E0)) at least once in every outer iteration, whereas the non-regularized exchange method does not need to solve it when the inner termination criterion(3.3.2) is satisfied for r = 0. However, convergence of the non-regularized exchange method isnot guaranteed theoretically since the objective function is linear. Indeed, the non-regularizedexchange method fails to solve Problems 4, 5 and 6 with T 0 = T 0

c since their objective functionsare most probably unbounded on the feasible sets.8 On the other hand, the regularized exchangemethod succeeds in solving all problems for any choice of T 0. This is the main advantage of theregularized exchange method.

regularized non-regularizedProblem (m,n) tasocp tbsocp tcsocp tasocp tbsocp tcsocp

1 (25, 15) 23 23 34 5 5 322 (25, 15) 18 18 20 1 1 113 (15, 15) 18 18 25 1 1 154 (15, 15) 27 28 44 4 4 F5 (10, 15) 19 24 29 4 5 F6 (10, 15) 28 30 46 8 8 F

Table 3.3: Comparison of regularized and non-regularized exchange methods

Experiment 3

In the third experiment, we apply Algorithm 3.2 to Chebyshev-like approximation problems forvector-valued functions.

Experiment 3-1. We first consider the vector-valued approximation problem with respect toH : R → R3 and h : R8 × R → R3 defined by

H(t) :=

et2

2tet2

(4t2 + 2)et2

, h(u, t) :=

∑8

ν=1uνtν−1∑8

ν=2(ν − 1)uνtν−2∑8

ν=3(ν − 1)(ν − 2)uνtν−3

.

In order to find a u ∈ R8 such that h(u, t) ≈ H(t) over t ∈ [−1, 1], we solve the followingproblem:

Minimizeu∈R8

maxt∈[−1,1]

‖H(t) − h(u, t)‖ . (3.5.3)

8In fact, for each CP(T 0c ) of Problems 4, 5 and 6, we found a feasible point whose objective function value is

less than −107.


−1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 10

0.02

0.04

0.06

0.08

0.1

0.12

0.14

0.16

t

‖H

(t)-

-h(u

∗,t)‖

Figure 3.1: The graph of ‖H(t) − h(u∗, t)‖ (t ∈ [−1, 1]) in Experiment 3-1

Introducing an auxiliary variable v ∈ R, we can reformulate (3.5.3) as the following SICP withinfinitely many four-dimensional second-order cone constraints:

Minimize(v,u>)>∈R×R8

v

subject to

1 0 0 0 · · · 00 1 t t2 · · · t7

0 0 1 2t · · · 7t6

0 0 0 2 · · · 42t5

(v

u

)−

0et

2

2tet2

(4t2 + 2)et2

∈ K4

for all t ∈ [−1, 1].

(3.5.4)

In applying Algorithm 3.2, we set T0 := {−1, 1} and εk = γk := 0.5k. Then, the algorithm out-puts the solution v∗ = 0.1415, u∗ = (0.9948, 0.0000, 1.0707, 0.0000, 0.3083, 0.0000, 0.3442, 0.0000)>

together with Tfin = {−1.00,−0.88,−0.52, 0, 0.52, 0.88, 1.00}. Notice that we have u∗2 = u∗4 =u∗6 = u∗8 = 0. This is reasonable since H1(t) and H3(t) are even functions, whereas H3(t) isan odd function. Figure 3.1 shows the graph of ‖H(t) − h(u∗, t)‖ over t ∈ [−1, 1]. From thegraph, we can observe that the values of ‖H(t) − h(u∗, t)‖ is bounded above by v∗ = 0.1415,and the bound is attained at multiple points in [−1, 1]. Actually, those points coincide withTfin = {−1.00,−0.88,−0.52, 0, 0.52, 0.88, 1.00}, which correspond to the active constraints atthe optimum.

Experiment 3-2. We next consider a vector-valued approximation problem where T is two-dimensional. Let H : R2 → R3 and h : R8 × R2 → R3 be defined by

H(t1, t2) :=

log(t1 + t2 + 1) sin t1sin t1/(t1 + t2 + 1) + log(t1 + t2 + 1) cos t1

sin t1/(t1 + t2 + 1)


v∗, (u∗)> 0.9730, (−0.1189, 0.2040, 0.2867,−1.0159, 0.9723, 0.1877,−0.3704, 0.1687)

Tfin {(1, 0), (0, 1), (1, 1), (0.64, 0.60), (0.46, 1), (0.60, 0.68), (0.58, 0.74), (0.62, 0.64)}{rk} 02, 2, 02, 3, 1, 4, 0, 1, 5, 3, 12, 6, 1, 2, 02

rsum 40

time(sec) 15.40

Table 3.4: Results for Experiment 3-2

and

h(u, t1, t2) :=

∑8

ν=1 uνtν−11 t8−ν

2∑8ν=2 uν(ν − 1)tν−2

1 t8−ν2∑7

ν=2 uν(8 − ν)tν−11 t7−ν

2

,

respectively. In order to find a vector u := (u1, u2, . . . , u8)> ∈ R8 such that h(u, t1, t2) ≈ H(t1, t2)for (t1, t2) ∈ [0, 1] × [0, 1], we solve the following problem:

Minimizeu∈R8

max(t1,t2)∈[0,1]×[0,1]

∥∥∥H(t1, t2) − h(u, t1, t2)∥∥∥ . (3.5.5)

Introducing an auxiliary variable v ∈ R, we can reformulate (3.5.5) as the following SICP withinfinitely many four-dimensional second-order cone constraints and two-dimensional index set:

Minimize(v,u>)>∈R×R8

v

subject to

v∑8

ν=1 uνtν−11 t8−ν

2∑8ν=2 uν(ν − 1)tν−2

1 t8−ν2∑7

ν=2 uν(8 − ν)tν−11 t7−ν

2

−

0

log(t1 + t2 + 1) sin t1sin t1/(t1 + t2 + 1) + log(t1 + t2 + 1) cos t1

sin t1/(t1 + t2 + 1)

∈ K4

for all (t1, t2) ∈ [0, 1] × [0, 1].

(3.5.6)

In applying Algorithm 3.2, we set T0 := {(0, 0), (0, 1), (1, 0), (1, 1)} and εk = γk := 0.5k. Theresults are shown in Table 3.4, where (v∗, (u∗)>)> is the computed optimal solution. From thetable, we can observe that Algorithm 3.2 obtained the solution within acceptable time (15.40seconds). Moreover, the values of |T0|, |Tfin| and rsum indicate that 36(= |T0| + rsum − |Tfin|)indices are discarded in Step 1-3 in total. Thus, the exchange-scheme in Step 1 worked efficientlyto prevent the size of problems CP(εk, E

r+1) from growing excessively.

3.6 Concluding remarks

For the semi-infinite program with an infinitely many conic constraints (SICP), we have shownthat the KKT conditions can be represented with finitely many conic constraints, as long as thegeneralized Robinson constraint qualification holds. Furthermore, for solving the SICP with aconvex objective function and affine conic constrains, we have proposed the explicit exchangemethod and the regularized explicit exchange method, and established their global convergence.

3.6 Concluding remarks 49

Finally, we have conducted numerical experiments with the proposed algorithms to examinetheir effectiveness.

Chapter 4

A local reduction based SQP-type

method for semi-infinite

second-order cone programming

problems

4.1 Introduction

In this chapter, we focus on the following semi-infinite program with an infinite number ofsecond-order cone constraints (SISOCP):

Minimizex∈Rn

f(x)

subject to g(x, t) ∈ Km for all t ∈ T,(4.1.1)

where f : Rn → R and g : Rn × R` → Rm are twice continuously differentiable functions, and Tis a nonempty compact index set given by

T := {t ∈ R` | hi(t) ≥ 0, i = 1, 2, . . . , p},

where hi : R` → R are twice continuously differentiable functions for i = 1, 2, . . . , p.We consider the problem (4.1.1) that contains a single SOC with m ≥ 2 for simplicity of

expression, although we can deal with the more general SISOCP that contains multiple SOCsas well as equality constraints, i.e.,

Minimizex∈Rn

f(x)

subject to g0(x) = 0,gs(x, t) ∈ Kms for all t ∈ T s (s = 1, 2, . . . , S),

(4.1.2)

where g0 : Rn → Rm0 and gs : Rn × R`s → Rms (s = 1, 2, . . . , S) are twice continuouslydifferentiable functions, and T s ⊆ R`s (s = 1, 2, . . . , S) are nonempty compact index sets givenby T s := {t ∈ R`s | hs

i (t) ≥ 0, i = 1, 2, . . . , ps} with twice continuously differentiable functions

52 Chapter 4 A local reduction based SQP-type method for SISOCPs

hsi : R`s → R (i = 1, 2, . . . , ps). It is possible to extend the subsequent analysis for (4.1.1) to

the general SISOCP (4.1.2) in a direct manner. In fact, we will show some numerical results forSISOCPs that contain multiple SOCs; see Experiment 3 in Section 4.5.

In the previous chapter, we proposed exchange-type methods to solve semi-infinite programswith infinitely many affine conic constraints. Although an exchange-type algorithm is effective tofind an approximate solution, it is not very suitable to obtain an accurate solution. On the otherhand, a local reduction-type method is known to have an advantage in computing an accuratesolution with fast convergence speed [24, 66, 50, 51]. In this chapter, for solving SISOCP (4.1.1),we propose a local reduction based method combined with a sequential quadratic programming(SQP) method, where, in each iteration, we replace the SISOCP with an SOCP by means of thelocal reduction method and then generate a search direction by solving a quadratic SOCP thatapproximates the obtained SOCP.

This chapter is organized as follows. In Section 4.2, we study the local reduction methodfor SISOCP (4.1.1). We define some concepts and give important propositions to represent theSISOCP locally as an SOCP by using implicit functions. In Section 4.3, we propose an SQP-typemethod combined with the local reduction method for solving SISOCP (4.1.1). In Section 4.4,we analyze the global and local convergence properties of the proposed algorithm. In Section 4.5,we observe the effectiveness of the algorithm by some numerical experiments. In Section 4.6, weconclude the chapter with some remarks.

4.2 Local reduction of SISOCP to SOCP

In this section, we study the local reduction method for SISOCP (4.1.1). In relation to theconstraints in SISOCP (4.1.1), we first consider the problem

P (x) :Minimize

t∈R`λ(x, t) := g1(x, t) − ‖g(x, t)‖

subject to t ∈ T = {t ∈ R` | hi(t) ≥ 0, i = 1, 2, . . . , p},(4.2.1)

where g1(x, t) is the first component of g(x, t) ∈ Rm and g(x, t) is the vector consisting ofthe remaining m − 1 components of g(x, t). We call problem (4.2.1) the lower-level problem ofSISOCP (4.1.1) and let

ϕ(x) := maxt∈T

(−λ(x, t)) . (4.2.2)

Obviously, the infinitely many SOC constraints g(x, t) ∈ Km (t ∈ T ) are equivalent to thecondition ϕ(x) ≤ 0. Hence, SISOCP (4.1.1) can be rewritten equivalently as

Minimizex∈Rn

f(x) subject to ϕ(x) ≤ 0. (4.2.3)

Though problem (4.2.3) has only one constraint, treating ϕ(x) ≤ 0 directly is difficult since it isnot differentiable everywhere. As a remedy, we take the local reduction method. In this method,at any x ∈ Rn, we find an open neighborhood U(x) ⊆ Rn of x and continuously differentiablefunctions tj : U(x) → T (j = 1, 2, . . . , r(x)) such that

ϕ(x) = max1≤j≤r(x)

(−λ(x, tj(x)))

4.2 Local reduction of SISOCP to SOCP 53

holds for all x ∈ U(x), where each tj(x) represents a local maximum of −λ(x, t) on T andr(x) is a positive integer. This means that ϕ(x) ≥ 0 may be reduced to the finitely manySOC constraints g(x, tj(x)) ∈ Km (j = 1, 2, . . . , r) in the set U(x), i.e., problem (4.2.3) can betransformed locally to

Minimizex∈Uε(x)

f(x)

subject to g(x, tj(x)) ∈ Km (j = 1, 2, . . . , r(x)).(4.2.4)

Then, we can expect that existing methods such as an SQP-type method [33] work efficientlyfor solving the reduced SOCP (4.2.4).

To give more formal treatment of the local reduction method, let l : Rn × R` × Rp → Rdenote the Lagrangian of the lower-level problem P (x), i.e.,

l(x, t, α) := λ(x, t) − h(t)>α,

where h(t) := (h1(t), h2(t), . . . , hp(t))>, and α := (α1, α2, . . . , αp)> ∈ Rp is a Lagrange multipliervector corresponding to the constraints hi(t) ≥ 0 (i = 1, 2, . . . , p). Let Ia(t) denote the activeindex set at t ∈ R`, i.e.,

Ia(t) := {i ∈ {1, 2, . . . , p} | hi(t) = 0}. (4.2.5)

Recall that if t ∈ T is a local optimum of P (x) such that the linear independence constraintqualification holds, i.e., {∇hi(t)}i∈Ia(t) are linearly independent, then there exists a uniqueLagrange multiplier vector α := (α1, α2, . . . , αp)> ∈ Rp such that

∇tl(x, t, α) = 0, 0 ≤ α ⊥ h(t) ≥ 0. (4.2.6)

Below, we define the nondegeneracy of local optima of P (x).

Definition 4.2.1 (Nondegeneracy). Let x ∈ Rn be an arbitrary vector. Let t ∈ T be a localoptimum of P (x) such that the linear independence constraint qualification holds, and α :=(α1, α2, . . . , αp)> ∈ Rp be a Lagrange multiplier vector satisfying (4.2.6) with x := x, t := t, andα := α. Let the function λ(·, ·) be twice continuously differentiable at (x, t). We say that t ∈ T

is nondegenerate if

(a) the second-order sufficient condition

v>∇2ttl(x, t, α)v > 0 for all v ∈ C(t) \ {0}

with

C(t) :=

{v ∈ R` | v>∇hi(t) = 0, i ∈ Ia(t)} (Ia(t) 6= ∅),

R` (Ia(t) = ∅)

holds, and

(b) the strict complementarityαi > 0 for all i ∈ Ia(t)

holds.


Under the nondegeneracy assumption, we have the following proposition.

Proposition 4.2.2. Let x ∈ Rn and x ∈ Rn. Assume that t ∈ T is a nondegenerate local opti-mum of P (x) and α := (α1, α2, . . . , αp)> ∈ Rp

+ is a Lagrange multiplier vector corresponding tothe constraints hi(t) ≥ 0 (i = 1, 2, . . . , p). Furthermore, suppose that λ(·, ·) is twice continuouslydifferentiable at (x, t). Then, there exist an open neighborhood U(x) of x and twice continuouslydifferentiable functions t(·) : U(x) → T and αi(·) : U(x) → R+ (i = 1, 2, . . . , p) such that

(a) t(x) = t, αi(x) = αi (i ∈ Ia(t)) and αi(x) = 0 (i /∈ Ia(t)),

(b) t(x) is a nondegenerate local optimum of P (x) for each x ∈ U(x) with a unique Lagrangemultiplier vector (α1(x), α2(x), . . . , αp(x))> ∈ Rp

+,

(c) ∇t(x) ∈ Rn×` and ∇αi(x) ∈ Rn (i ∈ Ia(t)) comprise a unique solution of the linear system(∇2

ttl(x, t, α) ∇ha(t)∇ha(t)> 0

)(∇t(x)>

∇αa(x)>

)= −

(∇2

txλ(x, t)0

), (4.2.7)

where

∇αa(x) := (∇αi(x))i∈Ia(t) ∈ Rn×|Ia(t)|, ∇ha(t) := (∇hi(t))i∈Ia(t) ∈ R`×|Ia(t)|;

in particular, if Ia(t) = ∅, then ∇t(x) ∈ Rn×` is a unique solution of the linear system

∇2ttl(x, t, α)∇t(x)> = −∇2

txλ(x, t), (4.2.8)

(d) for any x ∈ U(x), letting v(x) := λ(x, t(x)), we have

∇v(x) = ∇xλ(x, t),

∇2v(x) = ∇2xxλ(x, t) −∇t(x)∇2

ttl(x, t, α)∇t(x)>.

Proof. Apply the implicit function theorem to the following equations:

∇tl(x, t, α) = 0, hi(t) = 0 (i ∈ Ia(t)),

which come from the Karush-Kuhn-Tucker (KKT) conditions of P (x). See also [29, 24].

Next, letTloc(x) := {t ∈ T | t is a local optimum of P (x)}

andTε(x) := Tloc(x) ∩ {t ∈ T | λ(x, t) ≤ min

t∈Tλ(x, t) + ε}

for a given constant ε > 0. Now, we show that the infinitely many SOC constraints g(x, t) ∈Km (t ∈ T ) can locally be represented as finitely many SOC constraints under some assumptionsincluding the following ε-regularity condition.

Definition 4.2.3 (ε-regularity). Let x ∈ Rn and ε > 0 be given. We say that x is ε-regular ifany t ∈ Tε(x) is nondegenerate and |Tε(x)| <∞.

4.3 Local reduction based SQP-type algorithm for the SISOCP 55

Proposition 4.2.4. Let x ∈ Rn and ε > 0 be given. Suppose that x is ε-regular and letTε(x) := {t1, t2, . . . , trε(x)}. Furthermore, suppose that the function λ(·, ·) is twice continuouslydifferentiable at (x, tj) for all j = 1, 2, . . . , rε(x) and that x ∈ Rn is regular. Then, there existan open neighborhood Uε(x) ⊆ Rn of x and functions t1(·), t2(·), · · · , trε(x)(·) : Uε(x) → T suchthat, for each j = 1, 2, . . . , rε(x),

(a) tj(·) is twice continuously differentiable,

(b) tj(x) = tj, and

(c) ϕ(x) = maxj=1,2,··· ,rε(x)(−λ(x, tj(x))) for all x ∈ Uε(x), where ϕ(·) is defined by (4.2.2).

Moreover, SISOCP (4.1.1) can locally be reduced to the following SOCP:

Minimizex∈Uε(x)

f(x)

subject to g(x, tj(x)) ∈ Km, (j = 1, 2, . . . , rε(x)).

Proof. We omit the proof since it can easily be derived from Proposition 4.2.2 and the implicitfunction theorem.

4.3 Local reduction based SQP-type algorithm for the SISOCP

From Theorem 3.2.4, the Karush-Kuhn-Tucker (KKT) conditions for SISOCP (4.1.1) are rep-resented as follows: Let x∗ be a local optimum of SISOCP (4.1.1). Then, under suitableconstraint qualification, there exist q indices t∗1, t

∗2, . . . , t

∗q ∈ Tε(x∗) and Lagrange multipliers

η∗1, η∗2, . . . , η

∗q ∈ Rm such that q ≤ n and

∇f(x∗) −q∑

j=1

∇xg(x∗, t∗j )η∗j = 0, (4.3.1)

Km 3 η∗j ⊥ g(x∗, t∗j ) ∈ Km (j = 1, 2, . . . , q). (4.3.2)

In this section, we propose an algorithm for finding a vector x∗ ∈ Rn that satisfies the aboveKKT conditions. In the algorithm, we combine the local reduction method with the sequentialquadratic programming (SQP) method. Let ε > 0 be given and let xk ∈ Rn be a current iterate.Assume that xk satisfies the ε-regularity defined in Definition 4.2.3. Then, from Proposition 4.2.4,there exist some open neighborhood Uε(xk) ⊆ Rn of xk and twice continuously differentiablefunctions tkj : Uε(xk) → T (j = 1, 2, . . . , rε(xk)) such that SISOCP (4.1.1) can locally be reducedto the following SOCP:

SOCP(xk, ε) :Minimizex∈Uε(xk)

f(x)

subject to Gkj (x) := g(x, tkj (x)) ∈ Km (j = 1, 2, . . . , rε(xk)).

We then generate a search direction dk ∈ Rn by solving the following Quadratic SOCP (QSOCP),which consists of quadratic and linear approximations of the objective function and constraint


functions of SOCP(xk, ε), respectively:

QSOCP(xk, ε) :Minimize

d∈Rn∇f(xk)>d+ 1

2d>Bkd

subject to Gkj (x

k) + ∇Gkj (x

k)>d ∈ Km (j = 1, 2, . . . , rε(xk)),

where Bk ∈ Rn×n is a symmetric positive definite matrix. Note that Gkj (x

k) and ∇Gkj (x

k) aregiven by

Gkj (x

k) = g(xk, tkj (xk)), (4.3.3)

∇Gkj (x

k) = ∇xg(xk, tkj (xk)) + ∇tkj (xk)∇tg(x, tkj (x

k)), (4.3.4)

where tkj (xk) ∈ Tε(xk) and ∇tkj (xk) can be obtained by solving the lower-level problem P (xk)

and by solving (4.2.7) or (4.2.8). Under some constraint qualification, the optimum dk ofQSOCP(xk, ε) satisfies the following KKT conditions:

∇f(xk) +Bkdk −

rε(xk)∑j=1

∇Gkj (x

k)ηk+1j = 0, (4.3.5)

Km 3 ηk+1j ⊥ Gk

j (xk) + ∇Gk

j (xk)>dk ∈ Km (j = 1, 2, . . . , rε(xk)), (4.3.6)

where ηk+1j ∈ Rm (j = 1, 2, . . . , rε(xk)) are Lagrange multiplier vectors corresponding to the

SOC constraints Gkj (x

k) + ∇Gkj (x

k)>d ∈ Km (j = 1, 2, . . . , rε(xk)). If dk = 0, then it followsimmediately from (4.3.5) and (4.3.6) that the KKT conditions for solving SOCP(xk, ε) aresatisfied at xk. If, in addition, g(xk, t) 6= 0 holds for all t ∈ Tε(xk), then it holds that

∇f(xk) −rε(xk)∑j=1

∇xg(xk, tkj )ηk+1j = 0,

Km 3 ηk+1j ⊥ g(xk, tkj ) ∈ Km (j = 1, 2, . . . , rε(xk)),

where tkj := tkj (xk) (j = 1, 2, . . . , rε(xk)), as will be shown by Proposition 4.4.2. These are

actually regarded as the KKT conditions (4.3.1) and (4.3.2) of SISOCP (4.1.1). In particular,we can see that xk is feasible for SISOCP (4.1.1), since g(xk, tkj ) ∈ Km for all j = 1, 2, . . . , rε(xk)and tkj (j = 1, 2, . . . , rε(xk)) contain all minimizers of the lower-level problem P (xk).

To generate the next iterate xk+1 along the direction dk, we need to choose a step size. Tothis end, we perform a line search with the following `∞-type penalty function:

Φρ(x) := f(x) + ρϕ+(x), (4.3.7)

where ϕ(·) is defined by (4.2.2) and ρ > 0 is a penalty parameter. Notice that the function Φρ(·) iscontinuous everywhere. Another plausible choice of a merit function used in the line search is an`1-type penalty function, i.e., the function (4.3.7) with ϕ+(x)

(= maxt∈T (−λ(x, t))+

)replaced

by∑

t∈Tε(x) (−λ(x, t))+. However, in the semi-infinite case, the `1-type penalty function hassuch a serious drawback that it may fail to be continuous at a point where the cardinality ofTε(x) changes. Properties of penalty functions for semi-infinite programs are explained in detailtogether with a specific example in [66]. Now, we formally state the SQP-type algorithm forsolving SISOCP (4.1.1).

4.3 Local reduction based SQP-type algorithm for the SISOCP 57

Algorithm 4.1

Step 0 (Initialization): Choose x0 ∈ Rn and a matrix B0 ∈ Sn++. Select parameters α ∈

(0, 1), β ∈ (0, 1), δ > 0, ε > 0 and ρ−1 > 0. Set k := 0.

Step 1 (Generate a search direction): Solve QSOCP(xk, ε) to obtain dk ∈ Rn and corre-sponding Lagrange multipliers ηk+1

j ∈ Km (j = 1, 2, . . . , rε(xk)).

Step 2 (Check convergence): If dk = 0, then stop. Otherwise, go to Step 3.

Step 3 (Update penalty parameter): If ρk−1 ≥∑rε(xk)

j=1 (ηk+1j )1, then set ρk := ρk−1. Oth-

erwise, set ρk :=∑rε(xk)

j=1 (ηk+1j )1 + δ.

Step 4 (Armijo line search): Find the smallest nonnegative integer rk ≥ 0 satisfying

Φρk(xk + αrkdk) − Φρk

(xk) ≤ −αrkβ(dk)>Bkdk.

Set sk := αrk and xk+1 := xk + skdk.

Step 5: Update the matrix Bk to obtain Bk+1 ∈ Sn++. Set k := k + 1 and return to Step 1.

To construct QSOCP(xk, ε) at each iteration k, we need to obtain the set Tε(xk) by computingall local minimizers of the lower-level problem P (xk). Moreover, we have to compute ∇tkj (xk)(j = 1, 2, . . . , rε(xk)) by solving the linear system (4.2.7) or (4.2.8). In Step 4, we must computemaxt∈T (−λ(xk +αrdk, t))+ to evaluate Φρk

(xk +αrdk) for each r. In Step 5, we may choose Bk

as

Bk := ∇2f(xk) −rε(xk)∑j=1

(ζkj )1Wkj , (4.3.8)

where

Wkj := ∇2Gkj1(x

k) −∇2Gk

j (xk)Gk

j (xk)

‖Gkj (xk)‖

(j = 1, 2, . . . , rε(xk))

and ζkj (j = 1, 2, . . . , rε(xk)) are some estimates of Lagrange multiplier vectors corresponding to

the constraints Gkj (·) ∈ Km (j = 1, 2, . . . , rε(xk)). A specific choice of ζk

j (j = 1, 2, . . . , rε(xk))will be provided later in the section of numerical experiments. The matrix Wkj can be calculatedas follows: Let vk

j : Uε(xk) → R be defined by vkj (x) := Gk

j1(x)−‖Gkj (x)‖ for j = 1, 2, . . . , rε(xk).

Then, we have

∇2vkj (xk) = Wkj −

∇Gkj (x

k)∇Gkj (x

k)>

‖Gkj (xk)‖

+∇Gk

j (xk)Gk

j (xk)Gk

j (xk)>∇Gk

j (xk)>

‖Gkj (xk)‖3

,

which implies that, for j = 1, 2, . . . , rε(xk),

Wkj = ∇2vkj (xk) +

∇Gkj (x

k)∇Gkj (x

k)>

‖Gkj (xk)‖

−∇Gk

j (xk)Gk

j (xk)Gk

j (xk)>∇Gk

j (xk)>

‖Gkj (xk)‖3

. (4.3.9)

Notice that the right-hand side of the above formula can be evaluated since we have Gkj (x

k),∇Gk

j (xk) and ∇2vk

j (xk) by using (4.3.3), (4.3.4) and Proposition 4.2.2(d) with x replaced by xk,


respectively. Thus, we can calculate Wkj from (4.3.9). In the subsequent section, we will showquadratic convergence of Algorithm 4.1 in which Bk are chosen as (4.3.8).

Another plausible choice of Bk is to let Bk = ∇2xxLk

ε(xk, ηk) for each k, where Lk

ε(·, ·) de-notes the Lagrangian of SOCP(xk, ε). However, to evaluate ∇2

xxLkε(x

k, ηk), we have to compute∇2tkj (x

k) (j = 1, 2, . . . , rε(xk)), and it often brings about some numerical difficulties. On theother hand, computing (4.3.8) does not require any calculation of ∇2tkj (x

k).

4.4 Convergence analysis

In this section, we study global and local convergence properties of the proposed algorithm.

4.4.1 Global convergence

To begin with, we make the following assumption:

Assumption 4.4.1. For each k,

(a) xk is regular,

(b) g(xk, t) 6= 0 for all t ∈ Tε(xk),

(c) QSOCP(xk, ε) is feasible, and the KKT conditions (4.3.5) and (4.3.6) hold at the uniqueoptimum of QSOCP(xk, ε).

By Assumption 4.4.1 (a), SISOCP (4.1.1) can locally be reduced to SOCP(xk, ε) around xk

for each k. By Assumption 4.4.1 (b), we can ensure the continuous differentiability of λ(xk, ·) ateach t ∈ Tε(xk), which is required by the ε-regularity of xk. Although Assumption 4.4.1 (b) mayseem restrictive, g(xk, t) = 0 is unlikely to occur in practice at any local minimizer of P (xk),since −‖g(xk, ·)‖ attains its “sharp” maximum at any t ∈ T such that g(xk, t) = 0. UnderAssumption 4.4.1 (c), QSOCP(xk, ε) has a unique optimum dk since Bk ∈ Sn

++.By the following proposition, we can ensure that our algorithm finds a KKT point of

SISOCP (4.1.1), when the termination criterion dk = 0 is satisfied.

Proposition 4.4.2. Suppose that Assumption 4.4.1 holds. If dk = 0, then the KKT conditions(4.3.1) and (4.3.2) for SISOCP (4.1.1) are satisfied at xk with some Lagrange multiplier vectorsηk+11 , ηk+1

2 , . . . , ηk+1rε(xk)

∈ Rm. In particular, xk is feasible for SISOCP (4.1.1).

Proof. From the ε-regularity of xk, tkj := tkj (xk) ∈ Tε(xk) (j = 1, 2, . . . , rε(xk)) are nondegenerate

local optima of P (xk) and then satisfy the KKT conditions of the lower-level problem P (xk).Thus, we have, for each j = 1, 2, . . . , rε(xk),

∇tg1(xk, tkj ) −∇tg(xk, tkj )g(x

k, tkj )

‖g(xk, tkj )‖−

∑i∈Ia(tkj )

αji∇hi(tkj ) = 0, (4.4.1)

where Ia(tkj ) is defined by (4.2.5) and αji (i ∈ Ia(tkj )) are Lagrange multipliers. Using the fact

that ∇tkj (xk)∇hi(tkj ) = 0 holds for each i ∈ Ia(tkj ) by Proposition 4.2.2 (c), (4.4.1) yields

∇tkj (xk)∇tg1(xk, tkj ) −∇tkj (xk)∇tg(xk, tkj )g(x

k, tkj )

‖g(xk, tkj )‖= 0, j = 1, 2, . . . , rε(xk). (4.4.2)

4.4 Convergence analysis 59

From the KKT conditions (4.3.6) of QSOCP(xk, ε) with dk = 0 and (4.3.4), we obtain

∇f(xk) −rε(xk)∑j=1

(∇xg(xk, tkj ) + ∇tkj (xk)∇tg(xk, tkj )

)ηk+1

j = 0, (4.4.3)

Km 3 ηk+1j ⊥ g(xk, tkj ) ∈ Km (j = 1, 2, . . . , rε(xk)). (4.4.4)

Notice that (4.4.4) implies ηk+1j = (ηk+1

j )1(1,−g(xk, tkj )>/‖g(xk, tkj )‖)> since ‖g(xk, tkj )‖ 6= 0 by

Assumption 4.4.1 (b), which together with (4.4.2) yields

∇tkj (xk)∇tg(xk, tkj )ηk+1j = 0, j = 1, 2, . . . , rε(xk).

Hence, from (4.4.3), we have ∇f(xk)−∑rε(xk)

j=1 ∇xg(xk, tkj )ηk+1j = 0. Combining this and (4.4.4),

we obtain the desired result.The feasibility of xk readily follows, since we have g(xk, tkj ) ∈ Km (j = 1, 2, . . . , rε(xk)) from

(4.4.4) and tkj (j = 1, 2, . . . , rε(xk)) contain global optima of P (xk).

We next show that the search direction dk ∈ Rn obtained from QSOCP(xk, ε) is a descentdirection for Φρ(·) at xk as long as the penalty parameter ρ is sufficiently large, which ensuresthat the line search in Step 4 terminates finitely at each iteration. To this end, we begin withproving the following lemma.

Lemma 4.4.3. Suppose that Assumption 4.4.1 holds. Then, we have

ϕ+(xk) + ϕ′+(xk; dk) ≤ 0, (4.4.5)

where ϕ(·) is defined by (4.2.2).

Proof. By the ε-regularity of xk and Proposition 4.2.4, there exist an open neighborhood Uε(xk)of xk and C2 functions tkj (·) : Uε(xk) → T (j = 1, 2, . . . , rε(xk)) such that

ϕ(x) = maxt∈T

(−λ(x, t)) = maxj=1,2,··· ,rε(xk)

(−λ(x, tkj (x)) = maxj=1,2,··· ,rε(xk)

(‖Gk

j (x)‖ −Gkj1(x)

)for all x ∈ Uε(xk), whereGk

j (x) = (Gkj1(x), G

kj (x)) := (g1(x, tkj (x)), g(x, t

kj (x))) for j = 1, 2, . . . , rε(xk).

Then, by letting J(xk) :={j ∈ {1, 2, . . . , rε(xk)} | −λ(xk, tkj (x

k)) = ϕ(xk)}

, we have

ϕ(xk) = −λ(xk, tkj (xk)) = ‖Gk

j (xk)‖ −Gk

j1(xk) (j ∈ J(xk)). (4.4.6)

In addition, since Gkj (x

k) 6= 0 from Assumption 4.4.1 (b), it is not difficult to show

ϕ′+(xk; dk) =

0 if ϕ(xk) < 0,

maxj∈J(xk)

(Gk

j (xk)>∇Gk

j (xk)>dk

‖Gkj (x)‖

− ∇Gkj1(x

k)>dk

)+

if ϕ(xk) = 0,

maxj∈J(xk)

(Gk

j (xk)>∇Gk

j (xk)>dk

‖Gkj (x)‖

− ∇Gkj1(x

k)>dk

)if ϕ(xk) > 0.

(4.4.7)


Moreover, since dk is feasible for QSOCP(xk, ε), we have Gkj (x

k) + ∇Gkj (x

k)>dk ∈ Km (j =1, 2, . . . , rε(xk)), which implies that

Gkj1(x

k) + ∇Gkj1(x

k)>dk −∥∥∥Gk

j (xk) + ∇Gk

j (xk)>dk

∥∥∥ ≥ 0 (j = 1, 2, . . . , rε(xk)). (4.4.8)

Notice that, for any u ∈ Rn and v ∈ Rn,

‖u‖ +u>v

‖u‖≤ ‖u+ v‖ (4.4.9)

holds since

‖u+ v‖2 −(‖u‖ +

u>v

‖u‖

)2

= ‖v‖2 − (u>v)2

‖u‖2≥ ‖v‖2 − ‖u‖2‖v‖2

‖u‖2= 0.

Hence, by setting u := Gkj (x), v := ∇Gk

j (xk)>dk in (4.4.9), we have

Gkj (x

k)>∇Gkj (x

k)>dk

‖Gkj (x)‖

+ ‖Gkj (x

k)‖ ≤∥∥∥Gk

j (xk) + ∇Gk

j (xk)>dk

∥∥∥ (j ∈ J(xk)). (4.4.10)

To show (4.4.5), we consider three cases (i) ϕ(xk) < 0, (ii) ϕ(xk) = 0, (iii) ϕ(xk) > 0. In case(i), (4.4.7) implies ϕ+(xk) + ϕ′

+(xk; dk) = ϕ+(xk) = 0. In case (ii), since ‖Gkj (x

k)‖ −Gkj1(x

k) =−ϕ(xk) = 0, it holds that, for j ∈ J(xk),

Gkj (x

k)>∇Gkj (x

k)>dk

‖Gkj (x)‖

− ∇Gkj1(x

k)>dk

= ‖Gkj (x

k)‖ −Gkj1(x

k) +Gk

j (xk)>∇Gk

j (xk)>dk

‖Gkj (x)‖

− ∇Gkj1(x

k)>dk

≤∥∥∥Gk

j (xk) + ∇Gk

j (xk)>dk

∥∥∥−Gkj1(x

k) −∇Gkj1(x

k)>dk

≤ 0, (4.4.11)

where the first inequality follows from (4.4.10) and the second inequality does from (4.4.8).Then, we obtain from (4.4.7) and (4.4.11)

ϕ+(xk) + ϕ′+(xk; dk) = max

j∈J(xk)

(Gk

j (xk)>∇Gk

j (xk)>dk

‖Gkj (x)‖

− ∇Gkj1(x

k)>dk

)+

= 0.

In case (iii), since ϕ+(xk) = ϕ(xk), we have

ϕ+(xk) + ϕ′+(xk; dk) = ϕ(xk) + max

j∈J(xk)

(Gk

j (xk)>∇Gk

j (xk)>dk

‖Gkj (x)‖

− ∇Gkj1(x

k)>dk

)

= maxj∈J(xk)

(Gk

j (xk)>∇Gk

j (xk)>dk

‖Gkj (x)‖

+ ϕ(xk) −∇Gkj1(x

k)>dk

)

= maxj∈J(xk)

(Gk

j (xk)>∇Gk

j (xk)>dk

‖Gkj (x)‖

+ ‖Gkj (x

k)‖ −Gkj1(x

k) −∇Gkj1(x

k)>dk

)≤ max

j∈J(xk)

(∥∥∥Gkj (x

k) + ∇Gkj (x

k)>dk∥∥∥−Gk

j1(xk) −∇Gk

j1(xk)>dk

)≤ 0,


where the first and third equalities are obtained from (4.4.7) and (4.4.6), respectively, the firstinequality follows from (4.4.10), and the last inequality is derived from (4.4.8). Consequently,we have the desired result.

Proposition 4.4.4. Suppose that Assumption 4.4.1 holds. If ρ ≥∑rε(xk)

j=1 (ηk+1j )1, then we have

Φ′ρ(x

k; dk) ≤ −(dk)>Bkdk. (4.4.12)

Furthermore, Φ′ρ(x

k; dk) < 0 holds when dk 6= 0.

Proof. First note that, for each j = 1, 2, . . . , rε(xk), we have

Gkj (x

k)>ηk+1j = (ηk+1

j )1(Gkj (x

k))1 + (ηk+1j )>Gk

j (xk)

≥ (ηk+1j )1(Gk

j (xk))1 − ‖ηk+1

j ‖‖Gkj (x

k)‖

≥ (ηk+1j )1(Gk

j (xk))1 − (ηk+1

j )1‖Gkj (x

k)‖

= (ηk+1j )1λ(xk, tkj (x

k))

≥ (ηk+1j )1 min

t∈Tλ(xk, t),

≥ −(ηk+1j )1ϕ+(xk), (4.4.13)

where the second and third inequalities hold since (ηk+1j )1 ≥ ‖ηk

j ‖ by ηk+1j ∈ Km, and the

last inequality follows from ϕ+(xk) ≥ ϕ(xk) = −mint∈T λ(xk, t). Then, by noting the KKTconditions (4.3.5) and (4.3.6) of QSOCP(xk, ε), we obtain

∇f(xk)>dk = −(dk)>Bkdk +

rε(xk)∑j=1

(∇Gk

j (xk)ηk+1

j

)>dk

= −(dk)>Bkdk −

rε(xk)∑j=1

Gkj (x

k)>ηk+1j ,

≤ −(dk)>Bkdk +

rε(xk)∑j=1

(ηk+1j )1ϕ+(xk)

≤ −(dk)>Bkdk + ρϕ+(xk), (4.4.14)

where the second equality holds since (ηk+1j )>(Gk

j (xk)+∇Gk

j (xk)>dk) = 0 for each j = 1, 2, . . . , rε(xk)

by the SOC complementarity conditions in (4.3.6), the first inequality is derived from (4.4.13),and the last inequality is implied by ρ ≥

∑rε(xk)j=1 (ηk+1

j )1 ≥ 0 and ϕ+(xk) ≥ 0. By using thesefacts, we have

Φ′ρ(x

k; dk) = ∇f(xk)>dk + ρϕ′+(xk; dk)

≤ −(dk)>Bkdk + ρ(ϕ+(xk) + ϕ′

+(xk; dk))

≤ −(dk)>Bkdk,

where the first inequality follows from (4.4.14) and the last inequality does from Lemma 4.4.3.Therefore, (4.4.12) holds.

The latter claim is obvious from (4.4.12), dk 6= 0 and Bk ∈ Sn++.


Assumption 4.4.5. (a) There exist 0 < γ1 ≤ γ2 such that γ1‖d‖2 ≤ d>Bkd ≤ γ2‖d‖2 for alld ∈ Rn and k = 0, 1, 2, . . .,

(b) {xk} is bounded, and

(c) {dk} is bounded.

For an arbitrary accumulation point x∗ ∈ Rn of {xk}, it holds that

(d) x∗ is regular, and

(e) g(x∗, t) 6= 0 for any t ∈ Tε(x∗).

Furthermore, let Uε(x∗) ⊆ Rn and tj(·) : Uε(x∗) → T (j = 1, 2, . . . , rε(x∗)) be an open neighbor-hood of x∗ and functions, respectively, such that the conditions (a)-(c) in Proposition 4.2.4 holdwith x replaced by x∗. Then,

(f) there exists an open neighborhood Vε(x∗)(⊆ Uε(x∗)) of x∗ such that {tj(x)}rε(x∗)j=1 = Tε(x)

holds for any x ∈ Vε(x∗), and

(g) Slater’s constraint qualification holds for QSOCP(x∗, ε), i.e., there exists d0 ∈ Rn such thatGj(x∗)+∇Gj(x∗)>d0 ∈ intKm for all j = 1, 2, . . . , rε(x∗), where Gj(x) := g(x, tj(x)) (j =1, 2, . . . , rε(x∗)).

Assumption 4.4.5 (f) implies that, when xk ∈ Vε(x∗), we haveGkj (x) ≡ Gj(x) := g(x, tj(x)) (j =

1, 2, . . . , rε(x∗)), and hence SISOCP (4.1.1) can locally be reduced to the following SOCP aroundxk:

minx∈Vε(x∗)

f(x) s.t. Gj(x) ∈ Km, j = 1, 2, . . . , rε(x∗). (4.4.15)

In other words, the constraint functions of SOCP(xk, ε) coincide with those of SOCP(x∗, ε),whenever xk ∈ Vε(x∗).

Under the above assumptions, we have the following proposition:

Proposition 4.4.6. Suppose that Assumptions 4.4.1 and 4.4.5 hold. Let ηk+11 , ηk+1

2 , . . . , ηk+1rε(xk)

∈Km be Lagrange multiplier vectors satisfying the KKT conditions (4.3.5) and (4.3.6), and denoteηk := (ηk

1 , ηk2 , . . . , η

krε(xk)

). Then, it holds that

(a) {ηk} is bounded, and

(b) there exist some k0 ≥ 0 and ρ > 0 such that ρk = ρ for all k ≥ k0.

Proof. We first show (a). For contradiction, suppose that {ηk} is not bounded. Then, thereexists some subsequence {ηk+1}k∈K such that limk∈K, k→∞ ‖ηk+1‖ = ∞. We may assume,without loss of generality, that ηk+1 6= 0 for all k ∈ K. By Assumption 4.4.5(b), {xk}k∈K isbounded and has at least one accumulation point, say, x∗ ∈ Rn. Again, without loss of generality,we can assume that limk∈K, k→∞ xk = x∗. From Assumption 4.4.5(d), x∗ is regular. Then, byProposition 4.2.4, there exist some open neighborhood Uε(x∗) ⊆ Rn of x∗ and C2 functions


tj(·) : Uε(x∗) → T (j = 1, 2, . . . , rε(x∗)) such that SISOCP (4.1.1) can locally be reduced toSOCP(x∗, ε) around x∗:

minx∈Uε(x∗)

f(x) s.t. Gj(x) ∈ Km (j = 1, 2, . . . , rε(x∗))

where Gj(x) := g(x, tj(x)) (j = 1, 2, . . . , rε(x∗)). From Assumption 4.4.5 (f), the constraintfunctions of SOCP(xk, ε) for k(∈ K) ≥ k are identical to those of SOCP(x∗, ε) for some k ∈ K

large enough. Therefore, QSOCP(xk, ε) can be represented as

mind∈Rn

12d>Bkd+ ∇f(xk)>d s.t. Gj(xk) + ∇Gj(xk)>d ∈ Km, j = 1, 2, . . . , rε(x∗),

and its optimum dk satisfies the following KKT conditions:

∇f(xk) +Bkdk −

rε(x∗)∑j=1

∇Gj(xk)ηk+1j = 0,

Km 3 ηk+1j ⊥ Gj(xk) + ∇Gj(xk)>dk ∈ Km (j = 1, 2, . . . , rε(x∗)),

from which it follows that

1‖ηk+1‖

∇f(xk) +Bkd

k

‖ηk+1‖−

rε(x∗)∑j=1

∇Gj(xk)ηk+1j

‖ηk+1‖= 0, (4.4.16)

Km 3ηk+1

j

‖ηk+1‖⊥ Gj(xk) + ∇Gj(xk)>dk ∈ Km (j = 1, 2, . . . , rε(x∗)) (4.4.17)

for all k(∈ K) ≥ k.Note that {dk}k∈K ⊆ Rn is bounded from Assumption 4.4.5(c) and {ηk+1/‖ηk+1‖}k(∈K)≥k ⊆

Rmrε(x∗) is also bounded. Let (d∗, η∗) := (d∗, η∗1, η∗2, . . . , η

∗rε(x∗)) ∈ Rn ×Km ×Km × · · · ×Km be

an arbitrary accumulation point of {(dk, ηk+1/‖ηk+1‖

)}k(∈K)≥k. Without loss of generality, we

can assume that

limk∈K,k→∞

(ηk+1

‖ηk+1‖, dk, xk

)= (η∗, d∗, x∗) . (4.4.18)

Then, letting k ∈ K, k → ∞ in (4.4.16) and (4.4.17) yields

rε(x∗)∑j=1

∇Gj(x∗)η∗j = 0, (4.4.19)

Km 3 η∗j ⊥ Gj(x∗) + ∇Gj(x∗)>d∗ ∈ Km (j = 1, 2, . . . , rε(x∗)), (4.4.20)

since {Bk} is bounded from Assumption 4.4.5(a). Furthermore, from Assumption 4.4.5(g),QSOCP(x∗, ε) satisfies Slater’s constraint qualification, i.e., there exists some d0 ∈ Rn suchthat

Gj(x∗) + ∇Gj(x∗)>d0 ∈ intKm (j = 1, 2, . . . , rε(x∗)). (4.4.21)

Now observe that

rε(x∗)∑j=1

(η∗j )>(Gj(x∗) + ∇Gj(x∗)>d0

)=

rε(x∗)∑j=1

(∇Gj(x∗)η∗j

)> (d0 − d∗) = 0, (4.4.22)


where the first equality holds since (η∗j )> (Gj(x∗) + ∇Gj(x∗)>d∗

)= 0 (j = 1, 2, . . . , rε(x∗)) by

(4.4.20), and the second equality follows from (4.4.19). Combining (4.4.22) with (4.4.21) andη∗j ∈ Km (j = 1, 2, . . . , rε(x∗)), we obtain η∗j = 0 for j = 1, 2, . . . , rε(x∗). This is a contradictionsince ‖η∗‖ = 1 from (4.4.18). Therefore, {ηk} is bounded.

We next show (b). For contradiction, we suppose that such ρ > 0 and k0 ≥ 0 do notexist. Then, by the update rule in Step 3 of Algorithm 4.1, there exists an infinite subsequence{ρk}k∈K′ of penalty parameters such that, for all k ∈ K ′,

ρk−1 <

rε(x∗)∑j=1

(ηk+1j )1 and ρk =

rε(x∗)∑j=1

(ηk+1j )1 + δ,

from which we have ρk ≥ ρk−1 + δ for all k ∈ K ′. This implies limk→∞ ρk = ∞, since{ρk} is nondecreasing by the update rule. We thus obtain limk∈K′,k→∞ ‖ηk+1‖ = ∞ since∑rε(xk)

j=1 (ηkj )1 > ρk−1 → ∞ as k(∈ K ′) → ∞. This contradicts the boundedness of {ηk}.

Now, we establish the global convergence of Algorithm 4.1.

Theorem 4.4.7. Suppose that Assumptions 4.4.1 and 4.4.5 hold. Let x∗ ∈ Rn be an arbitrary ac-cumulation point of {xk} ⊆ Rn. Then, the KKT conditions (4.3.1) and (4.3.2) of SISOCP (4.1.1)hold at x∗.

Proof. First, from Proposition 4.4.6, there exists some ρ > 0 such that ρk = ρ for all k sufficientlylarge. For simplicity of expression, we assume that ρk = ρ for all k.

Choose a subsequence {xk}k∈K such that limk∈K,k→∞ xk = x∗. Since {dk}k∈K is boundedfrom Assumption 4.4.5 (c), it has at least one accumulation point, say, d∗ ∈ Rn. To prove thedesired result, from Proposition 4.4.2, it suffices to show d∗ = 0. Due to Assumption 4.4.5 (d),(e) and (f), SISOCP (4.1.1) can locally be reduced to SOCP (4.4.15) around xk for all k ∈ K

sufficiently large. Then, using the facts that {Bk} ⊆ Sn++ is uniformly bounded by Assump-

tion 4.4.5 (a), dk is a descent direction of Φρ(·) at xk by Proposition 4.4.4 and Φρ(·) is continuouseverywhere, we can deduce that d∗ = 0 in a way similar to the convergence analysis for the SQP-type method for solving the nonlinear SOCP [33].

4.4.2 Local convergence

Now, we analyze the convergence rate of Algorithm 4.1. In the remainder of this section, weassume that a sequence {(xk, ηk)} generated by Algorithm 4.1 converges to (x∗, η∗) ∈ Rn ×Rmrε(x∗). Moreover, we let x∗ ∈ Rn be a ε-regular point such that SISOCP (4.1.1) is locallyreduced to the following SOCP(x∗, ε) around x∗:

minx∈Uε(x∗)

f(x) s.t. Gj(x) ∈ Km (j = 1, 2, . . . , rε(x∗)),

where, for j = 1, 2, . . . , rε(x∗), Gj(x) := g(x, tj(x)) with C2 functions tj(·) : Uε(x∗) → T andan open neighborhood Uε(x∗) ⊆ Rn of x∗ satisfying conditions (a)-(c) in Proposition 4.2.4. Wesuppose that Gj(x∗) 6= 0 (j = 1, 2, . . . , rε(x∗)) and (x∗, η∗) satisfies the KKT conditions for


SOCP(x∗, ε):

∇f(x∗) −rε(x∗)∑j=1

∇Gj(x∗)η∗j = 0,

Km 3 Gj(x∗) ⊥ η∗j ∈ Km (j = 1, 2, . . . , rε(x∗)),

where η∗ := (η∗1, η∗2, . . . , η

∗rε(x∗)) ∈ Rm × Rm × · · · × Rm. Furthermore, we define the Lagrangian

of SOCP(x∗, ε) by

Lε(x, η) := f(x) −rε(x∗)∑j=1

Gj(x)>ηj ,

where η := (η1, η2, . . . , ηrε(x∗)) ∈ Rm × Rm × · · · × Rm.Before discussing the convergence rate of the algorithm, we recall the constraint nondegen-

eracy and second order sufficient condition for SOCP(x∗, ε). We say that x∗ ∈ Rn is constraintnondegenerate [5, Definition 16] if

∇Gj(x∗)>Rn + linTKm(Gj(x∗)) = Rm

holds for each j = 1, 2, . . . , rε(x∗), where TKm(z) denotes the tangent cone of Km at z ∈ Km andlinTKm(z) stands for the largest linear subspace contained by TKm(z). The second order sufficientcondition (SOSC) for general SOCP is studied in [5, 33, 71]. Under the assumption Gj(x∗) 6=0 (j = 1, 2, . . . , rε(x∗)), the SOSC can be simplified as follows: For all d ∈ CKm(x∗, η∗) \ {0},

d>∇2xxLε(x∗, η∗)d+ d>

rε(x∗)∑j=1

Hjε(x

∗, η∗)

d > 0,

where

Hjε(x

∗, η∗) :=

−

(η∗j )1Gj1(x∗)

∇Gj(x∗)

1 0

0 −Im−1

∇Gj(x∗)> if Gj(x∗) ∈ bdKm \ {0},

0 otherwise,

CKm(x∗, η∗) :={d ∈ Rn

∣∣∣d>∇Gj(x∗)η∗j = 0 for all j such that Gj(x∗) ∈ bd Km \ {0}}.

Under the above conditions, we can show that the sequence {(xk, ηk)} converges to (x∗, η∗)quadratically, by using an argument in [71, Theorem 4.2].

Proposition 4.4.8. Let B : Rn×Rrε(x∗) → Rn×n be a function such that B(x∗, η∗) = ∇xxLε(x∗, η∗)and B(·, ·) is continuously differentiable at (x∗, η∗). Suppose that Assumption 4.4.1 and As-sumption 4.4.5 (d)–(g) hold. Moreover, let the constraint nondegeneracy condition and SOSChold at (x∗, η∗). If (xk0 , ηk0) is sufficiently close to (x∗, η∗) for some k0 ≥ 0, and if sk = 1,Bk = B(xk, ηk) and Bk ∈ Sn

++ for all k ≥ k0, then {(xk, ηk)} converges to (x∗, η∗) quadratically.

Proof. From Assumption 4.4.5 (f)–(g), if xk is sufficiently close to x∗, then we can locally reduceSISOCP (4.1.1) to SOCP(x∗, ε) around xk. Then, by [71, Theorem 4.2], we obtain the desiredresult.


Using the above theorem, we can establish quadratic convergence of Algorithm 4.1 in whichBk are chosen as (4.3.8).

Theorem 4.4.9. Suppose that the assumptions in Proposition 4.4.8 hold. If (xk0 , ηk0) is suf-ficiently close to (x∗, η∗) for some k0 ≥ 0, and if sk = 1, Bk is chosen as (4.3.8) with ζk

j :=ηk

j (j = 1, 2, . . . , rε(x∗)) and Bk ∈ Sn++ for all k ≥ k0, then {(xk, ηk)} converges to (x∗, η∗)

quadratically.

Proof. From Assumption 4.4.5 (f)–(g), if xk is sufficiently close to x∗, then we can locally reduceSISOCP (4.1.1) to SOCP(x∗, ε) around xk. Then, by letting

B(x, η) := ∇2f(x) −rε(x∗)∑j=1

(ηj)1

(∇2Gj1(x) −

∇2Gj(x)Gj(x)‖Gj(x)‖

),

we have B(xk, ηk) = Bk for all k sufficiently large. Hence, from Proposition 4.4.8, we have onlyto show that B(·, ·) is continuously differentiable at (x∗, η∗) and B(x∗, η∗) = ∇2

xxLε(x∗, η∗).The first claim is obvious since Gj(x∗) 6= 0 (j = 1, 2, . . . , rε(x∗)). We prove the second claim.Notice that η∗j = (η∗j )1(1,−Gj(x∗)>/‖Gj(x∗)‖)> for j = 1, 2, . . . , rε(x∗) since Gj(x∗) 6= 0 andKm 3 η∗j ⊥ Gj(x∗) ∈ Km for j = 1, 2, . . . , rε(x∗). Thus, we have

∇2xxLε(x∗, η∗) = ∇2f(x∗) −

rε(x∗)∑j=1

∇2Gj(x∗)η∗j

= ∇2f(x∗) −rε(x∗)∑j=1

(η∗j )1∇2Gj(x∗)

1

− Gj(x∗)‖Gj(x∗)‖

= B(x∗, η∗).

This completes the proof.


In this section, we report some numerical results. The program was coded in Matlab 2008a andrun on a machine with an IntelrCore2 Duo E6850 3.00GHz CPU and 4GB RAM. Throughoutthe experiments, we let the index set be given by T := {t ∈ R | h(t) ≥ 0}, where h(t) :=(t + 1, 1 − t)>, i.e., T = [−1, 1]. The actual implementation of Algorithm 4.1 was carried outas follows. To obtain Tε(xk), we compute local minimizers of the lower-level problem P (xk).For this purpose, we first compute λ(xk, t) for t = −1,−0.98,−0.96, . . . , 0.98, 1, where λ(·, ·) isdefined as in (4.2.1). We then find local minimizers among λ(xk,−1), λ(xk,−0.98), . . . , λ(xk, 1)and apply Newton’s method with them as starting points. In Step 0, we set the parameters asα = 0.5, β = 10−5, δ = 5, ε = 0.1 and ρ−1 = 10. The initial point x0 ∈ Rn and the initial matrixB0 ∈ Rn×n are chosen as x0 := (10, 10, 10, . . . , 10)> and the identity matrix In, respectively. InStep 1, we make use of the smoothing method [18, 26] to solve QSOCP(xk, ε). In Step 2, westop the algorithm when ‖dk‖ ≤ 10−7 is satisfied. In Step 5, we update the matrix Bk ∈ Rn×n


by (4.3.8), where the vectors ζkj (j = 1, 2, . . . , rε(xk)) are set as

ζkj :=

ηki if we find i ∈ {1, 2, . . . , rε(xk)} such that tkj (·) = tk−1

i (·)

0 otherwise.(4.5.1)

In (4.5.1), we regard tkj (·) = tk−1i (·) when

j ∈ argminl=1,2,...,rε(xk−1)‖tkl (xk) − tk−1i (xk−1) −∇tk−1

i (xk−1)>(xk − xk−1)‖

and

‖tkj (xk) − tk−1i (xk−1) −∇tk−1

i (xk−1)>(xk − xk−1)‖ ≤ 10−4.

Moreover, to ensure positive definiteness of Bk, we modify Bk as follows. Let αki ∈ R (i =1, 2, . . . , n) and Vk ∈ Rn×n be scalars and a matrix such that Bk = Vkdiag(αki)n

i=1V>k , re-

spectively. Then, for each i, we replace αki by 10−4 if αki ≤ 10−5, and redefine Bk asBk = Vkdiag(αki)n

i=1V>k .

Experiment 1

In the first experiment, we examine the convergence behavior of Algorithm 4.1 by solving thevector-valued Chebyshev approximation problem (1.1.4). Let Q : R → R3 and q : Rn × R → R3

be defined by

Q(t) :=

et2+ cos t2

2tet2 − 2t sin t2

(4t2 + 2)et2 − 2 sin t2 − 4t2 cos t2

and

q(u, t) :=

∑n

ν=1uνtν−1∑n

ν=2(ν − 1)uνtν−2∑n

ν=3(ν − 1)(ν − 2)uνtν−3

.

To find a u ∈ Rn such that q(u, t) ≈ Q(t) over t ∈ T , we solve the following problem:

Minimizeu∈Rn

maxt∈T

‖Q(t) − q(u, t)‖ . (4.5.2)

As in (1.1.5), by using an auxiliary variable v ∈ R, we can reformulate (4.5.2) as the followingSISOCP with the four-dimensional SOC:

Minimize(v,u)∈R×Rn

v

subject to

1 0 0 0 · · · 00 1 t t2 · · · tn

0 0 1 2t · · · ntn−1

0 0 0 2 · · · n(n− 1)tn−2

(v

u

)−

0

et2+ cos t2

2tet2 − 2t sin t2

(4t2 + 2)et2 − 2 sin t2 − 4t2 cos t2

∈ K4

for all t ∈ T.

(4.5.3)


We then apply Algorithm 4.1 to SISOCP (4.5.3) with n = 6 and n = 8. The obtained results areshown in Tables 4.1 and 4.2, where cpu(s) denotes the running time of Algorithm 4.1 in seconds,and KKT(xk, ηk) is given by

KKT(xk, ηk) :=

∇f(xk) −

∑rε(xk)j=1 ∇xg(xk, tkj )η

ki

ηk1 − PKm

(ηk1 − g(xk, tk1)

)...

ηkrε(xk)

− PKm

(ηk

rε(xk)− g(x, tk

rε(xk)))

and ηk = (ηk

1 , . . . , ηkrε(xk)

) ∈ Rm × · · · × Rm is a Lagrange multiplier vector satisfying the KKTconditions (4.3.5) and (4.3.6) for QSOCP(xk, ε). Note that, by Proposition 2.4.3, KKT(xk, ηk) =0 if and only if (xk, ηk) satisfies the KKT conditions (4.3.1) and (4.3.2) for SISOCP (4.1.1).From the tables, we can observe that Algorithm 4.1 succeeds in getting an optimal solution forSISOCP (4.5.3). Indeed, xk and ηk satisfy the KKT conditions for SISOCP (4.5.3) accurately,since ‖KKT(xk, ηk)‖ ≤ 10−10 at the last iteration. Also, we can observe that the step size sk

equals 1 in the final stage and {xk} converges to a solution rapidly. In addition, we confirm that|Tε(xk)| becomes constant and the implicit functions {tkj (·)}

rε(xk)j=1 remain unchanged eventually,

and hence Assumption 4.4.5 (f) holds.

k sk ‖dk‖ ‖KKT(xk, ηk)‖ |Tε(xk)|1 1.0 1.73e+01 1.47e+02 12 0.5 1.47e+01 7.06e-01 2...

......

......

6 1.0 9.18e-04 5.99e-04 57 1.0 4.92e-07 2.63e-07 58 1.0 7.83e-11 4.20e-11 5

cpu(s): 5.8 seconds

Table 4.1: Results for Experiment 1 (n = 6)

k sk ‖dk‖ ‖KKT(xk, ηk)‖ |Tε(xk)|1 1.0 2.23e+01 5.03e+02 12 0.5 1.46e+01 7.06e-01 1...

......

......

10 0.5 7.94e-07 2.51e-06 711 1.0 3.97e-07 1.26e-06 712 1.0 1.78e-12 5.64e-12 7

cpu(s): 22.4 seconds



Experiment 2

In Experiment 1, we have observed that Algorithm 4.1 obtains accurate solutions with a rapidconvergence rate. Thus, if a starting point is chosen near an optimal solution, Algorithm 4.1 isexpected to find a solution more efficiently. In this experiment, to produce such a starting point,we use Algorithm 3.2 proposed in Chapter 3, and then use Algorithm 4.1 with an approximatesolution computed by Algorithm 3.2. The Algorithm 3.2 was implemented as described inExperiment 3-1 of Chapter 3. The computational results for SISOCP (4.5.3) with n = 6 andn = 8 are shown in Table 4.3 and Table 4.4, respectively, where

• cpu(s) (Algorithm 3.2): the running time of Algorithm 3.2

• cpu(s) (Algorithm 4.1): the running time of Algorithm 4.1

• cpu(s) (Algorithm 3.2+Algorithm 4.1): the total running time of Algorithm 3.2 and Al-gorithm 4.1.

From the tables, we observe that the total computational times are much less than those inExperiment 1. In particular, when n = 8, Algorithm 4.1 combined with Algorithm 3.2 took only15.3 seconds in total, while Algorithm 4.1 alone spent 22.4 seconds in Experiment 1.

cpu(s) (Algorithm 3.2) 2.0 secondscpu(s) (Algorithm 4.1) 0.49 secondscpu(s) (Algorithm 3.2+Algorithm 4.1) 2.49 seconds


cpu(s) (Algorithm 3.2) 4.0 secondscpu(s) (Algorithm 4.1) 11.3 secondscpu(s) (Algorithm 3.2+Algorithm 4.1) 15.3 seconds


Experiment 3

In the third experiment, we implemented another SQP-type algorithm, which is also expectedto find accurate solutions rapidly, and compared it with Algorithm 4.1 by solving the followingSISOCP that contains multiple SOCs:

Minimizex∈R10

12x

>Mx+ c>x

subject to As(t)x− bs(t) ∈ Kms for all t ∈ T,

s = 1, 2, . . . , S,

(4.5.4)

where c ∈ R10, As(t) := (Asij(t)) ∈ Rms×10 with As

ij(t) :=∑5

`=0 αsij`t

` (i = 1, 2, . . . ,ms, j =1, 2, . . . , 10) and bs(t) := (bsi (t)) ∈ Rms with bs1(t) := −

∑msi=2

∑5`=0 |βs

i`| and bsi (t) :=∑5

`=0 βsi`t

`


(i = 2, . . . ,ms). The SOCs K := Km1×Km2×· · · KmS are chosen as in Table 4.5. For each type ofSOC K, we generate 50 problems as follows: The problem data αs

1j`, αsij`, β

si` (i = 2, . . . ,ms, j =

1, 2, . . . , 10, ` = 0, 1, 2, . . . , 5, s = 1, 2, . . . , S) are chosen randomly from the interval [2,−2].All components of c are randomly chosen from the interval [5,−5]. The matrix M is set to beM := M>

1 M1 + 0.1In, where M1 ∈ Rn×n is a matrix whose entries are randomly chosen fromthe interval [1,−1]. Notice that, by the choice of bs1(t), we can ensure that (4.5.4) is feasible.1

In Step 3, we use the following penalty function for SISOCP (4.5.4) with multiple SOCs, whichis a natural extension of the function defined by (4.3.7):

Φρ(x) := f(x) + ρ

S∑s=1

ϕs+(x), (4.5.5)

where ϕs(x) := maxt∈T

(−As

1(t)x+ bs1(t) + ‖As(t)(x) − bs(t)‖)

for s = 1, 2, . . . , S. Accordingly,we extend the update rule of the penalty parameters {ρk} in Step 5 as follows:

ρk :=

ρk−1 if ρk−1 ≥ max

s=1,2,...,S

rsε(xk)∑j=1

(ηk+1sj )1

δ + maxs=1,2,...,S

rsε(xk)∑j=1

(ηk+1sj )1 otherwise,

where δ > 0 is a given constant and ηk+1sj (j = 1, 2, . . . , rs

ε(xk), s = 1, 2, . . . , S) are Lagrange

multiplier vectors obtained by solving QSOCP(xk, ε) for SISOCP (4.5.4).We next explain another SQP-type algorithm, which we call the QP-based method. For

simplicity of expression, we consider the case of SISOCP (4.1.1) with a single SOC. In the QP-based method, we reformulate SOCP(xk, ε) as the following nonlinear program that does notcontain SOC constraints explicitly:

minx∈U(xk)

f(x) s.t. vkj (x) ≥ 0 (j = 1, 2, . . . , rε(xk)), (4.5.6)

where vkj (x) := λ(x, tkj (x)) for j = 1, 2, . . . , rε(xk), and then generate a search direction dk by

solving the following Quadratic Program2 (QP):

QP(xk, ε) :Minimize ∇f(xk)>d+ 1

2d>Bkd

subject to vkj (xk) + ∇vk

j (xk)>d ≥ 0 (j = 1, 2, . . . , rε(xk)),

where Bk ∈ Sn++. We make use of the Hessian of the Lagrangian of (4.5.6). Specifically, we first

compute

Dk := ∇2f(xk) −rε(xk)∑j=1

ξkj ∇2vk

j (xk),

with

ξkj :=

ξki if we find i ∈ {1, 2, . . . , rε(xk)} such that tkj (·) = tk−1

i (·)

0 otherwise,1The origin x = 0 always lies in the interior of the feasible region, since we have −bs(t) ∈ intKms from

−bs1(t) − ‖(−bs

2(t), . . . ,−bsms

(t))>‖ > 0 for all t ∈ T .2We also suppose that Assumption 4.4.1 (b) holds.


where ξki ∈ R (i = 1, 2, . . . , rε(xk−1)) are Lagrange multipliers satisfying the KKT conditions of

QP(xk−1, ε). Note that ∇vkj (xk) and ∇2vk

j (xk) for j = 1, 2, . . . , rε(xk) can be calculated fromProposition 4.2.2 (d). Similarly to the matrix Bk for Algorithm 4.1, we also ensure the positivedefiniteness of Bk as follows: Let αki ∈ R (i = 1, 2, . . . , n) and Vk ∈ Rn×n be scalars and amatrix such that Bk = Vkdiag(αki)n

i=1V>k , respectively. Then, for each i, we replace αki by

10−4 if αki ≤ 10−5, and redefine Bk as Bk = Vkdiag(αki)ni=1V

>k . We use the penalty function

defined by (4.3.7) (by (4.5.5) for problem (4.5.4)) and determine a step size by the Armijo linesearch. We update the penalty parameters {ρk} as follows: If ρk−1 ≥

∑rε(xk)j=1 ξk+1

j , then we set

ρk := ρk−1. Otherwise, we set ρk :=∑rε(xk)

j=1 ξk+1j + δ, where δ > 0 is a given constant.

We extend the above QP-based method to (4.5.4) and implement it. The choice of parametersin the QP-based method is the same as in Algorithm 4.1. Moreover, we solve QP(xk, ε) withthe solver quadprog in MATLAB Optimization Toolbox.

The obtained results are shown in Table 4.5, where each column represents the following:

• itemax: the maximum number of iterations among 50 problems for each K

• itemin: the minimum number of iterations among 50 problems for each K

• iteave: the average number of iterations over 50 problems for each K

• cpu(s): the average time in seconds over 50 problems for each K

For all the generated problems, both algorithms successfully obtain optimal solutions. From thetable, we can observe that Algorithm 4.1 tends to perform better than the QP-based method.In particular, when K = 10, both the number of iterations and the computational time forAlgorithm 4.1 are less than half of those for the QP-based method. This fact suggests thatAlgorithm 4.1 may exploit the structure of SOC more effectively than the QP-based method.

Algorithm 4.1 QP-based methodK itemax itemin iteave cpu(s) itemax itemin iteave cpu(s)

K10 19 3 6.22 1.04 49 6 13.28 1.91K30 12 3 5.34 2.06 41 6 12.52 4.09K50 11 3 5.54 2.66 31 7 13.36 4.04

K20 ×K30 17 3 5.56 2.95 24 7 11.48 4.55K20 ×K15 ×K15 13 3 5.95 5.91 23 7 11.46 6.36

Table 4.5: Comparison of Algorithm 4.1 and the QP-based method (Experiment 3)


For solving the semi-infinite program with an infinite number of SOC constraints, we proposedthe local reduction based SQP-type method. We studied the global and local convergence prop-erties of the proposed algorithm. Finally, in the numerical experiments, we actually implementedand examined its effectiveness. For the sake of comparison, we also implemented a regularized


explicit exchange method and another SQP-type method, and observed good performance ofthe proposed algorithm.

Chapter 5

A smoothing SQP method for

mathematical programs with SOC

complementarity constraints

5.1 Introduction

In the previous chapters, we have studied algorithms for solving semi-infinite programs withinfinitely many SOC constraints. In this chapter, we turn to a class of mathematical programswith equilibrium constraints. Especially, we focus on the following mathematical program withSOC complementarity constraints, abbreviated as MPSOCC:

Minimizex,y,z

f(x, y)


z = Nx+My + q,

K 3 y ⊥ z ∈ K,

(5.1.1)

where f : Rn+m → R is a continuously differentiable function, A ∈ Rp×n, b ∈ Rp, N ∈Rm×n, M ∈ Rm×m and q ∈ Rm are given matrices and vectors, and K is the Cartesian productof second-order cones, that is, K := Km1×Km2×· · ·×Km` . Throughout the chapter, we supposethat mi ≥ 2 for each i.

The main purpose of this chapter is to develop an algorithm for solving MPSOCC (5.1.1).As mentioned in Chapter 1, Yan and Fukushima [77] proposed a smoothing method for solvingsuch problems. In their convergence theory, it is supposed that smoothed subproblems are solvedexactly. However, it can hardly be expected in practice. To overcome such a difficulty, wepropose to combine an SQP-type method with the smoothing method. The proposed methodreplaces the SOC complementarity condition of MPSOCC (5.1.1) with a certain vector equationby using a smoothed natural residual function, thereby yielding convex quadratic programmingsubproblems which can be solved efficiently by any state-of-the-art method such as the active-setmethod and interior point method. While Yan and Fukushima’s method solves the smoothedsubproblem exactly at each iteration, our method only solves convex quadratic programmingsubproblems that approximates the smoothed subproblem.

74 Chapter 5 A smoothing SQP method for MPSOCC

Although our method may be viewed as an extension of the SQP method in [17], the conver-gence analysis is quite different since it exploits the particular properties of the natural residualassociated with the SOC complementarity condition.

This chapter is organized as follows. In Section 5.2, we introduce the Cartesian P0 and theCartesian P matrices that play a key role to establish the well-definedness of the proposed algo-rithm. In Section 5.3, we reformulate MPSOCC (5.1.1) as a nonlinear programming problem byreplacing the second-order cone complementarity constraints by equivalent nonsmooth equalityconstraints. In Section 5.4, we recall a smoothing technique to deal with the nonsmooth con-straints. In Section 5.5, we propose an SQP-type algorithm for solving problem (5.1.1) and showthat the proposed method is well-defined. In Section 5.6, we prove that the proposed algorithmpossesses the global convergence property under the strict complementarity assumptions. InSection 5.7, we give some numerical examples. In Section 5.8, we end this chapter with someconcluding remarks.

5.2 Cartesian P0 and P matrices

In this section, we introduce the Cartesian P0 and the Cartesian P matrices. The concept of theCartesian P0 (P ) matrix is a natural extension of the well-known P0 (P ) matrix [13]. Althoughthe Cartesian P0 (P ) matrix can be defined not only for the SOC but also for the semidefinitecone [11] and the symmetric cone [23], we restrict ourselves to the case of the SOCs.

Definition 5.2.1. Suppose that the Cartesian structure of K ⊆ Rm is given as K := Km1 ×Km2 × · · · × Km`. Then, M ∈ Rm×m is called

(a) a Cartesian P0 matrix if, for every nonzero z = (z1, . . . , z`) ∈ Rm = Rm1 × · · · ×Rm`, thereexists an index i ∈ {1, . . . , `} such that (zi)>(Mz)i ≥ 0;

(b) a Cartesian P matrix if, for every nonzero z = (z1, . . . , z`) ∈ Rm = Rm1 × · · · × Rm`, thereexists an index i ∈ {1, . . . , `} such that (zi)>(Mz)i > 0.

Here, (Mz)i ∈ Rmi denotes the i-th subvector of Mz ∈ Rm conforming to the Cartesian structureof K.

Notice that the definition of the Cartesian P0 (P ) property depends on the Cartesian struc-ture of K. In what follows, we assume that the Cartesian structure of K is always given asK = Km1 ×Km2 × · · · ×Km` . The definition of the “classical” P0 (P ) matrix corresponds to thecase where K = Rm

+ . It is easily seen that every Cartesian P0 (P ) matrix is a P0 (P ) matrix [48].The following proposition implies that the Cartesian P0 (P ) property is preserved under a

nonsingular block-diagonal transformation.

Proposition 5.2.2. Let M ∈ Rm×m be any matrix, and Hi ∈ Rmi×mi(i = 1, 2, . . . , `) bearbitrary nonsingular matrices. Let the matrix M ′ ∈ Rm×m be defined by

M ′ :=H>MH, H :=

H1 0. . .

0 H`

.

5.3 Reformulation of MPSOCC and B-stationary points of MPSOCC 75

Then, the following statements hold.

(a) If M is a Cartesian P0 matrix, then M ′ is a Cartesian P0 matrix.

(b) If M is a Cartesian P matrix, then M ′ is a Cartesian P matrix.

Proof. We first show (b). Let z = (z1, . . . , z`) ∈ Rm = Rm1 × · · · ×Rm` be an arbitrary nonzerovector. We show that there exists an i ∈ {1, 2, . . . , `} such that (zi)>(M ′z)i > 0. Note that

(M ′z)i =

H1 0. . .

0 H`

>

M

H1 0. . .

0 H`

z

i

=

H1 0. . .

0 H`

>

M

H1z

1

...H`z

`

i

=

H>

1

∑k=1

M1kHkzk

...

H>`

∑k=1

M`kHkzk

i

=

(H>

i

∑k=1

MikHkzk

).

where (·)i and (·)ik denote the i-th subvector and the (i, k)-th block entry, respectively, con-forming to the Cartesian structure of K. Hence, we have

(zi)>(M ′z)i = (zi)>H>i

∑k=1

MikHkzk =

(Hiz

i)>∑

k=1

MikHkzk = ((Hz)i)>(MHz)i.

SinceM is a Cartesian P matrix andHz 6= 0 from the nonsingularity ofH, we have (zi)>(M ′z)i =((Hz)i)>(MHz)i > 0 for some i. Hence, M ′ is a Cartesian P matrix.

We omit the proof of (a) since it can be shown in a similar manner to (b).

5.3 Reformulation of MPSOCC and B-stationary points of MP-

SOCC

In Chapter 2, we introduced the natural residual function Φ and observed that SOC comple-mentarity condition K 3 y ⊥ z ∈ K can be represented as Φ(y, z) = 0 equivalently. In thissection, we rewrite MPSOCC (5.1.1) as the following problem where the SOC complementar-ity constraint is replaced by the equivalent equality constraint involving the natural residualfunction Φ:

Minimizex,y,z

f(x, y)


z = Nx+My + q,

Φ(y, z) = 0.

(5.3.1)


We also call this problem MPSOCC. MPSOCC (5.3.1) is a nonsmooth optimization problemsince Φ is not differentiable everywhere. However, as is shown below, Φ is continuously differ-entiable at any (y, z) satisfying the following strict complementarity condition:

Definition 5.3.1 (Strict complementarity). Suppose that (y, z) ∈ Rm × Rm satisfies the SOCcomplementarity condition K 3 y ⊥ z ∈ K. Moreover, decompose y and z as y = (y1, y2, . . . , y`)and z = (z1, z2, . . . , z`) ∈ Rm1 × Rm2 × · · · × Rm` = Rm conforming to the Cartesian structureof K. Then, we say that strict complementarity holds at (y, z) if, for every i = 1, 2, . . . , `, oneof the following three conditions holds:

(i) yi ∈ intKmi , zi = 0;

(ii) yi = 0, zi ∈ intKmi ;

(iii) yi ∈ bdKmi \ {0}, zi ∈ bdKmi \ {0}, (yi)>zi = 0.

Lemma 5.3.2. Let y, z ∈ Rm be chosen so that y − z /∈ bd (K ∪ −K). Then, the functionΦ : Rm × Rm → Rm defined by (2.4.1) is continuously differentiable at (y, z), and the followingequality holds:

∇yΦ(y, z) + ∇zΦ(y, z) = Im,

where Im ∈ Rm×m denotes the identity matrix.

Proof. It suffices to consider the case where K = Km. Let λ1, λ2 ∈ R be the spectral values ofy− z defined as in Definition 2.4.1. Note that, from y− z /∈ bd (Km ∪−Km), we have λ1, λ2 6= 0.Then, from [26, Proposition 4.8], the Clarke subdifferential ∂PKm(y − z) is explicitly given as

∂PKm(y − z) =

Im (λ1 > 0, λ2 > 0),λ2

λ2 − λ1Im +W (λ1 < 0, λ2 > 0),

O (λ1 < 0, λ2 < 0),

where

W :=12

(−r1 r>2r2 −r1r2r>2

), (r1, r2) :=

(y1 − z1, y2 − z2)‖y2 − z2‖

.

Thus PKm is differentiable at y − z. This fact readily implies the continuous differentiability ofΦ at (y, z) since Φ(y, z) = y−PKm(y− z). We next show the second half of the proposition. Byan easy calculation, we have

∇yΦ(y, z) = Im −∇PKm(y − z).

Similarly, we have ∇zΦ(y, z) = ∇PKm(y − z). Hence we obtain the desired equality.

Proposition 5.3.3. Let (y, z) ∈ Rm × Rm satisfy the strict complementarity condition. Then,Φ is continuously differentiable at (y, z).

Proof. The strict complementarity condition readily yields y − z /∈ bd (K ∪ −K). Hence, Φ iscontinuously differentiable at (y, z) by Lemma 5.3.2.

5.3 Reformulation of MPSOCC and B-stationary points of MPSOCC 77

Now, let X := {x ∈ Rn | Ax ≤ b}, and let w := (x, y, z) ∈ Rn ×Rm ×Rm be a feasible pointof MPSOCC (5.1.1) that satisfies the strict complementarity condition. By Proposition 5.3.3,MPSOCC (5.3.1) can be viewed as a smooth optimization problem in a small neighborhood ofw. Then, the Karush-Kuhn-Tucker (KKT) conditions on w are represented as ∇xf(x, y)

∇yf(x, y)0

+

NT

MT

−I

u+

0∇yΦ(y, z)∇zΦ(y, z)

v ∈ −NX(x) × {0}2m,

Φ(y, z) = 0, Ax ≤ b, z = Nx+My + q, (5.3.2)

where {0}2m := {0} × {0} × · · · × {0} ⊆ R2m, and u ∈ R`, v ∈ Rm and η ∈ Rm are Lagrangemultipliers.

We next consider the stationarity of MPSOCC (5.1.1) or (5.3.1). So far, several kinds ofstationary points have been studied in the literature of MPECs, e.g., see [59]. Among them, aBouligand- or B-stationary point is the most desirable, since it is directly related to the firstorder optimality condition. Specifically, a B-stationary point for MPSOCC (5.1.1) is defined asfollows:

Definition 5.3.4 (B-stationarity). Let F ⊆ Rn+2m denote the feasible set of MPSOCC (5.1.1).We say that w := (x, y, z) ∈ F is a B-stationary point of MPSOCC (5.1.1) if (−∇f(x, y), 0) ∈NF (w) holds.

In what follows, we show that a point satisfying KKT conditions (5.3.2) is a B-stationarypoint. For this purpose, we give three useful lemmas.

Lemma 5.3.5. [56, Proposition 6.41] Let C := C1 × C2 × · · · × Cs for nonempty closed setsCi ⊆ Rni. Choose ζ := (ζ1, ζ2, . . . , ζs) ∈ C1 ×C2 × · · · ×Cs. Then, we have NC(ζ) = NC1(ζ1)×NC2(ζ2) × · · · × NCs(ζs).

Lemma 5.3.6. [56, Chapter 6-C] For a continuously differentiable function F : Rp → Rq, letD := {ζ ∈ Rp | F (ζ) = 0}. Choose ζ ∈ D arbitrarily. If ∇F (ζ) has full column rank, then wehave

ND(ζ) = ∇F (ζ)Rq:= {ζ ∈ Rp | ζ = ∇F (ζ)v, v ∈ Rq}.

Lemma 5.3.7. [56, Theorem 6.14] For a continuously differentiable function F : Rp → Rq anda closed set C ⊆ Rp, let D := {ζ ∈ C | F (ζ) = 0}. Choose ζ ∈ D arbitrarily. Then we have

ND(ζ) ⊇ ∇F (ζ)Rq + NC(ζ).

Now, we show that the KKT conditions (5.3.2) are sufficient conditions for w = (x, y, z) tobe B-stationary.

Proposition 5.3.8. Let w := (x, y, z) be a feasible point of MPSOCC (5.3.1). Suppose that thestrict complementarity condition holds at (y, z). If w satisfies the KKT conditions (5.3.2), thenw is a B-stationary point of MPSOCC (5.1.1).


Proof. Let Y := {(y, z) ∈ Rm×Rm | Φ(y, z) = 0}. We first note that the strict complementarityat (y, z) implies the continuous differentiability of Φ at (y, z) from Lemma 5.3.2. Choose v ∈ Rm

such that ∇Φ(y, z)v = 0. From Lemma 5.3.2, we then have ∇yΦ(y, z)v = 0 and ∇zΦ(y, z)v =(I − ∇yΦ(y, z))v = 0, which readily imply v = 0, and thus ∇Φ(y, z) has full column rank.Therefore, by Lemma5.3.6 with p = q := m, D := Y and F := Φ, we have

NY (y, z) = ∇Φ(y, z)Rm. (5.3.3)

Then, it holds that

NX×Y (w) = NX(x) ×NY (y, z) = NX(x) ×∇Φ(y, z)Rm, (5.3.4)

where the first equality follows from Lemma 5.3.5 and the second equality follows from (5.3.3).Now, let F ⊆ Rn+2m denote the feasible set of MPSOCC (5.3.1), i.e., F = {(x, y, z) ∈ X × Y |Nx+My − z + q = 0}. Then, from Lemma 5.3.7, we have

NF (w) ⊇(N>,M>,−I

)>Rm + NX×Y (w). (5.3.5)

Combining (5.3.4) with (5.3.5), we obtain

NF (w) ⊇(N>,M>,−I

)>Rm + NX(x) ×∇Φ(y, z)Rm,

which together with the KKT conditions (5.3.2) implies (−∇f(x, y), 0) ∈ NF (w). Thus, w is aB-stationary point.

5.4 Smoothing function of natural residual

The natural residual function Φ given in Definition 2.4.2 is not differentiable everywhere, andtherefore, we cannot employ a derivative-based algorithm such as Newton’s method to solveMPSOCC (5.3.1). To overcome such a difficulty, we will utilize a smoothing technique.

Definition 5.4.1. Let Ψ : Rm → Rm be a nondifferentiable function. Then, the functionΨµ : Rm → Rm parametrized by µ > 0 is called a smoothing function of Ψ if it satisfies thefollowing properties: For any µ > 0, Ψµ is differentiable on Rm; for any z ∈ Rm, it holds thatlimµ→0+ Ψµ(z) = Ψ(z).

A smoothing function of the natural residual function can be constructed by means of theChen-Mangasarian (CM) function g : R → R [18].

Definition 5.4.2. A differentiable convex function g : R → R+ is called a CM function if

limα→−∞

g(α) = 0, limα→∞

(g(α) − α) = 0, 0 < g′(α) < 1 (α ∈ R). (5.4.1)

Notice that, if function pµ : R → R is defined by pµ(α) := µg(α/µ) with a CM function g anda positive parameter µ, then it becomes a smoothing function for p(α) := max{0, α}. Thanksto this fact, we can next provide a smoothing function Pµ for the projection operator PK.

5.4 Smoothing function of natural residual 79

Definition 5.4.3. Let z ∈ Rm be an arbitrary vector decomposed as z = (z1, z2, · · · , z`) ∈Rm1 ×Rm2 ×· · ·×Rm` = Rm conforming to the given Cartesian structure of K. For an arbitraryCM function g : R → R, let g : Rm → Rm be defined as

g(z) :=

g1(z1)

...g`(z`)

, (5.4.2)

gi(z) := g(λi1)ci1 + g(λi2)ci2,

where λij ∈ R and cij ∈ Rmi ((i, j) ∈ {1, 2, · · · , `} × {1, 2}) are the spectral values and thespectral vectors of subvectors zi with respect to Kmi, respectively. Then, the smoothing functionPµ: Rm → Rm of PK is given as

Pµ(z) := µg(z/µ).

Now, by using the above smoothing function Pµ, we can define the smoothing function forthe natural residual Φ.

Definition 5.4.4. Let µ > 0 be arbitrary. Let g : Rm → Rm, gi : Rmi → Rmi (i = 1, 2, . . . , `),and Pµ : Rm → Rm be defined as in Definition 5.4.3. Then, the smoothing function Φµ : Rm →Rm for the natural residual Φ is given as

Φµ(y, z) := y − Pµ(y − z)

= y − µg

(y − z

µ

)

=

y1 − µg1

(y1 − z1

µ

)...

y` − µg`

(y` − z`

µ

) . (5.4.3)

Before closing this subsection, we provide the following propositions which will be used inthe subsequent analyses.

Proposition 5.4.5. Let Pµ : Rm → Rm be defined as in Definition 5.4.3, and choose z /∈bd (K ∪ −K) arbitrarily. Let {zk} ⊆ Rm and {µk} ⊆ R++ be arbitrary sequences such thatzk → z and µk → 0 as k → ∞. Then, we have

∇PK(z) = limk→∞

∇Pµk(zk). (5.4.4)

Proof. For simplicity, we consider the case where K = Km. Let z and zk be decomposedas z = λ1c

1 + λ2c2 and zk = λk

1c1k + λk

2c2k, where λi, λk

i ∈ R are spectral values, and ci,cik ∈ Rm (i = 1, 2) are spectral vectors of z and zk, respectively. Since z /∈ bd (Km ∪−Km), PKm

is differentiable at z and ∇PKm(z) is given as

∇PKm(z) =

Im (λ1 > 0, λ2 > 0),λ2

λ2 − λ1

Im +W (λ1 < 0, λ2 > 0),

O (λ1 < 0, λ2 < 0),


where

W :=12

(−r1 r>2r2 −r1r2r>2

), (r1, r2) :=

(z1, z2)‖z2‖

, z := (z1, z2) ∈ R × Rm−1.

On the other hand, by [18, Proposition 5.2], ∇Pµk(zk) is written as

∇Pµk(zk) =

g′(zk1/µk)Im (zk

2 = 0),bµk

cµk(zk

2 )>

‖zk2‖

cµkzk2

‖zk2‖

aµkIm−1 + (bµk

− aµk)zk2z

k2>

‖zk2‖2

(zk2 6= 0),

where

aµk=g(λk

2/µk) − g(λk1/µk)

λk2/µk − λk

1/µk, bµk

=12

(g′(λk

2

µk

)+ g′

(λk

1

µk

)),

cµk=

12

(g′(λk

2

µk

)− g′

(λk

1

µk

)), zk := (zk

1 , zk2 ) ∈ R × Rm−1,

and g is defined as in Definition 5.4.3. Note that, from the definition of g, we have g(α)−α→ 0,g(−α) → 0, g′(α) → 1 and g′(−α) → 0 as α→ ∞. Then, it follows that

limk→∞

aµk=

1 (0 < λ1 ≤ λ2)

λ2/(λ2 − λ1) (λ1 < 0 < λ2)

0 (λ1 ≤ λ2 < 0),

limk→∞

bµk=

1 (0 < λ1 ≤ λ2)

1/2 (λ1 < 0 < λ2)

0 (λ1 ≤ λ2 < 0),

and

limk→∞

cµk=

0 (0 < λ1 ≤ λ2)

1/2 (λ1 < 0 < λ2)

0 (λ1 ≤ λ2 < 0).

From this fact, it is not difficult to observe that ∇PKm(z) = limk→∞∇Pµk(zk).

Proposition 5.4.6. [18, Proposition 5.1] Let Φ : Rm×Rm → Rm and Φµ : Rm×Rm → Rm bedefined by (2.4.1) and (5.4.3), respectively. Let ρ := g(0). Then, for any y, z ∈ Rn and µ > ν > 0,we have

ρ(µ− ν)e �K Φν(y, z) − Φµ(y, z) �K 0,

ρµe �K Φ(y, z) − Φµ(y, z) �K 0,

where e := (e1, e2, . . . , e`) ∈ Rm1 × Rm2 × · · · × Rm` with ei := (1, 0, 0, . . . , 0)> ∈ Rmi fori = 1, 2, . . . , `.

5.4 Smoothing function of natural residual 81

Proposition 5.4.7. [18, Corollary 5.3 and Proposition 6.1] Let Φ : Rm ×Rm → Rm and Φµ :Rm × Rm → Rm be defined by (2.4.1) and (5.4.3), respectively. Let g : Rm → Rm be defined by(5.4.2). Then, the following statements hold.

(a) Function g is continuously differentiable and ∇g(z) = diag(∇g1(z1), . . . ,∇g`(z`))) ∈ Rm×m

is symmetric for any z ∈ Rm, where the latter matrix denotes the block-diagonal matrix withblock-diagonal elements ∇gi(zi), i = 1, 2, . . . , `.

(b) For any y, z ∈ Rm, we have

∇yΦµ(y, z) = Im −∇g(y − z

µ

), ∇zΦµ(y, z) = ∇g

(y − z

µ

),

where Im ∈ Rm×m denotes the identity matrix.

(c) For any y, z ∈ Rm, we have

O ≺ ∇yΦµ(y, z) ≺ Im, O ≺ ∇zΦµ(y, z) ≺ Im, O ≺ ∇g(y − z

µ

)≺ Im.

Proposition 5.4.8. Let Φ : Rm × Rm → Rm and Φµ : Rm × Rm → Rm be defined by(2.4.1) and (5.4.3), respectively. Let ρ := g(0). Then, ‖Φµ(y, z) − Φ(y, z)‖ ≤

√2ρµ for any

(µ, y, z) ∈ R++ × Rm × Rm.

Proof. For simplicity, we only consider the case where K = Km. Let Φ(y, z) − Φµ(y, z) =λ1c

1 + λ2c2, where λi ∈ R and ci ∈ Rm (i = 1, 2) are the spectral values and spectral vectors of

Φ(y, z)−Φµ(y, z). Since e = c1+c2 and ρµe �Km Φ(y, z)−Φµ(y, z) �Km 0 from Proposition 5.4.6,we have ρµ(c1 + c2) �Km λ1c

1 +λ2c2 �Km 0, which implies 0 < λ1 ≤ λ2 ≤ ρµ. Hence, we obtain

‖Φ(y, z) − Φµ(y, z)‖ = ‖λ1c1 + λ2c

2‖ ≤ λ1‖c1‖ + λ2‖c2‖ ≤√

2ρµ,

where the first inequality is due to the triangle inequality and 0 < λ1 ≤ λ2, and the lastinequality follows from ‖c1‖ = ‖c2‖ = 1/

√2 and λ1 ≤ λ2 ≤ ρµ. This completes the proof.

Proposition 5.4.9. Let Φµ : Rm × Rm → Rm be defined by (5.4.3). Then, for any µ > ν > 0and (y, z) ∈ Rm × Rm, it holds that

‖Φν(y, z)‖1 − ‖Φµ(y, z)‖1 ≤ mρ(µ− ν),

where ρ = g(0).

Proof. We first assume K = Km. From Proposition 5.4.6, we have

ρ(µ− ν)e− (Φν(y, z) − Φµ(y, z))∈ Km, (5.4.5)

Φν(y, z) − Φµ(y, z)∈ Km, (5.4.6)

where e = (1, 0, . . . , 0)> ∈ Rm. Moreover, for any w = (w1, w2, . . . , wm)> ∈ Km, we have

w1 ≥ |wi| (i = 1, . . . ,m), (5.4.7)


since w1 ≥√w2

2 + · · · + w2m. Therefore, for each i = 1, 2, . . . ,m, we have

ρ(µ− ν) ≥(Φν(y, z) − Φµ(y, z)

)1

≥∣∣(Φν(y, z) − Φµ(y, z)

)i

∣∣≥ |Φν(y, z)i| − |Φµ(y, z)i|, (5.4.8)

where the first inequality holds from (5.4.3) and (5.4.5), the second equality holds from (5.4.6)and (5.4.7), and the last equality holds from the triangle inequality. Summing up (5.4.8) for alli, we obtain the desired conclusion. When K = Km1 × · · · × Km` , we can prove it in a similarway.

5.5 Algorithm

In this section, we propose an SQP type algorithm for MPSOCC (5.1.1). The SQP method solvesa quadratic programming (QP) problem in each iteration to determine the search direction.This method is known as one of the most efficient methods for solving nonlinear programmingproblems. In the remainder of the chapter, to apply the SQP method, we mainly consider MP-SOCC (5.3.1) equivalent to MPSOCC (5.1.1). We should notice, however, that the SQP methodcannot be applied directly to MPSOCC (5.3.1), since Φ(y, z) is not differentiable everywhere. Wethus consider the following problem where the smooth equality constraint Φµ(y, z) = 0 replacesΦ(y, z) = 0 in each iteration

Minimizex,y,z

f(x, y)


z = Nx+My + q,

Φµ(y, z) = 0.

(5.5.1)

Given a current iterate (xk, yk, zk) satisfying Axk ≤ bk and zk = Nxk+Myk+q, we then generatethe search direction (dxk, dyk, dzk) by solving the following QP subproblem, which consists ofquadratic and linear approximations of the objective and constraint functions of problem (5.5.1)with µ = µk, respectively, at (xk, yk, zk):

Minimizedx,dy,dz

∇f(xk, yk)>(dx

dy

)+

12

dx

dy

dz

>

Bk

dx

dy

dz

subject to Adx ≤ b−Axk,(

N M −Im0 ∇yΦµk

(yk, zk)> ∇zΦµk(yk, zk)>

) dx

dy

dz

= −

(0

Φµk(yk, zk)

),

(5.5.2)

where Bk∈ R(n+2m)×(n+2m) is a positive definite symmetric matrix. In the numerical experimentsin Section 5.7, Bk will be updated by using the modified Broyden-Fletcher-Goldfarb-Shanno(BFGS) formula. Note that the Karush-Kuhn-Tucker (KKT) conditions of QP (5.5.2) can be

5.5 Algorithm 83

written as ∇xf(xk, yk)∇yf(xk, yk)

0

+Bk

dx

dy

dz

+

N>

M>

−Im

u+

0∇yΦµk

(yk, zk)∇zΦµk

(yk, zk)

v +

A>

00

η = 0,

(N M −Im0 ∇yΦµk


) dx

dy

dz

= −

(0

Φµk(yk, zk)

),

0 ≤ (b−Axk −Adx) ⊥ η ≥ 0,

(5.5.3)

where (η, u, v) ∈ Rp × Rm × Rm denotes the Lagrange multipliers.For simplicity of notation, we denote

w := (x, y, z) ∈ Rn × Rm × Rm, dw := (dx, dy, dz) ∈ Rn × Rm × Rm.

Also, we define the `1 penalty function by

θµ,α(w) := f(x, y) + α‖Φµ(y, z)‖1, (5.5.4)

where α > 0 is the penalty parameter. Note that this function has the directional derivativeθ′µ,α(w; dw) for any w and dw.

Algorithm 5.1

Step 0: Choose parameters δ ∈ (0,∞), β ∈ (0, 1), ρ ∈ (0, 1), σ ∈ (0, 1), µ0 ∈ (0,∞), α−1 ∈(0,∞) and a symmetric positive definite matrix B0 ∈ R(n+2m)×(n+2m). Choose w0 =(x0, y0, z0) ∈ Rn × Rm × Rm such that Nx0 +My0 + q = z0 and Ax0 ≤ b. Set k := 0.

Step 1: Solve QP subproblem (5.5.2) to obtain the optimum dwk = (dxk, dyk, dzk) and theLagrange multipliers (ηk, uk, vk).

Step 2: If dwk = 0, then let wk+1 := wk, αk := αk−1 and go to Step 3. Otherwise, update thepenalty parameter by

αk :=

{αk−1 if αk−1 ≥ ‖vk‖∞ + δ,

max{‖vk‖∞ + δ, αk−1 + 2δ} otherwise.(5.5.5)

Then, set the step size τk := ρL, where L is the smallest nonnegative integer satisfying theArmijo condition

θµk,αk(wk + ρLdwk) ≤ θµk,αk

(wk) + σρLθ′µk,αk(wk; dwk). (5.5.6)

Let wk+1 := wk + τkdwk, and go to Step 3.

Step 3: Terminate if a certain criterion is satisfied. Otherwise, let µk+1 := βµk and update Bk

to determine a symmetric positive definite matrix Bk+1. Return to Step 1 with k replacedby k + 1.


In the remainder of this section, we establish the well-definedness of Algorithm 5.1. We firstshow the feasibility of QP subproblem (5.5.2). In general, a QP subproblem generated by theSQP method may not be feasible, even if the original nonlinear programming problem is feasible.However, in the present case, we can show that QP subproblem (5.5.2) is always feasible underthe Cartesian P0 property of the matrix M . To this end, the following lemma will be useful.

Lemma 5.5.1. Let M ∈ Rm×m be a Cartesian P0 matrix. Let Hi ∈ Rmi×mi (i = 1, 2, . . . , `) bepositive definite matrices with m =

∑`i=1mi, and H ∈ Rm×m be a block diagonal matrix with

block diagonal elements Hi (i = 1, . . . , `). Then, H +M is nonsingular.

Proof. The matrix H+M can easily be shown to be a Cartesian P matrix, which is nonsingular.

The next proposition shows the feasibility and solvability of QP subproblem (5.5.2). In theproof, the matrix

Dk :=

(M −Im

∇yΦµk(yk, zk)> ∇zΦµk

(yk, zk)>

)(5.5.7)

plays an important role.

Proposition 5.5.2 (Feasibility of QP subproblem). Let M be a Cartesian P0 matrix, and {wk}be a sequence generated by Algorithm 5.1. Then, (i) Axk ≤ b and zk = Nxk +Myk + q hold forall k, and (ii) QP subproblem (5.5.2) is feasible and hence has a unique solution for all k.

Proof. Since (i) can be shown easily, we only show (ii). Since the objective function of QP (5.5.2)is strongly convex, it suffices to show the feasibility. We first show that the matrix Dk definedby (5.5.7) is nonsingular. Note that, by Proposition 5.4.7(c), ∇zΦµk

(yk, zk) is nonsigular. LetDk be the Schur complement of the matrix ∇zΦµk

(yk, zk)> with respect to Dk, that is,

Dk := M +(∇zΦµk

(yk, zk)>)−1

∇yΦµ(yk, zk)>

= M +

(∇g(yk − zk

µk

)>)−1(Im −∇g

(yk − zk

µk

)>)

= M + diag

(∇gi

(yi,k − zi,k

µk

)>)−1

− Imi

`

i=1

,

where yi,k and zi,k denote the i-th subvectors of yk and zk, respectively, conforming to theCartesian structure of K, and each equality follows from Proposition 5.4.7. SinceM is a CartesianP0 matrix and (∇gi((yi,k − zi,k)/µk)>)−1 − Imi ∈ Rmi×mi is positive definite from Proposition5.4.7 (c), Lemma 5.5.1 ensures that Dk is nonsingular, and hence Dk is nonsingular. It thenfollows from (i) that

dx = 0,

(dy

dz

)= −D−1

k

(0

Φµk(yk, zk)

)

comprise a feasible solution to (5.5.2). This completes the proof.

5.5 Algorithm 85

The following proposition shows that the search direction dwk produced in Step 1 of Algo-rithm 5.1 is a descent direction of the penalty function θµk,αk

defined by (5.5.4). It guaranteesthe well-definedness of the line search in Step 2 in the sense that there exists a finite L satisfyingthe Armijo condition (5.5.6).

Proposition 5.5.3 (Descent direction). Let {wk} and {dwk} be sequences generated by Algo-rithm 5.1. Then, we have

(a) θ′µk,αk(wk; dwk) = ∇xf(xk, yk)>dxk + ∇yf(xk, yk)>dyk − αk‖Φµk

(yk, zk)‖1,

(b) θ′µk,αk(wk; dwk) ≤ −(dwk)>Bkdw

k

for each k. Moreover, if Φµk(yk, zk) 6= 0, then the inequality in (b) holds strictly.

Proof. We first show (a). Let Jk+, J

k0 , and Jk

− ⊆ {1, 2, . . . ,m} be the index sets defined by

Jk+ = {j | Φµk

(yk, zk)j > 0},

Jk0 = {j | Φµk

(yk, zk)j = 0},

Jk− = {j | Φµk

(yk, zk)j < 0},

where Φµk(yk, zk)j ∈ R denotes the j-th component of Φµk

(yk, zk) ∈ Rm. Then, we have

θ′µk,αk(wk; dwk) = ∇f(xk, yk)>

(dxk

dyk

)+ αk

∑j∈Jk

+

[∇Φµk

(yk, zk)]>j

(dyk

dzk

)

+αk

∑j∈Jk

0

∣∣∣∣∣[∇Φµk(yk, zk)

]>j

(dyk

dzk

)∣∣∣∣∣− αk

∑j∈Jk

−

[∇Φµk

(yk, zk)]>j

(dyk

dzk

), (5.5.8)

where [∇Φµk(yk, zk)]j denotes the j-th column vector of ∇Φµk

(yk, zk). Since

[∇Φµk

(yk, zk)]>j

(dyk

dzk

)= −Φµk

(yk, zk)j

from the constraint of QP subproblem (5.5.2), we have

θ′µk,αk(wk; dwk) = ∇xf(xk, yk)>dxk + ∇yf(xk, yk)>dyk − αk‖Φµk

(yk, zk)‖1. (5.5.9)

We next show (b). Taking the inner product of dwk = (dxk, dyk, dzk) and both sides of thefirst equality in the KKT conditions (5.5.3) with dw = dwk, η = ηk, u = uk, v = vk for thesubproblem (5.5.2), we obtain

∇f(xk, yk)>(dxk

dyk

)+ (dwk)>Bkdw

k + (uk)>(Ndxk +Mdyk − dzk)

+ (vk)>∇Φµk(yk, zk)

(dyk

dzk

)+ (ηk)>Adxk = 0. (5.5.10)


Moreover, from the constraints of the subproblem (5.5.2) and the KKT conditions (5.5.3), wehave

Ndxk +Mdyk − dzk = 0, (5.5.11)

∇Φµk(yk, zk)>

(dyk

dzk

)= −Φµk

(yk, zk), (5.5.12)

and

0 = (ηk)>(b−Axk −Adxk) = −(ηk)>Adxk + (ηk)>(b−Axk) ≥ −(ηk)>Adxk, (5.5.13)

where the inequality is due to ηk ≥ 0 and b−Axk ≥ 0 from (5.5.3). Substituting (5.5.11)–(5.5.13)into (5.5.10), we have

∇f(xk, yk)>(dxk

dyk

)+ (dwk)>Bkdw

k − (vk)>Φµk(yk, zk) ≤ 0.

Furthermore, from (5.5.9), we obtain

θ′µk,αk(wk; dwk) ≤ −(dwk)>Bkdw

k + (vk)>Φµk(yk, zk) − αk‖Φµk

(yk, zk)‖1

= −(dwk)>Bkdwk +

∑j∈Jk

+

(vkj − αk)

[Φµk

(yk, zk)]j

+∑j∈Jk

−

(vkj + αk)

[Φµk

(yk, zk)]j

≤ −(dwk)>Bkdwk,

where the last inequality follows from αk > ‖vk‖∞ and the definitions of Jk+ and Jk

−. Moreover,if Φµk

(yk, zk) 6= 0, then the last inequality holds strictly since Jk+ ∪ Jk

− 6= ∅. This completes theproof of (b).

5.6 Convergence analysis

In this section, we study the convergence property of the proposed algorithm. To begin with,we make the following assumption.

Assumption 5.6.1. Let sequences {wk} and {Bk} be produced by Algorithm 5.1.

(a) {wk} is bounded.

(b) There exist constants γ1, γ2 > 0 such that γ1‖d‖2 ≤ d>Bkd ≤ γ2‖d‖2 for any d ∈ Rn+2m

and k.

(c) There exists a constant c > 0 such that ‖D−1k ‖ ≤ c for any k, where Dk is the matrix defined

by (5.5.7).

Assumption 5.6.1 (b) means that {Bk} is bounded and uniformly positive definite. Assump-tion 5.6.1 (c) holds if and only if any accumulation point of {Dk} is nonsingular. The nextproposition provides a sufficient condition under which Assumption 5.6.1 (c) holds.


Proposition 5.6.2. Suppose that M is a Cartesian P matrix and Assumption 5.6.1 (a) holds.Then, Assumption 5.6.1 (c) holds.

Proof. Let (Ey, Ez) be an arbitrary accumulation point of {(∇yΦµk(yk, zk),∇zΦµk

(yk, zk))}.Then, it suffices to show that the matrix

D∞ :=

(M −ImEy Ez

)

is nonsingular. By Proposition 5.4.7, Ey and Ez satisfy the following three properties:

1. Ey + Ez = Im;

2. 0 � Ey � Im and 0 � Ez � Im;

3. Ey and Ez are symmetric and have the block-diagonal structure conforming to the Carte-sian structure of K = Km1 × · · · × Km` .

From (a) and (b), there exists an orthogonal matrix H ∈ Rm×m,

HEyH> = diag(αi)m

i=1, HEzH> = diag(1 − αi)m

i=1, 0 ≤ αi ≤ 1 (i = 1, 2, . . . ,m), (5.6.1)

where αi (i = 1, 2, . . . ,m) are the eigenvalues of Ey. Moreover, from (c), H has the sameblock-diagonal structure as Ey and Ez. Hence, M := HMH> is a Cartesian P matrix byProposition 5.2.2.

Now, let D∞ ∈ R2m×2m be defined as

D∞ :=

(H 00 H

)D∞

(H> 00 H>

)=

(M −Im

diag(αi)mi=1 diag(1 − αi)m

i=1

), (5.6.2)

and let (ζ, η) ∈ Rm × Rm be an arbitrary vector such that D∞(

ζη

)= 0. Then, we have

Mζ = η, (5.6.3)

αiζi + (1 − αi)ηi = 0 (i = 1, 2, . . . ,m). (5.6.4)

If αi = 0, then we have (Mζ)i = ηi = 0 from (5.6.3) and (5.6.4). If αi = 1, then we have ζi = 0from (5.6.4). If 0 < αi < 1, then we have ζi(Mζ)i = ζiηi = −αi(1 − αi)−1ζ2

i ≤ 0. Thus, forall i, we have ζi(Mζ)i ≤ 0. Since M is a Cartesian P matrix and every Cartesian P matrixis a P matrix, we must have ζ = η = 0. Hence, D∞ is nonsingular. From (5.6.2) and thenonsingularity of H, matrix D∞ is also nonsingular.

The following three lemmas play crucial roles in establishing the convergence theorem forthe algorithm.

Lemma 5.6.3. Let {wk} be a sequence generated by Algorithm 5.1, and dwk := (dxk, dyk, dzk) ∈Rn × Rm × Rm be the unique optimum of QP subproblem (5.5.2) for each k. Let {(µk, τk)} ⊆R++ × R++ be a sequence converging to (0, 0) and α > 0 be a fixed scalar. Suppose that {wk}


satisfies Assumption 5.6.1(a). In addition, assume that {dwk} has an accumulation point, andw := (x, y, z) and dw := (dx, dy, dz) are arbitrary accumulation points of {wk} and {dwk},respectively. Then we have

lim supk→∞

(θµk,α(wk + τkdw

k) − θµk,α(wk)τk

− θ′µk,α(wk; dwk))

≤ 0

provided y − z /∈ bd (K ∪−K) and dw 6= 0.

Proof. Taking subsequences if necessary, we may suppose wk → w and dwk → dw. Moreover,we have θ′µk,α(wk; dwk) = ∇xf(xk, yk)>dxk + ∇yf(xk, yk)>dyk − α‖Φµk

(yk, zk)‖1 as shown in(5.5.9). Thus, we have only to show

lim supk→∞

(θµk,α(wk + τkdw

k) − θµk,α(wk)τk

)≤ ∇f(x, y)>

(dx

dy

)− α‖Φ(y, z)‖1.

From the mean-value theorem and the continuity of ∇f , we have

limk→∞

f(xk + τkdxk, yk + τkdy

k) − f(xk, yk)τk

= ∇f(x, y)>(dx

dy

).

Therefore, it suffices to show that

lim supk→∞

|Φµk(yk + τkdy

k, zk + τkdzk)j | − |Φµk

(yk, zk)j |τk

≤ −|Φ(y, z)j | (5.6.5)

for each j. By Definition 5.4.4 and Proposition 5.4.7, we have ∇yΦµk(yk, zk) = Im−∇Pµk

(yk−zk)and ∇zΦµk

(yk, zk) = ∇Pµk(yk − zk), which together with the constraints of QP subproblem

(5.5.2) yield

−Φµk(yk, zk) = ∇yΦµk

(yk, zk)>dyk + ∇zΦµk(yk, zk)>dzk

= (I −∇Pµk(yk − zk)>)dyk + ∇Pµk

(yk − zk)>dzk

= dyk −∇Pµk(yk − zk)>(dyk − dzk). (5.6.6)

Hence, we have

Φµk(yk + τkdy

k, zk + τkdzk) − Φµk

(yk, zk)

= τkdyk −

(Pµk

(yk + τkdyk − (zk + τkdz

k)) − Pµk(yk − zk)

)= τk

(−Φµk

(yk, zk) + ∇Pµk(yk − zk)>(dyk − dzk)

)−Pµk

(yk − zk + τk(dyk − dzk)) + Pµk(yk − zk)

= −τkΦµk(yk, zk) + τkδ

k, (5.6.7)

where the first equality is due to (5.4.3), the second equality follows from (5.6.6), and δk :=(δk

1 , δk2 , . . . , δ

km) ∈ Rm is given by

δkj := ∇Pµk

(yk − zk)>j (dyk − dzk) − τ−1k

(Pµk

(yk − zk + τk(dyk − dzk))j − Pµk(yk − zk)j

).

(5.6.8)


To show (5.6.5), we consider three cases: (i) Φ(y, z)j = 0, (ii) Φ(y, z)j > 0 and (iii) Φ(y, z)j < 0.In case (i), we first notice that

|Φµk(yk + τkdy

k, zk + τkdzk)j | − |Φµk(yk, zk)j |

τk≤ |Φµk

(yk + τkdyk, zk + τkdzk)j − Φµk

(yk, zk)j |τk

= |Φµk(yk, zk)j − δk

j |, (5.6.9)

where the equality follows from (5.6.7). By applying the mean-value theorem in (5.6.8), we canfind ζkj ∈ (0, 1) for each k such that

δkj =

(∇Pµk

(yk − zk) −∇Pµk

(yk − zk + ζkjτk(dyk − dzk)

))>j

(dyk − dzk). (5.6.10)

Since Proposition 5.4.5 and the boundedness of {dwk} imply

limk→∞

∇Pµk(yk − zk) −∇Pµk

(yk − zk + ζkjτk(dyk − dzk)) = 0, (5.6.11)

we obtain limk→∞ δkj = 0. Moreover, by Proposition 5.4.8, we have limk→∞ Φµk

(yk, zk)j =Φ(y, z)j = 0. Then, by letting k → ∞ in (5.6.9), we obtain (5.6.5). In case (ii) and case (iii), wehave

|Φµk(yk + τkdy

k, zk + τkdzk)j | − |Φµk

(yk, zk)j | = −Φµk(yj , zk)j + δk

j

and|Φµk

(yk + τkdyk, zk + τkdz

k)j | − |Φµk(yk, zk)j | = Φµk

(yj , zk)j − δkj ,

respectively. Then, a similar argument to that in case (i) leads to the desired inequality (5.6.5).

Lemma 5.6.4. Let {wk} be a sequence generated by Algorithm 5.1. Suppose that Assumption5.6.1 holds. Then, we have the following statements.

(i) {dwk} and {(uk, vk)} are bounded.

(ii) There exists k0 such that αk = αk0 for all k ≥ k0.

(iii) The sequences {θµk,αk(wk)} and {θµk,αk

(wk+1)} converge to the same limit.

Proof. We first prove (i). Let(dyk

dzk

):= −D−1

k

(0

Φµk(yk, zk)

)(5.6.12)

and dwk := (0, dyk, dzk), where Dk is defined by (5.5.7). Then, dwk is a feasible point of QPsubproblem (5.5.2). Note that the objective function of QP subproblem (5.5.2) is rewritten as

12(dw −B−1

k gk)>Bk(dw −B−1k gk) + constant,

where gk := (∇f(xk, yk), 0). Since dwk is the optimum of QP subproblem (5.5.2), we have

(dwk −B−1k gk)>Bk(dwk −B−1

k gk) ≥ (dwk −B−1k gk)>Bk(dwk −B−1

k gk).


This together with Assumption 5.6.1 (b) implies

γ2‖dwk −B−1k gk‖2 ≥ γ1‖dwk −B−1

k gk‖2. (5.6.13)

Now, notice that {dwk} is bounded from (5.6.12), Assumption 5.6.1 (a), (c) and Proposition 5.4.8.In addition, {B−1

k } and {gk} are also bounded from Assumption 5.6.1 (a), (b). Thus, we havethe boundedness of {dwk} from (5.6.13). On the other hand, from the first equality of the KKTconditions (5.5.3), we have(

uk

vk

)= −(D>

k )−1

((∇yf(xk, yk)

0

)+ Bkdw

k

),

where Bk is the 2m×(n+2m) matrix consisting of the last 2m rows of Bk. This equation togetherwith Assumption 5.6.1 and the boundedness of {dwk} yields the boundedness of {(uk, vk)}.

We next prove (ii). From the update rule (5.5.5), we can easily see that {αk} is nondecreasing.Moreover, if

‖vk‖∞ > αk−1 − δ, (5.6.14)

then we have αk = max{‖vk‖∞ + δ, αk−1 + 2δ} ≥ αk−1 + 2δ, that is, αk increases at least by 2δat a time. Let K := {k|‖vk‖∞ > αk−1 − δ}. If |K| = ∞, then αk → ∞ as k → ∞ and hence{‖vk‖∞} is unbounded from (5.6.14). However this contradicts (i). Thus we have (ii).

We finally show (iii). Since we have (ii), there exist α and k0 such that α = αk for all k ≥ k0.In what follows, we suppose k ≥ k0. Since µk+1 ≤ µk, Proposition 5.4.9 together with (5.5.4)implies

θµk+1,α(wk+1) + αmρµk+1 ≤ θµk,α(wk+1) + αmρµk (5.6.15)

≤ θµk,α(wk) + αmρµk, (5.6.16)

where the last inequality follows from the Armijo condition (5.5.6) and Proposition 5.5.3 (b).From (5.6.16), {θµk,α(wk) + αmρµk} is a monotonically nonincreasing sequence. In addition,{θµk,α(wk) + αmρµk} is bounded, since {θµk,α(wk)} is bounded from Assumption 5.6.1 (a).Therefore, {θµk,α(wk) + αmρµk} is convergent. This fact together with limk→∞ µk = 0 and(5.6.16) yields that {θµk,α(wk+1)} and {θµk,α(wk)} must converge to the same limit.

Finally, we show that a sequence generated by the algorithm globally converges to B-stationary point of MPSOCC (5.3.1) under the assumption that any accumulation point w =(x, y, z) satisfies y−z /∈ bd (K∪−K), which is equivalent to the strict complementarity conditiongiven in Definition 5.3.1 if y and z satisfy the SOC complementarity condition.

Theorem 5.6.5. Let {wk} be a sequence generated by Algorithm 5.1. Suppose that {wk} satisfiesAssumption 5.6.1. Let w = (x, y, z) be an arbitrary accumulation point of {wk}. If (y, z) satisfiesy − z /∈ bd (K ∪−K), then w is a B-stationary point of MPSOCC (5.1.1).

Proof. By Lemma5.6.4(ii), there exists some constant α > 0 and k0 such that αk = α for allk ≥ k0. In this proof, we suppose without loss of generality that αk = α holds for all k.


We first show thatlim

k→∞‖dwk‖ = 0. (5.6.17)

From Proposition 5.5.3 (b) and Assumption 5.6.1 (b), we have

θ′µk,α(wk; dwk) ≤ −(dwk)>Bkdwk ≤ −γ1‖dwk‖2, (5.6.18)

which together with Armijo condition (5.5.6) yields

θµk,α(wk+1) ≤ θµk,α(wk) + στkθ′µk,α(wk; dwk)

≤ θµk,α(wk) − γ1στk‖dwk‖2.

Hence, from Lemma 5.6.4 (iii), we obtain

limk→∞

τk‖dwk‖2 = 0.

Now, assume for contradiction that (5.6.17) does not hold. Then, there exists an infinite indexset K ⊆ {0, 1, . . .} such that

limk→∞k∈K

‖dwk‖ > 0, (5.6.19)

and hencelimk→∞k∈K

τk = 0.

Let `k be the smallest nonnegative integer L satisfying (5.5.6), i.e., ρ`k = τk. Then, from thedefinition of `k, we have

θµk,α(wk + ρ`k−1dwk) > θµk,α(wk) + σρ`k−1θ′µk,α(wk; dwk),

which implies

ξk :=θµk,α(wk + ρ`k−1dwk) − θµk,α(wk)

ρ`k−1− θ′µk,α(wk; dwk) > −(1 − σ)θ′µk,α(wk; dwk).(5.6.20)

By Lemma 5.6.3 together with limk→∞, k∈K ρ`k−1=0 and y − z /∈ bd (K ∪ −K), we havelim supk→∞, k∈Kξk ≤ 0. Moreover, we have from (5.6.18)

−(1 − σ)θ′µk,α(wk; dwk) ≥ (1 − σ)γ1‖dwk‖2. (5.6.21)

From (5.6.20) and (5.6.21), we must have limk→∞, k∈K ‖dwk‖ = 0. However, this contradicts(5.6.19), and hence we have (5.6.17).

Next, we show that w satisfies the KKT conditions (5.3.2) of MPSOCC (5.3.1). Let {(ηk, uk, vk)}be the sequence of multipliers corresponding to {dwk}. Then, {(uk, vk)} is bounded from Lemma5.6.4 (i). Hence, there exist vectors u, v and an index set K ′ such that limk→∞,k∈K′(wk, uk, vk) =(w, u, v). By (5.5.3) and letting X := {x ∈ Rn | Ax ≤ b} we have

ζk ∈ −NX(xk) × {0}2m, (5.6.22)(N M −Im0 ∇yΦµk


) dxk

dyk

dzk

= −

(0

Φµk(yk, zk)

),(5.6.23)

0 ≤ b−Axk −Adxk, (5.6.24)


where

ζk :=

∇xf(xk, yk)∇yf(xk, yk)

0

+Bk

dxk

dyk

dzk

+

N>

M>

−Im

uk +

0∇yΦµk

(yk, zk)∇zΦµk

(yk, zk)

vk, (5.6.25)

and (5.6.22) follows from the first equation of (5.5.3) with NX(xk) = {A>η | η ≥ 0, η>(Axk − b) = 0}.By letting k ∈ K ′ tend to ∞ in (5.6.23) and (5.6.24), we have

0 ≤ b−Ax, Φ(y, z) = 0. (5.6.26)

Note that (5.6.26) implies x ∈ X. In addition, note that {ζk}k∈K′ is a convergent sequencesatisfying (5.6.22), since {Bk} is bounded from Assumption 5.6.1 and limk→∞∇Φµk

(yk, wk) =∇Φ(y, w) from the strict complementarity condition. Then, these facts together with (5.6.17)and the closedness of the point-to-set map NX(·) yield ∇xf(x, y)

∇yf(x, y)0

+

NT

MT

−Im

u+

0∇yΦ(y, z)∇zΦ(y, z)

v = limk→∞,k∈K′

ζk ∈ −NX(x) × {0}2m.

This together with (5.6.26) means that w satisfies the KKT conditions (5.3.2) of MPSOCC (5.3.1).By Proposition 5.3.8, w is a B-stationary point of MPSOCC (5.1.1).


In this section, we implement Algorithm 5.1 for solving problem (5.1.1) and report some numer-ical results. The program is coded in Matlab 2008a and run on a machine with an IntelrCore2Duo E6850 3.00GHz CPU and 4GB RAM. In Step 0 of the algorithm, we set the parameters as

δ := 1, α−1 := 10, σ := 10−3, ρ := 0.9.

The choice of smoothing parameters {µk}, i.e., µ0 and β, and a starting point w0 vary with theexperiment. We let B0 be the identity matrix, and update Bk by the modified BFGS formula:

Bk+1 := Bk − Bksk(Bks

k)>

(sk)>Bksk+ζk(ζk)>

(sk)>ζk

with sk = wk+1 − wk and ζk = θkζk + (1 − θk)Bks

k, where ζk := ∇wLµk+1(wk+1, uk, vk, ηk) −

∇wLµk(wk, uk, vk, ηk), Lµ denotes the Lagrangian function defined by Lµ(w, u, v, η) := f(x, y)+

Φµ(y, z)v + (Nx+My + q − z)u+ (Ax− b)η, and θk is determined by

θk :=

1 if (sk)>ζk ≥ 0.2(sk)>Bks

k

0.8(sk)>Bksk

(sk)>(Bksk − ζk)otherwise.

In Step 1, we use the quadprog solver in Matlab Optimization Toolbox for solving the QPsubproblems. In Step 3, we terminate the algorithm if the following condition is satisfied:

‖Φ(yk, zk)‖∞ + ‖dwk‖∞ ≤ 10−7. (5.7.1)


The rationale for using (5.7.1) is as follows. If Φ(yk, zk) = 0, i.e., K 3 yk ⊥ zk ∈ K holds,then wk = (xk, yk, zk) is feasible to MPSOCC (5.1.1), since the remaining constraints Axk ≤ b

and zk = Nxk + Myk + q always hold from Proposition 5.5.2. Moreover, if ‖dwk‖∞ = 0,then wk satisfies the KKT conditions (5.3.2) of MPSOCC (5.3.1). Thus, by Theorem 5.6.5,‖Φ(yk, zk)‖∞ + ‖dwk‖∞ = 0 indicates that wk is a B-stationary point under the assumptionyk − zk /∈ bd (K ∪ −K), which is equivalent to the strict complementarity condition in Defi-nition 5.3.1 if K 3 yk ⊥ zk ∈ K. Hence, (5.7.1) is appropriate for a stopping criterion of thealgorithm. As the CM function, we choose g(α) := ((α2 + 4)1/2 + α)/2.

Experiment 1

In the first experiment, we solve the following test problem of the form (5.1.1):

Minimizex,y,z

‖x‖2 + ‖y‖2


z = Nx+My + q,

K 3 y ⊥ z ∈ K,

(5.7.2)

where (x, y, z) ∈ R10×Rm×Rm, and each element of A ∈ R10×10, N ∈ Rm×10 is randomly chosenfrom [−1, 1]. Moreover, each element of b ∈ R10 is randomly chosen from [0, 1]. In addition,M ∈ Rm×m is a positive semi-definite symmetric matrix generated by M = M1M

>1 +0.01I, and

M1 ∈ Rm×m is a matrix whose entries are randomly chosen from [−1, 1]. The vector q ∈ Rm isset to be q := ξz−Mξy with ξy ∈ Rm and ξz ∈ Rm, whose components are randomly chosen from[−1, 1]. We choose different Cartesian structures for K, and generate 50 problems for each K. Inapplying Algorithm 5.1, we set an initial point w0 = (x0, y0, z0) := (0, ξy, ξz) ∈ R10 × Rm × Rm,so that Ax0 ≤ b and z0 = Nx0 +My0 + q are satisfied. We choose smoothing parameters {µk}as µk := 100 × 0.8k.

The obtained results are shown in Tables 5.1 and 5.2, where (Kν)κ := Kν×Kν×· · ·×Kν ⊆ Rνκ

and each column represents the following:

• ] ite: the average number of iterations among 50 test problems for each K;

• cpu(s): the average cpu-time in second among 50 test problems for each K;

• non(%): percentage of test problems whose solutions obtained by Algorithm 5.1 satisfythe strict complementarity condition in Definition 5.3.1.

Recall that convergence to a B-stationary point is proved under the strict complementaritycondition. Hence, the value of “non” represents the percentage of problems for which thealgorithm successfully finds B-stationary points. From Table 5.1, we can observe that ] ite doesnot change so much although the values of cpu(s) tends to be larger as m increases. FromTable 5.2, we can see that non(%) tends to be less than 100 if K includes K1 or K2. Indeed,when K = (K1)100 and (K2)50, the values of “non(%)” is 74 and 86, respectively, whereas itbecomes 100 when the dimension of all SOCs in K is larger than 10.


m K ]ite cpu(s) non(%)

10 K10 57.42 0.400 100

20 K20 56.30 0.471 100

30 K30 55.78 0.568 100

40 K40 55.44 0.726 100

50 K50 55.06 0.987 100

60 K60 54.96 1.388 100

70 K70 54.74 1.797 100

80 K80 54.66 2.130 100

90 K90 54.40 2.437 100

100 K100 54.20 2.930 100

Table 5.1: Results for problems with a single SOC complementarity constraint (Experiment 1)

m K ]ite cpu(s) non(%)

100 K100 54.20 2.930 100

100 (K50)2 55.18 3.037 100

100 K50 ×K20 ×K30 56.28 3.016 100

100 (K10)10 64.10 3.687 100

100 K50 ×K20 × (K10)2 ×K5 × (K1)5 78.22 4.558 98

100 (K2)50 78.68 6.012 86

100 (K1)100 87.84 6.685 74

Table 5.2: Results for problems with multiple SOC complementarity constraints (Experiment 1)

Experiment 2.

In the second experiment, we apply Algorithm 5.1 to a bilevel programming problem with arobust optimization problem in the lower-level. Bilevel programming has wide application suchas network design and production planning [2, 14]. On the other hand, robust optimization isknown to be a powerful methodology to treat optimization problems with uncertain data [3, 4].In this experiment, we solve the following problem:

Minimize(x,y)∈R4×R4

‖x− Cy‖2 +∑4

i=1 xi

subject to 0 ≤ xi ≤ 5 (i = 1, 2, 3, 4),1 ≤ −x1 + 2x2 + x4 ≤ 3,1 ≤ x2 + x3 − x4 ≤ 2,y solves P (x),

(5.7.3)

with

P (x) : Minimizey∈R4

maxx∈Ur(x)

x>y +12y>My,


where r ≥ 0 is an uncertainty parameter, Ur(x) ⊆ R4 is an uncertainty set defined by Ur(x) :={x ∈ R4 | ‖x− x‖ ≤ r}, and

M :=

2 2 0 −12 4 −2 00 −2 2 0−1 0 0 6

, C :=

−1 1 0 10 2 2 30 0 3 20 0 0 −1

.

For solving problem (5.7.3), we introduce an auxiliary variable γ ∈ R to reformulate the lower-level minimax problem P (x) as the following SOCP:

Minimize(γ,y)∈R×R4

12y

>My + x>y + rγ

subject to

(γ

y

)∈ K5.

Furthermore, the above SOCP can be rewritten as the following SOC complementarity problem:

K5 3

(γ

y

)⊥

(r

My + x

)∈ K5.

Thus, we can convert problem (5.7.3) to the following problem:

Minimize(x,y,z,γ)∈R4×R4×R5×R

‖x− Cy‖2 +∑4

i=1 xi

subject to 0 ≤ xi ≤ 5 (i = 1, 2, 3, 4),1 ≤ −x1 + 2x2 + x4 ≤ 3,1 ≤ x2 + x3 − x4 ≤ 2,

z =

(r

My + x

), K5 3

(γ

y

)⊥ z ∈ K5,

(5.7.4)

which is of the form (5.1.1). For the sake of comparison, problem (5.7.4) is solved not only byAlgorithm 5.1, but also by the smoothing method [77], which is described as follows:

Smoothing method

Step 0. Choose a positive sequence {τ`} such that τ` → 0. Set ` := 0.

Step 1. Find a stationary point w` = (x`, y`, z`) of the smoothed problem (5.5.1) with µ = τ`.

Step 2. If w` is feasible for MPSOCC (5.1.1), then stop. Otherwise, set ` := ` + 1 and go toStep 1.

Each algorithm is implemented in the following way: In Step 0 of Algorithm 5.1, we set smoothingparameters µk := 0.8k (k ≥ 0) and a starting point x0 := (1, 1, 1, 1), (γ0, y

0, z0) := (0, 0, 0) ∈R × R4 × R5. The choice of the other parameters is the same as in Experiment 1. In Step 0 ofthe smoothing method, we set τ` := 0.8` for ` ≥ 0. In Step 1, for solving problem (5.5.1) withµ = τ`, we apply Algorithm 5.1 with slight modification, where the smoothing parameter µk is


fixed to τ` for all k and the termination criterion is replaced by ‖dwk‖∞ ≤ 10−7. In Step 2, westop the smoothing method when ‖Φ(y`, z`)‖∞ ≤ 10−7 is satisfied.1

We then test both the methods to problem (5.7.4) with r = 0.02, 0.04, 0.06, 0.08 and 0.10.The obtained results are shown in Tables 5.3 and 5.4, whose columns represent the following:

• (x∗, y∗, γ∗): the value of x, y, γ obtained by Algorithm 5.1;

• (λ∗1, λ∗2): spectral values of ( γ∗

y∗ )+z∗ with respect to K5 defined as in Definition 2.4.1, wherez∗ := (r,M( γ∗

y∗ ) + x∗);

• iteout: the number of outer iterations;

• ]QP: the number of QP-subproblems (5.5.2) solved in each trial.

From Table 5.3, for all r, we can observe (λ∗1, λ∗2) > 0, which means ( γ∗

y∗ ) + z∗ ∈ intK5, i.e., thestrict complementarity condition holds at the obtained solution. Hence, Algorithm 5.1 finds aB-stationary point of problem (5.7.4) successfully. From Table 5.4, we cannot find a significantdifference between the values of iteout for the two methods. However, it is observed that thevalue of ]QP in the smoothing method tends to be much larger than that in Algorithm 5.1.Indeed, when r = 0.02, the smoothing method has ]QP = 218, which is almost five times largerthan ]QP = 44 in Algorithm 5.1. This fact suggests that the smoothing method needs to solvea number of QP-subproblems in Step 1 for solving each smoothed problem (5.5.1) with fixed µ,while Algorithm 5.1 only solves one QP subproblem (5.5.2) for each smoothed problem (5.5.1). Asa result, the computational cost in the smoothing method tends to be larger than Algorithm 5.1.

r x∗ y∗ γ∗ (λ∗1, λ∗2)

0.02 (0.9236, 0.9618, 0.0382, 0.0000) (−0.6152, 0.1120, 0.0915, 0.1020) 0.6402 (0.040, 1.280)0.04 (0.9495, 0.9747, 0.0253, 0.0000) (−0.6170, 0.1106, 0.0950, 0.1018) 0.6421 (0.080, 1284)0.06 (0.9754, 0.9877, 0.0123, 0.0000) (−0.6189, 0.1092, 0.0985, 0.1016) 0.6442 (0.120, 1.288)0.08 (1.0021, 1.0007, 0.0000, 0.0007) (−0.6218, 0.1084, 0.1021, 0.1014) 0.6474 (0.160, 1.295)0.10 (1.0416, 1.0139, 0.0000, 0.0139) (−0.6356, 0.1123, 0.1044, 0.1011) 0.6616 (0.200, 1.323)

Table 5.3: Results for Algorithm 5.1 (Experiment 2)


In this chapter, we have considered the mathematical program with SOC complementarity con-straints. We have proposed an algorithm based on the smoothing and the sequential quadraticprogramming (SQP) methods, in which we replace the SOC complementarity constraints withsmooth equality constraints by means of the natural residual and its smoothing function, andapply the SQP method while decreasing the smoothing parameter gradually. We have shown

1The remaining constraints Ax` ≤ b and z` = Nx` + My` + q are automatically satisfied since w` is feasible

to the smoothed problem (5.5.1).


Algorithm 5.1 smoothing methodr cpu(s) iteout ]QP cpu(s) iteout ]QP

0.02 0.207 45 44 1.005 43 2180.04 0.213 43 42 0.941 42 2050.06 0.213 43 41 0.908 41 1990.08 0.214 42 40 0.873 40 1880.10 0.209 41 40 0.875 40 188

Table 5.4: Comparison of Algorithm 5.1 and the smoothing method

that the proposed algorithm possesses the global convergence property under the Cartesian P0

and the strict complementarity assumptions. We have further confirmed the efficiency of thealgorithm through numerical experiments.

Chapter 6

Conclusion

In this thesis, we have considered two different types of optimization problems that are general-izations of the second-order cone programming problem (SOCP). One is a semi-infinite second-order cone programming problem (SISOCP) and the other is a mathematical programmingproblem with SOC complementarity constraints (MPSOCC).

The main contributions of the thesis can be summarized as follows:

• In Chapter 3, we have considered a special case of SISOCP that consists of a convexobjective function and infinitely many affine SOC constrains. For such a convex SISOCP,we have proposed two exchange-type methods that produce iteration points by solvinga sequence of relaxed SISOCPs with finitely many SOC constraints. The first one isan explicit exchange method. Assuming strict convexity of the objective function, wehave established its global convergence. The second one is a regularized explicit exchangemethod which is a hybrid of the explicit exchange method and regularization method. Withthe help of a regularization scheme, we have succeeded in ensuring its global convergencewithout strict convexity. Especially, the choice of two parameters given in the algorithmplays a crucial role in the convergence analysis. In the numerical experiments, we haveexamined the behavior of the latter proposed method and observed its efficiency.

• In Chapter 4, for solving SISOCP of the more general form, we have proposed an SQP-type method based on the local reduction technique. In this approach, we represent theSISOCP as SOCP locally by means of finitely many implicit functions, and apply theSQP-type method to the latter SOCP. We have established its global convergence byusing the max-type penalty function as a merit function. Furthermore, we have analyzedlocal convergence behavior of generated iteration points. The convergence rate has beenproved to be quadratic. In the numerical experiments, for the sake of comparison, we havealso implemented another SQP-type method, and then observed good performance of theproposed method.

• In Chapter 5, we have focused on MPSOCC and proposed a smoothing SQP algorithm forsolving it. This method first replaces the SOC complementarity constraints with nondif-ferentiable equality constraints by using the natural residual. Then, an SQP-type methodtogether with a smoothing technique is applied to the reformulated MPSOCC. We have

100 Chapter 6 Conclusion

proved that QP subproblems are always consistent if the Cartesian P0 property holds.Moreover, we have also shown that the generated iterative sequence globally convergesto a B-stationarity point under strict complementarity. Finally, we have conducted somenumerical experiments and observed efficiency of the proposed method.

We finally discuss further research concerning SISOCP and MPSOCC.

• We have some room to improve the proposed methods for solving SISOCP. Actually,all algorithms require exact optima of SOCP-subproblems to generate iteration points.Although they can be found efficiently by using existing algorithms, we have to spend quitea lot of computational cost for that purpose. Therefore, to solve SISOCP more practically,it is desired to refine the proposed algorithms so that they allow inexact solutions of SOCPwithout loss of good convergence properties.

• In the proposed algorithm for solving MPSOCC, we gradually decrease the smoothing pa-rameter to zero to ensure the global convergence. However, as in smoothing-type Newtonmethods [26, 12], the speed of decreasing the parameter should play a key role in estab-lishing rapid local convergence. For fast convergence, an effective way of decreasing theparameter should be explored.

==========================================

Bibliography

[1] F. Alizadeh and D. Goldfarb, Second-order cone programming, Mathematical Pro-gramming, 95 (2003), pp. 3–51.

[2] J. Bard et al., Practical Bilevel Optimization: Algorithms and Applications, Springer,1998.

[3] A. Ben-Tal and A. Nemirovski, Robust convex optimization, Mathematics of OperationsResearch, 23 (1998), pp. 769–805.

[4] , Robust solutions of uncertain linear programs, Operations Research Letters, 25 (1999),pp. 1–13.

[5] J. F. Bonnans and H. Ramırez C, Perturbation analysis of second-order cone program-ming problems, Mathematical Programming, 104 (2005), pp. 205–227.

[6] J. F. Bonnans and A. Shapiro, Perturbation Analysis of Optimization Problems,Springer-Verlag, New York, 2000.

[7] S. Butikofer and D. Klatte, A nonsmooth Newton method with path search and itsuse in solving C1,1 programs and semi-infinite problems, SIAM Journal on Optimization,20 (2010), pp. 2381–2412.

[8] S. Chan, H. Chen, and C. Pun, The design of digital all-pass filters using second-ordercone programming (SOCP), IEEE Transactions on Circuits and Systems II: Express Briefs,52 (2005), pp. 66–70.

[9] J.-S. Chen and S.-H. Pan, A descent method for a reformulation of the second-ordercone complementarity problem, Journal of Computational and Applied Mathematics, 213(2008), pp. 547–558.

[10] J.-S. Chen and P. Tseng, An unconstrained smooth minimization reformulation ofthe second-order cone complementarity problem, Mathematical Programming, 104 (2005),pp. 293–327.

[11] X. Chen and H. Qi, Cartesian P-property and its applications to the semidefinite linearcomplementarity problem, Mathematical programming, 106 (2006), pp. 177–201.

102 Bibliography

[12] X. Chen, D. Sun, and J. Sun, Complementarity functions and numerical experiments onsome smoothing Newton methods for second-order-cone complementarity problems, Compu-tational Optimization and Applications, 25 (2003), pp. 39–56.

[13] R. Cottle, J.-S. Pang, and R. Stone, The linear complementarity problem, Society forIndustrial Applied Mathematics, 2009.

[14] S. Dempe, Foundations of bilevel programming, Springer, 2002.

[15] J. Faraut and A. Koranyi, Analysis on Symmetric Cones, Oxford University Press,New York, 1994.

[16] C. A. Floudas and O. Stein, The adaptive convexification algorithm: A feasible pointmethod for semi-infinite programming, SIAM Journal on Optimization, 18 (2007), pp. 1187–1208.

[17] M. Fukushima, Z.-Q. Luo, and J.-S. Pang, A globally convergent sequential quadraticprogramming algorithm for mathematical programs with linear complementarity constraints,Computational Optimization and Applications, 10 (1998), pp. 5–34.

[18] M. Fukushima, Z.-Q. Luo, and P. Tseng, Smoothing functions for second-order conecomplementarity problems, SIAM Journal on Optimization, 12 (2001), pp. 436–460.

[19] M. Fukushima and P. Tseng, An implementable active-set algorithm for computing a B-stationary point of a mathematical program with linear complementarity constraints, SIAMJournal on Optimization, 12 (2002), pp. 724–739.

[20] L. E. Ghaoui, M. Oks, and F. Oustry, Worst-case value-at-risk and robust portfoliooptimization: A conic programming approach, Operations Research, 51 (2003), pp. 543–556.

[21] M. A. Goberna and M. A. Lopez, Semi-Infinite Programming: Recent Advances, KluwerAcademic Publishers, Dordrecht, 2001.

[22] D. Goldfarb and G. Iyengar, Robust portfolio selection problems, Mathematics of Op-erations Research, 28 (2003), pp. 1–38.

[23] M. S. Gowda, R. Sznajder, and J. Tao, Some P-properties for linear transformationson Euclidean Jordan algebras, Linear Algebra and Its Applications, 393 (2004), pp. 203–232.

[24] G. Gramlich, R. Hettich, and E. W. Sachs, Local convergence of SQP methods insemi-infinite programming, SIAM Journal on Optimization, 5 (1995), pp. 641–658.

[25] S. Hayashi, T. Yamaguchi, N. Yamashita, and M. Fukushima, A matrix-splittingmethod for symmetric affine second-order cone complementarity problems, Journal of Com-putational and Applied Mathematics, 175 (2005), pp. 335–353.

[26] S. Hayashi, N. Yamashita, and M. Fukushima, A combined smoothing and regulariza-tion method for monotone second-order cone complementarity problems, SIAM Journal onOptimization, 15 (2005), pp. 593–615.

BIBLIOGRAPHY 103

[27] S. Hayashi and S.-Y. Wu, An explicit exchange algorithm for linear semi-infinite pro-gramming problems with second-order cone constrains, SIAM Journal on Optimization, 20(2009), pp. 1527–1546.

[28] R. Hettich, An implementation of a discretization method for semi-infinite programming,Mathematical Programming, 34 (1986), pp. 354–361.

[29] R. Hettich and K. O. Kortanek, Semi-infinite programming: Theory, methods, andapplications, SIAM Review, 35 (1993), pp. 380–429.

[30] R. Hettich and W. V. Honstede, On quadratically convergent methods for semi-infiniteprogramming, in Semi-Infinite Programming, R. Hettich, ed., Springer, Berlin, 1979, pp. 97–111.

[31] S. Ito, Y. Liu, and K. Teo, A dual parametrization method for convex semi-infiniteprogramming, Annals of Operations Research, 98 (2000), pp. 189–213.

[32] Y. Kanno, J. Martins, and A. Pinto da Costa, Three-dimensional quasi-static fric-tional contact by using second-order cone linear complementarity problem, InternationalJournal for Numerical Methods in Engineering, 65 (2006), pp. 62–83.

[33] H. Kato and M. Fukushima, An SQP-type algorithm for nonlinear second-order coneprograms, Optimization Letters, 1 (2007), pp. 129–144.

[34] H. C. Lai and S.-Y. Wu, On linear semi-infinite programming problems, Numerical Func-tional Analysis and Optimization, 13 (1992), pp. 287–304.

[35] D. Li, L. Qi, J. Tam, and S.-Y. Wu, A smoothing Newton method for semi-infiniteprogramming, Journal of Global Optimization, 30 (2004), pp. 169–194.

[36] Y. Liu and L. Zhang, Convergence analysis of the augmented Lagrangian method fornonlinear second-order cone optimization problems, Nonlinear Analysis: Theory, Methods& Applications, 67 (2007), pp. 1359–1373.

[37] M. S. Lobo, L. Vandenberghe, S. Boyd, and H. Lebret, Applications of second-ordercone programming, Linear Algebra and its Applications, 284 (1998), pp. 193–228.

[38] M. Lopez and G. Still, Semi-infinite programming, European Journal of OperationalResearch, 180 (2007), pp. 491–518.

[39] W. S. Lu and T. Hinamoto, A second-order cone programming approach for minimaxdesign of 2-D FIR filters with low group delay, in Proc. IEEE Int. Symp. Circuits Systems,2006, pp. 2521–2524.

[40] W. Lu, T. Saramaki, and R. Bregovic, Design of practically perfect-reconstructioncosine-modulated filter banks: A second-order cone programming approach, IEEE Transac-tions on Circuits and Systems I: Regular Papers, 51 (2004), pp. 552–563.

104 Bibliography

[41] Z.-Q. Luo, J.-S. Pang, and D. Ralph, Mathematical Programs with Equilibrium Con-straints, Cambridge University Press, Cambridge, 1996.

[42] R. Monteiro and T. Tsuchiya, Polynomial convergence of primal-dual algorithms forthe second-order cone program based on the MZ-family of directions, Mathematical Pro-gramming, 88 (2000), pp. 61–83.

[43] Y. Narushima, N. Sagara, and H. Ogasawara, A smoothing Newton method withFischer-Burmeister function for second-order cone complementarity problems, Journal ofOptimization Theory and Applications, 149 (2011), pp. 79–101.

[44] Q. Ni, C. Ling, L. Qi, and K. Teo, A truncated projected Newton-type algorithm forlarge-scale semi-infinite programming, SIAM Journal on Optimization, 16 (2006), pp. 1137–1154.

[45] R. Nishimura, S. Hayashi, and M. Fukushima, Robust Nash equilibria in n-personnon-cooperative games: Uniqueness and reformulation, Pacific Journal of Optimization, 5(2009), pp. 237–259.

[46] T. Okuno and M. Fukushima, Local reduction based SQP-type method for semi-infiniteprograms with an infinite number of second-order cone constraints, submitted to Journal ofGlobal Optimization.

[47] T. Okuno, S. Hayashi, and M. Fukushima, A regularized explicit exchange methodfor semi-infinite programs with an infinite number of conic constraints, SIAM Journal onOptimization, 22 (2012), pp. 1009–1028.

[48] S.-H. Pan and J.-S. Chen, A regularization method for the second-order cone comple-mentarity problem with the cartesian P0-property, Nonlinear Analysis: Theory, Methods &Applications, 70 (2009), pp. 1475–1491.

[49] , A semismooth Newton method for SOCCPs based on a one-parametric class ofSOC complementarity functions, Computational Optimization and Applications, 45 (2010),pp. 59–88.

[50] A. Pereira, M. Costa, and E. Fernandes, Interior point filter method for semi-infiniteprogramming problems, Optimization, 60 (2011), pp. 1309–1338.

[51] A. Pereira and E. Fernandes, A reduction method for semi-infinite programming bymeans of a global stochastic approach, Optimization, 58 (2009), pp. 713–726.

[52] L. Qi, C. Ling, X. Tong, and G. Zhou, A smoothing projected Newton-type algorithmfor semi-infinite programming, Computational Optimization and Applications, 42 (2009),pp. 1–30.

[53] L. Qi, S.-Y. Wu, and G. Zhou, Semismooth Newton methods for solving semi-infiniteprogramming problems, Journal of Global Optimization, 27 (2003), pp. 215–232.

BIBLIOGRAPHY 105

[54] R. Reemtsen, Discretization methods for the solution of semi-infinite programming prob-lems, Journal of Optimization Theory and Applications, 71 (1991), pp. 85–103.

[55] R. Reemtsen and J. Ruckmann, Semi-Infinite Programming, Kluwer Academic Pub-lishers, Boston, 1998.

[56] R. T. Rockafellar and R. J. -B. Wets, Variational Analysis, Springer, New York,1998

[57] R. T. Rockafellar, Convex Analysis, Princeton University Press, Princeton, 1970.

[58] T. Sasakawa and T. Tsuchiya, Optimal magnetic shield design with second-order coneprogramming, SIAM Journal on Scientific Computing, 24 (2003), pp. 1930–1950.

[59] H. Scheel and S. Scholtes, Mathematical programs with complementarity constraints:Stationarity, optimality, and sensitivity, Mathematics of Operations Research, 25 (2000),pp. 1–22.

[60] T. Shiu and S.-Y. Wu, Relaxed cutting plane method with convexification for solving non-linear semi-infinite programming problems, Computational Optimization and Applications,53 (2012), pp. 1–23.

[61] O. Stein and P. Steuermann, The adaptive convexification algorithm for semi-infiniteprogramming with arbitrary index sets, Mathematical Programming, 136 (2012), pp. 183–207.

[62] O. Stein and A. Tezel, The semismooth approach for semi-infinite programming underthe reduction ansatz, Journal of Global Optimization, 41 (2008), pp. 245–266.

[63] , The semismooth approach for semi-infinite programming without strict complemen-tarity, SIAM Journal on Optimization, 20 (2009), pp. 1052–1072.

[64] G. Still, Discretization in semi-infinite programming: The rate of convergence, Mathe-matical Programming, 91 (2001), pp. 53–69.

[65] J. F. Sturm, Using SeDuMi 1.02, a MATLAB toolbox for optimization over symmetriccones, Optimization Methods and Software, 11 (1999), pp. 625–653.

[66] Y. Tanaka, M. Fukushima, and T. Ibaraki, A globally convergent SQP method forsemi-infinite nonlinear optimization, Journal of Computational and Applied Mathematics,23 (1988), pp. 141–153.

[67] J. Tao and M. S. Gowda, Some P-properties for nonlinear transformations on EuclideanJordan algebras, Mathematics of Operations Research, 30 (2005), pp. 985–1004.

[68] K. C. Toh, M. J. Todd, and R. H. Tutuncu, SDPT3—a MATLAB software packagefor semidefinite programming, version 2.1, Optimization Methods and Software, 11 (1999),pp. 545–581.

106 Chapter 6 Conclusion

[69] T. Tsuchiya, A convergence analysis of the scaling-invariant primal-dual path-followingalgorithms for second-order cone programming, Optimization Methods and Software, 11(1999), pp. 141–182.

[70] K. M. Tsui, S. C. Chan, and K. S. Yeung, Design of FIR digital filters with prescribedflatness and peak error constraints using second-order cone programming, IEEE Trans. Cir-cuits Syst. II, 52 (2005), pp. 601–605.

[71] Y. Wang and L. Zhang, Properties of equation reformulation of the Karush–Kuhn–Tuckercondition for nonlinear second order cone optimization problems, Mathematical Methods ofOperations Research, 70 (2009), pp. 195–218.

[72] S. P. Wu, S. Boyd, and L. Vandenberghe, FIR filter design via spectral factorizationand convex optimization, in Applied and Computational Control, Signals, and Circuits., 1(1999), pp. 215–245.

[73] S.-Y. Wu, D. H. Li, L. Qi and G. Zhou, An iterative method for solving KKT system ofthe semi-infinte programming, Optimization Methods and Software, 20 (2005), pp. 629–643.

[74] H. Xu and J. Zeng, A multisplitting method for symmetrical affine second-order cone com-plementarity problem, Computers & Mathematics with Applications, 55 (2008), pp. 459–469.

[75] H. Yamamura, T. Okuno, S. Hayashi, and M. Fukushima, A smoothing SQP methodfor mathematical programs with linear second-order cone complementarity constraints, Pa-cific Journal of Optimization, to appear.

[76] H. Yamashita and H. Yabe, A primal-dual interior point method for nonlinear optimiza-tion over second-order cones, Optimization Methods and Software, 24 (2009), pp. 407–426.

[77] T. Yan and M. Fukushima, Smoothing method for mathematical programs with symmetriccone complementarity constraints, Optimization, 60 (2011), pp. 113–128.

[78] H. Zhang, J. Li, and S. Pan, New second-order cone linear complementarity formula-tion and semi-smooth newton algorithm for finite element analysis of 3D frictional contactproblem, Computer Methods in Applied Mechanics and Engineering, 200 (2011), pp. 77–88.

[79] L. Zhang, S.-Y. Wu, and M. Lopez, A new exchange method for convex semi-infiniteprogramming, SIAM Journal on Optimization, 20 (2010), pp. 2959–2977.

studies on algorithms for solving generalized second … › papers › doctor ›...

Documents