chapter 16 quadratic optimization problemscis515/cis515-15-sl14.pdf · chapter 16 quadratic...

39
Chapter 16 Quadratic Optimization Problems 16.1 Quadratic Optimization: The Positive Definite Case In this chapter, we consider two classes of quadratic opti- mization problems that appear frequently in engineering and in computer science (especially in computer vision): 1. Minimizing f (x)= 1 2 x > Ax + x > b over all x 2 R n , or subject to linear or ane con- straints. 2. Minimizing f (x)= 1 2 x > Ax + x > b over the unit sphere. 797

Upload: others

Post on 10-Jun-2020

16 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Chapter 16 Quadratic Optimization Problemscis515/cis515-15-sl14.pdf · Chapter 16 Quadratic Optimization Problems 16.1 Quadratic Optimization: The Positive Definite Case In this

Chapter 16

Quadratic Optimization Problems

16.1 Quadratic Optimization: The Positive DefiniteCase

In this chapter, we consider two classes of quadratic opti-mization problems that appear frequently in engineeringand in computer science (especially in computer vision):

1. Minimizing

f (x) =1

2x>Ax + x>b

over all x 2 Rn, or subject to linear or a�ne con-straints.

2. Minimizing

f (x) =1

2x>Ax + x>b

over the unit sphere.

797

Page 2: Chapter 16 Quadratic Optimization Problemscis515/cis515-15-sl14.pdf · Chapter 16 Quadratic Optimization Problems 16.1 Quadratic Optimization: The Positive Definite Case In this

798 CHAPTER 16. QUADRATIC OPTIMIZATION PROBLEMS

In both cases, A is a symmetric matrix. We also seeknecessary and su�cient conditions for f to have a globalminimum.

Many problems in physics and engineering can be statedas the minimization of some energy function, with orwithout constraints.

Indeed, it is a fundamental principle of mechanics thatnature acts so as to minimize energy.

Furthermore, if a physical system is in a stable state ofequilibrium, then the energy in that state should be min-imal.

The simplest kind of energy function is a quadratic func-tion.

Page 3: Chapter 16 Quadratic Optimization Problemscis515/cis515-15-sl14.pdf · Chapter 16 Quadratic Optimization Problems 16.1 Quadratic Optimization: The Positive Definite Case In this

16.1. QUADRATIC OPTIMIZATION: THE POSITIVE DEFINITE CASE 799

Such functions can be conveniently defined in the form

P (x) = x>Ax � x>b,

where A is a symmetric n⇥n matrix, and x, b, are vectorsin Rn, viewed as column vectors.

Actually, for reasons that will be clear shortly, it is prefer-able to put a factor 1

2 in front of the quadratic term, sothat

P (x) =1

2x>Ax � x>b.

The question is, under what conditions (on A) does P (x)have a global minimum, preferably unique?

Page 4: Chapter 16 Quadratic Optimization Problemscis515/cis515-15-sl14.pdf · Chapter 16 Quadratic Optimization Problems 16.1 Quadratic Optimization: The Positive Definite Case In this

800 CHAPTER 16. QUADRATIC OPTIMIZATION PROBLEMS

We give a complete answer to the above question in twostages:

1. In this section, we show that if A is symmetric posi-tive definite, then P (x) has a unique global minimumprecisely when

Ax = b.

2. In Section 16.2, we give necessary and su�cient con-ditions in the general case, in terms of the pseudo-inverse of A.

We begin with the matrix version of Definition 14.2.

Definition 16.1.A symmetric positive definite matrixis a matrix whose eigenvalues are strictly positive, anda symmetric positive semidefinite matrix is a matrixwhose eigenvalues are nonnegative.

Page 5: Chapter 16 Quadratic Optimization Problemscis515/cis515-15-sl14.pdf · Chapter 16 Quadratic Optimization Problems 16.1 Quadratic Optimization: The Positive Definite Case In this

16.1. QUADRATIC OPTIMIZATION: THE POSITIVE DEFINITE CASE 801

Equivalent criteria are given in the following proposition.

Proposition 16.1. Given any Euclidean space E ofdimension n, the following properties hold:

(1) Every self-adjoint linear map f : E ! E is positivedefinite i↵

hx, f (x)i > 0

for all x 2 E with x 6= 0.

(2) Every self-adjoint linear map f : E ! E is positivesemidefinite i↵

hx, f (x)i � 0

for all x 2 E.

Some special notation is customary (especially in the fieldof convex optinization) to express that a symmetric ma-trix is positive definite or positive semidefinite.

Page 6: Chapter 16 Quadratic Optimization Problemscis515/cis515-15-sl14.pdf · Chapter 16 Quadratic Optimization Problems 16.1 Quadratic Optimization: The Positive Definite Case In this

802 CHAPTER 16. QUADRATIC OPTIMIZATION PROBLEMS

Definition 16.2. Given any n⇥n symmetric matrix Awe write A ⌫ 0 if A is positive semidefinite and we writeA � 0 if A is positive definite.

It should be noted that we can define the relation

A ⌫ B

between any two n ⇥ n matrices (symmetric or not) i↵A � B is symmetric positive semidefinite.

It is easy to check that this relation is actually a partialorder on matrices, called the positive semidefinite coneordering ; for details, see Boyd and Vandenberghe [8],Section 2.4.

If A is symmetric positive definite, it is easily checkedthat A�1 is also symmetric positive definite.

Also, if C is a symmetric positive definite m ⇥ m matrixand A is an m ⇥ n matrix of rank n (and so m � n),then A>CA is symmetric positive definite.

Page 7: Chapter 16 Quadratic Optimization Problemscis515/cis515-15-sl14.pdf · Chapter 16 Quadratic Optimization Problems 16.1 Quadratic Optimization: The Positive Definite Case In this

16.1. QUADRATIC OPTIMIZATION: THE POSITIVE DEFINITE CASE 803

We can now prove that

P (x) =1

2x>Ax � x>b

has a global minimum when A is symmetric positive def-inite.

Proposition 16.2. Given a quadratic function

P (x) =1

2x>Ax � x>b,

if A is symmetric positive definite, then P (x) has aunique global minimum for the solution of the linearsystem Ax = b. The minimum value of P (x) is

P (A�1b) = �1

2b>A�1b.

Page 8: Chapter 16 Quadratic Optimization Problemscis515/cis515-15-sl14.pdf · Chapter 16 Quadratic Optimization Problems 16.1 Quadratic Optimization: The Positive Definite Case In this

804 CHAPTER 16. QUADRATIC OPTIMIZATION PROBLEMS

Remarks:

(1) The quadratic function P (x) is also given by

P (x) =1

2x>Ax � b>x,

but the definition using x>b is more convenient forthe proof of Proposition 16.2.

(2) If P (x) contains a constant term c 2 R, so that

P (x) =1

2x>Ax � x>b + c,

the proof of Proposition 16.2 still shows that P (x)has a unique global minimum for x = A�1b, but theminimal value is

P (A�1b) = �1

2b>A�1b + c.

Page 9: Chapter 16 Quadratic Optimization Problemscis515/cis515-15-sl14.pdf · Chapter 16 Quadratic Optimization Problems 16.1 Quadratic Optimization: The Positive Definite Case In this

16.1. QUADRATIC OPTIMIZATION: THE POSITIVE DEFINITE CASE 805

Thus, when the energy function P (x) of a system is givenby a quadratic function

P (x) =1

2x>Ax � x>b,

where A is symmetric positive definite, finding the globalminimum of P (x) is equivalent to solving the linearsystem Ax = b.

Sometimes, it is useful to recast a linear problem Ax = bas a variational problem (finding the minimum of someenergy function).

However, very often, a minimization problem comes withextra constraints that must be satisfied for all admissiblesolutions.

Page 10: Chapter 16 Quadratic Optimization Problemscis515/cis515-15-sl14.pdf · Chapter 16 Quadratic Optimization Problems 16.1 Quadratic Optimization: The Positive Definite Case In this

806 CHAPTER 16. QUADRATIC OPTIMIZATION PROBLEMS

For instance, we may want to minimize the quadraticfunction

Q(y1, y2) =1

2

�y2

1 + y22

subject to the constraint

2y1 � y2 = 5.

The solution for which Q(y1, y2) is minimum is no longer(y1, y2) = (0, 0), but instead, (y1, y2) = (2, �1), as willbe shown later.

Geometrically, the graph of the function defined byz = Q(y1, y2) in R3 is a paraboloid of revolution P withaxis of revolution Oz.

The constraint2y1 � y2 = 5

corresponds to the vertical plane H parallel to the z-axisand containing the line of equation 2y1 � y2 = 5 in thexy-plane.

Page 11: Chapter 16 Quadratic Optimization Problemscis515/cis515-15-sl14.pdf · Chapter 16 Quadratic Optimization Problems 16.1 Quadratic Optimization: The Positive Definite Case In this

16.1. QUADRATIC OPTIMIZATION: THE POSITIVE DEFINITE CASE 807

Thus, the constrained minimum of Q is located on theparabola that is the intersection of the paraboloid P withthe plane H .

A nice way to solve constrained minimization problemsof the above kind is to use the method of Lagrange mul-tipliers .

Definition 16.3.The quadratic constrained minimiza-tion problem consists in minimizing a quadratic function

Q(y) =1

2y>C�1y � b>y

subject to the linear constraints

A>y = f,

where C�1 is an m ⇥ m symmetric positive definite ma-trix, A is an m ⇥ n matrix of rank n (so that m � n),and where b, y 2 Rm (viewed as column vectors), andf 2 Rn (viewed as a column vector).

Page 12: Chapter 16 Quadratic Optimization Problemscis515/cis515-15-sl14.pdf · Chapter 16 Quadratic Optimization Problems 16.1 Quadratic Optimization: The Positive Definite Case In this

808 CHAPTER 16. QUADRATIC OPTIMIZATION PROBLEMS

The reason for using C�1 instead of C is that the con-strained minimization problem has an interpretation asa set of equilibrium equations in which the matrix thatarises naturally is C (see Strang [31]).

Since C and C�1 are both symmetric positive definite,this doesn’t make any di↵erence, but it seems preferableto stick to Strang’s notation.

The method of Lagrange consists in incorporating then constraints A>y = f into the quadratic functionQ(y), by introducing extra variables � = (�1, . . . , �n)called Lagrange multipliers , one for each constraint. Weform the Lagrangian

L(y, �) = Q(y) + �>(A>y � f )

=1

2y>C�1y � (b � A�)>y � �>f.

Page 13: Chapter 16 Quadratic Optimization Problemscis515/cis515-15-sl14.pdf · Chapter 16 Quadratic Optimization Problems 16.1 Quadratic Optimization: The Positive Definite Case In this

16.1. QUADRATIC OPTIMIZATION: THE POSITIVE DEFINITE CASE 809

We shall prove that our constrained minimization prob-lem has a unique solution given by the system of linearequations

C�1y + A� = b,

A>y = f,

which can be written in matrix form as✓

C�1 AA> 0

◆✓y�

◆=

✓bf

◆.

Note that the matrix of this system is symmetric. Elimi-nating y from the first equation

C�1y + A� = b,

we gety = C(b � A�),

and substituting into the second equation, we get

A>C(b � A�) = f,

that is,A>CA� = A>Cb � f.

Page 14: Chapter 16 Quadratic Optimization Problemscis515/cis515-15-sl14.pdf · Chapter 16 Quadratic Optimization Problems 16.1 Quadratic Optimization: The Positive Definite Case In this

810 CHAPTER 16. QUADRATIC OPTIMIZATION PROBLEMS

However, by a previous remark, since C is symmetricpositive definite and the columns of A are linearly inde-pendent, A>CA is symmetric positive definite, and thusinvertible.

Note that this way of solving the system requires solvingfor the Lagrange multipliers first.

Letting e = b � A�, we also note that the system✓

C�1 AA> 0

◆✓y�

◆=

✓bf

is equivalent to the system

e = b � A�,

y = Ce,

A>y = f.

The latter system is called the equilibrium equations byStrang [31].

Page 15: Chapter 16 Quadratic Optimization Problemscis515/cis515-15-sl14.pdf · Chapter 16 Quadratic Optimization Problems 16.1 Quadratic Optimization: The Positive Definite Case In this

16.1. QUADRATIC OPTIMIZATION: THE POSITIVE DEFINITE CASE 811

Indeed, Strang shows that the equilibrium equations ofmany physical systems can be put in the above form.

In order to prove that our constrained minimization prob-lem has a unique solution, we proceed to prove that theconstrained minimization of Q(y) subject to A>y = fis equivalent to the unconstrained maximization of an-other function �P (�).

We get P (�) by minimizing the Lagrangian L(y, �)treated as a function of y alone .

Since C�1 is symmetric positive definite and

L(y, �) =1

2y>C�1y � (b � A�)>y � �>f,

by Proposition 16.2 the global minimum (with respect toy) of L(y, �) is obtained for the solution y of

C�1y = b � A�,

and the minimum of L(y, �) is

miny

L(y, �) = �1

2(A� � b)>C(A� � b) � �>f.

Page 16: Chapter 16 Quadratic Optimization Problemscis515/cis515-15-sl14.pdf · Chapter 16 Quadratic Optimization Problems 16.1 Quadratic Optimization: The Positive Definite Case In this

812 CHAPTER 16. QUADRATIC OPTIMIZATION PROBLEMS

Letting

P (�) =1

2(A� � b)>C(A� � b) + �>f,

we claim that the solution of the constrained minimiza-tion of Q(y) subject to A>y = f is equivalent to theunconstrained maximization of �P (�).

In order to prove that the unique minimum of the con-strained problem Q(y) subject to A>y = f is the uniquemaximum of �P (�), we compute Q(y) + P (�).

Proposition 16.3. The quadratic constrained mini-mization problem of Definition 16.3 has a unique so-lution (y, �) given by the system

✓C�1 AA> 0

◆✓y�

◆=

✓bf

◆.

Furthermore, the component � of the above solutionis the unique value for which �P (�) is maximum.

Page 17: Chapter 16 Quadratic Optimization Problemscis515/cis515-15-sl14.pdf · Chapter 16 Quadratic Optimization Problems 16.1 Quadratic Optimization: The Positive Definite Case In this

16.1. QUADRATIC OPTIMIZATION: THE POSITIVE DEFINITE CASE 813

Remarks:

(1) There is a form of duality going on in this situa-tion. The constrained minimization of Q(y) subjectto A>y = f is called the primal problem , and the un-constrained maximization of �P (�) is called the dualproblem . Duality is the fact stated slightly loosely as

miny

Q(y) = max�

�P (�).

Recalling that e = b � A�, since

P (�) =1

2(A� � b)>C(A� � b) + �>f,

we can also write

P (�) =1

2e>Ce + �>f.

This expression often represents the total potentialenergy of a system. Again, the optimal solution isthe one that minimizes the potential energy (and thusmaximizes �P (�)).

Page 18: Chapter 16 Quadratic Optimization Problemscis515/cis515-15-sl14.pdf · Chapter 16 Quadratic Optimization Problems 16.1 Quadratic Optimization: The Positive Definite Case In this

814 CHAPTER 16. QUADRATIC OPTIMIZATION PROBLEMS

(2) It is immediately verified that the equations of Propo-sition 16.3 are equivalent to the equations stating thatthe partial derivatives of the Lagrangian L(y, �) arenull:

@L

@yi= 0, i = 1, . . . , m,

@L

@�j= 0, j = 1, . . . , n.

Thus, the constrained minimum of Q(y) subject toA>y = f is an extremum of the Lagrangian L(y, �).As we showed in Proposition 16.3, this extremum cor-responds to simultaneously minimizing L(y, �) withrespect to y and maximizing L(y, �) with respect to�. Geometrically, such a point is a saddle point forL(y, �).

(3) The Lagrange multipliers sometimes have a naturalphysical meaning.

Page 19: Chapter 16 Quadratic Optimization Problemscis515/cis515-15-sl14.pdf · Chapter 16 Quadratic Optimization Problems 16.1 Quadratic Optimization: The Positive Definite Case In this

16.1. QUADRATIC OPTIMIZATION: THE POSITIVE DEFINITE CASE 815

Going back to the constrained minimization ofQ(y1, y2) =

12(y

21 + y2

2) subject to

2y1 � y2 = 5,

the Lagrangian is

L(y1, y2, �) =1

2

�y2

1 + y22

�+ �(2y1 � y2 � 5),

and the equations stating that the Lagrangian has a sad-dle point are

y1 + 2� = 0,

y2 � � = 0,

2y1 � y2 � 5 = 0.

We obtain the solution (y1, y2, �) = (2, �1, �1).

Page 20: Chapter 16 Quadratic Optimization Problemscis515/cis515-15-sl14.pdf · Chapter 16 Quadratic Optimization Problems 16.1 Quadratic Optimization: The Positive Definite Case In this

816 CHAPTER 16. QUADRATIC OPTIMIZATION PROBLEMS

16.2 Quadratic Optimization: The General Case

In this section, we complete the study initiated in Section16.1 and give necessary and su�cient conditions for thequadratic function 1

2x>Ax + x>b to have a global mini-

mum.

We begin with the following simple fact:

Proposition 16.4. If A is an invertible symmetricmatrix, then the function

f (x) =1

2x>Ax + x>b

has a minimum value i↵ A ⌫ 0, in which case this op-timal value is obtained for a unique value of x, namelyx⇤ = �A�1b, and with

f (A�1b) = �1

2b>A�1b.

Page 21: Chapter 16 Quadratic Optimization Problemscis515/cis515-15-sl14.pdf · Chapter 16 Quadratic Optimization Problems 16.1 Quadratic Optimization: The Positive Definite Case In this

16.2. QUADRATIC OPTIMIZATION: THE GENERAL CASE 817

Let us now consider the case of an arbitrary symmetricmatrix A.

Proposition 16.5. If A is a symmetric matrix, thenthe function

f (x) =1

2x>Ax + x>b

has a minimum value i↵ A ⌫ 0 and (I � AA+)b = 0,in which case this minimum value is

p⇤ = �1

2b>A+b.

Furthermore, if A = U>⌃U is an SVD of A, then theoptimal value is achieved by all x 2 Rn of the form

x = �A+b + U>✓0z

◆,

for any z 2 Rn�r, where r is the rank of A.

Page 22: Chapter 16 Quadratic Optimization Problemscis515/cis515-15-sl14.pdf · Chapter 16 Quadratic Optimization Problems 16.1 Quadratic Optimization: The Positive Definite Case In this

818 CHAPTER 16. QUADRATIC OPTIMIZATION PROBLEMS

The case in which we add either linear constraints of theform C>x = 0 or a�ne constraints of the form C>x = t(where t 6= 0) can be reduced to the unconstrained caseusing a QR-decomposition of C or N .

Let us show how to do this for linear constraints of theform C>x = 0.

If we use a QR decomposition of C, by permuting thecolumns, we may assume that

C = Q>✓

R S0 0

◆⇧,

where R is an r ⇥ r invertible upper triangular matrixand S is an r ⇥ (m � r) matrix (C has rank r).

Page 23: Chapter 16 Quadratic Optimization Problemscis515/cis515-15-sl14.pdf · Chapter 16 Quadratic Optimization Problems 16.1 Quadratic Optimization: The Positive Definite Case In this

16.2. QUADRATIC OPTIMIZATION: THE GENERAL CASE 819

Then, if we let

x = Q>✓

yz

◆,

where y 2 Rr and z 2 Rn�r, then, after some calcula-tions, our original problem becomes

minimize1

2(y>, z>)QAQ>

✓yz

◆+ (y>, z>)Qb

subject to y = 0, y 2 Rr, z 2 Rn�r.

Thus, the constraint C>x = 0 has been eliminated, andif we write

QAQ> =

✓G11 G12

G21 G22

and

Qb =

✓b1

b2

◆, b1 2 Rr, b2 2 Rn�r,

our problem becomes

minimize1

2z>G22z + z>b2, z 2 Rn�r,

the problem solved in Proposition 16.5.

Page 24: Chapter 16 Quadratic Optimization Problemscis515/cis515-15-sl14.pdf · Chapter 16 Quadratic Optimization Problems 16.1 Quadratic Optimization: The Positive Definite Case In this

820 CHAPTER 16. QUADRATIC OPTIMIZATION PROBLEMS

Constraints of the form C>x = t (where t 6= 0) can behandled in a similar fashion.

In this case, we may assume that C is an n ⇥ m matrixwith full rank (so that m n) and t 2 Rm.

Page 25: Chapter 16 Quadratic Optimization Problemscis515/cis515-15-sl14.pdf · Chapter 16 Quadratic Optimization Problems 16.1 Quadratic Optimization: The Positive Definite Case In this

16.3. MAXIMIZING A QUADRATIC FUNCTION ON THE UNIT SPHERE 821

16.3 Maximizing a Quadratic Function on the UnitSphere

In this section we discuss various quadratic optimizationproblems mostly arising from computer vision (image seg-mentation and contour grouping).

These problems can be reduced to the following basicoptimization problem: Given an n ⇥ n real symmetricmatrix A

maximize x>Ax

subject to x>x = 1, x 2 Rn.

In view of Proposition 15.6, the maximum value of x>Axon the unit sphere is equal to the largest eigenvalue �1 ofthe matrix A, and it is achieved for any unit eigenvectoru1 associated with �1.

Page 26: Chapter 16 Quadratic Optimization Problemscis515/cis515-15-sl14.pdf · Chapter 16 Quadratic Optimization Problems 16.1 Quadratic Optimization: The Positive Definite Case In this

822 CHAPTER 16. QUADRATIC OPTIMIZATION PROBLEMS

A variant of the above problem often encountered in com-puter vision consists in minimizing x>Ax on the ellipsoidgiven by an equation of the form

x>Bx = 1,

where B is a symmetric positive definite matrix.

Since B is positive definite, it can be diagonalized as

B = QDQ>,

where Q is an orthogonal matrix and D is a diagonalmatrix,

D = diag(d1, . . . , dn),

with di > 0, for i = 1, . . . , n.

If we define the matrices B1/2 and B�1/2 by

B1/2 = Q diag⇣p

d1, . . . ,p

dn

⌘Q>

and

B�1/2 = Q diag⇣1/p

d1, . . . , 1/p

dn

⌘Q>,

it is clear that these matrices are symmetric, thatB�1/2BB�1/2 = I , and that B1/2 and B�1/2 are mutualinverses.

Page 27: Chapter 16 Quadratic Optimization Problemscis515/cis515-15-sl14.pdf · Chapter 16 Quadratic Optimization Problems 16.1 Quadratic Optimization: The Positive Definite Case In this

16.3. MAXIMIZING A QUADRATIC FUNCTION ON THE UNIT SPHERE 823

Then, if we make the change of variable

x = B�1/2y,

the equation x>Bx = 1 becomes y>y = 1, and the opti-mization problem

maximize x>Ax

subject to x>Bx = 1, x 2 Rn,

is equivalent to the problem

maximize y>B�1/2AB�1/2y

subject to y>y = 1, y 2 Rn,

where y = B1/2x and where B�1/2AB�1/2 is symmetric.

Page 28: Chapter 16 Quadratic Optimization Problemscis515/cis515-15-sl14.pdf · Chapter 16 Quadratic Optimization Problems 16.1 Quadratic Optimization: The Positive Definite Case In this

824 CHAPTER 16. QUADRATIC OPTIMIZATION PROBLEMS

The complex version of our basic optimization problemin which A is a Hermitian matrix also arises in computervision. Namely, given an n⇥n complex Hermitian matrixA,

maximize x⇤Ax

subject to x⇤x = 1, x 2 Cn.

Again by Proposition 15.6, the maximum value of x⇤Axon the unit sphere is equal to the largest eigenvalue �1 ofthe matrix A and it is achieved for any unit eigenvectoru1 associated with �1.

It is worth pointing out that if A is a skew-Hermitianmatrix, that is, if A⇤ = �A, then x⇤Ax is pure imagi-nary or zero.

Page 29: Chapter 16 Quadratic Optimization Problemscis515/cis515-15-sl14.pdf · Chapter 16 Quadratic Optimization Problems 16.1 Quadratic Optimization: The Positive Definite Case In this

16.3. MAXIMIZING A QUADRATIC FUNCTION ON THE UNIT SPHERE 825

In particular, if A is a real matrix and if A is skew-symmetric, then

x>Ax = 0.

Thus, for any real matrix (symmetric or not),

x>Ax = x>H(A)x,

where H(A) = (A + A>)/2, the symmetric part of A.

There are situations in which it is necessary to add lin-ear constraints to the problem of maximizing a quadraticfunction on the sphere.

This problem was completely solved by Golub [16] (1973).

The problem is the following: Given an n ⇥ n real sym-metric matrix A and an n ⇥ p matrix C,

minimize x>Ax

subject to x>x = 1, C>x = 0, x 2 Rn.

Page 30: Chapter 16 Quadratic Optimization Problemscis515/cis515-15-sl14.pdf · Chapter 16 Quadratic Optimization Problems 16.1 Quadratic Optimization: The Positive Definite Case In this

826 CHAPTER 16. QUADRATIC OPTIMIZATION PROBLEMS

Golub shows that the linear constraint C>x = 0 can beeliminated as follows: If we use a QR decomposition ofC, by permuting the columns, we may assume that

C = Q>✓

R S0 0

◆⇧,

where R is an r ⇥ r invertible upper triangular matrixand S is an r ⇥ (p � r) matrix (assuming C has rank r).

Then if we let

x = Q>✓

yz

◆,

where y 2 Rr and z 2 Rn�r, then, after some calcula-tions, our original problem becomes

minimize (y>, z>)QAQ>✓

yz

subject to z>z = 1, z 2 Rn�r,

y = 0, y 2 Rr.

Page 31: Chapter 16 Quadratic Optimization Problemscis515/cis515-15-sl14.pdf · Chapter 16 Quadratic Optimization Problems 16.1 Quadratic Optimization: The Positive Definite Case In this

16.3. MAXIMIZING A QUADRATIC FUNCTION ON THE UNIT SPHERE 827

Thus, the constraint C>x = 0 has been eliminated, andif we write

QAQ> =

✓G11 G12

G>12 G22

◆,

our problem becomes

minimize z>G22z

subject to z>z = 1, z 2 Rn�r,

a standard eigenvalue problem.

Observe that if we let

J =

✓0 00 In�r

◆,

then

JQAQ>J =

✓0 00 G22

◆,

and if we setP = Q>JQ,

thenPAP = Q>JQAQ>JQ.

Page 32: Chapter 16 Quadratic Optimization Problemscis515/cis515-15-sl14.pdf · Chapter 16 Quadratic Optimization Problems 16.1 Quadratic Optimization: The Positive Definite Case In this

828 CHAPTER 16. QUADRATIC OPTIMIZATION PROBLEMS

Now, Q>JQAQ>JQ and JQAQ>J have the same eigen-values, so PAP and JQAQ>J also have the same eigen-values.

It follows that the solutions of our optimization problemare among the eigenvalues of K = PAP , and at least rof those are 0.

Using the fact that CC+ is the projection onto the rangeof C, where C+ is the pseudo-inverse of C, it can also beshown that

P = I � CC+,

the projection onto the kernel of C>.

In particular, when n � p and C has full rank (thecolumns of C are linearly independent), then we knowthat C+ = (C>C)�1C> and

P = I � C(C>C)�1C>.

This fact is used by Cour and Shi [10] and implicitly byYu and Shi [37].

Page 33: Chapter 16 Quadratic Optimization Problemscis515/cis515-15-sl14.pdf · Chapter 16 Quadratic Optimization Problems 16.1 Quadratic Optimization: The Positive Definite Case In this

16.3. MAXIMIZING A QUADRATIC FUNCTION ON THE UNIT SPHERE 829

The problem of adding a�ne constraints of the formN>x = t, where t 6= 0, also comes up in practice.

At first glance, this problem may not seem harder thanthe linear problem in which t = 0, but it is.

This problem was extensively studied in a paper by Gan-der, Golub, and von Matt [15] (1989).

Gander, Golub, and vonMatt consider the following prob-lem:

Given an (n + m) ⇥ (n + m) real symmetric matrix A(with n > 0), an (n + m) ⇥ m matrix N with full rank,and a nonzero vector t 2 Rm with

��(N>)†t�� < 1 (where

(N>)† denotes the pseudo-inverse of N>),

minimize x>Ax

subject to x>x = 1, N>x = t, x 2 Rn+m.

The condition��(N>)†t

�� < 1 ensures that the problemhas a solution and is not trivial.

Page 34: Chapter 16 Quadratic Optimization Problemscis515/cis515-15-sl14.pdf · Chapter 16 Quadratic Optimization Problems 16.1 Quadratic Optimization: The Positive Definite Case In this

830 CHAPTER 16. QUADRATIC OPTIMIZATION PROBLEMS

The authors begin by proving that the a�ne constraintN>x = t can be eliminated.

One way to do so is to use a QR decomposition of N .

It turns out that we get a simplified problem of the form

minimize z>Cz + 2z>b

subject to z>z = s2, z 2 Rm.

Unfortunately, if b 6= 0, Proposition 15.6 is no longerapplicable.

It is still possible to find the minimum of the functionz>Cz + 2z>b using Lagrange multipliers, but such a so-lution is too involved to be presented here.

Interested readers will find a thorough discussion in Gan-der, Golub, and von Matt [15].

Page 35: Chapter 16 Quadratic Optimization Problemscis515/cis515-15-sl14.pdf · Chapter 16 Quadratic Optimization Problems 16.1 Quadratic Optimization: The Positive Definite Case In this

Bibliography

[1] Michael Artin. Algebra. Prentice Hall, first edition,1991.

[2] Marcel Berger. Geometrie 1. Nathan, 1990. Englishedition: Geometry 1, Universitext, Springer Verlag.

[3] Marcel Berger. Geometrie 2. Nathan, 1990. Englishedition: Geometry 2, Universitext, Springer Verlag.

[4] J.E. Bertin. Algebre lineaire et geometrie clas-sique. Masson, first edition, 1981.

[5] Nicolas Bourbaki. Algebre, Chapitres 1-3. Elementsde Mathematiques. Hermann, 1970.

[6] Nicolas Bourbaki. Algebre, Chapitres 4-7. Elementsde Mathematiques. Masson, 1981.

[7] Nicolas Bourbaki. Espaces Vectoriels Topologiques.Elements de Mathematiques. Masson, 1981.

[8] Stephen Boyd and Lieven Vandenberghe. ConvexOptimization. Cambridge University Press, first edi-tion, 2004.

831

Page 36: Chapter 16 Quadratic Optimization Problemscis515/cis515-15-sl14.pdf · Chapter 16 Quadratic Optimization Problems 16.1 Quadratic Optimization: The Positive Definite Case In this

832 BIBLIOGRAPHY

[9] P.G. Ciarlet. Introduction to Numerical MatrixAnalysis and Optimization. Cambridge UniversityPress, first edition, 1989. French edition: Masson,1994.

[10] Timothee Cour and Jianbo Shi. Solving markov ran-dom fields with spectral relaxation. In Marita Meilaand Xiaotong Shen, editors, Artifical Intelligenceand Statistics. Society for Artificial Intelligence andStatistics, 2007.

[11] James W. Demmel. Applied Numerical Linear Al-gebra. SIAM Publications, first edition, 1997.

[12] Jean Dieudonne. Algebre Lineaire et GeometrieElementaire. Hermann, second edition, 1965.

[13] David S. Dummit and Richard M. Foote. AbstractAlgebra. Wiley, second edition, 1999.

[14] Jean H. Gallier. Geometric Methods and Appli-cations, For Computer Science and Engineering.TAM, Vol. 38. Springer, second edition, 2011.

[15] Walter Gander, Gene H. Golub, and Urs von Matt.A constrained eigenvalue problem. Linear Algebraand its Applications, 114/115:815–839, 1989.

Page 37: Chapter 16 Quadratic Optimization Problemscis515/cis515-15-sl14.pdf · Chapter 16 Quadratic Optimization Problems 16.1 Quadratic Optimization: The Positive Definite Case In this

BIBLIOGRAPHY 833

[16] Gene H. Golub. Some modified eigenvalue problems.SIAM Review, 15(2):318–334, 1973.

[17] H. Golub, Gene and F. Van Loan, Charles. MatrixComputations. The Johns Hopkins University Press,third edition, 1996.

[18] Trevor Hastie, Robert Tibshirani, and Jerome Fried-man. The Elements of Statistical Learning: DataMining, Inference, and Prediction. Springer, sec-ond edition, 2009.

[19] Roger A. Horn and Charles R. Johnson. MatrixAnalysis. Cambridge University Press, first edition,1990.

[20] Roger A. Horn and Charles R. Johnson. Topics inMatrix Analysis. Cambridge University Press, firstedition, 1994.

[21] Ho↵man Kenneth and Kunze Ray. Linear Algebra.Prentice Hall, second edition, 1971.

[22] D. Kincaid and W. Cheney. Numerical Analysis.Brooks/Cole Publishing, second edition, 1996.

[23] Serge Lang. Algebra. Addison Wesley, third edition,1993.

Page 38: Chapter 16 Quadratic Optimization Problemscis515/cis515-15-sl14.pdf · Chapter 16 Quadratic Optimization Problems 16.1 Quadratic Optimization: The Positive Definite Case In this

834 BIBLIOGRAPHY

[24] Serge Lang. Real and Functional Analysis. GTM142. Springer Verlag, third edition, 1996.

[25] Peter Lax. Linear Algebra and Its Applications.Wiley, second edition, 2007.

[26] Saunders Mac Lane and Garrett Birkho↵. Algebra.Macmillan, first edition, 1967.

[27] Carl D. Meyer. Matrix Analysis and Applied Lin-ear Algebra. SIAM, first edition, 2000.

[28] Laurent Schwartz. Analyse I. Theorie des Ensem-bles et Topologie. Collection Enseignement des Sci-ences. Hermann, 1991.

[29] Denis Serre. Matrices, Theory and Applications.GTM No. 216. Springer Verlag, second edition, 2010.

[30] G.W. Stewart. On the early history of the singularvalue decomposition. SIAM review, 35(4):551–566,1993.

[31] Gilbert Strang. Introduction to Applied Mathemat-ics. Wellesley-Cambridge Press, first edition, 1986.

[32] Gilbert Strang. Linear Algebra and its Applica-tions. Saunders HBJ, third edition, 1988.

[33] Claude Tisseron. Geometries a�nes, projectives,et euclidiennes. Hermann, first edition, 1994.

Page 39: Chapter 16 Quadratic Optimization Problemscis515/cis515-15-sl14.pdf · Chapter 16 Quadratic Optimization Problems 16.1 Quadratic Optimization: The Positive Definite Case In this

BIBLIOGRAPHY 835

[34] L.N. Trefethen and D. Bau III. Numerical LinearAlgebra. SIAM Publications, first edition, 1997.

[35] B.L. Van Der Waerden. Algebra, Vol. 1. Ungar,seventh edition, 1973.

[36] J.H. van Lint and R.M. Wilson. A Course in Com-binatorics. Cambridge University Press, second edi-tion, 2001.

[37] Stella X. Yu and Jianbo Shi. Grouping with bias.In Thomas G. Dietterich, Sue Becker, and ZoubinGhahramani, editors, Neural Information Process-ing Systems, Vancouver, Canada, 3-8 Dec. 2001.MIT Press, 2001.