chapter 14 quadratic optimization problemscis515/cis515-12-sl14.pdf · chapter 14 quadratic...

39
Chapter 14 Quadratic Optimization Problems 14.1 Quadratic Optimization: The Positive Definite Case In this chapter, we consider two classes of quadratic opti- mization problems that appear frequently in engineering and in computer science (especially in computer vision): 1. Minimizing f (x)= 1 2 x > Ax + x > b over all x 2 R n , or subject to linear or ane con- straints. 2. Minimizing f (x)= 1 2 x > Ax + x > b over the unit sphere. 675

Upload: lekhanh

Post on 31-Aug-2018

251 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Chapter 14 Quadratic Optimization Problemscis515/cis515-12-sl14.pdf · Chapter 14 Quadratic Optimization Problems 14.1 Quadratic Optimization: The Positive Definite Case In this

Chapter 14

Quadratic Optimization Problems

14.1 Quadratic Optimization: The Positive DefiniteCase

In this chapter, we consider two classes of quadratic opti-mization problems that appear frequently in engineeringand in computer science (especially in computer vision):

1. Minimizing

f (x) =1

2x>Ax + x>b

over all x 2 Rn, or subject to linear or a�ne con-straints.

2. Minimizing

f (x) =1

2x>Ax + x>b

over the unit sphere.

675

Page 2: Chapter 14 Quadratic Optimization Problemscis515/cis515-12-sl14.pdf · Chapter 14 Quadratic Optimization Problems 14.1 Quadratic Optimization: The Positive Definite Case In this

676 CHAPTER 14. QUADRATIC OPTIMIZATION PROBLEMS

In both cases, A is a symmetric matrix. We also seeknecessary and su�cient conditions for f to have a globalminimum.

Many problems in physics and engineering can be statedas the minimization of some energy function, with orwithout constraints.

Indeed, it is a fundamental principle of mechanics thatnature acts so as to minimize energy.

Furthermore, if a physical system is in a stable state ofequilibrium, then the energy in that state should be min-imal.

The simplest kind of energy function is a quadratic func-tion.

Page 3: Chapter 14 Quadratic Optimization Problemscis515/cis515-12-sl14.pdf · Chapter 14 Quadratic Optimization Problems 14.1 Quadratic Optimization: The Positive Definite Case In this

14.1. QUADRATIC OPTIMIZATION: THE POSITIVE DEFINITE CASE 677

Such functions can be conveniently defined in the form

P (x) = x>Ax � x>b,

where A is a symmetric n⇥n matrix, and x, b, are vectorsin Rn, viewed as column vectors.

Actually, for reasons that will be clear shortly, it is prefer-able to put a factor 1

2 in front of the quadratic term, sothat

P (x) =1

2x>Ax � x>b.

The question is, under what conditions (on A) does P (x)have a global minimum, preferably unique?

Page 4: Chapter 14 Quadratic Optimization Problemscis515/cis515-12-sl14.pdf · Chapter 14 Quadratic Optimization Problems 14.1 Quadratic Optimization: The Positive Definite Case In this

678 CHAPTER 14. QUADRATIC OPTIMIZATION PROBLEMS

We give a complete answer to the above question in twostages:

1. In this section, we show that if A is symmetric posi-tive definite, then P (x) has a unique global minimumprecisely when

Ax = b.

2. In Section 14.2, we give necessary and su�cient con-ditions in the general case, in terms of the pseudo-inverse of A.

We begin with the matrix version of Definition 12.2.

Definition 14.1.A symmetric positive definite matrixis a matrix whose eigenvalues are strictly positive, anda symmetric positive semidefinite matrix is a matrixwhose eigenvalues are nonnegative.

Page 5: Chapter 14 Quadratic Optimization Problemscis515/cis515-12-sl14.pdf · Chapter 14 Quadratic Optimization Problems 14.1 Quadratic Optimization: The Positive Definite Case In this

14.1. QUADRATIC OPTIMIZATION: THE POSITIVE DEFINITE CASE 679

Equivalent criteria are given in the following proposition.

Proposition 14.1. Given any Euclidean space E ofdimension n, the following properties hold:

(1) Every self-adjoint linear map f : E ! E is positivedefinite i↵

hx, f (x)i > 0

for all x 2 E with x 6= 0.

(2) Every self-adjoint linear map f : E ! E is positivesemidefinite i↵

hx, f (x)i � 0

for all x 2 E.

Some special notation is customary (especially in the fieldof convex optinization) to express that a symmetric ma-trix is positive definite or positive semidefinite.

Page 6: Chapter 14 Quadratic Optimization Problemscis515/cis515-12-sl14.pdf · Chapter 14 Quadratic Optimization Problems 14.1 Quadratic Optimization: The Positive Definite Case In this

680 CHAPTER 14. QUADRATIC OPTIMIZATION PROBLEMS

Definition 14.2. Given any n⇥n symmetric matrix Awe write A ⌫ 0 if A is positive semidefinite and we writeA � 0 if A is positive definite.

It should be noted that we can define the relation

A ⌫ B

between any two n ⇥ n matrices (symmetric or not) i↵A � B is symmetric positive semidefinite.

It is easy to check that this relation is actually a partialorder on matrices, called the positive semidefinite coneordering ; for details, see Boyd and Vandenberghe [8],Section 2.4.

If A is symmetric positive definite, it is easily checkedthat A�1 is also symmetric positive definite.

Also, if C is a symmetric positive definite m ⇥ m matrixand A is an m ⇥ n matrix of rank n (and so m � n),then A>CA is symmetric positive definite.

Page 7: Chapter 14 Quadratic Optimization Problemscis515/cis515-12-sl14.pdf · Chapter 14 Quadratic Optimization Problems 14.1 Quadratic Optimization: The Positive Definite Case In this

14.1. QUADRATIC OPTIMIZATION: THE POSITIVE DEFINITE CASE 681

We can now prove that

P (x) =1

2x>Ax � x>b

has a global minimum when A is symmetric positive def-inite.

Proposition 14.2. Given a quadratic function

P (x) =1

2x>Ax � x>b,

if A is symmetric positive definite, then P (x) has aunique global minimum for the solution of the linearsystem Ax = b. The minimum value of P (x) is

P (A�1b) = �1

2b>A�1b.

Page 8: Chapter 14 Quadratic Optimization Problemscis515/cis515-12-sl14.pdf · Chapter 14 Quadratic Optimization Problems 14.1 Quadratic Optimization: The Positive Definite Case In this

682 CHAPTER 14. QUADRATIC OPTIMIZATION PROBLEMS

Remarks:

(1) The quadratic function P (x) is also given by

P (x) =1

2x>Ax � b>x,

but the definition using x>b is more convenient forthe proof of Proposition 14.2.

(2) If P (x) contains a constant term c 2 R, so that

P (x) =1

2x>Ax � x>b + c,

the proof of Proposition 14.2 still shows that P (x)has a unique global minimum for x = A�1b, but theminimal value is

P (A�1b) = �1

2b>A�1b + c.

Page 9: Chapter 14 Quadratic Optimization Problemscis515/cis515-12-sl14.pdf · Chapter 14 Quadratic Optimization Problems 14.1 Quadratic Optimization: The Positive Definite Case In this

14.1. QUADRATIC OPTIMIZATION: THE POSITIVE DEFINITE CASE 683

Thus, when the energy function P (x) of a system is givenby a quadratic function

P (x) =1

2x>Ax � x>b,

where A is symmetric positive definite, finding the globalminimum of P (x) is equivalent to solving the linearsystem Ax = b.

Sometimes, it is useful to recast a linear problem Ax = bas a variational problem (finding the minimum of someenergy function).

However, very often, a minimization problem comes withextra constraints that must be satisfied for all admissiblesolutions.

Page 10: Chapter 14 Quadratic Optimization Problemscis515/cis515-12-sl14.pdf · Chapter 14 Quadratic Optimization Problems 14.1 Quadratic Optimization: The Positive Definite Case In this

684 CHAPTER 14. QUADRATIC OPTIMIZATION PROBLEMS

For instance, we may want to minimize the quadraticfunction

Q(y1, y2) =1

2

�y2

1 + y22

subject to the constraint

2y1 � y2 = 5.

The solution for which Q(y1, y2) is minimum is no longer(y1, y2) = (0, 0), but instead, (y1, y2) = (2, �1), as willbe shown later.

Geometrically, the graph of the function defined byz = Q(y1, y2) in R3 is a paraboloid of revolution P withaxis of revolution Oz.

The constraint2y1 � y2 = 5

corresponds to the vertical plane H parallel to the z-axisand containing the line of equation 2y1 � y2 = 5 in thexy-plane.

Page 11: Chapter 14 Quadratic Optimization Problemscis515/cis515-12-sl14.pdf · Chapter 14 Quadratic Optimization Problems 14.1 Quadratic Optimization: The Positive Definite Case In this

14.1. QUADRATIC OPTIMIZATION: THE POSITIVE DEFINITE CASE 685

Thus, the constrained minimum of Q is located on theparabola that is the intersection of the paraboloid P withthe plane H .

A nice way to solve constrained minimization problemsof the above kind is to use the method of Lagrange mul-tipliers .

Definition 14.3.The quadratic constrained minimiza-tion problem consists in minimizing a quadratic function

Q(y) =1

2y>C�1y � b>y

subject to the linear constraints

A>y = f,

where C�1 is an m ⇥ m symmetric positive definite ma-trix, A is an m ⇥ n matrix of rank n (so that m � n),and where b, y 2 Rm (viewed as column vectors), andf 2 Rn (viewed as a column vector).

Page 12: Chapter 14 Quadratic Optimization Problemscis515/cis515-12-sl14.pdf · Chapter 14 Quadratic Optimization Problems 14.1 Quadratic Optimization: The Positive Definite Case In this

686 CHAPTER 14. QUADRATIC OPTIMIZATION PROBLEMS

The reason for using C�1 instead of C is that the con-strained minimization problem has an interpretation asa set of equilibrium equations in which the matrix thatarises naturally is C (see Strang [29]).

Since C and C�1 are both symmetric positive definite,this doesn’t make any di↵erence, but it seems preferableto stick to Strang’s notation.

The method of Lagrange consists in incorporating then constraints A>y = f into the quadratic functionQ(y), by introducing extra variables � = (�1, . . . , �n)called Lagrange multipliers , one for each constraint. Weform the Lagrangian

L(y, �) = Q(y) + �>(A>y � f )

=1

2y>C�1y � (b � A�)>y � �>f.

Page 13: Chapter 14 Quadratic Optimization Problemscis515/cis515-12-sl14.pdf · Chapter 14 Quadratic Optimization Problems 14.1 Quadratic Optimization: The Positive Definite Case In this

14.1. QUADRATIC OPTIMIZATION: THE POSITIVE DEFINITE CASE 687

We shall prove that our constrained minimization prob-lem has a unique solution given by the system of linearequations

C�1y + A� = b,

A>y = f,

which can be written in matrix form as✓

C�1 AA> 0

◆✓y�

◆=

✓bf

◆.

Note that the matrix of this system is symmetric. Elimi-nating y from the first equation

C�1y + A� = b,

we gety = C(b � A�),

and substituting into the second equation, we get

A>C(b � A�) = f,

that is,A>CA� = A>Cb � f.

Page 14: Chapter 14 Quadratic Optimization Problemscis515/cis515-12-sl14.pdf · Chapter 14 Quadratic Optimization Problems 14.1 Quadratic Optimization: The Positive Definite Case In this

688 CHAPTER 14. QUADRATIC OPTIMIZATION PROBLEMS

However, by a previous remark, since C is symmetricpositive definite and the columns of A are linearly inde-pendent, A>CA is symmetric positive definite, and thusinvertible.

Note that this way of solving the system requires solvingfor the Lagrange multipliers first.

Letting e = b � A�, we also note that the system✓

C�1 AA> 0

◆✓y�

◆=

✓bf

is equivalent to the system

e = b � A�,

y = Ce,

A>y = f.

The latter system is called the equilibrium equations byStrang [29].

Page 15: Chapter 14 Quadratic Optimization Problemscis515/cis515-12-sl14.pdf · Chapter 14 Quadratic Optimization Problems 14.1 Quadratic Optimization: The Positive Definite Case In this

14.1. QUADRATIC OPTIMIZATION: THE POSITIVE DEFINITE CASE 689

Indeed, Strang shows that the equilibrium equations ofmany physical systems can be put in the above form.

In order to prove that our constrained minimization prob-lem has a unique solution, we proceed to prove that theconstrained minimization of Q(y) subject to A>y = fis equivalent to the unconstrained maximization of an-other function �P (�).

We get P (�) by minimizing the Lagrangian L(y, �)treated as a function of y alone .

Since C�1 is symmetric positive definite and

L(y, �) =1

2y>C�1y � (b � A�)>y � �>f,

by Proposition 14.2 the global minimum (with respect toy) of L(y, �) is obtained for the solution y of

C�1y = b � A�,

and the minimum of L(y, �) is

miny

L(y, �) = �1

2(A� � b)>C(A� � b) � �>f.

Page 16: Chapter 14 Quadratic Optimization Problemscis515/cis515-12-sl14.pdf · Chapter 14 Quadratic Optimization Problems 14.1 Quadratic Optimization: The Positive Definite Case In this

690 CHAPTER 14. QUADRATIC OPTIMIZATION PROBLEMS

Letting

P (�) =1

2(A� � b)>C(A� � b) + �>f,

we claim that the solution of the constrained minimiza-tion of Q(y) subject to A>y = f is equivalent to theunconstrained maximization of �P (�).

In order to prove that the unique minimum of the con-strained problem Q(y) subject to A>y = f is the uniquemaximum of �P (�), we compute Q(y) + P (�).

Proposition 14.3. The quadratic constrained mini-mization problem of Definition 14.3 has a unique so-lution (y, �) given by the system

✓C�1 AA> 0

◆✓y�

◆=

✓bf

◆.

Furthermore, the component � of the above solutionis the unique value for which �P (�) is maximum.

Page 17: Chapter 14 Quadratic Optimization Problemscis515/cis515-12-sl14.pdf · Chapter 14 Quadratic Optimization Problems 14.1 Quadratic Optimization: The Positive Definite Case In this

14.1. QUADRATIC OPTIMIZATION: THE POSITIVE DEFINITE CASE 691

Remarks:

(1) There is a form of duality going on in this situa-tion. The constrained minimization of Q(y) subjectto A>y = f is called the primal problem , and the un-constrained maximization of �P (�) is called the dualproblem . Duality is the fact stated slightly loosely as

miny

Q(y) = max�

�P (�).

Recalling that e = b � A�, since

P (�) =1

2(A� � b)>C(A� � b) + �>f,

we can also write

P (�) =1

2e>Ce + �>f.

This expression often represents the total potentialenergy of a system. Again, the optimal solution isthe one that minimizes the potential energy (and thusmaximizes �P (�)).

Page 18: Chapter 14 Quadratic Optimization Problemscis515/cis515-12-sl14.pdf · Chapter 14 Quadratic Optimization Problems 14.1 Quadratic Optimization: The Positive Definite Case In this

692 CHAPTER 14. QUADRATIC OPTIMIZATION PROBLEMS

(2) It is immediately verified that the equations of Propo-sition 14.3 are equivalent to the equations stating thatthe partial derivatives of the Lagrangian L(y, �) arenull:

@L

@yi= 0, i = 1, . . . , m,

@L

@�j= 0, j = 1, . . . , n.

Thus, the constrained minimum of Q(y) subject toA>y = f is an extremum of the Lagrangian L(y, �).As we showed in Proposition 14.3, this extremum cor-responds to simultaneously minimizing L(y, �) withrespect to y and maximizing L(y, �) with respect to�. Geometrically, such a point is a saddle point forL(y, �).

(3) The Lagrange multipliers sometimes have a naturalphysical meaning.

Page 19: Chapter 14 Quadratic Optimization Problemscis515/cis515-12-sl14.pdf · Chapter 14 Quadratic Optimization Problems 14.1 Quadratic Optimization: The Positive Definite Case In this

14.1. QUADRATIC OPTIMIZATION: THE POSITIVE DEFINITE CASE 693

Going back to the constrained minimization ofQ(y1, y2) =

12(y

21 + y2

2) subject to

2y1 � y2 = 5,

the Lagrangian is

L(y1, y2, �) =1

2

�y2

1 + y22

�+ �(2y1 � y2 � 5),

and the equations stating that the Lagrangian has a sad-dle point are

y1 + 2� = 0,

y2 � � = 0,

2y1 � y2 � 5 = 0.

We obtain the solution (y1, y2, �) = (2, �1, �1).

Page 20: Chapter 14 Quadratic Optimization Problemscis515/cis515-12-sl14.pdf · Chapter 14 Quadratic Optimization Problems 14.1 Quadratic Optimization: The Positive Definite Case In this

694 CHAPTER 14. QUADRATIC OPTIMIZATION PROBLEMS

14.2 Quadratic Optimization: The General Case

In this section, we complete the study initiated in Section14.1 and give necessary and su�cient conditions for thequadratic function 1

2x>Ax + x>b to have a global mini-

mum.

We begin with the following simple fact:

Proposition 14.4. If A is an invertible symmetricmatrix, then the function

f (x) =1

2x>Ax + x>b

has a minimum value i↵ A ⌫ 0, in which case this op-timal value is obtained for a unique value of x, namelyx⇤ = �A�1b, and with

f (A�1b) = �1

2b>A�1b.

Page 21: Chapter 14 Quadratic Optimization Problemscis515/cis515-12-sl14.pdf · Chapter 14 Quadratic Optimization Problems 14.1 Quadratic Optimization: The Positive Definite Case In this

14.2. QUADRATIC OPTIMIZATION: THE GENERAL CASE 695

Let us now consider the case of an arbitrary symmetricmatrix A.

Proposition 14.5. If A is a symmetric matrix, thenthe function

f (x) =1

2x>Ax + x>b

has a minimum value i↵ A ⌫ 0 and (I � AA+)b = 0,in which case this minimum value is

p⇤ = �1

2b>A+b.

Furthermore, if A = U>⌃U is an SVD of A, then theoptimal value is achieved by all x 2 Rn of the form

x = �A+b + U>✓0z

◆,

for any z 2 Rn�r, where r is the rank of A.

Page 22: Chapter 14 Quadratic Optimization Problemscis515/cis515-12-sl14.pdf · Chapter 14 Quadratic Optimization Problems 14.1 Quadratic Optimization: The Positive Definite Case In this

696 CHAPTER 14. QUADRATIC OPTIMIZATION PROBLEMS

The case in which we add either linear constraints of theform C>x = 0 or a�ne constraints of the form C>x = t(where t 6= 0) can be reduced to the unconstrained caseusing a QR-decomposition of C or N .

Let us show how to do this for linear constraints of theform C>x = 0.

If we use a QR decomposition of C, by permuting thecolumns, we may assume that

C = Q>✓

R S0 0

◆⇧,

where R is an r ⇥ r invertible upper triangular matrixand S is an r ⇥ (m � r) matrix (C has rank r).

Page 23: Chapter 14 Quadratic Optimization Problemscis515/cis515-12-sl14.pdf · Chapter 14 Quadratic Optimization Problems 14.1 Quadratic Optimization: The Positive Definite Case In this

14.2. QUADRATIC OPTIMIZATION: THE GENERAL CASE 697

Then, if we let

x = Q>✓

yz

◆,

where y 2 Rr and z 2 Rn�r, then, after some calcula-tions, our original problem becomes

minimize1

2(y>, z>)QAQ>

✓yz

◆+ (y>, z>)Qb

subject to y = 0, y 2 Rr, z 2 Rn�r.

Thus, the constraint C>x = 0 has been eliminated, andif we write

QAQ> =

✓G11 G12

G21 G22

and

Qb =

✓b1

b2

◆, b1 2 Rr, b2 2 Rn�r,

our problem becomes

minimize1

2z>G22z + z>b2, z 2 Rn�r,

the problem solved in Proposition 14.5.

Page 24: Chapter 14 Quadratic Optimization Problemscis515/cis515-12-sl14.pdf · Chapter 14 Quadratic Optimization Problems 14.1 Quadratic Optimization: The Positive Definite Case In this

698 CHAPTER 14. QUADRATIC OPTIMIZATION PROBLEMS

Constraints of the form C>x = t (where t 6= 0) can behandled in a similar fashion.

In this case, we may assume that C is an n ⇥ m matrixwith full rank (so that m n) and t 2 Rm.

Page 25: Chapter 14 Quadratic Optimization Problemscis515/cis515-12-sl14.pdf · Chapter 14 Quadratic Optimization Problems 14.1 Quadratic Optimization: The Positive Definite Case In this

14.3. MAXIMIZING A QUADRATIC FUNCTION ON THE UNIT SPHERE 699

14.3 Maximizing a Quadratic Function on the UnitSphere

In this section we discuss various quadratic optimizationproblems mostly arising from computer vision (image seg-mentation and contour grouping).

These problems can be reduced to the following basicoptimization problem: Given an n ⇥ n real symmetricmatrix A

maximize x>Ax

subject to x>x = 1, x 2 Rn.

In view of Proposition 13.6, the maximum value of x>Axon the unit sphere is equal to the largest eigenvalue �1 ofthe matrix A, and it is achieved for any unit eigenvectoru1 associated with �1.

Page 26: Chapter 14 Quadratic Optimization Problemscis515/cis515-12-sl14.pdf · Chapter 14 Quadratic Optimization Problems 14.1 Quadratic Optimization: The Positive Definite Case In this

700 CHAPTER 14. QUADRATIC OPTIMIZATION PROBLEMS

A variant of the above problem often encountered in com-puter vision consists in minimizing x>Ax on the ellipsoidgiven by an equation of the form

x>Bx = 1,

where B is a symmetric positive definite matrix.

Since B is positive definite, it can be diagonalized as

B = QDQ>,

where Q is an orthogonal matrix and D is a diagonalmatrix,

D = diag(d1, . . . , dn),

with di > 0, for i = 1, . . . , n.

If we define the matrices B1/2 and B�1/2 by

B1/2 = Q diag⇣p

d1, . . . ,p

dn

⌘Q>

and

B�1/2 = Q diag⇣1/p

d1, . . . , 1/p

dn

⌘Q>,

it is clear that these matrices are symmetric, thatB�1/2BB�1/2 = I , and that B1/2 and B�1/2 are mutualinverses.

Page 27: Chapter 14 Quadratic Optimization Problemscis515/cis515-12-sl14.pdf · Chapter 14 Quadratic Optimization Problems 14.1 Quadratic Optimization: The Positive Definite Case In this

14.3. MAXIMIZING A QUADRATIC FUNCTION ON THE UNIT SPHERE 701

Then, if we make the change of variable

x = B�1/2y,

the equation x>Bx = 1 becomes y>y = 1, and the opti-mization problem

maximize x>Ax

subject to x>Bx = 1, x 2 Rn,

is equivalent to the problem

maximize y>B�1/2AB�1/2y

subject to y>y = 1, y 2 Rn,

where y = B1/2x and where B�1/2AB�1/2 is symmetric.

Page 28: Chapter 14 Quadratic Optimization Problemscis515/cis515-12-sl14.pdf · Chapter 14 Quadratic Optimization Problems 14.1 Quadratic Optimization: The Positive Definite Case In this

702 CHAPTER 14. QUADRATIC OPTIMIZATION PROBLEMS

The complex version of our basic optimization problemin which A is a Hermitian matrix also arises in computervision. Namely, given an n⇥n complex Hermitian matrixA,

maximize x⇤Ax

subject to x⇤x = 1, x 2 Cn.

Again by Proposition 13.6, the maximum value of x⇤Axon the unit sphere is equal to the largest eigenvalue �1 ofthe matrix A and it is achieved for any unit eigenvectoru1 associated with �1.

It is worth pointing out that if A is a skew-Hermitianmatrix, that is, if A⇤ = �A, then x⇤Ax is pure imagi-nary or zero.

Page 29: Chapter 14 Quadratic Optimization Problemscis515/cis515-12-sl14.pdf · Chapter 14 Quadratic Optimization Problems 14.1 Quadratic Optimization: The Positive Definite Case In this

14.3. MAXIMIZING A QUADRATIC FUNCTION ON THE UNIT SPHERE 703

In particular, if A is a real matrix and if A is skew-symmetric, then

x>Ax = 0.

Thus, for any real matrix (symmetric or not),

x>Ax = x>H(A)x,

where H(A) = (A + A>)/2, the symmetric part of A.

There are situations in which it is necessary to add lin-ear constraints to the problem of maximizing a quadraticfunction on the sphere.

This problem was completely solved by Golub [15] (1973).

The problem is the following: Given an n ⇥ n real sym-metric matrix A and an n ⇥ p matrix C,

minimize x>Ax

subject to x>x = 1, C>x = 0, x 2 Rn.

Page 30: Chapter 14 Quadratic Optimization Problemscis515/cis515-12-sl14.pdf · Chapter 14 Quadratic Optimization Problems 14.1 Quadratic Optimization: The Positive Definite Case In this

704 CHAPTER 14. QUADRATIC OPTIMIZATION PROBLEMS

Golub shows that the linear constraint C>x = 0 can beeliminated as follows: If we use a QR decomposition ofC, by permuting the columns, we may assume that

C = Q>✓

R S0 0

◆⇧,

where R is an r ⇥ r invertible upper triangular matrixand S is an r ⇥ (p � r) matrix (assuming C has rank r).

Then if we let

x = Q>✓

yz

◆,

where y 2 Rr and z 2 Rn�r, then, after some calcula-tions, our original problem becomes

minimize (y>, z>)QAQ>✓

yz

subject to z>z = 1, z 2 Rn�r,

y = 0, y 2 Rr.

Page 31: Chapter 14 Quadratic Optimization Problemscis515/cis515-12-sl14.pdf · Chapter 14 Quadratic Optimization Problems 14.1 Quadratic Optimization: The Positive Definite Case In this

14.3. MAXIMIZING A QUADRATIC FUNCTION ON THE UNIT SPHERE 705

Thus, the constraint C>x = 0 has been eliminated, andif we write

QAQ> =

✓G11 G12

G>12 G22

◆,

our problem becomes

minimize z>G22z

subject to z>z = 1, z 2 Rn�r,

a standard eigenvalue problem.

Observe that if we let

J =

✓0 00 In�r

◆,

then

JQAQ>J =

✓0 00 G22

◆,

and if we setP = Q>JQ,

thenPAP = Q>JQAQ>JQ.

Page 32: Chapter 14 Quadratic Optimization Problemscis515/cis515-12-sl14.pdf · Chapter 14 Quadratic Optimization Problems 14.1 Quadratic Optimization: The Positive Definite Case In this

706 CHAPTER 14. QUADRATIC OPTIMIZATION PROBLEMS

Now, Q>JQAQ>JQ and JQAQ>J have the same eigen-values, so PAP and JQAQ>J also have the same eigen-values.

It follows that the solutions of our optimization problemare among the eigenvalues of K = PAP , and at least rof those are 0.

Using the fact that CC+ is the projection onto the rangeof C, where C+ is the pseudo-inverse of C, it can also beshown that

P = I � CC+,

the projection onto the kernel of C>.

In particular, when n � p and C has full rank (thecolumns of C are linearly independent), then we knowthat C+ = (C>C)�1C> and

P = I � C(C>C)�1C>.

This fact is used by Cour and Shi [10] and implicitly byYu and Shi [34].

Page 33: Chapter 14 Quadratic Optimization Problemscis515/cis515-12-sl14.pdf · Chapter 14 Quadratic Optimization Problems 14.1 Quadratic Optimization: The Positive Definite Case In this

14.3. MAXIMIZING A QUADRATIC FUNCTION ON THE UNIT SPHERE 707

The problem of adding a�ne constraints of the formN>x = t, where t 6= 0, also comes up in practice.

At first glance, this problem may not seem harder thanthe linear problem in which t = 0, but it is.

This problem was extensively studied in a paper by Gan-der, Golub, and von Matt [14] (1989).

Gander, Golub, and vonMatt consider the following prob-lem:

Given an (n + m) ⇥ (n + m) real symmetric matrix A(with n > 0), an (n + m) ⇥ m matrix N with full rank,and a nonzero vector t 2 Rm with

��(N>)†t�� < 1 (where

(N>)† denotes the pseudo-inverse of N>),

minimize x>Ax

subject to x>x = 1, N>x = t, x 2 Rn+m.

The condition��(N>)†t

�� < 1 ensures that the problemhas a solution and is not trivial.

Page 34: Chapter 14 Quadratic Optimization Problemscis515/cis515-12-sl14.pdf · Chapter 14 Quadratic Optimization Problems 14.1 Quadratic Optimization: The Positive Definite Case In this

708 CHAPTER 14. QUADRATIC OPTIMIZATION PROBLEMS

The authors begin by proving that the a�ne constraintN>x = t can be eliminated.

One way to do so is to use a QR decomposition of N .

It turns out that we get a simplified problem of the form

minimize z>Cz + 2z>b

subject to z>z = s2, z 2 Rm.

Unfortunately, if b 6= 0, Proposition 13.6 is no longerapplicable.

It is still possible to find the minimum of the functionz>Cz + 2z>b using Lagrange multipliers, but such a so-lution is too involved to be presented here.

Interested readers will find a thorough discussion in Gan-der, Golub, and von Matt [14].

Page 35: Chapter 14 Quadratic Optimization Problemscis515/cis515-12-sl14.pdf · Chapter 14 Quadratic Optimization Problems 14.1 Quadratic Optimization: The Positive Definite Case In this

Bibliography

[1] Michael Artin. Algebra. Prentice Hall, first edition,1991.

[2] Marcel Berger. Geometrie 1. Nathan, 1990. Englishedition: Geometry 1, Universitext, Springer Verlag.

[3] Marcel Berger. Geometrie 2. Nathan, 1990. Englishedition: Geometry 2, Universitext, Springer Verlag.

[4] J.E. Bertin. Algebre lineaire et geometrie clas-sique. Masson, first edition, 1981.

[5] Nicolas Bourbaki. Algebre, Chapitres 1-3. Elementsde Mathematiques. Hermann, 1970.

[6] Nicolas Bourbaki. Algebre, Chapitres 4-7. Elementsde Mathematiques. Masson, 1981.

[7] Nicolas Bourbaki. Espaces Vectoriels Topologiques.Elements de Mathematiques. Masson, 1981.

[8] Stephen Boyd and Lieven Vandenberghe. ConvexOptimization. Cambridge University Press, first edi-tion, 2004.

709

Page 36: Chapter 14 Quadratic Optimization Problemscis515/cis515-12-sl14.pdf · Chapter 14 Quadratic Optimization Problems 14.1 Quadratic Optimization: The Positive Definite Case In this

710 BIBLIOGRAPHY

[9] P.G. Ciarlet. Introduction to Numerical MatrixAnalysis and Optimization. Cambridge UniversityPress, first edition, 1989. French edition: Masson,1994.

[10] Timothee Cour and Jianbo Shi. Solving markov ran-dom fields with spectral relaxation. In Marita Meilaand Xiaotong Shen, editors, Artifical Intelligenceand Statistics. Society for Artificial Intelligence andStatistics, 2007.

[11] James W. Demmel. Applied Numerical Linear Al-gebra. SIAM Publications, first edition, 1997.

[12] Jean Dieudonne. Algebre Lineaire et GeometrieElementaire. Hermann, second edition, 1965.

[13] David S. Dummit and Richard M. Foote. AbstractAlgebra. Wiley, second edition, 1999.

[14] Walter Gander, Gene H. Golub, and Urs von Matt.A constrained eigenvalue problem. Linear Algebraand its Applications, 114/115:815–839, 1989.

[15] Gene H. Golub. Some modified eigenvalue problems.SIAM Review, 15(2):318–334, 1973.

Page 37: Chapter 14 Quadratic Optimization Problemscis515/cis515-12-sl14.pdf · Chapter 14 Quadratic Optimization Problems 14.1 Quadratic Optimization: The Positive Definite Case In this

BIBLIOGRAPHY 711

[16] H. Golub, Gene and F. Van Loan, Charles. MatrixComputations. The Johns Hopkins University Press,third edition, 1996.

[17] Trevor Hastie, Robert Tibshirani, and Jerome Fried-man. The Elements of Statistical Learning: DataMining, Inference, and Prediction. Springer, sec-ond edition, 2009.

[18] Roger A. Horn and Charles R. Johnson. MatrixAnalysis. Cambridge University Press, first edition,1990.

[19] Ho↵man Kenneth and Kunze Ray. Linear Algebra.Prentice Hall, second edition, 1971.

[20] D. Kincaid and W. Cheney. Numerical Analysis.Brooks/Cole Publishing, second edition, 1996.

[21] Serge Lang. Algebra. Addison Wesley, third edition,1993.

[22] Serge Lang. Real and Functional Analysis. GTM142. Springer Verlag, third edition, 1996.

[23] Peter Lax. Linear Algebra and Its Applications.Wiley, second edition, 2007.

[24] Saunders Mac Lane and Garrett Birkho↵. Algebra.Macmillan, first edition, 1967.

Page 38: Chapter 14 Quadratic Optimization Problemscis515/cis515-12-sl14.pdf · Chapter 14 Quadratic Optimization Problems 14.1 Quadratic Optimization: The Positive Definite Case In this

712 BIBLIOGRAPHY

[25] Carl D. Meyer. Matrix Analysis and Applied Lin-ear Algebra. SIAM, first edition, 2000.

[26] Laurent Schwartz. Analyse I. Theorie des Ensem-bles et Topologie. Collection Enseignement des Sci-ences. Hermann, 1991.

[27] Denis Serre. Matrices, Theory and Applications.GTM No. 216. Springer Verlag, second edition, 2010.

[28] G.W. Stewart. On the early history of the singularvalue decomposition. SIAM review, 35(4):551–566,1993.

[29] Gilbert Strang. Introduction to Applied Mathemat-ics. Wellesley-Cambridge Press, first edition, 1986.

[30] Gilbert Strang. Linear Algebra and its Applica-tions. Saunders HBJ, third edition, 1988.

[31] Claude Tisseron. Geometries a�nes, projectives,et euclidiennes. Hermann, first edition, 1994.

[32] L.N. Trefethen and D. Bau III. Numerical LinearAlgebra. SIAM Publications, first edition, 1997.

[33] B.L. Van Der Waerden. Algebra, Vol. 1. Ungar,seventh edition, 1973.

[34] Stella X. Yu and Jianbo Shi. Grouping with bias.In Thomas G. Dietterich, Sue Becker, and Zoubin

Page 39: Chapter 14 Quadratic Optimization Problemscis515/cis515-12-sl14.pdf · Chapter 14 Quadratic Optimization Problems 14.1 Quadratic Optimization: The Positive Definite Case In this

BIBLIOGRAPHY 713

Ghahramani, editors, Neural Information Process-ing Systems, Vancouver, Canada, 3-8 Dec. 2001.MIT Press, 2001.