cones of hermitian matrices and trigonometric...

Chapter One

Cones of Hermitian matrices and trigonometric

polynomials

In this chapter we study cones in the real Hilbert spaces of Hermitian matri-ces and real valued trigonometric polynomials. Based on an approach usingsuch cones and their duals, we establish various extension results for positivesemidefinite matrices and nonnegative trigonometric polynomials. In addi-tion, we show the connection with semidefinite programming and includesome numerical experiments.

1.1 CONES AND THEIR BASIC PROPERTIES

Let H be a Hilbert space over R with inner product 〈·, ·〉. A nonempty subsetC of H is called a cone if

(i) C + C ⊆ C,

(ii) αC ⊆ C for all α > 0.

Note that a cone is convex (i.e., if C1, C2 ∈ C then sC1 + (1 − s)C2 ∈ C fors ∈ [0, 1]). The cone C is closed if C = C, where C denotes the closure of C (inthe topology induced by the norm ‖·‖ on H, where, as usual, ‖·‖ = 〈·, ·〉1/2).For example, the set (x, y) : 0 ≤ x ≤ y is a closed cone in R2. This coneis the intersection of the cones (x, y) : |x| ≤ y and (x, y) : x, y ≥ 0. Ingeneral, intersections and sums of cones are again cones:

Proposition 1.1.1 Let H be a Hilbert space and let C1 and C2 be cones inH. Then the following are also cones:

(i) C1 ∩ C2, and

(ii) C1 + C2 = C1 + C2 : C1 ∈ C1, C2 ∈ C2.

Proof. The proof is left to the reader (see Exercise 1.6.1).

Given a cone C the dual C∗ of C is defined via

C∗ = L ∈ H : 〈L,K〉 ≥ 0 for all K ∈ C.

The notion of a dual cone is of special importance in optimization problems,as optimality conditions are naturally formulated using the dual cone. As a

2 CHAPTER 1

simple example, it is not hard to see that the dual of (x, y) : 0 ≤ x ≤ y isthe cone (x, y) : x ≥ 0 and y ≥ −x. We call a cone C selfdual if C∗ = C.For example, (x, y) : x, y ≥ 0 is selfdual.

Lemma 1.1.2 The dual of a cone is again a cone, which happens to beclosed. Any subspace W of H is a cone, and its dual is equal to its orthogonalcomplement (i.e., if W is a subspace then W ∗ = W⊥ := h ∈ H : 〈h,w〉 =0 for all w ∈W). Moreover, for cones C, C1, and C2 we have that

(i) if C1 ⊆ C2 then C∗2 ⊆ C∗1 ;

(ii) (C∗)∗ = C;

(iii) C∗1 + C∗2 ⊆ (C1 ∩ C2)∗;

(iv) (C1 + C2)∗ = C∗1 ∩ C∗2 .

Proof. The proof of this proposition is left as an exercise (see Exercise 1.6.3).

An extreme ray of a cone C is a subset of C∪0 of the form αK : α ≥ 0,where 0 6= K ∈ C is such that

K = A+B, A,B ∈ C ⇒ A = αK for some α ∈ [0,∞).

The cone (x, y) : 0 ≤ x ≤ y has the extreme rays (x, 0) : x ≥ 0 and(x, x) : x ≥ 0.

There are two cones that we are particularly interested in: the cone PSDof positive semidefinite matrices and the cone TPol+ of trigonometric poly-nomials that take on nonnegative values on the unit circle. We will studysome of the basics of these cones in this section, and in the remainder of thischapter we will study related cones. The cone RPol+ of polynomials thatare nonnegative on the real line is of interest as well. Some of its propertieswill be developed in the exercises.

1.1.1 The cone PSDn

Let Cn×n be the Hilbert space over C of n × n matrices endowed with theinner product

〈A,B〉 = tr(AB∗),

where tr(M) denotes the trace of the square matrix M and B∗ denotesthe complex conjugate transpose of the matrix B. Consider the subset ofHermitian matrices Hn = A ∈ Cn×n : A = A∗, which itself is a Hilbertspace over R under the same inner product 〈A,B〉 = tr(AB∗) = tr(AB).When we restrict ourselves to real matrices we will write Hn,R = Hn∩Rn×n.The following results on the cone PSDn of positive semidefinite matrices arewell known.

Lemma 1.1.3 The set PSDn of all n × n positive semidefinite matrices isa selfdual cone in the Hilbert space Hn.

CONES OF HERMITIAN MATRICES AND TRIGONOMETRIC POLYNOMIALS 3

Proof. Let A and B be n× n positive semidefinite matrices. We denote byA

12 the unique positive semidefinite matrix the square of which is A. Then,

by a well-known property of the trace, tr(AB) = tr(A12BA

12 ) ≥ 0. Thus

PSDn ⊆ (PSDn)∗.For the converse, let A ∈ Hn be such that tr(AB) ≥ 0 for every positive

semidefinite matrix B. For all v ∈ Cn, the rank 1 matrix B = vv∗ is positivesemidefinite, thus 0 ≤ tr(Avv∗) = 〈Av, v〉, so A is positive semidefinite.

Proposition 1.1.4 The extreme rays of PSDn are given by αvv∗ : α ≥0, where v is a nonzero vector in Cn. In other words, all extreme rays ofPSDn are generated by rank 1 matrices.

Proof. Let v ∈ Cn and suppose that vv∗ = A + B with A and B positivesemidefinite. If w ∈ Cn is orthogonal to v then we get that 0 = w∗vv∗w =w∗Aw + w∗Bw, and as both A and B are positive semidefinite we get thatw∗Aw = 0 = w∗Bw. Thus ‖A 1

2w‖ = 0 = ‖B 12w‖. This yields that A

12w =

0 = B12w, and thus Aw = 0 = Bw. As this holds for all w orthogonal to v,

we get that A and B have at most rank 1. If they have rank 1, the eigenvectorcorresponding to the single nonzero eigenvalue must be a multiple of v, andthis implies that both A and B are a multiple of vv∗.

Conversely, if K ∈ PSDn write K = v1v∗1 + · · ·+vkv

∗k, where v1, . . . , vk are

nonzero vectors in Cn and k is the rank of K (use, for instance, a Choleskyfactorization K = LL∗, and let v1, . . . , vk be the nonzero columns of theCholesky factor L). But then it follows immediately that if k ≥ 2, thematrix K does not generate an extreme ray of PSDn.

1.1.2 The cone TPol+n

We consider trigonometric polynomials p(z) =∑nk=−n pkz

k with the innerproduct

〈p, g〉 =1

2π

∫ 2π

0

p(eiθ)g(eiθ)dθ =n∑

k=−n

pkgk,

where g(z) =∑nk=−n gkz

k. We are particularly interested in the Hilbertspace TPoln over R consisting of those trigonometric polynomials p(z) thatare real valued on the unit circle, that is, p ∈ TPoln if and only if p is ofdegree ≤ n and p(z) ∈ R for z ∈ T. It is not hard to see that the lattercondition is equivalent to pk = p−k, k = 0, . . . , n. For such real-valuedtrigonometric polynomials we have that 〈p, g〉 =

∑nk=−n pkg−k. The cone

that we are interested in is formed by those trigonometric polynomials thatare nonnegative on the unit circle:

TPol+n = p ∈ TPoln : p(z) ≥ 0 for all z ∈ T.A crucial property of these nonnegative trigonometric polynomials is thatthey factorize as the modulus squared of a single polynomial. This propertystated in the next theorem is known as the Fejer-Riesz factorization prop-erty; it is in fact the key to determining the dual of TPol+n and its extreme

4 CHAPTER 1

rays. We call a polynomial q outer if all its roots are outside the open unitdisk; that is, q is outer when q(z) 6= 0 for z ∈ D. The degree n polynomialq(z) =

∑nk=0 qnz

n is called co-outer if all its roots are inside the closed unitdisk and qn 6= 0 (in some contexts it is convenient to interpret qn 6= 0 tomean that ∞ is not a root of q(z)). One easily sees that q(z) is outer if andonly if znq(1/z) is co-outer.

Theorem 1.1.5 The trigonometric polynomial p(z) =∑nk=−n pkz

k, pn 6=0, is nonnegative on the unit circle T if and only if there exists a polynomialq(z) =

∑nk=0 qkz

k such that

p(z) = q(z)q(1/z), z ∈ C, z 6= 0.

In particular, p(z) = |q(z)|2 for all z ∈ T. The polynomial q may be chosento be outer (co-outer) and in that case q is unique up to a scalar factor ofmodulus 1.

When the factor q is outer (co-outer), we refer to the equality p = |q|2as an outer (co-outer) factorization. To make the outer factorization uniqueone can insist that q0 ≥ 0, in which case one can refer to q as the outer factorof p. Similarly, to make the co-outer factorization unique one can insist thatqn ≥ 0, in which case one can refer to q as the co-outer factor of p.

Proof. The “if” statement is trivial, so we focus on the “only if” statement.So let p ∈ TPol+n . Using that p−k = pk for k = 0, . . . , n, we see that thepolynomial

g(z) = znp(z) = pn + · · ·+ p0zn + · · ·+ pnz

2n

satisfies the relation

g(z) = z2ng(1/z), z 6= 0. (1.1.1)

Let z1, . . . , z2n be the set of all zeros of g counted according to their multi-plicities. Since all zj are different from zero, it follows from (1.1.1) that 1/zjis also a zero of g of the same multiplicity as zj . We prove that if zj ∈ T,in which case zj = 1/zj , then zj = eitj has even multiplicity. The functionφ : R → C defined by φ(t) = p(eit) is differentiable and nonnegative on R.This implies that the multiplicity of tj as a zero of φ is even since φ has alocal minimum at tj , and consequently zj is an even multiplicity zero of g.So we can renumber the zeros of g so that z1, . . . , zn, 1/z1, . . . , 1/zn repre-sent the 2n zeros of g. Note that we can choose z1, . . . , zn so that |zk| ≥ 1,k = 1, . . . , n (or, if we prefer, so that |zk| ≤ 1, k = 1, . . . , n). Then we have

g(z) = pn

n∏j=1

(z − zj)n∏j=1

Åz − 1

zj

ã,

so

p(z) = z−ng(z) = pn

n∏j=1

(z − zj)n∏j=1

Å1− 1

zzj

ã(1.1.2)


= cn∏j=1

(z − zj)n∏j=1

Å1

z− zjã,

where c = pn(∏nj=1(−zj))−1. Since p is nonnegative on T and positive

somewhere, at α ∈ T, say, we get that c > 0 (as c =

∏j|α−zj |2

p(α) ). With

q(z) =√c∏nj=1(z − zj), (1.1.2) implies that p(z) = q(z)q(1/z). Note that

choosing |zk| ≥ 1, k = 1, . . . , n, leads to an outer factorization, and thatchoosing |zk| ≤ 1, k = 1, . . . , n, leads to a co-outer factorization. Theuniqueness of the (co-)outer factor up to a scalar factor of modulus 1 is leftas an exercise (see Exercise 1.6.5).

To obtain a description of the dual cone of TPol+n , it will be of use torewrite the inner product 〈p, g〉 in terms of the factorization p = |q|2. Forthis purpose, introduce the Toeplitz matrix

Tg := (gi−j)ni,j=0 =

ág0 g−1 · · · g−ng1 g0 · · · g−n+1

.... . .

. . ....

gn gn−1 · · · g0

ë,

where g(z) =∑nk=−n gkz

k. It is now a straightforward calculation to seethat p(z) = |q0 + · · ·+ qnz

n|2, z ∈ T, yields

〈p, g〉 =(q0 · · · qn

)á

g0 g1 · · · g−ng1 g0 · · · g−n+1

.... . .

. . ....

gn gn−1 · · · g0

ëÖq0

...qn

è= x∗Tgx,

(1.1.3)

where x =(q0 · · · qn

)Tand the superscript T denotes taking the trans-

pose. From this it is straightforward to see that the dual cone of TPol+nconsists exactly of those trigonometric polynomials g for which the associ-ated Toeplitz matrix Tg is positive semidefinite (notation: Tg ≥ 0).

Lemma 1.1.6 The dual cone of TPol+n is given by

(TPol+n )∗ = g ∈ TPoln : Tg ≥ 0. (1.1.4)

Proof. Clearly, if p = |q|2 ∈ TPol+n and g ∈ TPoln is such that Tg ≥ 0, weobtain by (1.1.3) that 〈p, g〉 ≥ 0. This gives ⊇ in (1.1.4). For the converse,suppose that g ∈ TPoln is such that Tg is not positive semidefinite. Then

choose x =(q0 · · · qn

)Tsuch that x∗Tgx < 0, and let p ∈ TPoln be given

via p(z) = |q0 + · · ·+ qnzn|2, z ∈ T. Then (1.1.3) yields that 〈p, g〉 < 0, and

thus g 6∈ (TPol+n )∗. This yields ⊆ in (1.1.4).

Next, we show that the extreme rays of TPol+n are generated exactly bythose nonnegative trigonometric polynomials that have 2n roots (countingmultiplicity) on the unit circle.

6 CHAPTER 1

Proposition 1.1.7 The trigonometric polynomial q(z) =∑nk=−n qkz

k in

TPol+n generates an extreme ray of TPol+n if and only if qn 6= 0 and all theroots of q are on the unit circle; equivalently, q(z) is of the form

q(z) = cn∏k=1

(z − eitk)

Å1

z− e−itk

ã, (1.1.5)

for some c > 0 and tk ∈ R, k = 1, . . . , n.

Proof. First suppose that q(z) is of the form (1.1.5) and let q = g + h withg, h ∈ TPol+n . As q(eitk) = 0, k = 1, . . . , n, and g and h are nonnegative onthe unit circle, we get that g(eitk) = h(eitk) = 0, k = 1, . . . , n. Notice thatq ≥ g on T implies that the multiplicity of a root on T of g must be at leastas large as the multiplicity of the same root of q (use the construction of thefunction φ in the proof of Theorem 1.1.5). Using this one now sees that gand h must be multiples of q.

Next, suppose that q has fewer that 2n roots on the unit circle (countingmultiplicity). Using the construction in the proof of Theorem 1.1.5 we canfactor q as q = gh, where g ∈ TPol+m, m < n, has all its roots on the unitcircle, h ∈ TPol+k has none of its roots on the unit circle, and k +m ≤ n. Ifh is not a constant, we can subtract a positive constant ε, say, from h whileremaining in TPol+k (e.g., take ε = minz∈T h(z) > 0). Then q = εg+ (h− ε)ggives a way of writing q as a sum of elements in TPol+n that are not scalarmultiples of q (as h was not a constant function). In case h ≡ c is a constantfunction, we can write

q(z) = g(z)h(z) = g(z)

Å1

2c− δz − δ

z

ã+ g(z)

Å1

2c+ δz +

δ

z

ã,

which for 0 < δ < c4 gives a way of writing q as a sum of elements in

TPol+m+1 ⊆ TPol+n that are not scalar multiples of q. This proves that q

does not generate an extreme ray of TPol+n .

Notice that if we consider the cone of nonnegative trigonometric polyno-mials without any restriction on the degree, the argument in the proof ofProposition 1.1.7 shows that none of its elements generate an extreme ray.In other words, the cone TPol+ =

⋃∞n=0 TPol+n does not have any extreme

rays.

1.2 CONES OF HERMITIAN MATRICES

A natural cone to study is the cone of positive semidefinite matrices thathave zeros in certain fixed locations, that is, positive semidefinite matriceswith a given sparsity pattern. As we shall see, the dual of this cone consistsof those Hermitian matrices which can be made into a positive semidefinitematrix by changing the entries in the given locations. As such, this dualcone is related to the positive semidefinite completion problem. While inthis section we analyze this problem from the viewpoint of cones and their


properties, we shall directly investigate the completion problem in detail inthe next chapter, then in the context of operator matrices.

For presenting the results of this section we first need some graph the-oretical preliminaries. An undirected graph is a pair G = (V,E) in whichV , the vertex set, is a finite set and the edge set E is a symmetric binaryrelation on V such that (v, v) 6∈ E for all v ∈ V . The adjacency set of avertex v is denoted by Adj(v), that is, w ∈ Adj(v) if (v, w) ∈ E. Givena subset A ⊆ V , define the subgraph induced by A by G|A = (A,E|A), inwhich E|A = (x, y) ∈ E : x ∈ A and y ∈ A. The complete graph is agraph with the property that every pair of distinct vertices is adjacent. Asubset K ⊆ V is a clique if the induced graph G|K on K is complete. Aclique is called a maximal clique if it is not a proper subset of another clique.

A subset P ⊆ 1, . . . , n × 1, . . . , n with the properties

(i) (i, i) ∈ P for i = 1, . . . , n,

(ii) (i, j) ∈ P if and only if (j, i) ∈ P

is called an n× n symmetric pattern. Such a pattern is said to be a sparsitypattern (location of required zeros) for a matrix A ∈ Hn, if every 1 ≤ i, j ≤ nsuch that (i, j) 6∈ P implies that the (i, j) and (j, i) entries of A are 0.

With an n × n symmetric pattern P we associate the undirected graphG = (V,E), with V = 1, . . . , n and (i, j) ∈ E if and only if (i, j) ∈ P andi 6= j. For a pattern with associated graph G = (V,E), we introduce thefollowing subspace of Hn:

HG = A ∈ Hn : Aij = 0 for all (i, j) 6∈ P,that is, the set of all n × n Hermitian matrices with sparsity pattern P . Itis easy to see that

H⊥G = A ∈ Hn : Aij = 0 for all (i, j) ∈ P.When A is an n × n matrix A = (Aij)

ni,j=1 and K ⊆ 1, . . . , n, then A|K

denotes the cardK × cardK principal submatrix

A|K = (Aij)i,j∈K = (Aij)(i,j)∈K×K .

Here cardK denotes the cardinality of the set K.We now define the following cones in HG:

PPSDG = A ∈ HG : A|K ≥ 0 for all cliques K of G ,PSDG = X ∈ HG : X ≥ 0 ,AG = Y ∈ HG : there is a W ∈ H⊥G such that Y +W ≥ 0 ,

BG = B ∈ HG : B =∑nB

i=1Bi where

B1, . . . , BnB ∈ PSDG are of rank 1 .The abbreviation PPSD stands for partially positive semidefinite.

We have the following descriptions of the duals of PPSDG and PSDG. ForHermitian matrices we introduce the Loewner order ≤, defined by A ≤ B ifand only if B −A ≥ 0 (i.e., B −A is positive semidefinite).

8 CHAPTER 1

Proposition 1.2.1 Let G = (V,E) be a graph. The cones PPSDG, PSDG,AG, and BG are closed, and their duals (in HG) are identified as

(PSDG)∗ = AG, and (PPSDG)∗ = BG.Moreover,

(PPSDG)∗ ⊆ PSDG ⊆ (PSDG)∗ ⊆ PPSDG. (1.2.1)

Proof. The closedness of PPSDG, AG, and PSDG is trivial. For the closed-ness of BG, first observe that if J1, . . . , Jp are all the maximal cliques of G,then any B ∈ BG can be written as B = B1 + · · · + Bp, where Bk ≥ 0 andthe nonzero entries of Bk lie in Jk × Jk, k = 1, . . . , p. Note that Bk ≤ B,

k = 1, . . . , p. Let now A(n) =∑pk=1A

(n)k ∈ BG be so that the nonzero en-

tries of A(n)k lie in Jk × Jk, k = 1, . . . , p, and assume that A(n) converges

to A as n → ∞. We need to show that A ∈ BG. Then A(n)1 is a bounded

sequence of matrices, and thus there is a subsequence A(nk)1 k∈N, that con-

verges to A1, say. Note that A1 is positive semidefinite and has only nonzero

entries in J1 × J1. Next take a subsequence A(nkl)2 l∈N, of A(nk)

2 k∈N thatconverges to A2, say, which is automatically positive semidefinite and hasonly nonzero entries in J2 × J2. Repeating this argument, we ultimately

obtain m1 < m2 < · · · so that limj→∞A(mj)k = Ak ≥ 0 with Ak having only

nonzero entries in Jk × Jk. But then

A = limn→∞

A(n) =

p∑k=1

limj→∞

A(mj)k = A1 + · · ·+Ap ∈ BG,

proving the closedness of BG.By Lemma 1.1.2 it is true that if C is a cone and W is a subspace, then

(C ∩W )∗ = C∗+W⊥. This implies the duality between PSDG and AG sincePSDG = PSDn ∩HG and PSDn is selfdual.

For the duality of PPSDG and BG note that if X ∈ PSDG and rank X = 1,then X = ww∗, where w is a vector with support in a clique of G. This showsthat any element in PPSDG lies in the dual of BG. Next if A 6∈ PPSDG,then there is a clique K such that A|K is not positive semidefinite. Thusthere is a vector v so that v∗(A|K)v < 0. But then one can pad the matrixvv∗ with zeros, and obtain a positive semidefinite rank 1 matrix B with allits nonzero entries in the clique K, so that 〈A,B〉 < 0. As B ∈ BG, thisshows that A is not in the dual of BG.

It is not hard to see that equality between PSDG and (PSDG)∗ holds onlyin case the maximal cliques of G are disjoint, and in that case all four conesare equal.

In order to explore when the cones PSD∗G and PPSDG are equal, it is con-venient to introduce the following graph theoretic notions. A path [v1, . . . , vk]in a graph G = (V,E) is a sequence of vertices such that (vj , vj+1) ∈ E forj = 1, . . . , k−1. The path [v1, . . . , vk] is referred to as a path between v1 andvk. A graph is called connected if there exists a path between any two dif-ferent vertices in the graph. A cycle of length k > 2 is a path [v1, . . . , vk, v1]


in which v1, . . . , vk are distinct. A graph G is called chordal if every cy-cle of length greater than 3 possesses a chord, that is, an edge joining twononconsecutive vertices of the cycle.

Below are two graphs and the corresponding symmetric patterns (picturedas partial matrices). The graph on the left is chordal, while the one on theright (the four cycle) is not.

4

3

2

1

4

3

2

1

Ü∗ ∗ ? ∗∗ ∗ ∗ ∗? ∗ ∗ ∗∗ ∗ ∗ ∗

ê,

Ü∗ ∗ ? ∗∗ ∗ ∗ ?? ∗ ∗ ∗∗ ? ∗ ∗

ê.

An ordering σ = [v1, . . . , vn] of the vertices of a graph is called a perfectvertex elimination scheme (or perfect scheme) if each set

Si = vj ∈ Adj(vi) : j > i (1.2.2)

is a clique. A vertex v of G is said to be simplicial when Adj(v) is a clique.Thus σ = [v1, . . . , vn] is a perfect scheme if each vi is simplicial in the inducedgraph G|vi, . . . , vn.

A subset S ⊂ V is an a− b vertex separator for nonadjacent vertices a andb if the removal of S from the graph separates a and b into distinct connectedcomponents. If no proper subset of S is an a− b separator, then S is calleda minimal a− b vertex separator.

Proposition 1.2.2 Every minimal vertex separator of a chordal graph iscomplete.

Proof. Let S be a minimal a − b separator in a chordal graph G = (V,E),and let A and B be the connected components in G|V −S containing a andb, respectively. Let x, y ∈ S. Each vertex in S must be connected to atleast one vertex in A and at least one in B. We can choose minimal lengthpaths [x, a1, . . . , ar, y] and [y, b1, . . . , bs, x], such that ai ∈ A for i = 1, . . . , rand bj ∈ B for j = 1, . . . , s. Then [x, a1, . . . , ar, y, b1, . . . , bs, x] is a cyclein G of length at least 4. Chordality of G implies it must have a chord.Since (ai, bj) 6∈ E by the definition of a vertex separator, (ai, ak) 6∈ E and(bk, bl) 6∈ E by the minimality of r and s, the only possible chord is (x, y).Thus S is a clique.

The next result is known as Dirac’s lemma.

10 CHAPTER 1

Lemma 1.2.3 Every chordal graph G has a simplicial vertex, and if G isnot a clique, then it has two nonadjacent simplicial vertices.

Proof. We proceed by induction on n, the number of vertices of G. Whenn = 1 or n = 2, the result is trivial. Assume n ≥ 3 and the result is truefor graphs with fewer than n vertices. Assume G = (V,E) has n vertices. IfG is complete, the result holds. Assume G is not complete, and let S be aminimal a− b separator for two nonadjacent vertices a and b. Let A and Bbe the connected components of a respectively b in G|V − S.

By our assumption, either the subgraph G|A ∪ S has two nonadjacentsimplicial vertices one of which must be in A (since by Proposition 1.2.2,G|S is complete), or G|(A∪S) is complete and any vertex in A is simplicialin G|A ∪ S. Since Adj(A) ⊆ A ∪ S, a simplicial vertex of G|A ∪ S in Ais simplicial in G. Similarly, B contains a simplicial vertex of G, and thisproves the lemma.

The following result is an algorithmic characterization of chordal graphs.

Theorem 1.2.4 An undirected graph is chordal if and only if it has a perfectscheme. Moreover, any simplicial vertex can start a perfect scheme.

Proof. Assume G = (V,E) be a chordal graph with n vertices. Assume everychordal graph with fewer than n vertices has a perfect scheme. (For n = 1the result is trivial.) By Lemma 1.2.3, G has a simplicial vertex x. Let[v1, . . . , vn−1] be a perfect scheme for G|V − x. Then [x, v1, . . . , vn−1] is aperfect scheme for G.

Let G = (V,E) be a graph with a perfect scheme σ = [v1, . . . , vn] andassume C is a cycle of length at least 4 in G. Let x be the vertex of Cwith the smallest index in σ. By the definition of a perfect scheme, the twovertices in C adjacent to x must be connected by an edge, so C has a chord.

The following result is an important consequence of Theorem 1.2.4. Wewill use the following lemma.

Lemma 1.2.5 Let A > 0. Then(A BB∗ C

)is positive (semi)definite if and

only if C −B∗A−1B is.

Proof. This follows immediately from the following factorization:ÅA BB∗ C

ã=

ÅI 0

B∗A−1 I

ãÅA 00 C −B∗A−1B

ãÅI A−1B0 I

ã.

In Section 2.4 we will see that C −B∗A−1B is a so-called “Schur comple-ment,” and that more general versions of this lemma hold.

Proposition 1.2.6 Let A ∈ PSDG, where G is a chordal graph with n ver-tices. Then A can be written as A =

∑ri=1 wiw

∗i , where r = rankA and

wi ∈ Cn are nonzero vectors such that wiw∗i ∈ PSDG for i = 1, . . . , r. In

particular, we have PSDG = (PPSDG)∗.


Proof. We use induction on n. For n = 1 the result is trivial and we assumeit holds for n− 1. Let A ∈ PSDG, where G = (V,E) is a chordal graph withn vertices, and let r = rankA. We can assume without loss of generality thatthe vertex 1 is simplicial in G (otherwise we reorder the rows and columns ofA in an order that starts with a simplicial vertex). If a11 = 0, then the firstrow and column of A are zero, so the result follows from the assumption forn− 1. Otherwise, let w1 ∈ Cn be the first column of A multiplied by 1√

a11.

An entry (k, l), 2 ≤ k, l ≤ n of w1w∗1 is nonzero if (1, k) and (1, l) are in E.

Since the vertex 1 is simplicial, we have (k.l) ∈ E. Then, by Lemma 1.2.5,A − w1w

∗1 is a positive semidefinite matrix of rank r − 1, and has its first

row and column equal to zero, and (A − w1w∗1)|2, . . . , n ∈ PSDG|2,...,n.

Since G|2, . . . , n is also chordal, by our assumption for n−1, we have thatA − w1w

∗1 =

∑ri=2 wiw

∗i , where each wi ∈ Cn is a nonzero vector with its

first component equal to 0. This completes our proof.

Proposition 1.2.6 has two immediate corollaries.

Corollary 1.2.7 Let P be a symmetric pattern with associated graph G =(V,E). Then Gaussian elimination can be carried out on every A ∈ PSDG

such that in the process no entry corresponding to (i, j) 6∈ P is changed eventemporarily to a nonzero, if and only if σ = [1, 2, . . . , n] is a perfect schemefor G.

Corollary 1.2.8 Let G = (V,E) be a graph. Then the lower-upper Choleskyfactorization A = LL∗ of every A ∈ PSDG satisfies Lij = 0 for 1 ≤ j < i ≤ nsuch that (i, j) 6∈ E, if and only if σ = [1, 2, . . . , n] is a perfect scheme forG.

Proof. Follows immediately from the fact that L =(w1 w2 · · ·wn

), where

w1, . . . , wn are obtained recursively as in the proof of Proposition 1.2.6.

Proposition 1.2.9 Let G = (V,E) be a nonchordal graph. Then (PSDG)∗

is a proper subset of PPSDG.

Proof. For m ≥ 4 define the m×m Toeplitz matrix

Am =

â1 1 0 0 · · · −11 1 1 0 · · · 00 1 1 1 · · · 0...

. . .. . .

. . .. . .

...−1 0 0 0 · · · 1

ì, (1.2.3)

the graph of which is the chordless cycle [1, . . . ,m, 1]. Each Am is par-tially positive semidefinite since both matrices ( 1 1

1 1 ) and(

1 −1−1 1

)are posi-

tive semidefinite. We cannot modify the zeros of Am in any way to obtain apositive semidefinite matrix, since the only positive semidefinite matrix withthe three middle diagonals equal to 1 is the matrix of all 1’s.

12 CHAPTER 1

Let G = (V,E), V = 1, . . . , n, be a nonchordal graph and assumewithout loss of generality that it contains the chordless cycle [1, . . . ,m, 1],4 ≤ m ≤ n. Let A be the matrix with graph G having 1 on its main diagonal,the matrix Am in (1.2.3) as its m×m upper-left corner, and 0 on any otherposition. Then A ∈ PPSDG, but A 6∈ (PSDG)∗, since the zeros in its upperleft m×m corner cannot be modified such that this corner becomes a positivesemidefinite matrix.

The following result summarizes the above results and shows that equal-ity between (PPSDG)∗ and PSDG (or equivalently, between (PSDG)∗ andPPSDG) occurs exactly when G is chordal.

Theorem 1.2.10 Let G = (V,E) be a graph. Then the following are equiv-alent.

(i) G is chordal.

(ii) PPSDG = (PSDG)∗.

(iii) (PPSDG)∗ = PSDG.

(iv) There exists a permutation σ of [1, . . . , n] such that after reorderingthe rows and columns of every A ∈ PSDG by the order σ, A has thelower-upper Cholesky factorization A = LL∗ with Lij = 0 for every1 ≤ i < j ≤ n such that (i, j) 6∈ E.

(v) The only matrices that generate extreme rays of PSDG are rank 1 ma-trices.

Proof. (i) ⇒ (iv) follows from Corollary 1.2.8, (iv) ⇒ (iii) from Corollary1.2.8 and Proposition 1.2.6, (iii)⇒ (ii) from Lemma 1.1.2 and the closednessof the cones, while (ii)⇒ (i) follows from Proposition 1.2.9. The implications(i) ⇒ (iii) ⇒ (v) follow from Proposition 1.2.6.

Let us give an example of a higher rank matrix that generates an extremeray in the nonchordal case.

Example 1.2.11 Let G be given by the graph representing the chordlesscycle [1, 2, 3, 4, 1], and let

K =

Ü1 1 0 11 2 1 00 1 1 −11 0 −1 2

ê.

Then K generates an extreme ray for PSDG. Indeed, suppose that K =A+B with A,B ∈ PSDG. Notice that if we let

V =

Å1 1 0 10 1 1 −1

ã=(v1 v2 v3 v4

),


then K = V ∗V . As K = A+ B and A,B ≥ 0, it follows that the nullspaceof K lies in the nullspaces of both A and B. Using this, it is not hard to seethat A and B must be of the form

A = V ∗AV, B = V ∗‹BV,where A and ‹B are 2 × 2 positive semidefinite matrices. As A,B ∈ PSDG,we have that

v∗1Av3 = 0 = v∗1‹Bv3 and v∗2Av4 = 0 = v∗2‹Bv4.

But then both A and ‹B are orthogonal to the matrices

v3v∗1 =

Å0 01 0

ã, v1v

∗3 =

Å0 10 0

ã,

v2v∗4 =

Å1 −11 −1

ã, v4v

∗2 =

Å1 1−1 −1

ã.

But then A and ‹B must be multiples of the identity, and thus A and B aremultiples of K.

1.3 CONES OF TRIGONOMETRIC POLYNOMIALS

Similarly to Section 1.2, we introduce in this section four cones, which nowcontain real-valued trigonometric polynomials. We establish the relationshipbetween these cones, identify their duals, and discuss the situation when theyare pairwise equal.

Let d ≥ 1 and let S be a finite subset of Zd. We consider the vectorspace PolS of trigonometric polynomials p with Fourier support in S, thatis, p(z) =

∑λ∈S pλz

λ, where z = (eit1 , . . . , eitd), λ = (λ1, . . . , λd), andzλ = (eiλ1t1 , . . . , eiλdtd). Note that in the notation PolS we do not highlightthe number of variables; it is implicit in the set S. In subsequent notation wehave also chosen not to attach the number of variables, as it would make thenotation more cumbersome. We hope that this does not lead to confusion.The complex numbers pλλ∈S are referred to as the (Fourier) coefficientsof p. The Fourier support of p is given by λ : pλ 6= 0. We endow PolSwith the inner product

〈p, q〉 =1

(2π)d

∫[0,2π]d

p(eit1 , . . . , eitd)q(eit1 , . . . , eitd)dt1 . . . dtd =∑λ∈S

pλqλ,

where pλλ∈S and qλλ∈S are the coefficients of p and q, respectively. Fora set S ⊂ Zd satisfying S = −S we let TPolS be the vector space over Rconsisting of all trigonometric polynomials p with Fourier support in S suchthat p(z) ∈ R for z ∈ Td. Equivalently, p ∈ TPolS if p(z) =

∑s∈S psz

s

satisfies ps = p−s for s ∈ S. With the inner product inherited from PolS thespace TPolS is a Hilbert space over R. We will be especially interested inthe case when S is of the form S = Λ− Λ, where Λ ⊂ Nd0.

14 CHAPTER 1

Let us introduce the following cones:

TPol+S = p ∈ TPolS : p(z) ≥ 0 for all z ∈ Td,the cone consisting of all nonnegative valued trigonometric polynomials withsupport in S, and

TPol2Λ = p ∈ TPolS : p =r∑j=1

|qj |2, with qj ∈ PolΛ,

consisting of all trigonometric polynomials that are sums of squares of abso-lute values of polynomials with support in Λ. Obviously, TPol2Λ ⊆ TPol+Λ−Λ.

Next we introduce two cones that are defined via Toeplitz matrices builtfrom Fourier coefficients. In general, when we have a finite subset Λ of Zdand a trigonometric polynomial p(z) =

∑s∈S psz

s, where S = Λ−Λ, we candefine the (multivariable) Toeplitz matrix

Tp,Λ = (pλ−ν)λ,ν∈Λ.

It should be noticed that in this definition of the multivariable Toeplitzmatrix there is some ambiguity in the order of rows and columns. We willonly be interested in properties (primarily positive semidefiniteness) of thematrix that are not dependent on the order of the rows and columns, as longas the same order is used for both the rows and the columns. For example,consider Λ = (0, 0), (0, 1), (1, 0) ⊆ Z2. If we order the elements of Λ asindicated, we get that

(pλ−ν)λ,ν∈Λ =

Ñp0,0 p0,−1 p−1,0

p0,1 p0,0 p−1,1

p1,0 p1,−1 p0,0

é.

If we order Λ as (0, 1), (0, 0), (1, 0) we get the matrix

(pλ−ν)λ,ν∈Λ =

Ñp0,0 p0,1 p−1,1

p0,−1 p0,0 p−1,0

p1,−1 p1,0 p0,0

é,

which is of course the previous matrix with rows and columns 1 and 2permuted. Later on, we may refer, for instance, to the (0, 0)th row andthe (0, 1)th column, and even the ((0, 0), (0, 1))th element of the matrix.In that terminology, we have that the (λ, ν)th element of the matrix Tp,Λis the number pλ−ν , explaining the convenience of this terminology. Ofcourse, when Λ = 1, . . . , n and the numbers 1 through n are ordered asusual, we get a classical Toeplitz matrix and the associated terminologyis the usual one. In general, though, these “multivariable” Toeplitz ma-trices and the associated terminology take some getting used to. Even inthe case when Λ ⊂ Z some care is needed in these definitions. For in-stance, for both subsets Λ = 0, 1, 3 and Λ′ = 0, 1, 2, 3 of Z, we have thatS = Λ− Λ = Λ′ − Λ′ = −3,−2,−1, 0, 1, 2, 3. For these sets we have that

Tp,Λ =

Ñp0 p−1 p−3

p1 p0 p−2

p3 p2 p0

é, (1.3.1)


while

Tp,Λ′ =

Üp0 p−1 p−2 p−3

p1 p0 p−1 p−2

p2 p1 p0 p−1

p3 p2 p1 p0

ê. (1.3.2)

The positive semidefiniteness of (1.3.1) does not guarantee positive semidefi-

niteness of (1.3.2). For example, when we take p(z) = 710z3 + 7

10z+1+ 7z10 + 7z3

10it is easy to check that Tp,Λ is positive semidefinite but Tp,Λ′ is not.

It should be noted that the condition Tp,Λ ≥ 0 is equivalent to the state-ment that ∑

λ,µ∈Λ

pλ−µcµcλ ≥ 0 (1.3.3)

for every complex sequence cλλ∈Λ. Relation (1.3.3) transforms Tp,Λ ≥ 0into a condition that is independent of the ordering of Λ, and in the literatureone may see the material addressed in this section presented in this way. Wechose to use the presentation with the positive semidefinite Toeplitz matricesbecause it parallels the setting of positive semidefinite matrices as discussedin the previous section.

Next, we define the notion of extendability. Given p(z) =∑s∈S psz

s, wesay that p is extendable if we can find pm, m ∈ Zd \ S, such that for everyfinite set J ⊂ Zd the Toeplitz matrix (pλ−ν)λ,ν∈J is positive semidefinite.Obviously, if S = Λ− Λ, then p being extendable implies that Tp,Λ ≥ 0.

We are now ready to introduce the next two cones:

AΛ = p ∈ TPolΛ−Λ : Tp,Λ ≥ 0and

BS = p ∈ TPolS : p is extendable.Clearly, BΛ−Λ ⊆ AΛ. In general, though, we do not have equality. Forinstance,

p(z) =7

10z3+

7

10z+ 1 +

7z

10+

7z3

10∈ A0,1,3 \ B−3,...,3. (1.3.4)

The relationship between the four convex cones introduced above is thesubject of the following result.

Proposition 1.3.1 Let Λ ⊂ Zd be finite and put S = Λ−Λ. Then the conesTPol2Λ, TPol+S , BS, and AΛ are closed in TPolS, and we have

TPol2Λ ⊆ TPol+S ⊆ BS ⊆ AΛ. (1.3.5)

Proof. The closedness of TPol+S , BS , and AΛ is trivial. For the closednessof TPol2Λ, observe first that dim(TPol+S ) = cardS =: m. First we show thatevery element of TPol2Λ is a sum of at most m squares. Let p ∈ TPol2Λ, andwrite

p =r∑j=1

|qj |2, (1.3.6)

16 CHAPTER 1

with each qj ∈ TPolΛ. If r ≤ m we are done, so assume that r > m. Sinceeach |qj |2 is in TPol+S , there exist real numbers rj , not all 0, such that

r∑j=1

rj |qj |2 = 0. (1.3.7)

Without loss of generality we may assume that |r1| ≥ |r2| ≥ · · · ≥ |rm| and|r1| > 0. Solving (1.3.7) for |q1|2 and substituting that into (1.3.6) we obtain

p =r∑j=2

Å1− rj

r1

ã|qj |2,

which is the sum of r − 1 squares. Repeating these arguments we see thatevery p ∈ TPol2Λ is the sum of at most m squares.

If pn ∈ TPol2Λ for n ≥ 1 and pn converges uniformly on Td to p, there existqj,n ∈ PolΛ such that

pn =m∑j=1

|qj,n|2, n ≥ 1. (1.3.8)

The sequence pn is uniformly bounded on Γ; hence so are qj,n by (1.3.8).Fix k ∈ Λ. It follows that there is a sequence ni → ∞ such that the kthFourier coefficients of qj,ni converge to gj,k, say, for j = 1, . . . ,m. Defining

qj(z) =∑k∈Λ

gj,kzk

for j = 1, . . . ,m, we obviously have p =∑mj=1 |qj |2. Thus p ∈ TPol2Λ, and

this proves TPol2Λ is closed.The first and third inclusions in (1.3.5) are trivial. For the second inclu-

sion, let p ∈ TPol+S and put pν = 0 for ν 6∈ S. With this choice we claim thatTp,J ≥ 0 for all finite J ⊂ Zd. Indeed, for a complex sequence v = (vj)j∈Jdefine the polynomial g(z) =

∑j∈J vjz

j . Then

v∗Tp,Jv =1

(2π)d

∫[0,2π]d

p(eij1t1 , . . . , eijdtd)|g(eij1t1 , . . . , eijdtd)|2dt ≥ 0,

as p(z) ≥ 0 for all z ∈ Td. This yields that p is extendable; thus p ∈ BS .

The next result identifies duality relations between the four cones.

Theorem 1.3.2 With the earlier notation, the following duality relationshold in TPolS:

(i) (TPol2Λ)∗ = AΛ;

(ii) (TPol+S )∗ = BS.

In order to prove this theorem we need a particular case of a result thatis known as Bochner’s theorem, which will be proved in its full generality as


Theorem 3.9.2. The following result is its particular version for Zd. For afinite Borel measure µ on Td, its moments are defined as

µ(n) =1

(2π)d

∫Td

z−ndµ(z), n ∈ Zd.

All measures appearing in this book are assumed to be regular.

Theorem 1.3.3 Let µ be a finite positive Borel measure µ on Td. Then theToeplitz matrices

(µ(j − k))j,k∈J

are positive semidefinite for all finite sets J ⊂ Zd. Conversely, if (cj)j∈Zd isa sequence of complex numbers such that for all finite J ⊂ Zd, the Toeplitzmatrices

(cj−k)j,k∈J

are positive semidefinite, then there exists a finite positive Borel measure µon Td such that µ(j) = cj, j ∈ Zd.

Proof. This result is a particular case of Theorem 3.9.2. A proof for thisversion may be found in [314].

We also need measures with finite support. For α ∈ Td, let δα denote theDirac mass at α (or evaluation measure at α). Thus for a Borel set E ⊆ Tdwe have

δα(E) =

ß1 if α ∈ E,0 otherwise.

We will frequently use the fact that for a Borel measurable function f : Td →C,∫Td fdδα = f(α). Applying this to the monomials f(z) = z−n we get that

δα(n) = α−n for every α ∈ Td. A positive Borel measure on Td is said tobe supported on the finite subset α1, . . . , αn ⊂ Td if there exist positiveconstants ρ1, . . . , ρn such that µ =

∑nk=1 ρkδαk .

Proof of Theorem 1.3.2. It is clear that for p ∈ TPolS , we have that p ∈(TPol2Λ)∗ if and only if 〈p, |q|2〉 ≥ 0 for all q ∈ PolΛ. Let q(z) =

∑λ∈Λ cλz

λ;then 〈p, |q|2〉 =

∑λ,µ∈Λ pλ−µcµcλ = v∗Tp,Λv, where v = (cλ)λ∈Λ. Thus

〈p, |q|2〉 ≥ 0 for all q ∈ PolΛ is equivalent to Tp,Λ ≥ 0. But then it followsthat p ∈ (TPol2Λ)∗ if and only if p ∈ AΛ.

For proving (ii), let p ∈ BS , where p(z) =∑s∈S psz

s. As p is extendible,we can find pν , ν ∈ Zd \ S so that (pj−k)j,k∈J ≥ 0 for all finite J ⊂ Td.Apply Theorem 1.3.3 to obtain a finite positive Borel measure µ on Td sothat µ(j) = cj , j ∈ Zd. For every q ∈ TPol+S , we have that

〈q, p〉 =∑s∈S

qsµ(−s) =

∫Td

q(z)dµ(z) ≥ 0.

This yields TPol+S ⊆ (BS)∗. For the reverse inclusion, let q ∈ (BS)∗ andlet α ∈ Td. Put pα,S(z) =

∑s∈S α

−szs. Since pα,S ∈ BS , we have that

18 CHAPTER 1

0 ≤ 〈q, pα,S〉 = q(α). This yields that q(α) ≥ 0 for all α ∈ Td and thusq ∈ TPol+S . Hence we obtain (BS)∗ ⊆ TPol+S , which proves relation (ii).

For one of the four cones it is easy to identify its extreme rays.

Theorem 1.3.4 Let Λ ⊂ Zd be a finite set and put S = Λ − Λ. Theextreme rays of BS are generated by the trigonometric polynomials pα,S(z) =∑s∈S α

−szs, where α ∈ Td.

Proof. Let CS be the closed cone generated by the trigonometric polynomialspα,S , α ∈ Td. Clearly, CS ⊆ BS . Next, note that q ∈ (CS)∗ if only if〈q, pα,S〉 = q(α) ≥ 0 for all α ∈ Td. But the latter is equivalent to thestatement that q ∈ TPol+S , and thus (CS)∗ = TPol+S . Taking duals we obtainfrom Theorem 1.3.2 that CS = BS . It remains to observe that each pα,Sgenerates an extreme ray of BS . This follows from the observation that therank 1 positive semidefinite Toeplitz matrix Tpα,S ,Λ cannot be written as asum of positive semidefinite rank 1 matrices other than by using multiplesof Tpα,S ,Λ (see the proof of Proposition 1.1.4).

As a consequence we obtain the following useful decomposition result.

Theorem 1.3.5 Let Λ ⊂ Zd be a finite set and put S = Λ − Λ. Thetrigonometric polynomial p belongs to BS if and only if Tp,Λ can be writtenas

Tp,Λ =r∑j=1

ρjvjv∗j ,

where r ≤ cardS, ρj > 0 and vj = (αλj )λ∈Λ with αj ∈ Td for j = 1, . . . , r.

Proof. Assume that m > cardS and let v =∑mj=1 ρjvjv

∗j , where ρj > 0

and vj = (αλj )λ∈Λ with αj ∈ Td for j = 1, . . . ,m. Since m > cardS,the matrices vjv

∗j are linearly dependent over R. Without loss of general-

ity, we may assume there exist 1 ≤ l ≤ m − 1 and nonnegative numbersa1, . . . , al, bl+1, . . . , bm such that

l∑i=1

aiviv∗i −

m∑j=l+1

bjvjv∗j = 0. (1.3.9)

We order decreasinglyρjbj

for bj 6= 0, and without loss of generality we assume

that among them ρmbm

is the smallest. From (1.3.9) we obtain that

vmv∗m =

Ñl∑i=1

aiviv∗i −

m−1∑j=l+1

bjvjv∗j

é1

bm,

which implies that

m∑i=1

ρjvjv∗j =

l∑i=1

Åρi +

ρmaibm

ãviv∗i +

m−1∑j=l+1

Åρj −

ρmbjbm

ãvjv∗j .


Clearly all coefficients for 1 ≤ i ≤ l are positive. The minimality of ρmbm

implies that the coefficients for l + 1 ≤ j ≤ m− 1 are also positive, provingthat every positive linear combination of matrices of the form vjv

∗j is a

positive linear combination of at most cardS such matrices. Let DS denotethe cone of all matrices v of the form

∑rj=1 ρjvjv

∗j with ρj > 0 and r ≤ cardS.

Using a similar argument as for proving the closedness of BG in Proposition1.2.1, one can show that DS is closed. The details of this are left as Exercise1.6.17. Using Theorem 1.3.4, we have that p ∈ BS if and only if Tp,Λ ∈ DS ,which is equivalent to the statement of theorem.

1.3.1 The extension property

An important question is when equality occurs between the cones we haveintroduced. More specifically, for what Λ ⊆ Zd do we have that TPol2Λ =TPol+Λ−Λ (or equivalently, BΛ−Λ = AΛ)? When these equalities occur, wesay Λ has the extension property. It is immediate that this property istranslation invariant, namely, that if Λ has the extension property then anyset of the form a + Λ, a ∈ Zd, has it. Also, if h : Zd → Zd is a groupisomorphism, then Λ has the extension property if and only if h(Λ) does; weleave this as an exercise (see Exercise 1.6.30).

In case Λ = 0, 1, . . . , n ⊂ Z, by the Fejer-Riesz factorization theorem(Theorem 1.1.5) we have that TPol2Λ = TPol+Λ−Λ. This implies the followingclassical Caratheodory-Fejer theorem, which solves the so-called truncatedtrigonometric moment problem.

Theorem 1.3.6 Let cj ∈ C, j = −n, . . . , n, with cj = c−j, j = 0, . . . , n. LetTn be the Toeplitz matrix Tn = (ci−j)

ni,j=1. Then the following are equivalent.

(i) There exists a finite positive Borel measure µ on T such that ck =µ(k), k = −n, . . . , n,.

(ii) The Toeplitz matrix Tn is positive semidefinite.

(iii) The Toeplitz matrix Tn can be factored as Tn = RDR∗, where R is aVandermonde matrix,

R =

á1 1 . . . 1α1 α2 . . . αr...

......

αn1 αn2 . . . αnr

ë,

with αj ∈ T for j = 1, . . . , r, and αj 6= αp for j 6= p, and D is a r × rpositive definite diagonal matrix.

(iv) there exist αj ∈ T, j = 1, . . . , r, with αj 6= αp, and ρj > 0, j = 1, . . . , r,such that

cl =r∑j=1

ρjαlj , |l| ≤ n. (1.3.10)

20 CHAPTER 1

In case Tn is singular, r = rankTn, and the αj are uniquely determined asthe roots of the polynomial

p(z) = det

âc0 c1 . . . crc1 c0 . . . cr−1

......

......

cr−1 cr−2 . . . c11 z . . . zr

ì.

In case Tn is nonsingular, one may choose cn+1 = c−n−1 such that Tn+1 =(ci−j)

n+1i,j=0 is positive semidefinite and singular, and apply the above to cj,

|j| ≤ n + 1. The measure µ can be chosen to equal µ =∑rj=1 ρjδαj , where

ρj and αj are as above.

Proof. Let Λ = 0, . . . , n. By Theorem 1.1.5 we have that TPol2Λ =TPol+Λ−Λ, and thus by duality AΛ = BΛ−Λ. Now (i) ⇒ (ii) follows fromTheorem 1.3.3. (ii) ⇒ (iii) follows as p ∈ AΛ implies p ∈ BΛ−Λ, whichin turn implies (iii) by Theorem 1.3.5. The equivalence of (iii) and (iv) isimmediate, which leaves the implication (iv) ⇒ (i). The latter follows fromTheorem 1.3.3.

In Exercise 1.6.20 we give an algorithm on how to find a solution to thetruncated trigonometric moment problem that is a finite positive combina-tion of Dirac masses.

We can also draw the following corollary, which will be useful in Section5.7.

Corollary 1.3.7 Let z1, . . . , zn ∈ C \ 0. Define σ−n, . . . , σn via

σk =n∑j=1

zkj , k = −n, . . . , n.

Then z1, . . . , zn are all distinct and lie on the unit circle T if and only ifσ = (σi−j)

ni,j=0 ≥ 0 and rank σ = n.

Proof. First suppose that z1, . . . , zn are all distinct and lie on the unit circle.Put

V =

á1 1 . . . 1z1 z2 . . . zn...

......

zn1 zn2 . . . znn

ë.

Then σ = V V ∗ ≥ 0, and rankσ = rankV = n.Next suppose that σ ≥ 0 and rankσ = n. By Theorem 1.3.6 we can write

σ = RDR∗, with R and D as in Theorem 1.3.6 (where r = n). Put V asabove, and

W =

á1 1 . . . 1z−1

1 z−12 . . . z−1

n...

......

z−n1 z−n2 . . . z−nn

ë.


Then σ = VWT . As rankσ = n, we get that V and W must be of full rank.This yields that z1, . . . , zn are different. Next, as σ = VW = RDR∗ hasa one-dimensional kernel, there is a nonzero row vector y =

(p0 · · · pn

)such that yσ = 0, and y is unique up to a multiplying with a nonzeroscalar. But then we get that yV = 0 = yR, yielding that the n differentnumbers z1, . . . , zn and the n different numbers α1, . . . , αn are all roots ofthe nonzero polynomial p(z) =

∑ni=0 piz

i. But then we must have thatz1, . . . , zn = α1, . . . , αn ⊂ T, finishing the proof.

In the one variable case (d = 1), the finite subsets of Z which have theextension property are characterized by the following result.

Theorem 1.3.8 Let Λ be a nonempty finite subset of Z. Then Λ has theextension property if and only if Λ = a, a+ b, a+ 2b, . . . , a+ kb for somea, b, k ∈ Z (namely, Λ is an arithmetic progression).

Before we prove this result we will develop a useful technique to identifysubsets of Zd that do not have the extension property. For Λ ⊆ Zd and∅ 6= A ⊂ Td, define

ΩΛ(A) = b ∈ Td : p(b) = 0 for all p ∈ PolΛ with p|A ≡ 0, (1.3.11)

where p|A ≡ 0 is short for p(a) = 0 for all a ∈ A. For a ∈ Td and Λ ⊆ Zd,denote La = row(aλ)λ∈Λ. Note that if p(z) =

∑λ∈Λ pλz

λ, then p(a) = LaP ,where P = col(pλ)λ∈Λ. Thus one may think of La as the linear functional onPolΛ that evaluates a polynomial at the point a ∈ Td. With this notation,we have

ΩΛ(A) = b ∈ Td : LbP = 0 for all P ∈ PolΛ with LaP = 0 for all a ∈ A.

The following is easy to verify.

Lemma 1.3.9 For Λ ⊆ Zd and ∅ 6= A,B ⊂ Td, we have

(i) A ⊆ ΩΛ(A);

(ii) A ⊆ B implies ΩΛ(A) ⊆ ΩΛ(B);

(iii) ΩΛ(A) = ΩΛ+a(A) for all a ∈ Zd.

For a nonempty set S ⊆ Zd we let G(S) denote the smallest (by inclusion)subgroup of Zd containing S.

In case A is a singleton and G(Λ − Λ) = Zd, it is not hard to determineΩΛ(A).

Proposition 1.3.10 Let Λ ⊆ Zd be such that G(Λ−Λ) = Zd. Then for alla ∈ Zd we have that ΩΛ(a) = a.

Proof. Without loss of generality 0 ∈ Λ (use Lemma 1.3.9 (iii)). Let b ∈ΩΛ(a), and let 0 6= λ ∈ Λ (which must exist as G(Λ−Λ) = Zd). Introducethe polynomial p(z) = zλ−aλ. As b ∈ ΩΛ(a), we must have that p(b) = 0.

22 CHAPTER 1

Thus we obtain that aλ = bλ for all λ ∈ Λ. By taking products and inverses,and using that G(Λ − Λ) = Zd, we obtain that aλ = bλ for all λ ∈ Zd. Inparticular, this holds for λ = ei, i = 1, . . . , d, where ei ∈ Zd has all zerosexcept for a 1 in the ith position. It now follows that ai = bi, i = 1, . . . , d,and thus a = b.

Exercise 1.6.31 shows that when G(Λ − Λ) 6= Zd then ΩΛ(a) containsmore than one element.

Theorem 1.3.11 Let Λ ⊆ Zd be a finite set such that G(Λ − Λ) = Zd.If Λ has the extension property, then for all finite subsets A of Td eitherΩΛ(A) = A or cardΩΛ(A) =∞.

Before starting the proof of Theorem 1.3.11, we need several auxiliaryresults. The range and kernel of a matrix T are denoted by RanT andKerT , respectively.

Lemma 1.3.12 Let T ≥ 0, 0 6= v ∈ Ran T . Let x be so that Tx = v. Thenµ(v) := x∗Tx > 0 is independent of the choice of x. Moreover, T− 1

µ(v)vv∗ ≥

0 and x ∈ Ker (T − 1µ(v)vv

∗).

Proof. Clearly, x∗Tx ≥ 0. If x∗Tx = 0, we get that ‖T 12x‖2 = x∗Tx = 0

and thus Tx = T12T

12x = 0, which is a contradiction. Thus x∗Tx > 0. Next,

if Tx = v = T x, then x− x ∈ Ker T , and thus

x∗Tx− x∗T x = (x− x)Tx+ x∗T (x− x) = 0.

Next, ÅT vv∗ µ(v)

ã=

ÅIx∗

ãT(I x

)≥ 0,

and thus, by Lemma 1.2.5, T − vv∗

µ(v) ≥ 0.

Finally,

x∗ÅT − vv∗

µ(v)

ãx = x∗Tx− (x∗Tx)2

µ(v)= 0,

and since T − vv∗

µ(v) ≥ 0 it follows that x ∈ Ker (T − vv∗

µ(v) ).

Lemma 1.3.13 Let

T = (cλ−µ)λ,µ∈Λ ≥ 0,

and

ΣT = a ∈ Td : LaP = 0 for all P ∈ Ker T.Then a ∈ ΣT if and only if L∗a ∈ Ran T .

Proof. As T = T ∗, we have Ran T = (Ker T )⊥, and the lemma follows.

The extension property implies that ΣT 6= ∅ and has a strong consequenceregarding the possible forms of the decompositions as in Theorem 1.3.5.


Proposition 1.3.14 Let Λ ⊆ Zd be a finite set that has the extension prop-erty, and let

0 6= T = (cλ−µ)λ,µ∈Λ ≥ 0.

Put m = rankT , and let ΣT as in Lemma 1.3.13. Then ΣT 6= ∅, and forany b1 ∈ ΣT there exist b2, . . . , bm ∈ ΣT and µ1, . . . , µm > 0 such that

T =m∑k=1

1

µkL∗bkLbk .

Note that necessarily Lb1 , . . . Lbm are linearly independent.

Proof. We first show that ΣT 6= ∅. As Λ has the extension property, we havethat BΛ−Λ = AΛ. As T ∈ AΛ = BΛ−Λ, we obtain from Theorem 1.3.5 thatwe may represent T as

T =r∑

k=1

ρkL∗akLak ,

where a1, . . . , ar ∈ Td and ρ1, . . . , ρr > 0. Now, if P ∈ KerT , we get thatP ∗TP =

∑rk=1 ρk|LakP |2 = 0 and thus LakP = 0, k = 1, . . . , r. This yields

that a1, . . . , ar ∈ ΣT .Let now b1 ∈ ΣT . By Lemma 1.3.13, 0 6= L∗b1 = Tx for some x 6= 0.

If we put µ1 = x∗Tx, we have by Lemma 1.3.12 that µ1 > 0, T := T −1µ1L∗b1Lb1 ≥ 0, and x ∈ Ker T . In particular, since Tx = Lb1 6= 0 and

KerT ⊆ Ker T (as 0 ≤ T ≤ T ), we get that dim Ker T = dim KerT + 1,

and thus rank T = rankT − 1.If T = 0, we are done. If not, note that Σ

T⊆ ΣT , and repeat the above

with T instead of T . As in each step the rank decreases by 1, we are donein at most m steps.

Lemma 1.3.15 Let x = (x1, . . . , xm)T ∈ Cm, and suppose x has all nonzerocomponents. Then there is no nonempty finite subset F ⊆ Cm with theproperties that 0 6∈ F and that for every d1, . . . , dm > 0 there exists y =(y1, . . . , ym)T ∈ F satisfying

∑mi=1 dixiyi = 0.

Proof. We prove the claim by induction. When m = 1 the statement isobviously true. Suppose now that the result holds for m, and we prove it form+1. Let x = (x1, . . . , xm+1)T ∈ Cm+1 be given with xi 6= 0, i = 1, . . . ,m+1, and suppose F ⊂ Cm+1 is finite. We can assume (y1, . . . , ym) 6= (0, . . . , 0)for any (y1, . . . , ym, ym+1)T ∈ F . Indeed, otherwise we would have thatdm+1xm+1ym+1 6= 0 for all dm+1 > 0, which is impossible since xm+1 6= 0and ym+1 6= 0 (as 0 6∈ F ). But then we can apply our induction hypothesis,and conclude that there exist d0

1, . . . , d0m > 0 such that

m∑i=1

d0ixiyi 6= 0

24 CHAPTER 1

for all (y1, . . . , ym, ym+1)T ∈ F . If the equation

m∑i=1

d0ixiyi + dm+1xm+1ym+1 = 0

holds, then necessarily ym+1 6= 0 and

dm+1 = −∑mi=1 d

0ixiyi

xm+1ym+1

.

Since the right-hand side can take on only a finite number of values, we getthat there must exists a d0

m+1 > 0 so that

m+1∑i=1

d0ixiyi 6= 0

for all (y1, . . . , ym, ym+1)T ∈ F , proving our claim.

We are now ready to prove Theorem 1.3.11.

Proof of Theorem 1.3.11. We do this by induction on m = cardA. Whenm = 1 it follows from Proposition 1.3.10. Next suppose that the result hasbeen proven for sets up to cardinality m− 1.

Let A = a1, . . . , am. If La1, . . . , Lam are linearly dependent, then, with-

out loss of generality, Lam belongs to the span of La1, . . . , Lam−1

. Butthen, for p(z) =

∑λ∈Λ pλz

λ ∈ PolΛ and P = col(pλ)λ∈Λ, we have thatp(ak) = LakP = 0, k = 1, . . . ,m− 1, implies that p(am) = LamP = 0. Thisgives that ΩΛ(a1, . . . , am) = ΩΛ(a1, . . . , am−1), and one can finish thiscase by using the induction assumption.

Next assume that La1, . . . , Lam are linearly independent, and that the set

ΩΛ(a1, . . . , am) is finite (otherwise we are done). By Lemma 1.3.9(iii) thisimplies that ΩΛ(a1, . . . , am−1) is a finite set as well. As La1

, . . . , Lam arelinearly independent, we can find U1, . . . , Um so thatÖ

La1

...Lam

è (U1 · · · Um

)= Im,

or, equivalently, LaiUj = δij with δij being the Kronecker delta. Let b1 ∈ΩΛ(a1, . . . , am). We claim that Lb1 ∈ SpanLa1 , . . . , Lam. It suffices toshow that Lb1P = 0 for all P with La1

P = · · · = LamP = 0. But this followsdirectly from the definition of ΩΛ(a1, . . . , am).

Next, suppose that Lb1Ui = 0 for some i = 1, . . . ,m. Without loss ofgenerality, Lb1Um = 0. As Lb1 is in the span of La1 , . . . , Lam , we maywrite Lb1 =

∑mi=1 ziLai for some complex numbers zi. But then 0 =

Lb1Um =∑mi=1 ziLaiUm = zm. Thus it follows that Lb1 lies in the span

of La1, . . . , Lam−1

. This, as before, gives that b1 ∈ ΩΛ(a1, . . . , am−1).As cardΩΛ(a1, . . . , am−1) < ∞ we get by the induction assumption thatb1 ∈ ΩΛ(a1, . . . , am−1) = a1, . . . , am−1, and we are done.


Finally, we are in the case when Lb1Ui 6= 0 for all i = 1, . . . ,m. Wewill show that this case cannot occur as we will reach a contradiction. LetD = diag(di)

mi=1 > 0, and consider

T (D) =m∑i=1

diL∗aiLai .

Note that ΣT (D) = ΩΛ(a1, . . . , am), where ΣT is defined in Proposition1.3.14. By Proposition 1.3.14 there exist µk = µk(D) > 0, k = 1, . . . ,m, andb2(D), . . . , bm(D) ∈ ΩΛ(a1, . . . , am) such that

T (D) =1

µ1(D)L∗b1Lb1 +

m∑k=2

1

µk(D)L∗bk(D)Lbk(D).

Thusm∑i=1

diL∗aiLai =

1

µ1(D)L∗b1Lb1 +

m∑k=2

1

µk(D)L∗bk(D)Lbk(D).

Multiplying this equation with U =(U1 · · · Um

)on the right and with

U∗ on the left, we get that

D = Q∗diag(µi(D))mi=1Q,

where

Q =

áLb1Lb2(D)

...Lbm(D)

ëU.

Thus

Im = (Q∗diag(µi(D))mi=1)(QD−1),

and as all matrices in this equation are m×m we also get

Im = (QD−1)(Q∗diag(µi(D))mi=1).

Looking at the (1, 2) entry of this equality (remember m ≥ 2) and dividingby µ2(D) > 0, we get that for all for all D = diag(di)

mi=1 > 0 there exists

b2(D) ∈ ΩΛ(a1, . . . , am) such that(1d1Lb1U1 · · · 1

dmLb1Um

)U∗L∗b2(D) = 0.

As Lb1Ui 6= 0, i = 1, . . . ,m, we get by Lemma 1.3.15 that there must beinfinitely many different vectors U∗Lb2(D). But as each b2(D) lies in thefinite set ΩΛ(a1, . . . , am), we have a contradiction.

Corollary 1.3.16 Assume Λ ⊂ Zd is a finite subset such that G(Λ− Λ) =Zd. If there exist k linearly independent polynomials p1, . . . , pk ∈ PolΛ suchthat the set A = ∩ki=1a ∈ Td : pi(a) = 0 is a finite set with cardinalitygreater than cardΛ− k, then Λ fails to have the extension property.

26 CHAPTER 1

Proof. The cardA × cardΛ matrix col(La)a∈A has a kernel of dimension atleast k, and thus rank col(La)a∈A ≤ cardΛ − k < cardA. Thus the setof linear functionals Laa∈A are linearly dependent. Let a1, . . . , am bea maximal subset of A for which the set Laimi=1 is linearly independent.Then A = ΩΛ(a1, . . . , am) is a finite set of cardinality greater than cardA.Thus by Theorem 1.3.11, Λ does not have the extension property.

We now have the techniques to easily prove Theorem 1.3.8.Proof of Theorem 1.3.8. Since having the extension property is a translationinvariant property, we may assume that a = 0. When b = 1, Λ = 0, 1, . . . , khas the extension property by Theorem 1.3.6. Next replacing z by zb, it easilyfollows that Λ = 0, b, 2b, . . . , kb has the extension property.

Assume now that Λ ⊂ Z has the extension property, G(Λ − Λ) = Z, 0 isthe smallest element of Λ, and N is the largest. Then zN − 1 ∈ PolΛ hasN roots on T, and consequently Corollary 1.3.16 implies that Λ has at leastN + 1 elements. Thus Λ = 0, 1, . . . , N. If G(Λ − Λ) 6= Z, then, letting bbe the greatest common divisor of the nonzero elements of Λ, it is easy tosee that that G(Λ− Λ) = bZ. By considering polynomials in zb as opposedto polynomials in z, one easily reduces it to the above case and sees that Λmust equal 0, b, 2b, . . . , kb, where k = N

b . In the multivariable case we consider finite sets Λ of the form

R(N1, . . . , Nd) := 0, . . . , N1 × · · · × 0, . . . , Nd, (1.3.12)

where R stands for rectangular.Now that the one-variable case has been settled, let us first consider a

three-variable case.

Theorem 1.3.17 The extension property fails for Λ = R(N1, N2, N3), forevery N1, N2, N3 ≥ 1.

Proof. The set of the common zeros of the polynomials (z1 − 1)(z2z3 + 1),(z2 − 1)(z1z2 + 1), and (z3 − 1)(z1z2 + 1) consist of the points in T3:

α1 = (1, 1, 1), α2 = (1,−1,−1), α3 = (−1, 1,−1),

α4 = (−1,−1, 1), α5 = (i, i, i), α6 = (−i,−i,−i).This implies that the set of common zeros in T3 of the polynomials (zN1

1 −1)(zN2

2 zN33 + 1), (zN2

2 − 1)(zN11 zN3

3 + 1), and (zN33 − 1)(zN1

1 zN22 − 1) be-

longing to PolΛ has cardinality 6N1N2N3. Since cardΛ = (N1 + 1)(N2 +1)(N3 + 1), Corollary 1.3.16 with k = 3 that the extension property fails forR(N1, N2, N3) whenever

6N1N2N3 > (N1 + 1)(N2 + 1)(N3 + 1)− 3,

an inequality that is true when Ni ≥ 1, i = 1, 2, 3.

The remainder of this section is devoted to proving the following charac-terization of all the finite subsets of Z2 that have the extension property. Wesay that two finite subsets Λ1 and Λ2 of Z2 are isomorphic if there exists agroup isomorphism Φ on Z2 so that Φ(Λ1) = Λ2.


Theorem 1.3.18 A finite set Λ ⊆ Z2 with G(Λ−Λ) = Z2 has the extensionproperty if and only if for some a ∈ Z2 the set Λ−a is isomorphic to R(0, n),R(1, n), or R(1, n) \ (1, n), for some n ∈ N.

We prove this result in several steps. We use the following terminology.Let R ⊂ Zd. We say that (ck)k∈R−R is a positive semidefinite sequence withrespect to R if (ck−l)k,l∈L ≥ 0 for all finite subsets L of R. Clearly, if R isfinite it suffices to check whether (ck−l)k,l∈R ≥ 0. Often the choice of theset R is clear, in which case we will just say that (ck)k∈R−R is a positivesemidefinite sequence.

Theorem 1.3.19 For every n ≥ 1, R(1, n) has the extension property.

Proof. Let S = R(1, n)−R(1, n) and let (ckl)(k,l)∈S be a positive semidefinitesequence. Thus

‹Cn =

C0 C∗1 · · · C∗n−1 C∗n

C1 C0. . . C∗n−1

.... . .

. . .. . .

...

Cn−1. . .

. . . C∗1Cn Cn−1 · · · C1 C0

≥ 0, (1.3.13)

where

Cj =

Åc0j c1jc−1,j c0j

ã, 0 ≤ j ≤ n.

Consider now the partial matrixâc00 P ∗C∗1 · · · P ∗C∗n X∗

C1P C0 · · · C∗n−1 C∗n...

.... . .

......

CnP Cn−1 · · · C0 C∗1X Cn · · · C1 C0

ì, (1.3.14)

where P = ( 01 ) and X is the unknown. Since ‹Cn is positive semidefinite, the

partial matrix (1.3.14) is partially positive semidefinite. By Theorem 1.2.10(i) ⇒ (ii) we can find an X = ( βα ) so that (1.3.14) is positive semidefinite.Using the Toeplitz structure of Cj , 0 ≤ j ≤ n, it is now not hard to see thatthe following matrix is positive semidefinite as well:â

C0 C∗1 · · · C∗n Y ∗

C1 C0 · · · C∗n−1 C∗nQ...

.... . .

......

Cn Cn−1 · · · C0 C∗1QY Q∗Cn · · · Q∗C1 c00

ì, (1.3.15)

where Q = ( 10 ) and Y =

(α β

). Indeed, (1.3.15) may be obtained from

(1.3.14) by reversing the rows and columns and taking complex conjugates

28 CHAPTER 1

entrywise. Next consider

‹Cn+1 =

âC0 C∗1 · · · C∗n C∗n+1

C1 C0 · · · C∗n−1 C∗n...

.... . .

......

Cn Cn−1 · · · C0 C∗1Cn+1 Cn · · · C1 C0

ì,

where

Cn+1 =

Åα βγ α

ãwith γ as an unknown. As (1.3.14) and (1.3.15) are positive semidefinite,

we have that ‹Cn+1 is partially positive semidefinite. Applying again Theo-

rem 1.2.10 (i) ⇒ (ii) we can find a γ so that ‹Cn+1 is positive semidefinite.Define now c0,n+1 = α, c1,n+1 = β, and c−1,n+1 = γ. This way we haveextended the given positive sequence to a positive sequence supported inthe set R(1, n + 1) − R(1, n + 1). We repeat successively the previous con-struction for n+ 1, n+ 2, . . .. This process produces a positive semidefiniteextension of the given sequence supported in the infinite band I1 = R1−R1,where R1 = 0, 1 × N. Thus, by Exercise 1.6.34, R(1, n) has the extensionproperty. This completes the proof.

In Exercise 2.9.33 we will point out an alternative proof for Theorem1.3.19 that will make use of the Fejer-Riesz factorization theorem (Theorem2.4.16).

Our next result introduces a new class of sets with the extension property.

Theorem 1.3.20 For every n ≥ 1, R(1, n)\(1, n) has the extension prop-erty.

Proof. Let (ckl)(k,l)∈R(1,n)\(1,n−R(1,n)\(1,n) be an arbitrary positive semidef-inite sequence. By Theorem 1.3.19, it is sufficient to prove that the se-quence can be extended to a positive semidefinite sequence supported inS = R(1, n)−R(1, n). The Toeplitz form associated with this sequence is

‹Dn =

âC0 C∗1 · · · C∗n−1 D∗nC1 C0 · · · C∗n−2 D∗n−1...

.... . .

......

Cn−1 Cn−2 · · · C0 D∗1Dn Dn−1 · · · D1 c00

ì,

where

Cj =

Åc0j c1jc−1,j c0j

ãand Dj =

(c0j c1j

).

It is easy to observe that deleting the first row and first column (resp.,the last row and last column) of the matrix (1.3.13) of a sequence supported


in R(1, n) − R(1, n), one obtains (a permuted version of) the matrix ‹Dn.

Thus every positive semidefinite matrix ‹Dn can be extended to a positivesemidefinite matrix of the form (1.3.13), and this completes the proof.

We would like to make a remark at this point.

Remark 1.3.21 Theorems 1.3.19 and 1.3.20 are true for scalar matricesonly. They are not true when the entries are 2 × 2 or larger. Take forinstance in Theorem 1.3.20, Λ = (0, 0), (0, 1), (1, 0), C00 = ( 1 0

0 1 ), andC01 and C10 to be two noncommuting unitary matrices. Then, in orderto be positive semidefinite with respect to Λ, by Lemma 1.2.5, a sequencemust satisfy C−1,1 = C∗01C10. We will show that no C11 exists for whichthe matrix of the sequence restricted to Λ′ = (0, 0), (0, 1), (1, 0), (1, 1) ispositive semidefinite. Indeed, by a Schur complement type argument, wemust have that C11 is unitary and C11 = C01C10 = C10C01. The last equalitydoes not hold, because C01 and C10 do not commute by assumption.

Theorem 1.3.22 The set R(N1, N2) has the extension property if and onlyif min(N1, N2) = 1.

We first need an auxiliary result.

Lemma 1.3.23 The real polynomial

F (s, t) = s2t2(s2 + t2 − 1) + 1

is strictly positive but it is not the sum of squares of polynomials with realcoefficients.

Proof. It is an easy exercise (see Exercise 1.6.25) to show that F takes onthe minimum value 26

27 at s = ± 1√3

and t = ± 1√3.

Assume that

F = F 21 + · · ·+ F 2

n (1.3.16)

for some real polynomials F1,. . . ,Fn. From F (s, 0) = F (0, t) = 1 it followsthat Fi(s, 0) and Fi(0, t) are constant for i = 1, . . . , n, so

Fi(s, t) = ai + stHi(s, t) (1.3.17)

for some constant ai and some polynomial Hi of degree at most one. Sub-stituting (1.3.17) into (1.3.16) and comparing the coefficients, we obtain

s2t2(s2 + t2 − 1) = s2t2n∑k=1

H2i (s, t).

This is a contradiction as the right side is always nonnegative while the leftis negative if st 6= 0 and s2 + t2 < 1.

Proof of Theorem 1.3.22. We prove the theorem by showing that for Λ =R(N1, N2) and S = Λ− Λ, TPol+S 6= TPolΛ.

30 CHAPTER 1

Let X be the linear space of all real polynomials of the form F (s, t) =∑2N1m=0

∑2N2n=0 amns

mtn. Every p ∈ TPolS is of the form

p(z, w) =

N1∑m=−N1

N2∑n=−N2

cmnzmwn

with c−m,−n = cmn. Define the linear map Ψ : TPolS → X by

(Ψ(f))(s, t) = (1 + s2)N1(1 + t2)N2f

Ås+ i

s− i,t+ i

t− i

ã.

It is clear Ψ is one-to-one, and since dimR(TPolS) = dimRX = (2N1 +1)(2N2 + 1), it follows that Ψ is a linear isomorphism of TPolS onto X.

Assume that f ∈ TPol2Λ. Then f =∑rj=1 |gj |2, where each gj ∈ PolΛ.

Define for j = 1, . . . , r,

Gj(s, t) = (s− i)N1(t− i)N2gj

Ås+ i

s− i,t+ i

t− i

ã.

Each Gj is a complex polynomial of degree at most N1 in s and at most N2

in t, and

(Ψ(f))(s, t) =r∑j=1

|Gj(s, t)|2.

Define F = Ψ(f) and let Gj = uj + ivj , where uj and vj are polynomialswith real coefficients. We have then F ∈ X, and F =

∑rj=1(u2

j +v2j ), so F is

a sum of squares. This cannot happen when F is the polynomial in Lemma1.3.23, and this implies TPol+S 6= TPol2Λ.

Theorem 1.3.24 Let Λ ⊂ Z2 be a finite subset containing the points (0, 0),(m, 0), (0, n), and (m,n), where m,n ≥ 2. If cardΛ ≤ (m + 1)(n + 1)and G(S) = Z2, then the extension property fails for Λ. In particular, theextension property fails for R(m,n) for m,n ≥ 2.

Proof. The polynomial 1 + z1 + z2 has two roots on T2, (α, α) and (α, α),

where α = e2πi3 . Therefore, the polynomial 1 + zm1 + zn2 ∈ PolΛ has 2mn

roots on T2. If m,n ≥ 2 and either m ≥ 3 or n ≥ 3, we have that card(Λ) ≤(m + 1)(n + 1) ≤ 2mn, and so Λ fails the extension property by Corollary1.3.16. When m = n = 2, the condition in Corollary 1.3.16 with k = 2 isverified by the polynomials 1−z2

1z22 and z2

1−z22 , which have 8 common zeros

on T2, proving R(2, 2) also fails the extension property.

A partial Toeplitz matrix is a partial matrix that is Toeplitz at the extentto which it is specified; that is, all specified entries lie along certain speci-fied diagonals, and all the entries along a specified diagonal have the samevalue. Positive semidefinite Toeplitz completions are of special interest. Thetrigonometric moment problem is equivalent to the fact that every partiallypositive Toeplitz matrix with entries specified in a band |i− j| ≤ k, for somespecified k, admits a positive Toeplitz completion.


By the pattern of a partial Toeplitz matrix we mean the set of its specifieddiagonals. The main diagonal is always assumed to be specified. Thus thepattern of a partially positive semidefinite Toeplitz matrix can be consideredto be a subset of 1, 2, . . . , n. A pattern P is said to be (positive semidef-inite) completable if every partial positive Toeplitz matrix with pattern Padmits a positive semidefinite completion. We will use the following result.

Theorem 1.3.25 A pattern P ⊆ 1, 2, . . . , n is completable if and only ifit is an arithmetic progression.

Proof. The proof may be found in [336].

Although the statement above and Theorem 1.3.8 seem to be quite close,they are rather different. For instance, if we let Λ = P = 1, 2, 4, thenTheorem 1.3.8 yields that there exist ck, k ∈ Λ−Λ = −3, . . . , 3, such thatÑ

c0 c−1 c−3

c1 c0 c−2

c3 c2 c0

é≥ 0

and Üc0 c−1 c−2 c−3

c1 c0 c−1 c−2

c2 c1 c0 c−1

c3 c2 c1 c0

ê6≥ 0.

On the other hand, Theorem 1.3.25 yields in this case that there exist ck =ck, k = 0, 1, 2, 4, so that the partial matrixà

c0 c−1 c−2 ? c−4

c1 c0 c−1 c−2 ?c2 c1 c0 c−1 c−2

? c2 c1 c0 c−1

c4 ? c2 c1 c0

íis partially positive semidefinite but no positive semidefinite Toeplitz com-pletion exists.

Next, we will introduce a canonical form for a finite set Λ ⊂ Z2. We willbe interested in those satisfying G(Λ−Λ) = Z2. Exercise 1.6.35 gives an easymethod to determine whether a finite set Λ ⊂ Z2 satisfies G(Λ−Λ) = Z2 ornot. By Exercise 1.6.30, a finite subset Λ ⊂ Z2 has the extension propertyif and only if every translation of Λ by a vector in a ∈ Z2 and every setisomorphic to Λ has the same property.

Let d1 = maxp1 − p2 : (p1, q), (p2, q) ∈ Λ, d2 = maxq1 − q2 : (p, q1),(p, q2) ∈ Λ, and d = maxd1, d2. By using a translation and maybealso interchanging the order of coordinates in Z2, we may assume that(0, 0), (d, 0) ∈ Λ, and that for each k < 0, maxp1 − p2 : (p1, k), (p2, k) ∈Λ < d. If Λ has the above properties, then we say it is in canonical form.This notion plays a crucial role in our considerations. For Λ in canonicalform, let m = −minl : (k, l) ∈ Λ, and M = maxl : (k, l) ∈ Λ. Withoutloss of generality, we assume that M ≥ m for every Λ in canonical form.

32 CHAPTER 1

If Λ is in canonical form and M = m = 0, then Λ is isomorphic to a subsetof Z for which we may apply Theorem 1.3.8. Therefore we may assume thatM ≥ 1.

Lemma 1.3.26 Suppose that Λ is in canonical form and has the extensionproperty. Let Σ = k : (k, 0) ∈ Λ − Λ and Λ(l) = (k, l) ∈ Λ, l ∈ Z.Then Σ forms an arithmetic progression. Moreover, there exists an l suchthat Σ = k : (k, 0) ∈ Λ(l)− Λ(l).

Proof. Let N = cardΣ. Define the sequence ck,l(k,l)∈Λ−Λ by ck,l = ckwhenever l = 0 and ck,l = 0 whenever l 6= 0. The sequence ck will bespecified later in the proof. The matrix of the sequence ck,l(k,l)∈Λ−Λ canbe written in a block diagonal form with m + M + 1 diagonal blocks (onefor each row in Λ). Let Θj , j = −m, . . . ,M denote these diagonal blocks.All the entries of the matrices Θj are terms of the sequence ck. Thesequence ck,l(k,l)∈Λ−Λ is positive semidefinite if and only if all matricesΘj are positive semidefinite. The matrices Θj can be viewed also as fullyspecified principal submatrices of the partial Toeplitz matrix (cs−t)

ds,t=0. If

kjNj=1 = Σ is not an arithmetic progression, then by Theorem 1.3.25 wecan choose ck such that all the matrices Θj are positive semidefinite but thesequence ck does not have a positive semidefinite Toeplitz extension. Thisimplies that the sequence ck,l constructed earlier in the proof is positivesemidefinite and does not have a positive semidefinite extension. This provesthat Σ must be an arithmetic progression.

Each Θl is an n(l) × n(l) matrix, where n(l) = card Λ(l) and ck,0 is anentry of Θl if and only if (k, 0) ∈ Λ(l) − Λ(l). Let n = maxl n(l). Then wecan redefine the sequence ck used above with c0 = 1 and ck = −1/(n−1) ifk 6= 0. Then each matrix Θj is positive semidefinite by diagonal dominance.It can easily be verified that if N > n the sequence ck is not positivesemidefinite. Therefore we must have n = N . But then the theorem followssince Λ(l)− Λ(l) ⊆ Σ for all l.

Lemma 1.3.27 Suppose that Λ is in canonical form, G(Λ − Λ) = Z2, andΛ has the extension property. Then there exist p and q such that

(p, q), (p+ 1, q), . . . , (p+ d, q) ∈ Λ.

In the remainder of the section (p, q), (p + 1, q), . . . , (p + d, q) will bereferred to as the full row.

Proof. The lemma is immediate for d ≤ 1, so we assume that d > 1.Let k be such that (k,M) ∈ Λ. Since the polynomial

1 + wd + zMwk = 0 (1.3.18)

has 2Md zeros on T2, then by Theorem 1.3.16, Λ does not have the extensionproperty unless it contains more than 2Md points.

Assume that no row Λ contains d+1 elements. Then the maximum numberof elements in Λ is (M + m + 1)d. The inequality (M + m + 1)d ≤ 2Md istrue if m ≤M − 1, therefore the lemma is certainly true unless M = m.


Let q be a row of Λ containing the largest number of elements among allrows. If (l0, q), (l1, q), . . . , (lm, q) is the set of elements on this row, then byLemma 1.3.26, l0, l1, . . . , lm must be an arithmetic progression. Therefore,if Λ does not have a full row, each row of Λ contains at most d

2 + 1 elements.

All rows q < 0 contain then at most d2 elements. Assuming M = m, we get

that the maximal numbers of elements in Λ is (M + 1)(d2 + 1) +M d2 . Using

again the polynomial in (1.3.18), we have that Λ does not have the extensionproperty unless (M + 1)(d2 + 1) +M d

2 ≤ 2Md, which holds if and only if

M + 1

M − 12

≤ d.

The function f(M) = M+1M− 1

2

is strictly decreasing. Since f(2) = 2, the

inequality holds for any d > 1 if M ≥ 2. It remains to consider the casewhen M = 1. Since f(1) = 4, we can restrict the consideration to the casewhen d = 2 or d = 3. If d = 3, there will be at most 2 points in each rowby Lemma 1.3.26 unless there is a row with four elements. There are atmost 5 elements in Λ in this case. By counting the number of zeros on T2

of the polynomial (1.3.18), this possibility is ruled out. If d = 2, the samereasoning holds if there are fewer than 5 elements in Λ. On the other hand,if there are 5 elements in Λ, three of them must be (k, 1), (k + 2, 1), and(w,−1). Since the polynomial zkw + zk+2w + zlw−l = 0 has 8 zeros on T2,the proof is complete.

Lemma 1.3.28 Suppose (0, 0), (1, 0) ⊂ Λ and that Λ is in canonical form.Then Λ does not have the extension property unless it contains elements ofthe form (pj , j) for every j such that −m ≤ j ≤M .

Proof. Suppose there exists −m ≤ q ≤ M such that there are no elementsof the form (p, q) in Λ.

Since G(Λ−Λ) = Z2 it is clear that the set Q = q : (p, q) ∈ Λ for some pis not an arithmetic progression. Then it follows from Theorem 1.3.8 thatthere exists a data set cj , j ∈ Q−Q, where c0 = 1 without loss of generality,such that

A = (cq1−q2)q1,q2∈Qis positive semidefinite, but its elements are not extendable to a positivesemidefinite sequence on the set [−m,M ].

Let Jk×k be the k × k matrix with all entries entries equal to one, where

k = maxj,j1,j2

|pj1 − pj2 | : (pj1 , j), (pj2 , j) ∈ Λ .

The Kronecker product B = A ⊗ Jk×k is a positive semidefinite matrix.Moreover, there exists a principal submatrix of B which gives a positivesemidefinite sequence (ck)k∈Λ−Λ.

Let c1,0 = c0,0 = 1. SinceÑc0,0 c1,0 cpj ,jc1,0 c0,0 cpj−1,j

cpj ,j cpj−1,j c0,0

é

34 CHAPTER 1

must be positive semidefinite for any positive semidefinite sequence on Z2,it is clear that for all extensions of (cj,l)(j,l)∈Λ−Λ the cj,l must be constantalong each row. Since the elements of A are not extendable to a positivesemidefinite sequence on Z, it follows that we have constructed a positivesequence on Λ− Λ which is not extendable to a positive sequence on Z2.

Lemma 1.3.29 Suppose Λ is a canonical form with d = 1. Then Λ ishomomorphic to R(1, 1), to R(1, 1) \ (1, 1), to a set Λ′ in canonical formwith d > 1, or to a set Λ = (0, 0), (1, 0), (0, 1), (p, q) which does not havethe extension property.

Proof. By Lemma 1.3.28 there exists an element (r, 1) ∈ Λ; it is possiblethat (r + 1, 1) ∈ Λ. The isomorphism Φ defined by the matrixÅ

1 −r0 1

ã(1.3.19)

maps Λ onto a set ∆ which contains the elements (0, 0), (1, 0), (0, 1), andpossibly also (1, 1). If Λ does not contain any additional element the proofis complete. Therefore, we may assume that Λ contains another elementwhich is mapped to (p, q) ∈ ∆ by Φ. If (1, 1) ∈ ∆ and card∆ > 4, then ∆, ora translation of it, is isomorphic to a set containing an element (d, 0), d ≥ 2.The latter follows by Exercise 1.6.36 since gcd((p, q)− (i, j)) ≥ 2 for at leastone choice of (i, j) among the elements (0, 0), (1, 0), (0, 1), (1, 1). This provesthe lemma if (1, 1) ∈ ∆.

If S = (0, 0), (1, 0), (0, 1), (p1, q1), (p2, q2) ⊆ ∆ it is clear that there exists1, s2 ∈ S such that gcdp, q ≥ 2 where (p, q) = s1 − s2. Then by Exercise1.6.36, ∆ is isomorphic to a set Λ′ with d > 1.

It remains to discuss the case when ∆ = (0, 0), (1, 0), (0, 1), (p, q). If(p, q) ∈ (1, 1), (1,−1), (−1, 1), (−1,−1), then ∆ is isomorphic to R(1, 1).If (p, q) 6∈ (1, 1), (1,−1), (−1, 1), (−1,−1), then ∆ does not have the exten-sion property. This follows from the fact that c0,0 = c1,0 = c0,1 = c−1,1 = 1,and cp,q = cp−1,q = cp,q−1 = 0 is a positive semidefinite sequence without apositive semidefinite extension.

Remark 1.3.30 If the set Λ is in canonical form, there is a maximal j suchthat (p0, j) and (pd, j) are elements in Λ with pd − p0 = d. We may assumethat j ≤M −m. Otherwise translate (p0, j) into (0, 0), and change the signof the first coordinate of the indices. Then we obtain a new set in canonicalform. For this new set we have M1 = m + j, and the maximal j has thesame value as in the original set. We want to maximize M , and since we arefree to work with any of the two sets introduced, we may assume M ≥M1,i.e. j ≤ M − m. Therefore, for Λ in canonical form we may assume thatcardΛ ≤ (j + 1)(d+ 1) + (M − j +m)d.

The above remark leads to the following lemma.

Lemma 1.3.31 Suppose Λ is in canonical form with d > 1. Then Λ doesnot have the extension property if M −m > 1 and maxM −m, d ≥ 3.


Proof. The polynomial (1.3.18) has 2Md zeros on T2. By Theorem 1.3.16,Λ does not have the extension property unless cardΛ ≤ 2Md. Λ has at most(j + 1)(d + 1) + (M − j + m)d elements. The latter number is maximizedby j = M −m, and we get the inequality (M −m+ 1)(d+ 1) + (M − (M −m) +m)d ≤ 2Md, which can be rewritten as

M −m+ 1

M −m− 1≤ d.

Let k = M −m, and consider the function f(k) = (k+ 1)/(k−1). Since thisfunction is strictly decreasing and f(2) = 3 and f(3) = 2, the result follows.

Lemma 1.3.32 Suppose Λ is a set in canonical form with m = 0, M = 1,and d > 1. Then Λ has the extension property if and only if it is isomorphicto one of the sets R(1, d) or R(1, d) \ (1, d).

Proof. By Lemma 1.3.27 at least one of the rows must be full. Let us assume,without loss of generality, that it is the zeroth row. Furthermore, using theisomorphism (1.3.19) we may assume that (0, 1) ∈ Λ. Using the polynomial(1.3.18) with M = 1 and pM = 0 and Theorem 1.3.16 shows that Λ does nothave the extension property unless its cardinality is at least 2d + 1. ThenΛ = R(1, d) \ (1, d), if (d, 1) 6∈ Λ. If (d, 1) ∈ Λ, the two polynomials1 − zdw = 0 and zd − w = 0 have 2d common zeros on T2. Then Theorem1.3.16 and the definition of the canonical form imply that Λ does not havethe the extension property unless Λ = R(1, d). The proof of the lemma isconcluded by the fact that R(1, d) and R(1, d) \ (1, d) have the extensionproperty.

Lemma 1.3.33 Let Λ be in canonical form with M +m ≥ 2 and d > 1. Ifthere exist p and q, p 6= q, such that (p,M), (q,M) ∈ Λ or (p,−m), (q,−m) ∈Λ, then Λ does not have the extension property.

Proof. If m = 0, then Lemma 1.3.31 implies that Λ does not have the ex-tension property except maybe for M = d = 2. If M = d = 2 the polynomial(1.3.18) has 8 zeros on T2. This implies that if Λ has the extension property,then it contains at least 9 elements. If Λ contains 9 or more elements, thereexists p2 such that (0, 0), (2, 0), (p2, 2), (p2 + 2, 2) ⊂ Λ. The polynomials1 − zp2+2w2 = 0 and z2 − w2zp2 = 0 have 8 common zeros on T2, andthen Theorem 1.3.16 implies that Λ does not have the extension property.Therefore we may assume m ≥ 1 for the rest of the proof.

We claim that if there are at most two elements in the Mth and −mthrow of Λ, and for either i = −m or M the elements in the ith row are(pi, i) and (pi + k, i) with k > bd2c, then Λ does not have the extensionproperty. Using these two elements together with an element from the otherextreme row we can construct a polynomial of a similar form as (1.3.18)with 2k(M +m) zeros on T2. Let j be as in 1.3.30. Then there are at most4 + (j + 1)(d+ 1) + (M − j +m− 2)d elements in Λ. By Theorem 1.3.16, Λ

36 CHAPTER 1

does not have the extension property if

4 + (j + 1)(d+ 1) + (M − j +m− 2)d ≤ 2k(M +m).

The left-hand side of the above inequality is maximized by j = M−m whichimplies

(M +m− 1)d+M −m+ 5 ≤ 2k(M +m). (1.3.20)

If d is even then k ≥ d2 + 1. Using this lower bound for k, inequality (1.3.20)

simplifies to

5 ≤ d+M + 3m,

which holds since d ≥ 2 and m ≥ 1. If d is odd, k ≥ d2 + 1

2 . Using this lowerbound for k, inequality (1.3.20) simplifies to

5 ≤ d+ 2m,

which holds since d ≥ 3 and m ≥ 1. This proves the claim.To complete the proof of the lemma we will produce a nonextendable

positive semidefinite data set in the case when Λ contains a pair of elementsof the form (pi, i), (pi + k, i) with i = −m or i = M and 0 < k ≤ bd2c. Ifthere are more than two elements in row −m or M , such a pair must exist.If there are at most two elements in these rows we may assume that sucha pair exists by the previous claim. To produce a nonextendable positivedata set we define c0,0 = 1, and cp,q = 0 for any (p, q) 6= (0, 0) such thatq 6= M − (−m). Let us represent the matrix

(cp1−p2,q1−q2)(p1,q1),(p2,q2)∈Λ

in the block formTM,M TM,M−1 . . . . . . TM,−m

TM−1,M−1. . .

.... . .

. . ....

T−m+1,−m+1 T−m+1,−mT−m,−m

, (1.3.21)

where the block Ti,n contains all the elements of the form cp−q,i−n for afixed pair of second coordinates i and n. By our choice of values for cp,q,all block diagonal elements are identity matrices, which sizes depend on thenumber of elements (p, q) ∈ Λ for a fixed q. By Lemma 1.3.27 at least oneof these block diagonal matrices is an (d + 1) × (d + 1) identity matrix.The off-diagonal blocks are, except for the TM,−m block, zero matrices ofappropriate dimension. The dimension of both square block matrices TM,M

and T−m,−m is less than d+ 1. The lemma now follows by choosing entriesin TM,−m such that the matrix (1.3.21) is positive semidefinite, but whenextending the matrices TM,M , T−m,−m, and TM,−m to (d + 1) × (d + 1)matrices we get a 2× 2 block matrix which is not positive semidefinite. Thematrices TM,M and T−m,−m are both submatrices of (ci−n,0)i,n∈0,1,...,d+1,


which equals the (d + 1) × (d + 1) identity matrix. The matrix TM,−m willbe a submatrix of a Toeplitz matrix of the form

A =

t0 t1 . . . . . . tb d2 ct−1 t0 t1 . . . tb d2 c−1

.... . .

. . .. . .

...t−b d2 c+1 . . . t−1 t0 t1t−b d2 c

. . . . . . t−1 t0

.

If Λ has the extension property, it must be possible to embed TM,−m intothe bigger Toeplitz contraction of the form

T =

ât0 t1 . . . . . . tdt−1 t0 t1 . . . td−1

.... . .

. . .. . .

...t−d+1 . . . t−1 t0 t1t−d . . . . . . t−1 t0

ì.

The block matrix (1.3.21) is positive semidefinite if and only if TM,−m is acontraction. If TM,−m has only one column or one row, definecpi−n1,i−n2

= cpi+k−n1,i−n2= 1√

2for i = M and n2 = −m, and all other

entries equal to zero. If i = −m, make a similar construction. Then theinformation we are given about T prevents it from being a contraction. IfTM,−m has more than one row and column, let ti = tn = 1, where n is theindex of the lower left corner of TM,−m and i is the smallest index such thatti is not an element of the first column or last row of TM,−m. Let the othertl of TM,−m be zero. Since |i− n| ≤ d it is clear that this choice prevents Tfrom being a contraction.

Using the definition of canonical form, the following is an immediate con-sequence of Lemma 1.3.33.

Corollary 1.3.34 Let Λ be such that G(Λ − Λ) = Z2, and suppose that Λis in canonical form with m = 0, M ≥ 2, and d ≥ 2. Then Λ does not havethe extension property.

Lemma 1.3.35 There is no set Λ in canonical form with d = 2 and M =m = 1 which has the extension property.

Proof. By Lemma 1.3.27 and Lemma 1.3.33 it is sufficient to consider thecase when Λ is of the form Λ = (p−1,−1), (0, 0), (1, 0), (2, 0), (p1, 1). Letckk∈Λ−Λ be a positive semidefinite sequence. The matrix of the sequenceis of the formà

c0,0 c−p1,−1 c1−p1,−1 c2−p1,−1 cp−1−p1,−2

c0,0 c1,0 c2,0 cp−1,−1

c0,0 c1,0 cp−1−1,−1

c0,0 cp−1−2,−1

c0,0

í.

38 CHAPTER 1

Let c0,0 = 1, c1,0 = c2,0 = 0, c1−p1,−1 = c2−p1,−1 = 1√2, ck,−1 = 0 for

k 6= 1− p1, 2− p1, and

cp−1−p1,−2 = c−p1,−1cp−1,−1 + c1−p1cp−1−1,−1 + c2−p1,−1cp−1−2,−1.

Then the matrix above is positive semidefinite. Still this data set does nothave a positive extension as the Hermitian matrixÜ

c0,0 c1,0 c1−p1,−1 c2−p1,−1

c0,0 c−p1,−1 c1−p1,−1

c0,0 c1,0c0,0

êis not positive semidefinite.

Lemma 1.3.36 Suppose Λ is in canonical form with d > 1. In additionwe assume that there are only one p such that (p,M) ∈ Λ and only oneq such that (q,−m) ∈ Λ. Then Λ does not have the extension property ifd+M −m ≥ 3.

Proof. The proof follows the same lines as the proof of Lemma 1.3.31, butnow there are at most (j + 1)(d+ 1) + (M − j +m− 2)d+ 2 elements in Λ.This number is maximized if j = M −m and must be less than 2Md. As aconsequence we have the inequality

M −m+ 3

M −m+ 1≤ d.

The function f(k) = (k + 3)/(k + 1) is strictly decreasing, and the lemmafollows since f(0) = 3, f(1) = 2 and f(2) < 2.

Corollary 1.3.37 Every Λ in canonical form with d ≥ 2, m ≥ 1, andd+M −m ≥ 3 does not have the extension property.

Proof. The result follows from the definition of the canonical form, Lemma1.3.33, and Lemma 1.3.36.

Lemma 1.3.38 Suppose Λ is in canonical form with m ≥ 2 and Λ has aunique element on each of the rows −m, −m + 1, M − 1, and M . Then Λdoes not have the extension property if d > 1.

Proof. The proof is similar to the proof of Lemma 1.3.36. We have to con-sider in this case the inequality

(j + 1)(d+ 1) + (M − j +m− 4)d+ 4 ≤ 2Md.

The left-hand side of the inequality is maximized when j = M −m. Thisleads to the inequality

M −m+ 5

M −m+ 3≤ d.

The function f(k) = (k+5)/(k+3) is strictly decreasing. The lemma followssince f(0) = 5/3 < 2.


Lemma 1.3.39 Let Λ be in canonical form with M +m ≥ 4 and d > 1. IfΛ does not have a unique element on each of the rows −m, −m+ 1, M − 1,and M , then Λ does not have the extension property.

Proof. By a similar ordering of the elements as in the proof of Lemma 1.3.33the matrix

(cp1−p2,q1−q2)(p1,q1),(p2,q2)∈Λ

can be written in the block formTM,M TM,M−1 . . . . . . TM,−m

TM−1,M−1. . .

.... . .

. . ....

T−m+1,−m+1 T−m+1,−mT−m,−m

, (1.3.22)

where the block Ti,j contains all the elements of the form cp−q,i−j for fixedsecond coordinates i and j.

Define c0,0 = 1 and all other entries of the matrix which are not in theTM,−m+1 or TM−1,−m blocks to be zero. By our choice of values for ci,j ,all block diagonal elements are identity matrices, whose sizes depend onthe number of elements (p, q) ∈ Λ for a fixed q. By Lemma 1.3.27, atleast one of these block diagonal matrices is a (d + 1) × (d + 1) identitymatrix. By Lemma 1.3.33, Λ does not have the extension property unlessit contains exactly one element of both the forms (p, k), for k = −m,M .We can therefore assume this. This implies that both block matrices TM,M

and T−m,−m are 1× 1 matrices, and the TM,−m+1 and TM−1,−m blocks area single row and a single column, respectively. Both blocks TM,−m+1 andTM−1,−m are submatrices of Toeplitz matrices (not necessarily the same one)of the form

T =

ât0 t1 . . . . . . tdt−1 t0 t1 . . . td−1

.... . .

. . .. . .

...t−d+1 . . . t−1 t0 t1t−d . . . . . . t−1 t0

ì.

This means that some entries in a row or column of T are specified. Apositive semidefinite extension of the matrix (1.3.22) exists if and only ifthere exists a contractive extension of T .

Now, if tj = ti = 1/√

2 with |j − i| ≤ d2 and tk = 0 for k 6= i, j the matrix

(1.3.22) is a positive semidefinite matrix, but T cannot be extended to acontraction. If there are more than three elements of the form (iM−1,M −1) ∈ Λ (or of the form (i−m+1,−m+ 1) ∈ Λ) or two elements with distanceless than d/2 in row M − 1 or −m+ 1, this construction is possible.

It remains to prove the lemma in the case when there are at most twoelements in both rows M − 1 and −m + 1, and that the distance betweenthem is more than d/2. By this assumption there are at most

2 + 4 + (M − 1)(d+ 1) + (m− 2)d = 5 +M + (M +m− 3)d

40 CHAPTER 1

elements in Λ. Moreover, there exists a polynomial of similar type as (1.3.18)with 2k(M+m−1) zeros on T2, where k is the distance between the elementsin one of the rows M − 1 or −m+ 1. We may assume that k ≥ d

2 + 12 if d is

odd, and k ≥ d2 + 1 if d is even. The proof is concluded by Theorem 1.3.16

since

M + 5 + (M +m− 3)d ≤ 2

Åd

2+

1

2

ã(M +m− 1)

holds if d is odd, d ≥ 3, and M ≥ m ≥ 2, and

M + 5 + (M +m− 3)d ≤ 2

Åd

2+ 1

ã(M +m− 1)

holds if d is even and M ≥ m ≥ 2.

Corollary 1.3.40 Suppose Λ is a canonical form with d > 1 and m ≥ 2.Then Λ does not have the extension property.

Proof. The result follows from Lemma 1.3.38 and Lemma 1.3.39 .

d m M −m1 ≥ 0 ≥ 0 Lemma 1.3.292 0 0, 1 Theorem 1.3.8 and Lemma 1.3.322 0 ≥ 2 Corollary 1.3.342 1 0 Lemma 1.3.352 1 ≥ 1 Corollary 1.3.372 ≥ 2 ≥ 0 Corollary 1.3.403 0 0,1 Theorem 1.3.8 and Lemma 1.3.323 0 ≥ 2 Corollary 1.3.343 ≥ 1 ≥ 0 Corollary 1.3.37≥ 4 0 0,1 Theorem 1.3.8 and Lemma 1.3.32≥ 4 0 ≥ 2 Corollary 1.3.34≥ 4 ≥ 1 ≥ 0 Corollary 1.3.37

Table 1.1 Cases considered in the proof of Theorem 1.3.18.

Proof of Theorem 1.3.18. The sets Λ written in canonical form can be dividedinto the cases in Table 1.1, depending on the size d,m, and M−m. The tableindicates which of the results cover the particular case under consideration.


1.4 DETERMINANT AND ENTROPY MAXIMIZATION

In this section we study the following functions, defined on the interior ofthe cones PSDn and TPol+n , respectively:

log det : int(PSDn)→ R, (1.4.1)

E : int(TPol+n )→ R, E(p) =1

2π

∫ 2π

0

log p(eit)dt. (1.4.2)

These functions will be useful in the study of these cones. We will showthat both functions are strictly concave on their domain and we will obtainoptimality conditions for their unique maximizers. These functions are ex-amples of barrier functions as they tend to minus infinity when the argumentapproaches the boundary of the domain.

1.4.1 The log det function

Let PDn denote the open cone of n× n positive definite matrices, or equiv-alently, PDn is the interior of PSDn. In other words,

PDn = A ∈ Cn×n : A > 0.

Let F0, . . . , Fm be n×n Hermitian matrices, and put F (x) = F0+∑mi=1 xiFi,

where x = (x1, . . . , xm)T ∈ Rm. We assume that F1, . . . , Fm are linearlyindependent. Let ∆ = x ∈ Rm : F (x) ∈ PDn, which we will assume tobe nonempty. By a translation of the vector (x1, . . . , xn), we may assumewithout loss of generality that 0 ∈ ∆. Note that ∆ is convex, as F (x) > 0and F (y) > 0 imply that F (sx + (1 − s)y) = sF (x) + (1 − s)F (y) > 0 for0 ≤ s ≤ 1. Define now the function

φ(x) = log detF (x), x ∈ ∆.

We start by computing its gradient and Hessian.

Lemma 1.4.1 We have that

∂φ

∂xi(x) = tr(F (x)−1Fi),

∂2φ

∂xi∂xj(x) = − tr(F (x)−1FiF (x)−1Fj),

and consequently,

∇φ(x) =(tr(F (x)−1F1) · · · tr(F (x)−1Fm)

), (1.4.3)

∇2φ(x) =

Ö− trF (x)−1F1F (x)−1F1 · · · − trF (x)−1F1F (x)−1Fm

......

− trF (x)−1FmF (x)−1F1 · · · − trF (x)−1FmF (x)−1Fm

è.

(1.4.4)

Proof. Without loss of generality we take x = 0 (one can always do a trans-lation so that 0 is the point of interest). Let us take m = 1 and let us

42 CHAPTER 1

compute φ′(0). Note that

φ(x) = log det(F0(I + xF−10 F1)) (1.4.5)

= log detF0 + log det(I + xF−10 F1) (1.4.6)

= log detF0 + log(1 + x tr(F−10 F1) +O(x2)) (1.4.7)

= log detF0 + x tr(F−10 F1) +O(x2) (x→ 0). (1.4.8)

But then φ′(0) = limx→0φ(x)−φ(0)

x = tr(F−10 F1). The general expression for

∂φ∂xi

(x) follows now easily. The second derivatives are obtained similarly.

Corollary 1.4.2 The function log det is strictly concave on PDn.

Proof. Note that the entries in the Hessian of φ can be rewritten as

− tr(F (x)−1/2FiF (x)−1/2F (x)−1/2FjF (x)−1/2).

Thus for y ∈ Rm we have that

y∗∇2φ(x)y = −m∑

i,j=1

yi tr(F (x)−1/2FiF (x)−1/2F (x)−1/2FjF (x)−1/2)yj

(1.4.9)

= − tr(F (x)−1/2

(m∑i=1

yiFi

)F (x)−1/2)2 (1.4.10)

= −‖F (x)−1/2

(m∑i=1

yiFi

)F (x)−1/2‖2 ≤ 0, (1.4.11)

and equality holds if and only if∑mi=1 yiFi = 0. Here ‖ · ‖2 stands for the

Frobenius norm of a matrix, i.e., ‖A‖2 =√

tr(A∗A). As F1, . . . , Fm arelinearly independent, this happens only if y1 = · · · = ym = 0.

The following is the main result of this subsection.

Theorem 1.4.3 Let W be a linear subspace of Hn, with 0 the only positivesemidefinite matrix contained in it. For every A > 0 and B ∈ Hn, there isa unique F ∈ (A+W) ∩ PDn such that F−1 −B ⊥ W. Moreover, F is theunique maximizer of the function

f(X) = log detX − tr(BX) , X ∈ (A+W) ∩ PDn. (1.4.12)

If A and B are real matrices then so is F .

Using the Hahn-Banach separation theorem one can easily show that when0 is the only positive semidefinite matrix in W, for every Hermitian B wehave that (B + W⊥) ∩ PSDn 6= ∅. Indeed, if the set is empty then theseparation theorem yields the existence of a Hermitian Φ and a real numberα such that tr(PΦ) > α for all P > 0 and tr(XΦ) ≤ α for all X ∈ B+W⊥.Since the positive definite matrices form a cone, it is easy to see that wemust have α ≤ 0, and since W⊥ is a subspace it is easy to see that we must


have Φ ∈ W. But then it follows that Φ ≥ 0 and Φ 6= 0. Thus the contentof the theorem is not weakened when one restricts oneself to B > 0.

Proof. First note that (A +W) ∩ PDn is convex. Next, since 0 is the onlypositive semidefinite matrix in W, (A+W)∩PDn is a bounded set. Indeed,as each nonzero W ∈ W has a negative eigenvalue, we have that for each0 6= W ∈ W the set (A+RW )∩PSDn is bounded. Let g(W ) = maxa : A+aW ∈ PSDn. As (A+W)∩PSDn is convex, we obtain that g is a continuousfunction on the unit ball B in W. Using now the finite dimensionality andthus the compactness of B, we get that g attains a maximum on B, M say.But then we obtain that we must have that (A+W)∩PDn ⊂ (A+W)∩PSDn

lies in A + MB, which proves the boundedness. Next, by Corollary 1.4.2,log det is strictly concave on PDn, and since tr(BX) is linear in X, f(X)is strictly concave. Moreover, f(X) tends to −∞ when X approaches theboundary of (A + W) ∩ PDn (as detX tends to 0 as X approaches theboundary). Thus f takes on a unique maximum on (A +W) ∩ PDn for amatrix denoted by F , say.

Fix an arbitrary W ∈ W. Consider the function fF,W (x) = log det(F +xW )−tr(B(F+xW )) defined in a neighborhood of 0 in R. Then f ′F,W (0) = 0(since f has its maximum at F ). Similarly as in the proof of Lemma 1.4.1,we obtain that

f ′F,W (0) =(det(I + xF−1W ))′

det(I + xF−1W )

∣∣∣x=0− (tr(B(F + xW )))′|x=0

= tr(F−1W )− tr(BW ) = tr((F−1 −B)W ) = 0.

Since W is an arbitrary element of W we have that F−1 −B ⊥ W. Assumethat G ∈ (A +W) ∩ PDn and G−1 − A ⊥ W. Then f ′G,W (0) = 0 for anyW ∈ W and since f is strictly convex it follows that G = F . This provesthe uniqueness of F .

In case A and B are real matrices, one can restrict attention to real ma-trices, and repeat the above argument. The resulting matrix F will also bereal.

Corollary 1.4.4 Given is a positive definite matrix A = (Aij)ni,j=1, a Her-

mitian matrix B = (Bij)ni,j=1, and an n×n symmetric pattern P with associ-

ated graph G. Then there exists a unique positive definite matrix F ∈ A+H⊥Gsuch that (F−1)ij = Bij for every (i, j) 6∈ P . Moreover, F maximizes thefunction f(X) = logdetX − tr(BX) over the set of all positive definite ma-trices X ∈ A+H⊥G. In case A and B are real, F is also real.

Proof. Consider in Theorem 1.4.3, W = H⊥G. Then evidently the only pos-itive definite matrix in W is 0. By Theorem 1.4.3 there exists a uniqueF ∈ (A + W) ∩ PDn such that F−1 − B ⊥ W. For any (k, j) 6∈ P ,

consider the matrix W(k,j)R ∈ W having all its entries 0 except those in

positions (k, j) and (j, k), which equal 1, and the matrix W(k,j)I having i

in position (k, j), −i in position (j, k), and 0 elsewhere. The conditions

44 CHAPTER 1

tr((F−1 −B)W(k,j)R ) = tr((F−1 −B)W

(k,j)I ) = 0 imply that (F−1)kj = Bkj

for any (k, j) ∈ P .

Corollary 1.4.5 Let P be a symmetric pattern with associated graph G andlet A be a Hermitian matrix such that there exists a positive definite matrixB ∈ A+H⊥G. Then among all such positive definite matrices B there exists

a unique one, denoted A, which maximizes the determinant. Also, A is theonly positive definite matrix in A + H⊥G the inverse of which has 0 in allpositions corresponding to entries (i, j) 6∈ P .

Proof. Follows immediately from Corollary 1.4.4 when B = 0.

Let us end this subsection with an example for Corollary 1.4.5.

Example 1.4.6 Find

maxx1,x2

det

Ü3 2 x1 −12 3 1 x2

x1 1 3 1−1 x2 1 3

ê,

the maximum being taken over those x1 and x2 for which the matrix ispositive definite. Notice that for x1 = x2 = 0 the matrix above is positivesemidefinite by diagonal dominance (or by using the Gersgorin disk theo-rem). As one can easily check, for x1 = x2 = 0 the matrix is invertible (andthus positive definite). Consequently, the above problem will have a solu-tion. By Corollary 1.4.4 the maximum is obtained at real x1 and x2, andthus we can suffice with optimizing over the reals. The following Matlabscript solves the problem. The script is a damped Newton’s method with αas damping factor. Notice that y and H below correspond to the gradientand the Hessian, respectively.

F0=[3 2 0 -1; 2 3 1 0; 0 1 3 1 ; -1 0 1 3];F1=[0 0 1 0 ; 0 0 0 0; 1 0 0 0; 0 0 0 0];F2=[0 0 0 0 ; 0 0 0 1; 0 0 0 0; 0 1 0 0];x=[0 ; 0];G=inv(F0);y = [G(1,3); G(2,4)];while norm(y) > 1e-7,

H=[G(1,3)*G(3,1)+G(1,1)*G(3,3) G(1,4)*G(2,3)+G(2,1)*G(3,4) ; . . .G(1,4)*G(2,3)+G(2,1)*G(3,4) G(2,4)*G(4,2)+G(2,2)*G(4,4)];v=H\y;delta = sqrt(v’*y);if delta < 1/4 alpha=1; else alpha=1/(1+delta); end;x=x+alpha*v;Fx=F0+x(1)*F1+x(2)*F2;G=inv(Fx);y = [G(1,3); G(2,4)];


end;

The solution we find isÜ3 2 0.3820 −12 3 1 −0.3820

0.3820 1 3 1−1 −0.3820 1 3

êwith a determinant equal to 29.2705.

1.4.2 The entropy function

Let

TPol++n = p ∈ TPoln : p(z) > 0, z ∈ T.

Equivalently, TPol++n is the interior of TPol+n . Let p(0), . . . , p(m) be elements

of TPoln, and put px = p(0) +∑mi=1 xip

(i), where x = (x1, . . . , xm)T ∈ Rm.We will assume that p(1), . . . , p(m) are linearly independent. Let ∆ = x ∈Rm : px ∈ TPol++

n , which we assume to be nonempty. It is easy to seethat ∆ is convex. Define now the function

ψ(x) = E(px) :=1

2π

∫ 2π

0

log px(eit)dt. (1.4.13)

Lemma 1.4.7 We have that

∂ψ

∂xj(x) =

1

2π

∫ 2π

0

p(j)(eit)

px(eit)dt,

∂2ψ

∂xj∂xk(x) = − 1

2π

∫ 2π

0

p(j)(eit)p(k)(eit)

px(eit)2dt.

Proof. Let us take m = 1 and let us compute ψ′(0) (assuming that 0 ∈ ∆).Note that

ψ(x) =1

2π

∫ 2π

0

ñlog p(0)(eit)− log

Ç1 + x

p(1)(eit)

p(0)(eit)

åôdt.

Writing log(1 + z) = z +O(z2) (z → 0), one readily finds that

ψ′(0) = limx→0

ψ(x)− ψ(0)

x=

1

2π

∫ 2π

0

p(1)(eit)

p(0)(eit)dt.

The general expression for ∂ψ∂xi

(x) follows now easily. The second derivativesare obtained similarly.

Corollary 1.4.8 The function E is strictly concave on TPol++n .

Proof. For y ∈ Rm we have that

yT∇2ψ(x)y = − 1

2π

∫ 2π

0

(∑mj=1 yjp

(j)(eit))2

px(eit)2dt ≤ 0,

and equality holds if and only if∑mj=1 yjp

(j) ≡ 0. As p(1), . . . , p(m) arelinearly independent, this happens only if y1 = · · · = ym = 0.

The following is the main result of this subsection.

46 CHAPTER 1

Theorem 1.4.9 Let W be a linear subspace of TPoln, with W ∩ TPol+n =0. For every p ∈ TPol++

n , there is a unique g ∈ (p +W) ∩ TPol++n such

that 1g − h ⊥ W. Moreover, g is the unique maximizer of the function

α(q) = E(q) + 〈q, h〉 , q ∈ (p+W) ∩ TPol++n . (1.4.14)

If p and h have real coefficients then so does g.

The proof is analogous to that of Theorem 1.4.3. We leave this as anexercise for the reader (see Exercise 1.6.48).

Example 1.4.10 Let px(z) = 12z3 + 2

z2 + xz + 6 + xz+ 2z2 + z3

2 . Determine

maxxE(px).

As the given data are real it suffices to consider real x. In order to performthe maximization, we need a way to determine the Fourier coefficients of1px

. We do this by using Matlab’s command fft. Indeed, by using the fastFourier transform we obtain a vector containing the values of px on a grid ofthe unit circle. Taking this vector and entry wise computing the reciprocalyields a vector containing the values of 1

pxon a grid on T. Performing an

inverse fast Fourier transform now yields an approximation of the Fouriercoefficients of 1

px. We can now execute the maximization:

p=[.5;2;0;6;0;2;.5];p1=[0;0;1;0;1;0;0];x=0;hh=fft([zeros(61,1);p;zeros(60,1)]);hh=ones(128,1)./hh;hh2=ifft(hh);y=hh2(64);while abs(y)>1e-5,

ll=fft([zeros(62,1);[1;0;2;0;1];zeros(61,1)]);ll=ll.*abs(hh).*abs(hh);ll=ifft(ll);H=ll(65);v=H\y;delta=sqrt(v’*y);if delta < 1/4 alpha=1; else alpha = 1/(1+delta); end;x=x+alpha*v;p=p+x*p1;hh=fft([zeros(61,1);p;zeros(60,1)]);hh=ones(128,1)./hh;hh2=ifft(hh);y=hh2(64);

end;

The solution we find is x = 0.4546.


1.5 SEMIDEFINITE PROGRAMMING

Many of the problems we have discussed in this chapter can be solved numer-ically very effectively using semidefinite programming (SDP). We will brieflyoutline some of the main ideas behind semidefinite programming and presentsome examples. There is a general theory on “conic programming” that goeswell beyond the cone of positive semidefinite matrices, but to discuss this inthat generality is beyond the scope of this book.

Let F0, . . . , Fm be given Hermitian matrices, and let F (x) = F0+∑mi=1 xiFi.

Let also c ∈ Rm be given, and write x = (x1, . . . , xm)T ∈ Rm. We considerthe problem

minimize cTxsubject to F (x)≥0 . (1.5.1)

We will call this the primal problem. We will denote

p∗ = infcTx : F (x) ≥ 0.

The primal problem has a so-called dual problem, which is the following:

maximize − tr(F0Z)subject to Z≥0and tr(FiZ)=ci,i=1,...,m

. (1.5.2)

We will denote

d∗ = sup− tr(F0Z) : Z ≥ 0 and tr(FiZ) = ci, i = 1, . . . ,m.

We will call the set x : F (x) ≥ 0 the feasible set for the primal problem,and the set Z ≥ 0 : tr(FiZ) = ci, i = 1, . . . ,m the feasible set for the dualproblem. We say that a (primal) feasible x is strictly feasible if F (x) > 0.Similarly, a (dual) feasible Z is strictly feasible if Z > 0. Observe that whenx is primal feasible and Z is dual feasible, we have that

cTx+ tr(F0Z) =m∑i=1

tr(FiZ)xi + tr(F0Z) = tr(F (x)Z) ≥ 0

as Z,F (x) ≥ 0. Thus

− tr(F0Z) ≤ cTx.

Consequently, when one can find feasible x and Z, one immediately sees that

− tr(F0Z) ≤ d∗ ≤ p∗ ≤ cTx.

Consequently, when one can find feasible x and Z such that

duality gap := cTx+ tr(F0Z)

is small, then one knows that one is close to the optimum. We state herewithout proof, that if a strict feasible x or a strict feasible Z exists, one hasin fact that d∗ = p∗, so in that case one will be able to make the duality gaparbitrary small.

48 CHAPTER 1

The log det function, which we discussed in the previous section, has threevery useful properties: (i) it is strictly concave, and thus any maximizer willbe unique; (ii) it tends to minus infinity as the argument approaches theboundary, and thus it will be easy to stay within the set of positive definitematrices; and (iii) its gradient and Hessian are easily computable. Thesethree properties do not change when we add a linear function to log det.This now leads to modified primal and dual problems. Let µ > 0, andintroduce

minimize µ log detF (x) + cTxsubject to F (x)>0 (1.5.3)

maximize − tr(F0Z) + µ log detZsubject to Z>0

and tr(FiZ)=ci,i=1,...,m

. (1.5.4)

When strictly feasible x and Z exist, the above problems will have uniqueoptimal solutions which may for instance be found via Newton’s algorithm.If the duality gap for these solutions is small, we will have found an approxi-mate solution to our original problem. Of course, finding the right parameterµ (or rather the right updates for µ → 0) and finding efficient algorithmsto implement these ideas are still an art, especially as closer to the optimalvalue the matrix becomes close to a singular one. It is beyond the scopeof this book to go into further detail on semidefinite programming. In thenotes we will refer the reader to references on the subject.

We end this section with two examples. We will make use of Matlab’s LMI(Linear Matrix Inequalities) Toolbox to solve these problems numerically.The LMI Toolbox uses semidefinite programming.

Example 1.5.1 Find

maxx1,x2

λmin

Ü1 2 x1 −12 1 1 x2

x1 1 1 1−1 x2 1 1

ê. (1.5.5)

We first observe that as the given data are all real, it suffices to consider realx1 and x2. Using Matlab’s command gevp (generalized eigenvalue problem)command, we get the following.

F0=[1 2 0 -1; 2 1 1 0; 0 1 1 1; -1 0 1 1]F1=[0 0 1 0 ; 0 0 0 0; 1 0 0 0 ; 0 0 0 0]F2=[0 0 0 0 ; 0 0 0 1; 0 0 0 0; 0 1 0 0]setlmis([])X1 = lmivar(1,[1 1]);X2 = lmivar(1,[1 1]);lmiterm([1 1 1 0],-F0)lmiterm([1 1 1 X1],-F1,1)lmiterm([1 1 1 X2],-F2,1)


lmiterm([-1 1 1 0],eye(4))lmis = getlmis[alpha,popt]=gevp(lmis,1,[1e-10 0 0 0 0 ])x1 = dec2mat(lmis,popt,X1)x2 = dec2mat(lmis,popt,X2)

Running this script leads to α = 1.0000, x1 = 1.0000, x2 = −1.0000. Thisleads to

αI +

Ü1 2 x1 −12 1 1 x2

x1 1 1 1−1 x2 1 1

ê=

Ü2 2 1 −12 2 1 −11 1 2 1−1 −1 1 2

ê≥ 0,

with α optimal. So the answer to (1.5.5) is −1.

Example 1.5.2 Find the outer factorization of p(z) = 2z−2− z−1 + 6− z+z2. In principle one can determine the roots of p(z) and use the approachpresented in the proof of Theorem 1.1.5 to compute the outer factor. Herewe will present a different approach based on semidefinite programming thatwe will be able to use in the matrix values case as well (see Section 2.4). Weare looking for g(z) = g0 + g1z + g2z

2 so that

p(z) = |g(z)|2, z ∈ T. (1.5.6)

If we introduce the matrix

F := (Fij)2i,j=0 =

Ñg0

g1

g2

é (g0 g1 g2

),

we have that F ≥ 0, tr(F ) = |g0|2+|g1|2+|g2|2 = 6, F10+F21 = g1g0+g2g1 =−1, and F20 = g2g0 = 2. Here we used (1.5.6). Thus we need to find apositive semidefinite matrix F with trace equal to 6, the sum of the entriesin diagonal 1 equal to −1, and the (2,0) entry equal to 2. In addition, wemust see to it that g does not have any roots in the open unit disk. One wayto do this is to maximize the value |g(0)| = |g0| which is the product of theabsolute values of the roots of g. As we have seen in the proof of Theorem1.1.5, the roots of any factor g of p = |g|2 are the results of choosing oneout each pair (α, 1/α) of roots of p. When we choose the roots that do notlie in the open unit disk, we obtain an outer g. This is exactly achieved bymaximizing |g0|2. In conclusion, we want to determine

max F00 subject to

ßF = (Fij)

2i,j=0 ≥ 0,

trF = 6, F01 + F21 = −1, F20 = 2.

When we write a Matlab script to find the appropriate F ≥ 0 we get thefollowing:

F0=[6 -1 2; -1 0 0 ; 2 0 0] % Notice that anyF1=[1 0 0; 0 -1 0 ; 0 0 0] % affine combination

50 CHAPTER 1

F2=[0 0 0; 0 1 0; 0 0 -1] % F =F0 + x1*F1 + x2*F2 + x3*F3F3=[0 1 0; 1 0 -1 ; 0 -1 0] % satisfies the linear constraints.setlmis([])x1 = lmivar(1,[1 1]); % x1, x2, x3 are a real scalarsx2 = lmivar(1,[1 1]);x3 = lmivar(1,[1 1]);lmiterm([-1 1 1 0],F0)lmiterm([-1 1 1 x1],F1,1) % these lmiterm commandslmiterm([-1 1 1 x2],F2,1) % create the LMIlmiterm([-1 1 1 x3],F3,1) % 0 ≤ F0 + x1*F1 + x2*F2 + x3*F3lmis = getlmis[copt,xopt] = mincx(lmis,[-1;0;0],[1e-6 0 0 0 0 ]) % minimize -x1F=F0+xopt(1)*F1+xopt(2)*F2 + xopt(3)*F3chol(F)

Executing this script leads to

F =

Ñ5.1173 −0.7190 2.0000−0.7190 0.1010 −0.28102.0000 −0.2810 0.7817

é,

which factors as vv∗ with v∗ =(2.2621 −0.3178 0.8841

). Thus p(z) =

|2.2621− 0.3178z + 0.8841z2|2, z ∈ T. It is no coincidence that the optimalmatrix F (x) is of rank 1 (see Exercise 1.6.49).


1.6 EXERCISES

1 Prove Proposition 1.1.1.

2 For the following subsets of R2 (endowed with the usual inner product),check whether the following sets are (1) cones; (2) closed. In the case of acone determine their dual and their extreme rays.

(a) x, y) ∈ R2 : x, y ≥ 0.

(b) x, y) ∈ R2 : xy > 0.

(c) x, y) ∈ R2 : 0 ≤ x ≤ y.

(d) x, y) ∈ R2 : x+ y ≥ 0.

3 Prove Lemma 1.1.2. (For the proof of part (ii) one needs the followingcorollary of the Hahn-Banach separation theorem: given a closed convex setA ⊂ H and a point h0 6∈ A, then there exist a y ∈ H and an α ∈ R such that〈h0, y〉 < α and 〈a, y〉 ≥ α for all a ∈ A.)

4 Show that p(z) =∑nk=−n pkz

k is real for all z ∈ T if and only if pk = p−k,k = 0, . . . , n.

5 Prove the uniqueness up to a constant factor of modulus 1 of the outer(co-outer) factor of a polynomial p ∈ TPol+n .

6 Show that PSDG = (PSDG)∗ if and only if all the maximal cliques in thegraph G are disjoint.

7 For the following matrices A and patterns P determine whether A be-longs to each of PPSDG,PSDG, (PSDG)∗, and (PPSDG)∗. Here G is thegraph associated with P .

(a) P = (i, j) ; 1 ≤ i, j ≤ 3 and |i− j| ≤ 1, and A =

Ñ5 1 01 2 1

20 1

2 1

é.

(b) P = (i, j) ; 1 ≤ i, j ≤ 3 and |i− j| ≤ 1, and A =

Ñ1 1 01 1 1

20 1

214

é.

(c) P = (1, . . . , 4 × 1, . . . , 4) \ (1, 3), (2, 4), (3, 1), (4, 2) and

A =

Ü1 1 0 11 2 1 00 1 1 −11 0 −1 2

ê.

52 CHAPTER 1

(d) P = (i, j) ; 1 ≤ i, j ≤ 5 and |i− j| 6= 2 and

A =

à1 2 0 4 52 4 6 0 100 6 9 12 04 0 12 16 205 10 0 20 25

í.

(e) P = (i, j) ; 1 ≤ i, j ≤ 3 and |i− j| ≤ 1, and A =

Ñ1 1 01 2 10 1 1

é.

(f) P = (i, j) ; 1 ≤ i, j ≤ 3 and |i− j| ≤ 1, and A =

Ñ1 1 01 1 10 1 1

é.

(g) P = (1, . . . , 4 × 1, . . . , 4) \ (1, 3), (2, 4), (3, 1), (4, 2) and

A =

Ü1 1 0 11 1 1 00 1 1 11 0 1 1

ê.

8 Can a chordal graph have 0,1,2,3 simplicial vertices? If yes, provide anexample of such a graph. If no, explain why not.

9 Let

Cn = A = (Aij)ni,j=1 ∈ PSDn : Aii = 1, i = 1, . . . , n

be the set of correlation matrices.

(a) Show that Cn is convex.

(b) Recall that E ∈ Cn is an extreme point of Cn if E±∆ ∈ Cn implies that∆ = 0. Show that if E ∈ C3 is an extreme point, then E has rank 1.

(c) Show that the convex set of real correlation matrices C3 ∩ R3×3 hasextreme points of rank 2. (In fact, they are exactly the rank 2 realcorrelation matrices whose off-diagonal entries have absolute value lessthan one.)

(d) Show that if E ∈ Cn is an extreme point of rank k then k2 ≤ n.

(e) Show that if E ∈ Cn∩Rn×n is an extreme point of rank k then k(k+1) ≤2n.

10 Let G be an undirected graph on n vertices. Let β(G) denote theminimal number of edges to be added to G to obtain a chordal graph. Forinstance, if G is the four cycle then β(G) = 1. Prove the following.


(a) If A generates an extreme ray of PSDG∩Rn×n, then rankA ≤ β(G) + 1.

(Hint: Let “G be a chordal graph obtained from G by adding β(G) edges.Let A ∈ PSDG ∩ Rn×n be of rank r > β(G) + 1. Write A =

∑rk=1Ak

with rankAk = 1 and Ak ∈ PSDG∩ Rn×n. Show now that there exist

positive c1, . . . , cr, not all equal, such that∑rk=1 ckAk ∈ PSDG ∩Rn×n.

Conclude that A does not generate an extreme ray.)

(b) If A generates an extreme ray of PSDG, then rankA ≤ 2β(G) + 1.

(c) Show that the bound in (a) is not sharp. (Hint: Consider the laddergraph G on 2m vertices 1, . . . , 2m where the edges are (2i− 1, 2i), i =1, . . . ,m, and (2i− 1, 2i+ 1), (2i, 2i+ 2), i = 1, . . . ,m− 1, and show thatβ(G) = m− 1 while all extreme rays in PSDG ∩Rn×n are generated bymatrices of rank ≤ 2.)

11 Let G be the loop of n vertices. Show that PSDG ∩Rn×n has extremerays generated by elements of rank n− 2, and that elements of higher rankdo not generate an extreme ray.

12 Let G be the (3, 2) full bipartite graph. Show that PSDG has extremerays generated by rank 3 matrices, while all extreme rays in PSDG ∩ R5×5

are generated by matrices of rank 2 or less.

13 This exercise concerns the dependance on Λ of TPol2Λ.

(a) Find a sequence pn3n=−3 for which is the matrix (1.3.1) is positivesemidefinite but the matrix (1.3.2) is not.

(b) Find a trigonometric polynomial q(z) = q−3z−3+· · ·+q3z

3 ≥ 0 such thatq(z) = |h(z)|2, where h(z) is some polynomial h(z) = h0 + h1z+ h2z

2 +

h3z3, but q(z) cannot be written as |h(z)|2 with h(z) = h0 + h1z+ h3z

3.

(c) Explain why the existence of the sequence pn3n=−3 in (a) implies theexistence of the trigonometric polynomial q(z) in (b), and vice versa.

14 Let T2 = (ci−j)2i,j=0 be positive semidefinite with dim KerT2 = 1. Show

that c−2 = 1c0

(c2−1 + (c20 − |c1|2)eit) for some t ∈ R.

15 Let Tn = (ci−j)ni,j=0 be positive definite, and define cn+1 = c−n−1 via

cn+1 :=(cn . . . c1

)Ö c0 . . . c−n−1

.... . .

...cn−1 . . . c0

è−1Öc1...cn

è+

detTndetTn−1

eit0 ,

(1.6.1)where t0 ∈ R may be chosen arbitrarily. Show that Tn+1 = (ci−j)

n+1i,j=0 is

positive semidefinite and singular.

54 CHAPTER 1

16 Let µ be the measure given by µ(θ) = 1|e2iθ−2|2 dθ. Find its moments

ck for k = 0, 1, 2, 3, 4, and show that T4 = (ci−j)4i,j=0 is strictly positive

definite.

17 Prove that the cone DS in the proof of Theorem 1.3.5 is closed.

18 Consider the cone of positive maps Φ : Cn×n → Cm×m (i.e., Φ islinear and Φ(PSDn) ⊆ PSDm). Show that the maps X 7→ AXA∗ andX 7→ AXTA∗ generate extreme rays of this cone.

19 Write the following positive semidefinite Toeplitz matrices as the sumof rank 1 positive semidefinite Toeplitz matrices. (Hint: any n × n rank 1positive semidefinite Toeplitz matrix has the form

c

á1α...

αn−1

ë(1 α · · · αn−1

),

where c > 0 and α ∈ T.)

(a) T =

Ñ2 1− i 0

1 + i 2 1− i0 1 + i 2

é.

(b) T =

Ñ3 −1 +

√2(1− i) 1− 2i

−1 +√

2(1 + i) 3 −1 +√

2(1− i)1 + 2i −1 +

√2(1 + i) 3

é.

20 Let cj = c−j be complex numbers such that

Tn = (cj−k)nj,k=0

is positive semidefinite.

(a) Assume that Tn has a one-dimensional kernel, and let (pj)nj=0 be a

nonzero vector in this kernel. Show that p(z) :=∑nj=0 pjz

j has allits roots on the unit circle T.

(b) Show that

det

Öc0 · · · c−n+1 1...

......

cn · · · c1 zn

èhas the same roots as p.

(c) Show that the roots of p are all different (start with the cases n = 1 andn = 2).


(d) Denote the roots of p by α1, . . . , αn. Let g(j)(z) = g(j)0 + · · ·+ g

(j)n zn be

the unique polynomial of degree n such that

g(j)(z) =

ß1 when z = αj ,0 when z = αk, k 6= j.

Put ρj = v∗jTnvj , where vj = (g(j)k )nk=0. Show that cj =

∑nk=1 ρkα

jk for

j = −n, . . . , n.

(e) Show how the above procedure gives a way to find a measure of finitesupport for the truncated trigonometric moment problem (see Theorem1.3.6) in the case that Tn has a one-dimensional kernel.

(f) Adjust the above procedure to the case when Tn has a kernel of dimen-sion k > 1. (Hint: apply the above procedure to Tn−k+1.)

(g) Adjust the above procedure to the case when Tn positive definite. (Hint:choose a cn+1 = c−n−1 so that Tn+1 is positive semidefinite with a one-dimensional kernel; see Exercise 1.6.15.)

21 Given a finite positive Borel measure µ on Td we consider integrals ofthe form

I(f) =

∫Tdfdµ.

A cubature (or quadrature) formula for I(f) is an expression of the form

C(f) =n∑j=1

ρjf(αj),

with fixed nodes α1, . . . , αn ∈ Tn and fixed weights ρ1, . . . , ρn > 0, such thatC(f) is a “good” approximation for I(f) on a class of functions of interest.Show that if we express the positive semidefinite Toeplitz matrix

(µ(λ− ν))λ,ν∈Λ

as (n∑j=1

ρjαλ−νj

)λ,ν∈Λ

(which can be done due to Theorem 1.3.5), then we have that

I(p) = C(p) for all p ∈ PolΛ−Λ.

22 Use the previous two exercises to find a cubature formula for the mea-sure µ(z) = 1

2πidz

z|2−z|2 on T that is exact for trigonometric polynomials of

degree 2.

56 CHAPTER 1

23 Consider

A =

Ñ1 t1 t2t1 1 t1t2 t1 1

é, B =

Ñ1 t1 t3t1 1 t2t3 t2 1

é.

Assume that |t1| = 1. Show that A ≥ 0 if and only if t2 = t21 and that B ≥ 0if and only if |t2| ≤ 1 and t3 = t1t2.

24 For γ ∈ [0, π], consider the cone Cγ of real valued trigonometric poly-nomials q(z) on T of degree n so that q(eiθ) ≥ 0 for −γ ≤ θ ≤ γ.

(a) Show that q ∈ Cγ if and only if there exist polynomials p1 and p2 ofdegree at most n and n− 1, respectively, so that

q(z) = |p1(z)|2 +

Åz +

1

z− 2 cos γ

ã|p2(z)|2, z ∈ T.

(Hint: Look at all the roots of q(z) and conclude that the roots on theunit circle where q(z) switches sign lie in the set eiθ : θ ∈ [γ, 2π − γ]and that there exist an even number of them. Notice that the otherroots appear in pairs α, 1

α , so that q(eiθ) = q1(eiθ)∏2rk=1 sin t−tk

2 , wheret1, . . . , t2r ∈ [γ, 2π − γ] and q1 is nonnegative on T.)

(b) Determine the extreme rays of the cone Cγ .

(c) Show that the dual cone of Cγ is given by those real valued trigonometricpolynomials g(z) =

∑nk=−n gkz

k so that the following Toeplitz matricesare positive semidefinite:

(gi−j)ni,j=0 ≥ 0 and (gi−j+1 + gi−j−1 − 2gij cos γ)n−1

i,j=0 ≥ 0.

25 Prove the first statement of the proof of Lemma 1.3.23.

26 Using positive semidefinite matrices, provide an alternative proof forthe fact that

F (s, t) = s2t2(s2 + t2 − 1) + 1

is not a sum of squares. (Hint: suppose that F (s, t) =∑ki=1 gi(s, t)

2. Showthat gi(s, t) = ai + bist+ cis

2t+dist2 for some real numbers ai, bi, ci and di.

Now rewrite F (s, t) as

F (s, t) =(1 st s2t st2

)Ü k∑i=1

Üaibicidi

ê (ai bi ci di

)êÜ 1sts2tst2

ê.

Thus we have written

F (s, t) =(1 st s2t st2

)A

Ü1sts2tst2

ê,


where A is a positive semidefinite real matrix. Now argue that this is im-possible.)

27 For the following polynomials f determine polynomials gi with realcoefficients so that f =

∑ki=1 g

2i , or argue that such a representation is

impossible. For those for which the representation is impossible, try to finda representation as above with gi rational. (Hint: try 1 + x2, 1 + x2 + y2,etc., as the numerator of gi.)

(a) f(x) = x4 − 2x3 + x2 + 16.

(b) f(x, y) = 1 + x2y4 + x4y2 − 3x2y2.

(c) f(x, y) = 2x4 + 2xy3 − x2y2 + 5x4.

(d) f(x, y) = x2y4 + x4y2 − 2x3y3 − 2xy2 + 2x2y + 1.

(e) f(x, y, z) = x4 − (2yz + 1)x2 + y2z2 + 2yz + 2.

(f) f(x, y) = 12x

2y4 − (2x3 − βx2)y3 + [(β2 + 2)x4 + βx3 + (β2 + 9)x2 +2βx]y2− (8βx2−4x)y+4(β2 +1). (Hint: consider the cases 0 < |β| < 2,β = 0, and |β| ≥ 2, separately.)

28 Recall that p(x, y, z) is a homogeneous polynomial of degree n if p(ax, ay, az) =anp(x, y, z) for all x, y, z, and a.

(a) Show that if p(x, y, z) is a homogeneous polynomial in three variables ofdegree 3, then

4p(0, 0, 1) + p(1, 1, 1) + p(−1, 1, 1) + p(1,−1, 1) + p(−1,−1, 1)

−2(p(1, 0, 1) + p(−1, 0, 1) + p(0, 1, 1) + p(0,−1, 1)) = 0.

(b) Show that if f(x, y, z) is a sum of squares of homogeneous polynomialsof degree 3, then

f(1, 1, 1) + f(−1, 1, 1) + f(1,−1, 1) + f(−1,−1, 1) + f(1, 0, 1)

+f(−1, 0, 1) + f(0, 1, 1) + f(0,−1, 1)− 4

5f(0, 0, 1) ≥ 0.

(c) Use (b) to show that the so-called Robinson polynomial

x6 + y6 + z6 − (x4(y2 + z2) + y4(x2 + z2) + z4(x2 + y2)) + 3x2y2z2

is not a sum of squares of polynomials.

(d) Show that the Robinson polynomial only takes on nonnegative valueson R3.

58 CHAPTER 1

29 Theorem 1.3.8 may be interpreted in the following way. Given anonempty finite S ⊆ Z of the form S = Λ−Λ, then BS = p : Tp,Λ ≥ 0 ifand only if S = jb : j = −k, . . . , k for some k, b ∈ N. For general finite Ssatisfying S = −S one may ask whether the equality

BS = p : Tp,Λ ≥ 0 for all Λ with Λ− Λ ⊆ Sholds. Show that for the set S = −4,−3,−1, 0, 1, 3, 4 equality does nothold.

30 Let h : Zd → Zd be a group isomorphism, and let Λ be a finite subsetof Zd. Show that Λ has the extension property if and only if h(Λ) has theextension property.

31 Prove that when a ∈ Td and G(Λ − Λ) 6= Zd, then ΩΛ(a) containsmore than one element.

32 The entropy of a continuous function f ∈ C(Td), f > 0 on Td is definedas

E(f) =1

(2π)d

∫[0,2π]d

log f(ei〈k,t〉)dt.

Let n ≥ 1, and let Λ = R(0, n), R(1, n), or R(1, n)\(1, n), and ckk∈Λ−Λ

be a positive sequence of complex numbers. We call a function

f(ei〈k,t〉) =∑k∈Zd

ck(f)ei〈k,t〉, f ∈ C(Td),

a positive definite extension of the sequence ckk∈Λ−Λ if f > 0 on Td andck(f) = ck for k ∈ Λ − Λ. Assume that ckk∈Λ−Λ admits an extensionf ∈ C(Td), f > 0 on Td. Prove that among all positive continuous extensionsof ckk∈Λ−Λ, there is a unique one denoted f0 which maximizes the entropy.Moreover, f0 is the unique positive extension of the form 1

P , with P ∈ ΠΛ−Λ.

33 Let ckl(k,l)∈R(1,n)−R(1,n) be a positive semidefinite sequence. Showthat there exists a unique positive definite extension dkl, k ∈ −1, 0, 1,l ∈ Z of ckl(k,l)∈R(1,n)−R(1,n) which maximizes the entropy E(dkl), andthat this unique extension is given via the formula

F0(eit, eiw) =f0(eiw)[f2

0 (eiw)− |f1(eiw)|2]

|f0(eiw)− eitf1(eiw)|2,

where f0(eiw) =∑∞j=−∞ d0je

ijw and f1(eiw) =∑∞j=−∞ d1je

ijw.

34 Prove that every positive semidefinite sequence on S = −1, 0, 1 × Zwith respect to R1 = 0, 1 × N admits a positive semidefinite extension toZ2.

35 Suppose Λ = (pi, qi)ni=1 ⊂ Z2 such that

A =

Åp1 p2 . . . pnq1 q2 . . . qn

ãhas rank 2. Prove the following.


(a) There exist integer matrices P and Q with |detP | = |detQ | = 1 suchthat

PAQ =

Åf1 0 0 . . . 00 f2 0 . . . 0

ã= F,

where f1 = gcdp1, . . . , pn, q1, . . . , qn and

f2 =1

f1gcd

ßdet

Åpi pjqi qj

ã, 1 ≤ i < j ≤ n

™.

(b) G(Λ− Λ) = Z2 if and only if f1 = f2 = 1.

36 Let Λ = (pi, qi) : i = 1, . . . , n ⊂ Z2 and define

d = maxgcd(pi, qi) : i = 1, . . . , n. (1.6.2)

Show that there exists an isomorphism Φ : Z2 → Z2 such that (d, 0) ∈ Φ(Λ).In addition, show that d is the largest number with this property.

(Hint: if the maximum is obtained at i = 1, there exist integers k, l suchthat p1k − q1l = d. Define Φ via the matrixÅ

k −lq1d −p1

d

ã.)

37 Let f(x) =∑mk=0 fkx

k be a polynomial such that f(x) ≥ 0, x ∈ R,and fm 6= 0.

(a) Show that m = 2n for some n ∈ N0.

(b) Show that fk ∈ R, k = 0, . . . , 2n.

(c) Show that α is a root of f if and only if α is a root of f , and show thatif α is a real root, then it has even multiplicity.

(d) Show that f(x) = |g(x)|2 for some polynomial g of degree n with possiblycomplex coefficients.

(e) Show that g above can be chosen so that all its roots have nonnegative(nonpositive) imaginary part, and that with that choice g is unique upto a constant of modulus 1.

(f) Show that f = h21 + h2

2 for some polynomials h1 and h2 of degree ≤ nwith real coefficients. (Hint: take g as above and let h1 and h2 be realvalued on the real line so that g = h1 + ih2 on the real line.)

(g) Use ideas similar to those above to show that a polynomial ψ(x) withψ(x) ≥ 0, x ≥ 0, can be written as

ψ(x) = h1(x)2 + h2(x)2 + x(h3(x)2 + h4(x)2),

where h1, h2, h3, h4 are polynomials with real coefficients.

60 CHAPTER 1

38 Define the (n+ 1)× (n+ 1) matrices Φn and Ψn such that

Φn

á1u...un

ë=

á(1 + iu)0(1− iu)n

(1 + iu)1(1− iu)n−1

...(1 + iu)n(1− iu)0

ë,

Ψn

á1z...zn

ë=

á(i− iz)0(1 + z)n

(i− iz)1(1 + z)n−1

...(i− iz)n(1 + z)0

ë.

For instance, when n = 2 we obtain the matrix

Φ2 =

Ñ1 −2i −11 0 11 2i −1

é.

(i) Show that ΨnΦn = 2nIn+1 = ΦnΨn.

(ii) Prove that if H = (hi+j)ni,j=0 is a Hankel matrix then T = ΦnHΦ∗n is

a Toeplitz matrix, and vice versa. (A Hankel matrix H = (hij) is amatrix that satisfies hij = hi+1,j−1 for all applicable i and j.)

39 Let RPoln = g : g(x) =∑nk=0 gkx

k, gk ∈ R be the space of realvalued degree ≤ n polynomials, and let RPol+n = g ∈ RPoln : g(x) ≥0 for all x ∈ R be the cone of nonnegative valued degree ≤ n polynomials.On RPoln define the inner product

〈g, h〉 =n∑i=0

gihi, for g(x) =n∑k=0

gkxk and h(x) =

n∑k=0

hkxk.

For h(x) =∑2ni=0 hix

i define the Hankel matrix

Hh = (hi+j)ni,j=0.

(a) Show that RPol+2n+1 = RPol+2n (see also part (a) of Exercise 1.6.37).

(b) Show that for g(x) =∑nk=0 gkx

k and h(x) =∑2nk=0 hkx

k we have that

〈g2, h〉 =(g0 · · · gn

)Hh

Ög0

...gn

è.

(c) Show that the dual cone of RPol+2nis the cone

C = h ∈ RPol2n : Hh ≥ 0.

(d) Show that the extreme rays of RPol+2n are generated by the polynomialsin RPol+2n that have only real roots.


(e) Show that the extreme rays of C are generated by those polynomials hfor which Hh has rank 1. (This part may be easier to do after readingthe proof of Theorem 2.7.9.)

40 Let p(x) = p0 + · · ·+ p2nx2n with pj ∈ R.

(a) Show that there exist polynomials gj(x) = g(j)0 + · · · + g

(j)n xn, j =

1, . . . ,m, with real coefficients such that p =∑mj=1 g

2j , if and only if

there exists a positive semidefinite matrix F = (Fjk)nj,k=0 satisfying

k∑l=0

Fk−l,l = pk, k = 0, . . . , 2n. (1.6.3)

(b) Let

F0 =

àp0

12p1

12p1 p2

. . .

. . .. . . 1

2p2n−112p2n−1 p2n

í∈ C(n+1)×(n+1).

Show that F = F ∗ satisfies (1.6.3) if and only if there is an n × nHermitian matrix Y = (Yjk)nj,k=1 so that

F = F0 +

Å0n×1 iY01×1 0n×1

ã+

Å01×n 01×1

−iY 0n×1

ã.

Here 0j×k is the j × k zero matrix.

(c) Let

A = −i

á01 0

. . .. . .

1 0

ë∈ Cn×n, B = −i

á10...0

ë∈ Cn×1.

Show that Å01×n 01×1

−iY 0n×1

ã=

Å01×1 01×nY B Y A

ã.

(d) Assume that p0 > 0 and write F0 as

F0 =

ÅG11 G12

G21 G22

ã,

where G11 = p0 and G22 ∈ Cn×n. Show that F in (b) is positivesemidefinite if and only if

A∗Y + Y A− (Y B +G21)G−111 (B∗Y +G12) +G22 ≥ 0. (1.6.4)

In addition, show that F is of rank 1 if and only equality holds in (1.6.4).

62 CHAPTER 1

(e) The equation

P ∗Y + Y P − Y QY +R = 0 (1.6.5)

for known matrices P , Q = Q∗, R = R∗, and unknown Y = Y ∗ isreferred to as the Algebraic Riccati Equation. Show that this Riccatiequation has a solution Y if and only if the range of the matrix

(InY

)is

an invariant subspace of the so-called Hamiltonian matrix

H =

Å−P ∗ QR P

ã.

(Hint: rewrite the Riccati equation as (−Y In )H(InY

)= 0 .)

(f) For p(x) = 1+4x+10x2 +12x3 +9x4, determine the appropriate Riccatiequation and its Hamiltonian. Use the eigenvectors of the Hamiltonianto determine an invariant subspace of H of the appropriate form, andobtain a solution to the Riccati equation. Use this solution to determinea rank 1 positive semidefinite F satisfying (1.6.3), and obtain a factor-ization p(x) = |g(x)|2. Notice that if one chooses the invariant subspaceof the Hamiltonian by using eigenvectors whose eigenvalues lie in theright half-plane, the corresponding g has all its roots in the closed lefthalf-plane

Remark: The Riccati equation is an important tool in control the-ory to determine factorizations of matrix-valued functions with desiredproperties (see also Section 2.7).

41 Consider

C = A = (aij)3i,j=1 ∈ R3×3 : A = AT ≥ 0 and a12 = a33.

(a) Show that C is a closed cone.

(b) Show that for x, y ∈ R, not both 0, the matrixÑx2

y2

xy

é (x2 y2 xy

)(1.6.6)

generates an extreme ray of C.

(c) Show that all extreme rays are generated by a matrix of the type (1.6.6).(Hint: the most involved part is to show that any rank 2 element in Cdoes not generate an extreme ray. To prove this, first show that ifÅ

a1

a2

ã,

Åb1b2

ã,

Åc1c2

ã∈ R2

are such that a1b1 + a2b2 = c21 + c22, then there exist α and x1, x2, y1, y2

in R so thatÅcosα − sinαsinα cosα

ãÅa1 b1 c1a2 b2 c2

ã=

Åx2

1 y21 x1y1

x22 y2

2 x2y2

ã.

The correct α will satisfy tan(2α) =a1b1−a2b2−c21+c22a1b2+a2b1−2c1c2

. )


(d) Determine the dual cone of C. What are its extreme rays?

42 Let A = (Aij)ni,j=1 be positive definite. Prove that there exists a unique

positive definite matrix F = (Fij)ni,j=1 such that Fij = Aij , i 6= j, trF = trA

and the diagonal entries of the inverse of F are all the same. In case A isreal, F is also real. (Hint: consider in Theorem 1.4.3 B = 0 and let W bethe set of all real diagonal n× n matrices with their trace equal to 0.)

43 Consider the partial matrix

A =

Ü2 ? 1 1? 2 ? 11 ? 2 ?1 1 ? 2

ê.

Find the positive definite completion for which the determinant is maximal.Notice that this completion is not Toeplitz. Next, find the positive definiteToeplitz completion for which the determinant is maximal.

44 Find the representation (1.3.10) for the matrixâ2 −1 −1−1 2 −1

. . .. . .

. . .

−1 2 −1−1 −1 2

ì. (1.6.7)

45 Let A be a positive semidefinite Toeplitz matrix. Find a closest (in theFrobenius norm) rank 1 positive semidefinite Toeplitz matrix A0. When isthe solution unique? (Hint: use (1.3.10).) Is the solution unique in the caseof the matrix (1.6.7)?

46 Let A = (Aj−i)ni,j=1 be a positive definite Toeplitz matrix, 0 < p1 <

p2 < · · · < pr < n and α1, . . . , αr ∈ C. Then there exists a unique Toeplitzmatrix F = (Fj−i)

ni,j=1 with Fq = Aq, |q| 6= p1, . . . , pr, and∑j−i=pk

(F−1)ij = αk , k = 1, . . . , r.

In case A and α1, . . . , αr are real, the matrix F is also real. (Hint: Considerin Theorem 1.4.3 B = (Bij)

ni,j=1 with B1pk = B∗pk1 = αk, k = 1, . . . , r and

Bij = 0 otherwise, and let W be the set of all Hermitian Toeplitz matriceswhich have entries equal to 0 on all diagonals except ±p1, . . . ,±pr.)

47 Recall that s1(X) ≥ s2(X) ≥ · · · denote the singular values of thematrix X.

64 CHAPTER 1

(a) Show that for positive definite n× n matrices A and B it holds that

tr(A−B)(B−1 −A−1) ≥∑nj=1 sj(A−B)2

s1(A)s1(B).

(b) Use (a) to give an alternative proof of Corollary 1.4.4 that a positivedefinite matrix is uniquely determined by some of its entries and thecomplementary entries of its inverse (in fact, the statement is slightlystronger than Corollary 1.4.4, as we do not require the given entries inthe matrix to include the main diagonal).

48 Prove Theorem 1.4.9.

49 Let pj = p−j be complex numbers and introduce the convex set

F =

F = (Fj,k)nj,k=0 ≥ 0 :

n−j∑l=0

Fl+j,l = pj , j = 0, . . . , n

,

and assume that F 6= ∅. Let Fopt ∈ F be so that (Fopt)00 ≥ F00 for allF ∈ F . Show that Fopt is of rank 1. (Hint: Let

Fopt =

Åa 01×nb F1

ãÅa b∗

0n×1 F ∗1

ãbe a Cholesky factorization of Fopt. If F1 has a nonzero upper left entry,argue that

Fopt −Å

01×1 01×n0n×1 F1F

∗1

ã+

ÅF1F

∗1 0n×1

01×n 01×1

ãlies in F but has a larger upper left entry than Fopt. Adjust the argumentto also include the case when F1 6= 0 but happens to have a zero upper leftentry.)

Remark. Notice that the above argument provides a simple iterativemethod to find F ∈ F whose upper left entry is maximal. In principleone could employ this naive procedure to find a solution to the problem inExample 1.5.2.


1.7 NOTES

Section 1.1

The general results about cones and their duals, as well as those aboutpositive (semi)definite matrices in Section 1.1 can be found in works like[91], [485], [108], [21], [319], and [538]. The proof of the Fejer-Riesz theoremdates back to [213] and can also be found in many contemporary texts (see,e.g., [53] and [510]). In Section 2.4 we present its generalization for operator-valued polynomials.

Section 1.2

Matrices which are inverses of tridiagonal ones were first characterized in[80] and matrices which are the inverses of banded ones in [81]. The theoryof positive definite completions started in [114] with the basic 3 × 3 blockmatrix case. Next, banded partially positive definite block matrices wereconsidered in [192]. Theorems 2.1.1 and 2.1.4 were first proven there in thefinite dimensional case.

In [285] undirected graphs were associated with the patterns of partiallypositive semidefinite matrices and the importance of chordal graphs in thiscontext was discovered. Proposition 1.2.9 is also taken from [285]. The ap-proach based on cones and duality to study positive semidefinite extensionsoriginates in [452] (see also [315] for the block tridiagonal case). Theorem1.2.10 was first obtained there. The equality (PSDG)∗ = PPSDG for achordal graph G was also established there, in a way similar to that in Sec-tion 1.2. The extreme rays of PSDG for nonchordal graphs G were studiedin [9], [309], [311], and [518]. Example 1.2.11 is taken from [9]. All notation,definitions, and results in Section 1.2 concerning graph theory follow thebook [276].

Section 1.3

The approach in Section 1.3 for characterizing finite subsets of Zd whichhave the extension property was developed in [115] and [503] (see also [510]).The characterization in [503] reduces the problem to the equivalence betweenTPol2Λ = TPol+S and BS = AΛ. Theorem 1.3.3 was taken from [314], whichcontains the simplest version of Bochner’s theorem, namely the one for G =Z.

Theorem 1.3.6 was originally proved by Caratheodory and Fejer in [120]by using a result in [119] by which the extremal rays of the cone of n × npositive semidefinite Toeplitz matrices are generated by rank 1 such matrices;see also [281, Section 4.1] for an excellent account. Corollary 1.3.7 may befound in [133].

Theorem 1.3.5 was first obtained in [240] by a different method. The the-orem replicates a result by Naimark [17] characterizing the extremal solu-tions of the Hamburger moment problem. Theorems 1.3.8 through Corollary1.3.16 were also first proven in [240]. However, we found an inaccuracy inthe original statement and proof of Lemma 1.3.15. Still, Theorem 1.3.11

66 CHAPTER 1

holds, which is the key for the proof of the main result, Corollary 1.3.16. Wethank the author of that article for his kind help.

Theorem 1.3.20 comes from [58]. The first proofs of Corollary 1.3.34,Corollary 1.3.40, and Theorem 1.3.18 were obtained in [59]. A reference onthe Smith normal forms for matrices with integer entries is [415]. Theorem1.3.25 is due to [336].

Theorem 1.3.22 was proved for R(N,N), N ≥ 3, in [115] and [503]. ThatR(2, 2) and R(1, 1, 1) (see Theorem 1.3.17) do not have the extension prop-erty appears in [506] (see also [507]). A theorem by Hilbert [317] states thatfor every N ≥ 3, there exist real polynomials F (s, t) of degree 2N whichare positive for every (s, t), but can not be represented as sums of squaresof polynomials. The first explicit example of such polynomial was found in[426] (see Exercise 1.6.27(b)). Other parts from Exercise 1.6.27 came from[447], [398], and [103]. Many other examples appear in [483]. The polyno-mial in Lemma 1.3.23 and its proof can be found in [90] (see also [510]). Itsuse makes Theorem 1.3.22 be true for R(2, 2) as well ([510]).

For application of positive trigonometric polynomials to signal processing,see, for instance, [188].

Section 1.4

Theorem 1.4.3 and Corollary 1.4.4 are taken from [69]. Corollary 1.4.4appears earlier in [525, Theorem 1] as a statement on Gaussian measureswith prescribed margins. The modified Newton algorithm used in Example1.5.1 can be found in [433] (see also [106]). For a semidefinite programmingapproach, see [26]. Corollaries 1.4.4 and 1.4.5 were first proved in [219] (seealso [285]). Exercise 1.6.47, which is based on [219], shows the reasoningthere. In [70] an operator-valued version of Corollary 1.4.5 appears. Theresults in Subsection 1.4.2 are presented for polynomials of a single variable,but they similarly hold for polynomials of several variables as well. For amatrix-valued correspondent of Theorem 1.4.9 see Theorem 2.3.9. The factthat diagonally strictly dominant Hermitian matrices are positive definitefollows directly from the Gersgorin disk theorem; see for example [323].

Section 1.5

Semidefinite programming (and the more general area of conic optimiza-tion) is an area that has developed developed during the past two decades.It is not our aim to give here a detailed account on this subject. The articlesand books [106], [433], [107], [554], [108], [258], and [170] are good sourcesfor information on this topic.

Exercises

Exercise 1.6.9 is based on [286] (parts (c), (e)) and [131] (part (b), (d));see also [409] and [404] for more results on extreme points of the set ofcorrelation matrices.

Exercise 1.6.10 (a) and (b) are based on [518, Theorem B], which solveda conjecture from [309]. Part (c) of this exercise is based on [309]. As a


separate note, let us remark that the computation of β(G) is an NP-completeproblem; see [582]. In addition, all graphs for which the extreme rays aregenerated by matrices of rank at most 2 are characterized in [395].

The answer to Exercise 1.6.11 can be found in [9].Exercise 1.6.12 is based on a result in [309].Exercise 1.6.18 is based on [583]. In [316] the problem of finding a closest

correlation matrix is considered, motivated by a problem in finance.Exercise 1.6.23 is based on the proof of Lemma 2.2 in [336].Exercise 1.6.24 is a result proved by N. Akhiezer and M.G. Krein (see

[376]; in [25] the matrix-valued case appears), while Exercises 1.6.42 and1.6.46 are results from [69].

Exercise 1.6.28 is based on [95]. The Robinson polynomial appears firstin the paper [484]; see also [483] for a further discussion.

Exercises 1.6.32 and 1.6.33 come from [58].The operations Φ and Ψ from Exercise 1.6.38 are discussed in [330].Exercise 1.6.43 is based on an example from [412], and the answer for

Exercise 1.6.45 may be found in [533]. The matrix (1.6.7), as well as minorvariations, is discussed in [531].

Additional notes

The standard approach to the frequency-domain design of one-dimensionalrecursive digital filters begins by finding the squared magnitude response ofa filter that best approximates the desired response. Spectral factorizationis then performed to find the (unique) minimum-phase filter that has thatmagnitude response, which is guaranteed to be stable. The design problem,embodied by the determination of the squared magnitude function, and theimplementation problem, embodied by the spectral factorization, are decou-pled.

Consider a class of systems whose input u and output y satisfy a linear

68 CHAPTER 1

constant coefficient difference equation of the form

n∑k=0

akyt−k =n∑k=0

bkuk. (1.7.1)

The frequency response, that is, the system transfer function evaluated onthe unit circle, has the form

f(z) =

n∑k=0

akzk

n∑k=0

bkzk.

In order to design a low pass filter, the constraints we have to satisfy are bestwritten using the squared magnitude of the filter frequency response |f(z)|2.In designing a lowpass filter one would like |f(z)|2 ≈ 1 when z ∈ eit : t ∈[0, s] for some 0 < s < π, and |f(z)|2 ≈ 0 when z ∈ eit : t ∈ [s, π]. Atypical squared magnitude response of a low pass filter looks like the graphabove (see for example [581] or [244]).

Spectral factorization is now performed to find the coefficients ak, bk ofthe minimum-phase filter that has the desired magnitude response, which isguaranteed to be stable. The coefficients may then be used to implementthe filter. General references on the subject include [347], [584], [27], [561],[469], [446], [589], [333], [107], and [432].

cones of hermitian matrices and trigonometric...

Documents