entanglement

The Mathematics of Entanglement

Summer School at Universidad de los Andes

Fernando G. S. L. Brandao, Matthias Christandl, Aram W. Harrow and Michael Walter

27-31 May, 2013

Contents

Lecture 1 - Quantum States 31.1 Probability theory and Tensor products . . . . . . . . . . . . . . . . . . . . . . . . . 31.2 Quantum Mechanics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

1.2.1 Measurements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41.3 Mixed states . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51.4 Composite systems and Entanglement . . . . . . . . . . . . . . . . . . . . . . . . . . 5

1.4.1 Partial trace . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

Lecture 2 - Quantum Operations 82.1 Measurements, POVMs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82.2 Unitary Dynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92.3 General Time Evolutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

Lecture 3 - Quantum Entropy 113.1 Shannon Entropy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113.2 Typical . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113.3 Quantum compression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

Lecture 4 - Teleportation and entanglement transformations 144.1 Teleportation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144.2 LOCC entanglement manipulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154.3 Distinguishing quantum states . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164.4 Entanglement dilution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

Lecture 5 - Introduction to the Quantum Marginal Problem 175.1 The Quantum Marginal Problem or Quantum Representability Problem . . . . . . . 17

5.1.1 Physical Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175.2 Quantum Marginals for 3 Parties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

5.2.1 Warm-Up: 2 Parties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185.2.2 3 Parties of Qubits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

Lecture 6 - Monogamy of Entanglement 206.1 Symmetric Subspace . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 206.2 Application to Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

1

Lecture 7 - Separable states, PPT and Bell inequalities 237.1 Mixed-state entanglement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 237.2 The PPT test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

7.2.1 Bound entanglement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 247.3 Entanglement witnesses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 257.4 CHSH game . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

Lecture 8 - Exact Entanglement Transformations 278.1 Three qubit subsystems, part 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 278.2 Exact Entanglement Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

8.2.1 Quantum Instrument . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 288.2.2 LOCC as a Quantum Operation . . . . . . . . . . . . . . . . . . . . . . . . . 288.2.3 SLOCC: Stochastic LOCC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

Lecture 9 - Quantum de Finetti theorem 309.1 de Finetti . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 309.2 Quantum Key Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

Lecture 10 - Computational complexity of entanglement 3210.1 More on CHSH games . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3210.2 Computational complexity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

Lecture 11 - Three Qubit Entanglement Polytopes 3711.1 Quantum Marginal Problem for Three Parties . . . . . . . . . . . . . . . . . . . . . . 37

Lecture 12 - High dimensional entanglement 39

Lecture 13 - LOCC distinguishability 4213.1 Data Hiding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4213.2 Better de Finetti theorems for 1-LOCC measurements . . . . . . . . . . . . . . . . . 43

Lecture 14 - Representation Theory and Spectrum Estimation 4614.1 Representation Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

14.1.1 Schur-Weyl Duality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4714.1.2 Computing mj . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

14.2 Spectrum Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4814.2.1 Application Keyl-Werner Relation . . . . . . . . . . . . . . . . . . . . . . . . 48

Lecture 15 - Proof of the 1-LOCC quantum de Finetti theorem 4915.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4915.2 Entropy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

2

The Mathematics of Entanglement - Summer 2013 27 May, 2013

Quantum States

Lecturer: Fernando G.S.L. Brandao Lecture 1

Entanglement is a quantum mechanical form of correlation, which appears in many areas, suchas condensed matter physics, quantum chemistry, and other areas of physics. This week we willdiscuss a perspective from quantum information, which means we will abstract away the underlyingphysics, and make statements about entanglement that apply independent of the underlying physi-cal system. This will also allow us to discuss information-processing applications, such as quantumcryptography.

1.1 Probability theory and Tensor products

Before discussing quantum states, we explain some aspects of probability theory, which turns outto have many similar features.

Suppose we have a system with d possible states, for some integer d, which we label by 1, . . . , d.Thus a deterministic state is simply an element of the set {1, . . . , d}. The probabilistic states areprobability distributions over this set, i.e. vectors in Rd+ whose entries sum to 1. The notationRd+ means that the entries are nonnegative. Thus, a probability distribution p = (p(1), . . . , p(d))

satisfies∑d

x=1 p(x) = 1 and p(x) ≥ 0 for each x. Note that we can think of a deterministic statex ∈ {1, . . . , d} as the probability distribution where p(x) = 1 and all other probabilities are zero.

Composition. Suppose we are given p ∈ Rm+ and q ∈ Rn+ which correspond to independentprobability distributions. Their joint distribution is given by the vector

p⊗ q :=

p(1)q(1)p(1)q(2)

...p(1)q(n)

...p(m)q(n)

.

We have introduced the notation ⊗ to denote the tensor product, which in general maps a pair ofvectors with dimensions m,n to a single vector with dimension mn. Later we will also considerthe tensor product of matrices. If Mn denotes n× n matrices, and we have A ∈Mn, B ∈Mm thenA⊗ B ∈ Mnm is the matrix whose entries are all possible products of an entry of A and an entry

of B. For example, if n = 2 and A =

(a11 a12

a21 a22

)then A⊗B is the block matrix(

a11B a12Ba21B a22B

).

One useful fact about tensor products, which simplifies many calculations, is that

(A⊗B)(C ⊗D) = AC ⊗BD.

We also define the tensor product of two vector space V ⊗W to be the span of all v ⊗ w forv ∈ V and w ∈W . In particular, observe that Cm ⊗ Cn = Cmn.

3

1.2 Quantum Mechanics

We will use Dirac notation in which a “ket” |ψ〉 denote a column vector in a complex vector space,i.e.

|ψ〉 =

ψ1

ψ2...ψd

∈ Cd.

The “bra” 〈ψ| denotes the conjugate transpose, i.e.

〈ψ| =(ψ∗1 ψ∗2 · · · ψ∗d

).

Combining a bra and a ket gives a “bra[c]ket”, meaning an inner product

〈ϕ|ψ〉 =

d∑i=1

ϕ∗iψi.

In this notation the norm is

‖ψ‖2 =√〈ψ|ψ〉 =

√√√√ d∑i=1

|ψi|2.

Now we can define a quantum state. The quantum analogue of a system with d states is thed-dimensional Hilbert space Cd. For example, a quantum system with d = 2 is called a qubit. Unitvectors |ψ〉 ∈ Cd, where 〈ψ|ψ〉 = 1, are called pure states. They are the analogue of deterministicstates in classical probability theory. For example, we might define the following pure states of aqubit:

|0〉 =

(10

), |1〉 =

(01

), |+〉 =

1√2

(|0〉+ |1〉), |−〉 =1√2

(|0〉+ |1〉).

Note that both pairs |0〉, |1〉 and |+〉, |−〉 form orthonormal bases of a qubit.

1.2.1 Measurements

A projective measurement is a collection of projectors {Pk} such that Pk ∈Md for each k, P †k = Pk,PkPk′ = δk,k′Pk and

∑k Pk = I. For example, we might measure in the computational basis, which

consists of the unit vectors |k〉 with a one in the kth position and zeros elsewhere. Thus define

Pk = |k〉〈k| =

0 0. . .

01

0. . .

0 0

,

which is the projector onto the one-dimensional subspace spanned by |k〉.

4

Born’s rule states that Pr[k], the probability of measurement outcome k, is given by

Pr[k] = 〈ψ|Pk|ψ〉.

As an exercise, verify that this is equal to tr(Pk|ψ〉〈ψ|). In our example, this is simply |ψk|2.Example. If we perform the measurement {|0〉〈0|, |1〉〈1|} on |+〉, then Pr[0] = Pr[1] = 1/2. If

we perform the measurement {|+〉〈+|, |−〉〈−|}, then Pr[+] = 1 and Pr[−] = 0.

1.3 Mixed states

Mixed states are a common generalization of probability theory and pure quantum mechanics. Ingeneral, if we have an ensemble of pure quantum states |ψx〉 with probabilities p(x), then definethe density matrix to be

ρ =∑x

p(x)|ψx〉〈ψx|.

The vectors |ψx〉 do not have to be orthogonal.Note that ρ is always Hermitian, meaning ρ = ρ†. Here † denotes the conjugate transpose, so

that (A†)i,j = Aj,i. In fact, ρ is positive semi-definite (“PSD”). This is also denoted ρ ≥ 0. Twoequivalent definitions (assuming that ρ = ρ†) are:

1. For all |ψ〉, 〈ψ|ρ|ψ〉 ≥ 0.

2. All the eigenvalues of ρ are nonnegative. That is,

ρ =∑i

λi|ϕi〉〈ϕi| (1.1)

for an orthonormal basis {|ϕ1〉, . . . , |ϕd〉} with each λi ≥ 0.

Exercise: Prove that these definitions are equivalent.A density matrix should also have trace one, since tr ρ =

∑x p(x)〈ψx|ψx〉 =

∑x p(x) = 1.

Conversely, any PSD matrix with trace one can be written in the form∑

x p(x)|ψx〉〈ψx| forsome probability distribution p and some unit vectors {|ψx〉}, and hence is a valid density matrix.This is just based on the eigenvalue decomposition: we can always take p(x) = λx in (1.1).

Note that this decomposition is not unique in general. For example, consider the maximally

mixed state ρ = I/2 =

(1/2 00 1/2

). This can be decomposed either as 1

2 |0〉〈0| +12 |1〉〈1| or as

12 |+〉〈+|+

12 |−〉〈−|, or indeed as 1

2 |u〉〈u|+12 |v〉〈v| for any orthonormal basis {|u〉, |v〉}.

Born’s rule. For pure states if we measured {Pk} then the probability of outcome k would betr(Pk|ψ〉〈ψ|). For mixed states ρ this becomes Pr[k] = tr(ρPk) by linearity.

1.4 Composite systems and Entanglement

We will work always with distinguishable particles. A pure state of two quantum systems is givenby a unit vector |η〉 in the tensor product Hilbert space Cn ⊗ Cm ∼= Cnm.

For example, if particle A is in the pure state |ψ〉A and particle B is in the pure state |ϕ〉Bthen their joint state is |η〉AB = |ψ〉A ⊗ |ϕ〉B. If |ψ〉A ∈ Cn and |ϕ〉B ∈ Cm, then we will have|η〉AB ∈ Cmn.

5

This should have the property that if we measure one system, say A, then we should obtainthe same result in this new formalism that we would have had if we treated the states separately.If we perform the projective measurement {Pk} on system A then this is equivalent to performingthe measurement {Pk ⊗ I} on the joint system. We can then calculate

Pr[k] = 〈η|Pk ⊗ I|η〉=A 〈ψ|B|ϕ〉(Pk ⊗ I)|ψ〉A|ϕ〉B= 〈ψ|Pk|ψ〉〈ϕ|ϕ〉= 〈ψ|Pk|ψ〉

Just as there are joint distributions for which two random variables are not independent (e.g.,...), there are quantum states which are cannot be written as a tensor product |ψ〉 ⊗ |ϕ〉 for anychoice of |ψ〉, |ϕ〉. In quantum mechanics, this situation can even occur for pure states of the system.For example, consider the EPR pair (we will see the reason for the name later):

|Φ+〉 =|0〉 ⊗ |0〉+ |1〉 ⊗ |1〉√

2.

We say that a pure state |η〉 is entangled if, for any |ψ〉, |ϕ〉, we have |η〉 6= |ψ〉 ⊗ |ϕ〉.Entangled states have many counterintuitive properties. For example, suppose we measure the

state |Φ+〉 using the projectors {Pj,k = |j〉〈j| ⊗ |k〉〈k|}. Then we can calculate

Pr[(0, 0)] = 〈Φ+|P0,0|Φ+〉 = 〈Φ+| |0〉〈0| ⊗ |0〉〈0| |Φ+〉 =1

2

Pr[(1, 1)] =1

2Pr[(0, 1)] = 0

Pr[(1, 0)] = 0

The outcomes are perfectly correlated.However, observe that if we measure in a different basis, we will also get perfect correlation.

Consider the measurement with outcomes

{|+ +〉〈+ + |, |+−〉〈+− |, | −+〉〈−+ |, | − −〉〈− − |,

where we have used the shorthand |+ +〉 := |+〉 ⊗ |+〉, and similarly for the other three. Then onecan calculate (and doing so is a good exercise) that, given the state |Φ+〉, we have

Pr[(+,+)] = Pr[(−,−)] =1

2,

meaning again there is perfect correlation.

1.4.1 Partial trace

Suppose we have ρAB ∈ D(Cn⊗Cm). We would like a quantum analogue of the notion of a marginaldistribution in probability theory.

6

Define the reduced state of A to be

ρA := trB ρAB

trB(ρAB) =∑k

(I ⊗ 〈k|)ρAB(I ⊗ ketk),

where {|k〉} is any orthonormal basis on B. The operation trB is called a partial trace.We observe that if we perform a measurement {Pj} on A, then we have

Pr[j] = tr(Pj ⊗ I)ρAB = tr(Pk trB(ρAB)) = tr(PkρA).

Thus the reduced state ρA perfectly reproduces the statistics of any measurement on the system A.

7


Quantum Operations

Lecturer: Matthias Christandl Lecture 2

In this lecture we will talk about dynamics in quantum mechanics. First stating from measure-ments again, and then going to unitary evolutions and general quantum dynamical processes.

2.1 Measurements, POVMs

FIXME: Insert three figures from Matthias slides.Consider a quantum measurement as a box, applied to a mixed quantum state ρ, with possible

outcomes labelled by i: In the previous lecture, we considered projective measurements given byorthogonal projectors {Pi}, with Born’s rule Pr(i) = tr(Piρ).

Another common way to think about these is the following: We can associate to any projectivemeasurement an observable A with eigendecomposition A =

∑i aiPi, where we think of the ai as

the values that the observable attains for each outcome (e.g., the value the measurement devicedisplays, the position of a pointer, . . . ). Then the expectation value of A in the state ρ is tr(Aρ) =∑

i ai tr(Piρ).But is this the most general measurement allowed in quantum mechanics? It turns out that

this is not the case. Suppose we have a quantum state ρA on Cd and we consider the joint stateρA ⊗ |0〉〈0|B, with |0〉B ∈ Cd the state of an ancillary particle. Let us perform a projectivemeasurement {Pi} on the joint system (i.e., the Pi are orthogonal projectors on Cd ⊗ Cd′ = Cdd′).Then the probability of measuring i is

Pr(i) = tr (Pi (ρA ⊗ |0〉〈0|B)) .

Using the partial trace, we can rewrite this as follows

Pr(i) = trA (trB (Pi (ρA ⊗ |0〉〈0|B)))

= trA (〈0|BPi|0〉B ρA)

= tr(QiρA),

where Qi := 〈0|BPi|0〉B. Thus the operators (Qi) allow us to describe the measurement statisticswithout having to consider the state of the ancillary system. What are the properties of Qi? First,it is PSD:

〈φ|Qi|φ〉 = 〈φ|A〈0|BPi|0〉N |φ〉A ≥ 0,

since Pi ≥ 0. Second the Qi sum up to the identity:∑i

Qi =∑i

〈0|BPi|0〉B = 〈0|B∑i

Pi|0〉B = 〈0|BI|0〉B = IA.

The converse of the above is also true: Whenever we are given a set of PSD matrices Qi ≥ 0with

∑iQi = I, we can always find a projective measurement {Pi} on a larger system A⊗B such

thattr(QiρA) = tr(Pi (ρA ⊗ |0〉〈0|B)).

8

The generalized quantum measurements we obtain in this way are called positive operator-valuedmeasure(ment)s (POVMs). Note that since the Qis are not necessarily orthogonal projections,there is no upper bound on the number of elements in a POVM.

Example 1. Consider two projective measurements, e.g. {|0〉〈0|, |1〉〈1|} and {|+〉〈+|, |−〉〈−|}.Then we can define a POVM as a mixture of these two:

Q0 =1

2|0〉〈0|,

Q1 =1

2|1〉〈1|,

Q2 =1

2|+〉〈+|,

Q3 =1

2|−〉〈−|.

It is clear that∑

kQk = I. One way of thinking about this POVM is that with probability 1/2 wemeasure in the computational basis, and with probability 1/2 in the |±〉 basis.

Example 2. The quantum state ρ of a qubit can always be written in the form

ρ = ρ(~r) =1

2(I + rxσx + ryσy + rzσz) ,

with the Pauli matrices

σx =

(0 11 0

), σz =

(1 00 −1

), σy =

(0 −ii 0

).

Note that the Pauli matrices are traceless, so that the state has indeed trace one (it’s normalized).We can the describe the state by a 3-dimensional vector ~r = (rx, ry, rz) ∈ R3. It turns out that ρis PSD if, and only if, ‖r‖2 ≤ 1. Therefore every quantum state of a qubit corresponds to a pointin a 3-dimensional sphere, called the Bloch sphere. A state ρ is pure if, and only if, ‖r‖2 = 1.

Let us consider a collection of four pure states {|ai〉〈ai|}4i=1 that form a tetrahedron on the Blochsphere. Then, by symmetry of the tetrahedron,

∑i |ai〉〈ai| = I, so they form indeed a POVM.

FIXME: Insert tetrahedron figure.

2.2 Unitary Dynamics

Let |ψ〉 be a quantum state and consider its time evolution according to the Schrodinger equationfor a time-independent Hamiltonian H. Then the state after some time t is given by

|ψt〉 = e−iHt|ψ〉,

where we have set ~ = 1. The matrix U = e−iHt describing the evolution of the system is a unitarymatrix, i.e. UU † = U †U = I.

Example 1. Ut = eit~e·~σ/2 with ~e ∈ R3 a unit vector and ~σ = (σx, σy, σz) the vector of Paulimatrices. We have

Utρ(~r)U †t = ρ(Rt~r),

where Rt rotates by an angle t around the axis ~e.

9

Example 2. The Hadamard unitary is given by H = 1√2

(1 11 −1

). Its action on the computa-

tional basis vectors is

H|0〉 = |+〉,H|1〉 = |−〉.

2.3 General Time Evolutions

There are more general possible dynamics in quantum mechanics than unitary evolution. Onepossibility is that we add an acilla state |0〉〈0|B to ρA and consider a unitary dynamics UAB :A⊗B → A′ ⊗B′ on the joint state. The state of the A′B′ system is

UρA ⊗ |0〉〈0|BU † (2.1)

Suppose now we are only interested in the final state of the subsystem A′. Then

ρA′ = trB′(UρA ⊗ |0〉〈0|BU †

), (2.2)

where we traced out over subsystem B′. We can associate a map Λ to this evolution as

Λ(ρA) = ρ′A = trB′(UρA ⊗ |0〉〈0|BU †

). (2.3)

What are the properties of Λ? First it maps PSD matrices to PSD matrices.We call this propertypositivity. Second, it preserves the trace. We say the map is trace preserving.

Another more interesting property is that even the map Λ⊗Id, where Id is the identity map onan auxiliary space of arbitrary dimension, is positive. We call this property completely positivity.

An important theorem (sometimes called Stinespring dilation) is that the converse also holds:Any Λ which is compeltely positive and trace preserving can be written as

Λ(X = trB′ (UX ⊗ |0〉〈0|BU) , (2.4)

with a unitary U : A ⊗ B → A′ ⊗ B′. Therefore every quantum dynamics are in a one-to-onecorrespondence to completely positive trace preserving maps, also called quantum operations.

Example: An example of a quantum operation is the so-called depolarization map: λ(ρ) =(1 − p)ρ + pI/2. With probability 1 − p the identity map is implemented, and with probability pthe state is destroyed and replaced by the maximally mixed one.

10


Quantum Entropy

Lecturer: Aram Harrow Lecture 3

3.1 Shannon Entropy

In this part, we want to understand quantum information in a quantitative way. One of theimportant concepts is entropy. But let us first look at classical entropy.

Given is a probability distribution p ∈ Rd+,∑

i pi = 1. The Shannon entropy of p is

H(p) = −∑i

pi log pi

(log is always to base to as we are talking about bits and units, convention limx 7→0 x log x = 0).Entropy quantifies uncertainty. We have maximal certainty for a deterministic distribution, e.g.p = (1, 0, . . . , 0), H(p) = 0. The distribution with maximal uncertainty is p = (1

d , . . . ,1d), H(p) =

log d.In the following we want to give Shannon entropy an operational meaning with help of the

problem of data compression. For this imagine you have a binary alphabet (d = 2) and you samplen times independently and identically distribution from the distribution p = (π, 1 − π); we writeX1, . . . , Xn ∈i.i.d. {0, 1}. (Prob[Xi = 0] = π Prob[Xi = 1] = 1− π)

Typically, the number of 0’s in the string nπ±O(√n and the number of 1’s n(1−π)±O(

√n. In

order to see why this is the case, consider the sum S = X1 + . . . , Xn (this equals the number of 1’sin the string). The expectation of this random variable is E[S] = E[X1] + . . .+E[Xn] = n(1− π),where we used the linearity of the expectation value. Furthermore the variance of S is V ar[S] =V ar[X1] + . . . + V ar[Xn] = nV ar[X1] = nπ(1 − π) ≤ 1

4 . Here we used the independence of therandom variable in the first equation and V ar[X1] = E[X2

1 ]−E[X1]2 = (1−π)−(1−π)2 = π(1−π)

in the third. This implies that the standard deviation of S is smaller than√n

2 .What does this have to do with compression? The total number of possible n-bit strings is

|{0, 1}n| = 2n. The number of strings with nπ 0’s is(nπn

)= n!

(πn)!((1−π)n)! = (n/e)n

(πn/e)πn((1−π)n/e)(1−π)n=

(1/π)nπ(1/(1−π))n(1−π) where we used Stirling’s approximation. We can rewrite this as exp(nπ log 1/π+(1 − π) log 1/(1 − π)) = exp(nH(p)). Hence, we only need to store around exp(nH(p)) possiblestrings, which we can do in a memory having nH(p) bits. (Note we ignored the fluctuations. If wetook them into account, we would only need additional O(

√n) bits.) This analysis easily generalises

to arbitrary alphabets (not only binary).

3.2 Typical

I now want to give you a different way of looking at this problem, a way that is both more rigorousand will more easily generalise to the quantum case. This we will do with help of typical sets.

Again let X1, . . . , Xn be i.i.d distributed with distribution p. The probability of a string is thengiven by

11

Prob[X1 . . . Xn = x1, . . . , xn] = p(x1)p(x2) · · · p(xn) where we used the notation (random vari-ables are in capital letters and values in small letters)

p⊗(n) = p⊗ p · · · ⊗ p (n times).

xn = (x1, . . . , xn) ∈ Σn

where Σ = {1, . . . , d} is the alphabet and Σn denotes strings of length n over that alphabetNote that

log p⊗n(xn) =n∑i=1

log p(xi) ≈ nE[log p(xi)]±√n√V ar[log p(xi] = −nH(p)±O(

√n)

where we used

E[log p(xi)] =∑i

p(xi) log p(xi) = −H(p)

Let us now define the typical set as the set of strings whose

Tp,n,δ = {xn ∈ Σ| log p⊗n(xn) + nH(p)| ≤ nδ}

Then∀δ > 0 lim

n→∞p⊗nTp,n,δ = 1

Our compression algorithm simply keeps all the strings in the typical set and throws away allothers. Hence, all we need to know the size of the typical set. This is easy. Note that

xn ∈ Tp,n,δ =⇒ exp(−nH(p)− nδ) ≤ p⊗n(xn) ≤ exp(−nH(p) + nδ)

Note1 ≥ p⊗n(Tp,n,δ) ≥ |Tp,n,δ|min p⊗n(xn)

where the minimum is over all strings in the typical set. This implies

1 ≥ |Tp,n,δ| exp(−nH(p)− nδ)

which is equivalent tolog |Tp,n,δ| ≤ nH(p) + nδ

Exercise: Show this is optimal. More precisely, show that we cannot compress to nR bits forR < H(p) unless the error does not go to zero. Hint: Use Chebycheff ienquality: Let Z be a randomvariable Prob[|Z − E[Z]| ≥ kSD[Z]] ≤ 1/k2 Possible simplifications: 1) pretend all strings to betypical 2) use exactly nR bits.

12

3.3 Quantum compression

Probability distributions are replaced by density matrices ρ⊗n = ρ⊗· · · ρ (n times) If ρ is a state ofa qubit then this state lives on a 2n dimensional space. The goal of quantum data compression is torepresent this state on a smaller dimensional subspace. Just as before in bits, we now measure thesize in terms of the number of qubits needed to represent that subspace, the log of the dimension.

It turns out to be possible (and optimal) to do this in nS(ρ)±nδ where S is the von Neumannentropy S(ρ) = −

∑λi log λi = H(λ) = −trρ log ρ, where the λi are the eigenvalues of the density

operator.

13


Teleportation and entanglement transformations


Prologue: Post-measurement statesOne loose thread from the previous lecture is to explain what happens to a quantum state after

we measurement. Consider a projective measurement {Pk}. (We saw yesterday in lecture 2 thatin fact these can simulate even generalized measurements.) Recall that outcome k occurs withprobability Pr[k] = tr(Pkρ). Then if this measurement outcome occurs, we are left with the state

PkρPktr(Pkρ)

. (4.1)

Observe that this has the property that repeated measurements always produce the same answer(although the same is not necessarily true of generalized measurements).

For a pure state |ψ〉, the post-measurement state is

Pk|ψ〉‖Pk|ψ〉‖

. (4.2)

Equivalently, we can write Pk|ψ〉 =√p|ϕ〉, where |ϕ〉 is a unit vector representing the post-

measurement state, and p is the probability of that outcome.

4.1 Teleportation

Suppose that Alice has a qubit |ψ〉A′ = c0|0〉+ c1|1〉 that she would like to transmit to Bob. If theyhave access to a quantum channel, such as an optical fiber, she can of course simply give Bob thephysical system A′ whose state is |ψ〉. This approach is referred to as quantum communication.However, if they have access to shared entanglement, then this communication can be replacedwith classical communication (while using up the entanglement). This is called teleportation.

The procedure is as follows. Suppose Alice and Bob share the state

|Φ+〉AB =|0, 0〉+ |1, 1〉√

2,

and Alice wants to transmit |ψ〉A′ to Bob. Then Alice first measures systems AA′ in the basis{|Φ+〉, |Φ−〉, |ψ+〉, |ψ−〉}, defined as

|Φ±〉 =|0, 0〉 ± |1, 1〉√

2

|ψ±〉 =|0, 1〉 ± |1, 0〉√

2

For ease of notation, define {|η0〉, |η1〉, |η2〉, |η3〉} := {|Φ+〉, |Φ−〉, |ψ+〉, |ψ−〉}.Outcome 0 corresponds to the unnormalized state(

|Φ+〉〈Φ+|A′A ⊗ IB)

(|ψ〉A′ ⊗ |Φ+〉AB) =1

2|Φ+〉A′A ⊗ |ψ〉B,

14

meaning the outcome occurs with probability 1/4 and when it does, Bob gets |ψ〉.One can show (calculation omitted) that outcome i (for i ∈ {0, 1, 2, 3}) corresponds to

(|ηi〉〈ηi| ⊗ IB) |ψ〉A′ ⊗ |Φ+〉AB =1

2|ηi〉A′A ⊗ σi|ψ〉B,

where {σ0, σ1, σ2, σ3} denote the four Pauli matrices {I, σx, σy, σz}. The 1/2 means that eachoutcome occurs with probability 1/4. Thus, transmitting the outcome i to Bob allows him to applythe correction σi and recover the state |ψ〉.

This protocol has achieved the following transformation of resources:

1“bit” entanglement + 2 bits classical communication ≥ 1 qubit quantum communication.

As a sanity check, we should verify that entanglement alone cannot be used to communicate.To check this, the joint state after the measurement is

ρA′AB =1

4

3∑i=0

|ηi〉〈ηi|A′A ⊗ σi|ψ〉〈ψ|σi.

Bob’s state specifically is

ρB =1

4

3∑i=0

σi|ψ〉〈ψ|σi =I

2.

Teleporting entanglement. This protocol also works if applied to qubits that are entangledwith other states. For example, Alice might locally prepare an entangled state |ψ〉RA′ and thenteleport qubit A′ to Bob. Then the state |ψ〉 will be shared between Alice’s system R and Bob’ssystem B. Thus, teleportation can be used to create shared entanglement. Of course, it consumesentanglement at the same rate, so we are not getting anything for free here.

4.2 LOCC entanglement manipulation

Suppose that Alice and Bob can freely communicate classically and can manipulate quantum sys-tems under their control, but are limited in their ability to communicate quantumly. This classof operations is called LOCC, meaning “local operations and classical communication”. It oftenmakes sense to study entanglement in this setting, since LOCC can modify entanglement from onetype to another, but cannot create it where it didn’t exist before. What types of entanglementmanipulations are possible with LOCC?

One example is to map a pure state |ψ〉AB to (UA ⊗ VB)|ψ〉AB, for some choice of unitariesUA, VB.

A more complicated example is that Alice might measure her state with a projective measure-ment {Pk} and transmit the oucome to Bob, who performs a unitary Uk depending on the outcome.This is essentially the structure of teleportation. The resulting map is

ρAB 7→∑k

(Pk ⊗ Uk)ρ(Pk ⊗ U †k).

One task for which we might like to use LOCC is to extract pure entangled states from anoisy state. For example, we might want to map ρAB to |Φ+〉〈Φ+|. This problem is in general

15

called entanglement distillation since we are distilling pure entanglement out of noisy entanglement.However, we typically consider it with a few variations. First, as with many information-theoreticproblems, we will consider asymptotic transformations in which we map ρ⊗nAB to |Φ+〉〈Φ+|⊗m, andseek to maximize the ratio m/n as n → ∞. Additionally, we will allow a small error (to beformalized later) that goes to zero as n→∞. Semi-formally, the distillable entanglement of ρ is

ED(ρAB) = limn→∞

max{mn

: ρ⊗nLOCC−−−−→ σm ≈ |Φ+〉〈Φ+|⊗m

}.

4.3 Distinguishing quantum states

The maximum distinguishing bias that any measurement can achieve between a pair of states ρ, σis

D(ρ, σ) = max{M,I−M}0≤M≤I

| tr(M(ρ− σ))|.

It turns out that D(ρ, σ) = 12‖ρ−σ‖1, where ‖X‖1 is the trace norm, defined as ‖X‖1 = tr(

√X†X).

For this reason, the distance D(ρ, σ) is also called the trace distance.Using this language, we can define ED properly as

ED(ρAB) = limε→0

limn→∞

max{mn

: ρ⊗nLOCC−−−−→ σm, ‖σm − |Φ+〉〈Φ+|⊗m‖1 ≤ ε

}.

4.4 Entanglement dilution

Suppose we wish to create a general entangled state ρAB out of pure EPR pairs. As with distillation,we will aim to maximize the asymptotic ratio achievable while the error goes to zero. Define theentanglement cost

Ec(ρAB) = limε→0

limn→∞

min{mn

: ∃Λ ∈ LOCC, ‖Λ(|Φ+〉〈Φ+|⊗m)− ρ⊗nAB‖1 ≤ ε}

In general, Ec and ED are both hard to compute. However, if ρAB is pure then there is a simplebeautiful formula.

Theorem 4.1. For any pure state |ψ〉AB,

Ec(|ψ〉〈ψ|AB) = ED(|ψ〉〈ψ|AB) = S(ρA) = S(ρB),

where S(ρ) = − tr ρ log ρ.

16


Introduction to the Quantum Marginal Problem


In the picture is the cover of a a book by Godel, Escher and Bach. You see it projects B, G orE depending from where you project the light on. Is it possible to project any triple of letters inthis way? It turns out that the answer is no. For example, by geometric considerations there is noway of projecting ”A” everywhere.

The goal of this lecture is to introduce a quantum version of this problem!

5.1 The Quantum Marginal Problem or Quantum Representabil-ity Problem

Consider a set of n particles each with d-dimensions. The state lives in (Cd)⊗n. We considerdifferent subsets of the particles Si ⊆ {1, . . . , N} and suppose we are given quantum states in eachof this sets: ρSi . The question we want to address is whether they are compatible, i.e. does thereexist a quantum state ρ{1,...,n} such that for all Si,

trSci (ρ) = ρSi , (5.1)

with Sci the complement of Si in {1, . . . , n}.

5.1.1 Physical Motivation

This is a interesting problem from a mathematical point of view, but it is also a prominent problemin the context of condensed matter physics and quantum chemistry. Consider a nearest-neighboursHamiltonian on a line H =

∑i hi,i+1, where hi,i+1 := hi,i+1⊗I{1,...,n} i,i+1 only acts on qubits i and

i + 1. A quantity of interest is the groundenergy of the model, given by the minimum eigenvalueof H. We can write it variationally as

Eg = min|ψ〉〈ψ|H|ψ〉 = min

ρ∈{1,...,n}tr(ρ{1,...,n}) (5.2)

since the set of quantum states is convex and the extremal points are the pure states. Continuing,

Eg = minρ∈{1,...,n}

tr(ρ{1,...,n})

= minρ

∑i

tr(ρhi,i+1 ⊗ I{1,...,n} i,i+1

)= min

ρ

∑i

tr(hi,i+1ρi,i+1). (5.3)

Therefore

Eg = min{ρi,i+1}compatible

∑i

tr(hi,i+1ρi,i+1), (5.4)

17

where the minimization is over {ρi,i+1} which are compatible.

Observe that the initial maximization is over |ψ〉 ∈(Cd)⊗n

, which has dimension dn. In contrast,the minimization in Eq. (5.4) is over nd variables. Therefore if we could solve the compatibilityproblem we could solve the original one in a much more efficient way. Unfortunately this is nota good strategy and in fact one can show that the compatibility problem is computationally hard(NP-hard and even QMA-hard).

There is an interesting connection of the representability problem with quantum entropies. Aimportant inequality of the von Neumann entropy of three subsystems ABC is strong subadditivity:

S(AB) + S(BC) ≥ S(B) + S(ABC). (5.5)

Clearly this inequality puts restrictions on compatible states. More interestingly, one can also useresults from the quantum marginal problem to give a proof of the inequality.

5.2 Quantum Marginals for 3 Parties

A particular case of the marginal problem is the following: given three quantum states ρA, ρB andρC , are they compatible? In this case it is that the answer is yes, just consider ρABC = ρA⊗ρB⊗ρC .

But what if we require that the global state ρABC is pure? I.e. we would like to have a purestate |ψ〉ABC such that

trAB (|ψ〉〈ψ|ABC) = ρC , trAC (|ψ〉〈ψ|ABC) = ρB, trBC (|ψ〉〈ψ|ABC) = ρA. (5.6)

Then just taking the tensor product of the reduced states is not an option any more.Example 1: ρA = ρB = ρC = I/2 are compatible, with the GHZ state (|0, 0, 0〉 + |1, 1, 1〉)/

√2

being a possible extension. ρA = ρB = ρC = I/2 are compatible, with the GHZ state (|0, 0, 0〉 +|1, 1, 1〉)/

√2 being a possible extension.

Example 2: Suppose ρA, ρB and ρC are compatible. Are ρ′A = UAρAU†A, ρ′B = UBρBU

†B, and

ρ′C = UCρCU†C compatible too? The answer is yes. Indeed if |ψ〉ABC was an extension of ρA, ρB

and ρC , then UA ⊗ UB ⊗ UC |ψ〉ABC is an extension of ρ′A, ρ′B and ρ′C .Therefore we see that the property of ρA, ρB, ρC being compatible only depends on the spectra

λA, λB and λC of ρA, ρB and ρC . Here λA := (λA,1, . . . , λA,d) with λA,i the eigenvalues of ρA.

5.2.1 Warm-Up: 2 Parties

Given ρA and ρB, are they compatible?A useful way of writing a bipartite pure state |ψ〉AB is in its Schmidt form:

|ψ〉AB =∑i

si|ei〉 ⊗ |fi〉 (5.7)

for orthogonal basis {|ei〉} and {|fi〉} o A and B, respectively. The numbers {si} are called Schmidtvalues of |ψ〉AB. The reductions of |ψ〉AB are

ρA =∑i

s2i |ei〉〈ei| (5.8)

18

andρA =

∑i

s2i |fi〉〈fi| (5.9)

Therefore we see that the eigenvalues of ρA and ρB are equal and given by {s2i }.

Going back to the compatibility question, we then see from the discussion above that ρA andρB are compatible if, and only if, they have the same spectrum.

5.2.2 3 Parties of Qubits

Consider ρA, ρB and ρC each acting on C2. Then since λA = (λA,1, 1−λA,1), the compatible regionis a subset of R3. This have a simple algebraic characterization (shown in []):

λAmax + λBmax ≤ 1 + λCmax (5.10)

plus all possible cyclic permutations of the labels.

19


Monogamy of Entanglement


Today, I will discuss a property of entanglement known as monogamy.Consider a Hamiltonian that has two-body interactions

H =∑<i,j>

Hij

where < i, j > are the edges in the interaction graph.We will consider the rather crude approximation that every particle interacts with any other

particle in the same way; this approximation is known as mean field approximation

H ≈ 1

n

∑1≤i<j≤n

It is then folklore that the ground state has the form ≈ ρ⊗nExample: Hij = Fij , where F is the swap operator defined by

F |α〉|β〉 = |β〉|α〉

The +1 eigenstates are | ↑↑〉, | ↓↓〉 and | ↑↓〉+ | ↓↑〉. The -1 eigenstate is | ↑↓〉 − | ↓↑〉, the singlet.The Hamiltonian would like that every two particle reduced density matrix is in the singlet

state. Now if the global state |ψ〉ABC has a reduced state ρAB = trC |ψ〉〈ψ|ABC that is indeed apure state in singlet form, then the global state has to be of the form

|ψ〉ABC = (| ↑↓〉 − | ↓↑〉)AB ⊗ |φ〉C .

Then we immediately see that the other pairs of particles are not entangled.This turns out to be a general feature of such systems.Theorem (quantum de Finetti): |ψ〉 is symmetric state on (CD)⊗k+n then trn|ψ〉〈ψ| ≈

∫dµ(σ)σ⊗k,

where µ is a distribution over density matrices on CD

This is a quantum version of de Finetti’s theorem from statistics. The important consequenceof this theorem is that the remaining k particles are not entangled. Since the ground states of meanfield systems are permutation invariant this means that these ground states are not entangled, andhence in some sense classical.

We will now introduce some mathematical tools needed to prove this theorem. The first is thesymmetic subspace

6.1 Symmetric Subspace

Let Sn be the group of permutations of n objects. Note that it contains n! elements. Now fix Dand a permutation π ∈ Sn. Let Pπ act on (CD)⊗n:

Pπ|i1〉 ⊗ · · · ⊗ |in〉 = |iπ−1(1)〉 ⊗ · · · ⊗ |iπ−1(n)〉.

20

The symmetric subspace is defined as the set of vectors that are invariant under the action of thesymmetric group

Symn(CD) = {|Ψ〉 ∈ (CD)⊗n : Pπ|Ψ〉 = |Ψ〉∀π ∈ Sn}.

Example:D = 2, n = 2

Sym2(C2) = span{|00〉, |11〉, |01 + 10〉}

D = 2, n = 2

Sym2(C2) = span{|000〉, |111〉, |001 + 010 + 100〉, |101 + 011 + 110〉}

The general construction is as follows. For this define the type of a string xn = (x1, . . . , xn) astype(xn) =

∑i exi , where ej is the basis vector with a 1 in the j’th position. t = (t1, . . . , td) is a

type if t1 + t2 + · · · td = n and the ti are natural numbers. For every type t the vector

|γt〉 =

(n

t

)−1/2∑|xn〉

where(nt

)is the multinomial coefficient. Symn(CD) = span{|γt〉}We can now compute the dimen-

sion of the symmetric subspace. Note that we can interpret this number as the number of waysin which you can arrange n balls into D buckets. There are

(n+D−1

n

)ways of doing this, which is

therefore the dimension.A useful way for calculations involving the symmetric subspace are the following two charac-

terisations of the projector onto the symmetric subspace1) Πsym = 1

n!

∑π Pπ where the sum is over all permutations

2)Πsym

tr Πsym=∫dφ|φ〉〈φ|⊗n where we integrate over the unit vectors in CD with the uniform

measure dφ normalised to∫dφ = 1. Note that tr Πsym = dim Symn(CD) =

(n+D−1

n

)Example: n = 1:

∫dφ|φ〉〈φ| = I/D

n = 2:∫dφ|φ〉〈φ|⊗2 = Psym/(D(D + 1)/2) = (I + SWAP )/(D(D + 1))

We can prove 2) either by representation theory (using Schur’s lemma) or by rewriting theintegral over unit vectors as an integral over Gaussian vectors and then using Wick’s theorem tosolve the integral.

Note that Πsym projector onto the symmetric subspace, hence

a) ∀|ψ〉 ∈ Symn(CD),Πsym|ψ〉 = |ψ〉 b) ∀|ψ〉 ∈ (CD)⊗n,Πsym|ψ〉 ∈ Symn(CD) c) Πsym = Π†sym

6.2 Application to Estimation

Given |ψ〉⊗n, we want to output an estimate |ψ〉 that approximates |ψ〉. We could now use differentnotions of approximation. Here, we want to maximise the overlap

E(|〈ψ|ψ〉|2k)

In order to do this, we will use the continuous POVM {Qψ},∫dψQψ = Πsym. In fact, we will

choose Qψ = c|ψ〉〈ψ|⊗n. The normalisation constant can be worked out as follows∫c|ψ〉〈ψ|⊗n = cΠsym/

(n+D − 1

n

),

21

hence Qψ =(n+D−1

n

)|ψ〉〈ψ|⊗n.

We now use this to solve our estimation problem:

E(|〈ψ|ψ〉|2k =

∫|〈ψ|ψ〉|2k Pr[ψ|ψ]

where Prob[ψ|ψ] = tr(|ψ〉〈ψ|⊗n|ψ〉〈ψ|⊗n)This equals ∫

tr |ψ〉〈ψ|⊗n+k|ψ〉〈ψ|⊗n+k

(n+D − 1

n

)=

(n+D−1

n

)(n+kD−1

n

) ≥ 1−Dk/n

where we usedΠn+ksym

(n+k+D−1n+k )

=∫dφ|φ〉〈φ|⊗n+k.

22


Separable states, PPT and Bell inequalities


Recall from yesterday this theorem.

Theorem 7.1. For any pure state |ψ〉AB,

Ec(|ψ〉〈ψ|AB) = ED(|ψ〉〈ψ|AB) = S(ρA) = S(ρB),

where S(ρ) = − tr ρ log ρ.

As a result, many copies of a pure entangled state can be (approximately) reversibly transformedinto EPR pairs and back again. Up to a small approximation error and inefficiency, we have|ψ〉⊗nAB LOCC←−−−→|Φ

+〉⊗nS(ρA).

7.1 Mixed-state entanglement

For pure states, a state is entangled if it’s not a product state. This is easy to check, and we caneven quantify the amount of entanglement (using Theorem 7.1) by looking at the entropy of one ofthe reduced density matrices.

But what about for mixed states? Here the situation is more complicated.

Definition 1. Define the set of separable states Sep to be the set of all ρAB that can be written as∑i

pi|ψi〉〈ψi|A ⊗ |ϕi〉〈ϕi|B. (7.1)

Definition 2. A state is entangled if it is not separable.

We should check that this notion of entanglement makes sense in terms of LOCC. And indeed,separable states can be created using LOCC: Alice samples i according to p, creates |ψi〉 and sendsi to Bob, who uses it to create |ϕi〉. On the other hand, entangled states cannot be created from aseparable state by using LOCC. In other words, the set Sep is closed under LOCC.

7.2 The PPT test

It is in general hard to test, given a state ρAB, whether ρ ∈ Sep. Naively we would have to checkfor all possible decompositions of the form in (7.1). So it is desirable to find efficient tests thatwork at least some of the time.

One such test is the Positive Partial Transpose, or PPT, test. The partial transpose can bethought of as (T ⊗ id), where T is the transpose map. More concretely, if

XAB =∑i,j,k,l

ci,j,k,l|i〉〈j|A ⊗ |k〉〈l|B

23

then the partial transpose is

XTAAB =

∑i,j,k,l

ci,j,k,l(|i〉〈j|A)T ⊗ |k〉〈l|B

=∑i,j,k,l

ci,j,k,l|j〉〈i|A ⊗ |k〉〈l|B

The PPT test asks whether ρTA is positive semidefinite. If so, we say that ρ is PPT.Observe that all separable states are PPT. This is because if ρ =

∑i pi|ψi〉〈ψi|A ⊗ |ϕi〉〈ϕi|B,

thenρTA =

∑i

pi|ψi〉〈ψi|A ⊗ |ϕi〉〈ϕi|.

This is still a valid density matrix and in particular is positive semidefinite (indeed, it is also inSep).

Thus, ρ ∈ Sep implies ρ ∈ PPT. The contrapositive is that ρ 6∈ PPT implies ρ 6∈ Sep. Thisgives us an efficient test that will detect entanglement in some cases.

Are there in fact any states that are not in PPT? Otherwise this would not be a very interestingtest.

Examples

1. |Φ+〉AB = |0,0〉+|1,1〉√2

. Then

|Φ+〉〈Φ+|TAAB =1

2(|0, 0〉〈0, 0|+ |1, 0〉〈0, 1|+ |0, 1〉〈1, 0|+ |1, 1〉〈1, 1|)

=

1/2 0 0 00 0 1/2 00 1/2 0 00 0 0 1/2

=1

2SWAP.

This has eigenvalues (1/2, 1/2, 1/2,−1/2), meaning that |Φ+〉〈Φ+|AB 6∈ PPT. Of course, wealready knew that |Φ+〉 was entangled.

2. Let’s try an example where we don’t already know the answer, like a noisy version of |Φ+〉.Let

ρ = p|Φ+〉〈Φ+|+ (1− p)I4.

Then one can calculate λmin(ρTA) = −p2 + 1−p

4 which is < 0 if and only if p > 1/3.

Maybe PPT = Sep? Unfortunately not. In D(C2 ⊗ C3) (i.e. density matrices in which onesystem has 2 dimensions and the other has 3) then all PPT states are separable. But for largersystems, e.g. 3x3 or 2x4, then there exist PPT states that are not separable.

7.2.1 Bound entanglement

Theorem 7.2. If ρAB ∈ PPT then ED(ρ) = 0.

To prove this we will establish two properties of the set PPT:

24

1. PPT is closed under LOCC.Consider a general LOCC protocol. This can be thought of as Alice and Bob alternating gen-eral measurements and sending each other the outcomes. When Alice makes a measurement,this transformation is

ρAB 7→(M ⊗ I)ρAB(M † ⊗ I)

tr((M †M ⊗ I)ρAB).

After Bob makes a measurement as well, depending on the outcome, the state is proportionalto

(M ⊗N)ρAB(M † ⊗N †),and so on. The class SLOCC (stochastic LOCC) consists of outcomes that can be obtainedwith some positive probability, and we will see later that this can be characterized in termsof (M ⊗N)ρAB(M † ⊗N †).We claim that if ρAB ∈ PPT then (M ⊗N)ρAB(M † ⊗N †) ∈ PPT. Indeed

((M ⊗N)ρAB(M † ⊗N †))TA = (MT ⊗N∗)ρTAAB(MT ⊗N∗)†.

Now ρTAAB ≥ 0 and XYX ≥ 0 whenever Y ≥ 0, implying that ((M⊗N)ρAB(M †⊗N †))TA ≥ 0.

2. PPT is closed under tensor product. If ρAB, σA′B′ ∈ PPT, then (ρAB ⊗ σA′B′) ∈ PPT. Why?Because

(ρAB ⊗ σA′B′)TAA′ = ρTAAB ⊗ σTA′A′B′ ≥ 0.

Proof of Theorem 7.2. Assume towards a contradiction that ρ ∈ PPT and ED(ρ) > 0. Then forany ε > 0 there exists n such that ρ⊗nAB can be transformed to |Φ+〉 using LOCC up to error ε. Sinceρ ∈ PPT, ρ⊗n is also PPT and so is the output of the protocol, which we call σ. Then σTA ≥ 0 and‖σ − |Φ+〉〈Φ+|‖1 ≤ ε. If we had ε = 0, then this would be a contradiction, because σ is in PPTand |Φ+〉〈Φ+| is not. We can use an argument based on continuity (details omitted) to show thata contradiction must appear even for some sufficiently small ε > 0.

If ρ is entangled but ED(ρ) = 0, then we say that ρ has “bound entanglement” meaning that itis entangled, but no pure entanglement can be extracted from it. By Theorem 7.2, we know thatany state in PPT but not Sep must be bound entangled.

Open question: A major open question (the “NPT bound entanglement” question) is whetherthere exist bound entangled states that have a non-positive partial transpose.

7.3 Entanglement witnesses

Sep is a convex set, meaning that if ρ, σ ∈ Sep and 0 ≤ λ ≤ 1 then λρ+ (1− λ)σ ∈ Sep. Thus theseparating hyperplane theorem implies that for any ρ 6∈ Sep, there exists a Hermitian matrix Wsuch that

1. For all σ ∈ Sep, tr(Wσ) ≥ 0

2. tr(Wρ) < 0.

Example: consider the state ρ = |Φ+〉〈Φ+|. Let W = I − 2|Φ+〉〈Φ+|. As an exercise, show thattr(Wσ) ≥ 0 for all σ ∈ Sep. We can also check that tr(Wρ) = −1.

Observe that an entanglement witness W needs to be chosen with a specific ρ in mind. An anexercise, show that no W can be a witness for all entangled states of a particular dimension.

25

7.4 CHSH game

One very famous type of entanglement witness is called a Bell inequality. In fact, these bounds ruleout not only separable states but even classically correlated distributions over states that couldbe from a theory more general than quantum mechanics. Historically, Bell inequalities have beenimportant in showing that entanglement is an inescapable, and experimentally testable, part ofquantum mechanics.

The game is played by two players, Alice and Bob, together with a Referee. The Referee choosebits r, s at random and sends r to Alice and s to Bob. Alice then sends a bit a back to the Refereeand Bob sends the bit b to the Referee.

A B

R

a

r

b

s

Alice and Bob win if a⊕ b = r · s, i.e. they want a⊕ b to be chosen according to this table:

r s desired a⊕ b0 0 00 1 01 0 01 1 1

One can show that if Alice and Bob use a deterministic strategy, their success probability willbe ≤ 3/4. However, using entanglement they can achieve a success probability of cos2(π/8) ≈0.854 . . . > 3/4. This strategy, together with the “payoff” function (+1 if they win, -1 if they lose),yields an entanglement witness, and one that can be implemented only with local measurements.

26


Exact Entanglement Transformations


8.1 Three qubit subsystems, part 2

Last lecture we looked at quantum states of three parties |ψ〉ABC ∈ C2 ⊗ C2 ⊗ C2. We say that ifρA, ρB, and ρC are the reductions of |ψ〉ABC then

λAmax + λBmax ≤ 1 + λCmax. (8.1)

Let us prove it. We have

λAmax + λBmax = max|φ〉,‖|φ〉‖2=1

〈φ|ρA|φ〉+ max|ψ〉,‖|ψ〉‖2=1

〈ψ|ρB|ψ〉

= maxφA

tr(ρA|φ〉〈φ|A) + maxφB

tr(ρB|φ〉〈φ|B)

= maxφA,φB

tr (ρAB(|φ〉〈φ|A ⊗ IB + IA ⊗ |φ〉〈φ|B))

= maxφA,φB

tr(ρAB(|φ〉〈φ|A ⊗ (|φ〉〈φ|B + |φ′〉〈φ′|B) + (|φ〉〈φ|A + |φ′〉〈φ′|A)⊗ |φ〉〈φ|B)

)≤ max

φA,φBtr(ρAB(IA ⊗ IB + |φ〉〈φ|A ⊗ |φ〉〈φ|B))

= 1 + maxφA,φB

tr(ρAB(|φ〉〈φ|A ⊗ |φ〉〈φ|B))

≤ 1 + max|φ〉AB

tr(ρAB(|φ〉〈φ|AB))

= = λABmax = λCmax, (8.2)

where in the last equality we used that |ψ〉ABC is pure.To show that Eq. (8.1) is sufficient for a triple of states ρA, ρB, and ρC to be compatible, let

us consider the following Ansatz:

|ψ〉ABC = a|0, 0, 0〉+ b|0, 1, 1〉+ c|1, 0, 1〉+ d|1, 1, 0〉. (8.3)

The A local density matrix for this state is

ρA = (a2 + b2)|0〉〈0|+ (c2 + d2)|1〉〈1|. (8.4)

Likewise,ρB = (a2 + c2)|0〉〈0|+ (b2 + d2)|1〉〈1| (8.5)

andρC = (a2 + d2)|0〉〈0|+ (b2 + c2)|1〉〈1| (8.6)

need picture... (add explanation later)Next lecture we will see how the mathematics of representation theory is useful for generalizing

this result to higher dimensions.

27

8.2 Exact Entanglement Transformation

In Fernando’s lecture he considered asymptotic and approximate entanglement transformations.Here we will consider the different regime of single-copy and exact transformations. Given twomultipartite states, |φ〉 and |ψ〉, which one is more entangled? One way to out an order on theset of quantum states is to say |φ〉 is more entangled than |ψ〉 if we can transform |φ〉 into |ψ〉 byLOCC.

A LOCC protocol is given by a sequence of measurements by one of the parties and classicalcommunication of the outcome obtained to the other. (see fig ....).

8.2.1 Quantum Instrument

Consider a quantum operation

Λ(ρA) = trB′(U(ρA ⊗ |0〉〈0|B)U †) =∑i

〈i|B′U |0〉B′ρA〈0|BU †|i〉B =∑i

EiρAE†i , (8.7)

with Ei := 〈0|BU †|i〉B. The Eis are called Kraus operators of the Λ.Note that the partial trace is the same as performing a projective measurement on B′ and

forgetting the outcome obtained. Suppose now that we would record the outcome instead. Thenconditioned on outcome i, the state is EiρAE

†i . We can associate the following measurement to it:

Γ(ρA) =∑i

EiρAE†i ⊗ |i〉〈i|. (8.8)

The operation Γ is also called a quantum instrument.

8.2.2 LOCC as a Quantum Operation

Going back to the LOCC protocol, Alice first measurement can be modelled by a set of Krausoperators {Ai1}. Then Bob’s measurement, which can depends on Alice’s outcome, will be givenby {Bi1,i2}, and so on. In terms of a quantum operation a n-round LOCC protocol can be writtenas

Λ(ρ) =∑i1...,in

(Ai1,...,in . . . Ai1 ⊗Bi1,...,in . . . Bi1i2)ρ(Ai1,...,in . . . Ai1 ⊗Bi1,...,in . . . Bi1i2)†

⊗ |i1, . . . , in〉〈i1, . . . , in|A′ ⊗ |i1, . . . , in〉〈i1, . . . , in|B′ (8.9)

8.2.3 SLOCC: Stochastic LOCC

The general form of a LOCC operation (Eq. (8.9)) is daunting. It turns out that the whole picturesimplifies if we condition on measurement outcomes, which corresponds on only considering one ofthe terms in the sum of Eq. (8.9). We call this operation Stochastic LOCC and it is an operationwhich can only be implemented with some non-zero probability. A general form of SLOCC (forthree parties) is then

Λ(ρ) = (A⊗B ⊗ C)ρ(A⊗B ⊗ C)†. (8.10)

Note that p := tr(Λ(ρ)) gives the probability that Λ is implemented (with probability 1 − p theprotocol fails and the state is transformed into something else).

28

Definition 3. |ψ〉 →SLOCC |φ〉 if there exists matrices A,B,C such that |φ〉 = (A⊗B⊗C)|ψ〉. Wesay that |ψ〉 and |φ〉 have the same type of entanglement if |ψ〉 →SLOCC |φ〉 and |φ〉 →SLOCC |ψ〉.

Since we do not care about normalization, we can w.l.o.g. take A,B,C to be SL(d) matrices(with SL(d) the group of d× d matrices of unit determinant). Therefore we see that the problemof characterizing different entanglement classes is equivalent to the problem of classifying the orbitclasses of SL(d)× SL(d)× SL(d).

In general the number of orbits is very big. Indeed the dimension of the space scales by d3, butthe group only has 3d2 parameters. But the case of three qubits turns out to be simple and weonly have 6 different classes.

One is the class of fully separable states, with representative state

|0, 0, 0〉. (8.11)

The three following ones are the classes where only two parties are entangled, with representativestates

|φ+〉AB ⊗ |0〉C (8.12)

and likewise for AC and BC.The 5th class if the so-called GHZ class, with representative state

1

2(|0, 0, 0〉+ |1, 1, 1〉). (8.13)

The final class is the so-called W class, with representative states

1√3

(|0, 0, 1〉+ |0, 1, 0〉+ |1, 0, 0〉). (8.14)

29


Quantum de Finetti theorem


9.1 de Finetti

Let us remind ourselves that the quantum de Finetti theorem states that for all |ψ〉 ∈ Symn+k(Cd),then trn |ψ〉〈ψ| ≈

∫dµ(σ)σ⊗k.

The intuition here is that measuring the last n systems and finding that they are each in stateσ implies that the remaining k systems are also in state σ.

Let us now do the math. Recall∫dφ|phi〉〈phi|⊗m = ΠD,m/

(D +m− 1

D − 1

)then

trn|ψ〉〈ψ| = trnid⊗k ⊗ΠD,n|ψ〉〈ψ|id⊗k ⊗ΠD,n (9.1)

= trn

∫dφdφ′

(D + n− 1

D − 1

)(id⊗k ⊗ |φ〉〈φ|⊗n|ψ〉〈ψ|(id⊗k ⊗ |φ〉〈φ|⊗n)

(D + n− 1

D − 1

)(9.2)

=

∫ ∫dφdφ′|vφ〉〈vφ||〈φ||φ′〉|n (9.3)

≈∫dφ|vφ〉〈vφ| (9.4)

(9.5)

where we defined(D+n−1D−1

)(id⊗k ⊗ 〈φ|⊗n)|ψ〉 =: |vφ〉 and where in the approximation we assume

that n is large.If we had not inserted the projector on the left, in fact we obtain the equality

trn|ψ〉〈ψ| =∫dφ|vφ〉〈vφ| =

∫dφpφ|vφ〉〈vφ| (9.6)

We now claim that |vφ〉 ≈ |φ〉⊗k on average:

∫dφpφ|〈vφ||φ〉⊗k|2 =

∫dφ|〈vφ||φ〉⊗k|2 (9.7)

=

(D + n− 1

D − 1

)∫dφ|〈ψ|(id⊗k ⊗ |φ〉⊗n)|φ〉⊗k|2 (9.8)

=

(D + n− 1

D − 1

)tr(|ψ〉〈ψ|

∫dφ|φ〉〈φ|⊗n+k) (9.9)

=

(D + n− 1

D − 1

)/

(D + n+ k − 1

D − 1

)tr(|ψ〉〈ψ|Π) =

(D + n− 1

D − 1

)/

(D + n+ k − 1

D − 1

)(9.10)

30

We can now get a good lower bound from this by expanding the binomial coefficients intofactorials:(D + n− 1

D − 1

)/

(D + n+ k − 1

D − 1

)=

(n+ 1) · · · (n+D − 1)

(n+ k + 1) · · · (n+ k +D − 1)≥ (1−k/(n+1))D−1 ≥ 1−kD/n.

Note that this bound is polynomial in n. This is tight. There exists, however, an improvementto an exponential dependence in n at the cost of replacing product states by almost product states.

In order to conclude the proof of the quantum de Finetti theorem, we need to relate the tracedistance to the average we computed.

For this, we consider the fidelity |〈α||β〉| between states |α〉 and |β〉. If now |〈α||β〉| = 1− ε andwe expand |β〉 =

√1− ε|α〉+

√ε|α〉

|||α〉〈α| − |β〉〈β|||1 = ||id− ....||1 = 2√ε

9.2 Quantum Key Distribution

A surprising application of entanglement is quantum key distribution. Suppose Alice and Bob sharean EPR pair |φ〉 = 1√

2|00 + 11〉, then the joint state of Alice Bob and a potential eavesdropper Eve

is |ψ〉ABE s.th. trE |ψ〉〈ψ|ABE = |φ〉〈φ|AB it follows that |ψ〉ABE = |φ〉AB ⊗ |γ〉EBy measuring in their standard basis, Alice and Bob thus obtain a secret random bit r. They

can use this bit to send a bit securely with help of the Vernam one-time pad cipher: Let’s callAlice’s message m. Alice sends the cipher c = m ⊕ r to Bob. Bob then recovers the message byadding r: c⊕ r = m⊕ r ⊕ r = m.

How can we establish shared entanglement between Alice and Bob? Alice could for instancecreate the state locally and send it to Bob using a quantum channel (i.e. a glas fibre).

But how can we now verify that the joint state that Alice and Bob have after the transmissionis an EPR state?

Protocol 1) Alice sends halves of n EPR pairs to Bob2) They choose randomly half of them and perform the CHSH tests. 3) They get key from theremaining halves

There are many technical details that I am glossing over here. One is, how can you be confidentthat the other halves are in this state? The de Finetti theorem! (the choice was permutationinvariant)

In order to make this applicable in actual implementations, one may use the exponential deFinetti theorem (Renner) or the post-selection technique (Christandl, Konig, Renner).

Other issues: there may be noise on the line. It is indeed possible to quantum key distributioneven in this case, but here one needs some other tools mainly relating to classical information theory(information reconciliation or privacy amplification).

31


Computational complexity of entanglement


10.1 More on CHSH games

We continue our discussion of the CHSH game.

A B

R

a

r

b

s

Alice and Bob win if a⊕ b = r · s, i.e. they want a⊕ b to be chosen according to this table:

r s desired a⊕ b0 0 00 1 01 0 01 1 1

Deterministic strategies. Consider a deterministic strategy. This means that if Alice receivesr = 0, she outputs the bit a0 and if she receives r = 1, she outputs the bit a1. Similarly, Boboutputs b0 if he receives s = 0 and b1 if he receives s = 1.

There are four possible inputs. If they set a0 = a1 = b0 = b1 = 0, then they will succeed withprobability 3/4. Can they do better? For a deterministic strategy this can only mean winning withprobability 1. But this implies that

a0 ⊕ b0 = 0

a0 ⊕ b1 = 0

a1 ⊕ b0 = 0

a1 ⊕ b1 = 1

Adding this up (and using x⊕ x = 0) we find 0 = 1, a contradiction.Randomized strategies. What if Alice and Bob share some correlated random variable and

choose a deterministic strategy based on this? Then the payoff is the average of the payoffs of eachof the deterministic strategies. Thus, there must always be at least one deterministic strategy thatdoes at least as well as the average. So we can assume that an optimal strategy does not make useof randomness. (Exercise: what if they use uncorrelated randomness? Can this help?)

Quantum strategies. Now suppose they share the state |Φ+〉. Define

|φ0(θ)〉 = cos(θ)|0〉+ sin(θ)|1〉|φ1(θ)〉 = − sin(θ)|0〉+ cos(θ)|1〉

32

Observe that {|φ0(θ)〉, |φ1(θ)〉} is an orthonormal basis for any choice of θ.The strategy is as follows. Alice and Bob will each measure their half of the entangled state in

the basis {|φ0(θ)〉, |φ1(θ)〉} for some choice of θ that depends on their inputs. They will output 0or 1, depending on their measurement outcome. The choices of θ are

Alicer = 0 θ = 0r = 1 θ = π/4

Bobs = 0 θ = π/8s = 1 θ = −π/8

As an exercise show that Pr[win] = cos2(π/8) = 12 + 1

2√

2> 3/4.

Another way to look at the quantum strategy is in terms of local observables. Alice and Bob’sstrategy can be described in terms of the matrices

A0 =

(1 00 −1

)A0 =

(1 00 −1

)B0 =

1√2

(1 11 −1

)B1 =

1√2

(1 −1−1 −1

)

Given a state |ψ〉, the value of the game can be expressed in terms of

1

4〈ψ|(A0 ⊗B0 +A0 ⊗B1 +A1 ⊗B0 −A1 ⊗B1)|ψ〉 = Pr[win]− Pr[lose] = 2 Pr[win]− 1.

We can define a Hermitian matrix W ′ by

W =1

4(A0 ⊗B0 +A0 ⊗B1 +A1 ⊗B0 −A1 ⊗B1).

Then, for any σ ∈ Sep,

1

4tr(W ′σ) ≤ 2 max Pr[win]− 1 =

3

2− 1 =

1

2

Define W = I2 −

14W

′. Then for all σ ∈ Sep, tr(Wσ) ≥ 0, while tr(W |Φ+〉〈Φ+|) = − 1√2< 0.

Thus Bell inequalities define entanglement witnesses; moreover, ones that distinguish an entan-gled state even from separable states over unbounded dimension that are measured with possiblydifferent measurement operators.

There has been some exciting recent work on the CHSH game. One recent line of work hasbeen on the rigidity property, which states that any quantum strategy that comes within ε of theoptimal value 1

2 + 12√

2must be within ε′ of the ideal strategy (up to some trivial changes). This

is relevant to the field of device-independent quantum information processing, which attempts todraw conclusions about an untrusted quantum device based only on local measurement outcomes.(For more references see arXiv:1203.2976 and arXiv:1303.3081.)

33

10.2 Computational complexity

Problem 1. Weak membership.Given ρAB ∈ D(Cn ⊗ Cm), ε > 0, and the promise that either

1. ρAB ∈ Sep, or

2. D(ρ, Sep) = minσ∈SepD(ρ, σ) ≥ ε,

decide which is the case.This is called “weak” membership because of the ε > 0 parameter, which means we don’t have

to worry too much about numerical precision.There are many choices of distance measure D(·, ·). We could take D(ρ, σ) = 1

2‖ρ− σ‖1, as we

did earlier. Or we should use ‖ρ− σ‖2, where ‖X‖2 :=√

tr(X†X).Another important problem related to Sep is called the support function. Like weak membership,

it can be defined for any set, but we will focus on the case of Sep.

Problem 2. Support function of Sep.Given M ∈ H(Cn ⊗ Cm) and ε > 0, compute hSep(M)± ε, where

hSep(M) := maxσ∈Sep

tr(Mσ).

There is a sense in which problem 1 ∼= problem 2, meaning that an efficient solution for one canbe turned into an efficient solution to the other. We omit the proof of this fact, which is a classicresult in convex optimization [M. Grotschel, L. Lovasz, A. Schrijver. Geometric Algorithms andCombinatorial Optimization. 1988].

Efficiency. What does it mean for a problem to be “efficiently” solvable? If we parametrize aproblem by the size of the input, then we say a problem is efficient if n-bit inputs can be solvedin time polynomial in n, i.e. in time ≤ c1n

c2 for some constants c1, c2. This class of problems iscalled P, which stands for Polynomial time. Examples include multiplication, finding eigenvalues,solving linear systems of equations, etc.

Another important class of problems are those where the solution can be efficiently checked. Thisis called NP, which stands for Nondeterministic Polynomial time. (The term “nondeterministic” issomewhat archaic, and refers to an imaginary computer that randomly checks a possible solutionand needs only to succeed with some positive, possibly infinitesimal, probability.)

One example of a problem in NP is called 3-SAT. A 3-SAT instance is a formula over variablesx1, . . . , xn ∈ {0, 1} consisting of an AND of m clauses, where each clause is an OR of three variablesor their negations. Denoting OR with ∨, AND with ∧, and NOT xi with xi, an example of a formulawould be

φ(x1, . . . , xn) = (x1 ∨ x4 ∨ x17) ∧ (x2 ∨ x7 ∨ x10) ∧ . . . .

Given a formula φ, it is not a priori obvious how we can figure out if it is satisfiable. One option isto check all possible values of x1, . . . , xn. But there are 2n assignments to check, so this approachrequires exponential time. Better algorithms are known, but none has been proven to run in timebetter than cn for various constants c > 1. However, 3-SAT is in NP because if φ is satisfiable, thenthere exists a short “witness” proving this fact that we can quickly verify. This witness is simplya satisfying assignment x1, . . . , xn. Given φ and x1, . . . , xn together, it is easy to verify whetherindeed φ(x1, . . . , xn) = 1.

34

Figure 1: This figure is taken from the wikipedia article http://en.wikipedia.org/wiki/Clique (graphtheory). The 42 2-cliques are the edges, the 19 3-cliques are the triangles colored light blue and the2 4-cliques are colored dark blue. There are no 5-cliques.

NP-hardness. It is generally very difficult to prove that a problem cannot be solved efficiently.For example, it is strongly believed that 3-SAT is not in P, but there is no proof of this conjecture.Instead, to establish hardness we need to settle for finding evidence that falls short of a proof.

Some of the strongest evidence we are able to obtain for this is to show that a problem isNP-hard, which means that any problem in NP be efficiently reduced to it. For example, 3-SAT isNP-hard. This means that if we could solve 3-SAT instances of length n in time T (n), then anyother problem in NP could be solved in time ≤ poly(T (poly(n))). In particular, if 3-SAT were inP then it would follow that P = NP.

It is conjectured that P 6= NP, because it seems harder to find a solution in general than torecognize a solution. This is one of the biggest open problems in mathematics, and all partial resultsin this direction are much much weaker. However, if we assume for now that P 6= NP, then showinga problem is NP-hard implies that it is not in P. And since thousands of problems are known to beNP-hard1 it suffices to show a reduction from any NP-hard problem in order to show that a newproblem is also NP-hard. Thus, this can be an effective method of showing that a problem is likelyto be hard.

Theorem 10.1. Problems 1 and 2 are NP-hard for ε = 1/ poly(n,m).

We will give only a sketch of the proof.

1. Argue that MAX-CLIQUE is NP-hard. This is a classical result that we will not reproducehere. Given a graph G = (V,E) with vertices V and edges E, a clique is a subset S ⊆ V suchthat (i, j) ∈ E for each i, j ∈ S, i 6= j. An example is given in Fig. 1. The MAX-CLIQUEproblem asks for the size of the largest clique in a given graph.

2.

1See this list: http://en.wikipedia.org/wiki/List of NP-complete problems. The terminology NP-complete refers toproblems that are both NP-hard and in NP.

35

http://en.wikipedia.org/wiki/Clique_(graph_theory)

http://en.wikipedia.org/wiki/Clique_(graph_theory)

http://en.wikipedia.org/wiki/List_of_NP-complete_problems

Theorem 10.2 (Motzkin-Straus). Let G = (V,E) be a graph, with maximum clique of sizeW . Then

1− 1

W= 2 max

∑(i,j)∈E

pipj , (10.1)

where the max is taken over all probability distributions p.

3. Given a graph, define

M =∑

(i,j)∈E

|i, j〉〈i, j|.

Thenmax|x〉〈x| ⊗ 〈x|M |x〉 ⊗ |x〉 = max

‖x‖2=1

∑(i,j)∈E

|xi|2|xj |2.

Defining pi = |xi|2, we have recovered the RHS of (10.1).

4. We argue thathSep(M) = max

|a〉,|b〉〈a, b|M |a, b〉.

This is because Sep is a convex set, its extreme points are of the form |a, b〉〈a, b|, and themaximum of any linear function over a convex set can be achieved by an extreme point.

5. Finally, we argue that maximizing over |a, b〉 is equivalent in difficulty to maximizing over|a, a〉.

What accuracy do we need here? If we want to distinguish a clique of size n (where there are nvertices) from size n− 1, then we need accuracy (1− 1

n−1)− (1− 1n) ≈ 1/n2. Thus, we have shown

that problem 2 is NP-hard for ε = 1/n2.

36


Three Qubit Entanglement Polytopes

Lecturer: Michael Walter Lecture 11

Last time we talked about SLOCC (stochastic LOCC), where we can post-select on particularoutcomes

Given a class of states that can be interconverted by SLOCC into |ψ〉 other by SLOCC, Cψ ={|φ〉 : |φ〉 ↔ |ψ〉}, a result by Dur-Vidal-Cirac says that

Cψ := {(A⊗B ⊗ C)|ψ〉/‖ . . . ‖ : A,B,C ∈ SL(d)} (11.1)

For three qubits there is a simple classification of all possible types of entanglement. Apartfrom product states, and states with only bipartite entanglement, the two classes have the followingrepresentative states:

|GHZ〉 =1

2(|000〉+ |111〉) (11.2)

and

|W 〉 =1

2(|001〉+ |010〉+ |100〉) (11.3)

The class of SLOCC operations forms a group:

G = {A⊗B ⊗ C : A,B,C ∈ SL(d)}. (11.4)

A easy to check fact is that SL(d) = {eX : tr(X) = 0}. Therefore

G = {eA ⊗ eB ⊗ eC = eA⊗I⊗I+I⊗B⊗I+I⊗I⊗C : tr(A) = tr(B) = tr(C) = 0}. (11.5)

We denote CψABC by G.|ψ〉ABC .

11.1 Quantum Marginal Problem for Three Parties

What are the possible ρA, ρB, ρC compatible with a pure state |φ〉ABC ∈ CψABC? We say beforethat this only depends on the spectra λA, λB and λC , as one can always apply local unitaries andchange the basis.

For example, for the W class the set of compatible spectra is given by the equation λAmax +λBmax + λCmax ≥ 2.

Let us start with a simpler problem, namely given a state |ψ〉ABC , does there exist a state inG.|ψ〉ABC with ρA = ρB = ρC = I/d? This is equivalent to

tr(ρAA) = tr(ρBB) = tr(ρCC) = 0, (11.6)

for all Hermitian traceless matrices A,B,C, which in turn is equivalent to

tr(ρAA) + tr(ρBB) + tr(ρCC) = 0, (11.7)

for all Hermitian traceless matrices A,B,C. We can write it as

〈ψABC |A⊗ I ⊗ I + I ⊗B ⊗ I + I ⊗ I ⊗ C|ψABC〉 = 0. (11.8)

37

Thus the norm of the state |ψABC〉 should not change (to 1st order) when we apply an infinitesimalSLOCC operation.

Let us look at

∂

∂t‖etA ⊗ etB ⊗ etC |ψABC〉‖ =

∂

∂t〈ψABC |etA ⊗ etB ⊗ etC |ψABC〉

= 2〈ψABC |A⊗ I ⊗ I + I ⊗B ⊗ I + I ⊗ I ⊗ C|ψABC〉. (11.9)

So from Eq. (11.8), if |ψABC〉 is the closest point to the origin in G.|φABC〉, then ρA = ρB =ρC = I/d.

What happens when there is no point in the class with ρA = ρB = ρC = I/d. That seemsstrange, as it implies by the above that there is no closest point to the origin. But indeed this isthe case for the |W 〉 class, for example. Consider

(ε0; 01/ε)⊗ (ε0′01/ε)⊗ (ε0′01/ε)|W 〉 = ε|W 〉 (11.10)

and when ε goes to zero, one approaches the origin. However the limit is not in G.|W 〉.In general, we have:

Theorem 11.1. (Kempf-Ness) The following are equivalent:

• There exists a closest point to 0 in G.|φABC〉.

• There exists a quantum state in G.|φABC〉 with ρA = ρB = ρC = I/d.

• G.|φABC〉 is closed.

The theorem says that there is no point in G.|W 〉 which is maximally mixed.How about if we look at the closure of G.|W 〉?A fact is that for the closure of G.|φABC〉, for every |φABC〉, contains a unique closed orbit.

Corollary 11.2. There exists a quantum state in the closure of G.|φ〉 with maximally mixed re-ductions if, and only if, 0 /∈ G.|φABC〉.

We saw before that 0 is in the closure of the |W 〉 class. Therefore we cannot approximate (toarbitrary accuracy) states in the |W 〉 class by a state with maximally mixed reductions.

If we are given a class G.|φ〉 and we would like to show it does not contain the origin, we canfind a function which separates the sets. It turns out we can always choose a polynomial for thatP , such that P (0) = 0 and P (|ψ〉) 6= 0 for all |ψ〉 ∈ G.|φ〉. We can choose the polynomial P to beG-invariant (i.e. P (|ψ〉) = P (g.|ψ〉) for all g) and homogeneous. The converse is also true, so wefind that:

Theorem 11.3. 0 /∈ G.|φABC〉 if, and only if, there exists a G-invariant homogeneous polynomialsuch that P (0) = 0 and P (|φABC〉) 6= 0.

This new characterization doesn’t look particularly useful at first sight since we have to checkall homogeneous G-invariant polynomials. However it turns out that we only have to check afinite number of polynomials since the set of G-invariant polynomials is finitely generated (ref). Aparticular case of this result to three qubits state is that any G-invariant polynomial is a sum ofpowers of the Cayley’s hyperdeterminant.

Next lecture we will see how we can use representation theory to study the G-invariant poly-nomials.

38


High dimensional entanglement


Today, I will tell you about bizarre things that can happen with entanglement of high dimen-sional quantum states. Recall from Fernando’s lecture that

hSEP (M) = maxσ∈SEP

trMσ

where {M, id−M} are the yes/no outcomes of a POVM. He also showed that it is NP-hard tocompute this quantity exactly in general. So, here we want to consider approximations to thisquantity that we can compute easier.

For this we introduce approximations to the set of separable states based on the concept ofn-extendibility.

Definition 4. Let ρAB ∈ D(CdA ⊗ CdB ). We say that ρAB is n-extendible if there exists a stateρAB1···Bn ∈ D(CdA ⊗ Symn(CdB )) s.th. ρAB = trB2···Bn ρAB1···Bn.

It turns out that the set of n-extendible states is a good outer approximation to the set ofseparable states that gets better and better as n increases. But let us first check that the set ofseparable states is contained in it, that is, that every separable ρAB is n-extendible. This can be seenby writing the separable state ρAB in the form

∑i pi|αi〉〈αi| ⊗ |βi〉〈βi|. A symmetric extension is

then easily seen to be given by ρAB1···Bn =∑

i pi|αi〉〈αi|⊗|βi〉〈βi|⊗n. The approximation statementis then contained in the following theorem.

Theorem 12.1. If ρAB is n-extendible, then there is a separable state σ with 12 ||ρ− σ||1 ≤

dn .

The proof of this theorem is very similar to the proof of the quantum de Finetti theorem whichwe did yesterday (in fact, you could adapt the proof as an exercise if you wish).

As a corollary it now follows that we can approximate hSEP (M) by

hn−ext(M) := maxρ∈n−ext

trMρ.

Corollary 12.2. For all 0 ≤M ≤ id:

hsep(M) ≤ hn−ext(M) ≤ hsep(M) +d

n.

The lower bound follows directly from the fact that the set of separable states is contained inthe set of n-extendible states (it even holds for all hermitian M without the restriction 0 ≤M ≤ id.For the upper bound, we use the theorem the observation that

max0≤M≤id

trM(ρ− σ) =1

2||ρ− σ||1

and obtain

hn−ext(M) = maxρ∈n−ext

trMρ ≤ maxσ∈sep

trM +d

n.

39

We now want to see how difficult it is to compute hn−ext(M). We rewrite hn−ext(M) in theform

hn−ext(M) = max|ψ〉∈CdA⊗Symn(CdB )

〈ψ|M ⊗ id⊗n−1 |ψ〉 (12.1)

= max|ψ〉∈CdA⊗Symn(CdB )

trM trn−1 |ψ〉〈ψ| (12.2)

= λmax(ΠsymM ⊗ id⊗n−1 Πsym). (12.3)

Hence, the effort to compute hn−ext(M) is polynomial in dn+1. In order to obtain an ε ap-proximation to hsep(M) we have to choose ε = d/n according to the corollary. Hence the effort toapproximate up to accuracy ε then the effort scales as dn/ε.

Actually this is optimal for general M . In order to see why, we are going to employ a quantumstates known as the antisymmetric states (it is also known to the the universal counter example toany conjecture in entanglement theory which you may have). The antisymmetric state comes in apair with the symmetric state:

The symmetric state is

ρsym =Πd,2sym

d(d+ 1)/2=

id +F

d(d+ 1).

It is indeed separable, becauseΠd,2sym

d(d+1)/2 =∫dφ|φ〉〈φ|⊗2.

The antisymmetric state is

ρanti =id−Πd,2

sym

d(d− 1)/2=

id−Fd(d− 1)

This antisymmetric state it funny because

1. it is very far from separable; for all separable σ: 12 ||ρanti − σ||1 ≥

12

2. it is very extendible; more precisely, two copies ρanti ⊗ ρanti are d− 1-extendible.

Let us first see why 1. holds. For this note that for M = Πd,2sym: trMρanti = 0. On the other

hand

trMσ = tr(σ/2 + Fσ/2) =1

2+

1

2trFσ.

In order to bound trFσ note that

trF (X ⊗ Y ) =∑i,j

〈i, j|F (X ⊗ Y )|i, j〉 =∑i,j

〈j, i|(X ⊗ Y )|i, j〉 =∑ij

XjiYij = trXY.

Hence

trMσ =1

2+

1

2

∑i

pi trF |αi〉〈αi| ⊗ |βi〉〈βi| (12.4)

=1

2+

1

2

∑i

pi|〈αi|βi〉|2 ≥1

2. (12.5)

40

In order to see 2. holds, note that

ρanti =2

d(d+ 1)

∑1≤i<j≤d

|ij〉 − |ji〉√2

〈ij| − 〈ji|√2

.

Consider now the following state (known as a Slater determinant)

|ψ〉 =1

d!

∑π∈Sd

sgn(π)|π(1)〉 ⊗ · · · ⊗ |π(n)〉.

where we introduced the sign of a permutation

sgn(π) = (−1)transpositions = det(n∑i

|π(i)〉〈i|).

Let us now verify that the Slater determinant extends ρanti, i.e. that

ρanti = tr3···d |ψ〉〈ψ|.

This can be done by a quick direct calculation of the partial trace using the formula

tr3···d |ψ〉〈ψ| =∑i3···in

id⊗〈i3 · · · in||ψ〉〈ψ| id⊗|i3 · · · in〉.

resulting in1

d(d− 1)

∑j<k

|jk − kj〉〈jk − kj|.

Note that the extension we constructed was actually antisymmetric and not symmetric. But ifwe take two copies of the antisymmetric state, the negative signs cancel out:

|ψ〉|ψ〉 ∈ Symn(Cd ⊗ Cd)

and we thus have the desired d− 1 extension of ρanti ⊗ ρanti.

41


LOCC distinguishability


13.1 Data Hiding

Review of bad news from yesterday.

1. The weak membership problem (determining whether ρAB ∈ Sep or D(ρAB,Sep) ≥ ε giventhe promise that one of these holds) is NP-hard for ε = 1/poly(dim).

2. k-extendability does not give a good approximation in trace norm until k ≥ d, which corre-sponds to an algorithm that takes time exponential in d.

Let’s look more closely at what went wrong with using k-extendable states to approximate Sep.The state is

W−AB =I − Fd(d− 1)

∈ D(Cd ⊗ Cd).

It is k-extendable for k = d− 1, and satisfies minσ∈Sep12‖W

−AB − σAB‖1 = 1

2 .Here the trace distance describes our ability to distinguish W− and σ using arbitrary two-

outcome measurements {M, I −M} satisfying only 0 ≤M ≤ I. However, since arbitrary measure-ments can be hard to implement, it is often reasonable to consider the smaller class of measurementsthat can be implemented with LOCC.

Locality-restricted measurements

• Define the LOCC norm to be

1

2‖ρAB − σAB‖LOCC := max

0≤M≤I{M,I−M}∈LOCC

| tr(M(ρ− σ))|.

• Define the 1-LOCC norm to be analogous, but with LOCC replaced with 1-LOCC. This standsfor “one-way LOCC.” This means that one party (by convention, Bob) makes a measurement,sends the outcome to Alice and she makes a measurement based on this message. The resultingmeasurements always have the form

M =∑k

Ak ⊗Bk

0 ≤ Ak ≤ I for each k

0 ≤ Bk for each k∑k

Bk = I

42

Is W− still far from Sep in the LOCC norm? Observe that

minσ∈Sep

1

2‖W−AB − σAB‖LOCC ≤

1

2‖W−AB −W

+AB‖LOCC,

since W+ := I+Fd(d+1) =

∫d|θ〉 |θ〉〈θ|⊗2 is separable.

Observe that if {M, I−M} ∈ LOCC then we have M =∑

k Ak⊗Bk and I−M =∑

k A′k⊗B′k

with each Ak, Bk, A′k, B

′k ≥ 0. Thus

0 ≤MTA ≤ I. (13.1)

We can then further relax

1

2‖W−AB −W

+AB‖LOCC ≤ max

0≤M≤I0≤MTA≤I

tr(M(W+ −W−))

= max0≤M≤I

0≤MTA≤I

tr(MTA((W+)TA − (W−)TA))

≤ 1

2‖(W+)TA − (W−)TA‖1.

To evaluate this last quantity, observe that if F =∑d

i,j=1 |i, j〉〈i, j|, then F TA = dΦ+ where

Φ+ := |Φ+〉〈Φ+| and |Φ+〉 = 1√d

∑di=1 |i, i〉. Then

(W−)TA =

(I − Fd(d− 1)

)TA=I − F TAd(d− 1)

=I − dΦ+

d(d− 1)

(W+)TA =

(I + F

d(d+ 1)

)TA=I + F TA

d(d+ 1)=I + dΦ+

d(d+ 1)

Now we can calculate

1

2

∥∥(W−)TA − (W+)TA∥∥

1=

1

2

∥∥∥∥I − dΦ+

d(d− 1)− I + dΦ+

d(d+ 1)

∥∥∥∥1

=1

2

∥∥∥∥ I

d(d− 1)(d+ 1)− Φ+

(d− 1)(d+ 1)

∥∥∥∥1

≤ 1

d.

This is an example of data hiding. The states W+,W− are perfectly distinguishable with globalmeasurements, but can only be distinguished with bias ≤ 1/d using LOCC measurements.

13.2 Better de Finetti theorems for 1-LOCC measurements

This data hiding example raises the hope that a more useful version of the de Finetti theorem mighthold when we look at 1-LOCC measurements. Indeed, we will see that the following improved deFinetti theorem does hold:

Theorem 13.1. If ρAB ∈ D(CdA ⊗ CdB ) is k-extendable then

minσ∈Sep

‖ρAB − σAB‖1-LOCC ≤√

2 ln(2) log(dA)

k. (13.2)

43

This was first proved in [Brandao, Christandl, Yard; arXiv:1010.1750] but in Aram’s lectureyou will see a simpler proof, from [Brandao, Harrow; arXiv:1210.6367].

It can be shown [Matthews, Wehner, Winter; arXiv:0810.2327] that

‖ρAB − σAB‖1-LOCC ≥1√127‖ρ− σ‖2 :=

1√127

√tr(ρ− σ)2.

Thus, Theorem 13.1 also gives a good approximation in the 2-norm.

Application to weak membership. Let’s consider the weak membership problem for Sep, butnow with the distance measure given by 1-LOCC norm:

D(ρ, Sep) := minσ∈Sep

‖ρ− σ‖1-LOCC.

We will solve this problem using semidefinite programming (SDP), which means optimizinga linear function over matrices subject to semidefinite constraints (i.e. constraints that a givenmatrix is positive semidefinite). Algorithms are known that can solve SDPs in time polynomialin the number of variables. The SDP for checking whether ρAB is k-extendable is to search for aπAB1,...,Bk satisfying

πAB1,...,Bk ≥ 0

∀j πABj = ρAB.

The algorithm is to run the SDP for k = 4 ln(2) log(dA)ε2

. If ρ ∈ Sep then the SDP will befeasible because ρ is also k-extendable. The harder case is to show that the SDP is infeasible whenD(ρAB,Sep) ≥ ε. But this follows from Theorem 13.1.

The run time is polynomial in dAdk+1B , which is dominated by the dkB term. This is

exp(c log(dA) log(dB)/ε2),

which is slightly more than polynomial-time. It is called “quasi-polynomial,” meaning that it isexp(poly(log(input size))).

Idea behind proof of Theorem 13.1. Suppose we had a “magic” entanglement measure:E : D(CdA ⊗ CdB ) 7→ R+ with the following properties.

1. normalization: E(ρAB) ≤ min(log(dA), log(dB))

2. monogamy: E(ρA:B1B2) ≥ E(ρA:B1) + E(ρA:B2).

3. faithfulness: E(ρAB) ≤ ε implies that D(ρ, Sep) ≤ f(ε) where f → 0 as ε → 0, e.g.f(ε) = c

√ε.

If we had such a measure, the proof would be very easy.

log(dA) ≥ E(ρA:B1...Bk) by normalization

≥ E(ρA:B1) + E(ρA:B2...Bk) by monogamy

≥k∑i=1

E(ρA:Bi) repeating the argument

= kE(ρ)

44

Rearranging, we have E(ρ) ≤ log(dA)/k, and finally we use faithfulness to argue that D(ρ, Sep) ≤f( log(dA)

k ).Such a measure does exist! It is called squashed entanglement and was introduced in 2003

by our very own Matthias Christandl and Andreas Winter [quant-ph/0308088]. Normalizationand monogamy are straightforward to prove for it (and were proved in the original paper), butfaithfulness was not proved until 2010 [Brandao, Christandl, Yard; arXiv:1010.1750].

45


Representation Theory and Spectrum Estimation


14.1 Representation Theory

Given a group G a representation of G in U(V ), the set of unitaries over the vector space V = Cis a mapping g 7→ U(g) ∈ U(V ) which is a homomorphism (...).

Definition 5. We say a representation is irreducible if for all such that

W ⊆ V,U(g)|w〉∀g, q ∈W (14.1)

we have W = {0, V }.

Definition 6. Given two representations g 7→ U(g) and g 7→ U(g), we say they are equivalent ifthere exists a invertible A such that AU(g)A−1 = U(g) for all g.

We have the following important theorem

Theorem 14.1. For G a finite or Lie group, U(g) = U1(g)⊕ . . .⊕Ul(g) with U1, . . . , Ul irreduciblerepresentations. Analogously,

V =∼=⊕i∈G

Vi ⊗ Cmi , (14.2)

with mi ∈ N. Here G is the set of equivalent classes of irreps (irreducible representations), with Gacting irreducible in each of the Vi.

This shows that irreducible representations are building blocks of general representationsExample: Let us consider SU(2). In this case Vj is the space of spin j, of dimension dim(Vj) =

2j + 1. A basis for Vj is {|j,m〉}jm=−j . Let us consider the su(2) Lie algebra

σz|j,m〉 = m|j,m〉, (14.3)

andσ±|j,m〉 = c±|j,m〉. (14.4)

Then the action of the group in Vj is obtained by exponentiating the Lie algebra.We can write Vj = Sym2j(C2) ⊆ (C2)⊗2j and the action of the group in Vj as g 7→ Πsymg ⊗

. . .⊗ gΠsym.

46

14.1.1 Schur-Weyl Duality

An important theorem in representation theory of SU(d) and the symmetric group is the so-calledSchur-Weyl duality.

Consider the following representation of SU(2): g 7→ g ⊗ . . . ⊗ g. The associated vector spaceis V n = (C2)⊗n. We can decompose this representation into irreps as

V (n) =⊕j

Vj ⊗ Cmj . (14.5)

For Sn, the symmetric group of order n, there is a natural representation in V n, given byπ|i1, . . . , in〉 = |π−1(1), . . . , π−1(n)〉, for π ∈ Sn. This representation clearly commutes with theone for SU(d). Schur Weyl duality gives a convenient representation into irreps for both groupssimultaneously. We will not need the details of this presentation in this lecture though. Insteadout goal is to compute the mj in (14.5).

14.1.2 Computing mj

Let us consider mnn/2. We have:

V (n+1) ∼= V (n) ⊗ V (1)

=

⊕j

Vj ⊗ Cmnj

⊗ V1/2

=

⊕j

Vj ⊗ Cmnj

⊗ V1/2

=⊕j

(Vj+1/2 ⊗ Vj−1/2)⊗ Cnj

=⊕j′

Vj′ ⊗ C(mnj′+1/2

+mnj′−1/2

)

= .... (14.6)

We can derive the following recursion relation: mn+1j = mn

j+1/2 +mnj−1/2. The solution of this

recursion relation can be checked to be

mnj =

(n

n/2− j

)−(

n

n/2− j − 1

)≤ 2nh(1/2−j/n) (14.7)

with h the binary entropy.But actually, Vj⊗Cmj ⊆ (C2)⊗n = span{π(|01〉−|10〉)⊗(n/2−j)|j,m〉}, with π ∈ Sn. The vectors

|j,m〉 ∈ C⊗2j are equal to to

|j,m〉 = c(|0, . . . , 0, 1, . . . , 1〉+ permutations of 0’s and 1’s), (14.8)

with j +m 0’s and j −m 1’s.

47

14.2 Spectrum Estimation

Let us for a moment forget about representation theory and consider the problem of spectrumestimation. In this problem we are given a source of quantum states which gives n copies of anunknown density matrix: ρ⊗n. The goal is to perform a measurement which gives a estimate of ρ.Suppose we are only interested in estimating the eigenvalues of ρ.

For a qubit state ρ, the eigenvalues of ρ we can be written as (1/2 + r, 1/2 − r), with r ∈[0, 1/2]. The problem of estimating r was considered by Keyl and Werner [], who provided aninteresting connection of the problem to representation theory of SU(2). The measurement consistsof projecting the n copies into the spaces Vj ⊗ Cmj introduced in the last section.

Informally, the claim is that with high probability j/m ≈ r. More precisely,

Pr(j) = tr(Pjρ⊗n) ≤ conste−nD(1/2+j/n||1/2+r), (14.9)

where D(x||y) is the relative entropy of x and y, given by D(x||y) = x log x−x log y+(1−x) log(1−x)− (1− x) log(1− y).

Let us now sketch the proof. We can compute

(〈0, 1| − 〈1, 0|)ρ⊗ ρ(|0, 1〉 − |1, 0〉) = (1/2 + r)(1/2− r). (14.10)

Then using Eq. (14.7),

tr(Pjρ⊗n) ≤

(n

n(1/2− j/n)

) j∑m=−j

(1/2− r)n/2−j(1− r)n/2+j(1/2 + r)j+m(1/2− r)j−m

≤(

n

n(1/2− j/n)

)(1/2 + r)n/2+j(1− r)n/2−j

∞∑k=0

(1/2− r)k

(1/2 + r)k. (14.11)

Noting that∑∞

k=0(1/2−r)k(1/2+r)k

is a constant we obtain the claim.

14.2.1 Application Keyl-Werner Relation

Let us finish mentioning one application. Suppose we have |ψ〉⊗nABC and we measure {Pj,A}, {Pj,B},and {Pj,C}. We just learned we will obtain with high probability outcomes jA, jB, jC such thatjA/n ≈ rA, jB/n ≈ rB, and jC/n ≈ rC . Therefore we must have

tr(Pj,A ⊗ Pj,B ⊗ Pj,C |ψ〉〈ψ|⊗n

)6= 0. (14.12)

Therefore {〈ωjA| ⊗ 〈ωjB| ⊗ 〈ω

jC |g|ψ〉⊗n} 6= 0 for all g ∈ G and bases |ωjA〉 ⊗ |ω

jB〉 ⊗ |ω

jC〉 for the

subspaces associated to Pj,A, Pj,B, Pj,C .

48


Proof of the 1-LOCC quantum de Finetti theorem


15.1 Introduction

in this lecture, I will give a proof of the following theorem, first mentioned by Fernando, on theway introducing useful properties of von Neumann and Shannon entropy.

Theorem 15.1. Let ρAB be k-extendible and M ′ a 1-LOCC measurement. Then there exists aseparable state σ such that

| trM ′(ρ− σ)| ≤ const√

log dAk

.

It is possible to swap the quantifiers with help of von Neumann’s minimax theorem.

Recall that M ′ can be written in the form M ′ =∑m

i=1Ai⊗Bk for 0 ≤ Ai ≤ id and 0 ≤ Bi ≤ id.Define the measurement M : M(ρ) =

∑i trBiρ|i〉〈i|. Note that ρAB extendible implies that there

exists a state πAB1···Bn s.th. ρAB = πABi for all i. It is the goal to show that

id⊗M(ρAB) ≈ id⊗M(σAB)

for separable σ:

σAB =∑i

|αi〉〈αi| ⊗ |βi〉〈βi|

Writing πAB1 =∑

x pxπxAB1

, we want to show that

id⊗M(πxAB1) ≈ αxM(βx),

where we use the shorthand αx = |αx〉〈αx|. We now consider the state

ωAB1···Ak := (id⊗M⊗k)(πAB1···Bk)

It is our goal to show that

ωAB1 =∑x

pxωxAB1

ωAB1 ≈ αx ⊗M(βx)

ωAB1 ≈ ωxA ⊗ ωxBIn order to show this, there are two cases to consider.Case 1: ωAB1 ≈ ωA ⊗ ωB1 . Then we are done. Caes 1: ωAB1 is far from ωA ⊗ ωB1 . Then we

condition on B1 and are looking at system B2, having reduced the uncertainty about that system.This way we get a little closer to case 1. When continuing to B3 etc, this will prove the theorem.

This was the high level view. We will now make this precise by using a measure of correlationbased on entropy. Since the maximum of an entropy is log d, this will give the bound of theorem(as opposed to the linear scaling in d that we encountered in the trace norm quantum de Finettitheorem.

49

15.2 Entropy

Recall that the Shannon entropy of a probability distribution p of a random variable X is given by

H(p) = −∑x

px log px = H(X)p.

When we have joint distributions p(xy), we can look at the marginal distributions p(x) =∑

y p(xy)and p(y) =

∑x p(xy) and their entropies. The conditional entropy of X given Y is defined as

H(X|Y )p =∑y

p(y)H(X)p(x|y)

where p(x|y) = p(xy)/p(y). Writing the conditional entropy out explicitly we find the formula

H(X|Y )p = H(XY )p −H(Y )p.

This gives us a beautiful interpretation of the conditional entropy: it is just the entropy of the jointdistribution minus the entropy of Y .

We can measure the correlation between two random variables X and Y by looking at thedifference between the entropy H(X) and H(X|Y )

I(X;Y ) = H(X)−H(X|Y ) = H(X) +H(Y )−H(XY )

Note that this quantity is symmetric with respect to interchange of X and Y . This is known asthe mutual information and quantifies the by how much our uncertainty about X changes when weare given Y (and vice versa, of course). It is also the amount of bits that you save by compressingXY together as opposed to compressing X and Y separately.

The mutual information has a few nice properties:

• I(X : Y ) ≥ 0

• I(X : Y ) ≤ log |X| and I(X : Y ) ≤ log |Y |, where |X| denotes the number of symbols in X.

• (Pinsker’s inequality) I(X : Y ) ≥ 12 ln 2

(∑x,y |p(xy)− p(x)p(y)|

)2

Let us now look at the quantum version of all this. I will use the notation that S(A)ρ = S(ρA)and we have a joint state ρAB, then S(A)ρ = S(trB ρAB). Note that it is not immediately clearhow to define the conditional entropy in the quantum case, since we cannot condition on a value ofsystem B. Luckily we had a second way of writing the conditional entropy and we are just goingto define the conditional quantum entropy as

S(A|B) := S(AB)− S(B)

and the mutual information as

I(A : B) = S(A) + S(B)− S(AB).

It has the following properties

50

• S(A : B) ≥ 0

• I(A : B) ≤ 2 log dA and I(A : B) ≤ 2 log dB (note the factor of two!)

• (Pinsker’s inequality) I(A : B) ≥ 12 ln 2 ||ρAB − ρA ⊗ ρB||

21

This last property looks like it could be useful in proving the theorem; but note that it allcannot be that easy, because we know that we should use the 1-LOCC norm and not the tracenorm...In order to proceed, we need the conditional mutual information:

I(A : B|C) = S(A|C) + S(B|C)− S(AB|C) = S(AC) + S(BC)− S(ABC)− S(C)

This formula is a little difficult to grasp and it is difficult to get an intuition for this quantity. Theconditional mutual information has a nice property, though, it satisfies the chain rule:

I(A : BC) = I(A : C)− I(A : B|C)

which you can easily check. It is called the chain rule, in part, because we can iterate it: Let’sassume we have a k-extendible state and we measure all of the B systems. How much does A knowabout the B’s? The chain rule gives us

I(A : B1 · · ·Bk) = I(A : B1) + I(A : B2 · · ·Bk|B1) (15.1)

= I(A : B1) + I(A : B2|B1) + I(A : B3 · · ·Bk|B1B2) (15.2)

= I(A : B1) + I(A : B2|B1) + · · ·+ I(A : Bk|B1B2 · · ·Bk−1) (15.3)

There are k terms and the sum is small than log d, hence there is a j such that I(A : Bk|B1B2 · · ·Bk−1 ≤log dk . This then immediately gives a proof of our theorem (see arXiv:1210.6367 for more details).

51

entanglement

Documents

locc quantum

quantum entropy

quantum information

quantum chemistry

quantum operations

quantum marginals

quantum stateslecturer

quantum instrument