free probability theory and random matricesweb.mit.edu/sea06/agenda/talks/speicher_survey.pdf ·...

Free Probability Theory and Random

Matrices

Roland Speicher

Queen’s University

Kingston, Canada

We are interested in the limiting eigenvalue distribution of

N × N random matrices

N → ∞.

Usually, large N distributions are close to the N → ∞ limit, and

asymptotic results give good predictions for finite N .

We can consider the convergence for N → ∞ of

• the eigenvalue distribution of one ”typical” realization of the

N × N random matrix

• the averaged eigenvalue distribution over many realizations

of the N × N random matrices

Consider (selfadjoint!) Gaussian N × N random matrix.

We have almost sure convergence (convergence of ”typical” re-

alization) of its eigenvalue distribution towards

Wigner’s semicircle.

−3 −2 −1 0 1 2 30

N=1000

−3 −2 −1 0 1 2 30

N=3000

Convergence of the averaged eigenvalue distribution happens

usually much faster, very good agreement with asymptotic limit

for moderate N .

−3 −2 −1 0 1 2 30

trials=5000

Consider Wishart random matrix A = XX∗, where X is N ×Mrandom matrix with independent Gaussian entries

Its eigenvalue distribution converges (averaged and almostsurely) towards Marchenko-Pastur distribution.

Example: M = 2N , 2000 trials

−0.5 0 0.5 1 1.5 2 2.5 3 3.5 40

We want to consider more complicated situations, built out of

simple cases (like Gaussian or Wishart) by doing operations like

• taking the sum of two matrices

• taking the product of two matrices

• taking corners of matrices

Note: If different N × N random matrices A and B are involved

then the eigenvalue distribution of non-trivial functions f(A, B)

(like A+B or AB) will of course depend on the relation between

the eigenspaces of A and of B.

However: It turns out there is a deterministic and treatable result

• the eigenspaces are in ”generic” position and

• if N → ∞

This is the realm of free probability theory.

Consider N × N random matrices A and C such that

• A has an asymptotic eigenvalue distribution for N → ∞ and

C has an asymptotic eigenvalue distribution for N → ∞

• A and C are independent (i.e., entries of A are independent

from entries of C)

Then eigenspaces of A and of C might still be in special relation

(e.g., both A and C could be diagonal).

However, consider now

A and B := UCU∗,

where U is Haar unitary N × N random matrix.

Then, eigenspaces of A and of B are in ”generic” position and

the asymptotic eigenvalue distribution of A+B depends only on

the asymptotic eigenvalue distribution of A and the asymptotic

eigenvalue distribution of B (which is the same as the one of C).

We can expect that the asymptotic eigenvalue distribution of

f(A, B) depends only on the asymptotic eigenvalue distribution

of A and the asymptotic eigenvalue distribution of B if

• A and B are independent

• one of them is unitarily invariant

(i.e., the joint distribution of the entries does not change

under unitary conjugation)

Note: Gaussian and Wishart random matrices are unitarily in-

variant

Thus: the asymptotic eigenvalue distribution of

• the sum of random matrices in generic position

A + UCU∗

• the product of random matrices in generic position

AUCU∗

• corners of unitarily invariant matrices UCU∗

should only depend on the asymptotic eigenvalue distribution of

A and of C.

Example: sum of independent Gaussian and Wishart (M = 2N)

random matrices, averaged over 10000 trials

−3 −2 −1 0 1 2 3 4 50

N=5−3 −2 −1 0 1 2 3 4 50

Example: product of two independent Wishart (M = 5N) ran-

dom matrices, averaged over 10000 trials

0 0.5 1 1.5 2 2.5 3 3.50

N=50 0.5 1 1.5 2 2.5 3 3.5

Example: upper left corner of size N/2 × N/2 of a randomly

rotated N × N projection matrix,

with half of the eigenvalues 0 and half of the eigenvalues 1,

averaged over 10000 trials

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.90

N=80.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9

N=3214

Problems:

• Do we have a conceptual way of understanding

these asymptotic eigenvalue distributions?

• Is there an algorithm for actually calculating these

asymptotic eigenvalue distributions?

How do we analyze the eigenvalue distributions?

eigenvalue distributionof matrix A

knowledge oftraces of powers,

tr(Ak)

λk1 + · · · + λk

= tr(Ak)

averaged eigenvaluedistribution of

random matrix A=̂

knowledge ofexpectations of

traces of powers,

E[tr(Ak)]

Stieltjes inversion formula. If one knows the asymptotic mo-

αk := limN→∞

E[tr(Ak)]

of a random matrix A, then one can get its asymptotic eigenvalue

distribution µ as follows:

Form Cauchy (or Stieltjes) transform

G(z) :=∞∑

dµ(t) = −1

πlimε→0

ℑG(t + iε)

Consider random matrices A and B in generic position.

We want to understand A + B, i.e., for all k ∈ N

(A + B)k)]

E[tr((A+B)6)] = E[tr(A6)]+· · ·+E[tr(ABAABA)]+· · ·+E[tr(B6)],

thus we need to understand

mixed moments in A and B

Use following notation:

ϕ(A) := limN→∞

E[tr(A)].

Question: If A and B are in generic position, can we understand

ϕ (An1Bm1An2Bm2 · · · )

in terms of

ϕ(Ak))

k∈Nand

ϕ(Bk))

Example: Consider two Gaussian random matrices A, B which

are independent (and thus in generic position).

Then the asymptotic mixed moments in A and B

ϕ (An1Bm1An2Bm2 · · · )

are given by

non-crossing/planar pairings of the pattern

A · A · · ·A︸︷︷︸

n1-times

·B · B · · ·B︸︷︷︸

m1-times

·A · A · · ·A︸︷︷︸

n2-times

·B · B · · ·B︸︷︷︸

m2-times

· · · ,

which do not pair A with B

Example: ϕ(AABBABBA) = 2 since there are two such non-

crossing pairings:

A A B B A B B A

Note: each of the pairings connects at least one of the groups

An1, Bm1, An2, . . . only among itself!

and thus:

A2−ϕ(A2)1)(

B2−ϕ(B2)1)(

A−ϕ(A)1)(

B2−ϕ(B2)1)(

A−ϕ(A)1))

In general we have

An1 −ϕ(An1) ·1)·(Bm1 −ϕ(Bm1) ·1

)·(An2 −ϕ(An2) ·1

)· · ·

non-crossing pairings which do not pair A with B,

and for which each group is connected with some other group

Actual equation for the calculation of the mixed moments

ϕ1 (An1Bm1An2Bm2 · · · )

is different for different random matrix ensembles.

However, the relation between the mixed moments,

An1 − ϕ(An1) · 1)

Bm1 − ϕ(Bm1) · 1)

· · ·

remains the same for matrix ensembles in generic position and

constitutes the definition of freeness.

Definition [Voiculescu 1985]: A and B are free (with respect

to ϕ) if we have for all n1, m1, n2, · · · ≥ 1 that

An1−ϕ(An1) ·1)

Bm1−ϕ(Bm1) ·1)

An2−ϕ(An2) ·1)

· · ·

Bn1−ϕ(Bn1)·1)

Am1−ϕ(Am1)·1)

Bn2−ϕ(Bn2)·1)

· · ·

alternating product in centered words in A and in B

Note: freeness is a rule for calculating mixed moments in A and

B from the moments of A and the moments of B.

Example:

An − ϕ(An)1)(

Bm − ϕ(Bm)1))

ϕ(AnBm)−ϕ(An·1)ϕ(Bm)−ϕ(An)ϕ(1·Bm)+ϕ(An)ϕ(Bm)ϕ(1·1) = 0,

and hence

ϕ(AnBm) = ϕ(An) · ϕ(Bm).

Freeness is a rule for calculating mixed moments, analo-

gous to the concept of independence for random variables.

Thus freeness is also called free independence

Note: free independence is a different rule from classical indepen-

dence; free independence occurs typically for non-commuting

random variables.

Example:

A − ϕ(A)1)

B − ϕ(B)1)

A − ϕ(A)1)

B − ϕ(B)1))

which results in

ϕ(ABAB) = ϕ(AA) · ϕ(B) · ϕ(B) + ϕ(A) · ϕ(A) · ϕ(BB)

− ϕ(A) · ϕ(B) · ϕ(A) · ϕ(B)

Consider A, B free.

Then, by freeness, the moments of A+B are uniquely determined

by the moments of A and the moments of B.

Notation: We say the distribution of A + B is the

free convolution

of the distribution of A and the distribution of B,

µA+B = µA ⊞ µB.

In principle, freeness determines this, but the concrete nature of

this rule is not clear.

Examples: We have

(A + B)1)

= ϕ(A) + ϕ(B)

(A + B)2)

= ϕ(A2) + 2ϕ(A)ϕ(B) + ϕ(B2)

(A + B)3)

= ϕ(A3) + 3ϕ(A2)ϕ(B) + 3ϕ(A)ϕ(B2) + ϕ(B3)

(A + B)4)

= ϕ(A4) + 4ϕ(A3)ϕ(B) + 4ϕ(A2)ϕ(B2)

ϕ(A2)ϕ(B)ϕ(B) + ϕ(A)ϕ(A)ϕ(B2)

− ϕ(A)ϕ(B)ϕ(A)ϕ(B))

+ 4ϕ(A)ϕ(B3) + ϕ(B4)

To treat these formulas in general, linearize the free convolution

by going over from moments (ϕ(Am))m∈N to free cumulants

(κm)m∈N.

Those are defined by relations like:

ϕ(A1) = κ1

ϕ(A2) = κ2 + κ21

ϕ(A3) = κ3 + 3κ1κ2 + κ31

ϕ(A4) = κ4 + 4κ1κ3 + 2κ22 + 6κ2

1κ2 + κ41

There is a combinatorial structure behind these formulas, the

sums are running over non-crossing partitions:

ϕ(A1) = ϕ(A2) = +

= κ1 = κ2 + κ1κ1

ϕ(A3) = + + + +

= κ3 + κ1κ2 + κ2κ1 + κ2κ1 + κ1κ1κ1

ϕ(A4) = + + + + + +

+ + + + + + +

= κ4 + 4κ1κ3 + 2κ22 + 6κ2

1κ2 + κ41

This combinatorial relation between moments (ϕ(Am))m∈N and

cumulants (κm)m∈N can be translated into generating power se-

G(z) =1

∞∑

ϕ(Am)

zm+1Cauchy transform

R(z) =∞∑

κmzm−1R-transform

Then we have the relation

G(z)+ R(G(z)) = z.

Theorem [Voiculescu 1986, Speicher 1994]:

Let A and B be free. Then one has

RA+B(z) = RA(z) + RB(z),

or equivalently

κA+Bm = κA

m + κBm ∀m.

This, together with the relation between Cauchy transform and

R-transform and with the Stieltjes inversion formula, gives an

effective algorithm for calculating free convolutions, i.e., sums

of random matrices in generic position.

A GA RA

RA +RB = RA+B GA+B A + B

B GB RB

Example: Wigner + Wishart (M = 2N), trials = 4000

−3 −2 −1 0 1 2 3 4 50

N=10034

One has similar analytic description for product.

Theorem [Voiculescu 1987, Haagerup 1997, Nica +

Speicher 1997]:

MA(z) :=∞∑

ϕ(Am)zm

and define

SA(z) :=1 + z

zM<−1>

A (z) S-transform of A

Then: If A and B are free, we have

SAB(z) = SA(z) · SB(z).

Example: Wishart x Wishart (M = 5N), trials=1000

0.5 1 1.5 2 2.5 3 3.50

N=10036

upper left corner of size N/2 × N/2 of a projection matrix,

with N/2 eigenvalues 0 and N/2 eigenvalues 1; trials=5000

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.90

N=6437

• Free Calculator by Raj Rao and Alan Edelman

• A. Nica and R. Speicher: Lectures on the Combinatorics of

Free Probability.

To appear soon in the London Mathematical Society Lecture

Note Series, vol. 335, Cambridge University Press

Outlook on other talks around free probability

• Anshelevich: ”free” orthogonal and Meixner polynomials

• Burda: free random Levy matrices

• Chatterjee: concentration of measures and free probability

• Demni: free stochastic processes

• Kargin: large deviations in free probability

• Mingo + Speicher: fluctuations of random matrices

• Rashidi Far: operator-valued free probability theory and block

matrices

free probability theory and random matricesweb.mit.edu/sea06/agenda/talks/speicher_survey.pdf ·...

Documents

chapter 1 probability, random variables and expectations ·...

random variables and probability distributions ·...

b probability and random processes...

noise notes set6.ppt...peyton z. peebles : probability,...

unit 4: probability, random variables, and probability

asymptotic random matrix applications in...

probability, random variables and random processes - part 2

probability random variables and random processes part 1

basic probability & random variables

probability, statistics, and random processes ee … ·...

lecture 8: continuous random variables and probability...

random variables streamlining probability: probability...

probability, random variables. continuous random variable

random variables and probability …random variables and...

integrable probability: random polymers, random tilings...

chapter 4 continuous random variables and probability...

unit 4: probability, random variables, probability

probability & random variables

probability and random process

probability theory probability – models for random...