algebraic constructions of randomness extractors

Algebraic Constructions of

Randomness Extractors

Chris UmansCaltech

Based on joint work with Venkat Guruswami and Salil Vadhanand joint work with Amnon Ta-Shma

July 24, 2013 2

Randomness extractors

• Computers are inherently deterministic machines, yet we want to use randomness– one solution: use pseudo-random

generators

• Question: can we use “real” randomness?– physical source– imperfect – biased, correlated

July 24, 2013 3


• “Hardware” side – what physical source? – ask the physicists…

• “Software” side– what is the minimum we need from the

physical source?

July 24, 2013 4

Randomness extractors• imperfect sources:

– “stuck bits”:

– “correlation”:

– “stranger correlation”:

• there are specific ways to get independent unbiased random bits from specific imperfect physical sources

111111

“ “ “ “ “ “perfect squares

July 24, 2013 5


• want to assume we don’t know details of physical source

• general model capturing all of these? – yes: “min-entropy”

• universal procedure for all imperfect sources? – yes: “extractors”

May 2, 2013 6

Min-entropy• General model of physical source w/ k < n bits of

hidden randomness

Definition: random variable X on {0,1}n has min-entropy minx –log(Pr[X = x])

– min-entropy k implies no string has weight more than 2-k

{0,1}n

2k stringsstring sampled uniformly from this set

July 24, 2013 7


• Dozens of constructions over 15+ years (e.g. NZ96, GW97, SZ99, Z97, TS96, NTS99, T99, RRV99, ISW00, RSW00, RVW00, TUZ01, TZS01, SU01, LRVW03, R05, Z06, GUV09, DKSS09, TU12)

Goals: optimal: “milestone”:short seed log n + O(1) O(log n)long output m = k + d - O(1) m = (1 - )k

seedsource string

{0,1}n

2k strings Ed bits

close to uniformm bits

15+ year quest for optimal…

July 24, 2013 8

Applications of extractors

• randomness extractors are extremely versatile objects

• different settings of parameters turn into– families of hash functions– error-correcting codes– expander graphs with the “unique neighbor” property– …

• many uses beyond original motivation

July 24, 2013 9

Applications of extractors• Derandomization [Sip88,NZ93,INW94, GZ97,RR99,

MV99,STV99,GW02]

• Distributed & Network Algorithms [WZ95,Zuc97,RZ98,Ind02]

• Hardness of Approximation [Zuc93,Uma99,MU01,Zuc06]

• Data Structures [Ta02]

• Cryptography [CDHKS00,Lu02,DRS04,NV04]

• Metric Embeddings [Ind06]

July 24, 2013 10

Constructions over the years• 1990 – 1999: largely combinatorial

– hashing– composition, iteration

• 1999: new ingredient – error-correcting codes

• 2003 - present: “milestone” parameters achieved, and slightly surpassed– composition + polynomial method [LRVW03 + DKSS09]– “purely algebraic” [GUV09 + TU12]

July 24, 2013 11

Condensers

• Intermediate object for obtaining extractor:

Goals: minimize d and m

seedsource string

{0,1}n

2k strings Cd bits m bits

2k’ strings

“lossless” if k’ = k + d

“k !² k’ condenser”

Graph viewpoint

July 24, 2013 12

degree D = 2d[N]={0,1}n[M]={0,1}m

subset TBAD(T)

xC(x,y)

C:{0,1}n x {0,1}d ! {0,1}m

“too many neighbors in T”

argue that BAD(T) is small

Graph viewpoint

July 24, 2013 13

[N]={0,1}n[M]={0,1}m

TBAD(T)

x

C(x,y)

C:{0,1}n x {0,1}d ! {0,1}m


When BAD(T) = {x:Pry[C(x, y) 2 T] = 1}

C is a lossless condenser if and only if|BAD(T)|· (1+²)|T|/D

D = 2d

Graph viewpoint

July 24, 2013 14

[N]={0,1}n[M]={0,1}m

TBAD(T)

x

C(x,y)

C:{0,1}n x {0,1}d ! {0,1}m


When BAD(T) = {x:Pry[C(x, y) 2 T] ¸ ²}

C a log(K/²) !² log(K’/²) condenser if for |T| = K’ we have |BAD(T)| · K

D = 2d

Graph viewpoint

July 24, 2013 15

[N]={0,1}n[M]={0,1}m

TBAD(T)

x

C(x,y)

C:{0,1}n x {0,1}d ! {0,1}m


D = 2d

Goal #1: prove |BAD(T)| < (1+²)|T|/DGoal #2: handle sets T as large as M/poly(n)

#1 + #2 would give optimal extractors!

July 24, 2013 16

Outline for rest of talk

1. first construction and proof[Guruswami, Umans, Vadhan 2009]

2. second construction: using the idea twice[Ta-Shma, Umans 2012]

1. an open question

First construction

and its analysis

July 24, 2013 18

First construction

• Fq finite field• parameter h ≤ q• deg. n polynomial E(Y) irreducible over Fq

– source: degree n-1 univariate polynomial f– define fi(Y) = fhi(Y) mod E(Y)

C(f, y 2 Fq) = (y, f0(y), f1(y), f2(y), , fm-1(y))

seed{0,1}n

2k strings

d bits

source C

July 24, 2013 19

First constructiondefine: fi(Y) = fhi(Y) mod irreducible E(Y)


• Fix T µ Fqm+1 of size at most (1 - ²)q¢hm

– note goal #2 was q¢qm/poly(n)

• Define BAD(T) = {f : Pry[C(f, y) 2 T] = 1}• We will prove: |BAD(T)| < hm

– this meets goal #1

July 24, 2013 20



• Q(W, W0, …, Wm-1) vanishes on T– deg(W) · (1-²)q and deg(Wi) · h-1

• Rf(Y) = Q(Y, f0(Y), …, fm-1(Y))– f 2 BAD(T) ) Rf(y) = 0 8y 2 Fq

– deg(Rf) · (1 - ²)q + hmn < q

T µ Fqm+1 BAD(T) = {f : Pry[C(f, y) 2 T] = 1}

July 24, 2013 21



• Q(W, W0, …, Wm-1) vanishes on T– deg(W) · (1-²)q and deg(Wi) · h-1

• Rf(Y) = Q(Y, f0(Y), …, fm-1(Y))– f 2 BAD(T) ) Rf(y) = 0 8y 2 Fq

– deg(Rf) · (1 - ²)q + hmn < q [require q > hnm/²]


July 24, 2013 22

First construction

• Q(W, W0, …, Wm-1) vanishes on Tf 2 BAD(T) ) Rf(Y) = Q(Y, f0(Y), …, fm-1(Y)) ´ 0

) (Y,f0(Y), …,fm-1(Y)) root of Q

) f root of Q*(Z) = Q(Y, Z, Zh, …, Zhm-1) mod E(Y)Conclude: |BAD(T)| · deg(Q*) = hm-1

define: fi(Y) = fhi(Y) mod irreducible E(Y)



July 24, 2013 23

First construction – recapdefine: fi(Y) = fhi(Y) mod irreducible E(Y)


• Fix T µ Fqm+1 of size at most (1 - ²)q¢hm

• We proved: |BAD(T)| < hm

• Two requirements force h < q1 - ® (® constant)

– q > nmh/²– q · poly(n)

• So |T| < qhm · q(qm)1-® ¼ M1-® [want close to M]

best possible

Graph viewpoint

July 24, 2013 24

[N]={0,1}n[M]={0,1}m

TBAD(T)

x

C(x,y)

C:{0,1}n x {0,1}d ! {0,1}m


D = 2d



Many 0s below ) root above

July 24, 2013 25

T µ Fqm+1 Q interpolates T

multivariateover Fq

BAD(T) µ Fqn Q* univariateover Fqn

many 0s on curvedefined by f

f is a root

)degree argument

info about: polynomial: type of poly:

Next: use this idea twice…

Secondconstruction

and its analysis

July 24, 2013 27

First modification

• Fq finite field

• deg. n polynomial E(Y) irreducible over Fq

– source: degree n-1 univariate polynomial f– fi(Y) = fhi(Y) mod E(Y) Gi(f) for Gi:Fqn ! Fqn

seed{0,1}n

2k stringsd bits

source C

(deg(Gi) will be hm-1 – same as before)

C(f, y 2 Fq) = (G0(f)(y), G1(f)(y), , Gm-1(f)(y))

• Fq = Fh[Z]/D(Z)

• deg. n polynomial E(Y) irreducible over Fq

– source: degree n-1 univariate polynomial f– fi(Y) = Gi(f) for Gi:Fqn ! Fqn

C(f; y2Fq, z2Fh) = (G0(f)(y)(z), G1(f)(y)(z), , Gm-1(f)(y)(z))

July 24, 2013 28

Second modification

seed{0,1}n

2k stringsd bits

source Cdegree 2 extension

now C maps into Fhm

Graph viewpoint – reminder

July 24, 2013 29

[N]={0,1}n[M]={0,1}m

TBAD(T)

x

C(x,y)

C:{0,1}n x {0,1}d ! {0,1}m


D = 2d



July 24, 2013 30

Analysis of 2nd constructionC(f; y2Fq, z2Fh) =

(G0(f)(y)(z), G1(f)(y)(z), , Gm-1(f)(y)(z))

• Fix T µ Fhm of size at most (1 - ²)hm


• Define BAD(T) = {f : Pry,z[C(f; y,z) 2 T] = 1}• will (try to) prove: |BAD(T)| < hm¢(small)

– note goal #1 was |BAD(T)| · hm/(qh)

Fqn

Fq

Fh

July 24, 2013 31

Analysis of 2nd construction

• Q(W0, …, Wm-1) vanishes on T with mult. t– deg(Q) · ht-1

T µ Fhm BAD(T) = {f : Pry,z[C(f; y,z) 2 T] = 1}


Fqn

Fq

Fh

Calculation…

• T µ Fhm of size (1 - ²)hm

• Q(W0, …, Wm-1) – vanishes on T with multiplicity t– total degree ht-1

July 24, 2013 32

# of monomials # constraints for each point in T

if t > (m2/²)

July 24, 2013 33


• Q(W0, …, Wm-1) vanishes on T with mult. t– deg(Q) · ht-1

• Rf, y(Z) = Q(G0(f)(y)(Z), , Gm-1(f)(y)(Z))– f 2 BAD(T), y 2 Fq ) Rf, y(z) = 0 8z 2 Fh (mult. t)

– deg(Rf, y) · ht-1 < ht



Fqn

Fq

Fh

July 24, 2013 34


• Q(W0, …, Wm-1) vanishes on T with mult. t

• Rf, y(Z) = Q(G0(f)(y)(Z), , Gm-1(f)(y)(Z))– f 2 BAD(T) ) Rf, y = 0 for all y 2 Fh

• Sf(Y) = Q(G0(f)(Y), , Gm-1(f)(Y))) Sf(y) = 0 8y 2 Fq; deg(Sf) · htn < q


Fqn

Fq

Fh

[require h > tn]


July 24, 2013 35


• Q(W0, …, Wm-1) vanishes on T with mult. tf 2 BAD(T) ) Sf(Y) = Q(G0(f)(Y), …,Gm-1(f)(Y)) ´ 0

) (G0(f)(Y), …, Gm-1(f)(Y)) root of Q

) f root of Q*(Z) = Q(G0(Z), …, Gm-1(Z))


Fqn

Fq

FhT µ Fh

m BAD(T) = {f : Pry,z[C(f; y,z) 2 T] = 1}

Conclude: |BAD(T)| · deg(Q*) < ht¢deg(Gi) = ht¢hm-1

July 24, 2013 36

Second construction – recap C(f; y2Fq, z2Fh) =

(G0(f)(y)(z), G1(f)(y)(z), , Gm-1(f)(y)(z))

• Fix T µ Fhm of size at most (1 - ²)hm


• we proved*: |BAD(T)| < hm¢t– note goal #1 was |BAD(T)| · hm/(qh)

Fqn

Fq

Fh

* but Q*(Z) = Q(G0(Z), …, Gm-1(Z)) may be zero!

• Can choose Gi of degree hm-1 s.t.– each Gi is a linearized polynomial (sparse)

– (Fh)m is contained in image of map

G = (G0, …, Gm-1) : Fqn ! (Fqn)m

Choice of Gi + problem solved

July 24, 2013 37

(Fqn)mFhm

G

• Can choose Gi of degree hm-1 s.t.– (Fh)m is contained in image of map

G = (G0, …, Gm-1) : Fqn ! (Fqn)m

• Q(W0, …, Wm-1) vanishes on T with mult. 2t– price: T of size only ¼ (h/2)m instead of ¼ hm

– payoff: some · t-order derivative Q(v) satisfies• Q(v) not zero on all of Fh

m

• hence Q(v)(G0(Z), …, Gm-1(Z)) 0• still vanishes on T with multiplicity at least t

Choice of Gi + problem solved

July 24, 2013 38

July 24, 2013 39

Condensers

• Intermediate object for obtaining extractor:

seedsource string

{0,1}n

2k strings Cd bits m bits

2k’ strings

“lossless” if k’ = k + d

2nd construction achieves d = O(log n) and– k’ ¼ (1 – 1/log n)k “sublinear entropy

loss”

– k’ ¼ (1 – 1/log n)m “sublinear entropy deficiency”

July 24, 2013 40

Getting an extractor

seedsource string

{0,1}n’

2k’ strings Ed2 bits


Cseed

source string

{0,1}n

2k stringsd1 bits

E only needs to work for “dense”

sources

July 24, 2013 41

Getting an extractor

Various works: from source with minentropy rate (1 - ®) can extract (1-3)k bits with seed d = O(optimal)

seedsource string

{0,1}n

2k strings Ed bits


July 24, 2013 42


Goals: optimal: “milestone”: this work:short seed log n + O(1) O(log n) O(log n)long output m = k+d-O(1) m = (1 - )k m = (1 -

®)k

seedsource string

{0,1}n

2k strings Ed bits


currently the “world record”…

® any constant

® = 1/log n

an open question

A question

• for every T µ Fhm of size hm/poly(h,m)

there is an interpolating polynomialQT(W0, …, Wm-1)

of deg ht-1 vanishing on T with multiplicity t, but QT(G0(Z), …, Gm-1(Z)) 0

July 24, 2013 44

h = poly(m)t = poly(m)

• find an explicit curveG = (G0, …, Gm-1) : Fqn ! (Fqn)m

with deg(Gi) · hm¢poly(h,m), so that

A question

• exists by simple probabilistic argument– for each T, find QT 0 as before

– probability QT is 0 on random point < ½

– probability QT is 0 on hm random points < 2-hm

– union bound over < 2hm different sets T

• Related question: are sparse or linearized polynomials sufficient?

July 24, 2013 45

July 24, 2013 46

Conclusions

• algebraic constructions of randomness extractors with “world record” parameters

• main objects:

• proof idea: “bad” strings are roots of poly Q*

define: fi(Y) = fhi(Y) mod irreducible E(Y)


curve G = (G0, …, Gm-1) : Fqn ! (Fqn)m

C(f; y 2 Fq, z 2 Fh) = (G0(f)(y)(z), , Gm-1(f)(y)(z))

July 24, 2013 47

Open problems

• Obtain an optimal extractor construction!– construct optimal extractors for extremely

dense sources (minentropy k = (1 – o(1))n)

– answering open question + overcoming a few technical hurdles would give condensers meeting goal #1 and #2

Thank you!

algebraic constructions of randomness extractors

Documents

randomness extractorswant

randomness extractorscomputers

randomness extractorshardware

real randomness

d o1m

degree d

random variable x

twhen badt