information and interactive computation
DESCRIPTION
Information and interactive computation. Mark Braverman Computer Science, Princeton University. January 16, 2012. Prelude: one-way communication. Basic goal : send a message from Alice to Bob over a channel. communication channel. Bob. Alice. One-way communication. Encode; Send; - PowerPoint PPT PresentationTRANSCRIPT
1
Information and interactive computation
January 16, 2012
Mark BravermanComputer Science, Princeton University
Prelude: one-way communication
• Basic goal: send a message from Alice to Bob over a channel.
2
communication channel
Alice Bob
One-way communication1) Encode;2) Send;3) Decode.
3
communication channel
Alice Bob
Coding for one-way communication• There are two main problems a good
encoding needs to address:– Efficiency: use the least amount of the
channel/storage necessary.– Error-correction: recover from (reasonable)
errors;
4
Interactive computation
Today’s themeExtending information and coding theory to interactive computation.
5
I will talk about interactive information theory and Anup Rao will talk about
interactive error correction.
Efficient encoding• Can measure the cost of storing a random
variable X very precisely. • Entropy: H(X) = ∑Pr[X=x] log(1/Pr[X=x]).• H(X) measures the average amount of
information a sample from X reveals. • A uniformly random string of 1,000 bits has
1,000 bits of entropy.
6
Efficient encoding
7
• H(X) = ∑Pr[X=x] log(1/Pr[X=x]).• The ZIP algorithm works because
H(X=typical 1MB file) < 8Mbits.• P[“Hello, my name is Bob”] >>
P[“h)2cjCv9]dsnC1=Ns{da3”].• For one-way encoding, Shannon’s source
coding theorem states that
Communication ≈ Information.
Efficient encoding
8
• The problem of sending many samples of X can be implemented in H(X) communication on average.
• The problem of sending a single sample of X can be implemented in <H(X)+1 communication in expectation.
Communication complexity [Yao]
• Focus on the two party setting.
9
A B
X YA & B implement a functionality F(X,Y).
F(X,Y)
e.g. F(X,Y) = “X=Y?”
Communication complexity
10
A B
X Y
Goal: implement a functionality F(X,Y).A protocol π(X,Y) computing F(X,Y):
F(X,Y)
m1(X)m2(Y,m1)
m3(X,m1,m2)
Communication cost = #of bits exchanged.
Distributional communication complexity
• The input pair (X,Y) is drawn according to some distribution μ.
• Goal: make a mistake on at most an ε fraction of inputs.
• The communication cost: C(F,μ,ε):C(F,μ,ε) := minπ computes F with error≤ε C(π, μ).
11
Example
12
μ is a distribution of pairs of files. F is “X=Y?”:
MD5(X) (128b)
X=Y? (1b)
Communication cost = 129 bits. ε ≈ 2-128.
A B
X Y
Randomized communication complexity
• Goal: make a mistake of at most ε on every input.
• The communication cost: R(F,ε).• Clearly: C(F,μ,ε)≤R(F,ε) for all μ.• What about the converse?• A minimax(!) argument [Yao]:
R(F,ε)=maxμ C(F,μ,ε).
13
A note about the model
• We assume a shared public source of randomness.
14A B
X YR
The communication complexity of EQ(X,Y)
• The communication complexity of equality:R(EQ,ε) ≈ log 1/ε
• Send log 1/ε random hash functions applied to the inputs. Accept if all of them agree.
• What if ε=0?R(EQ,0) ≈ n,
where X,Y in {0,1}n.15
Information in a two-way channel
• H(X) is the “inherent information cost” of sending a message distributed according to X over the channel.
16
communication channel
Alice BobXWhat is the two-way
analogue of H(X)?
Entropy of interactive computation
A B
X YR
• “Inherent information cost” of interactive two-party tasks.
One more definition: Mutual Information
• The mutual information of two random variables is the amount of information knowing one reveals about the other:
I(A;B) = H(A)+H(B)-H(AB)• If A,B are independent, I(A;B)=0.• I(A;A)=H(A).
18
H(A) H(B)I(A,B)
Information cost of a protocol
• [Chakrabarti-Shi-Wirth-Yao-01, Bar-Yossef-Jayram-Kumar-Sivakumar-04, Barak-B-Chen-Rao-10].
• Caution: different papers use “information cost” to denote different things!
• Today, we have a better understanding of the relationship between those different things.
19
Information cost of a protocol • Prior distribution: (X,Y) ~ μ.
A B
X Y
Protocol πProtocol transcript π
I(π, μ) = I(π;Y|X) + I(π;X|Y) what Alice learns about Y + what Bob learns about X
External information cost• (X,Y) ~ μ.
A B
X Y
Protocol πProtocol transcript π
Iext(π, μ) = I(π;XY) what Charlie learns about (X,Y)
C
Another view on I and Iext
• It is always the case that C(π, μ) ≥ Iext(π, μ) ≥ I(π, μ).
• Iext measures the ability of Alice and Bob to compute F(X,Y) in an information theoretically secure way if they are afraid of an eavesdropper.
• I measures the ability of the parties to compute F(X,Y) if they are afraid of each other.
Example
23
• F is “X=Y?”.• μ is a distribution where w.p. ½ X=Y and w.p. ½
(X,Y) are random.
MD5(X) [128b]
X=Y?A B
X Y
Iext(π, μ) = I(π;XY) = 129 bits what Charlie learns about (X,Y)
Example• F is “X=Y?”.• μ is a distribution where w.p. ½ X=Y and w.p. ½
(X,Y) are random.
MD5(X) [128b]
X=Y?A B
X Y
I(π, μ) = I(π;Y|X)+I(π;X|Y) ≈what Alice learns about Y + what Bob learns about X
1 + 64.5 = 65.5 bits
The (distributional) information cost of a problem F
• Recall:C(F,μ,ε) := minπ computes F with error≤ε C(π, μ).
• By analogy:I(F, μ, ε) := infπ computes F with error≤ε I(π, μ).
Iext (F, μ, ε) := infπ computes F with error≤ε Iext (π, μ).
25
I(F,μ,ε) vs. C(F,μ,ε): compressing interactive computation
Source Coding Theorem: the problem of sending a sample of X can be
implemented in expected cost <H(X)+1 communication – the information
content of X.
Is the same compression true for interactive protocols?
Can F be solved in I(F,μ,ε) communication?
Or in Iext(F,μ,ε) communication?
The big question
• Can interactive communication be compressed?
• Can π be simulated by π’ such that C(π’, μ) ≈ I(π, μ)?
Does I(F,μ,ε) ≈ C(F,μ,ε)?
27
Compression results we know• Let ε, ρ be constants; let π be a protocol
that computes F with error ε.• π’s costs: C, Iext, I.• Then π can be simulated using:
– (I·C)½·polylog(C) communication; [Barak-B-Chen-Rao’10]
– Iext·polylog(C) communication; [Barak-B-Chen-Rao’10]
– 2O(I) communication; [B’11]
while introducing an extra error of ρ.28
The amortized cost of interactive computation
Source Coding Theorem: the amortized cost of sending many independent
samples of X is =H(X).
What is the amortized cost of computing many independent
copies of F(X,Y)?
Information = amortized communication
• Theorem[B-Rao’11]: for ε>0I(F,μ,ε) = limn→∞ C(Fn,μn,ε)/n.
• I(F,μ,ε) is the interactive analogue of H(X).
30
Information = amortized communication
• Theorem[B-Rao’11]: for ε>0I(F,μ,ε) = limn→∞ C(Fn,μn,ε)/n.
• I(F,μ,ε) is the interactive analogue of H(X).• Can we get rid of μ? I.e. make I(F,ε) a
property of the task F?
C(F,μ,ε) I(F,μ,ε) R(F,ε) ?
Prior-free information cost
• Define:I(F,ε) := infπ computes F with error≤ε maxμ I(π, μ)
• Want a protocol that reveals little information against all priors μ!
• Definitions are cheap!• What is the connection between the
“syntactic” I(F,ε) and the “meaningful” I(F,μ,ε)?
• I(F,μ,ε) ≤ I(F,ε)…32
33
Prior-free information cost
• I(F,ε) := infπ computes F with error ≤ε maxμ I(π, μ).• I(F,μ,ε) ≤ I(F,ε) for all μ.• Recall: R(F,ε)=maxμ C(F,μ,ε).• Theorem[B’11]:
I(F,ε) ≤ 2·maxμ I(F,μ,ε/2).
I(F,0) = maxμ I(F,μ,0).
34
Prior-free information cost
• Recall: I(F,μ,ε) = limn→∞ C(Fn,μn,ε)/n.• Theorem: for ε>0
I(F,ε) = limn→∞ R(Fn,ε)/n.
Example
• R(EQ,0) ≈ n.• What is I(EQ,0)?
35
The information cost of Equality
• What is I(EQ,0)?• Consider the following protocol.
36A B
X in {0,1}n Y in {0,1}n
A non-singular in nn2Z
A1·XA1·Y
A2·XA2·Y
Continue for n steps, or until a disagreement is
discovered.
Analysis (sketch)
• If X≠Y, the protocol will terminate in O(1) rounds on average, and thus reveal O(1) information.
• If X=Y… the players only learn the fact that X=Y (≤1 bit of information).
• Thus the protocol has O(1) information complexity.
37
Direct sum theorems
• I(F,ε) = limn→∞ R(Fn,ε)/n.• Questions:
– Does R(Fn,ε)=Ω(n·R(F,ε))?– Does R(Fn,ε)=ω(R(F,ε))?
38
Direct sum strategy
• The strategy for proving direct sum results. • Take a protocol for Fn that costs Cn=R(Fn,ε),
and make a protocol for F that costs ≈Cn/n.• This would mean that C<Cn/n, i.e. Cn>n C.∙
39
~ ~
A protocol for n
copies of F
1 copy of FCnCn/n?
Direct sum strategy
• If life were so simple…
40
1 copy of FCnCn/nEasy!
Copy 1Copy 2
Copy n
Direct sum strategy
• Theorem: I(F,ε) = I(Fn,ε)/n ≤ Cn = R(Fn,ε)/n.• Compression → direct sum!
41
The information cost angle
• There is a protocol of communication cost Cn, but information cost ≤Cn/n.
1 copy of F
Cn
Cn/n
Restriction
1 bit
Copy
1
Copy
2
Copy
n
C n/n
info
Compression?
Direct sum theorems
Best known general simulation [BBCR’10]:• A protocol with C communication and I
information cost can be simulated using (I·C)½·polylog(C) communication.
• Implies: R(Fn,ε) = Ω(n1/2 R(F,∙ ε)).
43
~
Compression vs. direct sum
• We saw that compression → direct sum. • A form of the converse is also true. • Recall: I(F,ε) = limn→∞ R(Fn,ε)/n.• If there is a problem such that I(F,ε)=o(R(F,ε)),
then R(Fn,ε)=o(n·R(F,ε)).
44
A complete problem
• Can define a problem called Correlated Pointer Jumping – CPJ(C,I).
• The problem has communication cost C and information cost I.
• CPJ(C,I) is the “least compressible problem”. • If R(CPJ(C,I),1/3)=O(I), then R(F,1/3)=O(I(F,1/3))
for all F.
45
The big picture
R(F, ε) R(Fn,ε)/n
I(F, ε) I(Fn,ε)/n
direct sum for information
information = amortized
communicationdirect sum for
communication?
interactive compression?
Partial progress
• Can compress bounded-round interactive protocols.
• The main primitive is a one-shot version of Slepian-Wolf theorem.
• Alice gets a distribution PX. • Bob gets a prior distribution PY. • Goal: both must sample from PX.
47
Correlated sampling
48
A B
PX PY
M ~ PXM ~ PX
• The best we can hope for is D(PX||PY).
Uu Y
XXYX uP
uPuPPPD
)(
)(log)()||(
49
Proof Idea• Sample using D(PX||PY)+O(log 1/ε+D(PX||PY)½)
communication with statistical error ε.
PXPY
u1 u1
u2 u2
u3 u3
u4 u4
u4
~|U| samplesPublic randomness:
q1 q2 q3 q4 q5 q6 q7 ….u1 u2 u3 u4 u5 u6 u7
PX PY
1 1
0 0
Proof Idea• Sample using D(PX||PY)+O(log 1/ε+D(PX||PY)½)
communication with statistical error ε.
u4u2
h1(u4) h2(u4)
50PX
PYu4
PX PY
1 1
0 0u2
5151
Proof Idea• Sample using D(PX||PY)+O(log 1/ε+D(PX||PY)½)
communication with statistical error ε.
u4u2
h4(u4)… hlog 1/ ε(u4)
u4
h3(u4)
PX 2PY
PXPYu4
u4
h1(u4), h2(u4)
1 1
0 0
52
Analysis
• If PX(u4)≈2k PY(u4), then the protocol will reach round k of doubling.
• There will be ≈2k candidates.• About k+log 1/ε hashes.• The contribution of u4 to cost:
– PX(u4) (log PX (u4)/PY (u4) + log 1/ε).
Uu Y
XXYX uP
uPuPPPD
)(
)(log)()||(
Done!
Directions• Can interactive communication be fully
compressed? R(F, ε) = I(F, ε)?• What is the relationship between I(F, ε),
Iext(F, ε) and R(F, ε)?• Many other questions on interactive coding
theory!
53
54
Thank You