an introduction to information theory · fundamentals of information theory memory channels...

147
Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The University of Hong Kong July, 2018@HKU Guangyue Han The University of Hong Kong

Upload: others

Post on 29-Sep-2020

10 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The

Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory

An Introduction to Information Theory

Guangyue Han

The University of Hong Kong

July, 2018@HKU

Guangyue Han The University of Hong Kong

Page 2: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The

Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory

Outline of the Talk

I Fundamentals of Information Theory

I Research Directions: Memory Channels

I Research Directions: Continuous-Time Information Theory

Guangyue Han The University of Hong Kong

Page 3: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The

Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory

Outline of the Talk

I Fundamentals of Information Theory

I Research Directions: Memory Channels

I Research Directions: Continuous-Time Information Theory

Guangyue Han The University of Hong Kong

Page 4: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The

Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory

Fundamentals of Information Theory

Guangyue Han The University of Hong Kong

Page 5: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The

Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory

The Birth of Information Theory

C. E. Shannon. A mathematical theory of communication. BellSyst. Tech. J., 27:379-423,623-656, 1948.

Guangyue Han The University of Hong Kong

Page 6: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The

Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory

The Birth of Information Theory

C. E. Shannon. A mathematical theory of communication. BellSyst. Tech. J., 27:379-423,623-656, 1948.

Guangyue Han The University of Hong Kong

Page 7: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The

Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory

The Birth of Information Theory

C. E. Shannon. A mathematical theory of communication. BellSyst. Tech. J., 27:379-423,623-656, 1948.

Guangyue Han The University of Hong Kong

Page 8: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The

Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory

Classical Information Theory

Guangyue Han The University of Hong Kong

Page 9: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The

Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory

Classical Information Theory

Guangyue Han The University of Hong Kong

Page 10: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The

Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory

Information Theory Nowadays

Guangyue Han The University of Hong Kong

Page 11: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The

Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory

Entropy: Definition

The entropy H(X ) of a discrete random variable X is defined by

H(X ) = −∑x∈X

p(x) log p(x).

Let

X =

{1 with probability p,

0 with probability 1− p.

Then,

H(X ) = −p log p − (1− p) log(1− p) , H(p).

Guangyue Han The University of Hong Kong

Page 12: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The

Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory

Entropy: Definition

The entropy H(X ) of a discrete random variable X is defined by

H(X ) = −∑x∈X

p(x) log p(x).

Let

X =

{1 with probability p,

0 with probability 1− p.

Then,

H(X ) = −p log p − (1− p) log(1− p) , H(p).

Guangyue Han The University of Hong Kong

Page 13: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The

Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory

Entropy: Definition

The entropy H(X ) of a discrete random variable X is defined by

H(X ) = −∑x∈X

p(x) log p(x).

Let

X =

{1 with probability p,

0 with probability 1− p.

Then,

H(X ) = −p log p − (1− p) log(1− p) , H(p).

Guangyue Han The University of Hong Kong

Page 14: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The

Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory

Entropy: Measure of Uncertainty

Guangyue Han The University of Hong Kong

Page 15: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The

Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory

Entropy: Measure of Uncertainty

Guangyue Han The University of Hong Kong

Page 16: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The

Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory

Joint Entropy and Conditional Entropy

The joint entropy H(X ,Y ) of a pair of discrete random variables(X ,Y ) with a joint distribution p(x , y) is defined as

H(X ,Y ) = −∑x∈X

∑y∈Y

p(x , y) log p(x , y).

If (X ,Y ) ∼ p(x , y), the conditional entropy H(Y |X ) is defined as

H(Y |X ) =∑x∈X

p(x)H(Y |X = x)

= −∑x∈X

p(x)∑y∈Y

p(y |x) log p(y |x)

= −∑x∈X

∑y∈Y

p(x , y) log p(y |x).

Guangyue Han The University of Hong Kong

Page 17: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The

Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory

Joint Entropy and Conditional Entropy

The joint entropy H(X ,Y ) of a pair of discrete random variables(X ,Y ) with a joint distribution p(x , y) is defined as

H(X ,Y ) = −∑x∈X

∑y∈Y

p(x , y) log p(x , y).

If (X ,Y ) ∼ p(x , y), the conditional entropy H(Y |X ) is defined as

H(Y |X ) =∑x∈X

p(x)H(Y |X = x)

= −∑x∈X

p(x)∑y∈Y

p(y |x) log p(y |x)

= −∑x∈X

∑y∈Y

p(x , y) log p(y |x).

Guangyue Han The University of Hong Kong

Page 18: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The

Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory

Joint Entropy and Conditional Entropy

The joint entropy H(X ,Y ) of a pair of discrete random variables(X ,Y ) with a joint distribution p(x , y) is defined as

H(X ,Y ) = −∑x∈X

∑y∈Y

p(x , y) log p(x , y).

If (X ,Y ) ∼ p(x , y), the conditional entropy H(Y |X ) is defined as

H(Y |X ) =∑x∈X

p(x)H(Y |X = x)

= −∑x∈X

p(x)∑y∈Y

p(y |x) log p(y |x)

= −∑x∈X

∑y∈Y

p(x , y) log p(y |x).

Guangyue Han The University of Hong Kong

Page 19: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The

Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory

Joint Entropy and Conditional Entropy

The joint entropy H(X ,Y ) of a pair of discrete random variables(X ,Y ) with a joint distribution p(x , y) is defined as

H(X ,Y ) = −∑x∈X

∑y∈Y

p(x , y) log p(x , y).

If (X ,Y ) ∼ p(x , y), the conditional entropy H(Y |X ) is defined as

H(Y |X ) =∑x∈X

p(x)H(Y |X = x)

= −∑x∈X

p(x)∑y∈Y

p(y |x) log p(y |x)

= −∑x∈X

∑y∈Y

p(x , y) log p(y |x).

Guangyue Han The University of Hong Kong

Page 20: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The

Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory

Joint Entropy and Conditional Entropy

The joint entropy H(X ,Y ) of a pair of discrete random variables(X ,Y ) with a joint distribution p(x , y) is defined as

H(X ,Y ) = −∑x∈X

∑y∈Y

p(x , y) log p(x , y).

If (X ,Y ) ∼ p(x , y), the conditional entropy H(Y |X ) is defined as

H(Y |X ) =∑x∈X

p(x)H(Y |X = x)

= −∑x∈X

p(x)∑y∈Y

p(y |x) log p(y |x)

= −∑x∈X

∑y∈Y

p(x , y) log p(y |x).

Guangyue Han The University of Hong Kong

Page 21: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The

Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory

All Entropies Together

Chain Rule

H(X ,Y ) = H(X ) + H(Y |X ).

Proof.

H(X ,Y ) = −∑x∈X

∑y∈Y

p(x , y) log p(x , y)

= −∑x∈X

∑y∈Y

p(x , y) log p(x)p(y |x)

= −∑x∈X

∑y∈Y

p(x , y) log p(x)−∑x∈X

∑y∈Y

p(x , y) log p(y |x)

= −∑x∈X

p(x) log p(x)−∑x∈X

∑y∈Y

p(x , y) log p(y |x)

= H(X ) + H(Y |X ).

Guangyue Han The University of Hong Kong

Page 22: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The

Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory

All Entropies Together

Chain Rule

H(X ,Y ) = H(X ) + H(Y |X ).

Proof.

H(X ,Y ) = −∑x∈X

∑y∈Y

p(x , y) log p(x , y)

= −∑x∈X

∑y∈Y

p(x , y) log p(x)p(y |x)

= −∑x∈X

∑y∈Y

p(x , y) log p(x)−∑x∈X

∑y∈Y

p(x , y) log p(y |x)

= −∑x∈X

p(x) log p(x)−∑x∈X

∑y∈Y

p(x , y) log p(y |x)

= H(X ) + H(Y |X ).

Guangyue Han The University of Hong Kong

Page 23: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The

Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory

All Entropies Together

Chain Rule

H(X ,Y ) = H(X ) + H(Y |X ).

Proof.

H(X ,Y ) = −∑x∈X

∑y∈Y

p(x , y) log p(x , y)

= −∑x∈X

∑y∈Y

p(x , y) log p(x)p(y |x)

= −∑x∈X

∑y∈Y

p(x , y) log p(x)−∑x∈X

∑y∈Y

p(x , y) log p(y |x)

= −∑x∈X

p(x) log p(x)−∑x∈X

∑y∈Y

p(x , y) log p(y |x)

= H(X ) + H(Y |X ).

Guangyue Han The University of Hong Kong

Page 24: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The

Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory

All Entropies Together

Chain Rule

H(X ,Y ) = H(X ) + H(Y |X ).

Proof.

H(X ,Y ) = −∑x∈X

∑y∈Y

p(x , y) log p(x , y)

= −∑x∈X

∑y∈Y

p(x , y) log p(x)p(y |x)

= −∑x∈X

∑y∈Y

p(x , y) log p(x)−∑x∈X

∑y∈Y

p(x , y) log p(y |x)

= −∑x∈X

p(x) log p(x)−∑x∈X

∑y∈Y

p(x , y) log p(y |x)

= H(X ) + H(Y |X ).

Guangyue Han The University of Hong Kong

Page 25: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The

Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory

All Entropies Together

Chain Rule

H(X ,Y ) = H(X ) + H(Y |X ).

Proof.

H(X ,Y ) = −∑x∈X

∑y∈Y

p(x , y) log p(x , y)

= −∑x∈X

∑y∈Y

p(x , y) log p(x)p(y |x)

= −∑x∈X

∑y∈Y

p(x , y) log p(x)−∑x∈X

∑y∈Y

p(x , y) log p(y |x)

= −∑x∈X

p(x) log p(x)−∑x∈X

∑y∈Y

p(x , y) log p(y |x)

= H(X ) + H(Y |X ).

Guangyue Han The University of Hong Kong

Page 26: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The

Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory

All Entropies Together

Chain Rule

H(X ,Y ) = H(X ) + H(Y |X ).

Proof.

H(X ,Y ) = −∑x∈X

∑y∈Y

p(x , y) log p(x , y)

= −∑x∈X

∑y∈Y

p(x , y) log p(x)p(y |x)

= −∑x∈X

∑y∈Y

p(x , y) log p(x)−∑x∈X

∑y∈Y

p(x , y) log p(y |x)

= −∑x∈X

p(x) log p(x)−∑x∈X

∑y∈Y

p(x , y) log p(y |x)

= H(X ) + H(Y |X ).

Guangyue Han The University of Hong Kong

Page 27: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The

Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory

All Entropies Together

Chain Rule

H(X ,Y ) = H(X ) + H(Y |X ).

Proof.

H(X ,Y ) = −∑x∈X

∑y∈Y

p(x , y) log p(x , y)

= −∑x∈X

∑y∈Y

p(x , y) log p(x)p(y |x)

= −∑x∈X

∑y∈Y

p(x , y) log p(x)−∑x∈X

∑y∈Y

p(x , y) log p(y |x)

= −∑x∈X

p(x) log p(x)−∑x∈X

∑y∈Y

p(x , y) log p(y |x)

= H(X ) + H(Y |X ).

Guangyue Han The University of Hong Kong

Page 28: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The

Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory

Mutual Information

Original Definition

The mutual information I (X ;Y ) between two discrete randomvariables X ,Y with joint distribution p(x , y) is defined as

I (X ;Y ) =∑x ,y

p(x , y) logp(x , y)

p(x)p(y).

Alternative Definitions

I (X ;Y ) = H(X ) + H(Y )− H(X ,Y )

= H(X )− H(X |Y )

= H(Y )− H(Y |X )

Guangyue Han The University of Hong Kong

Page 29: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The

Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory

Mutual Information

Original Definition

The mutual information I (X ;Y ) between two discrete randomvariables X ,Y with joint distribution p(x , y) is defined as

I (X ;Y ) =∑x ,y

p(x , y) logp(x , y)

p(x)p(y).

Alternative Definitions

I (X ;Y ) = H(X ) + H(Y )− H(X ,Y )

= H(X )− H(X |Y )

= H(Y )− H(Y |X )

Guangyue Han The University of Hong Kong

Page 30: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The

Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory

Mutual Information

Original Definition

The mutual information I (X ;Y ) between two discrete randomvariables X ,Y with joint distribution p(x , y) is defined as

I (X ;Y ) =∑x ,y

p(x , y) logp(x , y)

p(x)p(y).

Alternative Definitions

I (X ;Y ) = H(X ) + H(Y )− H(X ,Y )

= H(X )− H(X |Y )

= H(Y )− H(Y |X )

Guangyue Han The University of Hong Kong

Page 31: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The

Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory

Mutual Information

Original Definition

The mutual information I (X ;Y ) between two discrete randomvariables X ,Y with joint distribution p(x , y) is defined as

I (X ;Y ) =∑x ,y

p(x , y) logp(x , y)

p(x)p(y).

Alternative Definitions

I (X ;Y ) = H(X ) + H(Y )− H(X ,Y )

= H(X )− H(X |Y )

= H(Y )− H(Y |X )

Guangyue Han The University of Hong Kong

Page 32: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The

Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory

Mutual Information

Original Definition

The mutual information I (X ;Y ) between two discrete randomvariables X ,Y with joint distribution p(x , y) is defined as

I (X ;Y ) =∑x ,y

p(x , y) logp(x , y)

p(x)p(y).

Alternative Definitions

I (X ;Y ) = H(X ) + H(Y )− H(X ,Y )

= H(X )− H(X |Y )

= H(Y )− H(Y |X )

Guangyue Han The University of Hong Kong

Page 33: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The

Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory

Mutual Information and Entropy

Guangyue Han The University of Hong Kong

Page 34: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The

Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory

Mutual Information and Entropy

Guangyue Han The University of Hong Kong

Page 35: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The

Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory

Asymptotic Equipartition Property Theorem

AEP TheoremIf X1,X2, . . . , are i.i.d ∼ p(x), then

−1

nlog p(X1,X2, . . . ,Xn)→ H(X ) in probability.

Proof.

−1

nlog p(X1,X2, . . . ,Xn) = −1

n

n∑i=1

log p(Xi )→ −E[log p(X )] = H(X ).

Guangyue Han The University of Hong Kong

Page 36: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The

Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory

Asymptotic Equipartition Property Theorem

AEP TheoremIf X1,X2, . . . , are i.i.d ∼ p(x), then

−1

nlog p(X1,X2, . . . ,Xn)→ H(X ) in probability.

Proof.

−1

nlog p(X1,X2, . . . ,Xn) = −1

n

n∑i=1

log p(Xi )→ −E[log p(X )] = H(X ).

Guangyue Han The University of Hong Kong

Page 37: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The

Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory

Asymptotic Equipartition Property Theorem

AEP TheoremIf X1,X2, . . . , are i.i.d ∼ p(x), then

−1

nlog p(X1,X2, . . . ,Xn)→ H(X ) in probability.

Proof.

−1

nlog p(X1,X2, . . . ,Xn) = −1

n

n∑i=1

log p(Xi )

→ −E[log p(X )] = H(X ).

Guangyue Han The University of Hong Kong

Page 38: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The

Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory

Asymptotic Equipartition Property Theorem

AEP TheoremIf X1,X2, . . . , are i.i.d ∼ p(x), then

−1

nlog p(X1,X2, . . . ,Xn)→ H(X ) in probability.

Proof.

−1

nlog p(X1,X2, . . . ,Xn) = −1

n

n∑i=1

log p(Xi )→ −E[log p(X )]

= H(X ).

Guangyue Han The University of Hong Kong

Page 39: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The

Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory

Asymptotic Equipartition Property Theorem

AEP TheoremIf X1,X2, . . . , are i.i.d ∼ p(x), then

−1

nlog p(X1,X2, . . . ,Xn)→ H(X ) in probability.

Proof.

−1

nlog p(X1,X2, . . . ,Xn) = −1

n

n∑i=1

log p(Xi )→ −E[log p(X )] = H(X ).

Guangyue Han The University of Hong Kong

Page 40: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The

Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory

The Shannon-McMillan-Breiman Theorem

The Shannon-McMillan-Breiman TheoremLet {Xn} be a finite-valued stationary ergodic process. Then, withprobability 1,

−1

nlog p(X1,X2, . . . ,Xn)→ H(X ),

where H(X ) here denotes the entropy rate of the process {Xn},namely,

H(X ) = limn→∞

H(X1,X2, . . . ,Xn)/n.

Proof.There are many. The simplest is the sandwich argument byAlgoet and Cover [1988].

Guangyue Han The University of Hong Kong

Page 41: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The

Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory

The Shannon-McMillan-Breiman Theorem

The Shannon-McMillan-Breiman TheoremLet {Xn} be a finite-valued stationary ergodic process. Then, withprobability 1,

−1

nlog p(X1,X2, . . . ,Xn)→ H(X ),

where H(X ) here denotes the entropy rate of the process {Xn},namely,

H(X ) = limn→∞

H(X1,X2, . . . ,Xn)/n.

Proof.There are many. The simplest is the sandwich argument byAlgoet and Cover [1988].

Guangyue Han The University of Hong Kong

Page 42: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The

Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory

The Shannon-McMillan-Breiman Theorem

The Shannon-McMillan-Breiman TheoremLet {Xn} be a finite-valued stationary ergodic process. Then, withprobability 1,

−1

nlog p(X1,X2, . . . ,Xn)→ H(X ),

where H(X ) here denotes the entropy rate of the process {Xn},namely,

H(X ) = limn→∞

H(X1,X2, . . . ,Xn)/n.

Proof.There are many. The simplest is the sandwich argument byAlgoet and Cover [1988].

Guangyue Han The University of Hong Kong

Page 43: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The

Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory

The Shannon-McMillan-Breiman Theorem

The Shannon-McMillan-Breiman TheoremLet {Xn} be a finite-valued stationary ergodic process. Then, withprobability 1,

−1

nlog p(X1,X2, . . . ,Xn)→ H(X ),

where H(X ) here denotes the entropy rate of the process {Xn},namely,

H(X ) = limn→∞

H(X1,X2, . . . ,Xn)/n.

Proof.There are many. The simplest is the sandwich argument byAlgoet and Cover [1988].

Guangyue Han The University of Hong Kong

Page 44: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The

Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory

Typical Set: Definition and Properties

DefinitionThe typical set A

(n)ε with respect to p(x) is the set of sequence

(x1, x2, . . . , xn) ∈ X n with the property

H(X )− ε ≤ −1

nlog p(x1, x2, . . . , xn) ≤ H(X ) + ε.

Properties

I (1− ε)2n(H(X )−ε) ≤ |A(n)ε | ≤ 2n(H(X )+ε) for n sufficiently large.

I Pr{A(n)ε } > 1− ε for n sufficiently large.

Guangyue Han The University of Hong Kong

Page 45: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The

Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory

Typical Set: Definition and Properties

DefinitionThe typical set A

(n)ε with respect to p(x) is the set of sequence

(x1, x2, . . . , xn) ∈ X n with the property

H(X )− ε ≤ −1

nlog p(x1, x2, . . . , xn) ≤ H(X ) + ε.

Properties

I (1− ε)2n(H(X )−ε) ≤ |A(n)ε | ≤ 2n(H(X )+ε) for n sufficiently large.

I Pr{A(n)ε } > 1− ε for n sufficiently large.

Guangyue Han The University of Hong Kong

Page 46: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The

Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory

Typical Set: Definition and Properties

DefinitionThe typical set A

(n)ε with respect to p(x) is the set of sequence

(x1, x2, . . . , xn) ∈ X n with the property

H(X )− ε ≤ −1

nlog p(x1, x2, . . . , xn) ≤ H(X ) + ε.

Properties

I (1− ε)2n(H(X )−ε) ≤ |A(n)ε | ≤ 2n(H(X )+ε) for n sufficiently large.

I Pr{A(n)ε } > 1− ε for n sufficiently large.

Guangyue Han The University of Hong Kong

Page 47: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The

Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory

Typical Set: Definition and Properties

DefinitionThe typical set A

(n)ε with respect to p(x) is the set of sequence

(x1, x2, . . . , xn) ∈ X n with the property

H(X )− ε ≤ −1

nlog p(x1, x2, . . . , xn) ≤ H(X ) + ε.

Properties

I (1− ε)2n(H(X )−ε) ≤ |A(n)ε | ≤ 2n(H(X )+ε) for n sufficiently large.

I Pr{A(n)ε } > 1− ε for n sufficiently large.

Guangyue Han The University of Hong Kong

Page 48: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The

Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory

Typical Set: Definition and Properties

DefinitionThe typical set A

(n)ε with respect to p(x) is the set of sequence

(x1, x2, . . . , xn) ∈ X n with the property

H(X )− ε ≤ −1

nlog p(x1, x2, . . . , xn) ≤ H(X ) + ε.

Properties

I (1− ε)2n(H(X )−ε) ≤ |A(n)ε | ≤ 2n(H(X )+ε) for n sufficiently large.

I Pr{A(n)ε } > 1− ε for n sufficiently large.

Guangyue Han The University of Hong Kong

Page 49: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The

Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory

Typical Set: A Pictorial Description

Consider all the instances (x1, x2, . . . , xn) ∈ X n of i.i.d.(X1,X2, · · · ,Xn) with distribution p(x).

Guangyue Han The University of Hong Kong

Page 50: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The

Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory

Typical Set: A Pictorial DescriptionConsider all the instances (x1, x2, . . . , xn) ∈ X n of i.i.d.(X1,X2, · · · ,Xn) with distribution p(x).

Guangyue Han The University of Hong Kong

Page 51: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The

Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory

Source Coding (Data Compression)

I Represent each typicalsequence with aboutnH(X ) bits.

I Represent eachnon-typical sequence withabout n log |X | bits.

I Then we have aone-to-one and easilydecodable code.

Guangyue Han The University of Hong Kong

Page 52: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The

Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory

Source Coding (Data Compression)

I Represent each typicalsequence with aboutnH(X ) bits.

I Represent eachnon-typical sequence withabout n log |X | bits.

I Then we have aone-to-one and easilydecodable code.

Guangyue Han The University of Hong Kong

Page 53: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The

Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory

Source Coding (Data Compression)

I Represent each typicalsequence with aboutnH(X ) bits.

I Represent eachnon-typical sequence withabout n log |X | bits.

I Then we have aone-to-one and easilydecodable code.

Guangyue Han The University of Hong Kong

Page 54: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The

Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory

Source Coding (Data Compression)

I Represent each typicalsequence with aboutnH(X ) bits.

I Represent eachnon-typical sequence withabout n log |X | bits.

I Then we have aone-to-one and easilydecodable code.

Guangyue Han The University of Hong Kong

Page 55: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The

Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory

Source Coding (Data Compression)

I Represent each typicalsequence with aboutnH(X ) bits.

I Represent eachnon-typical sequence withabout n log |X | bits.

I Then we have aone-to-one and easilydecodable code.

Guangyue Han The University of Hong Kong

Page 56: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The

Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory

Shannon’s Source Coding Theorem

The average bits needed is

E[l(X1, . . . ,Xn)] =∑

x1,...,xn

p(x1, . . . , xn)l(x1, . . . , xn)

=∑

x1,...,xn∈A(n)ε

p(x1, . . . , xn)l(x1, . . . , xn) +∑

x1,...,xn 6∈A(n)ε

p(x1, . . . , xn)l(x1, . . . , xn)

=∑

x1,...,xn∈A(n)ε

p(x1, . . . , xn)nH(X ) +∑

x1,...,xn 6∈A(n)ε

p(x1, . . . , xn)n log |X | ≈ nH(X ).

Source Coding Theorem

For any information source distributed according toX1,X2, · · · ∼ p(x), the compression rate is always greater thanH(X ), but it can be arbitrarily close to H(X ).

Guangyue Han The University of Hong Kong

Page 57: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The

Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory

Shannon’s Source Coding Theorem

The average bits needed is

E[l(X1, . . . ,Xn)] =∑

x1,...,xn

p(x1, . . . , xn)l(x1, . . . , xn)

=∑

x1,...,xn∈A(n)ε

p(x1, . . . , xn)l(x1, . . . , xn) +∑

x1,...,xn 6∈A(n)ε

p(x1, . . . , xn)l(x1, . . . , xn)

=∑

x1,...,xn∈A(n)ε

p(x1, . . . , xn)nH(X ) +∑

x1,...,xn 6∈A(n)ε

p(x1, . . . , xn)n log |X | ≈ nH(X ).

Source Coding Theorem

For any information source distributed according toX1,X2, · · · ∼ p(x), the compression rate is always greater thanH(X ), but it can be arbitrarily close to H(X ).

Guangyue Han The University of Hong Kong

Page 58: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The

Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory

Shannon’s Source Coding Theorem

The average bits needed is

E[l(X1, . . . ,Xn)] =∑

x1,...,xn

p(x1, . . . , xn)l(x1, . . . , xn)

=∑

x1,...,xn∈A(n)ε

p(x1, . . . , xn)l(x1, . . . , xn) +∑

x1,...,xn 6∈A(n)ε

p(x1, . . . , xn)l(x1, . . . , xn)

=∑

x1,...,xn∈A(n)ε

p(x1, . . . , xn)nH(X ) +∑

x1,...,xn 6∈A(n)ε

p(x1, . . . , xn)n log |X | ≈ nH(X ).

Source Coding Theorem

For any information source distributed according toX1,X2, · · · ∼ p(x), the compression rate is always greater thanH(X ), but it can be arbitrarily close to H(X ).

Guangyue Han The University of Hong Kong

Page 59: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The

Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory

Shannon’s Source Coding Theorem

The average bits needed is

E[l(X1, . . . ,Xn)] =∑

x1,...,xn

p(x1, . . . , xn)l(x1, . . . , xn)

=∑

x1,...,xn∈A(n)ε

p(x1, . . . , xn)l(x1, . . . , xn) +∑

x1,...,xn 6∈A(n)ε

p(x1, . . . , xn)l(x1, . . . , xn)

=∑

x1,...,xn∈A(n)ε

p(x1, . . . , xn)nH(X ) +∑

x1,...,xn 6∈A(n)ε

p(x1, . . . , xn)n log |X | ≈ nH(X ).

Source Coding Theorem

For any information source distributed according toX1,X2, · · · ∼ p(x), the compression rate is always greater thanH(X ), but it can be arbitrarily close to H(X ).

Guangyue Han The University of Hong Kong

Page 60: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The

Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory

Shannon’s Source Coding Theorem

The average bits needed is

E[l(X1, . . . ,Xn)] =∑

x1,...,xn

p(x1, . . . , xn)l(x1, . . . , xn)

=∑

x1,...,xn∈A(n)ε

p(x1, . . . , xn)l(x1, . . . , xn) +∑

x1,...,xn 6∈A(n)ε

p(x1, . . . , xn)l(x1, . . . , xn)

=∑

x1,...,xn∈A(n)ε

p(x1, . . . , xn)nH(X ) +∑

x1,...,xn 6∈A(n)ε

p(x1, . . . , xn)n log |X |

≈ nH(X ).

Source Coding Theorem

For any information source distributed according toX1,X2, · · · ∼ p(x), the compression rate is always greater thanH(X ), but it can be arbitrarily close to H(X ).

Guangyue Han The University of Hong Kong

Page 61: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The

Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory

Shannon’s Source Coding Theorem

The average bits needed is

E[l(X1, . . . ,Xn)] =∑

x1,...,xn

p(x1, . . . , xn)l(x1, . . . , xn)

=∑

x1,...,xn∈A(n)ε

p(x1, . . . , xn)l(x1, . . . , xn) +∑

x1,...,xn 6∈A(n)ε

p(x1, . . . , xn)l(x1, . . . , xn)

=∑

x1,...,xn∈A(n)ε

p(x1, . . . , xn)nH(X ) +∑

x1,...,xn 6∈A(n)ε

p(x1, . . . , xn)n log |X | ≈ nH(X ).

Source Coding Theorem

For any information source distributed according toX1,X2, · · · ∼ p(x), the compression rate is always greater thanH(X ), but it can be arbitrarily close to H(X ).

Guangyue Han The University of Hong Kong

Page 62: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The

Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory

Shannon’s Source Coding Theorem

The average bits needed is

E[l(X1, . . . ,Xn)] =∑

x1,...,xn

p(x1, . . . , xn)l(x1, . . . , xn)

=∑

x1,...,xn∈A(n)ε

p(x1, . . . , xn)l(x1, . . . , xn) +∑

x1,...,xn 6∈A(n)ε

p(x1, . . . , xn)l(x1, . . . , xn)

=∑

x1,...,xn∈A(n)ε

p(x1, . . . , xn)nH(X ) +∑

x1,...,xn 6∈A(n)ε

p(x1, . . . , xn)n log |X | ≈ nH(X ).

Source Coding Theorem

For any information source distributed according toX1,X2, · · · ∼ p(x), the compression rate is always greater thanH(X ), but it can be arbitrarily close to H(X ).

Guangyue Han The University of Hong Kong

Page 63: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The

Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory

Communication Channel: Definition

I A message W results in channel inputs X1(W ), . . . ,Xn(W );I And they are received as a random sequence

Y1, . . . ,Yn ∼ p(y1, . . . , yn|x1, . . . , xn).I The receiver then guesses the index W by an appropriate

decoding rule W = g(Y1, . . . ,Yn).I The receiver makes an error if W is not the same as W that

was transmitted.

Guangyue Han The University of Hong Kong

Page 64: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The

Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory

Communication Channel: Definition

I A message W results in channel inputs X1(W ), . . . ,Xn(W );I And they are received as a random sequence

Y1, . . . ,Yn ∼ p(y1, . . . , yn|x1, . . . , xn).I The receiver then guesses the index W by an appropriate

decoding rule W = g(Y1, . . . ,Yn).I The receiver makes an error if W is not the same as W that

was transmitted.

Guangyue Han The University of Hong Kong

Page 65: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The

Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory

Communication Channel: Definition

I A message W results in channel inputs X1(W ), . . . ,Xn(W );

I And they are received as a random sequenceY1, . . . ,Yn ∼ p(y1, . . . , yn|x1, . . . , xn).

I The receiver then guesses the index W by an appropriatedecoding rule W = g(Y1, . . . ,Yn).

I The receiver makes an error if W is not the same as W thatwas transmitted.

Guangyue Han The University of Hong Kong

Page 66: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The

Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory

Communication Channel: Definition

I A message W results in channel inputs X1(W ), . . . ,Xn(W );I And they are received as a random sequence

Y1, . . . ,Yn ∼ p(y1, . . . , yn|x1, . . . , xn).

I The receiver then guesses the index W by an appropriatedecoding rule W = g(Y1, . . . ,Yn).

I The receiver makes an error if W is not the same as W thatwas transmitted.

Guangyue Han The University of Hong Kong

Page 67: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The

Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory

Communication Channel: Definition

I A message W results in channel inputs X1(W ), . . . ,Xn(W );I And they are received as a random sequence

Y1, . . . ,Yn ∼ p(y1, . . . , yn|x1, . . . , xn).I The receiver then guesses the index W by an appropriate

decoding rule W = g(Y1, . . . ,Yn).

I The receiver makes an error if W is not the same as W thatwas transmitted.

Guangyue Han The University of Hong Kong

Page 68: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The

Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory

Communication Channel: Definition

I A message W results in channel inputs X1(W ), . . . ,Xn(W );I And they are received as a random sequence

Y1, . . . ,Yn ∼ p(y1, . . . , yn|x1, . . . , xn).I The receiver then guesses the index W by an appropriate

decoding rule W = g(Y1, . . . ,Yn).I The receiver makes an error if W is not the same as W that

was transmitted.Guangyue Han The University of Hong Kong

Page 69: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The

Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory

Communication Channel: An Example

Binary Symmetric Channel

p(Y = 0|X = 0) = 1− p, p(Y = 1|X = 0) = p,

p(Y = 0|X = 1) = p, p(Y = 1|X = 1) = 1− p.

Guangyue Han The University of Hong Kong

Page 70: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The

Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory

Communication Channel: An Example

Binary Symmetric Channel

p(Y = 0|X = 0) = 1− p, p(Y = 1|X = 0) = p,

p(Y = 0|X = 1) = p, p(Y = 1|X = 1) = 1− p.

Guangyue Han The University of Hong Kong

Page 71: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The

Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory

Communication Channel: An Example

Binary Symmetric Channel

p(Y = 0|X = 0) = 1− p, p(Y = 1|X = 0) = p,

p(Y = 0|X = 1) = p, p(Y = 1|X = 1) = 1− p.

Guangyue Han The University of Hong Kong

Page 72: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The

Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory

Tradeoff between Speed and Reliability

Speed

To transmit 1: we transmit 1. It is likely that we receive 0. Notethat the transmission rate is 1.

Reliability

To transmit 1: we transmit 11111. Though it is likely that wereceive something else, such as 11011, but more likely than not, wecan correct the possible error. Note that the transmission rate ishowever 1/5.

Guangyue Han The University of Hong Kong

Page 73: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The

Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory

Tradeoff between Speed and Reliability

Speed

To transmit 1: we transmit 1. It is likely that we receive 0. Notethat the transmission rate is 1.

Reliability

To transmit 1: we transmit 11111. Though it is likely that wereceive something else, such as 11011, but more likely than not, wecan correct the possible error. Note that the transmission rate ishowever 1/5.

Guangyue Han The University of Hong Kong

Page 74: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The

Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory

Tradeoff between Speed and Reliability

Speed

To transmit 1: we transmit 1. It is likely that we receive 0. Notethat the transmission rate is 1.

Reliability

To transmit 1: we transmit 11111. Though it is likely that wereceive something else, such as 11011, but more likely than not, wecan correct the possible error. Note that the transmission rate ishowever 1/5.

Guangyue Han The University of Hong Kong

Page 75: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The

Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory

Shannon’s Channel Coding Theorem: Statement

Channel Coding Theorem

For any discrete memoryless channel, asymptotically perfecttransmission rate below the capacity

C = maxp(x)

I (X ;Y )

is always possible, but is not possible above the capacity.

Guangyue Han The University of Hong Kong

Page 76: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The

Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory

Shannon’s Channel Coding Theorem: Statement

Channel Coding Theorem

For any discrete memoryless channel, asymptotically perfecttransmission rate below the capacity

C = maxp(x)

I (X ;Y )

is always possible, but is not possible above the capacity.

Guangyue Han The University of Hong Kong

Page 77: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The

Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory

Shannon’s Channel Coding Theorem: Proof

Guangyue Han The University of Hong Kong

Page 78: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The

Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory

Shannon’s Channel Coding Theorem: Proof

I For each typical input n-sequence, there are approximately2nH(Y |X ) possible typical output sequences, all of them equallylikely.

I We wish to ensure that no two X input sequences produce thesame Y output sequence. Otherwise, we will not be able todecide which X sequence was sent.

I The total number of possible typical Y sequences isapproximately 2nH(Y ). This set has to be divided into sets ofsize 2nH(Y |X ) corresponding to the different input Xsequences.

I The total number of disjoint sets is less than or equal to2n(H(Y )−H(Y |X )) = 2nI (X ;Y ). Hence, we can send at mostapproximately 2nI (X ;Y ) distinguishable sequences of length n.

Guangyue Han The University of Hong Kong

Page 79: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The

Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory

Shannon’s Channel Coding Theorem: Proof

I For each typical input n-sequence, there are approximately2nH(Y |X ) possible typical output sequences, all of them equallylikely.

I We wish to ensure that no two X input sequences produce thesame Y output sequence. Otherwise, we will not be able todecide which X sequence was sent.

I The total number of possible typical Y sequences isapproximately 2nH(Y ). This set has to be divided into sets ofsize 2nH(Y |X ) corresponding to the different input Xsequences.

I The total number of disjoint sets is less than or equal to2n(H(Y )−H(Y |X )) = 2nI (X ;Y ). Hence, we can send at mostapproximately 2nI (X ;Y ) distinguishable sequences of length n.

Guangyue Han The University of Hong Kong

Page 80: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The

Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory

Shannon’s Channel Coding Theorem: Proof

I For each typical input n-sequence, there are approximately2nH(Y |X ) possible typical output sequences, all of them equallylikely.

I We wish to ensure that no two X input sequences produce thesame Y output sequence. Otherwise, we will not be able todecide which X sequence was sent.

I The total number of possible typical Y sequences isapproximately 2nH(Y ). This set has to be divided into sets ofsize 2nH(Y |X ) corresponding to the different input Xsequences.

I The total number of disjoint sets is less than or equal to2n(H(Y )−H(Y |X )) = 2nI (X ;Y ). Hence, we can send at mostapproximately 2nI (X ;Y ) distinguishable sequences of length n.

Guangyue Han The University of Hong Kong

Page 81: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The

Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory

Shannon’s Channel Coding Theorem: Proof

I For each typical input n-sequence, there are approximately2nH(Y |X ) possible typical output sequences, all of them equallylikely.

I We wish to ensure that no two X input sequences produce thesame Y output sequence. Otherwise, we will not be able todecide which X sequence was sent.

I The total number of possible typical Y sequences isapproximately 2nH(Y ). This set has to be divided into sets ofsize 2nH(Y |X ) corresponding to the different input Xsequences.

I The total number of disjoint sets is less than or equal to2n(H(Y )−H(Y |X )) = 2nI (X ;Y ). Hence, we can send at mostapproximately 2nI (X ;Y ) distinguishable sequences of length n.

Guangyue Han The University of Hong Kong

Page 82: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The

Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory

Shannon’s Channel Coding Theorem: Proof

I For each typical input n-sequence, there are approximately2nH(Y |X ) possible typical output sequences, all of them equallylikely.

I We wish to ensure that no two X input sequences produce thesame Y output sequence. Otherwise, we will not be able todecide which X sequence was sent.

I The total number of possible typical Y sequences isapproximately 2nH(Y ). This set has to be divided into sets ofsize 2nH(Y |X ) corresponding to the different input Xsequences.

I The total number of disjoint sets is less than or equal to2n(H(Y )−H(Y |X )) = 2nI (X ;Y ). Hence, we can send at mostapproximately 2nI (X ;Y ) distinguishable sequences of length n.

Guangyue Han The University of Hong Kong

Page 83: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The

Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory

Capacity of Binary Symmetric Channels

The capacity of a binary symmetric channel with crossoverprobability p is C = 1− H(p), where

H(p) = −p log p − (1− p) log(1− p).

Proof.

I (X ;Y ) = H(Y )− H(Y |X )

= H(Y )−∑x

p(x)H(Y |X = x)

= H(Y )−∑x

p(x)H(p)

= H(Y )− H(p)

≤ 1− H(p).

Guangyue Han The University of Hong Kong

Page 84: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The

Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory

Capacity of Binary Symmetric Channels

The capacity of a binary symmetric channel with crossoverprobability p is C = 1− H(p), where

H(p) = −p log p − (1− p) log(1− p).

Proof.

I (X ;Y ) = H(Y )− H(Y |X )

= H(Y )−∑x

p(x)H(Y |X = x)

= H(Y )−∑x

p(x)H(p)

= H(Y )− H(p)

≤ 1− H(p).

Guangyue Han The University of Hong Kong

Page 85: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The

Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory

Capacity of Binary Symmetric Channels

The capacity of a binary symmetric channel with crossoverprobability p is C = 1− H(p), where

H(p) = −p log p − (1− p) log(1− p).

Proof.

I (X ;Y ) = H(Y )− H(Y |X )

= H(Y )−∑x

p(x)H(Y |X = x)

= H(Y )−∑x

p(x)H(p)

= H(Y )− H(p)

≤ 1− H(p).

Guangyue Han The University of Hong Kong

Page 86: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The

Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory

Capacity of Binary Symmetric Channels

The capacity of a binary symmetric channel with crossoverprobability p is C = 1− H(p), where

H(p) = −p log p − (1− p) log(1− p).

Proof.

I (X ;Y ) = H(Y )− H(Y |X )

= H(Y )−∑x

p(x)H(Y |X = x)

= H(Y )−∑x

p(x)H(p)

= H(Y )− H(p)

≤ 1− H(p).

Guangyue Han The University of Hong Kong

Page 87: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The

Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory

Capacity of Binary Symmetric Channels

The capacity of a binary symmetric channel with crossoverprobability p is C = 1− H(p), where

H(p) = −p log p − (1− p) log(1− p).

Proof.

I (X ;Y ) = H(Y )− H(Y |X )

= H(Y )−∑x

p(x)H(Y |X = x)

= H(Y )−∑x

p(x)H(p)

= H(Y )− H(p)

≤ 1− H(p).

Guangyue Han The University of Hong Kong

Page 88: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The

Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory

Capacity of Binary Symmetric Channels

The capacity of a binary symmetric channel with crossoverprobability p is C = 1− H(p), where

H(p) = −p log p − (1− p) log(1− p).

Proof.

I (X ;Y ) = H(Y )− H(Y |X )

= H(Y )−∑x

p(x)H(Y |X = x)

= H(Y )−∑x

p(x)H(p)

= H(Y )− H(p)

≤ 1− H(p).

Guangyue Han The University of Hong Kong

Page 89: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The

Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory

Capacity of Binary Symmetric Channels

The capacity of a binary symmetric channel with crossoverprobability p is C = 1− H(p), where

H(p) = −p log p − (1− p) log(1− p).

Proof.

I (X ;Y ) = H(Y )− H(Y |X )

= H(Y )−∑x

p(x)H(Y |X = x)

= H(Y )−∑x

p(x)H(p)

= H(Y )− H(p)

≤ 1− H(p).

Guangyue Han The University of Hong Kong

Page 90: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The

Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory

Capacity of Additive White Gaussian Channels

The capacity of an additive white Gaussian channel Y = X + Z ,where E[X 2] ≤ P and Z ∼ N(0, 1), is C = 1

2 log(1 + P).

Proof.

I (X ;Y ) = H(Y )− H(Y |X )

= H(Y )− H(X + Z |X )

= H(Y )− H(Z |X )

= H(Y )− H(Z )

≤ 1

2log 2πe(1 + P)− 1

2log 2πe

=1

2log(1 + P).

Guangyue Han The University of Hong Kong

Page 91: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The

Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory

Capacity of Additive White Gaussian Channels

The capacity of an additive white Gaussian channel Y = X + Z ,where E[X 2] ≤ P and Z ∼ N(0, 1), is C = 1

2 log(1 + P).

Proof.

I (X ;Y ) = H(Y )− H(Y |X )

= H(Y )− H(X + Z |X )

= H(Y )− H(Z |X )

= H(Y )− H(Z )

≤ 1

2log 2πe(1 + P)− 1

2log 2πe

=1

2log(1 + P).

Guangyue Han The University of Hong Kong

Page 92: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The

Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory

Capacity of Additive White Gaussian Channels

The capacity of an additive white Gaussian channel Y = X + Z ,where E[X 2] ≤ P and Z ∼ N(0, 1), is C = 1

2 log(1 + P).

Proof.

I (X ;Y ) = H(Y )− H(Y |X )

= H(Y )− H(X + Z |X )

= H(Y )− H(Z |X )

= H(Y )− H(Z )

≤ 1

2log 2πe(1 + P)− 1

2log 2πe

=1

2log(1 + P).

Guangyue Han The University of Hong Kong

Page 93: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The

Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory

Capacity of Additive White Gaussian Channels

The capacity of an additive white Gaussian channel Y = X + Z ,where E[X 2] ≤ P and Z ∼ N(0, 1), is C = 1

2 log(1 + P).

Proof.

I (X ;Y ) = H(Y )− H(Y |X )

= H(Y )− H(X + Z |X )

= H(Y )− H(Z |X )

= H(Y )− H(Z )

≤ 1

2log 2πe(1 + P)− 1

2log 2πe

=1

2log(1 + P).

Guangyue Han The University of Hong Kong

Page 94: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The

Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory

Capacity of Additive White Gaussian Channels

The capacity of an additive white Gaussian channel Y = X + Z ,where E[X 2] ≤ P and Z ∼ N(0, 1), is C = 1

2 log(1 + P).

Proof.

I (X ;Y ) = H(Y )− H(Y |X )

= H(Y )− H(X + Z |X )

= H(Y )− H(Z |X )

= H(Y )− H(Z )

≤ 1

2log 2πe(1 + P)− 1

2log 2πe

=1

2log(1 + P).

Guangyue Han The University of Hong Kong

Page 95: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The

Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory

Capacity of Additive White Gaussian Channels

The capacity of an additive white Gaussian channel Y = X + Z ,where E[X 2] ≤ P and Z ∼ N(0, 1), is C = 1

2 log(1 + P).

Proof.

I (X ;Y ) = H(Y )− H(Y |X )

= H(Y )− H(X + Z |X )

= H(Y )− H(Z |X )

= H(Y )− H(Z )

≤ 1

2log 2πe(1 + P)− 1

2log 2πe

=1

2log(1 + P).

Guangyue Han The University of Hong Kong

Page 96: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The

Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory

Capacity of Additive White Gaussian Channels

The capacity of an additive white Gaussian channel Y = X + Z ,where E[X 2] ≤ P and Z ∼ N(0, 1), is C = 1

2 log(1 + P).

Proof.

I (X ;Y ) = H(Y )− H(Y |X )

= H(Y )− H(X + Z |X )

= H(Y )− H(Z |X )

= H(Y )− H(Z )

≤ 1

2log 2πe(1 + P)− 1

2log 2πe

=1

2log(1 + P).

Guangyue Han The University of Hong Kong

Page 97: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The

Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory

Capacity of Additive White Gaussian Channels

The capacity of an additive white Gaussian channel Y = X + Z ,where E[X 2] ≤ P and Z ∼ N(0, 1), is C = 1

2 log(1 + P).

Proof.

I (X ;Y ) = H(Y )− H(Y |X )

= H(Y )− H(X + Z |X )

= H(Y )− H(Z |X )

= H(Y )− H(Z )

≤ 1

2log 2πe(1 + P)− 1

2log 2πe

=1

2log(1 + P).

Guangyue Han The University of Hong Kong

Page 98: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The

Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory

Memory Channels

Guangyue Han The University of Hong Kong

Page 99: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The

Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory

Memoryless Channels

I Channel transitions are characterized by time-invarianttransition probabilities {p(y |x)}.

I Channel inputs are independent and identically distributed.

I Representative examples include (memoryless) binarysymmetric channels and additive white Gaussian channels.

Guangyue Han The University of Hong Kong

Page 100: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The

Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory

Memoryless Channels

I Channel transitions are characterized by time-invarianttransition probabilities {p(y |x)}.

I Channel inputs are independent and identically distributed.

I Representative examples include (memoryless) binarysymmetric channels and additive white Gaussian channels.

Guangyue Han The University of Hong Kong

Page 101: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The

Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory

Memoryless Channels

I Channel transitions are characterized by time-invarianttransition probabilities {p(y |x)}.

I Channel inputs are independent and identically distributed.

I Representative examples include (memoryless) binarysymmetric channels and additive white Gaussian channels.

Guangyue Han The University of Hong Kong

Page 102: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The

Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory

Memoryless Channels

I Channel transitions are characterized by time-invarianttransition probabilities {p(y |x)}.

I Channel inputs are independent and identically distributed.

I Representative examples include (memoryless) binarysymmetric channels and additive white Gaussian channels.

Guangyue Han The University of Hong Kong

Page 103: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The

Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory

Capacity of Memoryless Channels

Shannon’s channel coding theorem

C = supp(x)

I (X ;Y )

= supp(x)−∑x ,y

p(x , y) logp(x , y)

p(x)p(y).

The Blahut-Arimoto algorithm (BAA)

Guangyue Han The University of Hong Kong

Page 104: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The

Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory

Capacity of Memoryless Channels

Shannon’s channel coding theorem

C = supp(x)

I (X ;Y )

= supp(x)−∑x ,y

p(x , y) logp(x , y)

p(x)p(y).

The Blahut-Arimoto algorithm (BAA)

Guangyue Han The University of Hong Kong

Page 105: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The

Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory

Capacity of Memoryless Channels

Shannon’s channel coding theorem

C = supp(x)

I (X ;Y )

= supp(x)−∑x ,y

p(x , y) logp(x , y)

p(x)p(y).

The Blahut-Arimoto algorithm (BAA)

Guangyue Han The University of Hong Kong

Page 106: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The

Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory

Memory Channels

I Channel transitions are characterized by probabilities{p(yi |x1, . . . , xi , y1, . . . , yi−1, si )},

where channel outputs are possibly dependent on previous andcurrent channel inputs and previous outputs and currentchannel state; for example, inter-symbol interference channels,flash memory channels, Gilbert-Elliot channels.

I Channel inputs may have to satisfy certain constraints whichnecessitate dependence among channel inputs; for example,(d , k)-RLL constraints, more generally, finite-type constraints.

I Such channels are widely used in a variety of real-lifeapplications, including magnetic and optical recording, solidstate drives, communications over band-limited channelswith inter-symbol interference.

Guangyue Han The University of Hong Kong

Page 107: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The

Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory

Memory Channels

I Channel transitions are characterized by probabilities{p(yi |x1, . . . , xi , y1, . . . , yi−1, si )},

where channel outputs are possibly dependent on previous andcurrent channel inputs and previous outputs and currentchannel state; for example, inter-symbol interference channels,flash memory channels, Gilbert-Elliot channels.

I Channel inputs may have to satisfy certain constraints whichnecessitate dependence among channel inputs; for example,(d , k)-RLL constraints, more generally, finite-type constraints.

I Such channels are widely used in a variety of real-lifeapplications, including magnetic and optical recording, solidstate drives, communications over band-limited channelswith inter-symbol interference.

Guangyue Han The University of Hong Kong

Page 108: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The

Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory

Memory Channels

I Channel transitions are characterized by probabilities{p(yi |x1, . . . , xi , y1, . . . , yi−1, si )},

where channel outputs are possibly dependent on previous andcurrent channel inputs and previous outputs and currentchannel state; for example, inter-symbol interference channels,flash memory channels, Gilbert-Elliot channels.

I Channel inputs may have to satisfy certain constraints whichnecessitate dependence among channel inputs; for example,(d , k)-RLL constraints, more generally, finite-type constraints.

I Such channels are widely used in a variety of real-lifeapplications, including magnetic and optical recording, solidstate drives, communications over band-limited channelswith inter-symbol interference.

Guangyue Han The University of Hong Kong

Page 109: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The

Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory

Memory Channels

I Channel transitions are characterized by probabilities{p(yi |x1, . . . , xi , y1, . . . , yi−1, si )},

where channel outputs are possibly dependent on previous andcurrent channel inputs and previous outputs and currentchannel state; for example, inter-symbol interference channels,flash memory channels, Gilbert-Elliot channels.

I Channel inputs may have to satisfy certain constraints whichnecessitate dependence among channel inputs; for example,(d , k)-RLL constraints, more generally, finite-type constraints.

I Such channels are widely used in a variety of real-lifeapplications, including magnetic and optical recording, solidstate drives, communications over band-limited channelswith inter-symbol interference.

Guangyue Han The University of Hong Kong

Page 110: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The

Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory

Capacity of Memory Channels

Despite a great deal of efforts by Zehavi and Wolf [1988], Mushkinand Bar-David [1989], Shamai and Kofman [1990], Goldsmith andVaraiya [1996], Arnold, Loeliger, Vontobel, Kavcic and Zeng[2006], Holliday, Goldsmith, and Glynn [2006], Vontobel, Kavcic,Arnold and Loeliger [2008], Pfister [2011], Permuter, Asnani andWeissman [2013], Han [2015], ...

Guangyue Han The University of Hong Kong

Page 111: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The

Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory

Capacity of Memory Channels

Despite a great deal of efforts by Zehavi and Wolf [1988], Mushkinand Bar-David [1989], Shamai and Kofman [1990], Goldsmith andVaraiya [1996], Arnold, Loeliger, Vontobel, Kavcic and Zeng[2006], Holliday, Goldsmith, and Glynn [2006], Vontobel, Kavcic,Arnold and Loeliger [2008], Pfister [2011], Permuter, Asnani andWeissman [2013], Han [2015], ...

Guangyue Han The University of Hong Kong

Page 112: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The

Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory

Capacity of Memory Channels

???Guangyue Han The University of Hong Kong

Page 113: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The

Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory

Capacity of Memory Channels

Shannon’s channel coding theorem

C = supp(x)

I (X ;Y )

= supp(x)

limn→∞

−1

n

∑xn1 ,y

n1

p(xn1 , yn1 ) log

p(xn1 , yn1 )

p(xn1 )p(yn1 ).

The Generalized Blahut-Arimotoalgorithm (GBAA) by Vontobel,Kavcic, Arnold and Loeliger [2008]

Guangyue Han The University of Hong Kong

Page 114: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The

Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory

Capacity of Memory Channels

Shannon’s channel coding theorem

C = supp(x)

I (X ;Y )

= supp(x)

limn→∞

−1

n

∑xn1 ,y

n1

p(xn1 , yn1 ) log

p(xn1 , yn1 )

p(xn1 )p(yn1 ).

The Generalized Blahut-Arimotoalgorithm (GBAA) by Vontobel,Kavcic, Arnold and Loeliger [2008]

Guangyue Han The University of Hong Kong

Page 115: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The

Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory

Capacity of Memory Channels

Shannon’s channel coding theorem

C = supp(x)

I (X ;Y )

= supp(x)

limn→∞

−1

n

∑xn1 ,y

n1

p(xn1 , yn1 ) log

p(xn1 , yn1 )

p(xn1 )p(yn1 ).

The Generalized Blahut-Arimotoalgorithm (GBAA) by Vontobel,Kavcic, Arnold and Loeliger [2008]

Guangyue Han The University of Hong Kong

Page 116: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The

Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory

Convergence of the GBAA

The GBAA will converge if the following conjecture is true.

Concavity Conjecture [Vontobel et al. 2008]

I (X ;Y ) and H(X |Y ) are both concave with respect to a chosenparameterization.

Unfortunately, the concavity conjecture is not true in general [Liand Han, 2013].

Guangyue Han The University of Hong Kong

Page 117: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The

Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory

Convergence of the GBAA

The GBAA will converge if the following conjecture is true.

Concavity Conjecture [Vontobel et al. 2008]

I (X ;Y ) and H(X |Y ) are both concave with respect to a chosenparameterization.

Unfortunately, the concavity conjecture is not true in general [Liand Han, 2013].

Guangyue Han The University of Hong Kong

Page 118: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The

Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory

Convergence of the GBAA

The GBAA will converge if the following conjecture is true.

Concavity Conjecture [Vontobel et al. 2008]

I (X ;Y ) and H(X |Y ) are both concave with respect to a chosenparameterization.

Unfortunately, the concavity conjecture is not true in general [Liand Han, 2013].

Guangyue Han The University of Hong Kong

Page 119: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The

Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory

Convergence of the GBAA

The GBAA will converge if the following conjecture is true.

Concavity Conjecture [Vontobel et al. 2008]

I (X ;Y ) and H(X |Y ) are both concave with respect to a chosenparameterization.

Unfortunately, the concavity conjecture is not true in general [Liand Han, 2013].

Guangyue Han The University of Hong Kong

Page 120: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The

Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory

A Randomized Algorithm [Han 2015]

With appropriately chosen step sizes an = 1/na, a > 0,

θn+1 = θn + angnb(θn),

where

I θ0 is randomly selected from the parameter space Θ;

I gnb(θ) is a simulator for I ′(X (θ);Y (θ));

I0 < β < α < 1/3, b > 0, 2a + b − 3bβ > 1,

here, α, β are some “hidden” parameters involved in thedefinition of gnb(θ).

Guangyue Han The University of Hong Kong

Page 121: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The

Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory

A Randomized Algorithm [Han 2015]

With appropriately chosen step sizes an = 1/na, a > 0,

θn+1 = θn + angnb(θn),

where

I θ0 is randomly selected from the parameter space Θ;

I gnb(θ) is a simulator for I ′(X (θ);Y (θ));

I0 < β < α < 1/3, b > 0, 2a + b − 3bβ > 1,

here, α, β are some “hidden” parameters involved in thedefinition of gnb(θ).

Guangyue Han The University of Hong Kong

Page 122: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The

Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory

A Randomized Algorithm [Han 2015]

With appropriately chosen step sizes an = 1/na, a > 0,

θn+1 = θn + angnb(θn),

where

I θ0 is randomly selected from the parameter space Θ;

I gnb(θ) is a simulator for I ′(X (θ);Y (θ));

I0 < β < α < 1/3, b > 0, 2a + b − 3bβ > 1,

here, α, β are some “hidden” parameters involved in thedefinition of gnb(θ).

Guangyue Han The University of Hong Kong

Page 123: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The

Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory

A Randomized Algorithm [Han 2015]

With appropriately chosen step sizes an = 1/na, a > 0,

θn+1 = θn + angnb(θn),

where

I θ0 is randomly selected from the parameter space Θ;

I gnb(θ) is a simulator for I ′(X (θ);Y (θ));

I0 < β < α < 1/3, b > 0, 2a + b − 3bβ > 1,

here, α, β are some “hidden” parameters involved in thedefinition of gnb(θ).

Guangyue Han The University of Hong Kong

Page 124: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The

Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory

A Randomized Algorithm [Han 2015]

With appropriately chosen step sizes an = 1/na, a > 0,

θn+1 = θn + angnb(θn),

where

I θ0 is randomly selected from the parameter space Θ;

I gnb(θ) is a simulator for I ′(X (θ);Y (θ));

I0 < β < α < 1/3, b > 0, 2a + b − 3bβ > 1,

here, α, β are some “hidden” parameters involved in thedefinition of gnb(θ).

Guangyue Han The University of Hong Kong

Page 125: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The

Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory

Our Simulator of I ′(X ;Y )

Define

q = q(n) , nβ, p = p(n) , nα, k = k(n) , n/(nα + nβ).

For any j with iq + (i − 1)p + 1 ≤ j ≤ iq + ip, define

Wj = −(

p′(Yj−bq/2c)

p(Yj−bq/2c)+ · · ·+

p′(Yj |Y j−1j−bq/2c)

p(Yj |Y j−1j−bq/2c)

)log p(Yj |Y j−1

j−bq/2c),

and furthermore

ζi , Wiq+(i−1)p+1 + · · ·+ Wiq+ip, Sn ,∑k(n)

i=1 ζi .

Our simulator for I ′(X ;Y ):

gn(X n1 ,Y

n1 ) = H ′(X2|X1) + Sn(Y n

1 )/(kp)− Sn(X n1 ,Y

n1 )/(kp).

Guangyue Han The University of Hong Kong

Page 126: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The

Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory

Our Simulator of I ′(X ;Y )

Define

q = q(n) , nβ, p = p(n) , nα, k = k(n) , n/(nα + nβ).

For any j with iq + (i − 1)p + 1 ≤ j ≤ iq + ip, define

Wj = −(

p′(Yj−bq/2c)

p(Yj−bq/2c)+ · · ·+

p′(Yj |Y j−1j−bq/2c)

p(Yj |Y j−1j−bq/2c)

)log p(Yj |Y j−1

j−bq/2c),

and furthermore

ζi , Wiq+(i−1)p+1 + · · ·+ Wiq+ip, Sn ,∑k(n)

i=1 ζi .

Our simulator for I ′(X ;Y ):

gn(X n1 ,Y

n1 ) = H ′(X2|X1) + Sn(Y n

1 )/(kp)− Sn(X n1 ,Y

n1 )/(kp).

Guangyue Han The University of Hong Kong

Page 127: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The

Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory

Our Simulator of I ′(X ;Y )

Define

q = q(n) , nβ, p = p(n) , nα, k = k(n) , n/(nα + nβ).

For any j with iq + (i − 1)p + 1 ≤ j ≤ iq + ip, define

Wj = −(

p′(Yj−bq/2c)

p(Yj−bq/2c)+ · · ·+

p′(Yj |Y j−1j−bq/2c)

p(Yj |Y j−1j−bq/2c)

)log p(Yj |Y j−1

j−bq/2c),

and furthermore

ζi , Wiq+(i−1)p+1 + · · ·+ Wiq+ip, Sn ,∑k(n)

i=1 ζi .

Our simulator for I ′(X ;Y ):

gn(X n1 ,Y

n1 ) = H ′(X2|X1) + Sn(Y n

1 )/(kp)− Sn(X n1 ,Y

n1 )/(kp).

Guangyue Han The University of Hong Kong

Page 128: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The

Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory

Our Simulator of I ′(X ;Y )

Define

q = q(n) , nβ, p = p(n) , nα, k = k(n) , n/(nα + nβ).

For any j with iq + (i − 1)p + 1 ≤ j ≤ iq + ip, define

Wj = −(

p′(Yj−bq/2c)

p(Yj−bq/2c)+ · · ·+

p′(Yj |Y j−1j−bq/2c)

p(Yj |Y j−1j−bq/2c)

)log p(Yj |Y j−1

j−bq/2c),

and furthermore

ζi , Wiq+(i−1)p+1 + · · ·+ Wiq+ip, Sn ,∑k(n)

i=1 ζi .

Our simulator for I ′(X ;Y ):

gn(X n1 ,Y

n1 ) = H ′(X2|X1) + Sn(Y n

1 )/(kp)− Sn(X n1 ,Y

n1 )/(kp).

Guangyue Han The University of Hong Kong

Page 129: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The

Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory

Our Simulator of I ′(X ;Y )

Define

q = q(n) , nβ, p = p(n) , nα, k = k(n) , n/(nα + nβ).

For any j with iq + (i − 1)p + 1 ≤ j ≤ iq + ip, define

Wj = −(

p′(Yj−bq/2c)

p(Yj−bq/2c)+ · · ·+

p′(Yj |Y j−1j−bq/2c)

p(Yj |Y j−1j−bq/2c)

)log p(Yj |Y j−1

j−bq/2c),

and furthermore

ζi , Wiq+(i−1)p+1 + · · ·+ Wiq+ip, Sn ,∑k(n)

i=1 ζi .

Our simulator for I ′(X ;Y ):

gn(X n1 ,Y

n1 ) = H ′(X2|X1) + Sn(Y n

1 )/(kp)− Sn(X n1 ,Y

n1 )/(kp).

Guangyue Han The University of Hong Kong

Page 130: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The

Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory

Convergence of Our Algorithm

Convergence and convergence rate with concavity

If I (X ;Y ) is concave with respect to θ, then θn converges to theunique capacity achieving distribution θ∗ almost surely. And forany τ with 2a + b − 3bβ − 2τ > 1, we have

|θn − θ∗| = O(n−τ ).

Guangyue Han The University of Hong Kong

Page 131: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The

Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory

Convergence of Our Algorithm

Convergence and convergence rate with concavity

If I (X ;Y ) is concave with respect to θ, then θn converges to theunique capacity achieving distribution θ∗ almost surely. And forany τ with 2a + b − 3bβ − 2τ > 1, we have

|θn − θ∗| = O(n−τ ).

Guangyue Han The University of Hong Kong

Page 132: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The

Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory

The Ideas for the Proofs

Analyticity result [Han, Marcus, 2006]

The entropy rate of hidden Markov chains is analytic.

Refinements of the Shannon-MaMillan-Breiman theorem [Han, 2012]

Limit theorems for the sample entropy of hidden Markov chainshold.

The analyticity result states that I (X ;Y ) = H(X ) + H(Y )− H(X ,Y )

is a “nicely behaved” function.

The refinement results confirm that using Monte Carlo simulations,I (X ;Y ) and its derivatives can be “well-approximated”.

Guangyue Han The University of Hong Kong

Page 133: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The

Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory

The Ideas for the Proofs

Analyticity result [Han, Marcus, 2006]

The entropy rate of hidden Markov chains is analytic.

Refinements of the Shannon-MaMillan-Breiman theorem [Han, 2012]

Limit theorems for the sample entropy of hidden Markov chainshold.

The analyticity result states that I (X ;Y ) = H(X ) + H(Y )− H(X ,Y )

is a “nicely behaved” function.

The refinement results confirm that using Monte Carlo simulations,I (X ;Y ) and its derivatives can be “well-approximated”.

Guangyue Han The University of Hong Kong

Page 134: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The

Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory

The Ideas for the Proofs

Analyticity result [Han, Marcus, 2006]

The entropy rate of hidden Markov chains is analytic.

Refinements of the Shannon-MaMillan-Breiman theorem [Han, 2012]

Limit theorems for the sample entropy of hidden Markov chainshold.

The analyticity result states that I (X ;Y ) = H(X ) + H(Y )− H(X ,Y )

is a “nicely behaved” function.

The refinement results confirm that using Monte Carlo simulations,I (X ;Y ) and its derivatives can be “well-approximated”.

Guangyue Han The University of Hong Kong

Page 135: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The

Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory

The Ideas for the Proofs

Analyticity result [Han, Marcus, 2006]

The entropy rate of hidden Markov chains is analytic.

Refinements of the Shannon-MaMillan-Breiman theorem [Han, 2012]

Limit theorems for the sample entropy of hidden Markov chainshold.

The analyticity result states that I (X ;Y ) = H(X ) + H(Y )− H(X ,Y )

is a “nicely behaved” function.

The refinement results confirm that using Monte Carlo simulations,I (X ;Y ) and its derivatives can be “well-approximated”.

Guangyue Han The University of Hong Kong

Page 136: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The

Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory

Continuous-Time Information Theory

Guangyue Han The University of Hong Kong

Page 137: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The

Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory

Continuous-Time Gaussian Non-Feedback Channels

Consider the following continuous-time Gaussian channel:

Y (t) =√snr

∫ t

0X (s)ds + B(t), t ∈ [0,T ],

where {B(t)} is the standard Brownian motion.

Guangyue Han The University of Hong Kong

Page 138: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The

Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory

Continuous-Time Gaussian Non-Feedback Channels

Theorem (Ducan 1970)

The following I-CMMSE relationship holds:

I (XT0 ;Y T

0 ) =1

2E∫ T

0(X (s)− E[X (s)|Y s

0 ])2 ds.

Theorem (Guo, Shamai and Verdu 2005)

The following I-MMSE relationship holds:

d

dsnrI (XT

0 ;Y T0 ) =

1

2E∫ T

0(X (s)− E[X (s)|Y T

0 ])2ds.

Guangyue Han The University of Hong Kong

Page 139: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The

Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory

Continuous-Time Gaussian Non-Feedback Channels

Theorem (Ducan 1970)

The following I-CMMSE relationship holds:

I (XT0 ;Y T

0 ) =1

2E∫ T

0(X (s)− E[X (s)|Y s

0 ])2 ds.

Theorem (Guo, Shamai and Verdu 2005)

The following I-MMSE relationship holds:

d

dsnrI (XT

0 ;Y T0 ) =

1

2E∫ T

0(X (s)− E[X (s)|Y T

0 ])2ds.

Guangyue Han The University of Hong Kong

Page 140: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The

Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory

Continuous-Time Gaussian Non-Feedback Channels

Theorem (Ducan 1970)

The following I-CMMSE relationship holds:

I (XT0 ;Y T

0 ) =1

2E∫ T

0(X (s)− E[X (s)|Y s

0 ])2 ds.

Theorem (Guo, Shamai and Verdu 2005)

The following I-MMSE relationship holds:

d

dsnrI (XT

0 ;Y T0 ) =

1

2E∫ T

0(X (s)− E[X (s)|Y T

0 ])2ds.

Guangyue Han The University of Hong Kong

Page 141: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The

Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory

Continuous-Time Gaussian Feedback Channels

Consider the following continuous-time Gaussian feedback channel:

Y (t) =√snr

∫ t

0X (s,M,Y s

0 )ds + B(t), t ∈ [0,T ],

where {B(t)} is the standard Brownian motion.

Guangyue Han The University of Hong Kong

Page 142: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The

Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory

Continuous-Time Gaussian Feedback Channels

Theorem (Kadota, Zakai and Ziv 1971)

The following I-CMMSE relationship:

I (M;Y T0 ) =

1

2E∫ T

0(X (s,M,Y s

0 )− E[X (s,M,Y s0 )|Y s

0 ])2 ds.

Theorem (Han and Song 2016)

The following I-MMSE relationship holds:

d

dsnrI (M;Y T

0 ) =1

2

∫ T

0E[(

X (s)− E[X (s)|Y T0 ])2]

ds

+snr

∫ T

0E[(

X (s)− E[X (s)|Y T

0

]) d

dsnrX (s)

]ds.

Guangyue Han The University of Hong Kong

Page 143: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The

Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory

Continuous-Time Gaussian Feedback Channels

Theorem (Kadota, Zakai and Ziv 1971)

The following I-CMMSE relationship:

I (M;Y T0 ) =

1

2E∫ T

0(X (s,M,Y s

0 )− E[X (s,M,Y s0 )|Y s

0 ])2 ds.

Theorem (Han and Song 2016)

The following I-MMSE relationship holds:

d

dsnrI (M;Y T

0 ) =1

2

∫ T

0E[(

X (s)− E[X (s)|Y T0 ])2]

ds

+snr

∫ T

0E[(

X (s)− E[X (s)|Y T

0

]) d

dsnrX (s)

]ds.

Guangyue Han The University of Hong Kong

Page 144: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The

Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory

Continuous-Time Gaussian Feedback Channels

Theorem (Kadota, Zakai and Ziv 1971)

The following I-CMMSE relationship:

I (M;Y T0 ) =

1

2E∫ T

0(X (s,M,Y s

0 )− E[X (s,M,Y s0 )|Y s

0 ])2 ds.

Theorem (Han and Song 2016)

The following I-MMSE relationship holds:

d

dsnrI (M;Y T

0 ) =1

2

∫ T

0E[(

X (s)− E[X (s)|Y T0 ])2]

ds

+snr

∫ T

0E[(

X (s)− E[X (s)|Y T

0

]) d

dsnrX (s)

]ds.

Guangyue Han The University of Hong Kong

Page 145: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The

Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory

Capacity of Continuous-Time Gaussian Channels

For either the following continuous-time Gaussian channel:

Y (t) =√snr

∫ t

0X (s)ds + B(t), t ∈ [0,T ],

or the following continuous-time Gaussian feedback channel:

Y (t) =√snr

∫ t

0X (s,M,Y s

0 )ds + B(t), t ∈ [0,T ],

the capacity is P/2.

Guangyue Han The University of Hong Kong

Page 146: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The

Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory

Capacity of Continuous-Time Gaussian Channels

For either the following continuous-time Gaussian channel:

Y (t) =√snr

∫ t

0X (s)ds + B(t), t ∈ [0,T ],

or the following continuous-time Gaussian feedback channel:

Y (t) =√snr

∫ t

0X (s,M,Y s

0 )ds + B(t), t ∈ [0,T ],

the capacity is P/2.

Guangyue Han The University of Hong Kong

Page 147: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The

Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory

Thank you!

Guangyue Han The University of Hong Kong