1 of 28 information theory linawati electrical engineering department udayana university

1 of 28

Information Theory

LinawatiElectrical Engineering Department

Udayana University

2

Information Source Measuring Information Entropy Source Coding Designing Codes

3

Information Source 4 characteristics of information source

The no. of symbols, n The symbols, S1, S2, …, Sn

The probability of occurrence of each symbol, P(S1), P(S2), …, P(Sn)

The correlation between successive symbols Memoryless source: if each symbol is

independent A message: a stream of symbols from the

senders to the receiver

4

Examples …

Ex. 1.: A source that sends binary information (streams of 0s and 1s) with each symbol having equal probability and no correlation can be modeled as a memoryless source

n = 2 Symbols: 0 and 1 Probabilities: p(0) = ½ and P(1) = ½

5

Measuring Information To measure the information contained in a

message How much information does a message

carry from the sender to the receiver? Examples

Ex.2.: Imagine a person sitting in a room. Looking out the window, she can clearly see that the sun is shining. If at this moment she receives a call from a neighbor saying “It is now daytime”, does this message contain any information?

Ex. 3. : A person has bought a lottery ticket. A friend calls to tell her that she has won first prize. Does this message contain any information?

6

Examples … Ex.3. It does not, the message contains no

information. Why? Because she is already certain that is daytime.

Ex. 4. It does. The message contains a lot of information, because the probability of winning first prize is very small

Conclusion The information content of a message is

inversely proportional to the probability of the occurrence of that message.

If a message is very probable, it does not contain any information. If it is very improbable, it contains a lot of information

7

Symbol Information To measure the information contained in a

message, it is needed to measure the information contained in each symbol

I(s) = log2 1/P(s) bits Bits is different from the bit, binary digit, used to define a

0 or 1 Examples

Ex.5. Find the information content of each symbol when the source is binary (sending only 0 or 1 with equal probability)

Ex. 6. Find the information content of each symbol when the source is sending four symbols with prob. P(S1) = 1/8, P(S2) = 1/8, P(S3) = ¼ ; and P(S4) = 1/2

8

Examples … Ex. 5.

P(0) = P(1) = ½ , the information content of each symbol is

Ex.6.

bit 1]2[log1

log)1(

1log)1(

bit 1]2[log1

log)0(

1log)0(

2

2122

2

2122

PI

PI

bit 1]2[log1

log)(

1log)(

bit 2]4[log1

log)(

1log)(

bit 3]8[log1

log)(

1log)(

bit 3]8[log1

log)(

1log)(

2212

424

2412

323

2812

222

2812

121

SPSI

SPSI

SPSI

SPSI

9

Examples … Ex.6.

The symbols S1 and S2 are least probable. At the receiver each carries more information (3 bits) than S3 or S4. The symbol S3 is less probable than S4, so S3 carries more information than S4

Definition the relationships If P(Si) = P(Sj), then I(Si) = I(Sj) If P(Si) < P(Sj), then I(Si) > I(Sj) If P(Si) = 1, then I(Si) = 0

10

Message Information If the message comes from a memoryless

source, each symbol is independent and the probability of receiving a message with symbols Si, Sj, Sk, … (where i, j, and k can be the same) is:

P(message) = P(Si)P(Sj)P(Sk) …

Then the information content carried by the message is

...)()()()(

...logloglog)(

log)(

)(1

2)(1

2)(1

2

)(1

2

kji

SPSjPSP

messageP

SISISImessageI

messageI

messageI

ki

11

Example … Ex.7. An equal – probability binary source

sends an 8-bit message. What is the amount of information received? The information content of the message is I(message) = I(first bit) + I(second bit) + …

+ I(eight bit) = 8 bits

12

Entropy Entropy (H) of the source

The average amount of information contained in the symbols

H(Source) = P(S1)xI(S1) + P(S2)xI(S2) + … + P(Sn)xI(Sn)

Example What is the entropy of an equal-probability

binary source? H(Source) = P(0)xI(0) + P(1)xI(1) = 0.5x1 +

0.5x1 = 1 bit 1 bit per symbol

13

Maximum Entropy For a particular source with n symbols,

maximum entropy can be achieved only if all the probabilities are the same. The value of this max is

In othe words, the entropy of every source has an upper limit defined by

H(Source)≤log2n

nSPSourceHn

nSiPi 21

21

)(1

2max logloglog)()( 1

14

Example …

What is the maximum entropy of a binary source?

Hmax = log22 = 1 bit

15

Source Coding To send a message from a source to a

destination, a symbol is normally coded into a sequence of binary digits.

The result is called code word A code is a mapping from a set of symbols

into a set of code words. Example, ASCII code is a mapping of a set

of 128 symbols into a set of 7-bit code words

A ………………………..> 0100001 B …………………………> 0100010 Set of symbols ….> Set of binary streams

16

Fixed- and Variable-Length Code

A code can be designed with all the code words the same length (fixed-length code) or with different lengths (variable length code) Examples

A code with fixed-length code words: S1 -> 00; S2 -> 01; S3 -> 10; S4 -> 11

A code with variable-length code words: S1 -> 0; S2 -> 10; S3 -> 11; S4 -> 110

17

Distinct Codes Each code words is different from every

other code word Example

S1 -> 0; S2 -> 10; S3 -> 11; S4 -> 110

Uniquely Decodable Codes A distinct code is uniquely decodable if each

code word can be decoded when inserted between other code words.

Example Not uniquely decodable

S1 -> 0; S2 -> 1; S3 -> 00; S4 -> 10 because 0010 -> S3 S4 or S3S2S1 or S1S1S4

18

Instantaneous Codes A uniquely decodable

S1 -> 0; S2 -> 01; S3 -> 011; S4 -> 0111 A 0 uniquely defines the beginning of a code

word

A uniquely decodable code is instantaneously decodable if no code word is the prefix of any other code word

19

Examples … A code word and its prefixes (note that each

code word is also a prefix of itself) S -> 01001 ; prefixes: 0, 10, 010, 0100, 01001

A uniquely decodable code that is instantaneously decodable S1 -> 0; s2 -> 10; s3 -> 110; s4 -> 111

When the receiver receives a 0, it immediately knows that it is S1; no other symbol starts with a 0. When the rx receives a 10, it immediately knows that it is S2; no other symbol starts with 10, and so on

20

Relationship between different types of coding

Instantaneous codes

Uniquely decodable codes

Distinct codes

All codes

21

Code …

Average code length L=L(S1)xP(S1) + L(S2)xP(S2) + … Example Find the average length of the following

code: S1 -> 0; S2 -> 10; S3 -> 110; S4 -> 111 P(S1) = ½, P(S2) = ¼; P(S3) = 1/8; P(S4) = 1/8 Solution

L = 1x ½ + 2x ¼ + 3x 1/8 + 3x1/8 = 1 ¾ bits

22

Code … Code efficiency

(code efficiency) is defined as the entropy of the source code divided by the average length of the code

Example Find the efficiency of the following code:

S1 ->0; S2->10; S3 -> 110; S4 -> 111 P(S1) = ½, P(S2) = ¼; P(S3) = 1/8; P(S4) = 1/8

Solution

%100)(LsourceH

%100%100

bits 1)8(log)8(log)4(log)2(log)(

bits 1

4343

1

1

43

281

281

241

221

43

sourceH

L

23

Designing Codes Two examples of instantaneous codes

Shannon – Fano code Huffman code

Shannon – Fano code An instantaneous variable – length encoding method in

which the more probable symbols are given shorter code words and the less probable are given longer code words

Design builds a binary tree top (top to bottom construction) following the steps below:

1. List the symbols in descending order of probability 2. Divide the list into two equal (or nearly equal)

probability sublists. Assign 0 to the first sublist and 1 to the second

3. Repeat step 2 for each sublist until no further division is possible

24

Example of Shannon – Fano Encoding Find the Shannon – Fano code words for the

following source P(S1) = 0.3 ; P(S2) = 0.2 ; P(S3) = 0.15 ; P(S4) =

0.1 ; P(S5) = 0.1 ; P(S6) = 0.05 ; P(S7) = 0.05 ; P(S8) = 0.05

Solution Because each code word is assigned a leaf of

the tree, no code word is the prefix of any other. The code is instantaneous. Calculation of the average length and the efficiency of this code

H(source) = 2.7 L = 2.75 = 98%

25

Example of Shannon – Fano Encoding

S1

0.30

S2

0.20

S3

0.15

S4

0.10

S5

0.10

S6

0.05

S7

0.05

S8

0.05

0 1

S1 S2 S3 S4 S5 S6 S7 S8

0 1 0 1

S1 S2 S3 S4 S5 S6 S7 S8

00 01 0 1 0 1

S3 S4 S5 S6 S7 S8

100 101 0 1 0 1

S5 S6 S7 S8

1100 1101 1110 1111

26

Huffman Encoding An instantaneous variable – length

encoding method in which the more probable symbols are given shorter code words and the less probable are given longer code words

Design builds a binary tree (bottom up construction):

1. Add two least probable symbols 2. Repeat step 1 until no further

combination is possible

27

Example Huffman encoding Find the Huffman code words for the

following source P(S1) = 0.3 ; P(S2) = 0.2 ; P(S3) = 0.15 ; P(S4) =

0.1 ; P(S5) = 0.1 ; P(S6) = 0.05 ; P(S7) = 0.05 ; P(S8) = 0.05

Solution Because each code word is assigned a leaf of

the tree, no code word is the prefix of any other. The code is instantaneous. Calculation of the average length and the efficiency of this code

H(source) = 2.70 ; L = 2.75 ; = 98%

28

Example Huffman encoding0 1

0 1

0 1

0 1

0 1

0 1 0 1

0.30 0.20 0.15 0.10 0.10 0.05 0.05 0.05

S1

00

S2

10

S3

010

S4

110

S5

111

S6

0110

S7

01110

S8

01111

0.20 0.10

0.15

0.3

0.40

0.60

1.00

1 of 28 information theory linawati electrical engineering department udayana university

Documents