communications theory and...
TRANSCRIPT
Communications Theory and Engineering
Master's Degree in Electronic Engineering
Sapienza University of Rome
A.A. 2018-2019
Information Theory
Practice Work 5
Source encoding
What are the necessary and desirable properties for source coding?
Required: a code must be uniquelly decodable (there can not betwo symbols identified by the same code word)
Desirable : the average length of a code word should beminimized in order to minimize the bit rate required to transferthe source symbols
Source encoding
What are the necessary and desirable properties for source coding?
Required: a code must be uniquelly decodable (there can not betwo symbols identified by the same code word)
Desirable : the average length of a code word should beminimized in order to minimize the bit rate required to transferthe source symbols
Code properties
The average length of a codeword is therefore given by:
L = p(ak )lkk=1
K
∑
η =LminL
Lmin = H p( )
and the code can be characterized by an efficiency defined as:
where Lmin is the minimum average length of a codeword defined bythe source coding theorem, i.e.:
Which of the following codes are NOT Huffman codes?Motivate the answers.
a) {0, 10, 11}b) {00, 01, 10, 110}c) {01, 10}
Answersa) It is a possible Huffman code for a ternary source such that p(x):{1/2, 1/4, 1/4}.
b) It is NOT a Huffman code as it is not optimal. It can indeed bereduced to {00, 01, 10, 11}.
c) It is NOT a Huffman code as it is not great. It can indeed bereduced to {0.1}.
Example 1Huffman coding
1. Generate a Huffman binary code for the source having thefollowing pmf:
p(x):{1/3, 1/5, 1/5, 2/15, 2/15}
2. Show that the code found is optimal*, both for the pmf in point 1and in the case of an equiprobable source.
q(x):{1/5, 1/5, 1/5, 1/5, 1/5}
• *optimal code: minimizes the average length L[bit], respecting the constraint H(X) ≤ L.
Example 2Huffman coding
Example 2Solution
A possible Huffman code is:
a1
a2
a3
a4
a5
1/3
1/5
1/5
2/15
2/15
0
1
4/15
2/5
0
1
0
1
symboles codes
a1 00
a2 10
a3 11
a4 010
a5 011
P = 1
9/15
0
1
H(p)=1/3*log3 + 2/5*log5 + 4/15*log15/2 = 2.2323 bits
LHuff = 2*1/3 + 2*1/5+2*1/5+3*2/15+3*2/15 =2.26 ≈ H(p)
Example 2Huffman coding
H(q)=5*1/5*log5 = 2.3219 bits
LHuff = 2*1/5+2*1/5+2*1/5+3*1/5+3*1/5 = 2.4 bits
The value of K which is closer to the Huffman coding is K=11.!=11→" !"# =11/5 = 2.2 #$% <&('), so this encoding is not usable, Huffman coding is
excellent.
Lgen =
li
5i=1
5
∑ =(l1 + l2 + l3 + l4 + l5 )
5= K
5
Example 3
1. Generate a Huffman binary code for the symbolsshowing in the table
2. Generate a Shannon-Fano binary code for thesymbols showing in the table
3. Find the lengths of the codes and compare thegenretaed codes.
symbol probability
a1 0.3
a2 0.08
a3 0.12
a4 0.20
a5 0.02
a6 0.25
a7 0.03
Example 3 solution Huffman coding
L= p(ak )lk
k=1
K
∑ =0.3*2+0.08*4+0.12*3+0.20*2+0.02*5+0.25*2+0.03*5=2.43
a1
a2
a3
a4
a6
0.3
0.08
0.12
0.20
0.25
0
1
0.051
P = 1
0
1
a5
a7
0.02
0.03
0.130
0.25
0.45
1
0
0.550
1
Symbol Code
a1 00
a2 0100
a3 011
a4 11
a5 01011
a6 10
a7 01010
H(p)= 2.409 bits
0
1
Example 3 solution Shannon-Fano coding
Ø Order the probabilities in decreasing order
Ø Choose k such that is minimized
Ø This point divides the source symbols into two sets of almost equalprobability. Assign 0 for the first bit of the upper set and 1 for the lower set.
Ø Repeat this process for each subset.
pi−
i=1
k
∑ pii=k+1
m
∑
Shannon-Fano coding algorithem:
L= p(ak )lk
k=1
K
∑ =0.3*2+0.08*4+0.12*3+0.20*2+0.02*5+0.25*2+0.03*5=2.43
Symbol Code
a1 00
a2 1000
a3 101
a4 11
a5 10011
a6 01
a7 10010
symbol probability 1� digit 2� digit 3� digit 4� digit 5� digit
a1 0.3 0 0
a6 0.25 0 1
a4 0.2 1 1
a3 0.12 1 0 1
a2 0.08 1 0 0 0
a7 0.03 1 0 0 1 0
a5 0.02 1 0 0 1 1
0.550.45
0.30.25
0.120.13
0.250.20
Example 3 solution Shannon-Fano coding
• X is a source with 5 possible outcomes {1, 2, 3, 4, 5}• Consider two possible distributions for X, such that
– p(x): {1/2, 1/4, 1/8, 1/16, 1/16}– q(x): {1/2, 1/8, 1/8, 1/8, 1/8}
• Consider two possible encodings, such that– C1 : {0, 10, 110, 1110, 1111}– C2 : {0, 100, 101, 110, 111}
a) Calculate H(p), H(q), D(p||q), and D(q||p).b) Verify that the average length of C1, with respect to p(x), is equal to H(p) (C1 is optimal
for p(x)). Verify that C2 is optimal for q(x).c) Assuming to use C2 when the distribution is p(x), calculate the average length of the
code, and how much this length exceeds the entropy H(p)d) Calculate the efficiency of using C1 when the distribution is q(x).
Note: the exercise can be done using Matlab, using the functions and scripts of previous exercises, and adding new features where appropriate.
Example 4: Encoding, Entropy,and relative Entropy
Example 4Solution
H ( p) = 1
2log2+ 1
4log 4 + 1
8log 8+ 1
16log16+ 1
16log16 = 15
8bits
H (q) = 1
2log2+ 1
8log 8+ 1
8log 8+ 1
8log 8+ 1
8log 8 = 2 bits
a)
b)
D( p||q) = 12
log1+ 14
log2+ 18
log1+ 116
log12+ 1
16log
12= 1
8bits
D(q|| p) = 12
log1+ 18
log12+ 1
8log1+ 1
8log2+ 1
8log2 = 1
8bits
L(C1 ) = pii∑ li =
12+ 1
4* 2+ 1
8* 3+ 1
16* 4 + 1
16* 4 = 15
8bits = H ( p)
L(C2 ) = qii∑ li =
12+ 4 * 1
8* 3 = 2 bits = H (q)
Example 4Solution
c)
d)
L(C2 ) = pii∑ li =
12+ 1
4* 3+ 1
8* 3+ 1
16* 3+ 1
16* 3 = 2 bits
L(C2 )− H ( p) = 18
L(C1 ) = qii∑ li =
12+ 1
8* 2+ 1
8* 3+ 1
8* 4 + 1
8* 4 = 17
8bits
η =Lmin
L= 2
178
= 1617
Source encoding
Classes of Codes
Source Singular code Non-Singular Code, Unambiguously unmodifiable
Non-Singular Code, uniquely decodable, Not instantaneous
Non-Singular Code, univocally decodable, instantaneous
1 0 0 10 0
2 0 010 00 10
3 0 01 11 110
4 0 10 110 111
*non-singular: if x~=x’ then c(x)~=c(x’)
*uniquely decodable : A code is called uniquely decodable if any of its extension is nonsingular.
*instantaneous/prefix: no codeword is a prefix of any other codeword.
Optimal coding : minimizes the average length L[bit] of the code, respecting to the constraint H(X) ≤ L