huffman coding and decoding taiabul haque naeemul hassan
TRANSCRIPT
Huffman Coding and Decoding
• TAIABUL HAQUE
• NAEEMUL HASSAN
Huffman Encoding
An encoding algorithm used for lossless data compression-
• Variable-length code
• Prefix code
Basic Intuition- Those symbols that are more frequent should have smaller codes
A special kind of tree called Huffman Tree is built by exploiting this property
Huffman Tree Creation
Character Frequency
A 29
E 23
I 25
O 14
U 7
Huffman Tree Creation
Character Frequency
A 29
E 23
I 25
O 14
U 7
7 14 23 25 29
Huffman Tree Creation
Character Frequency
A 29
E 23
I 25
O 14
U 7
U(7) O(14)
21
21 23 2925
Huffman Tree Creation
Character Frequency
A 29
E 23
I 25
O 14
U 7
U(7) O(14)
21E(23)
44
25 29 44
Huffman Tree Creation
Character Frequency
A 29
E 23
I 25
O 14
U 7
U(7) O(14)
21E(23)
44
I(25) A(29)
54
5444
Huffman Tree Creation
Character Frequency
A 29
E 23
I 25
O 14
U 7
U(7) O(14)
21E(23)
44
I(25) A(29)
54
98
Start
Accept training data
Scan data, keep tally
Make prioritized list
Create, Draw Tree
Traverse tree
Determine code words
Save code words
Accept test sentence
Encode with lookup
Display encoded string
Decode with traversal
Display decoded string
Calculate comp. ratio
End
Start
Accept training data
Scan data, keep tally
Make prioritized list
Create, Draw Tree
Traverse tree
Determine code words
Save code words
Accept test sentence
Encode with lookup
Display encoded string
Decode with traversal
Display decoded string
Calculate comp. ratio
End
analysis of algorithm
analysis of algorithm
a(3)
Start
Accept training data
Scan data, keep tally
Make prioritized list
Create, Draw Tree
Traverse tree
Determine code words
Save code words
Accept test sentence
Encode with lookup
Display encoded string
Decode with traversal
Display decoded string
Calculate comp. ratio
End
Start
Accept training data
Scan data, keep tally
Make prioritized list
Create, Draw Tree
Traverse tree
Determine code words
Save code words
Accept test sentence
Encode with lookup
Display encoded string
Calculate comp. ratio
Decode with traversal
Display decoded string
End
algo = 4 * 8 = 32 bits
10101100011110 = 14 bits
Compression Ratio = 14/32*100 = 43.75
Start
Accept training data
Scan data, keep tally
Make prioritized list
Create, Draw Tree
Traverse tree
Determine code words
Save code words
Accept test sentence
Encode with lookup
Display encoded string
Calculate comp. ratio
Decode with traversal
Display decoded string
End
10101100011110
10101100011110
a(3)
10101100011110
l(2)
10101100011110
g(1)
10101100011110
o(2)
Frequency Analysis
• E T A O I N S H R D L U is the approximate order of frequency of the twelve most commonly used letters in the English language.
• Our Observation:
File Size Order of letters
338 E T I A O N S R C H L D
65 E T N O I A R S C L H D
70 E A O T R S I N L H D C
8 E O T A N R I S L H U D
677 E T A O N I R S H L D C
THANK YOU