data structures' project
TRANSCRIPT
![Page 1: Data structures' project](https://reader037.vdocuments.site/reader037/viewer/2022103000/556175f0d8b42a98268b4ab0/html5/thumbnails/1.jpg)
1
![Page 2: Data structures' project](https://reader037.vdocuments.site/reader037/viewer/2022103000/556175f0d8b42a98268b4ab0/html5/thumbnails/2.jpg)
COMSATS Institute of Information & Technology, Islamabad
Subject : Data Structures
A presentation
by
Ayesha Arif
FA10-BSB-014
&
Ghazanfar Ali BaigFA10-BSB-024
Instructor Name:
Ms. Nusrat Shaheen
![Page 3: Data structures' project](https://reader037.vdocuments.site/reader037/viewer/2022103000/556175f0d8b42a98268b4ab0/html5/thumbnails/3.jpg)
Huffman’s Code
3
![Page 4: Data structures' project](https://reader037.vdocuments.site/reader037/viewer/2022103000/556175f0d8b42a98268b4ab0/html5/thumbnails/4.jpg)
Background
• When we encode characters in computers, we assign each an 8-bit code based on an ASCII chart.
• But in most files, some characters appear more often than others.
• So wouldn't it make more sense to assign shorter codes for characters that appear more often and longer codes for characters that appear less often?
4
![Page 5: Data structures' project](https://reader037.vdocuments.site/reader037/viewer/2022103000/556175f0d8b42a98268b4ab0/html5/thumbnails/5.jpg)
Data Compressions
• Lossless compressions:Less space is used, or energy to transmit it. –No information is lost –we can reconstruct
the original text–Often: be able to quickly reconstruct original
text • Lossy compressions.
5
![Page 6: Data structures' project](https://reader037.vdocuments.site/reader037/viewer/2022103000/556175f0d8b42a98268b4ab0/html5/thumbnails/6.jpg)
Data Compressions
• Text representation:• ASCII: 1 byte per character (fixed length
code)– “M A R C”
1st Byte 2nd Byte 3rd Byte 4th byte77 97 114 99
01001101 01100001 01110010 01100011
6
![Page 7: Data structures' project](https://reader037.vdocuments.site/reader037/viewer/2022103000/556175f0d8b42a98268b4ab0/html5/thumbnails/7.jpg)
• In the English language, the letter ‘e’ is, ‘quite common x’ is less common–Why should all characters be represented using the same number of bits?
7
![Page 8: Data structures' project](https://reader037.vdocuments.site/reader037/viewer/2022103000/556175f0d8b42a98268b4ab0/html5/thumbnails/8.jpg)
Huffman
• 1951, David Huffman found the “most efficient method of representing numbers, letters, and other symbols using binary code”
• Now standard method used for data compression
8
![Page 9: Data structures' project](https://reader037.vdocuments.site/reader037/viewer/2022103000/556175f0d8b42a98268b4ab0/html5/thumbnails/9.jpg)
The Concept
• Huffman coding has the following properties:
• Codes for more probable characters are shorter than ones for less probable characters.
• Each code can be uniquely decoded .
9
![Page 10: Data structures' project](https://reader037.vdocuments.site/reader037/viewer/2022103000/556175f0d8b42a98268b4ab0/html5/thumbnails/10.jpg)
• To read the codes from a Huffman tree, start from the root and add a '0' every time you go left to a child, and add a '1' every time you go right. So in this example, the code for the character 'b' is 01 and the code for 'd' is 110.
• As you can see, 'a' has a shorter code than 'd'. Notice that since all the characters are at the leafs of the tree (the ends), there is never a chance that one code will be the prefix of another one (eg. 'a' is 00 and 'b' is 011
• ).• Hence, this unique prefix property assures that each
code can be uniquely decoded.
The Concept10
![Page 11: Data structures' project](https://reader037.vdocuments.site/reader037/viewer/2022103000/556175f0d8b42a98268b4ab0/html5/thumbnails/11.jpg)
Huffman Codes
• How to construct Huffman codes?• Build a binary tree!
11
![Page 12: Data structures' project](https://reader037.vdocuments.site/reader037/viewer/2022103000/556175f0d8b42a98268b4ab0/html5/thumbnails/12.jpg)
Binary Tree
• A binary tree is a tree where each internal node has at most 2 children.
12
![Page 13: Data structures' project](https://reader037.vdocuments.site/reader037/viewer/2022103000/556175f0d8b42a98268b4ab0/html5/thumbnails/13.jpg)
A
B
D
C
FE G
The “Root Node”
The “Left Node” The “Right Node”
Sibling Nodes
Internal Vertex
Leaf Vertex
13
Binary Tree
![Page 14: Data structures' project](https://reader037.vdocuments.site/reader037/viewer/2022103000/556175f0d8b42a98268b4ab0/html5/thumbnails/14.jpg)
Algorithm
1.Take the characters and their frequencies, and this list by increasing frequency.
2 .All the characters are vertices of the tree.3.Take the first 2 vertices from the list and make them
children of a vertex having the sum of their frequencies.4.Insert the new vertex into the sorted list of vertices
waiting to be put into the tree.5.If there are at least 2 vertices in the list, go to step 3.6.Read the Huffman code from the tree.
14
![Page 15: Data structures' project](https://reader037.vdocuments.site/reader037/viewer/2022103000/556175f0d8b42a98268b4ab0/html5/thumbnails/15.jpg)
Huffman’s Code
1.Take the characters and their frequencies, and sort this list by increasing frequency
A: 3, O: 5, T: 7, E:10
15
![Page 16: Data structures' project](https://reader037.vdocuments.site/reader037/viewer/2022103000/556175f0d8b42a98268b4ab0/html5/thumbnails/16.jpg)
2. All the characters are the vertices of the tree:
A:3 O:5 E:10T:7
Huffman’s Code16
![Page 17: Data structures' project](https://reader037.vdocuments.site/reader037/viewer/2022103000/556175f0d8b42a98268b4ab0/html5/thumbnails/17.jpg)
3 .Take the first two vertices from the list and make them children of a vertex having the sum of their frequencies
A:3 O:5
* : 8
Huffman’s Code17
![Page 18: Data structures' project](https://reader037.vdocuments.site/reader037/viewer/2022103000/556175f0d8b42a98268b4ab0/html5/thumbnails/18.jpg)
4 .Insert the new vertex into the sorted list of vertices waiting to be put into tree
• List of remaining vertices
• New list, with the new vertex inserted
T:7
T:7 E:108
E:10
Huffman’s Code18
![Page 19: Data structures' project](https://reader037.vdocuments.site/reader037/viewer/2022103000/556175f0d8b42a98268b4ab0/html5/thumbnails/19.jpg)
5. Take the first 2 vertices from the list and make them
Children of the vertex having the sum of their frequencies.
A:3
*: 15
*: 8
O:5
New vertex withfrequency 7 + 8 = 15
Huffman’s Code19
T:7
![Page 20: Data structures' project](https://reader037.vdocuments.site/reader037/viewer/2022103000/556175f0d8b42a98268b4ab0/html5/thumbnails/20.jpg)
6. Insert the new vertex into the sorted list of vertices waiting to be put into tree.
• List of remaining vertices
• New list, with the new vertex inserted
E:10 * : 15
E:10
Huffman’s Code20
![Page 21: Data structures' project](https://reader037.vdocuments.site/reader037/viewer/2022103000/556175f0d8b42a98268b4ab0/html5/thumbnails/21.jpg)
7. Take the
first two
Vertices from
the list and make
them children of
the vertex having
the sum of their frequencies.
E:10
T:7
*:25
*:15
A:3
*:8
O:5
New vertex with frequency 10 +15 = 25Huffman’s Code
21
![Page 22: Data structures' project](https://reader037.vdocuments.site/reader037/viewer/2022103000/556175f0d8b42a98268b4ab0/html5/thumbnails/22.jpg)
• Left branch = 0 • Right branch = 1 • Huffman CodeE:0T:10A:110O: 111
0 1
0 1
0 1
Huffman’s Code
22
T:7
*:25
*:15
A:3
*:8
O:5
E:10
![Page 23: Data structures' project](https://reader037.vdocuments.site/reader037/viewer/2022103000/556175f0d8b42a98268b4ab0/html5/thumbnails/23.jpg)
Advantage of Huffman Codes
• Reduce size of data by 20%-90% in general• If no characters occur more frequently than others,
then no advantage over ASCII.• Encoding: –Given the characters and their frequencies, perform
the algorithm and generate a code. Write the characters using the code.
• Decoding: Given the Huffman tree, figure out what each
character is (possible because of prefix property).
23
![Page 24: Data structures' project](https://reader037.vdocuments.site/reader037/viewer/2022103000/556175f0d8b42a98268b4ab0/html5/thumbnails/24.jpg)
Applications
• Both the .mp3 and .jpg file formats use
Huffman coding at one stage of the compression.
• Alternative method that achieves higher compression but is slower is patented by IBM, making Huffman Codes attractively.
24
![Page 25: Data structures' project](https://reader037.vdocuments.site/reader037/viewer/2022103000/556175f0d8b42a98268b4ab0/html5/thumbnails/25.jpg)
Program25
![Page 26: Data structures' project](https://reader037.vdocuments.site/reader037/viewer/2022103000/556175f0d8b42a98268b4ab0/html5/thumbnails/26.jpg)
References
•http://www.siggraph.org/education/materials/HyperGraph/video/mpeg/mpegfaq/huffman_tutorial.html
• http://www.huffmancoding.com/david/algorithm.html
• http://www.cs.sfu.ca/CourseCentral/365/li/squeeze/Huffman.htm
26
![Page 27: Data structures' project](https://reader037.vdocuments.site/reader037/viewer/2022103000/556175f0d8b42a98268b4ab0/html5/thumbnails/27.jpg)
27
TashakuR