brought to you by max (icq:31252512 tel:61337706) february 5, 2005 advanced data structures...
TRANSCRIPT
![Page 1: Brought to you by Max (ICQ:31252512 TEL:61337706) February 5, 2005 Advanced Data Structures Introduction](https://reader031.vdocuments.site/reader031/viewer/2022032709/56649eab5503460f94bb02fb/html5/thumbnails/1.jpg)
Brought to you by Max (ICQ:31252512 TEL:61337706)February 5, 2005
Advanced Data StructuresIntroduction
![Page 2: Brought to you by Max (ICQ:31252512 TEL:61337706) February 5, 2005 Advanced Data Structures Introduction](https://reader031.vdocuments.site/reader031/viewer/2022032709/56649eab5503460f94bb02fb/html5/thumbnails/2.jpg)
Page 2
Outline
• Review of some data structures Array Linked List Sorted Array
• New stuff 3 of the most important data structures in OI (and your own pro
gramming) Binary Search Tree Heap (Priority Queue) Hash Table
![Page 3: Brought to you by Max (ICQ:31252512 TEL:61337706) February 5, 2005 Advanced Data Structures Introduction](https://reader031.vdocuments.site/reader031/viewer/2022032709/56649eab5503460f94bb02fb/html5/thumbnails/3.jpg)
Page 3
Review
• How to measure the merits of a data structure?• Time complexity of common operations
Function Find(T : DataType) : Element Function Find_Min() : Element Procedure Add(T : DataType) Procedure Remove(E : Element) Procedure Remove_Min()
![Page 4: Brought to you by Max (ICQ:31252512 TEL:61337706) February 5, 2005 Advanced Data Structures Introduction](https://reader031.vdocuments.site/reader031/viewer/2022032709/56649eab5503460f94bb02fb/html5/thumbnails/4.jpg)
Page 4
Review - Array
• Here Element is simply the integer index of the array cell
• Find(T) Must scan the whole array, O(N)
• Find_Min() Also need to scan the whole array, O(N)
• Add(T) Simply add it to the end of the array, O(1)
• Remove(E) Deleting an element creates a hole Copy the last element to fill the hole, O(1)
• Remove_Min() Need to Find_Min() then Remove(), O(N)
![Page 5: Brought to you by Max (ICQ:31252512 TEL:61337706) February 5, 2005 Advanced Data Structures Introduction](https://reader031.vdocuments.site/reader031/viewer/2022032709/56649eab5503460f94bb02fb/html5/thumbnails/5.jpg)
Page 5
Review - Linked List
• Element is a pointer to the object• Find(T)
Scan the whole list, O(N)• Find_Min()
Scan the whole list, O(N)• Add(T)
Just add it to a convenient position (e.g. head), O(1)• Remove(E)
With suitable implementation, O(1)• Remove_Min()
Need to Find_Min() then Remove(), O(N)
![Page 6: Brought to you by Max (ICQ:31252512 TEL:61337706) February 5, 2005 Advanced Data Structures Introduction](https://reader031.vdocuments.site/reader031/viewer/2022032709/56649eab5503460f94bb02fb/html5/thumbnails/6.jpg)
Page 6
Review - Sorted Array
• Like array, Element is the integer index of the cell• Find(T)
We can use binary search, O(logN)• Find_Min()
The first element must be the minimum, O(1)• Add(T)
First we need to find the correct place, O(logN) Then we need to shift the array by 1 cell, O(N)
• Remove(E) Deleting an element creates a hole Need to shift the of array by 1 cell, O(N)
• Remove_Min() Can be O(1) or O(N) depending on choice of implementation
![Page 7: Brought to you by Max (ICQ:31252512 TEL:61337706) February 5, 2005 Advanced Data Structures Introduction](https://reader031.vdocuments.site/reader031/viewer/2022032709/56649eab5503460f94bb02fb/html5/thumbnails/7.jpg)
Page 7
Review - Summary
• If we are going to perform a lot of these operations (e.g. N=100000), none of these is fast enough!
Array Linked List Sorted ArrayFind O(N) O(N) O(logN)Find_Min O(N) O(N) O(1)Add O(1) O(1) O(N)Remove O(1) O(1) O(N)Remove_Min
O(N) O(N) O(1) or O(N)
![Page 8: Brought to you by Max (ICQ:31252512 TEL:61337706) February 5, 2005 Advanced Data Structures Introduction](https://reader031.vdocuments.site/reader031/viewer/2022032709/56649eab5503460f94bb02fb/html5/thumbnails/8.jpg)
Brought to you by Max (ICQ:31252512 TEL:61337706)February 5, 2005
Advanced Data StructuresBinary Search Tree
![Page 9: Brought to you by Max (ICQ:31252512 TEL:61337706) February 5, 2005 Advanced Data Structures Introduction](https://reader031.vdocuments.site/reader031/viewer/2022032709/56649eab5503460f94bb02fb/html5/thumbnails/9.jpg)
Page 9
What is a Binary Search Tree?
• Use a binary tree to store the data• Maintain this property
Left Subtree < Node < Right Subtree
![Page 10: Brought to you by Max (ICQ:31252512 TEL:61337706) February 5, 2005 Advanced Data Structures Introduction](https://reader031.vdocuments.site/reader031/viewer/2022032709/56649eab5503460f94bb02fb/html5/thumbnails/10.jpg)
Page 10
Binary Search Tree - Implementation• Definition of a Node:
Node = RecordLeft, Right : ^Node;Value : Integer;
End;
• To search for a value (pseudocode)Node Find(Node N, Value V) :-
If (N.Value = V)Return N;
Else If (V < N.Value) and (V.Left != NULL)Return Find(N.Left);
Else If (V > N.Value) and (V.Right != NULL)Return Find(N.Right);
ElseReturn NULL; // not found
![Page 11: Brought to you by Max (ICQ:31252512 TEL:61337706) February 5, 2005 Advanced Data Structures Introduction](https://reader031.vdocuments.site/reader031/viewer/2022032709/56649eab5503460f94bb02fb/html5/thumbnails/11.jpg)
Page 11
Binary Search Tree - Find
![Page 12: Brought to you by Max (ICQ:31252512 TEL:61337706) February 5, 2005 Advanced Data Structures Introduction](https://reader031.vdocuments.site/reader031/viewer/2022032709/56649eab5503460f94bb02fb/html5/thumbnails/12.jpg)
Page 12
Binary Search Tree - Remove
• Case I : Removing a leaf node Easy
• Case II : Removing a node with a single child Replace the removed node with its child
• Case III : Removing a node with 2 children Replace the removed node with the minimum element in the righ
t subtree (or maximum element in the left subtree) This may create a hole again Apply Case I or II
• Sometimes you can avoid this by using “Lazy Deletion” Mark a node as removed instead of actually removing it Less coding, performance hit not big if you are not doing this freq
uently (may even save time)
![Page 13: Brought to you by Max (ICQ:31252512 TEL:61337706) February 5, 2005 Advanced Data Structures Introduction](https://reader031.vdocuments.site/reader031/viewer/2022032709/56649eab5503460f94bb02fb/html5/thumbnails/13.jpg)
Page 13
Binary Search Tree - Remove
![Page 14: Brought to you by Max (ICQ:31252512 TEL:61337706) February 5, 2005 Advanced Data Structures Introduction](https://reader031.vdocuments.site/reader031/viewer/2022032709/56649eab5503460f94bb02fb/html5/thumbnails/14.jpg)
Page 14
Binary Search Tree - Summary
• Add() is similar to Find()• Find_Min()
Just walk to the left, easy• Remove_Min()
Equivalent to Find_Min() then Remove()
• Summary Find() : O(logN) Find_Min() : O(logN) Remove_Min() : O(logN) Add() : O(logN) Remove() : O(logN) The BST is “supposed” to behave like that
![Page 15: Brought to you by Max (ICQ:31252512 TEL:61337706) February 5, 2005 Advanced Data Structures Introduction](https://reader031.vdocuments.site/reader031/viewer/2022032709/56649eab5503460f94bb02fb/html5/thumbnails/15.jpg)
Page 15
Binary Search Tree - Problems
• In reality… All these operations are O(logN) only if the tree is balanced Inserting a sorted sequence degenerates into a linked list
• The real upper bounds Find() : O(N) Find_Min() : O(N) Remove_Min() : O(N) Add() : O(N) Remove() : O(N)
• Solution AVL Tree, Red Black Tree Use “rotations” to maintain balance Both are difficult to implement, rarely used
![Page 16: Brought to you by Max (ICQ:31252512 TEL:61337706) February 5, 2005 Advanced Data Structures Introduction](https://reader031.vdocuments.site/reader031/viewer/2022032709/56649eab5503460f94bb02fb/html5/thumbnails/16.jpg)
Brought to you by Max (ICQ:31252512 TEL:61337706)February 5, 2005
Advanced Data StructuresHeap (Priority Queue)
![Page 17: Brought to you by Max (ICQ:31252512 TEL:61337706) February 5, 2005 Advanced Data Structures Introduction](https://reader031.vdocuments.site/reader031/viewer/2022032709/56649eab5503460f94bb02fb/html5/thumbnails/17.jpg)
Page 17
What is a Heap?
• A (usually) complete binary tree for Priority Queue Enqueue = Add Dequeue = Find_Min and Remove_Min
• Heap Property Every node’s value is greater than those of its decendants
![Page 18: Brought to you by Max (ICQ:31252512 TEL:61337706) February 5, 2005 Advanced Data Structures Introduction](https://reader031.vdocuments.site/reader031/viewer/2022032709/56649eab5503460f94bb02fb/html5/thumbnails/18.jpg)
Page 18
Heap - Implementation• Usually we use an array to simulate a heap• Assume nodes are indexed 1, 2, 3, ...
Parent = [Node / 2] Left Child = Node*2 Right Child = Node*2 + 1
![Page 19: Brought to you by Max (ICQ:31252512 TEL:61337706) February 5, 2005 Advanced Data Structures Introduction](https://reader031.vdocuments.site/reader031/viewer/2022032709/56649eab5503460f94bb02fb/html5/thumbnails/19.jpg)
Page 19
Heap - Add
• Append the new element at the end• Shift it up until the heap property is restored• Why always works?
![Page 20: Brought to you by Max (ICQ:31252512 TEL:61337706) February 5, 2005 Advanced Data Structures Introduction](https://reader031.vdocuments.site/reader031/viewer/2022032709/56649eab5503460f94bb02fb/html5/thumbnails/20.jpg)
Page 20
Heap - Remove_Min
• Replace the root with the last element• Shift it down until the heap property is restored• Again, why it always works?
![Page 21: Brought to you by Max (ICQ:31252512 TEL:61337706) February 5, 2005 Advanced Data Structures Introduction](https://reader031.vdocuments.site/reader031/viewer/2022032709/56649eab5503460f94bb02fb/html5/thumbnails/21.jpg)
Page 21
Heap - Build_Heap
• There is a special operation called Build_Heap Transform an ordinary into a heap without using extra memory
• The Remove_Min operation has two steps Replace the root with a leaf node Restore the heap structure by shifting the node down
• This is called “Heapify”• If we apply the Heapify step to ALL internal nodes, botto
m to up, we get a heap
![Page 22: Brought to you by Max (ICQ:31252512 TEL:61337706) February 5, 2005 Advanced Data Structures Introduction](https://reader031.vdocuments.site/reader031/viewer/2022032709/56649eab5503460f94bb02fb/html5/thumbnails/22.jpg)
Page 22
Heap - Build_Heap
![Page 23: Brought to you by Max (ICQ:31252512 TEL:61337706) February 5, 2005 Advanced Data Structures Introduction](https://reader031.vdocuments.site/reader031/viewer/2022032709/56649eab5503460f94bb02fb/html5/thumbnails/23.jpg)
Page 23
Heap - Summary
• Find() is usually not supported by a heap You may scan the whole tree / array if you really want
• Remove() is equivalent to applying Remove_Min() on a subtree
Remember that any subtree of a heap is also a heap
• Summary Find() : O(N) // We usually don’t use Heap for this Find_Min() : O(1) Remove_Min() : O(logN) Add() : O(logN) Remove() : O(logN)
![Page 24: Brought to you by Max (ICQ:31252512 TEL:61337706) February 5, 2005 Advanced Data Structures Introduction](https://reader031.vdocuments.site/reader031/viewer/2022032709/56649eab5503460f94bb02fb/html5/thumbnails/24.jpg)
Brought to you by Max (ICQ:31252512 TEL:61337706)February 5, 2005
Advanced Data StructuresHash Table
![Page 25: Brought to you by Max (ICQ:31252512 TEL:61337706) February 5, 2005 Advanced Data Structures Introduction](https://reader031.vdocuments.site/reader031/viewer/2022032709/56649eab5503460f94bb02fb/html5/thumbnails/25.jpg)
Page 25
What is a Hash Table?
• Question We have a Mark Six result (6 integers in the range 1..49) We want to check if our bet matches it What is the most efficient way?
• Answer Use a boolean array with 49 cells Checking a number is O(1)
• Problem What if the range of number is very large? What if we need to store strings?
• Solution Use a “Hash Function” to compress the range of values
![Page 26: Brought to you by Max (ICQ:31252512 TEL:61337706) February 5, 2005 Advanced Data Structures Introduction](https://reader031.vdocuments.site/reader031/viewer/2022032709/56649eab5503460f94bb02fb/html5/thumbnails/26.jpg)
Page 26
Hash Table
• Suppose we need to store values between 0 and 99, but only have an array with 10 cells
• We can map the values [0,99] to [0,9] by taking modulo 10. The result is the “Hash Value”
• Adding, finding and removing an element are O(1)
• It is even possible to map the strings to integers, e.g. “ATE” to (1*26*26+20*26+5) mod 10
![Page 27: Brought to you by Max (ICQ:31252512 TEL:61337706) February 5, 2005 Advanced Data Structures Introduction](https://reader031.vdocuments.site/reader031/viewer/2022032709/56649eab5503460f94bb02fb/html5/thumbnails/27.jpg)
Page 27
Hash Table - Collision
• But this approach has an inherent problem What happens if two data has the same hash value?
• Two major methods to deal with this Chaining (Also called Open Hashing) Open Addressing (Also called Closed Hashing)
![Page 28: Brought to you by Max (ICQ:31252512 TEL:61337706) February 5, 2005 Advanced Data Structures Introduction](https://reader031.vdocuments.site/reader031/viewer/2022032709/56649eab5503460f94bb02fb/html5/thumbnails/28.jpg)
Page 28
Hash Table - Chaining• Keep a link list at each hash table cell• On average, Add / Find / Remove is O(1+)
= Load Factor = # of stored elements / # of cells• If hash function is “random” enough, usually can get the average case
![Page 29: Brought to you by Max (ICQ:31252512 TEL:61337706) February 5, 2005 Advanced Data Structures Introduction](https://reader031.vdocuments.site/reader031/viewer/2022032709/56649eab5503460f94bb02fb/html5/thumbnails/29.jpg)
Page 29
Hash Table - Open Addressing
• If you don’t want to implement a linked list…• An alternative is to skip a cell if it is occupied• The following diagram illustrates “Linear Probing”
![Page 30: Brought to you by Max (ICQ:31252512 TEL:61337706) February 5, 2005 Advanced Data Structures Introduction](https://reader031.vdocuments.site/reader031/viewer/2022032709/56649eab5503460f94bb02fb/html5/thumbnails/30.jpg)
Page 30
Hash Table - Open Addressing
• Find() must continue until a blank cell is reached• Remove() must use Lazy Deletion, otherwise further ope
rations may fail
![Page 31: Brought to you by Max (ICQ:31252512 TEL:61337706) February 5, 2005 Advanced Data Structures Introduction](https://reader031.vdocuments.site/reader031/viewer/2022032709/56649eab5503460f94bb02fb/html5/thumbnails/31.jpg)
Page 31
Hash Table - Summary
• Find_Min() and Remove_Min() are usually not supported in a Hash Table
You may scan the whole tree / array if you really want
• For Chaining Find() : O(1+) Add() : O(1+ Remove() : O(1+)
• For Open Adressing Find() : O(1 / 1-) Add() : O(1 / 1-) Remove() : O(ln(1/1-)/ + 1/)
• Both are close to O(1) if is kept small (< 50%)
![Page 32: Brought to you by Max (ICQ:31252512 TEL:61337706) February 5, 2005 Advanced Data Structures Introduction](https://reader031.vdocuments.site/reader031/viewer/2022032709/56649eab5503460f94bb02fb/html5/thumbnails/32.jpg)
Page 32
Miscellaneous Stuff
• Judge problems 1020 – Left Join 1021 – Inner Join 1019 – Addition II
• Past contest problems NOI2004 Day 1 – Cashier Any more?
• Good place to find related information - Wikipedia http://en.wikipedia.org/wiki/Binary_search_tree http://en.wikipedia.org/wiki/Binary_heap http://en.wikipedia.org/wiki/Hash_table