binary trees: motivation
DESCRIPTION
Binary Trees: Motivation. Binary Search. Binary Search Efficiency. Make Binary Search efficient for LL. Binary Search Tree. Trees. Array implementation of trees. Linked list implementation of trees. C++ implementation. Binary Search Tree. Traversing a Binary Search Tree. - PowerPoint PPT PresentationTRANSCRIPT
Binary Trees: Motivation
Searching a linked list.
Linear Search/* To linear search a list for a particular Item */1. Set Loc = 0;2. Repeat the following:
a. If Loc >= length of list Return –1 to indicate Item not found.
b. If list element at location Loc is ItemReturn Loc as location of Item
c. Increment Loc by 1.
Linear search can be used for lists stored in arrays as well as for linkedlists.
It's the method used in the find algorithm in STL.
For a list of length n, its average search time will be O(n).
Binary Search
If a list is ordered, it can be searched more efficiently using binary search
1. Set First = 0 and Last = Length of List – 1.2. Repeat the following:
a. If First > LastReturn –1 to indicate Item not found.
b. Loc = location of middle element in the sublistfrom locations First through Last
c. If Item < the list element at LocSet Last = Loc – 1. // Search first half of list
Else if Item > the list element at LocSet First = Loc + 1. // Search last half of list
ElseReturn Loc as location of Item
For ordered list of length n, its average search time will be O(log n).
Binary Search Efficiency
Binary search efficient for lists stored in arraysMiddle element found simply by calculating
Loc = (First + Last) / 2
For linked lists, however, binary search is not practical, only have direct access to the first node locating any other node requires traversing the list
Finding middle element in linked lists becomes i. Mid = (First + Last) / 2 ii. LocPtr = First;iii. For Loc = First to Mid - 1
LocPtr = LocPtr->Nextiv. LocPtr holds address of the middle list element now.
The traversal required in step iii results in O(n) computing time
Make Binary Search efficient for LLM o d i f y t h e l i n k e d s t r u c t u r e t o m a k e a b i n a r y s e a r c h f e a s i b l e ?
N e e d d i r e c t a c c e s s t o t h e m i d d l e n o d e ( f o r f i r s t p a s s ) :
22 33 44 55 66 77 88
a n d t h e n d i r e c t a c c e s s t o t h e m i d d l e o f t h e f i r s t ( o r s e c o n d ) h a l f
22 33 44 55 66 77 88
a n d s o o n :
22 33 44 55 66 77 88
Binary Search Treeif stretch out the links, we acquire a tree-like shape:
55
77
88664422
33
Note that this tree is ordered!Each element to the left of the root is less than the rootEach element to the right of the root is greater than the root
The tree is a binary search tree (BST).
Trees
A tree consists o fa fin ite set o f e lem ents called nodes (o r vertices) anda fin ite set o f directed arcs that connect pairs of nodes.
A leaf is a node w ith no outgo ing arcs.
N odes d irectly accessib le (using one arc) fro m node X are called thechildren o f X .
children of this parent siblings of each other
root
leaves
Array implementation of trees
A binary tree is a tree in which each node has at most 2 children.
An array can be used to store some binary treesnumber the nodes level by level, from left to right
0
1 2
3 4 5 6
O
T
UPEC
M
and store node #0 in array location 0, node #1 in location 1, and so on:
i 0 1 2 3 4 5 6 . . .T [i ] O M T C E P U . . .
However, unless each level of the tree is full so there are no "danglinglimbs," there can be much wasted space in the array.
Linked list implementation of treesLinked Implementation: Use nodes of the form
data
left right
Left child Right child
and maintain a pointer to the root.
75
60
58
80
65 92
root
C++ implementation
template <typename BinTreeElement>
class BinaryTree{ public: // ... BinaryTree function members
private: class BinNode // a binary tree node { public: BinTreeElement data; BinNode * left, * right; }; typedef BinNode *BinNodePointer;
// BinaryTree data members
BinNodePointer root; // pointer to the root node};
Binary Search Tree
A Binary Search Tree (BST) is a binary tree in whichthe value in each node is
greater than all values in its left subtree andless than all values in its right subtree.
"binary search" a BST:1. Set pointer locPtr = root.2. Repeat the following::
If locPtr is nullReturn False
If Value < locPtr->DatalocPtr = locPtr->Left
Else if Value > locPtr->DatalocPtr = locPtr->Right
ElseReturn True
Search time: O(log2n) if tree is balanced.
Traversing a Binary Search Tree
view a binary tree as a recursive data structure:
A binary tree either:i. is empty Anchor
orii. consists of a node called the root,which has pointers to two disjoint binary subtrees
called the left subtree and the right subtree Inductive step
For traversal, consider the three operations:V: Visit a node.L: (Recursively) traverse the left subtree of a node.R: (Recursively) traverse the right subtree of a node.
Six different orders (permutations):LVR, VLR, LRV, VRL, RVL, and RLV
However, by convention, always visit left before right.
Traversing a Binary Search Tree
By convention, visiting left before right leaves three standard traversals:LVR (inorder) -- yields ordered sequenceVLR (preorder)LRV (postorder)
Note the prefix (in, pre, post) refers to the order in which the root is visitedrelative to its left and right subtrees.
By keeping the perspective of the tree as a recursive structure, one caneasily code functions for each of these tree traversal techniques.
L: Call Traverse to traverse the left subtree.V: Visit the root.R: Call Traverse to traverse the right subtree.
When the root is empty, an immediate return is executed.
Traversals
void Inorder(NodePointer r) //yields ordered sequence{ if (r != 0) {Inorder(r->left); // L Process(r->data); // V Inorder(r->right); // R
})
void Preorder(NodePointer r){ if (r != 0) {Process(r->data); // V
Preorder(r->left); // L Preorder(r->right); // R
})
void Postorder(NodePointer r){ if (r != 0) {Postorder(r->left); // L Postorder(r->right); // R
Process(r->data); // V}
)
Sample traversals
Example:LVR (inorder): 58, 60, 65, 75, 80, 92 // note orderedVLR (preorder): 75, 60, 58, 65, 80, 92LRV (postorder): 58, 65, 60, 92, 80, 75
Expression trees
These names are appropriate, recall expression trees, binary trees used torepresent the arithmetic expressions like A – B * C + D:
Inorder traversal infix expression: A – B * C + DPreorder traversal prefix expression: + – A * B C DPostorder traversal postfix expression: A B C * – D +
Insertion into BST
Modify the search algorithm so that a pointer parentPtr trails locPtr down the tree,keeping track of the parent of each node being checked:
1. Initialize pointers locPtr = root, parentPtr = NULL.2. While locPtr != NULL:
a. parentPtr = locPtrb. If value < locPtr->Data
locPtr = locPtr->Left Else if value > locPtr->Data
locPtr = locPtr->Right Else
value is already in the tree; return a found indicator.3. Get a new node pointed to by newPtr, put the value in its data part,
and set left and right to null.4. if parentPtr = NULL // empty tree
Set root = newptr. Else if value < parentPtr->data
Set parentPtr->left = newPtr. Else
Set parentPtr->right = newPtr.
Deletion from BST
Case 1: A leaf -- delete node, reset link from parent to null
Case 2: 1 child -- delete node, reset link from parent to point to child
Case 3: 2 children: 1. Replace node with inorder successor X.2. Delete X (which has 0 or 1 child)
Using Binary Trees: Coding
Fixed-length codes are expensive if there is a wide range of frequency ofuse among characters coded and decoded.
To increase the efficiency of codes, use shorter codes for more frequentlyoccurring characters
For example, ‘E’ in Morse code is '.' while ‘Z’ is '– – ..'
The objective is to minimize the expected length of the code for a character.By so doing, the number of bits that must be sent when transmittingencoded messages is minimized.
Variable-length coding schemes are also useful when compressing databecause they reduce the number of bits that must be stored.
Using Binary Trees: Coding
Given character set C 1, C 2, ... , C n with associated weights w 1, w 2, ... , w n
where w i is a measure of the character C i 's frequency of occurrence.
If l1, l2, ... , ln are the lengths of the codes for characters C 1, C 2, ... , C n,respectively, then the expected length of the code for any one of these characters is
= w1 l1+ w2 l2+ + wn ln
For example, given A, B, C, D, and E with weights 0.2, 0.1, 0.1, 0.15 and0.45, respectively, the corresponding Morse code is:
A .-B -…C -.-.D -..E .
Yielding and expected length of 2.1 for any character code from these 5.
Using Binary Trees: Coding
Another desirable property of coding schemes is immediately decodability.i.e., no sequence of bits that represents a character is a prefix of a longer
sequence for some other character.when a sequence of bits is received that is the code for a character, it can bedecoded immediately (unambiguously).
The Morse code scheme is not immediately decodable because, for example,the code for E (.) is a prefix of the code for A (-),the code for D (-..) is a prefix of the code for B (-…).
Morse code uses a 'pause' to separate encoded characters.
What is needed, for optimal minimized expected code length, is to develop animmediately decodable, variable-length coding scheme.
Huffman Codes & Binary Trees
Developed by graduate student D. A. Huffman in 1952, the followingalgorithm yields a coding scheme that is immediately decodable and forwhich each character has a minimal expected code length:
1. Initialize a list of one-node binary trees containing weights w 1, w 2, ... , w n
one for each of the characters C 1, C 2, ... , C n
2. Do the following n - 1 times:a. Find two trees T' and T'' in this list
with roots of minimal weight w' and w''.b. Replace these two trees with a binary tree
whose root has weight w' + w'', andwhose subtrees are T' and T'' , andlabel the pointers to these subtrees 0 and 1, respectively
3. The code for character C i is the bit string labeling the path from root to leaf C i
in the final binary tree.
Huffman Codes & Binary Trees
Huffman codes are immediate decodable because each character isassociated with a leaf AND there is a unique path from root to leaf.
Huffman codes provide minimum expected length because the binary tree isbuilt from the bottom-up (in a greedy fashion).
The lowest weighted characters are placed in the bottom of the tree firstThe more frequently occurring characters are folded into the top later
Less frequently occurring characters have longer paths from root to leaf More frequently occurring characters have shorter paths from root to leaf
Note that the actual codes may vary from construction to construction basedupon whether a subtree was merged to become a left or right subtree.
Placement as left or right subtree does not affect correctness just the code.Left branch associated with 0Right branch associated with 1
Huffman Codes & Binary Trees
Decoding algorithm: straight-forward traversal of binary tree
1. Initialize pointer p to root of Huffman tree
2. While the end of the message string has not been reached, do:a. Let x be the next bit in the string.b. If x = 0 then
Set p equal to its left child pointer.Else
Set p equal to its right child pointer.c. If p points to a leaf then
i. Display the character associated with that leaf.ii. Reset p to the root of the Huffman tree.