what we learnt
DESCRIPTION
What we learnt. We learnt what are forests , trees , bintrees , binary search trees . And some operation on the tree i.e. insertion, deletion, traversing. Our new topic. Optimal Binary search Trees. In this chapter there is no insertion no deletion - PowerPoint PPT PresentationTRANSCRIPT
We learnt what are forests, trees, bintrees, binary search trees.
And some operation on the tree i.e. insertion, deletion, traversing
What we learnt
• In this chapter there is no insertion no deletion• Today we `ll need only one tree its
possible combination so that we can find out the best one from it
• For this we need o know what are binary search trees
• We`ll get a quick review about the• binary search trees
• Binary search trees are simple binary trees with only difference that it is a sorted binary tree
• The root may contain any value
• But the left subtree contains value less than the root value
• And the right sub tree contains value greater than the root value
• And left and right subtree are itself binary search trees
• A sorted list (array) can be searched by using binary search
• We divide the list in half and search • And we divide it again and repeat the process
1 2 3 4 5 6 7 8 9 10
Greater than 5less than 6
Comparing bst with sorted array
• To search a tree we have two methods
• 1. Itersearch (which is the iteration method)• 2. search (which is a recurrsive function)
• Itersearch is similar to binary search
Suppose we take a binary tree on a sorted list (5,10,15)
5 15
10
• Although this tree is full it may
not be a optimal bst• What if I never search for 10 but
only for 15 ……., I have to do 2 comparisions all the time
Soo itz not optimal for my requirement
If U R given a set of nos {5, 10, 15, 20, 25}
there are many binary search trees that can b formed
For eg
15 25
5 20
10
5 25
10
15
20
Fig a Fig b
Give your opinion !• So which one tree will be the most
optimal(desirable) for any search ?????
15 25
5 20
10
5 25
10
15
20
Fig a Fig b
•Whatever may be your answer
Itz wrong!!!!!!
B`coz` we cant decide it until we know the probablity that how much times a number is searched
Difference between fig a and fig b* Fig a • This tree requires atmost 4
comparisions
Fig b• This tree requires atmost 3
comparisionsIf we consider the worst case i.e. for 15 fig.b is more desirable
*Considering each element has equal probablity*every search is a successful search
5 25
10
15
20 15 25
5 20
10Compr 1
Compr 2
Compr 3
Compr 4
Fig.a Fig.b
Fig a
• 1st comparison with 10• 2nd with 25• 3rd with 20• 4th with 15 • Total 4 comparisons
• Avg no. of comparisons• 1+2+2+3+4 =2.4• 5
Fig b• 1st comparison with 10• 2nd with 20• 3rd with 15 • Total 3 comparisons
• Avg no. of comparisons• 1+2+2+3+3 =2.2• 5
Hence for equal probability Fig.b is more desireable
If probablity of the elements are
different ?P(5) =0.3(prob of searching 5)P(10)=0.3(prob of searching 10)P(15)=0.05(prob of searching 15)
P(20)=0.05(prob of searching 20)
P(25)=0.3(prob of searching 25)
Fig a.• Avg no of comparisons
=1.85• Fig a has low cost
Fig. b• Avg no of comparisons
=2.05• Fig b has more cost
Soo the probability of searching a particular element does affects the cost
Now fig a seems to be desirable
When dealing with obst
• An optimal binary search tree is a binary search tree for which the nodes are arranged on levels such that the tree cost is minimum
• In each binary tree there are NULL links at the leaf node, and they are denoted by square nodes
• A tree with n nodes will have (n+1) NULL links
• The square nodes are called as External nodes, b`coz`they are not a part of the tree
• The inner round nodes are called as Internal nodes• Each time we search a element which is not in the
tree the search ends at External nodes• Hence external nodes are also called as failure
nodes• A tree with external nodes is called as extended
binary tree
Path length affects the cost
• Internal path length: - sum of path length of each internal node • External path length: - sum of path length of each external node
• .
5 25
10
15
20
L=0
L=1
L=2
L=3
L=4
Internal path lengthI=0+1+1+2+3=7
External path lengthE=2+2+2+3+4+4=17
E=I+2(no of nodes)
Tree with max E will have max I
1 – the key is found, so the corresponding weight ‘p’ is incremented;
2 – the key is not found, so the corresponding ‘q’value is incremented.
If the user searches a particular key in the tree, 2
cases can occur:
Total cost
• As we know that there is a possibility of both successful and unsuccessful searches
• Cost= +
k2
k1 k4
k3 k5d0 d1
d2 d3 d4 d5
k2
k1 k5
k4
k3
d0 d1
d2 d3
d4
d5
Figure (a)
i 0 1 2 3 4 5
PiKnode
0.15 0.10 0.05 0.10 0.20
Qidnode
0.05 0.10 0.05 0.05 0.05 0.10Figure (b)
Node# Depth probability costk1 1 0.15 0.30k2 0 0.10 0.10k3 2 0.05 0.15k4 1 0.10 0.20K5 2 0.20 0.60d0 2 0.05 0.15d1 3 0.10 0.30d2 3 0.05 0.20d3 3 0.05 0.20d4 3 0.05 0.20d5 3 0.10 0.40
Cost=
Probability * (Depth+1)
We can calculate the expected search cost node by node:
Cost=
Probability * (Depth+1)
• And the total cost = (0.30 + 0.10 + 0.15 + 0.20 + 0.60 + 0.15 + 0.30 + 0.20 + 0.20 + 0.20 + 0.40 ) = 2.80 (Fig a)
• So Figure (a)(complete bst) costs 2.80 ,on another,
the Figure (b) costs 2.75, and that tree is really optimal.
• We can see the height of (b) is more than (a) , and the key k5 has the greatest search probability of any key, yet the root of the OBST shown is k2.(The lowest expected cost of any BST with k5 at the root is 2.85)
k2
k1 k5
k4
k3
d0 d1
d2 d3
d4
d5
Figure (b)
To find the OBST, our idea is to decide its root, and also the root of each subtree
•To help our discussion, we define :Ei,j = expected time searching keys in(k i ; d j)
Real nodes from 1 - 5 dummy nodes from 0 - 5
Deciding Root of OBST• E[i,j] = minr { Ei,r-1 + Er+1,j + wi,j }
Here r lies between i and j• Corollary:• Let r be the parameter that minimizes• { Ei,r-1 + Er+1,j + wi,j }• Then the root of the OBST for keys• ( ki, ki+1, …, kj; di-1, di, …, dj ) should be set to kr
Computing Ei,j
Define a function Compute_E(i,j) as follows:Compute_E(i, j) /* Finding Ei,j */1. if (i == j+1) return qj; /* Exp time with key dj */2. min = 1;3. for (r = i, i+1, …, j)
{g = Compute_E(i,r-1) + Compute_E(r+1,j) + wi,j ;if (g <min) min = g;}
4. return min ;
Remarks•A slight change in the algorithm allows usto get the root of each subtree, and thusthe structure of OBST (how?)
•The powerful technique of storingcomputed is calledDynamic Programming
•Knuth observed a further property sothat we can compute OBST in O(n2) time(search wiki for more information)