bushy binary search tree from ordered list. behavior of the algorithm binary search tree recall that...

26
Bushy Binary Search Tree from Ordered List

Upload: justin-jenkins

Post on 18-Jan-2018

217 views

Category:

Documents


0 download

DESCRIPTION

an example Suppose, as an example, that we apply binary search to the list of seven letters a, b, c, d, e, f, and g. The resulting tree is shown in part (a) of Figure If tree_search is applied to this tree, it will do the same number of comparisons as binary search.

TRANSCRIPT

Page 1: Bushy Binary Search Tree from Ordered List. Behavior of the Algorithm Binary Search Tree Recall that tree_search is based closely on binary search. If

Bushy Binary Search Tree

from Ordered List

Page 2: Bushy Binary Search Tree from Ordered List. Behavior of the Algorithm Binary Search Tree Recall that tree_search is based closely on binary search. If

Behavior of the Algorithm Binary Search Tree

• Recall that tree_search is based closely on binary search. If we apply binary search to an ordered list and draw its comparison tree, then we see that binary search does exactly the same comparisons as tree_search will do if it is applied to this same tree.

• We already know from Section 7.4 that binary search performs O(log n) comparisons for a list of length n.

• This performance is excellent in comparison to other methods, since log n grows very slowly as n increases.

Page 3: Bushy Binary Search Tree from Ordered List. Behavior of the Algorithm Binary Search Tree Recall that tree_search is based closely on binary search. If

an example• Suppose, as an example, that we apply binary search

to the list of seven letters a, b, c, d, e, f, and g. • The resulting tree is shown in part (a) of Figure

10.8. If tree_search is applied to this tree, it will do the same number of comparisons as binary search.

Page 4: Bushy Binary Search Tree from Ordered List. Behavior of the Algorithm Binary Search Tree Recall that tree_search is based closely on binary search. If

The Binary Search Tree

Page 5: Bushy Binary Search Tree from Ordered List. Behavior of the Algorithm Binary Search Tree Recall that tree_search is based closely on binary search. If

The Binary Search Tree• It is quite possible, however, that the same letters

may be built into a binary search tree of a quite different shape, such as any of those shown in the remaining parts of Figure 10.8.

Page 6: Bushy Binary Search Tree from Ordered List. Behavior of the Algorithm Binary Search Tree Recall that tree_search is based closely on binary search. If

The Binary Search Tree Class• The tree shown as part (a) of Figure 10.8 is the best

possible for searching. • It is as “bushy” as possible: It has the smallest

possible height for a given number of vertices.

Page 7: Bushy Binary Search Tree from Ordered List. Behavior of the Algorithm Binary Search Tree Recall that tree_search is based closely on binary search. If

The Binary Search Tree Class• In part (c) of Figure 10.8, however, the tree has

degenerated quite badly, so that a search for target c requires six comparisons.

• In parts (d) and (e), the tree reduces to a single chain.

• tree_search, when applied to such a chain, degenerates to sequential search.

Page 8: Bushy Binary Search Tree from Ordered List. Behavior of the Algorithm Binary Search Tree Recall that tree_search is based closely on binary search. If

Goal:• Start with an ordered list and build its entries into a

binary search tree that is nearly balanced (“as bushy as possible”).

Page 9: Bushy Binary Search Tree from Ordered List. Behavior of the Algorithm Binary Search Tree Recall that tree_search is based closely on binary search. If

Building A Binary Search Tree• When the number of entries, n, is 31, for example,

we wish to build the tree of Figure 10.12.

Page 10: Bushy Binary Search Tree from Ordered List. Behavior of the Algorithm Binary Search Tree Recall that tree_search is based closely on binary search. If

Building A Binary Search Tree• In Figure 10.12 the entries are numbered in their

natural order, that is, in inorder sequence, which is the order in which they will be received and built into the tree, since they are received in sorted order.

• We will also use this numbering to label the nodes of the tree.

Page 11: Bushy Binary Search Tree from Ordered List. Behavior of the Algorithm Binary Search Tree Recall that tree_search is based closely on binary search. If

An important property of the labels• If you examine the diagram for a moment, you may notice an

important property of the labels. • The labels of the leaves are all odd numbers; that is, they are not

divisible by 2. • The labels of the nodes one level above the leaves are 2, 6, 10, 14, 18,

22, 26, and 30. These numbers are all double an odd number; that is, they are all even (divisible by 2 = 21), but are not divisible by 4.

• On the next level up, the labels are 4, 12, 20, and 28, numbers that are divisible by 4 = 22, but not by 8.

• Finally, the nodes just below the root are labeled 8 and 24 (divisible by 8 = 23), and the root itself is 16 (divisible by 16 = 24).

• The crucial observation is: If the nodes of a complete binary tree are labeled in inorder sequence, starting with 1, then each node is exactly as many levels above the leaves as the highest power of 2 that divides its label.

Page 12: Bushy Binary Search Tree from Ordered List. Behavior of the Algorithm Binary Search Tree Recall that tree_search is based closely on binary search. If

one more constraint • Let us now put one more constraint on our problem: • Let us suppose that we do not know in advance how

many entries will be built into the tree. • If the entries are coming from a file or a linked list,

then this assumption is quite reasonable, since we may not have any convenient way to count the entries before receiving them.

Page 13: Bushy Binary Search Tree from Ordered List. Behavior of the Algorithm Binary Search Tree Recall that tree_search is based closely on binary search. If

one more constraint• This assumption also has the advantage that it will

stop us from worrying about the fact that, when the number of entries is not exactly one less than a power of 2, the resulting tree will not be complete and cannot be as symmetrical as the one in Figure 10.12.

• Instead, we shall design our algorithm as though it were completely symmetrical, and after receiving all entries we shall determine how to tidy up the tree.

Page 14: Bushy Binary Search Tree from Ordered List. Behavior of the Algorithm Binary Search Tree Recall that tree_search is based closely on binary search. If

Getting Started• There is no doubt what to do with entry number 1

when it arrives. • It will be placed in a leaf node whose left and right

pointers should both be set to NULL. • Node number 2 goes above node 1, as shown in Figure

10.13. • Since node 2 links to node 1, we obviously must keep

some way to remember where node 1 is until entry 2 arrives.

• Node 3 is again a leaf, but it is in the right subtree of node 2, so we must remember a pointer to node 2.

Page 15: Bushy Binary Search Tree from Ordered List. Behavior of the Algorithm Binary Search Tree Recall that tree_search is based closely on binary search. If
Page 16: Bushy Binary Search Tree from Ordered List. Behavior of the Algorithm Binary Search Tree Recall that tree_search is based closely on binary search. If

keep a list of pointers • Does this mean that we must keep a list of pointers to all

nodes previously processed, to determine how to link in the next one?

• The answer is no, since when node 2 is added, all connections for node 1 are complete.

• Node 2 must be remembered until node 4 is added, to establish the left link from node 4, but then a pointer to node 2 is no longer needed.

• Similarly, node 4 must be remembered until node 8 has been processed. In Figure 10.13, colored arrows point to each node that must be remembered as the tree grows.

Page 17: Bushy Binary Search Tree from Ordered List. Behavior of the Algorithm Binary Search Tree Recall that tree_search is based closely on binary search. If

keep a list of pointers• It should now be clear that to establish future links,

we need only remember pointers to one node on each level, the last node processed on that level.

• We keep these pointers in a List called last_node that will be quite small.

• For example, a tree with 20 levels (hence 20 entries in last_node) can accommodate 220- 1 > 1,000,000 nodes.

Page 18: Bushy Binary Search Tree from Ordered List. Behavior of the Algorithm Binary Search Tree Recall that tree_search is based closely on binary search. If

Finishing the Task• Finally, we must determine how to tie in any

subtrees that may not yet be connected properly after all the nodes have been received.

Page 19: Bushy Binary Search Tree from Ordered List. Behavior of the Algorithm Binary Search Tree Recall that tree_search is based closely on binary search. If

Finishing the Task• For example, if n … 21, we must connect the three

components shown in Figure 10.13 into a single tree.

Page 20: Bushy Binary Search Tree from Ordered List. Behavior of the Algorithm Binary Search Tree Recall that tree_search is based closely on binary search. If

Finishing the Task• Some nodes in the upper part of the tree may still have their right

links set to NULL, even though further nodes have been inserted that now belong in their right subtrees.

• Any one of these nodes (a node, not a leaf, for which the right child is still NULL) will be one of the nodes in the list last_node.

• For n … 21, these will be nodes 16 and 20 (in positions 5 and 3 of last_node, respectively), as shown in Figure 10.14.

Page 21: Bushy Binary Search Tree from Ordered List. Behavior of the Algorithm Binary Search Tree Recall that tree_search is based closely on binary search. If

determine the highest node in last_node that is not already in the left subtree

• The pointer lower_node can be determined as the highest node in last_node that is not already in the left subtree of high_node.

• To determine whether a node is in the left subtree, we need only compare its key with that of high_node.

Page 22: Bushy Binary Search Tree from Ordered List. Behavior of the Algorithm Binary Search Tree Recall that tree_search is based closely on binary search. If

Evaluation The algorithm of this section produces a binary search tree

that is not always completely balanced.• If the tree has 31 nodes, then it will be completely balanced,

but if 32 nodes come in, then node 32 will become the root of the tree, and all 31 remaining nodes will be in its left subtree.

• In this case, the leaves are five steps away from the root. • If the root were chosen optimally, then most of the leaves

would be four steps from it, and only one would be five steps.

• Hence one comparison more than necessary will usually be done in the tree with 32 nodes.

Page 23: Bushy Binary Search Tree from Ordered List. Behavior of the Algorithm Binary Search Tree Recall that tree_search is based closely on binary search. If

Evaluation One extra comparison in a binary search is not really

a very high price, and it is easy to see that a tree produced by our method is never more than one level away from optimality.

There are sophisticated methods for building a binary search tree that is as balanced as possible, but much remains to recommend a simpler method, one that does not need to know in advance how many nodes are in the tree.

Page 24: Bushy Binary Search Tree from Ordered List. Behavior of the Algorithm Binary Search Tree Recall that tree_search is based closely on binary search. If

Random Search Trees and Optimality To conclude this section, let us ask whether it is

worthwhile, on average, to keep a binary search tree balanced or to rebalance it.

• If we assume that the keys have arrived in random order, then, on average, how many more comparisons are needed in a search of the resulting tree than would be needed in a completely balanced tree?

Page 25: Bushy Binary Search Tree from Ordered List. Behavior of the Algorithm Binary Search Tree Recall that tree_search is based closely on binary search. If

The average number of nodes visited

Page 26: Bushy Binary Search Tree from Ordered List. Behavior of the Algorithm Binary Search Tree Recall that tree_search is based closely on binary search. If

Evaluation In other words, the average cost of not balancing a

binary search tree is approximately 39 percent more comparisons.

In applications where optimality is important, this cost must be weighed against the extra cost of balancing the tree, or of maintaining it in balance.

Note especially that these latter tasks involve not only the cost of computer time, but the cost of the extra programming effort that will be required.