introduction to software engineeringorb.essex.ac.uk/ce/ce204/part3.pdf · traversing binary trees 1...

23/01/2016 CE204 Part 3 1

CE204

Data Structures and Algorithms

Part 3

23/01/2016 CE204 Part 3 2

Trees

The ADTs encountered so far have been linear; list and array implementations have been adequate. We now consider a non-linear type, the tree.

A tree is either empty or contains a data item and may have children that are themselves trees. A tree with no children is called a leaf.

We use the term node to refer to positions in a tree. A non-empty tree has precisely one node with no parent – this node is called the root.

23/01/2016 CE204 Part 3 3

Binary Trees 1

In a binary tree no node has more than two children. The children are distinguished as left and right; if a node has only one child, that child can be either a left child or a right child.

The primitive operations for binary trees are:

• createtree to create a new empty tree

• maketree, which takes three arguments, a data item and two trees and creates a new tree with the data item in the root and the two trees as children

• isempty, to check if a tree is empty

• leftchild and rightchild, to return references to the children

• value, which returns the data item at the root

23/01/2016 CE204 Part 3 4

Binary Trees 2

The primitive operations allow only bottom-up building of trees; most applications will require the ability to add nodes to and remove nodes from trees. Although these tasks could be implemented using the primitive operations it would be very inefficient to do so. Hence an implementation of a binary tree class should normally provide extra methods in addition to the primitive operations.

23/01/2016 CE204 Part 3 5

Binary Trees 3

The natural representation for binary trees is to represent each node as an object containing a data value and references to the left and right children: class BTNode<T>

{ T data;

BTNode<T> left, right;

BTNode(T o) // constructor for a leaf

{ data = o; left = right = null;

}

BTNode(T o, BTNode<T> l, BTNode<T> r)

{ data = o; left = l; right = r;

}

}

23/01/2016 CE204 Part 3 6

Binary Trees 4

We present on the following slides a binary tree class that simply supports the primitive operations. We shall provide constructors with zero and three arguments instead of createtree and maketree methods.

You will observe that a third constructor has been written – it makes the coding of other methods easier. In order to ensure that BTree objects always represent valid trees users must not be allowed to use this constructor; hence it is declared as private.

We assume that a TreeException class similar to the StackException class has been written.

23/01/2016 CE204 Part 3 7

Binary Trees 5

public class BTree<T>

{ private BTNode<T> root;

public BTree()

{ root = null;

}

public BTree(T o, BTree<T> l, BTree<T> r)

{ root = new BTNode<T>(o, l.root, r.root);

}

private BTree(BTNode<T> n)

{ root = n;

}

// continued on next slide

23/01/2016 CE204 Part 3 8

Binary Trees 6

// class BTree<T> (continued)

public boolean isEmpty() { return root==null; }

public T value() { if (root==null) throw new TreeException(”value”); else return root.data; }

// continued on next slide

23/01/2016 CE204 Part 3 9

Binary Trees 7

// class BTree<T> (continued)

public BTree<T> leftChild()

{ if (root==null)

throw new TreeException(”leftChild”);

else

return new BTree<T>(root.left);

}

public BTree<T> rightChild() { if (root==null) throw new TreeException(”rightChild”); else return new BTree<T>(root.right); } }

23/01/2016 CE204 Part 3 10

Traversing Binary Trees 1

For many applications it is necessary to “visit” each node in a binary tree; for example, we may want to print all of the items in the tree. There are three standard orders for traversing binary trees:

• pre-order: visit the root, then visit its left child in pre-order and then visit the right child in pre-order

• post-order: visit the left child of the root in post-order, then visit the right child in post-order, then finally visit the root

• in-order: visit the left child of the root using in-order, then visit the root, then visit the right child using in-order

23/01/2016 CE204 Part 3 11


Here is a method to print the values in a binary tree using an in-order traversal: static <T> void printInOrder(BTree<T> t)

{ if (!t.isEmpty())

{ printInOrder(t.leftChild());

System.out.println(t.value());

printInOrder(t.rightChild());

}

}

23/01/2016 CE204 Part 3 12


It is easy to modify the code on the previous slide to print using pre-order or post-order. We simply need to change the order of the lines. For example static <T> void printPreOrder(BTree<T> t)

{ if (!t.isEmpty())

{ System.out.println(t.value());

printPreOrder(t.leftChild());

printPreOrder(t.rightChild());

}

}

23/01/2016 CE204 Part 3 13


We could add a toString method to our BTree class presenting the contents using in-order:

public String toString()

{ if (root==null)

return ””;

else

{ String s1 = leftChild().toString();

String s2 = rightChild().toString();

return s1+” ”+root.data+” ”+s2;

}

}

23/01/2016 CE204 Part 3 14


The method on the previous slide is not particularly efficient: since the recursive calls need to be applied to BTree objects rather than BTNode objects, we had to use the leftChild and rightChild methods (which create new BTree objects) instead of the left and right references.

We could have avoided this unwanted creation of new objects by providing a toString method for the BTNode class, but this would mean that if we wanted to change the format of the string we would have to make changes to methods in two classes. A better approach is to write a private support method which takes a BTNode object as an argument – as seen on the next slide.

23/01/2016 CE204 Part 3 15


// toString method for class BTree<T>

public String toString()

{ return getString(root);

}

private static <T> String getString(BTNode<T> n)

{ if (n==null)

return ””;

else

{ String s1 = getString(n.left);

String s2 = getString(n.right);

return s1+” ”+n.data+” ”+s2;

}

}

23/01/2016 CE204 Part 3 16

Binary Search Trees 1

Suppose that we wish to store a large set of numbers and want to be able to quickly determine whether a number is in the set, and also want to be able to add new members to the set. In order to facilitate efficient searching we would need to store the numbers in order.

If an array is used searching could be performed quickly using a “binary chop” approach (look at the middle item to determine in which half of the array the number should be, then go to the middle of that half, etc.). However the insertion of a new number would require shifting.

23/01/2016 CE204 Part 3 17


A linked list implementation will not permit efficient searching since there is no way of locating the middle element without traversing half of the list.

An alternative approach that can give both efficient searching and efficient insertion is to use a binary search tree. This is a binary tree in which, for every node, all of the values in the left child are less than the value at the node and all of the values in the right child are greater than the value at the node.

Note that any sub-tree of a binary search tree will itself be a binary search tree.

23/01/2016 CE204 Part 3 18


We cannot write a fully generic implementation of binary search trees, since there are restrictions on the type of items that can be stored – they must be comparable.

Hence we shall first present an implementation of binary search trees of integers.

It would be possible to write our class as a subclass of the BTree class, but since the methods we need to write do not correspond to the primitive operations for that class there is no advantage in doing so. We will, however, make use of the BTNode class seen earlier.

23/01/2016 CE204 Part 3 19


public class BST

{ private BTNode<Integer> root;

public BST()

{ root = null;

}

public boolean find(Integer i) {} // body needed

public boolean insert(Integer i) {}

// body needed

public boolean delete(Integer i) {}

// body needed

}

23/01/2016 CE204 Part 3 20


The find method will start at the root and search downwards through the tree until the number to be found is located or an empty child is reached; at each step the BST ordering property will be used to decide whether to move to the left or right child.

Binary search trees cannot contain duplicate entries so if an attempt is made to add a number that is already present nothing will be added; hence the insert method returns a boolean result to indicate whether the insertion was successful.

The delete method also returns a boolean result, indicating whether the item was present and hence deleted.

23/01/2016 CE204 Part 3 21


// find method for BST class

public boolean find(Integer i) { BTNode<Integer> n = root;

boolean found = false;

while (n!=null && !found) { int comp = i.compareTo(n.data);

if (comp==0) found = true; else if (comp<0) // i<n.data n = n.left; else n = n.right; }

return found; }

23/01/2016 CE204 Part 3 22


// insert method for BST class

public boolean insert(Integer i) { BTNode<Integer> parent = null, child = root; boolean goneLeft = false; while (child!=null && i.compareTo(child.data)!=0) { parent = child; if (i.compareTo(child.data)<0) { child = child.left; goneLeft = true; } else { child = child.right; goneLeft = false; } } // method continued on next slide

23/01/2016 CE204 Part 3 23


// insert method for BST class continued

if (child!=null) return false; // number already present else { BTNode<Integer> leaf = new BTNode<Integer>(i);

if (parent==null) // tree was empty root = leaf; else if (goneLeft) parent.left = leaf; else parent.right = leaf;

return true; } }

23/01/2016 CE204 Part 3 24


Deleting an item from a binary search tree is simple if the item happens to be positioned at a leaf. Otherwise we have to consider what to do about the gap that would be left by removing the item, since it is not permissible to leave a node without a value.

If the node containing the item to be deleted has only one child we can replace the reference from the parent to the node with a reference to the node’s child, preserving the binary search property.

23/01/2016 CE204 Part 3 25


If the node containing the value to be removed has two children we must place in the node a value from one of these children. In order to preserve the binary search property this must be either the largest value in the left child or the smallest value in the right child. This value must of course be removed from its original position, so it might appear that in some cases it could be necessary to repeat the whole procedure several times. This is not in fact the case, since the node holding the largest or smallest value in a sub-tree can have at most one child.

The coding of the delete method is left as an exercise.

23/01/2016 CE204 Part 3 26


It is not difficult to see that the times taken for insertion, deletion and searching are approximately proportional to the depth of the tree. If a tree containing n items is well balanced its depth will be approximately log2n, but in the worst case the depth of such a tree can be as large as n. The average depth, if the items are inserted in a random order, can be shown to be about 2.5 log2n.

[ Hence the average times for insertion, deletion and searching are O(log n) but the worst case times are O(n). (The meaning of this will be explained in part 4.) ]

23/01/2016 CE204 Part 3 27

Generic Binary Search Trees 1

If we wish to write a generic version of our binary search tree class we need to use compareTo to compare elements. We cannot hence provide a generic version of the form BST<T> since the type T could be instantiated to a type that does not have a compareTo method and hence the compiler would not accept the use of compareTo.

To allow items other than integers to be stored in our class we could replace all occurrences of Integer by Comparable but users would then be able to add items of different types to the same tree and exceptions could be thrown at runtime.

23/01/2016 CE204 Part 3 28


To ensure a type-safe implementation of binary search trees permitting different types of objects to be stored, we need to be able to write a generic version which restricts the type parameter T so that it has to implement the Comparable interface. To do this we need to use a bounded type parameter.

A bounded type parameter of the form T extends X allows the type T to be instantiated only to subclasses of X (or classes that implement X, if X is an interface). Note that the extends X notation is used only when the identifier T first occurs, i.e. after the class name in the class declaration, or before the return type of a generic method.

23/01/2016 CE204 Part 3 29


The structure of the generic version of the binary search tree class should hence be as follows:

public class BST<T extends Comparable<T>>

{ private BTNode<T> root;

public BST() {}

public boolean find(T i) {………}

public boolean insert(T i) {………}

public boolean delete(T i) {………}

}

In a generic class it is preferable to use the generic version of Comparable which expects a compareTo method with an argument of type T. Although T extends Comparable would be acceptable, the compiler might generate warning messages.

23/01/2016 CE204 Part 3 30

Implementing Non-Binary Trees 1

Our binary tree nodes contained references to the left and right children. This is not appropriate for trees where nodes can have an arbitrary number of children. Several alternative approaches are possible.

If there is a known maximum number of children per node we could store the children of a node in an array; this makes inefficient use of memory if most nodes have less than the maximum number of children.

We could instead store a list of children using any of the classes that implement the List interface.

23/01/2016 CE204 Part 3 31


Using a linked list directly would require two types of object, tree nodes that contain a value and a reference to a list node, and list nodes that contain a reference to a tree node and a reference to the next list node. We can make more efficient use of memory by combining the two objects into one and produce a TreeNode class that contains a value, a reference to the leftmost child (the first item of the list of children), and a reference to the next item in the list of which this node is a member (i.e. the next child of this node’s parent). This is known as a left-child right-sibling representation.

23/01/2016 CE204 Part 3 32


Consider the following tree.

Homer

Bart Lisa Maggie

Bill Ted

23/01/2016 CE204 Part 3 33


The left-child right-sibling representation for the tree is as shown (with the diagonal arrows showing left-child references and the horizontal arrows showing right-sibling references).

Homer

Bart Lisa Maggie

Bill Ted

introduction to software engineeringorb.essex.ac.uk/ce/ce204/part3.pdf · traversing binary trees 1...

Documents