cp2530 - appspot.combranko-cirovic.appspot.com/cp2530/cp2530.notes.pdf · analysis of algorithms...
TRANSCRIPT
CP2530Algorithms and Data Structures
ANALYSIS OF ALGORITHMS
Analysis is more reliable than experimentation. Testing will reveal behaviour for some inputs, while analysis tells us about algorithm’s behaviour for a! inputs.
There are many different solutions to the same problem. Analysis can help us choose among different solutions.
Performance of a program can be predicted before actual implementation. Analysis gives us better understanding where the “slow” and “fast” parts are.
MOTIVATION
Recall Fibonacci sequence. It can be recursively defined as:
!"#
f ( x ) = 1
f ( x - 1) + f ( x - 2)if x = 1 or x = 2otherwise
Here we have two base cases, when argument x is either 1 or 2, i.e. f( 1 ) = 1 and f( 2 ) = 1, reflecting the fact that first two integers in the sequence are 1, 1.
FIBONACCI
The original formula gives rise to natural recursive implementation :
0 int f(int n) {1 if(n == 1 || n == 2)2 return 1;3 else4 return f(n - 1) + f(n - 2);5 }
EFFICIENCY
Basic Question : How much time would recursive algorithm take to compute the nth member of the sequence?
But how to measure time? In seconds? But then the answer changes every time Intel comes out with a faster processor.
To get a rough approximation, we measure time in terms of lines of ( pseudo ) code.
EFFICIENCY
Line 1 is always executed in the code on slide 4. Depending on the evaluation of line 1 either line 2 or line 4 is executed.
Therefore, the time required to compute the nth Fibonacci number in terms of lines of code is :
time(n) = 2 + time(n - 1) + time(n - 2)
The equation above is called recurrence relation, and we’ll see how to solve them shortly.
EFFICIENCY
Recursive implementation of Fibonacci numbers is natural, but very inefficient. Here is why, ( intuitively ) :
f(5)
f(4) f(3)
f(2) f(1)f(2)f(3)
f(2) f(1) The same numbers are recomputed ! Try running with n = 45 and wait ...
FIBONACCI:SECOND APROACH
Recursive algorithm is slow because it recomputes the same numbers over and over again.
Our second algorithm stores computed numbers in the array :
0 int f(int n) {1 int[] a = new int[n];2 a[0] = a[1] = 1;3 for(int i = 2; i < n; i++)4 a[i] = a[i - 1] + a[i - 2];5 return a[n - 1];6 }
Lines 1, 2 and 5 are executed unconditionally ( 3 times ). Line 3 is executed n - 1 times and line 4 is executed n - 2 times. So
time(n) = 3 + n - 1 + n - 2 = 2n
For n = 45, it takes 90 steps, roughly 25 million times faster than recursive implementation!
FIBONACCI:SECOND APROACH
SPACE COMPLEXITY
Efficiency ( running time ) is not our only concern or the only thing that we can analyze mathematically.
If a program takes a lot of time ( reasonably ), we can still run it and just wait longer for a result.
However, if a program takes a lot of memory ( space ), we may not be able to run it at all.
Recursive algorithm takes a constant amount of space : some for local variables as well as return addresses.
SPACE ANALYSIS
f(n)
f(3)
f(2) f(1)
f(n-1) f(n-2)
f(n-3)f(n-2) f(n-4)f(n-3)
The length of any such path is at most n, so the space complexity is again some constant factor times n.
Iterative algorithm uses roughly the same amount of space - size of the array ( n ). Since each step through the loop uses only two previous values, we can improve space complexity :
SPACE ANALYSIS
0 int f(int n) {1 int p = 1;2 int t = 0;3 for(int i = 1; i < n; i++) {4 int c = p + t;5 p = t;6 t = c;7 }8 return c;9 }
SPACE ANALYSIS
Lines 1, 2 and 8 are executed unconditionally ( 3 times ). Line 3 is executed n + 1 times, lines 4, 5 and 6 are executed n times. So
time(n) = 3 + n + 3n = 4n + 3
Because of swapping, this algorithm is slightly slower than array based algorithm, but it uses much less space.
GROWTH OF FUNCTIONS
Quite often we cannot predict the running times of algorithms exactly. Consider the following one, computing the maximum value in the array :
0 int max(int[] a) {1 int m = a[0];2 for(int i = 1; i < a.length; i++) {3 if(a[i] > m)4 m = a[i];5 }6 return m;7 }
GROWTH OF FUNCTIONS
Lines 1 and 6 are executed unconditionally ( 2 times ). Line 2 is executed n ( a.length ) times, line 3 is executed n - 1 times. But we can’t tell how many times line 4 will be executed. So
time(n) = 2 + n + n - 1 + A = 2n + 1 + A
In the expression above we know everything except the quantity A, which is the number of times we must change the value for the current maximum.
GROWTH OF FUNCTIONS
Minimum value of A ( best case )In function max, the best case is when array is sorted in descending order, i.e. step 4 is never executed and the running time is 2n + 1.
Maximum value of A ( worst case )In function max, the worst case is when array is sorted in ascending order, i.e. step 4 is executed n - 1 times and the running time is 3n.
The analysis usually consist of finding the :
GROWTH OF FUNCTIONS
It is the rate of growth that really interests us : how the running time of an algorithm increases with the size of an input in the limit.
Since we are interested in asymptotic efficiency of algorithms ( when input size is large ), when we define running times we simply ignore constants and lower degree variables.
For example, if T(n) = an2 + bn + c, where a, b and c are nonnegative constants, we’ll write Θ(n2) or O(n2).
Θ-NOTATION
What follows is fancy mathematical lingo to express the fact that if my code takes 3n4 + 2n² - 5 steps, I don’t care about lower terms and constants - it takes n4 steps. That’s it!
For a given function g(n) we denote by Θ(g(n)) the set of functions :
Θ(g(n)) = { f(n) : ∃c₁, c₂, n₀ > 0 such that 0 ≤ c₁g(n) ≤ f(n) ≤ c₂g(n) ∀n ≥ n₀ }
Any function f(n) in Θ(g(n)) is bounded by c₁g(n) and c₂g(n) for all n ≥ n₀. So we write f(n) ∈ Θ(g(n)) or f(n) = Θ(g(n)).
Θ-NOTATION
c₁g(n₁)
f(n₁)
c₂g(n₁)
n₁n₀
c₂g(n)
f(n)
c₁g(n)
runningtime
n - size of input
Graphically, Θ-notation can be represented as :
O-NOTATIONFor a given function g(n) we denote by O(g(n)) the set of functions :
O(g(n)) = { f(n) : ∃c, n₀ > 0 such that 0 ≤ f(n) ≤ cg(n) ∀n ≥ n₀ }
Θ-notation asymptotically bounds a function from above and below. O-notation stands for asymptotic upper bound ( worst case running time ).
Again we write, for example, 2n² + 3n - 2 = O(n²), but the real meaning is 2n² + 3n - 2 ∈ O(n²).
O-NOTATIONGraphically, O-notation can be represented as :
f(n₁)
cg(n₁)
n₁n₀
cg(n)
f(n)
runningtime
n - size of input
The most common values in the analysis of algorithms are :
O-NOTATION
constant logarithmic linear
O(1) O(log n) O(n)
quadratic polynomial exponential
O(n²) O(nk) ( k ≥ 1 ) O(an) ( a > 1 )
O-notation should be used to characterize function as closely as possible. While it is true that f(n) = 4n3 + 3n4/3 is in O(n5), it is more accurate to say that it is in O(n3).
Ω-NOTATION
For a given function g(n) we denote by Ω(g(n)) the set of functions :
Ω(g(n)) = { f(n) : ∃c, n₀ > 0 such that 0 ≤ cg(n) ≤ f(n) ∀n ≥ n₀ }
Ω-notation stands for asymptotic lower bound ( best case running time ).
Ω-NOTATIONGraphically, Ω-notation can be represented as :
cg(n₁)
f(n₁)
n₁n₀
f(n)
cg(n)
runningtime
n - size of input
RECURRENCES
When an algorithm contains a recursive call, its running time can often be described by recurrence, which is an equation that describes function in terms of smaller inputs.
For example, recurrence that describes recursive Fibonacci function is :
!"#
T(n) = 1T(n - 1) + T(n - 2)
if n = 1 or n = 2otherwise
The importance of solving recurrences is in obtaining asymptotic Θ or O bounds.
HOMOGENEOUS RECURRENCES
Recurrences of the form : a0T(n) + a1T(n - 1) + ... + akT(n - k) = 0
are called homogeneous recurrences.
For example, Fibonacci sequence, in terms of recurrences,T(n) - T(n - 1) - T(n - 2) = 0 is homogeneous recurrence.
With each recurrence we associate characteristic polynomial : p(x) = a0xk + a1xk-1 + ... + ak
For example, x2 - x - 1 is p(x) for Fibonacci sequence.
Let ri denote the ith root of the characteristic polynomial. Then homogeneous recurrence has solution of the form :
∑cirin = c1r1n + c2r2n + ... + ckrkn
provided that all roots are distinct.
Coefficients c1, c2, ..., ck can be determined from k initial conditions ( trivial recursive cases ) by solving the system of k linear equations in k unknowns.
HOMOGENEOUS RECURRENCES
EXAMPLE
Fibonacci recurrence is defined as T(n) - T(n - 1) - T(n - 2) = 0 with its characteristic polynomial x2 - x - 1. The roots of this polynomial are :
1 + √52r1 = r2 = 1 - √5
2
So the solution is of the form :
1 + √52c1 c2
1 - √52+
n n
T(n) = '(
'(
'(
'(
EXAMPLE
We find coefficients c1 and c2 from initial conditions T(1) = 1 and T(2) = 1. Their values are c1 = 1 / √5 and c1 = -1 / √5. Thus
Since (1 + √5) / 2 > 1 and (1 - √5) / 2 < 1, we have
T(n) = O(1.618n)
That’s bad. Very bad! It will take ~2,500,000,000 steps to compute the 45th member of the Fibonacci sequence!
1 + √52
1 - √52
-n n
T(n) = √51___ )
*)*
'(
'(
'(
'(
THE MASTER METHOD
The master method is a method for solving recurrences of the form :
where a ≥ 1, b > 1 and f(n) is asymptotically positive function. Then T(n) can be bounded asymptotically as follows :
nb_T(n) = aT + f(n)'
('(
1. If f(n) = O(nlogba - ε) for some constant ε > 0, thenT(n) = Θ(nlogba).
2. If f(n) = Θ(nlogba), then T(n) = Θ(nlogba log n).
3. If f(n) = Ω (nlogba + ε) for some constant ε > 0 and if af(n/b) ≤ cf(n) for some constant c < 1, then T(n) = Θ(f(n)).
THE MASTER METHOD
nb_T(n) = aT + f(n)'
('(
EXAMPLES (1)
Consider T(n) = 9T(n / 3) + n.
For this recurrence we have a = 9, b = 3, f(n) = n.
Thus nlog39 = n2. Let ε = 6.
Since f(n) = n = O(nlog39 - 6) = O(nlog33) = O(n), we apply case (1) and conclude T(n) = Θ(nlog39) = Θ(n2).
EXAMPLES (2)
Consider T(n) = T(2n / 3) + 1.
For this recurrence we have a = 1, b = 3/2, f(n) = 1
Thus nlog3/21 = n0 = 1.
Since f(n) = 1 = Θ(nlog3/21) = Θ(1), we apply case (2) and conclude T(n) = Θ(nlog3/21 log n) = Θ(log n).
EXAMPLES (3)
Consider T(n) = 3T(n / 4) + n log n.
For this recurrence we have a = 3, b = 4, f(n) = n log n and thus nlog43.
Since log43 < 1, choose ε such that ε + log43 = 1.
Then nlog43+ε = n1 = n and obviously f(n) = n log n = Ω(n)( case 3 is candidate ).
EXAMPLES (3)
Next we have to show that regularity condition applies :
a f(n / b) ≤ cf(n) for some constant c
3 (n / 4) log (n / 4) ≤ cn log n
Let c = 3 / 4. The we have :
(3n / 4) log (n / 4) < (3n / 4) log n
Case 3 applies and we conclude T(n) = Θ(n log n).
DATA STRUCTURES
So far, we studied static, fixed-size data structures such as arrays.
Quite often, it is not possible to predict, in advance, how much memory is needed to carry out computation ( can compiler predict how many variables you will declare? )
For this reason, we’ll study dynamic data structures - structures that can grow, as well as shrink, at run time.
SELF-REFERENTIAL CLASSES
All dynamic data structures are based on the concept ofself-referential nodes.
Self-referential nodes are implemented through self-referential classes which contain data field(s) as well as references ( links ) to objects of the same type.
Pictorially, self-referential node can be represented as :
data
Node.javaclass Node { private int data; private Node link;
public Node() { data = 0; link = null; }
public void setData(int data) { this.data = data; }
Node.java public int getData() { return data; }
public void setLink(Node link) { this.link = link; }
public Node getLink() { return link; }}
DYNAMIC MEMORY ALLOCATION
Creating and maintaining data structures requires dynamic memory allocation - ability to request and allocate more memory at run time.
Recall that in Java, the new operator is essential in dynamic memory allocation. For example : Node n = new Node();would request enough memory to store object of type Node and assign reference to variable n.
Unlike in C/C++, we don’t have to worry about deleting nodes we don’t need. Java garbage collector does the job!
TestNode.java
class TestNode { public static void main(String[] args) { Node n1, n2, n3, temp; n1 = new Node(); n2 = new Node(); n3 = new Node(); n1.setData(1); n1.setLink(n2); n2.setData(2); n2.setLink(n3); n3.setData(3);
TestNode.java
temp = n1; while(temp != null) { System.out.print(temp.getData() + " "); temp = temp.getLink(); } System.out.println(); }}
LINKED NODES
This is what we have created here, pictorially :
1
n1
2
n2
temp 3
n3
null
The major disadvantage with this approach is that we have to declare nodes in advance. Instead of doing so we’ll develop methods to create and link nodes dynamically.
STACKS
The simplest dynamic data structure is stack - new nodes can be added to and removed from the top only.
For this reason, stack is referred to as LIFO ( Last-In-First-Out ) data structure.
In order to implement stack, we need two private variables :Node top and int size, initially set to null and 0, respectively.
Primary methods are void push(int data) which adds a new node on top of the stack, and void pop() which removes node from the top.
STACKS : push
This is the sequence of steps taken every time the push(data) method is called :
1. Dynamically create new node : Node n = new Node();
0
n
null
null
topsize = 0
STACKS : push
2. Set data ( assume 1 ) : n.setData(data);
1
n
null
null
topsize = 0
3. Link newly created node to the previous one : n.setLink(top);
STACKS : push
4. Set top reference to newly created node : top = n;
size = 0
5. Increment size : size++;
1
n
null
top
STACKS : push
size = 11
null
top
0
n
null
2
size = 2
STACKS : push
public void push(int data) { Node n = new Node(); n.setData(data); n.setLink(top); top = n; size++;}
Pop method removes the node on top. This is the sequence of steps taken every time the pop() method is called :
STACKS : pop
1. If the stack is empty ( size is zero or top is null ), report it and exit. Otherwise, set top to the node below : top = top.getLink();
2. Decrement size : size--;
STACKS : pop
size = 11
null
top
2
size = 2
STACKS : pop
public void pop() {if(isEmpty())System.out.println("Stack is empty.");
else {top = top.getLink();size--;
}}
Stack.java
class Stack { Node top; int size; public Stack() { top = null; size = 0; }
Stack.java
public void push(int data) { Node n = new Node(); n.setData(data); n.setLink(top); top = n; size++;}
Stack.java
public boolean isEmpty() { if(size == 0) return true; else return false;}
Stack.java
public void pop() { if(isEmpty()) System.out.println("Stack is empty."); else { top = top.getLink(); size--; }}
Stack.java
public void print() {if(isEmpty())System.out.print("Stack is empty");
else {Node temp = top;while(temp != null) {System.out.print(temp.getData() + " ");temp = temp.getLink();
}}System.out.println();
}
TestStack.java
class TestStack { public static void main(String[] args) { Stack s = new Stack(); s.push(1); s.push(2); s.push(3); s.print(); s.pop(); s.print(); s.pop(); s.pop(); s.print(); s.pop(); }}
LINKED LISTS
Linked lists are data structures similar to stacks. Insertions are done at the end of the list called tail.
Here we use another reference - head, which points to the first node in the list.
With two references, it is possible to search, sort, remove entries not only at tail etc.
Hence, in order to implement linked lists, we need two references Node head, tail and int size. References are initialized to null and size to 0;
LINKED LISTS : insert
This is the sequence of steps taken every time the insert(data) method is called :
1. Dynamically create new node : Node n = new Node();
0
n
null
null
headsize = 0
null
tail
LINKED LISTS : insert
2. Set data ( assume 1 ) : n.setData(data);
1
n
null
null
headsize = 0
null
tail
LINKED LISTS : insert
3. If list is empty set head reference to newly created node n ( head = n ). Otherwise, link this node to the last one : tail.setLink(n);
1
n
null
headsize = 0
null
tail
LINKED LISTS : insert
4. Set tail reference to newly created node : tail = n;
5. Increment size : size++;
1
n
null
headsize = 0
tail
LINKED LISTS : insert
1
null
headsize = 1
tail
0
n
null
2
size = 2
LINKED LISTS : insert
public void insert(int data) { Node n = new Node(); n.setData(data); if(isEmpty()) head = n; else tail.setLink(n); tail = n; size++;}
LINKED LISTS : removeRemove method removes the node at tail. This is the sequence of steps taken every time the remove() method is called :
1. If the list is empty ( size is zero or head/tail are null ), report it and exit.
2. If size is 1, reset head and tail to null and size to zero.
3. Otherwise, traverse the list until we find the node just before tail :
while(temp.getLink() != tail) temp = temp.getLink();
LINKED LISTS : remove
1
head tail
2
null
size = 2temp
4. Set tail to temp : tail = temp;
5. Set tail’s link to null : tail.setLink(null);
6. Decrement size : size--;
LINKED LISTS : remove
1
head tail
2
null
size = 2temp
null
size = 1
LINKED LISTS : remove
public void remove() { if(isEmpty()) System.out.println("List is empty."); else if(size == 1) { head = tail = null; size = 0; }
LINKED LISTS : remove
else { Node temp = head; while(temp.getLink() != tail) temp = temp.getLink(); tail = temp; tail.setLink(null); size--; }}
LinkedList.java
class LinkedList { private Node head; private Node tail; int size; public LinkedList() { head = tail = null; size = 0; }
LinkedList.java
public boolean isEmpty() { if(size == 0) return true; else return false; }
LinkedList.java
public void insert(int data) { Node n = new Node(); n.setData(data); if(isEmpty()) head = n; else tail.setLink(n); tail = n; size++; }
LinkedList.java
public void remove() { if(isEmpty()) System.out.println("Can't remove. List is empty."); else if(size == 1) { head = tail = null; size = 0; } else { Node temp = head; while(temp.getLink() != tail) temp = temp.getLink(); tail = temp; tail.setLink(null); size--; } }
LinkedList.java
public void print() { if(isEmpty()) System.out.println("List is empty."); else { Node temp = head; while(temp != null) { System.out.print(temp.getData() + " "); temp = temp.getLink(); } System.out.println(); } }}
TestLinkedList.java
class TestLinkedList { public static void main(String[] args) { LinkedList l = new LinkedList(); l.insert(1); l.insert(2); l.insert(3); l.print(); l.remove(); l.print(); l.remove(); l.print(); l.remove(); l.print(); l.remove(); l.print(); }}
QUEUES
Queue is dynamic data structure similar to linked list : it can be described by two references head and tail, and size. Fundamental methods are :
void enqueue(int data) : identical to insert(int data) in linked list. New node is linked at the tail of queue.
void dequeue() : removes node from head. For this reason queues are called FIFO ( First-In-First-Out ) data structures.
QUEUES : dequeue
head tail
2
null
1
size = 2size = 1
Queue.java
class Queue { private Node head; private Node tail; private int size; public Queue() { head = tail = null; size = 0; }
Queue.java
public boolean isEmpty() { if(size == 0) return true; else return false; }
Queue.java
public void enqueue(int data) { Node n = new Node(); n.setData(data); if(isEmpty()) head = n; else tail.setLink(n); tail = n; size++; }
Queue.java
public void dequeue() { if(isEmpty()) System.out.println("Queue is empty"); else if(size == 1) { head = tail = null; size = 0; } else { head = head.getLink(); size--; } }
Queue.java
public void print() { if(isEmpty()) System.out.println("Queue is empty."); else { Node temp = head; while(temp != null) { System.out.print(temp.getData() + " "); temp = temp.getLink(); } System.out.println(); } }}
TestQueue.java
class TestQueue { public static void main(String[] args) { Queue q = new Queue(); q.enqueue(1); q.enqueue(2); q.print(); q.dequeue(); q.print(); q.dequeue(); q.print(); q.dequeue(); q.print(); }}
BINARY TREES
Graph G = ( V, E ) is the set of vertices and edges.
Cycle C is a path in graph, i.e. distinct sequence of vertices and edges, that begins and ends with the same vertex.
BINARY TREES
Tree T is acyclic graph. It is rooted in a root vertex.
Binary Tree is a tree in which any node can have at most two leaves ( children ).
rootroot
BINARY SEARCH TREE
All keys ( if any ) in left subtree with respect to root vertex precede the key in the root.
The key in the root vertex precedes all keys ( if any ) in its right subtree.
The left and right subtrees of the root are again binary search trees.
Binary Search Tree is a binary tree which is either empty or in which each vertex contains a key that satisfies conditions :
BINARY SEARCH TREE
2
1 3
root
Binary Search Tree
null
2
1 3
null null null
root
Binary Search Tree as data structure
BNode.java
class BNode { private int data; private BNode leftLink; private BNode rightLink; public BNode() { data = 0; leftLink = rightLink = null; }
BNode.java
public void setData(int data) { this.data = data; } public int getData() { return data; }
BNode.java
public void setLeftLink(BNode leftLink) { this.leftLink = leftLink; } public BNode getLeftLink() { return leftLink; }
BNode.java
public void setRightLink(BNode rightLink) { this.rightLink = rightLink; } public BNode getRightLink() { return rightLink; }}
BINARY TREE: INSERTION
1
3
2
1 - 2 - 3
1
3
2
1 - 3 - 2
3
1
2
3 - 2 - 1
3
1
2
3 - 1 - 2
Nodes cannot be inserted into a binary tree in a predefined way. Every insertion must preserve the binary tree. Different insertions might result in a different tree.
BINARY TREE: INSERTION
2
1 3
Observe that both sequences, 2 - 1 - 3 and 2 - 3 - 1 will yield the same tree.
Hence, before we can insert node into a binary tree, we have to find the proper position for that node. Initially, binary tree consists of a single reference root which is null.
BINARY TREE: INSERTION
2
4
3
1
5
root
temptempb.insert(2);b.insert(1);b.insert(4);b.insert(3);b.insert(5);
BINARY TREE: INSERTION
public void insert(int data) { BNode b = new BNode(); b.setData(data); if(root == null) root = b; else { BNode temp = root; while(temp != null) {
BINARY TREE: INSERTION
if(data < temp.getData()) { if(temp.getLeftLink() != null) temp = temp.getLeftLink(); else { temp.setLeftLink(b); break; }}
BINARY TREE: INSERTION
else if(data > temp.getData()) { if(temp.getRightLink() != null) temp = temp.getRightLink(); else { temp.setRightLink(b); break; }}
BINARY TREE: INSERTION
else { System.out.println("Error. Duplicate."); break; } }}
TREE TRAVERSAL
In many applications it is necessary to visit nodes in a regular way. Typically we have to visit the node ( V ), traverse its left subtree ( L ) and finally traverse its right subtree ( R ) :
V L R V R L L V R R V L L R V R L V
By standard convention, these six traversals are reduced to three, by traversing the left subtree before the right.
Their names are : Preorder ( VLR ), Inorder ( LVR ) and Postorder ( LRV ).
PREORDER (VLR)
2
4
3
1
5
root
temp
412 3 5
INORDER (LVR)
2
4
3
1
5
root
321 4 5
temp
POSTORDER (LRV)
2
4
3
1
5
root
531 4 2
temp
BTree.javaclass BTree { private BNode root; public BTree() { root = null; } public void insert(int data) { BNode b = new BNode(); b.setData(data); if(root == null) root = b;
BTree.java
else { BNode temp = root; while(temp != null) { if(data < temp.getData()) { if(temp.getLeftLink() != null) temp = temp.getLeftLink(); else { temp.setLeftLink(b); break; } }
BTree.java
else if(data > temp.getData()) { if(temp.getRightLink() != null) temp = temp.getRightLink(); else { temp.setRightLink(b); break; } }
BTree.java
else { System.out.println("Error. Duplicate."); break; } } } }
BTree.java
public void showPreorder() { System.out.print("Preorder traversal: "); preorder(root); System.out.println(); } public void preorder(BNode b) { if(b != null) { System.out.print(b.getData() + " "); preorder(b.getLeftLink()); preorder(b.getRightLink()); } }
BTree.java
public void showInorder() { System.out.print("Inorder traversal: "); inorder(root); System.out.println(); } public void inorder(BNode b) { if(b != null) { inorder(b.getLeftLink()); System.out.print(b.getData() + " "); inorder(b.getRightLink()); } }
BTree.java public void showPostorder() { System.out.print("Postorder traversal: "); postorder(root); System.out.println(); } public void postorder(BNode b) { if(b != null) { postorder(b.getLeftLink()); postorder(b.getRightLink()); System.out.print(b.getData() + " "); } }}
TestBTree.java
class TestBTree { public static void main(String[] args) { BTree b = new BTree(); b.insert(2); b.insert(1); b.insert(4); b.insert(4); b.insert(3); b.insert(5); b.showPreorder(); b.showInorder(); b.showPostorder(); }}
AVL TREESIf we inserted into a binary tree values 1, 2, 3 the resulting tree would look like :
root 1
2
3
null null
null
null
null
The worst-case performance ( search ) on binary search tree is when values are inserted in ascending / descending order. It is not better than linear search O(n).
Height-Balance PropertyFor every internal ( non-leaf ) node v of binary search tree T, the heights of the children of v can differ by at most 1.
Any tree that satisfies Height-Balance Property is said to be an AVL tree ( Adelson, Velskii, Landis ). AVL trees maintain the logarithmic height.
AVL TREES
INSERTION
Nodes are inserted into AVL tree in the same way as they were inserted into Binary Search Tree.
This action might violate height-balance property. If this is the case, tree must be restructured. There are four cases :
1
3
2
1
3
2
3
1
2
3
1
2
INSERTION
Once the new node x is inserted into a tree ( and linked to its parent! ), we’ll call :
public AVLNode unbalanced(AVLNode x)
Method unbalanced(x) climbs up the tree from x until it either finds unbalanced node by computing hbp or reaches the null reference ( root’s parent ).
If the tree is unbalanced, the method will return reference to unbalanced node ( z ), null otherwise.
INSERTION
4
53
1 2 6
7
Observe that it is not sufficient to check hbp at root alone. The following tree ( sequence 4 3 5 1 2 6 7 ) is balanced at root, but unbalanced at node 5.
root balanced
unbalanced
L-ROTATIONroot
b.insert(1);b.insert(2);b.insert(3);
2
1
3
L-ROTATION
root
1
2
3
null
t2 t3
t1
t0x
y
z
2
1 3
t0 t1 t2 t3
nullroot y
xz
This is L-Rotation in terms of pointers ( mess! ). Most recently inserted node is x, node z is unbalanced :
Newly inserted node x violates the height-balance property ( H(L) - H(R) = -2 ) in the right subtree with respect to unbalanced node z.
Newly inserted node x is the right child of its parent y.
L-ROTATION
Criteria for L-Rotation :
void l(x, y, z) AVLNode t0, t1, t2, t3; t0 = z.getLeftLink(); t1 = y.getLeftLink(); t2 = x.getLeftLink(); t3 = x.getRightLink(); y.setLeftLink(z); y.setRightLink(x); y.setParent(z.getParent()); z.setLeftLink(t0); z.setRightLink(t1); z.setParent(y); x.setLeftLink(t2); x.setRightLink(t3); x.setParent(y); if(z == root) root = y;
root
b.insert(1);b.insert(3);b.insert(2);
LR-ROTATION
1
3
2
LR-ROTATIONThis is LR-Rotation in terms of pointers ( mess! ). Most recently inserted node is x, node z is unbalanced :
2
1 3
t0 t1 t2 t3
nullroot x
yz
root
1
null
2
t1 t2
t0
x
y
3
z
t3
Newly inserted node x violates the height-balance property ( H(L) - H(R) = -2 ) in the right subtree with respect to unbalanced node z.
Newly inserted node x is the left child of its parent y.
LR-ROTATION
Criteria for LR-Rotation :
void lr(x, y, z) AVLNode t0, t1, t2, t3; t0 = z.getLeftLink(); t1 = x.getLeftLink(); t2 = x.getRightLink(); t3 = y.getRightLink(); x.setLeftLink(z); x.setRightLink(y); x.setParent(z.getParent()); z.setLeftLink(t0); z.setRightLink(t1); z.setParent(x); y.setLeftLink(t2); y.setRightLink(t3); y.setParent(x); if(z == root) root = x;
R-ROTATION
root
b.insert(3);b.insert(2);b.insert(1);
2
3
1
R-ROTATIONThis is R-Rotation in terms of pointers ( mess! ). Most recently inserted node is x, node z is unbalanced :
2
1 3
t0 t1 t2 t3
nullroot y
zx
2
1
t0 t1
t3
t2
x
y
root
3
nullz
R-ROTATION
Newly inserted node x violates the height-balance property ( H(L) - H(R) = 2 ) in the left subtree with respect to unbalanced node z.
Newly inserted node x is the left child of its parent y.
Criteria for R-Rotation :
void r(x, y, z) AVLNode t0, t1, t2, t3; t0 = x.getLeftLink(); t1 = x.getRightLink(); t2 = y.getRightLink(); t3 = z.getRightLink(); y.setLeftLink(x); y.setRightLink(z); y.setParent(z.getParent()); x.setLeftLink(t0); x.setRightLink(t1); x.setParent(y); z.setLeftLink(t2); z.setRightLink(t3); z.setParent(y); if(z == root) root = y;
RL-ROTATION
root
b.insert(3);b.insert(1);b.insert(2);
3
1
2
RL-ROTATIONThis is RL-Rotation in terms of pointers ( mess! ). Most recently inserted node is x, node z is unbalanced :
2
1 3
t0 t1 t2 t3
nullroot x
zy
1
2t0
t1
t3
t2
x
y
root
3
nullz
RL-ROTATION
Newly inserted node x violates the height-balance property ( H(L) - H(R) = 2 ) in the left subtree with respect to unbalanced node z.
Newly inserted node x is the right child of its parent y.
Criteria for RL-Rotation :
void rl(x, y, z) AVLNode t0, t1, t2, t3; t0 = y.getLeftLink(); t1 = x.getLeftLink(); t2 = x.getRightLink(); t3 = z.getRightLink(); x.setLeftLink(y); x.setRightLink(z); x.setParent(z.getParent()); y.setLeftLink(t0); y.setRightLink(t1); y.setParent(x); z.setLeftLink(t2); z.setRightLink(t3); z.setParent(x); if(z == root) root = x;
REMOVAL OF NODES IN AVL TREES
CASE 1 : Removal of external nodes ( nodes that do not have children ) ALGORITHM:
1. find node ( x ) with specified key
2. find x’s parent ( y )
3. set y’s left ( right ) link to null
4. possibly handle root
4
62
1 3
root
5
CASE 1
2
nullroot
x
1
null null
3
null null
y
a.remove(1);
find(1);
y = x.getParent();
if(x == y.getLeft()) y.setLeft(null);else y.setRight(null);
null
REMOVAL OF NODES IN AVL TREES
CASE 2 : Removal of internal nodes ( nodes that have at least one child ) Subcase ( a ) : 1 child
4
62
1 3
root
5
ALGORITHM:
1. find node ( w ) with specified key
2. if w has single child, set reference x to that child and reference y to w’s parent
3. properly link y to x and possibly handle root, i.e. w is root.
CASE 2A
4
62
1 3
root
5
4
2
1 3
root
5
w
y
x
CASE 2BCASE 2 : Removal of internal nodes, subcase ( b ) : 2 children
4
62
1 3
root
5
ALGORITHM:1. find node ( w ) with specified key
2. find node z that follows w in inorder traversal
3. copy value at z to w
4. if z has no children, unlink it; otherwise properly link z’s right subtree to z’s parent.
CASE 2B
4
62
1 3
root
5
w
z
5 5
62
1 3
root
REMOVAL OF NODES IN AVL TREES
Deleting a node from AVL tree may result in an unbalanced tree :
5
62
1 3
root 5
2
1 3
root
REMOVAL OF NODES IN AVL TREES
Resulting tree needs to be balanced using the same rotations as in the case of insertion :
2
1
3
root
5
R-Rotation
3
2
1
5
root
RL-Rotation
REMOVAL OF NODES IN AVL TREES
Situation is however more complex than in the case of insertion.
More than one rotation might be necessary to balance the tree.
For this reason, it necessary to climb back to the root node checking the balance property at each node.
Play with this at :http://www.qmatica.com/DataStructures/Trees/AVL/AVLTree.html
find(k) : returns an element with key k
insert(k, e) : inserts element e with key k into a dictionary
remove(k) : removes element with key k from dictionary
HASH TABLES
A dictionary is a collection of pairs ( k, e ) where k is the key and e is an element. Keys uniquely identify each pair. The most common methods are :
The primary purpose of dictionaries is to store elements so that they can be located effectively using keys.
HASH TABLES
The simplest implementation of dictionaries is an array, where the key is array’s index and element is data stored at that index.
The problem with arrays is that array’s indices are dense and many keys are not. Consider CNA students’ numbers : they are 8 digits long and there are roughly 8,500 students.
The solution to this problem is to use a hash table which consists of two components : bucket array and hash function.
The hash function is used to map sparse keys to dense locations of the bucket array.
The task of a hash function is to map a key k into an integer in the range [ 0, N - 1 ], where N is the size of a bucket.
One of the simplest hash functions is : h(k) = k mod N
HASH TABLES
HASH TABLES
Here is an example of a hash table with four buckets storing prime numbers { 2, 5, 11, 23, 37 } :
5 37
2
11
0
1
2
3
2 % 4 = 2
5 % 4 = 1
11 % 4 = 3
23 % 4 = 3
37 % 4 = 1
null
null
null
null23
If two different keys are mapped into the same bucket, we say that collision occurred.
Obviously, if buckets can store a single element, we would not be able to handle collisions.
Therefore, instead of storing single element in a bucket, we rather store a list of elements. Such a collision resolution is called chaining.
HASH TABLES
HashTable.java
class HashTable { private LinkedList[] table; private int size; public HashTable(int size) { this.size = size; table = new LinkedList[size]; for(int i = 0; i < size; i++) table[i] = new LinkedList(); }
HashTable.java
public void insert(int i) { int hashCode = i % size; table[hashCode].insert(i); }
public void remove(int i) { int hashCode = i % size; table[hashCode].remove(i); }
HashTable.java
public void get(int i) { int hashCode = i % size; int n = table[hashCode].search(i); if(n == -1) System.out.println(i + " is not in the table."); else { System.out.println(i + " is in the bucket " + hashCode + " at position " + n); } }
HashTable.java
public void print() { for(int i = 0; i < size; i++) { System.out.print("Bucket " + i + ": "); table[i].print(); } }}
GRAPHS
Graph G = ( V, E ) is the set of vertices and edges.
If e = ( u, v ) is in E, then e is an edge between vertices u and v. Circles represent vertices and arcs ( lines ) represent edges.
GRAPHS
A
B
C
D
E
F
If pairs of vertices are unordered, graph G is called undirected graph.
If pairs of vertices are ordered, graph G is called directed graph. Represented as line segments / arcs with arrowheads indicating direction.
GRAPH DEFINITIONS
A graph is called connected if there is a path from any vertex to any other vertex.
A path in a graph is a sequence of distinct vertices, each adjacent to the next.
A cycle in a graph is a path, where the starting and ending vertex are the same.
GRAPH REPRESENTATIONS
Adjacency Matrix representation
Mixed ( array - linked lists ) representation
Linked lists representation
There are ( at least ) three ways to represent graphs :
1. ADJACENCY MATRIX REPRESENTATION
Let n denote the number of vertices in a directed graph. We define the adjacency matrix :
boolean[][] am = new boolean[n][n];
am[i][j] is true if and only if vertex i is adjacent to vertex j, i.e., there exists a directed edge from i to j. If the graph is undirected, adjacency matrix will be symmetric.
1. ADJACENCY MATRIX
1
3 2
0
Directed graph
0 1 1 0
0 0 1 1
0 0 0 0
1 1 1 0
Adjacency matrix
0 1 2 3
0
1
2
3
In this representation, one-dimensional array is used to represent vertices.
Each entry in that array is a reference to a linked list of vertices adjacent to the vertex represented by array's index.
We assume that vertices are labeled with positive integers starting from zero.
2. MIXED REPRESENTATION
2. MIXED REPRESENTATION
The same graph as in previous example, would be represented as:
1
3 2
0
Directed graph
0 1
1
2 null3
2 null
2 null
null
0
1
2
3
Mixed representation
3. LINKED LIST REPRESENTATION
1
1 2
2 0
1
Similar to previous one. Instead of matrix, linked list is used to represent vertices.
2
null
3
null
2
null
3 null0
null
IMPLEMENTATION
In order to model graphs we need to types of nodes, representing vertices and edges as adjacent vertices.
For the sake of simplicity we will label graph nodes with nonnegative integers.
label EdgeNodeEdgeNode:
label
EdgeNode
VertexNodeVertexNode:
Edge.java
class Edge { private int label; private Edge next; public Edge() { label = 0; next = null; } public void setLabel(int label) { this.label = label; }
Edge.java
public int getLabel() { return label; } public void setNextEdge(Edge next) { this.next = next; } public Edge getNextEdge() { return next; }}
Vertex.javaclass Vertex { private int label; private Vertex nextVertex; private Edge nextEdge; public Vertex() { label = 0; nextVertex = null; nextEdge = null; } public void setLabel(int label) { this.label = label; } public int getLabel() { return label; }
Vertex.java public void setNextVertex(Vertex nextVertex) { this.nextVertex = nextVertex; } public Vertex getNextVertex() { return nextVertex; } public void setNextEdge(Edge nextEdge) { this.nextEdge = nextEdge; } public Edge getNextEdge() { return nextEdge; }}
TEST
Our driver class, TestGraph.java will model directed graph:
1
3 2
0e0
e1
e2e3
e4
e5
e6
TestGraph.javaclass TestGraph { public static void main(String args[]) { Vertex v0 = new Vertex(); Vertex v1 = new Vertex(); Vertex v2 = new Vertex(); Vertex v3 = new Vertex(); Edge e0 = new Edge(); Edge e1 = new Edge(); Edge e2 = new Edge(); Edge e3 = new Edge(); Edge e4 = new Edge(); Edge e5 = new Edge(); Edge e6 = new Edge();
TestGraph.java
v0.setLabel(0); v1.setLabel(1); v2.setLabel(2); v3.setLabel(3);
e0.setLabel(1); e1.setLabel(2); e2.setLabel(2); e3.setLabel(3); e4.setLabel(0); e5.setLabel(1); e6.setLabel(2);
TestGraph.java
v0.setNextVertex(v1); v1.setNextVertex(v2); v2.setNextVertex(v3); v0.setNextEdge(e0); e0.setNextEdge(e1); v1.setNextEdge(e2); e2.setNextEdge(e3); v3.setNextEdge(e4); e4.setNextEdge(e5); e5.setNextEdge(e6);
TestGraph.java
int vertex, edge; Vertex baseVertex = v0; while(baseVertex != null) { vertex = baseVertex.getLabel(); System.out.print("Vertex: " + vertex + " Edges: "); Edge baseEdge = baseVertex.getNextEdge(); while(baseEdge != null) { edge = baseEdge.getLabel(); System.out.print("(" + vertex + "," + edge + ") "); baseEdge = baseEdge.getNextEdge(); } System.out.println(); baseVertex = baseVertex.getNextVertex(); } }}
GENERAL IMPLEMENTATION
The general implementation requires dynamic creation of vertices and edges, i.e. methods insertVertex() and insertEdge(int, int).
We assume that vertices are labeled with integers starting with zero at the time of their creation.
insertVertex() method is almost identical to the regular insert method in linked lists.
insertVertex() METHOD
public void insertVertex() { Vertex v = new Vertex(); v.setLabel(vertexLabel); if(vertexLabel == 0) head = v; else tail.setNextVertex(v); tail = v; vertexLabel++; }
insertEdge(int,int) METHOD
The second method, insertEdge(int u, int v), takes two integers as arguments representing labels of vertices which we want to join.
First of all, we have to check if both vertices exist. If not, edge cannot be inserted. Otherwise, new edge node is created and labeled with v.
Finally, the newly created edge node should be linked to the linked list of adjacent vertices with respect to the vertex node with label u.
insertEdge(int,int) METHOD public void insertEdge(int u, int v) { Vertex base_u = find(u); Vertex base_v = find(v); if(base_u == null || base_v == null) System.out.println("Cannot insert edge between vertices " + u + " and " + v); else { Edge e = new Edge(); e.setLabel(v); if(base_u.getNextEdge() == null) base_u.setNextEdge(e); else { Edge temp = base_u.getNextEdge(); while(temp.getNextEdge()!= null) temp = temp.getNextEdge(); temp.setNextEdge(e); } } }
Graph.java
class Graph { private int vertexLabel; private Vertex head; private Vertex tail; public Graph() { vertexLabel = 0; head = tail = null; } public int getSize() { return vertexLabel; }
Graph.java
public void insertVertex() { Vertex v = new Vertex(); v.setLabel(vertexLabel); if(vertexLabel == 0) head = v; else tail.setNextVertex(v); tail = v; vertexLabel++; }
Graph.java public void insertEdge(int u, int v) { Vertex base_u = find(u); Vertex base_v = find(v); if(base_u == null || base_v == null) System.out.println("Cannot insert edge between vertices " + u + " and " + v); else { Edge e = new Edge(); e.setLabel(v); if(base_u.getNextEdge() == null) base_u.setNextEdge(e); else { Edge temp = base_u.getNextEdge(); while(temp.getNextEdge()!= null) temp = temp.getNextEdge(); temp.setNextEdge(e); } } }
Graph.java
public Vertex find(int x) { Vertex base = head; while(base.getLabel() != x) { base = base.getNextVertex(); if(base == null) break; } return base; }
Graph.java public void print() { int vertex, edge; Vertex baseVertex = head; while(baseVertex != null) { vertex = baseVertex.getLabel(); System.out.print("Vertex: " + vertex + " Edges: "); Edge baseEdge = baseVertex.getNextEdge(); while(baseEdge != null) { edge = baseEdge.getLabel(); System.out.print("(" + vertex + "," + edge + ") "); baseEdge = baseEdge.getNextEdge(); } System.out.println(); baseVertex = baseVertex.getNextVertex(); } }}
Graph.javaclass TestGraph { public static void main(String args[]) { Graph g = new Graph(); for(int i = 0; i < 4; i++) g.insertVertex(); g.insertEdge(0, 1); g.insertEdge(0, 2); g.insertEdge(1, 2); g.insertEdge(1, 3); g.insertEdge(3, 0); g.insertEdge(3, 1); g.insertEdge(3, 2); g.insertEdge(4, 4);
g.print(); }}
GRAPH TRAVERSALS
Depth First Search ( DFS )
To traverse graph means to visit all the vertices in some systematic order. Here we will consider Depth First Search (DFS) and Breadth First Search.
DFS is closely related to preorder traversal of a tree. Recall that preorder traversal visits each node before its children.
Preorder (vertex v)1 visit (v);2 for(each child w of v)3 Preorder(w);
DFS
To turn this into a graph traversal algorithm, we basically replace "child" by "neighbor" (adjacent vertex).
To prevent infinite recursion, we want to visit each vertex once. Here is the algorithm in pseudocode:
DFS(G, source)1 mark all vertices as not visited;2 traverse(vertex v)3 mark v as visited;4 for each neighbor of v5 if neighbor is not marked as visited6 traverse(neighbor);
DFS DEMO
0
21
4
root
3 5
0 1 3 4 5 2
IMPLEMENTATION
public void dfs() { System.out.print("dfs: "); boolean visited[] = new boolean[vertexLabel]; for(int i = 0; i < vertexLabel; i++) if(!visited[i]) traverse(i, visited);
System.out.println(); }
IMPLEMENTATION
private void traverse(int i, boolean visited[]) { if(!visited[i]) { Vertex v = find(i); visited[i] = true; Edge adjacent = v.getNextEdge(); while(adjacent != null) { int index = adjacent.getLabel(); System.out.print("(" + i + " " + index + ") "); adjacent = adjacent.getNextEdge(); traverse(index, visited); } } }
IMPLEMENTATION
Graph t = new Graph(); for(int i = 0; i < 5; i++) t.insertVertex(); t.insertEdge(1, 0); t.insertEdge(1, 3); t.insertEdge(3, 2); t.insertEdge(3, 4); t.dfs();
BFS
BFS is a traversal through a graph that finds all the vertices that are reachable from the source vertex.
The order of traversal is such that the algorithm explores all of the neighbors of a vertex before proceeding on the neighbors of its neighbors.
A vertex is discovered the first time it is encountered by the algorithm. A vertex is finished after all of its neighbors are explored.
BFSThe algorithm in pseudocode is:
BFS(G, source)1 create an empty queue Q;2 label all vertices as not-visited;3 mark the source vertex as visited and place it in Q;4 while(Q is not empty)5 remove the head from Q;6 mark it as visited;7 place its neighbors in the queue;
public void getDataAtHead() { return head.getData(); }
Class Queue requires this method :
BFS DEMO
0
21
4
root
3 5
0 1 2 3 4 5
Q = {}Q = {0}Q = {1, 2}
Q = {2, 3, 4, 5}Q = {3, 4, 5}Q = {4, 5}Q = {5}
Q = {}
IMPLEMENTATION
public void bfs() { System.out.print("bfs: "); boolean visited[] = new boolean[vertexLabel]; Queue q = new Queue(); for(int i = 0; i < vertexLabel; i++) if(!visited[i]) { q.enqueue(i); do { int entry = q.getDataAtHead(); if(visited[entry]) q.dequeue();
IMPLEMENTATION else { visited[entry] = true; q.dequeue(); Vertex v = find(entry); Edge adjacent = v.getNextEdge(); while(adjacent != null) { int index = adjacent.getLabel();
System.out.print("(" + entry + " " + index + ") ");
if(visited[index] == false) q.enqueue(index); adjacent = adjacent.getNextEdge(); } } } while(!q.isEmpty()); } System.out.println(); }
SHORTEST PATHS
Weighted graph is a graph that has nonnegative integer ( or real number ) associated with each edge.
Let G = ( V, E ) be a weighted directed graph and P = { v0, v1, ..., vk } be a directed path in G as in :
v0 v1 v2 vkw0 w1 w2 wk-1...
We define the weight of path P as a sum of weights of its constituent edges : w(P) = w0 + w1 + ... + wk-1
SHORTEST PATHS
The shortest path between two vertices u and v is then defined as a path between vertices u and v with minimum weight.
The single-source shortest-path problem is the problem of finding shortest paths between the specified vertex (source) and any other vertex in a weighted directed graph.
Dijkstra's algorithm solves the single-source shortest-path problem for the case in which all edges have nonnegative weights.
DIJKSTRA’S ALGORITHM
Dutch computer scientist,Turing Award ( 1972 ), ACM ( 2002 )
Graph algorithms, programming languages, Operating Systems
“Referring to computing as computer science is as calling surgery a knife science”Edsger Wybe Dijkstra
(1930-2002)
DIJKSTRA’S ALGORITHM
Solves the single-source shortest-paths problem for the case in which all edges have nonnegative weights. The algorithm uses the following procedures:
Initialization of a single sourceDistance of the source vertex is set to zero. All other distances are set to infinity.
d[v] denotes the distance from the source vertex and
p[v] denotes the predecessor of vertex v along the shortest path from source.
INITIALIZATION
In pseudo code initialization looks like :
Init(G, s)1 for each vertex v in G2 set d[v] to INFINITY3 set p[v] to NIL4 set d[s] to 0
DIJKSTRA’S ALGORITHMRelaxation.The process of relaxing an edge (u, v) consists of testing whether we can improve the shortest path to v found so far by going through u, and if so, updating d[v] and p[v].
s
u
d[v] = 9
d[u] = 5
2v d[u] + w(u,v) < d[v]d[v] = 7
RELAXATION
In pseudo code relaxation looks like :
Relax(u, v, w)1 if(d[v] > d[u] + w(u, v)2 d[v] = d[u] + w(u, v)3 p[v] = u
DIJKSTRA’S ALGORITHM
Algorithm maintains a set S of vertices whose final shortest path weights from the source have already been determined.
The algorithm repeatedly selects the vertex u from V - S with the minimum shortest-path estimate, inserts u into S and relaxes all edges leaving u.
DIJKSTRA’S ALGORITHM
In pseudo code algorithm looks like :
DIJKSTRA(G, w, s)1 Init(G, s)2 S is empty3 Q = V4 while(Q is not empty)5 u = extractMIN(Q)6 add u to S7 for each vertex v adjacent to u8 relax(u, v, w)
EXAMPLE
0
1
2
3
5
42 2
1
6 3
3
5 3
1
4
0
INF INF
INF INF
INF
S = {}
Q = {0(0), 1(INF), 2(INF), 3(INF), 4(INF), 5(INF)}
EXAMPLE
0
1
2
3
5
42 2
1
6 3
3
5 3
1
4
0
2 INF
6 INF
INF
S = {0}
Q = {1(6), 2(2), 3(INF), 4(INF), 5(INF)}
EXAMPLE
S = {0, 2}
Q = {1(6), 3(INF), 4(3), 5(INF)}
3
0
1
2
3
5
42 2
1
6 3
3
5 3
1
4
0
2
6 INF
INF
EXAMPLE
S = {0, 2, 4}
Q = {1(4), 3(INF), 5(5)}4
5
3
0
1
2
3
5
42 2
1
6 3
3
5 3
1
4
0
2
INF
EXAMPLE
S = {0, 2, 4, 1}
Q = {3(7), 5(5)}7
3
0
1
2
3
5
42 2
1
6 3
3
5 3
1
4
0
2
4
5
EXAMPLE
S = {0, 2, 4, 1, 5}
Q = {3(7)}7
3
0
1
2
3
5
42 2
1
6 3
3
5 3
1
4
0
2
4
5
EXAMPLE
S = {0, 2, 4, 1, 5, 3}
Q = {}7
3
0
1
2
3
5
42 2
1
6 3
3
5 3
1
4
0
2
4
5
JAVA DATA STRUCTURES
Java provides several interfaces ( collections ) that handle dynamic structures : List <E>, Map <K, V>, Set <E> and Queue <E>
Here we will take a look at List ( LinkedList and ArrayList ).
For more information visit :http://docs.oracle.com/javase/tutorial/collections/interfaces/index.html
interface List <E>
Defined in java.util package, interface list represents dynamic collection of objects.
It also provides a ListIterator that allows element insertion and replacement.
Some of the known implementing classes are ArrayList, LinkedList, Stack and Vector.
JavaList.javaimport java.util.List;import java.util.LinkedList;import java.util.Collections;
class JavaList { public static void main(String args[]){ LinkedList<Integer> l = new LinkedList<Integer>();
l.add(3); l.add(1); l.add(5); l.add(4); l.add(2); System.out.println(l);
l.remove(); l.removeLast(); System.out.println(l);
l.remove(1); System.out.println(l);
JavaList.java l.add(12); l.add(6); l.add(-2); System.out.println(l);
Collections.sort(l); System.out.println(l);
Collections.reverse(l); System.out.println(l);
Integer[] a = l.toArray(new Integer[l.size()]); System.out.println(a);
int[] n = new int[a.length]; for(int i = 0; i < a.length; i++) n[i] = (int)a[i];
System.out.println(n[0]); }}
ARRAYLIST
Another implementation of interface List.
LinkedList implements List with a doubly linked list.
ArrayList implements List with a dynamically resizing array.
When the original capacity is exceeded, new array ( double in size ) is created and original elements copied to it.
Deal.javaimport java.util.*;
class Deal { public static void main(String[] args) { if (args.length < 2) { System.out.println("Usage: Deal hands cards"); return; } int numHands = Integer.parseInt(args[0]); int cardsPerHand = Integer.parseInt(args[1]); // Make a normal 52-card deck. String[] suit = new String[] { "spades", "hearts", "diamonds", "clubs" };
Deal.java
String[] rank = new String[] { "ace", "2", "3", "4", "5", "6", "7", "8", "9", "10", "jack", "queen", "king" };
List<String> deck = new ArrayList<String>(); for (int i = 0; i < suit.length; i++) { for (int j = 0; j < rank.length; j++) { deck.add(rank[j] + " of " + suit[i]); } }
Deal.java
// Shuffle the deck. Collections.shuffle(deck); if (numHands * cardsPerHand > deck.size()) { System.out.println("Not enough cards."); return; } for (int i = 0; i < numHands; i++) System.out.println(dealHand(deck, cardsPerHand)); }
Deal.java
public static <E> List<E> dealHand(List<E> deck, int n) { int deckSize = deck.size(); List<E> handView = deck.subList(deckSize - n, deckSize); List<E> hand = new ArrayList<E>(handView); handView.clear(); return hand; }}
ALGORITHM DESIGN
Divide and Conquer algorithm design technique
Merge Sort, Towers of Hanoi
Dynamic Programming algorithm design technique
The Longest Common Subsequence
Greedy algorithm design technique
Activity Selection Problem, Minimum Spanning Trees
Fundamental algorithm design techniques are :
DIVIDE AND CONQUER
Divide the problem into a number of subproblems.
Conquer the subproblems by solving them recursively.
Combine solutions to subproblems into the solution of the original problem.
Divide and Conquer is a paradigm for designing algorithms which involves three steps at each lever of recursion :
MERGE SORT
Divide step : Divide the n-element sequence to be sorted into two subsequences of n / 2 elements.
Conquer step : sort the two subsequences recursively.
Combine step : merge the two subsequences to produces the sorted answer.
Merge Sort is divide-and-conquer sorting algorithm which operates intuitively as follows :
MERGE SORT
6 3 1 7 8 2 5
8 2 56 3 1 7
36 1 7 8 2 5
6 3 1 7 8 2
MERGE SORT
56 3 1 7 8 2
63 1 7 2 8 5
2 5 81 3 6 7
1 2 3 5 6 7 8
MergeSort.java
class MergeSort { private int[] a; public MergeSort(int[] a) { this.a = a; } public void sort() { divide(0, a.length - 1); }
MergeSort.java
public void divide(int p, int r) { int q = (p + r) / 2; if(p < r) { divide(p, q); divide(q + 1, r); merge(p, q, r); } }
MergeSort.java public void merge(int p, int q, int r) { int merged[] = new int[r - p + 1]; int i = p; int j = q + 1; int k = 0; while(i <= q && j <= r) { if(a[i] <= a[j]) { merged[k] = a[i]; i++; } else { merged[k] = a[j]; j++; } k++; }
MergeSort.java
while(i <= q) { merged[k] = a[i]; i++; k++; }
while(j <= r) { merged[k] = a[j]; j++; k++; } for(i = p; i <= r; i++) a[i] = merged[i - p]; }
MergeSort.java
public void print() { for(int i = 0; i < a.length; i++) System.out.print(a[i] + " "); System.out.println(); }}
TestMergeSort.java
class TestMergeSort { public static void main(String[] args) { int[] a = { 6, 3, 1, 7, 8, 2, 5 }; MergeSort ms = new MergeSort(a); ms.print(); ms.sort(); ms.print(); }}
ANALYSIS
The running time of recursive function divide() is :
!"#
T(n) = 12T(n / 2) + Θ(merge)
if n = 1if n > 1
Since Θ(merge) is Θ(n), we have :
!"#
T(n) = 12T(n / 2) + Θ(n)
if n = 1if n > 1
By application of Master Theorem we find that the running time of Merge Sort is : T(n) = Θ(n log n)
TOWERS OF HANOIIn the puzzle Towers of Hanoi the task is to move n disks from peg source to peg dest using auxiliary peg aux.
Only one disk can be moved at a time and bigger disk is not allowed to be placed on top of the smaller disk.
TOWERS OF HANOI
Our task is to write method move() which solves the puzzle. The task can be summarized as :
move(DISKS, 1, 3, 2);
where 1 denotes the source peg, 3 destination peg, 2 auxiliary peg and DISKS is the number of disks to be moved.
Solution is recursive and focuses on the hard step ( how to move the bottom disk from source to destination? ), not an easy one ( where to move the disk on top? )
TOWERS OF HANOIThe only way to move the bottom disk from peg 1 to peg 3 is to have remaining disks on peg 2 :
So the original task of moving 3 disks from 1 to 3 is divided into two smaller tasks : moving two disks from 1 to 2 using 3 and moving those two disks from 2 to 3 using 1.
TOWERS OF HANOI
This can be summarized in code as:
move(2, 1, 2, 3); System.out.println("Move disk from 1 to 3"); move(2, 2, 3, 1);
Assuming that the original call was move(3, 1, 3, 2).
The rest of the problem is done essentially in the same way - the beauty of recursion!
Since the number of disks is smaller by one in each recursive call, recursion stops when number of disks becomes zero.
Towers.java
class Towers { private int DISKS; public Towers(int n) { DISKS = n; } public void solve() { move(DISKS, 1, 3, 2); }
Towers.java
public void move(int n, int source, int dest, int aux) { if(n > 0) { move(n - 1, source, aux, dest); System.out.println("Move disk from " + source + " to " + dest); move(n - 1, aux, dest, source); } }}
TestTowers.java
class TestTowers { public static void main(String[] args) { Towers t = new Towers(3); t.solve(); }}
ANALYSIS
It can be shown that recursive solution to Towers of Hanoi puzzle produces the best possible solution.
The running time is T(n) = 2T(n - 1) + 1. Solving this recurrence ( homogeneous ) results in O(2n).
For 64 disks, assuming that disk can be moved in one second, it would take 5 × 1011 years to solve the puzzle. The age of the universe is estimated to be 10 billion (1010) years.
DYNAMIC PROGRAMMING
Dynamic programming paradigm is similar to divide and conquer - it solves problems by combining solutions to subproblems.
The major difference is that subproblems in dynamic programming are dependent, while they are not in the case of divide and conquer.
This technique is typically applied to optimization problems - finding the optimal solution among many solutions.
DYNAMIC PROGRAMMING
Characterize the structure of an optimal solution.
Recursively define the value of an optimal solution.
Compute the value of an optimal solution.
Construct an optimal solution from computed information.
Typical steps in dynamic programming involve :
THE LONGEST COMMON SUBSEQUENCE
Before we formally define the LCS problem, we introduce substring and subsequence problems since they all belong to string matching problems.
In its simplest form, the string matching algorithm takes as input two strings, pattern and text, and reports if pattern is contained in text.
Although there are many solutions to string matching problem, here we consider the solution based on finite state automata.
FINITE STATE AUTOMATA
Q is a finite set of states
q0 is the initial or start state
A ⊆ Q is a distinguished set of accepting states
Σ is a finite alphabet
δ is a function from Q × Σ → Q, called the transition function of M.
Finite state automaton M is a quintuple M = (Q, q0, A, Σ, δ) where :
0 1 2 3
*-a
a b c
*
*-b
*-c
FINITE STATE AUTOMATA
Finite state automata can be depicted by their transition state diagrams. The one that follows accepts strings that contain “abc” pattern :
SUBSTRING PROBLEM
a b cx y text
a b c pattern
b x ca y text
a b c pattern
class Substring { public static void main(String[] args) { if(isSubstring(args[0], args[1])) System.out.println(args[1] + " contains pattern " + args[0]); else System.out.println(args[1] + " does not contain pattern " + args[0]); }
Substring.java
Substring.java
public static boolean isSubstring(String patt, String text) { int i = 0; int j = 0; while(i < text.length()) { if(j == pattern.length()) break; if(text.charAt(i) == pattern.charAt(j)) j++; else j = 0; i++; }
Substring.java
if(j == pattern.length()) return true; else return false; }}
SUBSEQUENCE TESTSubsequence testing is somewhat similar to substring testing. We say that pattern is a subsequence of text, if letters of pattern appear in order, possibly separated.
b x ca y texta b c pattern
Here, pattern “abc” is not a substring of “abxcy”, but a subsequence.
SUBSEQUENCE TESTAs in the case of substrings, we use finite automata approach. The one that follows accepts strings that contain “abc” subsequence :
0 1 2 3a b c
**-c*-b*-a
Subsequence.java
class Subsequence { public static void main(String[] args) { if(isSubsequence(args[0], args[1])) System.out.println(args[1] + " contains sequence " + args[0]); else System.out.println(args[1] + " does not contain sequence " + args[0]); }
Subsequence.java
public static boolean isSubsequence(String patt, String text) { int i = 0; int j = 0; while(i < text.length()) { if(j == pattern.length()) break; if(text.charAt(i) == pattern.charAt(j)) j++; i++; }
Subsequence.java
if(j == pattern.length()) return true; else return false; }}
LONGEST COMMON SUBSEQUENCE
Consider pattern “abc” and text “bxycd”. Pattern “abc” is neither a substring nor a subsequence of “bxycd”.
The longest common subsequence of pattern and text in this case is “bc”.
x y cb d texta b c pattern
In the LCS problem, pattern and text have symmetric roles, and will be therefore referred to simply string A and string B.
DNA sequencingGenes are typically represented as sequences of four letters ACGT corresponding to four submolecules forming DNA. Similarity between two genes is then determined by computing the length of LCS.
File ComparisonThe UNIX diff utility compares two files by finding the LCS between the lines of two files.
LONGEST COMMON SUBSEQUENCE
The major importance of the LCS algorithm is in :
SOLUTION TO LCS
If the two strings start with the same letter, it is safe to choose that letter as the first character of the subsequence. For example if A = “abc” and B = “acd” we have :
As a first step, we will try to develop recursive solution. Here are some simple facts :
a
a
matchc d
b c
subproblem
c d
b c
SOLUTION TO LCS
Suppose, the first two characters differ. It is not possible for them to be part of a subsequence - one or the other or both will have to be removed :
b c
d
c
c d
SOLUTION TO LCS
public int lcs(String A, String B, int i, int j) { if(i == A.length() || j == B.length()) return 0; else if(A.charAt(i) == B.charAt(j)) return 1 + lcs(A, B, i + 1, j + 1); else return max(lcs(A, B, i + 1, j), lcs(A, B, i, j + 1)); }
Here is the code based on recursive algorithm :
The worst case scenario is when strings are disjoint. Thus the expensive line is :
SOLUTION TO LCS
return max(lcs(A, B, i + 1, j), lcs(A, B, i, j + 1));
Yielding the recurrence T(n) = 2T(n - 1) whose solution is T(n) = O(2n). Not good.
In order to improve efficiency, dynamic programming uses memoization, i.e. intermediate values are stored in the array.
We define L[i][j] to be the length of LCS for Ai and Bj as :
!"#
L[i][j] =
0L[i - 1][j - 1] + 1
if i = 0 or j = 0if i, j > 0 and A[i] = B[j]
max(L[i][j - 1], L[i - 1][j]) if i, j > 0 and A[i] ≠ B[j]
SOLUTION TO LCS
Once the algorithm terminates, L[m][n] stores the LCS value, where m and n are A’s and B’s lengths respectively.
0
0 0 00 0 0
0
0
0
b e b d c
a
b
c
0 0 0 0
1 1 1 1 1
1 1 1 1 2
SOLUTION TO LCS
Based on dynamic programming algorithm, this is how the matrix would be filled if A = “abc” and B = “bebdc” :
LCS.javaclass LCS { public static void main(String[] args) { int n = lcs(args[0], args[1]); System.out.println("LCS: " + n); } public static int lcs(String A, String B) { int m = A.length(); int n = B.length(); int[][] L = new int[m + 1][n + 1]; for(int i = 0; i <= m; i++) L[i][0] = 0; for(int j = 0; j <= n; j++) L[0][j] = 0;
LCS.java
for(int i = 0; i < m; i++) { for(int j = 0; j < n; j++) { if(A.charAt(i) == B.charAt(j)) L[i + 1][j + 1] = L[i][j] + 1; else L[i + 1][j + 1] = max(L[i + 1][j], L[i][j + 1]); } } return L[m][n]; }
LCS.java
public static int max(int x, int y) { if(x > y) return x; else return y; }}
GREEDY ALGORITHMS
Given a set of activities S = { 1, 2, ..., n }, find a maximum size subset A of compatible activities.
Each activity has associated start time si and finish time fi, such that si < fi.
Two activities i and j are compatible if si ≥ fj or sj ≥ fi.
Important class of techniques. Roughly, globa!y optimal solution can be obtained by making loca!y optimal ( greedy ) choices.
Activity Selection Problem
ACTIVITY SELECTION PROBLEM
0 1 2 3 4 5 6 7 8 9 10 11 time
1s1 f1
2s2 f2
3s3 f3
4s4 f4
5s5 f5
1 3 5
ActivitySelector.java
class ActivitySelector { public static void main(String[] args) { int[] s = {1, 3, 5, 3, 8}; int[] f = {4, 5, 7, 8, 11}; // sorted LinkedList l = selectActivity(s, f); l.print(); }
ActivitySelector.java public static LinkedList selectActivity(int[] s,int[] f) { LinkedList l = new LinkedList(); l.insert(1); int j = 1; for(int i = 1; i < s.length; i++) { if(s[i] >= f[j]) { l.insert(i + 1); j = i; } } return l; }}
ELEMENTS OF GREEDY STRATEGY
It can be proved that Activity Selector always produces optimal solution ( maximum size ).
In many instances, greedy algorithms do not generate optimal solution.
If we applied greedy strategy to the problem of computing the shortest path between A and C, it wouldn’t work. Why?
A
B
D
C10
150
100 10
ELEMENTS OF GREEDY STRATEGY
The Greedy Choice PropertyProof is required that a greedy choice at each step yields an optimal solution.
Optimal SubstructureA problem exhibits optimal substructure if an optimal solution to the problem contains within it optimal solutions to subproblems.
In order to design a greedy algorithm, the problem must exhibit :
MINIMUM SPANNING TREES
Spanning Tree T of graph G is a tree that connects all of its vertices.
9
2
15
45
6
14
3
10
8
G
9
2
15
4
14
3 8
T1 (weight: 55)
9
15
46
14
3 8
T2 (weight: 59)
Minimum Spanning Tree is a spanning tree whose weight is minimal.
PRIM’S ALGORITHM
Choose arbitrary vertex u. Find vertex v adjacent to u such that w(u, v) is minimal. Place that edge into tree T.
Continue adding edges of minimum weight into tree T that are incident to vertices already in a tree and not forming a circuit.
Stop when the number of edges in T is n - 1.
Input : G = 〈 V, E 〉 weighted connected undirected graph with n vertices.Output : T - minimum spanning tree of G.
PRIM’S ALGORITHM
4
5
3 8
2
9
15
6
14 10
KRUSKAL’S ALGORITHM
Choose an edge with the minimum weight.
Continue adding edges of minimum weight into tree T that do not form a circuit.
Stop when the number of edges in T is n - 1.
Input : G = 〈 V, E 〉 weighted connected undirected graph with n vertices.Output : T - minimum spanning tree of G.
KRUSKAL’S ALGORITHM
4
5
3 8
2
9
15
6
14 10