sort and search where did i leave my keys anyway?
DESCRIPTION
3 What is the “best” sorting algorithm The “best” algorithm will be defined as the algorithm that Executes the quickest for a given problem OR uses the least amount of memory for a given problem Usually concerned with the “quickest” solution Given two algorithms A and B (which do the same thing but in different ways) how can you tell which one is quicker? long start = System.currentTimeMillis(); CodeToTestGoesHere(); long finish = System.currentTimeMillis(); long totalTimeInMilliseconds = finish – start; long start = System.currentTimeMillis(); CodeToTestGoesHere(); long finish = System.currentTimeMillis(); long totalTimeInMilliseconds = finish – start;TRANSCRIPT
SORT AND SEARCHWhere did I leave my keys anyway?
2
Sorting Searching and sorting are two of the most used
and most studied types of algorithms in computing
In the early 70’s approximately 80% of computer usage was spent either sorting data or searching data
Examples: Searching for a login name or PIN from a database of IDs Searching for a web site on the internet Sorting a list of potential stock purchases based on some
performance criterion Bank statement with transactions sorted by date
3
What is the “best” sorting algorithm The “best” algorithm will be defined as the
algorithm that Executes the quickest for a given
problem OR uses the least amount of memory for a given problem
Usually concerned with the “quickest” solution Given two algorithms A and B (which do the
same thing but in different ways) how can you tell which one is quicker?
long start = System.currentTimeMillis();CodeToTestGoesHere();long finish = System.currentTimeMillis();long totalTimeInMilliseconds = finish – start;
4
Experimental vs. Analytical Experimental
Must actually implement the algorithm Results are system dependent (language, OS, hardware) Results are for the tested input only. Testing does not
“prove” anything about data that hasn’t been tested.
Analytical Do not need to implement the algorithm Results are not dependent on a particular system Conclusions can be determined for all cases without
performing any tests
5
Gentle intro to analysis Each CPU operation takes a certain amount of time
arithmetic operations (+, -, /, *, %) comparisons (<,==,>,<=, >=, !=) assignment method calls and method returns array access member access
These operations may take different lengths of time but assume (for simplicity) that each operation takes the same amount of time to complete
The length of time a program will take to execute is then given by the number of operations multiplied by the length of time per operation
TotalTimeprogram = (Nop * Timeop)
6
Analysis Analysis involves counting each operation of a
program Usually involves “worst-case” and “best-case”
counts Some handy math notation and terms:
}...{ 1021
10
1
kkkk
PPPP
}10987654321{10
1
k
k
2)1(*k
n
1k
nn
7
Analytic example Consider the following method. How
long does it take?// computes the sum of all numbers// between 1 and Npublic int sumNumbers(int n) { int sum = 0; for(int i=1; i<=n; i++) { sum += i; } return sum;}
8
How to sort/search? Sorting
Bubble sort Insertion sort Selection sort Merge sort Quick sort
Searching Linear search Binary search
9
What to sort/search? We have various types of ‘collections’ to sort
(yes … I use the term collection loosely since arrays aren’t descendents of Collection). arrays array-based list implementations singly-linked lists doubly-linked lists
The searching/sorting may be done as a behavior of the object as a method that operates on the object
27006
60000
2425000000(28.07 days)
4280000000(49.54 days)
7600000000(87.96 days)
2290
5580
24250000(6.74 hrs)
42800000(11.89 hrs)
76000000(21.11 hrs)
189
521
242500(4.04 min)
428000(7.13 min)
760000(12.67 min)
17
47
2425
4280
7612
7
15
336
588
1954
4
8
98
179
439
3
5
30
35
131
1
3
1
1
1
10,000,000
Experimental Test Data on 400 Mhz Pentium II RedHat Linux Machine
Quick
Merge
Insertion
Selection
Bubble
1,000,000100,00010,000400020001000100
Example test data
The following table lists the time in milliseconds that each of the specified sorting methods took to sort the specified number of randomly generated integers. Three timings were taken for each entry and the median value actually recorded. Numbers in italics are estimates.
11
Bubble sort an arraypublic void bubbleSort(int[] data) { for(int i=data.length-1; i>0; i--) { for(int j=0; j<i; j++) { if(data[j+1] < data[j]) { int tmp = data[j+1]; data[j+1] = data[j]; data[j] = tmp; }
} }}
12
Selection sort an array public void selectionSort (int[] data) { for(int i=0; i<data.length; i++) { int min = findIndexOfMinimum(data, i); int tmp = data[min]; data[min] = data[i]; data[i] = tmp; } }
// returns the index of the smallest element // in the range of indices [i, data.length-1] private int findMinimum (int[] data, int i) { int minIndex = i; for (int j=i+1; j<data.length; j++) { if (data[j] < data[min]) min = j; } return min; }
13
Insertion sort an array public void insertionSort (int[] data) { for (int i=1; i<data.length; i++) { int currentValue = data[i]; int j = i-1; while (j >= 0 && data[j] > currentValue) { data[j+1] = data[j]; j--; } data[j+1] = currentValue; } }
14
Quick sort an array A recursive algorithm with O(N log N) performance!algorithm quickSort(A) INPUT: List A of integers OUTPUT: None. A is sorted
// Base Case If A contains 0 or 1 items then return Let PIVOT be a randomly selected element of A
// Divide Let B be an array that contains all the elements of A <= PIVOT
(not including the PIVOT) Let C be an array that contains all the elements of A >= PIVOT (not including the PIVOT)
quickSort(B) quickSort(C)
// Conquer Let A = [B, PIVOT, C]
15
Quick sort an arraypublic static int partition(int data[], int left, int right) { while(true) { while(left < right && data[left] < data[right]) right--; if(left < right) swap(data, left++, right); else return left;
while(left < right && data[left] < data[right]) left++; if(left < right) swap(data, left, right--); else return right; }}
public static void quickSort(int data[], int left, int right) { if(left >= right) return; int pivotLocation = partition(data, left, right); quickSort(data, left, pivotLocation-1); quickSort(data, pivotLocation+1, right);}
// Swap function swaps the values at the specified indices of the array
16
Selection sort a listpublic void selectionSort (int[] data) { for(int i=0; i<data.length; i++) { int min = findIndexOfMinimum(data, i); int tmp = data[min]; data[min] = data[i]; data[i] = tmp; }}
private int findMinimum (int[] data, int i) { int minIndex = i; for (int j=i+1; j<data.length; j++) { if (data[j] < data[min]) min = j; } return min;}
public void selectionSort (List<Integer> data) { for(int i=0; i<data.size(); i++) { int min = findIndexOfMinimum(data, i); int tmp = data.get(min); data.set(min, data.get(i)); data.set(i, tmp); }}
private int findMinimum (List<Integer> data, int i) { int minIndex = i; for (int j=i+1; j<data.size(); j++) { if (data.get(j) < data.get(min)) min = j; } return min;}
17
Merge sort a listvoid merge(List<Integer> left, List<Integer> right, List<Integer> result) { while(!left.isEmpty() && !right.isEmpty()) { Integer n1 = (Integer)left.elementAt(0); Integer n2 = (Integer)right.elementAt(0); if(n1.intValue() < n2.intValue()) { result.add(left.remove(0)); } else { result.add(right.remove(0)); } } while(!left.isEmpty()) { result.add(left.remove(0)); } while(!right.isEmpty()) { result.add(right.remove(0)); }}
void mergeSort(List<Integer> v) { if(v.size() < 2) return;
List<Integer> left = new List<Integer>(); List<Integer> right = new List<Integer>(); int half = v.size()/2;
for(int i=0; i<half; i++) { left.add(v.remove(0)); }
while(!v.isEmpty()) { right.add(v.remove(0)); }
mergeSort(left); mergeSort(right);
merge(left, right, v);}
18
Searching Given an array A and an element E that ‘may’
be in the area: find the smallest K such that A[K] == E. Return -1 if there is no such K.
public int search(int[] data, int value) { for(int i=0; i<data.length; i++){ if(data[i] == value) return i; } return -1;}
http://www.flickr.com/photos/gozalewis/3329507901/
19
Searching Given a sorted array A and an element
E: find a K such that A[K] == E. Return -1 if there is no such K.public int binarySearch(int[] data, int value) { int intervalMin = 0, intervalMax = data.length-1; while(intervalMin <= intervalMax) { int middle = (intervalMin + intervalMax)/2; if(data[middle] == value) return middle; else if(data[middle] < value) intervalMin = middle+1; else intervalMax = middle-1; } return -1;}
20
Can we write a single method to sort any array of objects?
public void insertionSort (int[] x) { int i, j, currentValue; for (i=1; i<x.length; i++) { currentValue = x[i]; j = i-1; while (j>=0 && x[j]>currentValue) { x[j+1] = x[j]; j--; } x[j+1] = currentValue; } }
Consider the code below. What would have to change in order to sort an array of Strings or Stocks or BaseballPlayers or Transactions?
The only property about the array elements we need in order to sort them is that the array elements can be compared to each other!
21
Generic Sorting (uses the Comparable interface)
class Sorter { public static void insertionSort(Comparable[] data) { Comparable currentValue;
for(int i=1; i < data.length; i++) { currentValue = data[i]; j = i – 1; while(j >= 0 && x[j].compareTo(currentValue) > 0) { x[j+1] = x[j]; j--; } x[j+1] = currentValue; } }}
22
Arrays Class The Arrays class is used (oddly enough) to manipulate arrays!
Defined in the “java.util” package Contains a generic sorting method similar to ours (except it uses merge sort)! Methods to sort, fill, and search an array
public static void sort(Object[] a)
Sorts the specified array of objects into ascending order, according to the natural ordering of its elements. All elements in the array must implement the Comparable interface. Furthermore, all elements in the array must be mutually comparable (that is, e1.compareTo(e2) must not throw a ClassCastException for any elements e1 and e2 in the array). This sort is guaranteed to be stable: equal elements will not be reordered as a result of the sort.
The sorting algorithm is a modified mergesort (in which the merge is omitted if the highest element in the low sublist is less than the lowest element in the high sublist). This algorithm offers guaranteed n*log(n) performance, and can approach linear performance on nearly sorted lists.
23
Design Problem With Sorting Imagine a class for keeping records at little-league baseball games
Sometimes like to sort players by batting average Sometimes like to sort players by runs scored Sometimes like to sort players by number of home runs
// Create an array of playersPlayer[] team = new Player[18];
// Initialize the array and then sort it…???insertionSort(team);// or Arrays.sort(team);
How to I specify a sort by “average” or sort by “runs scored” or by “home runs”?
24
Solution 1class Player implements Comparable { private int average; private int runsScored, numberHits, numberHomeRuns; private String name;
Player(String n, int hits, int bats, int hrs, int runs) { name = n; numberHits = hits; numberHomeRuns = runs; atBats = bats; average = (int)(hits * 100 / (double)atBats); }
// obvious accessors and mutators omitted
public int compareTo(Object rhs) {Player other = (Player)rhs;
// ???? what do I do here???? }}
How to I specify to sort by “average” or sort by “runs scored” or by “home runs”?
25
Solution 1class Player implements Comparable { private int average; private int runsScored, numberHits, numberHomeRuns; private String name; private static int sortBy = AVG; public static final int HOMERS=0, AVG=1, HITS=2;
// constructor omitted public static void setSortMethod(int method) { if(method != HOMERS && method != AVG && method != HITS) throw new IllegalArgumentException(); sortBy = method; }
public int compareTo(Object rhs) {Player other = (Player)rhs;
switch(sortBy) { case HOMERS: return getHomers() – other.getHomers(); case AVG: return getAverage() – other.getAverage(); case HITS: return getHits() – other.getHits(); } return 0; }}
26
Solution 1Player[] team1 = new Player[20];
// initialize the array
// Sort by hitsPlayer.setSortMethod(Player.HITS);Arrays.sort(team1);
// Sort by home runsPlayer.setSortMethod(Player.HOMERS);Arrays.sort(team1);
// Sort by averagePlayer.setSortMethod(Player.AVG);Arrays.sort(team1);
The ability to “order” two players is contained within the Player class itself. Specifically, the “compareTo” method. Changing the meaning of “comparing” requires the Player class to be changed.
27
Solution 2: Arrays and Comparators
In the Arrays class:public static void sort(Object[] a, Comparator c)
Sorts the specified array of objects according to the order induced by the specified comparator. All elements in the array must be mutually comparable by the specified comparator (that is, c.compare(e1, e2) must not throw a ClassCastException for any elements e1 and e2 in the array). This sort is guaranteed to be stable: equal elements will not be reordered as a result of the sort.
interface Comparator { public int compareTo(Object one, Object two); public boolean equals(Object other);}
28
Comparators!class ComparePlayersByHomers implements Comparator { public int compareTo(Object lhs, Object rhs) { return ((Player)lhs).getHomers() – ((Player)rhs).getHomers(); }}
class ComparePlayersByAverage implements Comparator { public int compareTo(Object lhs, Object rhs) { return ((Player)lhs).getAverage() – ((Player)rhs).getAverage(); }}
class ComparePlayersByHits implements Comparator { public int compareTo(Object lhs, Object rhs) { return ((Player)lhs).getHits() – ((Player)rhs).getHits(); }}
29
Solution 2: Comparators// Create an array of playersPlayer[] team = new Player[18];…
// Initialize the array and then create a “sorter” object!Comparator howToSort;if(someUserInput == sort_by_avg) { howToSort = new ComparePlayersByAverage();} else if(someUserInput == sort_by_homers) { howToSort = new ComparePlayersByHomers();} else { howToSort = new ComparePlayersByHits();}
// sort the arrayArrays.sort(team, howToSort);
The ability to “order” two players is contained outside of the Player class. Specifically, the “compareTo” method. Changing the meaning of “comparing” doesn’t require a change of the Player class!
30
Summary Bubble Sort
Never useful! Don’t ever write another bubble sort routine! Selection Sort
Somewhat slower than insertion sort. Don’t use it! Insertion Sort
Very good when an array is “almost sorted” already Should be used in some applications and may beat quicksort!
Quick Sort The best general purpose sorting routine. Use it!
Merge Sort The best way to sort lists or List<Integer>s.
Generic Sorting Use the Comparable and/or Comparator interfaces