run time analysis of insertion sort and quick sort

24
Run time analysis of insertion sort and quick sort CM0212 – Algorithms and Data Structures Philip Strong 0807259 An empirical study into the run time characteristics of two simple sorting algorithms. The insertion sort and quick sort will be analysed to show the usefulness of each depending on the application.

Upload: philip-strong

Post on 04-Apr-2015

1.506 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Run Time Analysis of Insertion Sort and Quick Sort

Run time analysis of insertion sort and quick sort

CM0212 – Algorithms and Data Structures

Philip Strong0807259

An empirical study into the run time characteristics of two simple sorting algorithms. The insertion sort and quick sort will be analysed to show the usefulness of each depending on the application.

Page 2: Run Time Analysis of Insertion Sort and Quick Sort

Introduction

The insertion sort and quick sort algorithms are both sorting algorithms, which take an unsorted list

and reorder the entities into numerical order. Each algorithm has its own characteristics as to how the

run time changes with respect to the number of inputs.

This is commonly shown using the ‘Big Oh’ notation, where the O stands for ‘order of’. Big Oh notation is

to show how long an algorithm may take to run when very large inputs are applied, therefore only the

largest term in a polynomial is required [1]. For instance, when T(n) = An + B, the runtime complexity is

O(n), as B has little effect on the runtime as n tends towards infinity.

The average and worst case complexity of the insertion sort is O(n²), and the best case is O(n). When

the input is already in ascending order, each position will only need to be checked once, and no swaps

will be required. This results in the runtime complexity of O(n), one per position. The worst case is

when the input is in descending order, and each position must be compared with everything in the

sorted part of the list. This results in the runtime complexity of O(n²).

The best and average case complexity of the quick sort is O(nlogn), and the worst case is O(n²). The best

case of a quick sort is when the partitioning stage splits the list in half [2]. This is close to real world

operation and so the best case and average case complexities are close. The worst case runtime

complexity occurs when the list is in descending order [2].

Design

The sorting algorithms were based on the pseudocode shown below. As can be seen, the insertion sort

algorithm is significantly shorter and so could be a better choice when program space is critical. To

simplify the design of the algorithm, the quick sort pivot will be decided as the middle element in the

array.

Insertion sort pseudo code [3]

Algorithm insertionSort(input, n)Input: An array storing n integersOutput: An array sorted in ascending orderFor i = 1 to (n – 1) do

item = input[i]j = i – 1while j ≥ 0 and input[j] > item do

input [j + 1] = input [j]j = j – 1

input[j + 1] = item

Page 3: Run Time Analysis of Insertion Sort and Quick Sort

Quick sort pseudo code [3]

Algorithm quickSort(list, lower , upper )Input: Partial Array, list[lower::upper]Output: Partial array, sortedif upper > lower thenj partition(list, lower, upper)

quicksort(list, lower , j - 1)quicksort(list, j + 1, upper)

Algorithm partition(list, lower , upper )Input: Partial Array, list[lower ::upper ].Output: Partial array, partitionedi lower ; j upper + 1v list(lower )do{

do i + 1 while (list(i) < v)do j j - 1 while (list(j) > v)if (i < j) exchange(list(i ), list(j))

} while (i < j)exchange(list(lower ), list(j))return j

Experimental Planning

The experiment was set up by implementing both algorithms in Java. A random number generator will

create a text file for the algorithms to sort. The file will be read into an array, and then the array passed

as an argument to each algorithm. The output will be written to a file to reduce console output. The

time taken to run the algorithm will be recorded by adding a counter into every loop, which will be

recorded into a comma separated values (CSV) file, along with number of inputs, and sort type.

The tests will be set to run automatically. The order of the tests and functions will be as follows:

Generate random data set 1Run insertion sort on data set 1Run quick sort on data set 1Generate random data set 2Run insertion sort on data set 2Run quick sort on data set 2Generate random data set 3Run insertion sort on data set 3Run quick sort on data set 3Generate ascending dataRun best insertion sortRun best quick sortGenerate descending dataRun worst insertion sortRun worst quick sort

Each sort will sort from between 1 and 600,000 inputs, in 50,000 steps. The average case tests will be

run 3 times to check for anomalies.

Page 4: Run Time Analysis of Insertion Sort and Quick Sort

Implementation

public insertionSort(int[] input, int n){

if (n > input.length) // Check range is available in the data{

System.out.println("Out of range, only " + input.length + " integers available.");

}else{// Insertion sort algorithm

for(int i = 1; i <= (n - 1); i++){

counter++;item = input[i];j = i - 1;while ((j >= 0) && (input[j] > item)){

counter++;input[j + 1] = input[j];j = j - 1;

}input[j + 1] = item;

}output = input;

}}

// http://www.algolist.net/Algorithms/Sorting/Quicksort// [4]

public void sort(int[] input, int lower, int upper) { int index = partition(input, lower, upper); if (lower < index - 1) { sort(input, lower, index - 1); }

if (index < upper) { sort(input, index, upper); } output = input;}

public int partition(int input[], int lower, int upper)// [4]{ int i = lower; int j = upper; int tmp; pivot = input[(lower + upper) / 2]; // Set pivot as middle value while (i <= j) { while (input[i] < pivot) { counter++; i++; } while (input[j] > pivot) { counter++; j--;

Page 5: Run Time Analysis of Insertion Sort and Quick Sort

} if (i <= j)

{ counter++; tmp = input[i]; input[i] = input[j]; input[j] = tmp; i++; j--; }}return i;

}

Testing

To test the function of the algorithms, I set the random number generator to make only a very small set

of data, consisting of 10 random numbers between 1 and 100. The screenshots below show both the

input data file (data.tst) with the output data file (OUTdata.tst) shown on top. If the output data is in

numerical order then the algorithm is functioning correctly.

Figure 1 – Test data for insertion sort

Figure 2 – Test data for quick sort

Page 6: Run Time Analysis of Insertion Sort and Quick Sort

As is evident from the images, both the insertion sort and the quick sort are sorting the data as

expected.

Results

The data collected by running the tests is shown in Appendix A.

To determine whether or not the runtime complexity of each algorithm is as expected, I will be plotting

the number of iterations against the expected outcome, and look for a linear graph.

0 100000000000 200000000000 300000000000 4000000000000

100000000002000000000030000000000400000000005000000000060000000000700000000008000000000090000000000

100000000000

Figure 3 - Average runtime complexity of insertion sort

Number of inputs (n²)

Num

ber o

f ite

ratio

ns

Above, figure 3, shows the average runtime complexity of the insertion sort, with random inputs. It is

evident from the graph that the line is linear, and so therefore the runtime complexity with respect to

the number of inputs is O(n2), as expected.

Page 7: Run Time Analysis of Insertion Sort and Quick Sort

0 100000 200000 300000 400000 500000 600000 7000000

100000

200000

300000

400000

500000

600000

700000

Figure 4 - Best case runtime complexity of insertion sort

Number of inputs (n)

Num

ber o

f ite

ratio

ns

Above, figure 4, shows the best case runtime complexity of the insertion sort. As the graph has been

plotted with n versus number of iterations, the straight line assumes that the best case for insertion sort

is O(n), which is what was predicted.

0 100000000000 200000000000 300000000000 4000000000000

20000000000400000000006000000000080000000000

100000000000120000000000140000000000160000000000180000000000200000000000

Figure 5 - Worst case runtime complexity of insertion sort

Number of inputs (n²)

Num

ber o

f ite

ratio

ns

Above, figure 5 shows the worst time runtime complexity of the insertion sort. The graph has been

plotted again n2 and so the linear nature of the graph suggests that the runtime complexity is O(n2), as

stated.

Page 8: Run Time Analysis of Insertion Sort and Quick Sort

0 500000 1000000 1500000 2000000 2500000 3000000 3500000 40000000

2000000

4000000

6000000

8000000

10000000

12000000

14000000

Figure 6 - Average runtime complexity of quick sort

Number of inputs (nlogn)

Num

ber o

f ite

ratio

ns

Above, figure 6, shows the average runtime complexity of the quick sort, with random inputs. It is

evident from the graph that the line is linear, and so therefore the runtime complexity with respect to

the number of inputs is O(nlogn), as expected.

0 500000 1000000 1500000 2000000 2500000 3000000 3500000 40000000

2000000

4000000

6000000

8000000

10000000

12000000

14000000

Figure 7 - Best case runtime complexity of quick sort

Number of inputs (nlogn)

Num

ber o

f ite

ratio

ns

The graph above shows that the best case runtime is linear when plotted against nlogn. This shows that

the runtime complexity of the quick sort algorithm is nlogn.

Page 9: Run Time Analysis of Insertion Sort and Quick Sort

0 100000000000 200000000000 300000000000 4000000000000

2000000

4000000

6000000

8000000

10000000

12000000

14000000

Figure 8 - Worst case runtime complexity of quick sort

Number of inputs (n²)

Num

ber o

f ite

ratio

ns

As can be seen from the graph above, the output is not linear. This is due to a coding error which was

not traceable after testing.

Page 10: Run Time Analysis of Insertion Sort and Quick Sort

Analysis

For all tested numbers of inputs, the insertion sort was slower than the quick sort. This is evident when

looking at the numbers of iterations in Appendix A. Even 50,000 inputs causes the insertion sort to run

approximately 625,000,000 iterations, compared to approximately 850,000 in the quick sort.

An advantage of the insertion sort over the quick sort is the simplicity of programming, making it easier

to code, and to help fit in very small program spaces, such as in a microcontroller where program size

can be limited to as little as 1,000 operations.

The quick sort was faster in every tested case, and the characteristic of the runtime means that it did

not get slower at as fast a rate as more inputs were applied, compared to the insertion sort.

Conclusion

The empirical study has confirmed the predictions made. The best case runtime complexity of an

insertion sort is O(n); the average and worst case runtime complexity of an insertion sort is O(n2). The

best case and average runtime complexity of quick sort is O(nlogn). It is unable to conclude the worst

case runtime complexity of the quick sort the algorithm was unable to run with the pivot overridden

and the input list in descending order.

It is also concluded that insertion sort is a simpler sorting algorithm, which can be implemented in far

less code and time. This is a characteristic which would be useful when processing time is not an

important factor, but program space is.

If this experiment were to be repeated, I would run the algorithms on a more powerful computer and

run them with far higher numbers of inputs to get a more accurate impression of the runtime

characteristics.

I would also make an effort to ensure the functionality of my program which runs the algorithms, as I

had trouble with the worst case scenario when using the quick sort.

Page 11: Run Time Analysis of Insertion Sort and Quick Sort

References

[1] http://leepoint.net/notes-java/algorithms/big-oh/bigoh.html

[2] http://www.personal.kent.edu/~rmuhamma/Algorithms/MyAlgorithms/Sorting/quickSort.htm

[3] Mumford, C. CM0212 Lecture notes, Cardiff University

[4] http://www.algolist.net/Algorithms/Sorting/Quicksort

Page 12: Run Time Analysis of Insertion Sort and Quick Sort

Appendix A – Results table

Test Number of inputs Test 1 Test 2 Test 3

Insertion sort 1 0 0 0

Insertion sort 50000 627542123 626147799 624675965

Insertion sort 100000 2516824290 2509868625 2495291701

Insertion sort 150000 5641976345 5634034367 5611317362

Insertion sort 20000010034022051

10000410843 9996017518

Insertion sort 25000015662852797

15619101593 15619988064

Insertion sort 30000022480796819

22505840073 22491947988

Insertion sort 35000030590139318

30633132481 30616998925

Insertion sort 40000039974765255

40079147489 39983259067

Insertion sort 45000050635908205

50720042238 50589772366

Insertion sort 50000062515563283

62609607992 62468540235

Insertion sort 55000075618761287

75716370481 75624789786

Insertion sort 60000089988940660

90051873106 89994380407

Quick sort 1 2 1 2

Quick sort 50000 830404 842897 866601

Quick sort 100000 1873389 1809351 1796251

Quick sort 150000 2859706 2849807 2698369

Quick sort 200000 4003389 3783696 3827994

Quick sort 250000 5338076 4932477 5129526

Quick sort 300000 5779951 5844113 5868753

Quick sort 350000 6644237 6812426 6763763

Quick sort 400000 7512232 7905009 7658100

Quick sort 450000 8727657 8758091 9097588

Quick sort 500000 10354666 9978611 9675388

Quick sort 550000 10974410 11900488 11018520

Quick sort 600000 11939090 12267535 12358614

Insertion sort Best 1 0    

Insertion sort Best 50000 49999    

Insertion sort Best 100000 99999    

Insertion sort Best 150000 149999    

Insertion sort Best 200000 199999    

Insertion sort Best 250000 249999    

Insertion sort Best 300000 299999    

Insertion sort Best 350000 349999    

Insertion sort Best 400000 399999    

Insertion sort Best 450000 449999    

Insertion sort Best 500000 499999    

Insertion sort Best 550000 549999    

Page 13: Run Time Analysis of Insertion Sort and Quick Sort

Insertion sort Best 600000 599999    

Quick sort Best 1 2    

Quick sort Best 50000 784481    

Quick sort Best 100000 1668946    

Quick sort Best 150000 2587875    

Quick sort Best 200000 3537875    

Quick sort Best 250000 4487875    

Quick sort Best 300000 5475732    

Quick sort Best 350000 6475732    

Quick sort Best 400000 7475732    

Quick sort Best 450000 8475732    

Quick sort Best 500000 9475732    

Quick sort Best 550000 10501445    

Quick sort Best 600000 11551445    

Insertion sort Worst 1 0    

Insertion sort Worst 50000 1250024999    

Insertion sort Worst 100000 5000049999    

Insertion sort Worst 15000011250074999    

Insertion sort Worst 20000020000099999    

Insertion sort Worst 25000031250124999    

Insertion sort Worst 30000045000149999    

Insertion sort Worst 35000061250174999    

Insertion sort Worst 40000080000199999    

Insertion sort Worst 450000 1.0125E+11    

Insertion sort Worst 500000 1.25E+11    

Insertion sort Worst 550000 1.5125E+11    

Insertion sort Worst 600000 1.8E+11    

Quick sort Worst 1 5    

Quick sort Worst 50000 834436    

Quick sort Worst 100000 1768898    

Quick sort Worst 150000 2737824    

Quick sort Worst 200000 3737824    

Quick sort Worst 250000 4737824    

Quick sort Worst 300000 5775678    

Quick sort Worst 350000 6825678    

Quick sort Worst 400000 7875678    

Quick sort Worst 450000 8925678    

Quick sort Worst 500000 9975678    

Quick sort Worst 550000 11051388    

Quick sort Worst 600000 12151388    

Page 14: Run Time Analysis of Insertion Sort and Quick Sort

Appendix B – Code listing

runTests.java

import java.io.*;

public class runTests {

// Class to run all the tests for CM0212 Runtime analysis coursework// Philip Strong

private String fileLoc;PrintWriter output = new PrintWriter(new FileWriter("counter.csv"));;

private final int generatorSize = 100; // Max size of generated dataprivate final int numOfInts = 10; // Number of integers to be generatedprivate int[] range = {1, 10};

public static void main(String[] args) throws IOException {

new runTests(args);}

public runTests(String[] args) throws IOException {

fileLoc = "data.tst"; // Get filename from argumentinsertionSort insertionTest;quickSort quickTest;for (int j = 0; j < 3; j++){

generateRandomData(); // Call random number generator

System.out.println("Running average insertion sort " + (j + 1) + "...");

for (int i = 0; i < range.length; i++){

System.out.print((i + 1) + " / " + (range.length) + " ");insertionTest = new insertionSort(parseFileToArray(fileLoc), range[i]);

// Do insertion sortdumpArrayToFile(insertionTest.getArray(), i); // Dump sorted data to

filedumpInfoToFile("ins", range[i], insertionTest.getIterations());

}

System.out.println("Running average quicksort " + (j + 1) + "...");

for (int i = 0; i < range.length; i++){

System.out.print((i + 1) + " / " + (range.length) + " ");quickTest = new quickSort(parseFileToArray(fileLoc), 0, range[i],

false); // Do insertion sortdumpArrayToFile(quickTest.getArray(), i); // Dump sorted data to filedumpInfoToFile("qui", range[i], quickTest.getIterations());

}}generateAscendingData(); // Call ascending data generatorSystem.out.println("Running best insertion sort...");for (int i = 0; i < range.length; i++){

System.out.print((i + 1) + " / " + (range.length) + " ");insertionTest = new insertionSort(parseFileToArray(fileLoc), range[i]); // Do

insertion sortdumpArrayToFile(insertionTest.getArray(), i); // Dump sorted data to filedumpInfoToFile("insBest", range[i], insertionTest.getIterations());

}System.out.println("Running best quicksort...");for (int i = 0; i < range.length; i++){

System.out.print((i + 1) + " / " + (range.length) + " ");quickTest = new quickSort(parseFileToArray(fileLoc), 0, range[i], false); // Do

insertion sort

Page 15: Run Time Analysis of Insertion Sort and Quick Sort

dumpArrayToFile(quickTest.getArray(), i); // Dump sorted data to filedumpInfoToFile("quiBest", range[i], quickTest.getIterations());

}generateDescendingData(); // Call descending data generatorSystem.out.println("Running worst insertion sort...");for (int i = 0; i < range.length; i++){

System.out.print((i + 1) + " / " + (range.length) + " ");insertionTest = new insertionSort(parseFileToArray(fileLoc), range[i]); // Do

insertion sortdumpArrayToFile(insertionTest.getArray(), i); // Dump sorted data to filedumpInfoToFile("insWorst", range[i], insertionTest.getIterations());

}System.out.println("Running worst quicksort...");

for (int i = 0; i < range.length; i++){

System.out.print((i + 1) + " / " + (range.length) + " ");quickTest = new quickSort(parseFileToArray(fileLoc), 0, range[i], false); // Do

insertion sortdumpArrayToFile(quickTest.getArray(), i); // Dump sorted data to filedumpInfoToFile("quiWorst", range[i], quickTest.getIterations());

}output.close();

System.out.println("Done");}

public void dumpInfoToFile(String cmd, int range, long iterations) throws IOException{

try{

output.println(cmd + "," + range + "," + iterations);}finally{

System.out.println("Added to timing file.");}

}

public int[] parseFileToArray(String fileLoc) throws IOException// Take data file and place each item in an array position// Based on Mumford, C 2010 notes{

int[] num = new int[10]; int k = 10; int count = 0; BufferedReader input; String line;

try {

input = new BufferedReader(new FileReader(fileLoc)); line = input.readLine();

while(line != null) {

// dynamic array dimensions // makes array storage bigger, if needed

if (count == (num.length-1)) {

int n = num.length; int[] original = num; num = new int[n * 2]; for (int i = 0; i < n; i++) {

num[i] = original[i]; } original = null;

} num[count] = Integer.parseInt(line); line = input.readLine(); count++;

}

input.close();

Page 16: Run Time Analysis of Insertion Sort and Quick Sort

int[] original = num; num = new int[count]; for (int i = 0;i < count;i++) {

num[i] = original[i]; } original = null;

} catch(IOException e) {

System.out.println("Error: " + e.toString()); }return num;

}

public void dumpArrayToFile(int[] input, int k) throws IOException// Takes an array and dumps it to file{

try{

PrintWriter output;

output = new PrintWriter(new FileWriter("OUT" + fileLoc));for (int i = 0; i < range[k]; i++) {

output.println(input[i]);}output.close();

}catch(IOException e){

System.out.println("Error: " + e.toString());}

}

public void generateRandomData() throws IOException// Generates a file filled with random integers{

System.out.println("Generating random data...");try{

PrintWriter output;int number;

output = new PrintWriter(new FileWriter(fileLoc));for (int i = 0; i < numOfInts; i++) {

/* Debug output* if (i == (numOfInts * 0.1)) System.out.print("10%...");if (i == (numOfInts * 0.2)) System.out.print("20%...");if (i == (numOfInts * 0.3)) System.out.print("30%...");if (i == (numOfInts * 0.4)) System.out.print("40%...");if (i == (numOfInts * 0.5)) System.out.print("50%...");if (i == (numOfInts * 0.6)) System.out.print("60%...");if (i == (numOfInts * 0.7)) System.out.print("70%...");if (i == (numOfInts * 0.8)) System.out.print("80%...");if (i == (numOfInts * 0.9)) System.out.print("90%...");*/

number = (int) ((Math.random() * generatorSize) + 1);output.println(number);

}output.close();

}catch(IOException e){

System.out.println("Error: " + e.toString());}finally{

/*System.out.println("100% done");*/}

}

public void generateAscendingData() throws IOException

Page 17: Run Time Analysis of Insertion Sort and Quick Sort

// Generates a file filled with random integers{

System.out.println("Generating acending data...");try{

PrintWriter output;output = new PrintWriter(new FileWriter(fileLoc));for (int i = 0; i < numOfInts; i++) {

output.println(i);}output.close();

}catch(IOException e){

System.out.println("Error: " + e.toString());}finally{

System.out.println("100% done");}

}

public void generateDescendingData() throws IOException// Generates a file filled with random integers{

System.out.println("Generating descending data...");try{

PrintWriter output;int number;

output = new PrintWriter(new FileWriter(fileLoc));for (int i = numOfInts; i > 0; i--) {

output.println(i);}output.close();

}catch(IOException e){

System.out.println("Error: " + e.toString());}finally{

System.out.println("100% done");}

}

}

Page 18: Run Time Analysis of Insertion Sort and Quick Sort

insertionSort.javapublic class insertionSort {

private int item;private int j;private int n;private int[] output;private long counter = 0;

public insertionSort(int[] input, int n){

// input: array input storing minimum of n integers// output: array output stored in ascending order// n: number of data elements to sort

if (n > input.length) // Check range is available in the data{

System.out.println("Out of range, only " + input.length + " integers available.");

}else{

// Insertion sort algorithmfor(int i = 1; i <= (n - 1); i++){

counter++;item = input[i];j = i - 1;

while ((j >= 0) && (input[j] > item)){

counter++;input[j + 1] = input[j];j = j - 1;

}

input[j + 1] = item;}output = input;

}}

public int[] getArray()// Returns the completed array{

return output;}

public long getIterations()// Return timing information{

return counter;}

}

Page 19: Run Time Analysis of Insertion Sort and Quick Sort

quickSort.javapublic class quickSort {

private int[] output;private long counter;private int pivot;private boolean overRidePivot = false;

public quickSort(int[] input, int lower, int upper, boolean worstCase){

if (worstCase) setPivot(2); // Override for worst case pivot placement sort(input, lower, upper);

}

public void sort(int[] input, int lower, int upper) { int index = partition(input, lower, upper); if (lower < index - 1) { sort(input, lower, index - 1); }

if (index < upper) { sort(input, index, upper); } output = input;}

public int partition(int input[], int lower, int upper){ int i = lower; int j = upper; int tmp; if(!overRidePivot) { pivot = input[(lower + upper) / 2]; } else { overRidePivot = false; } while (i <= j) {

counter++; while (input[i] < pivot) { counter++; i++; } while (input[j] > pivot) { counter++; j--; } if (i <= j) { tmp = input[i]; input[i] = input[j]; input[j] = tmp; i++; j--; } } return i;}

public void setPivot(int pivotIn)// Set pivot when overridden{

overRidePivot = true;pivot = pivotIn;

}

Page 20: Run Time Analysis of Insertion Sort and Quick Sort

public int[] getArray()// Return sorted array{

return output;}

public long getIterations()// Return timing information{

return counter;}

}