copyright © 2003-2011 curt hill sorting ordering an array

42
Copyright © 2003-2011 Curt Hill Sorting Ordering an array

Upload: lorena-burns

Post on 17-Jan-2018

225 views

Category:

Documents


0 download

DESCRIPTION

Copyright © Curt Hill Selection Sort Basic idea Scan the entire array Find the smallest element Move to the top Remove the top from further consideration Repeat until entire array is sorted

TRANSCRIPT

Page 1: Copyright © 2003-2011 Curt Hill Sorting Ordering an array

Copyright © 2003-2011 Curt Hill

Sorting

Ordering an array

Page 2: Copyright © 2003-2011 Curt Hill Sorting Ordering an array

Copyright © 2003-2011 Curt Hill

Considered Topics• Simple sort schemes

– Algorithms with some code• More complicated sort schemes• Performance considerations

– Time to sort in terms of the number of records N

– How the number of compares and moves relate to the size of N

Page 3: Copyright © 2003-2011 Curt Hill Sorting Ordering an array

Copyright © 2003-2011 Curt Hill

Selection Sort• Basic idea• Scan the entire array• Find the smallest element• Move to the top• Remove the top from further

consideration• Repeat until entire array is sorted

Page 4: Copyright © 2003-2011 Curt Hill Sorting Ordering an array

Copyright © 2003-2011 Curt Hill

How it works

83

19

14

26

Topelement

Leastelement

13

89

14

26

Sorted part of array

Unsorted part of array

Page 5: Copyright © 2003-2011 Curt Hill Sorting Ordering an array

Copyright © 2003-2011 Curt Hill

Codevoid sort(int ar[], int size){ int temp; for(int i=0;i<size-1;i++){ temp = i; for (int j=i+1;j<size;j++)

if(ar[temp]>ar[j]) temp = j; if(temp!=i) {

int val =ar[i]; ar[i] = ar[temp]; ar[temp] = val; } //swap } // outer for}

Page 6: Copyright © 2003-2011 Curt Hill Sorting Ordering an array

Copyright © 2003-2011 Curt Hill

Best and Worst Cases• This is an unusual algorithm in that

the best and worst case are almost the same

• Best case – Already sorted– No moves are needed– All the compares are still done

• Worst case– Inversely sorted – Same number of compares– N-1 moves

Page 7: Copyright © 2003-2011 Curt Hill Sorting Ordering an array

Copyright © 2003-2011 Curt Hill

How it performs• The first element is compared with

all the other elements– N-1 compares

• The second element is compared with remaining – N-2 compares

• Compares: (N-1)+(N-2)+…1(N-1)2/2

• Moves N-1

Page 8: Copyright © 2003-2011 Curt Hill Sorting Ordering an array

Copyright © 2003-2011 Curt Hill

Comparing running times• Mostly we are not concerned with

many of the little issues of this analysis– It is N-1 instead of N– There is a factor of N2/2 instead of N2

• When we have two different factors we always take the most expensive– N2 compares instead of N moves

• Thus selection sort is O(N2)

Page 9: Copyright © 2003-2011 Curt Hill Sorting Ordering an array

Copyright © 2003-2011 Curt Hill

Common Os• Constant time O(c) or O(1)

– Hashing is constant time• Logarithmic time O(log2N)

– Binary and tree searches• Linear time O(N)

– File scans, bad searches• N log N, O(N log2 N) – no other name

– Good sorts• N Squared O(N2)

– Bad sorts• Polynomial O(NX)

– Expensive but doable• Exponential O(eN)

– Intractable

Page 10: Copyright © 2003-2011 Curt Hill Sorting Ordering an array

Copyright © 2003-2011 Curt Hill

Bubble Sort• Basic idea• Start at top• Compare adjacent elements• Exchange if out of order• Repeat until a pass has no

exchanges

Page 11: Copyright © 2003-2011 Curt Hill Sorting Ordering an array

Copyright © 2003-2011 Curt Hill

First Pass83

19

14

26

83

19

14

26

• Small items bubble up slowly– One element per pass

• Large items sink quickly– Keep descending until they find a

larger item or hit bottom

83

19

14

26

83

1

914

26

Page 12: Copyright © 2003-2011 Curt Hill Sorting Ordering an array

Copyright © 2003-2011 Curt Hill

Codevoid sort (int ar[], int size){ bool swapped; do { swapped = false; for(int j = 0;j<size-1;j++) if(ar[j] < ar[j+1]){ int temp = ar[j]; ar[j] = ar[j+1]; ar[j+1] = temp; swapped = true; } // if } // do while(swapped);}

Page 13: Copyright © 2003-2011 Curt Hill Sorting Ordering an array

Copyright © 2003-2011 Curt Hill

How it performs• Bubble sort makes many moves but

always a short distance• It also does many redundant

compares• O(N2)• Big O notation makes this comparable

with selection – Usually much worse– Have to be creative to make a worse sort

Page 14: Copyright © 2003-2011 Curt Hill Sorting Ordering an array

Copyright © 2003-2011 Curt Hill

Best and Worst Cases• Best case

– Already sorted– One pass through does no exchanges

and quits• Worst case

– Inversely sorted – The smallest only moves up one– N-1 passes– The case of all elements sorted

except first element is in last slot is almost as bad

Page 15: Copyright © 2003-2011 Curt Hill Sorting Ordering an array

Copyright © 2003-2011 Curt Hill

Bubble Again• Consider two symmetric cases,

sorted with one exception: largest or smallest as far away as possible– One takes two passes the other N-1

• The problem is the direction of the scan– Items going in that direction move fast– Items going other direction slowly

• This suggests a fix

Page 16: Copyright © 2003-2011 Curt Hill Sorting Ordering an array

Copyright © 2003-2011 Curt Hill

Shaker Sort• Basic idea is same as bubble sort• Scan top to bottom in odd passes• Scan bottom to top in even passes

Page 17: Copyright © 2003-2011 Curt Hill Sorting Ordering an array

Copyright © 2003-2011 Curt Hill

First and Second Passes

83

19

14

26

83

19

14

26

831

9

14

2

6

First pass goestop to bottom

Second pass goesbottom to top

Page 18: Copyright © 2003-2011 Curt Hill Sorting Ordering an array

Copyright © 2003-2011 Curt Hill

How it performs• Insignificantly different• The worst case occurs very

infrequently• The extra work to handle them

complicates every run• O(N2)

Page 19: Copyright © 2003-2011 Curt Hill Sorting Ordering an array

Copyright © 2003-2011 Curt Hill

The Previous Problems• The problem with both of these is

the short distance things are moved

• They usually move in the right direction but seldom far enough

• One fix is to compare non-adjacent elements

• How?

Page 20: Copyright © 2003-2011 Curt Hill Sorting Ordering an array

Copyright © 2003-2011 Curt Hill

Shell Sort• Start with a gap g, where 1 g N• Do a sort pass comparing elements

separated by the gap and exchanging if needed

• Decrease the gap in each pass– Do not divide size by 2

• When the gap is one it is a bubble sort but most of the large distance moving has been done

Page 21: Copyright © 2003-2011 Curt Hill Sorting Ordering an array

Copyright © 2003-2011 Curt Hill

First Pass83

19

14

26 8

31

914

26

• First: 8 and 1 exchanged• Third: 14 and 2 exchanged• Fourth: 6 and 14 exchanged

Gap = 3

8

31

9

14

26

8

31

914

2

6

Page 22: Copyright © 2003-2011 Curt Hill Sorting Ordering an array

Copyright © 2003-2011 Curt Hill

How it performs• The analysis is extremely difficult• Empirically the O(N1.25)• This makes it better for any all but

insignificant table size than bubble or selection

• The break even point between O(N1.25) and O(N log2 N) is size=65000, however the constant factor on Shell is large so the break even point is much smaller

• Still inferior to the N log N sorts for large tables

Page 23: Copyright © 2003-2011 Curt Hill Sorting Ordering an array

Copyright © 2003-2011 Curt Hill

Insertion Sort• Partition the array into two pieces• The first one and all the rest• The first part of the array is

already sorted• Remove the first unsorted item• Insert into the correct location of

the sorted part

Page 24: Copyright © 2003-2011 Curt Hill Sorting Ordering an array

Copyright © 2003-2011 Curt Hill

How it works

83

19

14

26

Sorted part of array

Unsorted part of array

8

19

14

26

3Remove 3

8

19

14

26

3

Insert

Page 25: Copyright © 2003-2011 Curt Hill Sorting Ordering an array

Copyright © 2003-2011 Curt Hill

How it performs• Best case is sorted • Worst case is inversely sorted• Yet another N2

• Moves N-1

Page 26: Copyright © 2003-2011 Curt Hill Sorting Ordering an array

Copyright © 2003-2011 Curt Hill

Merge Sort• Merge increasingly larger sorted

runs into a single much larger run• Start with runs of 1• Merge two runs into a temporary

area• Copy it back

Page 27: Copyright © 2003-2011 Curt Hill Sorting Ordering an array

Copyright © 2003-2011 Curt Hill

Pass 1

83

19

14

26

Start: Each element is a run of 1

83

1

9

142

6

Run 1

Run 3

Run 2

Run 4

End of pass 1: runs of 2

Run 1Run 2Run 3Run 4Run 5Run 6Run 7

Page 28: Copyright © 2003-2011 Curt Hill Sorting Ordering an array

Copyright © 2003-2011 Curt Hill

Pass 2

83

1

9

142

6

Runs of 2

831

9

1426

Run 1

Run 2

Runs of 4

Run 1

Run 2

Run 3

Run 4

138

14

Page 29: Copyright © 2003-2011 Curt Hill Sorting Ordering an array

Copyright © 2003-2011 Curt Hill

Important points on Merge• An item in a run can never be

compared with any other element in the same run

• It generalizes to files nicely• Requires extra copy space equal to

the size of longest run• In last pass that is entire array• First of O(N log2 N) sorts• The insertion process generates

many moves

Page 30: Copyright © 2003-2011 Curt Hill Sorting Ordering an array

Copyright © 2003-2011 Curt Hill

Quick Sort

• A complicated but very fast sort– Usually the best of the in memory

sorts• Never compares two items twice• Always moves things in the right

direction• Usually moves them a relatively

long distance

Page 31: Copyright © 2003-2011 Curt Hill Sorting Ordering an array

Copyright © 2003-2011 Curt Hill

Algorithm I• The first item is called the pivot

– It will be the middle element• From the top look for an item that is

larger• From the bottom look for an item

that is smaller• The two items are respectively in

the wrong “half” of the table– Recall the pivot will be the middle item

• Exchange the two

Page 32: Copyright © 2003-2011 Curt Hill Sorting Ordering an array

Copyright © 2003-2011 Curt Hill

Algorithm II • When searches collide move the

pivot there• Now have three partitions:

– Lower – sort it by itself– Pivot – nothing more needs to be

done– Higher – sort it by itself

Page 33: Copyright © 2003-2011 Curt Hill Sorting Ordering an array

Copyright © 2003-2011 Curt Hill

Quick Sort

83

19

14

26

Start, pivot is 8start looking

83

19

14

26

83

19

142

6

1st exch2nd exch

83

1

914

2

6

Pivot exch

23

1

914

8

6

Donefound

Three partitions

Page 34: Copyright © 2003-2011 Curt Hill Sorting Ordering an array

Copyright © 2003-2011 Curt Hill

Performance• A pair of distinct keys are never

compared twice• The trick is partitioning the array

into two separate pieces that never interact again

• (½ N)2 + (½ N)2 < N2 – 202=400– 102+102 = 200

• O(N log2 N)

Page 35: Copyright © 2003-2011 Curt Hill Sorting Ordering an array

Copyright © 2003-2011 Curt Hill

More on Performance• A happy accident is that the pivot

may be placed in a CPU register– It is the only value compared to the

entire array– This makes it free and quick to access

• Notice the recursive nature of this algorithm– The array is partitioned into two

pieces– These are different sizes and the sort

is recursively invoked on them

Page 36: Copyright © 2003-2011 Curt Hill Sorting Ordering an array

Copyright © 2003-2011 Curt Hill

Best and Worst Case• It does better on unsorted file than

sorted– Counter-intuitive

• The worst case is the sorted or inversely sorted file– The chosen partition divides the table

into two, not three, partitions– N2 In this case

Page 37: Copyright © 2003-2011 Curt Hill Sorting Ordering an array

Copyright © 2003-2011 Curt Hill

Improvements 1• The worst case makes one think

about choosing a different pivot• Any searching for a pivot will

slow the average process with a search

• The case of a sorted array to be sorted is extremely unlikely– For 10 elements – 2 chances in 3628800 for it to be

already sorted

Page 38: Copyright © 2003-2011 Curt Hill Sorting Ordering an array

Copyright © 2003-2011 Curt Hill

Improvements 2• The partitioning scheme is

complicated enough that it does worse than simple sorts in very small arrays: 6-12 entries– Recursion to sort an table of length 3

is wasteful in memory and CPU cycles• The only real improvement is to

use a simpler sort when the partition size gets small– If the partition is small just use a

simple N2 sort

Page 39: Copyright © 2003-2011 Curt Hill Sorting Ordering an array

Two more thoughts• Virtual memory can disrupt sorting

when pieces of the array are paged out– True for any sort– If possible fix the pages

• Quick sort could use threads– Spawn a thread for one of the

partitions if it were of sufficient size– Would need to be large to make

thread overhead worth whileCopyright © 2003-2011 Curt Hill

Page 40: Copyright © 2003-2011 Curt Hill Sorting Ordering an array

Copyright © 2003-2011 Curt Hill

Heap Sort• Builds a binary tree in the array• The positions of the left and right

sub-trees are implicit rather than needing pointers

• Also O(N log2 N) sort• Rather complicated• Will not be shown

Page 41: Copyright © 2003-2011 Curt Hill Sorting Ordering an array

Copyright © 2003-2011 Curt Hill

Heap Sort Performance• Slowest of the O(N log2 N) sorts• Advantages:

– Does not need recursion of quicksort– Does not need extra space of

mergesort– Worst case is still O(N log2 N) unlike

other two

Page 42: Copyright © 2003-2011 Curt Hill Sorting Ordering an array

Summary• Several sorts with varying

performance:– N2: Selection, Bubble, Shaker,

Insertion– N1.25: Shell– N log2 N: Merge, Quick, Heap

Copyright © 2003-2011 Curt Hill