css342: sorting algorithms

36
CSS342: Sorting Algorithms 1 CSS342: Sorting Algorithms Professor: Munehiro Fukuda

Upload: isleen

Post on 14-Jan-2016

46 views

Category:

Documents


0 download

DESCRIPTION

CSS342: Sorting Algorithms. Professor: Munehiro Fukuda. Why We Desperately Need Efficient Sorting Algorithms?. Data must be sorted before we run the following programs: Search algorithms such as binary search and interpolation search - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: CSS342: Sorting Algorithms

CSS342: Sorting Algorithms 1

CSS342: Sorting Algorithms

Professor: Munehiro Fukuda

Page 2: CSS342: Sorting Algorithms

CSS342: Sorting Algorithms 2

Why We Desperately Need Efficient Sorting Algorithms?

• Data must be sorted before we run the following programs:– Search algorithms such as binary search and interpolation

search– Many computational geometry/graphics algorithm such

as the convex hull

• We always or frequently need to sort the following data:– Dictionary– White/yellow pages– Student grades

Page 3: CSS342: Sorting Algorithms

CSS342: Sorting Algorithms 3

TopicsDay 1: Lecture• Selection Sort worst/average O(n2)• Bubble Sort worst/average O(n2)• Insertion Sort worst/average O(n2)• Shell Sort worst O(n2) average O(n3/2)• Merge Sort worst/average O(n log n)• Quick Sort worst O(n2) average O(n log n)• Radix Sort worst/average O(n)

Day 2: Lab Work• Partial Quick Sort

Homework Assignment• Non-recursive Semi-In-Place Merge Sort

Page 4: CSS342: Sorting Algorithms

CSS342: Sorting Algorithms 4

Selection Sort

Initial array

After 1st swap

After 2nd swap

After 3rd swap

After 4th swap

29 10 14 1337

29 10 14 13 37

2910 1413 37

2910 1413 37

2910 1413 37

Scan item 0 to size-1, locate the largest item, and swap it with the rightmost item.

Scan item 0 to size-2, locate the 2nd largest item, and swap it with the 2nd rightmost item.

Scan item 0 to size-3, locate the 3rd largest item, and swap it with the 3rd rightmost item.

Scan item 0 to size-4, locate the 4th largest item, and swap it with the 4th rightmost item.

Scan item 0 to size-5, locate the 5th largest item, and swap it with the 5th rightmost item.

0 size-1

O(n2) sorting

Page 5: CSS342: Sorting Algorithms

CSS342: Sorting Algorithms 5

Selection Sort

template <class Object>void selectionSort( vector<Object> & a ) { for ( int last = a.size( ) - 1; last >= 1; --last ) { int indexSoFar = 0; // Index of largest item found so far. // Assume 0th item is the largest first for ( int i = 1; i <= last; ++i ) { if ( a[i] > a[indexSoFar] )

indexSoFar = i; } // indexSoFar points to the largest item at this point swap( a[indexSoFar], a[last] ); }}

last = a.size( ) - 1

last

last

0

0

0

indexSoFar

swap

swap

swap

indexSoFar

indexSoFar

O(n2) sorting

Page 6: CSS342: Sorting Algorithms

CSS342: Sorting Algorithms 6

Efficiency of Selection Sort

29 10 14 1337

29 10 14 13 37

2910 1413 37

2910 1413 37

2910 1413 37

Initial array

After 1st swap

After 2nd swap

After 3rd swap

After 4th swap

N-1 (=4) 1

N-2 (=3) 1

N-3 (=2) 1

N-4 (=1) 1

Comparisons Swapping

O(n(n-1)/2) O(n-1)

O(n2)

O(n2) sorting

Page 7: CSS342: Sorting Algorithms

CSS342: Sorting Algorithms 7

Bubble Sort

29 10 14 1337

2910 14 1337

2910 14 1337

2910 14 1337

2910 14 13 3737

2910 14 13 3737

2910 14 13 3737

2910 14 13 3737

292910 14 13 3737

Pass 1 Pass 2

1310 14 2929 3737

1310 14 2929 3737

292910 13 1414 3737

Pass 3

141410 13 2929 3737

292910 1313 1414 3737

Pass 4

O(n2) sorting

Page 8: CSS342: Sorting Algorithms

CSS342: Sorting Algorithms 8

Bubble Sort

15 10

indexnextIndex

swap

#include <iostream>#include <vector>#include <string>

using namespace std;

template <class Object>void bubbleSort( vector<Object> & a ) { bool swapOccurred = true; // true when swaps occur

for ( int pass = 1; ( pass < a.size( ) ) && swapOccurred; ++pass ) { swapOccurred = false; // swaps have not occurred at the beginning for ( int i = 0; i < a.size( ) - pass; ++i ) { // a bubble(i) goes from 0 to size - pass if ( a[i] > a[i + 1] ) {

swap( a[i], a[i + 1] );swapOccurred = true; // a swap has occured

} } }}

O(n2) sorting

Page 9: CSS342: Sorting Algorithms

CSS342: Sorting Algorithms 9

Efficiency of Bubble Sort

29 10 14 1337

2910 14 1337

2910 14 1337

2910 14 1337

2910 14 13 3737

2910 14 13 3737

2910 14 13 3737

2910 14 13 3737

292910 14 13 3737

ComparisonSwapping

N-1N-1

N-2N-2

11

……

O(n2)O(n2)

O(n2)

Pass 1 Pass 2

O(n2) sorting

Page 10: CSS342: Sorting Algorithms

CSS342: Sorting Algorithms 10

Insertion Sort2929 10 14 1337

2929 14 1337

29291010 14 1337

2910 29 1337

14141010 2929 1337

14141010 2929 133737

1410 14 3729

13131010 1414 37372929

Sorted UnsortedCopy 10

Shift 29

Insert 10, copy 14

Shift 29

Insert 14; copy 37

Copy 13

Shift 37, 29 and 14.

Insert 13

14141010 2929 1337 Shift nothing

unsortedTop

O(n2) sorting

Page 11: CSS342: Sorting Algorithms

CSS342: Sorting Algorithms 11

Insertion Sort

template <class Object>

void SortedList<Object>::insertionSort( ) {

for ( int unsorted = 1; unsorted < array.size( ); ++unsorted ) {

// Assume the 0th item is sorted. Unsorted items start from the 1st item

Object unsortedTop = array[unsorted]; // Copy the top of unsorted group

int i;

for ( i = unsorted - 1; ( i >= 0 ) && (array[i] > unsortedTop ); --i )

// Upon a successful comparison, shift array[i] to the right

array[i + 1] = array[i];

// insert the unsorted top into the sorted group;

array[i + 1] = unsortedTop;

}

}

#endif

13 11 25 2037291410832unsorted

13

unsortedTop

37

loc loc+1loc loc+1

loc loc+1

copycompare

shift

2914

insert

O(n2) sorting

Page 12: CSS342: Sorting Algorithms

CSS342: Sorting Algorithms 12

Efficiency of Insertion Sort

2929 10 14 1337

29291010 14 1337

29291010 14 1337

14141010 2929 1337

14141010 2929 1337

14141010 2929 133737

14141010 2929 133737

13131010 1414 37372929

Sorted Unsorted Comparison Insertion Shift

1 1 1

2 1 2

3 1 3

N-1(=4) 1 N-1=(4)

O(n2) = O(n2) O(n) O(n2)

O(n2) sorting

Page 13: CSS342: Sorting Algorithms

CSS342: Sorting Algorithms 13

ShellSort

• The idea is to perform an insertion sort among items in gap

• This reduces the large amount of data movement.

81 20388785157541582895173512961194

0 16

81 95173512961194

3887851575415828

20

gap = 17/2 = 8

81

95

173512

96

11

94

38

878515

75

41

58

28

20

gap = 8/2.2 = 3

8195

17

3512

96

11

94

38

878515

75

41

58

28

20

81

95

17

35

12

96

11

94

38

87

85

15

75

41

58

28

20

gap = 3/2.2 = 1

81

95

17

35

12

96

11

94

38

87

85

15

75

41

58

28

20

81

95

17

35

12

96

11

94

38

87

85

15

75

41

58

28

20sort

sort

sort

O(n3/2) sorting

Practically chosen

Initially divided by 2

Page 14: CSS342: Sorting Algorithms

CSS342: Sorting Algorithms 14

ShellSorttemplate <class Comparable>void shellsort( vector<comparable> &a );{ for ( int gap = a.size( ) / 2; gap > 0; gap = ( gap == 2 )? 1 : int( gap / 2.2 ) ) { for ( int i = gap; i < a.size( ); i++ ) { Comparable tmp = a[i];

int j = i;

for ( ; j >= gap && tmp < a[j – gap]; j -= gap ) a[j] = a[j – gap]; a[j] = tmp; } }}

81 20388785157541582895173512961194

0 16

gap = 16/2 = 8

20

tmp

Shift a[16-8] if it is larger than tmp

Assume i = a.size( ) –1

Shift a[16-8 * 2] if it is larger than tmp

(1)

(1)

(2)

(2)

(3)

(3)

(4)

(4)(4)

(5)

(5)

O(n3/2) sorting

Page 15: CSS342: Sorting Algorithms

CSS342: Sorting Algorithms 15

Efficiency of ShellSort

• Performance– Worst case: O(N2)– Average case:

• O(N3/2) when dividing 2

• O(N5/4) or O(N7/6) when dividing 2.2

– Proof: • A long-standing open problem

O(n3/2) sorting

Page 16: CSS342: Sorting Algorithms

CSS342: Sorting Algorithms 16

Sorting Algorithms

• Selection Sort• Bubble Sort• Insertion Sort• Shell Sort• Merge Sort• Quick Sort

O(n2) (Shell’s average casedepends on increment.)

Use a recursive solutionTake advantage of tree’slog(n) characteristics

O(n log n)

O(nlog n) sorting

Page 17: CSS342: Sorting Algorithms

CSS342: Sorting Algorithms 17

Mergesort(with an auxiliary temporary array)

1 4 8 13 14 20 25 2 3 5 7 11 23

Assuming that we have already had two sorted array,How can we merge them into one sorted array?

2 3 5 7 11 23 25201413841

O(nlog n) sorting

Page 18: CSS342: Sorting Algorithms

CSS342: Sorting Algorithms 18

Mergesort(with an auxiliary temporary array)

Template <class Comparable>void merge(vector<Comparable> &a, int first, int mid, int last) { vector<Comparable> tempArray(a.size( )); int first1 = first; int lsat1 = mid; int first2 = mid + 1; int last2 = last;

int index = first1; for ( ; (first1 <= last1) && (first2 <= last2); ++index) { if (a[first1] < a[first2]) {

tempArray[index] = a[first1];++first1;

} else {tempArray[index] = a[first2];++first2;

} } for ( ; first1 <= last1; ++first1, ++index) tempArray[index] = a[first1]; for ( ; first2 <= last2; ++first2, ++index) tempArray[index] = a[first2]; for ( index = first; index <= last; ++index ) a[index] = tempArray[index];}

first mid last

sorted sorted

firs

t1

last

1fi

rst2

last

2

theArray

tempArray

< >=

inde

x

first midsorted sorted

firs

t1

last

1

firs

t2la

st2theArray

tempArray

inde

x

O(nlog n) sorting

Page 19: CSS342: Sorting Algorithms

CSS342: Sorting Algorithms 19

Mergesort(from down to top: conquer)

38 16 17123927 24 5

3816 2739 1217 245

16 3827 39 5 12 2417

5 12161724 383927

Now, how can we make each item separated?

O(nlog n) sorting

Page 20: CSS342: Sorting Algorithms

CSS342: Sorting Algorithms 20

Mergesort(from top to down: divide)

3816 17123927 24 5

38 16 17123927 24 5

firstmid=(fist + last)/2

last

theArray

3816 3927 1712 24 5first last first last

mid=(fist + last)/2mid=(fist + last)/2

3816 3927 1712 24 5first

first

last

last

first < last

O(nlog n) sorting

Page 21: CSS342: Sorting Algorithms

CSS342: Sorting Algorithms 21

Mergesort(final view)

38 16 17123927 24 5first

mid=(fist + last)/2last

theArray

38 16 3927 1712 24 5

38 16 3927 1712 24 5firstlast

38 16 17123927 24 5

3816 27 39 12 17 245

16 3827 39 5 12 2417

5 12 16 17 24 38 3927

template<Comparable>void mergesort(vector<Comparable> &a, int first, int last) { if ( first < last ) { int mid = ( first + last ) / 2; mergesort( a, first, mid ); mergesort( a, mid+1, last ); merge( a, first, mid, last ); }}

O(nlog n) sorting

Page 22: CSS342: Sorting Algorithms

CSS342: Sorting Algorithms 22

Mergesort(Efficiency Analysis)

38 16 17123927 24 5

3816 27 39 12 17 245

16 3827 39 5 12 2417

5 12 16 17 24 38 3927

Level # pairs of arrays #comparisons

# copies in a pair

1 4 1

2 * 2

2 2 3

4 *2

3 1 7

8 * 2

X n/2x 2x-1

2x * 2

At level X, #nodes in each pair = 2x

At level X, # major operations = n/ 2x * (3 * 2x – 1) = O(3n)#levels = log n, where n = # array elements ( if n is a power of 2 )#levels = log n + 1 if n is not a power of 2# operations = O(3n) * (log n + 1) = O(3 n log n) = O(n log n)

O(nlog n) sorting

Page 23: CSS342: Sorting Algorithms

CSS342: Sorting Algorithms 23

Quicksort(A partition about a pivot)

13

81

9243

65

3157

26

750

6513

8192433157

26

750

6513 81 9243

315726 750

13 4331 57260

Select a pivot

Partition

Smaller items Larger items

O(nlog n) sorting

Page 24: CSS342: Sorting Algorithms

CSS342: Sorting Algorithms 24

Quicksort(Code overview)

template<class Comparable>void quicksort(vector<Comparable> &a, int first, int last) { int pivotIndex; // after partition, pivotIndex points to a pivot

if ( first < last ) { partition( a, fist, last, pivotIndex ); quicksort( a, first, pivotIndex - 1 ); quicksort( a, pivotIndex + 1, last ); }}

O(nlog n) sorting

Page 25: CSS342: Sorting Algorithms

CSS342: Sorting Algorithms 25

Quicksort(Partitioning Algorithm)

p < p > p ?

S1 S2 unknown

first lastS1 firstUnknown last

Repeat moving each element in the unknown region to S1 or S2Until unknown reaches 0.

p ?

unknown

first

lastS1

firstUnknown last

Initial State

O(nlog n) sorting

Page 26: CSS342: Sorting Algorithms

CSS342: Sorting Algorithms 26

Quicksort(Moving an new unknown into S1)

p < p > p ?

S1 S2 unknown

first lastS1 firstUnknow last

new<p

p < p > p ?

S1 S2 unknown

first lastS1

firstUnknow

last

new<p

swap

O(nlog n) sorting

Page 27: CSS342: Sorting Algorithms

CSS342: Sorting Algorithms 27

Quicksort(Moving an new unknown into S2)

p < p > p ?

S1 S2 unknown

first lastS1 firstUnknow last

new>p

p < p ?

S1 S2 unknown

first lastS1

firstUnknow

last

new>p> p

O(nlog n) sorting

Page 28: CSS342: Sorting Algorithms

CSS342: Sorting Algorithms 28

Quicksort(Partitioning Code)

template<class Comparable>void partition(vector<Comparable> a[], int first,

int last, int& pivotIndex) { //place it in a[first] choosePivot( a, first, last ); Comparable pivot = theArray[first]; int lastS1 = first; int firstUnknown = first + 1;

for ( ; firstUnknown <= last; ++ firstUnknown ) if ( a[firstUnknown] < pivot ) { ++lastS1; swap( a[firstUnknown], a[lastS1] ); } // else item from unknown belongs in S2 swap( a[first], a[lastS1] ); pivotIndex = lastS1;}

p ?

unknown

firstlastS1

firstUnknowlast

p

p < p > p ?

S1 S2 unknown

first lastS1 firstUnknow last

new<p

p < p > p ?

S1 S2 unknown

first lastS1firstUnknow

last

new<p

swap

p < p > p

S1 S2

first lastS1 firstUnknowlast

swap

p< p > p

S1 S2

first lastS1 firstUnknowlast

O(nlog n) sorting

Page 29: CSS342: Sorting Algorithms

CSS342: Sorting Algorithms 29

Quicksort(Example)

27 28 16263912

27 28 16263912

27 28 16263912

27 12 16263928

27 12 16263928

27 12 16283926

27 12 39281626

16 12 39282726

Original array

firstUnknown=1(points to 28)28 belongs in S2

S2S1 is empty.12 belongs in S1, so swap 28 and 12

39 belongs in S2

26 belongs in S1, swap 28 and 26

16 belongs in S1, swap 39 and 16

S1 and S2 are determined

Place pivot between S1 and S2

O(nlog n) sorting

Page 30: CSS342: Sorting Algorithms

CSS342: Sorting Algorithms 30

Quicksort(Efficiency Analysis)

• Worst case: If the pivot is the smallest item in the array segment, S1 will remain empty.– S2 decreases in size by only 1 at each recursive call.– Level 1 requires n-1 comparisons.– Level 2 requires n-2 comparisons.– Thus, (n-1) + (n-2) + …. + 2 + 1 = n(n-1)/2 = O(n2)– Then, how can we select the best pivot?

• Average case: S1 and S2 contain the same number of items.– log n or log n + 1 levels of recursions occur.– Each level requires n-k comparisons– Thus, at most (n-1) * (log n + 1) = O(n log n )

O(nlog n) sorting

Page 31: CSS342: Sorting Algorithms

CSS342: Sorting Algorithms 31

Mergesort versus Quicksort

Worst case Average case

Mergesort n log n n log n

Quicksort n2 n log n

Then, why do we need Quicksort?Reasons: 1. Mergesort requires item-copying operations from the array a to the temp

array and vice versa.2. A worst-case situation is not typical.

Then, why do we need Mergesort?Reason:

If you sort a linked list, no item-copying operations are necessary.

O(nlog n) sorting

Page 32: CSS342: Sorting Algorithms

CSS342: Sorting Algorithms 32

Radix Sort(Algorithm Overview)

0123 2154 0222 0004 0283 1560 1061 2150 Original integers

1560 2150 1061 0222 0123 0283 2154 0004 Grouped by 4th digit

1560 2150 1061 0222 0123 0283 2154 0004 Combined

0004 0222 0123 2150 2154 1560 1061 0283 Grouped by 3rd digit

0004 0222 0123 2150 2154 1560 1061 0283 Combined

0004 1061 0123 2150 2154 0222 0283 1560 Grouped by 2nd digit

0004 1061 0123 2150 2154 0222 0283 1560 Combined

0004 0123 0222 0283 1061 1560 2150 2154 Grouped by 1st digit

0004 0123 0222 0283 1061 1560 2150 2154 Combined (sorted)

O(n) sorting

Page 33: CSS342: Sorting Algorithms

CSS342: Sorting Algorithms 33

Radix Sort(Efficiency Analysis)

• Each grouping work requires n shuffles.• # grouping and combining steps is # digits.

– The previous case is 4.

• Thus, for k digit number, the performance is:– K * n = O( n ) where k is irrelevant to n

• Disadvantage:– Need to compare digits in the same order rather than items.

– Need to accommodate 10 groups for numbers

– Need to accommodate 27 groups for strings (alphabet + blank)

O(log n) sorting

Page 34: CSS342: Sorting Algorithms

CSS342: Sorting Algorithms 34

A Comparison of Sorting Algorithms

n log nn log nHeapsort

n log nn2Treesort

nnRadix sort

n log nn2Quicksort

n log nMergesort

n2n2Insertion sort

n2n2Bubble sort

n2n2Selection sort

Average caseWorst case

Shell sort n2 n3/2 ,n5/4depends on

increment

n log n

Studied in css343

Studied in css343

Question: do we really need to always use mergesort or quicksort?

Page 35: CSS342: Sorting Algorithms

CSS342: Sorting Algorithms 35

Lab Work

• Partial Quicksort– Find the top k items– Find the bottom k items– Find the median

• Key Idea:– Focus on only either partition[first, pivot -1] or

partition[pivot, last] that fits the requirements: top k, bottom k, or middle.

Page 36: CSS342: Sorting Algorithms

CSS342: Sorting Algorithms 36

Programming Assignment

• In-Place Sorting– Sort data items only in the original array. Example: Quick Sort– Impractical for Merge Sort

8 5 4 1 7 2 6 3orig

85 41 72 63temp

8541 72 63

temp

orig

8541 72 63

• Non-Recursive, Semi-In-Place Merge Sort– Using a loop rather than recursion.– Using only one additional temporary array.– Moving data from the original to temporary or

vice versa at each stage