ch07-internal sorting inter.ppt sorting_inter4.pdf · lec 7: internal sorting 5 sorting the...

15
Data Structure Chapter 7 Dr. Patrick Chan School of Computer Science and Engineering South China University of Technology Internal Sorting Lec 7: Internal Sorting 2 Outline Introduction (Ch 7.1) Θ(n 2 ) Sorting (Ch 7.2) Θ(n log n) Sorting (Ch 7.3 7.7) Empirical Comparison (Ch 7.8) Lec 7: Internal Sorting 3 Sorting Arrange the records in a ascending/descending sequence according to their key values Definition Given a sequence of records R 1 , R 2 , … , R n with key values k 1 , k 2 , …, k n , respectively, arrange the records into any orders such that records R s1 , R s2 , …, R sn have keys obeying the property k s1 k s2 …≤ k sn Lec 7: Internal Sorting 4 Sorting Each record contains a field called the key Linear order: comparison Measures of cost: Comparisons Swaps

Upload: doandat

Post on 03-Apr-2019

250 views

Category:

Documents


0 download

TRANSCRIPT

Data StructureChapter 7

Dr. Patrick Chan

School of Computer Science and Engineering

South China University of Technology

Internal Sorting

Lec 7: Internal Sorting2

Outline

� Introduction (Ch 7.1)

� Θ(n2) Sorting (Ch 7.2)

� Θ(n log n) Sorting (Ch 7.3 – 7.7)

� Empirical Comparison (Ch 7.8)

Lec 7: Internal Sorting3

Sorting

� Arrange the records in a ascending/descending sequence according to their key values

� Definition

Given a sequence of records R1, R2, … , Rn

with key values k1, k2, …, kn, respectively,

arrange the records into any orders such that records R

s1, Rs2, …, Rsn

have keys obeying the property k

s1 ≤ ks2 ≤ …≤ k

sn

Lec 7: Internal Sorting4

Sorting

� Each record contains a field called the key

� Linear order: comparison

� Measures of cost:

� Comparisons

� Swaps

Lec 7: Internal Sorting5

Sorting

� The following sorting methods will be

introduced:

1. Bubble Sort

2. Selection Sort

3. Insertion Sort

7. Bin Sort4. Quick Sort

5. Merge Sort

6. Heap Sort

Lec 7: Internal Sorting6

Bubble Sort

� A very well-known method

� Smaller value “bubble” up

24242424

Lec 7: Internal Sorting7

Bubble Sort

42

20

17

13

28

14

23

15

42

20

17

13

28

14

23

15

13

42

20

17

14

28

15

23

13

14

42

20

17

15

28

23

13

14

15

42

20

17

23

28

13

14

15

17

42

20

23

28

13

14

15

17

20

42

23

28

13

14

15

17

20

23

42

28

i=0 i=1 i=2 i=3 i=4 i=5 i=6

15

23

14

28

13

17

13

20

13

42

15

28

14

17

14

20

14

42

2328

15

17

1520

15

42

17

20

17

42 20

42 23

42

42

28

Lec 7: Internal Sorting8

Outer Loop

Inner Loop

Bubble Sort

� Inner Loop

� Bubble up the small

value to the top

� Outer Loop

� Repeat the bubble up

process for the whole

list

template <class Elem, class Comp>

void bubsort(Elem A[], int n) {

for (int i=0; i<n-1; i++)

for (int j=n-1; j>i; j--)

if (Comp::lt(A[j], A[j-1]))

swap(A, j, j-1);

}

4220171328142315

1342301714281523

1314422017152823

1314154220172328

1314151742202328

1314151720422328

1314151720234228

1314151720232842

Lec 7: Internal Sorting9

Bubble Sort

� Number of comparisons made by the inner for

loop is always i

)( 2

1

ni

n

i

Θ=∑=

Swap Comparison

Best Case

Worst Case

Average Case

4220171328142315

1342301714281523

1314422017152823

1314154220172328

1314151742202328

1314151720422328

1314151720234228

1314151720232842

0

n(n-1)/2

n(n-1)/4

n(n-1)/2

n(n-1)/2

n(n-1)/2

Lec 7: Internal Sorting10

Selection Sort

� Recall, Bubble Sort:

� Objective: Find the correct value for a position

in each loop

� Some swaps are not necessary

� Selection Sort is a revised version of Bubble

Sort

4220171328142315

1342301714281523

1314422017152823

1314154220172328

1314151742202328

1314151720422328

1314151720234228

1314151720232842

Lec 7: Internal Sorting11

Selection Sort

42

20

17

13

28

14

23

15

42

20

17

13

28

14

23

15

13

20

17

42

28

14

23

15

13

14

17

42

28

20

23

15

13

14

15

42

28

20

23

17

13

14

15

17

28

20

23

42

13

14

15

17

20

28

23

42

13

14

15

17

28

23

28

42

i=0 i=1 i=2 i=3 i=4 i=5 i=6

13

42

14

20

17

15

42

17

28

20

23

28

4220171328142315

1342301714281523

1314422017152823

1314154220172328

1314151742202328

1314151720422328

1314151720234228

1314151720232842

Bubble Sort:

Lec 7: Internal Sorting12

Outer Loop

Inner Loop

Index Checking

Selection Sort

template <class Elem, class Comp>

void selsort(Elem A[], int n) {

for (int i=0; i<n-1; i++) {

int lowindex = i; // Remember its index

for (int j=n-1; j>i; j--) // Find least

if (Comp::lt(A[j], A[lowindex]))

lowindex = j; // Put it in place

if (lowindex != i)

swap(A, i, lowindex);

}

}

4220171328142315

1320174228142315

1314174228202315

1314154228202317

1314151728202342

1314151720282342

1314151720232842

1314151720232842

� Inner Loop

� Find the correct value

for the ith position

� Outer Loop

� Repeat the process for

the whole list

Lec 7: Internal Sorting13

n+n(n-1)/2

n+n(n-1)/2

n+n(n-1)/2

Selection Sort

Swap Comparison

Best Case

Worst Case

Average Case

4220171328142315

1320174228142315

1314174228202315

1314154228202317

1314151728202342

1314151720282342

1314151720232842

1314151720232842

� The n in comparison can be removed by deleting the index checking in Outer Loop

� If the checking is removed, the number of swaps will increase

� Cost of Swapping VS Cost of Comparison

Index Checking Find the smallest Value

0

n-1

(n-1)/2

Lec 7: Internal Sorting14

Insertion Sort

� Add a value to a correction position of a

sorted list

Lec 7: Internal Sorting15

Insertion Sort

42

20

17

13

28

14

23

15

42

20

17

13

28

14

23

15

20

42

17

13

28

14

23

15

17

20

42

13

28

14

23

15

13

17

20

42

28

14

23

15

13

17

20

28

42

14

23

15

13

14

17

20

28

42

23

15

13

14

17

20

23

28

42

1542

20

42

42

1720

17

13

42

13

20

13

17

2842 14

42

14

28

14

20

14

17

23

2842 28

23

20

17

15

i=1 i=2 i=3 i=4 i=5 i=6 i=7

Lec 7: Internal Sorting16

Outer Loop

Inner Loop

Insertion Sort

template <class Elem, class Comp>

void inssort(Elem A[], int n) {

for (int i=1; i<n; i++)

for (int j=i;

(j>0)&&(Comp::lt(A[j], A[j-1]));

j--)

swap(A, j, j-1);

}

4220171328142315

2042171328142315

1720421328142315

1317204228142315

1317202842142315

1314172028422315

1314172023284215

1314151720232842

� Inner Loop

� Find the correct position

for the new value in the

sorted list

� Outer Loop

� Repeat the process for

the whole list by adding

a value each time

Lec 7: Internal Sorting17

Insertion Sort

� Number of comparisons made by the inner for

loop is always smaller than or equal to i

4220171328142315

2042171328142315

1720421328142315

1317204228142315

1317202842142315

1314172028422315

1314172023284215

1314151720232842

Swap Comparison

Best Case

Worst Case

Average Case

)( 2

1

ni

n

i

Θ=∑=

0

n(n-1)/2

n(n-1)/4

n-1

n(n-1)/2

n(n-1)/4

Lec 7: Internal Sorting18

☺☺☺☺ Small Exercise!!!! ☺☺☺☺

� Use

� Bubble Sort

� Selection Sort

� Insertion Sort

to sort this list

5920171328142383

Lec 7: Internal Sorting19

☺☺☺☺ Small Exercise!!!! ☺☺☺☺

� Bubble Sort

5920171328142383

1359201714282383

1314592017232883

1314175920232883

1314172059232883

1314172023592883

1314172023285983

1314172023285983

Lec 7: Internal Sorting20

☺☺☺☺ Small Exercise!!!! ☺☺☺☺

� Selection Sort

5920171328142383

1320175928142383

1314175928202383

1314175928202383

1314172028592383

1314172023592883

13141720232859 83

1314172023285983

Lec 7: Internal Sorting21

☺☺☺☺ Small Exercise!!!! ☺☺☺☺

� Insertion Sort

5920171328142383

2059171328142383

1720591328142383

1317205928142383

1317202859142383

1314172028592383

1314172023 285983

1314172023 285983

Lec 7: Internal Sorting22

Comparison

Insertion Bubble Selection

Swaps

Best Case 0 0 0

Average Case Θ(n2) Θ(n2) Θ(n)

Worst Case Θ(n2) Θ(n2) Θ(n)

Comparisons

Best Case Θ(n) Θ(n2) Θ(n2)

Average Case Θ(n2) Θ(n2) Θ(n2)

Worst Case Θ(n2) Θ(n2) Θ(n2)

Lec 7: Internal Sorting23

Comparison

Insertion SortSelection Sort

512-colour

Hilbert-order swatch

Reference: http://corte.si/posts/code/sortvis-fruitsalad/

Lec 7: Internal Sorting24

Comparison

� Bubble, Selection and Insertion Sorts are

called exchanged sorting

� Exchange means swapping adjacent records

� The running times of exchanged sorting

must be ΘΘΘΘ(n2)

� Nested for loop is used

Lec 7: Internal Sorting25

Quick Sort

� Fastest known general-purpose in-memory

sorting algorithm in the average case

� A divide and conquer method

� Implement of the concept of Binary Search

Tree in an efficient way37

24

327

2

42

42

120

40

Lec 7: Internal Sorting26

Quick Sort

72 6 57 88 60 42 83 73 48 85

72 73 85 88 83

858342 48

48 57 426

85 83 8842 48 57

6048 6 57 42 88 83 73 72 85

6 42 48 57 60 72 73 83 85 88

73

83

85

88

6

57

42

60

48

72

Lec 7: Internal Sorting27

Quick Sort

72 6 57 88 60 42 83 73 48 8585 60

RL

48 72

L RRRRRL L L

8842

Each Step

6 57 83 7360 8548 728842

Lec 7: Internal Sorting28

72 6 57 88 60 42 83 73 48 85

72 6 57 88 85 42 83 73 48 60

72 6 57 88 85 42 83 73 48 60

48 6 57 88 85 42 83 73 72 60

48 6 57 88 85 42 83 73 72 60

48 6 57 42 85 88 83 73 72 60

48 6 57 42 85 88 83 73 72 60

Quick Sort

L R

L R

L R

L R

L R

LR

swapping

swapping

LR

swapping 48 6 57 42 60 88 83 73 72 85

Lec 7: Internal Sorting29

72 6 57 88 60 42 83 73 48 85

72 6 57 88 85 42 83 73 48 60

72 6 57 88 85 42 83 73 48 60

48 6 57 88 85 42 83 73 72 60

48 6 57 88 85 42 83 73 72 60

48 6 57 42 85 88 83 73 72 60

48 6 57 42 85 88 83 73 72 60

48 6 57 42 85 88 83 73 72 60

48 6 57 85 42 88 83 73 72 60

Quick Sort

L R

L R

L R

L R

L R

LR

LR

LR

swapping

swapping

swapping

swappingLec 7: Internal Sorting

30

Quick Sorttemplate <class Elem, class Comp>void qsort(Elem A[], int i, int j) { // Quicksort

if (j <= i) return; // Don't sort 0 or 1 Elem

int pivotindex = findpivot(A, i, j); swap(A, pivotindex, j); // Put pivot at end

// k will be the first position in the right subarray

int k = partition<Elem,Comp>(A, i-1, j, A[j]); swap(A, k, j); // Put pivot in place

qsort<Elem,Comp>(A, i, k-1); qsort<Elem,Comp>(A, k+1, j);

}

template <class Elem>int findpivot(Elem A[], int i, int j)

{ return (i+j)/2; }

72 6 57 88 60 42 83 73 48 856085

i =

0

j =

9

Lec 7: Internal Sorting31

Quick Sort

template <class Elem, class Comp>int partition(Elem A[], int l, int r, Elem& pivot) {

do { // Move the bounds in until they meet

while (Comp::lt(A[++l], pivot));while ((r != 0) && Comp::gt(A[-

-r], pivot));swap(A, l, r); // Swap out-of-

place values} while (l < r); // Stop when

they cross

swap(A, l, r); // Reverse last swap

return l; // Return first

72 6 57 88 85 42 83 73 48 60

RL

48 72

L RRRRRL L L

8842 428542 85

int k = partition<Elem,Comp>(A, i-1, j, A[j]);i =

0

j =

9

l = -1 r = 9 pivot = 6087654301234

l = 4

Lec 7: Internal Sorting32

Quick Sorttemplate <class Elem, class Comp>void qsort(Elem A[], int i, int j) { // Quicksort

if (j <= i) return; // Don't sort 0 or 1 Elem

int pivotindex = findpivot(A, i, j); swap(A, pivotindex, j); // Put pivot at end

// k will be the first position in the right subarray

int k = partition<Elem,Comp>(A, i-1, j, A[j]); swap(A, k, j); // Put pivot in place

qsort<Elem,Comp>(A, i, k-1); qsort<Elem,Comp>(A, k+1, j);

}

template <class Elem>int findpivot(Elem A[], int i, int j)

{ return (i+j)/2; } i =

0

j =

9

48 6 57 42 85 88 83 73 72 608560

k = 4

Lec 7: Internal Sorting33

Quick Sort

� The running time is:

� The Quick Sort may not be better the Bubble Sort in the worst case

� Optimizations:

� Better Pivot

� Better algorithm for small sublists

� Eliminate recursion

Best Case: Θ(n log(n))

Worst Case: Θ(n2)

Average Case: Θ(n log(n))

Lec 7: Internal Sorting34

☺☺☺☺ Small Exercise!!!! ☺☺☺☺

� Use Quick Sort to sort this list

59 20 17 13 28 14 23 83

Lec 7: Internal Sorting35

☺☺☺☺ Small Exercise!!!! ☺☺☺☺

13 20 17 83 28 14 23 59

20 17 23 14 28 83 59

14 17 23 20 59 83

20 23

59 20 17 13 28 14 23 83

Lec 7: Internal Sorting36

Merge Sort

� A divide and conquer method

� Simple to understand but difficult to

implement

15 24 31 60

7 8 22 50

15 24 31 607 8 22 50

Lec 7: Internal Sorting37

Merge Sort

36 20 17 13 28 14 23 15

36 20 17 13 28 14 23 15

20 36 13 17 14 28 15 23

13 17 20 36 14 15 23 28

20 36 13 17 14 28 15 23

13 17 20 36 14 15 23 28

13 14 15 17 20 23 28 36

Lec 7: Internal Sorting38

Merge Sort

� Running Time:

The best, Average and Worst Cases:

Θ(n log n)

� Requires twice the space

� Temporarily store the date in merging two

sublists

Lec 7: Internal Sorting39

☺☺☺☺ Small Exercise!!!! ☺☺☺☺

� Use Merge Sort to sort this list

59 20 17 13 28 14 23 83

Lec 7: Internal Sorting40

☺☺☺☺ Small Exercise!!!! ☺☺☺☺

59 20 17 13 28 14 23 83

20 59 13 17 14 28 23 83

13 17 20 59 14 23 28 83

13 14 17 20 23 28 59 83

Lec 7: Internal Sorting41

Heap Sort

� Quick Sort may face the worst situation

� Unbalanced BST

� Heap can avoid this problem

� You have learnt Heap in Ch6

Lec 7: Internal Sorting42

Heap Sort

72 6 57 88 60 42 83 73 48 85

57

42 83

6

4873

88 60

85

72

85

60

siftdown(4);buildheap();

siftdown(3);siftdown(2);siftdown(1);siftdown(0);

83

57

88

6

6

73

88

7285

72

Remove (get) the MAX (root)

until the heap is empty

Lec 7: Internal Sorting43

57

42 48

72

83

60 6

73

57

42 48

73

856

60 72

83

Heap Sort

83

42 57

73

486

60 72

88

85

88 85 83 73 72 60 57 48 42 6

83

42 57

85

486

73 72

60

88

642

57

48

57

72

48

42 6

60

648

42 60

57

57

42 73

60

48 6

72

6

486

42

42

6

Lec 7: Internal Sorting44

Heap Sort

� The time to build the heap is Θ(n)

� The time to remove the n elements is

Θ(n log n)

� So, if we only remove k elements,

the total cost is Θ(n + k log n)

Lec 7: Internal Sorting45

☺☺☺☺ Small Exercise!!!! ☺☺☺☺

� Use Heap Sort to sort this list

59 20 17 13 28 14 23 83

Lec 7: Internal Sorting46

☺☺☺☺ Small Exercise!!!! ☺☺☺☺

59 20 17 13 28 14 23 83

17

14 23

20

83

13 28

59

17

14 23

20

83

13 28

59

14

17 23

20

83

13 28

59

14

17 23

13

83

20 28

59

14

17 23

13

83

20 28

59

14

17 23

59

83

20 28

13

14

17 23

20

83

59 28

13

14

17 23

20

83

59 28

13

Lec 7: Internal Sorting47

☺☺☺☺ Small Exercise!!!! ☺☺☺☺

59 20 17 13 28 14 23 83

14

17 23

20

83

59 28

13

17

83 23

20

13

59 28

14

23

83 14

20

59 28

17

23

17

28

59 83

20

8328

59 20

23

8359

23

23

2383

59

59

83

83

Lec 7: Internal Sorting48

Bin Sort

� Maybe the simplest sorting method

Lec 7: Internal Sorting49

Bin Sort

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

17 13 20 7 2 17 12

17 20131272

17

2 7 12 13 17 17 20

Lec 7: Internal Sorting50

Bin Sorttemplate <class Elem>

void binsort(Elem A[], int n) {

List<Elem> B[MaxKeyValue];

Elem item;

for (i=0; i<n; i++) B[A[i]].append(A[i]);

for (i=0; i<MaxKeyValue; i++)

for (B[i].setStart();

B[i].getValue(item); B[i].next())

output(item);

}

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

17 13 20 7 2 17 12

17 20131272

17

2 7 12 13 17 17 20

Lec 7: Internal Sorting51

Bin Sort

� Running Time: Θ(n + MaxKeyValue)

Lec 7: Internal Sorting52

☺☺☺☺ Small Exercise!!!! ☺☺☺☺

� Use Bin Sort to sort this list

59 20 17 13 28 14 23 83

Lec 7: Internal Sorting53

☺☺☺☺ Small Exercise!!!! ☺☺☺☺

59 20 17 13 28 14 23 83

28 5923201713 14

1 … 13 14 15 16 17 18 19 20 … 23 … 28 … 59

… … … …

… 83

83

Lec 7: Internal Sorting54

Radix Sort

� The disadvantage of Bin Sort is the many

spaces are waste if the maximum number is

very large

� Radix Sort keep the number of bins and the

related processing small

� Revised Version of Bin Sort

Lec 7: Internal Sorting55

Radix Sort

27 91 1 97 17 23 84 28 72 5 67 25

91 1 72 23 84 5 25 27 97 17 67 28

1 5 17 23 25 27 28 67 72 84 91 97

Lec 7: Internal Sorting56

Radix Sort27 91 1 97 17 23 84 28 72 5 67 25

0

9

12345678

27 97 17 67

28

5

84

23

72

91 1

25

91 1 72 23 84 5 25 27 97 17 67 28

0

9

12345678

17

67

84

23

72

91

1 5

25

97

27 28

1 5 17 23 25 27 28 67 72 84 91 97

Phase 1:

Phase 2:

Phase 1

Right Digit

Phase 2

Left Digit

Lec 7: Internal Sorting57

Radix Sort

� Can be improved by increasing base r

� e.g. r =2i

� Each time the number of bits is doubled, the number of passes is cut in half

� For Example:

� r1 = 28 = 256 can process 1 byte at a time

� r2 = 216 = 64K can process 2 bytes at a time

� 32bits (4 bytes) data

� r1: 4 passes

� r2: 2 passes

Lec 7: Internal Sorting58

Radix Sort

� Cost: Θ(k (n + r) )

� n: number of values in the array

� k: maximum number of digits

� r: size of the base (e.g. 2, 10)

� If key range (k) is small, then this can be Θ(n)

� One pass can complete the sorting

Lec 7: Internal Sorting59

☺☺☺☺ Small Exercise!!!! ☺☺☺☺

� Use Radix Sort to sort this list

59 20 17 13 28 14 23 83

Lec 7: Internal Sorting60

☺☺☺☺ Small Exercise!!!! ☺☺☺☺

59 20 17 13 28 14 23 83A:

rtok = 1

rtok = 10

Result

A:

A:

20 13 23 83 14 17 28 59

13 14 17 20 23 28 59 83