csc 2300 data structures & algorithms march 20, 2007 chapter 7. sorting

18
CSC 2300 Data Structures & Algorithms March 20, 2007 Chapter 7. Sorting

Post on 20-Jan-2016

225 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: CSC 2300 Data Structures & Algorithms March 20, 2007 Chapter 7. Sorting

CSC 2300Data Structures & Algorithms

March 20, 2007

Chapter 7. Sorting

Page 2: CSC 2300 Data Structures & Algorithms March 20, 2007 Chapter 7. Sorting

Today – Sorting

Heapsort – Algorithm Analysis Lower bound

Mergesort – Algorithm Worst case

Quicksort – Algorithm

Page 3: CSC 2300 Data Structures & Algorithms March 20, 2007 Chapter 7. Sorting

Heapsort – Algorithm

What time bound can we get if we use a heap? We want to sort N integers. We build a binary heap of N elements. How much time does buildHeap take? We then perform N deleteMin operations. The elements leave the heap smallest first, in sorted order. By recording these elements in a second array and then

copying them back, we sort N elements. How much time does each deleteMin take? What is the total running time? What is a problem with Heapsort?

Page 4: CSC 2300 Data Structures & Algorithms March 20, 2007 Chapter 7. Sorting

Heapsort – Space

The algorithm uses an extra array. Thus, the memory requirement is doubled. What about the work to copy the second array back to the first? How do we solve the problem of space?

Page 5: CSC 2300 Data Structures & Algorithms March 20, 2007 Chapter 7. Sorting

Heapsort – Saving Space

After each deleteMin, the heap shrinks by 1. Thus, the cell that was last in the heap can be used to store the

element that was just deleted. If we use this strategy, the array will contain the elements in

what sorted order? We usually want the elements in increasing sorted order. What should we do?

Page 6: CSC 2300 Data Structures & Algorithms March 20, 2007 Chapter 7. Sorting

Max Heap

Use a max heap. Build the heap in linear time. Then perform N – 1 deleteMax operations.

Page 7: CSC 2300 Data Structures & Algorithms March 20, 2007 Chapter 7. Sorting

Heapsort

Note: the array for heapsort contains data in position 0, unlike before, where the index starts at 1.

Page 8: CSC 2300 Data Structures & Algorithms March 20, 2007 Chapter 7. Sorting

Heapsort – Performance

Experiments show that the performance of heapsort is extremely consistent.

On average, heapsort uses only slightly fewer comparisons than suggested by the worst-case bound.

Why? It appears that successive deleteMax operations destroy the

randomness of the heap. Can you suggest another O(N log N) sorting scheme? What do you think about when you see O(N log N)?

Page 9: CSC 2300 Data Structures & Algorithms March 20, 2007 Chapter 7. Sorting

Mergesort

A recursive algorithm! O(N log N) worst-case running time. We can show that the number of comparisons is nearly optimal. The fundamental operation is to merge two sorted lists. Since the lists are sorted, the merging can be done in one pass

through the input, if the output is put in a third list.

Page 10: CSC 2300 Data Structures & Algorithms March 20, 2007 Chapter 7. Sorting

Mergesort – Example

We will go through an example in class. Try this:

A: 1, 13, 24, 26, 35, 44, 52, 68

B: 2, 15, 27, 28, 47, 66, 72, 75

C: Merging two sorted lists with a total of N elements takes

at most N – 1 comparisons. Why only N – 1? Every comparison adds an element to C, except the last

comparison, which adds at least two elements.

Page 11: CSC 2300 Data Structures & Algorithms March 20, 2007 Chapter 7. Sorting

Mergesort – Routines

No base case in recursive routine?

Page 12: CSC 2300 Data Structures & Algorithms March 20, 2007 Chapter 7. Sorting

Merge – Routine

Page 13: CSC 2300 Data Structures & Algorithms March 20, 2007 Chapter 7. Sorting

Mergesort – Analysis

Assume that N is a power of 2. For N = 1, the time to mergesort is constant, which we denote by 1. Otherwise, the time to mergesort N numbers is equal to the time to

do two recursive mergesorts of size N/2, plus the time to merge, which is linear.

HenceT(1) = 1T(N) = 2T(N/2) + N

Does anyone remember how to solve this recurrence?

Page 14: CSC 2300 Data Structures & Algorithms March 20, 2007 Chapter 7. Sorting

Mergesort – Recurrence

Recurrence: T(N) = 2T(N/2) + N

Divide by N:T(N)/N = T(N/2)/(N/2) + 1

What do we assume about the value of N? We get:

T(N)/N = T(1)/1 + log N Thus,

T(N) = N log N + N = O(N log N)

Page 15: CSC 2300 Data Structures & Algorithms March 20, 2007 Chapter 7. Sorting

Quicksort – Description

Fastest generic sorting algorithm in practice. Average running time is O(N log N). What is worst-case running time bound? O(N2). We may make the O(N2) bound very unlikely with just

a little effort on choosing the pivot.

Page 16: CSC 2300 Data Structures & Algorithms March 20, 2007 Chapter 7. Sorting

Quicksort – Algorithm

1. If the number of elements in S is 0 or 1, then return.

2. Pick any element v in S. This is called the pivot.

3. Partition S – {v} into two disjoint groups:

S1 = { x ε S – {v} | x ≤ v}

and

S2 = { x ε S – {v} | x ≥ v}.

4. Return { quicksort(S1) followed by v followed by quicksort(S2)}.

Page 17: CSC 2300 Data Structures & Algorithms March 20, 2007 Chapter 7. Sorting

Quicksort – Example

Page 18: CSC 2300 Data Structures & Algorithms March 20, 2007 Chapter 7. Sorting

Quicksort – Partition Strategy Example. Input: 8, 1, 4, 9, 6, 3, 5, 2, 7, 0. Say 6 is chosen as pivot. 8 1 4 9 0 3 5 2 7 6

i j pivot 8 1 4 9 0 3 5 2 7 6

i j 2 1 4 9 0 3 5 8 7 6

i j 2 1 4 9 0 3 5 8 7 6

i j 2 1 4 5 0 3 9 8 7 6

i j 2 1 4 5 0 3 9 8 7 6

j i pivot 2 1 4 5 0 3 6 8 7 9

pivot i