csc 2300 data structures & algorithms march 27, 2007 chapter 7. sorting

Post on 19-Dec-2015

215 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

CSC 2300Data Structures & Algorithms

March 27, 2007

Chapter 7. Sorting

Today – Sorting

Quicksort Implementation

Quickselect Algorithm

Decision Trees Bucket Sort Summary

Quicksort – Efficient Implementation Is recursion good for all values of N? No. What should we do for small values of N?

Worst Case Bound

Quicksort: O(N2). Heapsort: O(N log N). Can we combine the two algorithms to achieve

a worst-case O(N log N) bound? Problem 7.27. Modify quicksort to call heapsort if the level of

recursion has reached 2 log N. Why would this work? Consider worst case analysis.

Quicksort Code

Lines 19 and 20 show why Quicksort is so fast.

Selection Problem

Page 1 of text. Given a set of N numbers, determine the

kth largest number. How to solve this problem? Sort the numbers in decreasing order, and

return the number in the kth position. Can you improve on this scheme?

Selection Problem – Modified Sort Problem: Given a set of N numbers, determine the

kth largest number. How to do better than just sorting the numbers? Read the first k numbers into an array, and sort

the numbers in decreasing order. Next, each remaining number is read one by one. Do you know how to complete the algorithm?

Selection Problem – Heapsort Problem: Given a set of N numbers, determine the

kth largest number. What is the time bound if we use a heap? Furthermore, what is the bound if k = N/2 (i.e., if

we want to find the median)?

Selection Problem – Quicksort Problem: Given a set of N numbers, determine the

kth largest number. Quicksort is very fast in sorting N numbers. Thus, quicksort should be very fast in selecting the

kth largest number. What is the work that we can save?

Quicksort – Algorithm

1. If the number of elements in S is 0 or 1, then return.

2. Pick any element v in S. This is called the pivot.

3. Partition S – {v} into two disjoint groups:

S1 = { x ε S – {v} | x ≤ v}

and

S2 = { x ε S – {v} | x ≥ v}.

4. Return { quicksort(S1) followed by v followed by quicksort(S2)}.

Where can we save work if we just want to find the kth largest number?

Quickselect makes only one recursive call (instead of two).

Quickselect – Code

Quicksort – Analysis

Time Bounds – Worst case Best case Average case

What would be the corresponding bounds for Quickselect?

In particular, the average case for Quickselect?

Average Case Analysis

Assume that each of the sizes for S1 is equally likely and thus has probability 1/N.

The average value of T(i) is thus (1/N) ∑ T(j). Quicksort recurrence becomes

T(N) = (2/N) ∑ T(j) + cN. What is recurrence for Quickselect? Quickselect recurrence is

T(N) = (1/N) ∑ T(j) + cN. See Problem 7.30. What is the answer?

Decision Tree

A decision tree is an abstraction used to prove lower bounds. In our context, it is a binary tree.

Each node represents a set of possible orderings. The results of the comparisons are the tree edges.

Decision Tree – Example

Sorting three numbers.

Decision Tree

Every algorithm that sorts by using only comparisons can be represented by a decision tree.

The number of comparisons used the sorting algorithm is equal to the depth of the deepest leaf.

The average number of comparisons used is equal to the average depth of the leaves.

Theory

Lemma 7.1. let T be a binary tree of depth d. Then T has at most 2d leaves.

Lemma 7.2. A binary tree with L leaves must have depth at least [log L].

Theorem 7.6. Any sorting algorithm that uses only comparisons between elements requires at least [log (N!)] comparisons in the worst case.

[ ] represents ceiling above. Stirling’s formula in Problem 7.34. Theorem 7.7. any sorting algorithm that uses

only comparisons between elements requires Ω(N log N) comparisons.

Linear Time Sorting

We have shown that any general sorting algorithm that uses only comparisons requires Ω(N log N) time in the worst case.

Now we describe bucket sort, which is a linear time algorithm.

Is this a contradiction?

Bucket Sort

Say the input A1, A2, …,AN consists of only positive integers smaller than M.

Keep an array called count, of size M, which is initialized to all 0’s.

Thus, count has M cells, or buckets, which are initially empty.

When Ai is read, increment count[Ai] by 1. After all the input is read, scan the count array,

printing out a representation of the sorted list. How much time does the algorithm require? What if M = O(N)? Have we violated the Ω(N log N) lower bound?

Summary

top related