computer science 101 fast searching and sorting. improving efficiency we got a better best case by...

28
Computer Science 101 Fast Searching and Sorting

Upload: whitney-jacobs

Post on 31-Dec-2015

214 views

Category:

Documents


0 download

TRANSCRIPT

Computer Science 101

Fast Searching and Sorting

Improving Efficiency

• We got a better best case by tweaking the selection sort and the bubble sort

• We would like to improve the worst cases, too

Example: Sequential Search

If the data items are in random order, then each one must be examined in the worst case

Requires N comparisons in the worst case

set Current to 1Set Found to false while Current <= N and not Found do if A(Current) = Target then set Found to true else increment Currentif Found then output Currentelse output 0

Searching a Sorted List

• When we search a phone book, we don’t begin with the first name and look at each successor

• We skip over large numbers of names until we find the target or give up

Binary Search

• Strategy– Have pointers marking left and right ends of the list still to be

processed– Compute the position of the midpoint between the two pointers– If the target equals the value at midpoint, quit with the position

found– Otherwise, if the target is less than that value, search just the

positions to the left of midpoint– Otherwise, search the just the positions to the right of midpoint– Give up when the pointers cross

14 16 22 32 34 66 80 9066target

begin endmid

Binary Search

• Strategy– Have pointers marking left and right ends of the list still to be

processed– Compute the position of the midpoint between the two pointers– If the target equals the value at midpoint, quit with the position

found– Otherwise, if the target is less than that value, search just the

positions to the left of midpoint– Otherwise, search the just the positions to the right of midpoint– Give up when the pointers cross

14 16 22 32 34 66 80 9066target

begin endmid

The Binary Search Space

34 41 56 63 72 89 950 1 2 3 4 5 6

34 41 560 1 2

72 89 954 5 6

34 560 2

72 954 6

The Binary Search Algorithmset Begin to 1Set End to NSet Found to false while Begin <= End and not Found do compute the midpoint if Target = A(Mid) then set Found to true else if Target < A(Mid) then search to the left of the midpoint else search to the right of the midpointif Found then output Midelse output 0

The Binary Search Algorithmset Begin to 1Set End to NSet Found to false while Begin <= End and not Found do set Mid to (Begin + End) / 2 if Target = A(Mid) then set Found to true else if Target < A(Mid) then search to the left of the midpoint else search to the right of the midpointif Found then output Midelse output 0

The Binary Search Algorithmset Begin to 1Set End to NSet Found to false while Begin <= End and not Found do set Mid to (Begin + End) / 2 if Target = A(Mid) then set Found to true else if Target < A(Mid) then set End to Mid – 1 else search to the right of the midpointif Found then output Midelse output 0

The Binary Search Algorithmset Begin to 1Set End to NSet Found to false while Begin <= End and not Found do set Mid to (Begin + End) / 2 if Target = A(Mid) then set Found to true else if Target < A(Mid) then set End to Mid – 1 else set Begin to Mid + 1if Found then output Midelse output 0

Analysis of Binary Search

• On each pass through the loop, ½ of the positions in the list are discarded

• In the worst case, the number of comparisons equals the number of times the size of the list can be divided by 2

• How many comparisons for a list of size N, in the worst case?

Improving on Sorting

• Several algorithms have been developed to break the (N2 - N) / 2 barrier for sorting

• Most of them use a divide-and-conquer strategy

• Break the list into smaller pieces and apply another algorithm to them

Quicksort

• Strategy - Divide and Conquer:– Partition list into two parts, with small elements in the first

part and large elements in the second part– Sort the first part– Sort the second part

• Question - How do we sort the sections?Answer - Apply Quicksort to them

• Recursive algorithm - one which makes use of itself to solve smaller problems of the same type

Quicksort

• Question - Will this recursive process ever stop?

• Answer - Yes, when the problem is small enough, we no longer use recursion. Such cases are called base cases

Partitioning a List

• To partition a list, we choose a pivot element

• The elements that are less than or equal to the pivot go into the first section

• The elements larger than the pivot go into the second section

19 8 15 5 30 20 10 1 28 25 12

19 8 15 5 302010 1 28 2512

Partition

Partitioning a List

Pivot is the element at the midpoint

Sublist to sort Sublist to sort

Data are where they should be relative to the pivot

The Quicksort Algorithm

if the list to sort has more than 1 element then if the list has exactly two elements then

if the elements are out of order then exchange them else perform the Partition Algorithm on the list apply QuickSort to the first section apply QuickSort to the second section

Partitioning: Choosing the Pivot

• Ideal would be to choose the median element as the pivot, but this would take too long

• Some versions just choose the first element

• Our choice - the median of the first three elements

19 8 15 5 30 20 10 1 28 25 12

12 5 1 15 283010 8 25 1920

Partition

Partitioning a List

Pivot is median of first three items

The median of the first three elements is a better approximation to the actual median than the element at the midpoint and results in more even splits

The Partition Algorithmexchange the median of the first 3 elements with the first set P to first position of listset L to second position of listset U to last position of listwhile L <= U while A(L) A(P) do set L to L + 1 while A(U) > A(P) do set U to U - 1 if L < U then exchange A(L) and A(U)exchange A(P) and A(U)

A The listP The position of the pivot elementL Probes for elements > pivotU Probes for elements <= pivot

Quicksort: Rough Analysis

• For simplification, assume that we always get even splits when we partition

• When we partition the entire list, each element is compared with the pivot - approximately n comparisons

• Each of the halves is partitioned, each taking about n/2 comparisons, thus about n more comparisons

• Each of the fourths is partitioned,each taking about n/4 comparisons - n more

Quicksort: Rough Analysis

• How many levels of about n comparisons do we get?

• Roughly, we keep splitting until the pieces are about size 1

• How many times must we divide n by 2 before we get 1?

• log(n) times, of course• Thus comparisons n Log(n) in the ideal or best

case

Call Tree For a Best Case34 41 56 63 72 89 95

34 41 56 72 89 95

34 56 72 95

We select the midpoint element as the pivot.The median element happens to be at the midpoint on each call. But the array was already sorted!

Worst Case

• What if the value at the midpoint is near the largest value on each call?

• Or near the smallest value on each call?

• Then there will be approximately n subdivisions, and the total number of comparisons will degenerate to n2

Call Tree For a Worst Case34 41 56 63 72 89 95

We select the first element as the pivot.The smallest element happens to be the first one on each call. n subdivisions!

41 56 63 72 89 95

56 63 72 89 95

63 72 89 95

72 89 95

89 95

95

Other Methods of Selecting the Pivot Element

• Pick a random element

• Pick the median of the first three elements

• Pick the median of the first, middle, and last elements

• Pick the median element - not!! This is an O(n) algorithm

For Monday

Continue Reading in Chapter 3