lecture 26: bucket sort & radix sort

30
LECTURE 26: BUCKET SORT & RADIX SORT CSC 213 – Large Scale Programming

Upload: giacinto-garza

Post on 03-Jan-2016

86 views

Category:

Documents


1 download

DESCRIPTION

CSC 213 – Large Scale Programming. Lecture 26: BUCKET SORT & RADIX Sort. Today’s Goals. Review discussion of merge sort and quick sort How do they work & why divide-and-conquer? Are they fastest possible sorts? Another way to sort data presented - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Lecture 26: BUCKET SORT & RADIX Sort

LECTURE 26:BUCKET SORT & RADIX SORT

CSC 213 – Large Scale Programming

Page 2: Lecture 26: BUCKET SORT & RADIX Sort

Today’s Goals

Review discussion of merge sort and quick sort How do they work & why divide-and-

conquer? Are they fastest possible sorts?

Another way to sort data presented How can we sort data with single simple

value? What are limits on using buckets to sort our

data? If we want more buckets, can we expand

these limits? How does radix sort work? How long does it

need?

Page 3: Lecture 26: BUCKET SORT & RADIX Sort

Quick Sort v. Merge Sort

Quick Sort Merge Sort

Divide data around pivot Want pivot to be near

middle All comparisons occur

here

Conquer with recursion Does not need extra

space

Merge usually done already Data already sorted!

Divide data in blindly half Always gets even split No comparisons

performed!

Conquer with recursion Needs* to use other

arrays

Merge combines solutions Compares from (sorted)

halves

Page 4: Lecture 26: BUCKET SORT & RADIX Sort

Complexity of Sorting

With n! external nodes, binary tree’s height is:minimum height (time)

log (n!)

n!

xi < xj ?

xa < xb ?

xc < xd ? xc < xd ?xc < xd ? xc < xd ?

xa < xb ?O(n log n)

Page 5: Lecture 26: BUCKET SORT & RADIX Sort

Bucket-Sort

Buckets, B, is array of Sequence Sorts Collection, C, in two phases:

1. Remove each element v from C & add to B[v]

2. Move elements from each bucket back to C

A B C

Page 6: Lecture 26: BUCKET SORT & RADIX Sort

Bucket-Sort

Buckets, B, is array of Sequence Sorts Collection, C, in two phases:

1. Remove each element v from C & add to B[v]

2. Move elements from each bucket back to C

Page 7: Lecture 26: BUCKET SORT & RADIX Sort

Bucket-Sort Algorithm

Algorithm bucketSort(Sequence<Integer> C)B = new Sequence[10] // & instantiate each Sequence

// Phase 1 for each element v in C

B[v].addLast(v) // Assumes each number in C between 0 & 9endfor

// Phase 2loc = 0for each Sequence b in B

for each element v in bC.set(loc, v)loc += 1

endforendfor

return C

Page 8: Lecture 26: BUCKET SORT & RADIX Sort

Bucket Sort Properties

For this to work, values must be legal indices Non-negative integer indices needed to

access arrays Sorting occurs without comparing objects

Page 9: Lecture 26: BUCKET SORT & RADIX Sort

Bucket Sort Properties

For this to work, values must be legal indices Non-negative integer indices needed to

access arrays Sorting occurs without comparing

objects

Page 10: Lecture 26: BUCKET SORT & RADIX Sort

Bucket Sort Properties

For this to work, values must be legal indices Non-negative integer indices needed to

access arrays

Sorting occurs without

comparing objects

Page 11: Lecture 26: BUCKET SORT & RADIX Sort

Bucket Sort Properties

For this to work, values must be legal indices Non-negative integer indices needed to

access arrays Sorting occurs without comparing objects

Stable sort describes any sort of this type Preserves relative ordering of objects with

same value (BUBBLE-SORT & MERGE-SORT are other

stable sorts)

Page 12: Lecture 26: BUCKET SORT & RADIX Sort

Bucket Sort Extensions

Use Comparator for BUCKET-SORT Get index for v using compare(v, null)

Comparator for booleans could return 0 when v is false 1 when v is true

Comparator for US states, could return Annual per capita consumption of Jello Consumption of jello overall, in cubic feet State’s ranking by population

Page 13: Lecture 26: BUCKET SORT & RADIX Sort

Bucket Sort Extensions

State’s ranking by population

1 California2 Texas3 New York4 Florida5 Illinois

6Pennsylvania

7 Ohio8 Michigan9 Georgia

Page 14: Lecture 26: BUCKET SORT & RADIX Sort

Bucket Sort Extensions

Extended BUCKET-SORT works with many types Limited set of data needed for this to work Need way to enumerate values of the set

Page 15: Lecture 26: BUCKET SORT & RADIX Sort

Bucket Sort Extensions

Extended BUCKET-SORT works with many types Limited set of data needed for this to work Need way to enumerate values of the set

enumerateis subtle

hint

Page 16: Lecture 26: BUCKET SORT & RADIX Sort

d-Tuples

Combination of d values such as (k1, k2, …, kd) ki is ith dimension of the tuple

A point (x, y, z) is 3-tuple x is 1st dimension’s value Value of 2nd dimension is y z is 3rd dimension’s value

Page 17: Lecture 26: BUCKET SORT & RADIX Sort

Lexicographic Order

Assume a & b are both d-tuples a = (a1, a2, …, ad)

b = (b1, b2, …, bd)

Can say a < b if and only if a1 < b1 OR

a1 = b1 && (a2, …, ad) < (b2, …, bd)

Order these 2-tuples using previous definition (3 4) (7 8) (3 2) (1 4) (4 8)

Page 18: Lecture 26: BUCKET SORT & RADIX Sort

Lexicographic Order

Assume a & b are both d-tuples a = (a1, a2, …, ad)

b = (b1, b2, …, bd)

Can say a < b if and only if a1 < b1 OR

a1 = b1 && (a2, …, ad) < (b2, …, bd)

Order these 2-tuples using previous definition (3 4) (7 8) (3 2) (1 4) (4 8) (1 4) (3 2) (3 4) (4 8) (7 8)

Page 19: Lecture 26: BUCKET SORT & RADIX Sort

Radix-Sort

Very fast sort for data expressed as d-tuple Cheats to win; faster than sorting’s lower

bound Sort performed using d calls to bucket sort Sorts least to most important dimension of

tuple Luckily lots of data are d-tuples

String is d-tuple of char“L E T T E R S”“L I N G E R S”

Page 20: Lecture 26: BUCKET SORT & RADIX Sort

Radix-Sort

Very fast sort for data expressed as d-tuple Cheats to win; faster than sorting’s lower

bound Sort performed using d calls to bucket sort Sorts least to most important dimension of

tuple Luckily lots of data are d-tuples

Digits of an int can be used for sorting, also

1 0 0 1 3 7 2 91 0 0 9 2 2 1 0

Page 21: Lecture 26: BUCKET SORT & RADIX Sort

Radix-Sort For Integers

Represent int as a d-tuple of digits:621010 = 1111102 041010 =

0001002

Decimal digits needs 10 buckets to use for sorting

Ordering using their bits needs 2 buckets O(d∙n) time needed to run RADIX-SORT

d is length of longest element in input In most cases value of d is constant (d =

31 for int) Radix sort takes O(n) time, ignoring

constant

Page 22: Lecture 26: BUCKET SORT & RADIX Sort

Radix-Sort In Action

List of 4-bit integers sorted using RADIX-SORT1001

0010

1101

0001

1110

Page 23: Lecture 26: BUCKET SORT & RADIX Sort

Radix-Sort In Action

List of 4-bit integers sorted using RADIX-SORT1001

0010

1101

0001

1110

0010

1110

1001

1101

0001

Page 24: Lecture 26: BUCKET SORT & RADIX Sort

Radix-Sort In Action

List of 4-bit integers sorted using RADIX-SORT1001

0010

1101

0001

1110

1001

1101

0001

0010

1110

0010

1110

1001

1101

0001

Page 25: Lecture 26: BUCKET SORT & RADIX Sort

Radix-Sort In Action

List of 4-bit integers sorted using RADIX-SORT1001

0010

1101

0001

1110

1001

0001

0010

1101

1110

1001

1101

0001

0010

1110

0010

1110

1001

1101

0001

Page 26: Lecture 26: BUCKET SORT & RADIX Sort

Radix-Sort In Action

List of 4-bit integers sorted using RADIX-SORT 0001

0010

1001

1101

1110

1001

0010

1101

0001

1110

1001

0001

0010

1101

1110

1001

1101

0001

0010

1110

0010

1110

1001

1101

0001

Page 27: Lecture 26: BUCKET SORT & RADIX Sort

Radix-Sort

Algorithm radixSort(Sequence<Integer> C) // Works from least to most significant value for bit = 0 to 30 C = bucketSort(C, bit) // Sort C using the specified bitendfor

return C

What is big-Oh complexity for Radix-Sort? Call in loop uses each element twice Loop repeats once per digit to complete

sort

Page 28: Lecture 26: BUCKET SORT & RADIX Sort

Radix-Sort

Algorithm radixSort(Sequence<Integer> C) // Works from least to most significant value for bit = 0 to 30 C = bucketSort(C, bit) // Sort C using the specified bitendfor

return C

What is big-Oh complexity for Radix-Sort? Call in loop uses each element twice

O(n) Loop repeats once per digit to complete

sort * O(1)

O(n)

Page 29: Lecture 26: BUCKET SORT & RADIX Sort

Radix-Sort

Algorithm radixSort(Sequence<Integer> C) // Works from least to most significant value for bit = 0 to 30 C = bucketSort(C, bit) // Sort C using the specified bitendfor

return C

What is big-Oh complexity for Radix-Sort? Call in loop uses each element twice

O(n) Loop repeats once per digit to complete

sort * O(1)

O(log n) times (?) O(n log n)

Page 30: Lecture 26: BUCKET SORT & RADIX Sort

For Next Lecture

Start thinking test cases for program #2 Wed. is next deadline when these must be

submitted Spend time on this: tests & design saves

coding Tuesday deadline for weekly

assignment For Wednesday, review index files, Set

& sorts Quiz will be like others this term with mix of

problems