1 sorting. for computer, sorting is the process of ordering data. [ 1 9 8 3 2 ] [ 1 2 3 8 9 ] [...

34
1 Sorting

Upload: thomasine-hines

Post on 18-Dec-2015

219 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 1 Sorting. For computer, sorting is the process of ordering data. [ 1 9 8 3 2 ]  [ 1 2 3 8 9 ] [ “Tom”, “Michael”, “Betty” ]  [ “Betty”, “Michael”,

1

Sorting

Page 2: 1 Sorting. For computer, sorting is the process of ordering data. [ 1 9 8 3 2 ]  [ 1 2 3 8 9 ] [ “Tom”, “Michael”, “Betty” ]  [ “Betty”, “Michael”,

2

Sorting

For computer, sorting is the process of ordering

data.

[ 1 9 8 3 2 ] [ 1 2 3 8 9 ]

[ “Tom”, “Michael”, “Betty” ] [ “Betty”, “Michael”,

“Tom” ]

Sorting has several applications.

Efficient binary search

Finding the min, the max, or the median

Efficient neighborhood operations

Page 3: 1 Sorting. For computer, sorting is the process of ordering data. [ 1 9 8 3 2 ]  [ 1 2 3 8 9 ] [ “Tom”, “Michael”, “Betty” ]  [ “Betty”, “Michael”,

3

Sorting

Many sorting algorithms are available.

Recommend a web page http://sorting.at

This class reviews Selection, Insertion, Merge (and

Quick) Sort.

The most efficient sorting in average is Quick sort.

However, each algorithm has its own advantages.

Knowing sorted() with cmp() function would be

enough.

Page 4: 1 Sorting. For computer, sorting is the process of ordering data. [ 1 9 8 3 2 ]  [ 1 2 3 8 9 ] [ “Tom”, “Michael”, “Betty” ]  [ “Betty”, “Michael”,

4

Basic Operation

Comparison and Swap operation.

A comparison function must be given.

The swap operation replaces (swaps) two elements

( L[i], L[j] ) = ( L[j], L[i] )

Page 5: 1 Sorting. For computer, sorting is the process of ordering data. [ 1 9 8 3 2 ]  [ 1 2 3 8 9 ] [ “Tom”, “Michael”, “Betty” ]  [ “Betty”, “Michael”,

5

Selection Sort

Page 6: 1 Sorting. For computer, sorting is the process of ordering data. [ 1 9 8 3 2 ]  [ 1 2 3 8 9 ] [ “Tom”, “Michael”, “Betty” ]  [ “Betty”, “Michael”,

Selection Sort

Finding the 1st minimum, the 2nd minimum, …

Successive application of find_min() function

def find_min(L,e):

if not L:

return None

(min_idx, min_value) = (0, L[0])

for j,x in enumerate(L[1:]):

(min_idx, min_value) = (j, x if x<min_value else min_value)

return minimum

Page 7: 1 Sorting. For computer, sorting is the process of ordering data. [ 1 9 8 3 2 ]  [ 1 2 3 8 9 ] [ “Tom”, “Michael”, “Betty” ]  [ “Betty”, “Michael”,

Selection Sort

[ 1 9 8 3 5 7 ]

[ 1 ]

[ 1 3 ]

[ 1 3 5 ]

[ 1 3 5 7 ]

[ 1 3 5 7 8 9 ]

Page 8: 1 Sorting. For computer, sorting is the process of ordering data. [ 1 9 8 3 2 ]  [ 1 2 3 8 9 ] [ “Tom”, “Michael”, “Betty” ]  [ “Betty”, “Michael”,

Selection Sort

Running Time Analysis

The first search takes n comparisons at most

The second search takes (n-1) comparisons at most

The last search takes 1 comparison.

Page 9: 1 Sorting. For computer, sorting is the process of ordering data. [ 1 9 8 3 2 ]  [ 1 2 3 8 9 ] [ “Tom”, “Michael”, “Betty” ]  [ “Betty”, “Michael”,

Selection Sort

The worst case

[ 5 4 3 2 1 ]

The best case

[ 1 2 3 4 5 ]

When terminated during execution, the output has

at least a part of input in ordered from the

beginning.

Page 10: 1 Sorting. For computer, sorting is the process of ordering data. [ 1 9 8 3 2 ]  [ 1 2 3 8 9 ] [ “Tom”, “Michael”, “Betty” ]  [ “Betty”, “Michael”,

Selection Sort

Stability

[ (1, 0), (5, 1), (3, 2), (1, 3), (7, 4) ]

[ (1, 0), (1, 3), (3, 2), (5, 1), (7, 4) ]

Selection sort is stable when creating a new list

because it maintains the order of data in its

original order among the same keys.

However, in some implementations, it may not be

stable.

Page 11: 1 Sorting. For computer, sorting is the process of ordering data. [ 1 9 8 3 2 ]  [ 1 2 3 8 9 ] [ “Tom”, “Michael”, “Betty” ]  [ “Betty”, “Michael”,

11

Insertion Sort

Page 12: 1 Sorting. For computer, sorting is the process of ordering data. [ 1 9 8 3 2 ]  [ 1 2 3 8 9 ] [ “Tom”, “Michael”, “Betty” ]  [ “Betty”, “Michael”,

Insertion Sort

Most people will do insertion sort.

1. Process each number from the left most element

2. Assume that the left sub-list L[0:k] is already

ordered, when (k+1)-th element is processed.

3. Find an element smaller x than L[k+1]

4. Insert L[k+1] element after the element x

Page 13: 1 Sorting. For computer, sorting is the process of ordering data. [ 1 9 8 3 2 ]  [ 1 2 3 8 9 ] [ “Tom”, “Michael”, “Betty” ]  [ “Betty”, “Michael”,

Insertion Sort

Inserting an element into a list could be

implemented in different ways.

Python list has a function insert(pos, elem) which

inserts an element elem before L[pos]

L = [1,2,3,4,5]

L.insert(0,0) L = [0,1,2,3,4,5]

L.insert(3,10) L = [0,1,2,10,3,4,5]

Page 14: 1 Sorting. For computer, sorting is the process of ordering data. [ 1 9 8 3 2 ]  [ 1 2 3 8 9 ] [ “Tom”, “Michael”, “Betty” ]  [ “Betty”, “Michael”,

Insertion Sort

To insert 12, we need to make room for it by moving first 36 and then 24.

6 10 24

12

36

Page 15: 1 Sorting. For computer, sorting is the process of ordering data. [ 1 9 8 3 2 ]  [ 1 2 3 8 9 ] [ “Tom”, “Michael”, “Betty” ]  [ “Betty”, “Michael”,

Insertion Sort

6 10 24 36

12

Page 16: 1 Sorting. For computer, sorting is the process of ordering data. [ 1 9 8 3 2 ]  [ 1 2 3 8 9 ] [ “Tom”, “Michael”, “Betty” ]  [ “Betty”, “Michael”,

Insertion Sort

6 10 24 36

12

Page 17: 1 Sorting. For computer, sorting is the process of ordering data. [ 1 9 8 3 2 ]  [ 1 2 3 8 9 ] [ “Tom”, “Michael”, “Betty” ]  [ “Betty”, “Michael”,

Insertion Sort

6 10 24 3612

Page 18: 1 Sorting. For computer, sorting is the process of ordering data. [ 1 9 8 3 2 ]  [ 1 2 3 8 9 ] [ “Tom”, “Michael”, “Betty” ]  [ “Betty”, “Michael”,

Insertion Sort

Analysis using Mathematical Induction

Assume that L[0:k-1] is already sorted.

For inserting k-th element into L[0:k-1], k

comparisons are required at most

For inserting (k+1)-th element into L[0:k], (k+1)

comparisons are required at most

For inserting n-th element into L[0:n-1], n

comparisons are required at most

Page 19: 1 Sorting. For computer, sorting is the process of ordering data. [ 1 9 8 3 2 ]  [ 1 2 3 8 9 ] [ “Tom”, “Michael”, “Betty” ]  [ “Betty”, “Michael”,

Insertion Sort

The worst case: O(n^2)

[ 5 4 3 2 1 ]

The average case: O(n^2) Why?

The best case: O(n) Why?

[ 1 2 3 4 5 ]

Page 20: 1 Sorting. For computer, sorting is the process of ordering data. [ 1 9 8 3 2 ]  [ 1 2 3 8 9 ] [ “Tom”, “Michael”, “Betty” ]  [ “Betty”, “Michael”,

20

Merge Sort

Page 21: 1 Sorting. For computer, sorting is the process of ordering data. [ 1 9 8 3 2 ]  [ 1 2 3 8 9 ] [ “Tom”, “Michael”, “Betty” ]  [ “Betty”, “Michael”,

Linear vs. Binary Search

Essentially, insertion and selection sorting

algorithms are linear.

Employing the concept of binary search, an

efficient sorting algorithm is possible.

Binary search works by dividing a problem into

smaller problems.

The search range reduces by half and half.

Page 22: 1 Sorting. For computer, sorting is the process of ordering data. [ 1 9 8 3 2 ]  [ 1 2 3 8 9 ] [ “Tom”, “Michael”, “Betty” ]  [ “Betty”, “Michael”,

Merge Sort

Merging two ordered lists:

L1 = [ 1 18 88 94 99 ]

L2 = [ 7 9 22 24 92 ]

O = [ ]

Merging two ordered lists, each has m and n

elements respectively, into a single ordered list

requires at most m+n comparisons

7 9 18 22 24 88 92 94 991

X

X

Page 23: 1 Sorting. For computer, sorting is the process of ordering data. [ 1 9 8 3 2 ]  [ 1 2 3 8 9 ] [ “Tom”, “Michael”, “Betty” ]  [ “Betty”, “Michael”,

Merge Sort

The idea is to split a given array by two halves until

one or two elements remain.

Merge two ordered lists successively.

Splitting array takes O(log n) and merging takes

O(n).

The total running time is at most O(n log n)

Page 24: 1 Sorting. For computer, sorting is the process of ordering data. [ 1 9 8 3 2 ]  [ 1 2 3 8 9 ] [ “Tom”, “Michael”, “Betty” ]  [ “Betty”, “Michael”,

Merge Sort

Sorting deals with the entire set of elements.

1 94 88 18 99 9 7 92 22 24

1 94 88 18 99 9 7 92 22 24

1 94 88 18 99 9 7 92 22 24

1 94 88 18 99 7 9 92 22 24

Page 25: 1 Sorting. For computer, sorting is the process of ordering data. [ 1 9 8 3 2 ]  [ 1 2 3 8 9 ] [ “Tom”, “Michael”, “Betty” ]  [ “Betty”, “Michael”,

Merge Sort

Sorting deals with the entire set of elements.

1 7 9 18 22 24 88 92 94 99

1 18 88 94 99 7 9 22 24 92

1 88 94 18 99 7 9 92 22 24

1 94 88 18 99 7 9 92 22 24

Page 26: 1 Sorting. For computer, sorting is the process of ordering data. [ 1 9 8 3 2 ]  [ 1 2 3 8 9 ] [ “Tom”, “Michael”, “Betty” ]  [ “Betty”, “Michael”,

Merge Sort

Merge sort always runs in O(n log n)

Merge sort requires extra memory to merge two

ordered lists.

There are efficient merging algorithms in-place.

Page 27: 1 Sorting. For computer, sorting is the process of ordering data. [ 1 9 8 3 2 ]  [ 1 2 3 8 9 ] [ “Tom”, “Michael”, “Betty” ]  [ “Betty”, “Michael”,

27

Quick Sort

Page 28: 1 Sorting. For computer, sorting is the process of ordering data. [ 1 9 8 3 2 ]  [ 1 2 3 8 9 ] [ “Tom”, “Michael”, “Betty” ]  [ “Betty”, “Michael”,

Quick Sort

The most widely used sorting algorithm

In-place and O( n log n ) in average but O (n^2 ) in

worst case

Similar to Merge sort, Quick sort also splits the list

into two halves by partitioning process

Page 29: 1 Sorting. For computer, sorting is the process of ordering data. [ 1 9 8 3 2 ]  [ 1 2 3 8 9 ] [ “Tom”, “Michael”, “Betty” ]  [ “Betty”, “Michael”,

Quick Sort

Partitioning

1. Choose a pivot element to split a list into two.

2. Move any smaller elements than the pivot in the list to the

left of the pivot

3. Move any greater elements than the pivot in the list to the

right of the pivot

4. No ordering guaranteed for the left and the right sub-lists.

5. Repeat this partitioning process for the left and the right.

[ x for x in L if x < pivot ] + [ pivot ] + [x for x in L if x >

pivot]

Page 30: 1 Sorting. For computer, sorting is the process of ordering data. [ 1 9 8 3 2 ]  [ 1 2 3 8 9 ] [ “Tom”, “Michael”, “Betty” ]  [ “Betty”, “Michael”,

Quick Sort

[1, 94, 88, 18, 99, 9, 7, 92, 22, 24] Choose 24 as pivot

[ 1 18 9 7 22 ] + [ 24 ] + [ 94 88 99 92 ] Repeat

[ 1 18 9 7 ] + [ 22 ] + [ ] + [ 24 ] + [ 88 ] + [ 92 ] + [ 88 99

]

[ 1 ] + [ 7 ] + [ 9 18 ] + [ 22 24 88 92 ] + [ 88 99 ]

[1 7 9 18 22 24 88 92 88 99 ]

Page 31: 1 Sorting. For computer, sorting is the process of ordering data. [ 1 9 8 3 2 ]  [ 1 2 3 8 9 ] [ “Tom”, “Michael”, “Betty” ]  [ “Betty”, “Michael”,

Quick Sort

Worst Case Analysis

Partitioning can be done in linear time

n comparisons are required

The worst case is when the pivot is the largest element

Why?

The next segment has n-1 elements

n-1 comparisons for partitioning

O(n^2) in worst; why?

Page 32: 1 Sorting. For computer, sorting is the process of ordering data. [ 1 9 8 3 2 ]  [ 1 2 3 8 9 ] [ “Tom”, “Michael”, “Betty” ]  [ “Betty”, “Michael”,

Quick Sort

Best Case Analysis

Partitioning can be done in linear time

n comparisons are required

The best case is when the pivot is the median

Why?

The next segment has two n/2 elements

O(n log n) at the best

Log n number of segments times n comparisons for

partitioning

Page 33: 1 Sorting. For computer, sorting is the process of ordering data. [ 1 9 8 3 2 ]  [ 1 2 3 8 9 ] [ “Tom”, “Michael”, “Betty” ]  [ “Betty”, “Michael”,

33

Python Sort Function

Page 34: 1 Sorting. For computer, sorting is the process of ordering data. [ 1 9 8 3 2 ]  [ 1 2 3 8 9 ] [ “Tom”, “Michael”, “Betty” ]  [ “Betty”, “Michael”,

Built-in Function

The built-in sorted() function creates a new list in

an ascending order.

By default, it uses ‘<=‘ operator for comparison.

Your comparison operator is given by key function

parameter.

sorted([ (3,”Tom”), (0, “Jack”) ], key = lambda x: x[0])

sorted([1,2,3],reverse=True)