unit-1-daa

8/6/2019 UNIT-1-DAA

1/118

Lecture by Sansar Singh ChauhanDepartment of Computer Science & Engg.(GCET)Lecture by Sansar Singh ChauhanDepartment of Computer Science & Engg.(GCET)

1

Chapter 1.The Role of Algorithms

in Computing

8/6/2019 UNIT-1-DAA

2/118


2

Algorithms

Informally, an algorithm is any well-defined computationalprocedure that takes input and produces output.

A sequence of computational steps that transform theinput into the output.

A tool for solving a well-specified computational problem.

The algorithm describes a specific computationalprocedure for achieving that input/outputrelationship.

E.g.) sorting problem sorting algorithm

An instance of a problem(input) vs. a solution to theproblem(output).

8/6/2019 UNIT-1-DAA

3/118


3

.. continued.

Correctness of Algorithm

An algorithm is said to be correctif it halts with the correctoutput for every input instance. A correct algorithm solves the given problem. An incorrect algorithm might not halt at all or

it might halt with an answer other than the desired one.

Applications: The Human Genome Project

the identification of all genes in human DNA: determination ofthe sequences of the 3 billion chemical base pairs, storingthis information in DB and development of tools for dataanalysis.

Management/Manipulation of the large volume of theinternet data:

Finding good routes use the graph algorithm of the shortestpath

using a search engine to quick discovery of the informationpages.

use the hash tables and the algorithm of string match,

8/6/2019 UNIT-1-DAA

4/118


4

How to solve the problems?

How to solve the problems?

Data Structures: A data structure is a way to store and organize data in order

to facilitate access and modifications.

Efficient Algorithms: P-problems:

problems that can be solved efficiently inpolynomial time.

NP-complete problems: its unknown whether efficient algorithm exist for NP-complete

problems or not. If an efficient algorithm exists for any one of them, then

efficient algorithms exist for all of them.

Several NP-complete problems are similar to problems forwhich we do know of efficient algorithms.

8/6/2019 UNIT-1-DAA

5/118


5

Chapter 2.Getting Started

8/6/2019 UNIT-1-DAA

6/118


6

Insertion Sort

8/6/2019 UNIT-1-DAA

7/118


7

8/6/2019 UNIT-1-DAA

8/118


8

8/6/2019 UNIT-1-DAA

9/118


9

Correctness

Often Use a loop invariant:

Loop invariant: At the start of each iteration of the outer forloop -- the loop indexed byj -- the subarrayA[1 . . J-1] consists of

the elements originally inA[1 .. j-1] but in sorted order.

To use a loop invariant to prove correctness, show three things about it:

Initialization: It is true prior to the first iteration of the loop.Maintenance: If it is true before an iteration of the loop, it remains true before the

next iteration.Termination: When the loop terminates, the invariantusually along with the

reason that the loop terminatedgives us a useful property that helpsshow that the algorithm is correct.

8/6/2019 UNIT-1-DAA

10/118


10

Initialization

The loop invariant is true prior to first iteration of loop.LI must hold when j=2. Then A[1..j-1] is A[1..1] which is

A[1].

1. A[1] is the original element in A[1].

2. A[1] is sorted.

1. for j 2 to length[A]2. do key A[j]3. Insert A[j] into sorted sequence A[1..j-1]4. i j-15. while i > 0 and A[i] > key6. do A[i+1] A[i]7. i i-18. A[i+1] key

8/6/2019 UNIT-1-DAA

11/118


11

Maintenance

If LI is true before an iteration of the loop, it must remaintrue before the NEXT iteration.

For the outer loop, note that we keep moving elements tothe right until we find the correct place for A[j]. We theninsert it there, so A[1..j-1] is sorted before the nextiteration. Thus, the LI is maintained from iteration toiteration.

1. for j 2 to length[A]2. do key A[j]

3. Insert A[j] into sorted sequence A[1..j-1]4. i j-15. while i > 0 and A[i] > key6. do A[i+1] A[i]7. i i-18. A[i+1] key

8/6/2019 UNIT-1-DAA

12/118

Lecture by Sansar Singh ChauhanDepartment of Computer Science & Engg.(GCET)Lecture by Sansar Singh ChauhanDepartment of Computer Science & Engg.(GCET) 12

Termination

When loop terminates, the LI gives us a usefulproperty that helps show that the algorithm iscorrect.

What happens at the end? Outer loop exits. j=n+1 atline 1. By substitution in LI, the subarray consists ofthe elements originally in A[1..n] but in sorted order.This is obviously a true statement (~property).

1. for j 2 to length[A]2. do key A[j]3. Insert A[j] into sorted sequence A[1..j-1]

4. i j-15. while i > 0 and A[i] > key6. do A[i+1] A[i]7. i i-18. A[i+1] key

8/6/2019 UNIT-1-DAA

13/118


Analysis of Insertion sort

8/6/2019 UNIT-1-DAA

14/118


Insertion-Sort Running Time

T(n) =

c1 [(n-1)+1]

+ c2 (n-1) + c3 (n-1) + c4 (n-1)

+ c5 (j=2,n tj)+ c6 ( j=2,n (tj -1) )

+ c7 ( j=2,n (tj -1) )

+ c8 (n-1)

c3 = 0, of course, since its the comment

8/6/2019 UNIT-1-DAA

15/118


Worst Case T(n)

Occurs when the loop of lines 5-7 is executed asmany times as possible, which is when A[] isin reverse sorted order.

key is A[j] from line 2 i starts at j-1 from line 4

i goes down to 0 due to line 7

So, tj in lines 5-7 is [(j-1) 0] + 1 = j The 1 at the end is due to the test that fails, causing exit from the

loop.

8/6/2019 UNIT-1-DAA

16/118


Worst Case T(n), ctd.

T(n) = c1 [(n-1)+1]

+ c2 (n-1) + c4

(n-1)

+ c5 (j=2,n j) + c6 [ j=2,n (j

-1) ]

+ c7 [ j=2,n (j-1) ]

+ c8 (n-1)

8/6/2019 UNIT-1-DAA

17/118



T(n) =

c1 n + c2 (n-1) + c4 (n-1) + c8 (n-1)

+ c5 (j=2,n j)

+ c6 [ j=2,n (j-1) ] + c7 [ j=2,n (j-1) ]

= c9 n + c10 + c5 (j=2,n j) + c11 [ j=2,n (j-1)]

8/6/2019 UNIT-1-DAA

18/118


Worst Case T(n), ctd.T(n) =

c9 n + c10 + c5 (j=2,n j) + c11 [ j=2,n (j-1) ]

But j=2,n j = [n(n+1)/2] 1

so that j=2,n (j-1) = j=2,n j j=2,n (1)

= [n(n+1)/2] 1 (n-2+1) = [n(n+1)/2] 1 n + 1 = n(n+1)/2 -

n = [n(n+1)-2n]/2 = [n(n+1-2)]/2 = n(n-1)/2

8/6/2019 UNIT-1-DAA

19/118



In conclusion,T(n) =

c9 n + c10 + c5 [n(n+1)/2] 1 + c11 n(n-1)/2

= c12 n2 + c13 n + c14

= f1(n2

) + f2(n1

) + f3(n0

)

8/6/2019 UNIT-1-DAA

20/118


Best Case

If already sorted, statements in while loop neverexecute since A[i]

8/6/2019 UNIT-1-DAA

21/118


Best Case Result

T(n) =

c1 n + (c2 + c4) (n-1) + c5 (n-

1)

+ c8

(n-1)

= n ( c1 + c2 + c4 + c5 + c8 )

+ ( -c2 c4 - c5 c8 )

= c9n + c10

= f1(n1) + f2(n0)

8/6/2019 UNIT-1-DAA

22/118


Analyzing algorithms

How do we analyze an algorithms running time?

The time taken by an algorithm depends on the input Input size: depends on the problem being studied. Running time: on a particular input, it is the number of primitive

operations(steps) executed.

Analysis ofinsertion sort The running time of the algorithm is:

(cost of statement) x ( # of times statement is executed) all statements

tj= # of times that while loop test is executed for that value ofj.

Bestcase: the array is already sorted (all tj= 1)

Worstcase:

the array is in reverse order (tj= j). The worst case running time gives a guaranteed upper bound on the

running time for any input.

Averagecase: On average, the key inA[j]is less than half the elements inA[1 .. j-1]

and its greater than the other half. (tj= j /2).

8/6/2019 UNIT-1-DAA

23/118


.. continued

Order of Growth

The abstraction to ease analysis and focus on the importantfeatures.

Look only at the leading term of the formula for runningtime.

Drop lower-order terms.

Ignore the constant coefficient in the leading term.

Example: an + bn + c = (n) Drop lower-order terms an

Ignore constant coefficient n

The worst case running time T(n) grows like n; it does not

equaln. The running time is (n) to capture the notion

that the order of growth is n.

We consider one algorithm is more efficientthan another if itsworst case running time has a smaller order of growth.

8/6/2019 UNIT-1-DAA

24/118


Designing algorithms

Divide and Conquer Dividethe problem into a number ofsubproblems.

Conquerthe subproblems by solving them recursively.

Base case: If the subproblems are small enough, just solve them.

Combinethe subproblemsolutions to give

a solution to the original problem.

Cf.) Incremental method insertion sort.

8/6/2019 UNIT-1-DAA

25/118


Merge Sort

A sorting algorithm based on divide and conquer.

The worst-case running time: merge sort < insertion sort in its order of growth

To sortA[p . . r]:

Divideby splitting into two subarraysA[p .. q] andA[q+1 .. r],where q is the halfway point ofA[p .. r].

Conquerby recursively sorting the two subarrays A[p .. q] andA[q+1 .. r].

Combineby merging the two sorted subarraysA[p .. q] and

A[q+1 .. r] to produce a single sorted subarrayA[p .. r]. To accomplish this step, well define a procedure MERGE(A, p, q, r).

8/6/2019 UNIT-1-DAA

26/118


Initial call: MERGE-SORT(A, 1, n)

8/6/2019 UNIT-1-DAA

27/118

8/6/2019 UNIT-1-DAA

28/118


8/6/2019 UNIT-1-DAA

29/118

8/6/2019 UNIT-1-DAA

30/118


8/6/2019 UNIT-1-DAA

31/118


Analyzing Divide-and-Conquer Algorithms

Use a recurrence (equation) to describe the running time

of a divide-and-conquer algorithm. Let T(n) = running time on a problem of a size n.

If the problem size is small enough(say, n c for some constant c), we have a base case c(=(1)).

Otherwise, suppose that we divide into asubproblems, each 1/

bthe size of the original. (In merge sort, a=b=2.) Let D(n) be the time to divide a size-n problem. There are a subproblems to solve, each of size n/ b

each subproblem takes T(n/ b) time to solve we spend aT(n/ b) time solving subproblems.

Let C(n) be the time to combine solutions.

We get the recurrence: T(n)= (1) ifn c

aT(n/ b) + D(n)+C(n) otherwise.

8/6/2019 UNIT-1-DAA

32/118


Analyzing Merge Sort Use a Recurrence.

For simplicity, assume that The base case: when n =1, T(n)= (1).

When n 2, time for merge sort steps: Divide: Just compute q as the average of p and r D(n)= (1).

Conquer: Recursively solve 2 subproblems, each of size n/ 2 2T(n/2) Combine: MERGE on an n element subarray takes (n) time

C(n)= (n). Since D(n)+C(n)= (1) + (n) = (n) ,the recurrence for merge sort

running time is: T(n)= (1) ifn=1 2T(n/ 2) + (n) n>1.

Solvingthe merge-sort recurrence: T(n) = (n log2 n) Let c be a constant for T(n) of the base case and of the time per array

element for the divide and conquer steps. Rewirte the recurrence as

T(n)= c ifn=1 2T(n/ 2) + c n n>1.

Draw a recursion tree, which shows successive expansions of therecurrence.

8/6/2019 UNIT-1-DAA

33/118


8/6/2019 UNIT-1-DAA

34/118


BubbleSort:

8/6/2019 UNIT-1-DAA

35/118


Chapter 3.

Growth of Functions

8/6/2019 UNIT-1-DAA

36/118


Growth of functions

A way to describe behavior of functions in the limit --

asymptoticefficiency. Growthof functions.

Focus on whats important by abstracting away low-orderterms and constant factors.

How to indicate running times of algorithms?

A way to compare sizes of functions:

O =o< >

8/6/2019 UNIT-1-DAA

37/118


37

8/6/2019 UNIT-1-DAA

38/118


38

Asymptotic Notation

O-notation O(g(n))= {f (n): there exist positive constantscandn0

such that 0 f (n)cg(n) for all n n0}.

g(n)is an asymptotic upper boundforf(n).

Example: 2n = O(n), with c=1 and n0=2. also, 2n = O(n), with c=2 and n0=0.

Examples of functions in O(n): n, n+ n, n+ 1000n, 1000n+ 1000n Also,

n, n/1000, n1.9999 , n/lg lg lg n

8/6/2019 UNIT-1-DAA

39/118


39

.. continued

-notation (g(n))= {f (n): there exist positive constants c and n0

such that 0 cg(n) f(n) for all n n0} .

g(n)is an asymptotic lower boundforf(n).

Example: n = (lg n), with c=1 and n0=16.

Examples of functions in (n ): n, n+ n, n- n, 1000n+ 1000n, 1000n- 1000n, Also,

n, n 2.0000 , nlg lg lg n,

8/6/2019 UNIT-1-DAA

40/118


40

.. continued

-notation (g(n))= {f (n): there exist positive constantsc1, c2and

n0 such that 0 c1 g(n) f(n) c2g(n)for all n n0}

.

g(n)is an asymptotic tight boundforf(n).

Example: n/22n = (n), with c1=1/4, c2=1/2and

n0=8. Also, 2n = (n ), with c1=1, c2=3(or c1=c2=2) and

n0=0.

Theorem: f (n) = (g(n)) iff f = (g(n)) and f =

(g(n)).

8/6/2019 UNIT-1-DAA

41/118


41

.. continued

-notation (g(n))= {f (n): for all constants c > 0, there exist a

constant n0 > 0such that 0 f(n) < cg(n)for all n n0}.

g(n)is an asymptotic strict upper boundforf(n).

Another view:

Example: n1.9999 = o(n2), n/lgn = (n), n (n) (just like ), n/1000 (n)

-notation (g(n))= {f (n): for all constants c > 0, there exist a

constant n0 > 0such that 0 cg(n) < f(n)for all n n0}.

g(n)is an asymptotic strict lower boundforf(n).

Another view:

Example:

n2.0001

= (n2

), nlgn = (n), n (n )

8/6/2019 UNIT-1-DAA

42/118


42

Comparisons of Functions

Related Properties: Transitivity:

f(n)= (g(n))andg(n)= (h(n)) f(n) = (h(n) ) . Same forO, , o, and .

Reflexivity: f(n) = ( f(n) ) . Same forO and .

Symmetry: f(n) = (g(n) )if and only ifg(n) = ( f(n) ) .

Transpose symmetry: f(n) = O(g(n) )if and only ifg(n) = ( f(n) ) . f(n) = (g(n) )if and only ifg(n) = ( f(n) ) . Comparisons:

f(n)is asymptotically smallerthang(n)iff(n)= o(g(n)). f(n)is asymptotically largerthang(n)iff(n)= ( g(n)).

8/6/2019 UNIT-1-DAA

43/118


43

Standard notations and common functions

Monotonicity: ( )f n is monotonically increasing ifm n f (m) f

(n).

f(n)is monotonically decreasingifm n f(m) f(n). f(n)isstrictly increasingifm

8/6/2019 UNIT-1-DAA

44/118


44

Chapter 4.

Recurrences

8/6/2019 UNIT-1-DAA

45/118


45

A recurrence is a function defined in terms of one or more base cases, and

itself, with smaller arguments.

Example: T(n)= 1 ifn = 1 ,

T(n-1)+1 ifn >1 . Solution: T(n)= n.

T(n)= 1 ifn = 1 , 2T(n/2)+ n ifn 1 . Solution: T(n)= n lg n + n.

T(n)= 0 ifn = 2 , T(n)+1 ifn >2 . Solution: T(n)= lg lg n.

T(n)= 1 ifn = 1 , T(n/3)+ T(2n/3)+ n ifn >1 . Solution: T(n)= (n lg n).

8/6/2019 UNIT-1-DAA

46/118


46

Methods for solving recurrences

Substitution method Guess a bound and then prove the guess

correct by mathematical induction.

The Master method It provides bounds for recurrences of the form

T(n) = aT(n/b) + f(n) , wherea1,b>1, andf(n)is a given function.

The Recursion-Tree method It converts the recurrence into a tree whose

nodes represent the costs incurred at variouslevels of the recursion.

Th S b tit ti th d

8/6/2019 UNIT-1-DAA

47/118


47

The Substitution method

T(n) = 2T(n/2) + n

Guess: T(n) = O(n lg n)

Proof: Prove that T(n) c n lg n forc>0 T(n) 2(c n/2 lg n/2) + n where T(n/2) c n/2 (lg n/2) forc>0 cn lg n/2 + n = cn lg n cn + n = cn lg n (c-1)n cn lg n ifc 1

Therefore, T(n) = O(n lg n)

8/6/2019 UNIT-1-DAA

48/118


48

The recursion-tree method

Each code represents the cost of a single subproblemsomewhere in the set of recursive function invocations.

Sum the costs within each level of the tree to obtain a set of per-level costs.

Then sum all the per-level costs to determine the total cost of all levels of the recursion.

Its useful to solve the recurrence which describes the runningtime of a divide-and-conquer algorithm.

Its used to generate a good guess which is then verified by the substitution method.

A careful drawing of a recursion tree and summing the costs can be used as a direct proof of a solution to a recurrence.

2( ) 3 ( / 4)T n T n cn= + : 4kA s s u m p t io n n=

8/6/2019 UNIT-1-DAA

49/118


49

( ) 3 ( / 4)T n T n cn + : 4 A s s u m p t io n n

8/6/2019 UNIT-1-DAA

50/118


50

Th M t M th d

8/6/2019 UNIT-1-DAA

51/118


51

The Master Method

Master Theorem

Let a 1, b >1 be constants, letf(n) ,be a function and let( )T n be defined on the nonnegative integers by the

recurrence

T(n)= aT(n/b)+ f(n) , where we interpret n/b to mean eithern/b orn/b . Then T(n)

can be bounded asymptotically as follows.

1: If for some constant > 0, then 2: If , then 3: If for some constant > 0, and if af(n/b) cf(n)for some constant c

8/6/2019 UNIT-1-DAA

52/118


52

.. continued The Master method

Used for many divide-and-conquerrecurrences of the form T(n)= aT(n/b)+ f(n) , where a 1, b >1, andf(n) >0. and Based on the master theorem (Theorem 4.1).

Compare vs.f(n): Case 1: for some constant >0.

(f(n)is polynomially smaller than ) Solution: --(Intuitively: cost is dominated by leaves.)

Case 2: , where k 0.

[This formulation is more general than in Masters Theorem(4.1). given in Exercise4.4-2.]

(f(n)is within a polylog factor of , but not smaller.) Solution: --(Intuitively: cost is at each level, and there are (lg n)levels.) Simple case: k= 0

Case 3: for some constant >0 andf(n)satisfies the regularity condition a f(n/b) cf(n)for some constant c

8/6/2019 UNIT-1-DAA

53/118


53

.. continued Examples: .1 ( )T n = 5 ( /T n 2)+ (n)

vs. n sol.) Since log2 5 - =2 for some constant >0, use Case 1 .

2. ( )T n = 27 ( /T n 3)+ (nlg )n vs. n lg n sol.) Use Case 2 with k= 1 T(n)= (nlg n)

3. T(n)= 5T(n/2)+(n) .vs n

.)sol Now lg 5 + = 3 for some constant >0 Check regularity condition (dont really need to sincef(n)is a polynomial): a f(n/b)= 5(n/2)= 5n/8 cnforc = 5/8

8/6/2019 UNIT-1-DAA

54/118


54

8/6/2019 UNIT-1-DAA

55/118


55

8/6/2019 UNIT-1-DAA

56/118


56

Chapter 6.

HeapSort

HeapSort

8/6/2019 UNIT-1-DAA

57/118


57

HeapSort

O(n lg n) worst case like merge sort

Sorts in place like insertion sort

Combines the best of both algorithms

To understand HeapSort, study Heaps and Heap operations Priority Queues.

Heap: Data Str ct re

8/6/2019 UNIT-1-DAA

58/118

Lecture by Sansar Singh ChauhanDepartment of Computer Science & Engg.(GCET)


58

Heap: Data Structure

Heap Ais a nearly complete binary tree. Heightof node = # of edges on a longest simple

path from the node down to a leaf. Heightof heap = height of root =( lg n) .

A heap can be stored as an arrayA[1 .. n].

Root of tree isA[1].

Parent ofA[i ] = A[i/2]. Left child ofA[i ] = A[2i ].

Right child ofA[i ] = A[2i + 1]. Computing is fast with binary representation

implementation.

8/6/2019 UNIT-1-DAA

59/118



59

Heap: Property

8/6/2019 UNIT-1-DAA

60/118



60

Heap: Property

For max-heaps (largest element at root), max-heap property:for all nodes i, excluding the root, A[PARENT(i )] A[i].

For min-heaps (smallest element at root), min-heap property:for all nodes i, excluding the root, A[PARENT(i )] A[i].

The maximum(or minimum) element of a max-heap(or min-heap) is at the root.

The heapsort algorithm uses max-heaps.

In general, heaps can be k-ary tree (instead of binary).

8/6/2019 UNIT-1-DAA

61/118



61

Maintaining the heap property

Always maintain the max-heap property: MAX-HEAPIFY

Before MAX-HEAPIFY, A[i]may be smaller than its children.

Assume left and right subtrees ofiare max-heaps.

AfterMAX-HEAPIFY, subtree rooted at iis a max-heap.

The way MAX-HEAPIFYworks:

CompareA[i], A[LEFT(i)], andA[RIGHT(i)].

If necessary, swapA[i]with the larger of the two children to

preserve heap property. Continue this process of comparing and swapping down the

heap, until subtree rooted at iis max-heap. If we hit a leaf,then the subtree rooted at the leaf is trivially a max-heap.

8/6/2019 UNIT-1-DAA

62/118



62

Exam

ple:

8/6/2019 UNIT-1-DAA

63/118



63

Time: O(lg n).

8/6/2019 UNIT-1-DAA

64/118



64

Building a Heap

Total time: O(n lg n)

Example:

8/6/2019 UNIT-1-DAA

65/118



65

8/6/2019 UNIT-1-DAA

66/118



66

The HeapSort algorithm

Given an input array,

Builds a max-heap from the array.

Starting with the root (the maximum element), thealgorithm places the maximum element into thecorrect place in the array by swapping it with the

element in the last position in the array.

Discard this last node (knowing that it is in itscorrect place) by decreasing the heap size, andcalling

MAX-HEAPIFY on the new (possibly incorrectly-placed) root.

Repeat this discarding process until only one node(the smallest element) remains, and therefore is inthe correct place in the array.

Algorithm

8/6/2019 UNIT-1-DAA

67/118



67

Algorithm

Example:

8/6/2019 UNIT-1-DAA

68/118



68

Analysi

8/6/2019 UNIT-1-DAA

69/118



69

Analysis:

HEAPSORT(A, n)

BUILD-MAX-HEAP(A, n) : O(n lg n)

fori n downto 2do : n-1 times exchange A[1] A[i] : O(1) MAX-HEAPFIY (A, 1, i-1) : O(lg n)

Total time: O(n lg n)

Heap implementation of Priority Queue

8/6/2019 UNIT-1-DAA

70/118



70

Heap implementation of Priority Queue

Heaps efficiently implement priority queues.

Max-heaps implemented with max-priority queues

A Heap: a good compromise between fast insertion but slowextraction and vice versa.

Both operations take O(lg n)O(lg n) time.

Cf) A max-priority queues implemented by a Linked-List:

insertion O(n) extraction O(1)

Priority Queue

8/6/2019 UNIT-1-DAA

71/118



71

Priority Queue

A Priority Queue is a data structure for maintaining a dynamic setS of elements, each element has an associated value called a key.

Apriority queueis a data structure such that access or removal is ofthe highest-priority element in the collection, according to somemethod for comparing elements. -- Collins(DS and the Javacollection framework)

Max-priority queue supports dynamic-set operations:

INSERT(S, x): inserts elementxinto set S.

MAXIMUM(S): returns element of Swith largest key.

EXTRACT-MAX(S): removes and returns element of Swithlargest key.

INCREASE-KEY(S, x, k): increases value of elementxs key tok.

Assume kxs current key value.

Example of max-priority queue application: schedule jobs on shared computer.

continued

8/6/2019 UNIT-1-DAA

72/118



72

.. continued

Min-priority queue supports similar operations: INSERT(S, x): inserts elementxinto set S.

MINIMUM(S): returns element of Swith smallest key.

EXTRACT-MIN(S): removes and returns element of Swithsmallest key.

DECREASE-KEY(S, x, k): decreases value of elementxs keyto k.

Assume k xs current key value.

Example of min-priority queue application: event-driven simulator.

8/6/2019 UNIT-1-DAA

73/118



73

Finding the maximum element: MAXIMUM(S)

Time: (1)

-- Get the root.

E t ti M l t AC A (S)

8/6/2019 UNIT-1-DAA

74/118



74

Extracting Max element: EXTRACT-MAX(S)

Given the arrayA:

Make sure heap is not empty.

Make a copy of the maximum element (the

root).Make the last node in the tree the new root.

Re-Heapify the heap, with one fewer node.

Return the copy of the maximum element.

8/6/2019 UNIT-1-DAA

75/118



75

Analysis:Constant time assignment + Time forMAX-HEAPIFY

Time: O( lg n).

8/6/2019 UNIT-1-DAA

76/118



76

Increasing key value: INCREASE-KEY(S, x, k)

Given setS, elementx, and new key valuek:

Make surekxs current key. Updatexskey value tok.

Traverse the tree upward comparingxto its parentand swapping keys if necessary, untilxs key is

smaller than its parents key.

8/6/2019 UNIT-1-DAA

77/118



77

Analysis:

Upward path from node i has length O(lg n)in an n-element heap.

Time: O(lg n)

8/6/2019 UNIT-1-DAA

78/118



78

Inserting into the Heap: INSERT(S, x)

8/6/2019 UNIT-1-DAA

79/118



79

g p

Analysis:

constant time assignments + time forHEAP-INCREASE-KEY.

Time: O(lg n)

Given a key kto insert into the heap:

Insert a new node in the very last position in the tree with key -.

Increase the - key to kusing the HEAP-INCREASE-KEYprocedure defined above.

8/6/2019 UNIT-1-DAA

80/118



80

Chapter 7.

Quicksort

Quick Sort

8/6/2019 UNIT-1-DAA

81/118



81

Qu c So t

Worst-case running time:(n).

Expected running time:(n lg n). Constants hidden in (n lg n) are small.

Based on the three-step process of divide-and-conquer.

To sort the subarrayA[p . . r]:

Divide: PartitionA[p..r], into two (possibly empty) subarraysA[p ..q-1]andA[q+1 .. r], such that each element in the first subarrayA[p .. q-1] is A[q] andA[q] is each element in the secondsubarrayA[q+1 .. r].

Conquer: Sort the two subarrays by recursive calls toQUICKSORT.

Combine: No work is needed to combine the subarrays, becausethey are sorted in place.

Perform the divide step by a procedure PARTITION, whichreturns the index q that marks the position separating the

subarrays.

8/6/2019 UNIT-1-DAA

82/118



82

Partitioning:

8/6/2019 UNIT-1-DAA

83/118



83

Partitioning:

PARTITION always selects the last elementA[r] in the

subarrayA[p .. r] as thepivotthe element aroundwhich to partition.

As the procedure executes, the array is partitioned intofour regions, some of which may be empty:

Loop invariant: 1. All entries inA[p .. i] pivot. 2. All entries inA[i+1..j-1] >pivot. 3.A[r] = pivot.

Its not needed as part of the loop invariant, but the fourthregion isA[j . . r-1], whose entries have not yet beenexamined, and so we dont know how they compare tothe pivot.

8/6/2019 UNIT-1-DAA

84/118



84

8/6/2019 UNIT-1-DAA

85/118



85

8/6/2019 UNIT-1-DAA

86/118



86

8/6/2019 UNIT-1-DAA

87/118



87

Correctness:U th l i i t t t f PARTITION

8/6/2019 UNIT-1-DAA

88/118



88

Use the loop invariant to prove correctness of PARTITION.Initialization:

Before the loop starts, all the conditions of the loop invariant aresatisfied, because r is the pivot and the subarrays

A[p .. i] andA[i+1 .. j-1] are empty.

Maintenance:While the loop is running,ifA[j] pivot , thenA[j] andA[i+1] are swapped

and iandjare incremented.IfA[j] > pivot, then increment onlyj.

Termination:When the loop terminates,j= r,so all elements inA are partitioned into one of the three cases:A[p . . i] pivot, A[ i+1 .. r-1] > pivot, andA[r] = pivot.

The last two lines of PARTITION move the pivot element from the end ofthe array

to between the two subarrays: swapping the pivot(A[r]) and the first element of thesecond subarray(A[i+ 1]).

Time for partitioning: (n)to partition an n-element subarray.

Performance of QuickSort

8/6/2019 UNIT-1-DAA

89/118



89

The running time of Quicksort depends on the partitioning of thesubarrays:

If the subarrays are balanced, then quicksort can run as fast asmergesort.

If they are unbalanced, then quicksort can run as slowly as insertionsort.

Worst case Occurs when the subarrays are completely unbalanced.

Have 0elements in one subarray andn-1 elements in the othersubarray.

Get the recurrence: T (n) = T (n-1) + T (0) + (n) = T (n-1) + (n) (=

(n) ).

Same running time as insertion sort. In fact, the worst-case running time occurs when quicksort takes a

sorted array as input, but insertion sort runs in O(n)time in thiscase.

Best case

Occurs when the subarrays are completely balanced every time.

.. continued

8/6/2019 UNIT-1-DAA

90/118



90

Balanced Partitioning QuickSorts average running time is much closer to the best

case than to the worst case. Imagine that PARTITION always produces a 9-to-1 split.

Get the recurrence T (n) T (9n/10) + T (n/10) + (n) = O(n lg n) .

Intuition: look at the recursion tree.Its like the one forT (n) = T (n/3) + T (2n/3) + O(n) in

section 4.2.

Except that here the constants are different;

we get log10 n full levels and log10/9 n levels that are

nonempty.As long as its a constant, the base of the log doesnt

matter in asymptotic notation.

Any split of constant proportionality will yield a recursiontree of depth (lg n).

8/6/2019 UNIT-1-DAA

91/118



91

.. continued

8/6/2019 UNIT-1-DAA

92/118



92

Intuition for the Average case

Splits in the recursion tree will not always be constant. There will usually be a mix of good and bad splits throughout

the recursion tree.

To see that this doesnt affect the asymptotic running time ofQuicksort, assume that levels alternate between best-

case and worst-case splits.

The extra level in the left-hand figure only adds to theconstant hidden in the -notation.

There are still the same number of subarrays to sort, and

only twice as much work was done to get to that point. Both figures(Fig.7.5 a & b) result in O(n lg n) time, though

the constant for the figure on the left is higher than that ofthe figure on the right.

8/6/2019 UNIT-1-DAA

93/118



93

Analysis of QuickSort

8/6/2019 UNIT-1-DAA

94/118



94

Worst-case Analysis We will prove that a worst-case split at every level produces a worst-case running time of

O(n).

Recurrence for the worst-case running time of QUICKSORT: T(n) = max 0qn-1(T(q) + T(n-q-1) ) + (n) .

Because PARTITION produces two subproblems, totaling sizen-1, q ranges from 0 ton-1.

Guess:T(n) cn, for somec. Substituting our guess into the above recurrence:

T(n) max 0qn-1(cq+ c(n-q-1) ) + (n) = c max 0qn-1(q+ (n-q-1) ) + (n) . The maximum value of(q2+(n-q-1))occurs whenq is either0 orn-1.

(Second derivative with respect to q is positive.) This means that max 0qn-1 (q+ (n-q-1) ) (n-1) = n-2n+1 .

Therefore, T(n) cn- c(2n-1) + (n) cn if c(2n-1) (n) .

Pickc so that c(2n-1)dominates (n). Therefore, the worst-case running time of quicksort isO(n).

Can also show that the recurrences solution is(n). Thus, the worst-case running time is(n) .

Randomized version of QuickSort

8/6/2019 UNIT-1-DAA

95/118



95

The assumption that all input permutations are equally likely is not always true.

Add randomizationto QuickSort.

Randomly permute the input array. Instead, use random sampling, or picking one element at random.

Dont always useA[r]as the pivot. Instead, randomly pick an element from thesubarray that is being sorted.

Randomly selecting the pivot element will, on average, cause the split of theinput array to be reasonably well balanced.

8/6/2019 UNIT-1-DAA

96/118



96

Randomization of Quicksort stops any specific type of array fromcausing worst case behavior.

For example, an already-sorted array causes worst-case behavior innon-randomized QUICKSORT, but not in RANDOMIZED-QUICKSORT.

8/6/2019 UNIT-1-DAA

97/118



97

8/6/2019 UNIT-1-DAA

98/118



98

8/6/2019 UNIT-1-DAA

99/118



99

Chapter 8.

Sorting in Linear Time

Types of Sort Algorithms

8/6/2019 UNIT-1-DAA

100/118



100

The only operation that may be used to gain order informationabout a sequence is comparison of pairs of elements.

Exchange Sorting -- comparison-based Bubble Sort Quick Sort

Insertion Sort -- comparison-based Selection Sorting -- comparison-based

Selection Sort Heap Sort

Merge Sort -- comparison-based

Distribution Sort Bucket Sort Radix Sort

Lower bounds for sorting

8/6/2019 UNIT-1-DAA

101/118



101

Lower bounds for sorting

Lower bounds (n) to examine all the input. All sorts seen so far are (n lg n).

Well show that (n lg n) is a lower bound forcomparison sorts.

Decision tree

Abstraction of any comparison sort.

Represents comparisons made by

a specific sorting algorithm

on inputs of a given size. Abstracts away everything else: control and data

movement.

Were counting onlycomparisons.

8/6/2019 UNIT-1-DAA

102/118

How many leaves on the decision tree? There are n! leaves, because every permutation appears at least once.

8/6/2019 UNIT-1-DAA

103/118



103

at least once.

For any comparison sort, 1 tree for each n.

View the tree as if the algorithm splits in two at each node, based onthe information it has determined up to that point. The tree models all possible execution traces.

What is the length of the longest path from root to leaf? Depends on the algorithm Insertion sort: (n)

Merge sort: (n lg n)

Lemma Any binary tree of height h has leaves. In other words:

l= # of leaves,

h = height, Then l .

Theorem Any decision tree that sorts n elements has height (n lg

n).

2h

2h

Sorting in Linear Time

8/6/2019 UNIT-1-DAA

104/118



104

g

Non-comparison sorts.Counting sort

Depends on a key assumption: numbers to be sorted areintegers in {0, 1, . . . , k}.

Input: A[1 . . n], whereA[ j ] {0, 1, . . . , k} forj= 1, 2, ..., n. ArrayA and values n and kare given as parameters.

Output:

B[1 . . n], sorted. B is assumed to be already allocated and is given as a parameter.

Auxiliary storage: C[0 . . k]

8/6/2019 UNIT-1-DAA

105/118



105

8/6/2019 UNIT-1-DAA

106/118



106

Analysis of Counting

8/6/2019 UNIT-1-DAA

107/118



107

sort (n + k), which is (n) if k= O(n).

How big a kis practical?

Good for sorting 32-bit values? No.

16-bit? Probably not.

8-bit? Maybe, depending on n. 4-bit? Probably (unless n is really small).

Counting sort will be used in radix sort.

Stable algorithm:

Numbers with the same value appear in the output array inthe same order as they do in the input array; i.e. ties b/t two numbersare broken by the rule that whichever number appears first in the input arrayappears first in the output array.

Radix Sort

8/6/2019 UNIT-1-DAA

108/118



108

Key idea: Sort leastsignificant digitsfirst.

To sort ddigits:

8/6/2019 UNIT-1-DAA

109/118



109

Correctness: Induction on number of passes (iin pseudocode). Assume digits 1, 2, . . . , i-1 are sorted. Show that a stable sort on digit ileaves digits 1, ..., isorted:

If 2 digits in position iare different, ordering by position i is

correct, and positions 1, . . . , i-1 are irrelevant.If 2 digits in position iare equal, numbers are already in the rightorder (by inductive hypothesis). The stable sort on digit ileavesthem in the right order.

Analysis of Radix Sort

8/6/2019 UNIT-1-DAA

110/118



110

y

Lemma 8.3:

Given n d-digit numbers in which each digit can take on upto k possible values, RADIX-SORT correctly sorts these

numbers in (d(n + k)) time.

Assume that we use counting sort as the intermediate sort. (n + k) per pass (digits in range 0, ... , k)

dpasses

(d(n + k)) total

Ifk = O(n), time = (dn) .

Lemma 8.4:

8/6/2019 UNIT-1-DAA

111/118



111

Given nb-bit numbers and any positive integerr b,RADIX-SORT correctly sorts these numbers in (b/r (n+ ))

time.How to break each key into digits?

n words.

b bits/word.

Break into r-bit digits. Have d= b/r . Use counting sort, k= -1.

Example: 32-bit words, 8-bit digits. b = 32, r= 8, d= 32 /8 = 4, k= - 1 = 255.

Time = (b/r (n + )).

How to choose r? Balance b/rand n+ . Choosing r lg n gives us

( (n + n)) = ( ). So, to sort 32-bit numbers, use r=lg =16 bits. b/r =2

passes.

2

r

82

2r

2r

162

162

2r

lg

b

n lg

bn

n

Bucket Sort

8/6/2019 UNIT-1-DAA

112/118



112

Assumes the input is generated by a random process that distributes elementsuniformly over[0, 1).

Idea: Divide [0, 1) into n equal-sized buckets. Distribute the n input values into the buckets. Sort each bucket. Then go through buckets in order, listing elements in each one.

Input:A[1 .. n], where 0 A[i] < 1 for all i.Auxiliary array: B[0 .. n-1] of linked lists, each list initially empty.

Correctness:

C id A[i ] A[ j ]

8/6/2019 UNIT-1-DAA

113/118



113

ConsiderA[i],A[j].

Assume without loss of generality thatA[i] A[j].

ThennA[i]

nA[j] .

SoA[i] is placed into the same bucket asA[j] or into a bucketwith a lower index.

If same bucket, insertion sort fixes up.

If earlier bucket, concatenation of lists fixes up.

Analysis: Relies on no bucket getting too many values.

All lines of algorithm except insertion sorting take (n) altogether. Intuitively, if each bucket gets a constant number of elements, it

takes O(1) time to sort each bucket O(n) sort time for allbuckets.

We expect each bucket to have few elements, since the average is1 element per bucket.

8/6/2019 UNIT-1-DAA

114/118



114

Important Features of Sort Algorithms

8/6/2019 UNIT-1-DAA

115/118



115

Sort Algorithm Worst-case T(n) Average-case T(n)

Bubble sortInsertion sortSelection sort

(n )(n )(n )

(n)(n )(n )Quick sort (n) (n lg n)

Merge sort (n lg n) (n lg n)

Heap sort (n lg n) (n lg n)

Counting sort

Radix sortBucket sort

(n)

(n lg n)(n )

(n)

(n) (n)

8/6/2019 UNIT-1-DAA

116/118



116

8/6/2019 UNIT-1-DAA

117/118



117

8/6/2019 UNIT-1-DAA

118/118

unit-1-daa

Documents