instructor neelima gupta expected running times and randomized algorithms instructor neelima gupta

44
Instructor Neelima Gupta [email protected]

Upload: rodney-greer

Post on 06-Jan-2018

222 views

Category:

Documents


0 download

DESCRIPTION

Expected Running Time of Insertion Sort x 1,x 2, , x i-1,x i, …,x n For I = 2 to n Insert the ith element x i in the partially sorted list x 1,x 2, , x i-1. (at r th position)

TRANSCRIPT

Page 1: Instructor Neelima Gupta Expected Running Times and Randomized Algorithms Instructor Neelima Gupta

InstructorNeelima Gupta

[email protected]

Page 2: Instructor Neelima Gupta Expected Running Times and Randomized Algorithms Instructor Neelima Gupta

Expected Running Times and Randomized Algorithms

Instructor Neelima Gupta [email protected]

Page 3: Instructor Neelima Gupta Expected Running Times and Randomized Algorithms Instructor Neelima Gupta

Expected Running Time of Insertion Sort

x1,x2,........., xi-1,xi,.......…,xn

For I = 2 to nInsert the ith element xi in the partially sorted

list x1,x2,........., xi-1.

(at rth position)

Page 4: Instructor Neelima Gupta Expected Running Times and Randomized Algorithms Instructor Neelima Gupta

Expected Running Time of Insertion SortLet Xi be the random variable which represents the

number of comparisons required to insert ith element of the input array in the sorted sub array of first i-1 elements.

Xi : can take values 1…i-1 (denoted by

xi1,xi2,..................…,xii)

E(Xi) = Σj xijp(xij )where E(Xi) is the expected value Xi

And, p(xij) is the probability of inserting xi in the jth position 1≤j≤i

Page 5: Instructor Neelima Gupta Expected Running Times and Randomized Algorithms Instructor Neelima Gupta

Expected Running Time of Insertion Sort

x1,x2,........., xi-1,xi,.......…,xn

How many comparisons it makes to insert ith element in jth position?

(at jth position)

Page 6: Instructor Neelima Gupta Expected Running Times and Randomized Algorithms Instructor Neelima Gupta

Position # of Comparisionsi 1i-1 2i-2 3

. . . . . .

2 i-11 i-1

Note: Here, both position 2 and 1 have # of Comparisions equal to i-1. Why? Because to insert element at position 2 we have to compare with previously

first element. and after that comparison we know which of them come first and which at second.

Page 7: Instructor Neelima Gupta Expected Running Times and Randomized Algorithms Instructor Neelima Gupta

Thus, E(Xi) = (1/i){ i-1Σk=1k + (i-1) }where 1/i is the probability to insert at jth position in the i possible positions.

For n elements,

E(X1 + X2 + .............+Xn) = nΣi=2 E(Xi)

= nΣi=2 (1/i){ i-1Σk=1k + (i-1) } = (n-1)(n-4)/4

Therefore average case of insertion sort takes Θ(n2)

Page 8: Instructor Neelima Gupta Expected Running Times and Randomized Algorithms Instructor Neelima Gupta

For n number of elements, expected time taken is,

T = nΣi=2 (1/i){ i-1Σk=1k + (i-1) }where 1/i is the probability to insert at rth

position in the i possible positions.

E(X1 + X2 + .............+Xn) = nΣi=1 E(Xi)Where,Xi is expected value of inserting Xi element.

T = (n-1)(n-4)/4Therefore average case of insertion sort takes

Θ(n2)

Page 9: Instructor Neelima Gupta Expected Running Times and Randomized Algorithms Instructor Neelima Gupta

Quick-Sort

Pick the first item from the array--call it the pivot Partition the items in the array around the pivot so all elements

to the left are to the pivot and all elements to the right are greater than the pivot

Use recursion to sort the two partitions

pivotpartition: items > pivotpartition 1: items pivot

Page 10: Instructor Neelima Gupta Expected Running Times and Randomized Algorithms Instructor Neelima Gupta

Quicksort: Expected number of comparisons

Partition may generate splits (0:n-1, 1:n-2, 2:n-3, … , n-2:1, n-1:0)

each with probability 1/nIf T(n) is the expected running time,

Page 11: Instructor Neelima Gupta Expected Running Times and Randomized Algorithms Instructor Neelima Gupta

Randomized Quick-Sort

Pick an element from the array--call it the pivot Partition the items in the array around the pivot so all elements

to the left are to the pivot and all elements to the right are greater than the pivot

Use recursion to sort the two partitions

pivotpartition: items > pivotpartition 1: items pivot

Page 12: Instructor Neelima Gupta Expected Running Times and Randomized Algorithms Instructor Neelima Gupta

RemarksNot much different from the Q-sort except that

earlier, the algorithm was deterministic and the bounds were probabilistic.

Here the algorithm is also randomized. We pick an element to be a pivot randomly. Notice that there isn’t any difference as to how does the algorithm behave there onwards?

In the earlier case, we can identify the worst case input. Here no input is worst case.

Page 13: Instructor Neelima Gupta Expected Running Times and Randomized Algorithms Instructor Neelima Gupta

Randomized Select

1

0

}1,max{1 n

k

nknTkTn

nT

Page 14: Instructor Neelima Gupta Expected Running Times and Randomized Algorithms Instructor Neelima Gupta

Randomized AlgorithmsA randomized algorithm performs coin tosses (i.e.,

uses random bits) to control its execution

i ← random()if i = 0do A …else { i.e. i = 1}do B …

Its running time depends on the outcomes of the coin tosses

Page 15: Instructor Neelima Gupta Expected Running Times and Randomized Algorithms Instructor Neelima Gupta

Assumptions

coins are unbiased, andcoin tosses are independent

The worst-case running time of a randomized algorithm may be large but occurs with very low probability (e.g., it occurs when all the coin tosses give “heads”)

Page 16: Instructor Neelima Gupta Expected Running Times and Randomized Algorithms Instructor Neelima Gupta

Monte Carlo AlgorithmsRunning times are guaranteed but the output may

not be completely correct.

Probability of error is low.

Page 17: Instructor Neelima Gupta Expected Running Times and Randomized Algorithms Instructor Neelima Gupta

Las Vegas AlgorithmsOutput is guaranteed to be correct.

Bounds on running times hold with high probability.

What type of algorithm is Randomized Qsort?

Page 18: Instructor Neelima Gupta Expected Running Times and Randomized Algorithms Instructor Neelima Gupta

Why expected running times?Markov’s inequalityP( X > k E(X)) < 1/ki.e. the probability that the algorithm will take more

than O(2 E(X)) time is less than 1/2.Or the probability that the algorithm will take more than

O(10 E(X)) time is less than 1/10.This is the reason why Qsort does well in practice.

Page 19: Instructor Neelima Gupta Expected Running Times and Randomized Algorithms Instructor Neelima Gupta

Markov’s Bound

P(X<kM)< 1/k , where k is a constant.

Chernouff’s Bound

P(X>2μ)< ½

A More Stronger Result

P(X>k μ )< 1/nk, where k is a constant.

Page 20: Instructor Neelima Gupta Expected Running Times and Randomized Algorithms Instructor Neelima Gupta

Binary Search Tree

What is a binary search tree?

A BST is a possibly empty rooted tree with a key value, a possible empty left subtree and a possible empty right subtree.

Each of the left subtree and the right subtree is a BST.

Page 21: Instructor Neelima Gupta Expected Running Times and Randomized Algorithms Instructor Neelima Gupta

Binary Search Tree

Pick the first item from the array--call it the pivot…it becomes the root of the BST.

Partition the items in the array around the pivot so that all elements to the left are the pivot and all elements to the right are greater than the pivot

Recursively Build a BST on each partition. They become the left and the right sub-tree of the root.

Page 22: Instructor Neelima Gupta Expected Running Times and Randomized Algorithms Instructor Neelima Gupta

Binary Search TreeConsider the following input:

1,2,3 …………………10,000.

What is the time for construction?Search Time?

Page 23: Instructor Neelima Gupta Expected Running Times and Randomized Algorithms Instructor Neelima Gupta

Randomly Built Binary Search Tree

Pick an item from the array randomly --call it the pivot…it becomes the root of the BST.

Partition the items in the array around the pivot so that all elements to the left are the pivot and all elements to the right are greater than the pivot

Recursively Build a BST on each partition. They become the left and the right sub-tree of the root.

Page 24: Instructor Neelima Gupta Expected Running Times and Randomized Algorithms Instructor Neelima Gupta

ExampleConsider the input

10, 20, 30, 40, 50, 60, 70, 80, 90, 100.

Page 25: Instructor Neelima Gupta Expected Running Times and Randomized Algorithms Instructor Neelima Gupta

WLOG, assume that the keys are distinct. (What if they are not?) Rank(x) = number of elements < xLet Xi : height of the tree rooted at a node with rank=i.Let Yi : exponential height of the tree=2^Xi Let H : height of the entire BST, then

H=max{H1,H2} + 1where H1 : ht. of left subtree H2 : ht.of right subtree

Height of the RBST

Page 26: Instructor Neelima Gupta Expected Running Times and Randomized Algorithms Instructor Neelima Gupta

Y=2^H =2.max{2^H1,2^H2}

E(EH(T(n))): Expected value of exponential ht. of the tree with ‘n’ nodes.

E(EH(T(n)))=2/n ∑ max{EH(T(k)),EH(T(n-1-k))}=O(n^3)

E(H(T(n))) =E(log (EH(T(n)))) = O(log n)

Page 27: Instructor Neelima Gupta Expected Running Times and Randomized Algorithms Instructor Neelima Gupta

Construction Time?Search Time?What is the worst case input?

Page 28: Instructor Neelima Gupta Expected Running Times and Randomized Algorithms Instructor Neelima Gupta

AcknowledgementsKunal VermaNidhi Aggarwal

And other students of MSc(CS) batch 2009.

Page 29: Instructor Neelima Gupta Expected Running Times and Randomized Algorithms Instructor Neelima Gupta

HashingMotivation: symbol tables

A compiler uses a symbol table to relate symbols to associated data Symbols: variable names, procedure names, etc. Associated data: memory location, call graph, etc.

For a symbol table (also called a dictionary), we care about search, insertion, and deletion

We typically don’t care about sorted order

Page 30: Instructor Neelima Gupta Expected Running Times and Randomized Algorithms Instructor Neelima Gupta

Hash TablesMore formally:

Given a table T and a record x, with key (= symbol) and satellite data, we need to support: Insert (T, x) Delete (T, x) Search(T, x)

We want these to be fast, but don’t care about sorting the records

The structure we will use is a hash tableSupports all the above in O(1) expected time!

Page 31: Instructor Neelima Gupta Expected Running Times and Randomized Algorithms Instructor Neelima Gupta

Hash FunctionsNext problem: collision T

0

m - 1

h(k1)h(k4)

h(k2) = h(k5)

h(k3)

k4

k2 k3

k1

k5

U(universe of keys)

K(actualkeys)

Page 32: Instructor Neelima Gupta Expected Running Times and Randomized Algorithms Instructor Neelima Gupta

Resolving CollisionsHow can we solve the problem of collisions?One of the solution is : chainingOther solutions: open addressing

Page 33: Instructor Neelima Gupta Expected Running Times and Randomized Algorithms Instructor Neelima Gupta

ChainingChaining puts elements that hash to the same slot in

a linked list:

——

——

——————

——T

k4

k2k3

k1

k5

U(universe of keys)

K(actualkeys)

k6k8

k7

k1 k4 ——

k5 k2

k3

k8 k6 ————

k7 ——

Page 34: Instructor Neelima Gupta Expected Running Times and Randomized Algorithms Instructor Neelima Gupta

ChainingHow do we insert an element?

——

——

——————

——T

k4

k2k3

k1

k5

U(universe of keys)

K(actualkeys)

k6k8

k7

k1 k4 ——

k5 k2

k3

k8 k6 ————

k7 ——

Page 35: Instructor Neelima Gupta Expected Running Times and Randomized Algorithms Instructor Neelima Gupta

ChainingHow do we delete an element?

——

——

——————

——T

k4

k2k3

k1

k5

U(universe of keys)

K(actualkeys)

k6k8

k7

k1 k4 ——

k5 k2

k3

k8 k6 ————

k7 ——

Page 36: Instructor Neelima Gupta Expected Running Times and Randomized Algorithms Instructor Neelima Gupta

ChainingHow do we search for a element with a

given key?

——

——

——————

——T

k4

k2k3

k1

k5

U(universe of keys)

K(actualkeys)

k6k8

k7

k1 k4 ——

k5 k2

k3

k8 k6 ————

k7 ——

Page 37: Instructor Neelima Gupta Expected Running Times and Randomized Algorithms Instructor Neelima Gupta

Analysis of ChainingAssume simple uniform hashing: each key in table is

equally likely to be hashed to any slotGiven n keys and m slots in the table: the

load factor = n/m = average # keys per slotWhat will be the average cost of an unsuccessful

search for a key?

Page 38: Instructor Neelima Gupta Expected Running Times and Randomized Algorithms Instructor Neelima Gupta

Analysis of ChainingAssume simple uniform hashing: each key in table is

equally likely to be hashed to any slotGiven n keys and m slots in the table, the

load factor = n/m = average # keys per slotWhat will be the average cost of an unsuccessful

search for a key? A: O(1+)

Page 39: Instructor Neelima Gupta Expected Running Times and Randomized Algorithms Instructor Neelima Gupta

Analysis of ChainingAssume simple uniform hashing: each key in

table is equally likely to be hashed to any slotGiven n keys and m slots in the table, the

load factor = n/m = average # keys per slotWhat will be the average cost of an

unsuccessful search for a key? A: O(1+)What will be the average cost of a successful

search?

Page 40: Instructor Neelima Gupta Expected Running Times and Randomized Algorithms Instructor Neelima Gupta

Analysis of ChainingAssume simple uniform hashing: each key in

table is equally likely to be hashed to any slotGiven n keys and m slots in the table, the

load factor = n/m = average # keys per slotWhat will be the average cost of an

unsuccessful search for a key? A: O(1+)What will be the average cost of a successful

search? A: O((1 + )/2) = O(1 + )

Page 41: Instructor Neelima Gupta Expected Running Times and Randomized Algorithms Instructor Neelima Gupta

Analysis of Chaining ContinuedSo the cost of searching = O(1 + )If the number of keys n is proportional to the number

of slots in the table, what is ? A: = O(1)

In other words, we can make the expected cost of searching constant if we make constant

Page 42: Instructor Neelima Gupta Expected Running Times and Randomized Algorithms Instructor Neelima Gupta

If we could prove this,

P(failure)<1/k (we are sort of happy)

P(failure)<1/nk (most of times this is true and we’re

happy )

P(failure)<1/2n (this is difficult but still we want this)

A Final Word About Randomized Algorithms

Page 43: Instructor Neelima Gupta Expected Running Times and Randomized Algorithms Instructor Neelima Gupta

AcknowledgementsKunal VermaNidhi Aggarwal

And other students of MSc(CS) batch 2009.

Page 44: Instructor Neelima Gupta Expected Running Times and Randomized Algorithms Instructor Neelima Gupta

END