data structures and algorithm

54
Data Structures and Algorithm Xiaoqing Zheng [email protected]

Upload: others

Post on 09-Dec-2021

29 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Data Structures and Algorithm

Data Structures and Algorithm

Xiaoqing [email protected]

Page 2: Data Structures and Algorithm

MULTIPOP

615

85

312

S

top[S] = 6

MULTIPOP(S, x)1. while not STACK-EMPTY(S)

and k ≠ 02. do POP(S)3. k ← k − 1

MULTIPOP(S, 4)top[S] = 2

Page 3: Data Structures and Algorithm

Analysis of MULTIPOP

The worst-case cost of a MULTIPOP operation is O(n)The worst-case time of any stack operation is therefore O(n)

A sequence of n PUSH, POP, and MULTIPOP operations on an initially empty stack.

A sequence of n operations costs is O(n2)

This bound is not tight!

Page 4: Data Structures and Algorithm

Amortized analysisAn amortized analysis is any strategy for analyzing a sequence of operations to show that the average cost per operation is small, even though a single operation within the sequence might be expensive.Even though we're taking averages, however, probability is not involved!An amortized analysis guarantees the average performance of each operation in the worst case.

Page 5: Data Structures and Algorithm

Types of amortized analysesThree common amortization arguments:

Aggregate methodAccounting methodPotential method

The aggregate method, though simple, lacks the precision of the other two methods. In particular, the accounting and potential methods allow a specific amortized cost to be allocated to each operation.

Page 6: Data Structures and Algorithm

Aggregate analysisWe show that for all n, a sequence of n operation takes worst-case time T(n) in total.Hence, the average cost, or amortized cost, peroperation is T(n)/n

Page 7: Data Structures and Algorithm

MULTIPOP (aggregate method)Each object can be popped at most once for eachtime it is pushed. Therefore, the number of timesthat POP can be called on a nonempty stack,including calls within MULTIPOP, is at most thenumber of PUSH operations, which is at most n.

Any sequence of n PUSH, POP, and MULTIPOPoperations takes a total of O(n) time. The averagecost of an operation is T(n)/n = O(1)

Page 8: Data Structures and Algorithm

8-bit binary counter

0 0 0 0 0 0 0 00 0 0 0 0 0 0 10 0 0 0 0 0 1 00 0 0 0 0 0 1 10 0 0 0 0 1 0 00 0 0 0 0 1 0 10 0 0 0 0 1 1 00 0 0 0 0 1 1 10 0 0 0 1 0 0 0

012345678

A[7]A[6]A[5]A[4] A[3] A[2]A[1] A[0]

013478

101115

CounterValue

Totalcost

Page 9: Data Structures and Algorithm

Incrementing a binary counterAn array A[0 … k − 1] of bit, where length[A] = k, as the counter.

1

0[ ] 2k i

ix A i−

== ⋅∑

INCREMENT(A)1. i ← 02. while i < length[A] and A[i] = 13. do A[i] ← 04. i = i + 15. if i < length[A] 6. then A[i] ← 1

Page 10: Data Structures and Algorithm

Binary counter (aggregate method)For i = 0, 1 …, , bit A[i] flips times in a sequence of n INCREMENT operation on an initially zero counter.

lg n⎢ ⎥⎣ ⎦ / 2in⎢ ⎥⎣ ⎦

The total number of flips in the sequence is thuslg

0 0

12 2

n

i ii i

n n⎢ ⎥ ∞⎣ ⎦

= =

⎢ ⎥ <⎢ ⎥⎣ ⎦∑ ∑

2n== O(n)

The amortized cost per operation is O(n)/n = O(1)

Page 11: Data Structures and Algorithm

MULTIPOP (accounting method)

$ 1

S

top[S] = 1

PUSHPOPMULTIPOP

Actual costs1,1,min(k, s).

PUSHPOPMULTIPOP

Amortized costs2,0,0.

$ 1

PUSH

Page 12: Data Structures and Algorithm

MULTIPOP (accounting method)

$ 1$ 1

$ 1

S

top[S] = 3

PUSHPOPMULTIPOP

Actual costs1,1,min(k, s).

PUSHPOPMULTIPOP

Amortized costs2,0,0.

$ 1 + $ 1 + $ 1

top[S] = 2

PUSHPUSH

Page 13: Data Structures and Algorithm

MULTIPOP (accounting method)

$ 1$ 1

$ 1

S

top[S] = 2

PUSHPOPMULTIPOP

Actual costs1,1,min(k, s).

PUSHPOPMULTIPOP

Amortized costs2,0,0.

$ 1 + $ 1 + $ 1

top[S] = 3

POP

+ $ 1

Page 14: Data Structures and Algorithm

MULTIPOP (accounting method)

$ 1$ 1

S

top[S] = 2

PUSHPOPMULTIPOP

Actual costs1,1,min(k, s).

PUSHPOPMULTIPOP

Amortized costs2,0,0.

$ 1 + $ 1 + $ 1top[S] = 0

MULTIPOP 2

+ $ 1 + $ 1 + $ 1

Page 15: Data Structures and Algorithm

Accounting methodWe assign differing charges to different operations, with some operations charged more or less than they actually cost. The amount we charge an operation is called its amortized cost.When an operation’s amortized cost exceeds its actual cost, the difference is assigned to specific objects in the data structure as credit.

Page 16: Data Structures and Algorithm

Accounting methodWe denote the actual cost of the ith operation by ciand the amortized cost of the ith operation by , we require

ic

1 1

n n

i ii i

c c= =

≥∑ ∑

Why?

Page 17: Data Structures and Algorithm

Binary counter (accounting method)

Set a bit to 1Flip the bit back to 0

Amortized costs2,0.

Page 18: Data Structures and Algorithm

Potential methodFor each i = 1, 2, …, n, we let ci be the actual cost of the ith operation and Di be the data structure thatresults after applying the ith operation to datastructure Di−1.A potential function Φ maps each data structure Dito a real number Φ(Di), which is the potentialassociated with data structure Di.A amortized cost of ith operation with respect topotential function is Φ defined by

1( ) ( ).i i i ic c D D −= +Φ −Φ

ic

Page 19: Data Structures and Algorithm

Potential methodThe total amortized cost of the n operation is

11 1

( ( ) ( ))n n

i i i ii i

c c D D −= =

= +Φ −Φ∑ ∑

01

( ) ( )n

i ni

c D D=

= +Φ −Φ∑Define a potential function Φ so that Φ(Dn) ≥Φ(D0), then the total amortized cost is an upper bound on the total actual cost .

1

nii

c=∑

1

nii

c=∑

Page 20: Data Structures and Algorithm

MULTIPOP (potential method)We define the potential function Φ on a stack to be the number of objects in the stack.

PUSH operation1( ) ( ) ( 1) 1i iD D s s−Φ −Φ = + − =

1( ) ( ) 1 1 2i i i ic c D D −= +Φ −Φ = + =

MULTIPOP(S, k) operation and k' = min(k, s)1( ) ( )i iD D k− ′Φ −Φ = −

1( ) ( ) 0i i i ic c D D k k− ′= +Φ −Φ = − =

POP operation and k' = min(1, s)1( ) ( )i iD D k− ′Φ −Φ = −

1( ) ( ) 0i i i ic c D D k k− ′= +Φ −Φ = − =

Page 21: Data Structures and Algorithm

Binary counter (potential method)We define the potential of counter after the ithINCREMENT operation to be bi, the number of 1's in the counter after the ith operation. Suppose that the ith INCREMENT operation reset ti bit.

1 1 1( ) ( ) ( 1)i i i i iD D b t b− − −Φ −Φ = − + −1 it= −

1( ) ( )i i i ic c D D −= +Φ −Φ

( 1) (1 )i it t= + + −2=

Page 22: Data Structures and Algorithm

8-bit binary counter

0 0 0 0 0 0 0 00 0 0 0 0 0 0 10 0 0 0 0 0 1 00 0 0 0 0 0 1 10 0 0 0 0 1 0 00 0 0 0 0 1 0 10 0 0 0 0 1 1 00 0 0 0 0 1 1 10 0 0 0 1 0 0 0

012345678

A[7]A[6]A[5]A[4] A[3] A[2]A[1] A[0]

013478

101115

CounterValue

Totalcost

1( ) ( )i iD D −Φ −Φ

1 1( 1)i i ib t b− −= − + −

= (2 − 1 + 1) − 2= 0

1( ) ( )i i i ic c D D −= +Φ −Φ

= 2 + 0= 2

Page 23: Data Structures and Algorithm

8-bit binary counter

0 0 0 0 0 0 0 00 0 0 0 0 0 0 10 0 0 0 0 0 1 00 0 0 0 0 0 1 10 0 0 0 0 1 0 00 0 0 0 0 1 0 10 0 0 0 0 1 1 00 0 0 0 0 1 1 10 0 0 0 1 0 0 0

012345678

A[7]A[6]A[5]A[4] A[3] A[2]A[1] A[0]

013478

101115

CounterValue

Totalcost

1( ) ( )i iD D −Φ −Φ

1 1( 1)i i ib t b− −= − + −

= (3 − 3 + 1) − 3= −2

1( ) ( )i i i ic c D D −= +Φ −Φ

= 4 − 2= 2

Page 24: Data Structures and Algorithm

Binary counter (potential method)

01 1

( ) ( )n n

i i ni i

c c D D= =

= +Φ −Φ∑ ∑

01 1

( ) ( )n n

i i ni i

c c D D= =

= −Φ +Φ∑ ∑

01

2n

ni

b b=

= − +∑

02 nn b b= − +

( )O n=

Page 25: Data Structures and Algorithm

Hash tables by chainingExpected time to search for a record with a given key

(1 )αΘ +

apply hash function and access slot

searchthe list

Expected search time = Θ(1) if = O(1),or equivalently, if n = O(m).

α

/n mα = = average number of keys per slot.

Page 26: Data Structures and Algorithm

Hash tables by open-addressedTheorem. Given an open-addressed hash table with load factor , the expected number of probes in an unsuccessful search is at most .

/ 1n mα = <1/(1 )α−

Theorem. Given an open-addressed hash table with load factor , the expected number of probes in an successful search is at most .

/ 1n mα = <

1 1ln1α α−

Page 27: Data Structures and Algorithm

How large should a hash table be?Goal: Make the table as small as possible, but large enough so that it won’t overflow (or otherwise become inefficient).Problem: What if we don’t know the proper size in advance?Solution: Dynamic tables.IDEA: Whenever the table overflows, "grow" it by allocating a new, larger table. Move all items from the old table into the new one, and free the storage for the old table.

Page 28: Data Structures and Algorithm

Example of a dynamic table

11. INSERT

Page 29: Data Structures and Algorithm

Example of a dynamic table (cont.)

11. INSERT2. INSERT Overflow

Page 30: Data Structures and Algorithm

Example of a dynamic table (cont.)

11. INSERT2. INSERT Overflow

Page 31: Data Structures and Algorithm

Example of a dynamic table (cont.)

12

1. INSERT2. INSERT

Page 32: Data Structures and Algorithm

Example of a dynamic table (cont.)

12

1. INSERT2. INSERT3. INSERT Overflow

Page 33: Data Structures and Algorithm

Example of a dynamic table (cont.)

1. INSERT2. INSERT3. INSERT Overflow

12

Page 34: Data Structures and Algorithm

Example of a dynamic table (cont.)

1. INSERT2. INSERT3. INSERT

123

Page 35: Data Structures and Algorithm

Example of a dynamic table (cont.)

1. INSERT2. INSERT3. INSERT

12344. INSERT

Page 36: Data Structures and Algorithm

Example of a dynamic table (cont.)

1. INSERT2. INSERT3. INSERT

12344. INSERT

5. INSERT Overflow

Page 37: Data Structures and Algorithm

Example of a dynamic table (cont.)

1. INSERT2. INSERT3. INSERT4. INSERT5. INSERT Overflow

1234

Page 38: Data Structures and Algorithm

Example of a dynamic table (cont.)

1. INSERT2. INSERT3. INSERT4. INSERT5. INSERT

12345

Page 39: Data Structures and Algorithm

Example of a dynamic table (cont.)

1. INSERT2. INSERT3. INSERT4. INSERT5. INSERT

1234567

6. INSERT7. INSERT

Page 40: Data Structures and Algorithm

Worst-case analysisConsider a sequence of n insertions. The worst-case time to execute one insertion is Θ(n). Therefore, the worst-case time for n insertions is n · Θ(n) = Θ(n2).

This bound is not tight! In fact, the worst-case cost for n insertions is only Θ(n).

Let's see why.

Page 41: Data Structures and Algorithm

Tighter analysisLet ci = the cost of the ith insertion

i if i – 1 is an exact power of 2,

1 otherwise.=

i

sizei

ci

1 2 3 4 5 6 7 8 9 10

1 2 4 4 8 8 8 8 16 16

1 2 3 1 5 1 1 1 9 1

Page 42: Data Structures and Algorithm

Tighter analysis (cont.)Let ci = the cost of the ith insertion

i if i – 1 is an exact power of 2,

1 otherwise.=

i

sizei

ci

1 2 3 4 5 6 7 8 9 10

1 2 4 4 8 8 8 8 16 16

1 1 1 1 1 1 1 1 1 1

1 2 4 8

Page 43: Data Structures and Algorithm

Tighter analysis (aggregate method)The total cost of n TABLE-INSERT operations is therefore

lg

1 02

nnj

ii j

c n⎢ ⎥⎣ ⎦

= =

≤ +∑ ∑2n n≤ +

3n=

( )n= Θ

Thus, the average cost of each dynamic-table operation is Θ(n)/n = Θ(1).

Page 44: Data Structures and Algorithm

Accounting analysis of dynamic tablesCharge an amortized cost of = $3 for the ith insertion.ic

$1 pays for the immediate insertion.$2 is stored for later table doubling.

When the table doubles, $1 pays to move a recent item, and $1 pays to move an old item.

Page 45: Data Structures and Algorithm

Accounting analysis of dynamic tables

$ 0$ 0$ 2$ 2

1. INSERT2. INSERT3. INSERT4. INSERT5. INSERT Overflow

Page 46: Data Structures and Algorithm

Accounting analysis of dynamic tables

$ 0$ 0$ 0$ 0

1. INSERT2. INSERT3. INSERT4. INSERT5. INSERT

Page 47: Data Structures and Algorithm

Accounting analysis of dynamic tables

$ 0$ 0$ 0$ 0$ 2

1. INSERT2. INSERT3. INSERT4. INSERT5. INSERT

Page 48: Data Structures and Algorithm

Accounting analysis of dynamic tables

$ 0$ 0$ 0$ 0$ 2$ 2$ 2

1. INSERT2. INSERT3. INSERT4. INSERT5. INSERT6. INSERT7. INSERT

Page 49: Data Structures and Algorithm

Dynamic table (potential method)We start by defining a potential function Φ that is 0immediately after an expansion but builds to the table size by the time the table is full. The function

Φ(T) = 2 · num[T] − size[T]

numi denote the number of items stored in the table after the ith operationsizei denote the total size of the table after the ithoperationΦi denote the potential after the ith operation

Page 50: Data Structures and Algorithm

Dynamic table (potential method)If the ith TABLE-INSERT operation does not trigger an expansion, the amortized cost is

1i i i ic c −= +Φ −Φ

1 11 (2 ) (2 )i i i inum size num size− −= + ⋅ − − ⋅ −1 (2 ) (2 ( 1) )i i i inum size num size= + ⋅ − − ⋅ − −3=

If the ith TABLE-INSERT operation triggers an expansion, the amortized cost is

1i i i ic c −= +Φ −Φ

1 1(2 ) (2 )i i i i inum num size num size− −= + ⋅ − − ⋅ −(2 2 ( 1)) (2 ( 1) ( 1))i i i i inum num num num num= + ⋅ − ⋅ − − ⋅ − − −

3=

Page 51: Data Structures and Algorithm

Table expansion and contractionTable contraction is analogous to table expansion: when the number of items in the table drops too low, we allocate a new, smaller table and then copy the items form the old table into the new one.

A natural strategy for expansion and contraction is to double the table size when an item is inserted into a full table and halve the size when a deletion would cause the table to become less than half full.

This strategy cause the amortized cost of an operation to be quite large.

Page 52: Data Structures and Algorithm

Problem of natural strategyWe perform the following sequence.I, D, D, I, I, D, D, I, I, …

1234

12345Expansion

1234

123

1234

INSERT DELETE DELETE INSERT INSERT DELETE DELETE

12345

1234

Contraction

Expansion

Contraction

Page 53: Data Structures and Algorithm

Improved strategyDouble the table size when an item is inserted into afull tableHalve the table size when a deletion causes the tableto become less than 1/4 full

2 · num[T] − size[T] if α(T) ≥ 1/2Φ(T) =

size[T] / 2− num[T] if α(T) < 1/2

Potential function

Page 54: Data Structures and Algorithm

Any question?Xiaoqing Zheng

Fundan University