1 cse 326: data structures program analysis lecture 3: friday, jan 8, 2003

1

CSE 326: Data Structures Program Analysis

Lecture 3: Friday, Jan 8, 2003

2

Outline

• Empirical analysis of algorithms

• Formal analysis of algorithms

• Reading assignment: sec. 2.4.3 (maximum subsequence)

3

Determining the Complexity of an Algorithm

• Empirical measurements: – pro: discover if constant factors are significant– con: may be running on “wrong” inputs

• Formal analysis (proofs):– pro: no interference from

implementation/hardware details– con: hides constants; may be hard

In theory, theory is the same as practice, but in practice it is not

4

Measuring Empirical Complexity:Linear vs. Binary Search

Linear Search

Binary Search

Time to find one item:

Time to find N items:

• Find a item in a sorted array of length N

• Binary search algorithm:

5

int bfind(int x, int a[], int left, int right){ if (left+1 == right) return –1; m = (left + right) / 2; if (x == a[m]) return m; if (x < a[m]) return bfind(x, a, left, m); else return bfind(x, a, m, right); }


for (i=0; i<n; i++) a[i] = i;

for (i=0; i<n; i++) lfind(i,a,n);

int lfind(int x, int a[], int n){ if (n==0) return –1; if (x == a[n-1]) return n-1; return lfind(x, a, n-1);}

int lfind(int x, int a[], int n){ if (n==0) return –1; if (x == a[n-1]) return n-1; return lfind(x, a, n-1);}

orbfind(i,a,-1,n)

6

Graphical Analysis

7

Graphical Analysis

9

slope 2

slope 1

Recall: we search n timesLinear = O(n2)Binary = O(n log n)

10

Property of Log/Log Plots

• On a linear plot, a linear function is a straight line

• On a log/log plot, any polynomial function is a straight line! slope y/ x = exponent

vertical axis slope

Proof: suppose y = cxk

log(y) = log(cxk)log(y) = log(c) + log(xk)log(y) = log(c) + k log(x)

horizontal axis

11

slope 1

Why does O(n log n) look like a straight line?

12

Empirical Complexity

• Large data sets may be required to gain an accurate empirical picture

• When running time is expected to be polynomial, use Log/log plots slope = exponent

• When the running time is expected to be exponential, use log on the y axis

• When running time is expected to be log, then use long on the x axis

• Best: try all three, and see which one is linear

13

Analyzing Code

• primitive operations

• consecutive statements

• function calls

• conditionals

• loops

• recursive functions

14

Conditionals

• Conditionalif C then S1 else S2

• Suppose you are doing a O( ) analysis?Time(C) + Max(Time(S1),Time(S2)) or

Time(C)+Time(S1)+Time(S2)

• Suppose you are doing a ( ) analysis?Time(C) + Min(Time(S1),Time(S2)) or

Time(C)

15

Nested Loops

for i = 1 to n do for j = 1 to n do sum = sum + 1

2

1 1 1

1n n n

i j i

n n

16

Nested Dependent Loops

for i = 1 to n do for j = i to n do sum = sum + 1

1

1 ?n n

i j i

17

Nested Dependent Loops

for i = 1 to n do for j = i to n do sum = sum + 1

n

i

n

i

n

j

innT11 1

11)(

)(2

)1(

2

)1(1 22

111

nOnn

nnn

ninn

i

n

i

n

i

Compute itthe hard way:

Compute it the smart way: substitute n - i+1 with j

)(2

)1( 2

1

1

nOnn

jjn

jnj

18

Other Important Series

• Sum of squares:

• Sum of exponents:

• Geometric series:

• Novel series: – Reduce to known series, or prove inductively

N largefor 36

)12)(1( 3

1

2 NNNNi

N

i

-1k and N largefor |1|

1

1

k

Ni

kN

i

k

1

11

0

A

AA

NN

i

i

19

Linear Search Analysis

• Best case, tight analysis:

• Worst case, tight analysis:

void lfind(int x, int a[], int n){ for (i=0; i<n; i++) if (a[i] == x) return i; return –1;}

void lfind(int x, int a[], int n){ for (i=0; i<n; i++) if (a[i] == x) return i; return –1;}

20

Iterated Linear Search Analysis

• Easy worst-case upper-bound:

• Worst-case tight analysis:

for (i=0; i<n; i++) a[i] = i;

for (i=0; i<n; i++) lfind(i,a,n);

21

Analyzing Recursive Programs

1. Express the running time T(n) as a recursive equation

2. Solve the recursive equation• For an upper-bound analysis, you can

optionally simplify the equation to something larger

• For a lower-bound analysis, you can optionally simplify the equation to something smaller

22

Binary Search

What is the worst-case upper bound?



23

Binary Search

Introduce some constants…

b = time needed for base case

c = time needed to get ready to do a recursive call

Size is n = right-left

Running time is thus:(1)

( ) ( / 2)

T b

T n T n c



24

Binary Search Analysis

One sub-problem, half as largeEquation: T(1) b

T(n) T(n/2) + c for n>1Solution:

T(n) T(n/2) + c write equation T(n/4) + c + c expand T(n/8) + c + c + c T(n/2k) + kc inductive leap T(1) + c log n where k = log n select value for k b + c log n = O(log n) simplify

25

Solving Recursive Equations by Telescoping

• Create a set of equations, take their sum( ) ( / 2)

( / 2) ( / 4)

( / 4) ( / 8)

( / 8) ( /1

initial equation

so this holds...

and this...

and this...6)

...

(2) (

and eventually...

sum equations, cancelling

terms that appear on both s

1)

T n T n c

T n T n c

T n T n c

T n T n c

T T c

ides

look famili( ) (1) log

( ) (log )

ar?T n T c n

T n n

26

base case

Assume hypothesis

definition of T(n)

by in

(1) log1

( ) log

(2 ) ( )

(2 ) ( log )

(2 ) ((log ) 1)

(2 ) ((

duction hypothesis

Q.E

log ) (log 2))

(2 ) log(2 ) .D.

Thus: (

T b c b

T n b c n

T n T n c

T n b c n c

T n b c n

T n b c n

T n b c n

T n

) (log )n

Inductive ProofIf you know the closed form solution,you can validate it by ordinary induction

27

Amortized Analysis

Stack

• Stack operations– push– pop– is_empty

• Stack property: if x is on the stack before y is pushed, then x will be popped after y is popped

A

BCDEF

E D C B A

F

What is biggest problem with an array implementation?

28

Stretchy Stack Implementationint[] data;int maxsize;int top;

Push(e){if (top == maxsize){

temp = new int[2*maxsize];for (i=0;i<maxsize;i++)

temp[i]=data[i];data = temp; maxsize = 2*maxsize; }

data[++top] = e;}

int pop() { return data[--top]; }

int[] data;int maxsize;int top;

Push(e){if (top == maxsize){

temp = new int[2*maxsize];for (i=0;i<maxsize;i++)

temp[i]=data[i];data = temp; maxsize = 2*maxsize; }

data[++top] = e;}

int pop() { return data[--top]; }

Best case Push = O( )

Worst case Push = O( )

29

Stretchy Stack Amortized Analysis

• Consider sequence of n push/pop operations

• Amortized time = (T1 + T2 + . . . + Tn) / n• We compute this next

n

push(e1)push(e2)pop()push(e3) push(e4)pop(). . .push(ek)

push(e1)push(e2)pop()push(e3) push(e4)pop(). . .push(ek)

time = T1

time = Tn

30


• The length of the array increases like this:1, 2, 4, 8, . . . , 2k, . . ., n

• For each Ti we have one of the following

– Ti = O(1) for pop( ), and for some push(ei)

– Ti = O(2k) for some push(ei)

• Hence

nnOOOnTTT n /))...8421()1(...)1((/)...( 21

31


1212

122...8421

1)log()log(

1

nn

nn

i

i

)1(/)...( 21 OnTTT n

Let’s compute this sum:

And therefore:

In an asymptotic sense, there is no overhead in using stretchy arraysrather than regular arrays!

32

Geometric Series

1

11

0

A

AA

NN

i

i

11

0

2 12 2 1

2 1

nni n

i

loglog 1 log 1

0

2 2 1 (2 )2 1 2 1n

i n n

i

n

33


• Careful ! We must be clever to get good amortized performance !

• Consider “smart pop”:

int pop(){ int e = data[--top]; if (top <= maxsize/2){ maxsize = maxsize/2;

temp = new int[maxsize];for (i=0;i<maxsize;i++) temp[i]=data[i];data = temp;}

return e;}

int pop(){ int e = data[--top]; if (top <= maxsize/2){ maxsize = maxsize/2;

temp = new int[maxsize];for (i=0;i<maxsize;i++) temp[i]=data[i];data = temp;}

return e;}

34


• Take the sequence of 3n push/pop operations:

push(e1)push(e2)...push(en)pop() push(en)pop()push(en)pop() ...push(en)pop()

push(e1)push(e2)...push(en)pop() push(en)pop()push(en)pop() ...push(en)pop()

n

2n

Suppose n = 2k+1

Hence amortized time is:

T = ((1) + . . . + (1) + (n) + . . .+ (n))/3n = (n (1) + 2n (n))/3n = 2/3 (n) Hence T = (n) !!!

1 cse 326: data structures program analysis lecture 3: friday, jan 8, 2003

Documents