1 cse 326: data structures program analysis lecture 3: friday, jan 8, 2003
Post on 20-Dec-2015
214 views
TRANSCRIPT
1
CSE 326: Data Structures Program Analysis
Lecture 3: Friday, Jan 8, 2003
2
Outline
• Empirical analysis of algorithms
• Formal analysis of algorithms
• Reading assignment: sec. 2.4.3 (maximum subsequence)
3
Determining the Complexity of an Algorithm
• Empirical measurements: – pro: discover if constant factors are significant– con: may be running on “wrong” inputs
• Formal analysis (proofs):– pro: no interference from
implementation/hardware details– con: hides constants; may be hard
In theory, theory is the same as practice, but in practice it is not
4
Measuring Empirical Complexity:Linear vs. Binary Search
Linear Search
Binary Search
Time to find one item:
Time to find N items:
• Find a item in a sorted array of length N
• Binary search algorithm:
5
int bfind(int x, int a[], int left, int right){ if (left+1 == right) return –1; m = (left + right) / 2; if (x == a[m]) return m; if (x < a[m]) return bfind(x, a, left, m); else return bfind(x, a, m, right); }
int bfind(int x, int a[], int left, int right){ if (left+1 == right) return –1; m = (left + right) / 2; if (x == a[m]) return m; if (x < a[m]) return bfind(x, a, left, m); else return bfind(x, a, m, right); }
for (i=0; i<n; i++) a[i] = i;
for (i=0; i<n; i++) lfind(i,a,n);
int lfind(int x, int a[], int n){ if (n==0) return –1; if (x == a[n-1]) return n-1; return lfind(x, a, n-1);}
int lfind(int x, int a[], int n){ if (n==0) return –1; if (x == a[n-1]) return n-1; return lfind(x, a, n-1);}
orbfind(i,a,-1,n)
6
Graphical Analysis
7
Graphical Analysis
8
9
slope 2
slope 1
Recall: we search n timesLinear = O(n2)Binary = O(n log n)
10
Property of Log/Log Plots
• On a linear plot, a linear function is a straight line
• On a log/log plot, any polynomial function is a straight line! slope y/ x = exponent
vertical axis slope
Proof: suppose y = cxk
log(y) = log(cxk)log(y) = log(c) + log(xk)log(y) = log(c) + k log(x)
horizontal axis
11
slope 1
Why does O(n log n) look like a straight line?
12
Empirical Complexity
• Large data sets may be required to gain an accurate empirical picture
• When running time is expected to be polynomial, use Log/log plots slope = exponent
• When the running time is expected to be exponential, use log on the y axis
• When running time is expected to be log, then use long on the x axis
• Best: try all three, and see which one is linear
13
Analyzing Code
• primitive operations
• consecutive statements
• function calls
• conditionals
• loops
• recursive functions
14
Conditionals
• Conditionalif C then S1 else S2
• Suppose you are doing a O( ) analysis?Time(C) + Max(Time(S1),Time(S2)) or
Time(C)+Time(S1)+Time(S2)
• Suppose you are doing a ( ) analysis?Time(C) + Min(Time(S1),Time(S2)) or
Time(C)
15
Nested Loops
for i = 1 to n do for j = 1 to n do sum = sum + 1
2
1 1 1
1n n n
i j i
n n
16
Nested Dependent Loops
for i = 1 to n do for j = i to n do sum = sum + 1
1
1 ?n n
i j i
17
Nested Dependent Loops
for i = 1 to n do for j = i to n do sum = sum + 1
n
i
n
i
n
j
innT11 1
11)(
)(2
)1(
2
)1(1 22
111
nOnn
nnn
ninn
i
n
i
n
i
Compute itthe hard way:
Compute it the smart way: substitute n - i+1 with j
)(2
)1( 2
1
1
nOnn
jjn
jnj
18
Other Important Series
• Sum of squares:
• Sum of exponents:
• Geometric series:
• Novel series: – Reduce to known series, or prove inductively
N largefor 36
)12)(1( 3
1
2 NNNNi
N
i
-1k and N largefor |1|
1
1
k
Ni
kN
i
k
1
11
0
A
AA
NN
i
i
19
Linear Search Analysis
• Best case, tight analysis:
• Worst case, tight analysis:
void lfind(int x, int a[], int n){ for (i=0; i<n; i++) if (a[i] == x) return i; return –1;}
void lfind(int x, int a[], int n){ for (i=0; i<n; i++) if (a[i] == x) return i; return –1;}
20
Iterated Linear Search Analysis
• Easy worst-case upper-bound:
• Worst-case tight analysis:
for (i=0; i<n; i++) a[i] = i;
for (i=0; i<n; i++) lfind(i,a,n);
21
Analyzing Recursive Programs
1. Express the running time T(n) as a recursive equation
2. Solve the recursive equation• For an upper-bound analysis, you can
optionally simplify the equation to something larger
• For a lower-bound analysis, you can optionally simplify the equation to something smaller
22
Binary Search
What is the worst-case upper bound?
int bfind(int x, int a[], int left, int right){ if (left+1 == right) return –1; m = (left + right) / 2; if (x == a[m]) return m; if (x < a[m]) return bfind(x, a, left, m); else return bfind(x, a, m, right); }
int bfind(int x, int a[], int left, int right){ if (left+1 == right) return –1; m = (left + right) / 2; if (x == a[m]) return m; if (x < a[m]) return bfind(x, a, left, m); else return bfind(x, a, m, right); }
23
Binary Search
Introduce some constants…
b = time needed for base case
c = time needed to get ready to do a recursive call
Size is n = right-left
Running time is thus:(1)
( ) ( / 2)
T b
T n T n c
int bfind(int x, int a[], int left, int right){ if (left+1 == right) return –1; m = (left + right) / 2; if (x == a[m]) return m; if (x < a[m]) return bfind(x, a, left, m); else return bfind(x, a, m, right); }
int bfind(int x, int a[], int left, int right){ if (left+1 == right) return –1; m = (left + right) / 2; if (x == a[m]) return m; if (x < a[m]) return bfind(x, a, left, m); else return bfind(x, a, m, right); }
24
Binary Search Analysis
One sub-problem, half as largeEquation: T(1) b
T(n) T(n/2) + c for n>1Solution:
T(n) T(n/2) + c write equation T(n/4) + c + c expand T(n/8) + c + c + c T(n/2k) + kc inductive leap T(1) + c log n where k = log n select value for k b + c log n = O(log n) simplify
25
Solving Recursive Equations by Telescoping
• Create a set of equations, take their sum( ) ( / 2)
( / 2) ( / 4)
( / 4) ( / 8)
( / 8) ( /1
initial equation
so this holds...
and this...
and this...6)
...
(2) (
and eventually...
sum equations, cancelling
terms that appear on both s
1)
T n T n c
T n T n c
T n T n c
T n T n c
T T c
ides
look famili( ) (1) log
( ) (log )
ar?T n T c n
T n n
26
base case
Assume hypothesis
definition of T(n)
by in
(1) log1
( ) log
(2 ) ( )
(2 ) ( log )
(2 ) ((log ) 1)
(2 ) ((
duction hypothesis
Q.E
log ) (log 2))
(2 ) log(2 ) .D.
Thus: (
T b c b
T n b c n
T n T n c
T n b c n c
T n b c n
T n b c n
T n b c n
T n
) (log )n
Inductive ProofIf you know the closed form solution,you can validate it by ordinary induction
27
Amortized Analysis
Stack
• Stack operations– push– pop– is_empty
• Stack property: if x is on the stack before y is pushed, then x will be popped after y is popped
A
BCDEF
E D C B A
F
What is biggest problem with an array implementation?
28
Stretchy Stack Implementationint[] data;int maxsize;int top;
Push(e){if (top == maxsize){
temp = new int[2*maxsize];for (i=0;i<maxsize;i++)
temp[i]=data[i];data = temp; maxsize = 2*maxsize; }
data[++top] = e;}
int pop() { return data[--top]; }
int[] data;int maxsize;int top;
Push(e){if (top == maxsize){
temp = new int[2*maxsize];for (i=0;i<maxsize;i++)
temp[i]=data[i];data = temp; maxsize = 2*maxsize; }
data[++top] = e;}
int pop() { return data[--top]; }
Best case Push = O( )
Worst case Push = O( )
29
Stretchy Stack Amortized Analysis
• Consider sequence of n push/pop operations
• Amortized time = (T1 + T2 + . . . + Tn) / n• We compute this next
n
push(e1)push(e2)pop()push(e3) push(e4)pop(). . .push(ek)
push(e1)push(e2)pop()push(e3) push(e4)pop(). . .push(ek)
time = T1
time = Tn
30
Stretchy Stack Amortized Analysis
• The length of the array increases like this:1, 2, 4, 8, . . . , 2k, . . ., n
• For each Ti we have one of the following
– Ti = O(1) for pop( ), and for some push(ei)
– Ti = O(2k) for some push(ei)
• Hence
nnOOOnTTT n /))...8421()1(...)1((/)...( 21
31
Stretchy Stack Amortized Analysis
1212
122...8421
1)log()log(
1
nn
nn
i
i
)1(/)...( 21 OnTTT n
Let’s compute this sum:
And therefore:
In an asymptotic sense, there is no overhead in using stretchy arraysrather than regular arrays!
32
Geometric Series
1
11
0
A
AA
NN
i
i
11
0
2 12 2 1
2 1
nni n
i
loglog 1 log 1
0
2 2 1 (2 )2 1 2 1n
i n n
i
n
33
Stretchy Stack Amortized Analysis
• Careful ! We must be clever to get good amortized performance !
• Consider “smart pop”:
int pop(){ int e = data[--top]; if (top <= maxsize/2){ maxsize = maxsize/2;
temp = new int[maxsize];for (i=0;i<maxsize;i++) temp[i]=data[i];data = temp;}
return e;}
int pop(){ int e = data[--top]; if (top <= maxsize/2){ maxsize = maxsize/2;
temp = new int[maxsize];for (i=0;i<maxsize;i++) temp[i]=data[i];data = temp;}
return e;}
34
Stretchy Stack Amortized Analysis
• Take the sequence of 3n push/pop operations:
push(e1)push(e2)...push(en)pop() push(en)pop()push(en)pop() ...push(en)pop()
push(e1)push(e2)...push(en)pop() push(en)pop()push(en)pop() ...push(en)pop()
n
2n
Suppose n = 2k+1
Hence amortized time is:
T = ((1) + . . . + (1) + (n) + . . .+ (n))/3n = (n (1) + 2n (n))/3n = 2/3 (n) Hence T = (n) !!!