csc – 332 data structures generics analysis of algorithms dr. curry guinn
TRANSCRIPT
CSC – 332 Data StructuresGenerics
Analysis of AlgorithmsDr. Curry Guinn
For Next Class, Thursday
• Homework 1 due tonight– Quiz 2 – Today, 01/21, before class!
• Up to 3 submissions
– Quiz 3 Thursday by class time
• Homework 2 due Monday, 01/27, 11:59pm• For Thursday
– Chapter 2, Sections 2.1-2.4.2
Quiz Answers
• Let’s go to Blackboard Learn and see
Generics
• http://people.uncw.edu/guinnc/courses/spring14/332/notes/day2_unix/Generics.ppt
Is This Algorithm Fast?
• Problem: given a problem, how fast does this code solve that problem?
• "My program finds all the primes between 2 and 1,000,000,000 in 1.37 seconds."– How good is this solution?
• Could try to measure the time it takes, but that is subject to lots of errors– multitasking operating system– speed of computer– language solution is written in
Math background: exponents• Exponents
– XY , or "X to the Yth power";X multiplied by itself Y times
• Some useful identities– XA XB = XA+B
– XA / XB = XA-B
– (XA)B = XAB
– XN+XN = 2XN
– 2N+2N = 2N+1
Math background:Logarithms• Logarithms
– definition: XA = B if and only if logX B = A
– intuition: logX B means: "the power X must be raised to, to get B"
– In this course, a logarithm with no base implies base 2.log B means log2 B
• Examples– log2 16 = 4 (because 24 = 16)
– log10 1000 = 3 (because 103 = 1000)
Identities for logs with addition, multiplication, powers:
• log (AB) = log A + log B
• log (A/B) = log A – log B
• log (AB) = B log A
• logA(B) = logC(B)/logC(A)
Logarithm identities
Some helpful mathematics
• N + N + N + …. + N (total of N times)– N*N = N2 which is O(N2)
• 1 + 2 + 3 + 4 + … + N– N(N+1)/2 = N2/2 + N/2 is O(N2)
Analysis of Algorithms• What do we mean by an “efficient” algorithm?
– We mean an algorithm that uses few resources.
– By far the most important resource is time.– Thus, when we say an algorithm is
efficient (assuming we do not qualify this further), we mean that it can be executed quickly.
• Is there some way to measure efficiency that does not depend on the state of current technology?– Yes!
• The Idea– Determine the number of “steps” an algorithm
requires when given some input.• We need to define “step” in some reasonable way.
– Write this as a formula, based on the input.
Generally, when we determine the efficiency of an algorithm, we are interested in:– Time Used by the Algorithm
• Expressed in terms of number of steps.
• People also talk about “space efficiency”, etc.
– How the Size of the Input Affects Running Time• Think about giving an algorithm a list of items to operate on.
The size of the problem is the length of the list.
– Worst-Case Behavior• What is the slowest the algorithm ever runs for a given input
size?
• Occasionally we also analyze average-case behavior.
• Typically use a simple model for basic operation costs
• RAM (Random Access Machine) model– RAM model has all the basic operations:
+, -, *, / , =, comparisons– fixed sized integers (e.g., 32-bit)– infinite memory– All basic operations take exactly one time unit (one
CPU instruction) to execute
RAM model
Critique of the model• Strengths:
– simple– easier to prove things about the model than the real
machine– can estimate algorithm behavior on any
hardware/software
• Weaknesses:– not all operations take the same amount of time in a
real machine– does not account for page faults, disk accesses, limited
memory, floating point math, etc
Relative rates of growth• Most algorithms' runtime can be
expressed as a function of the input size N
• Rate of growth: measure of how quickly the graph of a function rises
• Goal: distinguish between fast- and slow-growing functions– We only care about very large input sizes
(for small sizes, most any algorithm is fast enough)
Growth rate example
Consider these graphs of functions.Perhaps each one represents an
algorithm:n3 + 2n2
100n2 + 1000
• Which growsfaster?
Growth rate example
• How about now?
• “The fundamental law of computer science: As machines become more powerful, the efficiency of algorithms grows more important, not less.” — Nick Trefethen
• An algorithm (or function or technique …) that works well when used with large problems & large systems is said to be scalable.– Or “it scales well”.
Big O• Definition: T(N) = O(f(N))
if there exist positive constants c , n0 such that:
T(N) c · f(N) for all N n0
• Idea: We are concerned with how the function grows when N is large. We are not picky about constant factors: coarse distinctions among functions
• Lingo: "T(N) grows no faster than f(N)."
Big O
c , n0 > 0 such that f(N) c g(N) when N n0
• f(N) grows no faster than g(N) for “large” N
• pick tightest bound. If f(N) = 5N, then:f(N) = O(N5)
f(N) = O(N3)
f(N) = O(N log N)
f(N) = O(N) preferred
• ignore constant factors and low order termsT(N) = O(N), not T(N) = O(5N)
T(N) = O(N3), not T(N) = O(N3 + N2 + N log N)
• remove non-base-2 logarithmsf(N) = O(N log6 N)
f(N) = O(N log N) preferred
Preferred big-O usage
Big-O of selected functions
• Defn: T(N) = (g(N)) if there are positive constants c and n0 such that T(N) c g(N) for all N n0
– Lingo: "T(N) grows no slower than g(N)."
• Defn: T(N) = (g(N)) if and only if T(N) = O(g(N)) and T(N) = (g(N)).– Big-O, Omega, and Theta establish a relative
ordering among all functions of N
Big Omega, Theta
notation intuition
O (Big-O) (Big-Omega) (Theta) =
o (little-O) <
Intuition about the notations
Big-Omega
• c , n0 > 0 such that f(N) c g(N) when N n0
• f(N) grows no slower than g(N) for “large” N
Big Theta: f(N) = (g(N))
• the growth rate of f(N) is the same as the growth rate of g(N)
• An O(1) algorithm is constant time.– The running time of such an algorithm is essentially independent of the input.– Such algorithms are rare, since they cannot even read all of their input.
• An O(logbn) [for some b] algorithm is logarithmic time.– We do not care what b is.
• An O(n) algorithm is linear time.– Such algorithms are not rare.– This is as fast as an algorithm can be and still read all of its input.
• An O(n logbn) [for some b] algorithm is log-linear time.– This is about as slow as an algorithm can be and still be truly useful
(scalable).
• An O(n2) algorithm is quadratic time.– These are usually too slow.
• An O(bn) [for some b] algorithm is exponential time.– These algorithms are much too slow to be useful.
• T(N) = O(f(N))– f(N) is an upper bound on T(N)– T(N) grows no faster than f(N)
• T(N) = (g(N))– g(N) is a lower bound on T(N)– T(N) grows at least as fast as g(N)
• T(N) = o(h(N)) (little-O)– T(N) grows strictly slower than h(N)
Hammerin’ the terminolgy
Asymptotically less than or equal to O (Big-O)
Asymptotically greater than or equal to (Big-Omega)
Asymptotically equal to (Big-Theta)
Asymptotically strictly less o (Little-O)
Notations
Facts about big-O
• If T(N) is a polynomial of degree k, then:T(N) = (Nk)– example: 17n3 + 2n2 + 4n + 1 = (n3)
Hierarchy of Big-O• Functions, ranked in increasing order of growth:
– 1
– log log n
– log n
– n
– n log n
– n2
– n2 log n
– n3
...
– 2n
– n!
– nn
Various growth rates
)20 = or 10 = (e.g., of valuessmallfor Practical
time; : )!()(),()(),2()(
time : )()(
time : )()(
sortingfor famous : )log()(
time : )()(
time : )(log)(
imeconstant t asfast asabout Just : )log(log)(
imeConstant t : )()(
2
nnn
nnTnnTnT
nnT
nnT
nnnT
nnT
nnT
nnT
nT
nn
k
lexponentia
polynomial
quadratic
- time linear-log
linear
clogarithmi
1
• Evaluate:
)(
)(lim
Ng
NfN
limit is Big-Oh relation
0 f(N) = o(g(N))
c 0 f(N) = (g(N))
g(N) = o(f(N))
no limit no relation
Techniques for Determining Which Grows Faster
• L'Hôpital's rule:
If and , then
example: f(N) = N, g(N) = log N
Use L'Hôpital's rule
f'(N) = 1, g'(N) = 1/N
g(N) = o(f(N))
)(lim NfN
)(lim NgN
)('
)('lim
)(
)(lim
Ng
Nf
Ng
NfNN
Techniques, cont'd
for (int i = 0; i < n; i += c) // O(n) statement(s);
• Adding to the loop counter means that the loop runtime grows linearly when compared to its maximum value n.– Loop executes its body exactly n / c times.
Program loop runtimes
for (int i = 0; i < n; i *= c) // O(log n)
statement(s);
• Multiplying the loop counter means that the maximum value n must grow exponentially to linearly increase the loop runtime; therefore, it is logarithmic.– Loop executes its body exactly logc n times.
for (int i = 0; i < n * n; i += c) // O(n2)
statement(s);
• The loop maximum is n2, so the runtime is quadratic.– Loop executes its body exactly (n2 / c) times.
• Nesting loops multiplies their runtimes.for (int i = 0; i < n; i += c) { //O(n2) for (int j = 0; j < n; i += c) { statement;} }
More loop runtimes
• Loops in sequence add together their runtimes, which means the loop set with the larger runtime dominates.for (int i = 0; i < n; i += c) { // O(n) statement;}// O(nlog n)for (int i = 0; i < n; i += c) { for (int j = 0; j < n; i *= c) {
statement;} }
• Express the running time as f(N), where N is the size of the input
• worst case: your enemy gets to pick the input
• average case: need to assume a probability distribution on the inputs
Types of runtime analysis
Some rules
When considering the growth rate of a function using Big-O
• Ignore the lower order terms and the coefficients of the highest-order term
• No need to specify the base of logarithm– Changing the base from one constant to another changes
the value of the logarithm by only a constant factor
• If T1(N) = O(f(N) and T2(N) = O(g(N)), then
– T1(N) + T2(N) = max(O(f(N)), O(g(N))),
– T1(N) * T2(N) = O(f(N) * g(N))
For Next Class, Thursday
• Homework 1 due tonight– Quiz 2 – Today, 01/21, before class!
• Up to 3 submissions
– Quiz 3 Thursday by class time
• Homework 2 due Monday, 01/27, 11:59pm• For Thursday
– Chapter 2, Sections 2.1-2.4.2