csc – 332 data structures generics analysis of algorithms dr. curry guinn

CSC – 332 Data StructuresGenerics

Analysis of AlgorithmsDr. Curry Guinn

For Next Class, Thursday

• Homework 1 due tonight– Quiz 2 – Today, 01/21, before class!

• Up to 3 submissions

– Quiz 3 Thursday by class time

• Homework 2 due Monday, 01/27, 11:59pm• For Thursday

– Chapter 2, Sections 2.1-2.4.2

Quiz Answers

• Let’s go to Blackboard Learn and see

Generics

• http://people.uncw.edu/guinnc/courses/spring14/332/notes/day2_unix/Generics.ppt

Is This Algorithm Fast?

• Problem: given a problem, how fast does this code solve that problem?

• "My program finds all the primes between 2 and 1,000,000,000 in 1.37 seconds."– How good is this solution?

• Could try to measure the time it takes, but that is subject to lots of errors– multitasking operating system– speed of computer– language solution is written in

Math background: exponents• Exponents

– XY , or "X to the Yth power";X multiplied by itself Y times

• Some useful identities– XA XB = XA+B

– XA / XB = XA-B

– (XA)B = XAB

– XN+XN = 2XN

– 2N+2N = 2N+1

Math background:Logarithms• Logarithms

– definition: XA = B if and only if logX B = A

– intuition: logX B means: "the power X must be raised to, to get B"

– In this course, a logarithm with no base implies base 2.log B means log2 B

• Examples– log2 16 = 4 (because 24 = 16)

– log10 1000 = 3 (because 103 = 1000)

Identities for logs with addition, multiplication, powers:

• log (AB) = log A + log B

• log (A/B) = log A – log B

• log (AB) = B log A

• logA(B) = logC(B)/logC(A)

Logarithm identities

Some helpful mathematics

• N + N + N + …. + N (total of N times)– N*N = N2 which is O(N2)

• 1 + 2 + 3 + 4 + … + N– N(N+1)/2 = N2/2 + N/2 is O(N2)

Analysis of Algorithms• What do we mean by an “efficient” algorithm?

– We mean an algorithm that uses few resources.

– By far the most important resource is time.– Thus, when we say an algorithm is

efficient (assuming we do not qualify this further), we mean that it can be executed quickly.

• Is there some way to measure efficiency that does not depend on the state of current technology?– Yes!

• The Idea– Determine the number of “steps” an algorithm

requires when given some input.• We need to define “step” in some reasonable way.

– Write this as a formula, based on the input.

Generally, when we determine the efficiency of an algorithm, we are interested in:– Time Used by the Algorithm

• Expressed in terms of number of steps.

• People also talk about “space efficiency”, etc.

– How the Size of the Input Affects Running Time• Think about giving an algorithm a list of items to operate on.

The size of the problem is the length of the list.

– Worst-Case Behavior• What is the slowest the algorithm ever runs for a given input

size?

• Occasionally we also analyze average-case behavior.

• Typically use a simple model for basic operation costs

• RAM (Random Access Machine) model– RAM model has all the basic operations:

+, -, *, / , =, comparisons– fixed sized integers (e.g., 32-bit)– infinite memory– All basic operations take exactly one time unit (one

CPU instruction) to execute

RAM model

Critique of the model• Strengths:

– simple– easier to prove things about the model than the real

machine– can estimate algorithm behavior on any

hardware/software

• Weaknesses:– not all operations take the same amount of time in a

real machine– does not account for page faults, disk accesses, limited

memory, floating point math, etc

Relative rates of growth• Most algorithms' runtime can be

expressed as a function of the input size N

• Rate of growth: measure of how quickly the graph of a function rises

• Goal: distinguish between fast- and slow-growing functions– We only care about very large input sizes

(for small sizes, most any algorithm is fast enough)

Growth rate example

Consider these graphs of functions.Perhaps each one represents an

algorithm:n3 + 2n2

100n2 + 1000

• Which growsfaster?

Growth rate example

• How about now?

• “The fundamental law of computer science: As machines become more powerful, the efficiency of algorithms grows more important, not less.” — Nick Trefethen

• An algorithm (or function or technique …) that works well when used with large problems & large systems is said to be scalable.– Or “it scales well”.

Big O• Definition: T(N) = O(f(N))

if there exist positive constants c , n0 such that:

T(N) c · f(N) for all N n0

• Idea: We are concerned with how the function grows when N is large. We are not picky about constant factors: coarse distinctions among functions

• Lingo: "T(N) grows no faster than f(N)."

Big O

c , n0 > 0 such that f(N) c g(N) when N n0

• f(N) grows no faster than g(N) for “large” N

• pick tightest bound. If f(N) = 5N, then:f(N) = O(N5)

f(N) = O(N3)

f(N) = O(N log N)

f(N) = O(N) preferred

• ignore constant factors and low order termsT(N) = O(N), not T(N) = O(5N)

T(N) = O(N3), not T(N) = O(N3 + N2 + N log N)

• remove non-base-2 logarithmsf(N) = O(N log6 N)

f(N) = O(N log N) preferred

Preferred big-O usage

Big-O of selected functions

• Defn: T(N) = (g(N)) if there are positive constants c and n0 such that T(N) c g(N) for all N n0

– Lingo: "T(N) grows no slower than g(N)."

• Defn: T(N) = (g(N)) if and only if T(N) = O(g(N)) and T(N) = (g(N)).– Big-O, Omega, and Theta establish a relative

ordering among all functions of N

Big Omega, Theta

notation intuition

O (Big-O) (Big-Omega) (Theta) =

o (little-O) <

Intuition about the notations

Big-Omega

• c , n0 > 0 such that f(N) c g(N) when N n0

• f(N) grows no slower than g(N) for “large” N

Big Theta: f(N) = (g(N))

• the growth rate of f(N) is the same as the growth rate of g(N)

• An O(1) algorithm is constant time.– The running time of such an algorithm is essentially independent of the input.– Such algorithms are rare, since they cannot even read all of their input.

• An O(logbn) [for some b] algorithm is logarithmic time.– We do not care what b is.

• An O(n) algorithm is linear time.– Such algorithms are not rare.– This is as fast as an algorithm can be and still read all of its input.

• An O(n logbn) [for some b] algorithm is log-linear time.– This is about as slow as an algorithm can be and still be truly useful

(scalable).

• An O(n2) algorithm is quadratic time.– These are usually too slow.

• An O(bn) [for some b] algorithm is exponential time.– These algorithms are much too slow to be useful.

• T(N) = O(f(N))– f(N) is an upper bound on T(N)– T(N) grows no faster than f(N)

• T(N) = (g(N))– g(N) is a lower bound on T(N)– T(N) grows at least as fast as g(N)

• T(N) = o(h(N)) (little-O)– T(N) grows strictly slower than h(N)

Hammerin’ the terminolgy

Asymptotically less than or equal to O (Big-O)

Asymptotically greater than or equal to (Big-Omega)

Asymptotically equal to (Big-Theta)

Asymptotically strictly less o (Little-O)

Notations

Facts about big-O

• If T(N) is a polynomial of degree k, then:T(N) = (Nk)– example: 17n3 + 2n2 + 4n + 1 = (n3)

Hierarchy of Big-O• Functions, ranked in increasing order of growth:

– 1

– log log n

– log n

– n

– n log n

– n2

– n2 log n

– n3

...

– 2n

– n!

– nn

Various growth rates

)20 = or 10 = (e.g., of valuessmallfor Practical

time; : )!()(),()(),2()(

time : )()(

time : )()(

sortingfor famous : )log()(

time : )()(

time : )(log)(

imeconstant t asfast asabout Just : )log(log)(

imeConstant t : )()(

2

nnn

nnTnnTnT

nnT

nnT

nnnT

nnT

nnT

nnT

nT

nn

k

lexponentia

polynomial

quadratic

- time linear-log

linear

clogarithmi

1

• Evaluate:

)(

)(lim

Ng

NfN

limit is Big-Oh relation

0 f(N) = o(g(N))

c 0 f(N) = (g(N))

g(N) = o(f(N))

no limit no relation

Techniques for Determining Which Grows Faster

• L'Hôpital's rule:

If and , then

example: f(N) = N, g(N) = log N

Use L'Hôpital's rule

f'(N) = 1, g'(N) = 1/N

g(N) = o(f(N))

)(lim NfN

)(lim NgN

)('

)('lim

)(

)(lim

Ng

Nf

Ng

NfNN

Techniques, cont'd

for (int i = 0; i < n; i += c) // O(n) statement(s);

• Adding to the loop counter means that the loop runtime grows linearly when compared to its maximum value n.– Loop executes its body exactly n / c times.

Program loop runtimes

for (int i = 0; i < n; i *= c) // O(log n)

statement(s);

• Multiplying the loop counter means that the maximum value n must grow exponentially to linearly increase the loop runtime; therefore, it is logarithmic.– Loop executes its body exactly logc n times.

for (int i = 0; i < n * n; i += c) // O(n2)

statement(s);

• The loop maximum is n2, so the runtime is quadratic.– Loop executes its body exactly (n2 / c) times.

• Nesting loops multiplies their runtimes.for (int i = 0; i < n; i += c) { //O(n2) for (int j = 0; j < n; i += c) { statement;} }

More loop runtimes

• Loops in sequence add together their runtimes, which means the loop set with the larger runtime dominates.for (int i = 0; i < n; i += c) { // O(n) statement;}// O(nlog n)for (int i = 0; i < n; i += c) { for (int j = 0; j < n; i *= c) {

statement;} }

• Express the running time as f(N), where N is the size of the input

• worst case: your enemy gets to pick the input

• average case: need to assume a probability distribution on the inputs

Types of runtime analysis

Some rules

When considering the growth rate of a function using Big-O

• Ignore the lower order terms and the coefficients of the highest-order term

• No need to specify the base of logarithm– Changing the base from one constant to another changes

the value of the logarithm by only a constant factor

• If T1(N) = O(f(N) and T2(N) = O(g(N)), then

– T1(N) + T2(N) = max(O(f(N)), O(g(N))),

– T1(N) * T2(N) = O(f(N) * g(N))

For Next Class, Thursday

• Homework 1 due tonight– Quiz 2 – Today, 01/21, before class!

• Up to 3 submissions

– Quiz 3 Thursday by class time

• Homework 2 due Monday, 01/27, 11:59pm• For Thursday

– Chapter 2, Sections 2.1-2.4.2

csc – 332 data structures generics analysis of algorithms dr. curry guinn

Documents

algorithm behavior

efficient algorithm

b log alogab

n total of n timesn

log blog ab

logx b

given input size

time unit