theory of algorithms: fundamentals of the analysis of algorithm efficiency

Theory of Algorithms:Theory of Algorithms:Fundamentals of the Analysis of Fundamentals of the Analysis of

Algorithm EfficiencyAlgorithm Efficiency

Theory of Algorithms:Theory of Algorithms:Fundamentals of the Analysis of Fundamentals of the Analysis of

Algorithm EfficiencyAlgorithm EfficiencyJames Gain and Edwin Blakejgain | edwin @cs.uct.ac.za

Department of Computer Science

University of Cape Town

August - October 2004

ObjectivesObjectives

To outline a general Analytic Framework

To introduce the conceptual tools of: Asymptotic Notations

Base Efficiency Classes

To cover Techniques for the Analysis of Algorithms: Mathematical (non-recursive and recursive)

Empirical

Visualisation

Analysis of AlgorithmsAnalysis of Algorithms

Issues: Correctness (Is it guaranteed to produce a correct

correspondence between inputs and outputs?)

Time efficiency (How fast does it run?)

Space efficiency (How much extra space does it require?)

Optimality (Is this provably the best possible solution?)

Approaches: Theoretical analysis

Empirical analysis

Visualisation

Theoretical analysis of time efficiencyTheoretical analysis of time efficiency

Time efficiency is analyzed by determining the number of repetitions of the basic operation as a function of input size

Basic operation: the operation that contributes most towards the running time of the algorithm.

T(n) ≈ copC(n)running time execution time

for basic operation

Number of times basic operation is

executed

input size

Input size and basic operationsInput size and basic operations

ProblemInput size measure

Basic operation

Search for key in list of n items

Number of items in list n

Key comparison

Multiply two matrices of floating point numbers

Dimensions of matrices

Floating point multiplication

Compute an nFloating point multiplication

Graph problem#vertices and/or edges

Visiting a vertex or traversing an edge

Best-, Average- and Worst-casesBest-, Average- and Worst-cases

For some algorithms efficiency depends on type of input:

Worst case: W(n) – max over inputs of size n Best case: B(n) – min over inputs of size n Average case: A(n) – “avg” over inputs of size n

Number of times the basic operation will be executed on typical input

NOT the average of worst and best case Expected number of basic operations repetitions

considered as a random variable under some assumption about the probability distribution of all possible inputs of size n

Amortized EfficiencyAmortized Efficiency

From a data structures perspective the total time of a sequence of operations is important

Real-time apps are an exception

Amortized efficiency: time of an operation averaged over a worst-case sequence of operations

Desire “self-organizing” data structures that adapt to the context of use, e.g. red-black trees

Exercise: Sequential searchExercise: Sequential search

Problem: Given a list of n elements and a search key K, find an element equal to K, if any.

Algorithm: Scan the list and compare its successive elements with K until either a matching element is found (successful search) of the list is exhausted (unsuccessful search)

Calculate the: Worst case?

Best case?

Average case?

Types of Formulas for Operation CountsTypes of Formulas for Operation Counts

Exact formula e.g., C(n) = n(n-1)/2

Formula indicating order of growth with specific multiplicative constant e.g., C(n) ≈ 0.5 n2

Formula indicating order of growth with unknown multiplicative constant e.g., C(n) ≈ cn2

Order of GrowthOrder of Growth

Most important: Order of growth within a constant multiple as n∞

Example:How much faster will algorithm run on computer that is twice as fast?

How much longer does it take to solve problem of double input size?

Asymptotic NotationsAsymptotic Notations

A way of comparing functions that ignores constant factors and small input sizes

O(g(n)): class of functions f(n) that grow no faster than g(n) f(n) ≤ c g(n) for all n ≥ n0

Θ(g(n)): class of functions f(n) that grow at same rate as g(n) c2 g(n) ≤ f(n) ≤ c1 g(n) for all n ≥ n0

Ω (g(n)): class of functions f(n) that grow at least as fast as g(n) f(n) ≥ c g(n) for all n ≥ n0

Big-Oh: t(n) Big-Oh: t(n) O(g(n)) O(g(n))

Big-Omega: t(n) Big-Omega: t(n) Ω(g(n)) Ω(g(n))

Big-Theta: t(n) Big-Theta: t(n) ΘΘ(g(n)) (g(n))

Establishing Rate of Growth: using LimitsEstablishing Rate of Growth: using Limits

limn→∞ t(n)/g(n) =

0 implies order of t(n) < order of g(n)

c implies order of t(n) = order of g(n)

∞ implies order of t(n) > order of g(n)

Example: 10n vs. 2n2

limn→∞ 10n/2n2 = 5 limn→∞ 1/n = 0

Exercises: n(n+1)/2 vs. n2

logb n vs. logc n

L’Hôpital’s ruleL’Hôpital’s rule

If

limn→∞ f(n) = limn → ∞ g(n) = ∞

and the derivatives f ', g' exist,

Then

limn→∞ f(n) / g(n) = limn → ∞ f '(n) / g'(n)

Example: Example: log n vs. n

limn→∞ log n / n = limn→∞ 1/n log e = log e limn→∞ 1/n = 0

So, order log n < order n log n < order n

Establishing Rate of Growth: Establishing Rate of Growth: using Definitionsusing Definitions

f(n) is O(g(n)) if order of growth of f(n) ≤ order of growth of g(n) (within constant multiple)

There exist positive constant c and non-negative integer n0 such that

f(n) ≤ c g(n) for every n ≥ n0

This needs to be mathematically provable

Examples: 10n is O(2n2) because 10n < 2n2 for n > 5

5n+20 is O(10n) because 5n+20 < 10n for n > 4

Basic Asymptotic Efficiency ClassesBasic Asymptotic Efficiency Classes

1 constant Outside best-case, few examples

log n logarithmic Algorithms that decrease by a constant

n linear Algorithms that scan an n-sized list

n log n n log n Algorithms that divide and conquer, e.g. quicksort

n2 quadratic Typically two embedded loops

n3 cubic Typically three embedded loops

2n exponential Algorithms that generate all subsets of an n-element list

n! factorial Algorithms that generate all permutations of an n-element list

Time Efficiency of Non-recursive Time Efficiency of Non-recursive AlgorithmsAlgorithms

Steps in mathematical analysis of nonrecursive algorithms:

1. Decide on parameter n indicating input size

2. Identify algorithm’s basic operation

3. Determine worst, average, and best case for input of size n

4. Set up summation for C(n) reflecting algorithm’s loop structure

5. Simplify summation using standard formulas (see Appendix A of textbook)

Examples: Analysing non-recursive Examples: Analysing non-recursive algorithmsalgorithms

Matrix multiplication Multiply two square matrices of order n using dot

product of rows and columns

Selection sort Find smallest element in remainder of list and swap

with current element

Insertion sort Assume sub-list is sorted an insert current element

Mystery Algorithm Exercise

Matrix MultiplicationMatrix Multiplication

1. n = matrix order

2. Basic op = multiplication or addition

3. Best case = worst case = average case

4. 31

0

21

0

1

0

1

0

1

0

1

0

1)( nnnnMn

i

n

j

n

i

n

k

n

j

n

i

Selection SortSelection Sort

l n = number of list elements

l basic op = comparison

l best case = worst case = average case

l 2

)1(11)(

2

0

1

1

2

0

nninnC

n

i

n

ij

n

i

Insertion SortInsertion Sort

l n = number of list elements

l basic op = comparision

l best case != worst case != average case

l best case: A[j] > v executed only once on each iteration

)(11)(1

1

nnnCn

jbest

Exercise: Mystery AlgorithmExercise: Mystery Algorithm

What does this algorithm compute? What is its basic operation? What is its efficiency class? Suggest an improvement and analyse your improvement

// Input: A matrix A[0..n-1, 0..n-1] of real numbers

for i 0 to n - 1 do

for j i to n-1 do

if A[i,j] A[j,i]

return false

return true

Factorial: a Recursive FunctionFactorial: a Recursive Function

Recursive Evaluation of n!

Definition: n ! = 1*2*…*(n-1)*n

Recursive form for n!: F(n) = F(n-1) * n, F(0) = 1

Algorithm: IF n=0 THEN F(n) 1

ELSE F(n) F(n-1) * n

RETURN F(n)

Need to find the number of multiplications M(n) to compute n!

Recurrence RelationsRecurrence Relations

Definition: An equation or inequality that describes a function in terms of its

value on smaller inputs

Recurrence: The recursive step

E.g., M(n) = M(n-1) + 1 for n > 0

Initial Condition: The terminating step

E.g., M(0) = 0 [call stops when n = 0, no mults when n = 0]

Must differentiate the recursive function (factorial calculation) from the recurrence relation (number of basic operations)

Recurrence Solution for n!Recurrence Solution for n!

Need an analytic solution for M(n)

Solved by Method of Backward Substitution: M(n) = M(n-1) + 1

Substitute M(n-1) = M(n-2) + 1

= [M(n-2) + 1] +1 = M(n-2) + 2

Substitute M(n-2) = M(n-3) + 1

= [M(n-3) + 1] + 2 = M(n-3) + 3

Pattern: M(n) = M(n-i) + i

Ultimately: M(n) = M(n-n)+n = M(0) + n = n

Plan for Analysis of Recursive Plan for Analysis of Recursive AlgorithmsAlgorithms

Steps in mathematical analysis of recursive algorithms:

1. Decide on parameter n indicating input size

2. Identify algorithm’s basic operation

3. Determine worst, average, and best case for input of size n

4. (a) Set up a recurrence relation and initial condition(s) for C(n)-the number of times the basic operation will

be executed for an input of size n OR

(b) Alternatively, count recursive calls

5. (a) Solve the recurrence to obtain a closed form OR

(b) Estimate the order of magnitude of the solution (see Appendix B)

Iterative Methods for Solving Iterative Methods for Solving RecurrencesRecurrences

Method of Forward Substitution: Starting from the initial condition generate the first few

terms Look for a pattern expressible as a closed formula Check validity by direct substitution or induction Limited because pattern is hard to spot

Method of Backward Substitution: Express x(n-1) successively as a function of x(n-2),

x(n-3), … Derive x(n) as a function of x(n-i) Substitute n-i = base condition Surprisingly successful

Templates for Solving RecurrencesTemplates for Solving Recurrences

Decrease by One Recurrence

One (constant) operation reduces problem size by one Examples: Factorial

T(n) = T(n-1) + c T(1) = d

Solution: T(n) = (n-1)c + d

Linear Order

A pass through input reduces problem size by one Examples: Insertion Sort

T(n) = T(n-1) + cn T(1) = d

Solution: T(n) = [n(n+1)/2 – 1] c + d

Quadratic Order

More Templates for Solving More Templates for Solving RecurrencesRecurrences

Decrease by a Constant Factor

One (constant) operation reduces problem size by half Example: binary search

T(n) = T(n/2) + c T(1) = d

Solution: T(n) = c lg n + d

Logarithmic Order

A pass through input reduces problem size by half Example: Merge Sort

T(n) = 2T(n/2) + cn T(1) = d

Solution: T(n) = cn lg n + d n

n log n order

Master Method for Solving Master Method for Solving RecurrencesRecurrences

Divide and Conquer Style Recurrence

T(n) = aT(n/b) + f (n) where f (n) .(nk) is the time spent in dividing and merging

a >= 1, b >= 2

1. a < bk T(n) .(nk)

2. a = bk T(n) .(nk lg n )

3. a > bk T(n) .(nlog b a)

4. Note: the same results hold with O instead of .

Example: Fibonacci NumbersExample: Fibonacci Numbers

The Fibonacci sequence:0, 1, 1, 2, 3, 5, 8, 13, 21, 34, …

Describes the growth pattern of Rabbits. Just short of exponential, ask Australia!

Fibonacci recurrence:F(n) = F(n-1) + F(n-2)

F(0) = 0

F(1) = 1

Another example:A(n) = 3A(n-1) - 2A(n-2) A(0) = 1 A(1) = 3

2nd order linear homogeneous recurrence relation

with constant coefficients

Computing Fibonnaci NumbersComputing Fibonnaci Numbers

Algorithm Alternatives:

1. Definition based recursive algorithm

2. Nonrecursive brute-force algorithm

3. Explicit formula algorithmF(n) = 1/√5 n where is the golden ratio (1 + √5) / 2

4. Logarithmic algorithm based on formula:

FF((nn-1)-1) F F((nn))

F(n) F(n+1)

0 1

1 1=

n

• for n≥1, assuming an efficient way of computing matrix powers.

Empirical Analysis of Time EfficiencyEmpirical Analysis of Time Efficiency

Sometimes a mathematical analysis is difficult for even simple algorithms (limited applicability)

Alternative is to measure algorithm execution

PLAN:

1. Understand the experiment’s purpose Are we checking the accuracy of a theoretical result, comparing

efficiency of different algorithms or implementations, hypothesising the efficiency class?

2. Decide on an efficiency measure Use physical units of time (e.g., milliseconds)

OR

Count actual number of basic operations

Plan for Empirical AnalysisPlan for Empirical Analysis

3. Decide on Characteristics of the Input Sample Can choose Pseudorandom inputs, Patterned inputs or

a Combination Certain problems already have benchmark inputs

4. Implement the Algorithm

5. Generate a Sample Set of Inputs

6. Run the Algorithm and Record the Empirical Results

7. Analyze the Data Regression Analysis is often used to fit a function to the

scatterplot of results

Algorithm VisualisationAlgorithm Visualisation

Definition: Algorithm Visualisation seeks to convey useful algorithm

information through the use of images

Flavours: Static algorithm visualisation Algorithm animation

Some new insights: e.g., odd and even disks in Towers of Hanoi

Attributes of Information Visualisation apply - zoom and filter, detail on demand, etc.

Websource: The Complete Collection of Algorithm Animation http://www.cs.hope.edu/~alganim/ccaa/

theory of algorithms: fundamentals of the analysis of algorithm efficiency

Documents

worst case

inputs of size nbest

inputs of size naverage

algorithms efficiency

space efficiency

problem of double input

average of worst

exceptionamortized efficiency