unit 2: complexity analysis part 1: introduction to complexity · 2011-06-28 · big-o notation we...

27
Unit 2: Complexity Analysis Part 1: Introduction to Complexity Engineering 4892: Data Structures Faculty of Engineering & Applied Science Memorial University of Newfoundland May 13, 2010 ENGI 4892 (MUN) Unit 2, Part 1 May 13, 2010 1 / 27

Upload: others

Post on 25-Jul-2020

7 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Unit 2: Complexity Analysis Part 1: Introduction to Complexity · 2011-06-28 · Big-O notation We need a precise notation for asymptotic complexity. Here it is, De nition: Big-O

Unit 2: Complexity AnalysisPart 1: Introduction to Complexity

Engineering 4892:Data Structures

Faculty of Engineering & Applied ScienceMemorial University of Newfoundland

May 13, 2010

ENGI 4892 (MUN) Unit 2, Part 1 May 13, 2010 1 / 27

Page 2: Unit 2: Complexity Analysis Part 1: Introduction to Complexity · 2011-06-28 · Big-O notation We need a precise notation for asymptotic complexity. Here it is, De nition: Big-O

1 Introduction

1 Complexity

1 Asymptotic complexity

1 Big-O notation

1 Properties of big-O notation

1 Logarithm

1 Common Complexity Classes

1 Examples

1 Finding asymptotic complexity

ENGI 4892 (MUN) Unit 2, Part 1 May 13, 2010 2 / 27

Page 3: Unit 2: Complexity Analysis Part 1: Introduction to Complexity · 2011-06-28 · Big-O notation We need a precise notation for asymptotic complexity. Here it is, De nition: Big-O

Introduction

We now begin unit 2 of the course, which corresponds to chapter 2 of thetextbook.

When we talk about the efficiency of an algorithm, we are usually referringto the algorithm’s efficient use of resources. Such resources include,

Memory

Communications bandwidth

Number of logic gates for hardware implementation

However, most often we are concerned with an algorithm’s efficient use oftime.

ENGI 4892 (MUN) Unit 2, Part 1 May 13, 2010 3 / 27

Page 4: Unit 2: Complexity Analysis Part 1: Introduction to Complexity · 2011-06-28 · Big-O notation We need a precise notation for asymptotic complexity. Here it is, De nition: Big-O

The time required to execute a program is dependent on a number offactors:

How the algorithm was coded

Machine-specific details (clock rate, cache size, etc...)

How source code is executed (compiled or interpreted)

The value of the input

The size of the input

However, we wish to compare algorithms, not programs run on specificcomputers. Therefore, we will focus on the value and size of the input.

ENGI 4892 (MUN) Unit 2, Part 1 May 13, 2010 4 / 27

Page 5: Unit 2: Complexity Analysis Part 1: Introduction to Complexity · 2011-06-28 · Big-O notation We need a precise notation for asymptotic complexity. Here it is, De nition: Big-O

Complexity

An algorithm’s efficiency for a particular value of the input, is given as afunction of the input size n. Such a function gives the complexity of thealgorithm.

For example, the time t required to process an array of n items might be,

t = cn

where c is a constant (for a particular implementation of the algorithm wewould have a particular constant c). Here the complexity is a linearfunction of the input size n. Thus, if we have two input arrays with sizesn1 and n2 = 2n1,

t1 = cn1

t2 = cn2 = 2cn1 = 2t1

The time required to process the second array is twice the first.

Page 6: Unit 2: Complexity Analysis Part 1: Introduction to Complexity · 2011-06-28 · Big-O notation We need a precise notation for asymptotic complexity. Here it is, De nition: Big-O

Another algorithm might require a logarithmic amount of time,

t = c log2 n

Now the times to process our input arrays are as follows,

t1 = c log2 n1

t2 = c log2 n2 = c log2(2n1) = c(log2 2 + log n1) = c + t1

To process the second array requires just c more time units than for thefirst array.

Complexity is not measured in real-time units such as seconds, but inlogical units. With the appropriate constants (e.g. c above) we couldmeasure complexity in seconds or microseconds. However, we are moreconcerned with the complexity of the algorithm—specific constants wouldtie the analysis to a particular computer or computer architecture.

ENGI 4892 (MUN) Unit 2, Part 1 May 13, 2010 6 / 27

Page 7: Unit 2: Complexity Analysis Part 1: Introduction to Complexity · 2011-06-28 · Big-O notation We need a precise notation for asymptotic complexity. Here it is, De nition: Big-O

Asymptotic complexity

Assume we have analyzed an algorithm and determined its complexity as afunction of its input size n as follows,

f (n) = n2 + 100n + log10 n + 1000

Which is the largest term...

For n < 10, the last term is by far the largest.

When n = 10 the second and last terms are tied.

When n = 100 The first and second terms are tied.

As n grows beyond 100 the first term becomes more and more dominant.

ENGI 4892 (MUN) Unit 2, Part 1 May 13, 2010 7 / 27

Page 8: Unit 2: Complexity Analysis Part 1: Introduction to Complexity · 2011-06-28 · Big-O notation We need a precise notation for asymptotic complexity. Here it is, De nition: Big-O

The asymptotic complexity of f (n) is the complexity of this function forlarge n. Thus, the asymptotic complexity of f (n) is n2.

The asymptotic complexity of an algorithm is a rough approximation ofthe algorithm’s complexity which applies only for large n. However,understanding what happens for large input sizes is crucial for manyreal-world applications.

Page 9: Unit 2: Complexity Analysis Part 1: Introduction to Complexity · 2011-06-28 · Big-O notation We need a precise notation for asymptotic complexity. Here it is, De nition: Big-O

Big-O notationWe need a precise notation for asymptotic complexity. Here it is,

Definition: Big-O Notation

Given two positive-valued functions f and g , f (n) is O(g(n)) if thereexists positive numbers c and N such that f (n) ≤ cg(n) for all n ≥ N

Essentially, if we can find these two constants c and N such that theabove inequality is true, then f (n) is O(g(n)). This means that cg(n) isan upper bound on f (n) when n ≥ N.

Page 10: Unit 2: Complexity Analysis Part 1: Introduction to Complexity · 2011-06-28 · Big-O notation We need a precise notation for asymptotic complexity. Here it is, De nition: Big-O

e.g. Assume we have the function f (n) = 2n2 + 3n + 1 and we wish todetermine if it is O(n2).

We have to find constants c and N such that,

2n2 + 3n + 1 ≤ cn2 for all n ≥ N

In fact there are an infinite number of possible choices for c and N.However, we need some justification for the values chosen. By using thefollowing technique particular values will emerge naturally.

Technique: Look for a function that always bigger than or equal to f (n)which is easier to compare to g(n). If this function is O(g(n)) than so isf (n).

For example h(n) = 2n2 + 3n2 + n2 ≥ f (n) since 2n2 = 2n2, 3n2 > 3n,and n2 > 1. Simplifiying we have h(n) = 6n2.

h(n) = 6n2 ≤ cn2 for all n ≥ N

This is satisfied for c = 6 and N = 1 (the smallest possible N).ENGI 4892 (MUN) Unit 2, Part 1 May 13, 2010 10 / 27

Page 11: Unit 2: Complexity Analysis Part 1: Introduction to Complexity · 2011-06-28 · Big-O notation We need a precise notation for asymptotic complexity. Here it is, De nition: Big-O

Note: O(g(n)) describes a set of functions, all of which are boundedabove by cg(n) for large n. Thus, it is appropriate to write the following,

2n2 + 3n + 1 ∈ O(n2)

It is also common practise to write,

2n2 + 3n + 1 = O(n2)

This is really an abuse of notation: a function cannot equal a set.However, this notation is common and useful so we will employ it.

ENGI 4892 (MUN) Unit 2, Part 1 May 13, 2010 11 / 27

Page 12: Unit 2: Complexity Analysis Part 1: Introduction to Complexity · 2011-06-28 · Big-O notation We need a precise notation for asymptotic complexity. Here it is, De nition: Big-O

e.g. Is mn + b = O(n2), where m and b are constants?

We need to find positive constants c and N such that,

mn + b ≤ cn2

for all n ≥ N. We could argue as follows: since n2 ≥ n and n2 ≥ 1 forpositive n,

mn + b ≤ mn2 + bn2

If we can find constants such that the following is true,

mn2 + bn2 ≤ cn2

then our first inequality would be satisfied by the same constants (bytransitivity). This inequality is clearly satisfied by c = m + b, N = 1.

ENGI 4892 (MUN) Unit 2, Part 1 May 13, 2010 12 / 27

Page 13: Unit 2: Complexity Analysis Part 1: Introduction to Complexity · 2011-06-28 · Big-O notation We need a precise notation for asymptotic complexity. Here it is, De nition: Big-O

The previous example showed that a general linear function of n wasO(n2). In fact, we can always find an infinite number of functions g suchthat f (n) = O(g(n)). For example,

mn + b = O(n)

= O(n2)

= O(n3)

= ...

You should always choose the smallest and simplest. Thus, while it is truethat mn + b = O(n2) it is better to say that mn + b = O(n).

ENGI 4892 (MUN) Unit 2, Part 1 May 13, 2010 13 / 27

Page 14: Unit 2: Complexity Analysis Part 1: Introduction to Complexity · 2011-06-28 · Big-O notation We need a precise notation for asymptotic complexity. Here it is, De nition: Big-O

e.g. Is 3n2 + 4n = O(n)?

We need to find positive constants c and N such that,

3n2 + 4n ≤ cn

Divide both sides by n,3n + 4 ≤ c

It is not possible to pick a constant c to satisfy the above equation. Thus,3n2 + 4n 6= O(n).

ENGI 4892 (MUN) Unit 2, Part 1 May 13, 2010 14 / 27

Page 15: Unit 2: Complexity Analysis Part 1: Introduction to Complexity · 2011-06-28 · Big-O notation We need a precise notation for asymptotic complexity. Here it is, De nition: Big-O

Properties of big-O notation

The following facts are given verbatim from the textbook:

Fact 1

If f (n) is O(g(n)) and g(n) is O(h(n)), then f (n) is O(h(n)).

Proof: COVERED ON BOARD

Fact 2

If f (n) is O(h(n)) and g(n) is O(h(n)), then f (n) + g(n) is O(h(n)).

Proof: f (n) ≤ c1h(n) and g(n) ≤ c2h(n) for n ≥ N1 and n ≥ N2

(respectively).

Therefore f (n) + g(n) ≤ (c1 + c2)h(n) for n ≥ max(N1,N2).

Note: This extends to a summation of a constant k number of termswhere each term fi (n) is O(h(n)).

ENGI 4892 (MUN) Unit 2, Part 1 May 13, 2010 15 / 27

Page 16: Unit 2: Complexity Analysis Part 1: Introduction to Complexity · 2011-06-28 · Big-O notation We need a precise notation for asymptotic complexity. Here it is, De nition: Big-O

Fact 3

The function ank is O(nk).

Fact 4

The function nk is O(nk+j) for any positive j .

Observation: If we have a polynomial,

f (n) = aknk + ak−1nk−1 + · · ·+ a1n + a0

then f (n) = O(nk). That is, the complexity of the largest term determinesthe complexity of the whole polynomial.What facts do we have to invoketo prove this?

Fact 5

If f (n) = cg(n), then f (n) is O(g(n)).

ENGI 4892 (MUN) Unit 2, Part 1 May 13, 2010 16 / 27

Page 17: Unit 2: Complexity Analysis Part 1: Introduction to Complexity · 2011-06-28 · Big-O notation We need a precise notation for asymptotic complexity. Here it is, De nition: Big-O

Logarithm

As illustrated earlier, logarithmic functions grow slowly with increasing n.Algorithms with logarithmic complexity are generally considered quite fast.

Definition: iff x = by then logb x = y .

Informally, logb x gives the exponent to which b must be raised to get x .

Fact 6

The function loga n is O(logbn) for any positive numbers a and b 6= 1.

In other words, all logarithmic functions are big-O of each other.

Proof: COVERED ON BOARD

Note: We use lg x to refer to log2 x . Also, in big-O notation this notationis used to refer to any logarithmic function.

ENGI 4892 (MUN) Unit 2, Part 1 May 13, 2010 17 / 27

Page 18: Unit 2: Complexity Analysis Part 1: Introduction to Complexity · 2011-06-28 · Big-O notation We need a precise notation for asymptotic complexity. Here it is, De nition: Big-O

Common Complexity Classes

The most common and important classes of algorithm complexity areshown below in order of increasing size:

O(1)

O(lg n)

O(n)

O(n lg n)

O(n2)

O(n3)

O(2n)

A function f (n) is O(g(n)), if f (n) appears at the same position or beforeg(n) in the list given above.

e.g. n = O(n3), although it is better to say that n = O(n)

ENGI 4892 (MUN) Unit 2, Part 1 May 13, 2010 18 / 27

Page 19: Unit 2: Complexity Analysis Part 1: Introduction to Complexity · 2011-06-28 · Big-O notation We need a precise notation for asymptotic complexity. Here it is, De nition: Big-O
Page 20: Unit 2: Complexity Analysis Part 1: Introduction to Complexity · 2011-06-28 · Big-O notation We need a precise notation for asymptotic complexity. Here it is, De nition: Big-O

ENGI 4892 (MUN) Unit 2, Part 1 May 13, 2010 20 / 27

Page 21: Unit 2: Complexity Analysis Part 1: Introduction to Complexity · 2011-06-28 · Big-O notation We need a precise notation for asymptotic complexity. Here it is, De nition: Big-O

Roughly speaking, algorithms with polynomial complexity are thosewhich may be practical for larger values of n.

Polynomial complexity:

O(1), O(lg n), O(n), O(n lg n), O(n2), O(n3)

Still, there are cases where polynomial algorithms take days or years toexecute (see 106 column in figure 2.4).

Exponential algorithms are useful only if the input size happens to be quitesmall.

ENGI 4892 (MUN) Unit 2, Part 1 May 13, 2010 21 / 27

Page 22: Unit 2: Complexity Analysis Part 1: Introduction to Complexity · 2011-06-28 · Big-O notation We need a precise notation for asymptotic complexity. Here it is, De nition: Big-O

Examples

Determine the complexity of the following functions. Justify the stepstaken in terms of facts about big-O and the known common complexityclasses:

e.g. f (n) = 17 · 7n

f (n) = O(7n) (fact 5—constants don’t matter)

e.g. f (n) = 3n2 + lg(2n)

First simplify: f (n) = 3n2 + n lg 2

f (n) = O(n2) (complexity of polynomial)

e.g. f (n) = 3n2 + 2n

3n2 = O(2n) (fact 5, plus O(n2) = O(2n))

2n = O(2n) (fact 5)

O(2n) (fact 2)

Page 23: Unit 2: Complexity Analysis Part 1: Introduction to Complexity · 2011-06-28 · Big-O notation We need a precise notation for asymptotic complexity. Here it is, De nition: Big-O

Finding asymptotic complexity

To determine an algorithm’s complexity we should count the number ofoperations executed. What operations should we count?

The number of C++ statements? Some statements take longer toexecute than others.

The number of machine code instructions? This is awkward becauseit is dependent upon the computer’s architecture.

In this part of the course we will count the number of assignmentstatements. Why? Because for most algorithms the number of assignmentstatements will be proportional to the total number of operations (whetherC++ statements or machine code instructions).

Counting assignment statements is not a perfect strategy, but it is simpleand reasonable.

We will count the number of assignment statements in the following codesnippets and thereby determine the complexity of these algorithms...

ENGI 4892 (MUN) Unit 2, Part 1 May 13, 2010 23 / 27

Page 24: Unit 2: Complexity Analysis Part 1: Introduction to Complexity · 2011-06-28 · Big-O notation We need a precise notation for asymptotic complexity. Here it is, De nition: Big-O

for (i = sum = 0 ; i < n ; i++)sum += a [ i ] ;

There are two assignment statements in the initialization section.

Consider the body of the loop. For each execution of the loop there aretwo assignment statements executed:

sum += a[i] and i++

How many times does this loop execute? Reason it out in your own way oruse the formula:

loops = last index − first index + 1

Here last index = n − 1, first index = 0 and therefore loops = n. Theloop executes n times.

Thus, the total number of assignments is 2 + 2n. The asymptoticcomplexity of this code snippet is O(n).

Page 25: Unit 2: Complexity Analysis Part 1: Introduction to Complexity · 2011-06-28 · Big-O notation We need a precise notation for asymptotic complexity. Here it is, De nition: Big-O

for (i = 0 ; i < n ; i++) {for (j = 1 , sum = a [ 0 ] ; j <= i ; j++)

sum += a [ j ]cout<<"sum of 0 through "<< i <<" is "<<sum<<endl ;

}

The first assignment statement is i = 0. For each execution of the outerloop there are three single assignments (outside of the inner loop).

The outer loop is executed n times. Thus, the number of assignmentstatements is 1 + 3n + X where X is the number of assignment statementsexecuted by the inner loop.

There are two assignment statements within the body of the inner loop:

sum += a[j] and j++

For each outer loop, the inner loop is executed i times. So each executionof the inner loop executes 2i statements. i goes from 0 to n − 1.

The number of assignment statements executed is: 1 + 3n +∑n−1

i=0 2i

Page 26: Unit 2: Complexity Analysis Part 1: Introduction to Complexity · 2011-06-28 · Big-O notation We need a precise notation for asymptotic complexity. Here it is, De nition: Big-O

How do we evaluate this expression:

f (n) = 1 + 3n +n−1∑i=0

2i

Sum of the first n natural numbersn∑

i=1

i = 1 + 2 + · · ·+ n =n(n + 1)

2

Note first that,n−1∑i=0

2i = 2n−1∑i=0

i = 2n−1∑i=1

i

Utilizing the formula above we obtain,

f (n) = 1 + 3n + (n − 1)n = n2 + 2n + 1 = O(n2)

Page 27: Unit 2: Complexity Analysis Part 1: Introduction to Complexity · 2011-06-28 · Big-O notation We need a precise notation for asymptotic complexity. Here it is, De nition: Big-O

for (i = 4 ; i < n ; i++) {for (j = i−3, sum = a [ i−4] ; j <= i ; j++)

sum += a [ j ]cout<<"sum of "<<i−4<<" through "<< i <<" is "<<sum<<endl ;

}

The outer loop executes n − 4 times. For each outer loop there are 3assignment statements executed. If the number of assignment statementsexecuted in the inner loop is X , the total is 1 + 3(n − 4) + X .

The body of the inner loop always executes 4 times. There are 2assignments within this loop. Thus, each time the outer loop executes,there will be 8 assignments executed in the inner loop body. SoX = 8(n − 4).

Counting the one-time initialization i = 4, the complexity is1 + (n − 4)(3 + 8) = 11n − 43 = O(n)

Note: Sometimes a doubly-nested loop is O(n2), but sometimes it’s not!