time complexity - computer science• could use any one (or all) of these techniques: 1. run the...

36
Time Complexity Foundations of Computer Science Theory

Upload: others

Post on 02-Aug-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Time Complexity - Computer Science• Could use any one (or all) of these techniques: 1. Run the program, time it, make a prediction (an hypothesis), and verify or falsify the prediction

Time Complexity

Foundations of Computer Science Theory

Page 2: Time Complexity - Computer Science• Could use any one (or all) of these techniques: 1. Run the program, time it, make a prediction (an hypothesis), and verify or falsify the prediction

Developing an Algorithm

1. Model the problem 2. Find or create an algorithm to solve it 3. Determine if the algorithm meets

performance specifications • Is the algorithm correct? • Is it fast enough? • Does it use too much memory?

4. If performance is not met, figure out why 5. Address the performance issues

Page 3: Time Complexity - Computer Science• Could use any one (or all) of these techniques: 1. Run the program, time it, make a prediction (an hypothesis), and verify or falsify the prediction

Algorithm Analysis

• How long does a program take to run? • Could use any one (or all) of these techniques:

1. Run the program, time it, make a prediction (an hypothesis), and verify or falsify the prediction

2. Determine the number of operations performed, and the cost/frequency of each operation

3. Compare the algorithm to some other well-known algorithm using big-O notation

Page 4: Time Complexity - Computer Science• Could use any one (or all) of these techniques: 1. Run the program, time it, make a prediction (an hypothesis), and verify or falsify the prediction

Measuring Runtime

• How long does a program take to run on any input? 1. Run the program with various input sizes and record the

running time N Time (seconds)

250 0.000

500 0.000

1,000 0.100

2,000 0.828

4,000 6.539

8,000 51.107

16,000 ?

Page 5: Time Complexity - Computer Science• Could use any one (or all) of these techniques: 1. Run the program, time it, make a prediction (an hypothesis), and verify or falsify the prediction

Measuring Runtime

2. Plot running time T(N) vs. input size N using a log-log scale

– Use regression analysis to fit a straight line through the data points and use the power law to guess the complexity (c Nb )

3. Generate an hypothesis: The running time is about 1 x 10-10 x N3 seconds

4. Using the hypothesis, make a prediction: When N = 16,000 the runtime should be approx. 408 seconds

5. Make an observation

N Time (seconds)

8,000 51.1

8,000 51.0

8,000 51.1

16,000 401.9 Hypothesis is validated.

Page 6: Time Complexity - Computer Science• Could use any one (or all) of these techniques: 1. Run the program, time it, make a prediction (an hypothesis), and verify or falsify the prediction

Measuring Runtime QUESTION: Suppose that you make the following observations of the running time T(N) (in seconds) of a program as a function of the input size N. What is the running time of the algorithm? N T(N) 1,000 0.0 2,000 0.0 4,000 0.1 8,000 0.3 16,000 1.3 32,000 5.1 64,000 20.6 Assume T(N) = c Nb for some constants c and b. If we double the size of the input N, the running time approximately quadruples, indicating a quadratic algorithm or b = 2. T(N) = 20.6 for N = 64,000. Solving for c, we obtain c = 20.6/64,0002 ≈ 5.0 × 10-9. Thus, running time hypothesis is 5.0 x 10-9 N2.

Page 7: Time Complexity - Computer Science• Could use any one (or all) of these techniques: 1. Run the program, time it, make a prediction (an hypothesis), and verify or falsify the prediction

Analyzing Cost of Operations

• The total running time of an algorithm is the sum of all operations (the cost of each operation times the frequency of that operation)

• How to determine the cost of an operation? – Does the cost depend on the system the program is running on or

does it depend on the size of the input? • System dependent effects are constant

– Hardware: CPU, memory, cache, … – Software: compiler, memory, garbage collector, … – System: operating system, network, apps, …

• System independent effects depend on the input size – The algorithm and the input data

Page 8: Time Complexity - Computer Science• Could use any one (or all) of these techniques: 1. Run the program, time it, make a prediction (an hypothesis), and verify or falsify the prediction

General Costs of Operations Operation Example Time

Variable declaration int a c1

Assignment statement a = b c2

Integer compare a < b c3

Array element access a[i] c4

Array length a.Length c5

1D array allocation new int[N] c6 N

2D array allocation new int [N][N] c7 N2

String length s.length() c8

Substring extraction s.substring(N/2, N) c9 N

String concatenation s + t c10 N

Page 9: Time Complexity - Computer Science• Could use any one (or all) of these techniques: 1. Run the program, time it, make a prediction (an hypothesis), and verify or falsify the prediction

One-Sum Problem

• Given an array of N integers, how many of them are 0?

int count = 0; for (int i = 0; i < N; i++) if (a[i] == 0) count++;

Operation Frequency

Variable declaration 2

Assignment statement 2

Less-than compare N + 1

Equal-to compare N

Array access N

Increment N to 2N

Page 10: Time Complexity - Computer Science• Could use any one (or all) of these techniques: 1. Run the program, time it, make a prediction (an hypothesis), and verify or falsify the prediction

Two-Sum Problem

• Given an array of N integers (negative, positive, or 0), how many taken two at a time sum exactly to 0?

int count = 0; for (int i = 0; i < N; i++) for (int j = i+1; j < N; j++) if (a[i] + a[j] == 0) count++; Operation Frequency

Variable declaration N + 2

Assignment statement N + 2

Less-than compare ½ (N+1) (N+2)

Equal-to compare ½ N (N–1)

Array access N (N-1)

Increment ½ N (N+1) to N2

Page 11: Time Complexity - Computer Science• Could use any one (or all) of these techniques: 1. Run the program, time it, make a prediction (an hypothesis), and verify or falsify the prediction

Simplification of Two-Sum Problem

• To estimate the running time (or memory) as a function of input size N, we can ignore the lower-order terms – When N is large, the lower-order terms are negligible – When N is small, we don’t care

Operation Frequency Tilde Notation

Variable declaration N + 2 ~N

Assignment statement N + 2 ~N

Less-than compare ½ (N+1) (N+2) ~½ N2

Equal-to compare ½ N (N–1) ~½ N2

Array access N (N-1) ~N2

Increment ½ N (N-1) to N (N-1) ~½ N2 to ~N2

Page 12: Time Complexity - Computer Science• Could use any one (or all) of these techniques: 1. Run the program, time it, make a prediction (an hypothesis), and verify or falsify the prediction

Three-Sum Problem • Given an array of N integers, how many taken three at

a time sum to exactly zero? – For example, given the list of integers (30, -40, -20, -10, 40, 0, 10, 5), there are

four triples that sum to zero: (30, -40, 10), (30, -20, -10), (-40, 40, 0), (-10, 0, 10)

int count = 0; for (int i = 0; i < N; i++) for (int j = i+1; j < N; j++) for (int k = j+1; k < N; k++) if (a[i] + a[j] + a[k] == 0) count++;

Operation Frequency Tilde Notation

Equal-to compare (1/6) N (N–1)(N-2) ~(1/6) N3

Array access ½ N (N-1)(N-2) ~½ N3

Page 13: Time Complexity - Computer Science• Could use any one (or all) of these techniques: 1. Run the program, time it, make a prediction (an hypothesis), and verify or falsify the prediction

Faster Algorithm for 3-Sum

• Sorting-based algorithm: – Step 1: Sort the N (distinct) numbers – Step 2: For each pair of numbers a[i] and a[j] calculate a[i] + a[j] – Step 3: Check to see if the negated sum is one of the input

numbers using binary search

• Order of growth is N2 + N2 log N = O(N2 log N) – Step 1: N2 with insertion sort (don’t need a fast sort) – Step 2: N2 to check each pair of numbers – Step 3: log N for binary search

Page 14: Time Complexity - Computer Science• Could use any one (or all) of these techniques: 1. Run the program, time it, make a prediction (an hypothesis), and verify or falsify the prediction

Faster Algorithm for 3-Sum

• The sorting-based algorithm for 3-sum is significantly faster than the brute-force naïve algorithm

Brute-force N3 Sorting-based N2 log N

N Time (seconds)

1,000 0.1

2,000 0.8

4,000 6.4

8,000 51.1

N Time (seconds)

1,000 0.14

2,000 0.18

4,000 0.34

8,000 0.96

16,000 3.67

32,000 14.88

64,000 59.16

Page 15: Time Complexity - Computer Science• Could use any one (or all) of these techniques: 1. Run the program, time it, make a prediction (an hypothesis), and verify or falsify the prediction

Even Faster Algorithm for 3-Sum sort(S); for i = 0 to N-3 a = S[i]; k = i+1; j = N-1; while (k < j) b = S[k]; c = S[j]; if (a + b + c == 0) output a, b, c; exit; else if (a + b + c > 0) j = j - 1; else k = k + 1; end end end

Quadratic-time algorithm for 3-Sum:

Page 16: Time Complexity - Computer Science• Could use any one (or all) of these techniques: 1. Run the program, time it, make a prediction (an hypothesis), and verify or falsify the prediction

Analyzing Cost of Insertion Sort

Page 17: Time Complexity - Computer Science• Could use any one (or all) of these techniques: 1. Run the program, time it, make a prediction (an hypothesis), and verify or falsify the prediction

Analyzing Cost of Insertion Sort

• To compute the running time of insertion sort, sum the products of the cost and times columns:

• The best case is if the array is already sorted; then tj = 1, and the c6 and c7 terms go away:

Page 18: Time Complexity - Computer Science• Could use any one (or all) of these techniques: 1. Run the program, time it, make a prediction (an hypothesis), and verify or falsify the prediction

Analyzing Cost of Insertion Sort

• If the array is in reverse sorted order—that is, in decreasing order—the worst case results, and we must include the c6, and c7 terms

• In this case, each jth element must be compared with every other element in the entire sorted subarray A[1 . . . j-1], and so tj = j for j = 2, 3, . . . n:

Page 19: Time Complexity - Computer Science• Could use any one (or all) of these techniques: 1. Run the program, time it, make a prediction (an hypothesis), and verify or falsify the prediction

Quadratic Time Complexity • For most modern applications, quadratic time

complexity is far too slow – A typical application may perform billions of operations

per second and may occupy billions of entries in main memory

• Suppose that N equals 1 million. Approximately how much faster is an algorithm that performs N log N operations versus one that performs N2 operations? – N2/(N log N) = N/log N = 1,000,000/ log(1,000,000). Since

220 is approximately 1 million, we have 1,000,000/ 20 = 50,000. Thus, N log N is approximately 50,000 times faster than N2.

Page 20: Time Complexity - Computer Science• Could use any one (or all) of these techniques: 1. Run the program, time it, make a prediction (an hypothesis), and verify or falsify the prediction

Order of Growth

n →

time ↑

• Using order-of-growth is an easier way to get the runtime efficiency of an algorithm, instead of running code or counting operations

• The rate of growth of a function is how fast the function increases when the input size increase

• A fast growing function is a slow algorithm

Page 21: Time Complexity - Computer Science• Could use any one (or all) of these techniques: 1. Run the program, time it, make a prediction (an hypothesis), and verify or falsify the prediction

Order of Growth

Order of Growth Name Description Example 1 Constant Statement Add two numbers

log N Logarithmic Divide in half Binary search N Linear Single loop Find maximum

N log N Linearithmic Divide and Conquer Mergesort N2 Quadratic Double loop Check all pairs N3 Cubic Triple loop Check all triples 2N Exponential Exhaustive search Check all subsets

Page 22: Time Complexity - Computer Science• Could use any one (or all) of these techniques: 1. Run the program, time it, make a prediction (an hypothesis), and verify or falsify the prediction

Order of Growth

Order of Growth

1970s 1980s 1990s 2000s

1 any any any any log N any any any any

N millions tens of millions

hundreds of millions

billions

N log N hundreds of thousands

millions millions hundreds of millions

N2 100+ 1,000 1,000+ tens of thousands

N3 100 100+ 1,000 1,000+ 2N 20 20+ 20+ 30

Problem size solvable in minutes:

Page 23: Time Complexity - Computer Science• Could use any one (or all) of these techniques: 1. Run the program, time it, make a prediction (an hypothesis), and verify or falsify the prediction

Input Analysis

• Best case: lower bound on cost – Determined by “easiest” input – Provides a goal for all inputs

• Worst case: upper bound on cost – Determined by “most difficult” input – Provides a guarantee for all inputs

• Average case: expected cost for random input – Need a model for “random” input – Provides a way to predict performance

Page 24: Time Complexity - Computer Science• Could use any one (or all) of these techniques: 1. Run the program, time it, make a prediction (an hypothesis), and verify or falsify the prediction

Asymptotic Analysis Big-O f(n) = O(g(n))

f(n) is a function that grows no faster than the function g(n), and maybe slower

Big-Omega f(n) = Ω(g(n)) f(n) is a function that grows at least as fast as the function g(n) and maybe faster

Big-Theta f(n) = Θ(g(n)) f(n) is a function that grows at the same rate as the function g(n), within some constant range of values

Big-O Big-Theta Big-Omega

Page 25: Time Complexity - Computer Science• Could use any one (or all) of these techniques: 1. Run the program, time it, make a prediction (an hypothesis), and verify or falsify the prediction

Big-O Definition

• Definition of big-O (memorize): f(n) is O (g(n)) if there exists positive constants c and n0 such that f(n) ≤ c ∙ g(n) for all n ≥ n0

• Interpretation: an algorithm is O(g(n)) if the rate of increase of its running time grows no faster than (is bounded by) a constant times a function of the form g(n)

Page 26: Time Complexity - Computer Science• Could use any one (or all) of these techniques: 1. Run the program, time it, make a prediction (an hypothesis), and verify or falsify the prediction

Big-O Definition

• QUESTION: Which of the following functions is O(n3)?

a. 5000 b. 11n + 15 log n + 100 c. 1/3 n2 d. 25,000 n3 e. All of the above

Page 27: Time Complexity - Computer Science• Could use any one (or all) of these techniques: 1. Run the program, time it, make a prediction (an hypothesis), and verify or falsify the prediction

Proving Runtime Bounds

Prove that 2n = O(n2): (Start with the definition of big-O): f(n) is O(g(n)) if there exists positive constants c and n0 such that f(n) ≤ c ∙ g(n) for all n ≥ n0. Let f(n) = 2n and let g(n) = n2. 2n ≤ c ∙ n2 Choose c = 1, and n0 = 2 (2n ≤ n2 for all n ≥ 2) Goal: find c > 0 and n0 > 0 such that 2n ≤ c ∙ n2 for all n ≥ n0. What are c and n0 ? There may be many choices, you only need to find any that satisfy the definition. Choose c = 1, and n0 = 2. Since you showed that you can find c and n0, that satisfies the definition, so the proof is complete.

Page 28: Time Complexity - Computer Science• Could use any one (or all) of these techniques: 1. Run the program, time it, make a prediction (an hypothesis), and verify or falsify the prediction

Proving Runtime Bounds

Prove that 3n = O(n2): f(n) is O(g(n)) if there exists positive constants c and n0 such that f(n) ≤ c ∙ g(n) for all n ≥ n0. Let f(n) = 3n and let g(n) = n2. 3n ≤ c ∙ n2 Choose c = 1, n0 = 3 (3n ≤ n2 for all n ≥ 3) Proof complete. Notice that choosing any c ≥ 1 and n0 = 3 would also work, but choosing c = 1 and n0 = 2 would not work (because 3 ∙ 2 > 22). Since there exist at least one choice for both c and n0, the proof is valid. Moral: make sure you make the correct choices!

Page 29: Time Complexity - Computer Science• Could use any one (or all) of these techniques: 1. Run the program, time it, make a prediction (an hypothesis), and verify or falsify the prediction

Proving Runtime Bounds

Prove that n2 ≠ O(n): f(n) is O(g(n)) if there exists positive constants c and n0 such that f(n) ≤ c ∙ g(n) for all n ≥ n0. Let f(n) = n2 and let g(n) = n. Proof by contradiction: Suppose that c and n0 existed such that n2 ≤ c ∙ n for all n ≥ n0. Then: n2 ≤ c ∙ n Divide both sides by n to simplify n ≤ c ? Notice that c is a constant! It is fixed. That means that there is no possible value for n0 because n can be arbitrarily large. Remember that the equation must be true for all n ≥ n0.

Page 30: Time Complexity - Computer Science• Could use any one (or all) of these techniques: 1. Run the program, time it, make a prediction (an hypothesis), and verify or falsify the prediction

Proving Runtime Bounds

Prove that 6n3 ≠ O(n2): f(n) is O(g(n)) if there exists positive constants c and n0 such that f(n) ≤ c ∙ g(n) for all n ≥ n0. Let f(n) = 6n3 and let g(n) = n2. Proof by contradiction: Suppose that c and n0 existed such that 6n3 ≤ c ∙ n2 for all n ≥ n0. Then: 6n3 ≤ c ∙ n2 Divide both sides by n2 to simplify

6n ≤ c n ≤ c/6 ? Cannot hold for arbitrarily large n because c is a constant. Therefore, 6n3 ≠ O(n2).

Page 31: Time Complexity - Computer Science• Could use any one (or all) of these techniques: 1. Run the program, time it, make a prediction (an hypothesis), and verify or falsify the prediction

Proving Runtime Bounds Prove that n2/2 - 3n = θ(n2): f(n) is θ(g(n)) if there exists positive constants c1, c2, and n0 such that c1∙ g(n) ≤ f(n) ≤ c2∙ g(n) for all n ≥ n0. Let f(n) = n2/2 - 3n and let g(n) = n2. c1 ∙ n2 ≤ n2/2 - 3n ≤ c2 ∙ n2 Divide by n2 to simplify c1 ≤ 1/2 - 3/n ≤ c2 Treat each side separately 1/2 - 3/n ≤ c2 First, the right-hand side: Choose c2 = 1/2 and n0 = 1

c1 ≤ 1/2 - 3/n Next, the left-hand side: Choose c1 = 1/14 and n0 = 7 Since you can only have one n0, choose n0 = 7 since 7 > 1

Page 32: Time Complexity - Computer Science• Could use any one (or all) of these techniques: 1. Run the program, time it, make a prediction (an hypothesis), and verify or falsify the prediction

Proving Runtime Bounds

Prove or disprove nn = O(nn+1): f(n) is O(g(n)) if there exists positive constants c and n0 such that f(n) ≤ c ∙ g(n) for all n ≥ n0. Let f(n) = nn and let g(n) = nn+1. nn ≤ c ∙ nn+1

nn ≤ c ∙ nn ∙ n1 Divide by nn to simplify 1 ≤ c ∙ n 1/n ≤ c Choose c = 1 and n0 = 1 Notice that as n gets larger, 1/n gets smaller. Since you showed that you can find c and n0, that satisfies the definition nn = O(nn+1).

Page 33: Time Complexity - Computer Science• Could use any one (or all) of these techniques: 1. Run the program, time it, make a prediction (an hypothesis), and verify or falsify the prediction

Proving Runtime Bounds

Prove or disprove n2 + n ln n = θ(n2 + n log n): f(n) is θ(g(n)) if there exists positive constants c1, c2, and n0 such that c1∙ g(n) ≤ f(n) ≤ c2∙ g(n) for all n ≥ n0. Let f(n) = n2 + n ln n and let g(n) = n2 + n log n. c1 (n2 + n log n) ≤ n2 + n ln n ≤ c2 (n2 + n log n) Divide by n c1 (n + log n) ≤ n + ln n ≤ c2 (n + log n) Notice that logx y = loge y / loge x, so that log2 n = loge n / loge 2 = ln n / ln 2 Also, ln 2 = 0.693 and ln n = log n ∙ ln 2, choose c1 = ln 2, c2 = 2, and n0 = 1 continued ...

Page 34: Time Complexity - Computer Science• Could use any one (or all) of these techniques: 1. Run the program, time it, make a prediction (an hypothesis), and verify or falsify the prediction

Proving Runtime Bounds

Continued from the previous slide: c1 (n + log n) ≤ n + ln n ≤ c2 (n + log n) Choose c1 = ln 2 and c2 = 2: ln 2 (n + log n) ≤ n + ln n ≤ 2 (n + log n) ln 2 n + ln 2 log n ≤ n + ln n ≤ 2n + 2 ln n / ln 2 n ln 2 + ln n ≤ n + ln n ≤ 2n + 2 ln n / ln 2 Since this is true for all n0 ≥ 1, n2 + n ln n = θ(n2 + n log n). Remember that ln 2 < 1, so that n ln 2 < n as n gets bigger, and n/ln 2 > n as n gets bigger.

Page 35: Time Complexity - Computer Science• Could use any one (or all) of these techniques: 1. Run the program, time it, make a prediction (an hypothesis), and verify or falsify the prediction

Proving Runtime Bounds

Prove or disprove n! = θ(2n): f(n) is θ(g(n)) if there exists positive constants c1, c2, and n0 such that c1∙ g(n) ≤ f(n) ≤ c2∙ g(n) for all n ≥ n0. Let f(n) = n! and let g(n) = 2n. c1 ∙ 2n ≤ n! ≤ c2 ∙ 2n n! ≤ c2 ∙ 2n (right-hand side is only true for n ≤ 3) c1 ∙ 2n ≤ n! (left-hand side is only true for n ≥ 4) Since you cannot choose n0 such that both halves are always true, n! ≠ θ(2n). However, notice that 2n = O(n!). Choose c = 1 and n0 = 4.

n 2n n!0 1 11 2 12 4 23 8 64 16 245 32 120

Page 36: Time Complexity - Computer Science• Could use any one (or all) of these techniques: 1. Run the program, time it, make a prediction (an hypothesis), and verify or falsify the prediction

Proving Runtime Bounds

Prove or disprove n! = θ(en): f(n) is θ(g(n)) if there exists positive constants c1, c2, and n0 such that c1∙ g(n) ≤ f(n) ≤ c2∙ g(n) for all n ≥ n0. Let f(n) = n! and let g(n) = en. n! ≤ c ∙ en Since c is a constant, this equation cannot hold for arbitrarily large n, therefore, n! ≠ θ(en). However, notice that en = O(n!). Choose c = 1 and n0 = 6.

n en n!0 1 11 2.718 12 7.39 23 20 64 55 245 148 1206 403 7207 1096 50408 2981 40,320