extended introduction to computer science cs1001.py...

57
Extended Introduction to Computer Science CS1001.py Lecture 7: Basic Algorithms – Part 1 Binary Search, Selection Sort Instructors: Elhanan Borenstein, Amir Rubinstein Teaching Assistants: Michal Kleinbort, Noam Parzanchevsky, Ori Zviran School of Computer Science Tel-Aviv University Spring Semester 2020 (the “Corona Semester ”) py.wikidot.com - 1001 cs - http://tau

Upload: others

Post on 30-May-2020

9 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Extended Introduction to Computer Science CS1001.py ...tau-cs1001-py.wdfiles.com/local--files/lecture... · Lecture 5-6A: Highlights •Binary Numbers •Representing integers with

Extended Introduction to Computer Science CS1001.py

Lecture 7: Basic Algorithms – Part 1

Binary Search, Selection Sort

Instructors: Elhanan Borenstein, Amir Rubinstein

Teaching Assistants: Michal Kleinbort,

Noam Parzanchevsky, Ori Zviran

School of Computer Science Tel-Aviv University

Spring Semester 2020 (the “Corona Semester”) py.wikidot.com-1001cs-http://tau

Page 2: Extended Introduction to Computer Science CS1001.py ...tau-cs1001-py.wdfiles.com/local--files/lecture... · Lecture 5-6A: Highlights •Binary Numbers •Representing integers with

Lecture 5-6A: Highlights

• Binary Numbers

• Representing integers with bits

• Integer Exponentiation - Naïve method vs. (fast) iterated squaring

2

Page 3: Extended Introduction to Computer Science CS1001.py ...tau-cs1001-py.wdfiles.com/local--files/lecture... · Lecture 5-6A: Highlights •Binary Numbers •Representing integers with

Lecture 7: Plan

• Basic algorithms: • Binary search

• Sorting using selection sort

• Merging sorted lists (next time)

• Complexity of algorithms (next time) • The O(…) notation – a formal definition for complexity

• Worst / best case analysis

3

Page 4: Extended Introduction to Computer Science CS1001.py ...tau-cs1001-py.wdfiles.com/local--files/lecture... · Lecture 5-6A: Highlights •Binary Numbers •Representing integers with

Search 1. Sequential (linear) search 2. Binary search (on sorted lists)

4 (taken from http://bizlinksinternational.com/web/web%20seo.php)

Page 5: Extended Introduction to Computer Science CS1001.py ...tau-cs1001-py.wdfiles.com/local--files/lecture... · Lecture 5-6A: Highlights •Binary Numbers •Representing integers with

Search • Search has always been a central computational task. In early

days, search supposedly took one quarter of all computing time. • The emergence and the popularization of the world wide web

has literally created a universe of data, and with it the need to pinpoint information in this universe.

• Various search engines have emerged, to cope with this challenge. They constantly collect data on the web, organize it, index it, and store it in sophisticated data structures that support efficient (fast) access, resilience to failures, frequent updates, including deletions, etc., etc.

• In this class we will deal with two much simpler search algorithms: • Sequential search • Binary search

5

Page 6: Extended Introduction to Computer Science CS1001.py ...tau-cs1001-py.wdfiles.com/local--files/lecture... · Lecture 5-6A: Highlights •Binary Numbers •Representing integers with

Search in Unordered vs. Ordered Lists

Hands on experience: Searching for a word in a book vs. searching for it in a dictionary.

(We mean a real world, hard copy, dictionary, not Python's dict, which you may have already met)

9

Page 7: Extended Introduction to Computer Science CS1001.py ...tau-cs1001-py.wdfiles.com/local--files/lecture... · Lecture 5-6A: Highlights •Binary Numbers •Representing integers with

6

Sequential Search The computational problem:

Input: a set of elements, and a key

Output: the index of an element in the set with the given key (if exists)

Possible solution:

Efficiency: how many iterations in the worst and best cases?

def sequential_search(lst, key):

for i in range(len(lst)):

if lst[i] == key :

return i

# we get here when key is not in list

return None

Page 8: Extended Introduction to Computer Science CS1001.py ...tau-cs1001-py.wdfiles.com/local--files/lecture... · Lecture 5-6A: Highlights •Binary Numbers •Representing integers with

Sequential Search: Time Analysis

• Any sequential search in an unordered list goes over it, item by item. If the list is of length n, sequential search will take n steps in the worst case (when the item is not found because it is missing).

• For a rather short list, n steps is not a problem. But if n is very large, such a search will take very long.

7

Page 9: Extended Introduction to Computer Science CS1001.py ...tau-cs1001-py.wdfiles.com/local--files/lecture... · Lecture 5-6A: Highlights •Binary Numbers •Representing integers with

Searching backwards : Code

• Is going over the list is reversed order justified?

• And is list reversing a good idea? Think what will happen to the worst and best case inputs.

• What about a random order?

8

def sequential_search_back(lst, key):

lst = lst[::-1]

for i in range(len(lst)):

if lst[i] == key :

return len(lst)-i-1

# we get here when key is not in list

return None

Page 10: Extended Introduction to Computer Science CS1001.py ...tau-cs1001-py.wdfiles.com/local--files/lecture... · Lecture 5-6A: Highlights •Binary Numbers •Representing integers with

10

Page 11: Extended Introduction to Computer Science CS1001.py ...tau-cs1001-py.wdfiles.com/local--files/lecture... · Lecture 5-6A: Highlights •Binary Numbers •Representing integers with

11

Animated Example - success

Searching for the existing Item, 18

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21

3 4 5 6 9 10 13 16 18 20 22 28 29 31 32 33 40 42 47 48 50 52 A

Page 12: Extended Introduction to Computer Science CS1001.py ...tau-cs1001-py.wdfiles.com/local--files/lecture... · Lecture 5-6A: Highlights •Binary Numbers •Representing integers with

11

Animated Example - success

Searching for the existing Item, 18

A[mid] ==22 > 18

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21

3 4 5 6 9 10 13 16 18 20 22 28 29 31 32 33 40 42 47 48 50 52 A

left mid right

Page 13: Extended Introduction to Computer Science CS1001.py ...tau-cs1001-py.wdfiles.com/local--files/lecture... · Lecture 5-6A: Highlights •Binary Numbers •Representing integers with

11

Animated Example - success

Searching for the existing Item, 18

A[mid] ==22 > 18

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21

3 4 5 6 9 10 13 16 18 20 22 28 29 31 32 33 40 42 47 48 50 52 A

left mid right

A[mid] == 9 < 18

Page 14: Extended Introduction to Computer Science CS1001.py ...tau-cs1001-py.wdfiles.com/local--files/lecture... · Lecture 5-6A: Highlights •Binary Numbers •Representing integers with

11

Animated Example - success

Searching for the existing Item, 18

A[mid] ==22 > 18

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21

3 4 5 6 9 10 13 16 18 20 22 28 29 31 32 33 40 42 47 48 50 52 A

left mid right

A[mid] == 9 < 18 A[mid] == 16 < 18

Page 15: Extended Introduction to Computer Science CS1001.py ...tau-cs1001-py.wdfiles.com/local--files/lecture... · Lecture 5-6A: Highlights •Binary Numbers •Representing integers with

11

Animated Example - success

Searching for the existing Item, 18

A[mid] ==22 > 18

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21

3 4 5 6 9 10 13 16 18 20 22 28 29 31 32 33 40 42 47 48 50 52 A

left

mid

right

A[mid] == 9 < 18 A[mid] == 16 < 18 A[mid] == 18 Found!

Page 16: Extended Introduction to Computer Science CS1001.py ...tau-cs1001-py.wdfiles.com/local--files/lecture... · Lecture 5-6A: Highlights •Binary Numbers •Representing integers with

12

Animated Example - failure

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21

3 4 5 6 9 10 13 16 18 20 22 28 29 31 32 33 40 42 47 48 50 52 A

Searching for the non existing Item, 17

Page 17: Extended Introduction to Computer Science CS1001.py ...tau-cs1001-py.wdfiles.com/local--files/lecture... · Lecture 5-6A: Highlights •Binary Numbers •Representing integers with

12

Animated Example - failure

A[mid] ==22 > 17

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21

3 4 5 6 9 10 13 16 18 20 22 28 29 31 32 33 40 42 47 48 50 52 A

left mid right

Searching for the non existing Item, 17

Page 18: Extended Introduction to Computer Science CS1001.py ...tau-cs1001-py.wdfiles.com/local--files/lecture... · Lecture 5-6A: Highlights •Binary Numbers •Representing integers with

12

Animated Example - failure

A[mid] ==22 > 17

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21

3 4 5 6 9 10 13 16 18 20 22 28 29 31 32 33 40 42 47 48 50 52 A

left mid right

A[mid] == 9 < 17

Searching for the non existing Item, 17

Page 19: Extended Introduction to Computer Science CS1001.py ...tau-cs1001-py.wdfiles.com/local--files/lecture... · Lecture 5-6A: Highlights •Binary Numbers •Representing integers with

12

Animated Example - failure

A[mid] ==22 > 17

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21

3 4 5 6 9 10 13 16 18 20 22 28 29 31 32 33 40 42 47 48 50 52 A

left mid right

A[mid] == 9 < 17 A[mid] == 16 < 17

Searching for the non existing Item, 17

Page 20: Extended Introduction to Computer Science CS1001.py ...tau-cs1001-py.wdfiles.com/local--files/lecture... · Lecture 5-6A: Highlights •Binary Numbers •Representing integers with

12

Animated Example - failure

A[mid] ==22 > 17

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21

3 4 5 6 9 10 13 16 18 20 22 28 29 31 32 33 40 42 47 48 50 52 A

left

mid

right

A[mid] == 9 < 17 A[mid] == 16 < 17 A[mid] == 18 > 17

Searching for the non existing Item, 17

Page 21: Extended Introduction to Computer Science CS1001.py ...tau-cs1001-py.wdfiles.com/local--files/lecture... · Lecture 5-6A: Highlights •Binary Numbers •Representing integers with

12

Animated Example - failure

A[mid] ==22 > 17

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21

3 4 5 6 9 10 13 16 18 20 22 28 29 31 32 33 40 42 47 48 50 52 A

left

mid

right

A[mid] == 9 < 17 A[mid] == 16 < 17 A[mid] == 18 > 17

Searching for the non existing Item, 17

Page 22: Extended Introduction to Computer Science CS1001.py ...tau-cs1001-py.wdfiles.com/local--files/lecture... · Lecture 5-6A: Highlights •Binary Numbers •Representing integers with

12

Animated Example - failure

A[mid] ==22 > 17

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21

3 4 5 6 9 10 13 16 18 20 22 28 29 31 32 33 40 42 47 48 50 52 A

left

mid

right

A[mid] == 9 < 17 A[mid] == 16 < 17 A[mid] == 18 > 17

Searching for the non existing Item, 17

Not Found! left > right

Page 23: Extended Introduction to Computer Science CS1001.py ...tau-cs1001-py.wdfiles.com/local--files/lecture... · Lecture 5-6A: Highlights •Binary Numbers •Representing integers with

13

Page 24: Extended Introduction to Computer Science CS1001.py ...tau-cs1001-py.wdfiles.com/local--files/lecture... · Lecture 5-6A: Highlights •Binary Numbers •Representing integers with

14

Binary Search – the code

def binary_search(lst, key):

""" iterative binary search. lst must be sorted """

n= len(lst)

left = 0

right = n-1

while left <= right :

middle = (right + left)//2 # middle rounded down

if key == lst[middle]: # item found

return middle

elif key < lst[middle]: # item not in top half

right = middle-1

else: # item not in bottom half

left = middle+1

#print(key, "not found")

return None

Page 25: Extended Introduction to Computer Science CS1001.py ...tau-cs1001-py.wdfiles.com/local--files/lecture... · Lecture 5-6A: Highlights •Binary Numbers •Representing integers with

15

Binary Search – the code

Execution examples: in class.

Page 26: Extended Introduction to Computer Science CS1001.py ...tau-cs1001-py.wdfiles.com/local--files/lecture... · Lecture 5-6A: Highlights •Binary Numbers •Representing integers with

26

Binary Search – Time Analysis • As a measure for efficiency, we will look at the number of steps/iterations.

• An underlying assumption: the time needed for all the (basic) operations in each iteration is bounded by some constant.

Therefore each iteration takes at most a constant amount of time

What is considered a basic operation? This is context dependent (discussion in class).

Note that this assumption does not always hold (recall integer exponentiation as an example)

• So, how many iterations are needed, as a function of the input size? Input size in this case is the list’s length, denoted n

• Does the result depend on the content of the input, or on its length only?

Page 27: Extended Introduction to Computer Science CS1001.py ...tau-cs1001-py.wdfiles.com/local--files/lecture... · Lecture 5-6A: Highlights •Binary Numbers •Representing integers with

16

Page 28: Extended Introduction to Computer Science CS1001.py ...tau-cs1001-py.wdfiles.com/local--files/lecture... · Lecture 5-6A: Highlights •Binary Numbers •Representing integers with

17

Binary Search - Real Time Measurements import time

repeat = 20 #repeat execution several times, for more significant results

for n in [10**6, 2*10**6, 4*10**6]:

print("n=", n)

L = [i for i in range(n)]

key = -1 # why?

t0 = time.perf_counter()

for i in range(repeat):

res = sequential_search(L, key)

t1 = time. perf_counter()

print("sequential search:", t1-t0)

t0 = time. perf_counter()

for i in range(repeat):

res = binary_search(L, key)

t1 = time. perf_counter()

print("binary search:", t1-t0)

Page 29: Extended Introduction to Computer Science CS1001.py ...tau-cs1001-py.wdfiles.com/local--files/lecture... · Lecture 5-6A: Highlights •Binary Numbers •Representing integers with

18

n= 1000000

sequential search: 2.9088399801573757

binary search: 0.0005532383071873426

n= 2000000

sequential search: 5.7504573815927795

binary search: 0.0005583582503900786

n= 4000000

sequential search: 11.536035866908783

binary search: 0.0005953356179659863

• How would the results change if we searched an element that does exist in the

list? Does it depend on where in the list the element is found?

Binary Search - Real Time Measurements

Page 30: Extended Introduction to Computer Science CS1001.py ...tau-cs1001-py.wdfiles.com/local--files/lecture... · Lecture 5-6A: Highlights •Binary Numbers •Representing integers with

19

Created using https://graphsketch.com/

log(n)

n

Log: input x2 time + constant

Linear: input x2 time x2 (approximately)

Logarithmic vs. Linear Time Algorithms

Page 31: Extended Introduction to Computer Science CS1001.py ...tau-cs1001-py.wdfiles.com/local--files/lecture... · Lecture 5-6A: Highlights •Binary Numbers •Representing integers with

20

Page 32: Extended Introduction to Computer Science CS1001.py ...tau-cs1001-py.wdfiles.com/local--files/lecture... · Lecture 5-6A: Highlights •Binary Numbers •Representing integers with

21

Page 33: Extended Introduction to Computer Science CS1001.py ...tau-cs1001-py.wdfiles.com/local--files/lecture... · Lecture 5-6A: Highlights •Binary Numbers •Representing integers with

Sorting

33

Page 34: Extended Introduction to Computer Science CS1001.py ...tau-cs1001-py.wdfiles.com/local--files/lecture... · Lecture 5-6A: Highlights •Binary Numbers •Representing integers with

34

The Sorting Problem

The computational problem:

Input: a set of elements

Output: a sequence of the same elements, ordered by “size”

Note that a computational problem is described in abstract terms, and is merely a desired relation between legal inputs and their outputs.

Technically, we will represent a sequence as a python list

Possible algorithms?

We will see at least 3 in this course, one today

These solutions employ different strategies, which has consequences in terms of efficiency, as we will see

Page 35: Extended Introduction to Computer Science CS1001.py ...tau-cs1001-py.wdfiles.com/local--files/lecture... · Lecture 5-6A: Highlights •Binary Numbers •Representing integers with

35

def selection_sort(lst):

''' sort lst (in-place) '''

n = len(lst)

for i in range(n):

m_index = i #index of minimum

for j in range(i+1,n):

if lst[m_index] > lst[j]:

m_index = j

swap(lst, i, m_index)

return None #no need to return lst??

Selection sort • The algorithm in pseudo-code:

Selection-Sort (input: lst of size n) 1. for i=0 to n-1: 1.1 find the minimum of the sublist of lst from index i onward 1.2 swap it with the element at index i 2. end

• Implementation in code:

def swap(lst, i, j):

tmp = lst[i]

lst[i] = lst[j]

lst[j] = tmp

Page 36: Extended Introduction to Computer Science CS1001.py ...tau-cs1001-py.wdfiles.com/local--files/lecture... · Lecture 5-6A: Highlights •Binary Numbers •Representing integers with

36

Selection Sort - Efficiency • We will analyze in class the total number of iterations as a

function of the list size, 𝑛. • Then we will measure actual running time and see if it fits

the formal analysis.

def selection_sort(lst):

''' sort lst (in-place) '''

n = len(lst)

for i in range(n):

m_index = i #index of minimum

for j in range(i+1,n):

if lst[m_index] > lst[j]:

m_index = j

swap(lst, i, m_index)

return None #no need to return lst??

Page 37: Extended Introduction to Computer Science CS1001.py ...tau-cs1001-py.wdfiles.com/local--files/lecture... · Lecture 5-6A: Highlights •Binary Numbers •Representing integers with

37

Selection Sort - Analysis • As a measure for efficiency, we will look at the number of iterations.

• An underlying assumption: the time needed for all the (basic) operations in each iteration is bounded by some constant (not including inner loops).

Therefore each iteration takes a constant amount of time

What is considered a basic operation? This is context dependent (discussion in class).

Note that this assumption does not always hold (recall integer exponentiation as an example)

• So, how many iterations are needed, as a function of the input size? Input size in this case is the list’s length, denoted n

• Does the result depend on the content of the list, or on its length only?

Answers: in class and on board

Page 38: Extended Introduction to Computer Science CS1001.py ...tau-cs1001-py.wdfiles.com/local--files/lecture... · Lecture 5-6A: Highlights •Binary Numbers •Representing integers with

38

Selection Sort – Actual Running Time

• Output:

n= 1000 0.06043896640494811

n= 2000 0.27381915858021066

n= 4000 1.0055912134084082

• How does running time seem to change with input size?

import time

import random

for n in [1000,2000,4000]:

lst = list(range(n)) # [0,1,2,…,n-1]

random.shuffle(lst) # balagan

t0 = time.clock() # stopper go!

selection_sort(lst)

t1= time.clock() # stopper end

print("n=", n, t1-t0)

Page 39: Extended Introduction to Computer Science CS1001.py ...tau-cs1001-py.wdfiles.com/local--files/lecture... · Lecture 5-6A: Highlights •Binary Numbers •Representing integers with

39

Selection Sort - Efficiency

log(n)

n

n2

Quadratic: input x2 time x22 (approximately)

Page 40: Extended Introduction to Computer Science CS1001.py ...tau-cs1001-py.wdfiles.com/local--files/lecture... · Lecture 5-6A: Highlights •Binary Numbers •Representing integers with

Crash Intro to Complexity

40

Page 41: Extended Introduction to Computer Science CS1001.py ...tau-cs1001-py.wdfiles.com/local--files/lecture... · Lecture 5-6A: Highlights •Binary Numbers •Representing integers with

Time Complexity: A Crash Intro • A computational problem is a relation between input and output:

• description of parameters (input) • description of solution (output)

• An algorithm is a step-by-step procedure, a “recipe” • can be represented e.g. as a computer program (but also in natural languages, diagrams, animations, etc.) • an abstract notion • A formal definition in "Computational Models" course

• Efficient algorithms are usually preferred • fastest – time complexity • most economical in terms of memory – space complexity

• Time complexity analysis: • measured in terms of operations, not actual timings

• We want to say something about the algorithm, not a specific machine/execution/programming language implementation

• expressed as a function of the problem size • We will be interested in how the number of operations changes with input size

41

Page 42: Extended Introduction to Computer Science CS1001.py ...tau-cs1001-py.wdfiles.com/local--files/lecture... · Lecture 5-6A: Highlights •Binary Numbers •Representing integers with

Defining Complexity

We will be interested in how the number of operations changes with input size.

In most cases, we will not care about the exact function, but in its “order”, or growth rate (e.g., logarithmic, linear, quadratic, etc.)

Sometimes we will only be interested/able to give an upper bound for this growth rate. We will, however, strive to make this upper bound as tight (=low) as we can. In this course, we will almost always be able to give tight upper bounds.

We need some formal definition for ”upper bound for the number of operations growth rate, as a function of input size”.

42

Page 43: Extended Introduction to Computer Science CS1001.py ...tau-cs1001-py.wdfiles.com/local--files/lecture... · Lecture 5-6A: Highlights •Binary Numbers •Representing integers with

Big O Notation • We say that a function 𝑓(𝑛) is 𝑂(𝑔(𝑛)) if there is a constant 𝑐 such that for

large enough 𝑛,

|𝑓(𝑛)| ≤ 𝑐 ∙ |𝑔(𝑛)|

• We denote this as 𝑓(𝑛) = 𝑂(𝑔(𝑛))

• In our context, 𝑓(𝑛) will usually denote the number of operation an algorithm performs on an input of size 𝑛

• a number with 𝑛 bits • a collection with 𝑛 elements (list, string, etc.)

• sometimes 𝑓(𝑛) will denote the number of memory cells required by the

algorithm on an input of size 𝑛

• So in our context 𝑓 and 𝑔 are positive functions for every 𝑛, and so we will omit the absolute value notation.

43

Page 44: Extended Introduction to Computer Science CS1001.py ...tau-cs1001-py.wdfiles.com/local--files/lecture... · Lecture 5-6A: Highlights •Binary Numbers •Representing integers with

Big O Notation – Visualized

44

𝑓 𝑛 = 𝑂(𝑔 𝑛 )

𝑓(𝑛)

𝑐 ∙ 𝑔(𝑛)

Page 45: Extended Introduction to Computer Science CS1001.py ...tau-cs1001-py.wdfiles.com/local--files/lecture... · Lecture 5-6A: Highlights •Binary Numbers •Representing integers with

Big O Notation - Examples

Examples:

• 3𝑛 + 7 = 𝑂 𝑛

• 3𝑛 + 7 = 𝑂(𝑛2) *

• 3𝑛 + 7 ≠ 𝑂(√𝑛)

• 5𝑛 ∙ log2𝑛 + 1 = 𝑂(𝑛 log 𝑛) [where did the log base disappear?]

• 6log2𝑛 = 𝑂(𝑛) *

• 2log2𝑛 + 12 = 𝑂(𝑛) *

• 1000 ∙ 𝑛 ∙ log2 𝑛 = 𝑂(𝑛2) *

• 3𝑛 ≠ 𝑂(2𝑛)

• 2𝑛/100 ≠ 𝑂(𝑛100)

45 * not the tightest possible bound

Page 46: Extended Introduction to Computer Science CS1001.py ...tau-cs1001-py.wdfiles.com/local--files/lecture... · Lecture 5-6A: Highlights •Binary Numbers •Representing integers with

The Asymptotic nature of Big 𝑂 • Consider the two functions 𝑓(𝑛) = 10𝑛log2 𝑛 + 1, and

𝑔 𝑛 = 𝑛2 ⋅ (2 + sin (𝑛)/3) + 2

• It is not hard to verify that 𝑓(𝑛) = 𝑂(𝑔 𝑛 ).

• Yet, for small values of 𝑛, 𝑓(𝑛) > 𝑔(𝑛), as can be seen in the following plot:

46

𝑓(𝑛)

𝑔(𝑛)

Page 47: Extended Introduction to Computer Science CS1001.py ...tau-cs1001-py.wdfiles.com/local--files/lecture... · Lecture 5-6A: Highlights •Binary Numbers •Representing integers with

The Asymptotic nature of Big 𝑂 (CONT.)

• But for large enough 𝑛, indeed 𝑓(𝑛) < 𝑔(𝑛), as can be seen in the next plot:

47

• Also, remember that for big 𝑂, 𝑓(𝑛) may be larger than 𝑔(𝑛), as long as there is a constant 𝑐 such that 𝑓(𝑛) < 𝑐 ∙ 𝑔(𝑛).

𝑓(𝑛)

𝑔(𝑛)

Page 48: Extended Introduction to Computer Science CS1001.py ...tau-cs1001-py.wdfiles.com/local--files/lecture... · Lecture 5-6A: Highlights •Binary Numbers •Representing integers with

Complexity Hierarchy

O(1)

O(logn)

O(n)

O(n2)

O(2n)

constant

logarithmic

linear

quadratic

exp

on

enti

al

O(log2n)

O(3n)

O(nlogn)

48

We’ll meet this guy later in the course

Unless asked to prove formally, You can use this hierarchical orderings as facts.

(bo

un

d b

y) P

oly

no

mia

l

poly-logarithmic

. . .

. . .

. . .

. . .

. . .

. . .

. . .

. . .

Page 49: Extended Introduction to Computer Science CS1001.py ...tau-cs1001-py.wdfiles.com/local--files/lecture... · Lecture 5-6A: Highlights •Binary Numbers •Representing integers with

O(1)

49

What is the meaning of this?

a) A very short running time

b) A running time that is independent of the input size (i.e. constant)

c) 1 operation

d) immediate termination of the algorithm

Page 50: Extended Introduction to Computer Science CS1001.py ...tau-cs1001-py.wdfiles.com/local--files/lecture... · Lecture 5-6A: Highlights •Binary Numbers •Representing integers with

Worst / Best Case Complexity

50

• In many cases, for the same size of input, the content of the input itself affects the complexity. We then separate between worst case and best case complexity.

• Examples:

• Note that this statement is completely nonsense: "The best time complexity is when 𝑛 is very small…"

𝑇𝑤𝑜𝑟𝑠𝑡 𝑛 = max {𝑡𝑖𝑚𝑒 𝐼𝑛𝑝𝑢𝑡 : 𝐼𝑛𝑝𝑢𝑡 = 𝑛} 𝑇𝑏𝑒𝑠𝑡 𝑛 = min {𝑡𝑖𝑚𝑒 𝐼𝑛𝑝𝑢𝑡 : 𝐼𝑛𝑝𝑢𝑡 = 𝑛}

Worst case Best case

O(logn) O(1) Binary search

O(n2) O(n2) Selection sort

Page 51: Extended Introduction to Computer Science CS1001.py ...tau-cs1001-py.wdfiles.com/local--files/lecture... · Lecture 5-6A: Highlights •Binary Numbers •Representing integers with

51

Summary of Some Previous Results

• All these results refer to worst case scenarios.

• Algorithms on sequences: • Binary search on a sorted list of length 𝑛 takes 𝑂(𝑙𝑜𝑔𝑛) iterations • Selection Sort on a list of length 𝑛 takes 𝑂(𝑛2) iterations • Merging of 2 sorted lists of sizes 𝑛 and 𝑚 takes 𝑂(𝑛 + 𝑚) iterations • Palindrome checking on a string of length 𝑛 takes 𝑂(𝑛) iterations

• Algorithms on integers: • Addition of two 𝑛-bit integers takes 𝑂(𝑛) iterations • Multiplication of two 𝑛-bit integers takes 𝑂(𝑛2) iterations • Naïve integer exponentiation 𝑎𝑏 where |𝑏| = 𝑛 bits takes 𝑂(2𝑛) multiplications* • Iterated squaring for 𝑎𝑏 where |𝑏| = 𝑛 bits takes 𝑂(𝑛) multiplications *

* the number of iterations depends on the size of the multiplied numbers, which isn't constant. In Modular exponentiation (𝑎𝑏%c) where |c|=n bits, each multiplication takes is 𝑂(𝑛2) iterations.

Page 52: Extended Introduction to Computer Science CS1001.py ...tau-cs1001-py.wdfiles.com/local--files/lecture... · Lecture 5-6A: Highlights •Binary Numbers •Representing integers with

52

Input Size - Clarifications • We measure running time (or computational complexity) as a function of the

input size.

• For integers, input size is the number of bits in the representation of the number in the computer. • we normally count the number of "simple" bit operations (such as adding or

multiplying two bits).

• For lists/strings/dictionaries/other collections, the input size is typically the number of elements in the collection. • We normally count the number of "simple" list element operations (such as

comparisons, assignments), assuming that the size of each element is bound by some constant (therefor no need to consider operations on them, such as comparison, addition, etc.)

• There are exceptions to this, however. For example, a list of 𝑛 string each of size 𝑚.

Page 53: Extended Introduction to Computer Science CS1001.py ...tau-cs1001-py.wdfiles.com/local--files/lecture... · Lecture 5-6A: Highlights •Binary Numbers •Representing integers with

53

10 20 30 40 50 60

n 1.0E-09 2.0E-09 3.0E-09 4.0E-09 5.0E-09 6.0E-09

seconds seconds seconds seconds seconds seconds

n2 1.0E-08 4.0E-08 9.0E-08 1.6E-07 2.5E-07 3.6E-07

seconds seconds seconds seconds seconds seconds

n3 1.0E-07 8.0E-07 2.7E-06 6.4E-06 1.3E-05 2.2E-05

seconds seconds seconds seconds seconds seconds

n5 1.0E-05 0.00032 0.00243 0.01024 0.03125 0.07776

seconds seconds seconds seconds seconds seconds

2n 1.02E-07 1.05E-04 0.107 1.833 1.303 0.64

seconds seconds seconds minutes days years

3n 5.9E-06 0.35 5.72 38.55 22764 1.34E+09

seconds seconds hours years centuries centuries

How would execution time for a very fast, modern processor (1010 ops per second, say) vary for a task with the following time complexities and n = input sizes?

Modified from Garey and Johnson's classical book

Tractability - Basic Distinction:

Polynomial time = tractable. Exponential time = intractable.

Page 54: Extended Introduction to Computer Science CS1001.py ...tau-cs1001-py.wdfiles.com/local--files/lecture... · Lecture 5-6A: Highlights •Binary Numbers •Representing integers with

Time Complexity - What is tractable in Practice?

• A polynomial-time algorithm is good. • n100 is polynomial, hence good.

• An exponential-time algorithm is bad. • 2n/100 is exponential, hence bad.

Yet for input of size n = 4000, the n100 time algorithm takes more than 1035 centuries on the above mentioned machine, while the

2n/100 algorithm runs in just under two minutes.

54

Page 55: Extended Introduction to Computer Science CS1001.py ...tau-cs1001-py.wdfiles.com/local--files/lecture... · Lecture 5-6A: Highlights •Binary Numbers •Representing integers with

Time Complexity - Advice

• Trust, but check! Don't just mumble "polynomial-time algorithms are good", "exponential-time algorithms are bad" because the lecturer told you so.

• Asymptotic run time and the O notation are important, and in most cases help clarify and simplify the analysis.

• But when faced with a concrete task on a specific problem size, you may be far away from "the asymptotic".

• In addition, constants hidden in the O notation may have unexpected impact on actual running time.

55

Page 56: Extended Introduction to Computer Science CS1001.py ...tau-cs1001-py.wdfiles.com/local--files/lecture... · Lecture 5-6A: Highlights •Binary Numbers •Representing integers with

Time Complexity – Advice (cont.)

• We will employ both asymptotic analysis and direct measurements of the actual running time.

• For direct measurements, we will use either the time package and the time.clock() function.

• Or the timeit package and the timeit.timeit() function.

• Both have some deficiencies, yet are highly useful for our needs.

56

Page 57: Extended Introduction to Computer Science CS1001.py ...tau-cs1001-py.wdfiles.com/local--files/lecture... · Lecture 5-6A: Highlights •Binary Numbers •Representing integers with

Average Complexity

57

• Often the average complexity is more informative (e.g. when the worst and best cases are rather rare).

• However analyzing it is usually more complicated, and requires some knowledge on the distribution of input probability.

• Assuming distribution is uniform:

• Examples from our course you will encounter in the near future: - Quicksort runs on average in O(nlogn) - Hash table chains are of average length O(n/m)

𝑇𝑎𝑣𝑒𝑟𝑎𝑔𝑒 𝑛 =

𝑡𝑖𝑚𝑒 𝐼𝑛𝑝𝑢𝑡

|𝐼𝑛𝑝𝑢𝑡|=𝑛

N𝑜. 𝐼𝑛𝑝𝑢𝑡𝑠 𝑜𝑓 𝑠𝑖𝑧𝑒 𝑛