efficiency of algorithms csci 107 lecture 5. last time –algorithms for find all occurences of...

21
Efficiency of Algorithms Csci 107 Lecture 5

Post on 19-Dec-2015

220 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: Efficiency of Algorithms Csci 107 Lecture 5. Last time –Algorithms for Find all occurences of target Find number of occurences of target Find number of

Efficiency of Algorithms

Csci 107

Lecture 5

Page 2: Efficiency of Algorithms Csci 107 Lecture 5. Last time –Algorithms for Find all occurences of target Find number of occurences of target Find number of

• Last time– Algorithms for

• Find all occurences of target

• Find number of occurences of target

• Find number of values larger than target

• Find largest /smallest, sum, average

– Pattern matching

• Today– Pattern matching algorithm

– Efficiency of algorithms

– Data cleanup algorithms

– Reading: start on Chapter 3, textbook

Page 3: Efficiency of Algorithms Csci 107 Lecture 5. Last time –Algorithms for Find all occurences of target Find number of occurences of target Find number of

Pattern MatchingProblem: Suppose we have a gene (text) T = TCAGGCTAATCGTAGG

and a probe (pattern) P = TA. Design an algorithm that searches T to find the position of every instance of P that appears in T.

– E.g., for this text, the algorithm should return the answer:

There is a match at position 7There is a match at position 13

• Algorithm: – What is the idea?– Check if pattern matches starting at position 1, then check if it matches

starting at position 2,…and so on– How to check if pattern matches text starting at position k?

• Check that every character of pattern matches corresponding character of text

Page 4: Efficiency of Algorithms Csci 107 Lecture 5. Last time –Algorithms for Find all occurences of target Find number of occurences of target Find number of

Pattern Matching• Input

– Gene (text) of n characters T1, T2, …, Tn

– Probe (pattern) of m (m < n) characters P1, P2, …Pm

• Output: – Location (index) of every occurrence of pattern within text

• Algorithm idea– Get input (text and pattern)

– Set starting location k to 1

– Repeat until reach end of text• Attempt to match every character in the pattern beginning at pos k in text

• If there was a match, print k

• Add 1 to k

– Stop

Page 5: Efficiency of Algorithms Csci 107 Lecture 5. Last time –Algorithms for Find all occurences of target Find number of occurences of target Find number of

Comparing Algorithms• Algorithm

– Design– Correctness– Efficiency– Also, clarity, elegance, ease of understanding

• There are many ways to solve a problem– Conceptually– Also different ways to write pseudocode for the same conceptual

idea

• How to compare algorithms?

Page 6: Efficiency of Algorithms Csci 107 Lecture 5. Last time –Algorithms for Find all occurences of target Find number of occurences of target Find number of

Efficiency of Algorithms

• Efficiency: Amount of resources used by an algorithm• Space (number of variables)

• Time (number of instructions)

• When designing an algorithm must be aware of its use of resources

• If there is a choice, pick the more efficient algorithm!

Page 7: Efficiency of Algorithms Csci 107 Lecture 5. Last time –Algorithms for Find all occurences of target Find number of occurences of target Find number of

Efficiency of Algorithms

Does efficiency matter?

• Computers are so fast these days…

• Yes, efficiency matters a lot!– There are problems (actually a lot of them) for which all known

algorithms are so inneficient that they are impractical

– Remember the shortest-path-through-all-cities problem from Lab1…

Page 8: Efficiency of Algorithms Csci 107 Lecture 5. Last time –Algorithms for Find all occurences of target Find number of occurences of target Find number of

Efficiency of Algorithms

How to measure time efficiency?

• Running time: let it run and see how long it takes– On what machine?

– On what inputs?

Time efficiency depends on input

• Example: the sequential search algorithm– In the best case, how fast can the algorithm halt?

– In the worst case, how fast can the algorithm halt?

Page 9: Efficiency of Algorithms Csci 107 Lecture 5. Last time –Algorithms for Find all occurences of target Find number of occurences of target Find number of

Time Efficiency

• We want a measure of time efficiency which is independent of machine, speed etc– Look at an algorithm pseudocode and estimate its running time

– Look at 2 algorithm pseudocodes and compare them

• (Time) Efficiency of an algorithm: – the number of pseudocode instructions (steps) executed

• Is this accurate? – Not all instructions take the same amount of time…

– But..Good approximation of running time in most cases

Page 10: Efficiency of Algorithms Csci 107 Lecture 5. Last time –Algorithms for Find all occurences of target Find number of occurences of target Find number of

(Time) Efficiency of an algorithm worst case efficiency

is the maximum number of steps that an algorithm can take for any input data values.

best case efficiency

is the minimum number of steps that an algorithm can take for any input data values.

average case efficiency

-the efficiency averaged on all possible inputs

- must assume a distribution of the input

- we normally assume uniform distribution (all keys are equally

probable)

If the input has size n, efficiency will be a function of n

Page 11: Efficiency of Algorithms Csci 107 Lecture 5. Last time –Algorithms for Find all occurences of target Find number of occurences of target Find number of

Analysis of Sequential Search

• Time efficiency– Best-case : 1 comparison

• target is found immediately

– Worst-case: 3n + 5 comparisons• Target is not found

– Average-case: 3n/2+4 comparisons• Target is found in the middle

• Space efficiency– How much space is used in addition to the input?

Page 12: Efficiency of Algorithms Csci 107 Lecture 5. Last time –Algorithms for Find all occurences of target Find number of occurences of target Find number of

Worst Case Efficiency for Sequential Search

1. Get the value of target, n, and the list of n values 1

2. Set index to 1 1

3. Set found to false 1

4. Repeat steps 5-8 until found = true or index > n n

5 if the value of listindex = target then n

6 Output the index 0

7 Set found to true 0

8 else Increment the index by 1 n9 if not found then 1

10 Print a message that target was not found 0

11 Stop 1

Total 3n+5

Page 13: Efficiency of Algorithms Csci 107 Lecture 5. Last time –Algorithms for Find all occurences of target Find number of occurences of target Find number of

Order of Magnitude

• Worst-case of sequential search: – 3n+5 comparisons– Are these constants accurate? Can we ignore them?

• Simplification: – ignore the constants, look only at the order of magnitude– n, 0.5n, 2n, 4n, 3n+5, 2n+100, 0.1n+3 ….are all linear– we say that their order of magnitude is n

• 3n+5 is order of magnitude n: 3n+5 = (n)• 2n +100 is order of magnitude n: 2n+100=(n)• 0.1n+3 is order of magnitude n: 0.1n+3=(n)• ….

Page 14: Efficiency of Algorithms Csci 107 Lecture 5. Last time –Algorithms for Find all occurences of target Find number of occurences of target Find number of

Data Cleanup Algorithms

What are they?

A systematic strategy for removing errors from data.

Why are they important?

Errors occur in all real computing situations.

How are they related to the search algorithm?

To remove errors from a series of values, each value must be examined to determine if it is an error.

E.g., suppose we have a list d of data values, from which we want to remove all the zeroes (they mark errors), and pack the good values to the left. Legit is the number of good values remaining when we are done.

d1 d2 d3 d4 d5 d6 d7 d8

5 3 4 0 6 2 4 0Legit

Page 15: Efficiency of Algorithms Csci 107 Lecture 5. Last time –Algorithms for Find all occurences of target Find number of occurences of target Find number of

Data Cleanup: Copy-Over algorithm

Idea: Scan the list from left to right and copy non-zero values to a new list

Copy-Over Algorithm (Fig 3.2)Variables: n, A1, …, An, newposition, left, B1,…,Bn• Get values for n and the list of n values A1, A2, …, An• Set left to 1• Set newposition to 1• While left <= n do

• If Aleft is non-zero • Copy A left into B newposition

(Copy it into position newposition in new list• Increase left by 1• Increase newposition by 1

• Else increase left by 1 • Stop

Page 16: Efficiency of Algorithms Csci 107 Lecture 5. Last time –Algorithms for Find all occurences of target Find number of occurences of target Find number of

Data Cleanup: The Shuffle-Left Algorithm

• Idea: – go over the list from left to right. Every time we see a

zero, shift all subsequent elements one position to the left.

– Keep track of nb of legitimate (non-zero) entries

• How does this work?

• How many loops do we need?

Page 17: Efficiency of Algorithms Csci 107 Lecture 5. Last time –Algorithms for Find all occurences of target Find number of occurences of target Find number of

Shuffle-Left Algorithm (Fig 3.1)

Variables: n, A1,…,An, legit, left, right1 Get values for n and the list of n values A1, A2, …, An2 Set legit to n3 Set left to 14 Set right to 25 Repeat steps 6-14 until left > legit

6 if Aleftt ≠ 0

7 Increase left by 1 8 Increase right by 1

9 else10 Reduce legit by 111 Repeat 12-13 until right > n

12 Copy Aight into Aright-1

13 Increase right by 114 Set right to left + 1

15 Stop

Page 18: Efficiency of Algorithms Csci 107 Lecture 5. Last time –Algorithms for Find all occurences of target Find number of occurences of target Find number of

Exercising the Shuffle-Left Algorithm

d1 d2 d3 d4 d5 d6 d7 d8

5 3 4 0 6 2 4 0legit

Page 19: Efficiency of Algorithms Csci 107 Lecture 5. Last time –Algorithms for Find all occurences of target Find number of occurences of target Find number of

Data Cleanup: The Converging-Pointers Algorithm

• Idea:– One finger moving left to right, one moving

right to left– Move left finger over non-zero values;– If encounter a zero value then

• Copy element at right finger into this position

• Shift right finger to the left

Page 20: Efficiency of Algorithms Csci 107 Lecture 5. Last time –Algorithms for Find all occurences of target Find number of occurences of target Find number of

Converging Pointers Algorithm (Fig 3.3)

Variables: n, A1,…, An, legit, left, right1 Get values for n and the list of n values A1, A2,

…,An2 Set legit to n3 Set left to 14 Set right to n5 Repeat steps 6-10 until left ≥ right

6 If the value of Aleft≠0 then increase left by 1

7 Else8 Reduce legit by 19 Copy the value of Aright to Aleft

10 Reduce right by 1

11 if Aleft=0 then reduce legit by 1.12 Stop

Page 21: Efficiency of Algorithms Csci 107 Lecture 5. Last time –Algorithms for Find all occurences of target Find number of occurences of target Find number of

Exercising the Converging Pointers Algorithm

d1 d2 d3 d4 d5 d6 d7 d8

5 3 4 0 6 2 4 0legit