algorithms @ ymu, 2005

32
2005/6/3 2005/6/3 Algorithms @ YMU Algorithms @ YMU 1 Algorithms @ YMU, 2005 呂呂http://www.csie.ntu.edu.tw/~hi l/

Upload: glenda

Post on 22-Jan-2016

55 views

Category:

Documents


1 download

DESCRIPTION

Algorithms @ YMU, 2005. 呂學一 http://www.csie.ntu.edu.tw/~hil/. Today. Exact string matching in linear time. The Exact String Matching Problem. Input a string P –– the pattern a string S –– the text Output all the occurrences of P in S. Illustration. 1 2 3 4 5 6 7 8 9 0 1 2 3 - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Algorithms @ YMU, 2005

2005/6/32005/6/3 Algorithms @ YMUAlgorithms @ YMU 11

Algorithms @ YMU, 2005

呂學一

http://www.csie.ntu.edu.tw/~hil/

Page 2: Algorithms @ YMU, 2005

2005/6/32005/6/3 Algorithms @ YMUAlgorithms @ YMU 22

TodayToday

Exact string matching in linear time.

Page 3: Algorithms @ YMU, 2005

2005/6/32005/6/3 Algorithms @ YMUAlgorithms @ YMU 33

The Exact String The Exact String Matching ProblemMatching Problem Input

– a string P –– the pattern– a string S –– the text

Output– all the occurrences of P in S.

Page 4: Algorithms @ YMU, 2005

2005/6/32005/6/3 Algorithms @ YMUAlgorithms @ YMU 44

IllustrationIllustration

1 2 3 4 5 6 7 8 9 0 1 2 3 S = t a t a t t a t a t a t a P = t a t a Output

16810

Page 5: Algorithms @ YMU, 2005

2005/6/32005/6/3 Algorithms @ YMUAlgorithms @ YMU 55

B. Why? B. Why?

Computer Science– Dictionary, database– Search engines: Yahoo!, Google, …

Biology:– Blast

Warm-up for this course:– A well studied problem,– The idea/technique behind.

Page 6: Algorithms @ YMU, 2005

2005/6/32005/6/3 Algorithms @ YMUAlgorithms @ YMU 66

C. How?

Page 7: Algorithms @ YMU, 2005

2005/6/32005/6/3 Algorithms @ YMUAlgorithms @ YMU 77

Notation for stringsNotation for strings

S is a string– |S| = the length of S.– substring: S[i…j].

Page 8: Algorithms @ YMU, 2005

2005/6/32005/6/3 Algorithms @ YMUAlgorithms @ YMU 88

A naïve algorithmA naïve algorithm

Input: S and P. Output: all occurrences of P in S.

for i=1 to |S|

if S[i…i+|P|-1] equals P

output i;

Page 9: Algorithms @ YMU, 2005

2005/6/32005/6/3 Algorithms @ YMUAlgorithms @ YMU 99

IllustrationIllustration

1 2 3 4 5 6 7 8 9 0 1 2 3 S = t a t a t t a t a t a t a P = t a t a Output

16810

Page 10: Algorithms @ YMU, 2005

2005/6/32005/6/3 Algorithms @ YMUAlgorithms @ YMU 1010

Another approach

Dan Gusfield’s Z values

Page 11: Algorithms @ YMU, 2005

2005/6/32005/6/3 Algorithms @ YMUAlgorithms @ YMU 1111

The Z values of a The Z values of a string Sstring S Z(i) of a string S is the largest integer d

such that S[1…d] = S[i…i+d-1]

Si i+d-1

Page 12: Algorithms @ YMU, 2005

2005/6/32005/6/3 Algorithms @ YMUAlgorithms @ YMU 1212

Clearly, …Clearly, …

If Z(1), Z(2), …, Z(|S|) are the Z values of S, then– Z(1) = |S|;– Z(i) ≥ 0 for each i = 1, 2, …, |S|.

Page 13: Algorithms @ YMU, 2005

2005/6/32005/6/3 Algorithms @ YMUAlgorithms @ YMU 1313

For example, …For example, …

S = a a g c a a t a a a g c

Z = 12 1 0 0 2 1 0 2 4 1 0 0

a a a a a

a a a g c

a

Page 14: Algorithms @ YMU, 2005

2005/6/32005/6/3 Algorithms @ YMUAlgorithms @ YMU 1414

Question

How do we find all occurrences of P in S using Z values (of what)?

Page 15: Algorithms @ YMU, 2005

2005/6/32005/6/3 Algorithms @ YMUAlgorithms @ YMU 1515

Exact String Matching Exact String Matching with Z valueswith Z values

computing Z values of PS;

for i=1 to |S|

if Z(i+|P|)>=|P| then

output i;

P

i+|P| i+|P|+d-1

S

Page 16: Algorithms @ YMU, 2005

2005/6/32005/6/3 Algorithms @ YMUAlgorithms @ YMU 1616

Time complexity?Time complexity?

computing Z values of PS;

for i=1 to |S|

if Z(i+n)>=|P| then

output i; O(|S|) + time for computing the Z values

of PS.

Page 17: Algorithms @ YMU, 2005

2005/6/32005/6/3 Algorithms @ YMUAlgorithms @ YMU 1717

Computing Z(i) naivelyComputing Z(i) naively

For i=1 to |S| {

let j = i;

let Z(i) = 0;

while (S[j]==S[j-i+1]){

Z(i)++;

j++;

}

}

Time complexit

y?

Is it tight?

O(|S|2)

S = 000…000

Page 18: Algorithms @ YMU, 2005

2005/6/32005/6/3 Algorithms @ YMUAlgorithms @ YMU 1818

Z(i) can be naively computed in O(Z(i)+1) timeFor i=1 to |S| {

let j = i;

let Z(i) = 0;

while (S[j]==S[j-i+1]){

Z(i)++;

j++;

}

}

We need this

observation later.

Page 19: Algorithms @ YMU, 2005

2005/6/32005/6/3 Algorithms @ YMUAlgorithms @ YMU 1919

Z values in linear time

Page 20: Algorithms @ YMU, 2005

2005/6/32005/6/3 Algorithms @ YMUAlgorithms @ YMU 2020

NotationNotation

右護法 (i) = max{j + Z(j) – 1 | 1< j ≤ i}.– Abbreviated as 右 (i).

左護法 (i) = min{j | 右 (j) = 右 (i)}.– Abbreviated as 左 (i).

觀察 : 左右護法均 nondecreasing.

Page 21: Algorithms @ YMU, 2005

2005/6/32005/6/3 Algorithms @ YMUAlgorithms @ YMU 2121

Illustration Illustration

Si i+Z(i) – 1i – 1 +Z(i – 1) – 1i – 1

Page 22: Algorithms @ YMU, 2005

2005/6/32005/6/3 Algorithms @ YMUAlgorithms @ YMU 2222

IllustrationIllustration

i1 2 右 (i)左 (i)

Page 23: Algorithms @ YMU, 2005

2005/6/32005/6/3 Algorithms @ YMUAlgorithms @ YMU 2323

For example, …For example, … 1 1 1 1 1 1 1 1 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7

S = a a b a a b c a x a a b a a b c yZ = 1 0 3 1 0 0 1 0 7 1 0 3 1 0 0 0

1 1 1 1 1 1 1 1i+Z(i)-1 = 2 2 6 5 5 6 8 8 6 1 1 5 4 4 5 6

1 1 1 1 1 1 1 1右 (i) = 2 2 6 6 6 6 8 8 6 6 6 6 6 6 6 6

1 1 1 1 1 1 1 1左 (i) = 2 2 4 4 4 4 8 8 0 0 0 0 0 0 0 0

Page 24: Algorithms @ YMU, 2005

2005/6/32005/6/3 Algorithms @ YMUAlgorithms @ YMU 2424

StrategyStrategy

Computing Z(i), 右 (i), 左 (i) from – Z(1), Z(2), …, Z(i – 1);– 右 (i – 1);– 左 (i – 1).

Page 25: Algorithms @ YMU, 2005

2005/6/32005/6/3 Algorithms @ YMUAlgorithms @ YMU 2525

Case 1: Case 1: 右右 (i-1) ≤ i-1.(i-1) ≤ i-1.

右 (i-1) does not cover i. – (S[i]未能受到右護法的庇護 )

Computing Z(i) naively in O(1+Z(i)) time. 左 (i) = i. 右 (i) = i + Z(i) – 1. Observation (need this later)

1+Z(i) = 右 (i) – i + 2 ≤ 右 (i) – 右 (i – 1) + 1.

Page 26: Algorithms @ YMU, 2005

2005/6/32005/6/3 Algorithms @ YMUAlgorithms @ YMU 2626

Case 2:Case 2: 右右 (i-1) ≥ i(i-1) ≥ iand Z(j) < and Z(j) < 右右 (i-1)-i+1.(i-1)-i+1.

i 右 (i-1)左 (i-1)

i – 左 (i-1)+1 = j

Z(j)

Page 27: Algorithms @ YMU, 2005

2005/6/32005/6/3 Algorithms @ YMUAlgorithms @ YMU 2727

Z(i) = Z(j), Z(i) = Z(j), 左左 (i) = (i) = 左左(i-1), (i-1), 右右 (i) = (i) = 右右 (i-1).(i-1).

i 右 (i-1)左 (i-1)

i – 左 (i-1)+1 = j

Z(j)

Page 28: Algorithms @ YMU, 2005

2005/6/32005/6/3 Algorithms @ YMUAlgorithms @ YMU 2828

Case 3:Case 3: 右右 (i-1) ≥ i(i-1) ≥ iand Z(j) ≥ and Z(j) ≥ 右右 (i-1)-i+1.(i-1)-i+1.

i 右 (i-1)左 (i-1)

i – 左 (i-1)+1 = j

Z(j)

Page 29: Algorithms @ YMU, 2005

2005/6/32005/6/3 Algorithms @ YMUAlgorithms @ YMU 2929

Finding Z(i) by Finding Z(i) by comparsions comparsions starting from starting from 右右 (i-1)+1. (i-1)+1. Why?Why?

i 右 (i-1)左 (i-1)

i – 左 (i-1)+1 = j

Z(j)

Page 30: Algorithms @ YMU, 2005

2005/6/32005/6/3 Algorithms @ YMUAlgorithms @ YMU 3030

Computing Computing 左左 (i) and (i) and 右右(i).(i).

i 右 (i-1)左 (i-1)

i – 左 (i-1) + 1 = j

Z(j)

右 (i) = i + Z(i) -1. 左 (i) = i. How many comparisons?

– 右 (i)- 右 (i-1)+1.

Page 31: Algorithms @ YMU, 2005

2005/6/32005/6/3 Algorithms @ YMUAlgorithms @ YMU 3131

Time complexity is Time complexity is linear.linear. Case 1:

– O(Z(i)+1) = O( 右 (i)- 右 (i-1)+1). Case 2:

– O(1) = O( 右 (i)- 右 (i-1)+1). Case 3:

– O( 右 (i)- 右 (i -1)+1).

Page 32: Algorithms @ YMU, 2005

2005/6/32005/6/3 Algorithms @ YMUAlgorithms @ YMU 3232

Overall time Overall time complexitycomplexity

|))(|(|)(| SOSO

|).(| SO

)1)1()((||

1

i右i右OS

i