chapter 2 arrays and structures

54
Chapter 2 Arrays and Structures Instructors: C. Y. Tang and J. S. Roge r Jang All the material are integrated from the textbook "Fundamental s of Data Structures in C" and some supplement from the slide s of Prof. Hsin-Hsi Chen (NTU).

Upload: marly

Post on 09-Jan-2016

49 views

Category:

Documents


2 download

DESCRIPTION

Chapter 2 Arrays and Structures. Instructors: C. Y. Tang and J. S. Roger Jang. All the material are integrated from the textbook "Fundamentals of Data Structures in C" and some supplement from the slides of Prof. Hsin-Hsi Chen (NTU). Outline. The array as an abstract data type - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Chapter 2  Arrays and Structures

Chapter 2 Arrays and Structures

Instructors: C. Y. Tang and J. S. Roger Jang

All the material are integrated from the textbook "Fundamentals of Data Structures in C" and some supplement from the slides of Prof. Hsin-Hsi Chen (NTU).

Page 2: Chapter 2  Arrays and Structures

Outline The array as an abstract data type Structures and unions The polynomial abstract data type The sparse matrix abstract data type Representation of multidimensional arrays The string abstract data type

Page 3: Chapter 2  Arrays and Structures

Outline The array as an abstract data type Structures and unions The polynomial abstract data type The sparse matrix abstract data type Representation of multidimensional arrays The string abstract data type

Page 4: Chapter 2  Arrays and Structures

Arrays

Array: a set of index and value

data structureFor each index, there is a value associated with

that index.

representation (possible)implemented by using consecutive memory.

Page 5: Chapter 2  Arrays and Structures

Abstract Data Type Array Structure Array is objects : a set of pairs <index, value> where for each value of index there is a value from the set item.

functions: Array Create (j, list) : an array of j dimensions, where list is a j-tuple whose ith element is the size of the ith dimension. Item Retrieve (A, i) : If (i ε index) return the item indexed by i in array A, else return error. Array Store (A, i, x) : If (i ε index) insert <i, x> and return the new array,

else return error. end Array

Page 6: Chapter 2  Arrays and Structures

Implementation in C int list[5], *plist[5];

Variable Memory Address

list[0]base address =

α

list[1] α + sizeof(int)list[2] α + 2*sizeof(int)list[3] α + 3*sizeof(int)list[4] α + 4*sizeof(int)

The names of the five

array elements

Assume the memory address α = 1414, list[0] = 6, list[2] = 8, plist[3] = list = 1414 printf(“%d”, list); // 1414, the variable “list” is the pointer to an int printf(“%d”, &list[0]) // 1414, return the address using “&” printf(“%d”, list[0]); // 6, the value stored in the address “&list[0]” printf(“%d”, (list + 2)); // 1414 + 2*sizeof(int), “(list +2)” is a pointer printf(“%d”, *(list + 2)); // 8 printf(“%d”, plist[3]); // 1414, the value of a pointer is an address printf(“%d”, *plist[3]); // 6

Page 7: Chapter 2  Arrays and Structures

Example: One-dimensional array accessed by address

call print1(&one[0], 5)

void print1(int *ptr, int rows){/* print out a one-dimensional array using a pointer */

int i;printf(“Address Contents\n”);for (i=0; i < rows; i++)

printf(“%8u%5d\n”, ptr+i, *(ptr+i));printf(“\n”);

}

int one[] = {0, 1, 2, 3, 4};Goal: print out address and value

Page 8: Chapter 2  Arrays and Structures

Example: Array program #define MAX_SIZE 100 float sum(float [], int); float input[MAX_SIZE], answer; int i; void main (void) { for (i = 0; i < MAX_SIZE; i++) input[i] = i; answer = sum(input, MAX_SIZE); printf("The sum is: %f\n", answer); } float sum(float list[], int n) { int i; float tempsum = 0; for (i = 0; i < n; i++) tempsum += list[i]; return tempsum; }

Result : The sum is: 4950.000000

Page 9: Chapter 2  Arrays and Structures

Outline The array as an abstract data type Structures and unions The polynomial abstract data type The sparse matrix abstract data type Representation of multidimensional arrays The string abstract data type

Page 10: Chapter 2  Arrays and Structures

Structures (Records)

struct {char name[10]; // a name that is a character arrayint age; // an integer value representing the age of the personfloat salary; // a float value representing the salary of the individual} person;

strcpy(person.name, “james”); person.age=10; person.salary=35000;

An alternate way of grouping data that permits the data to vary in type.

Example:

Page 11: Chapter 2  Arrays and Structures

Create structure data typetypedef struct human_being {

char name[10];int age;float salary;};

or

typedef struct {char name[10];int age;float salary} human_being;

human_being person1, person2;

Page 12: Chapter 2  Arrays and Structures

Example: Embed a structure within a structuretypedef struct {

int month;int day;int year;} date;

typedef struct human_being {char name[10];int age;float salary;

date dob;};

person1.dob.month = 2;person1.dob.day = 11;person1.dob.year = 1944;

Page 13: Chapter 2  Arrays and Structures

UnionsSimilar to struct, but only one field is active.

Example: Add fields for male and female.typedef struct sex_type {

enum tag_field {female, male} sex;union {

int children;boolean beard;} u;

};

typedef struct human_being {char name[10];int age;float salary;date dob;sex_type sex_info;}

human_being person1, person2;person1.sex_info.sex = male;person1.sex_info.u.beard = FALSE;person2.sex_info.sex = female;person2.sex_info.u.children = 4;

Page 14: Chapter 2  Arrays and Structures

Self-Referential Structures

One or more of its components is a pointer to itself.

typedef struct list {char data;list *link;};

list item1, item2, item3;item1.data=‘a’;item2.data=‘b’;item3.data=‘c’;item1.link=item2.link=item3.link=NULL;

a b c

Construct a list with three nodesitem1.link=&item2;item2.link=&item3;malloc: obtain a node

Page 15: Chapter 2  Arrays and Structures

Outline The array as an abstract data type Structures and unions The polynomial abstract data type The sparse matrix abstract data type Representation of multidimensional arrays The string abstract data type

Page 16: Chapter 2  Arrays and Structures

Implementation on other data types

Arrays are not only data structures in their own right, we can also use them to implement other abstract data types.

Some examples of the ordered (linear) list : - Days of the week: (Sunday, Monday, Tuesday, Wednesday, Thursday, Friday, Saturday). - Values in a deck of cards: (Ace, 2, 3, 4, 5, 6, 7, 8, 9, 10, Jack, Queen, King) - Floors of a building: (basement, lobby, mezzanine, first, second) Operations on lists :

- Finding the length, n , of the list.- Reading the items from left to right (or right to left).- Retrieving the ith element from a list, 0 ≦ i < n.- Replacing the ith item of a list, 0 ≦ i < n.- Inserting a new item in the ith position of a list, 0 ≦ i < n.- Deleting an item from the ith position of a list, 0 ≦ i < n.

Page 17: Chapter 2  Arrays and Structures

Abstract data type Polynomial

Structure Polynomial is objects: ; a set of ordered pairs of <ei,ai> where ai in Coefficients and ei in Exponents, ei are integers >= 0functions:for all poly, poly1, poly2 Polynomial, coef Coefficients, expon Exponents Polynomial Zero( ) ::= return the polynomial, p(x) = 0Boolean IsZero(poly) ::= if (poly) return FALSE else return TRUECoefficient Coef(poly, expon) ::= if (expon poly) return its coefficient else return Zero Exponent Lead_Exp(poly) ::= return the largest exponent in polyPolynomial Attach(poly,coef, expon) ::= if (expon poly) return error else return the polynomial poly with the term <coef, expon> inserted Polynomial Remove(poly, expon) ::= if (expon poly) return the polynomial poly with the term whose exponent is expon deleted else return errorPolynomial SingleMult(poly, coef, expon) ::= return the polynomial poly • coef • xexpon

Polynomial Add(poly1, poly2) ::= return the polynomial poly1 +poly2Polynomial Mult(poly1, poly2) ::= return the polynomial poly1 • poly2End Polynomial

nen

e xaxaxp ...)( 11

degree : the largest exponent of a polynomial

Page 18: Chapter 2  Arrays and Structures

Two types of Implementation for the polynomial ADT

Type I :

#define MAX_DEGREE 1001 typedef struct { int degree; float coef[MAX_DEGREE]; } polynomial;

polynomial a; a.degree = n a. coef[i] = an-i , 0 ≦ i ≦ n.

Type II :

MAX_TERMS 1001 /* size of terms array */typedef struct { float coef; int expon; } polynomial;

polynomial terms[MAX_TERMS];int avail = 0; /* recording the available space*/

starta finisha

startb

finishb

avail

coef 2 1 1 10 3 1  

exp 1000 0 4 3 2 0  

0 1 2 3 4 5 6

advantage: easy implementationdisadvantage: waste space when sparse

storage requirements: start, finish, 2*(finish-start+1)nonparse: twice as much as Type I

when all the items are nonzero

PA=2x1000+1

PB=x4+10x3+3x2

+1

Poly A

coef 1 … 0 0 2

exp (index)

0 … 998 999100

0

Poly B

coef 1 0 3 10 1 … 0

exp (index)

0 1 2 3 4 …100

0

Page 19: Chapter 2  Arrays and Structures

Polynomials adding of Type I /* d =a + b, where a, b, and d are polynomials */

d = Zero( )while (! IsZero(a) && ! IsZero(b)) do { switch COMPARE (Lead_Exp(a), Lead_Exp(b)) { case -1: d = Attach(d, Coef (b, Lead_Exp(b)), Lead_Exp(b)); b = Remove(b, Lead_Exp(b)); break; case 0: sum = Coef (a, Lead_Exp (a)) + Coef ( b, Lead_Exp(b)); if (sum) { Attach (d, sum, Lead_Exp(a)); a = Remove(a , Lead_Exp(a)); b = Remove(b , Lead_Exp(b)); } break;

case 1: d = Attach(d, Coef (a, Lead_Exp(a)), Lead_Exp(a)); a = Remove(a, Lead_Exp(a)); } } insert any remaining terms of a or b into d

// case -1: Lead_exp(a) < Lead_exp(b)

// case 0: Lead_exp(a) = Lead_exp(b)

// case 1: Lead_exp(a) < Lead_exp(b)

Page 20: Chapter 2  Arrays and Structures

Polynomials adding of Type II

void padd (int starta, int finisha, int startb, int finishb,int * startd, int *finishd){ /* add A(x) and B(x) to obtain D(x) */ float coefficient; *startd = avail; while (starta <= finisha && startb <= finishb) switch (COMPARE(terms[starta].expon, terms[startb].expon)) { case -1: /* a expon < b expon */ attach(terms[avail], terms[startb].coef, terms[startb].expon); startb++; avail++;

break; case 0: /* equal exponents */ coefficient = terms[starta].coef + terms[startb].coef; if (coefficient) attach (terms[avail], coefficient, terms[starta].expon); starta++; startb++; avail++;

break; case 1: /* a expon > b expon */ attach(terms[avail], terms[starta].coef, terms[starta].expon); starta++; avail++; }/* add in remaining terms of A(x) */for( ; starta <= finisha; starta++) { attach(terms[avail], terms[starta].coef, terms[starta].expon);

avail++; }/* add in remaining terms of B(x) */for( ; startb <= finishb; startb++) { attach(terms[avail], terms[startb].coef, terms[startb].expon);

avail++; } *finishd =avail -1;}

Page 21: Chapter 2  Arrays and Structures

Outline The array as an abstract data type Structures and unions The polynomial abstract data type The sparse matrix abstract data type Representation of multidimensional arrays The string abstract data type

Page 22: Chapter 2  Arrays and Structures

The sparse matrix abstract data type

Matrix aNonzero rate : 15/15

Dense !

col_0 col_1 col_2

row_0 -27 3 4

row_1 6 82 -2

row_2 109 -64 11

row_3 12 8 9

row_4 48 27 47

col_0

col_1

col_2

col_3

col_4

col_5

row_0 15 0 0 22 0 -15

row_1 0 11 3 0 0 0

row_2 0 0 0 -6 0 0

row_3 0 0 0 0 0 0

row_4 91 0 0 0 0 0

row_5 0 0 28 0 0 0

Matrix bNonzero rate : 8/36

Sparse !

Page 23: Chapter 2  Arrays and Structures

Abstract data type Sparse_MatrixStructure Sparse_Matrix is

objects: a set of triples, <row, column, value>, where row and column are integers and form a unique combination, and value comes from the set item. functions: for all a, b Sparse_Matrix, x item, i, j, max_col, max_row index Sparse_Marix Create(max_row, max_col) ::= return a Sparse_matrix that can hold up to max_items = max _row

max_col and whose maximum row size is max_row and whose maximum column size is max_col. Sparse_Matrix Transpose(a) ::=

return the matrix produced by interchanging the row and column value of every triple.

Sparse_Matrix Add(a, b) ::= if the dimensions of a and b are the same

return the matrix produced by adding corresponding items, namely those with identical row and column values.

else return error Sparse_Matrix Multiply(a, b) ::= if number of columns in a equals number of rows in b return the matrix d produced by multiplying a by b according to the formula:

d [i] [j] = (a[i][k]•b[k][j]) where d (i, j) is the (i,j)th element else return error.

Page 24: Chapter 2  Arrays and Structures

Sparse_Matrix Create and transpose

#define MAX_TERMS 101 /* maximum number of terms +1*/ typedef struct { int col; int row; int value; } term;

term a[MAX_TERMS]

for each row i (or column j) take element <i, j, value> and

store it in element <j, i, value> of the transpose.

  row col value

a[0] 6 6 8

[1] 0 0 15

[2] 0 3 22

[3] 0 5 -15

[4] 1 1 11

[5] 1 2 3

[6] 2 3 -6

[7] 4 0 91

[8] 5 2 28

  row col value

b[0]

6 6 8

[1] 0 0 15

[2] 0 4 91

[3] 1 1 11

[4] 2 1 3

[5] 2 5 28

[6] 3 0 22

[7] 3 2 -6

[8] 5 0 -15Sparse matrix and its transpose stored as

triples

#(rows)#(columns)

#(nonzero entries)

any element within a matrix using the triple <row, col, value>

Difficulty: what position to put <j, i, value> ?

Page 25: Chapter 2  Arrays and Structures

Transpose of a sparse matrix in O(columns*elements)void transpose (term a[], term b[])

/* b is set to the transpose of a */{ int n, i, j, currentb; n = a[0].value; /* total number of elements */ b[0].row = a[0].col; /* rows in b = columns in a */ b[0].col = a[0].row; /*columns in b = rows in a */ b[0].value = n; if (n > 0) { /*non zero matrix */ currentb = 1; for (i = 0; i < a[0].col; i++) /* transpose by columns in a */ for( j = 1; j <= n; j++) /* find elements from the current column */ if (a[j].col == i) { /* element is in current column, add it to b */ b[currentb].row = a[j].col; b[currentb].col = a[j].row; b[currentb].value = a[j].value; currentb++ } }}

Scan the array “columns” times.The array has “elements” elements.

==> O(columns*elements)

Page 26: Chapter 2  Arrays and Structures

Analysis and improvement

Discussion: compared with 2-D array representation

O(columns*elements) vs. O(columns*rows)

#(elements) columns * rows, when the matrix is not sparse. O(columns*elements) O(columns*columns*rows)

Problem: Scan the array “columns” times.

Improvement: Determine the number of elements in each column of the original matri

x. => Determine the starting positions of each row in the transpose matrix.

Page 27: Chapter 2  Arrays and Structures

Transpose of a sparse matrix in O(columns + elements)void fast_transpose(term a[ ], term b[ ])

{ /* the transpose of a is placed in b */ int row_terms[MAX_COL], starting_pos[MAX_COL]; int i, j, num_cols = a[0].col, num_terms = a[0].value; b[0].row = num_cols; b[0].col = a[0].row; b[0].value = num_terms; if (num_terms > 0){ /*nonzero matrix*/ for (i = 0; i < num_cols; i++) row_terms[i] = 0; for (i = 1; i <= num_terms; i++) row_terms[a[i].col]++ starting_pos[0] = 1; for (i =1; i < num_cols; i++) starting_pos[i]=starting_pos[i-1] +row_terms [i-1];

for (i=1; i <= num_terms, i++) { j = starting_pos[a[i].col]++; b[j].row = a[i].col; b[j].col = a[i].row; b[j].value = a[i].value; } }}

O(num_cols)

O(num_terms)

O(num_cols-1)

O(num_terms)

O(columns + elements)

Page 28: Chapter 2  Arrays and Structures

Sparse matrix multiplicationvoid mmult (term a[ ], term b[ ], term d[ ] )

/* multiply two sparse matrices */{ int i, j, column, totalb = b[].value, totald = 0; int rows_a = a[0].row, cols_a = a[0].col, totala = a[0].value; int cols_b = b[0].col, int row_begin = 1, row = a[1].row, sum =0; int new_b[MAX_TERMS][3]; if (cols_a != b[0].row){ fprintf (stderr, “Incompatible matrices\n”); exit (1); } fast_transpose(b, new_b); /* set boundary condition */ a[totala+1].row = rows_a; new_b[totalb+1].row = cols_b; new_b[totalb+1].col = 0; for (i = 1; i <= totala; ) { column = new_b[1].row; for (j = 1; j <= totalb+1;) { /* mutiply row of a by column of b */ if (a[i].row != row) { storesum(d, &totald, row, column, &sum); i = row_begin; for (; new_b[j].row == column; j++) ; column =new_b[j].row }

else if (new_b[j].row != column){ storesum(d, &totald, row, column, &sum); i = row_begin; column = new_b[j].row;}else switch (COMPARE (a[i].col, new_b[j].col)) {

case -1: /* go to next term in a */ i++; break; case 0: /* add terms, go to next term in a and b */ sum += (a[i++].value * new_b[j++].value); break; case 1: /* advance to next term in b*/ j++ } } /* end of for j <= totalb+1 */ for (; a[i].row == row; i++) ; row_begin = i; row = a[i].row; } /* end of for i <=totala */ d[0].row = rows_a; d[0].col = cols_b; d[0].value = totald;}

Page 29: Chapter 2  Arrays and Structures

storesum functionvoid storesum(term d[ ], int *totald, int row, int column, int *sum)

{/* if *sum != 0, then it along with its row and column position is stored as the *totald+1 entry in d */ if (*sum) if (*totald < MAX_TERMS) { d[++*totald].row = row; d[*totald].col = column; d[*totald].value = *sum; } else { fprintf(stderr, ”Numbers of terms in product exceed %d\n”, MAX_TERMS); exit(1); }}

Page 30: Chapter 2  Arrays and Structures

Analyzing the algorithm cols_b * termsrow1 + totalb + cols_b * termsrow2 + totalb + … + cols_b * termsrowrows_a + totalb= cols_b * (termsrow1 + termsrow2 + … + termsrowrows_a) + rows_a * totalb= cols_b * totala + row_a * totalb

O(cols_b * totala + rows_a * totalb)

Page 31: Chapter 2  Arrays and Structures

Compared with classic multiplication algorithm

for (i =0; i < rows_a; i++) for (j=0; j < cols_b; j++) { sum =0; for (k=0; k < cols_a; k++) sum += (a[i][k] *b[k][j]); d[i][j] =sum; }

O(rows_a * cols_a * cols_b)

mmult vs. classicO(cols_b * totala + rows_a * totalb) vs. O(rows_a * cols_a * cols_b)

Page 32: Chapter 2  Arrays and Structures

Matrix-chain multiplication n matrices A1, A2, …, An with size p0 p1, p1 p2, p2 p3, …, pn-1 pn

To determine the multiplication order such that # of scalar multiplications is minimized.

To compute Ai Ai+1, we need pi-1pipi+1 scalar multiplications.

e.g. n=4, A1: 3 5, A2: 5 4, A3: 4 2, A4: 2 5((A1 A2) A3) A4, # of scalar multiplications:

3 * 5 * 4 + 3 * 4 * 2 + 3 * 2 * 5 = 114(A1 (A2 A3)) A4, # of scalar multiplications:

3 * 5 * 2 + 5 * 4 * 2 + 3 * 2 * 5 = 100(A1 A2) (A3 A4), # of scalar multiplications:

3 * 5 * 4 + 3 * 4 * 5 + 4 * 2 * 5 = 160

Page 33: Chapter 2  Arrays and Structures

Let m(i, j) denote the minimum cost for computing Ai Ai+1 … Aj

Computation sequence :

Time complexity : O(n3)

jiif

jiifpppj)1,m(kk)m(i,min

0j)m(i,

ik1ijki

Matrix-chain multiplication (cont.)

Page 34: Chapter 2  Arrays and Structures

Outline The array as an abstract data type Structures and unions The polynomial abstract data type The sparse matrix abstract data type Representation of multidimensional arrays The string abstract data type

Page 35: Chapter 2  Arrays and Structures

Representation of multidimensional arrays

The internal representation of multidimensional arrays requires more complex addressing formulas.

a[upper0] [upper1]…[uppern-1] => #(elements of a) = There are two ways to represent multidimensional arrays: row major order and column major order. (row major order stores multidimensional arrays by rows) Ex. A[upper0][upper1] : upper0 rows (row0, row1,… ,rowupper0-1), each row containing upper1 elements Assume that α is the address of A[0][0]

the address of A[i][0] : α+ i *upper1 A[i][j] : α+ i *upper1+j

1

0

n

iiupper

row major :

1 2 3 4

5 6 7 8

9 10 11 12

column major :

1 4 7 10

2 5 8 11

3 6 9 12

Page 36: Chapter 2  Arrays and Structures

Representation of Multidimensional Arrays (cont.)

n dimensional array addressing formula for A[i0][i1]…[in-1] :

Assume that the address of A[0][0]…[0] is α. the address of A[i0][0][0]…[0] : α + i0*upper1*upper2*…*uppern-1

A[i0][i1][0]…[0] : α + i0*upper1*upper2*…*uppern-1

+ i1*upper2*upper3*…*uppern-1

A[i0][i1][i2]…[in-1] : α + i0*upper1*upper2*…*uppern-1

+ i1*upper2*upper3*…*uppern-1

+ i2*upper3*upper4*…*uppern-1

+ in-2*uppern-1

+ in-1

1

01

1

1

1

10,:,

n

in

n

jkkj

jj

a

njupperawhereai

Page 37: Chapter 2  Arrays and Structures

Outline The array as an abstract data type Structures and unions The polynomial abstract data type The sparse matrix abstract data type Representation of multidimensional arrays The string abstract data type

Page 38: Chapter 2  Arrays and Structures

The string abstract data type

Structure String is objects : a finite set of zero or more characters. functions : for all s, t String, i, j, m non-negative integers String Null(m) ::= return a string whose maximum length is m characters, but initially set to NULL. We write NULL as “”.

Integer Compare(s, t) ::= if s equals t return 0 else if s precedes t return -1 else return +1 Boolean IsNull(s) ::= if (Compare(s, NULL)) return FALSE else return TRUE Integer Length(s) ::= if (Compare(s, NULL)) return the number of characters in s else return 0 String Concat(s, t) ::= if (Compare(t, NULL)) return a string whose elements are those of s followed by those of t else return s String Subset(s, i, j) ::= if ((j>0) && (i+j-1)<Length(s)) return the string containing the characters of s at position i, i+1, …, i+j-1. else return NULL.

Page 39: Chapter 2  Arrays and Structures

C string functions

Page 40: Chapter 2  Arrays and Structures

String insertion function

void strnins(char *s, char *t, int i){/* insert string t into string s at position i */ char string[MAX_SIZE], *temp = string;

if (i < 0 && i > strlen(s)) { fprintf(stderr, ”Position is out of bounds \n” ); exist(1); } if (!strlen(s)) strcpy(s, t); else if (strlen(t)) { strncpy(temp, s, i); strcat(temp, t); strcat(temp, (s+i)); strcpy(s, temp); }}

s → a m o b i l e\

0

t → u t o \0

temp → \0

initially

temp → a \0

(a) after strncpy (temp, s, i)

temp → a u t o \0

(b) after strcat (temp, t)

temp → a u t o m o b i l e \0

(c) after strcat (temp, (s+i))

Example:

Never be used in practice, since it’s wasteful in use of time and space !

Page 41: Chapter 2  Arrays and Structures

Pattern matching algorithm (checking end indices first)

int nfind(char *string, char *pat){/* match the lat character of pattern first, and then match from the beginning */ int i, j, start = 0; int lasts = strlen(string) – 1; int lastp = strlen(pat) – 1; int endmatch = lastp;

for (i = 0; endmatch <= lasts; endmatch++, start++){ if (string[endmatch] == pat[lastp]) for (j = 0, i = start; j < lastp && string[i] == pat[j]; i++, j++) ; if (j == lastp) return start; /* successful */ } return -1;}

Page 42: Chapter 2  Arrays and Structures

Simulation of nfind

a a b  

↑ ↑

j lastp

(a) pattern

a b a b b a a b a a  

↑ ↑ ↑

start endmatch lasts

(b) no match

a b a b b a a b a a  

↑ ↑ ↑

start endmatch lasts

(c) no match

a b a b b a a b a a  

↑ ↑ ↑

start endmatch lasts

(d) no match

a b a b b a a b a a  

↑ ↑ ↑

start endmatch lasts

(e) no match

a b a b b a a b a a  

↑ ↑ ↑

start endmatch lasts

(f) no match

a b a b b a a b a a  

↑ ↑ ↑

start

endmatch lasts

(g) match

Page 43: Chapter 2  Arrays and Structures

Failure function used in KMP algorithm

Definition: If P = p0p1…pn-1 is a pattern, then its failure function, f, is defined as:

Example: pat = ‘abcabcacab’

A fast way to compute the failure function:

otherwise

jf,1

)(Largest i < j such that p0p1…pi = pj-ipj-i-1…pj if such an i ≧ 0 exists

1

1)1(

1

)( jfjf m

if j = 0

Where m is the least integer k for which

if there is no k satisfying the above

jjfPp k

1)1(

j 0 1 2 3 4 5 6 7 8 9

pat

a b c a b c a c a b

f-1

-1

-1

0 1 2 3-1

0 1

Page 44: Chapter 2  Arrays and Structures

KMP algorithm (1/5)

Case 1: the first symbol of P dose not appear again in P

Case 2: the first symbol of P appear again in P

Page 45: Chapter 2  Arrays and Structures

KMP algorithm (2/5)

Case 3: not only the first symbol of P appears in P, prefixes of P appear in P twice

Page 46: Chapter 2  Arrays and Structures

KMP algorithm (3/5)

The KMP Algorithm

The Prefix function

Page 47: Chapter 2  Arrays and Structures

KMP algorithm (4/5)

Page 48: Chapter 2  Arrays and Structures

KMP algorithm (5/5)#include <stdio.h>#include <string.h>#define max_string_size 100#define max_pattern_size 100int pmatch();void fail();int failure[max_pattern_size];char string[max_string_size];char pat[max_pattern_size];

int pmatch (char *string, char *pat){/* Knuth, Morris, Pratt string matching algorithm */ int i = 0, j =0; int lens = strlen(string); int lenp = strlen(pat); while (i < lens && j < lenp) { if (string[i] == pat[j]) { i++; j++; } else if (j == 0) i++; else j = failure[j-1] + 1; } return ( (j == lenp) ? (i-lenp) : -1);}

void fail (char *pat){/* compute the pattern’s failure function */ int n = strlen(pat); failure[0] = -1; for (j = 1; j < n; j++) { i = failure[j-1]; while ((pat[j] != pat[i+1]) && (I >= 0)) i = failure[i]; if (pat[j] == pat[i+1]) failure[j] = i + 1; else failure[j] = -1; }}

O(strlen(stirng))

O(strlen(pattern))

O(strlen(stirng)+strlen(pat))

Page 49: Chapter 2  Arrays and Structures

An Example for KMP algorithm (1/6)

P1≠T1, mismatchSince j=1, i = i + 1

Page 50: Chapter 2  Arrays and Structures

An Example for KMP algorithm (2/6)

T5≠P4, mismatchPf(4-1)+1=P0+1=P1

Page 51: Chapter 2  Arrays and Structures

An Example for KMP algorithm (3/6)

P1≠T5, mismatchSince j=1, i = i + 1

Page 52: Chapter 2  Arrays and Structures

An Example for KMP algorithm (4/6)

Match exactly

Pf(12)+1=P4+1=P5

Page 53: Chapter 2  Arrays and Structures

An Example for KMP algorithm (5/6)

T19≠P6, mismatch

Pf(6-1)+1=P0+1=P1

Page 54: Chapter 2  Arrays and Structures

An Example for KMP algorithm (6/6)

.

.

.