arrays and structures

62
Arrays and Structures

Upload: bradford-jerry

Post on 30-Dec-2015

39 views

Category:

Documents


0 download

DESCRIPTION

Arrays and Structures. Data type. A data type is a collection of objects and a set of operations that act on those objects. Abstract data type. An abstract data type (ADT) is a data type organized in such a way that the specification of the objects and - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Arrays and Structures

Arrays and Structures

Page 2: Arrays and Structures

Arrays and Structures 2

Data type

A data type is a collection of objects and a set of operations that act on those objects.

Page 3: Arrays and Structures

Arrays and Structures 3

Abstract data type

An abstract data type (ADT) is a data type organized in such a way that

the specification of the objects and

the specification of the operations on the objects

is separated from the representation of the objects and

the implementation of the operations.

Page 4: Arrays and Structures

Arrays and Structures 4

Why ADT?

An ADT definition of the object can help us to fast grasp the essential elements of the object.

Page 5: Arrays and Structures

The array as an ADT

Page 6: Arrays and Structures

Arrays and Structures 6

Abstract data type of the array

An array is a set of pairs, <index, value>, such that each index that is defined has a value associated with it.• An array is usually

implemented as a consecutive set of memory locations.

• Advantageous to random access

functions

Array Create(j,list);

/* return an array of j dimensions where list is a j-turple whose jth element is the size of the ith dimension */

Item Retrieve(index i);

Array Store(index i,float x);

Page 7: Arrays and Structures

Arrays and Structures 7

C array

Example: #define MAX_SIZE 100

float array[MAX_SIZE];• index set: 0... MAX_SIZE-1

C does not check an array index to ensure that it is in the range for which the array is defined.

Page 8: Arrays and Structures

Arrays and Structures 8

Dynamically allocated arrays in C Dynamically allocate an one-dimensional integer array of size n

int *p = malloc(sizeof(int)*n);

p=realloc(p,s) • resizes memory allocated by malloc or calloc• changes the size of the memory block pointed at by p to s• keeps the contents of the first min{s, oldSize} bytes of the block u

nchanged.

Dynamically allocate a two-dimensional integer array of size mxnint k,**x;x = malloc(sizeof(int*)*m);for(k = 0; k < m; ++k) x[k] = malloc(sizeof(int)*n);

Page 9: Arrays and Structures

Arrays and Structures 9

Structures

A structure is a collection of data items, where each item has its own type and name.

Page 10: Arrays and Structures

Arrays and Structures 10

Define a structure Create our own structure data type.

typedef struct human_being {char name[10];int age;float salary;

};ortypedef struct {

char name[10];int age;float salary;

} human_being; Declare varaibles person1 and person2 of type huma

n_being:human_being person1, person2;

Page 11: Arrays and Structures

Arrays and Structures 11

Define a self-referential structure

typedef struct list {

int data;

list *link;

};

Page 12: Arrays and Structures

Arrays and Structures 12

Ordered (linear) listThe ordered list can be written in the form (a0, a1, a2, ..., an-1).

• Empty list: ()Operations

(1) Find the length, n, of the list.(2) Read the list from left to right (or right to left).(3) Retrieve the ith element.(4) Store a new value into the ith position.(5) Insert a new element at the position i, causing elements numbered i, i+1, ..., n-1 to become numbered i+1, i+2, ..., n.(6) Delete the element at position i, causing elements numbered i+1, ..., n-1 to become numbered i, i+1,..., n-2.

The ordered list can be represented by an array.Only operations (5) and (6) require real effort.(5) void insert(int i, int e) { int j; for(j=n; j >i; j--) a[j]=a[j-1]; a[i]=e; }(6) void delete(int i) { int j; for(j=i; j < n-1; j++) a[j]=a[j+1]; }

Page 13: Arrays and Structures

Polynomials

We try to use arrays to represent polynomials

Page 14: Arrays and Structures

Arrays and Structures 14

Polynomial Examples

• A(x)=3x2+2x+4

• B(x)=x4+10x3+3x2+1

• The largest exponent of a polynomial is called its degree. The coefficients that are zero are not displayed.

The addition and multiplication of two polynomials A(x) and B(x) are defined as

j

ji

i

iii

xbxaxBxA

xbaxBxA

)()(

)()()(

Page 15: Arrays and Structures

Arrays and Structures 15

ADT Polynomialstructure Polynomial

//objects: p(x)=a0xe0+a1xe1+... +anxen; a set of ordered pairs of// <ei, ai> where ai Coefficient and eiExponentfunctions:

Polynomial Zero()Boolean IsZero(poly) Coefficeint Coef(poly, expon);Exponent Lead_Exp(poly);Polynomial Attach(poly,coef,expon)Polynomial Remove(poly,expon)Polynommial SingleMult(poly,coef,expon)Polynomial Add(Polynomial poly);Polynomial Mult(Polynomial Poly);

// Evaluate the polynomial *this at f and return the result.

Page 16: Arrays and Structures

Arrays and Structures 16

Polynomial representation Representation 1

typedef struct {int degree; /* degree < MaxDegree */float coef[MaxDegree];

} polynomial;

polynomial a;a.degree = n;

a.coef[i] = an-i;

Disadvantage: Representation 1 is wasteful in memory.

MaxDegree-1...10

a0an-1an

coef:

Page 17: Arrays and Structures

Arrays and Structures 17

Program 2.5: Initial version of padd/* d = a + b */d = Zero();while (!IsZero(a) && !IsZero(b)) {

switch(COMARE(Lead_Exp(a),Lead_Exp(b)) {case -1: d = Attach(d,Coef(b,Lead_Exp(b)),Lead_Exp(b)); // Lead_Exp(a) < Lead_Exp(b)

b = Remove(b,Lead_Exp(b)); break;

case 0: sum = Coef(a,Lead_Exp(a))+Coef(b,Lead_Exp(a)); // Lead_Exp(a) == Lead_Exp(b)if (sum) {

Attach(d,sum,Lead_Exp(a);}a = Remove(a,Lead_Exp(a));b = Remove(b,Lead_Exp(b));break;

case 1: d = Attach(d,Coef(a,Lead_Exp(a)),Lead_Exp(a)); // Lead_Exp(a) > Lead_Exp(b) a = Remove(a,Lead_Exp(a)); break;

}}

Page 18: Arrays and Structures

A(x)=3x2+2x+4B(x)=x4+10x3+3x2+1C(x)=A(x)+B(x) 210

423

coef:A(x)

1

4210

3101

coef:

0

3B(x)

4210

coef:

3C(x) 1 10 6 2 5

Index=0Cd=2-Index=2

Index=0Cd=4-Index=4

Index=1Cd=4-Index=3

Index=2Cd=4-Index=2

Index=1Cd=4-Index=1

Index=4Cd=4-Index=0

Index=1Cd=2-Index=1

Index=2Cd=2-Index=0

Index=5Cd=4-Index=-1

Index=5Cd=2-Index=-1

Page 19: Arrays and Structures

Arrays and Structures 19

Polynomial representation (cont’d)

Representation 2typedef struct {

int degree; float *coef;

} polynomial;

polynomial a;a.degree = n;a.coef = (float*)malloc(sizeof(float)*(a.degree+1));

Disadvantage: Representation 2 is wasteful in computer memory if there are many ze

ro coefficients in the polynomial. Such polynomials are called sparse. For instance, A(x)=x10000+1.

Page 20: Arrays and Structures

Arrays and Structures 20

Polynomial representation (cont’d)

Representation 3typedef struct {

float coef;

int exp;

} polynomial;

polynomial terms[MaxTerms];

int avail = 0;

int Start, Finish; starta finisha startb finishb avail

coef 100 1 1 1 2 1 5

exp 4 2 0 1000 100 2 0

0 1 2 3 4 5 6 7

A(x)=100x4+x2+1B(x)=x1000+2x100+x2+5

• Disadvantage:When all coefficients are all nonzero, Representation 3 uses twice as mush space as Representation 2.

Unless we know in advance that each of our polynomials has very few zero terms, Representation 3 is preferable.

Page 21: Arrays and Structures

Arrays and Structures 21

Program 2.6: Revised version of paddvoid padd(int starta,int finisha,int startb,int finishb,int *startd,int *finishd){

float c;*startd = avail;while ((starta<=finisha)&&(startb<=finishb))

switch(COMPARE(terms[starta].expon,term[startb].expon)) {case -1: attach(terms[startb].coef,terms[b].expon);

startb++;break;

case 0: c=terms[starta].coef+terms[startb].coef;if (c) attach(c,terms[starta].expon);starta++; startb++;break;

case 1: attach(terms[starta].coef,terms[starta].expon);starta++;break;

}for(;starta<=finisha;starta++) attach(terms[starta].coef,terms[starta].expon);for(;startb<=finishb;startb++) attach(terms[startb].coef,terms[startb].expon);*finishd = avail-1;

}

void attach(float c,int e){ if (avail>=MAX_TERMS) { printf(“Too many terms\n”); exit(1); } terms[avail].coef = c; terms[avail].expon = e; avail++;}

void attach(float c,int e){ if (avail>=MAX_TERMS) { printf(“Too many terms\n”); exit(1); } terms[avail].coef = c; terms[avail].expon = e; avail++;}

Page 22: Arrays and Structures

Arrays and Structures 22

Disadvantages of Representation 3 Disadvantage

• When some polynomials are no longer needed, some functions would be invoked to make the continuous free space at one end.

Alternative approach• Each polynomial has its own array of terms. This array can

be created dynamically.

• Disadvantage: This approach requires us to run the addition algorithm twice: • one to determine the number of items in the resulting

polynomial and

• the other to actually perform the addition.

• Can you propose a better solution?

Page 23: Arrays and Structures

Sparse matrices

Page 24: Arrays and Structures

Arrays and Structures 24

Introduction

A matrix that has many zero entries is called a sparse matrix.

Page 25: Arrays and Structures

Arrays and Structures 25

ADT SparseMatrixstructure Sparse_Matrix {

// objects: A set of triples,<row,column,value>, where row and column are integers

// and form a unique combination; value is also an integer;

functions:

Sparse_Matrix Create(max_row, max_col);

Sparse_Matrix Transpose(a);

Sparse_Matrix Add(a, b);

Sparse_Matrix Multiply(a, b);

};

Page 26: Arrays and Structures

Arrays and Structures 26

Sparse matrix representationtypedef struct {

int row, col, value;

} term;

term a[MAX_TERMS];

The triples are ordered by row and within rows by columns.

2825a[7]

9104a[6]

-632a[5]

321a[4]

1111a[3]

-1550a[2]

2230a[1]

766a[0]

valuecolrow

Number of rows

Number of columns

Number of nonzero terms

Page 27: Arrays and Structures

transpose

row col value

a[0] 6 6 7

a[1] 0 3 22

a[2] 0 5 -15

a[3] 1 1 11

a[4] 1 2 3

a[5] 2 3 -6

a[6] 4 0 91

a[7] 5 2 28

row col value

a[0] 6 6 7

a[1] 0 4 91

a[2] 1 1 11

a[3] 2 1 3

a[4] 2 5 28

a[5] 3 0 22

a[6] 3 2 -6

a[7] 5 0 -15

0002800

0000091

000000

006000

0003110

15022000

0000015

000000

0006022

2800030

0000110

0910000

transpose

The transposed matrix should hold the ordering rule of the entry.

Page 28: Arrays and Structures

Arrays and Structures 28

Program 2.8: Transpose of a sparse matrix

void transpose(term a[], term b[]){

int n, I, j, currentb;n = a[0].value;b[0].row = a[0].col;b[0].col = a[0].row;b[0].value= n;if (n>0) {

currentb = 1;for(i = 0; i < b[0].row; i++) for(j = 1; j <= n; j++)

if (a[j].col == i) {b[currentb].row = a[j].col;b[currentb].col = a[i].row;b[currentb].value = a[i].value;currentb++;

}}

}

a[0].col

Page 29: Arrays and Structures

Arrays and Structures 29

Analysis of Transpose

Time complexity: O(elementscolumns)

for(i = 0; i < a[0].col; i++) for(j = 1; j <= n; j++)

if (a[j].col == i) {b[currentb].row = a[j].col;b[currentb].col = a[i].row;b[currentb].value = a[i].value;currentb++;

}

Page 30: Arrays and Structures

Arrays and Structures 30

Program 2.9: fast_transposevoid fast_transpose(term a[], term b[]){ int row_terms[MAX_COL]; int starting_pos[MAX_COL]; int i,j, num_cols = a[0].col, num_terms = a[0].value; b[0].row = num_cols; b[0].col = a[0].row; b[0].value = num_terms; if (num_terms > 0) {

for(i = 0; i < num_cols;i++) row_terms[i]=0;for(i = 1; i<=num_terms; i++) row_terms[a[i].col]++;starting_pos[0] = 1;for(i=1;i<num_cols;i++) starting_pos[i] = starting_pos[i-1]+row_terms[i-1];

for(i=1; i<=num_terms;i++) {j=starting_pos[a[i].col];b[j].col = a[i].row;b[j].row = a[i].col;b[j].value = a[i].value;start_pos[a[i].col]++;

} }} Time Complexity: O(elements+columns)Time Complexity: O(elements+columns)

Page 31: Arrays and Structures

0 1 2 3 4 5

row_terms 1 1 2 2 0 1

starting_pos 1 2 3 5 7 7

row col value

a[0] 6 6 7

a[1] 0 3 22

a[2] 0 5 -15

a[3] 1 1 11

a[4] 1 2 3

a[5] 2 3 -6

a[6] 4 0 91

a[7] 5 2 28

row col value

a[0] 6 6 7

a[1] 0 4 91

a[2] 1 1 11

a[3] 2 1 3

a[4] 2 5 28

a[5] 3 0 22

a[6] 3 2 -6

a[7] 5 0 -15transpose

Page 32: Arrays and Structures

775321starting_pos

102211row_terms

543210

2825a[7]

9104a[6]

-632a[5]

321a[4]

1111a[3]

-1550a[2]

2230a[1]

766a[0]

valuecolrow

b[7]

b[6]

b[5]

b[4]

b[3]

b[2]

b[1]

766b[0]

valuecolrow

776321starting_pos

102211row_terms

543210

2825a[7]

9104a[6]

-632a[5]

321a[4]

1111a[3]

-1550a[2]

2230a[1]

766a[0]

valuecolrow

b[7]

b[6]

2203b[5]

b[4]

b[3]

b[2]

b[1]

766b[0]

valuecolrow

876321starting_pos

102211row_terms

543210

2825a[7]

9104a[6]

-632a[5]

321a[4]

1111a[3]

-1550a[2]

2230a[1]

766a[0]

valuecolrow

-1505b[7]

b[6]

2203b[5]

b[4]

b[3]

b[2]

b[1]

766b[0]

valuecolrow

876331starting_pos

102211row_terms

543210

2825a[7]

9104a[6]

-632a[5]

321a[4]

1111a[3]

-1550a[2]

2230a[1]

766a[0]

valuecolrow

-1505b[7]

b[6]

2203b[5]

b[4]

b[3]

11 1 1b[2]

b[1]

766b[0]

valuecolrow

876431starting_pos

102211row_terms

543210

2825a[7]

9104a[6]

-632a[5]

321a[4]

1111a[3]

-1550a[2]

2230a[1]

766a[0]

valuecolrow

-1505b[7]

b[6]

2203b[5]

b[4]

3 1 2b[3]

11 1 1b[2]

b[1]

766b[0]

valuecolrow

877431starting_pos

102211row_terms

543210

2825a[7]

9104a[6]

-632a[5]

321a[4]

1111a[3]

-1550a[2]

2230a[1]

766a[0]

valuecolrow

-1505b[7]

-623b[6]

2203b[5]

b[4]

3 1 2b[3]

11 1 1b[2]

b[1]

766b[0]

valuecolrow

877432starting_pos

102211row_terms

543210

2825a[7]

9104a[6]

-632a[5]

321a[4]

1111a[3]

-1550a[2]

2230a[1]

766a[0]

valuecolrow

-1505b[7]

-623b[6]

2203b[5]

b[4]

3 1 2b[3]

11 1 1b[2]

9140b[1]

766b[0]

valuecolrow

877532starting_pos

102211row_terms

543210

2825a[7]

9104a[6]

-632a[5]

321a[4]

1111a[3]

-1550a[2]

2230a[1]

766a[0]

valuecolrow

-1505b[7]

-623b[6]

2203b[5]

2852b[4]

3 1 2b[3]

11 1 1b[2]

9140b[1]

766b[0]

valuecolrow

Page 33: Arrays and Structures

Arrays and Structures 33

Multiplication of sparse matrices

The product of two matrices A and B, A is an mxn matrix and B is an nxp matrix, is an mxp matrix whose ijth element is defined as

1

0

n

kkjikij baresult

003800

000001

000050

006000

000360

502050

0002800

0000091

000000

006000

0003110

15022000

Page 34: Arrays and Structures

Arrays and Structures 34

Multiplication of sparse matrices (cont’d)

The product of two sparse matrices may no longer be sparse.

111

111

111

000

000

111

001

001

001

Page 35: Arrays and Structures

Arrays and Structures 35

Program 2.10: sparse matrix multiplication

void mmult(term a[], term b[], term d[]){

int i,j, column, total_b = b[0].value;int totald = 0,row_a = a[0].row;int cols_a = a[0].col, total_a = a[0].value;

int cols_b = b[0].col,row_begin = 1;int row = a[1].row,sum = 0;int new_b[MAX_TERMS][3];if (cols_a != b[0].row) {

fprintf(stderr,”Imcompatible matrices\n”);exit(1);

}fast_transpose(b,new_b);/* set boundary condition */a[total_a+1].row = rows_a;new_b[total_b+1].row = cols_b;new_b[total_b+1].col = 0;

void mmult(term a[], term b[], term d[]){

int i,j, column, total_b = b[0].value;int totald = 0,row_a = a[0].row;int cols_a = a[0].col, total_a = a[0].value;

int cols_b = b[0].col,row_begin = 1;int row = a[1].row,sum = 0;int new_b[MAX_TERMS][3];if (cols_a != b[0].row) {

fprintf(stderr,”Imcompatible matrices\n”);exit(1);

}fast_transpose(b,new_b);/* set boundary condition */a[total_a+1].row = rows_a;new_b[total_b+1].row = cols_b;new_b[total_b+1].col = 0;

for(i=1; i < total_a;) {column = new_b[1].row;for(j = 1; j <= total_b+1;) {

if(a[i].row != row) {storesum(d,&totald,row,column,&sum);i = row_begin;for(;new_b[j].row == column; j++);column = new_b[j].row;

} else if (new_b[j].row != column) {storesum(d, &totald, row, column, &sum);i = row_begin;column = new_b[j].row;

} else switch(COMPARE(a[i].col, new_b[j].col)) {case -1: i++; break;case 0: sum += (a[i++].value * new_b[j++].value); break;case 1: j++; break;

}}for(;a[i].row == row; i++);row_begin = i; row = a[i].row;

}d[0].row = rows_a; d[0].col = cols_b; d[0].value = totald;

for(i=1; i < total_a;) {column = new_b[1].row;for(j = 1; j <= total_b+1;) {

if(a[i].row != row) {storesum(d,&totald,row,column,&sum);i = row_begin;for(;new_b[j].row == column; j++);column = new_b[j].row;

} else if (new_b[j].row != column) {storesum(d, &totald, row, column, &sum);i = row_begin;column = new_b[j].row;

} else switch(COMPARE(a[i].col, new_b[j].col)) {case -1: i++; break;case 0: sum += (a[i++].value * new_b[j++].value); break;case 1: j++; break;

}}for(;a[i].row == row; i++);row_begin = i; row = a[i].row;

}d[0].row = rows_a; d[0].col = cols_b; d[0].value = totald;

Page 36: Arrays and Structures

Arrays and Structures 36

Program 2.11: Store a matrix termvoid store_sum(term d[],int *totald,int row,int column,int *sum){

if (*sum) {if (*totald < MAX_TERMS) {

*totald++;d[*totald].row = row;d[*totald].col = column;d[*totald].value = *sum;*sum=0;

} else {fprintf(stderr,” Number of terms in product exceeds %d\n”,

MaxTerms);}

}}

Page 37: Arrays and Structures

003800

000001

000050

006000

000360

502050

0002800

0000091

000000

006000

0003110

15022000

000805

000000

300602

800030

005065

010000

.

0002800

0000091

000000

006000

0003110

15022000

transpose

Page 38: Arrays and Structures

row col value

a[0] 6 6 7

a[1] 0 3 22

a[2] 0 5 -15

a[3] 1 1 11

a[4] 1 2 3

a[5] 2 3 -6

a[6] 4 0 91

a[7] 5 2 28

row col value

nb[0] 6 6 7

nb[1] 0 4 91

nb[2] 1 1 11

nb[3] 2 1 3

nb[4] 2 5 28

nb[5] 3 0 22

nb[6] 3 2 -6

nb[7] 5 0 -15

transpose

A*A=

row col value

d[0] 6 6 7

d[1] 0 2 -15*28

d[2] 1 1 11*11

d[3] 1 2 11*3

d[4] 1 3 3*-6

d[5] 4 3 91*22

d[6] 4 5 91*-15

d[7] 5 3 28*-6

*

AA

Page 39: Arrays and Structures

Arrays and Structures 39

AnalysisTime complexity:

O(max{cols_b+total_b,cols_btotal_a + rows_atotal_b})=O(cols_btotal_a + rows_atotal_b) cols_b*total_a + rows_a*total_b

cols_b*rows_a*cols_a + rows_a*rows_b*cols_b=cols_b*rows_a*cols_a + rows_a*cols_a*cols_b=O(cols_b*rows_a*cols_a)

for(int i = 0; i <a_row; i++) {for(int j = 0; j < b_col; j++) {

sum = 0;for(int k = 0; k < a_col; k++) {

sum+= a[i][k]*b[k][j];}c[i][j] = sum;

}} // Time complexity O(a_row*a_col*b_col)

for(int i = 0; i <a_row; i++) {for(int j = 0; j < b_col; j++) {

sum = 0;for(int k = 0; k < a_col; k++) {

sum+= a[i][k]*b[k][j];}c[i][j] = sum;

}} // Time complexity O(a_row*a_col*b_col)

Classical multiplication algorithmClassical multiplication algorithm

fast_transpose(b,new_b)

Page 40: Arrays and Structures

Arrays and Structures 40

Disadvantages of representing sparse matrices by arrays

Disadvantage• Although it is an efficient way to represent all our

sparse matrices in one array, some functions would be invoked to make the continuous free space at one end when some matrices are no longer needed.

Alternative approach• Each sparse matrix has its own array. This array can

be created dynamically.

• Disadvantage: similar to those with the polynomial representation.

Page 41: Arrays and Structures

Representation of arrays

Page 42: Arrays and Structures

Arrays and Structures 42

Representation of arrays

Multidimensional arrays are usually implemented by storing the elements in a one-dimensional array.

If an array is declared A[p1...q1][p2...q2]...[pn...qn] where pi...qi is the range of index values in dimension i, then the number of elements is

A[4..5][2..4][1..2][3..4] has 2*3*2*2=24 elements.

n

iii pq

1

)1(

Page 43: Arrays and Structures

Arrays and Structures 43

Row major order Arrays are often represented in row major

order.• In row major order, the elements of A[4..5][2..4][1..2]

[3..4] will be stored asA[4][2][1][3],A[4][2][1][4],A[4][2][2][3],A[4][2][2][4],...,A[5][4][2][4].

See Sec. 2.6 Exercise 3 for column major order.

Page 44: Arrays and Structures

Arrays and Structures 44

col0 col 1 col 2 ... col u2-1

row 0

row 1

... A[i][j]

row u1-1

Sequential representation of array declared as A[u1][u2]

Location of A[i][j] = +i*u2+j

u2 elements

Location of A[0][0]=

For example, char A[u1][u2];

Page 45: Arrays and Structures

Arrays and Structures 45

General formula

The addressing formula for any element A[i1][i2]...[in] in an n-dimensional array declared as A[u1][u2]...[un] is

.1

1 where

]]...[][[ of address the

1

121

njua

nja

aiiiiA

n

jkkj

n

n

jjjn

Page 46: Arrays and Structures

The string ADT

Page 47: Arrays and Structures

Arrays and Structures 47

ADT Stringstructure String {

// objects: A finite ordered set of zero or more characters.functions: String Null(m); Integer Compare(s,t); Boolean IsNull(s); Integer Length(s); String Concat(s,t); String Substr(s, i, j);

};

Page 48: Arrays and Structures

Arrays and Structures 48

String pattern matching

Problem. Determine whether string pat is in string s or not.

Page 49: Arrays and Structures

Arrays and Structures 49

A simple algorithm

...gfecba

gba

s

pat

...gfecba

gba

s

pat

...gfecba

gba

s

pat

Page 50: Arrays and Structures

Arrays and Structures 50

A simple algorithm (cont’d)int simple_find(char*string,char* pat){

int i = 0,len_s=strlen(string),len_p=strlen(pat);char*p = pat;char*s = string;if (*p&&*s) {

while(i+len_p<=len_s) {if (*p==*s) {

p++; s++;if (!*p) return i;

} else {i++; s=string+i; p = pat;

}}

}return –1;

} Time complexity: O(Length(string) Length(pat))Time complexity: O(Length(string) Length(pat))

Page 51: Arrays and Structures

Arrays and Structures 51

Program 2.13: Pattern matching by checking end indices firstint nfind(char *string,char *pat){

int I,j,start=0;int lasts = strlen(string)-1;int lastp = strlen(pat)-1;int endmatch = lastp;for(i=0; endmatch <=lasts; endmatch++,start++) {

if(string[endmatch]==pat[lastp]) {for(j=0,i=start; j < lastp && string[i]==pat[j];i++,j++);if(j==lastp) return start;

}}return -1;

}

Time complexity: O(strlen(string) strlen(pat))Time complexity: O(strlen(string) strlen(pat))

Page 52: Arrays and Structures

Arrays and Structures 52

The Knuth-Morris-Pratt algorithm

si si+1 si+2

s - a b ? ? ? . . . . ?

pat a b c a b c a c a b

si si+1 si+2 si+3 si+4

s - a b c a c . . . . ?

pat a b c a b c a c a b

It is unnecessary to move backward in s!!

a b c a b …

a b c a b …

The simple algorithm only advances one character

It is possible to advance three characters if pat is analyzed in advance.

Page 53: Arrays and Structures

Arrays and Structures 53

The Knuth-Morris-Pratt algorithm (cont’d)

Definition. If p=p0p1p2...pn-1 is a pattern, then its failure function f is defined as

otherwise1

exists 0 asuch if......

such that largest )(

110

kpppppp

jkjf

jkjkjk

j 0 1 2 3 4 5 6 7 8 9

pat a b c a b c a c a b

f -1 -1 -1 0 1 2 3 -1 0 1

Page 54: Arrays and Structures

Arrays and Structures 54

The Knuth-Morris-Pratt algorithm (cont’d)

111 ... iikikiki sssss

11 ... jjkjkj pppp

110 ... kk pppp=

==

1)()(10 ... jfjf pppp

….

?

Page 55: Arrays and Structures

Arrays and Structures 55

The Knuth-Morris-Pratt algorithm (cont’d)

If a partial match is found such that si-j...si-1=p0p1...pj-1 and sipj then matching may be resumed by comparing si and pf(j-1)+1 if j 0. If j=0, then we may continue be comparing si+1 and p0.

j 0 1 2 3 4 5 6 7 8 9

pat a b c a b c a c a b

f -1 -1 -1 0 1 2 3 -1 0 1

... a b c a b c a ds

j 0 1 2 3 4 5 6 7 8 9

pat a b c a b c a c a b

f -1 -1 -1 0 1 2 3 -1 0 1

Page 56: Arrays and Structures

Arrays and Structures 56

Program 2.14:The Knuth-Morris-Pratt algorithm

int pmatch(char*string,char*pat){

int PosP = 0,PosS = 0;int LengthP=strlen(pat),LengthS=strlen(string);while((PosP<LengthP)&&(PosS<LengthS)) {

if (pat[PosP]==string[PosS]) {PosP++; PosS++;

} else {if (PosP==0) PosS++; else PosP=failure[PosP-1]+1;

}}if (PosP<LengthP) return –1; else return PosS-LengthP;

}

The time complexity of pmatch is O(LengthS). If the fail function is not known in advance, pattern matching can be carried out in timeO(strlen(string)+strlen(pat)).

The time complexity of pmatch is O(LengthS). If the fail function is not known in advance, pattern matching can be carried out in timeO(strlen(string)+strlen(pat)).

Execute at most LengthS times

Page 57: Arrays and Structures

a b a b c

-1 -1 0 1 -1

a b a b c

-1 -1 0 1 -1

a b a b c

-1 -1 0 1 -1

e a b c a b a b cstring

pattern

f

a b a b c

-1 -1 0 1 -1

a b a b c

-1 -1 0 1 -1

a b a b c

-1 -1 0 1 -1

PosP=0

PosS=0

PosP=0

PosS=1

PosP=1

PosS=2

PosP=2

PosS=3

PosP=0

PosS=3

PosP=0

PosS=4

PosP=1

PosS=5

PosP = f(PosP-1)+1

PosP=2

PosS=6

PosP=3

PosS=7

PosP=4

PosS=8

Page 58: Arrays and Structures

Arrays and Structures 58

The Knuth-Morris-Pratt algorithm (cont’d)

The failure function can be equivalently rewritten as

above thesatisfying no is thereif1

for which integer least theis where1)1(

0 if1

)(11

k

ppkmjf

j

jf j)(jf

mk

j 0 1 2 3 4 5 6 7 8 9

pat a b c a b c a c a b

f -1 -1 -1 0 1 2 3 -1 0 1

))(()(),()( 11 jffjfjfjf mm Note

f(j-1)

jf(j-1)+1

=

Page 59: Arrays and Structures

Arrays and Structures 59

j 0 1 2 3 4 5 6 7 8 9 10 11

pat a a a a a a b b b c c c

f -1 0 1 2 3 4 -1 -1 -1 -1 -1 -1

More examples of failure functions

j 0 1 2 3 4 5 6 7 8 9 10 11

pat a b c d a b c d a b c f

f -1 -1 -1 -1 0 1 2 3 4 5 6 -1

j 0 1 2 3 4 5 6 7 8 9 10 11

pat a b c d a b c d a b c a

f -1 -1 -1 -1 0 1 2 3 4 5 6 0

f2(10)=f(6)

-1

f3(10)=f2(6)=f(2) f(10)

Page 60: Arrays and Structures

j 0 1 2 3 4 5 6 7 8 9 10 11

pat a b c d a b c d a b c f

f -1 -1 -1 -1 0 1 2 3 4 5 6 -1

j 0 1 2 3 4 5 6 7 8 9 10 11

pat a b c d a b c d a b c f

f -1 -1 -1 -1 0 1 2 3 4 5 6 -1

j 0 1 2 3 4 5 6 7 8 9 10 11

pat a b c d a b c d a b c f

f -1 -1 -1 -1 0 1 2 3 4 5 6 -1

j 0 1 2 3 4 5 6 7 8 9 10 11

pat a b c d a b c d a b c f

f -1 -1 -1 -1 0 1 2 3 4 5 6 -1

f(f(10))+1

f(10)+1

f(f(f(10)))+1

Page 61: Arrays and Structures

Arrays and Structures 61

Program 2.15: Computing the failure function

void fail(char *pat){

int i,j,LengthP = strlen(pat);failure[0] = -1;for(j=1;j<LengthP; j++) {

i = failure[j-1];while((pat[j]!=pat[i+1])&&(i>=0)) i=failure[i];if (pat[j]==pat[i+1]) failure[j] = i+1;else failure[j]=-1;

}} Time complexity: O(strlen(pat))

Page 62: Arrays and Structures

Arrays and Structures 62

Summary

Representing sparse polynomials and sparse matrices by arrays• Storing pairs of <index, value>

• Analytic formula for the indexes of nonzero elements• e.g. the diagonal marix, tridiagonal matrix, upper/lo

wer triangular matrix,…