chapter 2 arrays and structures

Chapter 2 Arrays and Structures

Instructors: C. Y. Tang and J. S. Roger Jang

All the material are integrated from the textbook "Fundamentals of Data Structures in C" and some supplement from the slides of Prof. Hsin-Hsi Chen (NTU).

Outline The array as an abstract data type Structures and unions The polynomial abstract data type The sparse matrix abstract data type Representation of multidimensional arrays The string abstract data type

Arrays

Array: a set of index and value

data structureFor each index, there is a value associated with

that index.

representation (possible)implemented by using consecutive memory.

Abstract Data Type Array Structure Array is objects : a set of pairs <index, value> where for each value of index there is a value from the set item.

functions: Array Create (j, list) : an array of j dimensions, where list is a j-tuple whose ith element is the size of the ith dimension. Item Retrieve (A, i) : If (i ε index) return the item indexed by i in array A, else return error. Array Store (A, i, x) : If (i ε index) insert <i, x> and return the new array,

else return error. end Array

Implementation in C int list[5], *plist[5];

Variable Memory Address

list[0]base address =

α

list[1] α + sizeof(int)list[2] α + 2*sizeof(int)list[3] α + 3*sizeof(int)list[4] α + 4*sizeof(int)

The names of the five

array elements

Assume the memory address α = 1414, list[0] = 6, list[2] = 8, plist[3] = list = 1414 printf(“%d”, list); // 1414, the variable “list” is the pointer to an int printf(“%d”, &list[0]) // 1414, return the address using “&” printf(“%d”, list[0]); // 6, the value stored in the address “&list[0]” printf(“%d”, (list + 2)); // 1414 + 2*sizeof(int), “(list +2)” is a pointer printf(“%d”, *(list + 2)); // 8 printf(“%d”, plist[3]); // 1414, the value of a pointer is an address printf(“%d”, *plist[3]); // 6

Example: One-dimensional array accessed by address

call print1(&one[0], 5)

void print1(int *ptr, int rows){/* print out a one-dimensional array using a pointer */

int i;printf(“Address Contents\n”);for (i=0; i < rows; i++)

printf(“%8u%5d\n”, ptr+i, *(ptr+i));printf(“\n”);

}

int one[] = {0, 1, 2, 3, 4};Goal: print out address and value

Example: Array program #define MAX_SIZE 100 float sum(float [], int); float input[MAX_SIZE], answer; int i; void main (void) { for (i = 0; i < MAX_SIZE; i++) input[i] = i; answer = sum(input, MAX_SIZE); printf("The sum is: %f\n", answer); } float sum(float list[], int n) { int i; float tempsum = 0; for (i = 0; i < n; i++) tempsum += list[i]; return tempsum; }

Result : The sum is: 4950.000000

Structures (Records)

struct {char name[10]; // a name that is a character arrayint age; // an integer value representing the age of the personfloat salary; // a float value representing the salary of the individual} person;

strcpy(person.name, “james”); person.age=10; person.salary=35000;

An alternate way of grouping data that permits the data to vary in type.

Example:

Create structure data typetypedef struct human_being {

char name[10];int age;float salary;};

or

typedef struct {char name[10];int age;float salary} human_being;

human_being person1, person2;

Example: Embed a structure within a structuretypedef struct {

int month;int day;int year;} date;

typedef struct human_being {char name[10];int age;float salary;

date dob;};

person1.dob.month = 2;person1.dob.day = 11;person1.dob.year = 1944;

UnionsSimilar to struct, but only one field is active.

Example: Add fields for male and female.typedef struct sex_type {

enum tag_field {female, male} sex;union {

int children;boolean beard;} u;

};

typedef struct human_being {char name[10];int age;float salary;date dob;sex_type sex_info;}

human_being person1, person2;person1.sex_info.sex = male;person1.sex_info.u.beard = FALSE;person2.sex_info.sex = female;person2.sex_info.u.children = 4;

Self-Referential Structures

One or more of its components is a pointer to itself.

typedef struct list {char data;list *link;};

list item1, item2, item3;item1.data=‘a’;item2.data=‘b’;item3.data=‘c’;item1.link=item2.link=item3.link=NULL;

a b c

Construct a list with three nodesitem1.link=&item2;item2.link=&item3;malloc: obtain a node

Implementation on other data types

Arrays are not only data structures in their own right, we can also use them to implement other abstract data types.

Some examples of the ordered (linear) list : - Days of the week: (Sunday, Monday, Tuesday, Wednesday, Thursday, Friday, Saturday). - Values in a deck of cards: (Ace, 2, 3, 4, 5, 6, 7, 8, 9, 10, Jack, Queen, King) - Floors of a building: (basement, lobby, mezzanine, first, second) Operations on lists :

- Finding the length, n , of the list.- Reading the items from left to right (or right to left).- Retrieving the ith element from a list, 0 ≦ i < n.- Replacing the ith item of a list, 0 ≦ i < n.- Inserting a new item in the ith position of a list, 0 ≦ i < n.- Deleting an item from the ith position of a list, 0 ≦ i < n.

Abstract data type Polynomial

Structure Polynomial is objects: ; a set of ordered pairs of <ei,ai> where ai in Coefficients and ei in Exponents, ei are integers >= 0functions:for all poly, poly1, poly2 Polynomial, coef Coefficients, expon Exponents Polynomial Zero( ) ::= return the polynomial, p(x) = 0Boolean IsZero(poly) ::= if (poly) return FALSE else return TRUECoefficient Coef(poly, expon) ::= if (expon poly) return its coefficient else return Zero Exponent Lead_Exp(poly) ::= return the largest exponent in polyPolynomial Attach(poly,coef, expon) ::= if (expon poly) return error else return the polynomial poly with the term <coef, expon> inserted Polynomial Remove(poly, expon) ::= if (expon poly) return the polynomial poly with the term whose exponent is expon deleted else return errorPolynomial SingleMult(poly, coef, expon) ::= return the polynomial poly • coef • xexpon

Polynomial Add(poly1, poly2) ::= return the polynomial poly1 +poly2Polynomial Mult(poly1, poly2) ::= return the polynomial poly1 • poly2End Polynomial

nen

e xaxaxp ...)( 11

degree : the largest exponent of a polynomial

Two types of Implementation for the polynomial ADT

Type I :

#define MAX_DEGREE 1001 typedef struct { int degree; float coef[MAX_DEGREE]; } polynomial;

polynomial a; a.degree = n a. coef[i] = an-i , 0 ≦ i ≦ n.

Type II :

MAX_TERMS 1001 /* size of terms array */typedef struct { float coef; int expon; } polynomial;

polynomial terms[MAX_TERMS];int avail = 0; /* recording the available space*/

starta finisha

startb

finishb

avail

coef 2 1 1 10 3 1 　

exp 1000 0 4 3 2 0 　

0 1 2 3 4 5 6

advantage: easy implementationdisadvantage: waste space when sparse

storage requirements: start, finish, 2*(finish-start+1)nonparse: twice as much as Type I

when all the items are nonzero

PA=2x1000+1

PB=x4+10x3+3x2

+1

Poly A

coef 1 … 0 0 2

exp (index)

0 … 998 999100

0

Poly B

coef 1 0 3 10 1 … 0

exp (index)

0 1 2 3 4 …100

0

Polynomials adding of Type I /* d =a + b, where a, b, and d are polynomials */

d = Zero( )while (! IsZero(a) && ! IsZero(b)) do { switch COMPARE (Lead_Exp(a), Lead_Exp(b)) { case -1: d = Attach(d, Coef (b, Lead_Exp(b)), Lead_Exp(b)); b = Remove(b, Lead_Exp(b)); break; case 0: sum = Coef (a, Lead_Exp (a)) + Coef ( b, Lead_Exp(b)); if (sum) { Attach (d, sum, Lead_Exp(a)); a = Remove(a , Lead_Exp(a)); b = Remove(b , Lead_Exp(b)); } break;

case 1: d = Attach(d, Coef (a, Lead_Exp(a)), Lead_Exp(a)); a = Remove(a, Lead_Exp(a)); } } insert any remaining terms of a or b into d

// case -1: Lead_exp(a) < Lead_exp(b)

// case 0: Lead_exp(a) = Lead_exp(b)

// case 1: Lead_exp(a) < Lead_exp(b)

Polynomials adding of Type II

void padd (int starta, int finisha, int startb, int finishb,int * startd, int *finishd){ /* add A(x) and B(x) to obtain D(x) */ float coefficient; *startd = avail; while (starta <= finisha && startb <= finishb) switch (COMPARE(terms[starta].expon, terms[startb].expon)) { case -1: /* a expon < b expon */ attach(terms[avail], terms[startb].coef, terms[startb].expon); startb++; avail++;

break; case 0: /* equal exponents */ coefficient = terms[starta].coef + terms[startb].coef; if (coefficient) attach (terms[avail], coefficient, terms[starta].expon); starta++; startb++; avail++;

break; case 1: /* a expon > b expon */ attach(terms[avail], terms[starta].coef, terms[starta].expon); starta++; avail++; }/* add in remaining terms of A(x) */for( ; starta <= finisha; starta++) { attach(terms[avail], terms[starta].coef, terms[starta].expon);

avail++; }/* add in remaining terms of B(x) */for( ; startb <= finishb; startb++) { attach(terms[avail], terms[startb].coef, terms[startb].expon);

avail++; } *finishd =avail -1;}

The sparse matrix abstract data type

Matrix aNonzero rate : 15/15

Dense !

col_0 col_1 col_2

row_0 -27 3 4

row_1 6 82 -2

row_2 109 -64 11

row_3 12 8 9

row_4 48 27 47

col_0

col_1

col_2

col_3

col_4

col_5

row_0 15 0 0 22 0 -15

row_1 0 11 3 0 0 0

row_2 0 0 0 -6 0 0

row_3 0 0 0 0 0 0

row_4 91 0 0 0 0 0

row_5 0 0 28 0 0 0

Matrix bNonzero rate : 8/36

Sparse !

Abstract data type Sparse_MatrixStructure Sparse_Matrix is

objects: a set of triples, <row, column, value>, where row and column are integers and form a unique combination, and value comes from the set item. functions: for all a, b Sparse_Matrix, x item, i, j, max_col, max_row index Sparse_Marix Create(max_row, max_col) ::= return a Sparse_matrix that can hold up to max_items = max _row

max_col and whose maximum row size is max_row and whose maximum column size is max_col. Sparse_Matrix Transpose(a) ::=

return the matrix produced by interchanging the row and column value of every triple.

Sparse_Matrix Add(a, b) ::= if the dimensions of a and b are the same

return the matrix produced by adding corresponding items, namely those with identical row and column values.

else return error Sparse_Matrix Multiply(a, b) ::= if number of columns in a equals number of rows in b return the matrix d produced by multiplying a by b according to the formula:

d [i] [j] = (a[i][k]•b[k][j]) where d (i, j) is the (i,j)th element else return error.

Sparse_Matrix Create and transpose

#define MAX_TERMS 101 /* maximum number of terms +1*/ typedef struct { int col; int row; int value; } term;

term a[MAX_TERMS]

for each row i (or column j) take element <i, j, value> and

store it in element <j, i, value> of the transpose.

　 row col value

a[0] 6 6 8

[1] 0 0 15

[2] 0 3 22

[3] 0 5 -15

[4] 1 1 11

[5] 1 2 3

[6] 2 3 -6

[7] 4 0 91

[8] 5 2 28

　 row col value

b[0]

6 6 8

[1] 0 0 15

[2] 0 4 91

[3] 1 1 11

[4] 2 1 3

[5] 2 5 28

[6] 3 0 22

[7] 3 2 -6

[8] 5 0 -15Sparse matrix and its transpose stored as

triples

#(rows)#(columns)

#(nonzero entries)

any element within a matrix using the triple <row, col, value>

Difficulty: what position to put <j, i, value> ?

Transpose of a sparse matrix in O(columns*elements)void transpose (term a[], term b[])

/* b is set to the transpose of a */{ int n, i, j, currentb; n = a[0].value; /* total number of elements */ b[0].row = a[0].col; /* rows in b = columns in a */ b[0].col = a[0].row; /*columns in b = rows in a */ b[0].value = n; if (n > 0) { /*non zero matrix */ currentb = 1; for (i = 0; i < a[0].col; i++) /* transpose by columns in a */ for( j = 1; j <= n; j++) /* find elements from the current column */ if (a[j].col == i) { /* element is in current column, add it to b */ b[currentb].row = a[j].col; b[currentb].col = a[j].row; b[currentb].value = a[j].value; currentb++ } }}

Scan the array “columns” times.The array has “elements” elements.

==> O(columns*elements)

Analysis and improvement

Discussion: compared with 2-D array representation

O(columns*elements) vs. O(columns*rows)

#(elements) columns * rows, when the matrix is not sparse. O(columns*elements) O(columns*columns*rows)

Problem: Scan the array “columns” times.

Improvement: Determine the number of elements in each column of the original matri

x. => Determine the starting positions of each row in the transpose matrix.

Transpose of a sparse matrix in O(columns + elements)void fast_transpose(term a[ ], term b[ ])

{ /* the transpose of a is placed in b */ int row_terms[MAX_COL], starting_pos[MAX_COL]; int i, j, num_cols = a[0].col, num_terms = a[0].value; b[0].row = num_cols; b[0].col = a[0].row; b[0].value = num_terms; if (num_terms > 0){ /*nonzero matrix*/ for (i = 0; i < num_cols; i++) row_terms[i] = 0; for (i = 1; i <= num_terms; i++) row_terms[a[i].col]++ starting_pos[0] = 1; for (i =1; i < num_cols; i++) starting_pos[i]=starting_pos[i-1] +row_terms [i-1];

for (i=1; i <= num_terms, i++) { j = starting_pos[a[i].col]++; b[j].row = a[i].col; b[j].col = a[i].row; b[j].value = a[i].value; } }}

O(num_cols)

O(num_terms)

O(num_cols-1)

O(num_terms)

O(columns + elements)

Sparse matrix multiplicationvoid mmult (term a[ ], term b[ ], term d[ ] )

/* multiply two sparse matrices */{ int i, j, column, totalb = b[].value, totald = 0; int rows_a = a[0].row, cols_a = a[0].col, totala = a[0].value; int cols_b = b[0].col, int row_begin = 1, row = a[1].row, sum =0; int new_b[MAX_TERMS][3]; if (cols_a != b[0].row){ fprintf (stderr, “Incompatible matrices\n”); exit (1); } fast_transpose(b, new_b); /* set boundary condition */ a[totala+1].row = rows_a; new_b[totalb+1].row = cols_b; new_b[totalb+1].col = 0; for (i = 1; i <= totala; ) { column = new_b[1].row; for (j = 1; j <= totalb+1;) { /* mutiply row of a by column of b */ if (a[i].row != row) { storesum(d, &totald, row, column, &sum); i = row_begin; for (; new_b[j].row == column; j++) ; column =new_b[j].row }

else if (new_b[j].row != column){ storesum(d, &totald, row, column, &sum); i = row_begin; column = new_b[j].row;}else switch (COMPARE (a[i].col, new_b[j].col)) {

case -1: /* go to next term in a */ i++; break; case 0: /* add terms, go to next term in a and b */ sum += (a[i++].value * new_b[j++].value); break; case 1: /* advance to next term in b*/ j++ } } /* end of for j <= totalb+1 */ for (; a[i].row == row; i++) ; row_begin = i; row = a[i].row; } /* end of for i <=totala */ d[0].row = rows_a; d[0].col = cols_b; d[0].value = totald;}

storesum functionvoid storesum(term d[ ], int *totald, int row, int column, int *sum)

{/* if *sum != 0, then it along with its row and column position is stored as the *totald+1 entry in d */ if (*sum) if (*totald < MAX_TERMS) { d[++*totald].row = row; d[*totald].col = column; d[*totald].value = *sum; } else { fprintf(stderr, ”Numbers of terms in product exceed %d\n”, MAX_TERMS); exit(1); }}

Analyzing the algorithm cols_b * termsrow1 + totalb + cols_b * termsrow2 + totalb + … + cols_b * termsrowrows_a + totalb= cols_b * (termsrow1 + termsrow2 + … + termsrowrows_a) + rows_a * totalb= cols_b * totala + row_a * totalb

O(cols_b * totala + rows_a * totalb)

Compared with classic multiplication algorithm

for (i =0; i < rows_a; i++) for (j=0; j < cols_b; j++) { sum =0; for (k=0; k < cols_a; k++) sum += (a[i][k] *b[k][j]); d[i][j] =sum; }

O(rows_a * cols_a * cols_b)

mmult vs. classicO(cols_b * totala + rows_a * totalb) vs. O(rows_a * cols_a * cols_b)

Matrix-chain multiplication n matrices A1, A2, …, An with size p0 p1, p1 p2, p2 p3, …, pn-1 pn

To determine the multiplication order such that # of scalar multiplications is minimized.

To compute Ai Ai+1, we need pi-1pipi+1 scalar multiplications.

e.g. n=4, A1: 3 5, A2: 5 4, A3: 4 2, A4: 2 5((A1 A2) A3) A4, # of scalar multiplications:

3 * 5 * 4 + 3 * 4 * 2 + 3 * 2 * 5 = 114(A1 (A2 A3)) A4, # of scalar multiplications:

3 * 5 * 2 + 5 * 4 * 2 + 3 * 2 * 5 = 100(A1 A2) (A3 A4), # of scalar multiplications:

3 * 5 * 4 + 3 * 4 * 5 + 4 * 2 * 5 = 160

Let m(i, j) denote the minimum cost for computing Ai Ai+1 … Aj

Computation sequence :

Time complexity : O(n3)

jiif

jiifpppj)1,m(kk)m(i,min

0j)m(i,

ik1ijki

Matrix-chain multiplication (cont.)

Representation of multidimensional arrays

The internal representation of multidimensional arrays requires more complex addressing formulas.

a[upper0] [upper1]…[uppern-1] => #(elements of a) = There are two ways to represent multidimensional arrays: row major order and column major order. (row major order stores multidimensional arrays by rows) Ex. A[upper0][upper1] : upper0 rows (row0, row1,… ,rowupper0-1), each row containing upper1 elements Assume that α is the address of A[0][0]

the address of A[i][0] : α+ i *upper1 A[i][j] : α+ i *upper1+j

1

0

n

iiupper

row major :

1 2 3 4

5 6 7 8

9 10 11 12

column major :

1 4 7 10

2 5 8 11

3 6 9 12

Representation of Multidimensional Arrays (cont.)

n dimensional array addressing formula for A[i0][i1]…[in-1] :

Assume that the address of A[0][0]…[0] is α. the address of A[i0][0][0]…[0] : α + i0*upper1*upper2*…*uppern-1

A[i0][i1][0]…[0] : α + i0*upper1*upper2*…*uppern-1

+ i1*upper2*upper3*…*uppern-1

A[i0][i1][i2]…[in-1] : α + i0*upper1*upper2*…*uppern-1



﹕

+ in-2*uppern-1

+ in-1

1

01

1

1

1

10,:,

n

in

n

jkkj

jj

a

njupperawhereai

The string abstract data type

Structure String is objects : a finite set of zero or more characters. functions : for all s, t String, i, j, m non-negative integers String Null(m) ::= return a string whose maximum length is m characters, but initially set to NULL. We write NULL as “”.

Integer Compare(s, t) ::= if s equals t return 0 else if s precedes t return -1 else return +1 Boolean IsNull(s) ::= if (Compare(s, NULL)) return FALSE else return TRUE Integer Length(s) ::= if (Compare(s, NULL)) return the number of characters in s else return 0 String Concat(s, t) ::= if (Compare(t, NULL)) return a string whose elements are those of s followed by those of t else return s String Subset(s, i, j) ::= if ((j>0) && (i+j-1)<Length(s)) return the string containing the characters of s at position i, i+1, …, i+j-1. else return NULL.

C string functions

String insertion function

void strnins(char *s, char *t, int i){/* insert string t into string s at position i */ char string[MAX_SIZE], *temp = string;

if (i < 0 && i > strlen(s)) { fprintf(stderr, ”Position is out of bounds \n” ); exist(1); } if (!strlen(s)) strcpy(s, t); else if (strlen(t)) { strncpy(temp, s, i); strcat(temp, t); strcat(temp, (s+i)); strcpy(s, temp); }}

s → a m o b i l e\

0

t → u t o \0

temp → \0

initially

temp → a \0

(a) after strncpy (temp, s, i)

temp → a u t o \0

(b) after strcat (temp, t)

temp → a u t o m o b i l e \0

(c) after strcat (temp, (s+i))

Example:

Never be used in practice, since it’s wasteful in use of time and space !

Pattern matching algorithm (checking end indices first)

int nfind(char *string, char *pat){/* match the lat character of pattern first, and then match from the beginning */ int i, j, start = 0; int lasts = strlen(string) – 1; int lastp = strlen(pat) – 1; int endmatch = lastp;

for (i = 0; endmatch <= lasts; endmatch++, start++){ if (string[endmatch] == pat[lastp]) for (j = 0, i = start; j < lastp && string[i] == pat[j]; i++, j++) ; if (j == lastp) return start; /* successful */ } return -1;}

Simulation of nfind

a a b 　

↑ ↑

j lastp

(a) pattern

a b a b b a a b a a 　

↑ ↑ ↑

start endmatch lasts

(b) no match


↑ ↑ ↑


(c) no match


↑ ↑ ↑


(d) no match


↑ ↑ ↑


(e) no match


↑ ↑ ↑


(f) no match


↑ ↑ ↑

start

endmatch lasts

(g) match

Failure function used in KMP algorithm

Definition: If P = p0p1…pn-1 is a pattern, then its failure function, f, is defined as:

Example: pat = ‘abcabcacab’

A fast way to compute the failure function:

otherwise

jf,1

)(Largest i < j such that p0p1…pi = pj-ipj-i-1…pj if such an i ≧ 0 exists

1

1)1(

1

)( jfjf m

if j = 0

Where m is the least integer k for which

if there is no k satisfying the above

jjfPp k

1)1(

j 0 1 2 3 4 5 6 7 8 9

pat

a b c a b c a c a b

f-1

-1

-1

0 1 2 3-1

0 1

KMP algorithm (1/5)

Case 1: the first symbol of P dose not appear again in P

Case 2: the first symbol of P appear again in P

KMP algorithm (2/5)

Case 3: not only the first symbol of P appears in P, prefixes of P appear in P twice

KMP algorithm (3/5)

The KMP Algorithm

The Prefix function

KMP algorithm (4/5)

KMP algorithm (5/5)#include <stdio.h>#include <string.h>#define max_string_size 100#define max_pattern_size 100int pmatch();void fail();int failure[max_pattern_size];char string[max_string_size];char pat[max_pattern_size];

int pmatch (char *string, char *pat){/* Knuth, Morris, Pratt string matching algorithm */ int i = 0, j =0; int lens = strlen(string); int lenp = strlen(pat); while (i < lens && j < lenp) { if (string[i] == pat[j]) { i++; j++; } else if (j == 0) i++; else j = failure[j-1] + 1; } return ( (j == lenp) ? (i-lenp) : -1);}

void fail (char *pat){/* compute the pattern’s failure function */ int n = strlen(pat); failure[0] = -1; for (j = 1; j < n; j++) { i = failure[j-1]; while ((pat[j] != pat[i+1]) && (I >= 0)) i = failure[i]; if (pat[j] == pat[i+1]) failure[j] = i + 1; else failure[j] = -1; }}

O(strlen(stirng))

O(strlen(pattern))

O(strlen(stirng)+strlen(pat))

An Example for KMP algorithm (1/6)

P1≠T1, mismatchSince j=1, i = i + 1


T5≠P4, mismatchPf(4-1)+1=P0+1=P1


P1≠T5, mismatchSince j=1, i = i + 1


Match exactly

Pf(12)+1=P4+1=P5


T19≠P6, mismatch

Pf(6-1)+1=P0+1=P1


.

.

.

chapter 2 arrays and structures

Documents

answer int i

int printf

size i inputi

n i tempsum

onedimensional array

structure data typetypedef

array program

new array