algorithm problems collection

Algorithm Collection

Q1: How would you find a cycle in a linked list? Try to do it in O(n) time. Try it using constant amount of memory.Q2: Given a history of URLs, how would you determine if a particular URL had been seen before?Q3: Since pages can have multiple URLs pointing to them, how can you make sure you've never seen the same CONTENT before?Q4: Come up with the plan on how to traverse a graph, as well as to quickly determine if a given URL is one of the million or so you've previously seen.Q5: The Web can be modeled as a directed graph. Come up with a graph traversal algorithm. Make the algorithm non-recursive and breadth-first.Q6: Write a function to print the Fibonacci numbersQ7: Write a function to print all of the permutations of a string.Q8: Design a memory management scheme.Q9: Give a one-line C expression to test whether a number is a power of 2. Now implement it without using loops.Q10: How can I swap two integers in a single line statement?Q11: Implement an algorithm to sort a linked list.Q12: Implement strstr(), strcat(), strtok(), strrchr, and strcmpQ13: Given a linked list which is sorted, how will you insert in sorted way.Q14: Give me an algorithm and C code to find the subarray with the largest sum given an array containing both positive and negative integers.Q15: Given an array of size N in which every number is between 1 and N, determine if there are any duplicates in it.

Q16: A square picture is cut into 16 squares and they are shuffled. Write a program to rearrange the 16 squares to get the original big square.Q17: Implement an algorithm to reverse a singly linked list. (with and without recursion)Q18: Implement an algorithm to reverse a doubly linked list.Q19: Delete an element from a doubly linked list.Q20: Implement an algorithm to sort an array.Q21: Given a sequence of characters, how will you convert the lower case characters to upper case characters?Q22: Count the number of set bits in a number without using a loop.Q23: Give me an algorithm and C code to shuffle a deck of cards, given that the cards are stored in an array of ints. Try to come up with a solution that does not require any extra space.Q24: How would you print out the data in a binary tree, level by level, starting at the top?Q25: Do a breadth first traversal of a tree.Q26: Write a routine to draw a circle given a center coordiante (x,y) and a radius (r) without making use of any floating point computations.Q27: Given an array of characters which form a sentence of words, give an efficient algorithm to reverse the order of the words in it.Q28: Implement a TIC-TAC-TOE game assuming Computer as one of the player. Optimize it for fast computer play time and space. Do some analysis on memory and processing requirements.Q29: Write a function to find the depth of a binary tree.Q30: You are given two strings: S1 and S2. Delete from S2 all those characters which occur in S1 also and create a clean S2 with the relevant characters deleted.

Q31: Write a small lexical analyzer for expressions like "a*b" etc.Q32: Given an array t[100] which contains numbers between 1 and 99. Return the duplicated value. Try both O(n) and O(n-square).Q33: Write efficient code for extracting unique elements from a sorted list of array.Q34: Given a list of numbers (fixed list) Now given any other list, how can you efficiently find out if there is any element in the second list that is an element of the first list (fixed list).Q35: Print an integer using only putchar. Try doing it without using extra storage.Q36: Write a function that allocates memory for a two-dimensional array of given size(parameter x & y).Q37: Write source code for printHex(int i) in C/C++Q38: What sort of technique you would use to update a set of files over a network, where a server contains the master copy.Q39: How do you handle deadlock on a table that is fed with a live serial feed?Q40: Do the class/structure description for a Hash table, and write the source code for the insert function.Q41: How would you implement a hash table? How do you deal with collisions.?Q42: What would you suspect if users report they are seeing someone else's data on their customized pages?Q43: How would you do conditional compilation?Q44: Write an algorithm to draw a 3D pie chart?Q45: Prove that Dijkstra's MST algorithm indeed finds the overall MST.Q46: How would you implement a queue from a stack?Q47: Write a funtion that finds repeating characters in a string.Q48: Write a routine to reverse a series of numbers

without using an array.Q49: Write a function to find the nth item from the end of a linked list in a single pass.Q50: Give me an algorithm for telling me the number I didn't give you in a given range of numbers (Numbers are given at random).Q51: Write a random number generator in a specified range.Q52: Delete a node from a single linked list.Q53: Say you have three integers between 0 - 9. You have the equation: A! + B! + C! = ABC. Find A, B, and C that satisfies this equation.Q54: Give 2 nodes in a tree, how do you find the common root of the nodes?Q99: Write a small lexical analyzer for expressions like a(b|c)d*e+.

Q1: How would you find a cycle in a linked list? Try to do it in O(n) time. Try it using constant amount of memory.A1: p2 is guaranteed to reach the end of the list before p1 and every link will be tested by the while condition so no chance of access violation. Also, incrementing p2 by 2 and p1 by 1 is the fastest way of finding the cycle. In general if p1 is incremented by 'n' and p2 by 'm', ('n' not == 'm'), then if we number the nodes in the cycle from 0 to k-1 (k is the number of nodes in the cycle), then p1 will take values given by i*n (mod k) and p2 will take values i*m (mod k). These will collide every n*m iterations. Clearly, n*m is smallest for n==1 and m==2.

bool HasCycle(Node *pHeadNode){

Node *p1, *p2;p1 = p2 = pHeadNode;while (p2 && p2->Next) {

p1 = p1->Next;

p2 = p2->Next->Next;if (p1 == p2)

return true;}return false;

}Q2: Given a history of URLs, how would you determine if a particular URL had been seen before?A2: Hash Table is the most efficient way to do this. You can use several hashing algorithms. For example, checksum of a link can be used as a key for hashing. This will ensure o(1) order provided good checksum algorithm is used which is always unique. Whenever page loads we can parse all URL's from the page and take their checksum and compare them w/ the hashtable. Whichever links matches are displayed in some different color.

Hashtable is the correct answer, but a constant order algo. sounds too fantastic. URLs are by themselves unique and so hash function should not add to the redundancy by being unique. O/w it becomes nothing but a linear sort of search, while binary can do better.

Though URLs are not inherently alphabetically ordered, one might think of ordering them that way, or making the hash function that utilizes this. This will entail a combined binary + linear search which sounds optimal and is open to complexity calculations. A good data structure can be a hash table pointing to a binary tree (an array pointing to a binary tree).

Q3: Since pages can have multiple URLs pointing to them, how can you make sure you've never seen the same CONTENT before?

A3: Keep a list (or a binary tree) of hashes (using MD5, SHA1 or similar hash/digest algorithm) of pages you've visited. Then just compare digest of current page to the hashes in the tree.

Q4: Come up with the plan on how to traverse a graph, as well as to quickly determine if a given URL is one of the million or so you've previously seen.A4: Use prim's algorithm; Kruskal's algorithm is faster than Prim's. For checking for an already seen url, use dictionary search tree. It is the most efficient i.e. O(1) for whatever number of urls you have in the database.

Q5: The Web can be modeled as a directed graph. Come up with a graph traversal algorithm. Make the algorithm non-recursive and breadth-first.

Q6: Write a function to print the Fibonacci numbers

int fib (int n){

assert(n>=1);return (n

Q7: Write a function to print all of the permutations of a string.

map map_StrInt;void PermuteString(char* str, int n){

char ch = str[n-1];for(int i = 0; i < n; i++) {

swap (str[i], str[n-1]);PermuteString(str, n-1);swap (str[i], str[n-1]);

}

if (n == 1)map_StrInt[string(str)]++;

}Q8: Design a memory management scheme.

Q9: Give a one-line C expression to test whether a number is a power of 2. Now implement it without using loops.A9: x = ((x > 0) & !(x & x - 1));

Q10: How can I swap two integers in a single line statement?A10: Use xor operator to accomplish this: a ^= b ^= a ^= b;

Q11: Implement an algorithm to sort a linked list.A11: The merge sort algorithm:

#include

typedef struct MNode *PNODE;struct MNode

{int Key;struct MNode *Next;

};

PNODE Merge(PNODE p, PNODE q){

assert(p);assert(q);

PNODE pHead;if (p->Key < q->Key){

pHead = p, p = p->Next;}else{

pHead = q, q = q->Next;}

PNODE r = pHead;while (p && q){

if(p->Key < q->Key){

r->Next = p, r = r->Next, p = p->Next;

}else{

r->Next = q, r = r->Next, q = q->Next;

}}if(!p) r->Next = q;if(!q) r->Next = p;return pHead;

}

PNODE Partition(PNODE pNode){

PNODE p1 = pNode;PNODE p2 = pNode->Next->Next;while (p2 && p2->Next){

p2 = p2->Next->Next;p1 = p1->Next;

}PNODE pSav = p1->Next;p1->Next = NULL;return pSav;

}

PNODE Merge_Sort(PNODE p){

if (!p || !p->Next) return p;PNODE q = Partition(p);p = Merge_Sort(p);q = Merge_Sort(q);p = Merge(p, q);return p;

}Q12: Implement strstr(), strcat(), strtok(), strrchr, and strcmp

char * strstr (const char *str1, const char *str2){

char *cp = (char *)str1;char *endp = cp + (strlen(cp) -

strlen(str2));

while (*cp & (cp

char *s2 = (char *)str2;while ( *s1 & *s2 && (*s1 == *s2) )

s1++, s2++;if (!(*s2)) return(cp); //

success!cp++; // bump pointer to next char

}return(NULL);

}

char *strcat (char * dst, const char * src){

char *cp = dst;while (*cp) cp++; // find

end of dstwhile (*cp++ = *src++); // Copy

src to end of dstreturn (dst); // return

dst}

char *strcpy (char * dst, const char * src){

char* cp = dst;while (*cp++ = *src++); // Copy

src over dstreturn(dst);

}

char *strtok (char *string, const char *control){

char *str;const char *ctrl = control;

char map[32];int count;

static char *nextoken;

/* Clear control map */for (count = 0; count < 32; count++)

map[count] = 0;

// Set bits in delimiter tabledo {

map[*ctrl >> 3] |= (1 > 3] & (1 > 3] & (1

// Update nextokennextoken = str;

// Determine if a token has been found.if (string == str)

return NULL;else

return string;}

int strcmp(const char *src, const char* dst){

int ret = 0;while (!(ret = (*src - *dst)) & *dst);

++src, ++dst;if (ret < 0)

ret = -1;else

ret = 1;return ret;

}

char *strrev (char *string){

char *start = (char *)string;char *left = (char *)string;while (*string++); // find end of stringstring -= 2;while (left < string){

char ch = *left;*left++ = *string;*string-- = ch;

}return start;

}

char *strrchr (const char *string, int ch){

char *start = (char *)string;while (*string--); // find end of stringwhile (--string != start && *string !=

(char)ch);// search forward frontif (*string == (char)ch) // char found ?

return (char *)string;return (NULL);

}Q13: Given a linked list which is sorted, how will you insert in sorted way.

void Insert(PNODE &pHead, PNODE pThing){

if (pHead == 0)pHead = pThing;

else{

bool fComesFirst = true;PNODE pCurrent = pHead;PNODE pPrevious;while (pCurrent){

if (pThing->Key < pCurrent->Key)break;

pPrevious = pCurrent;pCurrent = pCurrent->Next;fComesFirst = false;

}

if (fComesFirst)pHead = pThing;

elsepPrevious->Next = pThing;

pThing->Next = pCurrent;}

}Q14: Give me an algorithm and C code to find the subarray with the largest sum given an array containing both positive and negative integers./* Give me an algorithm and C code to find the subarray with the largest sum given an array containing both positive and negative integers.

For each position i in the array we can find out the sub array ending exactly at that position and having the largest sum. At the beginning for the first element the only possible sub array it that one containing the first element so initially the largest amount is a[1]. (For algorithm sake let assume that the array is 1-indexed). Following the principle of dynamic programming it will be proved that the best sub array( with the largest sum) ending in position i+1 is somehow related to best sub array ending in position i.Let be k the starting index of the best sub array ending in position i and S the sum of this sub array. Let be t an index strictly less than k. The sum from the index t to i+1 is T + S + a[i+1] where T is the sum of elements from t to k-1 indexes. Because of the properties of index k T+S

you to start a new sub array that will be candidate for the best sub array of the entire array.In conclusion the information to be kept while going through the array only once (O(n)) is the best sub array ending in the current position and best sub array found for the entire array until the current step.(if it necessary also the starting and ending position for these sub array can be kept). After the processing of the last element the algorithm will have determined the sub array having the largest sum.

Note: The algorithm works fine even there is no negative number in the array and it will produce of course as the largest sum the sum of all array elements.

Algorithm: arr is the array of n integer; the function will return the largest sum and the limits of the sub array producing that value */

#include using namespace std;

int GetLargestSubArray(int* arr, int n, int &iBeginRet, int &iEndRet){

int iBeginSaved=0, iEndSaved=0; // the start/end positions of the saved sub array

int iBegin, iEnd; // the start/end positions of the current sub array

int nSumSaved=0, nSum=0; // the sums of whole saved largest and current sub arrays

int i = 0; // index to loop in the array

if (0 == n) // Nothing to analyze, return invalid array indexes

{iEndRet = iBeginRet = -1;return 0;

}

nSumSaved = nSum = arr[i];for(i = 2; i < n; i++){

/* Compute the current largest sum */if (nSum nSumSaved){

nSumSaved = nSum;iBeginSaved = iBegin;iEndSaved = iEnd;

}}iBeginRet = iBeginSaved;iEndRet = iEndSaved;return nSumSaved;

}Q15: Given an array of size N in which every number is between 1 and N, determine if there are any duplicates in it.A15: I'll try to do it in O(N) w/o using any additional memory. The key is to use content of the array as index into array, checking in O(1) if that number has been seen already.

bool HasDups(int * a, int N){

bool fHasDup = false;for (int i = 0; i < N; i++) {

int index = a[i] % N;if (a[index] > N) {

fHasDup = true;break;

}a[index] += N;

}

//restore the arrayfor (int j = 0; j < i; j++)

if (a[j] > N) a[j] %= N;

return fHasDup;}Q16: A square picture is cut into 16 squares and they are shuffled. Write a program to rearrange the 16 squares to get the original big square.

Q17: Implement an algorithm to reverse a singly linked list. (with and without recursion)

Node *RevSList(Node *pCur, Node *pRev) {if (!pCur) return pRev;Node *pNext = pCur->Next;pCur->Next = pRev;pRev = pCur;return (RevSList(pNext, pRev));

}

Node * RevSList(Node *pCur) {Node *pRev = NULL;while (pCur)

{Node *pNext = pCur->Next;pCur->Next = pRev;pRev = pCur;pCur = pNext;

}return pRev;

}Q18: Implement an algorithm to reverse a doubly linked list.

Node *RevDList(Node *pCur){

if (!pCur) return pCur;pSav = pCur->Next;pCur->Next = pCur->Prev;pCur->Prev = pSav;if (!pSav) return pCur;return RevDList(pSav);

}

Node *RevDList(Node *pCur){

while (pCur){

Node *pSav = pCur->Next;pCur->Next = pCur->Prev;pCur->Prev = pSav;if (!pSav) return pCur;pCur = pSav;

}return pCur;

}Q19: Delete an element from a doubly linked list.

Node *DelDLLNode(Node *pNode){

if (!pNode) return pNode; if (pNode->Next) pNode->Prev->Next = pNode->Next; if (pNode->Prev) pNode->Next->Prev = pNode->Prev; return pNode; // delete it if it's heap-based.}Q20: Implement an algorithm to sort an array.

Q21: Given a sequence of characters, how will you convert the lower case characters to upper case characters?

Q22: Count the number of set bits in a number without using a loop.

#define reverse(x) \(x=x>>16 | (0x0000ffff&x)8|(0x00ff00ff&x)4|(0x0f0f0f0f&x)2|(0x33333333&x)1|(0x55555555&x)1)+(0x55555555&x), \ x=((0xcccccccc&x)>>2)+(0x33333333&x), \ x=((0xf0f0f0f0&x)>>4)+(0x0f0f0f0f&x), \ x=((0xff00ff00&x)>>8)+(0x00ff00ff&x), \ x=(x>>16) + (0x0000ffff&x))

Q23: Give me an algorithm and C code to shuffle a deck of cards, given that the cards are stored in an array of ints. Try to come up with a solution that does not require any extra space.

for (Src = 0; Src < N; Src++){

Dest = rand() % N; // All N positions equally likely Swap (X[Src], X[Dest]);}At first glance, it would appear that this algorithm generates all permutations with equal probability. On examination, though, one can see that this will generate NN arrangements of elements---each of the N iterations of the loop positions a value among the N available positions. It is known, though, that there are only N! possible permutations of N elements: for each permutation there are multiple ways to generate that permutation---on average, NN/N! ways.

for (Src = 0; Src < N; Src++){ Dest = rand() % N; // All N positions equally likely Swap (X[Src], X[Dest]);}Examination of the structure of this loop shows that it will generate N! arrangements of elements. All permutations are equally likely, aside from the very minor deviation from uniform distribution by selecting a random value between 0 and Dest as (rand() % (Dest+1)).

Q24: How would you print out the data in a binary tree, level by level, starting at the top?

Q25: Do a breadth first traversal of a tree.

typedef int VertexType;typedef class Vertex *LPVertex;

class Vertex

{int index;int weight;LPVertex next;

public:int GetIndex();bool Visited();Vertex &Visit();Vertex &Mark();Vertex *GetNext();

};

enum { kMaxVertices = 7 };typedef LPVertex HeaderArray[kMaxVertices];HeaderArray adjacencyList;

void BreathFirstSearch (HeaderArray adjacencyList, LPVertex pv){

queue Q;pv->Visit().Mark();Q.push(pv);while (!Q.empty()){

pv = Q.front(); Q.pop();pv = adjacencyList[pv->GetIndex()];for (; pv; pv = pv->GetNext())

if (!pv->Visited()){

pv->Visit().Mark();Q.push (pv);

}}

}

void DepthFirstSearch (HeaderArray adjacencyList, LPVertex pv){

stack S;pv->Visit().Mark();S.push(pv);while (!S.empty()){

pv = S.top();pv = adjacencyList[pv->GetIndex()];for (; pv; pv = pv->GetNext())

if (!pv->Visited()){

pv->Visit().Mark();S.push(pv);

}S.pop();

}}Q26: Write a routine to draw a circle given a center coordiante (x,y) and a radius (r) without making use of any floating point computations.

Q27: Given an array of characters which form a sentence of words, give an efficient algorithm to reverse the order of the words in it.

char *ReverseWords (char *string){

char *start = strrev(string);char *left;char *right;char ch;

while (*string){

while (*string == ' ' & *string)string++;

left = string;

while (*string != ' ' & *string)string++;

right = string-1;

while (left < right){

ch = *left;*left++ = *right;*right-- = ch;

}}return start;

}Q28: Implement a TIC-TAC-TOE game assuming Computer as one of the player. Optimize it for fast computer play time and space. Do some analysis on memory and processing requirements.

Q29: Write a function to find the depth of a binary tree.

long findDepth(Node *pNode){

if (!pNode) return 0;long depthLeft = findDepth(pNode->Left);long depthRight = findDepth(pNode->Right);return (depthLeft>depthRight ? +

+depthLeft: ++depthRight);}Q30: You are given two strings: S1 and S2. Delete from S2 all those characters which occur in S1 also and create a clean S2 with the relevant characters deleted.

char *strtokrem (char *string, const char *control){

char *start = string;char *str = string;

const char *ctrl = control;

char map[32];int count;

/* Clear control map */for (count = 0; count < 32; count++)

map[count] = 0;

// Set bits in delimiter tabledo {

map[*ctrl >> 3] |= (1 > 3] & (1

if (*string == 'b')bRet = (*++string == '\0');

return bRet;}Q32: Given an array t[100] which contains numbers between 1 and 99. Return the duplicated value. Try both O(n) and O(n-square).

Q33: Write efficient code for extracting unique elements from a sorted list of array.

void PrintUniqueElements(int rgNumb[], int cNumb){

assert(cNumb>0);int iSav;cout

while (!S.empty()){

val = S.top(); S.pop();if (val >= 10){

S.push(val%10);S.push(val/10);

}else

cout.put(val+'0');}

}Q36: Write a function that allocates memory for a two-dimensional array of given size(parameter x & y).

Q37: Write source code for printHex(int i) in C/C++

void putHex(int val){

if (val < 0) {printf("-");val = -val;

}if (val >= 0 & val < 16) {

printf("%c", val > 9 ? (val-10)+'A' : val+'0');

return;}putHex(val/16);printf("%c", val%16 > 9 ? (val%16-

10)+'A' : val%16+'0');}Q38: What sort of technique you would use to update a set of files over a network, where a server contains the master copy.

Q39: How do you handle deadlock on a table that is fed

with a live serial feed?

Q40: Do the class/structure description for a Hash table, and write the source code for the insert function.

Q41: How would you implement a hash table? How do you deal with collisions.?

Q42: What would you suspect if users report they are seeing someone else's data on their customized pages?A42: Overflow; If we're talking about ASP, JSP, CFML that sort of thing, I would suspect unsynchronized access to session data in a multi-threaded Servlet, JavaBean or COM object -- i.e., a non-threadsafe bug in the server application.

Q43: How would you do conditional compilation?A43: The #if directive, with the #elif, #else, and #endif directives, controls compilation of portions of a source file. If the expression you write (after the #if) has a nonzero value, the line group immediately following the #if directive is retained in the translation unit. Plus, #ifdef and #ifndef identifier.

Q44: Write an algorithm to draw a 3D pie chart?

Q45: Prove that Dijkstra's MST algorithm indeed finds the overall MST.A45: The two common MST algorithms are by Kruskal and Prim. Djikstra gave us an algorithm for shortest path, not MST.

Q46: How would you implement a queue from a stack?

stack stk1, stk2;

void push(element e){

push.stk1(e);}

element pop(){

if(stk2.empty())while(!stk1.empty())

stk2.push(stk1.pop());return stk2.pop();

}Q47: Write a funtion that finds repeating characters in a string.

Q48: Write a routine to reverse a series of numbers without using an array.

int iReverse (int iNum){

int iRev =0;while(iNum !=0){

iRev = iRev * 10 + iNum % 10;iNum /= 10;

}return iRev;

}Q49: Write a function to find the nth item from the end of a linked list in a single pass.A49: I would think keeping a gap of "n" between fptr and sptr would do. Then, advance both together till fptr->next (fptr is the one in front) = NULL.

Aren't you traversing the list twice - once with fptr and the second time with sptr. I think you should maintain an queue of pointers. Keep pushing a pointer into the

queue and whenever the size of the queue becomes greater than n, remove a pointer at the head of the queue. When you reach the end of the list. The element at the head of the queue gives a pointer to the nth element from the end.

#include PNODE GetNthLastElement (PNODE pCur, unsigned nOffset){

queue Q;

for (; pCur && Q.size() < nOffset; Q.push(pCur), pCur = pCur->Next) ;

if (Q.size() < nOffset) return NULL;

while (pCur){

Q.pop();Q.push(pCur);pCur = pCur->Next;

}return (Q.front());

}Q50: Give me an algorithm for telling me the number I didn't give you in a given range of numbers (Numbers are given at random).

Q51: Write a random number generator in a specified range.

#include

int random_range(int lowest_number, int highest_number){if (lowest_number > highest_number)

{swap(lowest_number, highest_number);}

int range = highest_number - lowest_number + 1;return lowest_number + int(range * rand()/(RAND_MAX + 1.0));}

Q52: Delete a node from a single linked list.

Node *DelSLLNode(Node *pDelete, Node *pHead){

if (pHead == pDelete)return (pHead = pHead->Next);

Node *pPrev = pHead;for ( ; pPrev->Next; pPrev = pPrev->Next){

if (pPrev->Next == pDelete){

pPrev->Next = pPrev->Next->Next;break;

}}return pHead;

}

Q53: Say you have three integers between 0 - 9. You have the equation: A! + B! + C! = ABC (where ABS is a three digit numbers, not A * B * C). Find A, B, and C that satisfies this equation.

1! + 4! + 5! = 145

Q54: Give 2 nodes in a tree, how do you find the

common root of the nodes?

Q99: Write a small lexical analyzer for expressions like a(b|c)d*e+.

enum State {s0 = 0, s1, s2, s3, // statesm0 = 0, m1, m2, m3, acc, err // actions:

matches, accept or error};

State DecisionTable[4][6] = {// a b c d e other // input

m1,err,err,err,err,err, // s0err, m2, m2,err,err,err, // s1err,err,err, m2, m3,err, // s2acc,acc,acc,acc, m3,acc // s3

};

State IsAcceptable (char *&theIStream){

State stateCur = s0, State theDecision = err;

do{

char theChar = *theIStream++;int theInput = (theChar - 'a') % 6;theDecision = DecisionTable[stateCur]

[theInput];

switch (theDecision){

default:stateCur = theDecision;break;

case err:case acc:

; // do nothing}

} while (theDecision != err & theDecision != acc);

return theDecision;}

The Six Phases of a Compiler

Front End o Lexical Analysis - Token Stream o Syntactic Analysis - Parse Tree o Semantic Analysis - Parse Tree o Intermediate Code Generation - IR

Back End o Code Optimization - IR o Code Generation - Target Program

Error Handling and Symbol Table

Two Parts - Analysis and Synthesis

The Six Components of a Compiler

Scanner, Lexer, or Lexical Analyzer o Groups characters into tokens - basic unit of

syntax(character string forming a token is a lexeme)

o Eliminates white space (blanks, tabs and returns)

Parser, Syntactic Analyzer o Groups tokens into grammatical phrases o Represent the grammatical phrases into a

parse tree

o The syntac of a language is specified by context-free grammar

semantic Analyzer o Variables redefined, Procedures called w/ the

right number of types of args. o Type checking with permitted coercions, e.g.

Operator called w/ incompatible types Intermediate Code Generator Code Optimizer - The VC++ can perform copy

propagation and dead store elimination, common subexpression elimination, register allocation, function inlining, loop optimizations, flow graph optimizations, peephole optimizations, scheduling, and stack packing.

Code Generator

Lexers - Finite State Automata (FSA)

LEX implements this by creating a C program that consists of a general algorithm and a decision table specific to the DFSA:

state= 0; get next input character while (not end of input) { depending on current state and input character match: /* input expected */ calculate new state; get next input character accept: /* current pattern completely matched */ perform action corresponding to pattern; state= 0 error: / input unexpected /

reset input; report error; state= 0 }

Parsers - Push Down Automata (PDA)

There are two important classes of parsers, known as top-down parsers and bottom-up parsers. They are important because they provide a good trade-off between speed (they read the input exactly once, from left-to-right) and power (they can deal w/ most computer languages, but are confused by some grammars).

Top-Down Parsers

Here, the stack of the PDA is used to hold what we are expecting to see in the input. Assuming the input is correct, we repeatedly match the top-of-stack to the next input symbol and then discard them, or else replace the top-of-stack by an expansion from the grammar rules.

initialise stack to be the start symbol followed by end-of-input repeat if top-of-stack == next input match -- pop stack, get next input symbol else if top-of-stack is non-terminal and lookahead (e.g. next input) is as expected expand -- pop stack (= LHS of grammar rule) and push expansion for it (= RHS of grammar rule) in reverse order else

error -- expected (top-of-stack) but saw (next input) until stack empty or end-of-input

e.g. recognising a , b , c using: list : id tail;tail : ',' id tail | ;

stack next input rest of input action ($ represents end-of-input)$ list a , b , c $ expand list to id tail$ tail id a , b , c $ match id a$ tail , b , c $ expand tail to ',' id tail$ tail id , , b , c $ match ,$ tail id b , c $ match id b$ tail , c $ expand tail to ',' id tail$ tail id , , c $ match ,$ tail id c $ match id c$ tail $ expand tail to (nothing)$ $ accept

Bottom-Up Parsers

Here, the stack of the PDA is used to hold what we have already seen in the input. Assuming the input is correct, we repeatedly shift input symbols onto the stack until what we have matches a grammar rule, and then we reduce the grammar rule to its left-hand-side.

initialise stack (to end-of-input marker) repeat if complete grammar rule on stack and lookahead (e.g. next input) is as expected reduce -- pop grammar rule and push LHS else if lookahead (e.g. next input) is as expected shift -- push next input, get next input symbol else error until stack==start symbol and at end-of-input

e.g. recognising a , b , c using: list : id | list ',' id;

stack next input rest of input action$ a , b , c $ shift id a$ id , b , c $ reduce id to list$ list , b , c $ shift ,$ list , b , c $ shift id b$ list , id , c $ reduce list ',' id to list$ list , c $ shift ,$ list , c $ shift id c$ list , id $ reduce list ',' id to list$ list $ accept

e.g. here is the y.output and corresponding decision table for list : id | list ',' id ;

state 0 $accept : _list $end id shift 1 . error list goto 2state 1 list : id_ (1) . reduce 1state 2 $accept : list_$end list : list_, id $end accept , shift 3 . errorstate 3 list : list ,_id id shift 4 . errorstate 4 list : list , id_ (2) . reduce 2

state terminal symbols non-terminalsid ',' $end list

0 shift 1 . . goto 21 reduce (list:id) . . .2 . shift 3 accept .3 shift 4 . . .

4 reduce (list : list ',' id) . . .

References: How LEX and YACC work and Inside Lex and Yacc

Algorithm CollectionThe Six Phases of a CompilerTwo Parts - Analysis and SynthesisThe Six Components of a Compiler

Lexers - Finite State Automata (FSA)Parsers - Push Down Automata (PDA)Top-Down ParsersBottom-Up Parsers

algorithm problems collection

Documents