introduction to algorithms

Algorithm Analysis & Data Algorithm Analysis & Data StructuresStructures

Jaideep Srivastava

2

Schedule of topicsSchedule of topics

Lecture 1: Algorithm analysis Concept – what is it? Importance – why do it? Examples – lots of it Formalism

Lecture 2: Recursion Concept Examples

Lecture 3: Trees Concept & properties Tree algorithms

3

IntroductionIntroduction

A famous quote: Program = Algorithm + Data Structure. All of you have programmed; thus have already been

exposed to algorithms and data structure. Perhaps you didn't see them as separate entities; Perhaps you saw data structures as simple programming

constructs (provided by STL--standard template library). However, data structures are quite distinct from

algorithms, and very important in their own right.

Lecture 1 – Algorithm Analysis & Lecture 1 – Algorithm Analysis & ComplexityComplexity

5

ObjectivesObjectives The main focus of is to introduce you to a systematic

study of algorithms and data structures. The two guiding principles of the course are: abstraction

and formal analysis. Abstraction: We focus on topics that are broadly

applicable to a variety of problems. Analysis: We want a formal way to compare two objects

(data structures or algorithms). In particular, we will worry about "always correct"-ness,

and worst-case bounds on time and memory (space).

6

What is Algorithm Analysis ForWhat is Algorithm Analysis For

Foundations of Algorithm Analysis and Data Structures. Analysis:

How to predict an algorithm’s performance How well an algorithm scales up How to compare different algorithms for a problem

Data Structures How to efficiently store, access, manage data Data structures effect algorithm’s performance

7

Example AlgorithmsExample Algorithms Two algorithms for computing the Factorial Which one is better?

int factorial (int n) { if (n <= 1) return 1; else return n * factorial(n-1);}

int factorial (int n) { if (n<=1) return 1; else { fact = 1; for (k=2; k<=n; k++) fact *= k; return fact; }}

8

Examples of famous algorithmsExamples of famous algorithms

Constructions of Euclid Newton's root finding Fast Fourier Transform (signal processing) Compression (Huffman, Lempel-Ziv, GIF, MPEG) DES, RSA encryption (network security) Simplex algorithm for linear programming

(optimization) Shortest Path Algorithms (Dijkstra, Bellman-Ford) Error correcting codes (CDs, DVDs) TCP congestion control, IP routing (computer networks) Pattern matching (Genomics) Search Engines (www)

9

Role of Algorithms in Modern WorldRole of Algorithms in Modern World

Enormous amount of data Network traffic (telecom billing, monitoring) Database transactions (Sales, inventory) Scientific measurements (astrophysics,

geology) Sensor networks. RFID tags

Radio frequency identification (RFID) is a method of remotely storing and retrieving data using devices called RFID tags.

Bioinformatics (genome, protein bank)

10

A real-world ProblemA real-world Problem

Communication in the Internet Message (email, ftp) broken down into IP packets. Sender/receiver identified by IP address. The packets are routed through the Internet by special

computers called Routers. Each packet is stamped with its destination address, but

not the route. Because the Internet topology and network load is

constantly changing, routers must discover routes dynamically.

What should the Routing Table look like?

11

IP Prefixes and RoutingIP Prefixes and Routing

Each router is really a switch: it receives packets at several input ports, and appropriately sends them out to output ports.

Thus, for each packet, the router needs to transfer the packet to that output port that gets it closer to its destination.

Should each router keep a table: IP address x Output Port?

How big is this table? When a link or router fails, how much information would

need to be modified? A router typically forwards several million packets/sec!

12

Data StructuresData Structures

The IP packet forwarding is a Data Structure problem! Efficiency, scalability is very important.

Similarly, how does Google find the documents matching your query so fast?

Uses sophisticated algorithms to create index structures, which are just data structures.

Algorithms and data structures are ubiquitous. With the data glut created by the new technologies, the

need to organize, search, and update MASSIVE amounts of information FAST is more severe than ever before.

13

Algorithms to Process these DataAlgorithms to Process these Data

Which are the top K sellers? Correlation between time spent at a web site and

purchase amount? Which flows at a router account for > 1% traffic? Did source S send a packet in last s seconds? Send an alarm if any international arrival matches a

profile in the database Similarity matches against genome databases Etc.

14

Max Subsequence ProblemMax Subsequence Problem Given a sequence of integers A1, A2, …, An, find the maximum

possible value of a subsequence Ai, …, Aj. Numbers can be negative. You want a contiguous chunk with largest sum.

Example: -2, 11, -4, 13, -5, -2 The answer is 20 (subseq. A2 through A4).

We will discuss 4 different algorithms, with time complexities O(n3), O(n2), O(n log n), and O(n).

With n = 106, algorithm 1 may take > 10 years; algorithm 4 will take a fraction of a second!

15

int maxSum = 0;

for( int i = 0; i < a.size( ); i++ )

for( int j = i; j < a.size( ); j++ )

{int thisSum = 0;for( int k = i; k <= j; k+

+ )thisSum += a[ k ];

if( thisSum > maxSum )maxSum = thisSum;

}return maxSum;

Algorithm 1 for Max Subsequence SumAlgorithm 1 for Max Subsequence Sum

Given A1,…,An , find the maximum value of Ai+Ai+1+···+Aj

0 if the max value is negative

Time complexity: On3

)1(O

)( ijO

)1(O

)1(O

)1(O

))((1

n

ij

ijO ))((1

0

1

n

i

n

ij

ijO

16

Algorithm 2Algorithm 2

Idea: Given sum from i to j-1, we can compute the sum from i to j in constant time.

This eliminates one nested loop, and reduces the running time to O(n2).

into maxSum = 0;

for( int i = 0; i < a.size( ); i++ )

int thisSum = 0;for( int j = i; j < a.size( );

j++ ){ thisSum += a[ j ]; if( thisSum > maxSum

) maxSum =

thisSum;}

return maxSum;

17


This algorithm uses divide-and-conquer paradigm.

Suppose we split the input sequence at midpoint.

The max subsequence is entirely in the left half, entirely in the right half, or it straddles the midpoint.

Example:left half | right half4 -3 5 -2 | -1 2 6 -2

Max in left is 6 (A1 through A3); max in right is 8 (A6 through A7). But straddling max is 11 (A1 thru A7).

18

Algorithm 3 (cont.)Algorithm 3 (cont.) Example:

left half | right half4 -3 5 -2 | -1 2 6 -2

Max subsequences in each half found by recursion. How do we find the straddling max subsequence? Key Observation:

Left half of the straddling sequence is the max subsequence ending with -2.

Right half is the max subsequence beginning with -1.

A linear scan lets us compute these in O(n) time.

19

Algorithm 3: AnalysisAlgorithm 3: Analysis

The divide and conquer is best analyzed through recurrence:

T(1) = 1T(n) = 2T(n/2) + O(n)

This recurrence solves to T(n) = O(n log n).

20


Time complexity clearly O(n) But why does it work? I.e. proof of correctness.

2, 3, -2, 1, -5, 4, 1, -3, 4, -1, 2

int maxSum = 0, thisSum = 0;

for( int j = 0; j < a.size( ); j++ )

{thisSum += a[ j ];

if ( thisSum > maxSum )maxSum = thisSum;

else if ( thisSum < 0 )thisSum = 0;

}return maxSum;

}

21

Proof of CorrectnessProof of Correctness

Max subsequence cannot start or end at a negative Ai. More generally, the max subsequence cannot have a

prefix with a negative sum. Ex: -2 11 -4 13 -5 -2 Thus, if we ever find that Ai through Aj sums to < 0, then

we can advance i to j+1 Proof. Suppose j is the first index after i when the sum

becomes < 0 The max subsequence cannot start at any p between i

and j. Because Ai through Ap-1 is positive, so starting at i would have been even better.

22


int maxSum = 0, thisSum = 0;

for( int j = 0; j < a.size( ); j++ ){

thisSum += a[ j ];

if ( thisSum > maxSum )maxSum = thisSum;

else if ( thisSum < 0 )thisSum = 0;

}return maxSum

• The algorithm resets whenever prefix is < 0. Otherwise, it forms new sums and updates maxSum in one pass.

23

Why Efficient Algorithms MatterWhy Efficient Algorithms Matter

Suppose N = 106

A PC can read/process N records in 1 sec. But if some algorithm does N*N computation, then it

takes 1M seconds = 11 days!!!

100 City Traveling Salesman Problem. A supercomputer checking 100 billion tours/sec still

requires 10100 years!

Fast factoring algorithms can break encryption schemes. Algorithms research determines what is safe code length. (> 100 digits)

24

How to Measure Algorithm PerformanceHow to Measure Algorithm Performance

What metric should be used to judge algorithms? Length of the program (lines of code) Ease of programming (bugs, maintenance) Memory required Running time

Running time is the dominant standard. Quantifiable and easy to compare Often the critical bottleneck

25

AbstractionAbstraction An algorithm may run differently depending on:

the hardware platform (PC, Cray, Sun) the programming language (C, Java, C++) the programmer (you, me, Bill Joy)

While different in detail, all hardware and programming models are equivalent in some sense: Turing machines.

It suffices to count basic operations.

Crude but valuable measure of algorithm’s performance as a function of input size.

26

Average, Best, and Worst-CaseAverage, Best, and Worst-Case

On which input instances should the algorithm’s performance be judged?

Average case: Real world distributions difficult to predict

Best case: Seems unrealistic

Worst case: Gives an absolute guarantee We will use the worst-case measure.

27

ExamplesExamples

Vector addition Z = A+Bfor (int i=0; i<n; i++)

Z[i] = A[i] + B[i];

T(n) = c n

Vector (inner) multiplication z =A*B

z = 0;for (int i=0; i<n; i++)

z = z + A[i]*B[i];

T(n) = c’ + c1 n

28

ExamplesExamples

Vector (outer) multiplication Z = A*BT

for (int i=0; i<n; i++) for (int j=0; j<n; j++) Z[i,j] = A[i] * B[j];T(n) = c2 n2;

A program does all the above T(n) = c0 + c1 n + c2 n2;

29

Simplifying the BoundSimplifying the Bound

T(n) = ck nk + ck-1 nk-1 + ck-2 nk-2 + … + c1 n + co

too complicated too many terms Difficult to compare two expressions, each

with 10 or 20 terms Do we really need that many terms?

30

SimplificationsSimplifications Keep just one term!

the fastest growing term (dominates the runtime) No constant coefficients are kept

Constant coefficients affected by machines, languages, etc.

Asymtotic behavior (as n gets large) is determined entirely by the leading term.

Example. T(n) = 10 n3 + n2 + 40n + 800 If n = 1,000, then T(n) = 10,001,040,800 error is 0.01% if we drop all but the n3 term

In an assembly line the slowest worker determines the throughput rate

31

SimplificationSimplification

Drop the constant coefficient Does not effect the relative order

32

SimplificationSimplification

The faster growing term (such as 2n) eventually will outgrow the slower growing terms (e.g., 1000 n) no matter what their coefficients!

Put another way, given a certain increase in allocated time, a higher order algorithm will not reap the benefit by solving much larger problem

33

T(n)n n n log n n2 n3 n4 n10 2n

10 .01s .03s .1s 1s 10s 10s 1s20 .02s .09s .4s 8s 160s 2.84h 1ms30 .03s .15s .9s s 810s 6.83d 1s40 .04s .21s 1.6s s 2.56ms 121d 18m50 .05s .28s s s 6.25ms 3.1y 13d

100 .1s .66s 10s 1ms 100ms 3171y 41013y103 1s 9.96s 1ms 1s 16.67m 3.171013y 3210283y104 s 130s 100ms 16.67m 115.7d 3.171023y105 s 1.66ms 10s 11.57d 3171y 3.171033y106 ms 19.92ms 16.67m 31.71y 3.17107y 3.171043y

Complexity and Tractability Complexity and Tractability

Assume the computer does 1 billion ops per sec.

34

log n n n log n n2 n3 2n

0 1 0 1 1 21 2 2 4 8 42 4 8 16 64 163 8 24 64 512 2564 16 64 256 4096 65,5365 32 160 1,024 32,768 4,294,967,296

0

10000

20000

30000

40000

50000

60000

70000

n1

10

100

1000

10000

100000

n

2n

n2

n log n

n

log n

log n

n

n log n

n2

n3

n32n

35

Another ViewAnother View

More resources (time and/or processing power) translate into large problems solved if complexity is low

T(n) Problem size solved in 103 sec

Problem size solved in 104 sec

Increase in Problem size

100n 10 100 10

1000n 1 10 10

5n2 14 45 3.2

N3 10 22 2.2

2n 10 13 1.3

36

T(n) keep one drop coef

3n2+4n+1 3 n2 n2

101 n2+102 101 n2 n2

15 n2+6n 15 n2 n2

a n2+bn+c a n2 n2

AsymptoticsAsymptotics

They all have the same “growth” rate

37

CaveatsCaveats

Follow the spirit, not the letter a 100n algorithm is more expensive

than n2 algorithm when n < 100 Other considerations:

a program used only a few times a program run on small data sets ease of coding, porting, maintenance memory requirements

38

Asymptotic NotationsAsymptotic Notations

Big-O, “bounded above by”: T(n) = O(f(n)) For some c and N, T(n) c·f(n) whenever n > N.

Big-Omega, “bounded below by”: T(n) = (f(n)) For some c>0 and N, T(n) c·f(n) whenever n > N. Same as f(n) = O(T(n)).

Big-Theta, “bounded above and below”: T(n) = (f(n)) T(n) = O(f(n)) and also T(n) = (f(n))

Little-o, “strictly bounded above”: T(n) = o(f(n)) T(n)/f(n) 0 as n

39

By PicturesBy Pictures

Big-Oh (most commonly used) bounded above

Big-Omega bounded below

Big-Theta exactly

Small-o not as expensive as ...

0N

0N

0N

40

Example Example

33

25

10

0

(?)(?)

nn

nn

nn

O

23 2)( nnnT

41

ExamplesExamples

)(log/1))/((!

)()(

)()()(

)1(cAsymptomic)(

1

0

11

321

21

1

niennn

rrninininnc

nf

ni

n

nini

kkni

ni

ni

kii

ki

42

Summary (Why O(n)?)Summary (Why O(n)?)

T(n) = ck nk + ck-1 nk-1 + ck-2 nk-2 + … + c1 n + co

Too complicated O(nk )

a single term with constant coefficient dropped

Much simpler, extra terms and coefficients do not matter asymptotically

Other criteria hard to quantify

43

Runtime AnalysisRuntime Analysis

Useful rules simple statements (read, write, assign)

O(1) (constant) simple operations (+ - * / == > >= < <=

O(1) sequence of simple statements/operations

rule of sums for, do, while loops

rules of products

44

Runtime Analysis (cont.)Runtime Analysis (cont.)

Two important rules Rule of sums

if you do a number of operations in sequence, the runtime is dominated by the most expensive operation

Rule of products if you repeat an operation a number of times, the

total runtime is the runtime of the operation multiplied by the iteration count

45


if (cond) then O(1)body1 T1(n)

elsebody2 T2(n)

endif

T(n) = O(max (T1(n), T2(n))

46


Method calls A calls B B calls C etc.

A sequence of operations when call sequences are flattenedT(n) = max(TA(n), TB(n), TC(n))

47

ExampleExample

for (i=1; i<n; i++)if A(i) > maxVal then

maxVal= A(i);maxPos= i;

Asymptotic Complexity: O(n)

48

ExampleExample

for (i=1; i<n-1; i++)for (j=n; j>= i+1; j--)

if (A(j-1) > A(j)) thentemp = A(j-1);A(j-1) = A(j);A(j) = tmp;

endifendfor

endfor

Asymptotic Complexity is O(n2)

49

Run Time for Recursive ProgramsRun Time for Recursive Programs

T(n) is defined recursively in terms of T(k), k<n The recurrence relations allow T(n) to be

“unwound” recursively into some base cases (e.g., T(0) or T(1)).

Examples: Factorial Hanoi towers

50

Example: Factorial Example: Factorial

int factorial (int n) { if (n<=1) return 1; else return n * factorial(n-1);}

factorial (n) = n*n-1*n-2* … *1

n * factorial(n-1)

n-1 * factorial(n-2)

n-2 * factorial(n-3)

…

2 *factorial(1)

T(n)

T(n-1)

T(n-2)

T(1)

)(

*)1(

*)1()1(

....

)3(

)2(

)1(

)(

nO

dnc

dnT

dddnT

ddnT

dnT

nT

51

Example: Factorial (cont.)Example: Factorial (cont.)

int factorial1(int n) { if (n<=1) return 1; else { fact = 1; for (k=2;k<=n;k++) fact *= k; return fact; }} Both algorithms are O(n).

)1(O

)1(O)(nO

52

Example: Hanoi TowersExample: Hanoi Towers

Hanoi(n,A,B,C) = Hanoi(n-1,A,C,B)+Hanoi(1,A,B,C)+Hanoi(n-

1,C,B,A)

)2(

)12...22(

)12...2()1(2

....

22)3(2

2)2(2

)1(2

)(

21

21

23

2

n

nn

nn

O

c

cT

cccnT

ccnT

cnT

nT

53

// Early-terminating version of selection sort bool sorted = false; !sorted &&

sorted = true;

else sorted = false; // out of order

Worst Case Best Case

template<class T>void SelectionSort(T a[], int n){

for (int size=n; (size>1); size--) { int pos = 0; // find largest for (int i = 1; i < size; i++) if (a[pos] <= a[i]) pos = i; Swap(a[pos], a[size - 1]); }}

Worst Case, Best Case, and Average CaseWorst Case, Best Case, and Average Case

54

T(N)=6N+4 : n0=4 and c=7, f(N)=N T(N)=6N+4 <= c f(N) = 7N for

N>=4 7N+4 = O(N) 15N+20 = O(N) N2=O(N)? N log N = O(N)? N log N = O(N2)?

T(N)f(N)

c f(N)

n0

T(N)=O(f(N))

N2 = O(N log N)?N10 = O(2N)?6N + 4 = W(N) ? 7N? N+4 ? N2? N log N?N log N = W(N2)?3 = O(1)1000000=O(1)Sum i = O(N)?

55

An Analogy: Cooking RecipesAn Analogy: Cooking Recipes Algorithms are detailed and precise instructions. Example: bake a chocolate mousse cake.

Convert raw ingredients into processed output. Hardware (PC, supercomputer vs. oven, stove) Pots, pans, pantry are data structures.

Interplay of hardware and algorithms Different recipes for oven, stove, microwave etc.

New advances. New models: clusters, Internet, workstations Microwave cooking, 5-minute recipes, refrigeration

Lecture 2 - RecursionLecture 2 - Recursion

57

What is Recursion?What is Recursion?

Recursion is when a function either directly or indirectly makes a call to itself.

Recursion is a powerful problem solving tool.

Many mathematical functions and algorithms are most easily expressed using recursion.

58

How does it work?How does it work?

Functions are implemented using an internal stack of activation records.

Each time a function is called a new activation record is pushed on the stack.

When a function returns, the stack is popped and the activation record of the calling method is on top of the stack.

59

How does it work? (cont.)How does it work? (cont.)

The function being called, and whose activation record is being pushed on the stack, can be different from the calling function (e.g., when main calls a function).

The function being called can be a different instance of the calling subprogram.

Each instance of the function has its own parameters and local variables.

60

ExampleExample

Many mathematical functions are defined recursively. For example, the factorial function:

N! = N * (N-1)! for N>0

0! = 1

We have defined factorial in terms of a smaller (or simpler) instance of itself.We must also define a base case or stopping condition.

Recursive Function CallRecursive Function Call

A recursive call is a function call in which the called function is the same as the one making the call.

We must avoid making an infinite

sequence of function calls (infinite recursion).

Finding a Recursive SolutionFinding a Recursive Solution

Each successive recursive call should bring you closer to a situation in which the answer is known. general (recursive) case

A case for which the answer is known (and can be expressed without recursion) is called a base case.

General format forGeneral format formany recursive functionsmany recursive functions

if (some conditions for which answer is known) // base case solution statement

else // general case recursive function call

Tail RecursionTail Recursion

Use only one recursive call at the end of a function

void tail (int i) { if (i > 0) { System.out.print( i+ “ ”); tail(i – 1); } }

void iterativetail(int i) {

for ( ; i > 0; i--)

System.out.print( i+ “ ”);

}

65

NonTail RecursionNonTail Recursion

void nonTail (int i) { if (i > 0) { nonTail(i – 1); System.out.print( i+ “ ”); nonTail(i – 1); } }

66

Indirect recursionIndirect recursionReceive( buffer )

while buffer is not filled upIf information is still incoming

get a char and store it in buffer;else exit();decode(buffer);

Decode(buffer)decode information in buffer;store(buffer);

Store(buffer)transfer information from buffer to file;receive(buffer);

67

Nested recursionNested recursion

h(n) { 0 if n=0h(n) ={ n if n>4 h(n) { h(2+h(2n)) if n <=4

Writing a recursive function to find Writing a recursive function to find n factorialn factorial

DISCUSSION

The function call Factorial(4) should have value 24, because that is 4 * 3 * 2 * 1 .

For a situation in which the answer is known, the value of 0! is 1.

So our base case could be along the lines of

if ( number == 0 ) return 1;

Writing a recursive function to find Writing a recursive function to find Factorial(n)Factorial(n)

Now for the general case . . .

The value of Factorial(n) can be written as n * the product of the numbers from (n - 1) to 1, that is,

n * (n - 1) * . . . * 1

or, n * Factorial(n - 1)

And notice that the recursive call Factorial(n - 1) gets us “closer” to the base case of Factorial(0).

70

Recursive Function Example: FactorialRecursive Function Example: Factorial

Problem: calculate n! (n factorial)

n! = 1 if n = 0 n! = 1 * 2 * 3 *...* n if n > 0

Recursively:

if n = 0, then n! = 1if n > 0, then n! = n * (n-1)!

71

Factorial FunctionFactorial Function

int RecFactorial( /* in */ int n)// Calculates n factorial, n! // Precondition: n is a non-negative// integer{ if (n <= 0) then

return 1 else

return n * RecFactorial(n-1)}

72

1int fact(int n)2{3 if (n <= 0) then4 return 1;

5 else6 return n*fact(n-1);

7 }

main(...) {

...

20 System.out.print( (fact(3));

...

returns 6 to main()

73

ExponentiationExponentiation

baseexponent

e.g. 53

Could be written as a function

Power(base, exp)

74

Can we write it recursively?Can we write it recursively?

be = b * b(e-1)

What’s the limiting case?

When e = 0 we have b0 which always equals?

1

75

Another Recursive FunctionAnother Recursive Function

1 function Power returnsa Num(base, exp)2 // Computes the value of BaseExp 3 // Pre: exp is a non-negative integer 4 if (exp = 0) then5 returns 16 else7 returns base * Power(base, exp-1)8 endif9 endfunction

Power xN = x * xN-1 for N>0 x0 = 1

main(...) {

...

20 cout << (Power(2,3));

...

76

A man has an infant male-female pair of rabbits in a hutch entirely surrounded by a wall. We wish to know how many rabbits can be bred from this pair in one year, if the nature of the rabbits is such that every month they breed one other male-female pair which begins to breed in the second month after their birth. Assume that no rabbits die during the year.

The PuzzleThe Puzzle

77

A Tree Diagram for Fibonacci’s PuzzleA Tree Diagram for Fibonacci’s Puzzle

78

ObservationsObservations

The number of rabbits at the beginning of any month equals the number of rabbits of the previous month plus the number of new pairs.

The number of new pairs at the beginning of a month equals the number of pairs two months ago.

One gets the sequence: 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, …233

79

Fibonacci sequenceFibonacci sequence

Recursive definitionf(1) = 1f(2) = 1f(n) = f(n-1) + f(n-2)

80

Fibonacci Number Sequence

if n = 1, then Fib(n) = 1if n = 2, then Fib(n) = 1if n > 2, then Fib(n) = Fib(n-2) + Fib(n-1)

Numbers in the series:1, 1, 2, 3, 5, 8, 13, 21, 34, ...

A More Complex Recursive FunctionA More Complex Recursive Function

81

Fibonacci Sequence FunctionFibonacci Sequence Function

function Fib returnsaNum (n)// Calculates the nth Fibonacci number // Precondition: N is a positive

integer if ((n = 1) OR (n = 2)) then returns 1 else returns Fib(n-2) + Fib(n-1) endifendfunction //Fibonacci

82

Tracing with Multiple Recursive CallsTracing with Multiple Recursive Calls

Main Algorithm: answer <- Fib(5)

83



Fib(5): Fib returns Fib(3) + Fib(4)

84





85





Fib(1): Fib returns 1

86




Fib(3): Fib returns 1 + Fib(2)

87






88




Fib(3): Fib returns 1 + 1

89




90





91






92





93






94







95






96







97






98





99




100


Main Algorithm: answer <- 5

101

Fib(5)

Fib(3)Fib(4)

Fib(3) Fib(2)

Fib(2) Fib(1)

Fib(1) Fib(0)

Fib(1) Fib(0)

Fib(2) Fib(1)

Fib(1) Fib(0)

15 calls to Fib to find the 5th Fibonacci number!!!

102

Excessive Recursion-Fibonacci recursionExcessive Recursion-Fibonacci recursion

Int iterativeFib ( int n )if ( n < 2)

return n;else {

int i=2, tmp,current = 1, last=0;for ( ; i<=n; ++i) {

tmp=current;current+=last;last=tmp;}return current;

}}

105

Rules of RecursionRules of Recursion

First two fundamental rules of recursion:

Base cases: Always have at least one case that can be solved without using recursion.

Make progress: Any recursive call must make progress towards the base case.

106


Third fundamental rule of recursion:

“You gotta believe”: Always assume that the recursive call works.

107


Fourth fundamental rule of recursion:

Compound interest rule: Never duplicate work by solving the same instance of a problem in separate recursive calls.

108

Towers of HanoiTowers of Hanoi

The puzzle consisted of N disks and three poles: A (the source), B (the destination), and C (the spare)

109


B CA

110


B CA

111


B CA

112


B CA

113


B CA

114


B CA

115


B CA

116


B CA

117


A pseudocode description of the solution is:

Towers(Count, Source, Dest, Spare)

if (Count is 1)

Move the disk directly from Source to Dest

else

{

Solve Towers(Count-1, Source, Spare, Dest)

Solve Towers(1, Source, Dest, Spare)

Solve Towers(Count-1, Spare, Dest, Source)

}

118


void solveTowers( int count, char source, char dest, char spare){ if (count == 1) cout<<“Move disk from pole “ << source << " to pole " << destination <<endl; else { towers(count-1, source, spare, destination); towers(1, source, destination, spare); towers(count-1, spare, destination, source); }//end if}//end solveTowers

Recall that . . . Recall that . . .

Recursion occurs when a function calls itself (directly or indirectly).

Recursion can be used in place of iteration (looping).

Some functions can be written more easily using recursion.

120

Pascal TrianglePascal Triangle(Is this recursive?)(Is this recursive?)

121

Pascal TrianglePascal Triangle

The combinations of n items taken r at a time. For example:three items: a b c

taken 2 at a time: ab ac bc Thus there are three combinations of 3 items taken 2 at a

time. In General: C(n,r) = n!/(r!(n-r)!) Obviously you can

calculate C(n,r) using factorials.

122


This leads to Pascal Triangle:

n 0 1 1 1 1 2 1 2 1 3 1 3 3 1 4 1 4 6 4 1 5 1 5 10 10 5 1

This can also be written: r 0 1 2 3 4 5 n 0 1 1 1 1 2 1 2 1 3 1 3 3 1 4 1 4 6 4 1 5 1 5 10 10 5 1

123


Note from Pascal's Triangle that: C(n,r) = C(n-1, r-1) + C(n-1,r)

This leads to the recurrence for nonnegative r and n,

C(n,r) = 1 if r = 0 or r = n, 0 if r > n, C(n-1, r-1) + C(n-1,r) otherwise.

124


This immediately leads to the recursive function for combinations:

int C(int n, int r) {

if((r == 0) || (r == n)) return 1; else if(r > n) return 0; else return C(n-1, r-1) + C(n-1, r); }

What is the value of What is the value of rose(25)rose(25)??

int rose (int n){

if ( n == 1 ) // base casereturn 0;

else // general casereturn ( 1 + rose ( n / 2 ) );

}

Finding the value of Finding the value of rose(25)rose(25)

rose(25) the original call= 1 + rose(12) first recursive call= 1 + ( 1 + rose(6) ) second recursive call= 1 + ( 1 + ( 1 + rose(3) ) ) third recursive call= 1 + ( 1 + ( 1 + (1 + rose(1) ) ) ) fourth recursive call= 1 + 1 + 1 + 1 + 0= 4

Writing recursive functionsWriting recursive functions There must be at least one base case, and at least one

general (recursive) case. The general case should bring you “closer” to the base case.

The parameter(s) in the recursive call cannot all be the same as the formal parameters in the heading. Otherwise, infinite recursion would occur.

In function rose( ), the base case occurred when (n == 1) was true. The general case brought us a step closer to the base case, because in the general case the call was to rose(n/2), and the argument n/2 was closer to 1 (than n was).

Three-Question Method of verifyingThree-Question Method of verifying recursive functions recursive functions

Base-Case Question: Is there a nonrecursive way out of the function?

Smaller-Caller Question: Does each recursive function call involve a smaller case of the original problem leading to the base case?

General-Case Question: Assuming each recursive call works correctly, does the whole function work correctly?

Guidelines for writing recursive Guidelines for writing recursive functionsfunctions1. Get an exact definition of the problem to be solved.

2. Determine the size of the problem to be solved on this call to the function. On the initial call, the size of the whole problem is expressed by the actual parameter(s).

3. Identify and solve the base case(s) which have non-recursive solutions.

4. Identify and solve the general case(s) in terms of smaller (recursive) cases of the same problem.

struct ListType

{ int length ; // number of elements in the list

int info[ MAX_ITEMS ] ;

} ;

ListType list ;

struct ListTypestruct ListType

Recursive function to determine if value is in listRecursive function to determine if value is in listPROTOTYPE

bool ValueInList( ListType list , int value , int startIndex )

Already searched Needs to be searched

74 36 . . . 95

list[0] [1] [startIndex]

75 29 47 . . .

[length -1]

index of currentelement to examine

bool ValueInList ( ListType list , int value , int startIndex )

// Searches list for value between positions startIndex// and list.length-1{ if ( list.info[startIndex] == value ) // one base case

return true ; else if (startIndex == list.length -1 ) // another base case

return false ; else // general case

return ValueInList( list, value, startIndex + 1 ) ;}

132

““ Why use recursion?”Why use recursion?”

Many solutions could have been written without recursion, by using iteration instead. The iterative solution uses a loop, and the recursive solution uses an if statement.

However, for certain problems the recursive solution is the most natural solution. This often occurs when pointer variables are used.

134

When to Use RecursionWhen to Use Recursion

If the problem is recursive in nature therefore it is likely the a recursive algorithm will be preferable and will be less complex

If running times of recursive and non-recursive algorithms are hardly perceivable, recursive version is better

If both recursive and non-recursive algorithms have the same development complexity, a non-recursive version should be preferred

A third alternative in some problems is to use table-driven techniques

Sometimes we know that we will not use the only a few values of a particular function

If this is the case an implementation using a table would probably suffice and the performance will be better

int factorial[8] = {1, 1, 2, 6, 24, 120, 720, 5040};int factorial[8] = {1, 1, 2, 6, 24, 120, 720, 5040};

Use a recursive solution when:Use a recursive solution when:

The depth of recursive calls is relatively “shallow” compared to the size of the problem.

The recursive version does about the same amount of work as the nonrecursive version.

The recursive version is shorter and simpler than the nonrecursive solution.

SHALLOW DEPTH EFFICIENCY CLARITY

Lecture 3 – Binary TreesLecture 3 – Binary Trees

137

Why Trees?Why Trees?

Limitations of Arrays Linked lists Stacks Queues

138

Trees: Recursive DefinitionTrees: Recursive Definition

A tree is a collection of nodes. The collection can be empty, or consist of

a “root” node R. There is a “directed edge” from R to the

root of each subtree. The root of each subtree is a “child” of R. R is the “parent” of each subtree root.

139

Trees: Recursive Definition (cont.)Trees: Recursive Definition (cont.)

ROOT OF TREE T

T1 T2 T3 T4 T5

SUBTREES

140

Trees: An ExampleTrees: An Example

A

B C D E

FG H

I

141

Trees: More DefinitionsTrees: More Definitions

Nodes with no children are leaves: (C,E,F,H,I).

Nodes with the same parents are siblings: (B,C,D,E) and (G,H).

A path from node n to node m is the sequence of directed edges from n to m.

A length of a path is the number of edges in the path

142

Trees: More Definitions (cont.)Trees: More Definitions (cont.)

The level/depth of node n is the length of the path from the root to n. The level of the root is 0.

The height/depth of a tree is equal to the maximum level of a node in the tree.

The height of a node n is the length of the longest path from n to a leaf. The height of a leaf node is 0.

The height of a tree is equal to the height of the root node.

143

Binary Trees – A Informal DefinitionBinary Trees – A Informal Definition

A binary tree is a tree in which no node can have more than two children.

Each node has 0, 1, or 2 children In this case we can keep direct links to

the children:

struct TreeNode{ Object element; TreeNode *left_child; TreeNode *right_child;};

144

Binary Trees – Formal DefinitionBinary Trees – Formal Definition

A binary tree is a structure that contains no nodes, or is comprised of three disjoint sets of

nodes:a roota binary tree called its left subtreea binary tree called its right subtree

A binary tree that contains no nodes is called empty

145

Binary Trees: Recursive DefinitionBinary Trees: Recursive Definition

ROOT OF TREE T

T1 T2

SUBTREES

*left_child *right_child

146

Differences Between A Tree & A Binary TreeDifferences Between A Tree & A Binary Tree

No node in a binary tree may have more than 2 children, whereas there is no limit on the number of children of a node in a tree.

The subtrees of a binary tree are ordered; those of a tree are not ordered.

147

Differences Between A Tree & A Binary Tree Differences Between A Tree & A Binary Tree (cont.)(cont.) The subtrees of a binary tree are ordered; those

of a tree are not ordered

a

b

a

b

• Are different when viewed as binary trees

• Are the same when viewed as trees

148

Internal and External NodesInternal and External Nodes

Because in a binary tree all the nodes must have the same number of children we are forced to change the concepts slightly We say that all internal nodes have two children External nodes have no children

internal node

external node

149

Recursive definition of a Binary TreeRecursive definition of a Binary Tree Most of concepts related to binary trees can be explained

recursive For instance, A binary tree is:

An external node, or An internal node connected to a left binary tree and a

right binary tree (called left and right subtrees) In programming terms we can see that our definition for

a linked list (singly) can be modified to have two links from each node instead of one.

150

Property2: a unique path exists from the root to every other node

What is a binary tree? (cont.)What is a binary tree? (cont.)

151

Mathematical Properties of Binary TreesMathematical Properties of Binary Trees

Let's us look at some important mathematical properties of binary trees

A good understanding of these properties will help the understanding of the performance of algorithms that process trees

Some of the properties we'll describe relate also the structural properties of these trees. This is the case because performance characteristics of many algorithms depend on these structural properties and not only the number of nodes.

152

Minimum Number Of NodesMinimum Number Of Nodes

Minimum number of nodes in a binary tree whose height is h.

At least one node at each level.

minimum number of nodes is h + 1

153

Maximum Number Of NodesMaximum Number Of Nodes All possible nodes at first h levels are

present

Maximum number of nodes

= 1 + 2 + 4 + 8 + … + 2h = 2h+1 - 1

154

Number of Nodes & HeightNumber of Nodes & Height

Let n be the number of nodes in a binary tree whose height is h.

h + 1 <= n <= 2h+1 – 1 log2(n+1)-1 <= h <= n -1 The max heightmax height of a tree with N nodes is N - 1

(same as a linked list) The min heightmin height of a tree with N nodes is

log(N+1)-1

155

Relationship Between Number of Nodes Relationship Between Number of Nodes (Internal - External)(Internal - External) A binary tree with N internal nodes has N+1 external

nodes

Let's try to prove this using induction...

156

Number of edgesNumber of edges

A binary tree with N internal nodes has 2N edges


157

Number of edgesNumber of edges

A binary tree with N nodes (internal and external) has N-1 edges


158

Binary Tree RepresentationBinary Tree Representation

Array representation Linked representation

159

Binary TreesBinary Trees

Full binary tree : All internal nodes have two children.

Complete binary tree : All leaves have the same level All internal nodes have two children

160

Node Number Properties Node Number Properties

Parent of node i is node i/2 But node 1 is the root and has no parent

Left child of node i is node 2i But if 2i > n, node i has no left child

Right child of node i is node 2i+1 But if 2i+1 > n, node i has no right child

1

2 3

4 5 6 7

8 9 10 11 12 13 14 15

Complete binary tree

161

Full Binary Tree With n NodesFull Binary Tree With n Nodes Start with a complete binary tree that has at least

n nodes. Number the nodes as described earlier. The binary tree defined by the nodes numbered 1

through n is the unique n node full binary tree.

1

2 3

4 5 6 7

8 9 10 11 12 13 14 15Full binary tree with 11 nodes

162

Array RepresentationArray Representation

•Number the nodes using the numbering scheme for a full binary tree•Store the node numbered i in tree[i]

b

a

c

d e f g

h i j

1

2 3

4 5 6 7

8 9 10

tree[]0 5 10

a b c d e f g h i j1 2 3 4 6 7 8 9

163

Right-Skewed Binary TreeRight-Skewed Binary Tree

An n node binary tree needs an array whose length is between n+1 and 2n

If h = n-1 then skewed binary tree

a

b

1

3

c7

d15

tree[] 0 5 10

a - b - - - c - - - - - - -15d

164

Array Representation (cont.)Array Representation (cont.)

Each tree node is represented as a struct

Struct TreeNode {object element;

int child1; int child2; … int childn;};

Struct TreeNode tree[100];

165

Linked RepresentationLinked Representation

Each tree node is represented as an object whose data type is TreeNode

The space required by an n node binary tree is n * (space required by one node)

166

Trees: Linked representation Trees: Linked representation Implementation 1Implementation 1

struct TreeNode{ Object element; TreeNode *child1; TreeNode *child2; . . . TreeNode *childn;};

Each node contains a link to all of its children. This isn’t a good idea, because a node can have an arbitrary number of children!

167

struct TreeNode{ Object element; TreeNode *child1; TreeNode *sibling;};

Each node contain a link to its firstchild and a link to its next sibling. This isa better idea.

Trees: Linked representation Implementation 2

168

Implementation 2: ExampleImplementation 2: Example

/

The downward links are to the firstchild; the horizontal links are to thenext sibling.

A /

B

F /

C / D E /

G H /

I /

/

/

/

169

Binary TreesBinary Trees

A binary tree is a tree whose nodes have at most two offspring

Example struct nodeType { object element; struct nodeType *left, *right; }; struct nodeType *tree;

170

Linked Representation ExampleLinked Representation Example

a

cb

d

f

e

g

hleftChildelementrightChild

root

171

Some Binary Tree OperationsSome Binary Tree Operations

• Determine the height.• Determine the number of nodes.• Make a clone.• Determine if two binary trees are clones.• Display the binary tree.• Evaluate the arithmetic expression

represented by a binary tree.• Obtain the infix form of an expression.• Obtain the prefix form of an expression.• Obtain the postfix form of an expression.

CS122 Algorithms and Data StructuresCS122 Algorithms and Data Structures

Week 7: Binary Search Trees

Binary Expression Trees

173

Uses for Binary Trees…Uses for Binary Trees… -- -- Binary Search TreesBinary Search Trees Use for storing and retrieving information Insert, delete, and search faster than with a

linked list Idea: Store information in an ordered way (keys)

174

A Property of Binary Search TreesA Property of Binary Search Trees

The key of the root is larger than any key in the left subtree

The key of the root is smaller than any key in the right subtree

Note: Duplicated keys are not allowed

175

A Property of A Property of Binary Search TreeBinary Search Tree

ROOT OF TREE T

T1 T2

SUBTREES

*left_child *right_childX

All nodes in T1 havekeys < X.

All nodes in T2 havekeys > X.

176

Binary Search Trees in C++Binary Search Trees in C++

We will use two classes: The class BinaryNode simply constructs

individual nodes in the tree. The class BinarySearchTree maintains a

pointer to the root of the binary search tree and includes methods for inserting and removing nodes.

177

Search OperationSearch Operation

BinaryNode *search (const int &x, BinaryNode *t)

{

if ( t == NULL ) return NULL;

if (x == t->key) return t; // Match

if ( x < t->key )

return search( x, t->left );

else // t ->key < x

return search( x, t->right );

}

178

FindMin OperationFindMin Operation

BinaryNode* findMin (BinaryNode *t) {

if ( t == NULL )

return NULL;

if ( t -> left == NULL )

return t;

return findMin (t -> left);

}

This method returns a pointer to the node containing thesmallest element in the tree.

179

FindMax OperationFindMax Operation

BinaryNode* findMax (BinaryNode *t) {

if ( t == NULL )

return NULL;

if ( t -> right == NULL )

return t;

return findMax (t -> right);

}

This function returns a pointer to the node containing thelargest element in the tree.

180

Insert OperationInsert Operation

To insert X into a tree T, proceed down the tree as you would with a find. If X is found, do nothing. Otherwise insert X at the last spot on the path that has been traversed.

181

Insert Operation (cont.)Insert Operation (cont.)

void BinarySearchTree insert (const int &x, BinaryNode *&t) const

{

if (t == NULL)

t = new BinaryNode (x, NULL, NULL);

else if (x < t->key)

insert(x, t->left);

else if( t->key < x)

insert(x, t->right);

else

; // Duplicate entry; do nothing

}

Note the pointer t is passed using call by reference.

182

Removal OperationRemoval Operation

If the node to be removed is a leaf, it can be deleted immediately.

If the node has one child, the node can be deleted after its parent adjusts a link to bypass the deleted node.

183

If the node to be removed has two children, the general strategy is to replace the data of this node with the smallest key of the right subtree.

Then the node with the smallest data is now removed (this case is easy since this node cannot have two children).

Removal Operation (cont.)Removal Operation (cont.)

184

void remove (const int &x, BinaryNode* &t) const {

if ( t == NULL ) return; // key is not found; do nothing

if ( t->key == x) {

if( t->left != NULL && t->right != NULL ) { // Two children

t->key = findMin( t->right )->key;

remove( t->key, t->right );

}

else { // One child

BinaryNode *oldNode = t;

t = ( t->left != NULL ) ? t->left : t->right;

delete oldNode;

}

}

else { // Two recursive calls

if ( x < t->key ) remove( x, t->left );

else if( t->key < x ) remove( x, t->right );

}

}

Removal Operation (cont.)Removal Operation (cont.)

185

Deleting by mergingDeleting by merging

186

Deleting by mergingDeleting by merging

187

Deleting by copyingDeleting by copying

188

Balancing a binary treeBalancing a binary tree

A binary tree is height-balanced or simply balanced if the difference in height of both the subtrees is either zero or one

Perfectly balanced if all leaves are to be found on one level or two levels.

189

Balancing a binary treeBalancing a binary tree

190

AnalysisAnalysis

The running time of these operations is O(lv), where lv is the level of the node containing the accessed item.

What is the average level of the nodes in a binary search tree? It depends on how well balanced the tree is.

191

Average Level of NodesAverage Level of Nodes

10

5 20

1 8 13 34

Consider this very well-balanced binary search tree. What is thelevel of its leaf nodes?

1)(log so 122 21

0

NLN L

L

lv

lv

N=7

Data Order: 10, 5, 1, 8, 20, 13, 34

192

A Better AnalysisA Better Analysis

The analysis on the previous slide was for a particularly well-balanced binary search tree. However, not all binary search trees will be this well balanced.

In particular, binary search trees are created via insertions of data. Depending on the order of the data, various trees will emerge.

193

Effect of Data OrderEffect of Data Order

Obtained if data is 4, 3, 2 1

Obtained if data is 1, 2, 3, 4

Note in these cases the average depth of nodes is about N/2, not log(N)!

194

Depth of NodesDepth of Nodes

In the best case the depth will be about O(log N).

In the worst case, if the data are already ordered, the depth will be about O(N).

195

Effects of Data Order…Effects of Data Order…

So, if the input data are randomly ordered, what is the average depth of the nodes?

The analysis is beyond the scope of this course, but it can be shown that the average depth is O(log N), which is a very nice result.

196

SummarySummary

In this lecture we showed that, for an average binary search tree, the average depth of the nodes is O(log N). This is quite amazing, indicating that the bad situations, which are O(N), don’t occur very often.

However, for those who are still concerned about the very bad situations, we can try to “balance” the trees.

197

Uses for Binary Trees…Uses for Binary Trees…-- -- Binary Expression TreesBinary Expression Trees Binary trees are a good way to express

arithmetic expressions. The leaves are operands and the other nodes

are operators. The left and right subtrees of an operator

node represent subexpressions that must be evaluated before applying the operator at the root of the subtree.

198

Binary Expression Trees: ExamplesBinary Expression Trees: Examples

a + b+

a b

- a

-

a

(a + b) * (c – d) / (e + f)

/

+

a b

-

c d

+

e f

*

/

199

Merits of Binary Tree FormMerits of Binary Tree Form

Left and right operands are easy to visualize Code optimization algorithms work with the

binary tree form of an expression Simple recursive evaluation of expression

+

a b

-

c d

+

e f

*

/

Levels Indicate PrecedenceLevels Indicate Precedence

The levels of the nodes in the tree indicate their relative precedence of evaluation (we do not need parentheses to indicate precedence).

Operations at lower levels of the tree are evaluated later than those at higher levels.

The operation at the root is always the last operation performed.

A Binary Expression TreeA Binary Expression Tree

What value does it have?

( 4 + 2 ) * 3 = 18

‘*’

‘+’

‘4’

‘3’

‘2’

Inorder Traversal: (A + H) / (M - Y)Inorder Traversal: (A + H) / (M - Y)

‘/’

‘+’

‘A’ ‘H’

‘-’

‘M’ ‘Y’

tree

Print left subtree first Print right subtree last

Print second

203

Inorder Traversal (cont.)Inorder Traversal (cont.)

a

+

*

b c

+

*

+ g

*

d e

f

Inorder traversal yields: (a + (b * c)) + (((d * e) + f) * g)

Preorder Traversal: / + A H - M YPreorder Traversal: / + A H - M Y

‘/’

‘+’

‘A’ ‘H’

‘-’

‘M’ ‘Y’

tree

Print left subtree second Print right subtree last

Print first

205

Preorder Traversal (cont.)Preorder Traversal (cont.)

a

+

*

b c

+

*

+ g

*

d e

f

Preorder traversal yields: (+ (+ a (* b c)) (* (+ (* d e) f) g))

‘/’

‘+’

‘A’ ‘H’

‘-’

‘M’ ‘Y’

tree

Print left subtree first Print right subtree second

Print last

Postorder Traversal: A H + M Y - /

207

Postorder Traversal (cont.)Postorder Traversal (cont.)

a

+

*

b c

+

*

+ g

*

d e

f

Postorder traversal yields: a b c * + d e * f + g * +

208

Traversals and ExpressionsTraversals and Expressions

Note that the postorder traversal produces the postfix representation of the expression.

Inorder traversal produces the infix representation of the expression.

Preorder traversal produces a representation that is the same as the way that the programming language Lisp processes arithmetic expressions!

209

Constructing an Expression TreeConstructing an Expression Tree

There is a simple O(N) stack-based algorithm to convert a postfix expression into an expression tree.

Recall we also have an algorithm to convert an infix expression into postfix, so we can also convert an infix expression into an expression tree without difficulty (in O(N) time).

210

Expression Tree AlgorithmExpression Tree Algorithm

Read the postfix expression one symbol at at time: If the symbol is an operand, create a one-node tree

and push a pointer to it onto the stack. If the symbol is an operator, pop two tree pointers T1

and T2 from the stack, and form a new tree whose root is the operator, and whose children are T1 and T2.

Push the new tree pointer on the stack.

211

ExampleExample

a b + :

Note: These stacks are depicted horizontally.

a b

+

ba

212

ExampleExample

a b + c d e + :

+

ba

c d e

+

ba

c

d e

+

213

ExampleExample

a b + c d e + * :

+

ba c

d e

+

*

214

ExampleExample

a b + c d e + * * :

+

ba c

d e

+

*

*

introduction to algorithms

Technology

general recursive case

nonrecursive algorithms

times of recursive

smaller case

generalcase question

base case returnfalse

base case return0

nonrecursive version