introduction to analysis of algorithms cs342 s2004

Introduction to Analysis of Algorithms

CS342 S2004

How to assess an algorithm

– Correctness– Ease of use– Efficiency as a function of n, the number of

data elements• Time efficiency• Memory or space (RAM) efficiency: less important

today.

Running time analysis

• Measure actual execution time: depends on machine architecture (processor) and operating systems (hardware and software dependencies); not very useful.

• Tally the number of key computations (operations) such as comparisons or assignments as a function of n, or T(n), where n is the number of data elements being processed by the algorithm. T(n) is independent of hardware and software. From T(n), one can easily obtain Big-O(O(n)).

Examples

• Selection sort algorithm• key operations: comparison operations• T(n) = (n-1) + (n-2) + … + 3 + 2 + 1 = n(n-1)/2 = n2 /2 – n/2

• Sequential search algorithm• Key operations: comparison operations• T(n) = 1 best case (not useful)• T(n) = n/2 average case• T(n) = n worst case

• Binary search algorithm• Key operations: comparison operations• T(n) = 2(1 + log2n) each iteration requires 2

comparisons (== and >)

How to obtain Big-O from T(n) intuitively

The Big-O notation– Asymptotic behavior of T(n)

• Drop insignificant terms as n becomes very large

– Obtain Big-O notation from T(n)• by inspection• look for dominant term in T(n) and drop constant

associated with the dominant term

Examples

• Selection sort: T(n) = n2 /2 – n/2; dominant term is n2 /2, dropping 1/2, O(n2), the algorithm is quadratic; Big-O of n square.

• Linear search: T(n) = n; dominant term is n, the algorithm has an efficiency of O(n) (reads Big-O of n and the algorithm is linear)

• Binary search: T(n) = 2(1 + log2n); dominant term is 2log2n, dropping 2, O(log2n), the algorithm is logarithmic, Big-O if log2n.

Some observations

– As n increases, so does execution time.– Running time of practical algorithms may grow

with n (form best to worse)• Constant (does not depend on n)• n (Linear)

• log2n

• nlog2n

• n2 (quadratic)

• n3 (cubic)

• 2n (exponential)

Formal definition of the Big-O notation

• Given functions T(n) and g(n), we say that T(n) is O(g(n)) if there are positive constants c and n0 such that

• T(n) cg(n) for n n0

Example

T(n) = 2n + 10 is O(n)2n + 10 cn

(c 2) n 10

n 10/(c 2)

Pick c = 3 and n0 = 10 (can have infinite number of solutions)

Example

T(n) = 3n3 + 20n2 + 5 is O(n3)need c > 0 and n0 1 such that 3n3 + 20n3 + 5

cn3 for n n0

this is true for c = 4 and n0 = 21

Example

T(n) = 3 log2 n + log2 log2 n is O(log2 n) need c > 0 and n0 1 such that 3 log2 n + log2 log2 n c•log2 n for n n0

this is true for c = 4 and n0 = 2

Observations

– The big-Oh notation gives an upper bound on the growth rate of a function

– The statement “T(n) is O(g(n))” means that the growth rate of T(n) is no more than the growth rate of g(n)

– We can use the big-Oh notation to rank functions according to their growth rate

Big-O rules

• If is T(n) a polynomial of degree d, then T(n) is O(nd), i.e.,

• Drop lower-order terms• Drop constant factors• Use the smallest possible class of functions• Say “2n is O(n)” instead of “2n is O(n2)”• Use the simplest expression of the class, Say

“3n + 5 is O(n)” instead of “3n + 5 is O(3n)”

Math you need to know

• Summations• Logarithms and Exponents

– properties of logarithms:• logb(xy) = logbx + logby• logb (x/y) = logbx - logby• logbxa = alogbx• logba = logxa/logxb

– properties of exponentials:• a(b+c) = abac

• abc = (ab)c

• ab /ac = a (b-c)

• b = a logab

• bc = a c*logab

– Proof techniques– Basic probability

Relatives of Big-O

• big-Omega– f(n) is (g(n)) if there is a constant c > 0 and an integer constant

n0 1 such that – f(n) c•g(n) for n n0

• big-Theta– f(n) is (g(n)) if there are constants c’ > 0 and c’’ > 0 and an

integer constant n0 1 such that c’•g(n) f(n) c’’•g(n) for n n0

• little-oh– f(n) is o(g(n)) if, for any constant c > 0, there is an integer constant

n0 0 such that f(n) c•g(n) for n n0

• little-omega– f(n) is (g(n)) if, for any constant c > 0, there is an integer constant

n0 0 such that f(n) c•g(n) for n n0

Intuition for asymptotic notations

– Big-Oh• f(n) is O(g(n)) if f(n) is asymptotically less than or equal to

g(n)

– big-Omega• f(n) is (g(n)) if f(n) is asymptotically greater than or equal

to g(n)

– big-Theta• f(n) is (g(n)) if f(n) is asymptotically equal to g(n)

– little-oh• f(n) is o(g(n)) if f(n) is asymptotically strictly less than g(n)

– little-omega• f(n) is (g(n)) if is asymptotically strictly greater than g(n)

Selection sort: The C++ code

template <typename T>

void selectionSort(T arr[ ], int n)

{ int smallIndex; // index of smallest element in the sublist

int pass, j;

T temp;

// pass has the range 0 to n-2

Selection sort: The C++ code continued

for (pass = 0; pass < n-1; pass++) // loop n-1 times;

// =: 1 time; <: n-1 times; ++: n-2 times

{smallIndex = pass; // scan the sublist starting at index pass = n-1

for (j = pass+1; j < n; j++) // j traverses the sublist arr[pass+1] to arr[n-1]

if (arr[j] < arr[smallIndex]) // update if smaller element found // < operation: T(n) or n(n-1)/2 times

smallIndex = j; // Worst case: = operation: T(n) or n(n-1)/2 times

// if smallIndex and pass are not the same location,// exchange the smallest item in the sublist with arr[pass]

if (smallIndex != pass) { // != operation: n-1times

temp = arr[pass]; // Worst case: = operations: n-1 times

arr[pass] = arr[smallIndex]; // = operation: n-1 times

arr[smallIndex] = temp; // = operation: n-1 times

}}

Total instruction count and Big-O

• key operations—in doubly nested loop: – < operations: T(n) = (n-1) + (n-2) + … + 3 + 2 + 1 = n(n-1)/2 =

n2 /2 – n/2. Same for = operations: T(n) = n(n-1)/2• Other operations:

= operation: [1 + (n-1) + 3(n-1)] < operation: [n-1] ++ operation: [n-2] != operation: [n-1] Total: 7(n-1)

• Assuming all instructions take same time to execute or total instruction count: T(n) = 2n(n-1)/2+7(n-1) = n2+6n-7

• T(n) <= cg(n) for n > n0; possible solution: g(n) = n2, c = 1 n0 = 6; therefore T(n) is O(n2).

C++ code for Sequential or linear search

template <typename T>int seqSearch(const T arr[], int first, int last, const

T& target){

int i; // scan indices in the range first <= i < last

for(i=first; i < last; i++)if (arr[i] == target) // assume T has the "==" operatorreturn i; // immediately return on a matchreturn last; // return last if target not found

}

C++ code for binary searchtemplate <typename T>int binSearch(const T arr[], int first, int last, const T& target){

int mid; // index of the midpointT midValue; // object that is assigned arr[mid]int origLast = last; // save original value of last = 1while (first < last) // test for nonempty sublist < 2m = n or m = log2n{

mid = (first+last)/2;midValue = arr[mid];if (target == midValue)return mid; // have a match

// determine which sublist to searchelse if (target < midValue)last = mid; // search lower sublist. reset lastelsefirst = mid+1; // search upper sublist. reset first

}return origLast; // target not found

introduction to analysis of algorithms cs342 s2004

Documents