introduction to analysis of algorithms cs342 s2004
TRANSCRIPT
Introduction to Analysis of Algorithms
CS342 S2004
How to assess an algorithm
– Correctness– Ease of use– Efficiency as a function of n, the number of
data elements• Time efficiency• Memory or space (RAM) efficiency: less important
today.
Running time analysis
• Measure actual execution time: depends on machine architecture (processor) and operating systems (hardware and software dependencies); not very useful.
• Tally the number of key computations (operations) such as comparisons or assignments as a function of n, or T(n), where n is the number of data elements being processed by the algorithm. T(n) is independent of hardware and software. From T(n), one can easily obtain Big-O(O(n)).
Examples
• Selection sort algorithm• key operations: comparison operations• T(n) = (n-1) + (n-2) + … + 3 + 2 + 1 = n(n-1)/2 = n2 /2 – n/2
• Sequential search algorithm• Key operations: comparison operations• T(n) = 1 best case (not useful)• T(n) = n/2 average case• T(n) = n worst case
• Binary search algorithm• Key operations: comparison operations• T(n) = 2(1 + log2n) each iteration requires 2
comparisons (== and >)
How to obtain Big-O from T(n) intuitively
The Big-O notation– Asymptotic behavior of T(n)
• Drop insignificant terms as n becomes very large
– Obtain Big-O notation from T(n)• by inspection• look for dominant term in T(n) and drop constant
associated with the dominant term
Examples
• Selection sort: T(n) = n2 /2 – n/2; dominant term is n2 /2, dropping 1/2, O(n2), the algorithm is quadratic; Big-O of n square.
• Linear search: T(n) = n; dominant term is n, the algorithm has an efficiency of O(n) (reads Big-O of n and the algorithm is linear)
• Binary search: T(n) = 2(1 + log2n); dominant term is 2log2n, dropping 2, O(log2n), the algorithm is logarithmic, Big-O if log2n.
Some observations
– As n increases, so does execution time.– Running time of practical algorithms may grow
with n (form best to worse)• Constant (does not depend on n)• n (Linear)
• log2n
• nlog2n
• n2 (quadratic)
• n3 (cubic)
• 2n (exponential)
Formal definition of the Big-O notation
• Given functions T(n) and g(n), we say that T(n) is O(g(n)) if there are positive constants c and n0 such that
• T(n) cg(n) for n n0
Example
T(n) = 2n + 10 is O(n)2n + 10 cn
(c 2) n 10
n 10/(c 2)
Pick c = 3 and n0 = 10 (can have infinite number of solutions)
Example
T(n) = 3n3 + 20n2 + 5 is O(n3)need c > 0 and n0 1 such that 3n3 + 20n3 + 5
cn3 for n n0
this is true for c = 4 and n0 = 21
Example
T(n) = 3 log2 n + log2 log2 n is O(log2 n) need c > 0 and n0 1 such that 3 log2 n + log2 log2 n c•log2 n for n n0
this is true for c = 4 and n0 = 2
Observations
– The big-Oh notation gives an upper bound on the growth rate of a function
– The statement “T(n) is O(g(n))” means that the growth rate of T(n) is no more than the growth rate of g(n)
– We can use the big-Oh notation to rank functions according to their growth rate
Big-O rules
• If is T(n) a polynomial of degree d, then T(n) is O(nd), i.e.,
• Drop lower-order terms• Drop constant factors• Use the smallest possible class of functions• Say “2n is O(n)” instead of “2n is O(n2)”• Use the simplest expression of the class, Say
“3n + 5 is O(n)” instead of “3n + 5 is O(3n)”
Math you need to know
• Summations• Logarithms and Exponents
– properties of logarithms:• logb(xy) = logbx + logby• logb (x/y) = logbx - logby• logbxa = alogbx• logba = logxa/logxb
– properties of exponentials:• a(b+c) = abac
• abc = (ab)c
• ab /ac = a (b-c)
• b = a logab
• bc = a c*logab
– Proof techniques– Basic probability
Relatives of Big-O
• big-Omega– f(n) is (g(n)) if there is a constant c > 0 and an integer constant
n0 1 such that – f(n) c•g(n) for n n0
• big-Theta– f(n) is (g(n)) if there are constants c’ > 0 and c’’ > 0 and an
integer constant n0 1 such that c’•g(n) f(n) c’’•g(n) for n n0
• little-oh– f(n) is o(g(n)) if, for any constant c > 0, there is an integer constant
n0 0 such that f(n) c•g(n) for n n0
• little-omega– f(n) is (g(n)) if, for any constant c > 0, there is an integer constant
n0 0 such that f(n) c•g(n) for n n0
Intuition for asymptotic notations
– Big-Oh• f(n) is O(g(n)) if f(n) is asymptotically less than or equal to
g(n)
– big-Omega• f(n) is (g(n)) if f(n) is asymptotically greater than or equal
to g(n)
– big-Theta• f(n) is (g(n)) if f(n) is asymptotically equal to g(n)
– little-oh• f(n) is o(g(n)) if f(n) is asymptotically strictly less than g(n)
– little-omega• f(n) is (g(n)) if is asymptotically strictly greater than g(n)
Selection sort: The C++ code
template <typename T>
void selectionSort(T arr[ ], int n)
{ int smallIndex; // index of smallest element in the sublist
int pass, j;
T temp;
// pass has the range 0 to n-2
Selection sort: The C++ code continued
for (pass = 0; pass < n-1; pass++) // loop n-1 times;
// =: 1 time; <: n-1 times; ++: n-2 times
{smallIndex = pass; // scan the sublist starting at index pass = n-1
for (j = pass+1; j < n; j++) // j traverses the sublist arr[pass+1] to arr[n-1]
if (arr[j] < arr[smallIndex]) // update if smaller element found // < operation: T(n) or n(n-1)/2 times
smallIndex = j; // Worst case: = operation: T(n) or n(n-1)/2 times
// if smallIndex and pass are not the same location,// exchange the smallest item in the sublist with arr[pass]
if (smallIndex != pass) { // != operation: n-1times
temp = arr[pass]; // Worst case: = operations: n-1 times
arr[pass] = arr[smallIndex]; // = operation: n-1 times
arr[smallIndex] = temp; // = operation: n-1 times
}}
Total instruction count and Big-O
• key operations—in doubly nested loop: – < operations: T(n) = (n-1) + (n-2) + … + 3 + 2 + 1 = n(n-1)/2 =
n2 /2 – n/2. Same for = operations: T(n) = n(n-1)/2• Other operations:
= operation: [1 + (n-1) + 3(n-1)] < operation: [n-1] ++ operation: [n-2] != operation: [n-1] Total: 7(n-1)
• Assuming all instructions take same time to execute or total instruction count: T(n) = 2n(n-1)/2+7(n-1) = n2+6n-7
• T(n) <= cg(n) for n > n0; possible solution: g(n) = n2, c = 1 n0 = 6; therefore T(n) is O(n2).
C++ code for Sequential or linear search
template <typename T>int seqSearch(const T arr[], int first, int last, const
T& target){
int i; // scan indices in the range first <= i < last
for(i=first; i < last; i++)if (arr[i] == target) // assume T has the "==" operatorreturn i; // immediately return on a matchreturn last; // return last if target not found
}
C++ code for binary searchtemplate <typename T>int binSearch(const T arr[], int first, int last, const T& target){
int mid; // index of the midpointT midValue; // object that is assigned arr[mid]int origLast = last; // save original value of last = 1while (first < last) // test for nonempty sublist < 2m = n or m = log2n{
mid = (first+last)/2;midValue = arr[mid];if (target == midValue)return mid; // have a match
// determine which sublist to searchelse if (target < midValue)last = mid; // search lower sublist. reset lastelsefirst = mid+1; // search upper sublist. reset first
}return origLast; // target not found