struktur data & algoritme -...
TRANSCRIPT
Fundamental Data Types: String, Math, Casting 1
Struktur Data & Algoritme(Data Structures & Algorithms)
Fakultas Ilmu KomputerUniversitas Indonesia
Semester Genap - 2000/2001Version 1.0 - Internal Use Only
Analisa Algoritma
SDA/ALG-AN/DN/V1.0/2
Algoritmen Suatu set instruksi yang harus diikuti oleh computer
untuk memecahkan suatu masalah.
Fundamental Data Types: String, Math, Casting 2
SDA/ALG-AN/DN/V1.0/3
Analisa Algoritme: motivasin perlu diketahui berapa banyak resource (time &
space) yang diperlukan oleh sebuah algoritmen Menggunakan teknik-teknik untuk mengurangi waktu
yang dibutuhkan oleh sebuah algoritme
SDA/ALG-AN/DN/V1.0/4
Objectivesn dapat memperkirakan waktu yang dibutuhkan
n mempelajari teknik yang secara drastis menurunkanrunning time
n mempelajari kerangka matematik yang secararigorously menggambarkan running time
Fundamental Data Types: String, Math, Casting 3
SDA/ALG-AN/DN/V1.0/5
Outlinen Apa itu analisa algoritme?- what
n Bagaimana cara untuk analisa/mengukur? - how
n Notasi Big-Oh
n Case Study: The Maximum Contiguous SubsequenceSum Problem
n Algorithm 1: A cubic algorithm
n Algorithm 2: A quadratic algorithm
n Algorithm 3: An N log N algorithm
n Algorithm 4: A linear algorithm
SDA/ALG-AN/DN/V1.0/6
Analisa Algoritme: what?n Mengukur jumlah sumber daya (time dan space) yang
diperlukan oleh sebuah algoritmen Waktu yang diperlukan (running time) oleh sebuah
algorithm cenderung tergantung pada jumlah inputyang diproses.
n Running time dari sebuah algoritme adalah fungsidari jumlah inputnya
n Selalu tidak terikat pada platform, bahasapemrograman atau bahkan metodologi (mis.Procedural vs OO)
Fundamental Data Types: String, Math, Casting 4
SDA/ALG-AN/DN/V1.0/7
Worst, Best and Average Casen Worst-case running time is a bound over all inputs of
a certain size N. (Guarantee - Pesimistic)n Average-case running time is an average over all
inputs of a certain size N. (Prediction)n Best-case running time is defined by the minimum
number of steps taken on any instance of size n.(Optimistic)
SDA/ALG-AN/DN/V1.0/8
Worst, Best and Average Case (2)
average case
input size
runningtime
n
best case
worst case
Fundamental Data Types: String, Math, Casting 5
SDA/ALG-AN/DN/V1.0/9
Analisa Algoritme: how?n Bagaimana jika kita menggunakan jam?
n Jumlah waktu yang digunakan bervariasi tergantungpada beberapa faktor lain: kecepatan mesin, sistemoperasi, kualitas kompiler, dan bahasa pemrograman.
n Sehingga kurang memberikan gambaran yang tepattentang algoritme
n Notasi O (sering disebut sebagai “notasi big-Oh”)n Digunakan sebagai bahasa untuk membahas efisiensi
dari sebuah algoritme: log n, linier, n log n, n2, n3, ...n Contoh: run time algoritme Insertion Sort adalah O(n2)n Dari hasil run-time, dapat kita buat grafik dari waktu
eksekusi dan jumlah data.
SDA/ALG-AN/DN/V1.0/10
Big Ohn Sebuah fungsi kubic adalah sebuah fungsi yang suku
dominannya (dominant term) adalah sebuah konstandikalikan dengan n3.
n Contoh:n 10 n3 + n2 + 40 n + 80n n3 + 1000 n2 + 40 n + 80n n3 + 10000
Fundamental Data Types: String, Math, Casting 6
SDA/ALG-AN/DN/V1.0/11
Dominant Termn Mengapa hanya suku yang memiliki pangkat
tertinggi/dominan saja yang diperhatikan?n Contoh: kita estimasi n3 + 350n2 + n dengan n3.n Untuk n = 10000
n Nilai aktual 1,003,500,010,000n Nilai estimasi 1,000,000,000,000n Kesalahan dalam estimasi 0.35%, dapat diabaikan.
n Untuk n yang besar, suku dominan lebihmengindikasikan perilaku dari algoritme.
n Untuk n yang kecil, suku dominan tidak selalumengindikasikan perilakunya, tetapi program denganinput kecil umumnya berjalan sangat cepat sehinggakita tidak perlu perhatikan.
SDA/ALG-AN/DN/V1.0/12
Big Oh: issuesn Apakah fungsi linier selalu lebih kecil dari fungsi
kubic?n Untuk n yang kecil, bisa saja fungsi linier > fungsi
kubicn Tetapi untuk n yang besar, fungsi kubic > fungsi liniern Contoh:
• 10 n3 + 20 n + 10• 1000 n + 100
n Mengapa nilai konstan/koefesien pada setiap sukutidak diperhatikan?n Nilai konstan/koefesien tidak berarti pada mesin yang
berbedan ∴∴ Big Oh digunakan untuk merepresentasikan laju
pertumbuhan (growth rate)
Fundamental Data Types: String, Math, Casting 7
SDA/ALG-AN/DN/V1.0/13
Orde Fungsi Running-Time
Function Namec Constantlog N Logarithmiclog2 N Log-squaredN LinearN log N N log NN2 QuadraticN3 Cubic2N ExponentialN! Factorial
SDA/ALG-AN/DN/V1.0/14
Fungsi Running-Timen Untuk input yang sedikit, beberapa fungsi lebih cepat
dibandingkan dengan yang lain.
Fundamental Data Types: String, Math, Casting 8
SDA/ALG-AN/DN/V1.0/15
Fungsi Running-Time (2)n Untuk input yang besar, beberapa fungsi running-
time sangat lambat - tidak berguna.
SDA/ALG-AN/DN/V1.0/16
Contohn Mencari elemen terkecil dalam sebuah array
n Algoritme: sequential scann Orde-nya: O(n) - linear
n Mencari dua titik yang memiliki jarak terpendek dalamsebuah bidang (koordinat X-Y)n Masalah dalam komputer grafis.n Algoritme:
• hitung jarak dari semua pasangan titik• cari jarak yang terpendek
n Jika jumlah titik adalah n, maka jumlah pasangan adan * (n - 1) / 2
n Orde-nya: O(n2) - kuadratikn Ada solusi yang O(n log n) bahkan O(n) dengan cara
algoritme lain.
Fundamental Data Types: String, Math, Casting 9
SDA/ALG-AN/DN/V1.0/17
Contoh (2)n Tentukan apakah ada tiga titik dalam sebuah bidang
yang segaris (colinier).n Algoritme:
• periksa semua pasangan titik yang terdiri dari 3 titik.n Jumlah pasangan: n * (n - 1) * (n - 2) / 6n Orde-nya: O(n3) - kubik. Sangat tidak berguna untuk
10.000 titik.n Ada algoritme yang kuadratik.
SDA/ALG-AN/DN/V1.0/18
Studi Kasusn Mengamati sebuah masalah dengan beberapa solusi.n Masalah Maximum Contiguous Subsequence Sum
n Diberikan (angka integer negatif dimungkinkan) A1, A2,…, AN, cari nilai maksimum dari (Ai + Ai+1 + …+ Aj ).
n maximum contiguous subsequence sum adalah noljika semua integer adalah negatif. (Why?)
n Contoh (maximum subsequences digarisbawahi)n -2, 11, -4, 13, -4, 2n 1, -3, 4, -2, -1, 6
Fundamental Data Types: String, Math, Casting 10
SDA/ALG-AN/DN/V1.0/19
Brute Force Algorithm (1)n Algoritme:
n Hitung jumlah dari semua sub-sequence yangmungkin
n Cari nilai maksimumnya
SDA/ALG-AN/DN/V1.0/20
public static int MaxSubSum1( int [] A ) int MaxSum = 0; for(int i = 0; i < A.length; i++) for( int j = i; j < A.length; j++) int ThisSum = 0; for(int k = i; k <= j; k++) ThisSum += A[ k ]; if( ThisSum > MaxSum ) MaxSum = ThisSum; return MaxSum;
Brute Force Algorithm (2)
Fundamental Data Types: String, Math, Casting 11
SDA/ALG-AN/DN/V1.0/21
Analysisn Loop of size N inside of loop of size N inside of loop
of size N means O(N3), or cubic algorithm.n Slight over-estimate that results from some loops
being of size less than N is not important.
SDA/ALG-AN/DN/V1.0/22
Actual Running Timen For N = 100, actual time is 0.47 seconds on a
particular computer.n Can use this to estimate time for larger inputs:
T(N) = cN 3
T(10N) = c(10N)3 = 1000cN 3 = 1000T(N)n Inputs size increases by a factor of 10 means that
running time increases by a factor of 1,000.n For N = 1000, estimate an actual time of 470 seconds.
(Actual was 449 seconds).n For N = 10,000, estimate 449000 seconds (6 days).
Fundamental Data Types: String, Math, Casting 12
SDA/ALG-AN/DN/V1.0/23
How To Improven Remove a loop; not always possible.n Here it is: innermost loop is unnecessary because it
throws away information.n ThisSum for next j is easily obtained from old value
of ThisSum:n Need Ai + A i+1 + … + A j-1 + Aj
n Just computed Ai +A i+1 + …+ A j-1
n What we need is what we just computed + Aj
SDA/ALG-AN/DN/V1.0/24
The Better Algorithm public static int MaxSubSum2( int [ ] A ) int MaxSum = 0; for( int i = 0; i < A.length; i++ ) int ThisSum = 0; for( int j = i; j < A.length; j++ ) ThisSum += A[ j ]; if( ThisSum > MaxSum ) MaxSum = ThisSum; return MaxSum;
Fundamental Data Types: String, Math, Casting 13
SDA/ALG-AN/DN/V1.0/25
Analysisn Same logic as before: now the running time is
quadratic, or O(N2)n As we will see, this algorithm is still usable for inputs
in the tens of thousands.n Recall that the cubic algorithm was not practical for
this amount of input.
SDA/ALG-AN/DN/V1.0/26
Actual running timen For N = 100, actual time is 0.011 seconds on the same
particular computer.n Can use this to estimate time for larger inputs:
T(N) = cN 2
T(10N) = c(10N)2 = 100cN 2 = 100T(N)n Inputs size increases by a factor of 10 means that
running time increases by a factor of 100.n For N = 1000, estimate a running time of 1.11 seconds.
(Actual was 1.12 seconds).n For N = 10,000, estimate 111 seconds (= actual).
Fundamental Data Types: String, Math, Casting 14
SDA/ALG-AN/DN/V1.0/27
Recursive Algorithmn Use a divide-and-conquer approach.
n Membagi (divide) permasalahan ke dalam bagianyang lebih kecil
n Menyelesaikan (conquer) masalah per bagian secararecursive
n Menggabung penyelesaian per bagian menjadi solusimasalah awal
SDA/ALG-AN/DN/V1.0/28
Recursive Algorithm (2)n The maximum subsequence either
n lies entirely in the first halfn lies entirely in the second halfn starts somewhere in the first half, goes to the last
element in the first half, continues at the first elementin the second half, ends somewhere in the second half.
n Compute all three possibilities, and use themaximum.
n First two possibilities easily computed recursively.
Fundamental Data Types: String, Math, Casting 15
SDA/ALG-AN/DN/V1.0/29
Computing the Third Casen Easily done with two loops; see the coden For maximum sum that starts in the first half and
extends to the last element in the first half, use aright-to-left scan starting at the last element in thefirst half.
n For the other maximum sum, do a left-to-right scan,starting at the first element in the first half.
4 -3 5 -2 -1 2 6 -24* 0 3 -2 -1 1 7* 5
SDA/ALG-AN/DN/V1.0/30
Recursion versionstatic private intmaxSumRec (int[] a, int left, int right) int center = (left + right) / 2;
if(left == right) // Base case return a[left] > 0 ? a[left] : 0;
int maxLeftSum = maxSumRec (a, left, center); int maxRightSum = maxSumRec (a, center+1, right);
for(int i = center; i >= left; i--) leftBorderSum += a[i]; if(leftBorderSum > maxLeftBorderSum) maxLeftBorderSum = leftBorderSum; ...
Fundamental Data Types: String, Math, Casting 16
SDA/ALG-AN/DN/V1.0/31
Recursion Version ... for(int j = center + 1; j <= right; j++) rightBorderSum += a[j]; if(rightBorderSum > maxRightBorderSum) maxRightBorderSum = rightBorderSum;
return max3 (maxLeftSum, maxRightSum, maxLeftBorderSum + maxRightBoderSum);
// publicly visible routinestatic public int maxSubSum (int [] a) return maxSumRec (a, 0, a.length-1);
SDA/ALG-AN/DN/V1.0/32
Coding Detailsn The code is more involvedn Make sure you have a base case that handles zero-
element arrays.n Use a public static driver with a private recursive
routine.n Recursion rules:
n Have a base casen Make progress to the base casen Assume it worksn Avoid computing the same solution twice
Fundamental Data Types: String, Math, Casting 17
SDA/ALG-AN/DN/V1.0/33
Analysisn Let T( N ) = the time for an algorithm to solve a
problem of size N.n Then T( 1 ) = 1 (1 will be the quantum time unit;
remember that constants don't matter).n T( N ) = 2 T( N / 2 ) + N
n Two recursive calls, each of size N / 2. The time tosolve each recursive call is T( N / 2 ) by the abovedefinition
n Case three takes O( N ) time; we use N, because wewill throw out the constants eventually.
SDA/ALG-AN/DN/V1.0/34
Bottom LineT(1) = 1 = 1 * 1T(2) = 2 * T(1) + 2 = 4 = 2 * 2T(4) = 2 * T(2) + 4 = 12 = 4 * 3T(8) = 2 * T(4) + 8 = 32 = 8 * 4T(16) = 2 * T(8) + 16 = 80 = 16 * 5T(32) = 2 * T(16) + 32 = 192 = 32 * 6T(64) = 2 * T(32) + 64 = 448 = 64 * 7
T(N) = N(1 + log N) = O(N log N)
Fundamental Data Types: String, Math, Casting 18
SDA/ALG-AN/DN/V1.0/35
N log Nn Any recursive algorithm that solves two half-sized
problems and does linear non-recursive work tocombine/split these solutions will always take O( Nlog N ) time because the above analysis will alwayshold.
n This is a very significant improvement over quadratic.n It is still not as good as O( N ), but is not that far away
either. There is a linear-time algorithm for thisproblem; see the online code. The running time isclear, but the correctness is non-trivial.
SDA/ALG-AN/DN/V1.0/36
Linear Algorithmsn Linear algorithm would be best.n Remember: linear means O( N ).n Running time is proportional to amount of input. Hard
to do better for an algorithm.n If input increases by a factor of ten, then so does
running time.
Fundamental Data Types: String, Math, Casting 19
SDA/ALG-AN/DN/V1.0/37
Idea
n Let Ai,j be any sequence with Si,j < 0. If q > j, then A i,q
is not the maximum contigous subsequence.
n Proof:
n Si,q = S i,j + Sj+1,q
n Si,j < 0 => Si,q < S j+1,q
SDA/ALG-AN/DN/V1.0/38
The Code - version 1static public intmaximumSubSequenceSum3 (int a[]) int maxSum = 0; int thisSum = 0; for (int j = 0; j < a.length; j++) thisSum += a[j]; if (thisSum > maxSum) maxSum = thisSum; else if (thisSum < 0) thisSum = 0; return maxSum;
Fundamental Data Types: String, Math, Casting 20
SDA/ALG-AN/DN/V1.0/39
The Code - version 2static public intmaximumSubSequenceSum3b (int data[]) int thisSum = 0, maxSum = 0; for (int i = 0; i < data.length; i++) thisSum = Max(0, data[i] + thisSum); maxSum = Max(thisSum, maxSum); return maxSum;
SDA/ALG-AN/DN/V1.0/40
Running Time
Fundamental Data Types: String, Math, Casting 21
SDA/ALG-AN/DN/V1.0/41
Running Time: on different machinesn Cubic Algorithm on Alpha 21164 at 533 Mhz using C compiler
n Linear Algorithm on Radio Shack TRS-80 Model III(a 1980 personal computer with a Z-80 processor running at 2.03Mhz) using interpreted Basic
n Alpha 21164A,C compiled,
Cubic Algorithm
TRS-80,Basic interpreted,Linear Algorithm
10 0.6 microsecs 200 milisecs100 0.6 milisecs 2.0 secs
1,000 0.6 secs 20 secs10,000 10 mins 3.2 mins
100,000 7 days 32 mins1,000,000 19 yrs 5.4 hrs
SDA/ALG-AN/DN/V1.0/42
Running Time: moral storyn Even the most clever programming tricks cannot
make an ineffiecient algorithm fast.n Before we waste effort attempting to optimize code,
we need to optimize the algorithm.
Fundamental Data Types: String, Math, Casting 22
SDA/ALG-AN/DN/V1.0/43
The Logarithmn Formal Definition
n For any B, N > 0, logBN = K if B K = N.n If (the base) B is omitted, it defaults to 2 in computer
science.
n Examples:n log 32 = 5 (because 25 = 32)n log 1024 = 10n log 1048576 = 20n log 1 billion = about 30
n The logarithm grows much more slowly than N, andslower than the square root of N.
SDA/ALG-AN/DN/V1.0/44
Examples of the Logarithmn BITS IN A BINARY NUMBER
n How many bits are required to represent Nconsecutive integers?
n REPEATED DOUBLINGn Starting from X = 1, how many times should X be
doubled before it is at least as large as N?n REPEATED HALVING
n Starting from X = N, if N is repeatedly halved, howmany iterations must be applied to make N smallerthan or equal to 1 (Halving rounds up).
n Answer to all of the above is log N (rounded up).
Fundamental Data Types: String, Math, Casting 23
SDA/ALG-AN/DN/V1.0/45
Why log N?n B bits represents 2B integers. Thus 2B is at least as
big as N, so B is at least log N. Since B must be aninteger, round up if needed.
n Same logic for the other examples.
SDA/ALG-AN/DN/V1.0/46
Repeated Halving Principlen An algorithm is O( log N ) if it takes constant time to
reduce the problem size by a constant fraction (whichis usually 1/2).
n Reason: there will be log N iterations of constantwork.
Fundamental Data Types: String, Math, Casting 24
SDA/ALG-AN/DN/V1.0/47
Static Searchingn Given an integer X and an array A, return the position
of X in A or an indication that it is not present. If Xoccurs more than once, return any occurrence. Thearray A is not altered.
n If input array is not sorted, solution is to use asequential search. Running times:n Unsuccessful search: O( N ); every item is examinedn Successful search:
• Worst case: O( N ); every item is examined• Average case: O( N ); half the items are examined
n Can we do better if we know the array is sorted?
SDA/ALG-AN/DN/V1.0/48
Binary Searchn Yes! Use a binary search.n Look in the middle
n Case 1: If X is less than the item in the middle, thenlook in the subarray to the left of the middle
n Case 2: If X is greater than the item in the middle, thenlook in the subarray to the right of the middle
n Case 3: If X is equal to the item in the middle, then wehave a match
n Base Case: If the subarray is empty, X is not found.
n This is logarithmic by the repeated halving principle.n The first binary search was published in 1946. The
first published binary search without bugs did notappear until 1962.
Fundamental Data Types: String, Math, Casting 25
SDA/ALG-AN/DN/V1.0/49
/** Performs the standard binary search using two comparisons per level. @param a the array @param x the key @exception ItemNotFound if appropriate. @return index where item is found.*/public static int binarySearch (Comparable [ ] a, Comparable x ) throws ItemNotFound int low = 0; int high = a.length - 1; int mid;
while( low <= high ) mid = (low + high) / 2; if (a[mid].compares (x) < 0) low = mid + 1; else if (a[mid].compares (x) > 0) high = mid - 1; else return mid; throw new ItemNotFound( "BinarySearch fails" );
SDA/ALG-AN/DN/V1.0/50
Binary Searchn Can do one comparison per iteration instead of two
by changing the base case.n See online code for details.n Average case and worst case in revised algorithm are
identical. 1 + log N comparisons (rounded down tothe nearest integer) are used. Example: If N =1,000,000, then 20 element comparisons are used.Sequential search would be 25,000 times more costlyon average.
Fundamental Data Types: String, Math, Casting 26
SDA/ALG-AN/DN/V1.0/51
Big-Oh Rules (1)n Mathematical expression relative rates of growth
n DEFINITION: (Big-Oh) T(N) = O(F(N)) if there arepositive constants c and N0 such that T(N) ≤ c F(N)when N ≥ N0.
n DEFINITION: (Big-Omega) T(N) = Ω(F(N)) if there areconstants c and N0 such that T(N) ≥ c F(N) whenN ≥ N0.
n DEFINITION: (Big-Theta) T(N) = Θ(F(N)) if and only ifT(N) = O(F(N)) and T(N) = Ω(F(N)).
n DEFINITION: (Little-Oh) T(N) = o(F(N)) if and only ifT(N) = O(F(N)) and T(N) ≠ Θ (F(N)).
SDA/ALG-AN/DN/V1.0/52
Big-Oh Rules (2)n Meanings of the various growth functions
T(N) = O(F(N)) Growth of T(N) is ≤ growth of F(N)T(N) = Ω(F(N)) Growth of T(N) is ≥ growth of F(N)T(N) = Θ(F(N)) Growth of T(N) is = growth of F(N)
T(N) = o(F(N)) Growth of T(N) is < growth of F(N)
Fundamental Data Types: String, Math, Casting 27
SDA/ALG-AN/DN/V1.0/53
Limitation of Big Ohn Tidak dapat digunakan untuk N yang keciln Faktor X
n Memory access & disk accessn Memory size & disk cache
SDA/ALG-AN/DN/V1.0/54
Summaryn more data means the program takes more timen Big-Oh tidak dapat digunakan untuk N yang kecil
Fundamental Data Types: String, Math, Casting 28
SDA/ALG-AN/DN/V1.0/55
Exercisesn Urutkan fungsi berikut berdasarkan laju pertumbuhan
(growth rate)
n N, √N, N1.5, N2, N log N, N log log N, N log2 N, N log(N2), 2/N, 2N, 2N/2, 37, N3, N2 log N
n A5 + B5 + C5 + D5 + E5 = F5
n 0 < A ≤ B ≤ C ≤ D ≤ E ≤ F ≤ 75
n hanya memiliki satu solusi. berapa?
SDA/ALG-AN/DN/V1.0/56
Assignmentn Critical Mass
n Secara umum dapat dilihat dihttp://pala.acad.cs.ui.ac.id/contest/OnlineJudge/v5/580.html
n Lengkapnya akan diposting di forum.iki.strukturn Due date: 2 Maret 2001
Fundamental Data Types: String, Math, Casting 29
SDA/ALG-AN/DN/V1.0/57
Further Readingn Chapter 5
n http://pala.acad.cs.ui.ac.id/contest/mirror/evo.apm.tuwien.ac.at/AlgDesignManual/INDEX.HTM
n http://pala.acad.cs.ui.ac.id/contest/OnlineJudge/v1/108.html
SDA/ALG-AN/DN/V1.0/58
What’s Nextn Recursion (Chapter 7)