implementasi algoritma rabin karp untuk … · gambar 3. 8 sequence diagram cari judul berita...
Post on 09-Mar-2019
223 Views
Preview:
TRANSCRIPT
IMPLEMENTASI ALGORITMA RABIN KARP
UNTUK REKOMENDASI JUDUL BERITA
INDONESIA
TUGAS AKHIR
Sebagai Persyaratan Guna Meraih Gelar Sarjana Strata 1 Teknik Informatika
Universitas Muhammadiyah Malang
Oleh:
Adika Ridlo Taqwin
NIM. 201210370311068
JURUSAN TEKNIK INFORMATIKA
FAKULTAS TEKNIK
UNIVERSITAS MUHAMMADIYAH MALANG
2016
DAFTAR ISI
BAB I ....................................................................................................................... 1
PENDAHULUAN .................................................................................................. 1
1.1 LATAR BELAKANG .................................................................................... 1
1.1.1 RUMUSAN MASALAH ............................................................................................................... 2 1.1.2 TUJUAN PENELITIAN ................................................................................................................. 3 1.1.3 BATASAN MASALAH ................................................................................................................. 3 1.1.4 METODOLOGI ......................................................................................................................... 3 1.1.5 SISTEMATIKA PENULISAN........................................................................................................... 4
BAB II ..................................................................................................................... 6
LANDASAN TEORI ............................................................................................. 6
2.1 BERITA .......................................................................................................... 6
2.2 PERBEDAAN BERITA MEDIA ELEKTRONIK DAN BERITA
MEDIA CETAK..................................................................................................... 7
2.3 INFORMASI .................................................................................................. 8
2.4 SISTEM INFORMASI .................................................................................. 8
2.5 SIMILARITY ................................................................................................. 9
2.5.1 DISTANCE-BASED SIMILARITY MEASURE ........................................................................................ 9 2.5.2 FEATURE-BASED SIMILARITY MEASURE ......................................................................................... 9 2.5.3 PROBABILISTIC-BASED SIMILIRATY MEASURE .................................................................................. 9
2.6 PENGUKURAN NILAI SIMILARITY ....................................................... 9
2.7 KAPPA STATISTIK ................................................................................... 10
2.8 PRECISION .................................................................................................. 11
2.9 TEXT MINING ............................................................................................ 12
2.10 TEXT PROCESSING ................................................................................ 12
2.11 ALGORITMA RABIN KARP .................................................................. 13
2.12 PROSES HASHING................................................................................... 13
2.13 RECOMMENDER SYSTEMS ................................................................... 14
2.13.1 FILTERING BERDASARKAN ATURAN (RULE-BASED RECOMMENDATION)............................................ 14 2.13.2 FILTERING BERBASIS KONTEN (CONTENT-BASED RECOMMENDATION) .............................................. 14 2.13.3 PENYARINGAN KOLABORATIF (COLLABORATIVE FILTERING (CF) BASED ............................................. 15 2.13.4 HYBRID FILTERING (HYBRID FILTERING BASED RECOMMENDATION) ................................................. 15
2.14 REKOMENDASI KUTIPAN LOKAL (LOCAL CITATION
RECOMMENDATION)...................................................................................... 15
2.15 REKOMENDASI KUTIPAN GLOBAL (GLOBAL CITATION
RECOMMENDATION)...................................................................................... 16
2.16 WEB SCRAPPING .................................................................................... 17
2.17 SOSIAL MEDIA ........................................................................................ 18
BAB III .................................................................................................................. 19
ANALISA DAN PERANCANGAN SISTEM ................................................... 19
3.1 FLOWCHART SISTEM .............................................................................. 19
3.2 PERANCANGAN KEBUTUHAN SISTEM ............................................. 20
3.2.1 USE CASE DAN SKENARIO SISTEM ............................................................................................. 21 3.2.2 ACTIVITY DIAGRAM ................................................................................................................ 21
3.3 PERANCANGAN TAHAP ANALIASA ................................................... 24
3.3.1 ROBUSTNESS DIAGRAM .......................................................................................................... 24 3.3.2 SEQUENCE DIAGRAM LEVEL ANALISA......................................................................................... 24 3.3.3 CLASS DIAGRAM .................................................................................................................... 27
3.4 TAHAPAN PENCARIAN RABBIT SEARCH .......................................... 27
3.5 TAHAPAN PREPROCESSING .................................................................. 28
3.5.1 CASE FOLDING ...................................................................................................................... 28 3.5.2 NORMALISASI KATA ............................................................................................................... 28 3.5.3 TOKENIZING.......................................................................................................................... 30 3.5.4 STEMMING ........................................................................................................................... 30 3.5.5 STOPWORD REMOVAL ............................................................................................................ 35
3.6 TAHAPAN PROCESSING ......................................................................... 37
3.6.1 PEMBENTUKAN K-GRAM ........................................................................................................ 37 3.6.2 PROSES HASHING .................................................................................................................. 39 3.6.3 MENGHILANGKAN NILAI GANDA PADA NILAI HASHING ................................................................. 39 3.6.4 MENGHITUNG NILAI KEDEKATAN .............................................................................................. 40
3.7 PERANCANGAN ANTARMUKA ............................................................ 41
3.7.1 PERANCANGAN HALAMAN UTAMA ........................................................................................... 41 3.7.2 PERANCANGAN HALAMAN KATEGORI ........................................................................................ 42 3.7.3 PERANCANGAN HALAMAN HASIL PENCARIAN .............................................................................. 43 3.7.4 PERANCANGAN HALAMAN DETAIL PROSES ................................................................................. 44
BAB IV .................................................................................................................. 45
IMPLEMENTASI DAN PENGUJIAN SISTEM .............................................. 45
4.1 SPESIFIKASI KEBUTUHAN HARDWARE DAN SOFTWARE ......... 45
4.1.1 KEBUTUHAN HARDWARE ........................................................................................................ 45 4.1.2 KEBUTUHAN SOFTWARE .......................................................................................................... 45
4.2 IMPLEMENTASI SISTEM ........................................................................ 45
4.2.1 PEMBUATAN BASIS DATA ........................................................................................................ 46 4.2.1.1 Tabel Berita ............................................................................................................... 46 4.2.1.2 Tabel Preprocessing Key ........................................................................................... 47 4.2.1.3 Tabel Preprocessing Teks Pembanding ..................................................................... 48
4.2.1.4 Tabel Stopword ......................................................................................................... 50 4.2.1.5 Tabel Kata Dasar ....................................................................................................... 50 4.2.1.6 Tabel Stopword Removal .......................................................................................... 51 4.2.1.7 Tabel Rabin ............................................................................................................... 51 4.2.1.8 Tabel Sementara ....................................................................................................... 53
4.2.2 PEMBUATAN CODE PROGRAM .................................................................................................. 54 4.2.2.1 Konfigurasi pada fremework CI ................................................................................ 54 4.2.2.2 Pembuatan Kode Program Pada Kelas Berita ........................................................... 56
4.2.2.2.1 Fungsi Index ..................................................................................................................... 56 4.2.2.2.2 Fungsi Home ..................................................................................................................... 56 4.2.2.2.3 Fungsi Scrapping Sub ........................................................................................................ 57 4.2.2.2.4 Fungsi Scrapping All ......................................................................................................... 58 4.2.2.2.5 Fungsi Cari ........................................................................................................................ 58 4.2.2.2.6 Fungsi Multi Explode ........................................................................................................ 59 4.2.2.2.7 Fungsi Hapus Imbuhan ..................................................................................................... 59 4.2.2.2.8 Fungsi Ambil Judul ............................................................................................................ 60 4.2.2.2.9 Fungsi Preprocessing Judul .............................................................................................. 60 4.2.2.2.10 Fungsi Stopword Removal .............................................................................................. 62
4.2.2.3 Pembuatan Kode Program Pada Kelas Site .............................................................. 63 4.2.2.3.1 Fungsi Kategori ................................................................................................................. 63 4.2.2.3.2 Fungsi Hasil Kategor ......................................................................................................... 64 4.2.2.3.3 Fungsi Rabin Fix ................................................................................................................ 64 4.2.2.3.4 Fungsi Hasil....................................................................................................................... 67
4.3 PENGUJIAN ................................................................................................ 67
4.3.1 PENGUJIAN ANTAR MUKA ....................................................................................................... 68 4.3.1.1 Halaman Utama ....................................................................................................... 68 4.3.1.2 Halaman Kategori ..................................................................................................... 69 4.3.1.3 Halaman Hasil Pencarian .......................................................................................... 71 4.3.1.4 Halaman Detail Proses .............................................................................................. 73
4.3.2 PENGUJIAN MENGGUNAKAN KAPPA STATISTIK ............................................................................ 75 4.3.4 PENGUJIAN PRECISION ............................................................................................................ 77 4.3.5 PENGUJIAN TAHAPAN PROSES PENCARIAN ................................................................................. 78
BAB V ................................................................................................................... 81
KESIMPULAN DAN SARAN ............................................................................ 81
5.1 KESIMPULAN ............................................................................................ 81
5.2 SARAN .......................................................................................................... 81
DAFTAR PUSTAKA ........................................................................................... 82
Daftar Gambar
Gambar 3. 1 Flowchart Sistem ............................................................................. 19
Gambar 3. 2 Use Case Sistem .............................................................................. 21
Gambar 3. 3 Activity Diagram Scrapping Data ................................................... 21
Gambar 3. 4 Activity Diagram Pilih Kategori ...................................................... 22
Gambar 3. 5 Activity Diagram Cari Berita ........................................................... 22
Gambar 3. 6 Activity Diagram Detail Proses ....................................................... 23
Gambar 3. 7 Robustness Diagram ........................................................................ 24
Gambar 3. 8 Sequence Diagram Cari Judul Berita .............................................. 25
Gambar 3. 9 Sequence Diagram Pilih Kategori ................................................... 25
Gambar 3. 10 Sequence Diagram Scrapping Data .............................................. 26
Gambar 3. 11 Sequence Diagram Detail Proses................................................... 26
Gambar 3. 12 Class Diagram ............................................................................... 27
Gambar 3. 13 Bagan Tahapan Pencarian .............................................................. 27
Gambar 3. 14 Contoh Case Folding ..................................................................... 28
Gambar 3. 15 Contoh Normalisasi ....................................................................... 29
Gambar 3. 16 Contoh Tokenizing ......................................................................... 30
Gambar 3. 17 Alur Proses Stimming .................................................................... 31
Gambar 3. 18 Alur Proses Penghapusan Kata Tidak Penting .............................. 36
Gambar 3. 19 Contoh Proses Pembentukan K-gram ............................................ 38
Gambar 3. 20 Perancangan Utama ....................................................................... 41
Gambar 3. 21 Perancangan Halaman Kategori .................................................... 42
Gambar 3. 22 Perancangan Halaman Hasil Pencarian ......................................... 43
Gambar 3. 23 Perancangan Halaman Detail Proses ............................................. 44
Gambar 4. 1 Kode Program 1 ............................................................................... 54
Gambar 4. 2 Kode Program 2 ............................................................................... 54
Gambar 4. 3 Contoh Menggunakan htaccess ....................................................... 54
Gambar 4. 4 Contoh Tidak Menggunakan htaccess ............................................. 55
Gambar 4. 5 Konfigurasi Autoload ...................................................................... 55
Gambar 4. 6 Konfigurasi Routes .......................................................................... 55
Gambar 4. 7 Konfigurasi Koneksi Database ........................................................ 55
Gambar 4. 8 Fungsi Index..................................................................................... 56
Gambar 4. 9 Fungsi Home .................................................................................... 56
Gambar 4. 10 Fungsi Scrapping Sub .................................................................... 57
Gambar 4. 11 Fungsi Scrapping All ..................................................................... 58
Gambar 4. 12 Fungsi Cari..................................................................................... 58
Gambar 4. 13 Fungsi Multi Explode ..................................................................... 59
Gambar 4. 14 Fungsi Hapus Imbuhan .................................................................. 59
Gambar 4. 15 Fungsi Hapus Imbuhan .................................................................. 60
Gambar 4. 16 Kode Program Untuk Case Folding .............................................. 60
Gambar 4. 17 Kode Program Untuk Tokenizing .................................................. 61
Gambar 4. 18 Kode Program Untuk Normalisasi................................................. 61
Gambar 4. 19 Kode Program Untuk Tokenizing .................................................. 62
Gambar 4. 20 Fungsi Stopword Removal ............................................................. 62
Gambar 4. 21 Fungsi Kategori ............................................................................. 63
Gambar 4. 22 Fungsi Hasil Kategori .................................................................... 64
Gambar 4. 23 Fungsi Rabin Fix............................................................................ 65
Gambar 4. 24 Fungsi Proses Hashing .................................................................. 65
Gambar 4. 25 Kode Proses Hashing ..................................................................... 66
Gambar 4. 26 Kode Proses Menghitung Nilai Kedekatan.................................... 66
Gambar 4. 27 Proses Penyimpanan Nilai Similarity ............................................ 66
Gambar 4. 28 Fungsi Hasil ................................................................................... 67
Gambar 4. 29 Halaman Utama ............................................................................. 68
Gambar 4. 30 Halaman Utama Untuk Smartphone dan Tablet ............................ 68
Gambar 4. 31 Halaman Kategori .......................................................................... 69
Gambar 4. 32 Halaman Kategori Untuk Smartphone dan Tablet ......................... 70
Gambar 4. 33 Halaman Hasil Pencarian ............................................................... 71
Gambar 4. 34 Halaman Hasil Pencarian Untuk Smartphone dan Tablet ............. 72
Gambar 4. 35 Halaman Detail Proses ................................................................... 73
Gambar 4. 36 Halaman Detail Proses Untuk Smartphone dan Tablet ................. 74
Daftar Tabel
Tabel 1. Perbedaan Berita Media Elektronik Dengan Media Cetak ....................... 7
Tabel 2. Proses Penghapusan Kata Pada Teks Pembanding Yang Tidak Memiliki
Kemiripan Dengan Teks Masukan ......................................................................... 37
Tabel 3. Contoh Hasil Pembentukan K-gram........................................................ 38
Tabel 4. Hasil Pembentukan Hashing ................................................................... 39
Tabel 5. Contoh Penghilangan Nilai Ganda Pada Hashing ................................... 40
Tabel 6. Contoh Hasil Hashing ............................................................................. 40
Tabel 7. Berita ....................................................................................................... 46
Tabel 8. Preprocessing Key ................................................................................... 47
Tabel 9. Preprocessing Teks Pembanding ............................................................ 48
Tabel 10. Tabel Stopword...................................................................................... 50
Tabel 11. Kata Dasar ............................................................................................. 50
Tabel 12. Tabel Stopword Removal....................................................................... 51
Tabel 13. Tabel Rabin ........................................................................................... 51
Tabel 14. Sementara .............................................................................................. 53
Tabel 15. Tabel Skenario Hasil Pengujian Kappa ................................................. 75
Tabel 16. Tabel Skenario Hasil Pengujian Kappa ................................................. 76
Tabel 17. Tabel Hasil Pengujian Kappa ................................................................ 76
Tabel 18. Tabel Skenario Pengujian Precision ..................................................... 77
Tabel 19. Hasil Pengujian Precision ..................................................................... 78
Tabel 20. Hasil Pengujian Tahapan Proses Pencarian........................................... 79
Daftar Persamaan
Persamaan 2. 1 Menghitung Nilai Similarity ....................................................... 10
Persamaan 2. 2 Menghitung Nilai P(A) ............................................................... 11
Persamaan 2. 3 Menghitung Nilai P(E) ................................................................ 11
Persamaan 2. 4 Menghitung Nilai Kappa ............................................................. 11
Persamaan 2. 5 Menghitung Nilai Precision ........................................................ 11
Persamaan 2. 6 Menghitung Nilai Hashing.......................................................... 13
DAFTAR PUSTAKA
[1] N. H. R. A. d. M. A. Wandi, “Pengembangan Sistem Rekomendasi
Penelusuran Buku dengan Penggalian Association Rule Menggunakan
Algoritma Apriori,” Jurnal Teknik ITS, vol. 1, pp. 445-449, 2012.
[2] Salmuasih, “Perancangan Sistem Deteksi Plagiat Pada Dokumen Teks
Dengan Konsep Similarity Menggunakan Algoritma Rabin Karp,” 2013.
[3] M. S. P. B. S. Sahriar Hamza, “Sistem Koreksi Soal Essay Otomatis Dengan
Menggunakan Metode Rabin Karp,” Jurnal EECCIS, vol. 7, no. 2, pp. 153-
158, 2013.
[4] D. I. Muda, JURNALISTIK TELEVISI "Menjadi Reporter Profesional",
Bandung: PT REMAJA ROSDAKARYA, 2005.
[5] Witarto, Memahami Sistem Informasi. Pendekatan praktis rekayasa sistem
informasi melalui kasus-kasus sistem informasi di sekitar kita, Bandung:
Informatika Bandung, 2004.
[6] B. Zaka, “Theory and Applications of Similarity Detection Techniques,”
Disertation. Institute for Information Systems and Computer Media (IISCM),
Graz University of Technology Austria, 2009.
[7] S. H. M. Muhammad A. Al Rahmani, “N-Gram-Based Techniques for Arabic
Text Document Matching; Case Study: Courses Accreditation,” ABHATH
AL-YARMOUK: "Basic Sci. & Eng.", vol. 21, no. 1, pp. 85-105, 2012.
[8] F. R. A. Maskur, “Implementasi Web Semantik Untuk Aplikasi Pencarian
Tugas Akhir Menggunakan Ontologi Dan Cosine Similarity,” Jurnal Ilmiah
NERO, vol. 2, no. 1, pp. 11-18, 2015.
[9] J. S. Ronen Feldman, The Text Mining Handbook, Cambridge: Cambridge
University Press, 2007.
[10] Z. A. B. Y. F. A. W. Diah Pudi Langgeni, “Clustering Artikel Berita
Berbahasa Indonesia Menggunakan Unsupervised Feature Selection,”
Seminar Nasional Informatika 2010 (semnasIF 2010), pp. 1-10, 2010.
[11] F. A. S. S. R. Rahadian Dustrial Dewandono, “Clone Detection Using Rabin-
Karp Parallel Algorithm,” Departemen of Informatics, Institut Teknologi
Sepuluh Nopember, pp. 21-26, 2013.
[12] A. R. N. K. Vidya SaiKrishna, “String Matching and its Applications in
Diversified Fields,” IJCSI International Journal of Computer Science Issues,
vol. 9, no. 1, pp. 219-226, 2012.
[13] H. B. Firdaus, “Deteksi Plagiat Dokumen Menggunakan Algoritma Rabin-
Karp,” JURNAL ILMU KOMPUTER DAN TEKNOLOGI INFORMASI, vol.
III, no. 12, 2003.
[14] I. R. Ahmad Aulia Wiguna, “Pemanfaatan Algoritma Rabin-Karp Untuk
Mengetahui Tingkat Kemiripan Dari Source Code Pada Pemrograman Lisp”.
[15] S. A. D. A. A. R. Vidya, “Recommendation of News Groups to the Users
Based on Cobweb Clustering,” Scholars Journal of Engineering and
Technology (SJET), vol. 2, no. 1, pp. 54-59, 2014.
[16] M. N. B. S. N. M. N. I. U. Yahya AlMurtadha, “IPACT: Improved Web Page
Recommendation System Using Profile Aggregation Based On Clustering of
Transaction,” American Journal of Applied Sciences, vol. 8, no. 3, pp. 277-
283, 2011.
[17] X. K. d. Haifeng Liu, “Context-Based Collaborative Filtering for Citation
Recommendation,” The Journal for rapid open access publishing, vol. 3, pp.
1695-1702, 2015.
[18] C. Hanretty, Scrapping The Web For Arts And Humanities, Anglia:
University Of East Anglia, 2013.
[19] R. D. Curtis Rasmussen, “Empowering Users Through Privacy Management
Recommender Systems,” IEEE Canada International Humanitarian
Technology Conference - (IHTC), 2014.
top related