romi dm 05 research june2012
TRANSCRIPT
-
8/13/2019 Romi Dm 05 Research June2012
1/19
Data Mining:Penelitian Data Mining
Romi Satria [email protected]://romisatriawahono.net
+6281586220090
-
8/13/2019 Romi Dm 05 Research June2012
2/19
SD Sompok Semarang (1987)
SMPN 8Semarang (1990)
SMA Taruna Nusantara, Magelang (1993)
S1, S2 dan S3 (on-leave)
Department of Computer SciencesSaitama University, Japan (1994-2004)
Research Interests: Software EngineeringandIntelligent Systems
Founder IlmuKomputer.Com
Peneliti LIPI (2004-2007)
Founder dan CEO PT Brainmatics Cipta Informatika
Romi Satria Wahono
-
8/13/2019 Romi Dm 05 Research June2012
3/19
Course Outline
1. Pengenalan Data Mining
2. Proses Data Mining
3. Evaluasi dan Validasi pada Data Mining
4. Metode dan Algoritma Data Mining
5. Penelitian Data Mining
-
8/13/2019 Romi Dm 05 Research June2012
4/19
Penelitian Data Mining
-
8/13/2019 Romi Dm 05 Research June2012
5/19
Penelitian Data Mining
1. Standard Proses Penelitian pada Data Mining
2. Journal Publications on Data Mining
3. Research on Classification
4. Research on Clustering
5. Research on Prediction
6. Research on Association Rule
-
8/13/2019 Romi Dm 05 Research June2012
6/19
Standard Proses Penelitianpada Data Mining
-
8/13/2019 Romi Dm 05 Research June2012
7/19
Data Mining Standard Process (CRISPDM)
A cross-industry standard was clearly required
that is industry neutral, tool-neutral, andapplication-neutral
The Cross-Industry Standard Process for Data
Mining (CRISP
DM) was developed in 1996(Chapman, 2000)
CRISP-DM provides a nonproprietary and freely
available standard process for fitting datamining into the general problem-solvingstrategy of a business or research unit
-
8/13/2019 Romi Dm 05 Research June2012
8/19
CRISP-DM
-
8/13/2019 Romi Dm 05 Research June2012
9/19
1. Business Understanding Phase
Enunciate the project objectives and requirements
clearly in terms of the business or research unit as awhole
Translate these goals and restrictions into the
formulation of a data mining problem definition Prepare a preliminary strategy for achieving these
objectives
-
8/13/2019 Romi Dm 05 Research June2012
10/19
2. Data Understanding Phase
Collect the data
Use exploratory data analysis to familiarize yourselfwith the data and discover initial insights
Evaluatethe quality of the data
If desired, select interesting subsets that maycontain actionable patterns
-
8/13/2019 Romi Dm 05 Research June2012
11/19
3. Data Preparation Phase
Prepare from the initial raw data the final data set
that is to be used for all subsequent phases. Thisphase is very labor intensive
Select the cases and variables you want to analyze
and that are appropriate for your analysis Perform transformations on certain variables, if
needed
Clean the raw data so that it is ready for themodeling tools
-
8/13/2019 Romi Dm 05 Research June2012
12/19
4. Modeling phase
Select and apply appropriate modeling techniques
Calibrate model settingsto optimize results
Remember that often, several different techniquesmay be usedfor the same data mining problem
If necessary, loop back to the data preparationphaseto bring the form of the data into line withthe specific requirements of a particular data
mining technique
-
8/13/2019 Romi Dm 05 Research June2012
13/19
5. Evaluation phase
Evaluate the one or more models delivered in the
modeling phase for quality and effectivenessbefore deploying them for use in the field
Determine whether the model in fact achieves the
objectives set for it in the first phase Establish whether some important facet of the
business or research problem has not beenaccounted for sufficiently
Come to a decision regarding use of the datamining results
-
8/13/2019 Romi Dm 05 Research June2012
14/19
6. Deployment phase
Make use of the models created: Model creation
does not signify the completion of a project
Example of a simple deployment: Generate areport
Example of a more complex deployment:Implement a parallel data mining process inanother department
For businesses, the customer often carries out thedeployment based on your model
-
8/13/2019 Romi Dm 05 Research June2012
15/19
Latihan
Pelajari dan pahami Case Study 1-5 dari buku
Larose (2005) Chapter 1
Pelajari dan pahami bagaimana menerapkanCRISP-DMpada tesis Firmansyah (2011) tentangpenerapan algoritma C4.5 untuk penentuankelayakan kredit
-
8/13/2019 Romi Dm 05 Research June2012
16/19
Journal Publications on Data Mining
-
8/13/2019 Romi Dm 05 Research June2012
17/19
Transactions and Journals
Review Paper (survey and state-of-the-art):
ACM Computing Surveys (CSUR)
Research Paper (technical):
ACM Transactions on Knowledge Discovery from Data (TKDD)
ACM Transactions on Information Systems (TOIS)
IEEE Transactions on Knowledge and Data Engineering
Springer Data Mining and Knowledge Discovery
International Journal of Business Intelligence and Data Mining(IJBIDM)
-
8/13/2019 Romi Dm 05 Research June2012
18/19
Cognitive Assignment III
1. Baca 1 paper ilmiah yang diterbitkan di journal 2010-2012 yang
berhubungan dengan metode data mining yang sudah kitapelajari
2. Rangkumkan masing-masing dalam bentuk slide denganstruktur:
1. Latar Belakang Masalah (Research Background)2. Pernyataan Masalah (Problem Statements)
3. Pertanyaan Penelitian (Research Questions)
4. TujuanPenelitian (Research Objective)
5. Metode-Metode yang Sudah Ada(Existing Methods)6. Metode yang Diusulkan (Proposed Method)
7. Hasil (Results)
8. Kesimpulan(Conclusion)
3. Presentasikan di depan kelas pada mata kuliah berikutnya
-
8/13/2019 Romi Dm 05 Research June2012
19/19
Referensi
1. Ian H. Witten, Frank Eibe, Mark A. Hall, Data mining: Practical
Machine Learning Tools and Techniques 3rd Edition, Elsevier,2011
2. Daniel T. Larose, Discovering Knowledge in Data: anIntroduction to Data Mining,John Wiley & Sons, 2005
3. Florin Gorunescu, Data Mining: Concepts, Models andTechniques, Springer, 2011
4. Jiawei Han and Micheline Kamber, Data Mining: Concepts andTechniques Second Edition, Elsevier, 2006
5. Oded Maimon and Lior Rokach, Data Mining and KnowledgeDiscovery Handbook Second Edition, Springer, 2010
6. Warren Liao and Evangelos Triantaphyllou (eds.), RecentAdvances in Data Mining of Enterprise Data: Algorithms andApplications, World Scientific, 2007