data mining - universitas hasanuddinunhas.ac.id/amil/s1tif/dm2020/04 dm 2020.pdf · algoritma data...
TRANSCRIPT
![Page 1: Data Mining - Universitas Hasanuddinunhas.ac.id/amil/S1TIF/DM2020/04 DM 2020.pdf · Algoritma Data Mining (DM) 1. Estimation (Estimasi): • Linear Regression, Neural Network, Support](https://reader033.vdocuments.site/reader033/viewer/2022060716/607cd7198b2c93354061197b/html5/thumbnails/1.jpg)
Data Mininghttp://www.unhas.ac.id/amil/S1TIF/DM2020/
L4
Amil Ahmad Ilham
![Page 2: Data Mining - Universitas Hasanuddinunhas.ac.id/amil/S1TIF/DM2020/04 DM 2020.pdf · Algoritma Data Mining (DM) 1. Estimation (Estimasi): • Linear Regression, Neural Network, Support](https://reader033.vdocuments.site/reader033/viewer/2022060716/607cd7198b2c93354061197b/html5/thumbnails/2.jpg)
Algoritma Data Mining (DM)
1. Estimation (Estimasi):• Linear Regression, Neural Network, Support Vector Machine, etc
2. Prediction/Forecasting (Prediksi/Peramalan):• Linear Regression, Neural Network, Support Vector Machine, etc
3. Classification (Klasifikasi):• Naive Bayes, K-Nearest Neighbor, C4.5, ID3, CART, Linear Discriminant Analysis,
Logistic Regression, etc
4. Clustering (Klastering):• K-Means, K-Medoids, Self-Organizing Map (SOM), Fuzzy C-Means, etc
5. Association (Asosiasi):• FP-Growth, A Priori, Coefficient of Correlation, Chi Square, etc
2
![Page 3: Data Mining - Universitas Hasanuddinunhas.ac.id/amil/S1TIF/DM2020/04 DM 2020.pdf · Algoritma Data Mining (DM) 1. Estimation (Estimasi): • Linear Regression, Neural Network, Support](https://reader033.vdocuments.site/reader033/viewer/2022060716/607cd7198b2c93354061197b/html5/thumbnails/3.jpg)
Evaluasi Model RegresiEvaluasi Model Regresi
![Page 4: Data Mining - Universitas Hasanuddinunhas.ac.id/amil/S1TIF/DM2020/04 DM 2020.pdf · Algoritma Data Mining (DM) 1. Estimation (Estimasi): • Linear Regression, Neural Network, Support](https://reader033.vdocuments.site/reader033/viewer/2022060716/607cd7198b2c93354061197b/html5/thumbnails/4.jpg)
Evaluasi Regresi
𝑅𝑀𝑆𝐸 =1
𝑛
𝑗=1
𝑛
𝑦𝑗 − 𝑦𝑗2
Root Mean Squared Error:
Populer karena memberikan nilai dengan skala yang sama dengan vector respon y.
𝑀𝐴𝐸 =1
𝑛
𝑗=1
𝑛
𝑦𝑗 − 𝑦𝑗
Mean Absolute Error:
Mencerminkan rata-rata error.
𝑀𝑆𝐸 =1
𝑛
𝑗=1
𝑛
𝑦𝑗 − 𝑦𝑗2
Mean Squared Error:
Populer karena lebih mengutamaerror yang besar.
Error
![Page 5: Data Mining - Universitas Hasanuddinunhas.ac.id/amil/S1TIF/DM2020/04 DM 2020.pdf · Algoritma Data Mining (DM) 1. Estimation (Estimasi): • Linear Regression, Neural Network, Support](https://reader033.vdocuments.site/reader033/viewer/2022060716/607cd7198b2c93354061197b/html5/thumbnails/5.jpg)
Pendekatan Evaluasi (Train/Test Split)
Engine Size Cylinder
Fuel Consumption Co2 Em
0 2.0 4 8.5 196
1 2.4 4 9.6 221
2 1.5 4 5.9 136
3 3.5 6 11.1 255
4 3.5 6 10.6 244
5 3.5 6 10.0 230
6 3.5 6 10.1 232
7 3.7 6 11.1 255
8 3.7 6 11.6 267
9 2.4 4 9.2 212
Train
Test
Nilai aktual
Prediksi
6 234
7 256
8 267
9 210
Nilai prediksibandingkan
![Page 6: Data Mining - Universitas Hasanuddinunhas.ac.id/amil/S1TIF/DM2020/04 DM 2020.pdf · Algoritma Data Mining (DM) 1. Estimation (Estimasi): • Linear Regression, Neural Network, Support](https://reader033.vdocuments.site/reader033/viewer/2022060716/607cd7198b2c93354061197b/html5/thumbnails/6.jpg)
Pendekatan Evaluasi (Train/Test Split)
• Contoh:
![Page 7: Data Mining - Universitas Hasanuddinunhas.ac.id/amil/S1TIF/DM2020/04 DM 2020.pdf · Algoritma Data Mining (DM) 1. Estimation (Estimasi): • Linear Regression, Neural Network, Support](https://reader033.vdocuments.site/reader033/viewer/2022060716/607cd7198b2c93354061197b/html5/thumbnails/7.jpg)
Model Non-linierModel Non-linier
![Page 8: Data Mining - Universitas Hasanuddinunhas.ac.id/amil/S1TIF/DM2020/04 DM 2020.pdf · Algoritma Data Mining (DM) 1. Estimation (Estimasi): • Linear Regression, Neural Network, Support](https://reader033.vdocuments.site/reader033/viewer/2022060716/607cd7198b2c93354061197b/html5/thumbnails/8.jpg)
Haruskah menggunakan regresi linier?
8
Data GDP 1960 -2014
![Page 9: Data Mining - Universitas Hasanuddinunhas.ac.id/amil/S1TIF/DM2020/04 DM 2020.pdf · Algoritma Data Mining (DM) 1. Estimation (Estimasi): • Linear Regression, Neural Network, Support](https://reader033.vdocuments.site/reader033/viewer/2022060716/607cd7198b2c93354061197b/html5/thumbnails/9.jpg)
Haruskah menggunakan regresi linier?
9
Data GDP 1960 -2014 Tampak seperti fungsi eksponensial atau logistik.
![Page 10: Data Mining - Universitas Hasanuddinunhas.ac.id/amil/S1TIF/DM2020/04 DM 2020.pdf · Algoritma Data Mining (DM) 1. Estimation (Estimasi): • Linear Regression, Neural Network, Support](https://reader033.vdocuments.site/reader033/viewer/2022060716/607cd7198b2c93354061197b/html5/thumbnails/10.jpg)
Berbagai jenis regresi
10
Cari fungsi model yang paling cocok
![Page 11: Data Mining - Universitas Hasanuddinunhas.ac.id/amil/S1TIF/DM2020/04 DM 2020.pdf · Algoritma Data Mining (DM) 1. Estimation (Estimasi): • Linear Regression, Neural Network, Support](https://reader033.vdocuments.site/reader033/viewer/2022060716/607cd7198b2c93354061197b/html5/thumbnails/11.jpg)
Regresi linier vs non-linier
• Bagaimana cara mengetahui sebuah problem itu linier atau non-linier?
• Inspeksi secara visual (hitung koefisien korelasi antara variable terikat dan bebas; > 0,7 berarti linier)
• Berdasarkan akurasi (karena tidak dapat memodelkan hubungan dengan parameter linier)
• Bagaimana memodelkan data, jika scatter plot menunjukkan sifat non-linier?
• Regresi polynomial
• Regresi non-linier
• “transformasi” data…
11
![Page 12: Data Mining - Universitas Hasanuddinunhas.ac.id/amil/S1TIF/DM2020/04 DM 2020.pdf · Algoritma Data Mining (DM) 1. Estimation (Estimasi): • Linear Regression, Neural Network, Support](https://reader033.vdocuments.site/reader033/viewer/2022060716/607cd7198b2c93354061197b/html5/thumbnails/12.jpg)
TutorialTutorial
12
![Page 13: Data Mining - Universitas Hasanuddinunhas.ac.id/amil/S1TIF/DM2020/04 DM 2020.pdf · Algoritma Data Mining (DM) 1. Estimation (Estimasi): • Linear Regression, Neural Network, Support](https://reader033.vdocuments.site/reader033/viewer/2022060716/607cd7198b2c93354061197b/html5/thumbnails/13.jpg)
Persamaan Linier
• y = ax + b
• Contoh y = 2x + 3
• Gambarkan secara manual grafik y untuk -5 <= x <= 5
13
![Page 14: Data Mining - Universitas Hasanuddinunhas.ac.id/amil/S1TIF/DM2020/04 DM 2020.pdf · Algoritma Data Mining (DM) 1. Estimation (Estimasi): • Linear Regression, Neural Network, Support](https://reader033.vdocuments.site/reader033/viewer/2022060716/607cd7198b2c93354061197b/html5/thumbnails/14.jpg)
Persamaan Linier
y = 2x + 3
Menggunakan Jupyter Notebook, plot y untuk -5 <= x <= 5• import numpy as np• import matplotlib.pyplot as plt• %matplotlib inline• x = np.arange(-5.0, 5.0, 0.5)• y = 2*(x) + 3• plt.plot(x,y, 'r') • plt.ylabel(‘y')• plt.xlabel(‘x')• plt.show()
14
![Page 15: Data Mining - Universitas Hasanuddinunhas.ac.id/amil/S1TIF/DM2020/04 DM 2020.pdf · Algoritma Data Mining (DM) 1. Estimation (Estimasi): • Linear Regression, Neural Network, Support](https://reader033.vdocuments.site/reader033/viewer/2022060716/607cd7198b2c93354061197b/html5/thumbnails/15.jpg)
Persamaan Linier
Misalkan datanya digenerate secara random:• x = np.arange(-5.0, 5.0, 0.4)
• y = 2*(x) + 3
• y_random = 2 * np.random.normal(size=x.size)
• ydata = y + y_random
• plt.plot(x, ydata, ‘bo')
• plt.plot(x,y, 'r')
• plt.ylabel(‘y')
• plt.xlabel(‘x')
• plt.show()
15
![Page 16: Data Mining - Universitas Hasanuddinunhas.ac.id/amil/S1TIF/DM2020/04 DM 2020.pdf · Algoritma Data Mining (DM) 1. Estimation (Estimasi): • Linear Regression, Neural Network, Support](https://reader033.vdocuments.site/reader033/viewer/2022060716/607cd7198b2c93354061197b/html5/thumbnails/16.jpg)
Persamaan Non-Linier
16
![Page 17: Data Mining - Universitas Hasanuddinunhas.ac.id/amil/S1TIF/DM2020/04 DM 2020.pdf · Algoritma Data Mining (DM) 1. Estimation (Estimasi): • Linear Regression, Neural Network, Support](https://reader033.vdocuments.site/reader033/viewer/2022060716/607cd7198b2c93354061197b/html5/thumbnails/17.jpg)
Persamaan Non-Linier (polynomial)
• y = ax3 + bx2 + cx + d
• Contoh: y = x3 + 2x2 + 3x + 4
• Gambarkan secara manual grafik y untuk -5 <= x <= 5
17
![Page 18: Data Mining - Universitas Hasanuddinunhas.ac.id/amil/S1TIF/DM2020/04 DM 2020.pdf · Algoritma Data Mining (DM) 1. Estimation (Estimasi): • Linear Regression, Neural Network, Support](https://reader033.vdocuments.site/reader033/viewer/2022060716/607cd7198b2c93354061197b/html5/thumbnails/18.jpg)
Persamaan Non-Linier (polynomial)y = x3 + 2x2 + 3x + 4
• Misalkan datanya digenerate secara random:
18
![Page 19: Data Mining - Universitas Hasanuddinunhas.ac.id/amil/S1TIF/DM2020/04 DM 2020.pdf · Algoritma Data Mining (DM) 1. Estimation (Estimasi): • Linear Regression, Neural Network, Support](https://reader033.vdocuments.site/reader033/viewer/2022060716/607cd7198b2c93354061197b/html5/thumbnails/19.jpg)
Persamaan Non-Linier (Quadratic)y = x2
• Misalkan datanya digenerate secara random:
19
![Page 20: Data Mining - Universitas Hasanuddinunhas.ac.id/amil/S1TIF/DM2020/04 DM 2020.pdf · Algoritma Data Mining (DM) 1. Estimation (Estimasi): • Linear Regression, Neural Network, Support](https://reader033.vdocuments.site/reader033/viewer/2022060716/607cd7198b2c93354061197b/html5/thumbnails/20.jpg)
Persamaan Non-Linier (Exponential)y = ex
• Misalkan datanya digenerate secara random:
20
![Page 21: Data Mining - Universitas Hasanuddinunhas.ac.id/amil/S1TIF/DM2020/04 DM 2020.pdf · Algoritma Data Mining (DM) 1. Estimation (Estimasi): • Linear Regression, Neural Network, Support](https://reader033.vdocuments.site/reader033/viewer/2022060716/607cd7198b2c93354061197b/html5/thumbnails/21.jpg)
Persamaan Non-Linier (Logarithmic)y = log x
• Misalkan datanya digenerate secara random:
21
![Page 22: Data Mining - Universitas Hasanuddinunhas.ac.id/amil/S1TIF/DM2020/04 DM 2020.pdf · Algoritma Data Mining (DM) 1. Estimation (Estimasi): • Linear Regression, Neural Network, Support](https://reader033.vdocuments.site/reader033/viewer/2022060716/607cd7198b2c93354061197b/html5/thumbnails/22.jpg)
Persamaan Non-Linier (Sigmoidal/Logistic)
• Misalkan datanya digenerate secara random:
22
![Page 23: Data Mining - Universitas Hasanuddinunhas.ac.id/amil/S1TIF/DM2020/04 DM 2020.pdf · Algoritma Data Mining (DM) 1. Estimation (Estimasi): • Linear Regression, Neural Network, Support](https://reader033.vdocuments.site/reader033/viewer/2022060716/607cd7198b2c93354061197b/html5/thumbnails/23.jpg)
Persamaan Non-Linier (Sigmoidal/Logistic)
• Misalkan datanya digenerate secara random:
23
![Page 24: Data Mining - Universitas Hasanuddinunhas.ac.id/amil/S1TIF/DM2020/04 DM 2020.pdf · Algoritma Data Mining (DM) 1. Estimation (Estimasi): • Linear Regression, Neural Network, Support](https://reader033.vdocuments.site/reader033/viewer/2022060716/607cd7198b2c93354061197b/html5/thumbnails/24.jpg)
Persamaan Non-Linier (Sigmoidal/Logistic)
• Misalkan datanya digenerate secara random:
24
![Page 25: Data Mining - Universitas Hasanuddinunhas.ac.id/amil/S1TIF/DM2020/04 DM 2020.pdf · Algoritma Data Mining (DM) 1. Estimation (Estimasi): • Linear Regression, Neural Network, Support](https://reader033.vdocuments.site/reader033/viewer/2022060716/607cd7198b2c93354061197b/html5/thumbnails/25.jpg)
Studi Kasus Regresi Non-linier
• Download file china_gdp.csv di http://www.unhas.ac.id/amil/S1TIF/DM2020/• Klik kanan file => Save Link As => Save as type: All Files
• Buka file baru di Jupyter Notebook
25
![Page 26: Data Mining - Universitas Hasanuddinunhas.ac.id/amil/S1TIF/DM2020/04 DM 2020.pdf · Algoritma Data Mining (DM) 1. Estimation (Estimasi): • Linear Regression, Neural Network, Support](https://reader033.vdocuments.site/reader033/viewer/2022060716/607cd7198b2c93354061197b/html5/thumbnails/26.jpg)
Melihat dataset
• Run new jupyter notebook
26
![Page 27: Data Mining - Universitas Hasanuddinunhas.ac.id/amil/S1TIF/DM2020/04 DM 2020.pdf · Algoritma Data Mining (DM) 1. Estimation (Estimasi): • Linear Regression, Neural Network, Support](https://reader033.vdocuments.site/reader033/viewer/2022060716/607cd7198b2c93354061197b/html5/thumbnails/27.jpg)
Plotting dataset
27
![Page 28: Data Mining - Universitas Hasanuddinunhas.ac.id/amil/S1TIF/DM2020/04 DM 2020.pdf · Algoritma Data Mining (DM) 1. Estimation (Estimasi): • Linear Regression, Neural Network, Support](https://reader033.vdocuments.site/reader033/viewer/2022060716/607cd7198b2c93354061197b/html5/thumbnails/28.jpg)
Memilih model yang cocok dengan dataset(?)
28
Dataset
Exponential Sigmoidal/Logistic
![Page 29: Data Mining - Universitas Hasanuddinunhas.ac.id/amil/S1TIF/DM2020/04 DM 2020.pdf · Algoritma Data Mining (DM) 1. Estimation (Estimasi): • Linear Regression, Neural Network, Support](https://reader033.vdocuments.site/reader033/viewer/2022060716/607cd7198b2c93354061197b/html5/thumbnails/29.jpg)
Memilih model yang cocok dengan dataset(?)
29
Dataset
Sigmoidal/Logistic
![Page 30: Data Mining - Universitas Hasanuddinunhas.ac.id/amil/S1TIF/DM2020/04 DM 2020.pdf · Algoritma Data Mining (DM) 1. Estimation (Estimasi): • Linear Regression, Neural Network, Support](https://reader033.vdocuments.site/reader033/viewer/2022060716/607cd7198b2c93354061197b/html5/thumbnails/30.jpg)
Membuat Model (Sigmoidal/Logistic)
30
![Page 31: Data Mining - Universitas Hasanuddinunhas.ac.id/amil/S1TIF/DM2020/04 DM 2020.pdf · Algoritma Data Mining (DM) 1. Estimation (Estimasi): • Linear Regression, Neural Network, Support](https://reader033.vdocuments.site/reader033/viewer/2022060716/607cd7198b2c93354061197b/html5/thumbnails/31.jpg)
Testing Model (Optional)
31
![Page 32: Data Mining - Universitas Hasanuddinunhas.ac.id/amil/S1TIF/DM2020/04 DM 2020.pdf · Algoritma Data Mining (DM) 1. Estimation (Estimasi): • Linear Regression, Neural Network, Support](https://reader033.vdocuments.site/reader033/viewer/2022060716/607cd7198b2c93354061197b/html5/thumbnails/32.jpg)
Normalisasi data
![Page 33: Data Mining - Universitas Hasanuddinunhas.ac.id/amil/S1TIF/DM2020/04 DM 2020.pdf · Algoritma Data Mining (DM) 1. Estimation (Estimasi): • Linear Regression, Neural Network, Support](https://reader033.vdocuments.site/reader033/viewer/2022060716/607cd7198b2c93354061197b/html5/thumbnails/33.jpg)
Menentukan nilai beta_1 dan beta_2
33
![Page 34: Data Mining - Universitas Hasanuddinunhas.ac.id/amil/S1TIF/DM2020/04 DM 2020.pdf · Algoritma Data Mining (DM) 1. Estimation (Estimasi): • Linear Regression, Neural Network, Support](https://reader033.vdocuments.site/reader033/viewer/2022060716/607cd7198b2c93354061197b/html5/thumbnails/34.jpg)
Hitung y_prediksi
![Page 35: Data Mining - Universitas Hasanuddinunhas.ac.id/amil/S1TIF/DM2020/04 DM 2020.pdf · Algoritma Data Mining (DM) 1. Estimation (Estimasi): • Linear Regression, Neural Network, Support](https://reader033.vdocuments.site/reader033/viewer/2022060716/607cd7198b2c93354061197b/html5/thumbnails/35.jpg)
Plot Hasil Regresi Non Linier
35
![Page 36: Data Mining - Universitas Hasanuddinunhas.ac.id/amil/S1TIF/DM2020/04 DM 2020.pdf · Algoritma Data Mining (DM) 1. Estimation (Estimasi): • Linear Regression, Neural Network, Support](https://reader033.vdocuments.site/reader033/viewer/2022060716/607cd7198b2c93354061197b/html5/thumbnails/36.jpg)
Tugas• Buat program untuk menghasilkan
36GDP China 1960 - 2014 Prediksi GDP China 2015 - 2030