[d2 campus] tech meet-up `data science` 발표자료
TRANSCRIPT
-
&
-
2003
: ?
: . .
-
2017
-
..
-
1.
-
ML DM STAT
Data Mining (KDD)
Machine Learning
( AI )
Statistics
From http://www.kdnuggets.com/2014/06/data-science-skills-business-problems.html
-
1.1Data Mining
From www.saedasayad.com
-Solving everything -Algorithmic & Efficient
-
1.2 Machine Learning
From http://www.humphreysheil.com/blog/deep-learning-and-machine-learning
-AI is all of computer science -Learn, learn and learn
-
1.3 Statistics
From www.quora.com
- The World is probabilistic - Model and Distribution
Too formal but strong
https://www.quora.com/What-can-you-do-with-a-double-major-in-Computer-Science-and-Statistics
-
1.4 Why statistics?
Data Mining (KDD)
Machine Learning
( AI )
Statistics
DATA Probability inevitably
Association Rule
( Conditional Probability)
K-means ( EM ) 1. NO BLACK BOX
2. BREAKTHROUGH
-
2003 2017
2007
2010
2012
AI
-
.
-
NEXT..
-
2.
-
1.
? ?
..
-
2.
. .
-
3.
Data Mining (KDD)
Machine Learning
( AI )
Statistics
-
3.
-
1.
-
1.
? ? ? ?
-
1.
? ? ? ?
-
2.
? , , ? ? ? ?
-
&
LDA
Fraud detection
Team matching
ROBOT IR ( )
TOPIC
-
UX - UI
-7%
-
2nd 3600px
1st 2750px
3rd 4550px
2nd 1060px
1st 560px
3rd 2050px
1000px
2000px
3000px
4000px
5000px
560px
User
Valid max 4050 ( 98%)
2nd 610px
1st 560px
3rd 720px
- boxplot
.
-
-50000
0
50000
100000
150000
200000
250000
300000
350000
0 500 1000 1500 2000 2500 3000 3500 4000 4500
median
median
median
- , Graph
-
4.
-
NLP ? text ? ?
-
5. &
-
0. 5 .
(1) Data Collection : ( )
(2) Descriptive Statistics - : .
(3) Exploratory data analysis : , .
(4) Hypothesis testing : .
(5) Estimation : . 5 .
-
1. (Descriptive Statistics) -
?
-> .
-> , ,
Median, quantile, variance,
-
1. (Descriptive Statistics) -
[ multi modal ]
, .
! , .
-
2. (Exploratory Data Analysis ) - ( )
?
-
2. (Exploratory Data Analysis ) -
- Sequence Mining - Clustering - Classification - Topic modeling - Deep learning
[ clustering ]
-
3. (Hypothesis testing) -
?
-> : , ..
-
3. (Hypothesis testing) -
- P-value - T test, Chi square
test - Likelihood ratio - Cross validation
-
4. (Estimation) - ,
-
4. (Estimation) - ,
- Bayesian Inference - Deep Learning
-
5. (Data Collection) -
O
X
? SNS
-
3.1 Agony..
D-
Drop
..
-
3.2 Learn from problem-solving Gaussian Mixture Model for MUSIC ( 2012 )
Beat
,
.
+
.
-
3.3 Roughly saying about Statistics..
-
3.5 .
t
F
()
continuous discrete
:
-
3.5 .
t
F
()
continuous discrete
bernuill binomial
poisson
multinomial Multivariate normaml
gaussian
beta
dirichlet
Student t
Chi-square
F
Gamma
-
-
3.7 .
From wikipedia
-
..
-Welcome!
-
-
Q&A
-
Thank You