machine learning - eden au · supervised learning is the machine learning task of learning a...
Post on 15-Mar-2020
22 Views
Preview:
TRANSCRIPT
Machine LearningEden Au, James Fulton
1
3http://joelgrus.com/2013/06/09/post-prism-data-science-venn-diagram/
4https://uk.mathworks.com/discovery/machine-learning.html
Supervised Learning
“ Supervised learning is the machine learning task of learning a function that maps an input to an output based on example input-output pairs.
6
7
The Machine● Linear regression ● K-nearest neighbours ● Support Vector Machines ● Random Forest● Naïve bayes
8
9Wikipedia
Linear Regression - Ordinary Least Squares1. Linearity (solved by feature engineering e.g. polynomial regression)2. Error-free inputs (solved by generalized least squares)3. Common variance (solved by weighted least squares)
10
Wikipedia
K-nearest neighbours1. Inherently non-linear
(non-parametric)2. Simple3. Parameter selection4. High memory requirement -
slow prediction stage
11
Main Challenges - Feature Engineering
Shallow learning requires feature engineering
12
ArtificialNeural Network
Neural NetworksEach layer consists of
1. Linear regression2. Non-linear ‘activation’
Multiple layers of network enables
1. Sophisticated features to be learned
Problem:
1. So many parameters - requires much more data
14
ConvolutionalNeural Network
CNN1. Leverages spatio-temporal
relationships2. Applies discrete convolution
operations3. Kernels are the only trainable
parameters - reduce # parameters
4. Insights in kernels:
16
Compressing and separatinginformation
The value of each pixel does not matter, the relationships among neighbouring pixels do.
17
18
19
Challenges
1. Data quality (GIGO)2. Data quantity (overfitting)3. Black box (neural networks)4. Feature engineering (for shallow learning)5. Domain expertise6. No guarantee
Unsupervised Learning
“ Unsupervised machine learning algorithms infer patterns from data, without reference to known outcomes
21
What is the underlying structure?
https://www.bbc.co.uk/news/science-environment-47267081https://xkcd.com/1838/
Anomaly Detection
“ Can automatically discover unusual data points in your dataset
24
25
Anomaly Detection: example
- Looking for misrecorded values
Credit: National Science Foundation
26
Anomaly Detection: example
- Classifying extreme events
https://www.ncdc.noaa.gov/extremes/cei/definition
Clustering
“ Allows you to automatically split the dataset into groups according to similarity
28
Clustering
29
30
Clustering:Example
Antarctic ocean temperature profiles
31
Clustering:Example
Antarctic ocean temperature profiles
32
Clustering:Example
Antarctic ocean temperature profiles
Latent Variable Models
“ Decomposing the dataset into multiple components
34
35
Latent Variable Models
Non‑random correlation structures and dimensionality reduction in multivariate climate data - Martin Vejmelka et al. 36
Latent Variable Models: Example
Component Analysis
Autoencoder
“ Tries to learn how to compress data down to the most important components
38
39
Autoencoder:Basic Structure
Image Credit: https://www.jeremyjordan.me/autoencoders/
40
Autoencoder:Application
Dimensionality reduction and finding ‘extreme’ weather events
Topic Modelling
“ Latent variable models applied to text to boost your literature searches
42
43
44
Topic Modelling: Example
Finding the topics of active research and research network
structure
45
Causal Inference
“What are the dynamics of the system? What drives what?
47
Causal Inference: Example
Finding direct and indirect teleconnections
48
Generative Adversarial Networks
“ Learns the distribution function of data so that you can draw more unique samples
50
51
GANs:What they do- Generator takes random
input and tries to create fake image
- Discriminator tries to tell difference between real and fake images
https://thispersondoesnotexist.com/
52
GANs: Example
Generating new, unique examples using what the
network has discovered about the data set
53
GANs: Example
Used to emulate a simulator
Opportunities/resources
“ In data science, 80% of the time is spent preparing data with the remaining 20% spent complaining about the need to prepare data...
55
Questions?
Gaussian Processes
“ Fit a statistical model with minimum assumptions which will return a value and an uncertainty in that value
59
60
Gaussian Processes:
problem
Gaussian Processes:
problem
61
Gaussian Processes:
problem
62
Gaussian Processes:
example
Fitting model to fill gaps in data
63
Gaussian Processes:
exampleOptimising an expensive
experiment or physical model
64
Gaussian Processes:
example
65
top related