python machine learning step-by-step: modeling financial time series data · 2017-10-08 · python...
TRANSCRIPT
![Page 1: Python Machine Learning Step-by-Step: Modeling Financial Time Series Data · 2017-10-08 · Python Machine Learning Step-by-Step: Modeling Financial Time Series Data Reece Heineke](https://reader036.vdocuments.site/reader036/viewer/2022062602/5ec9961dbbcdfb09b032fe5d/html5/thumbnails/1.jpg)
Python Machine Learning Step-by-Step:Modeling Financial Time Series Data
Reece Heineke
Director of Big DataCredibly
February 27, 2017
![Page 2: Python Machine Learning Step-by-Step: Modeling Financial Time Series Data · 2017-10-08 · Python Machine Learning Step-by-Step: Modeling Financial Time Series Data Reece Heineke](https://reader036.vdocuments.site/reader036/viewer/2022062602/5ec9961dbbcdfb09b032fe5d/html5/thumbnails/2.jpg)
What is Machine Learning?
Data PreparationOverviewPython ToolboxTrade Ideas to DataConclusion
Exploratory Data AnalysisOverviewScatter PlotPrincipal Component Analysis (PCA)Conclusion
Fitting ModelsOverviewModels and PipelinesLearning CurvesInterpretabilityConclusion
A Fitted Model
![Page 3: Python Machine Learning Step-by-Step: Modeling Financial Time Series Data · 2017-10-08 · Python Machine Learning Step-by-Step: Modeling Financial Time Series Data Reece Heineke](https://reader036.vdocuments.site/reader036/viewer/2022062602/5ec9961dbbcdfb09b032fe5d/html5/thumbnails/3.jpg)
What is Machine Learning?
1. Machine learning is a subfield of computer science thatprovides computers with the ability to learn without beingexplicitly programmed.
2. There are two sides to every machine learning problem:
2.1 The learning2.2 Model produced from the learning
![Page 4: Python Machine Learning Step-by-Step: Modeling Financial Time Series Data · 2017-10-08 · Python Machine Learning Step-by-Step: Modeling Financial Time Series Data Reece Heineke](https://reader036.vdocuments.site/reader036/viewer/2022062602/5ec9961dbbcdfb09b032fe5d/html5/thumbnails/4.jpg)
What is Machine Learning?
1. Machine learning is a subfield of computer science thatprovides computers with the ability to learn without beingexplicitly programmed.
2. There are two sides to every machine learning problem:
2.1 The learning2.2 Model produced from the learning
![Page 5: Python Machine Learning Step-by-Step: Modeling Financial Time Series Data · 2017-10-08 · Python Machine Learning Step-by-Step: Modeling Financial Time Series Data Reece Heineke](https://reader036.vdocuments.site/reader036/viewer/2022062602/5ec9961dbbcdfb09b032fe5d/html5/thumbnails/5.jpg)
What is Machine Learning?
1. Machine learning is a subfield of computer science thatprovides computers with the ability to learn without beingexplicitly programmed.
2. There are two sides to every machine learning problem:
2.1 The learning2.2 Model produced from the learning
![Page 6: Python Machine Learning Step-by-Step: Modeling Financial Time Series Data · 2017-10-08 · Python Machine Learning Step-by-Step: Modeling Financial Time Series Data Reece Heineke](https://reader036.vdocuments.site/reader036/viewer/2022062602/5ec9961dbbcdfb09b032fe5d/html5/thumbnails/6.jpg)
What is Machine Learning?
1. Machine learning is a subfield of computer science thatprovides computers with the ability to learn without beingexplicitly programmed.
2. There are two sides to every machine learning problem:
2.1 The learning
2.2 Model produced from the learning
![Page 7: Python Machine Learning Step-by-Step: Modeling Financial Time Series Data · 2017-10-08 · Python Machine Learning Step-by-Step: Modeling Financial Time Series Data Reece Heineke](https://reader036.vdocuments.site/reader036/viewer/2022062602/5ec9961dbbcdfb09b032fe5d/html5/thumbnails/7.jpg)
What is Machine Learning?
1. Machine learning is a subfield of computer science thatprovides computers with the ability to learn without beingexplicitly programmed.
2. There are two sides to every machine learning problem:
2.1 The learning2.2 Model produced from the learning
![Page 8: Python Machine Learning Step-by-Step: Modeling Financial Time Series Data · 2017-10-08 · Python Machine Learning Step-by-Step: Modeling Financial Time Series Data Reece Heineke](https://reader036.vdocuments.site/reader036/viewer/2022062602/5ec9961dbbcdfb09b032fe5d/html5/thumbnails/8.jpg)
Data Preparation: Overview
I Review the Python software stack
I Motivate the problem
I Discuss some issues specific to time series modeling
![Page 9: Python Machine Learning Step-by-Step: Modeling Financial Time Series Data · 2017-10-08 · Python Machine Learning Step-by-Step: Modeling Financial Time Series Data Reece Heineke](https://reader036.vdocuments.site/reader036/viewer/2022062602/5ec9961dbbcdfb09b032fe5d/html5/thumbnails/9.jpg)
Data Preparation: Overview
I Review the Python software stack
I Motivate the problem
I Discuss some issues specific to time series modeling
![Page 10: Python Machine Learning Step-by-Step: Modeling Financial Time Series Data · 2017-10-08 · Python Machine Learning Step-by-Step: Modeling Financial Time Series Data Reece Heineke](https://reader036.vdocuments.site/reader036/viewer/2022062602/5ec9961dbbcdfb09b032fe5d/html5/thumbnails/10.jpg)
Data Preparation: Overview
I Review the Python software stack
I Motivate the problem
I Discuss some issues specific to time series modeling
![Page 11: Python Machine Learning Step-by-Step: Modeling Financial Time Series Data · 2017-10-08 · Python Machine Learning Step-by-Step: Modeling Financial Time Series Data Reece Heineke](https://reader036.vdocuments.site/reader036/viewer/2022062602/5ec9961dbbcdfb09b032fe5d/html5/thumbnails/11.jpg)
Python Toolbox
1
1 Scientific Python by Eueung Mulyana
![Page 12: Python Machine Learning Step-by-Step: Modeling Financial Time Series Data · 2017-10-08 · Python Machine Learning Step-by-Step: Modeling Financial Time Series Data Reece Heineke](https://reader036.vdocuments.site/reader036/viewer/2022062602/5ec9961dbbcdfb09b032fe5d/html5/thumbnails/12.jpg)
Trump2Cash
2
2 Trump2Cash GitHub Project
![Page 13: Python Machine Learning Step-by-Step: Modeling Financial Time Series Data · 2017-10-08 · Python Machine Learning Step-by-Step: Modeling Financial Time Series Data Reece Heineke](https://reader036.vdocuments.site/reader036/viewer/2022062602/5ec9961dbbcdfb09b032fe5d/html5/thumbnails/13.jpg)
Input: Trump criticizes Toyota on Twitter
![Page 14: Python Machine Learning Step-by-Step: Modeling Financial Time Series Data · 2017-10-08 · Python Machine Learning Step-by-Step: Modeling Financial Time Series Data Reece Heineke](https://reader036.vdocuments.site/reader036/viewer/2022062602/5ec9961dbbcdfb09b032fe5d/html5/thumbnails/14.jpg)
Output: Toyota stock opens lower
3
3 Toyota Stock on Yahoo Finance’s Interactive Chart
![Page 15: Python Machine Learning Step-by-Step: Modeling Financial Time Series Data · 2017-10-08 · Python Machine Learning Step-by-Step: Modeling Financial Time Series Data Reece Heineke](https://reader036.vdocuments.site/reader036/viewer/2022062602/5ec9961dbbcdfb09b032fe5d/html5/thumbnails/15.jpg)
WSJ Analysis of Trump Tweets
4
4 by Akane Otani and Shane Shifflett
![Page 16: Python Machine Learning Step-by-Step: Modeling Financial Time Series Data · 2017-10-08 · Python Machine Learning Step-by-Step: Modeling Financial Time Series Data Reece Heineke](https://reader036.vdocuments.site/reader036/viewer/2022062602/5ec9961dbbcdfb09b032fe5d/html5/thumbnails/16.jpg)
IPython: A Data Scientist’s Best Friend
Jupyter Notebook
![Page 17: Python Machine Learning Step-by-Step: Modeling Financial Time Series Data · 2017-10-08 · Python Machine Learning Step-by-Step: Modeling Financial Time Series Data Reece Heineke](https://reader036.vdocuments.site/reader036/viewer/2022062602/5ec9961dbbcdfb09b032fe5d/html5/thumbnails/17.jpg)
Data Preparation: Conclusion
We now have a illustrative data set to work with
I Data set has 10 numeric dimensions: 9 inputs, 1 output
I Data set is large (˜400MB compressed)
![Page 18: Python Machine Learning Step-by-Step: Modeling Financial Time Series Data · 2017-10-08 · Python Machine Learning Step-by-Step: Modeling Financial Time Series Data Reece Heineke](https://reader036.vdocuments.site/reader036/viewer/2022062602/5ec9961dbbcdfb09b032fe5d/html5/thumbnails/18.jpg)
Data Preparation: Conclusion
We now have a illustrative data set to work with
I Data set has 10 numeric dimensions: 9 inputs, 1 output
I Data set is large (˜400MB compressed)
![Page 19: Python Machine Learning Step-by-Step: Modeling Financial Time Series Data · 2017-10-08 · Python Machine Learning Step-by-Step: Modeling Financial Time Series Data Reece Heineke](https://reader036.vdocuments.site/reader036/viewer/2022062602/5ec9961dbbcdfb09b032fe5d/html5/thumbnails/19.jpg)
Exploratory Data Analysis: Overview
I Covariance and Correlation Matrices
I Scatter plots
I Principal Component Analysis (PCA)
I Kernel PCA
![Page 20: Python Machine Learning Step-by-Step: Modeling Financial Time Series Data · 2017-10-08 · Python Machine Learning Step-by-Step: Modeling Financial Time Series Data Reece Heineke](https://reader036.vdocuments.site/reader036/viewer/2022062602/5ec9961dbbcdfb09b032fe5d/html5/thumbnails/20.jpg)
Exploratory Data Analysis: Overview
I Covariance and Correlation Matrices
I Scatter plots
I Principal Component Analysis (PCA)
I Kernel PCA
![Page 21: Python Machine Learning Step-by-Step: Modeling Financial Time Series Data · 2017-10-08 · Python Machine Learning Step-by-Step: Modeling Financial Time Series Data Reece Heineke](https://reader036.vdocuments.site/reader036/viewer/2022062602/5ec9961dbbcdfb09b032fe5d/html5/thumbnails/21.jpg)
Exploratory Data Analysis: Overview
I Covariance and Correlation Matrices
I Scatter plots
I Principal Component Analysis (PCA)
I Kernel PCA
![Page 22: Python Machine Learning Step-by-Step: Modeling Financial Time Series Data · 2017-10-08 · Python Machine Learning Step-by-Step: Modeling Financial Time Series Data Reece Heineke](https://reader036.vdocuments.site/reader036/viewer/2022062602/5ec9961dbbcdfb09b032fe5d/html5/thumbnails/22.jpg)
Exploratory Data Analysis: Overview
I Covariance and Correlation Matrices
I Scatter plots
I Principal Component Analysis (PCA)
I Kernel PCA
![Page 24: Python Machine Learning Step-by-Step: Modeling Financial Time Series Data · 2017-10-08 · Python Machine Learning Step-by-Step: Modeling Financial Time Series Data Reece Heineke](https://reader036.vdocuments.site/reader036/viewer/2022062602/5ec9961dbbcdfb09b032fe5d/html5/thumbnails/24.jpg)
Scatter Plot: What can we say about the data?
![Page 25: Python Machine Learning Step-by-Step: Modeling Financial Time Series Data · 2017-10-08 · Python Machine Learning Step-by-Step: Modeling Financial Time Series Data Reece Heineke](https://reader036.vdocuments.site/reader036/viewer/2022062602/5ec9961dbbcdfb09b032fe5d/html5/thumbnails/25.jpg)
scikit-learn Algorithm Cheat-Sheet: Just looking
5
5 scikit-learn Cheat-Sheet
![Page 26: Python Machine Learning Step-by-Step: Modeling Financial Time Series Data · 2017-10-08 · Python Machine Learning Step-by-Step: Modeling Financial Time Series Data Reece Heineke](https://reader036.vdocuments.site/reader036/viewer/2022062602/5ec9961dbbcdfb09b032fe5d/html5/thumbnails/26.jpg)
Principal Component Analysis (PCA)
![Page 27: Python Machine Learning Step-by-Step: Modeling Financial Time Series Data · 2017-10-08 · Python Machine Learning Step-by-Step: Modeling Financial Time Series Data Reece Heineke](https://reader036.vdocuments.site/reader036/viewer/2022062602/5ec9961dbbcdfb09b032fe5d/html5/thumbnails/27.jpg)
Kernel PCA with Radial Basis Function (RBF)
![Page 28: Python Machine Learning Step-by-Step: Modeling Financial Time Series Data · 2017-10-08 · Python Machine Learning Step-by-Step: Modeling Financial Time Series Data Reece Heineke](https://reader036.vdocuments.site/reader036/viewer/2022062602/5ec9961dbbcdfb09b032fe5d/html5/thumbnails/28.jpg)
Exploratory Data Analysis: Conclusion
I Nonlinear relationship with (0, 9), (2, 9), (6, 9)
I All other dimensions are quite random
![Page 29: Python Machine Learning Step-by-Step: Modeling Financial Time Series Data · 2017-10-08 · Python Machine Learning Step-by-Step: Modeling Financial Time Series Data Reece Heineke](https://reader036.vdocuments.site/reader036/viewer/2022062602/5ec9961dbbcdfb09b032fe5d/html5/thumbnails/29.jpg)
Exploratory Data Analysis: Conclusion
I Nonlinear relationship with (0, 9), (2, 9), (6, 9)
I All other dimensions are quite random
![Page 30: Python Machine Learning Step-by-Step: Modeling Financial Time Series Data · 2017-10-08 · Python Machine Learning Step-by-Step: Modeling Financial Time Series Data Reece Heineke](https://reader036.vdocuments.site/reader036/viewer/2022062602/5ec9961dbbcdfb09b032fe5d/html5/thumbnails/30.jpg)
Fitting Models: Overview
I Scikit learn’s model and pipelines
I Illustrative learning curves
![Page 31: Python Machine Learning Step-by-Step: Modeling Financial Time Series Data · 2017-10-08 · Python Machine Learning Step-by-Step: Modeling Financial Time Series Data Reece Heineke](https://reader036.vdocuments.site/reader036/viewer/2022062602/5ec9961dbbcdfb09b032fe5d/html5/thumbnails/31.jpg)
Fitting Models: Overview
I Scikit learn’s model and pipelines
I Illustrative learning curves
![Page 32: Python Machine Learning Step-by-Step: Modeling Financial Time Series Data · 2017-10-08 · Python Machine Learning Step-by-Step: Modeling Financial Time Series Data Reece Heineke](https://reader036.vdocuments.site/reader036/viewer/2022062602/5ec9961dbbcdfb09b032fe5d/html5/thumbnails/32.jpg)
scikit-learn Revisited
6
6 scikit-learn Cheat-Sheet
![Page 33: Python Machine Learning Step-by-Step: Modeling Financial Time Series Data · 2017-10-08 · Python Machine Learning Step-by-Step: Modeling Financial Time Series Data Reece Heineke](https://reader036.vdocuments.site/reader036/viewer/2022062602/5ec9961dbbcdfb09b032fe5d/html5/thumbnails/33.jpg)
scikit-learn Pipeline
7
7 Python Machine Learning by Sebastian Raschka
![Page 34: Python Machine Learning Step-by-Step: Modeling Financial Time Series Data · 2017-10-08 · Python Machine Learning Step-by-Step: Modeling Financial Time Series Data Reece Heineke](https://reader036.vdocuments.site/reader036/viewer/2022062602/5ec9961dbbcdfb09b032fe5d/html5/thumbnails/34.jpg)
Holdout Method
8
8 Python Machine Learning by Sebastian Raschka
![Page 35: Python Machine Learning Step-by-Step: Modeling Financial Time Series Data · 2017-10-08 · Python Machine Learning Step-by-Step: Modeling Financial Time Series Data Reece Heineke](https://reader036.vdocuments.site/reader036/viewer/2022062602/5ec9961dbbcdfb09b032fe5d/html5/thumbnails/35.jpg)
Cross-Validation
9
9 Python Machine Learning by Sebastian Raschka
![Page 36: Python Machine Learning Step-by-Step: Modeling Financial Time Series Data · 2017-10-08 · Python Machine Learning Step-by-Step: Modeling Financial Time Series Data Reece Heineke](https://reader036.vdocuments.site/reader036/viewer/2022062602/5ec9961dbbcdfb09b032fe5d/html5/thumbnails/36.jpg)
Learning Curves: What does it tell us?
10
10 Python Machine Learning by Sebastian Raschka
![Page 37: Python Machine Learning Step-by-Step: Modeling Financial Time Series Data · 2017-10-08 · Python Machine Learning Step-by-Step: Modeling Financial Time Series Data Reece Heineke](https://reader036.vdocuments.site/reader036/viewer/2022062602/5ec9961dbbcdfb09b032fe5d/html5/thumbnails/37.jpg)
Poor fit: Linear Regression even with (K)PCA
![Page 38: Python Machine Learning Step-by-Step: Modeling Financial Time Series Data · 2017-10-08 · Python Machine Learning Step-by-Step: Modeling Financial Time Series Data Reece Heineke](https://reader036.vdocuments.site/reader036/viewer/2022062602/5ec9961dbbcdfb09b032fe5d/html5/thumbnails/38.jpg)
Good fits: SVR (RBF) and Decision Tree Learning Curves
![Page 39: Python Machine Learning Step-by-Step: Modeling Financial Time Series Data · 2017-10-08 · Python Machine Learning Step-by-Step: Modeling Financial Time Series Data Reece Heineke](https://reader036.vdocuments.site/reader036/viewer/2022062602/5ec9961dbbcdfb09b032fe5d/html5/thumbnails/39.jpg)
Classic Overfitting: Random Forest Regressor
![Page 40: Python Machine Learning Step-by-Step: Modeling Financial Time Series Data · 2017-10-08 · Python Machine Learning Step-by-Step: Modeling Financial Time Series Data Reece Heineke](https://reader036.vdocuments.site/reader036/viewer/2022062602/5ec9961dbbcdfb09b032fe5d/html5/thumbnails/40.jpg)
Decision Trees: Easy to understand
![Page 41: Python Machine Learning Step-by-Step: Modeling Financial Time Series Data · 2017-10-08 · Python Machine Learning Step-by-Step: Modeling Financial Time Series Data Reece Heineke](https://reader036.vdocuments.site/reader036/viewer/2022062602/5ec9961dbbcdfb09b032fe5d/html5/thumbnails/41.jpg)
Fitting Models: Conclusion
I Support Vector Machine (SVR) with Radial Basis Function(RBF) Kernel has a higher accuracy
I Decision Tree is easier to understand
I Choice involves our own priors on the underlying structure
![Page 42: Python Machine Learning Step-by-Step: Modeling Financial Time Series Data · 2017-10-08 · Python Machine Learning Step-by-Step: Modeling Financial Time Series Data Reece Heineke](https://reader036.vdocuments.site/reader036/viewer/2022062602/5ec9961dbbcdfb09b032fe5d/html5/thumbnails/42.jpg)
Fitting Models: Conclusion
I Support Vector Machine (SVR) with Radial Basis Function(RBF) Kernel has a higher accuracy
I Decision Tree is easier to understand
I Choice involves our own priors on the underlying structure
![Page 43: Python Machine Learning Step-by-Step: Modeling Financial Time Series Data · 2017-10-08 · Python Machine Learning Step-by-Step: Modeling Financial Time Series Data Reece Heineke](https://reader036.vdocuments.site/reader036/viewer/2022062602/5ec9961dbbcdfb09b032fe5d/html5/thumbnails/43.jpg)
Fitting Models: Conclusion
I Support Vector Machine (SVR) with Radial Basis Function(RBF) Kernel has a higher accuracy
I Decision Tree is easier to understand
I Choice involves our own priors on the underlying structure
![Page 44: Python Machine Learning Step-by-Step: Modeling Financial Time Series Data · 2017-10-08 · Python Machine Learning Step-by-Step: Modeling Financial Time Series Data Reece Heineke](https://reader036.vdocuments.site/reader036/viewer/2022062602/5ec9961dbbcdfb09b032fe5d/html5/thumbnails/44.jpg)
Second Half of Machine Learning: A Persistent Model
Jupyter Notebook
![Page 45: Python Machine Learning Step-by-Step: Modeling Financial Time Series Data · 2017-10-08 · Python Machine Learning Step-by-Step: Modeling Financial Time Series Data Reece Heineke](https://reader036.vdocuments.site/reader036/viewer/2022062602/5ec9961dbbcdfb09b032fe5d/html5/thumbnails/45.jpg)
Thanks for listening: Q&A
https://github.com/rheineke/time series modeling