benefits of using scikit-learn for a machine learning
TRANSCRIPT
Benefits of using scikit-learn for amachine learning researcher and instructorChloé-Agathe Azencott
CBIO, Mines ParisTech – Institut Curie – INSERM U900, Paris (France)
June 15, 2016 – Scikit Learn Day
http://cazencott.info [email protected] @cazencott
Precision medicine
I Treatment adapted to (genetic) specificities of the patient.
I Data-driven biology / medicineIdentify similarities between patients that exhibit similar susceptibilities / prognoses /responses to treatment.
Develop machine learning methods to determinewhich genetic markers are most relevant to a given trait.
1
Research with scikit-learn
2
What I use scikit-learn for (in research)
I data preprocessingsklearn.preprocessing
I cross-validation toolssklearn.cross_validation
I performance metricssklearn.metrics
I baseline algorithmssklearn.linear_model,sklearn.svm,sklearn.ensemble
For more, see my talk last year at PyData.https://www.youtube.com/watch?v=
IpK9GeYs_KA
3
Why I use scikit-learn (in research)
To avoid reinventing the wheel while introducing bugs.
I scikit-learn does things I needI I trust scikit-learn to do them right.
4
Teaching with scikit-learn
5
Generating examples for lectures
Example: Linear regression (p=1000, n=100, 10 causal features)
6
Labs and projets
I For machine learning students
7
Labs and projects
I For machine learning studentsI SVM micro-project
8
Labs and projects
I For bioinformatics studentsI Biological data analysis micro-projects
9
Labs and projects
I For bioinformatics studentsI MSc Internship
Validation of an existing method, by a student previously mostly unfamiliar with machinelearning
10
Why I use scikit-learn (for teaching)
I scikit-learn teaches pratice.
I Ease of use and pedagogicalvalue:I Detailed documentationI Multiple tutorialsI Fiability.
I Jupyter Notebooks.
11
Thinking about the future
I Creating labs with Jupyter & scikit-learn?I Lecturing with Jupyter notebooks?Example: https://github.com/justmarkham/scikit-learn-videos/blob/master/05_model_evaluation.ipynb
12