think machine learning with scikit-learn (python)
TRANSCRIPT
Think Machine Learning with Scikit-learn (Python)
By: Chetan KhatriPrincipal Big Data Engineer, Nazara Technologies.
Data Science Lab, The Department of Computer Science, University of Kachchh.
About mel - Principal Big Data Engineer, Nazara Technologies.l - Technical Reviewer – Packt Publication.l - Ex. Developer - Eccella Corporation.l Alumni, The Department of Computer Science, KSKV
Kachchh University.
Outline
l An Introduction to Machine Learningl Hello World in Machine learning with 6 lines of code
l Visualizing a Decision Treel Classifying Imagesl Supervised learning : Pipelinel Writing first Classifier
Early Days AI Programs : Deep Blue
Now, AI Programs
l Alpha go is best example, wrote for Playing Go game, but it can play Atari games also.
Machine Learning
l Machine Learning does this possible, it is study of algorithms which learns from examples and experience having set of rules and hard coded lines.
l “Learns from Examples and Experience”
Let's have problem
l Let's have problem: It seems easy but difficult to solve without machine learning.
Open Source Libraries
Classifier
Scikit-learn
Test ! No error ! Yay !!
Supervised Learning
Collecting Training
Data
Train Classifier
MakePredictions
Training DataWeight Texture Label150g Bumpy Orange170g Bumpy Orange140g Smooth Apple130g Smooth Apple
Features
Examples
Training Data
Important Concepts
l How does this work in Real world ?l How much training data do you need ?l How is the tree created ?l What makes a good feature ?
Many Types of Classifier
l Artificial Neural Network (ANN)l Support Vector Machine (SVM)l Nearest Neighbour classifier (KNN)l Random Forest (RF)l Gradient Boosting Machine (GBM)l Etc..l Etc..
Demo
2. Visualizing a Decision Tree
3. What Makes a Good Feature?
l Imagine we want to write classifier to classify two types of dogs.
Variation in the world !
l Hands - On
About 80% of dogs at this height are labs
About 95% of dogs at this height are greyhounds
l Feature captures different types of information
Thought Experiment
Avoid useless features
Independent features are best
l Height in Inchesl Height in centimeters
l Height in Inchesl Height in centimeters
l Avoid Redundant featuresl Feature should be easy to understand
l Simpler relationships are easier to learn.l Ideal features are:
l Informativel Independentl Simple
4. Pipeline - Machine Learning
l http://playground.tensorflow.org/
5. Writing our first classifier
l Measure Distance
Demo
l Implement nearest neighbor Algorithm
Next Step
Thank you