machine learning - empatika open

Download Machine Learning - Empatika Open

Post on 16-Apr-2017

101 views

Category:

Technology

0 download

Embed Size (px)

TRANSCRIPT

  • Machine LearningBayram Annakov, Empatika Open

  • Types of ML

  • Supervised machine learning

  • Supervised learning

    x

    o

    xxx

    x x

    ooo

    o o

  • Classification

  • Unsupervised learning

  • Unsupervised learning

    oooo

    o o

    oooo

    o o

    X1

    X2

  • Users clustering

  • Process

  • Baby first steps

    GOAL: better purchase conversion from Trial Emails

    Knowledge: internal Empatika Open

    Whats next?

    Load new data & apply

  • First results - Im genius!KNN: 97% score on test set

  • & first disappointment

    Too many Negative - unbalanced dataset

    Balanced: 64%

    Better do something: email conversion 2 times less

  • Start from scratch

    +

    How: 1. Small dataset (own)

    Time Value

    2. Balanced 3. Dont hurry

    lots of answers

  • ProcessData

    Model 1 Model 2 Model N

    Results 1 Results 2 Results N

    Reducing features

    size

    Scaling

    other data

    stuffBest result

    Train model(parameters)

  • Results

    Test dataset

    New dataset

    30% better

    7% better

    Email conversion 2 times better Vs. previous model

    416 different inputs

  • Next level

    Features

    Volume

    Understand Model parameters

    Train model harder (24/7)

    Whole picture: not only 1 score, but Precision, Recall, f1-score, etc.

  • Lessons & Knowledge source

    Think about features (valuable VS lots VS less) balance

    Models are sensitive to different data

    Model tuning is important, but long road

    Sources:

    OREILLY: Introduction to Machine Learning with Python

    scikit-learn.org

    Github

  • Be patient

  • Process

    Data collection & preparation

    Modeling

    Training

    Evaluation

  • Data preparation!!!

  • Tasks

  • Images classification

  • Rhythmic Gymnastics

  • Rhythmic Gymnastics

  • Approach Collect data

    Simple iPhone app that helps draw and export

    Prepare dataImage = Grid. Each cell = 1 (black) or 0 (white)Convert Grid to Line Image = 000100011000011100011

    Train + Analyze Until satisfied with the score

  • Prepare dataimport skimageimport numpy

  • Train and Analyze

    1. K-neighbors 78% from sklearn.neighbors import KNeighborsClassifier clf = KNeighborsClassifier(n_neighbors=1) clf.fit(x_train, y_train) clf.score(x_test, y_test)

  • Train and Analyze

    2. K-neighbors + PCA 81% from sklearn.decomposition import PCA pca = PCA(n_components=40, whiten=True) pca.fit(x_train) x_train_pca = pca.transform(x_train) x_test_pca = pca.transform(x_test)

    //repeat KNN

  • May be someone has already solved it?

  • MNIST

    http://yann.lecun.com/exdb/mnist/

    http://yann.lecun.com/exdb/mnist/

  • 3. SVM 90% from sklearn import svm classifier = svm.SVC(gamma=0.001) classifier.fit(x_train, y_train)

    predicted = classifier.predict(x_test)

  • Neural networks?

  • Neuron

  • Perceptron

  • Multi-layered (deep)

  • Problems with images

    Too big vectors (200x200x3 = 120,000)

    Pixel position matters

  • Convolution

  • Pooling (sub-sampling)

  • CNN

  • Object recognition

  • ImageNet

  • Faces recognition

  • Eigenfaces

  • LFW

  • Recommendation systems

  • NLP

  • Bag-of-words

  • Data is key

  • Competitive Advantage?

  • Costs

  • Why?

  • CPU vs GPU

  • Opportunities

  • Better than Google?

  • Attributes

    Proprietary data sets

    Domain-specific tasks

    Domain-specific knowledge

  • So,

  • Useful links

    The Master Algorithm

    Andrew Ng AI is new electricity

    fast.ai course

    Introduction to ML with Python

    Python Machine Learning

  • one more thing

  • Please donate any sum to any fund

  • Plans

    3 universities in Paris

    Crowdfunding

    Platform

  • Not only academics

  • New Tech

  • How you can help?Finances

    Introductions

    Ideas

    Expertise

    Media

    Tech

    even frequent flyer miles :)

  • ThanksLucy Evstratova

    +79165884397

    Unicore.pro

    AlfaBank

    4154 8120 0093 9516

    Sberbank

    4276 3800 1234 3302