practical machine learning

33
Practical Machine Learning Jaganadh G [email protected] BarCamp Kerala 9 Amrita Vishwa Vidyapeetham Karunagapally 14 November 2010 Jaganadh G Practical Machine Learning

Upload: jaganadh-gopinadhan

Post on 25-May-2015

2.731 views

Category:

Technology


1 download

DESCRIPTION

A talk on practical use of Machine Learning and Mahout

TRANSCRIPT

Page 1: Practical Machine Learning

Practical Machine Learning

Jaganadh [email protected]

BarCamp Kerala 9Amrita Vishwa Vidyapeetham

Karunagapally

14 November 2010

Jaganadh G Practical Machine Learning

Page 2: Practical Machine Learning

About me !!

Working in Natural Language Processing, Machine Learning,Data Mining etc...

Passionate about Free and Open source :-)

When gets free time teaches Python and blogs athttp://jaganadhg.freeflux.net/blog

Working as Project Lead (NLP) 365Media Pvt. Ltd.Coimbatore

I am a computational linguist / Linguist and Indologist

Now Software Engineer by Profession

Jaganadh G Practical Machine Learning

Page 3: Practical Machine Learning

Machine Learning

Machine Learning

Machine learning is a subfield of artificial intelligence (AI)concerned with algorithms that allow computers to learn.

This talk is not aimed to give introduction about MachineLearning

Dont expect some mathy equations here

Jaganadh G Practical Machine Learning

Page 4: Practical Machine Learning

Machine Learning

Machine Learning

Machine learning is a subfield of artificial intelligence (AI)concerned with algorithms that allow computers to learn.

This talk is not aimed to give introduction about MachineLearning

Dont expect some mathy equations here

Jaganadh G Practical Machine Learning

Page 5: Practical Machine Learning

Machine Learning

Machine Learning

Machine learning is a subfield of artificial intelligence (AI)concerned with algorithms that allow computers to learn.

This talk is not aimed to give introduction about MachineLearning

Dont expect some mathy equations here

Jaganadh G Practical Machine Learning

Page 6: Practical Machine Learning

Machine Learning

Machine Learning

Machine learning is a subfield of artificial intelligence (AI)concerned with algorithms that allow computers to learn.

This talk is not aimed to give introduction about MachineLearning

Dont expect some mathy equations here

Jaganadh G Practical Machine Learning

Page 7: Practical Machine Learning

Machine Learning and Our Life

Do you think that Machine Learning has any impact in our life??

Yes

In our day to day life we may use many Machine Learningpowered tools

E-mail spam filtering , product recommendations etc ..

Fraud detection

Jaganadh G Practical Machine Learning

Page 8: Practical Machine Learning

Machine Learning and Our Life

Do you think that Machine Learning has any impact in our life??

Yes

In our day to day life we may use many Machine Learningpowered tools

E-mail spam filtering , product recommendations etc ..

Fraud detection

Jaganadh G Practical Machine Learning

Page 9: Practical Machine Learning

Machine Learning and Our Life

Do you think that Machine Learning has any impact in our life??

Yes

In our day to day life we may use many Machine Learningpowered tools

E-mail spam filtering , product recommendations etc ..

Fraud detection

Jaganadh G Practical Machine Learning

Page 10: Practical Machine Learning

Machine Learning and Our Life

Do you think that Machine Learning has any impact in our life??

Yes

In our day to day life we may use many Machine Learningpowered tools

E-mail spam filtering , product recommendations etc ..

Fraud detection

Jaganadh G Practical Machine Learning

Page 11: Practical Machine Learning

Machine Learning and Our Life

Do you think that Machine Learning has any impact in our life??

Yes

In our day to day life we may use many Machine Learningpowered tools

E-mail spam filtering , product recommendations etc ..

Fraud detection

Jaganadh G Practical Machine Learning

Page 12: Practical Machine Learning

Examples

Jaganadh G Practical Machine Learning

Page 13: Practical Machine Learning

Examples

Jaganadh G Practical Machine Learning

Page 14: Practical Machine Learning

Examples

Jaganadh G Practical Machine Learning

Page 15: Practical Machine Learning

Tool for building Machine Learning powerd product/service

Apache Mahout

Apache Mahout is a scalable machine learning library that supportslarge data sets. Apache Mahout’s goal is to build scalable machinelearning libraries.

Commercially friendly licence

Well documented

Healthy community

Targeted to developers

Jaganadh G Practical Machine Learning

Page 16: Practical Machine Learning

Algorithms in Apache Mahout

Collaborative Filtering

User and Item based recommenders

K-Means, Fuzzy K-Means clustering

Mean Shift clustering

Dirichlet process clustering

Latent Dirichlet Allocation

Singular value decomposition

Parallel Frequent Pattern mining

Complementary Naive Bayes classifier

Random forest decision tree based classifier

Jaganadh G Practical Machine Learning

Page 17: Practical Machine Learning

Algorithms in Apache Mahout

Collaborative Filtering

User and Item based recommenders

K-Means, Fuzzy K-Means clustering

Mean Shift clustering

Dirichlet process clustering

Latent Dirichlet Allocation

Singular value decomposition

Parallel Frequent Pattern mining

Complementary Naive Bayes classifier

Random forest decision tree based classifier

Jaganadh G Practical Machine Learning

Page 18: Practical Machine Learning

Algorithms in Apache Mahout

Collaborative Filtering

User and Item based recommenders

K-Means, Fuzzy K-Means clustering

Mean Shift clustering

Dirichlet process clustering

Latent Dirichlet Allocation

Singular value decomposition

Parallel Frequent Pattern mining

Complementary Naive Bayes classifier

Random forest decision tree based classifier

Jaganadh G Practical Machine Learning

Page 19: Practical Machine Learning

Algorithms in Apache Mahout

Collaborative Filtering

User and Item based recommenders

K-Means, Fuzzy K-Means clustering

Mean Shift clustering

Dirichlet process clustering

Latent Dirichlet Allocation

Singular value decomposition

Parallel Frequent Pattern mining

Complementary Naive Bayes classifier

Random forest decision tree based classifier

Jaganadh G Practical Machine Learning

Page 20: Practical Machine Learning

Algorithms in Apache Mahout

Collaborative Filtering

User and Item based recommenders

K-Means, Fuzzy K-Means clustering

Mean Shift clustering

Dirichlet process clustering

Latent Dirichlet Allocation

Singular value decomposition

Parallel Frequent Pattern mining

Complementary Naive Bayes classifier

Random forest decision tree based classifier

Jaganadh G Practical Machine Learning

Page 21: Practical Machine Learning

Algorithms in Apache Mahout

Collaborative Filtering

User and Item based recommenders

K-Means, Fuzzy K-Means clustering

Mean Shift clustering

Dirichlet process clustering

Latent Dirichlet Allocation

Singular value decomposition

Parallel Frequent Pattern mining

Complementary Naive Bayes classifier

Random forest decision tree based classifier

Jaganadh G Practical Machine Learning

Page 22: Practical Machine Learning

Algorithms in Apache Mahout

Collaborative Filtering

User and Item based recommenders

K-Means, Fuzzy K-Means clustering

Mean Shift clustering

Dirichlet process clustering

Latent Dirichlet Allocation

Singular value decomposition

Parallel Frequent Pattern mining

Complementary Naive Bayes classifier

Random forest decision tree based classifier

Jaganadh G Practical Machine Learning

Page 23: Practical Machine Learning

Algorithms in Apache Mahout

Collaborative Filtering

User and Item based recommenders

K-Means, Fuzzy K-Means clustering

Mean Shift clustering

Dirichlet process clustering

Latent Dirichlet Allocation

Singular value decomposition

Parallel Frequent Pattern mining

Complementary Naive Bayes classifier

Random forest decision tree based classifier

Jaganadh G Practical Machine Learning

Page 24: Practical Machine Learning

Algorithms in Apache Mahout

Collaborative Filtering

User and Item based recommenders

K-Means, Fuzzy K-Means clustering

Mean Shift clustering

Dirichlet process clustering

Latent Dirichlet Allocation

Singular value decomposition

Parallel Frequent Pattern mining

Complementary Naive Bayes classifier

Random forest decision tree based classifier

Jaganadh G Practical Machine Learning

Page 25: Practical Machine Learning

Algorithms in Apache Mahout

Collaborative Filtering

User and Item based recommenders

K-Means, Fuzzy K-Means clustering

Mean Shift clustering

Dirichlet process clustering

Latent Dirichlet Allocation

Singular value decomposition

Parallel Frequent Pattern mining

Complementary Naive Bayes classifier

Random forest decision tree based classifier

Jaganadh G Practical Machine Learning

Page 26: Practical Machine Learning

Algorithms in Apache Mahout

Collaborative Filtering

User and Item based recommenders

K-Means, Fuzzy K-Means clustering

Mean Shift clustering

Dirichlet process clustering

Latent Dirichlet Allocation

Singular value decomposition

Parallel Frequent Pattern mining

Complementary Naive Bayes classifier

Random forest decision tree based classifier

Jaganadh G Practical Machine Learning

Page 27: Practical Machine Learning

Demo

Building recommendations engines with Mahout

Document Classification with Mahout

Some Python stuff on Machine Learning

Jaganadh G Practical Machine Learning

Page 28: Practical Machine Learning

Reference

Jaganadh G Practical Machine Learning

Page 29: Practical Machine Learning

Reference

Mahout in Action - Book by Sean Owen and Robin Anil,published by Manning Publications.

Taming Text - By Grant Ingersoll and Tom Morton, publishedby Manning Publications.

Introducing Apache Mahout - Grant Ingersoll - Intro toApache Mahout focused on clustering, classification andcollaborative filtering.https://www.ibm.com/developerworks/java/library/j-mahout/index.html

Programming Collective Intelligence: Building Smart Web 2.0Applicationshttp://www.amazon.com/Programming-Collective-Intelligence-Building-Applications/dp/0596529325

Jaganadh G Practical Machine Learning

Page 30: Practical Machine Learning

Useful Resources

Apache Mahout Site http://mahout.apache.org/

Apache Mahout Mailing List [email protected]

The code which I used for Mahout demo is available athttp://bitbucket.org/jaganadhg/blog/src/tip/bck9/java/

Twenty News Group data sethttp://people.csail.mit.edu/jrennie/20Newsgroups/20news-bydate.tar.gz

Jaganadh G Practical Machine Learning

Page 31: Practical Machine Learning

Questions ??

Jaganadh G Practical Machine Learning

Page 32: Practical Machine Learning

Acknowledgments

Thanks to :

Manning Publications for Review Copy of the book ”Mahoutin Action”

Apache Mahout mailing list members

Ted Dunning and Robin Anil for suggestions

Sreejith S and Biju B for Java help

@chelakkandupoda for review and criticism

Mukundhanchari R&D Director 365Media Pvt. Ltd. forsupport and encouragement

Jaganadh G Practical Machine Learning

Page 33: Practical Machine Learning

Finally

Jaganadh G Practical Machine Learning