online algorithms in machine learning

25
AMRINDER ARORA ONLINE ALGORITHMS IN MACHINE LEARNING

Upload: amrinder-arora

Post on 12-Apr-2017

880 views

Category:

Education


1 download

TRANSCRIPT

Page 1: Online algorithms in Machine Learning

A M R I N D E R A R O RA

ONLINE ALGORITHMS IN MACHINE LEARNING

Page 2: Online algorithms in Machine Learning

Machine Learning 2

BRIEF INTRODUCTION/CONTACT INFO

CTO at [email protected]

Adjunct Faculty at GWU/[email protected]

+1 571 276 8807

Arora - Online Algorithms

Second EditionISBN: 978-1-63487-

073-3

Page 3: Online algorithms in Machine Learning

Machine Learning 3

ONLINE ALGORITHMS IN MACHINE LEARNING

• First, let us understand a basic machine learning problem.

• For example, let us consider: classification

Arora - Online Algorithms

Page 4: Online algorithms in Machine Learning

Machine Learning 4

CLASSIFICATION

• Given: A collection of records (training set), where each record contains a set of attributes, and a class.

• Find: A model for class attribute as a function of the values of other attributes.

• Goal: previously unseen records should be assigned a class as accurately as possible.

• A test set is used to determine the accuracy of the model. Usually, the given data set is divided into training and test sets, with training set used to build the model and test set used to validate it.

Arora - Online Algorithms

Page 5: Online algorithms in Machine Learning

Machine Learning 5

ILLUSTRATING CLASSIFICATION TASK

Apply Model

Induction

Deduction

Learn Model

Model

Tid Attrib1 Attrib2 Attrib3 Class

1 Yes Large 125K No

2 No Medium 100K No

3 No Small 70K No

4 Yes Medium 120K No

5 No Large 95K Yes

6 No Medium 60K No

7 Yes Large 220K No

8 No Small 85K Yes

9 No Medium 75K No

10 No Small 90K Yes 10

Tid Attrib1 Attrib2 Attrib3 Class

11 No Small 55K ?

12 Yes Medium 80K ?

13 Yes Large 110K ?

14 No Small 95K ?

15 No Large 67K ? 10

Test Set

Learningalgorithm

Training Set

Arora - Online Algorithms

Page 6: Online algorithms in Machine Learning

Machine Learning 6

EXAMPLES OF CLASSIFICATION TASK

• Predict tax returns as “clean” or “need an audit”

• Predicting tumor cells as benign or malignant

• Classifying credit card transactions as legitimate or fraudulent

• Classifying secondary structures of protein as alpha-helix, beta-sheet, or random coil

• Categorizing news stories as finance, weather, entertainment, sports, etc

Arora - Online Algorithms

Page 7: Online algorithms in Machine Learning

Machine Learning 7

CLASSIFICATION TECHNIQUES

• Decision Tree based Methods• Rule-based Methods• Memory based reasoning• Neural Networks• Naïve Bayes and Bayesian Belief Networks• Support Vector Machines

Arora - Online Algorithms

Page 8: Online algorithms in Machine Learning

Machine Learning 8

EXAMPLE OF A DECISION TREE• Decision Trees are an intuitive example of

classification techniques

• income < $40K• job > 5 yrs then good risk• job < 5 yrs then bad risk

• income > $40K• high debt then bad risk• low debt then good risk

Arora - Online Algorithms

Page 9: Online algorithms in Machine Learning

Machine Learning 9

SO, WE HAVE DIFFERENT KINDS OF CLASSIFIERS..

• Different decision trees based on Hunt’s• C4.5• Naïve Bayes• Support Vector Machine

• Each of these models can be considered as an “expert”.• We do not know how good each “expert” will perform in

an actual setting• This is where online algorithms in machine learning an

help us.

Arora - Online Algorithms

Page 10: Online algorithms in Machine Learning

Machine Learning 10

ONLINE ALGORITHMS IN MACHINE LEARNING

• Given m experts, each given an output (0,1)• We want to be able predict the output• After each try, we are told the result.• Goal: After some time, we want to be able to do

“not much worse” than the best expert (without knowing beforehand who was a good expert)

Arora - Online Algorithms

Page 11: Online algorithms in Machine Learning

Machine Learning 11

“WEIGHTED MAJORITY” – ALGORITHM 1

• Initialize the weights of all experts w1..wn to 1• At each step, take the majority decision.

• That is, output 1 if weighted average of experts saying 1 is at least 0.5

• After each step, halve the weight of each expert who was wrong (leave the weight of correct experts unchanged)

Arora - Online Algorithms

Page 12: Online algorithms in Machine Learning

Machine Learning 12PE

RFO

RM

AN

CE

OF

WM

-A1

Proof• Suppose WM-A1 makes M mistakes• After each mistake, total weight goes down by ¼. So, it is no more

than n(3/4)M

• [All initial weights are 1, so initial total weight = n]• After each mistake, best expert’s weight goes down by ½. So, it is

no more than 1/2m

• So, 1/2m ≤ n(3/4)M

• [Best expert’s weight is no more than the total weight.]

Arora - Online Algorithms

The number of mistakes made by Weighted Majority- Algorithm 1 is never more than 2.41 (m + lg n), where m is the number of mistakes made by best expert.

Page 13: Online algorithms in Machine Learning

Machine Learning 13

PERFORMANCE OF WM-A1

Proof (cont.)1/2m ≤ n(3/4)M

(4/3)M ≤ n 2m

M lg (4/3) ≤ lg n + mM ≤ [1 / lg (4/3)] [m + lg n]M ≤ 2.41 [m + lg n]

Arora - Online Algorithms

The number of mistakes made by Weighted Majority- Algorithm 1 is never more than 2.41 (m + lg n), where m is the number of mistakes made by best expert, and n is number of experts.

Page 14: Online algorithms in Machine Learning

Machine Learning 14

“WEIGHTED MAJORITY” – ALGORITHM 2

• Initialize the weights of all experts w1..wn to 1• At each step, take the probability decision. That

is, output 1 with probability that is equal to sum of weights of experts that say 1 (divided by total weight).

• After each step, multiply the weight of each expert who was wrong by β (leave the weight of correct experts unchanged)

Arora - Online Algorithms

Page 15: Online algorithms in Machine Learning

Machine Learning 15

PE

RFO

RM

AN

CE

OF

WM

-A2

For β = ½, this is:1.39m + 2 ln n

For β = 3/4, this is:1.15m + 4 ln n

Arora - Online Algorithms

The number of mistakes made by Weighted Majority- Algorithm 2 is never more than (m ln (1/ β) + ln n)/(1- β), where m is the number of mistakes made by best expert.

Page 16: Online algorithms in Machine Learning

Machine Learning 16

PERFORMANCE OF WM-A2

Proof Suppose we have seen t tries so far. Let Fi be the fraction of total weight on the wrong answers at the i-th

trial. Suppose WM-A2 makes M mistakes. Therefore M = {i=1 to t} { Fi }

[Why? Because, in each try, probability of mistake = Fi] Suppose best expert makes m mistakes. After each mistake, best expert’s weight gets multiplied by β. So, it is no

more than βm

During each round, the total weight changes as: W W (1 – (1-β) Fi )

Arora - Online Algorithms

Page 17: Online algorithms in Machine Learning

Machine Learning 17

PERFORMANCE OF WM-A2

Proof (cont.) Therefore, at the end of t tries, total weight:

W = n {i= 1 to t} {1 – (1 – β) Fi} Since total weight ≥ weight of best expert:

n {i= 1 to t} {1 – (1 – β) Fi} ≥ βm

Taking natural logs:ln n + {i=1 to t} ln {1 – (1 – β) Fi} ≥ m ln β

Reversing the inequality (multiply by -1): – ln n – {i=1 to t} ln {1 – (1 – β) Fi} ≤ m ln (1/β)

A bit of math: – ln (1 – x) > x – ln n + (1 – β) {i=1 to t} {Fi} ≤ m ln (1/β)

– ln n + (1 – β) M ≤ m ln (1/β) M ≤ {m ln (1/β) + ln n} / {1 – β}

Arora - Online Algorithms

Page 18: Online algorithms in Machine Learning

Machine Learning 18

SUMMARY

The number of mistakes made by Weighted Majority- Algorithm 2 is never more than (m ln (1/ β) + ln n)/(1- β), where m is the number of mistakes made by best expert.

Arora - Online Algorithms

Page 19: Online algorithms in Machine Learning

Machine Learning 19

WHY DOES THIS ALL MATTER?

• Many practical applications use techniques such as ensemble models.

• Ensemble models are a generalization of the simple majority algorithms we discussed in this presentation

• There are many relevant practical applications• Pandora, Netflix and other Recommendation Engines• Government and Commercial targeting systems

http://www.fda.gov/predict

Arora - Online Algorithms

Page 20: Online algorithms in Machine Learning

Machine Learning 20

Q&A

• Ask anything you want..

Arora - Online Algorithms

Page 21: Online algorithms in Machine Learning

Machine Learning 21

PIZZA TIME!

You better cut the pizza in four pieces because I'm not hungry enough to eat six.--Yogi Berra

Arora - Online Algorithms

Page 22: Online algorithms in Machine Learning

Machine Learning 22Arora - Online Algorithms

APPENDIX 1MORE ON DECISION TREES

Page 23: Online algorithms in Machine Learning

Machine Learning 23

DECISION TREE INDUCTION

• Many Algorithms:• Hunt’s Algorithm (one of the earliest)• CART• ID3, C4.5• SLIQ,SPRINT

Arora - Online Algorithms

Page 24: Online algorithms in Machine Learning

Machine Learning 24

GENERAL STRUCTURE OF HUNT’S ALGORITHM

• Let Dt be the set of training records that reach a node t

• General Procedure:• If Dt contains records that belong the same

class yt, then t is a leaf node labeled as yt• If Dt is an empty set, then t is a leaf node

labeled by the default class, yd• If Dt contains records that belong to more than

one class, use an attribute test to split the data into smaller subsets. Recursively apply the procedure to each subset.

Arora - Online Algorithms

Page 25: Online algorithms in Machine Learning

Machine Learning 25

MEASURES OF NODE IMPURITY

• Gini Index

• Entropy

• Misclassification error

Arora - Online Algorithms

j

tjptGINI 2)]|([1)(

j

tjptjptEntropy )|(log)|()(

)|(max1)( tiPtErrori