nyai - interactive machine learning by daniel hsu

99
Interactive machine learning Daniel Hsu Columbia University 1

Upload: rizwan-habib

Post on 15-Apr-2017

254 views

Category:

Technology


4 download

TRANSCRIPT

Page 1: NYAI - Interactive Machine Learning by Daniel Hsu

Interactive machine learning

Daniel HsuColumbia University

1

Page 2: NYAI - Interactive Machine Learning by Daniel Hsu

Non-interactive machine learning

“Cartoon” of (non-interactive) machine learning

1. Get labeled data {(inputi , outputi )}ni=1.

2. Learn prediction function f̂ (e.g., classifier, regressor, policy)such that

f̂ (inputi ) ≈ outputi

for most i = 1, 2, . . . , n.

Goal: f̂ (input) ≈ output for future (input, output) pairs.Some applications: document classification, face detection,speech recognition, machine translation, credit rating, . . .

There is a lot of technology for doing this,and a lot of mathematical theories for understanding this.

2

Page 3: NYAI - Interactive Machine Learning by Daniel Hsu

Non-interactive machine learning

“Cartoon” of (non-interactive) machine learning1. Get labeled data {(inputi , outputi )}ni=1.

2. Learn prediction function f̂ (e.g., classifier, regressor, policy)such that

f̂ (inputi ) ≈ outputi

for most i = 1, 2, . . . , n.

Goal: f̂ (input) ≈ output for future (input, output) pairs.Some applications: document classification, face detection,speech recognition, machine translation, credit rating, . . .

There is a lot of technology for doing this,and a lot of mathematical theories for understanding this.

2

Page 4: NYAI - Interactive Machine Learning by Daniel Hsu

Non-interactive machine learning

“Cartoon” of (non-interactive) machine learning1. Get labeled data {(inputi , outputi )}ni=1.

2. Learn prediction function f̂ (e.g., classifier, regressor, policy)such that

f̂ (inputi ) ≈ outputi

for most i = 1, 2, . . . , n.

Goal: f̂ (input) ≈ output for future (input, output) pairs.Some applications: document classification, face detection,speech recognition, machine translation, credit rating, . . .

There is a lot of technology for doing this,and a lot of mathematical theories for understanding this.

2

Page 5: NYAI - Interactive Machine Learning by Daniel Hsu

Non-interactive machine learning

“Cartoon” of (non-interactive) machine learning1. Get labeled data {(inputi , outputi )}ni=1.

2. Learn prediction function f̂ (e.g., classifier, regressor, policy)such that

f̂ (inputi ) ≈ outputi

for most i = 1, 2, . . . , n.

Goal: f̂ (input) ≈ output for future (input, output) pairs.

Some applications: document classification, face detection,speech recognition, machine translation, credit rating, . . .

There is a lot of technology for doing this,and a lot of mathematical theories for understanding this.

2

Page 6: NYAI - Interactive Machine Learning by Daniel Hsu

Non-interactive machine learning

“Cartoon” of (non-interactive) machine learning1. Get labeled data {(inputi , outputi )}ni=1.

2. Learn prediction function f̂ (e.g., classifier, regressor, policy)such that

f̂ (inputi ) ≈ outputi

for most i = 1, 2, . . . , n.

Goal: f̂ (input) ≈ output for future (input, output) pairs.Some applications: document classification, face detection,speech recognition, machine translation, credit rating, . . .

There is a lot of technology for doing this,and a lot of mathematical theories for understanding this.

2

Page 7: NYAI - Interactive Machine Learning by Daniel Hsu

Non-interactive machine learning

“Cartoon” of (non-interactive) machine learning1. Get labeled data {(inputi , outputi )}ni=1.

2. Learn prediction function f̂ (e.g., classifier, regressor, policy)such that

f̂ (inputi ) ≈ outputi

for most i = 1, 2, . . . , n.

Goal: f̂ (input) ≈ output for future (input, output) pairs.Some applications: document classification, face detection,speech recognition, machine translation, credit rating, . . .

There is a lot of technology for doing this,and a lot of mathematical theories for understanding this.

2

Page 8: NYAI - Interactive Machine Learning by Daniel Hsu

Interactive machine learning: example #1

Practicing physician

Loop:1. Patient arrives with symptoms, medical history, genome . . .2. Prescribe treatment.3. Observe impact on patient’s health (e.g., improves, worsens).

Goal prescribe treatments that yield good health outcomes.

3

Page 9: NYAI - Interactive Machine Learning by Daniel Hsu

Interactive machine learning: example #1

Practicing physicianLoop:1. Patient arrives with symptoms, medical history, genome . . .

2. Prescribe treatment.3. Observe impact on patient’s health (e.g., improves, worsens).

Goal prescribe treatments that yield good health outcomes.

3

Page 10: NYAI - Interactive Machine Learning by Daniel Hsu

Interactive machine learning: example #1

Practicing physicianLoop:1. Patient arrives with symptoms, medical history, genome . . .2. Prescribe treatment.

3. Observe impact on patient’s health (e.g., improves, worsens).

Goal prescribe treatments that yield good health outcomes.

3

Page 11: NYAI - Interactive Machine Learning by Daniel Hsu

Interactive machine learning: example #1

Practicing physicianLoop:1. Patient arrives with symptoms, medical history, genome . . .2. Prescribe treatment.3. Observe impact on patient’s health (e.g., improves, worsens).

Goal prescribe treatments that yield good health outcomes.

3

Page 12: NYAI - Interactive Machine Learning by Daniel Hsu

Interactive machine learning: example #1

Practicing physicianLoop:1. Patient arrives with symptoms, medical history, genome . . .2. Prescribe treatment.3. Observe impact on patient’s health (e.g., improves, worsens).

Goal prescribe treatments that yield good health outcomes.

3

Page 13: NYAI - Interactive Machine Learning by Daniel Hsu

Interactive machine learning: example #2

Website operator

Loop:1. User visits website with profile, browsing history . . .2. Choose content to display on website.3. Observe user reaction to content (e.g., click, “like”).

Goal choose content that yield desired user behavior.

4

Page 14: NYAI - Interactive Machine Learning by Daniel Hsu

Interactive machine learning: example #2

Website operatorLoop:1. User visits website with profile, browsing history . . .

2. Choose content to display on website.3. Observe user reaction to content (e.g., click, “like”).

Goal choose content that yield desired user behavior.

4

Page 15: NYAI - Interactive Machine Learning by Daniel Hsu

Interactive machine learning: example #2

Website operatorLoop:1. User visits website with profile, browsing history . . .2. Choose content to display on website.

3. Observe user reaction to content (e.g., click, “like”).

Goal choose content that yield desired user behavior.

4

Page 16: NYAI - Interactive Machine Learning by Daniel Hsu

Interactive machine learning: example #2

Website operatorLoop:1. User visits website with profile, browsing history . . .2. Choose content to display on website.3. Observe user reaction to content (e.g., click, “like”).

Goal choose content that yield desired user behavior.

4

Page 17: NYAI - Interactive Machine Learning by Daniel Hsu

Interactive machine learning: example #2

Website operatorLoop:1. User visits website with profile, browsing history . . .2. Choose content to display on website.3. Observe user reaction to content (e.g., click, “like”).

Goal choose content that yield desired user behavior.

4

Page 18: NYAI - Interactive Machine Learning by Daniel Hsu

Interactive machine learning: example #3

E-mail service provider

Loop:1. Receive e-mail messages for users (spam or not).2. Ask users to provide labels for some borderline cases.3. Improve spam filter using newly labeled messages.

Goal maximize accuracy of spam filter,minimize number of queries to users.

5

Page 19: NYAI - Interactive Machine Learning by Daniel Hsu

Interactive machine learning: example #3

E-mail service providerLoop:1. Receive e-mail messages for users (spam or not).

2. Ask users to provide labels for some borderline cases.3. Improve spam filter using newly labeled messages.

Goal maximize accuracy of spam filter,minimize number of queries to users.

5

Page 20: NYAI - Interactive Machine Learning by Daniel Hsu

Interactive machine learning: example #3

E-mail service providerLoop:1. Receive e-mail messages for users (spam or not).2. Ask users to provide labels for some borderline cases.

3. Improve spam filter using newly labeled messages.

Goal maximize accuracy of spam filter,minimize number of queries to users.

5

Page 21: NYAI - Interactive Machine Learning by Daniel Hsu

Interactive machine learning: example #3

E-mail service providerLoop:1. Receive e-mail messages for users (spam or not).2. Ask users to provide labels for some borderline cases.3. Improve spam filter using newly labeled messages.

Goal maximize accuracy of spam filter,minimize number of queries to users.

5

Page 22: NYAI - Interactive Machine Learning by Daniel Hsu

Interactive machine learning: example #3

E-mail service providerLoop:1. Receive e-mail messages for users (spam or not).2. Ask users to provide labels for some borderline cases.3. Improve spam filter using newly labeled messages.

Goal maximize accuracy of spam filter,minimize number of queries to users.

5

Page 23: NYAI - Interactive Machine Learning by Daniel Hsu

Characteristics of interactive machine learning problems

1. Learning agent (a.k.a. “learner”) interacts with the world(e.g., patients, users) to gather data.

2. Learner’s performance based on learner’s decisions.3. Data available to learner depends on learner’s decisions.4.

This talk: two interactive machine learning problems,1. active learning, and2. contextual bandit learning.

(Our models for these problems have all but the last of above characteristics.)

6

Page 24: NYAI - Interactive Machine Learning by Daniel Hsu

Characteristics of interactive machine learning problems

1. Learning agent (a.k.a. “learner”) interacts with the world(e.g., patients, users) to gather data.

2. Learner’s performance based on learner’s decisions.3. Data available to learner depends on learner’s decisions.4.

This talk: two interactive machine learning problems,1. active learning, and2. contextual bandit learning.

(Our models for these problems have all but the last of above characteristics.)

6

Page 25: NYAI - Interactive Machine Learning by Daniel Hsu

Characteristics of interactive machine learning problems

1. Learning agent (a.k.a. “learner”) interacts with the world(e.g., patients, users) to gather data.

2. Learner’s performance based on learner’s decisions.

3. Data available to learner depends on learner’s decisions.4.

This talk: two interactive machine learning problems,1. active learning, and2. contextual bandit learning.

(Our models for these problems have all but the last of above characteristics.)

6

Page 26: NYAI - Interactive Machine Learning by Daniel Hsu

Characteristics of interactive machine learning problems

1. Learning agent (a.k.a. “learner”) interacts with the world(e.g., patients, users) to gather data.

2. Learner’s performance based on learner’s decisions.3. Data available to learner depends on learner’s decisions.

4.

This talk: two interactive machine learning problems,1. active learning, and2. contextual bandit learning.

(Our models for these problems have all but the last of above characteristics.)

6

Page 27: NYAI - Interactive Machine Learning by Daniel Hsu

Characteristics of interactive machine learning problems

1. Learning agent (a.k.a. “learner”) interacts with the world(e.g., patients, users) to gather data.

2. Learner’s performance based on learner’s decisions.3. Data available to learner depends on learner’s decisions.4. State of the world depends on learner’s decisions.

This talk: two interactive machine learning problems,1. active learning, and2. contextual bandit learning.

(Our models for these problems have all but the last of above characteristics.)

6

Page 28: NYAI - Interactive Machine Learning by Daniel Hsu

Characteristics of interactive machine learning problems

1. Learning agent (a.k.a. “learner”) interacts with the world(e.g., patients, users) to gather data.

2. Learner’s performance based on learner’s decisions.3. Data available to learner depends on learner’s decisions.4. State of the world depends on learner’s decisions.

State of theworld depends on learner’s decisions.

This talk: two interactive machine learning problems,1. active learning, and2. contextual bandit learning.

(Our models for these problems have all but the last of above characteristics.)

6

Page 29: NYAI - Interactive Machine Learning by Daniel Hsu

Active learning

Page 30: NYAI - Interactive Machine Learning by Daniel Hsu

Motivation

Recall non-interactive (a.k.a. passive) machine learning:

1. Get labeled data {(inputi , labeli )}ni=1.

2. Learn function f̂ (e.g., classifier, regressor, decision rule, policy)such that

f̂ (inputi ) ≈ labeli

for most i = 1, 2, . . . , n.

Problem: labels often much more expensive to get than inputs.E.g., can’t ask for spam/not-spam label of every e-mail.

8

Page 31: NYAI - Interactive Machine Learning by Daniel Hsu

Motivation

Recall non-interactive (a.k.a. passive) machine learning:

1. Get labeled data {(inputi , labeli )}ni=1.

2. Learn function f̂ (e.g., classifier, regressor, decision rule, policy)such that

f̂ (inputi ) ≈ labeli

for most i = 1, 2, . . . , n.

Problem: labels often much more expensive to get than inputs.E.g., can’t ask for spam/not-spam label of every e-mail.

8

Page 32: NYAI - Interactive Machine Learning by Daniel Hsu

Active learning

Basic structure of active learning:

I Start with pool of unlabeled inputs(and maybe some (input, label) pairs).

I Repeat:

1. Train prediction function f̂ using current labeled pairs.2. Pick some inputs from the pool, and query their labels.

Goal:

I Final prediction function f̂ satisfies f̂ (input) ≈ label formost future (input, label) pairs.

I Number of label queries is small.(Can be selective/adaptive about which labels to query . . . )

9

Page 33: NYAI - Interactive Machine Learning by Daniel Hsu

Active learning

Basic structure of active learning:

I Start with pool of unlabeled inputs(and maybe some (input, label) pairs).

I Repeat:

1. Train prediction function f̂ using current labeled pairs.2. Pick some inputs from the pool, and query their labels.

Goal:

I Final prediction function f̂ satisfies f̂ (input) ≈ label formost future (input, label) pairs.

I Number of label queries is small.(Can be selective/adaptive about which labels to query . . . )

9

Page 34: NYAI - Interactive Machine Learning by Daniel Hsu

Active learning

Basic structure of active learning:

I Start with pool of unlabeled inputs(and maybe some (input, label) pairs).

I Repeat:1. Train prediction function f̂ using current labeled pairs.

2. Pick some inputs from the pool, and query their labels.

Goal:

I Final prediction function f̂ satisfies f̂ (input) ≈ label formost future (input, label) pairs.

I Number of label queries is small.(Can be selective/adaptive about which labels to query . . . )

9

Page 35: NYAI - Interactive Machine Learning by Daniel Hsu

Active learning

Basic structure of active learning:

I Start with pool of unlabeled inputs(and maybe some (input, label) pairs).

I Repeat:1. Train prediction function f̂ using current labeled pairs.2. Pick some inputs from the pool, and query their labels.

Goal:

I Final prediction function f̂ satisfies f̂ (input) ≈ label formost future (input, label) pairs.

I Number of label queries is small.(Can be selective/adaptive about which labels to query . . . )

9

Page 36: NYAI - Interactive Machine Learning by Daniel Hsu

Active learning

Basic structure of active learning:

I Start with pool of unlabeled inputs(and maybe some (input, label) pairs).

I Repeat:1. Train prediction function f̂ using current labeled pairs.2. Pick some inputs from the pool, and query their labels.

Goal:

I Final prediction function f̂ satisfies f̂ (input) ≈ label formost future (input, label) pairs.

I Number of label queries is small.(Can be selective/adaptive about which labels to query . . . )

9

Page 37: NYAI - Interactive Machine Learning by Daniel Hsu

Active learning

Basic structure of active learning:

I Start with pool of unlabeled inputs(and maybe some (input, label) pairs).

I Repeat:1. Train prediction function f̂ using current labeled pairs.2. Pick some inputs from the pool, and query their labels.

Goal:I Final prediction function f̂ satisfies f̂ (input) ≈ label for

most future (input, label) pairs.

I Number of label queries is small.(Can be selective/adaptive about which labels to query . . . )

9

Page 38: NYAI - Interactive Machine Learning by Daniel Hsu

Active learning

Basic structure of active learning:

I Start with pool of unlabeled inputs(and maybe some (input, label) pairs).

I Repeat:1. Train prediction function f̂ using current labeled pairs.2. Pick some inputs from the pool, and query their labels.

Goal:I Final prediction function f̂ satisfies f̂ (input) ≈ label for

most future (input, label) pairs.

I Number of label queries is small.(Can be selective/adaptive about which labels to query . . . )

9

Page 39: NYAI - Interactive Machine Learning by Daniel Hsu

Sampling biasMain difficulty of active learning: sampling bias.

I At any point in active learning process, labeled data settypically not representative of overall data.

I Problem can derail learning and go unnoticed!

input ∈ R, label ∈ {0, 1}, classifier type = threshold functions.

45%45% 5%5%

Learner converges to classifier with 5% error rate, even thoughthere’s a classifier with 2.5% error rate. Not statistically consistent!

10

Page 40: NYAI - Interactive Machine Learning by Daniel Hsu

Sampling biasMain difficulty of active learning: sampling bias.

I At any point in active learning process, labeled data settypically not representative of overall data.

I Problem can derail learning and go unnoticed!

input ∈ R, label ∈ {0, 1}, classifier type = threshold functions.

45%45% 5%5%

Learner converges to classifier with 5% error rate, even thoughthere’s a classifier with 2.5% error rate. Not statistically consistent!

10

Page 41: NYAI - Interactive Machine Learning by Daniel Hsu

Sampling biasMain difficulty of active learning: sampling bias.

I At any point in active learning process, labeled data settypically not representative of overall data.

I Problem can derail learning and go unnoticed!

input ∈ R, label ∈ {0, 1}, classifier type = threshold functions.

45%45% 5%5%

Learner converges to classifier with 5% error rate, even thoughthere’s a classifier with 2.5% error rate. Not statistically consistent!

10

Page 42: NYAI - Interactive Machine Learning by Daniel Hsu

Sampling biasMain difficulty of active learning: sampling bias.

I At any point in active learning process, labeled data settypically not representative of overall data.

I Problem can derail learning and go unnoticed!

input ∈ R, label ∈ {0, 1}, classifier type = threshold functions.

45%45% 5%5%

Learner converges to classifier with 5% error rate, even thoughthere’s a classifier with 2.5% error rate. Not statistically consistent!

10

Page 43: NYAI - Interactive Machine Learning by Daniel Hsu

Sampling bias

Typical active learning heuristic:I Start with pool of unlabeled inputs.

I Pick a handful of inputs, and query their labels.I Repeat:

1. Train classifier f̂ using current labeled pairs.2. Pick input from pool closest to decision boundary of f̂ ,

and query its label.

input ∈ R, label ∈ {0, 1}, classifier type = threshold functions.

45%45% 5%5%

Learner converges to classifier with 5% error rate, even thoughthere’s a classifier with 2.5% error rate. Not statistically consistent!

10

Page 44: NYAI - Interactive Machine Learning by Daniel Hsu

Sampling bias

Typical active learning heuristic:I Start with pool of unlabeled inputs.I Pick a handful of inputs, and query their labels.

I Repeat:

1. Train classifier f̂ using current labeled pairs.2. Pick input from pool closest to decision boundary of f̂ ,

and query its label.

input ∈ R, label ∈ {0, 1}, classifier type = threshold functions.

45%45% 5%5%

Learner converges to classifier with 5% error rate, even thoughthere’s a classifier with 2.5% error rate. Not statistically consistent!

10

Page 45: NYAI - Interactive Machine Learning by Daniel Hsu

Sampling bias

Typical active learning heuristic:I Start with pool of unlabeled inputs.I Pick a handful of inputs, and query their labels.I Repeat:

1. Train classifier f̂ using current labeled pairs.

2. Pick input from pool closest to decision boundary of f̂ ,and query its label.

input ∈ R, label ∈ {0, 1}, classifier type = threshold functions.

45%45% 5%5%

Learner converges to classifier with 5% error rate, even thoughthere’s a classifier with 2.5% error rate. Not statistically consistent!

10

Page 46: NYAI - Interactive Machine Learning by Daniel Hsu

Sampling bias

Typical active learning heuristic:I Start with pool of unlabeled inputs.I Pick a handful of inputs, and query their labels.I Repeat:

1. Train classifier f̂ using current labeled pairs.2. Pick input from pool closest to decision boundary of f̂ ,

and query its label.

input ∈ R, label ∈ {0, 1}, classifier type = threshold functions.

45%45% 5%5%

Learner converges to classifier with 5% error rate, even thoughthere’s a classifier with 2.5% error rate. Not statistically consistent!

10

Page 47: NYAI - Interactive Machine Learning by Daniel Hsu

Sampling bias

Typical active learning heuristic:I Start with pool of unlabeled inputs.I Pick a handful of inputs, and query their labels.I Repeat:

1. Train classifier f̂ using current labeled pairs.2. Pick input from pool closest to decision boundary of f̂ ,

and query its label.

input ∈ R, label ∈ {0, 1}, classifier type = threshold functions.

45%45% 5%5%

Learner converges to classifier with 5% error rate, even thoughthere’s a classifier with 2.5% error rate. Not statistically consistent!

10

Page 48: NYAI - Interactive Machine Learning by Daniel Hsu

Framework for statistically consistent active learning

Importance weighted active learning(Beygelzimer, Dasgupta, and Langford, ICML 2009)

I Start with unlabeled pool {inputi}ni=1.I For each i = 1, 2, . . . , n (in random order):

1. Get inputi .2. Pick probability pi ∈ [0, 1], toss coin with P(heads) = pi .3. If heads, query labeli , and include (inputi , labeli ) with

importance weight 1/pi in data set.

Importance weight = “inverse propensity” that labeli is queried.Used to account for sampling bias in error rate estimates.

11

Page 49: NYAI - Interactive Machine Learning by Daniel Hsu

Framework for statistically consistent active learning

Importance weighted active learning(Beygelzimer, Dasgupta, and Langford, ICML 2009)

I Start with unlabeled pool {inputi}ni=1.

I For each i = 1, 2, . . . , n (in random order):

1. Get inputi .2. Pick probability pi ∈ [0, 1], toss coin with P(heads) = pi .3. If heads, query labeli , and include (inputi , labeli ) with

importance weight 1/pi in data set.

Importance weight = “inverse propensity” that labeli is queried.Used to account for sampling bias in error rate estimates.

11

Page 50: NYAI - Interactive Machine Learning by Daniel Hsu

Framework for statistically consistent active learning

Importance weighted active learning(Beygelzimer, Dasgupta, and Langford, ICML 2009)

I Start with unlabeled pool {inputi}ni=1.I For each i = 1, 2, . . . , n (in random order):

1. Get inputi .

2. Pick probability pi ∈ [0, 1], toss coin with P(heads) = pi .3. If heads, query labeli , and include (inputi , labeli ) with

importance weight 1/pi in data set.

Importance weight = “inverse propensity” that labeli is queried.Used to account for sampling bias in error rate estimates.

11

Page 51: NYAI - Interactive Machine Learning by Daniel Hsu

Framework for statistically consistent active learning

Importance weighted active learning(Beygelzimer, Dasgupta, and Langford, ICML 2009)

I Start with unlabeled pool {inputi}ni=1.I For each i = 1, 2, . . . , n (in random order):

1. Get inputi .2. Pick probability pi ∈ [0, 1], toss coin with P(heads) = pi .

3. If heads, query labeli , and include (inputi , labeli ) withimportance weight 1/pi in data set.

Importance weight = “inverse propensity” that labeli is queried.Used to account for sampling bias in error rate estimates.

11

Page 52: NYAI - Interactive Machine Learning by Daniel Hsu

Framework for statistically consistent active learning

Importance weighted active learning(Beygelzimer, Dasgupta, and Langford, ICML 2009)

I Start with unlabeled pool {inputi}ni=1.I For each i = 1, 2, . . . , n (in random order):

1. Get inputi .2. Pick probability pi ∈ [0, 1], toss coin with P(heads) = pi .3. If heads, query labeli , and include (inputi , labeli ) with

importance weight 1/pi in data set.

Importance weight = “inverse propensity” that labeli is queried.Used to account for sampling bias in error rate estimates.

11

Page 53: NYAI - Interactive Machine Learning by Daniel Hsu

Framework for statistically consistent active learning

Importance weighted active learning(Beygelzimer, Dasgupta, and Langford, ICML 2009)

I Start with unlabeled pool {inputi}ni=1.I For each i = 1, 2, . . . , n (in random order):

1. Get inputi .2. Pick probability pi ∈ [0, 1], toss coin with P(heads) = pi .3. If heads, query labeli , and include (inputi , labeli ) with

importance weight 1/pi in data set.

Importance weight = “inverse propensity” that labeli is queried.Used to account for sampling bias in error rate estimates.

11

Page 54: NYAI - Interactive Machine Learning by Daniel Hsu

Importance weighted active learning algorithms

How to choose pi?

1. Use your favorite heuristic (as long as pi is never too small).

2. Simple rule based on estimated error rate differences(Beygelzimer, Hsu, Langford, and Zhang, NIPS 2010).

In many cases, can prove the following:I Final classifier is as accurate as learner that queries all n labels.I Expected number of queries

∑ni=1 pi grows sub-linearly with n.

3. Solve a large convex optimization problem(Huang, Agarwal, Hsu, Langford, and Schapire, NIPS 2015).

Even stronger theoretical guarantees.

Latter two methods can be made to work with any classifier type,provided a good (non-interactive) learning algorithm.

12

Page 55: NYAI - Interactive Machine Learning by Daniel Hsu

Importance weighted active learning algorithms

How to choose pi?1. Use your favorite heuristic (as long as pi is never too small).

2. Simple rule based on estimated error rate differences(Beygelzimer, Hsu, Langford, and Zhang, NIPS 2010).

In many cases, can prove the following:I Final classifier is as accurate as learner that queries all n labels.I Expected number of queries

∑ni=1 pi grows sub-linearly with n.

3. Solve a large convex optimization problem(Huang, Agarwal, Hsu, Langford, and Schapire, NIPS 2015).

Even stronger theoretical guarantees.

Latter two methods can be made to work with any classifier type,provided a good (non-interactive) learning algorithm.

12

Page 56: NYAI - Interactive Machine Learning by Daniel Hsu

Importance weighted active learning algorithms

How to choose pi?1. Use your favorite heuristic (as long as pi is never too small).

2. Simple rule based on estimated error rate differences(Beygelzimer, Hsu, Langford, and Zhang, NIPS 2010).

In many cases, can prove the following:I Final classifier is as accurate as learner that queries all n labels.I Expected number of queries

∑ni=1 pi grows sub-linearly with n.

3. Solve a large convex optimization problem(Huang, Agarwal, Hsu, Langford, and Schapire, NIPS 2015).

Even stronger theoretical guarantees.

Latter two methods can be made to work with any classifier type,provided a good (non-interactive) learning algorithm.

12

Page 57: NYAI - Interactive Machine Learning by Daniel Hsu

Importance weighted active learning algorithms

How to choose pi?1. Use your favorite heuristic (as long as pi is never too small).

2. Simple rule based on estimated error rate differences(Beygelzimer, Hsu, Langford, and Zhang, NIPS 2010).

In many cases, can prove the following:I Final classifier is as accurate as learner that queries all n labels.I Expected number of queries

∑ni=1 pi grows sub-linearly with n.

3. Solve a large convex optimization problem(Huang, Agarwal, Hsu, Langford, and Schapire, NIPS 2015).

Even stronger theoretical guarantees.

Latter two methods can be made to work with any classifier type,provided a good (non-interactive) learning algorithm.

12

Page 58: NYAI - Interactive Machine Learning by Daniel Hsu

Importance weighted active learning algorithms

How to choose pi?1. Use your favorite heuristic (as long as pi is never too small).

2. Simple rule based on estimated error rate differences(Beygelzimer, Hsu, Langford, and Zhang, NIPS 2010).

In many cases, can prove the following:I Final classifier is as accurate as learner that queries all n labels.I Expected number of queries

∑ni=1 pi grows sub-linearly with n.

3. Solve a large convex optimization problem(Huang, Agarwal, Hsu, Langford, and Schapire, NIPS 2015).

Even stronger theoretical guarantees.

Latter two methods can be made to work with any classifier type,provided a good (non-interactive) learning algorithm.

12

Page 59: NYAI - Interactive Machine Learning by Daniel Hsu

Importance weighted active learning algorithms

How to choose pi?1. Use your favorite heuristic (as long as pi is never too small).

2. Simple rule based on estimated error rate differences(Beygelzimer, Hsu, Langford, and Zhang, NIPS 2010).

In many cases, can prove the following:I Final classifier is as accurate as learner that queries all n labels.I Expected number of queries

∑ni=1 pi grows sub-linearly with n.

3. Solve a large convex optimization problem(Huang, Agarwal, Hsu, Langford, and Schapire, NIPS 2015).

Even stronger theoretical guarantees.

Latter two methods can be made to work with any classifier type,provided a good (non-interactive) learning algorithm.

12

Page 60: NYAI - Interactive Machine Learning by Daniel Hsu

Importance weighted active learning algorithms

How to choose pi?1. Use your favorite heuristic (as long as pi is never too small).

2. Simple rule based on estimated error rate differences(Beygelzimer, Hsu, Langford, and Zhang, NIPS 2010).

In many cases, can prove the following:I Final classifier is as accurate as learner that queries all n labels.I Expected number of queries

∑ni=1 pi grows sub-linearly with n.

3. Solve a large convex optimization problem(Huang, Agarwal, Hsu, Langford, and Schapire, NIPS 2015).

Even stronger theoretical guarantees.

Latter two methods can be made to work with any classifier type,provided a good (non-interactive) learning algorithm.

12

Page 61: NYAI - Interactive Machine Learning by Daniel Hsu

Contextual bandit learning

Page 62: NYAI - Interactive Machine Learning by Daniel Hsu

Contextual bandit learning

Protocol for contextual bandit learning:For round t = 1, 2, . . . ,T :

1. Observe contextt . [e.g., user profile]

2. Choose actiont ∈ Actions. [e.g., display ad]

3. Collect rewardt(actiont) ∈ [0, 1]. [e.g., click or no-click]

Goal: Choose actions that yield high reward.

Note: In round t, only observe reward for chosen actiont , and notreward that you would’ve received if you’d chosen a different action.

14

Page 63: NYAI - Interactive Machine Learning by Daniel Hsu

Contextual bandit learning

Protocol for contextual bandit learning:For round t = 1, 2, . . . ,T :

1. Observe contextt . [e.g., user profile]

2. Choose actiont ∈ Actions. [e.g., display ad]

3. Collect rewardt(actiont) ∈ [0, 1]. [e.g., click or no-click]

Goal: Choose actions that yield high reward.

Note: In round t, only observe reward for chosen actiont , and notreward that you would’ve received if you’d chosen a different action.

14

Page 64: NYAI - Interactive Machine Learning by Daniel Hsu

Contextual bandit learning

Protocol for contextual bandit learning:For round t = 1, 2, . . . ,T :1. Observe contextt . [e.g., user profile]

2. Choose actiont ∈ Actions. [e.g., display ad]

3. Collect rewardt(actiont) ∈ [0, 1]. [e.g., click or no-click]

Goal: Choose actions that yield high reward.

Note: In round t, only observe reward for chosen actiont , and notreward that you would’ve received if you’d chosen a different action.

14

Page 65: NYAI - Interactive Machine Learning by Daniel Hsu

Contextual bandit learning

Protocol for contextual bandit learning:For round t = 1, 2, . . . ,T :1. Observe contextt . [e.g., user profile]

2. Choose actiont ∈ Actions. [e.g., display ad]

3. Collect rewardt(actiont) ∈ [0, 1]. [e.g., click or no-click]

Goal: Choose actions that yield high reward.

Note: In round t, only observe reward for chosen actiont , and notreward that you would’ve received if you’d chosen a different action.

14

Page 66: NYAI - Interactive Machine Learning by Daniel Hsu

Contextual bandit learning

Protocol for contextual bandit learning:For round t = 1, 2, . . . ,T :1. Observe contextt . [e.g., user profile]

2. Choose actiont ∈ Actions. [e.g., display ad]

3. Collect rewardt(actiont) ∈ [0, 1]. [e.g., click or no-click]

Goal: Choose actions that yield high reward.

Note: In round t, only observe reward for chosen actiont , and notreward that you would’ve received if you’d chosen a different action.

14

Page 67: NYAI - Interactive Machine Learning by Daniel Hsu

Contextual bandit learning

Protocol for contextual bandit learning:For round t = 1, 2, . . . ,T :1. Observe contextt . [e.g., user profile]

2. Choose actiont ∈ Actions. [e.g., display ad]

3. Collect rewardt(actiont) ∈ [0, 1]. [e.g., click or no-click]

Goal: Choose actions that yield high reward.

Note: In round t, only observe reward for chosen actiont , and notreward that you would’ve received if you’d chosen a different action.

14

Page 68: NYAI - Interactive Machine Learning by Daniel Hsu

Contextual bandit learning

Protocol for contextual bandit learning:For round t = 1, 2, . . . ,T :1. Observe contextt . [e.g., user profile]

2. Choose actiont ∈ Actions. [e.g., display ad]

3. Collect rewardt(actiont) ∈ [0, 1]. [e.g., click or no-click]

Goal: Choose actions that yield high reward.

Note: In round t, only observe reward for chosen actiont , and notreward that you would’ve received if you’d chosen a different action.

14

Page 69: NYAI - Interactive Machine Learning by Daniel Hsu

Challenges in contextual bandit learning

1. Explore vs. exploit:I Should use what you’ve already learned about actions.I Must also learn about new actions that could be good.

2. Must use context effectively.I Which action is good likely to depend on current context.I Want to do as well as the best policy (i.e., decision rule)

π : context 7→ action

from some rich class of policies Π.

3. Selection bias, especially when exploiting.

15

Page 70: NYAI - Interactive Machine Learning by Daniel Hsu

Challenges in contextual bandit learning

1. Explore vs. exploit:I Should use what you’ve already learned about actions.I Must also learn about new actions that could be good.

2. Must use context effectively.I Which action is good likely to depend on current context.I Want to do as well as the best policy (i.e., decision rule)

π : context 7→ action

from some rich class of policies Π.

3. Selection bias, especially when exploiting.

15

Page 71: NYAI - Interactive Machine Learning by Daniel Hsu

Challenges in contextual bandit learning

1. Explore vs. exploit:I Should use what you’ve already learned about actions.I Must also learn about new actions that could be good.

2. Must use context effectively.I Which action is good likely to depend on current context.I Want to do as well as the best policy (i.e., decision rule)

π : context 7→ action

from some rich class of policies Π.

3. Selection bias, especially when exploiting.

15

Page 72: NYAI - Interactive Machine Learning by Daniel Hsu

Challenges in contextual bandit learning

1. Explore vs. exploit:I Should use what you’ve already learned about actions.I Must also learn about new actions that could be good.

2. Must use context effectively.I Which action is good likely to depend on current context.I Want to do as well as the best policy (i.e., decision rule)

π : context 7→ action

from some rich class of policies Π.

3. Selection bias, especially when exploiting.

15

Page 73: NYAI - Interactive Machine Learning by Daniel Hsu

Why is exploration necessary?

No-exploration approach:

1. Using historical data, learn a “reward predictor” for each actionbased on context:

r̂eward(action | context) .

2. Then deploy policy π̂, given by

π̂(context) := argmaxaction∈Actions

r̂eward(action | context) ,

and collect more data.

16

Page 74: NYAI - Interactive Machine Learning by Daniel Hsu

Why is exploration necessary?

No-exploration approach:1. Using historical data, learn a “reward predictor” for each action

based on context:

r̂eward(action | context) .

2. Then deploy policy π̂, given by

π̂(context) := argmaxaction∈Actions

r̂eward(action | context) ,

and collect more data.

16

Page 75: NYAI - Interactive Machine Learning by Daniel Hsu

Why is exploration necessary?

No-exploration approach:1. Using historical data, learn a “reward predictor” for each action

based on context:

r̂eward(action | context) .

2. Then deploy policy π̂, given by

π̂(context) := argmaxaction∈Actions

r̂eward(action | context) ,

and collect more data.

16

Page 76: NYAI - Interactive Machine Learning by Daniel Hsu

Using no-exploration

Example: two actions {a1, a2}, two contexts {x1, x2}.

Suppose initial policy says π̂(x1) = a1 and π̂(x2) = a2.

Observed rewardsa1 a2

x1 0.7 —x2 — 0.1

Reward estimatesa1 a2

x1 0.7 0.5x2 0.5 0.1

New policy: π̂′(x1) = π̂′(x2) = a1.

Observed rewardsa1 a2

x1 0.7 —x2 0.3 0.1

Reward estimatesa1 a2

x1 0.7 0.5x2 0.3 0.1

True rewardsa1 a2

x1 0.7 1.0x2 0.3 0.1

Never try a2 in context x1. Highly sub-optimal.

17

Page 77: NYAI - Interactive Machine Learning by Daniel Hsu

Using no-exploration

Example: two actions {a1, a2}, two contexts {x1, x2}.

Suppose initial policy says π̂(x1) = a1 and π̂(x2) = a2.

Observed rewardsa1 a2

x1 0.7 —x2 — 0.1

Reward estimatesa1 a2

x1 0.7 0.5x2 0.5 0.1

New policy: π̂′(x1) = π̂′(x2) = a1.

Observed rewardsa1 a2

x1 0.7 —x2 0.3 0.1

Reward estimatesa1 a2

x1 0.7 0.5x2 0.3 0.1

True rewardsa1 a2

x1 0.7 1.0x2 0.3 0.1

Never try a2 in context x1. Highly sub-optimal.

17

Page 78: NYAI - Interactive Machine Learning by Daniel Hsu

Using no-exploration

Example: two actions {a1, a2}, two contexts {x1, x2}.

Suppose initial policy says π̂(x1) = a1 and π̂(x2) = a2.

Observed rewardsa1 a2

x1 0.7 —x2 — 0.1

Reward estimatesa1 a2

x1 0.7 0.5x2 0.5 0.1

New policy: π̂′(x1) = π̂′(x2) = a1.

Observed rewardsa1 a2

x1 0.7 —x2 0.3 0.1

Reward estimatesa1 a2

x1 0.7 0.5x2 0.3 0.1

True rewardsa1 a2

x1 0.7 1.0x2 0.3 0.1

Never try a2 in context x1. Highly sub-optimal.

17

Page 79: NYAI - Interactive Machine Learning by Daniel Hsu

Using no-exploration

Example: two actions {a1, a2}, two contexts {x1, x2}.

Suppose initial policy says π̂(x1) = a1 and π̂(x2) = a2.

Observed rewardsa1 a2

x1 0.7 —x2 — 0.1

Reward estimatesa1 a2

x1 0.7 0.5x2 0.5 0.1

New policy: π̂′(x1) = π̂′(x2) = a1.

Observed rewardsa1 a2

x1 0.7 —x2 0.3 0.1

Reward estimatesa1 a2

x1 0.7 0.5x2 0.3 0.1

True rewardsa1 a2

x1 0.7 1.0x2 0.3 0.1

Never try a2 in context x1. Highly sub-optimal.

17

Page 80: NYAI - Interactive Machine Learning by Daniel Hsu

Using no-exploration

Example: two actions {a1, a2}, two contexts {x1, x2}.

Suppose initial policy says π̂(x1) = a1 and π̂(x2) = a2.

Observed rewardsa1 a2

x1 0.7 —x2 — 0.1

Reward estimatesa1 a2

x1 0.7 0.5x2 0.5 0.1

New policy: π̂′(x1) = π̂′(x2) = a1.

Observed rewardsa1 a2

x1 0.7 —x2 0.3 0.1

Reward estimatesa1 a2

x1 0.7 0.5x2 0.3 0.1

True rewardsa1 a2

x1 0.7 1.0x2 0.3 0.1

Never try a2 in context x1. Highly sub-optimal.

17

Page 81: NYAI - Interactive Machine Learning by Daniel Hsu

Using no-exploration

Example: two actions {a1, a2}, two contexts {x1, x2}.

Suppose initial policy says π̂(x1) = a1 and π̂(x2) = a2.

Observed rewardsa1 a2

x1 0.7 —x2 — 0.1

Reward estimatesa1 a2

x1 0.7 0.5x2 0.5 0.1

New policy: π̂′(x1) = π̂′(x2) = a1.

Observed rewardsa1 a2

x1 0.7 —x2 0.3 0.1

Reward estimatesa1 a2

x1 0.7 0.5x2 0.3 0.1

True rewardsa1 a2

x1 0.7 1.0x2 0.3 0.1

Never try a2 in context x1. Highly sub-optimal.

17

Page 82: NYAI - Interactive Machine Learning by Daniel Hsu

Using no-exploration

Example: two actions {a1, a2}, two contexts {x1, x2}.

Suppose initial policy says π̂(x1) = a1 and π̂(x2) = a2.

Observed rewardsa1 a2

x1 0.7 —x2 — 0.1

Reward estimatesa1 a2

x1 0.7 0.5x2 0.5 0.1

New policy: π̂′(x1) = π̂′(x2) = a1.

Observed rewardsa1 a2

x1 0.7 —x2 0.3 0.1

Reward estimatesa1 a2

x1 0.7 0.5x2 0.3 0.1

True rewardsa1 a2

x1 0.7 1.0x2 0.3 0.1

Never try a2 in context x1.

Highly sub-optimal.

17

Page 83: NYAI - Interactive Machine Learning by Daniel Hsu

Using no-exploration

Example: two actions {a1, a2}, two contexts {x1, x2}.

Suppose initial policy says π̂(x1) = a1 and π̂(x2) = a2.

Observed rewardsa1 a2

x1 0.7 —x2 — 0.1

Reward estimatesa1 a2

x1 0.7 0.5x2 0.5 0.1

New policy: π̂′(x1) = π̂′(x2) = a1.

Observed rewardsa1 a2

x1 0.7 —x2 0.3 0.1

Reward estimatesa1 a2

x1 0.7 0.5x2 0.3 0.1

True rewardsa1 a2

x1 0.7 1.0x2 0.3 0.1

Never try a2 in context x1. Highly sub-optimal.

17

Page 84: NYAI - Interactive Machine Learning by Daniel Hsu

Framework for simultaneous exploration and exploitation

I Initialize distribution W over some policies.

I For t = 1, 2, . . . ,T :

1. Observe contextt .2. Randomly pick policy π̂ ∼W ,

choose π̂(contextt) ∈ Actions.3. Collect rewardt(actiont) ∈ [0, 1],

update distribution W .

Can use “inverse propensity” importance weights to account forsampling bias when estimating expected rewards of policies.

18

Page 85: NYAI - Interactive Machine Learning by Daniel Hsu

Framework for simultaneous exploration and exploitation

I Initialize distribution W over some policies.

I For t = 1, 2, . . . ,T :

1. Observe contextt .2. Randomly pick policy π̂ ∼W ,

choose π̂(contextt) ∈ Actions.3. Collect rewardt(actiont) ∈ [0, 1],

update distribution W .

Can use “inverse propensity” importance weights to account forsampling bias when estimating expected rewards of policies.

18

Page 86: NYAI - Interactive Machine Learning by Daniel Hsu

Framework for simultaneous exploration and exploitation

I Initialize distribution W over some policies.

I For t = 1, 2, . . . ,T :1. Observe contextt .

2. Randomly pick policy π̂ ∼W ,choose π̂(contextt) ∈ Actions.

3. Collect rewardt(actiont) ∈ [0, 1],update distribution W .

Can use “inverse propensity” importance weights to account forsampling bias when estimating expected rewards of policies.

18

Page 87: NYAI - Interactive Machine Learning by Daniel Hsu

Framework for simultaneous exploration and exploitation

I Initialize distribution W over some policies.

I For t = 1, 2, . . . ,T :1. Observe contextt .2. Randomly pick policy π̂ ∼W ,

choose π̂(contextt) ∈ Actions.

3. Collect rewardt(actiont) ∈ [0, 1],update distribution W .

Can use “inverse propensity” importance weights to account forsampling bias when estimating expected rewards of policies.

18

Page 88: NYAI - Interactive Machine Learning by Daniel Hsu

Framework for simultaneous exploration and exploitation

I Initialize distribution W over some policies.

I For t = 1, 2, . . . ,T :1. Observe contextt .2. Randomly pick policy π̂ ∼W ,

choose π̂(contextt) ∈ Actions.3. Collect rewardt(actiont) ∈ [0, 1],

update distribution W .

Can use “inverse propensity” importance weights to account forsampling bias when estimating expected rewards of policies.

18

Page 89: NYAI - Interactive Machine Learning by Daniel Hsu

Framework for simultaneous exploration and exploitation

I Initialize distribution W over some policies.

I For t = 1, 2, . . . ,T :1. Observe contextt .2. Randomly pick policy π̂ ∼W ,

choose π̂(contextt) ∈ Actions.3. Collect rewardt(actiont) ∈ [0, 1],

update distribution W .

Can use “inverse propensity” importance weights to account forsampling bias when estimating expected rewards of policies.

18

Page 90: NYAI - Interactive Machine Learning by Daniel Hsu

Algorithms for contextual bandit learning

How to choose the distribution W over policies?

Exp4 (Auer, Cesa-Bianchi, Freund, and Schapire, FOCS 1995)

Maintains exponential weights over all policies in apolicy class Π.

Monster (Dudik, Hsu, Kale, Karampatziakis, Langford, Reyzin, andZhang, UAI 2011)

Solves optimization problem to come up withpoly(T , log |Π|)-sparse weights over policies.

Mini-monster (Agarwal, Hsu, Kale, Langford, Li, and Schapire, ICML 2014)

Simpler and faster algorithm to solve optimizationproblem; get much sparser weights over policies.

Latter two methods rely on a good (non-interactive) learningalgorithm for policy class Π.

19

Page 91: NYAI - Interactive Machine Learning by Daniel Hsu

Algorithms for contextual bandit learning

How to choose the distribution W over policies?Exp4 (Auer, Cesa-Bianchi, Freund, and Schapire, FOCS 1995)

Maintains exponential weights over all policies in apolicy class Π.

Monster (Dudik, Hsu, Kale, Karampatziakis, Langford, Reyzin, andZhang, UAI 2011)

Solves optimization problem to come up withpoly(T , log |Π|)-sparse weights over policies.

Mini-monster (Agarwal, Hsu, Kale, Langford, Li, and Schapire, ICML 2014)

Simpler and faster algorithm to solve optimizationproblem; get much sparser weights over policies.

Latter two methods rely on a good (non-interactive) learningalgorithm for policy class Π.

19

Page 92: NYAI - Interactive Machine Learning by Daniel Hsu

Algorithms for contextual bandit learning

How to choose the distribution W over policies?Exp4 (Auer, Cesa-Bianchi, Freund, and Schapire, FOCS 1995)

Maintains exponential weights over all policies in apolicy class Π.

Monster (Dudik, Hsu, Kale, Karampatziakis, Langford, Reyzin, andZhang, UAI 2011)

Solves optimization problem to come up withpoly(T , log |Π|)-sparse weights over policies.

Mini-monster (Agarwal, Hsu, Kale, Langford, Li, and Schapire, ICML 2014)

Simpler and faster algorithm to solve optimizationproblem; get much sparser weights over policies.

Latter two methods rely on a good (non-interactive) learningalgorithm for policy class Π.

19

Page 93: NYAI - Interactive Machine Learning by Daniel Hsu

Algorithms for contextual bandit learning

How to choose the distribution W over policies?Exp4 (Auer, Cesa-Bianchi, Freund, and Schapire, FOCS 1995)

Maintains exponential weights over all policies in apolicy class Π.

Monster (Dudik, Hsu, Kale, Karampatziakis, Langford, Reyzin, andZhang, UAI 2011)

Solves optimization problem to come up withpoly(T , log |Π|)-sparse weights over policies.

Mini-monster (Agarwal, Hsu, Kale, Langford, Li, and Schapire, ICML 2014)

Simpler and faster algorithm to solve optimizationproblem; get much sparser weights over policies.

Latter two methods rely on a good (non-interactive) learningalgorithm for policy class Π.

19

Page 94: NYAI - Interactive Machine Learning by Daniel Hsu

Algorithms for contextual bandit learning

How to choose the distribution W over policies?Exp4 (Auer, Cesa-Bianchi, Freund, and Schapire, FOCS 1995)

Maintains exponential weights over all policies in apolicy class Π.

Monster (Dudik, Hsu, Kale, Karampatziakis, Langford, Reyzin, andZhang, UAI 2011)

Solves optimization problem to come up withpoly(T , log |Π|)-sparse weights over policies.

Mini-monster (Agarwal, Hsu, Kale, Langford, Li, and Schapire, ICML 2014)

Simpler and faster algorithm to solve optimizationproblem; get much sparser weights over policies.

Latter two methods rely on a good (non-interactive) learningalgorithm for policy class Π.

19

Page 95: NYAI - Interactive Machine Learning by Daniel Hsu

Wrap-up

I Interactive machine learning models better capture howmachine learning is used in many applications (thannon-interactive frameworks).

I Sampling bias is a pervasive issue.

I Research: how to use non-interactive machine learningtechnology in these new interactive settings.

Thanks!

20

Page 96: NYAI - Interactive Machine Learning by Daniel Hsu

Wrap-up

I Interactive machine learning models better capture howmachine learning is used in many applications (thannon-interactive frameworks).

I Sampling bias is a pervasive issue.

I Research: how to use non-interactive machine learningtechnology in these new interactive settings.

Thanks!

20

Page 97: NYAI - Interactive Machine Learning by Daniel Hsu

Wrap-up

I Interactive machine learning models better capture howmachine learning is used in many applications (thannon-interactive frameworks).

I Sampling bias is a pervasive issue.

I Research: how to use non-interactive machine learningtechnology in these new interactive settings.

Thanks!

20

Page 98: NYAI - Interactive Machine Learning by Daniel Hsu

Wrap-up

I Interactive machine learning models better capture howmachine learning is used in many applications (thannon-interactive frameworks).

I Sampling bias is a pervasive issue.

I Research: how to use non-interactive machine learningtechnology in these new interactive settings.

Thanks!

20

Page 99: NYAI - Interactive Machine Learning by Daniel Hsu

Wrap-up

I Interactive machine learning models better capture howmachine learning is used in many applications (thannon-interactive frameworks).

I Sampling bias is a pervasive issue.

I Research: how to use non-interactive machine learningtechnology in these new interactive settings.

Thanks!

20