machine learning with annotator rationales to reduce annotation cost omar f. zaidan jason eisner...

122
Machine Learning with Annotator Rationales to Reduce Annotation Cost Omar F. Zaidan Jason Eisner Christine D. Piatko Johns Hopkins University NIPS Workshop on Cost-Sensitive Learning Friday, Dec. 12, 2008 cs.jhu.edu { | ozaidan cpiatko jason @ | }

Upload: alan-ezra-cole

Post on 28-Dec-2015

228 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Machine Learning with Annotator Rationales to Reduce Annotation Cost Omar F. Zaidan Jason Eisner Christine D. Piatko Johns Hopkins University NIPS Workshop

Machine Learning with Annotator Rationales to Reduce Annotation Cost

Omar F. ZaidanJason Eisner

Christine D. PiatkoJohns Hopkins University

NIPS Workshop on Cost-Sensitive LearningFriday, Dec. 12, 2008

cs.jhu.edu{ |ozaidan cpiatkojason @| }

Page 2: Machine Learning with Annotator Rationales to Reduce Annotation Cost Omar F. Zaidan Jason Eisner Christine D. Piatko Johns Hopkins University NIPS Workshop

Zaidan, Eisner, Piatko – Annotator Rationales

Training Docs

Per

form

ance

x

x

x

xx

. . .

Supervised Learning

Annotated

Supervised Learning

Page 3: Machine Learning with Annotator Rationales to Reduce Annotation Cost Omar F. Zaidan Jason Eisner Christine D. Piatko Johns Hopkins University NIPS Workshop

Zaidan, Eisner, Piatko – Annotator Rationales

Training Docs

Per

form

ance

x

x

x

xx

. . .

Supervised Learning

Page 4: Machine Learning with Annotator Rationales to Reduce Annotation Cost Omar F. Zaidan Jason Eisner Christine D. Piatko Johns Hopkins University NIPS Workshop

Zaidan, Eisner, Piatko – Annotator Rationales

Richer Annotation

Training Docs

Per

form

ance

x

x

x

xx

. . .

x

xx

xx

. . .

Page 5: Machine Learning with Annotator Rationales to Reduce Annotation Cost Omar F. Zaidan Jason Eisner Christine D. Piatko Johns Hopkins University NIPS Workshop

Zaidan, Eisner, Piatko – Annotator Rationales

Richer Annotation?

• Usually, an annotator indicates what the correct answer is.

• We propose the annotator also indicate why.

• Why is a related task (cf. multi-task learning), so its annotations may help learn what.– In fact, “why” is especially appropriate for this.– It means “tell me something that will help me learn what.”

• Both tasks are on the same training example.– So annotator can attempt both tasks at the same time,

which is cost-efficient.

Page 6: Machine Learning with Annotator Rationales to Reduce Annotation Cost Omar F. Zaidan Jason Eisner Christine D. Piatko Johns Hopkins University NIPS Workshop

Zaidan, Eisner, Piatko – Annotator Rationales

Armageddon

This disaster flick is a disaster alright. Directed by Tony Scott (Top Gun), it's the story of an asteroid the size of Texas caught on a collision course with Earth. After a great opening, in which an American spaceship, plus NYC, are completely destroyed by a comet shower, NASA detects said asteroid and go into a frenzy. They hire the world's best oil driller (Bruce Willis), and send him and his crew up into space to fix our global problem.

The action scenes are over the top and too ludicrous for words. So much so, I had to sigh and hit my head with my notebook a couple of times. Also, to see a wonderful actor like Billy Bob Thornton in a film like this is a waste of his talents. The only real reason for making this film was to somehow out-perform Deep Impact. Bottom line is, Armageddon is a failure.

“Hey annotator,

is this review

positive or

negative?”

Useful!

Annotators are Underutilized?

Annotator says:

(y = –1)

Dataset from Pang & Lee (2004)

1000 pos & 1000 neg reviewse.g., vs.

Used their features too(bag of words – 18K features)

Page 7: Machine Learning with Annotator Rationales to Reduce Annotation Cost Omar F. Zaidan Jason Eisner Christine D. Piatko Johns Hopkins University NIPS Workshop

Zaidan, Eisner, Piatko – Annotator Rationales

الحساب يوم

. الفيلم، كارثي فيلم بحق هو الكارثي الفيلم هذا ( ) ، جن توب فيلم مخرج سكوت توني إخراج منفي تكساس والية بحجم كويكب قصة يروي . مشهد بعد األرضية بالكرة لالصطدام طريقهأمريكية، فضاء سفينة فيه تدمر رائع، افتتاحيالكويكب ناسا تكتشف نيويورك، مدينة إلى إضافة . منقب بأفضل فيستعينون جنونها ويجن الذكر آنف ( ) ويرسلونه ، ويليس بروس العالم في النفط عنالعالمية مشكلتنا إلصالح الفضاء إلى وطاقمه

هذه.

وصف ويصعب بها مبالغ واآلكشن اإلثارة مشاهدكلمات بأي فيه ،سخفها نفسي وجدت حد إلى

. و مرات بضع بمدونتي جبهتي وضربت بل أتنهد،في ثورنتون بوب بيلي مثل ممثل رؤية فإن أيضا . الوحيد السبب لمواهبه مضيعة بحق هو كهذا فيلمفيلم على التفوق محاولة هو الفيلم هذا لصنع . يوم األمر، خالصة ما بطريقة امباكت ديب

. فاشل فيلم الحساب

Annotators are Underutilized!“Hey annotator,

is this review

positive or

negative?”

Useful??

Annotator says:

(y = –1)

Page 8: Machine Learning with Annotator Rationales to Reduce Annotation Cost Omar F. Zaidan Jason Eisner Christine D. Piatko Johns Hopkins University NIPS Workshop

Zaidan, Eisner, Piatko – Annotator Rationales

الحساب يوم

. الفيلم، كارثي فيلم بحق هو الكارثي الفيلم هذا ( ) ، جن توب فيلم مخرج سكوت توني إخراج منفي تكساس والية بحجم كويكب قصة يروي . مشهد بعد األرضية بالكرة لالصطدام طريقهأمريكية، فضاء سفينة فيه تدمر رائع، افتتاحيالكويكب ناسا تكتشف نيويورك، مدينة إلى إضافة . منقب بأفضل فيستعينون جنونها ويجن الذكر آنف ( ) ويرسلونه ، ويليس بروس العالم في النفط عنالعالمية مشكلتنا إلصالح الفضاء إلى وطاقمه

هذه.

وصف ويصعب بها مبالغ واآلكشن اإلثارة مشاهدكلمات بأي فيه ،سخفها نفسي وجدت حد إلى

. و مرات بضع بمدونتي جبهتي وضربت بل أتنهد،في ثورنتون بوب بيلي مثل ممثل رؤية فإن أيضا . الوحيد السبب لمواهبه مضيعة بحق هو كهذا فيلمفيلم على التفوق محاولة هو الفيلم هذا لصنع . يوم األمر، خالصة ما بطريقة امباكت ديب

. فاشل فيلم الحساب

Annotators are Underutilized!“Hey annotator,

is this review

positive or

negative?”

Useful!

Annotator says:

(y = –1)

Page 9: Machine Learning with Annotator Rationales to Reduce Annotation Cost Omar F. Zaidan Jason Eisner Christine D. Piatko Johns Hopkins University NIPS Workshop

Zaidan, Eisner, Piatko – Annotator Rationales

Armageddon

This disaster flick is a disaster alright. Directed by Tony Scott (Top Gun), it's the story of an asteroid the size of Texas caught on a collision course with Earth. After a great opening, in which an American spaceship, plus NYC, are completely destroyed by a comet shower, NASA detects said asteroid and go into a frenzy. They hire the world's best oil driller (Bruce Willis), and send him and his crew up into space to fix our global problem.

The action scenes are over the top and too ludicrous for words. So much so, I had to sigh and hit my head with my notebook a couple of times. Also, to see a wonderful actor like Billy Bob Thornton in a film like this is a waste of his talents. The only real reason for making this film was to somehow out-perform Deep Impact. Bottom line is, Armageddon is a failure.

Annotators are Underutilized!“Hey annotator,

is this review

positive or

negative?”

Useful!

Annotator says:

(y = –1)

Our annotation UI:Boldface the rationales in an HTML editor.

(For other tasks, would be more complicated.) (cf. Peekaboom game (von Ahn et al. 2006) )

Page 10: Machine Learning with Annotator Rationales to Reduce Annotation Cost Omar F. Zaidan Jason Eisner Christine D. Piatko Johns Hopkins University NIPS Workshop

Zaidan, Eisner, Piatko – Annotator Rationales

“Annotators are underutilized. Let’s ask them to do more than just annotate class.”

Related Work• Let’s ask annotator to identify relevant features.

– Haghighi & Klein (2006), Raghavan et al. (2006), Druck et al. (2008).

• But should annotators really look at feature set?– Need a human-readable notation for describing features.

– Features could be hard for annotators to understand.– We might want different features in the future.

apple apple- noun

Apple shares up 3%

Page 11: Machine Learning with Annotator Rationales to Reduce Annotation Cost Omar F. Zaidan Jason Eisner Christine D. Piatko Johns Hopkins University NIPS Workshop

Zaidan, Eisner, Piatko – Annotator Rationales

. . .. . .

Non-annotated documents

Page 12: Machine Learning with Annotator Rationales to Reduce Annotation Cost Omar F. Zaidan Jason Eisner Christine D. Piatko Johns Hopkins University NIPS Workshop

Zaidan, Eisner, Piatko – Annotator Rationales

. . .. . .

Saving Private Ryan

War became a reality to me after seeing Saving Private Ryan. Steve Spielberg goes beyond reality with his latest production. Keep the kids home as the R rating is for Reality. Tom Hanks is stunning as Capt John Miller, set out in France during WW II to rescue and return home a soldier, Private Ryan (Matt Damon) who lost three brothers in the war. Spielberg takes us inside the heads of these individuals as they face death during the horrific battle scenes. Private Ryan is not for everyone, but I felt the time was right for a movie like this to be made. The movie reminds us of the sacrifices made by our fighting men and women. For this I thank them and for Steve Spielberg for making a movie that I will never forget. And I’m sure the Academy will not forget Tom Hanks come April, as another well deserved Oscar with be in Tom's possession.

Page 13: Machine Learning with Annotator Rationales to Reduce Annotation Cost Omar F. Zaidan Jason Eisner Christine D. Piatko Johns Hopkins University NIPS Workshop

Zaidan, Eisner, Piatko – Annotator Rationales

. . .. . .

Saving Private Ryan

War became a reality to me after seeing Saving Private Ryan. Steve Spielberg goes beyond reality with his latest production. Keep the kids home as the R rating is for Reality. Tom Hanks is stunning as Capt John Miller, set out in France during WW II to rescue and return home a soldier, Private Ryan (Matt Damon) who lost three brothers in the war. Spielberg takes us inside the heads of these individuals as they face death during the horrific battle scenes. Private Ryan is not for everyone, but I felt the time was right for a movie like this to be made. The movie reminds us of the sacrifices made by our fighting men and women. For this I thank them and for Steve Spielberg for making a movie that I will never forget. And I’m sure the Academy will not forget Tom Hanks come April, as another well deserved Oscar with be in Tom's possession.

Page 14: Machine Learning with Annotator Rationales to Reduce Annotation Cost Omar F. Zaidan Jason Eisner Christine D. Piatko Johns Hopkins University NIPS Workshop

Zaidan, Eisner, Piatko – Annotator Rationales

. . .. . .

Page 15: Machine Learning with Annotator Rationales to Reduce Annotation Cost Omar F. Zaidan Jason Eisner Christine D. Piatko Johns Hopkins University NIPS Workshop

Zaidan, Eisner, Piatko – Annotator Rationales

Kundun

Martin Scorsese's Kundun has been criticized for its lack of narrative structure. Personally, I don't think it needs one: it works perfectly well as a study of Tibetan Buddhist culture and communist China. Scorsese views the Dalai Lama the way many Tibetans probably do, as a larger-than-life symbol of Buddhist spirituality and political leadership: the only glimpses into his head come from several dream sequences, but his portrayal is appropriate for a film that concentrates on the political and spiritual. The set design and cinematography were absolutely outstanding. Scorsese gets carried away with the spectacle, and it helps to augment the cultural contrast when the Dalai Lama travels to China to meet Chairman Mao. Kundun does not succumb to the temptation to start shouting slogans, delivering its message in an artistically interesting way without being overly manipulative.

. . .. . .

Page 16: Machine Learning with Annotator Rationales to Reduce Annotation Cost Omar F. Zaidan Jason Eisner Christine D. Piatko Johns Hopkins University NIPS Workshop

Zaidan, Eisner, Piatko – Annotator Rationales

Kundun

Martin Scorsese's Kundun has been criticized for its lack of narrative structure. Personally, I don't think it needs one: it works perfectly well as a study of Tibetan Buddhist culture and communist China. Scorsese views the Dalai Lama the way many Tibetans probably do, as a larger-than-life symbol of Buddhist spirituality and political leadership: the only glimpses into his head come from several dream sequences, but his portrayal is appropriate for a film that concentrates on the political and spiritual. The set design and cinematography were absolutely outstanding. Scorsese gets carried away with the spectacle, and it helps to augment the cultural contrast when the Dalai Lama travels to China to meet Chairman Mao. Kundun does not succumb to the temptation to start shouting slogans, delivering its message in an artistically interesting way without being overly manipulative.

. . .. . .

Page 17: Machine Learning with Annotator Rationales to Reduce Annotation Cost Omar F. Zaidan Jason Eisner Christine D. Piatko Johns Hopkins University NIPS Workshop

Zaidan, Eisner, Piatko – Annotator Rationales

Volcano

Volcano starts with Tommy Lee Jones worrying about a small earthquake enough to leave his daughter at home with a baby sitter. There is one small quake then another quake. Then a geologist points out to Tommy that its takes a geologic event to heat millions of gallons of water in 12 hours. A few hours later large amount of ash start to fall. Then...it starts: the volcanic eruption!

I liked this movie...but it was not as great as I hoped. It was still good none the less. It had excellent special effects. The best view was that of the helecopters flying over the streets of volcanos. Also, there were interesting side stories that made the plot more interesting. So...it was good!!

. . .. . .

Page 18: Machine Learning with Annotator Rationales to Reduce Annotation Cost Omar F. Zaidan Jason Eisner Christine D. Piatko Johns Hopkins University NIPS Workshop

Zaidan, Eisner, Piatko – Annotator Rationales

Volcano

Volcano starts with Tommy Lee Jones worrying about a small earthquake enough to leave his daughter at home with a baby sitter. There is one small quake then another quake. Then a geologist points out to Tommy that its takes a geologic event to heat millions of gallons of water in 12 hours. A few hours later large amount of ash start to fall. Then...it starts: the volcanic eruption!

I liked this movie...but it was not as great as I hoped. It was still good none the less. It had excellent special effects. The best view was that of the helecopters flying over the streets of volcanos. Also, there were interesting side stories that made the plot more interesting. So...it was good!!

. . .. . .

. . .

Page 19: Machine Learning with Annotator Rationales to Reduce Annotation Cost Omar F. Zaidan Jason Eisner Christine D. Piatko Johns Hopkins University NIPS Workshop

Zaidan, Eisner, Piatko – Annotator Rationales

The Postman

Question: after the disaster that was Waterworld, what the fuck were the execs who gave Costner the money to make another movie thinking??

In this 3 hour advertisement for his new hair weave, Costner plays a nameless drifter who dons a long dead postal employee's uniform (I shit you not) and gradually turns a nuked-out USA into an idealized hippy-dippy society. (The main accomplishment of this brave new world is in re-inventing polyester.) When he's not pointing the camera directly at himself, director Costner does have a nice visual sense, but by the time the second hour rolled around, I was reduced to sitting on my hands to keep from clawing out my own eyes. Mark this one "return to sender".

. . .

. . .

Page 20: Machine Learning with Annotator Rationales to Reduce Annotation Cost Omar F. Zaidan Jason Eisner Christine D. Piatko Johns Hopkins University NIPS Workshop

Zaidan, Eisner, Piatko – Annotator Rationales

The Postman

Question: after the disaster that was Waterworld, what the fuck were the execs who gave Costner the money to make another movie thinking??

In this 3 hour advertisement for his new hair weave, Costner plays a nameless drifter who dons a long dead postal employee's uniform (I shit you not) and gradually turns a nuked-out USA into an idealized hippy-dippy society. (The main accomplishment of this brave new world is in re-inventing polyester.) When he's not pointing the camera directly at himself, director Costner does have a nice visual sense, but by the time the second hour rolled around, I was reduced to sitting on my hands to keep from clawing out my own eyes. Mark this one "return to sender".

. . .

. . .

Page 21: Machine Learning with Annotator Rationales to Reduce Annotation Cost Omar F. Zaidan Jason Eisner Christine D. Piatko Johns Hopkins University NIPS Workshop

Zaidan, Eisner, Piatko – Annotator Rationales

The Postman

Question: after the disaster that was Waterworld, what the fuck were the execs who gave Costner the money to make another movie thinking??

In this 3 hour advertisement for his new hair weave, Costner plays a nameless drifter who dons a long dead postal employee's uniform (I shit you not) and gradually turns a nuked-out USA into an idealized hippy-dippy society. (The main accomplishment of this brave new world is in re-inventing polyester.) When he's not pointing the camera directly at himself, director Costner does have a nice visual sense, but by the time the second hour rolled around, I was reduced to sitting on my hands to keep from clawing out my own eyes. Mark this one "return to sender".

. . .

. . .

Page 22: Machine Learning with Annotator Rationales to Reduce Annotation Cost Omar F. Zaidan Jason Eisner Christine D. Piatko Johns Hopkins University NIPS Workshop

Zaidan, Eisner, Piatko – Annotator Rationales

Batman & Robin

I once wrote that Speed 2 was the worst film I've ever reviewed. I didn't know that I'd soon encounter Batman & Robin, the picture least worthy of your attention this summer. As directed by Joel Schumacher, B&R is one long excuse for a Taco Bell promotion.

The plot, which has Mr. Freeze and Poison Ivy planning to take over the world, is weighted down by repetitive asides about the nature of trust, partnership, blah, blah, blah. But morals are not the point of this film – topping each bloated, confusing action scene with the next one is. Only George Clooney comes out on top; he underplays nicely and pretends like he's in a real movie… . . .

. . .

Page 23: Machine Learning with Annotator Rationales to Reduce Annotation Cost Omar F. Zaidan Jason Eisner Christine D. Piatko Johns Hopkins University NIPS Workshop

Zaidan, Eisner, Piatko – Annotator Rationales

Batman & Robin

I once wrote that Speed 2 was the worst film I've ever reviewed. I didn't know that I'd soon encounter Batman & Robin, the picture least worthy of your attention this summer. As directed by Joel Schumacher, B&R is one long excuse for a Taco Bell promotion.

The plot, which has Mr. Freeze and Poison Ivy planning to take over the world, is weighted down by repetitive asides about the nature of trust, partnership, blah, blah, blah. But morals are not the point of this film – topping each bloated, confusing action scene with the next one is. Only George Clooney comes out on top; he underplays nicely and pretends like he's in a real movie… . . .

. . .

Page 24: Machine Learning with Annotator Rationales to Reduce Annotation Cost Omar F. Zaidan Jason Eisner Christine D. Piatko Johns Hopkins University NIPS Workshop

Zaidan, Eisner, Piatko – Annotator Rationales

Armageddon

This disaster flick is a disaster alright. Directed by Tony Scott (Top Gun), it's the story of an asteroid the size of Texas caught on a collision course with Earth. After a great opening, in which an American spaceship, plus NYC, are completely destroyed by a comet shower, NASA detects said asteroid and go into a frenzy. They hire the world's best oil driller (Bruce Willis), and send him and his crew up into space to fix our global problem.

The action scenes are over the top and too ludicrous for words. So much so, I had to sigh and hit my head with my notebook a couple of times. Also, to see a wonderful actor like Billy Bob Thornton in a film like this is a waste of his talents. The only real reason for making this film was to somehow out-perform Deep Impact. Bottom line is, Armageddon is a failure.

. . .

. . .

Page 25: Machine Learning with Annotator Rationales to Reduce Annotation Cost Omar F. Zaidan Jason Eisner Christine D. Piatko Johns Hopkins University NIPS Workshop

Zaidan, Eisner, Piatko – Annotator Rationales

Armageddon

This disaster flick is a disaster alright. Directed by Tony Scott (Top Gun), it's the story of an asteroid the size of Texas caught on a collision course with Earth. After a great opening, in which an American spaceship, plus NYC, are completely destroyed by a comet shower, NASA detects said asteroid and go into a frenzy. They hire the world's best oil driller (Bruce Willis), and send him and his crew up into space to fix our global problem.

The action scenes are over the top and too ludicrous for words. So much so, I had to sigh and hit my head with my notebook a couple of times. Also, to see a wonderful actor like Billy Bob Thornton in a film like this is a waste of his talents. The only real reason for making this film was to somehow out-perform Deep Impact. Bottom line is, Armageddon is a failure.

. . .

. . .. . .

Page 26: Machine Learning with Annotator Rationales to Reduce Annotation Cost Omar F. Zaidan Jason Eisner Christine D. Piatko Johns Hopkins University NIPS Workshop

Zaidan, Eisner, Piatko – Annotator Rationales

. . .. . .

Annotated documents

Class and also “rationales”

OK … now what??

<x,y,r>

Page 27: Machine Learning with Annotator Rationales to Reduce Annotation Cost Omar F. Zaidan Jason Eisner Christine D. Piatko Johns Hopkins University NIPS Workshop

Zaidan, Eisner, Piatko – Annotator Rationales

• In our experiments, we use a linear classifier.

• Classifier is represented by a weight vector .

• Rationales will play a role when learning .

• Improvements solely due to a better-learned .

• At test time:– No change to decision rule. – No new features.– No need for rationales.

Linear Classifier

(vs. no rationales)

Page 28: Machine Learning with Annotator Rationales to Reduce Annotation Cost Omar F. Zaidan Jason Eisner Christine D. Piatko Johns Hopkins University NIPS Workshop

Zaidan, Eisner, Piatko – Annotator Rationales

• At test time, when presented with document x:

• Where:– : feature vector (18k binary unigram features).– : corresponding weights (i.e. 18k weights).

Positive weights favor y = +1

Negative weights favor y = –1

Linear Classifier

)(f

0)( , 1

0)( , 1 Choose *

xf

xfy

Page 29: Machine Learning with Annotator Rationales to Reduce Annotation Cost Omar F. Zaidan Jason Eisner Christine D. Piatko Johns Hopkins University NIPS Workshop

Zaidan, Eisner, Piatko – Annotator Rationales

Trees Lounge is the directoral debut from one of my favorite actors, Steve Buscemi. He gave memorable performences in in The Soup, Fargo, and Reservoir Dogs. Now he tries his hand at writing, directing and acting all in the same flick. The movie starts out awfully slow with Tommy (Buscemi) hanging around a local bar the "trees lounge" and him pestering his brother. It's obvious he a loser. But as he says "it's better I'm a loser and know I am, then being a loser and not thinking I am." Well put. The story starts to take off when his uncle dies, and Tommy, not having a job, decides to drive an ice cream truck. Well, the movie starts to pick up with him finding a love interest in a 17 year old girl named Debbie (Chloe Sevigny) and... I liked this movie alot even though it did not reach my expectation. After you've seen him in Fargo and Reservoir Dogs, you know he is capable of a better performence. I think his brother, Michael, did an excellent job for his debut performence. Mr. Buscemi is off to a good career as a director!

! " ( ) , . a acting actors after all alot am an and around as at awfully bar being better brother buscemi but capable career chloe cream debbie debut decides did dies directing director dogs drive even excellent expectation fargo favorite finding flick for from gave girl good hand hanging having he him his i ice i'm in interest is it it's job know liked local loser lounge love memorable michael movie mr my named not now obvious of off old one out pick put reach reservoir same says seen sevigny slow soup starts steve story take the then think thinking this though to tommy trees tries truck uncle up well when with writing year you you've

fand = 1, freach = 1, fterrific= 0, …

x

f (x): 0/1 vector; 18k elements

Page 30: Machine Learning with Annotator Rationales to Reduce Annotation Cost Omar F. Zaidan Jason Eisner Christine D. Piatko Johns Hopkins University NIPS Workshop

Zaidan, Eisner, Piatko – Annotator Rationales

! " ( ) , . a acting actors after all alot am an and around as at awfully bar being better brother buscemi but capable

career chloe cream debbie .

debut decides did dies directing director dogs drive even excellent expectation fargo favorite

finding flick for from gave girl .

good hand hanging having he him his i ice i'm in interest is it it's job know liked local loser lounge love

memorable michael movie .

mr my named not now obvious of off old one out pick put reach reservoir same says seen sevigny slow soup

starts steve story take .

the then think thinking this though to tommy trees tries truck uncle up well when with writing year

you you've .

! " ( ) , . a acting actors after all alot am an and around as at awfully bar being better brother buscemi but capable career chloe cream debbie debut decides did dies directing director dogs drive even excellent expectation fargo favorite finding flick for from gave girl good hand hanging having he him his i ice i'm in interest is it it's job know liked local loser lounge love memorable michael movie mr my named not now obvious of off old one out pick put reach reservoir same says seen sevigny slow soup starts steve story take the then think thinking this though to tommy trees tries truck uncle up well when with writing year you you've

abc abc abc abc abc-0.2 -0.1 0.0 +0.1 +0.2

favor y = +1favor y = –1

Page 31: Machine Learning with Annotator Rationales to Reduce Annotation Cost Omar F. Zaidan Jason Eisner Christine D. Piatko Johns Hopkins University NIPS Workshop

Zaidan, Eisner, Piatko – Annotator Rationales

Page 32: Machine Learning with Annotator Rationales to Reduce Annotation Cost Omar F. Zaidan Jason Eisner Christine D. Piatko Johns Hopkins University NIPS Workshop

Zaidan, Eisner, Piatko – Annotator Rationales

Saving Private Ryan

War became a reality to me after seeing Saving Private Ryan. Steve Spielberg goes beyond reality with his latest production. Keep the kids home as the R rating is for Reality. Tom Hanks is stunning as Capt John Miller, set out in France during WW II to rescue and return home a soldier, Private Ryan (Matt Damon) who lost three brothers in the war. Spielberg takes us inside the heads of these individuals as they face death during the horrific battle scenes. Private Ryan is not for everyone, but I felt the time was right for a movie like this to be made. The movie reminds us of the sacrifices made by our fighting men and women. For this I thank them and for Steve Spielberg for making a movie that I will never forget. And I’m sure the Academy will not forget Tom Hanks come April, as another well deserved Oscar with be in Tom's possession.

How sure was the annotator that this is a positive review?

Page 33: Machine Learning with Annotator Rationales to Reduce Annotation Cost Omar F. Zaidan Jason Eisner Christine D. Piatko Johns Hopkins University NIPS Workshop

Zaidan, Eisner, Piatko – Annotator Rationales

Saving Private Ryan

War became a reality to me after seeing Saving Private Ryan. Steve Spielberg goes beyond reality with his latest production. Keep the kids home as the R rating is for Reality. Tom Hanks is stunning as Capt John Miller, set out in France during WW II to rescue and return home a soldier, Private Ryan (Matt Damon) who lost three brothers in the war. Spielberg takes us inside the heads of these individuals as they face death during the horrific battle scenes. Private Ryan is not for everyone, but I felt the time was right for a movie like this to be made. The movie reminds us of the sacrifices made by our fighting men and women. For this I thank them and for Steve Spielberg for making a movie that I will never forget. And I’m sure the Academy will not forget Tom Hanks come April, as another well deserved Oscar with be in Tom's possession.

How sure was the annotator that this is a positive review?

Page 34: Machine Learning with Annotator Rationales to Reduce Annotation Cost Omar F. Zaidan Jason Eisner Christine D. Piatko Johns Hopkins University NIPS Workshop

Zaidan, Eisner, Piatko – Annotator Rationales

Saving Private Ryan

War became a reality to me after seeing Saving Private Ryan. Steve Spielberg goes beyond reality with his latest production. Keep the kids home as the R rating is for Reality. Tom Hanks is stunning as Capt John Miller, set out in France during WW II to rescue and return home a soldier, Private Ryan (Matt Damon) who lost three brothers in the war. Spielberg takes us inside the heads of these individuals as they face death during the horrific battle scenes. Private Ryan is not for everyone, but I felt the time was right for a movie like this to be made. The movie reminds us of the sacrifices made by our fighting men and women. For this I thank them and for Steve Spielberg for making a movie that I will never forget. And I’m sure the Academy will not forget Tom Hanks come April, as another well deserved Oscar with be in Tom's possession.

If a rationale is masked out, the annotator would not be as sure that this is a positive review.

Intuition: a good model should also be less sure.

How sure was the annotator that this is a positive review?

Page 35: Machine Learning with Annotator Rationales to Reduce Annotation Cost Omar F. Zaidan Jason Eisner Christine D. Piatko Johns Hopkins University NIPS Workshop

Zaidan, Eisner, Piatko – Annotator Rationales

“Contrast” Examples

Original Example

Contrast ExamplesObtain a contrast by

masking out a rationale

Intuition: a good model should be less sure of a positive classification on contrasts than on the original.

Our work: modified SVM that takes this intuition into account.

Page 36: Machine Learning with Annotator Rationales to Reduce Annotation Cost Omar F. Zaidan Jason Eisner Christine D. Piatko Johns Hopkins University NIPS Workshop

Zaidan, Eisner, Piatko – Annotator Rationales

Standard SVM

Class +1 examples

Class -1 examples

Page 37: Machine Learning with Annotator Rationales to Reduce Annotation Cost Omar F. Zaidan Jason Eisner Christine D. Piatko Johns Hopkins University NIPS Workshop

Zaidan, Eisner, Piatko – Annotator Rationales

Standard SVM

Class +1 examples

Class -1 examples

Page 38: Machine Learning with Annotator Rationales to Reduce Annotation Cost Omar F. Zaidan Jason Eisner Christine D. Piatko Johns Hopkins University NIPS Workshop

Zaidan, Eisner, Piatko – Annotator Rationales

Standard SVM

Class +1 examples

Class -1 examples

Page 39: Machine Learning with Annotator Rationales to Reduce Annotation Cost Omar F. Zaidan Jason Eisner Christine D. Piatko Johns Hopkins University NIPS Workshop

Zaidan, Eisner, Piatko – Annotator Rationales

Class +1 examples

Class -1 examples

Support vectors

Standard SVM

Class +1 examples

Class -1 examples

Page 40: Machine Learning with Annotator Rationales to Reduce Annotation Cost Omar F. Zaidan Jason Eisner Christine D. Piatko Johns Hopkins University NIPS Workshop

Zaidan, Eisner, Piatko – Annotator Rationales

Standard SVM

Page 41: Machine Learning with Annotator Rationales to Reduce Annotation Cost Omar F. Zaidan Jason Eisner Christine D. Piatko Johns Hopkins University NIPS Workshop

Zaidan, Eisner, Piatko – Annotator Rationales

w1

1x

Standard SVM2x

Page 42: Machine Learning with Annotator Rationales to Reduce Annotation Cost Omar F. Zaidan Jason Eisner Christine D. Piatko Johns Hopkins University NIPS Workshop

Zaidan, Eisner, Piatko – Annotator Rationales

w1

1x

Standard SVM2x

:Minimize

2

2

1w

:subject to

1 ixw

Page 43: Machine Learning with Annotator Rationales to Reduce Annotation Cost Omar F. Zaidan Jason Eisner Christine D. Piatko Johns Hopkins University NIPS Workshop

Zaidan, Eisner, Piatko – Annotator Rationales

12v11vj1v

1 ixw

:Minimize

2

2

1w

:subject to

w1

1x

Incorporating Contrasts

. . . . . .

2x

. . . . . .22v

21vj2v

Page 44: Machine Learning with Annotator Rationales to Reduce Annotation Cost Omar F. Zaidan Jason Eisner Christine D. Piatko Johns Hopkins University NIPS Workshop

Zaidan, Eisner, Piatko – Annotator Rationales

12v11vj1v

1 ixw

:Minimize

2

2

1w

:subject to

w1

1x

Incorporating Contrasts

. . . . . .

2x

. . . . . .22v

21vj2v

Original (x)

Contrast (v)

Page 45: Machine Learning with Annotator Rationales to Reduce Annotation Cost Omar F. Zaidan Jason Eisner Christine D. Piatko Johns Hopkins University NIPS Workshop

Zaidan, Eisner, Piatko – Annotator Rationales

w1

1x

Incorporating Contrasts

. . . . . .

12v11vj1v

2x

. . . . . .22v

21vj2v

1 ixw

:Minimize

2

2

1w

:subject to

Original (x)

Contrast (v)

Page 46: Machine Learning with Annotator Rationales to Reduce Annotation Cost Omar F. Zaidan Jason Eisner Christine D. Piatko Johns Hopkins University NIPS Workshop

Zaidan, Eisner, Piatko – Annotator Rationales

w1

1x

Incorporating Contrasts

. . . . . .

12v11vj1v

2x

. . . . . .22v

21vj2v

1 ixw

:Minimize

2

2

1w

:subject to

Original (x)

Contrast (v)

Page 47: Machine Learning with Annotator Rationales to Reduce Annotation Cost Omar F. Zaidan Jason Eisner Christine D. Piatko Johns Hopkins University NIPS Workshop

Zaidan, Eisner, Piatko – Annotator Rationales

w1

1x

Incorporating Contrasts

. . . . . .

12v11vj1v

2x

. . . . . .22v

21vj2v

1 ixw

:Minimize:subject to

ALSO

2

2

1w

:subject to

iji vwxw

Original (x)

Contrast (v)

Page 48: Machine Learning with Annotator Rationales to Reduce Annotation Cost Omar F. Zaidan Jason Eisner Christine D. Piatko Johns Hopkins University NIPS Workshop

Zaidan, Eisner, Piatko – Annotator Rationales

1

iji vxw

w1

1x

Incorporating Contrasts

. . . . . .

12v11vj1v

2x

. . . . . .22v

21vj2v

1 ixw

:Minimize:subject to

ALSO

2

2

1w

:subject to

iji vwxw

Original (x)

Contrast (v)

Page 49: Machine Learning with Annotator Rationales to Reduce Annotation Cost Omar F. Zaidan Jason Eisner Christine D. Piatko Johns Hopkins University NIPS Workshop

Zaidan, Eisner, Piatko – Annotator Rationales

w1

1x

. . . . . .

12v11vj1v

2x

. . . . . .22v

21vj2v

1 ixw 1 ijxw

:Minimize:subject to

ALSO

2

2

1w

:subject to

iji vwxw

1

iji vxw

Original (x)

Contrast (v)

def

Incorporating Contrasts

Page 50: Machine Learning with Annotator Rationales to Reduce Annotation Cost Omar F. Zaidan Jason Eisner Christine D. Piatko Johns Hopkins University NIPS Workshop

Zaidan, Eisner, Piatko – Annotator Rationales

1 ixw i 1 ijxw

:Minimize:subject to

ALSO

2

2

1w

:subject to

w1

1x

Slack Variables

. . . . . .

12v11vj1v

2x

. . . . . .22v

21vj2v

iji vwxw

1

iji vxw

Original (x)

Contrast (v)

i

iC

Page 51: Machine Learning with Annotator Rationales to Reduce Annotation Cost Omar F. Zaidan Jason Eisner Christine D. Piatko Johns Hopkins University NIPS Workshop

Zaidan, Eisner, Piatko – Annotator Rationales

ij1 ixw i 1 ijxw

:Minimize:subject to

ALSO

2

2

1w

:subject to

w1

1x

Slack Variables

. . . . . .

12v11vj1v

2x

. . . . . .22v

21vj2v

iji vwxw

1

iji vxw ij

)1( ij

Original (x)

Contrast (v)

ji

ijcontrastC,

i

iC

Page 52: Machine Learning with Annotator Rationales to Reduce Annotation Cost Omar F. Zaidan Jason Eisner Christine D. Piatko Johns Hopkins University NIPS Workshop

Zaidan, Eisner, Piatko – Annotator Rationales

) (iy 1 ixw i ij) (iy 1 ijxw

jii

ijcontrasti CC,

:Minimize:subject to

ALSO

2

2

1w

:subject to

w1

1x

((Include Negative Examples))

. . . . . .

12v11vj1v

2x

. . . . . .22v

21vj2v

iji vwxw

1

iji vxw ij

)1( ij) (iy

) (iy

Original (x)

Contrast (v)

Page 53: Machine Learning with Annotator Rationales to Reduce Annotation Cost Omar F. Zaidan Jason Eisner Christine D. Piatko Johns Hopkins University NIPS Workshop

Zaidan, Eisner, Piatko – Annotator Rationales

ij) (iy 1 ijxw

:Minimize:subject to

ALSO

2

2

1w

:subject to

w1

1x

The Modified SVM

. . . . . .

12v11vj1v

2x

. . . . . .22v

21vj2v

iji vwxw

1

iji vxw ij

)1( ij) (iy

) (iy 1 ixw i

) (iy

Original (x)

Contrast (v)

jii

ijcontrasti CC,

Page 54: Machine Learning with Annotator Rationales to Reduce Annotation Cost Omar F. Zaidan Jason Eisner Christine D. Piatko Johns Hopkins University NIPS Workshop

Zaidan, Eisner, Piatko – Annotator Rationales

What this Means in Practice

Page 55: Machine Learning with Annotator Rationales to Reduce Annotation Cost Omar F. Zaidan Jason Eisner Christine D. Piatko Johns Hopkins University NIPS Workshop

Zaidan, Eisner, Piatko – Annotator Rationales

What this Means in Practice

Standard SVM cares about this margin

Page 56: Machine Learning with Annotator Rationales to Reduce Annotation Cost Omar F. Zaidan Jason Eisner Christine D. Piatko Johns Hopkins University NIPS Workshop

Zaidan, Eisner, Piatko – Annotator Rationales

What this Means in Practice

Standard SVM cares about this margin

Modified SVM cares about both margins

Page 57: Machine Learning with Annotator Rationales to Reduce Annotation Cost Omar F. Zaidan Jason Eisner Christine D. Piatko Johns Hopkins University NIPS Workshop

Zaidan, Eisner, Piatko – Annotator Rationales

What this Means in Practice

Standard SVM cares about this margin

Modified SVM cares about both margins

i.e. a hyperplane w that might reduce standard margin (to help the other margin)

Page 58: Machine Learning with Annotator Rationales to Reduce Annotation Cost Omar F. Zaidan Jason Eisner Christine D. Piatko Johns Hopkins University NIPS Workshop

Zaidan, Eisner, Piatko – Annotator Rationales

Recap

• Training examples: (x1,y1), (x2,y2), …

• yi has ni rationales: ri1,ri2,…,rin

• xi gives ni contrast examples: vi1,vi2,…,vin

(obtain jth contrast by masking out jth rationale.)

• We extend the SVM to determine best hyperplane subject to:– Constraints for standard margin,

and also– Constraints for original/contrast separating margin.

Page 59: Machine Learning with Annotator Rationales to Reduce Annotation Cost Omar F. Zaidan Jason Eisner Christine D. Piatko Johns Hopkins University NIPS Workshop

Zaidan, Eisner, Piatko – Annotator Rationales

What this is not• In tasks like digit recognition, one can “generate” more

training data from the existing examples

Class-preserving transformations

?

Page 60: Machine Learning with Annotator Rationales to Reduce Annotation Cost Omar F. Zaidan Jason Eisner Christine D. Piatko Johns Hopkins University NIPS Workshop

Zaidan, Eisner, Piatko – Annotator Rationales

What this is not

New examples have original class

Automatic (rescale, deslant, etc)

?New examples have other class, or have original class less confidently

Manual (new info via human insight)

Page 61: Machine Learning with Annotator Rationales to Reduce Annotation Cost Omar F. Zaidan Jason Eisner Christine D. Piatko Johns Hopkins University NIPS Workshop

Zaidan, Eisner, Piatko – Annotator Rationales

Results

Page 62: Machine Learning with Annotator Rationales to Reduce Annotation Cost Omar F. Zaidan Jason Eisner Christine D. Piatko Johns Hopkins University NIPS Workshop

Zaidan, Eisner, Piatko – Annotator Rationales

0 400 800 1200 1600

Training Set Size (Documents)

93

91

89

87

85

83

81

79

Ac

cu

rac

y (

%)

Baseline

Contrasts Introduced

Standard vs. Modified SVMS

VM

as constrasts:+

Mod

ified

SV

M

Significantly different

Page 63: Machine Learning with Annotator Rationales to Reduce Annotation Cost Omar F. Zaidan Jason Eisner Christine D. Piatko Johns Hopkins University NIPS Workshop

Zaidan, Eisner, Piatko – Annotator Rationales

0 400 800 1200 1600

Training Set Size (Documents)

Ac

cu

rac

y (

%)

Baseline

Contrasts Introduced

Page 64: Machine Learning with Annotator Rationales to Reduce Annotation Cost Omar F. Zaidan Jason Eisner Christine D. Piatko Johns Hopkins University NIPS Workshop

Zaidan, Eisner, Piatko – Annotator Rationales

Baseline

Contrasts Introduced

SV

M

as constrasts:+

Mod

ified

SV

M

Page 65: Machine Learning with Annotator Rationales to Reduce Annotation Cost Omar F. Zaidan Jason Eisner Christine D. Piatko Johns Hopkins University NIPS Workshop

Zaidan, Eisner, Piatko – Annotator Rationales

Baseline

Contrasts Introduced

Rationales Removed

SV

MS

VM

as constrasts:+

Mod

ified

SV

M

Removing rationales

hurts baseline

Page 66: Machine Learning with Annotator Rationales to Reduce Annotation Cost Omar F. Zaidan Jason Eisner Christine D. Piatko Johns Hopkins University NIPS Workshop

Zaidan, Eisner, Piatko – Annotator Rationales

Baseline

Contrasts Introduced

Rationales Removed

SV

MS

VM

as constrasts:+

Mod

ified

SV

M

Page 67: Machine Learning with Annotator Rationales to Reduce Annotation Cost Omar F. Zaidan Jason Eisner Christine D. Piatko Johns Hopkins University NIPS Workshop

Zaidan, Eisner, Piatko – Annotator Rationales

Baseline

Contrasts Introduced

Rationales Removed

Rationales Only (as concatenations)

SV

MS

VM

SV

M

as constrasts:+

Mod

ified

SV

M

Keeping only rationales

hurts performance

Page 68: Machine Learning with Annotator Rationales to Reduce Annotation Cost Omar F. Zaidan Jason Eisner Christine D. Piatko Johns Hopkins University NIPS Workshop

Zaidan, Eisner, Piatko – Annotator Rationales

Baseline

Contrasts Introduced

Rationales Removed

Rationales Only (as concatenations)

SV

MS

VM

SV

M

as constrasts:+

Mod

ified

SV

M

Page 69: Machine Learning with Annotator Rationales to Reduce Annotation Cost Omar F. Zaidan Jason Eisner Christine D. Piatko Johns Hopkins University NIPS Workshop

Zaidan, Eisner, Piatko – Annotator Rationales

Baseline

Contrasts Introduced

Rationales Removed

Rationales Only (as individual docs)

Rationales Only (as concatenations)

SV

M

SV

M

SV

MS

VM

as constrasts:+

Mod

ified

SV

M

Page 70: Machine Learning with Annotator Rationales to Reduce Annotation Cost Omar F. Zaidan Jason Eisner Christine D. Piatko Johns Hopkins University NIPS Workshop

Zaidan, Eisner, Piatko – Annotator Rationales

Baseline

Contrasts Introduced

Rationales Removed

Rationales Only (as individual docs)

Rationales Only (as concatenations)

SV

M

SV

M

SV

MS

VM

as constrasts:+

Mod

ified

SV

M

Pieces to solving classification puzzlecannot be found solely in the rationales

Page 71: Machine Learning with Annotator Rationales to Reduce Annotation Cost Omar F. Zaidan Jason Eisner Christine D. Piatko Johns Hopkins University NIPS Workshop

Zaidan, Eisner, Piatko – Annotator Rationales

We got better performance, but at what increment to the cost?

Page 72: Machine Learning with Annotator Rationales to Reduce Annotation Cost Omar F. Zaidan Jason Eisner Christine D. Piatko Johns Hopkins University NIPS Workshop

Zaidan, Eisner, Piatko – Annotator Rationales

• With a fixed budget, better to annotate rationales, or just annotate more documents?

• We did timing studies on 3 annotation tasks.– 50 docs/task given to four annotators

Annotation Time

Page 73: Machine Learning with Annotator Rationales to Reduce Annotation Cost Omar F. Zaidan Jason Eisner Christine D. Piatko Johns Hopkins University NIPS Workshop

Zaidan, Eisner, Piatko – Annotator Rationales

– T1: given document, annotate the class.– T2: given document and gold standard class,

annotate the rationales.– T3: given document, annotate both the class

and the rationales.

• We found that Time(T3) ≈ 2 x Time(T1)

• Even though on average 8.3 rationales/doc + class!– This task has unusually high why/what ratio– Added cost would presumably be lower for other tasks

• How come it’s so fast?– Annotator already needs to find rationales to determine class.

Extra work is only to make them explicit:

Time(T3) < Time(T1)+Time(T2) by about 20%

Annotation Time

Page 74: Machine Learning with Annotator Rationales to Reduce Annotation Cost Omar F. Zaidan Jason Eisner Christine D. Piatko Johns Hopkins University NIPS Workshop

Zaidan, Eisner, Piatko – Annotator Rationales

Using Rationales fromsome (and not all) Documents

• To speed up, can we get most of the benefit by using rationales from some training documents instead of all training documents?

Baseline

Contrasts Introduced

Explore this space

Page 75: Machine Learning with Annotator Rationales to Reduce Annotation Cost Omar F. Zaidan Jason Eisner Christine D. Piatko Johns Hopkins University NIPS Workshop

Zaidan, Eisner, Piatko – Annotator Rationales

0 400 800 1200 1600

Training Set Size T (Documents)

93

91

89

87

85

83

81

79

Ac

cu

rac

y (

%)

Baseline

Contrasts Introduced

Page 76: Machine Learning with Annotator Rationales to Reduce Annotation Cost Omar F. Zaidan Jason Eisner Christine D. Piatko Johns Hopkins University NIPS Workshop

Zaidan, Eisner, Piatko – Annotator Rationales

No Rationales

0 400 800 1200 1600

Training Set Size T (Documents)

93

91

89

87

85

83

81

79

Ac

cu

rac

y (

%)

Page 77: Machine Learning with Annotator Rationales to Reduce Annotation Cost Omar F. Zaidan Jason Eisner Christine D. Piatko Johns Hopkins University NIPS Workshop

Zaidan, Eisner, Piatko – Annotator Rationales

0 400 800 1200 1600

Training Set Size T (Documents)

93

91

89

87

85

83

81

79

Ac

cu

rac

y (

%)

Rationales from Only 200 Documents

T = 800: Class annotation from 800 documents.

R = 200: Rationales from 200 documents only.

Page 78: Machine Learning with Annotator Rationales to Reduce Annotation Cost Omar F. Zaidan Jason Eisner Christine D. Piatko Johns Hopkins University NIPS Workshop

Zaidan, Eisner, Piatko – Annotator Rationales

0 400 800 1200 1600

Training Set Size T (Documents)

93

91

89

87

85

83

81

79

Ac

cu

rac

y (

%)

Use Rationales from R Documents

Class annotation from T documents.

Rationales from R documents only.

Page 79: Machine Learning with Annotator Rationales to Reduce Annotation Cost Omar F. Zaidan Jason Eisner Christine D. Piatko Johns Hopkins University NIPS Workshop

Zaidan, Eisner, Piatko – Annotator Rationales

0 400 800 1200 1600

Training Set Size T (Documents)

93

91

89

87

85

83

81

79

Ac

cu

rac

y (

%)

Use Rationales from R Documents

Class annotation from T documents.

Rationales from R documents only.

Page 80: Machine Learning with Annotator Rationales to Reduce Annotation Cost Omar F. Zaidan Jason Eisner Christine D. Piatko Johns Hopkins University NIPS Workshop

Zaidan, Eisner, Piatko – Annotator Rationales

0 400 800 1200 1600

Training Set Size T (Documents)

93

91

89

87

85

83

81

79

Ac

cu

rac

y (

%)

Use Rationales from R Documents

Class annotation from T documents.

Rationales from R documents only.

Page 81: Machine Learning with Annotator Rationales to Reduce Annotation Cost Omar F. Zaidan Jason Eisner Christine D. Piatko Johns Hopkins University NIPS Workshop

Zaidan, Eisner, Piatko – Annotator Rationales

0 400 800 1200 1600

Training Set Size T (Documents)

93

91

89

87

85

83

81

79

Ac

cu

rac

y (

%)

Use Rationales from R Documents

Class annotation from T documents.

Rationales from R documents only.

Page 82: Machine Learning with Annotator Rationales to Reduce Annotation Cost Omar F. Zaidan Jason Eisner Christine D. Piatko Johns Hopkins University NIPS Workshop

Zaidan, Eisner, Piatko – Annotator Rationales

0 400 800 1200 1600

Training Set Size T (Documents)

93

91

89

87

85

83

81

79

Ac

cu

rac

y (

%)

Use Rationales from R Documents

Class annotation from T documents.

Rationales from R documents only.

Page 83: Machine Learning with Annotator Rationales to Reduce Annotation Cost Omar F. Zaidan Jason Eisner Christine D. Piatko Johns Hopkins University NIPS Workshop

Zaidan, Eisner, Piatko – Annotator Rationales

0 400 800 1200 1600

Training Set Size T (Documents)

93

91

89

87

85

83

81

79

Ac

cu

rac

y (

%)

Use Rationales from R Documents

Class annotation from T documents.

Rationales from R documents only.

Page 84: Machine Learning with Annotator Rationales to Reduce Annotation Cost Omar F. Zaidan Jason Eisner Christine D. Piatko Johns Hopkins University NIPS Workshop

Zaidan, Eisner, Piatko – Annotator Rationales

0 400 800 1200 1600

Training Set Size T (Documents)

93

91

89

87

85

83

81

79

Ac

cu

rac

y (

%)

Use Rationales from R Documents

Class annotation from T documents.

Rationales from R documents only.

Page 85: Machine Learning with Annotator Rationales to Reduce Annotation Cost Omar F. Zaidan Jason Eisner Christine D. Piatko Johns Hopkins University NIPS Workshop

Zaidan, Eisner, Piatko – Annotator Rationales

0 400 800 1200 1600

Training Set Size T (Documents)

93

91

89

87

85

83

81

79

Ac

cu

rac

y (

%)

Use Rationales from R Documents

Class annotation from T documents.

Rationales from R documents only.

Page 86: Machine Learning with Annotator Rationales to Reduce Annotation Cost Omar F. Zaidan Jason Eisner Christine D. Piatko Johns Hopkins University NIPS Workshop

Zaidan, Eisner, Piatko – Annotator Rationales

• Observation #1: much of the benefit can be obtained without annotating 100% of the documents– e.g. (0%, 50%, 100%) for T = 800 and T = 1600– Recommendation: If rationales are expensive, still collect some.

Using Rationales fromsome (and not all) Documents

Page 87: Machine Learning with Annotator Rationales to Reduce Annotation Cost Omar F. Zaidan Jason Eisner Christine D. Piatko Johns Hopkins University NIPS Workshop

Zaidan, Eisner, Piatko – Annotator Rationales

• Observation #2: Adding more rationales is often more useful than adding more documents.– E.g., adding rationales provides a fresh benefit even after

conventional annotation flattens out (diminishing returns)– Recommendation: Adaptively decide when to solicit rationales,

using marginal cost/benefitmeasurements.

(Could combine with

active learning.)

Using Rationales fromsome (and not all) Documents

Page 88: Machine Learning with Annotator Rationales to Reduce Annotation Cost Omar F. Zaidan Jason Eisner Christine D. Piatko Johns Hopkins University NIPS Workshop

Zaidan, Eisner, Piatko – Annotator Rationales

Rationales from Documents

• Observation #3: Lazy annotators would likewise be faster– Fewer rationales per doc (instead of fewer docs with rationales)– Keep random fraction per doc similar results to before

some (and not all)Using

Page 89: Machine Learning with Annotator Rationales to Reduce Annotation Cost Omar F. Zaidan Jason Eisner Christine D. Piatko Johns Hopkins University NIPS Workshop

Zaidan, Eisner, Piatko – Annotator Rationales

Rationales from Documents

• Observation #3: Lazy annotators would likewise be faster– Fewer rationales per doc (instead of fewer docs with rationales)– Keep random fraction per doc similar results to before– A real live lazy annotator may be even better

• Keeps the obvious rationales: disproportionately fast and useful?

some (and not all)Using

0 400 800 1200 1600

Training Set Size T (Documents)

Diligent Lazy (simulated)

The two (T=800,R=200) points are comparable (same # of rationales)

Page 90: Machine Learning with Annotator Rationales to Reduce Annotation Cost Omar F. Zaidan Jason Eisner Christine D. Piatko Johns Hopkins University NIPS Workshop

Zaidan, Eisner, Piatko – Annotator Rationales

Page 91: Machine Learning with Annotator Rationales to Reduce Annotation Cost Omar F. Zaidan Jason Eisner Christine D. Piatko Johns Hopkins University NIPS Workshop

Zaidan, Eisner, Piatko – Annotator Rationales

Problem with previous method:In general, how to construct feature

vectors for contrast examples?• Not obvious which features to turn off if we

“mask out” these rationale regions of the image.– (Ideally, would impute the missing data: expensive!)

Annotator: “this is a one because there is a lot of empty space here and here”

Page 92: Machine Learning with Annotator Rationales to Reduce Annotation Cost Omar F. Zaidan Jason Eisner Christine D. Piatko Johns Hopkins University NIPS Workshop

Zaidan, Eisner, Piatko – Annotator Rationales

Why are rationales useful, anyway?

The Annotator

Class labelsRationales

Two kinds of annotation

Both give evidence about same parameters

Page 93: Machine Learning with Annotator Rationales to Reduce Annotation Cost Omar F. Zaidan Jason Eisner Christine D. Piatko Johns Hopkins University NIPS Workshop

Zaidan, Eisner, Piatko – Annotator Rationales

–1! " ( ) , . a acting actors after all alot am an and around as at awfully bar being better brother buscemi but capable

career chloe cream debbie .

debut decides did dies directing director dogs drive even excellent expectation fargo favorite

finding flick for from gave girl .

good hand hanging having he him his i ice i'm in interest is it it's job know liked local loser lounge love

memorable michael movie .

mr my named not now obvious of off old one out pick put reach reservoir same says seen sevigny slow soup

starts steve story take .

the then think thinking this though to tommy trees tries truck uncle up well when with writing year

you you've .

This was learned in a standard way:

choose that models class labels well

(of training data)

(incorrect)

logistic regression – probabilistic equiv. of SVM

Page 94: Machine Learning with Annotator Rationales to Reduce Annotation Cost Omar F. Zaidan Jason Eisner Christine D. Piatko Johns Hopkins University NIPS Workshop

Zaidan, Eisner, Piatko – Annotator Rationales

choose that models class labels & rationales well

Our work: another way to learn :

(of training data)

! “ ( ) , . a acting actors after all alot am an and around as at awfully bar being better brother buscemi but capable career chloe

cream debbie .

debut decides did dies directing director dogs drive even excellent expectation fargo favorite finding flick for from gave girl .

good hand hanging having he him his I ice i'm in interest is it it's job know liked local loser lounge love memorable michael movie .

mr my named not now obvious of off old one out pick put reach reservoir same says seen sevigny slow soup starts steve story take .

the then think thinking this though to tommy trees tries truck uncle up well when with writing year you you've .

Page 95: Machine Learning with Annotator Rationales to Reduce Annotation Cost Omar F. Zaidan Jason Eisner Christine D. Piatko Johns Hopkins University NIPS Workshop

Zaidan, Eisner, Piatko – Annotator Rationales

+1

choose that models class labels & rationales well

Our work: another way to learn :

(of training data)

! “ ( ) , . a acting actors after all alot am an and around as at awfully bar being better brother buscemi but

capable career chloe cream debbie .

debut decides did dies directing director dogs drive even excellent expectation fargo favorite

finding flick for from gave girl

good hand hanging having he him his I ice i'm in interest is it it's job know liked local loser lounge love

memorable michael movie .

mr my named not now obvious of off old one out pick put reach reservoir same says seen sevigny slow

soup starts steve story take .

the then think thinking this though to tommy trees tries truck uncle up well when with writing

year you you've .

Page 96: Machine Learning with Annotator Rationales to Reduce Annotation Cost Omar F. Zaidan Jason Eisner Christine D. Piatko Johns Hopkins University NIPS Workshop

Zaidan, Eisner, Piatko – Annotator Rationales

–1! " ( ) , . a acting actors after all alot am an and around as at awfully bar being better brother buscemi but capable

career chloe cream debbie .

debut decides did dies directing director dogs drive even excellent expectation fargo favorite

finding flick for from gave girl .

good hand hanging having he him his i ice i'm in interest is it it's job know liked local loser lounge love

memorable michael movie .

mr my named not now obvious of off old one out pick put reach reservoir same says seen sevigny slow soup

starts steve story take .

the then think thinking this though to tommy trees tries truck uncle up well when with writing year

you you've .

This was learned in a standard way:

(of training data)

choose that models class labels well

Page 97: Machine Learning with Annotator Rationales to Reduce Annotation Cost Omar F. Zaidan Jason Eisner Christine D. Piatko Johns Hopkins University NIPS Workshop

Zaidan, Eisner, Piatko – Annotator Rationales

+1 ! “ ( ) , . a acting actors after all alot am an and around as at awfully bar being better brother buscemi but

capable career chloe cream debbie .

debut decides did dies directing director dogs drive even excellent expectation fargo favorite

finding flick for from gave girl

good hand hanging having he him his I ice i'm in interest is it it's job know liked local loser lounge love

memorable michael movie .

mr my named not now obvious of off old one out pick put reach reservoir same says seen sevigny slow

soup starts steve story take .

the then think thinking this though to tommy trees tries truck uncle up well when with writing

year you you've .

This was learned in a novel way (our algorithm!):

choose that models class labels & rationales well

(of training data)

Page 98: Machine Learning with Annotator Rationales to Reduce Annotation Cost Omar F. Zaidan Jason Eisner Christine D. Piatko Johns Hopkins University NIPS Workshop

Zaidan, Eisner, Piatko – Annotator Rationales

+1 ! “ ( ) , . a acting actors after all alot am an and around as at awfully bar being better brother buscemi but

capable career chloe cream debbie .

debut decides did dies directing director dogs drive even excellent expectation fargo favorite

finding flick for from gave girl

good hand hanging having he him his I ice i'm in interest is it it's job know liked local loser lounge love

memorable michael movie .

mr my named not now obvious of off old one out pick put reach reservoir same says seen sevigny slow

soup starts steve story take .

the then think thinking this though to tommy trees tries truck uncle up well when with writing

year you you've .

This was learned in a novel way (our algorithm!):

(of training data)

choose that models class labels & rationales well

Page 99: Machine Learning with Annotator Rationales to Reduce Annotation Cost Omar F. Zaidan Jason Eisner Christine D. Piatko Johns Hopkins University NIPS Workshop

Zaidan, Eisner, Piatko – Annotator Rationales

n

iii xyp

1

),|(maxarg Choose

),,|(

iii xyrp

Page 100: Machine Learning with Annotator Rationales to Reduce Annotation Cost Omar F. Zaidan Jason Eisner Christine D. Piatko Johns Hopkins University NIPS Workshop

Zaidan, Eisner, Piatko – Annotator Rationales

n

iii xyp

1

),|(maxarg Choose

),,|(

iii xyrp

Page 101: Machine Learning with Annotator Rationales to Reduce Annotation Cost Omar F. Zaidan Jason Eisner Christine D. Piatko Johns Hopkins University NIPS Workshop

Zaidan, Eisner, Piatko – Annotator Rationales

n

iii xyp

1

),|(maxarg Choose

),,|(

iii xyrp

From a review for “Prince of Egypt” ( y = +1 ):

O O O O O I I I I O

51 weeks into '98 , a champ has emerged .

O O O O I I I I I O

the prince of egypt succeeds where other movies failed .

Page 102: Machine Learning with Annotator Rationales to Reduce Annotation Cost Omar F. Zaidan Jason Eisner Christine D. Piatko Johns Hopkins University NIPS Workshop

Zaidan, Eisner, Piatko – Annotator Rationales

n

iii xyp

1

),|(maxarg Choose

),,|(

iii xyrp

From a review for “Prince of Egypt” ( y = +1 ):

O O O O O I I I I O

51 weeks into '98 , a champ has emerged .

O O O O I I I I I O

the prince of egypt succeeds where other movies failed .

Page 103: Machine Learning with Annotator Rationales to Reduce Annotation Cost Omar F. Zaidan Jason Eisner Christine D. Piatko Johns Hopkins University NIPS Workshop

Zaidan, Eisner, Piatko – Annotator Rationales

n

iii xyp

1

),|(maxarg Choose

x

x

),,|(

iii xyrp

From a review for “Prince of Egypt” ( y = +1 ):

O O O O O I I I I O

51 weeks into '98 , a champ has emerged .

O O O O I I I I I O

the prince of egypt succeeds where other movies failed .

Page 104: Machine Learning with Annotator Rationales to Reduce Annotation Cost Omar F. Zaidan Jason Eisner Christine D. Piatko Johns Hopkins University NIPS Workshop

Zaidan, Eisner, Piatko – Annotator Rationales

n

iii xyp

1

),|(maxarg Choose

x

x

),,|(

iii xyrp

From a review for “Prince of Egypt” ( y = +1 ):

O O O O O I I I I O

51 weeks into '98 , a champ has emerged .

O O O O I I I I I O

the prince of egypt succeeds where other movies failed .

Page 105: Machine Learning with Annotator Rationales to Reduce Annotation Cost Omar F. Zaidan Jason Eisner Christine D. Piatko Johns Hopkins University NIPS Workshop

Zaidan, Eisner, Piatko – Annotator Rationales

n

iii xyp

1

),|(maxarg Choose

x

x

),,|(

iii xyrp

r

r

From a review for “Prince of Egypt” ( y = +1 ):

O O O O O I I I I O

51 weeks into '98 , a champ has emerged .

O O O O I I I I I O

the prince of egypt succeeds where other movies failed .

Page 106: Machine Learning with Annotator Rationales to Reduce Annotation Cost Omar F. Zaidan Jason Eisner Christine D. Piatko Johns Hopkins University NIPS Workshop

Zaidan, Eisner, Piatko – Annotator Rationales

n

iii xyp

1

),|(maxarg Choose

x

x

),,|(

iii xyrp

r

r

From a review for “Prince of Egypt” ( y = +1 ):

O O O O O I I I I O

51 weeks into '98 , a champ has emerged .

O O O O I I I I I O

the prince of egypt succeeds where other movies failed .

• We encode rationales as a tag sequence.

Page 107: Machine Learning with Annotator Rationales to Reduce Annotation Cost Omar F. Zaidan Jason Eisner Christine D. Piatko Johns Hopkins University NIPS Workshop

Zaidan, Eisner, Piatko – Annotator Rationales

n

iii xyp

1

),|(maxarg Choose

x

x

),,|(

iii xyrp

r

r

From a review for “Prince of Egypt” ( y = +1 ):

O O O O O I I I I O

51 weeks into '98 , a champ has emerged .

O O O O I I I I I O

the prince of egypt succeeds where other movies failed .

• We encode rationales as a tag sequence.

• The model should predict tag sequence with some help from . Let’s describe it…

),,|(

xyrp

Page 108: Machine Learning with Annotator Rationales to Reduce Annotation Cost Omar F. Zaidan Jason Eisner Christine D. Piatko Johns Hopkins University NIPS Workshop

Zaidan, Eisner, Piatko – Annotator Rationales

…r1

x1

r2

x2

rM

xM

r3

x3

(Tags)

(Words)

Designing the Model .• Mission: model {I,O} tag sequence with ’s help.

Hmm… has no role here.

Page 109: Machine Learning with Annotator Rationales to Reduce Annotation Cost Omar F. Zaidan Jason Eisner Christine D. Piatko Johns Hopkins University NIPS Workshop

Zaidan, Eisner, Piatko – Annotator Rationales

• Makes sense: if for some word w, , then annotator is more likely to mark it I.

• How much more likely? Depends on annotator.• Nonlinear function, parameterized by .• is learned jointly with from annotator’s data.

…r1

x1

r2

x2

rM

xM

r3

x3

(Tags)

(Words)

Designing the Model .• Mission: model {I,O} tag sequence with ’s help.

( )

Page 110: Machine Learning with Annotator Rationales to Reduce Annotation Cost Omar F. Zaidan Jason Eisner Christine D. Piatko Johns Hopkins University NIPS Workshop

Zaidan, Eisner, Piatko – Annotator Rationales

• Problem with this model: it would like to explain every I tag via high for its word.

• BUT:

( )

… temptation to start shouting slogans, delivering its message in an artistically interesting way without being overly manipulative.

…r1

x1

r2

x2

rM

xM

r3

x3

(Tags)

(Words)

Designing the Model .• Mission: model {I,O} tag sequence with ’s help.

Page 111: Machine Learning with Annotator Rationales to Reduce Annotation Cost Omar F. Zaidan Jason Eisner Christine D. Piatko Johns Hopkins University NIPS Workshop

Zaidan, Eisner, Piatko – Annotator Rationales

• So, learn 4 more weights:

( )

… temptation to start shouting slogans, delivering its message in an artistically interesting way without being overly manipulative.

…r1

x1

r2

x2

rM

xM

r3

x3

(Tags)

(Words)

Designing the Model .• Mission: model {I,O} tag sequence with ’s help.

Page 112: Machine Learning with Annotator Rationales to Reduce Annotation Cost Omar F. Zaidan Jason Eisner Christine D. Piatko Johns Hopkins University NIPS Workshop

Zaidan, Eisner, Piatko – Annotator Rationales

• Mission: model {I,O} tag sequence with ’s help.

Designing the Model .

• So, learn 4 more weights:

– These model the length and quantity of rationales.

• Better! Now: “w is marked with I either because of high or being around others marked with I.”

• CRF: Used in other labeling tasks too (POS, NER).

…r1

x1

r2

x2

rM

xM

r3

x3

(Tags)

(Words)

LM CM

( )

Page 113: Machine Learning with Annotator Rationales to Reduce Annotation Cost Omar F. Zaidan Jason Eisner Christine D. Piatko Johns Hopkins University NIPS Workshop

Zaidan, Eisner, Piatko – Annotator Rationales

• Linear classifier:

• Our work: choose that models all of annotator’s data well: both class labels and rationales.

• Remember, at test time:– No change to decision rule.

– No new features.

– No need for rationales.

Recap

Learned jointly : Standard log-linear model

: CRF predicting I/O tag sequence

Improvement due solely to better-learned .

Page 114: Machine Learning with Annotator Rationales to Reduce Annotation Cost Omar F. Zaidan Jason Eisner Christine D. Piatko Johns Hopkins University NIPS Workshop

Zaidan, Eisner, Piatko – Annotator Rationales

Results

Page 115: Machine Learning with Annotator Rationales to Reduce Annotation Cost Omar F. Zaidan Jason Eisner Christine D. Piatko Johns Hopkins University NIPS Workshop

Zaidan, Eisner, Piatko – Annotator Rationales

Discriminative Baseline (SVMs)

Generative Baseline (log-lin.)

Discriminative Rationales

Generative Rationales

0 400 800 1200 1600

Training Set Size (Documents)

95

93

91

89

87

85

83

81

79

Acc

urac

y (%

)

Significantly different

Significantly different

Page 116: Machine Learning with Annotator Rationales to Reduce Annotation Cost Omar F. Zaidan Jason Eisner Christine D. Piatko Johns Hopkins University NIPS Workshop

Zaidan, Eisner, Piatko – Annotator Rationales

• Remember this?

• Actually, we used an expanded set that includes about 100 “conditional” transition features, e.g.:– How often do you see I-O around punctuation?– How often do you see I-O around syntactic boundaries?

• Extra features model rationales better, but don’t help learn any better than basic feature set.

Fancy -Features

…r1

x1

r2

x2

rM

xM

r3

x3

(Tags)

(Words)

Page 117: Machine Learning with Annotator Rationales to Reduce Annotation Cost Omar F. Zaidan Jason Eisner Christine D. Piatko Johns Hopkins University NIPS Workshop

Zaidan, Eisner, Piatko – Annotator Rationales

• is a noisy channel, parameterized by .

• Noisy? Annotators use r to tell us about , but they don’t do it “perfectly”:– They may mark too little (or too much).– They may prefer to start rationales at syntactic

boundaries, and/or end around punctuation.

• To what degree? Remember that we train both models jointly:

• So, learned captures style of a specific annotator.

. Captures Annotator’s Style

Page 118: Machine Learning with Annotator Rationales to Reduce Annotation Cost Omar F. Zaidan Jason Eisner Christine D. Piatko Johns Hopkins University NIPS Workshop

Zaidan, Eisner, Piatko – Annotator Rationales

• So, learned captures style of a specific annotator.

• “So, perhaps you got lucky with A0 (whose data you used in experiments). Would this work with others?”

. Captures Annotator’s Style

SVM baseline 72.0 72.0 72.0 70.0

“SVM Contrast” method 75.0 73.0 74.0 72.0

Smaller experiments (100 documents only) A0 A3 A4 A5

Log-linear baseline 71.0 73.0 71.0 70.0

Generative method 76.0 76.0 77.0 74.0

Each annotator’s own works best. (So there really were different annotation styles to

learn!)

Page 119: Machine Learning with Annotator Rationales to Reduce Annotation Cost Omar F. Zaidan Jason Eisner Christine D. Piatko Johns Hopkins University NIPS Workshop

Zaidan, Eisner, Piatko – Annotator Rationales

• 2 baseline methods: SVM, logistic regression.• Improved each by using rationales at training time.

• Our second approach is better, and generalizes:– To other classifiers: incorporate in objective.– To other domains: e.g., model how relevant portions of

image tend to trigger rationales (“channel model”) and what size/shape rationales tend to be (“language model”).

• Doing an annotation project? Collect rationales!– Even a small number could help.– “You may get more bang for the buck!”

Conclusions

Page 120: Machine Learning with Annotator Rationales to Reduce Annotation Cost Omar F. Zaidan Jason Eisner Christine D. Piatko Johns Hopkins University NIPS Workshop

Zaidan, Eisner, Piatko – Annotator Rationales

On The Internets

• The enriched dataset (and slides and papers):http://cs.jhu.edu/~ozaidan/rationales

Thank you!

Page 121: Machine Learning with Annotator Rationales to Reduce Annotation Cost Omar F. Zaidan Jason Eisner Christine D. Piatko Johns Hopkins University NIPS Workshop

Zaidan, Eisner, Piatko – Annotator Rationales

Thank you!

Page 122: Machine Learning with Annotator Rationales to Reduce Annotation Cost Omar F. Zaidan Jason Eisner Christine D. Piatko Johns Hopkins University NIPS Workshop

Zaidan, Eisner, Piatko – Annotator Rationales

Machine Learning with Annotator Rationalesto Reduce Annotation Cost

This talk presented an interesting idea; it does seem to me that rationales would help. I think the authors are on the right track. However the talk was quite long. The animation was OK, I suppose, though I think some people might be put off by it. At any rate, I’d be interested to know more about this research. It should be noted that other ML methods may benefit from this approach even more than SVM’s, if such a learning method emulates the human decision process more than an SVM (e.g. a decision tree). An interesting idea overall though more experimental results are needed to convince me completely.

Thank you!