data-driven response generation...alan ritter, colin cherry, bill dolan (emnlp 2011) “data-driven...

Post on 02-Aug-2020

6 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Alan RitterOhio State University

Data-Driven Response Generation

1950s ~ 2010 Dialog systems mostly rule-based

Alan Ritter (Ohio State University)

Rule-Based: Eliza (Weizenbaum 1966)

Goal-Directed Dialogue Systems:

Information Retrieval (Isbell et. al. 2000)

ATIS Dataset (Hemphill, 1990)

-774 flight reservation conversations-Manually annotated

Chatbots:

1990s ~ 2010s Data-Driven Machine Translation

millions of bilingual documents on the web

Alan Ritter (Ohio State University)

Findings of WMT 2010 (Callison-Burch et. al.) The Mathematics of Statistical Machine Translation: Parameter Estimation (Brown et. al.)

July 2011 Data-Driven Dialogue

500 million conversations per month on Twitter alone

Alan Ritter, Colin Cherry, Bill Dolan (EMNLP 2011) “Data-Driven Response Generation in Social Media”

(vs. 30m for French-English translation)

July 2011 Data-Driven Dialogue

500 million conversations per month on Twitter alone

Alan Ritter, Colin Cherry, Bill Dolan (EMNLP 2011) “Data-Driven Response Generation in Social Media”

Named Entity Recognition (Ritter et. al. EMNLP 2011)

Open-Domain Event Extraction (Ritter et. al. KDD 2012)

Unsupervised Dialogue Acts (Ritter, Cherry, Dolan, NAACL 2010)

NLP on Noisy User-Generated Text:

Minimally-Supervised Event Extraction (Ritter et. al. WWW 2015)

(vs. 30m for French-English translation)

July 2011 Data-Driven Dialogue

500 million conversations per month on Twitter alone

Alan Ritter, Colin Cherry, Bill Dolan (EMNLP 2011) “Data-Driven Response Generation in Social Media”

MTDialogue

… and they lived happily ever after.

(vs. 30m for French-English translation)

Alan Ritter, Colin Cherry, Bill Dolan (EMNLP 2011) “Data-Driven Response Generation in Social Media”

But, unlike MT, conversations are not semantically equivalent.

Who wants to come over for dinner tomorrow?Input:

Alan Ritter, Colin Cherry, Bill Dolan (EMNLP 2011) “Data-Driven Response Generation in Social Media”

Who wants to come over for dinner tomorrow?Input:

Output:

Yum ! I

{Alan Ritter, Colin Cherry, Bill Dolan (EMNLP 2011) “Data-Driven Response Generation in Social Media”

Who wants to come over for dinner tomorrow?Input:

Output:{

want toYum ! I

{Alan Ritter, Colin Cherry, Bill Dolan (EMNLP 2011) “Data-Driven Response Generation in Social Media”

Who wants to come over for dinner tomorrow?Input:

Output:{

want toYum ! I

{be there

{Alan Ritter, Colin Cherry, Bill Dolan (EMNLP 2011) “Data-Driven Response Generation in Social Media”

Who wants to come over for dinner tomorrow?Input:

Output:{

want toYum ! I

{be there

{tomorrow !

{Alan Ritter, Colin Cherry, Bill Dolan (EMNLP 2011) “Data-Driven Response Generation in Social Media”

2015 ~ present Neural MT-based Conversation Models

Alan Ritter, Colin Cherry, Bill Dolan (EMNLP 2011) “Data-Driven Response Generation in Social Media”

• I. Serban, A. Sordoni, Y. Bengio, A. Courville and J. Pineau. Building End-To-End Dialogue Systems Using Generative Hierarchical Neural Networks. In Proc of AAAI, 2016.

• Jesse Dodge, Andreea Gane, Xiang Zhang, Antoine Bordes, Sumit Chopra, Alexander Miller, Arthur Szlam, Jason Weston. Evaluating Prerequisite Qualities for Learning End-to-end Dialog Systems, ICLR 2016

• Alessandro Sordoni, Michel Galley, Michael Auli, Chris Brockett, Yangfeng Ji, Meg Mitchell, Jian-Yun Nie, Jianfeng Gao, and Bill Dolan, A Neural Network Approach to Context-Sensitive Generation of Conversational Responses. NAACL 2015

• Lifeng Shang, Zhengdong Lu, Hang Li. Neural Responding Machine for Short Text Conversation. ACL 2015

• O. Vinyals, Q.V. Le. A Neural Conversational Model. ICML Deep Learning Workshop 2015

• Jiwei Li, Michel Galley, Chris Brockett, Jianfeng Gao and Bill Dolan. A Diversity-Promoting Objective Function for Neural Conversation Models. NAACL 2016

Alan Ritter, Colin Cherry, Bill Dolan (EMNLP 2011) “Data-Driven Response Generation in Social Media”

But, maximum likelihood estimate responses can be safe and boring

arg max

r1,...,rlP (r1, . . . , rl|m1, . . . ,mk)

Input MessageResponse

Alan Ritter, Colin Cherry, Bill Dolan (EMNLP 2011) “Data-Driven Response Generation in Social Media”

But, maximum likelihood estimate responses can be safe and boring

Some replies work for almost any input:

arg max

r1,...,rlP (r1, . . . , rl|m1, . . . ,mk)

Input MessageResponse

Alan Ritter, Colin Cherry, Bill Dolan (EMNLP 2011) “Data-Driven Response Generation in Social Media”

But, maximum likelihood estimate responses can be safe and boring

Some replies work for almost any input:

“I don’t know”

arg max

r1,...,rlP (r1, . . . , rl|m1, . . . ,mk)

Input MessageResponse

2016 Neural Dialogue with Deep Reinforcement Learning

J. Li, W. Monroe, A. Ritter, M. Galley, J. Gao, D. Jurafsky (EMNLP 2016) “Deep Reinforcement Learning for Dialogue Generation”

2016 Neural Dialogue with Deep Reinforcement Learning

J. Li, W. Monroe, A. Ritter, M. Galley, J. Gao, D. Jurafsky (EMNLP 2016) “Deep Reinforcement Learning for Dialogue Generation”

Jiwei Li (PhD Stanford 2017)

How old are you ?

Problem: Short-sighted conversation decisions.

J. Li, W. Monroe, A. Ritter, M. Galley, J. Gao, D. Jurafsky (EMNLP 2016) “Deep Reinforcement Learning for Dialogue Generation”

How old are you ?

i 'm 16 .

Problem: Short-sighted conversation decisions.

J. Li, W. Monroe, A. Ritter, M. Galley, J. Gao, D. Jurafsky (EMNLP 2016) “Deep Reinforcement Learning for Dialogue Generation”

How old are you ?

i 'm 16 .

16 ?

Problem: Short-sighted conversation decisions.

J. Li, W. Monroe, A. Ritter, M. Galley, J. Gao, D. Jurafsky (EMNLP 2016) “Deep Reinforcement Learning for Dialogue Generation”

How old are you ?

i 'm 16 .

16 ?

i don 't know what you 're talking about

Problem: Short-sighted conversation decisions.

J. Li, W. Monroe, A. Ritter, M. Galley, J. Gao, D. Jurafsky (EMNLP 2016) “Deep Reinforcement Learning for Dialogue Generation”

How old are you ?

i 'm 16 .

16 ?

i don 't know what you 're talking about

Problem: Short-sighted conversation decisions.

J. Li, W. Monroe, A. Ritter, M. Galley, J. Gao, D. Jurafsky (EMNLP 2016) “Deep Reinforcement Learning for Dialogue Generation”

How old are you ?

i 'm 16 .

16 ?

i don 't know what you 're talking about

you don 't know what you 're saying

Problem: Short-sighted conversation decisions.

J. Li, W. Monroe, A. Ritter, M. Galley, J. Gao, D. Jurafsky (EMNLP 2016) “Deep Reinforcement Learning for Dialogue Generation”

How old are you ?

i 'm 16 .

16 ?

i don 't know what you 're talking about

you don 't know what you 're saying

i don 't know what you 're talking about

Problem: Short-sighted conversation decisions.

J. Li, W. Monroe, A. Ritter, M. Galley, J. Gao, D. Jurafsky (EMNLP 2016) “Deep Reinforcement Learning for Dialogue Generation”

How old are you ?

i 'm 16 .

16 ?

i don 't know what you 're talking about

you don 't know what you 're saying

i don 't know what you 're talking about

you don 't know what you 're saying

Problem: Short-sighted conversation decisions.

J. Li, W. Monroe, A. Ritter, M. Galley, J. Gao, D. Jurafsky (EMNLP 2016) “Deep Reinforcement Learning for Dialogue Generation”

How old are you ?

i 'm 16 .

16 ?

i don 't know what you 're talking about

you don 't know what you 're saying

i don 't know what you 're talking about

you don 't know what you 're saying

Bad Action

Problem: Short-sighted conversation decisions.

J. Li, W. Monroe, A. Ritter, M. Galley, J. Gao, D. Jurafsky (EMNLP 2016) “Deep Reinforcement Learning for Dialogue Generation”

How old are you ?

i 'm 16 .

16 ?

i don 't know what you 're talking about

you don 't know what you 're saying

i don 't know what you 're talking about

you don 't know what you 're saying Outcome

Problem: Short-sighted conversation decisions.

J. Li, W. Monroe, A. Ritter, M. Galley, J. Gao, D. Jurafsky (EMNLP 2016) “Deep Reinforcement Learning for Dialogue Generation”

How old are you ?

i 'm 16 .

16 ?

i don 't know what you 're talking about

you don 't know what you 're saying

i don 't know what you 're talking about

you don 't know what you 're saying Outcome does not emerge

until a few turns later

Can Reinforcement Learning Handle This?

J. Li, W. Monroe, A. Ritter, M. Galley, J. Gao, D. Jurafsky (EMNLP 2016) “Deep Reinforcement Learning for Dialogue Generation”

How old are you ?

how old are you

Encoding

Notations: State

J. Li, W. Monroe, A. Ritter, M. Galley, J. Gao, D. Jurafsky (EMNLP 2016) “Deep Reinforcement Learning for Dialogue Generation”

How old are you ?

i 'm 16 .

I’m 16 . EOS

Decoding

EOS I’m fine .how old are you

Encoding

Notations: Action

J. Li, W. Monroe, A. Ritter, M. Galley, J. Gao, D. Jurafsky (EMNLP 2016) “Deep Reinforcement Learning for Dialogue Generation”

A message from training set

Encode

r1DecodeEncode

r2Decode

Simulation

J. Li, W. Monroe, A. Ritter, M. Galley, J. Gao, D. Jurafsky (EMNLP 2016) “Deep Reinforcement Learning for Dialogue Generation”

S1 S2 Sn

Compute Accumulated Reward R(S1,S2,…,Sn)

Input Message

Encode Decode

Turn 1

Encode

Turn 2

Decode Encode

Decode

Turn N

J. Li, W. Monroe, A. Ritter, M. Galley, J. Gao, D. Jurafsky (EMNLP 2016) “Deep Reinforcement Learning for Dialogue Generation”

S1 S2 Sn

REINFORCE Algorithm (William,1992)

Input Message

Encode Decode

Turn 1

Encode

Turn 2

Decode Encode

Decode

Turn N

J. Li, W. Monroe, A. Ritter, M. Galley, J. Gao, D. Jurafsky (EMNLP 2016) “Deep Reinforcement Learning for Dialogue Generation”

Policy Gradient Methods:

S1 S2 Sn

REINFORCE Algorithm (William,1992)

What we want to learn

Input Message

Encode Decode

Turn 1

Encode

Turn 2

Decode Encode

Decode

Turn N

J. Li, W. Monroe, A. Ritter, M. Galley, J. Gao, D. Jurafsky (EMNLP 2016) “Deep Reinforcement Learning for Dialogue Generation”

Policy Gradient Methods:

Q: How to a Specify Reward Signal?

(J. Li, W. Monroe, T. Shi, S. Jean, A. Ritter, D. Jurafsky (EMNLP 2017) “Adversarial Learning for Neural Dialogue Generation”

A: Turing Test

Adversarial Learning(Goodfellow et al., 2014)

(J. Li, W. Monroe, T. Shi, S. Jean, A. Ritter, D. Jurafsky (EMNLP 2017) “Adversarial Learning for Neural Dialogue Generation”

Q: How to a Specify Reward Signal?

A: Turing Test

Real-world conversations

Response Generator

generate response

sample human response

Discriminator Real or Fake?

(J. Li, W. Monroe, T. Shi, S. Jean, A. Ritter, D. Jurafsky (EMNLP 2017) “Adversarial Learning for Neural Dialogue Generation”

Adversarial Learning for Neural Dialogue

Real-world conversations

Response Generator

generate response

sample human response

Discriminator

(Alternate Between Training Generator and Discriminator)

Real or Fake?

(J. Li, W. Monroe, T. Shi, S. Jean, A. Ritter, D. Jurafsky (EMNLP 2017) “Adversarial Learning for Neural Dialogue Generation”

Adversarial Learning for Neural Dialogue

Real-world conversations

Response Generator

Discriminator

(Alternate Between Training Generator and Discriminator)

REINFORCE Algorithm (Williams,1992)

Real or Fake?

generate response

sample human response

(J. Li, W. Monroe, T. Shi, S. Jean, A. Ritter, D. Jurafsky (EMNLP 2017) “Adversarial Learning for Neural Dialogue Generation”

Adversarial Learning for Neural Dialogue

Human Evaluator:

Machine Evaluator:

Adversarial Success (How often can you fool a machine)

Adversarial Learning 8.0%Standard Seq2Seq model 4.9%

Adversarial Win

Adversarial Lose

Tie

62% 18% 20%

Adversarial Learning Improves Response Generationvs a vanilla generation model

(J. Li, W. Monroe, T. Shi, S. Jean, A. Ritter, D. Jurafsky (EMNLP 2017) “Adversarial Learning for Neural Dialogue Generation”

34

Extract Entities,

Relations and Events

Barack Obama

Hawaii

Born in

United States

President

David Ige

Mayor

Spouse

Michelle Obama Alma

Mater

Princeton

Honolulu

Capitol

Future: Integrating dynamic knowledge graphs

(A. Konovalov, B. Strauss, A. Ritter and B. O'Connor (WWW 2017) “Learning to Extract Events from Knowledge Base Revisions”

Takeaways

Alan Ritter (Ohio State University)

MTDialogue

Open-Domain Dialogue

Takeaways

Alan Ritter (Ohio State University)

Learning from Delayed-Reward

MTDialogue

Open-Domain Dialogue

Takeaways

Alan Ritter (Ohio State University)

Learning from Delayed-Reward

MTDialogue

Open-Domain Dialogue

Adversarial Learning for Dialogue

Takeaways

Alan Ritter (Ohio State University)

Learning from Delayed-Reward

MTDialogue

Open-Domain Dialogue

Adversarial Learning for Dialogue

Takeaways

Alan Ritter (Ohio State University)

Thank You!

top related