from dialogue systems to social chatbots: … · from dialogue systems to social chatbots:...

37
From Dialogue Systems to Social Chatbots: Reinforcement Learning, Seq2Seq, and back again. SwissText keynote, 9 th June 2017 Verena Rieser, Professor in Computer Science Heriot-Watt University, Edinburgh, UK.

Upload: dangphuc

Post on 02-Aug-2018

229 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: From Dialogue Systems to Social Chatbots: … · From Dialogue Systems to Social Chatbots: Reinforcement Learning, Seq2Seq, and back again. SwissTextkeynote, 9thJune 2017 Verena Rieser,

FromDialogueSystemstoSocialChatbots:ReinforcementLearning,Seq2Seq,andbackagain.

SwissText keynote,9th June2017

VerenaRieser,ProfessorinComputerScienceHeriot-WattUniversity,Edinburgh,UK.

Page 2: From Dialogue Systems to Social Chatbots: … · From Dialogue Systems to Social Chatbots: Reinforcement Learning, Seq2Seq, and back again. SwissTextkeynote, 9thJune 2017 Verena Rieser,
Page 3: From Dialogue Systems to Social Chatbots: … · From Dialogue Systems to Social Chatbots: Reinforcement Learning, Seq2Seq, and back again. SwissTextkeynote, 9thJune 2017 Verena Rieser,

NLPGroup@HWU

Dr SimonKeizer,PDRA

Dr JekaterinaNovikova,PDRA Dr Ondrej

Dusek,PDRADr XingkunLiu,PDRA

AmandaCurry,PhDstudent

Xinnuo Xu,PhDstudent

NLPgroup@HWU

ProfVerenaRieser,Director

Page 4: From Dialogue Systems to Social Chatbots: … · From Dialogue Systems to Social Chatbots: Reinforcement Learning, Seq2Seq, and back again. SwissTextkeynote, 9thJune 2017 Verena Rieser,

ProjectsIamworkingon:

• StatisticalNaturalLanguageGeneration(EPSRCDILIGENTproject)• TransferLearningforDialogueSystems(EPSRCMADRIGALproject)• AutomaticQualityEstimationforOutputGeneration(NVIDIA)• PersonalizedHumanRobotInteraction(withEmoTech LTD)• AmazonAlexaChallenge(Amazon)• SentimentAnalysisforArabic(SemEval’16winner)

CurrentResearchProjects

Page 5: From Dialogue Systems to Social Chatbots: … · From Dialogue Systems to Social Chatbots: Reinforcement Learning, Seq2Seq, and back again. SwissTextkeynote, 9thJune 2017 Verena Rieser,

TalkingMachines

Page 6: From Dialogue Systems to Social Chatbots: … · From Dialogue Systems to Social Chatbots: Reinforcement Learning, Seq2Seq, and back again. SwissTextkeynote, 9thJune 2017 Verena Rieser,

ThenewBotsarecoming….“Botsarethenewapps''

becausethey”fundamentallyrevolutionizehowcomputingisexperiencedbyeverybody.”

Microsoft’sCEONardella

Page 7: From Dialogue Systems to Social Chatbots: … · From Dialogue Systems to Social Chatbots: Reinforcement Learning, Seq2Seq, and back again. SwissTextkeynote, 9thJune 2017 Verena Rieser,

Source:MICJan2015.

Marketforecast

Page 8: From Dialogue Systems to Social Chatbots: … · From Dialogue Systems to Social Chatbots: Reinforcement Learning, Seq2Seq, and back again. SwissTextkeynote, 9thJune 2017 Verena Rieser,

Overview

• Task-drivenStatisticalDialogueSystems(SDS)• ReinforcementLearningandStateTracking

• SocialChatbots• Seq2Seqmodels• DeepRL?

• Futurechallenges• Evaluation?• Data?• Combiningtask-drivenandsocialsystems?

Overview

Page 9: From Dialogue Systems to Social Chatbots: … · From Dialogue Systems to Social Chatbots: Reinforcement Learning, Seq2Seq, and back again. SwissTextkeynote, 9thJune 2017 Verena Rieser,

ShortcomingsofSiriSDSArchitecture

e.g.Rieser&Lemon,Comp.Ling.2011,ACL’10,’08,’06

e.g.Rieseretal.,ACL’05,’09,’10,’16EMNLP’12,’15,EACL’09,’14

e.g.Boidin &Rieser,Interspeech’09

Page 10: From Dialogue Systems to Social Chatbots: … · From Dialogue Systems to Social Chatbots: Reinforcement Learning, Seq2Seq, and back again. SwissTextkeynote, 9thJune 2017 Verena Rieser,

Anexampleofaframe

“ShowmemorningflightsfromEdinburghtoLondononTuesday.”

SHOW:FLIGHTS:

ORIGIN:CITY: EdinburghDATE:TuesdayTIME:?

DEST:CITY:LondonDATE:?TIME:?

TaskrepresentationandNLU

Page 11: From Dialogue Systems to Social Chatbots: … · From Dialogue Systems to Social Chatbots: Reinforcement Learning, Seq2Seq, and back again. SwissTextkeynote, 9thJune 2017 Verena Rieser,

DialogueEngineering:FSAwithVoiceXML etc.

Page 12: From Dialogue Systems to Social Chatbots: … · From Dialogue Systems to Social Chatbots: Reinforcement Learning, Seq2Seq, and back again. SwissTextkeynote, 9thJune 2017 Verena Rieser,

“Aspokendialoguesystemisacomputeragentthatinteractswithhumansbyunderstandingandproducing

spokenlanguageinacoherentway.”[Rieser&Lemon,Springer2011]

StatisticalDialogueSystems(SDS)

• Planning• Adaptation• Robustness

Data-drivenMachineLearningmethods

Page 13: From Dialogue Systems to Social Chatbots: … · From Dialogue Systems to Social Chatbots: Reinforcement Learning, Seq2Seq, and back again. SwissTextkeynote, 9thJune 2017 Verena Rieser,

ShortcomingsofSiri

Twomainresearchareas:1. BeliefMonitoringusingPartiallyObservableMarkov

DecisionProcesses(POMDPs),e.g.[Williams&Young,2007].

2. ActionSelection/PolicyOptimisation usingReinforcementLearning,e.g.[Singhetal.,2002],[Rieser &Lemon,2008,2011]

StatisticalApproachestotask-baseddialogue

Page 14: From Dialogue Systems to Social Chatbots: … · From Dialogue Systems to Social Chatbots: Reinforcement Learning, Seq2Seq, and back again. SwissTextkeynote, 9thJune 2017 Verena Rieser,

ShortcomingsofSiriReinforcementLearning

Qπ (s,a) = Tss 'a

s '∑ [Rss 'a +γV π (s ')];

Bellmann optimalityequation(1952),see[SuttonandBarto,1998].

Page 15: From Dialogue Systems to Social Chatbots: … · From Dialogue Systems to Social Chatbots: Reinforcement Learning, Seq2Seq, and back again. SwissTextkeynote, 9thJune 2017 Verena Rieser,

ShortcomingsofSiriPolicyOptimisationforStochasticEnvironments:MarkovDecisionProcesses

S1t1“Three”

S1t2 mumble

p=0.5

p=0.5

+1

-1

St-1

S2t2

S2t1+1

Hangup -10

“Yes”p=0.8

p=0.2

a1t-1

“Howmanydoyouwant?”

a2t-1“Doyouwantthree?”

Trade-offproblem

Page 16: From Dialogue Systems to Social Chatbots: … · From Dialogue Systems to Social Chatbots: Reinforcement Learning, Seq2Seq, and back again. SwissTextkeynote, 9thJune 2017 Verena Rieser,

St-1

at-1

“Howmanydoyouwant?”

machineaction

oldbelief

Stnewbelief

“Threeplease”

otobserveddata

InferenceviaBayes Rule

BeliefMonitoringforPartiallyObservableEnvironments:POMDPs

Page 17: From Dialogue Systems to Social Chatbots: … · From Dialogue Systems to Social Chatbots: Reinforcement Learning, Seq2Seq, and back again. SwissTextkeynote, 9thJune 2017 Verena Rieser,

ShortcomingsofSiriAfullystatisticalsystem(2010)

Page 18: From Dialogue Systems to Social Chatbots: … · From Dialogue Systems to Social Chatbots: Reinforcement Learning, Seq2Seq, and back again. SwissTextkeynote, 9thJune 2017 Verena Rieser,

• Notenough(annotated)data• Traininsimulation(Rieser &Lemon,ACL2006-2010)• Fasterconvergingalgorithms(Pietquin etal.,2010;Gasic etal.2010)• Domain-transferlearning(Williams,2013;Youngetal.2014)

• InterfacewithNLG.• Mismatchbetween“whattosay”and“howtosay”it.• Hierarchicallearning(Rieser &Lemon,2010;Dethlefs etal.2011)• End-to-endneuralarchitecture(Wenetal.2016)

Challenges

Page 19: From Dialogue Systems to Social Chatbots: … · From Dialogue Systems to Social Chatbots: Reinforcement Learning, Seq2Seq, and back again. SwissTextkeynote, 9thJune 2017 Verena Rieser,

Overview

• Task-drivenStatisticalDialogueSystems(SDS)• ReinforcementLearningandStateTracking

• SocialChatbots• Seq2Seqmodels• DeepRL?

• Futurechallenges• Evaluation?• Data?

Overview

Page 20: From Dialogue Systems to Social Chatbots: … · From Dialogue Systems to Social Chatbots: Reinforcement Learning, Seq2Seq, and back again. SwissTextkeynote, 9thJune 2017 Verena Rieser,

ShortcomingsofSiriAmazonAlexaChallenge:HWUTeam

Page 21: From Dialogue Systems to Social Chatbots: … · From Dialogue Systems to Social Chatbots: Reinforcement Learning, Seq2Seq, and back again. SwissTextkeynote, 9thJune 2017 Verena Rieser,

ShortcomingsofSiriTuringTest:“Exhibitintelligentbehaviour equivalentto,orindistinguishablefrom,thatofahuman.”(AlanTuring1950)

AmazonAlexaChallenge:“Conversecoherentlyandengaginglywithhumansoverpopulartopicsandeventsfor20minutes.”(Amazon2016-2017)

ChatBots

Page 22: From Dialogue Systems to Social Chatbots: … · From Dialogue Systems to Social Chatbots: Reinforcement Learning, Seq2Seq, and back again. SwissTextkeynote, 9thJune 2017 Verena Rieser,

AIvs.AI:CleverBot (Carpenter2011)

22

Page 23: From Dialogue Systems to Social Chatbots: … · From Dialogue Systems to Social Chatbots: Reinforcement Learning, Seq2Seq, and back again. SwissTextkeynote, 9thJune 2017 Verena Rieser,

CleverbotRolloCarpenter(2011)

CleverBot (Carpenter 2011):• n-gram models of question-answer pairs P(xi|xi-(n1),…xi-1) • Trained on BIG data.

Howfarcanyougowithbigdata?

Page 24: From Dialogue Systems to Social Chatbots: … · From Dialogue Systems to Social Chatbots: Reinforcement Learning, Seq2Seq, and back again. SwissTextkeynote, 9thJune 2017 Verena Rieser,

CleverbotRolloCarpenter(2011)• End-to-end systems.• Learnfrom“raw”dialoguedata(e.g.OpenSubtitles).• Nosemanticorpragmaticannotationrequired.

ResponseGenerationSystems

Input-outputmapping

text-based

Page 25: From Dialogue Systems to Social Chatbots: … · From Dialogue Systems to Social Chatbots: Reinforcement Learning, Seq2Seq, and back again. SwissTextkeynote, 9thJune 2017 Verena Rieser,

ShortcomingsofSiri• InformationRetrieval

• Cleverbot,Xiaoice,Tay etc.• Banchs &Li.,2012,Yuetal.2016:TickTock system.

•StatisticalMachineTranslation• Ritteretal.2011:Data-DrivenResponseGenerationinSocialMedia.

•DeepNeuralNets• Shangetal.,2015;Vinyals &Le,2015;Sordoni etal.,2015.

End-to-EndArchitectures

Page 26: From Dialogue Systems to Social Chatbots: … · From Dialogue Systems to Social Chatbots: Reinforcement Learning, Seq2Seq, and back again. SwissTextkeynote, 9thJune 2017 Verena Rieser,

Seq2seqmodelarchitectureSeq2SeqModelsforEnd-to-EndSystems

Page 27: From Dialogue Systems to Social Chatbots: … · From Dialogue Systems to Social Chatbots: Reinforcement Learning, Seq2Seq, and back again. SwissTextkeynote, 9thJune 2017 Verena Rieser,

ShortcomingsofSiri

• Trainon:1. Moviedialoguedatabases,e.g.OpenSubtitles.com

(Banchs &Li,2012)2. Twitterconversations(Ritteretal.,2011)3. Reddit(Al-Rfou etal.,2016)4. Ubuntucorpus(Loweetal.,SigDial 2015)

NeuralLearningfromLargeDataSets

Page 28: From Dialogue Systems to Social Chatbots: … · From Dialogue Systems to Social Chatbots: Reinforcement Learning, Seq2Seq, and back again. SwissTextkeynote, 9thJune 2017 Verena Rieser,

Sampleoutputfromaneuralnetchatbot

# Context c Message m Generated Response r

1 noone can help this time its just i need a support system well im here to support you. whenever youneed me.

thank you, i appreciate it thank you

2 nothin much, and how’s the book?! its good but i’m only like halfway throughcuz i don’t feel like reading. i’m so bored ...

that’s good! i have the book but i’mbored too.

3 i know! what have you been up to? nothing really. just enjoying summer andworking. what about you?

that’s good! nothing much justlistening to music and you?

4 thanks, your way better though i’m just good at expressing it that’s but trustme you are freakin amazing and waay better

i know, i don’t trust you, i love youtoo! thank you

5 thanks. yeah i am good thanks! good, but angry! sorry you’re angry, been there, i hope it diesdown.

i hope so too, i’m not angry!!

6 figures. bass. never ate bass. bass is extremely mild with very few bones.my favorite after catfish

that’s one of the bones, chuck bass ismine too.

7 today i will walk for miles. clear sky clear mind. love walking. it’s too hot to walk this month.i haven’t moved. frozen. heat freeze.

it’s too hot for you to get up to honeyif you have frozen yogurt to you.

8 arresting people over “nasty tweets”? please. i’m notcondoning what the lad did but arresting him is a tadsilly.

it appears he may have sent death threats toothers while all this was kicking off too.

i did! he was armed with nuclearthreats? that’s what’s happening toyou.

Table 5: Sample responses produced by the MT-based DCGM-II+CMM system.

information that conflicts either internally within theresponse itself, or is at odds with the context, as in ex-amples 4-5. This is not unsurprising, since our modellacks mechanisms both for reflecting agent intent inthe response and for maintaining consistency withrespect to sentiment polarity. Longer context andmessage components may also result in responsesthat wander off-topic or lapse into incoherence as in6-8, especially when relatively low frequency uni-grams (“bass”, “threat”) are echoed in the response.In general, we expect that larger datasets and incorpo-ration of more extensive contexts into the model willhelp yield more coherent results in these cases. Con-sistent representation of agent intent is outside thescope of this work, but will likely remain a significantchallenge.

7 ConclusionWe have formulated a neural network architecturefor data-driven response generation trained from so-cial media conversations, in which generation ofresponses is conditioned on past dialog utterancesthat provide contextual information. We have pro-posed a novel multi-reference extraction techniqueallowing for robust automated evaluation using stan-dard SMT metrics such as BLEU and METEOR.Our context-sensitive models consistently outper-form both context-independent and context-sensitivebaselines by up to 11% relative improvement in

BLEU in the MT setting and 24% in the IR setting, al-beit using a minimal number of features. As our mod-els are completely data-driven and self-contained,they hold the potential to improve fluency and con-textual relevance in other types of dialog systems.

Our work suggests several directions for futureresearch. We anticipate that there is much room forimprovement if we employ more complex neural net-work models that take into account word order withinthe message and context utterances. Direct genera-tion from neural network models is an interesting andpotentially promising next step. Future progress inthis area will also greatly benefit from thorough studyof automated evaluation metrics.

Acknowledgments

We thank Alan Ritter, Ray Mooney, Chris Quirk,Lucy Vanderwende, Susan Hendrich and MouniReddy for helpful discussions, as well as the threeanonymous reviewers for their comments.

ReferencesMichael Auli, Michel Galley, Chris Quirk, and Geoffrey

Zweig. 2013. Joint language and translation modelingwith recurrent neural networks. In Proc. of EMNLP,pages 1044–1054.

Satanjeev Banerjee and Alon Lavie. 2005. METEOR:An automatic metric for MT evaluation with improved

204

Sordoni A,GalleyM,Auli M,BrockettC,JiY,MitchellM,Nie JY,GaoJ,DolanB.Aneuralnetworkapproachtocontext-sensitivegenerationofconversationalresponses.NAACL2015

trainedon127MTwittercontext-message-responsetriples

SampleOutputfromaNeuralNetchatbot

Page 29: From Dialogue Systems to Social Chatbots: … · From Dialogue Systems to Social Chatbots: Reinforcement Learning, Seq2Seq, and back again. SwissTextkeynote, 9thJune 2017 Verena Rieser,

DeepRLProblemswithstandardSeq2Seq

Jiwei Li,MichelGalley,ChrisBrockett,Jianfeng Gao,andBillDolan.2016.ADiversity-PromotingObjectiveFunctionforNeuralConversationModels.

Page 30: From Dialogue Systems to Social Chatbots: … · From Dialogue Systems to Social Chatbots: Reinforcement Learning, Seq2Seq, and back again. SwissTextkeynote, 9thJune 2017 Verena Rieser,

DeepRLDeepReinforcementLearning(Lietal.,2016)

Jiwei Li,WillMonroe,AlanRitter,MichelGalley,Jianfeng GaoandDanJurafsky:DeepReinforcementLearningforDialogueGeneration.

Page 31: From Dialogue Systems to Social Chatbots: … · From Dialogue Systems to Social Chatbots: Reinforcement Learning, Seq2Seq, and back again. SwissTextkeynote, 9thJune 2017 Verena Rieser,

DeepRLDeepReinforcementLearning(Lietal.,2016)

Jiwei Li,WillMonroe,AlanRitter,MichelGalley,Jianfeng GaoandDanJurafsky:DeepReinforcementLearningforDialogueGeneration.

Page 32: From Dialogue Systems to Social Chatbots: … · From Dialogue Systems to Social Chatbots: Reinforcement Learning, Seq2Seq, and back again. SwissTextkeynote, 9thJune 2017 Verena Rieser,

DeepRLRewardmodelling(Lietal.,2016)

Jiwei Li,WillMonroe,AlanRitter,MichelGalley,Jianfeng GaoandDanJurafsky:DeepReinforcementLearningforDialogueGeneration.

Reward=0.25EaseOfAnswering+0.25InformationFlow+0.5SemanticCoherence;

Page 33: From Dialogue Systems to Social Chatbots: … · From Dialogue Systems to Social Chatbots: Reinforcement Learning, Seq2Seq, and back again. SwissTextkeynote, 9thJune 2017 Verena Rieser,

Overview

• Task-drivenStatisticalDialogueSystems(SDS)• ReinforcementLearningandStateTracking

• SocialChatbots• Seq2Seqmodels• DeepRL?

• Futurechallenges• Evaluation?• Data?• Combiningtask-drivenandsocialsystems?

Overview

Page 34: From Dialogue Systems to Social Chatbots: … · From Dialogue Systems to Social Chatbots: Reinforcement Learning, Seq2Seq, and back again. SwissTextkeynote, 9thJune 2017 Verena Rieser,

• Noclearindicationof”success”.• Currentlyevaluatedturn-level:

• E.g.BLEU,METEOR,etc.• Lowcorrelationwithhumanscores(Lui etal.2016)(Novikova &Rieser,2017)

• Currentresearch:• Turn-level:Reference-lessqualityestimation(Dusek &Rieser,2017)• System-level:Estimatecustomerratings(Curry&Rieser,2017)

EvaluationforSocialDialogue(Curry&Rieser,2017)

Page 35: From Dialogue Systems to Social Chatbots: … · From Dialogue Systems to Social Chatbots: Reinforcement Learning, Seq2Seq, and back again. SwissTextkeynote, 9thJune 2017 Verena Rieser,

ShortcomingsofSiriPitfallsofData

Page 36: From Dialogue Systems to Social Chatbots: … · From Dialogue Systems to Social Chatbots: Reinforcement Learning, Seq2Seq, and back again. SwissTextkeynote, 9thJune 2017 Verena Rieser,

ShortcomingsofSiri

• Task-basedSDS:• ReinforcementLearningwith(PO)MDPs• RelyonDialogueActstomeasureprogresstowardsagoal.

• ResponseGenerationSystems/ChatBotsystems:• End-to-endsystems,distributionalsemantics• ChatBots aimfor“engagingstrategies”

• Challenges:• Qualitycontrol,evaluation.• Cleandatasets.• Integratingtask-basedsystemsandchatbots.

Summary:Data-drivenDialogueSystem

Page 37: From Dialogue Systems to Social Chatbots: … · From Dialogue Systems to Social Chatbots: Reinforcement Learning, Seq2Seq, and back again. SwissTextkeynote, 9thJune 2017 Verena Rieser,

Thanksforlistening!

Comingup:End-to-EndSharedChallengeforNLGhttp://www.macs.hw.ac.uk/InteractionLab/E2E/