layla el asri, research scientist, maluuba
Post on 05-Apr-2017
90 Views
Preview:
TRANSCRIPT
PowerPoint Presentation
A Microsoft companyTeaching AI to Make Decisions and CommunicateLayla El Asri, Research Manager
with slides by Paul Gray, Harm Van Seijen, and Adam Trischler
Maluubas Vision: Solving AGI by Creating Literate Machines
Machine Reading Comprehension
Teaching artificial agents to read and understand natural languageAdvanced Conversational Systems
Building knowledgeable systems that can exchange information with users to help users accomplish tasks or gain knowledgeReinforcement Learning
Fundamental research in scalability of Reinforcement Learning to allow machines to perform complex tasks in the real world
Maluuba, a Microsoft company
Im Layla from Maluuba. Our vision is to solve artificial general intelligence by creating machines that can read, think and communicate like humans.We started in 2011 and operate a deep and reinforcement learning lab in Montreal.In January, Maluuba was acquired by Microsoft.
Our work focuses on three areasMRC / Dialogue / RL (quick intro for each)2
Teaching AI to Make Decisions and Communicate
Expectations of AILearning to LearnLearning to PerceiveLearning to Communicate
Maluuba, a Microsoft company
Expectations of AI
Nice, thanks
When is my appointment with Marc?You have a meeting with Marc Villeneuve tomorrow at 10am.Ok, where is it again?At Starbucks on Maisonneuve and Montagne so you should leave the office at 9:40.
Ok is it the same Starbucks when I met Harry last week?
Yes
I see. Do you know what Marcs been up to lately?
Yes, there was an article on MIT Tech review yesterday. His company will start commercializing affordable 3d printers.Learning to CommunicateLearning to LearnLearning to Perceive
Maluuba, a Microsoft company
4
Learning to LearnHuman beings decompose tasks into subtasks in an efficient way.
Subtasks are achieved without conscious awareness.
Maluuba, a Microsoft company
Learning to Learn: Separation of Concerns
Separation between performance metric and learning objective.Each agent has its own learning objective.The goal is to find a reasonable policy efficiently.
Maluuba, a Microsoft company
Example of Application
Maluuba, a Microsoft company
Collecting the fruitsGoalGet all fruits as quickly as possible
Reward+1 if all fruits are eaten0 otherwise
Number of fruits: n
State space: 100x100n = 102n + n
NP-complete problem
Using one agent per fruitState space reduced to nx100
Maluuba, a Microsoft company
Pac-Boy
Reward+1 for eating a fruit-10 for each collision with a ghost
The episode ends after all fruits are eaten or after 300 time steps.
State space Approximately 1028 states
Maluuba, a Microsoft company
Configuration1 agent per fruit1 agent per ghost75 fruit agents with 76 states2 ghost agents with 76x76 states
Maluuba, a Microsoft company
DemoDQNSoC
Maluuba, a Microsoft company
Results
Maluuba, a Microsoft company
Learning to Perceive
For living creatures, perception is adapted to task achievementFirst living creatures: ability to reactEvolution: ability to foreseeChallenge: correlate sensory inputs with eventsModern human beings: ability to focus
Maluuba, a Microsoft company
Learning to Perceive: Information GatheringGuessing Game tasks that progress in difficultyBattleship sink the enemys ships quicklyHangman guess the phrase quicklyBlockworld
We developed a model that achieves super-human performance on these tasks.
Maluuba, a Microsoft company
BlockworldEnvironmentObservationsModels World BeliefPeeking Policy
Models Answer BeliefIs the red sphere above the red cross?
Maluuba, a Microsoft company
Information Gathering Model
Maluuba, a Microsoft company
Learning to Communicate
Language is the most precise communication tool that we have but it is still very impreciseEasier to give orders and strictly define the meaning of words
Maluuba, a Microsoft company
How to Build a Goal-Driven Dialogue System?Inform(city = Rio)State trackerNatural Language Understanding(NLU)Natural Language Generation(NLG)Dialogue Management(DM)City = Rio, budget = $2000, hotel = Hilton, price = $1950Databasecity = Rio, budget = $2000Hotel = Hilton, price = $1950Offer(name = Hilton, price = $1950)You can book the Hilton for $1950.I want to go to Rio.
Maluuba, a Microsoft company
Going One Step Further: Modelling Memory
Maluuba, a Microsoft company
Frames Dataset Overview
15 Turns per Dialogue
268 Hotels
109 Cities
19,986 Turns
1369 Dialogues
Maluuba, a Microsoft company
Frame TrackingCuritiba, August 15th August 26th, 4 stars, $2877.68 Columbus, August 15th,Request(price)And how much if I were to go to Columbus?Curitiba, August 15thCuritiba, August 15thCuritiba, August 15th August 26th, 4 stars, $2877.68 Curitiba, August 15thAnd how much if I were to go to Columbus?Columbus, August 15th,Request(price)State TrackingFrame Tracking
Maluuba, a Microsoft company
Frame Tracking Model
InputThe NLU labels, the list of frames, the previous active frame, and the user utterance
OutputThe current active frame and the frames referred by the dialogue acts
Model
Maluuba, a Microsoft company
Thank you!Papers discussedImproving Scalability of Reinforcement Learning by Separation of ConcernsTowards Information-Seeking AgentsFrames: A Corpus For Adding Memory To Goal-Oriented Dialogue Systems
Maluuba, a Microsoft company
Were hiring!Research ScientistsResearch EngineersDevelopersProduct/Program Managers
www.maluuba.com/careers
Maluuba, a Microsoft company
top related