geoffrey gordon matt mason zita marinho · joão paulo costeira instituto superior técnico. title:...

1
Carnegie Mellon University THE ROBOTICS INSTITUTE Thesis Proposal Zita Marinho Monday, October 26, 2015 GHC 4405 10:00 a.m. Siddhartha Srinivasa Co-chair Geoffrey Gordon Co-chair Matt Mason André Martins Priberam Labs Thesis Committee Moment-based Algorithms for Structured Prediction Abstract Latent variable models cons1tute a compact representa1on for complex, high dimensional data. This is useful in many applica1ons in Robo1cs and Natural Language Processing. Latent models are however very hard to learn, in part because of the nonconvexity of the likelihood func1on. Maximum likelihood learning is sta1s1cally consistent but leads to intractable op1miza1on problems, and local methods such as the Expecta1on Maximiza1on algorithm present a tractable solu1on, but provide only local convergence guarantees. We focus on an alterna1ve school of thought, the so called Method of Moments (MoM), whose goal is to find model parameters that are in agreement with certain sta1s1cal moments of data. These methods yield (asympto1cally) sta1s1cally op1mal solu1ons, that can be computed efficiently. Some of these methods rely in their core on a spectral decomposi1on of observed sta1s1cs. These are known as spectral methods. Other approaches involve finding observa1ons that unambiguously iden1fy hidden states. These are known as anchor learning methods. Momentbased algorithms are faster to compute and usually require more data to build good empirical es1mates, compared with likelihood methods. In many applica1ons, unsupervised data is inexpensive to compute, which makes momentbased algorithms a preferable choice under unsupervised seNngs, or even when labeled data is scarce. In this work we propose different forms of learning hidden state models using moment matching techniques, with large quan11es of unsupervised data and liPle supervised data. We intend to design algorithms that are capable of handling large amounts of data in weakly supervised seNngs, that are flexible, and that allow the inclusion of model constraints. In this thesis, we focus on momentbased learning for structured predic1on tasks. We propose to use these spectral techniques to learn controllable Predic1ve State Representa1ons (PSRs) and apply them to two different domains: robo1c manipula1on tasks and a transi1onbased parser. We use anchorbased methods in two sequence labeling tasks: semisupervised partofspeech tagging, where we learn from a small, annotated TwiPer corpus, and weakly supervised named en1ty recogni1on system, for different languages. Shay Cohen University of Edinburgh João Paulo Costeira Instituto Superior Técnico

Upload: others

Post on 09-Jul-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Geoffrey Gordon Matt Mason Zita Marinho · João Paulo Costeira Instituto Superior Técnico. Title: marinho Author: Suzanne Lyons Muth Created Date: 10/21/2015 2:11:08 PM

Carnegie Mellon UniversityTHE ROBOTICS INSTITUTE

Thesis ProposalZitaMarinho

Monday, October 26, 2015 GHC 440510:00 a.m.

Siddhartha SrinivasaCo-chair

Geoffrey GordonCo-chair

Matt Mason

André Martins Priberam Labs

Thesis Committee

Moment-based Algorithms for Structured Prediction

AbstractLatent  variable  models  cons1tute   a  compact  representa1on  for  complex,  high  dimensional  data.  This  is  useful  in  many  applica1ons  in  Robo1cs  and  Natural  Language  Processing.  Latent  models  are  however  very  hard  to  learn,   in  part  because  of  the  non-­‐convexity  of  the  likelihood  func1on.  Maximum   likelihood  learning   is  sta1s1cally   consistent  but   leads  to  intractable  op1miza1on  problems,   and   local  methods  such  as  the  Expecta1on  Maximiza1on  algorithm  present  a  tractable  solu1on,  but  provide  only  local  convergence  guarantees.

We  focus  on  an  alterna1ve  school  of  thought,  the  so  called  Method  of  Moments  (MoM),  whose  goal  is  to  find  model  parameters  that  are   in  agreement  with  certain  sta1s1cal  moments  of  data.   These  methods  yield  (asympto1cally)  sta1s1cally  op1mal  solu1ons,   that  can  be  computed  efficiently.  

Some   of   these   methods   rely   in   their   core   on   a   spectral   decomposi1on   of   observed   sta1s1cs.   These   are   known   as   spectral  methods.   Other   approaches   involve   finding   observa1ons   that   unambiguously   iden1fy   hidden   states.   These   are   known  as   anchor  learning  methods.

Moment-­‐based  algorithms  are   faster   to  compute   and  usually  require  more   data   to  build  good  empirical  es1mates,  compared  with  likelihood  methods.   In  many  applica1ons,  unsupervised  data   is  inexpensive   to  compute,   which  makes  moment-­‐based  algorithms  a  preferable  choice  under  unsupervised  seNngs,  or  even  when  labeled  data  is  scarce.

 In  this  work  we  propose  different  forms  of  learning  hidden  state  models  using  moment  matching  techniques,  with  large  quan11es  of  unsupervised  data  and  liPle   supervised  data.  We   intend  to  design  algorithms  that  are   capable   of  handling   large  amounts  of  data   in  weakly  supervised  seNngs,  that  are  flexible,  and  that  allow  the  inclusion  of  model  constraints.  

In  this  thesis,  we   focus  on  moment-­‐based  learning  for  structured  predic1on  tasks.    We   propose   to  use  these  spectral  techniques  to  learn  controllable  Predic1ve  State  Representa1ons  (PSRs)  and  apply  them  to  two  different  domains:  robo1c  manipula1on  tasks  and  a  transi1on-­‐based   parser.   We   use   anchor-­‐based  methods   in   two   sequence   labeling   tasks:   semi-­‐supervised  part-­‐of-­‐speech   tagging,  where   we   learn   from   a   small,   annotated  TwiPer   corpus,   and   weakly   supervised   named  en1ty   recogni1on   system,   for   different  languages.

Shay CohenUniversity of Edinburgh

João Paulo Costeira Instituto Superior Técnico