geoffrey gordon matt mason zita marinho · joão paulo costeira instituto superior técnico. title:...
TRANSCRIPT
Carnegie Mellon UniversityTHE ROBOTICS INSTITUTE
Thesis ProposalZitaMarinho
Monday, October 26, 2015 GHC 440510:00 a.m.
Siddhartha SrinivasaCo-chair
Geoffrey GordonCo-chair
Matt Mason
André Martins Priberam Labs
Thesis Committee
Moment-based Algorithms for Structured Prediction
AbstractLatent variable models cons1tute a compact representa1on for complex, high dimensional data. This is useful in many applica1ons in Robo1cs and Natural Language Processing. Latent models are however very hard to learn, in part because of the non-‐convexity of the likelihood func1on. Maximum likelihood learning is sta1s1cally consistent but leads to intractable op1miza1on problems, and local methods such as the Expecta1on Maximiza1on algorithm present a tractable solu1on, but provide only local convergence guarantees.
We focus on an alterna1ve school of thought, the so called Method of Moments (MoM), whose goal is to find model parameters that are in agreement with certain sta1s1cal moments of data. These methods yield (asympto1cally) sta1s1cally op1mal solu1ons, that can be computed efficiently.
Some of these methods rely in their core on a spectral decomposi1on of observed sta1s1cs. These are known as spectral methods. Other approaches involve finding observa1ons that unambiguously iden1fy hidden states. These are known as anchor learning methods.
Moment-‐based algorithms are faster to compute and usually require more data to build good empirical es1mates, compared with likelihood methods. In many applica1ons, unsupervised data is inexpensive to compute, which makes moment-‐based algorithms a preferable choice under unsupervised seNngs, or even when labeled data is scarce.
In this work we propose different forms of learning hidden state models using moment matching techniques, with large quan11es of unsupervised data and liPle supervised data. We intend to design algorithms that are capable of handling large amounts of data in weakly supervised seNngs, that are flexible, and that allow the inclusion of model constraints.
In this thesis, we focus on moment-‐based learning for structured predic1on tasks. We propose to use these spectral techniques to learn controllable Predic1ve State Representa1ons (PSRs) and apply them to two different domains: robo1c manipula1on tasks and a transi1on-‐based parser. We use anchor-‐based methods in two sequence labeling tasks: semi-‐supervised part-‐of-‐speech tagging, where we learn from a small, annotated TwiPer corpus, and weakly supervised named en1ty recogni1on system, for different languages.
Shay CohenUniversity of Edinburgh
João Paulo Costeira Instituto Superior Técnico