generation. aims of this talk discuss mrs and lkb generation describe larger research programme:...
Post on 21-Dec-2015
214 views
TRANSCRIPT
![Page 1: Generation. Aims of this talk Discuss MRS and LKB generation Describe larger research programme: modular generation Mention some interactions with other](https://reader035.vdocuments.site/reader035/viewer/2022062714/56649d625503460f94a441bd/html5/thumbnails/1.jpg)
Generation
![Page 2: Generation. Aims of this talk Discuss MRS and LKB generation Describe larger research programme: modular generation Mention some interactions with other](https://reader035.vdocuments.site/reader035/viewer/2022062714/56649d625503460f94a441bd/html5/thumbnails/2.jpg)
Aims of this talk Discuss MRS and LKB generation Describe larger research programme:
modular generation Mention some interactions with other
work in progress: RMRS SEM-I
![Page 3: Generation. Aims of this talk Discuss MRS and LKB generation Describe larger research programme: modular generation Mention some interactions with other](https://reader035.vdocuments.site/reader035/viewer/2022062714/56649d625503460f94a441bd/html5/thumbnails/3.jpg)
Outline of talk Towards modular generation Why MRS? MRS and chart generation Data-driven techniques SEM-I and documentation
![Page 4: Generation. Aims of this talk Discuss MRS and LKB generation Describe larger research programme: modular generation Mention some interactions with other](https://reader035.vdocuments.site/reader035/viewer/2022062714/56649d625503460f94a441bd/html5/thumbnails/4.jpg)
Modular architecture
Language independent component
Meaning representation
Language dependent realization
string or speech output
![Page 5: Generation. Aims of this talk Discuss MRS and LKB generation Describe larger research programme: modular generation Mention some interactions with other](https://reader035.vdocuments.site/reader035/viewer/2022062714/56649d625503460f94a441bd/html5/thumbnails/5.jpg)
Desiderata for a portable realization module Application independent Any well-formed input should be
accepted No grammar-specific/conventional
information should be essential in the input
Output should be idiomatic
![Page 6: Generation. Aims of this talk Discuss MRS and LKB generation Describe larger research programme: modular generation Mention some interactions with other](https://reader035.vdocuments.site/reader035/viewer/2022062714/56649d625503460f94a441bd/html5/thumbnails/6.jpg)
Architecture (preview)
Chart generator
String
External LF
Internal LF
SEM-I
control modules
specializationmodules
![Page 7: Generation. Aims of this talk Discuss MRS and LKB generation Describe larger research programme: modular generation Mention some interactions with other](https://reader035.vdocuments.site/reader035/viewer/2022062714/56649d625503460f94a441bd/html5/thumbnails/7.jpg)
Why MRS? Flat structures
independence of syntax: conventional LFs partially mirror tree structure
manipulation of individual components: can ignore scope structure etc
lexicalised generation composition by accumulation of EPs: robust
composition Underspecification
![Page 8: Generation. Aims of this talk Discuss MRS and LKB generation Describe larger research programme: modular generation Mention some interactions with other](https://reader035.vdocuments.site/reader035/viewer/2022062714/56649d625503460f94a441bd/html5/thumbnails/8.jpg)
An excursion: Robust MRS Deep Thought: integration of deep and
shallow processing via compatible semantics
All components construct RMRSs Principled way of building robustness into
deep processing Requirements for consistency etc help
human users too
![Page 9: Generation. Aims of this talk Discuss MRS and LKB generation Describe larger research programme: modular generation Mention some interactions with other](https://reader035.vdocuments.site/reader035/viewer/2022062714/56649d625503460f94a441bd/html5/thumbnails/9.jpg)
Extreme flattening of deep output
x
every
cat
x
some
y dog1 chase
y x y
some
y dog1
y
every
x cat chase
xe
x ye
lb1:every_q(x), RSTR(lb1,h9), BODY(lb1,h6), lb2:cat_n(x), lb5:dog_n_1(y),
lb4:some_q(y), RSTR(lb4,h8), BODY(lb4,h7), lb3:chase_v(e),ARG1(lb3,x),
ARG2(lb3,y), h9 qeq lb2,h8 qeq lb5
![Page 10: Generation. Aims of this talk Discuss MRS and LKB generation Describe larger research programme: modular generation Mention some interactions with other](https://reader035.vdocuments.site/reader035/viewer/2022062714/56649d625503460f94a441bd/html5/thumbnails/10.jpg)
Extreme Underspecification Factorize deep representation to minimal
units Only represent what you know Robust MRS
Separating relations Separate arguments Explicit equalities Conventions for predicate names and sense
distinctions Hierarchy of sorts on variables
![Page 11: Generation. Aims of this talk Discuss MRS and LKB generation Describe larger research programme: modular generation Mention some interactions with other](https://reader035.vdocuments.site/reader035/viewer/2022062714/56649d625503460f94a441bd/html5/thumbnails/11.jpg)
Chart generation with the LKB1. Determine lexical signs from MRS2. Determine possible rules contributing EPs
(`construction semantics’: compound rule etc)
3. Instantiate signs (lexical and rule) according to variable equivalences
4. Apply lexical rules5. Instantiate chart6. Generate by parsing without string position7. Check output against input
![Page 12: Generation. Aims of this talk Discuss MRS and LKB generation Describe larger research programme: modular generation Mention some interactions with other](https://reader035.vdocuments.site/reader035/viewer/2022062714/56649d625503460f94a441bd/html5/thumbnails/12.jpg)
Lexical lookup for generation _like_v_1(e,x,y) – return lexical entry for
sense 1 of verb like temp_loc_rel(e,x,y) – returns multiple temp_loc_rel(e,x,y) – returns multiple
lexical entrieslexical entries multiple relations in one lexical entry: multiple relations in one lexical entry:
e.g., e.g., who, wherewho, where entries with null semantics: heuristicsentries with null semantics: heuristics
![Page 13: Generation. Aims of this talk Discuss MRS and LKB generation Describe larger research programme: modular generation Mention some interactions with other](https://reader035.vdocuments.site/reader035/viewer/2022062714/56649d625503460f94a441bd/html5/thumbnails/13.jpg)
Instantiation of entries _like_v_1(e,x,y) & named(x,”Kim”) &
named(y,”Sandy”) find locations corresponding to `x’s in all FSs replace all `x’s with constant repeat for `y’s etc
Also for rules contributing construction semantics
`Skolemization’ (misleading name ...)
![Page 14: Generation. Aims of this talk Discuss MRS and LKB generation Describe larger research programme: modular generation Mention some interactions with other](https://reader035.vdocuments.site/reader035/viewer/2022062714/56649d625503460f94a441bd/html5/thumbnails/14.jpg)
Lexical rule application Lexical rules that contribute EPs only
used if EP is in input Inflectional rules will only apply if
variable has the correct sort Lexical rule application does
morphological generation (e.g., liked, bought)
![Page 15: Generation. Aims of this talk Discuss MRS and LKB generation Describe larger research programme: modular generation Mention some interactions with other](https://reader035.vdocuments.site/reader035/viewer/2022062714/56649d625503460f94a441bd/html5/thumbnails/15.jpg)
Chart generation proper Possible lexical signs added to a chart
structure Currently no indexing of chart edges
chart generation can use semantic indices, but current results suggest this doesn’t help
Rules applied as for chart parsing: edges checked for compatibility with input semantics (bag of EPs)
![Page 16: Generation. Aims of this talk Discuss MRS and LKB generation Describe larger research programme: modular generation Mention some interactions with other](https://reader035.vdocuments.site/reader035/viewer/2022062714/56649d625503460f94a441bd/html5/thumbnails/16.jpg)
Root conditions Complete structures must consume all
the EPs in the input MRS Should check for compatibility of scopes
precise qeq matching is (probably) too strict
exactly same scopes is (probably) unrealistic and too slow
![Page 17: Generation. Aims of this talk Discuss MRS and LKB generation Describe larger research programme: modular generation Mention some interactions with other](https://reader035.vdocuments.site/reader035/viewer/2022062714/56649d625503460f94a441bd/html5/thumbnails/17.jpg)
Generation failures due to MRS issues Well-formedness check prior to input to
generator (optional) Lexical lookup failure: predicate doesn’t
match entry, wrong arity, wrong variable types
Unwanted instantiations of variables Missing EPs in input: syntax (e.g., no noun),
lexical selection Too many EPs in input: e.g., two verbs and no
coordination
![Page 18: Generation. Aims of this talk Discuss MRS and LKB generation Describe larger research programme: modular generation Mention some interactions with other](https://reader035.vdocuments.site/reader035/viewer/2022062714/56649d625503460f94a441bd/html5/thumbnails/18.jpg)
Improving generation via corpus-based techniques CONTROL: e.g. intersective modifier
order: Logical representation does not determine
order• wet(x) & weather(x) & cold(x)
UNDERSPECIFIED INPUT: e.g., Determiners: none/a/the/ Prepositions: in/on/at
![Page 19: Generation. Aims of this talk Discuss MRS and LKB generation Describe larger research programme: modular generation Mention some interactions with other](https://reader035.vdocuments.site/reader035/viewer/2022062714/56649d625503460f94a441bd/html5/thumbnails/19.jpg)
Constraining generation for idiomatic output Intersective modifier order: e.g.,
adjectives, prepositional phrases Logical representation does not
determine order wet(x) & weather(x) & cold(x)
![Page 20: Generation. Aims of this talk Discuss MRS and LKB generation Describe larger research programme: modular generation Mention some interactions with other](https://reader035.vdocuments.site/reader035/viewer/2022062714/56649d625503460f94a441bd/html5/thumbnails/20.jpg)
Adjective ordering Constraints / preferences
big red car * red big car cold wet weather wet cold weather (OK, but dispreferred)
Difficult to encode in symbolic grammar
![Page 21: Generation. Aims of this talk Discuss MRS and LKB generation Describe larger research programme: modular generation Mention some interactions with other](https://reader035.vdocuments.site/reader035/viewer/2022062714/56649d625503460f94a441bd/html5/thumbnails/21.jpg)
Corpus-derived adjective ordering ngrams perform poorly Thater: direct evidence plus clustering positional probability Malouf (2000): memory-based learning
plus positional probability: 92% on BNC
![Page 22: Generation. Aims of this talk Discuss MRS and LKB generation Describe larger research programme: modular generation Mention some interactions with other](https://reader035.vdocuments.site/reader035/viewer/2022062714/56649d625503460f94a441bd/html5/thumbnails/22.jpg)
Underspecified input to generationWe bought a car on FridayAccept:
pron(x) & a_quant(y,h1,h2) & car(y) & buy(epast,x,y) & on(e,z) & named(z,Friday)
and:pron(x) & general_q(y,h1,h2) & car(y) & buy(epast,x,y) & temploc(e,z) & named(z,Friday)
And maybe: pron(x1pl) & car(y) & buy(epast,x,y) & temp_loc(e,z) & named(z,Friday)
![Page 23: Generation. Aims of this talk Discuss MRS and LKB generation Describe larger research programme: modular generation Mention some interactions with other](https://reader035.vdocuments.site/reader035/viewer/2022062714/56649d625503460f94a441bd/html5/thumbnails/23.jpg)
Guess the determiner We went climbing in _ Andes _ president of _ United States I tore _ pyjamas I tore _ duvet George doesn’t like _ vegetables We bought _ new car yesterday
![Page 24: Generation. Aims of this talk Discuss MRS and LKB generation Describe larger research programme: modular generation Mention some interactions with other](https://reader035.vdocuments.site/reader035/viewer/2022062714/56649d625503460f94a441bd/html5/thumbnails/24.jpg)
Determining determiners Determiners are partly conventionalized,
often predictable from local context Translation from Japanese etc, speech
prosthesis application More `meaning-rich’ determiners assumed to
be specified in the input Minnen et al: 85% on WSJ (using TiMBL)
![Page 25: Generation. Aims of this talk Discuss MRS and LKB generation Describe larger research programme: modular generation Mention some interactions with other](https://reader035.vdocuments.site/reader035/viewer/2022062714/56649d625503460f94a441bd/html5/thumbnails/25.jpg)
Preposition guessing Choice between temporal in/on/at
in the morning in July on Wednesday on Wednesday morning at three o’clock at New Year
ERG uses hand-coded rules and lexical categories Machine learning approach gives very high precision
and recall on WSJ, good results on balanced corpus (Lin Mei, 2004, Cambridge MPhil thesis)
![Page 26: Generation. Aims of this talk Discuss MRS and LKB generation Describe larger research programme: modular generation Mention some interactions with other](https://reader035.vdocuments.site/reader035/viewer/2022062714/56649d625503460f94a441bd/html5/thumbnails/26.jpg)
SEM-I: semantic interface Meta-level: manually specified
`grammar’ relations (constructions and closed-class)
Object-level: linked to lexical database for deep grammars
Definitional: e.g. lemma+POS+sense Linked test suites, examples,
documentation
![Page 27: Generation. Aims of this talk Discuss MRS and LKB generation Describe larger research programme: modular generation Mention some interactions with other](https://reader035.vdocuments.site/reader035/viewer/2022062714/56649d625503460f94a441bd/html5/thumbnails/27.jpg)
SEM-I development SEM-I eventually forms the `API’: stable,
changes negotiated. SEM-I vs Verbmobil SEMDB
Technical limitations of SEMDB Too painful! `Munging’ rules: external vs internal SEM-I development must be incremental
![Page 28: Generation. Aims of this talk Discuss MRS and LKB generation Describe larger research programme: modular generation Mention some interactions with other](https://reader035.vdocuments.site/reader035/viewer/2022062714/56649d625503460f94a441bd/html5/thumbnails/28.jpg)
Role of SEM-I in architecture Offline
Definition of `correct’ (R)MRS for developers
Documentation Checking of test-suites
Online In unifier/selector: reject invalid RMRSs Patching up input to generation
![Page 29: Generation. Aims of this talk Discuss MRS and LKB generation Describe larger research programme: modular generation Mention some interactions with other](https://reader035.vdocuments.site/reader035/viewer/2022062714/56649d625503460f94a441bd/html5/thumbnails/29.jpg)
Goal: semi-automated documentation
Lex DB [incr tsdb()]
and semantic test-suite
Object-level SEM-I
Meta-level SEM-I
Documentation
Auto-generate examples
autogenerateappendix
ERGDocumentation
strings
semi-automatic
examples, autogenerated
on demand
![Page 30: Generation. Aims of this talk Discuss MRS and LKB generation Describe larger research programme: modular generation Mention some interactions with other](https://reader035.vdocuments.site/reader035/viewer/2022062714/56649d625503460f94a441bd/html5/thumbnails/30.jpg)
Robust generation SEM-I an important preliminary
check whether generator input is semantically compatible with grammars
Eventually: hierarchy of relations outside grammars, allowing underspecification
`fill-in’ of underspecified RMRS exploit work on determiner guessing etc
![Page 31: Generation. Aims of this talk Discuss MRS and LKB generation Describe larger research programme: modular generation Mention some interactions with other](https://reader035.vdocuments.site/reader035/viewer/2022062714/56649d625503460f94a441bd/html5/thumbnails/31.jpg)
Architecture (again)
Chart generator
String
External LF
Internal LF
SEM-I
control modules
specializationmodules
![Page 32: Generation. Aims of this talk Discuss MRS and LKB generation Describe larger research programme: modular generation Mention some interactions with other](https://reader035.vdocuments.site/reader035/viewer/2022062714/56649d625503460f94a441bd/html5/thumbnails/32.jpg)
Interface External representation
public, documented reasonably stable
Internal representation syntax/semantics interface convenient for analysis
External/Internal conversion via SEM-I
![Page 33: Generation. Aims of this talk Discuss MRS and LKB generation Describe larger research programme: modular generation Mention some interactions with other](https://reader035.vdocuments.site/reader035/viewer/2022062714/56649d625503460f94a441bd/html5/thumbnails/33.jpg)
Guaranteed generation? Given a well-formed input MRS/RMRS,
with elementary predications found in SEM-I (and dependencies)
Can we generate a string? with input fix up? negotiation? Semantically bleached lexical items: which,
one, piece, do, make Defective paradigms, negative polarity,
anti-collocations etc?
![Page 34: Generation. Aims of this talk Discuss MRS and LKB generation Describe larger research programme: modular generation Mention some interactions with other](https://reader035.vdocuments.site/reader035/viewer/2022062714/56649d625503460f94a441bd/html5/thumbnails/34.jpg)
Next stages SEM-I development Documentation and test suite integration Generation from RMRSs produced by shallower
parser (or deep/shallow combination) Partially fixed text in generation (cogeneration) Further statistical modules: e.g., locational
prepositions, other modifiers More underspecification Gradually increase flexibility of interface to
generation