from sequential structure to semantic interpretation: more connectionist research on language...

25
From Sequential Structure to Semantic Interpretation: More Connectionist Research on Language Processing PDP Class Lecture February 14, 2011

Upload: jonas-bryant

Post on 25-Dec-2015

220 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: From Sequential Structure to Semantic Interpretation: More Connectionist Research on Language Processing PDP Class Lecture February 14, 2011

From Sequential Structure to Semantic Interpretation:

More Connectionist Research on Language Processing

PDP Class LectureFebruary 14, 2011

Page 2: From Sequential Structure to Semantic Interpretation: More Connectionist Research on Language Processing PDP Class Lecture February 14, 2011

The Simple Recurrent Network

• Network is trained on a stream of elements with sequential structure

• At step n, target for output is next element.

• Pattern on hidden units is copied back to the context units.

• After learning the network comes to retain information about preceding elements of the string, allowing expectations to be conditioned by an indefinite window of prior context.

Page 3: From Sequential Structure to Semantic Interpretation: More Connectionist Research on Language Processing PDP Class Lecture February 14, 2011

Learning about sentence structure from streams of

words

Page 4: From Sequential Structure to Semantic Interpretation: More Connectionist Research on Language Processing PDP Class Lecture February 14, 2011

Learned and imputed hidden-layer representations (average vectors over all contexts)

‘Zog’ representationderived by averagingvectors obtained byinserting novel item in place of each occurrence of ‘man’.

Page 5: From Sequential Structure to Semantic Interpretation: More Connectionist Research on Language Processing PDP Class Lecture February 14, 2011

Elman (1991)

Page 6: From Sequential Structure to Semantic Interpretation: More Connectionist Research on Language Processing PDP Class Lecture February 14, 2011

Prediction withan embedded clause

Page 7: From Sequential Structure to Semantic Interpretation: More Connectionist Research on Language Processing PDP Class Lecture February 14, 2011

Components trackingconstituents withinclauses of differenttypes.

Page 8: From Sequential Structure to Semantic Interpretation: More Connectionist Research on Language Processing PDP Class Lecture February 14, 2011

Can we extend the approach to address comprehension?

Who did what to whom, etc

8

Page 9: From Sequential Structure to Semantic Interpretation: More Connectionist Research on Language Processing PDP Class Lecture February 14, 2011

Some factors in comprehension• Sentence structure and constraints on events are both

important:

– The boy chased the girl.– The girl chased the boy.

– The car was parked by the attendant.– The car was parked by the lamppost.

– We ate some food with some friends that we like.– We found a painting in the gallery that was painted by Rembrandt.

– The horse raced past the barn...– The horse dragged past the barn…– The cart raced past the barn…

Page 10: From Sequential Structure to Semantic Interpretation: More Connectionist Research on Language Processing PDP Class Lecture February 14, 2011

Alternative Approaches

• Parsing based approaches:–‘Syntax proposes, Semantics Disposes’

• Although data were collected that initially seemed to support this, further studies changed the picture (see next slide).

–Beam Search and Particle Filtering• Keep several explicit alternatives active at a time; discard

alternatives as they become implausible.

• The PDP approach–Use constituents of the sentence as they are

encountered to construct a representation of the event described be the sentence directly.

–Keep a single distributed representation that implicitly represents a mixture of possibilities.

Page 11: From Sequential Structure to Semantic Interpretation: More Connectionist Research on Language Processing PDP Class Lecture February 14, 2011

A Syntactic Parsing Principle:‘Minimal Attachment’

• The principle predicts that a prepositional phrase following a direct object will be treated as a constituent of the verb phrase of a sentence .

• This leads to the prediction that subjects will be slower to read the last word of (b) relative to (a) below:(a) The spy shot the policeman with the revolver.

(b) The spy shot the policeman with the binoculars.

• Although this seemed to be true for the sentences used, the reverse is true for other sentences:(a) The man read the article in the magazine.

(b) The man read the article in the bathtub.

Page 12: From Sequential Structure to Semantic Interpretation: More Connectionist Research on Language Processing PDP Class Lecture February 14, 2011

What about the idea that everything depends on the verb?

–The spy saw the policeman with binoculars–The spy saw the policeman with a revolver–The bird saw the birdwatcher with binoculars–The bird saw its prey with binoculars

–The children collected … –The rain collected …

Page 13: From Sequential Structure to Semantic Interpretation: More Connectionist Research on Language Processing PDP Class Lecture February 14, 2011

Additional Aspects of Sentence Comprehension

• Context helps us:– Select the correct meaning of ambiguous words

• The boy hit the ball with the bat.

– Fill in missing information• The boy spread the peanut butter on the bread.

– Shade and specify the ‘meaning’ of a particular word • The container held the apples.• The container held the coffee.• The boy kissed someone under the mistletoe.• The baby rolled the ball to her daddy.• The slugger hit the ball out of the park.

• John loves Mary.• John loves ice cream.• The pope loves sinners.

• The {writer/student/goat} finished the book.

Page 14: From Sequential Structure to Semantic Interpretation: More Connectionist Research on Language Processing PDP Class Lecture February 14, 2011

The Role of Situation (Elman, 2009)

• The shopper saved…

• The lifeguard saved…

• There was a big sale at the swimshop. The lifeguard saved…

• … was skating … primes ‘arena’

• … had skated … does not

Page 15: From Sequential Structure to Semantic Interpretation: More Connectionist Research on Language Processing PDP Class Lecture February 14, 2011

Do words have meanings, or are they clues to meaning?

• For a first approximation, the lexicon is the store of words in long-term memory from which the grammar constructs phrases and sentences.

• [A lexical entry] lists a small chunk of phonology, a small chunk of syntax, and a small chunk of semantics.

– Ray Jackendoff

• My approach suggests that comprehension, like perception, should be likened to Hebb's (1949) paleontologist, who uses his beliefs and knowledge about dinosaurs in conjunction with the clues provided by the bone fragments available to construct a full-fledged model of the original. In this case the words spoken and the actions taken by the speaker are likened to the clues of the paleontologist, and the dinosaur, to the meaning conveyed through these clues.

– David Rumelhart

Page 16: From Sequential Structure to Semantic Interpretation: More Connectionist Research on Language Processing PDP Class Lecture February 14, 2011

The Sentence Gestalt Model

• Input consists of sequences of words,

• After each word, net attempts to complete a set of role-filler pairs (can probe with role or filler).

• Sentence gestalt is used to constrain completion and serves as context for interpretation of next constituent.

• Rhode (2002) extended this model to allow probes for fillers of roles with respect to particular head words (e.g. verbs) so model could deal with embedded clauses.

Page 17: From Sequential Structure to Semantic Interpretation: More Connectionist Research on Language Processing PDP Class Lecture February 14, 2011

A probabilistic formulation of back propagation

• Think of the activation of a unit as representing the network’s estimate of the probability that the unit should be on in the given context.

• We can measure the degree to which the observed target values match their predicted values using a measure called ‘Cross-Entropy’

CEp = -i [tiplog(aip) + (1-tip)log(1-aip)]

• If targets are actually probabilistic, minimizing CEp maximizes the probability of the observed target values.

• The minimum value of the CE will occur when the activations match the target probabilities. (SSE also has the same minimum, but lacks the explicit probabilistic interpretation).

• [CE has the practical advantage of eliminating the ‘pinned output unit’ problem.]

Page 18: From Sequential Structure to Semantic Interpretation: More Connectionist Research on Language Processing PDP Class Lecture February 14, 2011

Sentences can be active or passive,constituents can be vaguely identified or

may be left out it strongly implied.

Page 19: From Sequential Structure to Semantic Interpretation: More Connectionist Research on Language Processing PDP Class Lecture February 14, 2011

Changing interpretations of role fillers as a sentence unfolds

Page 20: From Sequential Structure to Semantic Interpretation: More Connectionist Research on Language Processing PDP Class Lecture February 14, 2011

St. John’s (1992) Story Gestalt Model

• Learns from stereotyped multi-proposition stories with slots and fillers

• Can answer specific questions, fill in missing propositions based on typical proerties of scripts, etc.

B

Page 21: From Sequential Structure to Semantic Interpretation: More Connectionist Research on Language Processing PDP Class Lecture February 14, 2011

Limitations

• Can only deal with ‘one level’ event-structures–Cannot handle embeddings or modifiers or consituents,

as in–‘The policeman saw that the young girl was bitten by the

mean dog’.

• Two follow-on approaches –Use fuller probes for completions of embedded

propositions (Bryant and Miikulainen, 2001)

–Use a recursively-constructed compressed representation of the semantics of the sentence (Rohde, 2002).

Page 22: From Sequential Structure to Semantic Interpretation: More Connectionist Research on Language Processing PDP Class Lecture February 14, 2011

Hierarchical Compressed Representation of a Moderately

Complex Sentence

Compressed DecodableRepresentation of a

Head-relation-filler triple

Page 23: From Sequential Structure to Semantic Interpretation: More Connectionist Research on Language Processing PDP Class Lecture February 14, 2011

Rohde’s (2002) Model

• Used a common representation constrained by three-role propositions and sentences.

• Did prediction and production as well as comprehension.

Page 24: From Sequential Structure to Semantic Interpretation: More Connectionist Research on Language Processing PDP Class Lecture February 14, 2011

One complaint remains• The models pre-supposes propositional representations of events … that does not

seem right.

• Can we get rid of stipulation of structure and query the Gestalt with an English question?

• Can we create a target for learning based on an actual scene representation rather than a propositional representation?

Page 25: From Sequential Structure to Semantic Interpretation: More Connectionist Research on Language Processing PDP Class Lecture February 14, 2011

Schematic of a Future Model

Event

Event