syntactic analysis using context free grammars

Post on 15-Jan-2016

74 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

Syntactic analysis using Context Free Grammars. Analysis of language. Morphological analysis Chairs , Part Of Speech (POS) tagging The/DT man/NN left/VBD the/DT room/NN - PowerPoint PPT Presentation

TRANSCRIPT

Syntactic analysis using Context Free Grammars

Analysis of language

• Morphological analysis– Chairs <chair, n, plural>, <chair, v, 3rd person, present>

• Part Of Speech (POS) tagging– The/DT man/NN left/VBD the/DT room/NN– The/DT red/ADJ block/NN on/PREP the/DT blue/ADJ

cylinder/NN was/AUX moved/VBD onto/PREP the/DT brown/ADJ table/NN

• Any further analysis?

Analysis of language

• Part Of Speech (POS) tagging– The/DT man/NN left/VBD the/DT room/NN– The/DT red/ADJ block/NN on/PREP the/DT blue/ADJ

cylinder/NN was/AUX moved/VBD onto/PREP the/DT brown/ADJ table/NN

• Any further analysis?– chunks, clauses, syntax, semantics, word senses etc…

• Today’s lecture on analyzing syntax

What is Syntax?

• Study of structure of language– how words can connect to each other

• Specifically, goal is to relate surface form (i.e. the sentence) to semantics (the meaning)

• Representational device is tree structure

Structure in StringsProposal 1

• Some words: the a small nice big very boy girl sees likes

• Some good sentences:– (the) boy (likes a girl) – (the small) girl (likes the big girl)– (a very small nice) boy (sees a very nice boy)

• Some bad sentences:– *(the) boy (the girl)– *(small) boy (likes the nice girl)

Structure in StringsProposal 2

• Some words: the a small nice big very boy girl sees likes

• Some good sentences:– (the boy) likes (a girl) – (the small girl) likes (the big girl)– (a very small nice boy) sees (a very nice boy)

• Some bad sentences:– *(the boy) (the girl)– *(small boy) likes (the nice girl)

More Structure in StringsProposal 2 -- ctd

• Some words: the a small nice big very boy girl sees likes

• Some good sentences:– ((the) boy) likes ((a) girl) – ((the) (small) girl) likes ((the) (big) girl)– ((a) ((very) small) (nice) boy) sees ((a) ((very) nice) girl)

• Some bad sentences:– *((the) boy) ((the) girl)– *((small) boy) likes ((the) (nice) girl)

From Substrings to Trees

• (((the) boy) likes ((a) girl))

boythe

likesgirl

a

27

Context-Free Grammars

• Terminals– This would be the lexicon/vocabulary

• Non-Terminals– The constituents in a language

• Like noun phrase, verb phrase, prepositional phrase and sentence

• Rules– Rules consist of a single non-terminal on the left and any number of terminals

and non-terminals on the right.– Describe the allowed structure of the constituents– Express the ways in which symbols of the language can be grouped or ordered

together

Phrase Structure Tree

• (((the/Det) boy/N) likes/V ((a/Det) girl/N))

boy

the

likes

girl

a

DetP

NP NP

DetP

S

Phrase-structuretree

nonterminalsymbols= constituents

terminal symbols = words

Context?

• The notion of context in CFGs is not the same as the ordinary meaning of the word context in language.

• All it really means is that the non-terminal on the left-hand side of a rule is out there all by itself– A -> B C– Means that I can rewrite an A as a B followed by a C regardless of the context

in which A is found

CFG: Example

• Many possible CFGs for English, here is an example (fragment):– S NP VP– VP V NP– NP DetP N | AdjP NP– AdjP Adj | Adv AdjP– N boy | girl– V sees | likes– Adj big | small– Adv very – DetP a | the

the very small boy likes a girl

Derivations in a CFG

S NP VPVP V NPNP DetP N | AdjP NPAdjP Adj | Adv AdjPN boy | girlV sees | likesAdj big | smallAdv very DetP a | the

S

S

Derivations in a CFG

S NP VPVP V NPNP DetP N | AdjP NPAdjP Adj | Adv AdjPN boy | girlV sees | likesAdj big | smallAdv very DetP a | the

NP VP

NP

S

VP

Derivations in a CFG

S NP VPVP V NPNP DetP N | AdjP NPAdjP Adj | Adv AdjPN boy | girlV sees | likesAdj big | smallAdv very DetP a | the

DetP N VP

DetP

NP

S

VP

N

Derivations in a CFG

S NP VPVP V NPNP DetP N | AdjP NPAdjP Adj | Adv AdjPN boy | girlV sees | likesAdj big | smallAdv very DetP a | the

the boy VP

boythe

DetP

NP

S

VP

N

Derivations in a CFG

S NP VPVP V NPNP DetP N | AdjP NPAdjP Adj | Adv AdjPN boy | girlV sees | likesAdj big | smallAdv very DetP a | the

the boy likes NP

boythe likes

DetP

NP

NP

S

VP

N V

Derivations in a CFG

S NP VPVP V NPNP DetP N | AdjP NPAdjP Adj | Adv AdjPN boy | girlV sees | likesAdj big | smallAdv very DetP a | the

the boy likes a girl

boythe likes

DetP

NP

girla

NP

DetP

S

VP

N

N

V

38

Simple lexicon

39

Simple grammar

40

Generativity

• We can view these rules as either analysis or synthesis machines– Generate strings in the language– Reject strings not in the language– Impose structures (trees) on strings in the language

41

A CFG defines a formal language

• Sentences (strings of words) that can be derived by the grammar are in the formal language defined by the grammar

• Sentences that cannot be derived from the grammar are not in the language– Ungrammatical

42

Derivations• A derivation is a sequence of

rules applied to a string that accounts for that string– Covers all the elements in the

string– Covers only the elements in the

string

Recursion

• We’ll have to deal with rules such as the following where the non-terminal on the left also appears somewhere on the right (directly).– NP -> NP PP [[The flight] [to Boston]]– VP -> VP PP [[departed Miami] [at noon]]

Recursion

• Of course, this is what makes syntax interesting– flights from Denver– Flights from Denver to Miami– Flights from Denver to Miami in February– Flights from Denver to Miami in February on a Friday– Flights from Denver to Miami in February on a Friday under $300– Flights from Denver to Miami in February on a Friday under $300 with lunch

The Point

• If you have a rule like– VP -> V NP

– It only cares that the thing after the verb is an NP. It doesn’t have to know about the internal affairs of that NP

The Point

• VP -> V NP• I hate

– flights from Denver– Flights from Denver to Miami– Flights from Denver to Miami in February– Flights from Denver to Miami in February on a Friday– Flights from Denver to Miami in February on a Friday under $300– Flights from Denver to Miami in February on a Friday under $300 with lunch

Potential Problems in CFG

• Agreement• Subcategorization• Movement

48

Agreement• By agreement, we have in mind constraints that hold among various constituents

that take part in a rule or set of rules

• For example, in English, determiners and the head nouns in NPs have to agree in their number.

This flightThose flights

*This flights*Those flight

49

Problem

• Our earlier NP rules are clearly deficient since they don’t capture this constraint– NP Det Nominal

• Accepts, and assigns correct structures, to grammatical examples (this flight)• But its also happy with incorrect examples (*these flight)

– Such a rule is said to overgenerate.

50

Verb Phrases• English VPs consist of a head verb along with 0 or more following

constituents which we’ll call arguments.

Subcategorization• Sneeze: John sneezed• Find: Please find [a flight to NY]NP

• Give: Give [me]NP[a cheaper fare]NP

• Help: Can you help [me]NP[with a flight]PP

• Prefer: I prefer [to leave earlier]TO-VP

• Told: I was told [United has a flight]S

• …

• *John sneezed the book• *I prefer United has a flight• *Give with a flight

• Subcat expresses the constraints that a predicate (verb for now) places on the number and type of the argument it wants to take

So?

• So the various rules for VPs overgenerate.– They permit the presence of strings containing verbs and arguments that don’t

go together– For example– VP -> V NP therefore– Sneezed the book is a VP since “sneeze” is a verb and “the book” is a valid NP

• Subcategorization frames can fix this problem (“slow down” overgeneration)

• This is a modern take on the traditional notion of transitive/intransitive.• Modern grammars may have 100s or such classes.

53

Subcategorization

• Sneeze: John sneezed• Find: Please find [a flight to NY]NP

• Give: Give [me]NP[a cheaper fare]NP

• Help: Can you help [me]NP[with a flight]PP

• Prefer: I prefer [to leave earlier]TO-VP

• Told: I was told [United has a flight]S

• …

Movement

• Core example– [[My travel agent]NP [booked [the flight]NP]VP]S

• I.e. “book” is a straightforward transitive verb. It expects a single NP arg within the VP as an argument, and a single NP arg as the subject.

Movement

• What about?– Which flight do you want me to have the travel agent book?

• The direct object argument to “book” isn’t appearing in the right place. It is in fact a long way from where its supposed to appear.

• And note that its separated from its verb by 2 other verbs.

Grammar equivalence and normal form

• Strong equivalence:– two grammars are strongly equivalent if:

• they generate the same set of strings• they assign the same phrase structure to each sentence

– two grammars are weakly equivalent if:• they generate the same set of strings• they do not assign the same phrase structure to each sentence

• Normal form – Restrict the form of productions– Chomsky Normal Form (CNF)– Right hand side of the productions has either one or two terminals or non-

terminals– e.g. A -> BC A -> a– Any grammar can be translated into a weakly equivalent CNF– A -> B C D <=> A-> B X X -> C D

Building tree structures

• Draw tree structures for the following phrases

• Dallas• from Denver• arriving in Washington• I need to fly between Philadelphia and Atlanta• My flight from Philadelphia to Atlanta has been cancelled

top related