computational paninian grammar for dependency parsing dipti misra sharma ltrc, iiit, hyderabad nlp...

68
Computational Paninian Grammar for Dependency Parsing Dipti Misra Sharma LTRC, IIIT, Hyderabad NLP Winter School 25-12-2008

Upload: ashlyn-crawford

Post on 26-Dec-2015

228 views

Category:

Documents


4 download

TRANSCRIPT

Computational Paninian Grammar for Dependency Parsing

Dipti Misra SharmaLTRC, IIIT,Hyderabad

NLP Winter School25-12-2008

Outline

Backgrond Paninian Grammar :The Basic Framework Some Example Cases Conclusion

Background

Indian languages Rich morphology Relatively flexible word order

For example,

1. a) baccaa phala khaataa hai

‘child’ ‘fruit’ ‘eat+hab’ ‘pres’

b) phala baccaa khaataa hai

c) phala khaataa hai baccaa d) baccaa khaataa hai phala

Basic Structure in PS

NP VP

S

N

baccaa

NP VP

N V Aux

haikhaataaphala

1 a) baccaa phala khaataa hai

‘child’ ‘fruit’ ‘eat+hab’ ‘pres’

•Subject – baccaa ‘child’

•Object - phala ‘fruit’

PS for 1(b)

1 b) phala baccaa khaataa hai

‘fruit’ ‘child’ ‘eat’ ‘pres’

Topic – phala ‘fruit’

Subject - baccaa ‘child’

Object - t

•Movement involved

Tree - I

Problems

Complex tree In what ways subject (baccaa) is different from object (phala) ?

Agreement does not hold

Position does not hold

How to Draw PSs for 1 (c-d) ?

1 c) baccaa khaata hai phala

'child' 'eat+hab' 'pres' 'fruit'

1 d) phala khaata hai baccaa

'fruit' 'eat+hab' 'pres' 'child'

Simple and perfectly natural sentences - difficult to handle in Phrase Structure

Dependency structures make it easy

Dependency Structure

khaataa_ hai

phalabaccaa

baccaa phala khaataa hai

‘child’ ‘fruit’ ‘eat’ ‘is’

phala baccaa khaataa hai

‘fruit’ ‘child’ ‘eat’ ‘is’

baccaa khaata hai phala

‘child’ ‘eat’ ‘is’ ‘fruit’

phala khaata hai baccaa

‘fruit’ ‘eat’ ‘is’ ‘child’

k1 k2

• One dependency for all (1a-d)• Additional attribute of 'order' can be included to capture the variation in order• Case and postpositions be encoded in role

Paninian Grammatical Formalism

A dependency grammar based approach Motivation for following the Paninian approach

Inspired by inflectionally rich language (Sanskrit) Better suited for handling ILs Provides the level of syntactico-semantic

interface for parsing Various linguistic phenomena handled

seamlessly

( Refer Akshar Bharati et al Natural Language Parsing - a Paninian Perspective (1995) http://ltrc.iiit.net/showfile.php?filename=downloads/nlpbook/index.html)

Panian Grammar Contd.

The grammar facilitates realisation of the intended meaning as an 'expression' of what the speaker wants to communicate (vivaksha)

The Basic Framework

Treats a sentence as a series of modifier-modified relations A sentence has a primary modified

(generally a verb)

Provides a blueprint to identify these relations

Syntactic cues help in identifying the relation types

Levels of Representation

(1) Semantic information

Assignment of karakas (Th-roles) and of abstract tense

(2) Morphosyntactic representation Morphological spellout rules

(3) Abstract morphological representation Allomorphy and phonology

(4) Phonological output form (From Kiparsky, Lectures in CIEFL, Hyderabad, pg2)

Some Concepts

Speaker's intention (vivakshaa) Root + Suffix (prakriti + pratyaya) Expectancy (aakaankshaa) Eligibility (yogyataa) Proximity (sannidhi) Karaka vibhakti

Speaker’s Intention (vivakshaa)

Each sentence reflects speaker’s intention Various sub-actions come into focus Participants are assigned various relations

accordingly ‘key’ gets assigned karta, karana based on

the kind of sub-action under focus

Syntax reflects vivaksha

Prakriti and Pratyaya(root and suffix)

The premise

Every word is composed of two parts1. Content part (root morpheme)2. Functional part (affix)

For languages such as English and Hindi the auxiliaries can be treated as the functional morphemes

Morph analysers or Local word groupers can provide this information

aakaankshaa(Expectation/Demand)

Every word has certain demands to be fulfilled. For Parsing, verb is the most critical element

The demand frames (karaka frames) for the verbs list out their demands

For Example, frame of Hindi verb 'khaa'

Verb khaa

Sense to eat Sense ID ???

Eg raam seb khaataa hai

‘Ram ate an apple’

----------------------------------------------------------------------------------

arc-label necessity vibhakti lextype reln

----------------------------------------------------------------------------------

k1 m 0 n c

k2 m 0 n c

-----------------------------------------------------------------------------------------

k1 karta; k2 karma; m mandatory; n noun; c child

Yogyataa(Eligibility)

Selectional RestrictionsFor example,

baccaa phala khaataa hai

'phala' (fruit) does not have the eligibility to become the 'karta' of the verb 'khaa' (eat)Constraints based on yogyata require semantic knowledge for each lexical item

This knowledge can be obtained from a lexical resource such as a 'WordNet'

Sannidhi (Proximity)

The modifier and the modified tend to occur in close proximity in a sentence

For example,

'rAma ne kelaa khaayaa, mohana ne duudha

piyaa Ora Hari ne film dekhii'

This Hindi example cotains three verbs -khAyA (ate), piyA (drank) and dekhI (saw)

Respective arguments of each of these verbs would tend to occur in close proximity to it

Karaka and Vibhakti

Two levels of analysis Syntactico-sematic relations :

Direct participants of the action denoted by a verb (Karaka) Other relations : purpose, genitive, reason etc

Relation markers (Vibhaktis)

Semantics of the verb

A verbal root denotes: The activity The result

Locus of activity : karta

Locus of result : karma

Verbal Root

activity result

karta - karma

The boy opened the lock k1 – karta k2 – karma

karta, karma sometimes correspond to agent/theme Not always

open

boy lock

k1 k2

Action – bundle of sub-actions

The boy opened the lock with the keyThe key opened the lockThe lock opened

Notion of vivaksha Realization of speakers’ intention in a

sentence

Sub-actions - Opening of lock

Sub-actions - Opening of lock

Action 1 The boy opened the lock with the key

Action 2 The key opened the lock

Action 3 The lock opened

Each sentence reflects speakers’ intention

Sub-actions - Opening of lock

open

boy lock key

k1k2 k3

open

open

lock

lockkey

k1

k1 k2

k1 – karta (doer)k2 – karma (affected)k3 – karana (instrument)

Basic karaka relations

Only six

karta – subject/agent/doer karma – object/patient karana – instrument sampradaan – beneficiary apaadaan – source adhikarana – location in place/time/other

Basic karaka relations

raama phala khaataa hai ‘Ram eats fruits’

Basic karaka relations

raama chaaku se seba kaaTtaa hai ‘Ram cuts the apple with knife’

Basic karaka relations

raama ne mohana ko pustaka dii‘Ram gave a book to Mohan’

Other relations

Other dependency relations Purpose, reason, direction etc Causatives, associatives, comparatives

etc Genitive, adjective

Vibhaktis : Markers for karaka Relations

• Relation markers (Vibhaktis)

raama ne caakuu se seba kaaTaa

'Ram‘ 'erg' 'knife‘ 'with' 'apple' 'cut'

| | | karta(doer) karana(instrument) karma (theme)

raama ne mohana ke_liye seba kaaTaa ‘Ram’ ‘erg’ ‘Mohan’ ‘for’ ‘apple’ ‘cut’

“Ram cut the apple for Mohan” (purpose)

maiM mohana ke_saatha baazaara gayaa ‘I’ ‘Mohan’ ‘with’ ‘market’ ‘went’ “ I went to the market with Mohan “ (associative)

However

No one-to-one correspondence between relations and relation markers

Syntactic Cues

Verbal inflections (Tense Aspect Modality (TAM)) Passive : verb agrees with the karma Some other casesraama ko jaanaa paDaa‘I+to’ ‘go’ ‘had to’

“I had to go”raama ko calanaa caahiye‘Ram’ ‘to’ ‘walk’ ‘should’

“I should leave”

Example

Raama jaataa hai

‘Ram’ ‘go+hab’ ‘pres’“Ram goes”

jaa

karta

raama

Raama ko jaanaa paDaa

‘Ram+to’ ‘go’ ‘had to’

“Ram had to go”

jaa

karta

mujha

Some Examples

Relative Clause MWEs Change of state verbs Conjuncts Ellipsis

Relative Clause

A noun is modified by a clause with a relative pronoun as its co-

referent

Example

meraa bhaaii jo dillii meM rahataa hai kala aa

‘my’ ‘brother’ ‘who’ ‘Delhi’ ‘in’ ‘live+hab’ ‘pres’ ‘tomorrow’ ‘come’

rahaa hai

‘prog’ ‘pres’ ‘My brother who lives in Delhi is coming tomorrow’

How to represent this ? Two possible representations

Alternative 1

aa

meraa bhaaii kala

jo

raha

dillii

Alternative 2

Aa

meraa bhaaii kala

coref raha

jo dillii

Other Relative-Corelative Constructions

Adjective having a clausal modifiertuma aisaa sundara ghar banaao jaisaa unakaa hai‘you’ ‘such’ ‘beautiful’ ‘house’ ‘build’ ‘such-that’ ‘theirs’ ‘is’“You build a house as beautiful as theirs”

banaao ‘build’ k1 k2

tuma ghara

adj sundara

jjmod aisaa

coref jo-vo-jjmod hai jaisaa unakaa

jaisaa usakaa

MWEs

Conjunct Verbs

((raama ne)) ((bahuta dera)) ((ravi kii)) ((pratiikshaa kii)) 'rAma erg' 'very' 'late' 'ravi' ‘of' 'wait‘ ‘did’

Ram waited for Ravi for a long time

((kaaryashaalaa ke liye)) ((biisa logoM kaa)) ((naamaaMkana kiyaa gayaa))

'workshop‘ 'for' 'twenty' 'people' ‘of‘ 'name registration' 'do+passive‘

Twenty people were registered for the workshop

Conjunct Verbs

Conjunct verb ‘prashna kiyaa’ below mohana ne ravi se prashna kiyaa

'Mohan' 'erg' 'Ravi' 'to' 'question' 'did'

“Mohan asked Ravi a question”

A conjunct verb can have partial modification mohana ne acchaa prashna kiyaa thaa

'Mohan' 'erg' 'good' 'question' 'do+perf' 'past‘

The elements in a complex predicate can also be dis-continuous

prashna to mohana ne kiyaa thaa 'question' 'part' 'Mohan' 'erg' 'do+perf' 'past'

Conjunct Verbs

However,

Mohan ne ravi se acchaa prashna kiyaa

prashna_kiyaa ‘questioned’

k1 k2 ?

mohan ne ravi se acchaa Mohan to Ravi good

'acchaa' is NOT a verb modifier, 'acchaa' modifies 'prashna' and not 'prashna

kiyA',

Solution ?

Conjunct Verbs

Solution

Don't chunk a conjunct verb as a single verbal unit

Thus,

Mohan ne ravi se ((acchaa)) ((prashna kiyaa))_VG

Revise to

Mohan ne ravi se ((acchaa prashna))_NP ((kiyaa))_VG

Conjunct Verbs

Show 'part-of' relation between the noun and the verb

Add a tag 'pof' to achieve the above Therefore, _kiyaa

k1 k2 pof

mohan ne ravi se prashna

nmod

acchaa

DS for Discontinuous Elements

prashna to mohana ne kiyaa thaa

• Use of pof (‘Part Of’ relation )kiyaa

mohana prashna

pofk1

MWEs

Idioms ((kisaana kii)) ((patnii ko)) ((vaha ciDiyaa))

'farmer' 'of' 'wife' 'to' 'that' 'bird'

(( phuuTii aaMkha nahiiM suhaatii thii))

'not appealed'

The idiom (in bold) is functionally a verb.

Idioms

Two possible solutions phuuTii aazkha suhaa <fs tam=nahiiM+taa_thaa>

‘not appealed’

k1 k2

patnii vaha ciDiyaa ‘wife ’ ‘that bird’ r6

kisaana ‘farmer’

Solution-1

Idioms

suhaa <fs tam=nahiiM+taa_thaa> ‘not appealed’

k2 pof k1

vaha ciDiyaa phuuTii aazkha patnii ‘that bird’ ‘burst eye’ ‘wife’

r6 kisaana

‘farmer'

Solution-2

Change of State Verbs

Change of state verbs such as ‘raMganaa’ (colour) pose a problem such as,

((usane)) ((apanaa ghara)) ((piilaa)) ((raMgaa))'he/she' 'own' 'house' 'yellow' 'coloured'

raMga ‘colour’

k1 k2 ?

usane ghara piilaa he/she house yellow

Is 'piilaa' a complement of 'ghara' ? ORIs it the k2 of raMgaa ?If ‘piilaa’ is the k2 of raMgaa then what is the relation of

‘ghara’ with ‘raMgaa ?Can they both be k2 ?

Proposed Solution

In Panini's framework, verbs denoting 'change of state' can have two 'karma'

The object which is being changed The state after change

Thus, raMga ’coloured’

k1 k2-1 k2-2

usane ghara piilaa he house yellow

Conjuncts

Need special treatment in a dependencyrepresentation

(maiM baazaara gayaa)1 Ora (ve loga ghara para ruke)2

'I' 'market' 'went' 'and' 'those' people‘ 'home' 'at‘ 'stayed'

“I went to the market and those people stayed at home”

What is the head of a co-ordinate structure ?How to represent the equal status of 1 and 2

above ?

Conjuncts

Take Conjunct as the 'head' Label the relation as 'ccof' Ora ‘and’ ccof ccof

gayA ‘went’ ruke ‘stay’ k1 k2 k1 k7p

mEM bAzAra loga ghara ‘I’ ‘market’ ‘people’ ‘home’

A subordinating conjunct will have a single child node

Some Problem Cases

Certain complex sentences pose problems

For example :agara tuma aate to hama vahaaM jaate

‘if’ ‘you’ ‘come’ ‘then’ ‘we’ ‘there’ ‘go’

“Had you come, we would have gone there”

Counterfactual

‘agara’ and ‘to’ two connectives

How to represent the dependencies ?

Main Clause – Subordinate Clause

jaate ‘go+?’

? ? K1 k7p

agara to hama vahaaM ccof

aate k1

tuma

This representation fails to capture the relation between ‘agara’-’to’

Representation-Currently Followed

to ‘then’

ccof

jaate ‘go+?’

vmod k1 k7p

agara hama vahaaM

ccof ‘we’ ‘there’

aate 'come'

k1

tuma 'you'

Alternative Proposal

agara-to

pof pof

agara to

ccof ccof

aate jaate

k1 k1 k7p

tuma hama vahaaM

Treat ‘agara-to’ as a complex conjunct

Ellipsis

How to show dependencies when the head is missing ?

bacce baDe ho gaye hEM kisI kI bAta nahIM sunate

“The children have grown up, they don't listen to anyone”

No explicit conjunct !!

Insert a NULL element to show the dependencies

NULL_CCP

ccof ccof bade_ho_gaye nahIM_sunate

Insert a NULL node only if it is essential to represent the dependencies

.

Applying Paninian Model

to English

Some English Examples

English is :

A configurational language Relatively fixed word order Relations are not realised in affixes Subject and object are positional Subject is sacrosanct

Passive

Rama ate a banana

eat <fs tam=PAST>

k1 k2

Rama banana

A banana was eaten by Rama

eat <fs tam=was_en>

k2 k1

banana Rama

Extend the notion of vibhakti to English subject, object positions

Interrogatives

Did Rama eat a banana ?

A 'Yes-no' interrogative

Structurally,

Interrogative is realised through word order change Subject – Auxiliary inversion No interrogative morpheme

Interrogative Contd.

Proposed solution:

eat < fs stype=interrogative__yes-no>

fragof k1 k2

Did Rama banana

Position gives the cues for the constraints

Interrogatives Contd.

What did Rama eat ?

Eat < fs stype=interrogative__wh>

k2 fragof k1

What did Rama

Question element 'what' andAuxiliary position provide the syntactic cues

Control Verbs

John persuaded Harry to leave

persuade

k1 k2 rt (?)

John Harry leave

The object of persuade corefers to the 'missing' 'karta' of 'leave'

John promised Harry to leave

promise

k1 k4 k2

John Harry leave

The subject of promise corefers to the 'missing' 'karta' of 'leave'

Verbs such as 'want'

John wanted Harry to leave want

k1 k2

John leave

k1

Harry

'want' is a transitive verb and can take 'a clause' as its 'karma'

Empty 'it'

It is raining in Delhi

rain <fs stype=expletive__it>

k7p

Delhi

Possible representation Empty 'it' can be captured in the feature structure

Conclusion

Paninian Grammatical Formalism offers a depenency based approach for sentence parsing which suits better morphologically richer languages with relatively free word order such as Indian languages.