computational paninian grammar for dependency parsing dipti misra sharma ltrc, iiit, hyderabad nlp...
TRANSCRIPT
Computational Paninian Grammar for Dependency Parsing
Dipti Misra SharmaLTRC, IIIT,Hyderabad
NLP Winter School25-12-2008
Background
Indian languages Rich morphology Relatively flexible word order
For example,
1. a) baccaa phala khaataa hai
‘child’ ‘fruit’ ‘eat+hab’ ‘pres’
b) phala baccaa khaataa hai
c) phala khaataa hai baccaa d) baccaa khaataa hai phala
Basic Structure in PS
NP VP
S
N
baccaa
NP VP
N V Aux
haikhaataaphala
1 a) baccaa phala khaataa hai
‘child’ ‘fruit’ ‘eat+hab’ ‘pres’
•Subject – baccaa ‘child’
•Object - phala ‘fruit’
PS for 1(b)
1 b) phala baccaa khaataa hai
‘fruit’ ‘child’ ‘eat’ ‘pres’
Topic – phala ‘fruit’
Subject - baccaa ‘child’
Object - t
•Movement involved
Tree - I
Problems
Complex tree In what ways subject (baccaa) is different from object (phala) ?
Agreement does not hold
Position does not hold
How to Draw PSs for 1 (c-d) ?
1 c) baccaa khaata hai phala
'child' 'eat+hab' 'pres' 'fruit'
1 d) phala khaata hai baccaa
'fruit' 'eat+hab' 'pres' 'child'
Simple and perfectly natural sentences - difficult to handle in Phrase Structure
Dependency structures make it easy
Dependency Structure
khaataa_ hai
phalabaccaa
baccaa phala khaataa hai
‘child’ ‘fruit’ ‘eat’ ‘is’
phala baccaa khaataa hai
‘fruit’ ‘child’ ‘eat’ ‘is’
baccaa khaata hai phala
‘child’ ‘eat’ ‘is’ ‘fruit’
phala khaata hai baccaa
‘fruit’ ‘eat’ ‘is’ ‘child’
k1 k2
• One dependency for all (1a-d)• Additional attribute of 'order' can be included to capture the variation in order• Case and postpositions be encoded in role
Paninian Grammatical Formalism
A dependency grammar based approach Motivation for following the Paninian approach
Inspired by inflectionally rich language (Sanskrit) Better suited for handling ILs Provides the level of syntactico-semantic
interface for parsing Various linguistic phenomena handled
seamlessly
( Refer Akshar Bharati et al Natural Language Parsing - a Paninian Perspective (1995) http://ltrc.iiit.net/showfile.php?filename=downloads/nlpbook/index.html)
Panian Grammar Contd.
The grammar facilitates realisation of the intended meaning as an 'expression' of what the speaker wants to communicate (vivaksha)
The Basic Framework
Treats a sentence as a series of modifier-modified relations A sentence has a primary modified
(generally a verb)
Provides a blueprint to identify these relations
Syntactic cues help in identifying the relation types
Levels of Representation
(1) Semantic information
Assignment of karakas (Th-roles) and of abstract tense
(2) Morphosyntactic representation Morphological spellout rules
(3) Abstract morphological representation Allomorphy and phonology
(4) Phonological output form (From Kiparsky, Lectures in CIEFL, Hyderabad, pg2)
Some Concepts
Speaker's intention (vivakshaa) Root + Suffix (prakriti + pratyaya) Expectancy (aakaankshaa) Eligibility (yogyataa) Proximity (sannidhi) Karaka vibhakti
Speaker’s Intention (vivakshaa)
Each sentence reflects speaker’s intention Various sub-actions come into focus Participants are assigned various relations
accordingly ‘key’ gets assigned karta, karana based on
the kind of sub-action under focus
Syntax reflects vivaksha
Prakriti and Pratyaya(root and suffix)
The premise
Every word is composed of two parts1. Content part (root morpheme)2. Functional part (affix)
For languages such as English and Hindi the auxiliaries can be treated as the functional morphemes
Morph analysers or Local word groupers can provide this information
aakaankshaa(Expectation/Demand)
Every word has certain demands to be fulfilled. For Parsing, verb is the most critical element
The demand frames (karaka frames) for the verbs list out their demands
For Example, frame of Hindi verb 'khaa'
Verb khaa
Sense to eat Sense ID ???
Eg raam seb khaataa hai
‘Ram ate an apple’
----------------------------------------------------------------------------------
arc-label necessity vibhakti lextype reln
----------------------------------------------------------------------------------
k1 m 0 n c
k2 m 0 n c
-----------------------------------------------------------------------------------------
k1 karta; k2 karma; m mandatory; n noun; c child
Yogyataa(Eligibility)
Selectional RestrictionsFor example,
baccaa phala khaataa hai
'phala' (fruit) does not have the eligibility to become the 'karta' of the verb 'khaa' (eat)Constraints based on yogyata require semantic knowledge for each lexical item
This knowledge can be obtained from a lexical resource such as a 'WordNet'
Sannidhi (Proximity)
The modifier and the modified tend to occur in close proximity in a sentence
For example,
'rAma ne kelaa khaayaa, mohana ne duudha
piyaa Ora Hari ne film dekhii'
This Hindi example cotains three verbs -khAyA (ate), piyA (drank) and dekhI (saw)
Respective arguments of each of these verbs would tend to occur in close proximity to it
Karaka and Vibhakti
Two levels of analysis Syntactico-sematic relations :
Direct participants of the action denoted by a verb (Karaka) Other relations : purpose, genitive, reason etc
Relation markers (Vibhaktis)
Semantics of the verb
A verbal root denotes: The activity The result
Locus of activity : karta
Locus of result : karma
Verbal Root
activity result
karta - karma
The boy opened the lock k1 – karta k2 – karma
karta, karma sometimes correspond to agent/theme Not always
open
boy lock
k1 k2
Action – bundle of sub-actions
The boy opened the lock with the keyThe key opened the lockThe lock opened
Notion of vivaksha Realization of speakers’ intention in a
sentence
Sub-actions - Opening of lock
Action 1 The boy opened the lock with the key
Action 2 The key opened the lock
Action 3 The lock opened
Each sentence reflects speakers’ intention
Sub-actions - Opening of lock
open
boy lock key
k1k2 k3
open
open
lock
lockkey
k1
k1 k2
k1 – karta (doer)k2 – karma (affected)k3 – karana (instrument)
Basic karaka relations
Only six
karta – subject/agent/doer karma – object/patient karana – instrument sampradaan – beneficiary apaadaan – source adhikarana – location in place/time/other
Other relations
Other dependency relations Purpose, reason, direction etc Causatives, associatives, comparatives
etc Genitive, adjective
Vibhaktis : Markers for karaka Relations
• Relation markers (Vibhaktis)
raama ne caakuu se seba kaaTaa
'Ram‘ 'erg' 'knife‘ 'with' 'apple' 'cut'
| | | karta(doer) karana(instrument) karma (theme)
raama ne mohana ke_liye seba kaaTaa ‘Ram’ ‘erg’ ‘Mohan’ ‘for’ ‘apple’ ‘cut’
“Ram cut the apple for Mohan” (purpose)
maiM mohana ke_saatha baazaara gayaa ‘I’ ‘Mohan’ ‘with’ ‘market’ ‘went’ “ I went to the market with Mohan “ (associative)
Syntactic Cues
Verbal inflections (Tense Aspect Modality (TAM)) Passive : verb agrees with the karma Some other casesraama ko jaanaa paDaa‘I+to’ ‘go’ ‘had to’
“I had to go”raama ko calanaa caahiye‘Ram’ ‘to’ ‘walk’ ‘should’
“I should leave”
Example
Raama jaataa hai
‘Ram’ ‘go+hab’ ‘pres’“Ram goes”
jaa
karta
raama
Raama ko jaanaa paDaa
‘Ram+to’ ‘go’ ‘had to’
“Ram had to go”
jaa
karta
mujha
Relative Clause
A noun is modified by a clause with a relative pronoun as its co-
referent
Example
meraa bhaaii jo dillii meM rahataa hai kala aa
‘my’ ‘brother’ ‘who’ ‘Delhi’ ‘in’ ‘live+hab’ ‘pres’ ‘tomorrow’ ‘come’
rahaa hai
‘prog’ ‘pres’ ‘My brother who lives in Delhi is coming tomorrow’
How to represent this ? Two possible representations
Other Relative-Corelative Constructions
Adjective having a clausal modifiertuma aisaa sundara ghar banaao jaisaa unakaa hai‘you’ ‘such’ ‘beautiful’ ‘house’ ‘build’ ‘such-that’ ‘theirs’ ‘is’“You build a house as beautiful as theirs”
banaao ‘build’ k1 k2
tuma ghara
adj sundara
jjmod aisaa
coref jo-vo-jjmod hai jaisaa unakaa
jaisaa usakaa
MWEs
Conjunct Verbs
((raama ne)) ((bahuta dera)) ((ravi kii)) ((pratiikshaa kii)) 'rAma erg' 'very' 'late' 'ravi' ‘of' 'wait‘ ‘did’
Ram waited for Ravi for a long time
((kaaryashaalaa ke liye)) ((biisa logoM kaa)) ((naamaaMkana kiyaa gayaa))
'workshop‘ 'for' 'twenty' 'people' ‘of‘ 'name registration' 'do+passive‘
Twenty people were registered for the workshop
Conjunct Verbs
Conjunct verb ‘prashna kiyaa’ below mohana ne ravi se prashna kiyaa
'Mohan' 'erg' 'Ravi' 'to' 'question' 'did'
“Mohan asked Ravi a question”
A conjunct verb can have partial modification mohana ne acchaa prashna kiyaa thaa
'Mohan' 'erg' 'good' 'question' 'do+perf' 'past‘
The elements in a complex predicate can also be dis-continuous
prashna to mohana ne kiyaa thaa 'question' 'part' 'Mohan' 'erg' 'do+perf' 'past'
Conjunct Verbs
However,
Mohan ne ravi se acchaa prashna kiyaa
prashna_kiyaa ‘questioned’
k1 k2 ?
mohan ne ravi se acchaa Mohan to Ravi good
'acchaa' is NOT a verb modifier, 'acchaa' modifies 'prashna' and not 'prashna
kiyA',
Solution ?
Conjunct Verbs
Solution
Don't chunk a conjunct verb as a single verbal unit
Thus,
Mohan ne ravi se ((acchaa)) ((prashna kiyaa))_VG
Revise to
Mohan ne ravi se ((acchaa prashna))_NP ((kiyaa))_VG
Conjunct Verbs
Show 'part-of' relation between the noun and the verb
Add a tag 'pof' to achieve the above Therefore, _kiyaa
k1 k2 pof
mohan ne ravi se prashna
nmod
acchaa
DS for Discontinuous Elements
prashna to mohana ne kiyaa thaa
• Use of pof (‘Part Of’ relation )kiyaa
mohana prashna
pofk1
MWEs
Idioms ((kisaana kii)) ((patnii ko)) ((vaha ciDiyaa))
'farmer' 'of' 'wife' 'to' 'that' 'bird'
(( phuuTii aaMkha nahiiM suhaatii thii))
'not appealed'
The idiom (in bold) is functionally a verb.
Idioms
Two possible solutions phuuTii aazkha suhaa <fs tam=nahiiM+taa_thaa>
‘not appealed’
k1 k2
patnii vaha ciDiyaa ‘wife ’ ‘that bird’ r6
kisaana ‘farmer’
Solution-1
Idioms
suhaa <fs tam=nahiiM+taa_thaa> ‘not appealed’
k2 pof k1
vaha ciDiyaa phuuTii aazkha patnii ‘that bird’ ‘burst eye’ ‘wife’
r6 kisaana
‘farmer'
Solution-2
Change of State Verbs
Change of state verbs such as ‘raMganaa’ (colour) pose a problem such as,
((usane)) ((apanaa ghara)) ((piilaa)) ((raMgaa))'he/she' 'own' 'house' 'yellow' 'coloured'
raMga ‘colour’
k1 k2 ?
usane ghara piilaa he/she house yellow
Is 'piilaa' a complement of 'ghara' ? ORIs it the k2 of raMgaa ?If ‘piilaa’ is the k2 of raMgaa then what is the relation of
‘ghara’ with ‘raMgaa ?Can they both be k2 ?
Proposed Solution
In Panini's framework, verbs denoting 'change of state' can have two 'karma'
The object which is being changed The state after change
Thus, raMga ’coloured’
k1 k2-1 k2-2
usane ghara piilaa he house yellow
Conjuncts
Need special treatment in a dependencyrepresentation
(maiM baazaara gayaa)1 Ora (ve loga ghara para ruke)2
'I' 'market' 'went' 'and' 'those' people‘ 'home' 'at‘ 'stayed'
“I went to the market and those people stayed at home”
What is the head of a co-ordinate structure ?How to represent the equal status of 1 and 2
above ?
Conjuncts
Take Conjunct as the 'head' Label the relation as 'ccof' Ora ‘and’ ccof ccof
gayA ‘went’ ruke ‘stay’ k1 k2 k1 k7p
mEM bAzAra loga ghara ‘I’ ‘market’ ‘people’ ‘home’
A subordinating conjunct will have a single child node
Some Problem Cases
Certain complex sentences pose problems
For example :agara tuma aate to hama vahaaM jaate
‘if’ ‘you’ ‘come’ ‘then’ ‘we’ ‘there’ ‘go’
“Had you come, we would have gone there”
Counterfactual
‘agara’ and ‘to’ two connectives
How to represent the dependencies ?
Main Clause – Subordinate Clause
jaate ‘go+?’
? ? K1 k7p
agara to hama vahaaM ccof
aate k1
tuma
This representation fails to capture the relation between ‘agara’-’to’
Representation-Currently Followed
to ‘then’
ccof
jaate ‘go+?’
vmod k1 k7p
agara hama vahaaM
ccof ‘we’ ‘there’
aate 'come'
k1
tuma 'you'
Alternative Proposal
agara-to
pof pof
agara to
ccof ccof
aate jaate
k1 k1 k7p
tuma hama vahaaM
Treat ‘agara-to’ as a complex conjunct
Ellipsis
How to show dependencies when the head is missing ?
bacce baDe ho gaye hEM kisI kI bAta nahIM sunate
“The children have grown up, they don't listen to anyone”
No explicit conjunct !!
Insert a NULL element to show the dependencies
NULL_CCP
ccof ccof bade_ho_gaye nahIM_sunate
Insert a NULL node only if it is essential to represent the dependencies
.
Some English Examples
English is :
A configurational language Relatively fixed word order Relations are not realised in affixes Subject and object are positional Subject is sacrosanct
Passive
Rama ate a banana
eat <fs tam=PAST>
k1 k2
Rama banana
A banana was eaten by Rama
eat <fs tam=was_en>
k2 k1
banana Rama
Extend the notion of vibhakti to English subject, object positions
Interrogatives
Did Rama eat a banana ?
A 'Yes-no' interrogative
Structurally,
Interrogative is realised through word order change Subject – Auxiliary inversion No interrogative morpheme
Interrogative Contd.
Proposed solution:
eat < fs stype=interrogative__yes-no>
fragof k1 k2
Did Rama banana
Position gives the cues for the constraints
Interrogatives Contd.
What did Rama eat ?
Eat < fs stype=interrogative__wh>
k2 fragof k1
What did Rama
Question element 'what' andAuxiliary position provide the syntactic cues
Control Verbs
John persuaded Harry to leave
persuade
k1 k2 rt (?)
John Harry leave
The object of persuade corefers to the 'missing' 'karta' of 'leave'
John promised Harry to leave
promise
k1 k4 k2
John Harry leave
The subject of promise corefers to the 'missing' 'karta' of 'leave'
Verbs such as 'want'
John wanted Harry to leave want
k1 k2
John leave
k1
Harry
'want' is a transitive verb and can take 'a clause' as its 'karma'
Empty 'it'
It is raining in Delhi
rain <fs stype=expletive__it>
k7p
Delhi
Possible representation Empty 'it' can be captured in the feature structure