two-stage constraint based sanskrit parser akshar bharati, iiit,hyderabad
TRANSCRIPT
![Page 1: Two-Stage Constraint Based Sanskrit Parser Akshar Bharati, IIIT,Hyderabad](https://reader035.vdocuments.site/reader035/viewer/2022062300/56649cdf5503460f949a92b0/html5/thumbnails/1.jpg)
Two-Stage Constraint Based Sanskrit Parser
Akshar Bharati,
IIIT,Hyderabad
![Page 2: Two-Stage Constraint Based Sanskrit Parser Akshar Bharati, IIIT,Hyderabad](https://reader035.vdocuments.site/reader035/viewer/2022062300/56649cdf5503460f949a92b0/html5/thumbnails/2.jpg)
Brief outline
Dependency Paninian framework vibhakti-karaka correspondence karaka frames (basic + transformation) Source groups, demand groups
Constraints Three basic constraints Constraints as Integer programming equations
![Page 3: Two-Stage Constraint Based Sanskrit Parser Akshar Bharati, IIIT,Hyderabad](https://reader035.vdocuments.site/reader035/viewer/2022062300/56649cdf5503460f949a92b0/html5/thumbnails/3.jpg)
Notions from Paninian Framework – a)Karaka relations
It uses the notion of karaka relations between verbs and nouns in a sentence.
The notion of karaka relations is central to the Paninian model.
The karaka relations are syntactico-semantic (or semantico-syntactic) relations between the verbals and other related constituents in a sentence.
![Page 4: Two-Stage Constraint Based Sanskrit Parser Akshar Bharati, IIIT,Hyderabad](https://reader035.vdocuments.site/reader035/viewer/2022062300/56649cdf5503460f949a92b0/html5/thumbnails/4.jpg)
Notions from Paninian Framework – Demand Frames
For the task of karaka assignment, the core parser uses the fundamental principle of ' akanksha' (demand unit) and ' yogyata' (qualification of the source unit) .
Ex: CAwraH vixyAlayam gacCawi (student) (school) (go)
Verb Frame for this form of “gacCawi”
![Page 5: Two-Stage Constraint Based Sanskrit Parser Akshar Bharati, IIIT,Hyderabad](https://reader035.vdocuments.site/reader035/viewer/2022062300/56649cdf5503460f949a92b0/html5/thumbnails/5.jpg)
Demand Frame Gam1:
-------------------------------------------------------------------------------
arc-label necessity vibhakti lex-type src-pos arc-dir
-----------------------------------------------------------------------------
K1 m 1 n l ds
K2 m 2 n l ds
K3 m 3 n l ds
K5 m 5 n l ds
![Page 6: Two-Stage Constraint Based Sanskrit Parser Akshar Bharati, IIIT,Hyderabad](https://reader035.vdocuments.site/reader035/viewer/2022062300/56649cdf5503460f949a92b0/html5/thumbnails/6.jpg)
Constraint Based Parsing Computational Paninian Model Integer Programming with basic constraints
For each mandatory karakas in a karaka chart there should be exactly one outgoing edge labelled by the karaka from the demand group
For each of the desirable or optional karakas in a karaka chart there should be at most one outgoing edge labelled by the karaka from the demand group
There should be exactly one incoming arc into each of the source group
![Page 7: Two-Stage Constraint Based Sanskrit Parser Akshar Bharati, IIIT,Hyderabad](https://reader035.vdocuments.site/reader035/viewer/2022062300/56649cdf5503460f949a92b0/html5/thumbnails/7.jpg)
Parser
Two stage strategy
Stage I (Intra-clausal relations) Dependency relations marked Relations such as k1, k2, k3, etc. for each verb
Stage II (Inter-clausal relations & conjunct relations) Conjuncts and relative clauses
![Page 8: Two-Stage Constraint Based Sanskrit Parser Akshar Bharati, IIIT,Hyderabad](https://reader035.vdocuments.site/reader035/viewer/2022062300/56649cdf5503460f949a92b0/html5/thumbnails/8.jpg)
Steps in Parsing
Morph, POS tagging,Chunking
SENTENCE
Identify DemandGroups
Load Frames&
Transform
Find CandidatesApply
Constraints& Solve
Final ParseIs ComplexNO
YES
STAGE - II
![Page 9: Two-Stage Constraint Based Sanskrit Parser Akshar Bharati, IIIT,Hyderabad](https://reader035.vdocuments.site/reader035/viewer/2022062300/56649cdf5503460f949a92b0/html5/thumbnails/9.jpg)
Morph,Chunked,Tagged data
((
1 (( NP <fs af='CAwra,n,m,sg,,o,1,1‘ '>
1.1 CAwraH NN <fs af='CAwra,n,m,sg,,o,1,1'>
))
2 (( NP <fs af='vixyAlaya,n,m,sg,,d,2,2’>
2.1 vixyAlayam NN <fs af='vixyAlaya,n,m,sg,,d,2,2'>
))
3 (( VGF <fs af='gam1,v,,sg,3,,karwari_lat, gaNaH='BvAxiH' paxI='parasmEpaxI' XAwuH='gamLz'>
3.1 gacCawi VM <fs af='gam1,v,,sg,3,,karwari_lat,' paxI='parasmEpaxI' gaNaH='BvAxiH' XAwuH='gamLz'>
))
))
![Page 10: Two-Stage Constraint Based Sanskrit Parser Akshar Bharati, IIIT,Hyderabad](https://reader035.vdocuments.site/reader035/viewer/2022062300/56649cdf5503460f949a92b0/html5/thumbnails/10.jpg)
CAwraH <fs af='CAwra,n,m,sg,,o,1,1'>
vixyAlayam <fs af='vixyAlaya,n,m,sg,,d,2,2'>
gacCawi <fs af='gam1,v,,sg,3,,karwari_lat,' paxI='parasmEpaxI' gaNaH='BvAxiH' XAwuH='gamLz'>
![Page 11: Two-Stage Constraint Based Sanskrit Parser Akshar Bharati, IIIT,Hyderabad](https://reader035.vdocuments.site/reader035/viewer/2022062300/56649cdf5503460f949a92b0/html5/thumbnails/11.jpg)
Demand Frame Gam1:
-------------------------------------------------------------------------------
arc-label necessity vibhakti lex-type src-pos arc-dir
-----------------------------------------------------------------------------
K1 m 1 n l ds
K2 m 2 n l ds
K3 m 3 n l ds
K5 m 5 n l ds
![Page 12: Two-Stage Constraint Based Sanskrit Parser Akshar Bharati, IIIT,Hyderabad](https://reader035.vdocuments.site/reader035/viewer/2022062300/56649cdf5503460f949a92b0/html5/thumbnails/12.jpg)
k1 k2
CAwraH vixyAlayam gacCawi
![Page 13: Two-Stage Constraint Based Sanskrit Parser Akshar Bharati, IIIT,Hyderabad](https://reader035.vdocuments.site/reader035/viewer/2022062300/56649cdf5503460f949a92b0/html5/thumbnails/13.jpg)
Sanskrit Example
CAwraH vixyAlayam gacCawi
![Page 14: Two-Stage Constraint Based Sanskrit Parser Akshar Bharati, IIIT,Hyderabad](https://reader035.vdocuments.site/reader035/viewer/2022062300/56649cdf5503460f949a92b0/html5/thumbnails/14.jpg)
Steps (Stage II)
Identify NewDemandGroups
Load Frames&
Transform
FindCandidates
ApplyConstraints
& Solve
FINAL PARSE
Repair
Output ofSTAGE - I
![Page 15: Two-Stage Constraint Based Sanskrit Parser Akshar Bharati, IIIT,Hyderabad](https://reader035.vdocuments.site/reader035/viewer/2022062300/56649cdf5503460f949a92b0/html5/thumbnails/15.jpg)
Example – Relative Clause
vaha puswaka jo rAma ne mohana ko xI hE prasixXa hE that book which Ram ERG. Mohana DAT. gave is famous is ‘The book which Ram gave to Mohana is famous’
![Page 16: Two-Stage Constraint Based Sanskrit Parser Akshar Bharati, IIIT,Hyderabad](https://reader035.vdocuments.site/reader035/viewer/2022062300/56649cdf5503460f949a92b0/html5/thumbnails/16.jpg)
Output after Stage - I
xI
puswaka
mohanarAma
k2k4
k1
_ROOT_
jo
hEk1
prasixXa
k1s
mainmain
vaha
![Page 17: Two-Stage Constraint Based Sanskrit Parser Akshar Bharati, IIIT,Hyderabad](https://reader035.vdocuments.site/reader035/viewer/2022062300/56649cdf5503460f949a92b0/html5/thumbnails/17.jpg)
Identify the demand group
xiyA ‘give’Main verb of the relative clause
![Page 18: Two-Stage Constraint Based Sanskrit Parser Akshar Bharati, IIIT,Hyderabad](https://reader035.vdocuments.site/reader035/viewer/2022062300/56649cdf5503460f949a92b0/html5/thumbnails/18.jpg)
Identify the demand group,Load and Transform DF
jo ‘which’ transformation (special) Transforms the demand frame of the main verb of the
relative clause
--------------------------------------------------------------------------------------------------------------arc-label necessity vibhakti lextype src-pos arc-dir oprt--------------------------------------------------------------------------------------------------------------nmod__relc m any n r|l p insert--------------------------------------------------------------------------------------------------------------
![Page 19: Two-Stage Constraint Based Sanskrit Parser Akshar Bharati, IIIT,Hyderabad](https://reader035.vdocuments.site/reader035/viewer/2022062300/56649cdf5503460f949a92b0/html5/thumbnails/19.jpg)
Karaka Frame
vaha puswaka jo rAma ne mohana ko xI prasixXa hE | that book which Ram ERG. Mohana DAT. gave famous is‘The book which Ram gave to Mohana is famous’
Main verb of relative clause
--------------------------------------------------------------------------------------------------------arc-label necessity vibhakti lextype src-pos arc-dir oprt--------------------------------------------------------------------------------------------------------nmod__relc m any n r|l p insert---------------------------------------------------------------------------------------------------------
Transformed frame for xe after applying the jo trasformation
New row inserted after
transformation
![Page 20: Two-Stage Constraint Based Sanskrit Parser Akshar Bharati, IIIT,Hyderabad](https://reader035.vdocuments.site/reader035/viewer/2022062300/56649cdf5503460f949a92b0/html5/thumbnails/20.jpg)
Possible candidates
vaha puswaka jo rAma ne mohana ko xI hE prasixXa hE |
nmod__relc
![Page 21: Two-Stage Constraint Based Sanskrit Parser Akshar Bharati, IIIT,Hyderabad](https://reader035.vdocuments.site/reader035/viewer/2022062300/56649cdf5503460f949a92b0/html5/thumbnails/21.jpg)
Output after Stage - II
xiyA hE
vaha puswaka
mohana rAma
k2k4
k1
_ROOT_
jo
hEk1
prasixXa
k1s
nmod__relc
main
![Page 22: Two-Stage Constraint Based Sanskrit Parser Akshar Bharati, IIIT,Hyderabad](https://reader035.vdocuments.site/reader035/viewer/2022062300/56649cdf5503460f949a92b0/html5/thumbnails/22.jpg)
Example II – Coordination
rAma Ora siwA kala Aye | Ram and Sita yesterday came ‘Ram and Sita came yesterday’
![Page 23: Two-Stage Constraint Based Sanskrit Parser Akshar Bharati, IIIT,Hyderabad](https://reader035.vdocuments.site/reader035/viewer/2022062300/56649cdf5503460f949a92b0/html5/thumbnails/23.jpg)
Output of Stage - I
rAma
_ROOT_
Ayek1
siwA
Ora
kala
k7t
dummydummy
main
![Page 24: Two-Stage Constraint Based Sanskrit Parser Akshar Bharati, IIIT,Hyderabad](https://reader035.vdocuments.site/reader035/viewer/2022062300/56649cdf5503460f949a92b0/html5/thumbnails/24.jpg)
For Stage – II (Constraint Graph)
rAma
_ROOT_
Ayek1
siwA
Ora
kala
main
k7tccof
ccof
![Page 25: Two-Stage Constraint Based Sanskrit Parser Akshar Bharati, IIIT,Hyderabad](https://reader035.vdocuments.site/reader035/viewer/2022062300/56649cdf5503460f949a92b0/html5/thumbnails/25.jpg)
Candidate Arcs
rAma
_ROOT_
Ayek1
siwA
Ora
kala
main
k1
k1
ccofccof
![Page 26: Two-Stage Constraint Based Sanskrit Parser Akshar Bharati, IIIT,Hyderabad](https://reader035.vdocuments.site/reader035/viewer/2022062300/56649cdf5503460f949a92b0/html5/thumbnails/26.jpg)
Solution Graph
rAma
_ROOT_
Aye
siwA
Ora
kala
k7t
maink1
ccofccof
![Page 27: Two-Stage Constraint Based Sanskrit Parser Akshar Bharati, IIIT,Hyderabad](https://reader035.vdocuments.site/reader035/viewer/2022062300/56649cdf5503460f949a92b0/html5/thumbnails/27.jpg)
Parse tree
Aye
kalaOra
k7tk1
_ROOT_
rAma siwA
ccofccof
main
Output after Stage II
![Page 28: Two-Stage Constraint Based Sanskrit Parser Akshar Bharati, IIIT,Hyderabad](https://reader035.vdocuments.site/reader035/viewer/2022062300/56649cdf5503460f949a92b0/html5/thumbnails/28.jpg)
Results for Hindi
![Page 29: Two-Stage Constraint Based Sanskrit Parser Akshar Bharati, IIIT,Hyderabad](https://reader035.vdocuments.site/reader035/viewer/2022062300/56649cdf5503460f949a92b0/html5/thumbnails/29.jpg)
Results
CBP: Results when only the first parse is considered
CBP’’: When best parse of the first 25 parses are considered
CBP was tested on 220 sentences These are the results published in IALP-2008
![Page 30: Two-Stage Constraint Based Sanskrit Parser Akshar Bharati, IIIT,Hyderabad](https://reader035.vdocuments.site/reader035/viewer/2022062300/56649cdf5503460f949a92b0/html5/thumbnails/30.jpg)
Work Progress in Sanskrit
Existing Constraint Based parser for Sanskrit can parse simple sentences.
Over 2000 demand charts Two stage parsing needs more development Experiments performed with 268 simple sentences Re-ranking of parses is not done,only the first parse is
considered for results Results not very accurate due to data problems
![Page 31: Two-Stage Constraint Based Sanskrit Parser Akshar Bharati, IIIT,Hyderabad](https://reader035.vdocuments.site/reader035/viewer/2022062300/56649cdf5503460f949a92b0/html5/thumbnails/31.jpg)
Results in Sanskrit
Labelled attachment score: 540 / 1213 * 100 = 44.52 %
Unlabeled attachment score: 876 / 1213 * 100 = 72.22 %
Label accuracy score: 566 / 1213 * 100 = 46.66 %
![Page 32: Two-Stage Constraint Based Sanskrit Parser Akshar Bharati, IIIT,Hyderabad](https://reader035.vdocuments.site/reader035/viewer/2022062300/56649cdf5503460f949a92b0/html5/thumbnails/32.jpg)
Treebank requirement
Proper Gold tagged,chunked and dependency marked data for Sanskrit will improve the efficiency of the parser
Annotation with proper tools It will also help us in using machine learning
methods to train statistical parsers for Sanskrit
![Page 33: Two-Stage Constraint Based Sanskrit Parser Akshar Bharati, IIIT,Hyderabad](https://reader035.vdocuments.site/reader035/viewer/2022062300/56649cdf5503460f949a92b0/html5/thumbnails/33.jpg)
Further work on Constraint Based Parsing.
Extension of the parser using treebank data Hybrid approaches
Soft Constraints Pruning of the graph in data driven parsers using
Constraint Graph Allow learning of the parser from the treebank
data Better performance
![Page 34: Two-Stage Constraint Based Sanskrit Parser Akshar Bharati, IIIT,Hyderabad](https://reader035.vdocuments.site/reader035/viewer/2022062300/56649cdf5503460f949a92b0/html5/thumbnails/34.jpg)
What we expect From Data
((
1 (( NP <fs af='CAwra,n,m,sg,,o,1,1' drel='k1:3' name='1'>
1.1 CAwraH NN <fs af='CAwra,n,m,sg,,o,1,1'>
))
2 (( NP <fs af='vixyAlaya,n,m,sg,,d,2,2' drel='k2:3' name='2'>
2.1 vixyAlayam NN <fs af='vixyAlaya,n,m,sg,,d,2,2'>
))
3 (( VGF <fs af='gam1,v,,sg,3,,karwari_lat,' name='3' gaNaH='BvAxiH' paxI='parasmEpaxI' XAwuH='gamLz'>
3.1 gacCawi VM <fs af='gam1,v,,sg,3,,karwari_lat,' paxI='parasmEpaxI' gaNaH='BvAxiH' XAwuH='gamLz'>
))
))
![Page 35: Two-Stage Constraint Based Sanskrit Parser Akshar Bharati, IIIT,Hyderabad](https://reader035.vdocuments.site/reader035/viewer/2022062300/56649cdf5503460f949a92b0/html5/thumbnails/35.jpg)
THANKS!!