digital historical corpora dagstuhl meeting, december 03-08, 2006 annotation and pos-tagging issues:...

34
Digital Historical Corpora Digital Historical Corpora Dagstuhl meeting, December 03-08, 2006 Dagstuhl meeting, December 03-08, 2006 Annotation and POS-tagging Annotation and POS-tagging issues: Participle issues: Participle Constructions in Old Constructions in Old Bulgarian Bulgarian Mila Dimitrova-Vulchanova & Valentin Mila Dimitrova-Vulchanova & Valentin Vulchanov Vulchanov NTNU, Norway NTNU, Norway

Upload: flora-wilkerson

Post on 17-Dec-2015

218 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Digital Historical Corpora Dagstuhl meeting, December 03-08, 2006 Annotation and POS-tagging issues: Participle Constructions in Old Bulgarian Mila Dimitrova-Vulchanova

Digital Historical CorporaDigital Historical CorporaDagstuhl meeting, December 03-08, 2006Dagstuhl meeting, December 03-08, 2006

Annotation and POS-tagging issues: Annotation and POS-tagging issues: Participle Constructions in Old Participle Constructions in Old

BulgarianBulgarianMila Dimitrova-Vulchanova & Valentin VulchanovMila Dimitrova-Vulchanova & Valentin Vulchanov

NTNU, NorwayNTNU, Norway

Page 2: Digital Historical Corpora Dagstuhl meeting, December 03-08, 2006 Annotation and POS-tagging issues: Participle Constructions in Old Bulgarian Mila Dimitrova-Vulchanova

Participles in Old BulgarianParticiples in Old Bulgarian

1.1. Frequency and general distributionFrequency and general distribution: very : very high frequency (also compared to the NT high frequency (also compared to the NT Greek source text). Found in (quite!) a Greek source text). Found in (quite!) a number of well-defined contexts.number of well-defined contexts.

2.2. Our data from the electronic corpus of Our data from the electronic corpus of Old Bulgarian nominal phrases from Old Bulgarian nominal phrases from Codex Suprasliensis (at Codex Suprasliensis (at http://www.hf.ntnu.no/hf/adm/forskning/prhttp://www.hf.ntnu.no/hf/adm/forskning/prosjekter/balkansim/databases.htmlosjekter/balkansim/databases.html))

Page 3: Digital Historical Corpora Dagstuhl meeting, December 03-08, 2006 Annotation and POS-tagging issues: Participle Constructions in Old Bulgarian Mila Dimitrova-Vulchanova

Participles in Old BulgarianParticiples in Old Bulgarian

3. A split category: both verbal and nominal 3. A split category: both verbal and nominal morphological marking morphological marking

Verbal: tense, voice, aspect (present & Verbal: tense, voice, aspect (present & pastpast11/past/past22, active & passive), active & passive)

Nominal: case, gender, number (sg, pl & Nominal: case, gender, number (sg, pl & dual)dual)

Likewise reflected in their functionLikewise reflected in their function

Page 4: Digital Historical Corpora Dagstuhl meeting, December 03-08, 2006 Annotation and POS-tagging issues: Participle Constructions in Old Bulgarian Mila Dimitrova-Vulchanova

ParticiplesParticiples in OB in OB

4. Main function(s)4. Main function(s): as a : as a modifiermodifier

adnominal (within the extended nominal adnominal (within the extended nominal projection), and projection), and

adverbial (within the extended adverbial (within the extended verbal/clausal projection)verbal/clausal projection)

Page 5: Digital Historical Corpora Dagstuhl meeting, December 03-08, 2006 Annotation and POS-tagging issues: Participle Constructions in Old Bulgarian Mila Dimitrova-Vulchanova

TThhee aaddvveerrbbiiaal l ffuunnccttiioonn

4. The adverbial function4. The adverbial function: a circumstantial modifier : a circumstantial modifier according to traditional grammars (e.g. Colwell & according to traditional grammars (e.g. Colwell & Tune 2001)Tune 2001)

when related to a noun or pronoun – usually the when related to a noun or pronoun – usually the subject – this Participle supplies the subject – this Participle supplies the circumstancecircumstance of some action which supplements of some action which supplements or qualifies the action of the main verbor qualifies the action of the main verb

in the Greek parallel this participle occurs in the Greek parallel this participle occurs without the definite articlewithout the definite article

Page 6: Digital Historical Corpora Dagstuhl meeting, December 03-08, 2006 Annotation and POS-tagging issues: Participle Constructions in Old Bulgarian Mila Dimitrova-Vulchanova

ExamplesExamples

(1)(1) aa.. и и [[ра´гнýвавъра´гнýвавъ с с ]] аyрилиаянъ аyрилиаянъ повелýповелý andand enrageenrage PastPast,,ActAct,,NN,,mm. . ClCl,,ReflRefl AurelianAurelian, , PNPN,,NN,,mm.. orderorder, , AorAor,3,,3,SgSg

сúâàëúìûсúâàëúìû îëîâýíû îëîâýíû бит· бит· яя ballball ( (metalmetal mouldingmoulding) ) PlPl,,INSTINST,,mm. . ((ofof) ) leadlead, , DADA,,PlPl,,INSTINST,,mm. . beatbeat InfInf. . t them Duhem Du,,ACCACC

по по челюстьма " челюстьма " [[ глагол глагол ¬м¹ ¬м¹ ]] ... ...on (in) jaw, Du,D,f. say, Part,Pres,Act on (in) jaw, Du,D,f. say, Part,Pres,Act him him ((CSCS 1, 10-1 1, 10-122))

a’a’ and become angry Part,Past,N,m. Aurelian, PN,N,m. order Aor,3,Sg.and become angry Part,Past,N,m. Aurelian, PN,N,m. order Aor,3,Sg.

ball Pl,D (of) lead GEN beat, Inf. he (his), GEN,m. ball Pl,D (of) lead GEN beat, Inf. he (his), GEN,m.

Art,Pl,ACC,f. cheek, Pl,ACC,f. say, Part,Pres,N,m. himArt,Pl,ACC,f. cheek, Pl,ACC,f. say, Part,Pres,N,m. him

Page 7: Digital Historical Corpora Dagstuhl meeting, December 03-08, 2006 Annotation and POS-tagging issues: Participle Constructions in Old Bulgarian Mila Dimitrova-Vulchanova

ExamplesExamples

b.b. отъвýштаваи отъвýштаваи вýдûвýдû CPCP яко яко

answer, Imper,Sg know, Pres,Act,N,m. thanswer, Imper,Sg know, Pres,Act,N,m. thatat прýдъ цýсаремь прýдъ цýсаремь стоишистоишиbefore king, INST,m. stand, 2,Sg before king, INST,m. stand, 2,Sg ((CSCS 1, 13-14) 1, 13-14)

b’b’

speak, Imper. know, Part,Pres,N,m. thatspeak, Imper. know, Part,Pres,N,m. that

king, D,m. stand before, 2,SGking, D,m. stand before, 2,SG

Page 8: Digital Historical Corpora Dagstuhl meeting, December 03-08, 2006 Annotation and POS-tagging issues: Participle Constructions in Old Bulgarian Mila Dimitrova-Vulchanova

ExamplesExamples

cc.. и и отъведъотъведъ паyла свýне паyла свýне с©ди·штас©ди·штаand lead away Past,Act,N,m. Paul, PNand lead away Past,Act,N,m. Paul, PN P (away P (away from) court, GEN,n.from) court, GEN,n.

пр·´ъвавъпр·´ъвавъ и¹аияни« и¹аияни« речерече къ неи"къ неи" call Past,Act,N,m. Iuliania, PN,ACC say, Aor to her call Past,Act,N,m. Iuliania, PN,ACC say, Aor to her ((CSCS 1, 15-16) 1, 15-16)

c’c’ and remove, Part,Aor.Art,ACC,m. Paul PN,ACC away from Art,GEN,n. court GEN,n.and remove, Part,Aor.Art,ACC,m. Paul PN,ACC away from Art,GEN,n. court GEN,n.

call (summon) Part,Aor.3,Sg Art,ACC,f. Juliania, PN,ACC,f. say Aor,3,Sg. she, D,f.call (summon) Part,Aor.3,Sg Art,ACC,f. Juliania, PN,ACC,f. say Aor,3,Sg. she, D,f.

dd.. ¹молена¹молена бûв'шибûв'ши отъ мене отъ мене" не" не прýльштаи с прýльштаи с ... ...V-main, Pass V- Aux.V-main, Pass V- Aux.entreatentreat PastPast,,Pass,N,f. become PPass,N,f. become Past,Act,N,f. by me Neg delude Imper,Sg Cl,Reflast,Act,N,f. by me Neg delude Imper,Sg Cl,Refl

((CSCS 1, 18-19) 1, 18-19)

d’d’ , , beg, Part,Pass,N,f. by me, GEN,m. Neg. go astray, Imper.beg, Part,Pass,N,f. by me, GEN,m. Neg. go astray, Imper.

Page 9: Digital Historical Corpora Dagstuhl meeting, December 03-08, 2006 Annotation and POS-tagging issues: Participle Constructions in Old Bulgarian Mila Dimitrova-Vulchanova

More examplesMore examples

(2) (2)

aa. . и се и се ÷иñòо и ïðý÷иñòî÷иñòо и ïðý÷иñòîii иì©øòииì©øòи ttii äýâúñòâîäýâúñòâîand here is pure ACC,n and over-pure,ACC,n. have, Pres,Act.N.f. virginity,ACC,n.and here is pure ACC,n and over-pure,ACC,n. have, Pres,Act.N.f. virginity,ACC,n.

въ´врашта¬тъ с въ´врашта¬тъ с видýти ...видýти ... (she) come back 3,Sg Cl,Refl (to)see Inf (she) come back 3,Sg Cl,Refl (to)see Inf (CS 4, 24-25)(CS 4, 24-25)

a’.a’. and here is virtuous ACC,f. and pure ACC,f. have,Part, Pres,N,f. and here is virtuous ACC,f. and pure ACC,f. have,Part, Pres,N,f.

Art,ACC,f. virginity, ACC,f. return, 3,Sg. (to) see, Inf.Art,ACC,f. virginity, ACC,f. return, 3,Sg. (to) see, Inf.

Page 10: Digital Historical Corpora Dagstuhl meeting, December 03-08, 2006 Annotation and POS-tagging issues: Participle Constructions in Old Bulgarian Mila Dimitrova-Vulchanova

Adnominal functionAdnominal function

5. The adnominal (attributive) function5. The adnominal (attributive) functionfunctions as a modifier inside the nominal functions as a modifier inside the nominal expression very much like other modifiers (e.g., expression very much like other modifiers (e.g., adjectives)adjectives)the Greek parallel usually has the definite article, the Greek parallel usually has the definite article, however, our data do not attest a clear and however, our data do not attest a clear and reliable correlationreliable correlationlinear position inside the nominal string: linear position inside the nominal string: predominantly post-nominal (as a heavy predominantly post-nominal (as a heavy modifier, cf. Dimitrova-Vulchanova & Vulchanov modifier, cf. Dimitrova-Vulchanova & Vulchanov 2003, in press/a)2003, in press/a)

Page 11: Digital Historical Corpora Dagstuhl meeting, December 03-08, 2006 Annotation and POS-tagging issues: Participle Constructions in Old Bulgarian Mila Dimitrova-Vulchanova

ExamplesExamples

(3)(3)a.a. ñòî øòààãîñòî øòààãî òą òą íàðîäà íàðîäà

stand Pres,Act,GEN,m. there, Adv. people, GEN,m.stand Pres,Act,GEN,m. there, Adv. people, GEN,m. (CS 2, (CS 2, 27)27)

a’a’ Art,Pl,GEN stand around or by Part,Pres,Act,Pl,GEN,m. people, Art,Pl,GEN stand around or by Part,Pres,Act,Pl,GEN,m. people,

Pl,ACC,m.Pl,ACC,m.

The adverb The adverb òąòą is added in the OB translation and most is added in the OB translation and most likely translates the additional semantic information (e.g. likely translates the additional semantic information (e.g. around around or or byby)) of the verb of the verb

Page 12: Digital Historical Corpora Dagstuhl meeting, December 03-08, 2006 Annotation and POS-tagging issues: Participle Constructions in Old Bulgarian Mila Dimitrova-Vulchanova

ExamplesExamples

b.b. якояко коникони ръж©штеръж©ште îî äîáðîòýäîáðîòý ¬ª¬ª likelike horse, Pl,N,m. neigh, Part,Pl,N,m. on goodness,LOC,f. she, GEN,f.horse, Pl,N,m. neigh, Part,Pl,N,m. on goodness,LOC,f. she, GEN,f.

b’b’ likelike horse Pl,N,m. neigh Pl,N,m.horse Pl,N,m. neigh Pl,N,m. on (at) Art,D,f. on (at) Art,D,f. goodness, D,f. she, goodness, D,f. she, GEN,f.GEN,f.

(CS 2, 30 – (CS 2, 30 – 3,1)3,1)

Observe that in the Greek text the Participle is Observe that in the Greek text the Participle is without the Articlewithout the Article

Page 13: Digital Historical Corpora Dagstuhl meeting, December 03-08, 2006 Annotation and POS-tagging issues: Participle Constructions in Old Bulgarian Mila Dimitrova-Vulchanova

ExamplesExamples

c. c. ооγνώ βύχόνγνώ βύχόνąąìą ąąìą ąготованąąмąготованąąмąą firefire DD,,mm.. eternal eternal, , DD,,mm.. prepareprepare, , Pres,PassPres,Pass,,DD,,mm..

тебýтебý ии дияволąдияволą youyou DD,,mm.. and devil and devil, , DD,,m. m. (CS 4, 13)(CS 4, 13)

c’.c’. Art,N,n. fire Art,N,n. eternal Art,N,n. prepare Part,Pass,ACC,n. you, D Art,N,n. fire Art,N,n. eternal Art,N,n. prepare Part,Pass,ACC,n. you, D

and Art,D,m. devil Dand Art,D,m. devil D

Page 14: Digital Historical Corpora Dagstuhl meeting, December 03-08, 2006 Annotation and POS-tagging issues: Participle Constructions in Old Bulgarian Mila Dimitrova-Vulchanova

A third function?A third function?

Substantivised functionSubstantivised function (a verbal Noun) (a verbal Noun)occurs in a sentence in any position where occurs in a sentence in any position where a noun or pronoun could be used (in our a noun or pronoun could be used (in our terms stands for an NP)terms stands for an NP)the Greek parallel usually has the definite the Greek parallel usually has the definite article, however, this can be misleading: article, however, this can be misleading: the article in NT Greek is a marker at the the article in NT Greek is a marker at the phrasal level (e.g., the NP as a whole) phrasal level (e.g., the NP as a whole) rather than the word/token levelrather than the word/token level

Page 15: Digital Historical Corpora Dagstuhl meeting, December 03-08, 2006 Annotation and POS-tagging issues: Participle Constructions in Old Bulgarian Mila Dimitrova-Vulchanova

ExamplesExamples((44)) a a.. ´агради´агради ¹ста ¹ста плýж©шти·хъплýж©шти·хъ

block (close) Aor,3,Sg mouth, Pl,ACC,f.block (close) Aor,3,Sg mouth, Pl,ACC,f. crawl (reptile) Substant,Pres,Act, crawl (reptile) Substant,Pres,Act, Pl,GEN,n.(CS Pl,GEN,n.(CS

1, 8)1, 8)

a’a’ shut Aor,3,Sg Art,ACC,n.shut Aor,3,Sg Art,ACC,n. mouth, ACC,n. mouth, ACC,n. Art,Pl,GEN Art,Pl,GEN

reptile,Noun,Pl,GEN,n reptile,Noun,Pl,GEN,n

b. b. пîêðîâиòåëüпîêðîâиòåëü áî øòи·õúáî øòи·õú ñ ¬ãî ñ ¬ãî protector, N,m. fearing, Pres,Act,Pl,GEN,m. Cl,Refl. he, GENprotector, N,m. fearing, Pres,Act,Pl,GEN,m. Cl,Refl. he, GEN,m. ,m.

(CS 3, 14-(CS 3, 14-15)15)

b’ b’ Art Pl,D,m. fear, Part,Pl,D,m.Art Pl,D,m. fear, Part,Pl,D,m. he, ACC,m. he, ACC,m.

Page 16: Digital Historical Corpora Dagstuhl meeting, December 03-08, 2006 Annotation and POS-tagging issues: Participle Constructions in Old Bulgarian Mila Dimitrova-Vulchanova

ExamplesExamples

c.c. ìîëìîëииòâûòâû äðúæäðúæииìûìûииõúõú àäîìúàäîìú prayer, Pl,ACC,f.prayer, Pl,ACC,f. hold, Pres,Pass,Pl,GEN,m.hold, Pres,Pass,Pl,GEN,m. hell, INST,m. (CS 10, 8-hell, INST,m. (CS 10, 8-

9)9)

c’c’ Art,Pl,GEN hold, Part,Pl,GEN by Art,GEN,m. Hades (hell) GEN,m.Art,Pl,GEN hold, Part,Pl,GEN by Art,GEN,m. Hades (hell) GEN,m.

d.d. ражда¬мо¬ражда¬мо¬ отъотъ тебетебе сЃтосЃтоgive birth Part,Pres,Pass,N,n. from (by) you, GEN holy, Subst,N,n.give birth Part,Pres,Pass,N,n. from (by) you, GEN holy, Subst,N,n.

(CS 10, 26)(CS 10, 26)

d’.d’. Art,N,n. give birth Part,Pass,N,n. from you, GEN holy, N,n.Art,N,n. give birth Part,Pass,N,n. from you, GEN holy, N,n.

Page 17: Digital Historical Corpora Dagstuhl meeting, December 03-08, 2006 Annotation and POS-tagging issues: Participle Constructions in Old Bulgarian Mila Dimitrova-Vulchanova

A third functionA third function

• we consider those instances in no way different we consider those instances in no way different from the regular adnominal use, that is as NPs from the regular adnominal use, that is as NPs (nominal expressions) with a non-overt (empty) (nominal expressions) with a non-overt (empty) N head (in line with decisions in the Penn-N head (in line with decisions in the Penn-Helsinki corpus of Middle English)Helsinki corpus of Middle English)

• Issue: POS inventories adopt a separate POS Issue: POS inventories adopt a separate POS specification for this category treating it as a specification for this category treating it as a substantivised item (e.g. in ACT), however how substantivised item (e.g. in ACT), however how about attributes inherited from V? Moreover, about attributes inherited from V? Moreover, these are not lexicalised items at this stage of these are not lexicalised items at this stage of the language!the language!

Page 18: Digital Historical Corpora Dagstuhl meeting, December 03-08, 2006 Annotation and POS-tagging issues: Participle Constructions in Old Bulgarian Mila Dimitrova-Vulchanova

StructureStructure

6. Structure of participle constructions depending 6. Structure of participle constructions depending on functionon function

a general observation: a fully-fledged structure a general observation: a fully-fledged structure including complements, adjuncts and in some including complements, adjuncts and in some cases also subjects, regardless of function!cases also subjects, regardless of function!

The participial subject is oblique (Dative) in the The participial subject is oblique (Dative) in the default case when overt, otherwise Accusative in default case when overt, otherwise Accusative in ECM (V-complement) environments, most ECM (V-complement) environments, most commonly however a PROcommonly however a PRO

Page 19: Digital Historical Corpora Dagstuhl meeting, December 03-08, 2006 Annotation and POS-tagging issues: Participle Constructions in Old Bulgarian Mila Dimitrova-Vulchanova

StructureStructure

judging by the functional categories marked on judging by the functional categories marked on the participle, and the occasional presence of the participle, and the occasional presence of subjects, it is at least a TP/IP (a clause-size subjects, it is at least a TP/IP (a clause-size category); no evidence of CP structure except category); no evidence of CP structure except for complement function (Manolessou 2005 for for complement function (Manolessou 2005 for AG/NTG)AG/NTG)

in contrast to other non-finite constructions (e.g., in contrast to other non-finite constructions (e.g., infinitives) the base constituent order is (S)VO infinitives) the base constituent order is (S)VO (cf. Dimitrova-Vulchanova & Vulchanov, in (cf. Dimitrova-Vulchanova & Vulchanov, in press/b), however not without its exceptionspress/b), however not without its exceptions

Page 20: Digital Historical Corpora Dagstuhl meeting, December 03-08, 2006 Annotation and POS-tagging issues: Participle Constructions in Old Bulgarian Mila Dimitrova-Vulchanova

ExamplesExamples

(5)(5) aa.. и и [[пришедъшипришедъши ¬и¬и къ къ с©ди·шт© с©ди·шт© ]]

and come, Part,Past,Act,D,f. she D to court, D,n.and come, Part,Past,Act,D,f. she D to court, D,n. (CS 4, 28) (CS 4, 28)

a’a’ come, Part,GEN,f. she, GEN,f. to Art,ACC,n. court, come, Part,GEN,f. she, GEN,f. to Art,ACC,n. court,

ACC,n.ACC,n.

Observe that the Dativus Absolutus in the Old Observe that the Dativus Absolutus in the Old Bulgarian text corresponds to Genitive Absolutus Bulgarian text corresponds to Genitive Absolutus in the Greek text. No Article in Greek.in the Greek text. No Article in Greek.

Page 21: Digital Historical Corpora Dagstuhl meeting, December 03-08, 2006 Annotation and POS-tagging issues: Participle Constructions in Old Bulgarian Mila Dimitrova-Vulchanova

ExamplesExamples

b.b. дýâý дýâý ñЃòýñЃòý и и ïðý÷иñòýïðý÷иñòý

virginvirgin, , DD,,ff. . holyholy,,DD,,ff. . andand overpureoverpure, , DD,,ff.. отъ рода отъ рода цýсарьска цýсарьска с©шти с©шти ]]

fromfrom kinkin, , GENGEN,,mm.. kingking, , DADA,,GENGEN,,mm. . bebe, , PartPart,,DD,,ff.. ((CSCS 10, 21- 10, 21-22)22)

b’b’ virgin D,f. holy purevirgin D,f. holy pure

Encl.Part from kin, GEN prophet DA,GEN happen to be, Part,Pres,D,f.Encl.Part from kin, GEN prophet DA,GEN happen to be, Part,Pres,D,f.

Page 22: Digital Historical Corpora Dagstuhl meeting, December 03-08, 2006 Annotation and POS-tagging issues: Participle Constructions in Old Bulgarian Mila Dimitrova-Vulchanova

Other functionsOther functions

7. 7. Predicative and complement usesPredicative and complement usesnot so frequent, however cannot be ignorednot so frequent, however cannot be ignoredPredicative functionPredicative function: ambiguous between a periphrastic : ambiguous between a periphrastic tense (e.g., on a par with the English Progressive tense (e.g., on a par with the English Progressive be + be + V-ingV-ing construction) and a copular use, however, no such construction) and a copular use, however, no such periphrastic form attested in Old Bulgarian otherwiseperiphrastic form attested in Old Bulgarian otherwisemost likely a common Indo-European type (cf. Latin most likely a common Indo-European type (cf. Latin participle constructions and the origin of the English participle constructions and the origin of the English be + be + V-ing ...V-ing ...se him se him fultumiende wæs (fultumiende wæs (this (one) him helping wasthis (one) him helping was) ) cf. Berndt 1984).cf. Berndt 1984).As the head of the As the head of the complement complement of a matrix verb (e.g., in of a matrix verb (e.g., in Exceptional Case marking environments, cf. the verb Exceptional Case marking environments, cf. the verb мýнитимýнити (consider) in (6c)).(consider) in (6c)).

Page 23: Digital Historical Corpora Dagstuhl meeting, December 03-08, 2006 Annotation and POS-tagging issues: Participle Constructions in Old Bulgarian Mila Dimitrova-Vulchanova

ExamplesExamples(6)(6) a a.. мнозимнози PPPP отот DPDPсто штаагосто штааго т¹т¹

народанарода manymany, , PlPl,,NN,,mm.. ofof stand stand, , PartPart,,GENGEN,,mm.. therethere, , AdvAdv. . peoplepeople ( (crowdcrowd)), ,

GEN,m.GEN,m.

бýах© бýах© тек©штетек©ште• • и и рьвън¹«штерьвън¹«ште купьнокупьно

be, Imperf,3,Pl. run Pres,Act,Pl,Nbe, Imperf,3,Pl. run Pres,Act,Pl,N and strive Pres,Act,Pl,Nand strive Pres,Act,Pl,N all together, all together, Adv.Adv.

кк''то прьво¬ вьнидеть къ неито прьво¬ вьнидеть къ неиwho first, Adv. make way, 3,Sg. to herwho first, Adv. make way, 3,Sg. to her ((CSCS 2, 27 2, 27-29-29))

a’a’ ... ... be, Imperf,3,Pl run (rush), Part,Pres,Pl,Nbe, Imperf,3,Pl run (rush), Part,Pres,Pl,N

quarrel, Part,Pres,Pl,N to (against) one another, Pl,ACCquarrel, Part,Pres,Pl,N to (against) one another, Pl,ACC

Page 24: Digital Historical Corpora Dagstuhl meeting, December 03-08, 2006 Annotation and POS-tagging issues: Participle Constructions in Old Bulgarian Mila Dimitrova-Vulchanova

ExamplesExamples

bb. . вижд© бо вижд© бо SC SC т т дýвиц© м©др© дýвиц© м©др© с©шт©с©шт© see 1,Sg.see 1,Sg. Cl,ACC girl, ACC,f. wise, ACC,f. be, Cl,ACC girl, ACC,f. wise, ACC,f. be, Pres,Act,ACC,f.Pres,Act,ACC,f.

и и м'ног© м'ног© прýм©дрость прýм©дрость им©шт©им©шт©and much (great) wisdom, ACC,f. have, Pres,Act,ACC,f.and much (great) wisdom, ACC,f. have, Pres,Act,ACC,f. ((CSCS 1, 20-21) 1, 20-21)

b’b’ ,,see 1,Sg you, Cl,ACC girl, ACC,f. wise, ACC,f. be, Part,Pres,ACC,f see 1,Sg you, Cl,ACC girl, ACC,f. wise, ACC,f. be, Part,Pres,ACC,f

much, GEN,f. wisdom, GEN,f. much, GEN,f. wisdom, GEN,f. have, Part,Pres,Act,ACC,f.have, Part,Pres,Act,ACC,f.

Page 25: Digital Historical Corpora Dagstuhl meeting, December 03-08, 2006 Annotation and POS-tagging issues: Participle Constructions in Old Bulgarian Mila Dimitrova-Vulchanova

ExamplesExamples

c.c. к'де к'де ¬стъ сестра ¬стъ сестра твоятвояwhere is sister, N,f. your, N,f.where is sister, N,f. your, N,f.

«же «же тû мýниши тû мýниши IPIP дýв© дýв© с©шт©с©шт©who, ACC you mean, 2,Sg virgin, ACC be, Part,ACC,f. (CS 4, 15-16)who, ACC you mean, 2,Sg virgin, ACC be, Part,ACC,f. (CS 4, 15-16)

c’.c’.

where is 3,Sg Art sister, N,f. yourwhere is 3,Sg Art sister, N,f. your

who, ACC,f. mean 2,Sg be, Inf. virgin, ACCwho, ACC,f. mean 2,Sg be, Inf. virgin, ACC

Page 26: Digital Historical Corpora Dagstuhl meeting, December 03-08, 2006 Annotation and POS-tagging issues: Participle Constructions in Old Bulgarian Mila Dimitrova-Vulchanova

Internal syntaxInternal syntax

8. The internal syntax of participle constructions8. The internal syntax of participle constructionsin attributive use surface in the specifier of a functional in attributive use surface in the specifier of a functional Agr projection inside the extended nominal projection, Agr projection inside the extended nominal projection, agreement features (number, gender, Case) inherited agreement features (number, gender, Case) inherited from the head noun (and checked in a spec-head from the head noun (and checked in a spec-head configuration inside the FP)configuration inside the FP)in complement position the participle gets features from in complement position the participle gets features from two sourcestwo sources, number and gender from its subject and , number and gender from its subject and Case from the matrix verbCase from the matrix verbin adjunct position: agreement features are more difficult in adjunct position: agreement features are more difficult to account for, even if the participle is assumed to occur to account for, even if the participle is assumed to occur in the specifier of a functional projection (a la Cinque in the specifier of a functional projection (a la Cinque 1999 very much like DP-internal modifiers); agreement 1999 very much like DP-internal modifiers); agreement with matrix term is still loose (cf. Manolessou 2005 for with matrix term is still loose (cf. Manolessou 2005 for Greek)Greek)

Page 27: Digital Historical Corpora Dagstuhl meeting, December 03-08, 2006 Annotation and POS-tagging issues: Participle Constructions in Old Bulgarian Mila Dimitrova-Vulchanova

Internal structure and featuresInternal structure and features

A similar situation obtains in resumptive contexts A similar situation obtains in resumptive contexts whereby the participle agrees with an overt Dative whereby the participle agrees with an overt Dative subject of its own while being related to a matrix term subject of its own while being related to a matrix term with no match in case or gender featureswith no match in case or gender features

(7)(7)нене посл¹ща¬тепосл¹ща¬те лили [ [ DPDP менемене

Neg. listen 2,Pl.Neg. listen 2,Pl. QCl.QCl. me GEN,mme GEN,m

[[IPIP мол шт¹мол шт¹ мими с с вамъвамъ] ] ] ] ((CSCS 178, 2-3) 178, 2-3)

beg Part,D,m.beg Part,D,m. Cl,DCl,D Refl,Cl you Pl,D,m.Refl,Cl you Pl,D,m.””If (you) don’t listen to me who is begging you”If (you) don’t listen to me who is begging you”

Page 28: Digital Historical Corpora Dagstuhl meeting, December 03-08, 2006 Annotation and POS-tagging issues: Participle Constructions in Old Bulgarian Mila Dimitrova-Vulchanova

The issueThe issue

Warning in the Penn-Helsinki annotation Warning in the Penn-Helsinki annotation manual: participle clauses can often be confused manual: participle clauses can often be confused with reduced relatives, usually labelled absolute with reduced relatives, usually labelled absolute clauses (PP- ABS), howeverclauses (PP- ABS), however

The problem is The problem is how to distinguish these from other absolutes;how to distinguish these from other absolutes; how to establish the referential chain with the how to establish the referential chain with the

matrix termmatrix term

Page 29: Digital Historical Corpora Dagstuhl meeting, December 03-08, 2006 Annotation and POS-tagging issues: Participle Constructions in Old Bulgarian Mila Dimitrova-Vulchanova

Internal syntaxInternal syntax

The diachronic development of Bulgarian The diachronic development of Bulgarian participles vs. Greek seems juxtaposed: participles vs. Greek seems juxtaposed: Manolessou (2005) argues that in Greek Manolessou (2005) argues that in Greek the drive is for participle structures to have the drive is for participle structures to have a subject of their own and to create an a subject of their own and to create an unambiguous embedding relationship unambiguous embedding relationship (e.g., in the case of Greek (e.g., in the case of Greek participle > participle > gerundgerund), while this is not the case in OB), while this is not the case in OB

Page 30: Digital Historical Corpora Dagstuhl meeting, December 03-08, 2006 Annotation and POS-tagging issues: Participle Constructions in Old Bulgarian Mila Dimitrova-Vulchanova

ProblemsProblems

How many POS/POS specifications do we need How many POS/POS specifications do we need for participles?for participles?

Tag by function or form? Tag by function or form?

Existing SolutionsExisting Solutions The BNC: listed essentially as verbal forms, The BNC: listed essentially as verbal forms,

however, ambiguity tags (AJ0-VVG/VVN or however, ambiguity tags (AJ0-VVG/VVN or VVG/VVN-AJ0) + disambiguation rulesVVG/VVN-AJ0) + disambiguation rules

in addition: the AJ0-NN1/NN1-AJ0 tag for items in addition: the AJ0-NN1/NN1-AJ0 tag for items in the pre-N modifying functionin the pre-N modifying function

Page 31: Digital Historical Corpora Dagstuhl meeting, December 03-08, 2006 Annotation and POS-tagging issues: Participle Constructions in Old Bulgarian Mila Dimitrova-Vulchanova

Problems & (some) solutionsProblems & (some) solutions

The Penn-Helsinki corpus: no distinction The Penn-Helsinki corpus: no distinction between verbal and adjectival uses of between verbal and adjectival uses of participles, e.g., consistently tagged as participles, e.g., consistently tagged as participles; participles;

- two categories VAG and VAN, however - two categories VAG and VAN, however - exceptions: nominal uses of present - exceptions: nominal uses of present

participle participle tagged as N; tagged as N; - present participle for infinitive - present participle for infinitive tagged tagged

for function as VBfor function as VB

Page 32: Digital Historical Corpora Dagstuhl meeting, December 03-08, 2006 Annotation and POS-tagging issues: Participle Constructions in Old Bulgarian Mila Dimitrova-Vulchanova

Problems & (some) solutionsProblems & (some) solutions

The Act project: 4 POS categoriesThe Act project: 4 POS categories - AT (adj. – ptc)- AT (adj. – ptc) - ST (subst. – ptc)- ST (subst. – ptc) - T (participle)- T (participle) - DE (adverbial participle)- DE (adverbial participle) Problem: supposedly distinguished on the basis Problem: supposedly distinguished on the basis

of morphology - AT lack aspect, tense and voice, of morphology - AT lack aspect, tense and voice, while DE lack all agreement and categorial while DE lack all agreement and categorial morphology which is counter to fact!morphology which is counter to fact!

Furthermore, some disctinctions are only Furthermore, some disctinctions are only displayed in syntax (e.g., substantivization)displayed in syntax (e.g., substantivization)

Page 33: Digital Historical Corpora Dagstuhl meeting, December 03-08, 2006 Annotation and POS-tagging issues: Participle Constructions in Old Bulgarian Mila Dimitrova-Vulchanova

SolutionsSolutions

Morphology (tagging by form) doesn’t help: Morphology (tagging by form) doesn’t help: same in all functionssame in all functionsConsistent ambiguity/underspecified tagging? Consistent ambiguity/underspecified tagging? Potentially a distinction between structures Potentially a distinction between structures without overt subject (IP-PPL) and with overt without overt subject (IP-PPL) and with overt subject (IP-SMCsubject (IP-SMCpasspass or IP-ABS) (as in the Penn or IP-ABS) (as in the Penn corpus)? However, not before syntactic corpus)? However, not before syntactic annotation!annotation!Rules over most likely tag sequences Rules over most likely tag sequences word word sequences (we already have reliable systematic sequences (we already have reliable systematic observations here and data from specialized observations here and data from specialized corpora) corpora)

Page 34: Digital Historical Corpora Dagstuhl meeting, December 03-08, 2006 Annotation and POS-tagging issues: Participle Constructions in Old Bulgarian Mila Dimitrova-Vulchanova

The most feasible SolutionThe most feasible Solution

A two-level annotation A two-level annotation POS tagging as participles POS tagging as participles Syntactic annotation Syntactic annotation

thus keeping the two levels strictly apart thus keeping the two levels strictly apart and avoiding the concatenation of non-and avoiding the concatenation of non-unifrom grammatical information in the unifrom grammatical information in the same tagsame tag