towards parsing croatian complex sentences: dependent noun clauses

22
Towards Parsing Croatian Complex Sentences: Dependent Noun Clauses Vanja Štefanec, Kristina Vučković, Zdravko Dovedan University of Zagreb, Faculty of Humanities and Social Sciences {vstefane, kvuckovi, zdovedan}@ffzg.hr NooJ2010 Komotini

Upload: zohar

Post on 07-Jan-2016

28 views

Category:

Documents


0 download

DESCRIPTION

Towards Parsing Croatian Complex Sentences: Dependent Noun Clauses. Vanja Štefanec, Kristina Vučković, Zdravko Dovedan University of Zagreb, Faculty of Humanities and Social Science s { vstefane, kvuckovi, zdovedan } @ ffzg.hr NooJ20 10 Komotini. Our goal. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Towards Parsing Croatian  Complex Sentences:  Dependent Noun Clauses

Towards Parsing Croatian Complex Sentences:

Dependent Noun Clauses

Vanja Štefanec, Kristina Vučković, Zdravko DovedanUniversity of Zagreb, Faculty of Humanities and Social Sciences

{vstefane, kvuckovi, zdovedan}@ffzg.hr

NooJ2010Komotini

Page 2: Towards Parsing Croatian  Complex Sentences:  Dependent Noun Clauses

NooJ2010Komotini 2/22

Our goal

to determine the boundaries of dependent clauses within the complex sentence focusing the parser performing disambiguation of chunks improving the chunker

to test the adequacy of this model as a pre-parsing method for complex sentences

Page 3: Towards Parsing Croatian  Complex Sentences:  Dependent Noun Clauses

NooJ2010Komotini 3/22

Overview of the work

grammar that can recognize the dependent noun clause (object clause) in the complex sentence

both simple object clause and coordination of object clauses

by defining the co-text in which object clause can occur

NOT by describing its structure relying on

output of the chunker conjunctions, complementizers,

punctuations, ...

Page 4: Towards Parsing Croatian  Complex Sentences:  Dependent Noun Clauses

NooJ2010Komotini 4/22

Object clauses in Croatian

very frequent refer to their superordinate clause

predicate as a direct object

three types (according to grammars) relative (odnosne) interrogative (zavisnoupitne) declarative (izrične)

Page 5: Towards Parsing Croatian  Complex Sentences:  Dependent Noun Clauses

NooJ2010Komotini 5/22

Relative object clauses

introduced by relative pronouns and adjectives

Jeste li našli [što ste tražili]? Have you found [what you’ve been looking for]?

Kupit ću [kakvog nađem]. *I will buy [of the kind I’ll find].

Page 6: Towards Parsing Croatian  Complex Sentences:  Dependent Noun Clauses

NooJ2010Komotini 6/22

Interrogative object clauses

1. general (općeupitne) introduced by interrogative conjunctions

‘li’, ‘da li’ or by interrogative pronouns (‘tko’, ‘koji’, ‘čiji’, ‘što’, …)

Još ne shvaćaš [što se dogodilo]. You still don’t understand [what happened].

Zaboravio sam [koji je danas dan]. I forgot [which day it is].

Page 7: Towards Parsing Croatian  Complex Sentences:  Dependent Noun Clauses

NooJ2010Komotini 7/22

Interrogative object clauses

2. of place (mjesne) introduced by interrogative adverbs of

place

Recite [kamo ste se zaputili]. Tell us [where you are headed].

3. of time (vremenske) introduced by interrogative adverbs of time

Nisu rekli [kad će doći]. They didn’t say [when they’ll be coming].

Page 8: Towards Parsing Croatian  Complex Sentences:  Dependent Noun Clauses

NooJ2010Komotini 8/22

Interrogative object clauses

4. of manner (načinske) introduced by interrogative adverb ‘kako’

Još nismo saznali [kako se to dogodilo]. We still haven’t found out [how that happened].

5. qualitative (kvalitativne) introduced by interrogative adjectives

‘kakav’, ‘kakva’, ‘kakvo’

Ne znam [kakav si ti to čovjek]. I don’t know [what kind of a person you are]?

Page 9: Towards Parsing Croatian  Complex Sentences:  Dependent Noun Clauses

NooJ2010Komotini 9/22

Interrogative object clauses

6. of amount (količinske) introduced by interrogative adverb ‘koliko’

Znaš li [koliko si već popio]? Do you know [how much you drank already]?

7. of cause (uzročne) introduced by interrogative adverbs of cause

or prepositional expressions ‘zašto’, ‘zbog čega’, …

Ne razumijem [zašto si zakasnio].I don’t understand [why you are late].

Page 10: Towards Parsing Croatian  Complex Sentences:  Dependent Noun Clauses

NooJ2010Komotini 10/22

Declarative object clauses

introduced by conjunctions ‘da’ (most common) ‘kako’ (less frequent; stylistic variant of

‘da’) ‘gdje’ (extremely rare; very stylistically

marked)

Obećao si [da ćeš doći]. You promised [that you’ll come].

Rekli su [kako ga nije briga]. They said [that he doesn't care].

Page 11: Towards Parsing Croatian  Complex Sentences:  Dependent Noun Clauses

NooJ2010Komotini 11/22

Object clauses in Croatian

have to be preceded by a transitive verb in an active voice form

impossible to predict their function by observing only the structure (Vidio sam)PRED ([da se igra])OBJ.

I saw that he’s playing. object-clause

(Vidio sam)PRED (ga)OBJ ([da se igra])ATTR.

I saw him playing. adjective clause

(Izišao je)PRED (van)ADV ([da se igra])ADV.

He went out to play. purpose clause

Page 12: Towards Parsing Croatian  Complex Sentences:  Dependent Noun Clauses

NooJ2010Komotini 12/22

Object clauses in Croatian

can be easily confused with subject clauses

subject clauses refer either to the nominal predicate or verbal predicate in passive voice forms

(Poznato je)PRED ([da pušenje uzrokuje rak])SUBJ.

It is well known that smoking causes cancer.

(Kaže se)PRED ([da je bolje spriječiti nego liječiti])SUBJ.

It is said that it is better to be safe than sorry.

Page 13: Towards Parsing Croatian  Complex Sentences:  Dependent Noun Clauses

NooJ2010Komotini 13/22

The model

can be divided into four parts

1. the predicate2. what can appear between the predicate

and object clause3. object clause4. what can appear after the object clause

1. 2. 3. 4.

Page 14: Towards Parsing Croatian  Complex Sentences:  Dependent Noun Clauses

NooJ2010Komotini 14/22

1. the predicate

Page 15: Towards Parsing Croatian  Complex Sentences:  Dependent Noun Clauses

NooJ2010Komotini 15/22

2. between predicate and the clause

Page 16: Towards Parsing Croatian  Complex Sentences:  Dependent Noun Clauses

NooJ2010Komotini 16/22

3. object clause - conjunctions

Page 17: Towards Parsing Croatian  Complex Sentences:  Dependent Noun Clauses

NooJ2010Komotini 17/22

3. object clause - body

Page 18: Towards Parsing Croatian  Complex Sentences:  Dependent Noun Clauses

NooJ2010Komotini 18/22

4. after the object clause

Page 19: Towards Parsing Croatian  Complex Sentences:  Dependent Noun Clauses

NooJ2010Komotini 19/22

Examples

Dodao je ([da približavanje Hrvatske EU ima dvije faze]).

Pretpostavimo ([da imate visoke demokratske standarde], [da manjine imaju puna prava], [da su medijske slobode savršene])...

Zato savjetuje svima koji namjeravaju podići kredite ([da malo pričekaju, ako to mogu]).

Odgovarajući na pitanje hoće li na dogovore iz Mokrica djelovati skorašnji slovenski lokalni izbori, Maštruko je rekao ([kako u to ne vjeruje] te [da bi u slučaju kad bi države svaki put čekale ([da prođu izbori]), pregovaranje bilo nemoguće]).

Page 20: Towards Parsing Croatian  Complex Sentences:  Dependent Noun Clauses

NooJ2010Komotini 20/22

Problems

chunker can not identify the whole VP undisambiguated chunks

subject clauses some verbs can take two arguments in

accusative case ‘pitati’ (to ask), ‘učiti’ (to teach), ... adjective clauses, purpose clauses

identifying the level of subordination often problem beyond syntax

rules of orthography proper use of punctuation marks (comma,

dash)

Page 21: Towards Parsing Croatian  Complex Sentences:  Dependent Noun Clauses

NooJ2010Komotini 21/22

Evaluation

performed in ideal circumstances predicate is correctly identified (i.e.

chunked) information about verb valency is present

corpus consists of 174 sentences with 215 object clauses

PRECISION RECALL F-MEASURE

0,46 0,82 0,59

Page 22: Towards Parsing Croatian  Complex Sentences:  Dependent Noun Clauses

NooJ2010Komotini 22/22

Evaluation

low precision BUT correct identification in 91% of the

cases average number of results per clause is

2,15 disambiguation!

high recall confirms the adequacy of the model AND we have identified the critical cases

so improvements can also be expected