from sentence structure to “immediate” discourse structure: annotation of discourse connectives...

46
FROM SENTENCE STRUCTURE TO “IMMEDIATE” DISCOURSE STRUCTURE: ANNOTATION OF DISCOURSE CONNECTIVES AND THEIR ARGUMENTS Aravind K. Joshi University of Pennsylvania Philadelphia, PA USA IIT, Powai, Mumbai, December 30 2005

Upload: alyson-miller

Post on 28-Dec-2015

231 views

Category:

Documents


2 download

TRANSCRIPT

FROM SENTENCE STRUCTURETO

“IMMEDIATE” DISCOURSE STRUCTURE: ANNOTATION

OFDISCOURSE CONNECTIVES

ANDTHEIR ARGUMENTS

Aravind K. JoshiUniversity of Pennsylvania

Philadelphia, PA USAIIT, Powai, Mumbai, December 30 2005

Outline

• Introduction• Transition from sentence to immediate discourse

• Dependencies in discourse structure• Penn Discourse Treebank (PDTB)• Some properties of discourse connectives• Some examples from PDTB• Some aspects of annotation guidelines• Semantics of discourse connectives• Assigning roles to the arguments

• Attributions of arguments and connectives• Summary

Transition from sentence to immediate discourse

• How much information can be packaged in a sentence?• When does a transition from a sentence to discourse happen?• Are there any general principles?• Beyond some conventions of style are there any linguistic principles to this transition?

Transition from sentence to immediate discourse

• Sentences are made up of clauses• Clause: Predicate (Verb) Arguments, Adjuncts• Dependency structure

• Connectives• Composition operations

• Extend dependency structures to discourse• Extend the same composition operations to discourse

• Extend the sentence level parser to discourse

Transition from sentence to immediate discourse

• At the sentence level•Predicates have as their arguments -- NPs -- NPs and clauses -- Clauses

• Discourse connectives can be treated as higher order predicates taking only clauses as their arguments

Sentence Structure and Discourse Structure

• At the sentence level • Structural composition and associated semantic composition• Anaphoric links• Other inferences

• At the discourse level• Structural composition and associated semantic composition• Anaphoric links• Other inferences

• Conventionally, work in discourse structure does not consider and therefore, allow such a decomposition

Dependencies in discourse structure

• Discourse connectives as predicates taking clausal arguments• The dependencies between the predicate and their arguments can be stretched

On the one hand, Fred likes beans.Not only does he eat them for dinner.But he also eats them for breakfast and snacks.On the other hand, he’s allergic to them.

Nested Dependencies:

Dependencies in discourse structure

• Dependencies can be stretched by nesting• Crossed dependencies do not seem to be possible• Is this cross-linguistically valid?• Apparent crossing dependencies are resolved by treating one argument of a discourse connective as anaphoric

Webber, Joshi, Stone, and Knott. 2003. Anaphora and discourse

structure. Computational Linguistics, 29:545-587.

On the one hand, Fred likes beans.Not only does he eat them for dinner.But he also eats them for breakfast and snacks.On the other hand, he’s allergic to them.

Crossed dependencies

On the one hand, Fred likes beans.Not only does he eat them for dinner.On the other hand, he’s allergic to them.But he also eats them for breakfast and snacks

*

In this sense, discourse structure may be simpler than sentencestructure, even cross-linguistically?

True crossed dependencies do not seem to be possible

Dependencies in discourse structure

(a) John loves Barolo.(b) So he ordered three cases of the ’97.(c) But he had to cancel the order(d) Because then he discovered he was broke.

because gets its arguments from (c) and (d)

then gets its arguments from (b) and (d),thus crossing the connection between (c) and (d) associated with because

Apparent crossing dependency: Treat the argument from (b) for then as anaphoric

Penn Discourse Treebank: PDTB

• Annotate discourse connectives and their argument structure for the Penn Treebank corpus– PDTB• Independent of the specifics of the discourse lexicalized TAG (DLTAG)

People: Aravind Joshi Eleni Miltsakaki, Rashmi Prasad AnnotatorsCollaborator: Bonnie Webber (Edinburgh University)

PDTB

• Discourse connectives such as -- and, or, but, because, since, while, when, however, instead, although, also, for example, then, so that, insofar as, nonetheless, … , Empty Connectives -- Subordinate conjunctions, Coordinate conjunctions, Adverbial connectives, Implicit connectives -- Discourse connectives take clauses as their arguments and express relations between clauses, i.e., relations between propositions, events, situations, … associated with the clauses

• Towards computing a class of inferences associated with discourse connectives, hence relevant to complex NLP tasks – IE, MT, QA … • Towards discourse structure - discourse understanding

Research Strategy• Not shallow vs deep syntactic processing

• Not shallow vs deep semantic processing

But• Deeper and deeper shallow processing

Some properties of discourse connectives• Discourse connectives have argument structure (analogous to verbs and their argument structure) as in the Propbank. However, there are crucial differences• arity of connectives is fixed, they are binary (some apparent exceptions)• One argument is in the same sentence in which the connective appears. The other argument may or may not be in the same sentence. It can be in the preceding or following discourse• Harder to annotate the extent of an argument• one of the arguments can be anaphoric

• Very little is known about the semantics of discourse connectives

What is being annotated

• Relation: Connective--explicit or implicit• Arguments: Arg1, Arg2• Attributions of arguments• Attribution of relation• Sense of the connective• Supplementary material

?

Subordinate: because[The federal government suspended sales of U.S. savingsBonds] because [Congress hasn’t lifted the ceiling on government debt.]

• Both arguments are in the same sentence

Some Examples from PDTB

Subordinate: althoughAlthough [started in 1965], [Wedtech didn’t really getRolling until 1975] (when Mr. Neuberger discovered theFederal Government’s Section 8(A) minority businessProgram).

• Both arguments are in the same sentence, one argument has possible supplementary material in ( )

Adverbial: however

[Both Newsweek and U.S. News have been gaining circulation in recent years without heavy use of electronicgiveaways to subscribers, such as telephone or watches.]

However, [none of the big three weeklies recordedcirculation gains recently.]

• The two arguments are in different sentences

Adverbial: for example

[The computers were crude by today’s standards.][Apple II owners, for example, had to use their television|sets as screens and stored data on audiocassetts.]

[The computers were crude by today’s standards.][Apple II owners, for example, had to use their televisionsets as screens and stored data on audiocassetts.]

• An argument can be a discontiguous string• Problems with aligning arguments with Penn Treebank constituents

Discourse adverbials as anaphors: Instead

John wanted to eat a pear. Instead he ate an apple.

John will not eat fruit. Instead, he eats only candy bars and potato chips.

John ate an apple. # Instead he wanted a pear.

Antecedent of instead: salient but unchosen orunrealized alternative -- anaphoric argument of insteadLicensing environment: modal context, negation, …

Adverbial: still

[Some senior advisors argue that with further fights overa capital-gains tax cut and a budget-reduction bill Mr.Bush already has enough pending confrontations withcongress. They prefer to put off the line-item veto untilat least next year.]Still, [Mr. Bush and some other aides are strongly drawnto the idea of trying out a line-item veto.]

ARG1: Some senior… congress. They prefer…next yearARG2: Mr. Bush…a line-item veto

ARG1 has two sentences

Adverbial: also

[On the Big Board, Crawford & Co., Atlanta, (CFD)begins trading today.] Crawford evaluates health careplans, manages medical and disability aspects of worker’scompensation injuries and is involved in claims adjustments for insurance companies.Also, [beginning trading today on the Big Board are ElPaso Refinery Limited Partnership, El Paso, Texas, (ELP)and Franklin Multi-Income Trust, San Mateo, Calif., (FMI).]

• • The sentence (in green) after the left argument of “also” can be regarded as a kind of adjunct of the left argument• Discourse connectives have a fixed arity (2).

Empty connective: EMPTY

[El Paso owns and operates a petroleum refinery.]EMPTY= whereas [Franklin is a closed-end managementinvestment company.]

• whereas is the connective that one annotator thought best described the relation expressed by the empty connective

• Analogous to the empty relation in a noun-noun compound at the sentence level

Empty connective

Individuals close to the situation believe Ford officialswill seek a meeting this week with Sir John to outlinetheir proposal for a full bid. <CONSEQUENTLY>Any discussion with Ford could postpone the Jaguar-GM deal, headed for completion within the next two weeks.

Empty connectives

But now the companies are getting into troublebecause they undertook a record expansion programwhile they were raising prices sharply. <CONSEQUENTLY/AS A RESULT> Third-quarter profits fell at several companies.

Disagreement on selected connective but agreementover class

Empty connectives

British government restrictions prevent any singleshareholder from going beyond 15% before the end of1990 without government permission. <BECAUSE/HOWEVER> The British government, which ownedJaguar until 1984, still holds a controlling “goldenshare” in the company.

Disagreement over connective and also the classes they belong

Attributions of arguments and relations

Advocates said the 90-cent-an hour rise to $4.25 an hourby April 1991, is too small for the working poor, whileopponents argued that the increase will still hurt smallBusinesses and cost many thousands of jobs.

Relation: Connective- whileArg1: Advocates said…poorArg2: opponents … jobsAttributions: Relation: WA (writer attribution) Arg 1: WA Arg 2: WA

Attributions of arguments and relations

Factory orders and construction outlays were largelyflat in September, while purchasing agents saidmanufacturing shrank further in October.

Relation: Connective- whileArg1: Factory orders… SeptemberArg2: manufacturing shrank… in OctoberAttributions:Relation: WAArg1: WAArg2: SA (speaker attribution)

How many discourse connectives in PTB?

Types: about 253(Subordinating: 32, Coordinating: 4, Adverbial/Anaphoric: 217)

Tokens: about 23,620(Subordinating: 7011, Coordinating: 6169, Adverbial/Anaphoric: 10,440)

Empty connectives: Tokens: about 20,000 Types: ??Total: Tokens: 43,620

Annotation Guidelines– some comments

• What counts as a discourse connective? -- in general, discourse connectives convey a relation between states, events, situations, etc.

•as a result is a discourse connectiveBut inStrangely, conventional wisdom inside the Beltway regards these transfer payments as …

“strangely” requires only a single state/event which it classifies in the set of “strange” events. Hence, it is nota discourse connective• What counts as an argument?

Annotation Guidelines– some comments

• How far does an argument extend?

Although [started in 1965], [Wedtech didn’t really getrolling until 1975] (when Mr. Neuberger discovered theFederal Government’s Section 8 minority businessProgram).

“Proper partial overlap”

ARG1: Wedtech didn’t really … 1975ARG2: started in 1965SUP2: when Mr. Neuberger … Program

Multiple annotations

• In the standard annotation paradigm only one annotation is selected• At the discourse level multiple annotations cannot be completely avoided

[Big bear doesn’t care for disposable diapers,] which aren’t biodegradable. Yet [parents demand them.]

Big bear doesn’t care for disposable diapers, [which aren’t biodegradable.] Yet [parents demand them.]

Assigning roles to the arguments

For verbs• In terms of general roles such as agent, theme, goal, instrument, …• In terms of word specific roles

He wouldn’t accept anything of value from those he was writing about

REL: accept Arg0: acceptorArg1: thing accepted Arg2: accepted-from

Prague Dependency Treebank (PDB) (1998, 2001), Framenet (2000, 2002),Propbank (2002, 2003)

Assigning “roles” to the arguments of a connective

• In terms of general roles-- ???• In terms of connective specific “roles”

Roles of arguments of “if” (conditional)

if (hypothetical)

If John studies hard he will pass the examination

REL: if (hypothetical)ARG0: (Truth condition) circumstances which make ARG1 trueARG1: (Assertion) expresses assertion

Roles of arguments of “if” (relevance conditional)

if (relevance)

If you are thirsty, there is beer in the fridge

REL: if (relevance conditional)ARG0: (Relevance condition) circumstances in which ARG1 is relevantARG1: (Assertion) expresses assertion

Roles of arguments of “if” (factual conditional)

if (factual)

If Bill is so unhappy here, he should leave

REL: if (factual conditional)ARG0: (Factual condition) someone other than the speaker believes that ARG0 is true and ARG0 justifies ARG1ARG1: (Conditional assertion) expresses assertion

Some possible new senses for if

[It will be at their peril] if [Americans allow another happening like the degrading Bork confirmation circus]

ARG1: it will … perilARG2: Americans … circus

ARG 1 makes reference to ARG 2

If here is not hypothetical conditional but it is just a way of making an assertion, much like hypothetical relevance conditional but not quite like it.

Some possible new senses for if

[Don’t leave home without the American Express card if [you’d really rather have a Buick.]

If here is more like the hypothetical relevanceconditional but not quite like it.

Some possible senses for while

[Under Chapter 11, a company operates under protectionfrom creditors’ lawsuits] while [it works out a planto pay its debts.]

ARG1: Under … lawsuitsARG2: it works … debtsCon: whileSense: Temporal

[Some will likely be offered severance package] while [others will be transferred to overseas operations.]

Sense: Concessive

Some possible senses for while

[Each company remains independent] while [workingtogether to market and sell their products.]

Sense: Temporal/Concessive

While [the insurance index fell 3.56 to 528.56,] [the Nasdaq bank index fell 5.00 to 432.61.]

Sense: ? Compare but no real contrast

Some possible senses for since, when, …

• Senses for since: Temporal, Causal, Temporal/Causal

•Senses for when: Temporal, Causal, Temporal/Causal

Since

• Temporal (T)– She hasn’t played any music since the earthquake

hit.

• Causal (C)– Since the budget measures cash flow, a new $1

direct loan is treated as a $1 expenditure.

• Temporal/Causal (T/C)– … and domestic car sales have plunged 19% since

the Big Three ended many of their programs Sept 30.

While

• Temporal (T)– A nurse contracted the virus while injecting an

AIDS patient

• Concession (Con)– The basket product, while it has got off to a slow

start, is being supported by some firms.

• Opposition (Opp)– … one ex-player claims he received $4000 to

$5000 for his season football tickets while others said theirs brought only a few hundred dollars.

When

• Temporal– The San Francisco earthquake hit when

resources in the field already were stretched

• Temporal/Causal– When the Trinity Repertory Theatre named

Anne Bogart as its artistic director last spring, the nation’s theatrical cognoscenti arched a collective eyebrow

Summary

• Expected date of release – April 2006 -- all explicit connectives (adjudicated) -- all implicit connectives but only about 50% adjudicated -- some annotation senses• All connectives, all senses, and some experimental results – December 2006

Summary

• Boundary between sentence and discourse• Flexible• Discourse connectives sit at this boundary• Similarities and differences between sentence structure and local discourse structure

• Properties of discourse connectives• Arguments of connectives, a-rity is 2• Extent of the arguments and their semantics• Annotations of attributions -- Mismatch between syntax and discourse• Sense annotation—new opportunities• Multiple annotations-- implications