from sentence structure to “immediate” discourse structure: annotation of discourse connectives...
TRANSCRIPT
FROM SENTENCE STRUCTURETO
“IMMEDIATE” DISCOURSE STRUCTURE: ANNOTATION
OFDISCOURSE CONNECTIVES
ANDTHEIR ARGUMENTS
Aravind K. JoshiUniversity of Pennsylvania
Philadelphia, PA USAIIT, Powai, Mumbai, December 30 2005
Outline
• Introduction• Transition from sentence to immediate discourse
• Dependencies in discourse structure• Penn Discourse Treebank (PDTB)• Some properties of discourse connectives• Some examples from PDTB• Some aspects of annotation guidelines• Semantics of discourse connectives• Assigning roles to the arguments
• Attributions of arguments and connectives• Summary
Transition from sentence to immediate discourse
• How much information can be packaged in a sentence?• When does a transition from a sentence to discourse happen?• Are there any general principles?• Beyond some conventions of style are there any linguistic principles to this transition?
Transition from sentence to immediate discourse
• Sentences are made up of clauses• Clause: Predicate (Verb) Arguments, Adjuncts• Dependency structure
• Connectives• Composition operations
• Extend dependency structures to discourse• Extend the same composition operations to discourse
• Extend the sentence level parser to discourse
Transition from sentence to immediate discourse
• At the sentence level•Predicates have as their arguments -- NPs -- NPs and clauses -- Clauses
• Discourse connectives can be treated as higher order predicates taking only clauses as their arguments
Sentence Structure and Discourse Structure
• At the sentence level • Structural composition and associated semantic composition• Anaphoric links• Other inferences
• At the discourse level• Structural composition and associated semantic composition• Anaphoric links• Other inferences
• Conventionally, work in discourse structure does not consider and therefore, allow such a decomposition
Dependencies in discourse structure
• Discourse connectives as predicates taking clausal arguments• The dependencies between the predicate and their arguments can be stretched
On the one hand, Fred likes beans.Not only does he eat them for dinner.But he also eats them for breakfast and snacks.On the other hand, he’s allergic to them.
Nested Dependencies:
Dependencies in discourse structure
• Dependencies can be stretched by nesting• Crossed dependencies do not seem to be possible• Is this cross-linguistically valid?• Apparent crossing dependencies are resolved by treating one argument of a discourse connective as anaphoric
Webber, Joshi, Stone, and Knott. 2003. Anaphora and discourse
structure. Computational Linguistics, 29:545-587.
On the one hand, Fred likes beans.Not only does he eat them for dinner.But he also eats them for breakfast and snacks.On the other hand, he’s allergic to them.
Crossed dependencies
On the one hand, Fred likes beans.Not only does he eat them for dinner.On the other hand, he’s allergic to them.But he also eats them for breakfast and snacks
*
In this sense, discourse structure may be simpler than sentencestructure, even cross-linguistically?
True crossed dependencies do not seem to be possible
Dependencies in discourse structure
(a) John loves Barolo.(b) So he ordered three cases of the ’97.(c) But he had to cancel the order(d) Because then he discovered he was broke.
because gets its arguments from (c) and (d)
then gets its arguments from (b) and (d),thus crossing the connection between (c) and (d) associated with because
Apparent crossing dependency: Treat the argument from (b) for then as anaphoric
Penn Discourse Treebank: PDTB
• Annotate discourse connectives and their argument structure for the Penn Treebank corpus– PDTB• Independent of the specifics of the discourse lexicalized TAG (DLTAG)
People: Aravind Joshi Eleni Miltsakaki, Rashmi Prasad AnnotatorsCollaborator: Bonnie Webber (Edinburgh University)
PDTB
• Discourse connectives such as -- and, or, but, because, since, while, when, however, instead, although, also, for example, then, so that, insofar as, nonetheless, … , Empty Connectives -- Subordinate conjunctions, Coordinate conjunctions, Adverbial connectives, Implicit connectives -- Discourse connectives take clauses as their arguments and express relations between clauses, i.e., relations between propositions, events, situations, … associated with the clauses
• Towards computing a class of inferences associated with discourse connectives, hence relevant to complex NLP tasks – IE, MT, QA … • Towards discourse structure - discourse understanding
Research Strategy• Not shallow vs deep syntactic processing
• Not shallow vs deep semantic processing
But• Deeper and deeper shallow processing
Some properties of discourse connectives• Discourse connectives have argument structure (analogous to verbs and their argument structure) as in the Propbank. However, there are crucial differences• arity of connectives is fixed, they are binary (some apparent exceptions)• One argument is in the same sentence in which the connective appears. The other argument may or may not be in the same sentence. It can be in the preceding or following discourse• Harder to annotate the extent of an argument• one of the arguments can be anaphoric
• Very little is known about the semantics of discourse connectives
What is being annotated
• Relation: Connective--explicit or implicit• Arguments: Arg1, Arg2• Attributions of arguments• Attribution of relation• Sense of the connective• Supplementary material
?
Subordinate: because[The federal government suspended sales of U.S. savingsBonds] because [Congress hasn’t lifted the ceiling on government debt.]
• Both arguments are in the same sentence
Some Examples from PDTB
Subordinate: althoughAlthough [started in 1965], [Wedtech didn’t really getRolling until 1975] (when Mr. Neuberger discovered theFederal Government’s Section 8(A) minority businessProgram).
• Both arguments are in the same sentence, one argument has possible supplementary material in ( )
Adverbial: however
[Both Newsweek and U.S. News have been gaining circulation in recent years without heavy use of electronicgiveaways to subscribers, such as telephone or watches.]
However, [none of the big three weeklies recordedcirculation gains recently.]
• The two arguments are in different sentences
Adverbial: for example
[The computers were crude by today’s standards.][Apple II owners, for example, had to use their television|sets as screens and stored data on audiocassetts.]
[The computers were crude by today’s standards.][Apple II owners, for example, had to use their televisionsets as screens and stored data on audiocassetts.]
• An argument can be a discontiguous string• Problems with aligning arguments with Penn Treebank constituents
Discourse adverbials as anaphors: Instead
John wanted to eat a pear. Instead he ate an apple.
John will not eat fruit. Instead, he eats only candy bars and potato chips.
John ate an apple. # Instead he wanted a pear.
Antecedent of instead: salient but unchosen orunrealized alternative -- anaphoric argument of insteadLicensing environment: modal context, negation, …
Adverbial: still
[Some senior advisors argue that with further fights overa capital-gains tax cut and a budget-reduction bill Mr.Bush already has enough pending confrontations withcongress. They prefer to put off the line-item veto untilat least next year.]Still, [Mr. Bush and some other aides are strongly drawnto the idea of trying out a line-item veto.]
ARG1: Some senior… congress. They prefer…next yearARG2: Mr. Bush…a line-item veto
ARG1 has two sentences
Adverbial: also
[On the Big Board, Crawford & Co., Atlanta, (CFD)begins trading today.] Crawford evaluates health careplans, manages medical and disability aspects of worker’scompensation injuries and is involved in claims adjustments for insurance companies.Also, [beginning trading today on the Big Board are ElPaso Refinery Limited Partnership, El Paso, Texas, (ELP)and Franklin Multi-Income Trust, San Mateo, Calif., (FMI).]
• • The sentence (in green) after the left argument of “also” can be regarded as a kind of adjunct of the left argument• Discourse connectives have a fixed arity (2).
Empty connective: EMPTY
[El Paso owns and operates a petroleum refinery.]EMPTY= whereas [Franklin is a closed-end managementinvestment company.]
• whereas is the connective that one annotator thought best described the relation expressed by the empty connective
• Analogous to the empty relation in a noun-noun compound at the sentence level
Empty connective
Individuals close to the situation believe Ford officialswill seek a meeting this week with Sir John to outlinetheir proposal for a full bid. <CONSEQUENTLY>Any discussion with Ford could postpone the Jaguar-GM deal, headed for completion within the next two weeks.
Empty connectives
But now the companies are getting into troublebecause they undertook a record expansion programwhile they were raising prices sharply. <CONSEQUENTLY/AS A RESULT> Third-quarter profits fell at several companies.
Disagreement on selected connective but agreementover class
Empty connectives
British government restrictions prevent any singleshareholder from going beyond 15% before the end of1990 without government permission. <BECAUSE/HOWEVER> The British government, which ownedJaguar until 1984, still holds a controlling “goldenshare” in the company.
Disagreement over connective and also the classes they belong
Attributions of arguments and relations
Advocates said the 90-cent-an hour rise to $4.25 an hourby April 1991, is too small for the working poor, whileopponents argued that the increase will still hurt smallBusinesses and cost many thousands of jobs.
Relation: Connective- whileArg1: Advocates said…poorArg2: opponents … jobsAttributions: Relation: WA (writer attribution) Arg 1: WA Arg 2: WA
Attributions of arguments and relations
Factory orders and construction outlays were largelyflat in September, while purchasing agents saidmanufacturing shrank further in October.
Relation: Connective- whileArg1: Factory orders… SeptemberArg2: manufacturing shrank… in OctoberAttributions:Relation: WAArg1: WAArg2: SA (speaker attribution)
How many discourse connectives in PTB?
Types: about 253(Subordinating: 32, Coordinating: 4, Adverbial/Anaphoric: 217)
Tokens: about 23,620(Subordinating: 7011, Coordinating: 6169, Adverbial/Anaphoric: 10,440)
Empty connectives: Tokens: about 20,000 Types: ??Total: Tokens: 43,620
Annotation Guidelines– some comments
• What counts as a discourse connective? -- in general, discourse connectives convey a relation between states, events, situations, etc.
•as a result is a discourse connectiveBut inStrangely, conventional wisdom inside the Beltway regards these transfer payments as …
“strangely” requires only a single state/event which it classifies in the set of “strange” events. Hence, it is nota discourse connective• What counts as an argument?
Annotation Guidelines– some comments
• How far does an argument extend?
Although [started in 1965], [Wedtech didn’t really getrolling until 1975] (when Mr. Neuberger discovered theFederal Government’s Section 8 minority businessProgram).
“Proper partial overlap”
ARG1: Wedtech didn’t really … 1975ARG2: started in 1965SUP2: when Mr. Neuberger … Program
Multiple annotations
• In the standard annotation paradigm only one annotation is selected• At the discourse level multiple annotations cannot be completely avoided
[Big bear doesn’t care for disposable diapers,] which aren’t biodegradable. Yet [parents demand them.]
Big bear doesn’t care for disposable diapers, [which aren’t biodegradable.] Yet [parents demand them.]
Assigning roles to the arguments
For verbs• In terms of general roles such as agent, theme, goal, instrument, …• In terms of word specific roles
He wouldn’t accept anything of value from those he was writing about
REL: accept Arg0: acceptorArg1: thing accepted Arg2: accepted-from
Prague Dependency Treebank (PDB) (1998, 2001), Framenet (2000, 2002),Propbank (2002, 2003)
Assigning “roles” to the arguments of a connective
• In terms of general roles-- ???• In terms of connective specific “roles”
Roles of arguments of “if” (conditional)
if (hypothetical)
If John studies hard he will pass the examination
REL: if (hypothetical)ARG0: (Truth condition) circumstances which make ARG1 trueARG1: (Assertion) expresses assertion
Roles of arguments of “if” (relevance conditional)
if (relevance)
If you are thirsty, there is beer in the fridge
REL: if (relevance conditional)ARG0: (Relevance condition) circumstances in which ARG1 is relevantARG1: (Assertion) expresses assertion
Roles of arguments of “if” (factual conditional)
if (factual)
If Bill is so unhappy here, he should leave
REL: if (factual conditional)ARG0: (Factual condition) someone other than the speaker believes that ARG0 is true and ARG0 justifies ARG1ARG1: (Conditional assertion) expresses assertion
Some possible new senses for if
[It will be at their peril] if [Americans allow another happening like the degrading Bork confirmation circus]
ARG1: it will … perilARG2: Americans … circus
ARG 1 makes reference to ARG 2
If here is not hypothetical conditional but it is just a way of making an assertion, much like hypothetical relevance conditional but not quite like it.
Some possible new senses for if
[Don’t leave home without the American Express card if [you’d really rather have a Buick.]
If here is more like the hypothetical relevanceconditional but not quite like it.
Some possible senses for while
[Under Chapter 11, a company operates under protectionfrom creditors’ lawsuits] while [it works out a planto pay its debts.]
ARG1: Under … lawsuitsARG2: it works … debtsCon: whileSense: Temporal
[Some will likely be offered severance package] while [others will be transferred to overseas operations.]
Sense: Concessive
Some possible senses for while
[Each company remains independent] while [workingtogether to market and sell their products.]
Sense: Temporal/Concessive
While [the insurance index fell 3.56 to 528.56,] [the Nasdaq bank index fell 5.00 to 432.61.]
Sense: ? Compare but no real contrast
Some possible senses for since, when, …
• Senses for since: Temporal, Causal, Temporal/Causal
•Senses for when: Temporal, Causal, Temporal/Causal
Since
• Temporal (T)– She hasn’t played any music since the earthquake
hit.
• Causal (C)– Since the budget measures cash flow, a new $1
direct loan is treated as a $1 expenditure.
• Temporal/Causal (T/C)– … and domestic car sales have plunged 19% since
the Big Three ended many of their programs Sept 30.
While
• Temporal (T)– A nurse contracted the virus while injecting an
AIDS patient
• Concession (Con)– The basket product, while it has got off to a slow
start, is being supported by some firms.
• Opposition (Opp)– … one ex-player claims he received $4000 to
$5000 for his season football tickets while others said theirs brought only a few hundred dollars.
When
• Temporal– The San Francisco earthquake hit when
resources in the field already were stretched
• Temporal/Causal– When the Trinity Repertory Theatre named
Anne Bogart as its artistic director last spring, the nation’s theatrical cognoscenti arched a collective eyebrow
Summary
• Expected date of release – April 2006 -- all explicit connectives (adjudicated) -- all implicit connectives but only about 50% adjudicated -- some annotation senses• All connectives, all senses, and some experimental results – December 2006
Summary
• Boundary between sentence and discourse• Flexible• Discourse connectives sit at this boundary• Similarities and differences between sentence structure and local discourse structure
• Properties of discourse connectives• Arguments of connectives, a-rity is 2• Extent of the arguments and their semantics• Annotations of attributions -- Mismatch between syntax and discourse• Sense annotation—new opportunities• Multiple annotations-- implications