subjectivity annotation update josef ruppenhofer jan wiebe

57
Subjectivity Subjectivity Annotation Annotation Update Update Josef Ruppenhofer Josef Ruppenhofer Jan Wiebe Jan Wiebe

Post on 20-Dec-2015

225 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Subjectivity Annotation Update Josef Ruppenhofer Jan Wiebe

Subjectivity Annotation Subjectivity Annotation UpdateUpdate

Josef RuppenhoferJosef RuppenhoferJan WiebeJan Wiebe

Page 2: Subjectivity Annotation Update Josef Ruppenhofer Jan Wiebe

OutlineOutline

Update on our annotationsUpdate on our annotations Exploration of subjectivity and Exploration of subjectivity and

Discourse Treebank annotationsDiscourse Treebank annotations

Page 3: Subjectivity Annotation Update Josef Ruppenhofer Jan Wiebe

Subjectivity AnnotationSubjectivity Annotation Before this year: MPQA annotation Before this year: MPQA annotation

scheme and corpusscheme and corpus

@ www.cs.pitt.edu/mpqa@ www.cs.pitt.edu/mpqa

English language versions of articles from the English language versions of articles from the world press (world press (187 news sources)187 news sources)

535 Documents; 11,114 sentences535 Documents; 11,114 sentences

Wiebe, Wilson, Cardie. Annotating Expressions of Opinions and Emotions in Language. LRE 2005.

Page 4: Subjectivity Annotation Update Josef Ruppenhofer Jan Wiebe

Subjectivity AnnotationSubjectivity Annotation Current work (goals have been Current work (goals have been

expanded schemes, not high volume expanded schemes, not high volume annotation) annotation) Extended MPQAExtended MPQA

Additional data annotatedAdditional data annotated 2005 LRE scheme plus extensions 2005 LRE scheme plus extensions

Theresa Wilson’s PhD dissertation (2008)Theresa Wilson’s PhD dissertation (2008) [not-yet-released extensions added to the MPQA [not-yet-released extensions added to the MPQA

corpus]corpus]

Discourse level relations between opinionsDiscourse level relations between opinions Subjectivity in health surveillance textsSubjectivity in health surveillance texts Word sense subjectivity and polarityWord sense subjectivity and polarity

Page 5: Subjectivity Annotation Update Josef Ruppenhofer Jan Wiebe

Extended MPQA SchemeExtended MPQA Scheme

Documents Documents 85 Xbank files85 Xbank files ““Boyan” subset of ULA dataBoyan” subset of ULA data

1/3 completed1/3 completed expected completion: early summerexpected completion: early summer

[MPQA][MPQA]

Page 6: Subjectivity Annotation Update Josef Ruppenhofer Jan Wiebe

Extended MPQA SchemeExtended MPQA Scheme

AnnotatorsAnnotators JosefJosef two undergraduatestwo undergraduates

Page 7: Subjectivity Annotation Update Josef Ruppenhofer Jan Wiebe

Training Time & EffortTraining Time & Effort

TrainingTraining Josef’s effort: 75 hours (~ 2 weeks)Josef’s effort: 75 hours (~ 2 weeks)

10 preparing materials10 preparing materials 40 basic training40 basic training 25 extensions25 extensions

Annotators: 120 hours combined (~1.5 weeks each)Annotators: 120 hours combined (~1.5 weeks each) Problems with scheduling arose (annotators did not Problems with scheduling arose (annotators did not

work planned hours per week; redundant one-on-work planned hours per week; redundant one-on-one meetings)one meetings)

With perfect scheduling, estimate 1 week to train With perfect scheduling, estimate 1 week to train two annotators (though Josef is involved in two annotators (though Josef is involved in production annotation)production annotation)

Page 8: Subjectivity Annotation Update Josef Ruppenhofer Jan Wiebe

Production annotationProduction annotation

Single annotator per documentSingle annotator per document Annotator time per document (very Annotator time per document (very

rough est)rough est) 2 hours 45 mins 2 hours 45 mins 45 mins of which is time spent on 45 mins of which is time spent on

consultation, 15 with each other, 30 with consultation, 15 with each other, 30 with JosefJosef

Page 9: Subjectivity Annotation Update Josef Ruppenhofer Jan Wiebe

Periodic Agreement Testing Periodic Agreement Testing

documents with known gold standarddocuments with known gold standard no consultationno consultation ~every 5 documents~every 5 documents post-mortem meetings (one on one, group)post-mortem meetings (one on one, group) Four annotations to compare (Theresa Four annotations to compare (Theresa

Wilson, Josef, two undergraduate Wilson, Josef, two undergraduate annotators)annotators)

[Results of previous agreement studies in [Results of previous agreement studies in previous papers]previous papers]

Page 10: Subjectivity Annotation Update Josef Ruppenhofer Jan Wiebe

Agreement measurementAgreement measurement So far, average pair-wise agreement So far, average pair-wise agreement

calculated per documentcalculated per document Full analysis forthcomingFull analysis forthcoming

Relative label reliability:Relative label reliability:agent > direct-subjective > target > agent > direct-subjective > target > attitude > objective-speech-event > attitude > objective-speech-event > expressive-subjective-elementexpressive-subjective-element• Given the interactions between the labels, Given the interactions between the labels,

errors are interrelatederrors are interrelated

Page 11: Subjectivity Annotation Update Josef Ruppenhofer Jan Wiebe

Annotation SchemesAnnotation Schemes

Page 12: Subjectivity Annotation Update Josef Ruppenhofer Jan Wiebe

What is Subjectivity?

• The linguistic expression of somebody’s opinions, sentiments, emotions, evaluations, beliefs, speculations (private states)

Private state: state that is not open to objective observation or verification Quirk, Greenbaum, Leech, Svartvik (1985). A Comprehensive Grammar of the English Language.

Page 13: Subjectivity Annotation Update Josef Ruppenhofer Jan Wiebe

Overview

• Fine-grained: expression-level rather than sentence or document level

• Annotate – Subjective xpressions– material attributed to a source, but presented

objectively

Page 14: Subjectivity Annotation Update Josef Ruppenhofer Jan Wiebe

Overview

• Focus on three ways private states are expressed in language

Page 15: Subjectivity Annotation Update Josef Ruppenhofer Jan Wiebe

Direct Subjective Expressions

• Direct mentions of private states The United States fears a spill-over from the anti-

terrorist campaign.

• Private states expressed in speech events “We foresaw electoral fraud but not daylight

robbery,” Tsvangirai said.

Page 16: Subjectivity Annotation Update Josef Ruppenhofer Jan Wiebe

Expressive Subjective Elements [Banfield 1982]

• “We foresaw electoral fraud but not daylight robbery,” Tsvangirai said

• The part of the US human rights report about China is full of absurdities and fabrications

Page 17: Subjectivity Annotation Update Josef Ruppenhofer Jan Wiebe

Objective Speech Events

• Material attributed to a source, but presented as objective fact

The government, it added, has amended the Pakistan Citizenship Act 10 of 1951 to enable women of Pakistani descent to claim Pakistani nationality for their children born to foreign husbands.

Page 18: Subjectivity Annotation Update Josef Ruppenhofer Jan Wiebe

Nested Sources

“The US fears a spill-over’’, said Xirao-Nima, a professorof foreign affairs at the Central University for Nationalities.

(writer, Xirao-Nima, US) (writer, Xirao-Nima)

(writer)

“The report is full of absurdities,’’ he continued.

(writer, Xirao-Nima) (writer, Xirao-Nima)

(writer)

Page 19: Subjectivity Annotation Update Josef Ruppenhofer Jan Wiebe

“The report is full of absurdities,” Xirao-Nina said.

Objective speech event anchor: the entire sentence source: <writer> implicit: true

Direct subjective anchor: said source: <writer, Xirao-Nima> intensity: high expression intensity: neutral attitude type: negative target: report

Expressive subjective element anchor: full of absurdities source: <writer, Xirao-Nima> intensity: high attitude type: negative

Page 20: Subjectivity Annotation Update Josef Ruppenhofer Jan Wiebe

Objective speech event anchor: the entire sentence source: <writer> implicit: true

Objective speech event anchor: said source: <writer, Xirao-Nima>

Direct subjective anchor: fears source: <writer, Xirao-Nima, US> intensity: medium expression intensity: medium attitude type: negative target: [new work]

“The US fears a spill-over’’, said Xirao-Nima, a

professor of foreign affairs at the Central University

for Nationalities.

Page 21: Subjectivity Annotation Update Josef Ruppenhofer Jan Wiebe

ExtensionsExtensionsWilson 2008Wilson 2008

I think people are happy because Chavez has I think people are happy because Chavez has fallen.fallen.

direct subjective span: are happy source: <writer, I, People> attitude:

inferred attitude span: are happy because Chavez has fallen type: neg sentiment intensity: medium target:

target span: Chavez has fallen

target span: Chavez

attitude span: are happy type: pos sentiment intensity: medium target:

direct subjective span: think source: <writer, I> attitude:

attitude span: think type: positive arguing intensity: medium target:

target span: people are happy because Chavez has fallen

Page 22: Subjectivity Annotation Update Josef Ruppenhofer Jan Wiebe

Subjectivity TypesSubjectivity TypesWilson 2008Wilson 2008

Other (esp. general cognition)

Page 23: Subjectivity Annotation Update Josef Ruppenhofer Jan Wiebe

Discourse-Level Opinion Discourse-Level Opinion Frames in Task-Oriented Frames in Task-Oriented

Dialogs (AMI)Dialogs (AMI) Frames are defined in terms of their componentsFrames are defined in terms of their components

Opinion spansOpinion spans Opinion typeOpinion type

SentimentSentiment ArguingArguing

Opinion PolarityOpinion Polarity TargetsTargets SourcesSources Relationships between targetsRelationships between targets

Same or alternativeSame or alternative Example motivation: polarity and targets interactExample motivation: polarity and targets interact

E.g. an argument for one design that is simultaneously E.g. an argument for one design that is simultaneously an argument against an alternative designan argument against an alternative design

Page 24: Subjectivity Annotation Update Josef Ruppenhofer Jan Wiebe

Subjectivity in Health Subjectivity in Health Surveillance TextsSurveillance Texts

TypesTypes SentimentSentiment BeliefBelief

Belief about what is the caseBelief about what is the case Belief about what should or should not be doneBelief about what should or should not be done

Knowledge/Awareness of factsKnowledge/Awareness of facts Agreement/Disagreement between sources in the textAgreement/Disagreement between sources in the text ……

SourcesSources WriterWriter MediaMedia Non-media organizationsNon-media organizations Members of the general publicMembers of the general public ……

TargetsTargets Occurrence of a disease outbreakOccurrence of a disease outbreak Danger/severity of an outbreakDanger/severity of an outbreak Cause of a diseaseCause of a disease SymptomsSymptoms ……

Page 25: Subjectivity Annotation Update Josef Ruppenhofer Jan Wiebe

Exploring the relationship between PDTB (2) and Extended

MPQA

Page 26: Subjectivity Annotation Update Josef Ruppenhofer Jan Wiebe

Overview

• Richer interpretations via combination• Potential disambiguation both ways

Page 27: Subjectivity Annotation Update Josef Ruppenhofer Jan Wiebe

A connective is marked as “Restatement” when it indicates that the semantics of Arg2 restates the semantics of Arg1. It is inferred that the situations described in Arg1 and Arg2 hold true at the same time.

Page 28: Subjectivity Annotation Update Josef Ruppenhofer Jan Wiebe

Subjectivity preserved in a restatement

“This means Nestle is now in the candybar business in a big way," said Lisbeth Echeandia, publisher of Orlando, Fla.-based Confectioner Magazine. “For them, it makes all kinds of sense.”

Page 29: Subjectivity Annotation Update Josef Ruppenhofer Jan Wiebe

Subjectivity preserved in a restatement: PDTB

“[This means Nestle is now in the candybar business in a big way ARG1]," said Lisbeth Echeandia, publisher of Orlando, Fla.-based Confectioner Magazine. “IMPLICIT_IN SHORT [For them, it makes all kinds of sense ARG2].

Page 30: Subjectivity Annotation Update Josef Ruppenhofer Jan Wiebe

Subjectivity preserved in a restatement

"This [means Nestle is now in the candybar business in a big way ARGUING-POS]," said Lisbeth Echeandia, publisher of Orlando, Fla.-based Confectioner Magazine. “[For them, it makes all kinds of sense ARGUING-POS].

Related opinions; part of the same larger opinion

Page 31: Subjectivity Annotation Update Josef Ruppenhofer Jan Wiebe

Subjectivity preserved in a restatement

"This [means Nestle is now in the candybar business in a big way ARGUING-POS]," said Lisbeth Echeandia, publisher of Orlando, Fla.-based Confectioner Magazine. “[For them, it makes all kinds of sense ARGUING-POS].

Same polarity, type, source; Hyp: common pattern with restatement

Page 32: Subjectivity Annotation Update Josef Ruppenhofer Jan Wiebe

Subjectivity preserved in a restatement

"This [means Nestle is now in the candybar business in a big way ARGUING-POS]," said Lisbeth Echeandia, publisher of Orlando, Fla.-based Confectioner Magazine. “[For them, it makes all kinds of sense ARGUING-POS].

Semantics of restatement: sameness includes subjectivity

Page 33: Subjectivity Annotation Update Josef Ruppenhofer Jan Wiebe

Note: Sentiment

"This means Nestle is now in the candybar business in a big way," said Lisbeth Echeandia, publisher of Orlando, Fla.-based Confectioner Magazine. “For them, it [makes all kinds of sense SENTIMENT-POS].”

Not directly part of the restatement relation

Page 34: Subjectivity Annotation Update Josef Ruppenhofer Jan Wiebe

The type “Cause” is used when the connective indicates that the situations described in Arg1 and Arg2 are causally influenced and the two are not in a conditional relation …

Page 35: Subjectivity Annotation Update Josef Ruppenhofer Jan Wiebe

Subjectivity preserved across Reason relation

But Mr. Schwarz welcomes the competition in U.S. Trust's flagship businesses, calling it "flattery." Mr. Schwarz says the competition "broadens the base of opportunity for us." Other firms "are dealing with the masses…”

Page 36: Subjectivity Annotation Update Josef Ruppenhofer Jan Wiebe

Subjectivity preserved across Reason relation: PDTB

[But Mr. Schwarz welcomes the competition in U.S. Trust's flagship businesses ARG1], [calling it "flattery SUP1]." Mr. Schwarz says IMPLICIT_BECAUSE [the competition "broadens the base of opportunity for us ARG2]." Other firms "are dealing with the masses…”

ARG2 is a reason for ARG1

Page 37: Subjectivity Annotation Update Josef Ruppenhofer Jan Wiebe

Subjectivity preserved across Reason relation: subjectivity

But Mr. Schwarz [welcomes SENTIMENT-POS] the competition in U.S. Trust's flagship businesses, calling it "flattery." Mr. Schwarz says the competition “[broadens the base of opportunity for us SENTIMENT-POS]." Other firms "are dealing with the masses.

Positive evaluation which is a reason for a positive feeling; same overall opinion

“I like it because it is so good”

Page 38: Subjectivity Annotation Update Josef Ruppenhofer Jan Wiebe

Subjectivity preserved across Reason relation: subjectivity

But Mr. Schwarz [welcomes SENTIMENT-POS] the competition in U.S. Trust's flagship businesses, calling it "flattery." Mr. Schwarz says the competition “[broadens the base of opportunity for us SENTIMENT-POS]." Other firms "are dealing with the masses.

Subjectivity: same source, target, polarity, type; Hyp: common with reason; Help with target recognition, for example.

Page 39: Subjectivity Annotation Update Josef Ruppenhofer Jan Wiebe

Subjectivity preserved across Reason relation: subjectivity

But Mr. Schwarz [welcomes SENTIMENT-POS] the competition in U.S. Trust's flagship businesses, calling it "flattery." Mr. Schwarz says the competition “[broadens the base of opportunity for us SENTIMENT-POS]." Other firms "are dealing with the masses.

Semantics of reason: specific subtype, where an evaluation is a reason for an attitude

Page 40: Subjectivity Annotation Update Josef Ruppenhofer Jan Wiebe

The type “Cause” is used when the connective indicates that the situations described in Arg1 and Arg2 are causally influenced and the two are not in a conditional relation …

Page 41: Subjectivity Annotation Update Josef Ruppenhofer Jan Wiebe

Polarity preserved across Result relation

Other firms "are dealing with the masses. I don't believe they have the culture" to adequately service high-net-worth individuals, he adds.

Page 42: Subjectivity Annotation Update Josef Ruppenhofer Jan Wiebe

Polarity preserved across Result relation: PDTB

[Other firms "are dealing with the masses ARG1]. I don't believe IMPLICIT_SO [they have the culture" to adequately service high-net-worth individuals ARG2], he adds.

ARG2 is a result of ARG1

Page 43: Subjectivity Annotation Update Josef Ruppenhofer Jan Wiebe

Polarity preserved across Result relation: PDTB

[Other firms "are dealing with the masses ARG1]. I don't believe IMPLICIT_SO [they have the culture" to adequately service high-net-worth individuals ARG2], he adds.

X said Y: “X said” X’s belief space“I don’t believe” explicit in second sentence“Swartz said” implicit in first sentenceARG spans: Dis. Rel within Swartz’s belief space

Page 44: Subjectivity Annotation Update Josef Ruppenhofer Jan Wiebe

Polarity preserved across Result relation: subjectivity

Other firms “[are dealing with the masses SENTIMENT-NEG]. I [don't believe they have the culture" to adequately service high-net-worth individuals SENTIMENT-NEG], he adds.

Attitude span includes “don’t believe”; schemes require different notions of spans

Page 45: Subjectivity Annotation Update Josef Ruppenhofer Jan Wiebe

Polarity preserved across Result relation: subjectivity

Other firms “[are dealing with the masses SENTIMENT-NEG]. I [don't believe they have the culture" to adequately service high-net-worth individuals SENTIMENT-NEG], he adds.

Two negative properties, where the second is a result of the first

Page 46: Subjectivity Annotation Update Josef Ruppenhofer Jan Wiebe

Polarity preserved across Result relation: subjectivity

Other firms “[are dealing with the masses SENTIMENT-NEG]. I [don't believe they have the culture" to adequately service high-net-worth individuals SENTIMENT-NEG], he adds.

Dis Rel between ARGS inside his belief space

Page 47: Subjectivity Annotation Update Josef Ruppenhofer Jan Wiebe

Polarity preserved across Result relation: subjectivity

Other firms “[are dealing with the masses SENTIMENT-NEG]. I [don't believe they have the culture" to adequately service high-net-worth individuals SENTIMENT-NEG], he adds.

Semantics of result: specific subtype, where a negative state of affairs is the result of another one

Page 48: Subjectivity Annotation Update Josef Ruppenhofer Jan Wiebe

The class tag “COMPARISON” applies when the connective indicates that a discourse relation is established between Arg1 and Arg2 in order to highlight prominent differences between the two situations. Semantically, the truth of both arguments is independent of the connective or the established relation.

Page 49: Subjectivity Annotation Update Josef Ruppenhofer Jan Wiebe

In that suit, the SEC accused Mr. Antar of engaging in a "massive financial fraud" to overstate the earnings of Crazy Eddie, Edison, N.J., over a three-year period.Through his lawyers, Mr. Antar has denied allegations in the SEC suit and in civil suits previously filed by shareholders against Mr. Antar and others.

Page 50: Subjectivity Annotation Update Josef Ruppenhofer Jan Wiebe

PDTB

[In that suit, the SEC accused Mr. Antar of engaging in a "massive financial fraud" to overstate the earnings of Crazy Eddie, Edison, N.J., over a three-year period. ARG1]IMPLICIT_HOWEVER [ Through his lawyers, Mr. Antar has denied allegations in the SEC suit and in civil suits previously filed by shareholders against Mr. Antar and others. ARG2]

Contrast between the SEC accusing Mr. Antar of something, and his denying the accusation

Page 51: Subjectivity Annotation Update Josef Ruppenhofer Jan Wiebe

Subjectivity

In that suit, the SEC [[accused SENTIMENT-NEG] Mr. Antar of engaging in a "massive financial fraud" to overstate the earnings of Crazy Eddie, Edison, N.J. ARGUING-POS], over a three-year period.Through his lawyers, Mr. Antar [has denied AGREE-NEG] allegations in the SEC suit and in civil suits previously filed by shareholders against Mr. Antar and others.

Two attitudes combined into one large disagreement between two parties

Page 52: Subjectivity Annotation Update Josef Ruppenhofer Jan Wiebe

Subjectivity

In that suit, the SEC [[accused SENTIMENT-NEG] Mr. Antar of engaging in a "massive financial fraud" to overstate the earnings of Crazy Eddie, Edison, N.J. ARGUING-POS], over a three-year period.Through his lawyers, Mr. Antar [has denied AGREE-NEG] allegations in the SEC suit and in civil suits previously filed by shareholders against Mr. Antar and others.

Subjectivity: arguing-pos and agree-neg with different sources; Hyp: common with contrast. Help recognize the implicit contrast.

Page 53: Subjectivity Annotation Update Josef Ruppenhofer Jan Wiebe

Subjectivity

In that suit, the SEC [[accused SENTIMENT-NEG] Mr. Antar of engaging in a "massive financial fraud" to overstate the earnings of Crazy Eddie, Edison, N.J. ARGUING-POS], over a three-year period.Through his lawyers, Mr. Antar [has denied AGREE-NEG] allegations in the SEC suit and in civil suits previously filed by shareholders against Mr. Antar and others.

Semantics of comparison: specific case of highlighting prominent differences in attitudes of different people

Page 54: Subjectivity Annotation Update Josef Ruppenhofer Jan Wiebe

Discourse and opinion relations are not redundant

• Compare (PDTB)– Its like [people hate him ARG1] because Reason

[people love him so much ARG2] and people love him so much because people hate him so much. ARG2 reason for ARG1

– [Some people hate him ARG1]. IMPLICIT Contrast [ Others love him ARG2].

– Subjectivity is the same: contrasting polarities, the same target, different sources.

Page 55: Subjectivity Annotation Update Josef Ruppenhofer Jan Wiebe

Discourse and opinion relations are not redundant

• [Some people hate him ARG1]. IMPLICIT Contrast [ Others love him ARG2]. – Sentiment-neg Sentiment-pos– Source and polarity contrasts

• [I like the Lexus ARG1] but Contrast [my wife likes the Prius ARG2]. – Sentiment-pos Sentiment-pos– Source and target contrasts

Page 56: Subjectivity Annotation Update Josef Ruppenhofer Jan Wiebe

Discourse and opinion relations are not redundant

• ARGument and subjectivity spans often do not match exactly

Page 57: Subjectivity Annotation Update Josef Ruppenhofer Jan Wiebe

Etc.

• Attributions and nested sources• Though the schemes are not redundant, some

relations seem to imply subjectivity– E.g., Pragmatic cause; implicit assertions

• Discourse relations may help uncover inferred attitudes