Computational Extraction of Social and Interactional Meaning
from Speech
Dan Jurafsky and Mari Ostendorf
Lecture 5: Agreement, Citation, Propositional Attitude
Mari Ostendorf
Agreement, Citation, Propositional Attitude
Agreement vs. disagreement with propositions (and people)
How to make friends & influence people…Tool for affiliation, indicator of influenceTool for distancing, indicator of factions or rifts in groups
Important component of group problem solving
Speech Examples Revisited
A: This’s probably what the LDC uses. I mean they do a lot of transcription at the LDC. B: OK. A: I could ask my contacts at the LDC what it is they actually use.B: Oh! Good idea, great idea.
A: After all these things, he raises hundreds of millions of dollars. I mean uh the fella B: but he never stops talking about it. A: but okB: Aren’t you supposed to y- I mean A: well that’s a little- the Lord saysB: Does charity mean something if you’re constantly using it as a cudgel to beat your enemies over the- I’m better than you. I give money to charity.A: Well look, now I…
Subgroups Example: Wikipedia Talk Page
By including the "Haditha Massacre" in the Human Rights Abuse section, we are effectively convicting the Marines that are currently on trial. I think we need to wait until the trial is over. – UnregisteredUser1
Disagree. All I see is the listing "Haditha killings (Under investigation)." Is the word Massacre used? If not, I believe it should be because this word fits every version of the story presented in the public, including Time, the US Marines, and the Iraqi Government. – RegisteredUser1
I agree with RegisteredUser1, this is about (current) history, not law. Just because something hasn't been decided by a court doesn't mean it didn't happen. It should be enough in the article to just mention that the marines charged/suspected of the massacre have not yet been convicted. –RegisteredUser2
I disagree, you cannot call it a human rights violation if it’s not stated what happened there. Also your statement "have not yet been convicted" is kind of the thing we are attempting to avoid. Without guilt or a better understanding of the situation I think it’s premature to put it in the human rights violation section. – RegisteredUser3
Actually, as long as NPOV, WP:Verifiability are maintained you can call it a human rights violation even if it is untrue. As Wikipedia says "As counterintuitive as it may seem, the threshold for inclusion in Wikipedia is verifiability, not truth." Like it or not, as long as there are reputable sources calling it a massacre and/or a human rights violation then it can be included in the article. —RegisteredUser4
Calling it a human rights violation in itself is POV. I also do not think anyone would appreciate you attempting to manipulate wiki policy for the sake of adding POV into an article. – RegisteredUser3
Influencing ExampleThere is a guideline that we shouldn't semi-protect articles linked from front page, so as to allow new editors a chance to edit articles they are most likely to read. But in this case all we are doing is enabling a swarm of socks. Semi-protection is definitely needed in this instance, with an apology should a new, well-intentioned editor actually show up amidst the swarm and be prevented from editing. Semi-protect this sucker, or we'll never determine the appropriate course of action for this article. RegUser2
Even though semi-protection is defidentally good for what is nominally "my" side … it's against policy and not appropriate. Please take it off. RegUser3
Is is absolutely not against policy. Wikipedia:Protection policy is very clear: … For this article at this time, it's necessary. That's in perfect compliance with policy. RegUser2
Removing the image without discussion is aggressively bad editing (which I am often guilty of). It's not vandalism. sprotect is only for vandalism. RegUser3
Repeated violations of 3RR and using sockpuppets, together with admitting that the purpose of removing the image is to curry favour with one's god and not to improve Wikipedia, doesn't so much cross the line from bad editing to vandalism as pole vault it. – RegUser4
Ok, my WP:AGF is falling. I still think sprotect is agressive, but not as badly as I did before. RegUser3
Influenced participant: alignment change
Online Political Discussion Forum
Q: Gavin Newsom- I expected more from him when I supported him in the 2003 election. He showed himself as a family-man/Catholic, but he ended up being the exact oppisate, supporting abortion, and giving homosexuals marriage licenses. I love San Francisco, but I hate the people. Sometimes, the people make me want to move to Sacramento or DC to fix things up.
R: And what is wrong with giving homosexuals the right to settle down with the person they love? What is it to you if a few limp-wrists get married in San Francisco? Homosexuals are people, too, who take out their garbage, pay their taxes, go to work, take care of their dogs, and what they do in their bedroom is none of your business.
Citations (from Teufel et al., 2006)Following Pereira et al. ‘93, we measure word
similarity by the relative entropy or Kulbach-Leibler (KL) distance, between the corresponding conditional distributions.
His [Hindle’s] notion of similarity seems to agree with our intuitions in many cases, but it is not clear how it can be used directly to construct word classes and corresponding models of association.
OverviewCommon threadsExamples:
Agreements & disagreements in meetingsAgreements & disagreements in online discussionsCitation function
More common threads
(Plus examples from unpublished UW studies on Wikipedia discussions.)
OverviewCommon threadsExamples:
Agreements & disagreements in meetingsAgreements & disagreements in online discussionsCitation function
More common threads
Common Threads
Sentiment detection (sort of)Discussions: agreement/disagreement/neutralCitations: positive/negative/neutral (opt. contrast)
Most studies detect person/paper as target, not the proposition per se
ChallengesCultural bias & infrequent negativesBag of words is not enoughIdentifying person/paper target of agreement (context
can extend beyond the “sentiment” sentence)Computational modeling
Challenge: Cultural BiasEnglish meetings: many more agreements than
disagreementsMandarin wiki dicussions: fewer explicit
disagreements than in EnglishCitations: several studies find that negative citations
are rare (presumably because they are politically dangerous)
People use positive words to soften the blow:“right but….”, “yeah” with negative intonation
Challenge: Polarity Words in BOW
Need to account for negation“agree” vs. “don’t agree”, “absolutely” vs. “absolutely not”BUT fewer than half the positive words in negative turns are
lexically negated Some part-of-speech issues, e.g. “well”People include positive words to soften the blow
dissenting turns have more positive words than negative“right” occurs 75 times in dissenting turns, 162 times in
neutral turns & only 33 times in supporting turns
Polarity Word Trickiness (cont.)
Positive negatives“yeah larry i i want to correct something randi said of
course” “right but but you you can't say that punching him in the
back of the head is justified”Negative positives
“Steph- vent away – that sucks –”“no you stick with what you're doing”
Challenge: Identifying the Target
Baseline: The target is the most recent speaker:67% accurate for Wiki discussions80% accurate for meetings
Adding names doesn’t help much (70% accurate for Wiki discussions)
Target can be more than one personIn political discussion forum (Abbott et al. 11), 82% of
posts with quotes have quotes that can be linked to previous post
Citation information often not in the same sentence as the citation (Teufel et al. 06).
Chat: complication of asynchrony
PubCoord Are we agreed on about 60 for soda?Acct yeah, only ourselves are set apart, I thinkSecty They can't take a bottle.Secty Okay, I agree on 60 for sodaPubCoord VotePubCoord agreedProjMgr Yeah, agreeSecty How much does ice cost?PubCoord 2.50 per packAcct how about 50, because project manager won't drink that much sodaPubCoord probablyPubCoord What is he a camel?Acct and some folks won't drink any?Secty lolAcct no, some people dont like flavor, carbonationProjMgr Shut up! Soda can be harshAcct or, OMG caloriesSecty please stay on topicAcct yeah, i don’t like the carbonationPubCoord Alright, I've identified two of youProjMgr I was just going to say that...Acct me too!Secty so was that $50 for ice?Acct actually, I guess I know who everyone is thenPubCoord What?
Acct no, 50 for popSecty ohPubCoord No, 50 for soda is fine I guessSecty please vote between 50 or 60PubCoord I think maybe 10 for iceProjMgr Yeah :/Acct and someone already volunteered
their cooler?PubCoord YessirSecty *please vote between 50 or 60 for sodaSecty I vote 60PubCoord 60ProjMgr 50Acct i vote 50ProjMgr TIE!PubCoord then?Secty 50 it isAcct g d itAcct yeah, 55Secty okay, 55Secty so how much is left, accountant?
?
?
Computational Modeling -- Review
Standard text classification problemExtract feature vector apply model score classesChoose class with best score
Popular modelsNaïve BayesDecision trees/forests vs. boostexter/icsiboostMaximum entropySVMsK-nearest neighbor (lazy learning or memory-based)
Feature selection or regularizationEvaluation:
Classification accuracy or Macro F (mean of F measures)
New since Lec 5
Feature Extraction – Noise Issue
Both speech and text have “noise” challengesSpeech: speech recognition errors (especially when
there is overlapping speech)Online discussions: typos and funny spellings
defidentally goodthe exact oppisate
Not a big issue for edited text (e.g. most articles that would have citations)
Challenge: Skewed PriorsLarge percentage of sentences are neutral, standard
training algorithms emphasize the frequent classesSome solutions:
Use development set to tune detection thresholdsRandom sampling using biased priors and bagging
(classifier combination)
OverviewCommon threadsExamples:
Agreements & disagreements in meetingsAgreements & disagreements in online discussionsCitation function
More common threads
Detecting (Dis)Agreements in Meetings
Adjacency pair speaker detection (given B, find A)Target detection for agreements & disagreementsAlso includes question/answer, offer/acceptance, etc.
Classify B as agreement/disagreement/other(Backchannels modeled separately, but including in “other
for scoring.)
A: I could ask my contacts at the LDC what it is they actually use.B: Oh! Good idea, great idea.
Galley et al. 2004
Meeting DataICSI Meeting corpus
75 1-hour meetings, average of 6.5 participants/meetingHand transcribed, audio automatically time alignedHand labeled for adjacency pairs7 meetings pause-segmented into “spurts”Class distribution:
Agree: 12%Disagree: 7%Other: 81%
Adjacency Pair – Speaker Ranking
Features (B given, A is candidate target)Structural: +/- overlap, # of speakers/spurts between A
& B, etcDuration: duration of overlap, duration of A, time
between A & B, overlap with others, speaking rateLexical: word counts, counts of shared words, cue word
indicators, name indicator, …Dialog acts (oracle)
Feature selection: incrementalClassifier: Maximum entropy
Adjacency Pair Results
Only small gain from oracle DA information: 91.3%
Agreement/Disagreement ClassifierFeatures
Structural: previous next spurt same/diffDuration: spurt, silence & overlap duration, speech rateLexical: similar to adjacency pairs, plus polarity word
countsLabel dependency: contextual tags (a speaker is likely to
disagree with someone who disagrees with them)Classifier
Conditional Markov model (Max Entropy Markov Model)
Agreement/Disagreement Results
OverviewCommon threadsExamples:
Agreements & disagreements in meetingsAgreements & disagreements in online discussionsCitation function
More common threads
Detecting (Dis)Agreement in Online Discussions
Abbott et al., 2011
Task: label R in a Q-R (quote-response) pair as agreement/disagreement.
ARGUE Data110k forum posts (11k discussion threads, 2764
authors) from website 4forums.comForums include: evolution, gun control, abortion, gay
marriage, healthcare, death penalty, …Annotations by Mechanical Turkers with [-5,5] scale
Disagree-agree (Krippendorff’s = 0.62)Other annotations had < 0.5: attach, fact/emotion,
sarcasm, nice/nasty8k “good” Q-R pairs annotated sample & use (-1,1)
threshold gives 682 pairs for testingClass distribution: resampled to be balanced
(Dis)Agree Classifier
FeaturesMetaPost: author info, time between posts, # other quotesUnigram & Bigram counts, initial unigram/bigram/trigramRepeated punctuation (collapsed to ??,!!, ?!)LIWC measuresParse dependencies <relation,wi,wj>, POS-polarity opinion
dependenciesTf-idf cosine distance to previous post
Classifier: Naïve Bayes & JRip (WEKA toolkit)Chi-squared feature selection, plus feature selection
implicit in JRip (rule learner)
Sample (Dis)Agree Classifier
(Dis)Agree Classification Results
• JRip beats NB• JRip Accuracy: Local features: 68% Othe annotations: 81%
Caveat: optimistic, since neutral cases are removed.
OverviewCommon threadsExamples:
Agreements & disagreements in meetingsAgreements & disagreements in online discussionsCitation function
More common threads
Classification of Citation Function
Teufel et al., 2006Agreement, usage,
compatibility (6)Weakness (4)Contrastneutral
Citation Study Data26 articles w/ 548 citationsKappa = 0.72 for 12 categoriesClass distribution: >67% neutral + neutral contrast, 4%
negative, 19% usage
Citation Classifier
FeaturesGrammar of 1762 cue phrases, e.g. “as far as we are
aware” from other work + 892 from this corpus185 POS patterns for recognizing agents (self-cites vs.
others) w/ 20 manually acquired verb clustersVerb tense, voice, modalitySentence location in paragraph & section
Classifier: K-nearest neighbor (WEKA toolkit)
Citation Classification Results
K=0.75 for humans for these categories
OverviewCommon threadsExamples:
Agreements & disagreements in meetingsAgreements & disagreements in online discussionsCitation function
More common threads
Collected Observations re FeaturesPhrase patterns and location-based n-grams are usefulStructural features are useful
Location of turn relative to other authors/speakersLocation of sentence in turn & document
Broader context (beyond target sentence) is usefulSequential patterns of disagreementEmotion context
Simple cosine similarity is not so usefulProsodic features not being taken advantage of
More ChallengesExplicit agreement & disagreement do not capture all the
phenomena associated with alignment & distancingImplicit (dis)agreement via stating an opposite opinion
A: The video is still an allegationB: The video is hard evidence or rhetorical question
… or a rhetorical questionA: Such a topic is far more broad than the current article but should certainly
contain a link back to this one. B: How is the [[Iraq invasion controversy]] suggestion more broad?
Support vs. attackWell, you have proven yoruself [sic] to be a man with no brainSteph- vent away – that sucks
These phenomena are hard for human annotators to more consistently (exception: citation labels?)
Different studies may group or distinguish them
The victims were teenagers, not children. Furthermore, the teenagers were throwing rocks and makeshift grenades at the soldiers. Second, the video is still an allegation. We should wait until the investigation is completed before putting it up. – RegisteredUser1
The video is hard evidence. If this was 1945, you'd be telling us not to include any footage of the Nazi concentration camps until the Germans had concluded that they committed war crimes. As for your suggestions that those children *deserved* what happened because they allegedly throw rocks at soldiers carrying assault rifles, I find that as offensive as suggesting that America deserved the 9/11 attack because of its foreign policies. – AnonymousUser1 THEY WEREN'T CHILDREN! The article makes NO mention of children whatsoever. So before you all let your emotions run wild over this: a) they weren't children b) they had hand grenades. – RegisteredUser1 YES THEY WERE CHILDREN! Watch the video. The soldiers are clearly acting in hatred and blood-lust, not self-defense. Defending them is like defending a child molester or serial murderer. The video SHOWS children being assaulted. – AnonymousUser2 A 14 year old is definitely a child. There's a reason we don't let 14 year-olds drink, vote, drive, "consent" to sex with adults, or sign legal agreements without a guardian. – RegisteredUser2 At 14 you are definitely a teenager, not a child. 14 year olds can throw a grenade and shoot a rifle, and know the consequences of their actions. Furthermore 18 isn't the age of majority in Iraq so far as I know. In much of the world the drinking and driving ages are 14 and 16. The world is not centered upon our American beliefs, and it's high time that we started accepting that in ALL situations, not just the ones we deem acceptable. I'm absolutely sickened by the brainwashed vehemence and anti-US hatred expressed by so many so called "liberals" on Wikipedia. - RegisteredUser1
In the English language the word adult is generally not used for people under the age of 18. If you want to use it differently you need to explain it in the article in order not to be misleading. Please calm down and do not personally attack others as "brainwashed" or spreading "hatred". – RegisteredUser4
Example Wikipedia Talk Page
SummaryWhy look for (dis)agreement, support, etc?
Dissecting discussions for influence, subgroups, affiliation, successful problem solving, etc
Understanding citation impactThese tasks are very related to sentiment detection,
except that the target is often part of the problemDifferent ways of handling agreement vs. supportThe neutral class is huge – don’t ignore itComputational advice:
Many better alternatives to Naïve BayesConsider features beyond n-grams