sentiment analysis of scientific citations awais athar natural language and information processing...
TRANSCRIPT
Sentiment Analysis of
Scientific Citations
Awais Athar
Natural Language and Information Processing Group , Computer Lab
Supervised by: Dr. Simone Teufel
Sentiment AnalysisSentiment Analysis focuses on identifying positive and negative opinions, emotions or expressions in given text.
Subjectivity Analysis
Example: Movie Reviews
Can we do it automatically?
Simple Sentiment Analysis
This movie is absolutely HILARIOUS!!! I hated
the Spice Girls before my friend made me watch
this movie, and now I LOVE them! This movie is
one of the funniest movies I've ever seen in
my life, and I watch comedies all the time.
This is definitely my new favorite movie.
Sentiment = sign(Number of positive words - Number of negative words) = sign(4 - 1) = sign(3) = +ve
Does it always work?
I hate the Spice Girls. I hate how their music is so
… I hate how they promote … And I hate how they're
all over … Why I saw this movie is a really, really,
really long story, but I did, and one would think I'd
despise every minute of it. But... Okay, I'm really
ashamed of it, but I enjoyed it. I mean, I admit it's
a really awful movie, a wannabe … filled with excuses
for them to act wacky as hell… the ninth floor of
hell … a cheap ass cameo in a cheap ass movie. The
plot is such a mess that it's terrible. But I loved
it.
CENSORED
CENSORED
http://www.imdb.com/reviews/111/11181.html
I work on scientific text…• Scientific papers cite other papers• A citation is any mention of another document• Used in citation indexes for search
What do researchers think about this paper?
35 237
43 151
6 75
18 163
Is citation count a good
measure ?
A citation sub-graph
N04-1021
N09-1025
P02-1039J93-2003
W03-1002
Colour the edges
N04-1021
N09-1025
P02-1039J93-2003
W03-1002
After a top-sort
J93-2003 P02-1039 W03-1002 N04-1021 N09-1025
Why not reuse existing classifiers?
• Sentiment is often hidden
• Often neutral
While SCL has been successfully applied to POS tagging and Sentiment Analysis (Blitzer et al., 2006), its effectiveness for parsing was rather unexplored.
There are five different IBM translation models (Brown et al. , 1993).
Scientific Text
• Negative polarity is often expressed in contrastive terms
• Variation in lexicon
This method was shown to outperform the class based model proposed in (Brown et al., 1992) . . .
Similarity-based smoothing (Dagan, Lee, and Pereira 1999) provides an intuitively appealing approach to language modeling.
Scientific Text
• Technical terms play a major role
• Scope of influence of citations varies widely
Current state of the art machine translation systems (Och, 2003) use phrasal (n-gram) features . . .
As reported in Table 3, small increases in METEOR (Banerjee and Lavie, 2005), BLEU (Papineni et al., 2002) and NIST scores (Doddington, 2002) suggest that . . .
Applications
• Determining the quality of a paper for ranking in citation indexes by including negative citations in the weighting scheme
• Identifying contributions of some research work in the domain.
• Identifying shortcomings and detecting problems in a particular approach
• Recognising unaddressed issues and possible gaps in current research approaches.
• Identifying personal bias of an author by observing his criticism trends.
Task 1
Given a formal citation, predict its sentiment
Corpus for Citation Sentiment Analysis
• Manually annotated 8736 citations • From 310 research papers • ACL Anthology (Bird et al., 2008)
Citations
7541
293
902
Distribution of Sentiment across Citations
Objective Negative Positive
Features
Word Level
N-grams
Parts of Speech
Science Lexicon
Contextual Polarity*
Subjectivity Clues
Negation Phrases
Valance Shifters
Sentence Structure
Dependency Structures
Sentence Splitting
Negation
* Wilson et al. 2009
Word Level Features
• N-grams: “The results were good”– The results were good– Unigrams: The, results, were, good– Bigrams: The results, results were, were good– Trigrams: The results were, results were good
• Parts of Speech– This lead to good results– DT VBP TO JJ NNS – This/DT lead/VBP to/TO good/JJ results/NNS
Word Level Features
• Science Specific Sentiment Lexicon– 83 manually extracted polar phrases– From 736 citations– Negative: complicated, daunting, deficiencies, degrade, difficult, inability, lack, poor, restrict, unexplored, worse
– Positive: acceptance, accurately, adequately, aided, appealing, bestperforming, better …
Contextual Polarity Features
• Adjectives• Adverbs• Subjectivity Clues
– Strong / Weak– Positive / Negative
• Cardinal Numbers • Modal Auxiliary Verbs (can, may, could, might, …)• Negation Phrases (no, not, never, …)• Polarity Shifters (so-called effort)
Sentence Structure Features
<CIT> showed that the results for French-English were competitive
nsubj
ccomp
complm
det prep
nsubj
pobjcop
The relationship between results and competitive will be missed by trigrams but the dependency representation captures it in the nsubj(competitive, results) feature.
Dependency Relations
Output from Stanford parser
Sentence Structure Features
Removing irrelevant polar phrases around a citation might improve results. Sentence trimming
Sentence Trimming Algo
Sentence Structure Features
“
”Turney’s method did not work_neg well_neg although they reported 80% accuracy in <CIT>.
All words inside a k-word window of any negation term are suffixed with a token neg to distinguish them from their non-polar versions
Negation
Classifier
• Support Vector Machine• 10-fold cross-validation
w
1 bxw 1 bxw
0 bxw
bluexredx
Kernel Trick
𝜑 (𝐱 ) → (𝑥1❑2 ,√2𝑥1𝑥2 , 𝑥2❑
2 )
Evaluation
Citations
7541
293902
𝐹𝑚𝑖𝑐𝑟𝑜=𝐹 𝑜75418736
+𝐹𝑛293
8736+𝐹 𝑝
9028736
¿0.87 𝐹𝑜+0.03𝐹𝑛+0.10𝐹 𝑝
𝐹𝑚𝑎𝑐𝑟𝑜=𝐹𝑜+𝐹𝑛+𝐹 𝑝
3
Results
Challenges in Citation Sentiment Analysis
• Negative sentiment is ‘politically dangerous’- (Ziman, 1968)
• Personal biases are hedged - (Hyland, 1995)
• Criticism is ‘sweetened’ - (MacRoberts and MacRoberts, 1984; Hornsey et al., 2008)
“While SCL has been successfully applied to POS tagging and Sentiment Analysis (Blitzer et al., 2006), its effectiveness for parsing was rather unexplored.”
Problem: Context is Ignored
Problem: Informal Citations Are ignoredCurrent work assumes that the sentiment present in the citation sentence represents
the true sentiment
Task 2
Given a sentence,predict whether or not it
contains an informal citation
Corpus Construction• Starting point: Athar's 2011
citation sentence corpus• Select top 20 papers; treat all
incoming citations to these• 1,741 citations (from >850
papers)• 4-class scheme
– objective/neutral– positive– negative– e cluded
x
Distribution of Classes
Features: Formal Citation
Features: Author’s Name
Features: Acronyms
Features: Work Nouns (Teufel, 2010)
Features: Pronoun
Features: Connectors
Features: Section Markers
Features: Citation Lists
Features: Lexical Hooks
Features: n-Grams
• Using as baseline• SVM• 10-fold cross-validation• F-score
Results
Task 3 (redefinition of Task 1?)
Given a citation,predict sentiment
(taking informal citations into account)
Impact on Sentiment Detection
• n-grams of length 1 to 3• Dependency triplets (Athar, 2011)
det_results_Thensubj_good_resultscop_good_were
Annotation Unit is the Citation• Problem
– There may be more than 1 sentiment /citation
• Annotation unit = citation. Projection needed:– For Gold Standard: assume last sentiment is what is really
meant– For Automatic Treatment: merge citation context into one
single sentence
Results: Context Helps!
• SVM• 10-fold cross-validation• F-score
Back to the original question
35 237
43 151
6 75
18 163
Is citation count a good
measure ?
Referenced Papers and Citation Count
• Traditional Measure: Citation count
• Misses informal citations– 1 Formal, 27 informal
• Most papers are cited out of ‘politeness, policy or piety’
– Ziman (1968) • Out of 2,300 citations, 80% were cited only to
point towards further information– Spiegel-Rosing (1977)
• Out of 623 references, only 9% were of essential importance to the citing paper
– Hanney et al. (2005)
Task 4
Given a referenced paper,predict whether or not it is
significant
Features
Features
New Features
Results
Class-based Comparison
Conclusion
• New large citation sentiment corpus – more than 200,000 sentences
• Citation contexts carry subjective references – ignoring them would result in loss of a lot of
sentiment, specially criticism.• Citation sentiment detection
– all forms of citations– indirect mentions and acronyms.
• New task of detecting `in passing’ references
References
• A. Athar, “Detecting Sentiment in Scientific Citations”, PhD Thesis, Computer Lab, University of Cambridge. 2013 (expected)
• A. Athar and S. Teufel, “Detection of implicit citations for sentiment detection”, in Proc. of Workshop on Detecting Structure in Scholarly Discourse 2012, Jeju, Republic of Korea. 2012.
• A. Athar and S. Teufel, “Context-Enhanced Citation Sentiment Detection”, in Proc. of NAACL/HLT 2012, Montréal, Canada. 2012.
• A. Athar, “Sentiment Analysis of Citations using Sentence Structure-Based Features”, in Proc. Of ACL 2011, Portland, Oregon, US. 2011.
Thank you!