learning subjective nouns using extraction pattern bootstrapping ellen riloff, janyce wiebe, theresa...

10
Learning Subjective Nouns using Extraction Pattern Bootstrapping Ellen Riloff, Janyce Wiebe, Theresa Wilson Presenter: Gabriel Nicolae

Post on 21-Dec-2015

218 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Learning Subjective Nouns using Extraction Pattern Bootstrapping Ellen Riloff, Janyce Wiebe, Theresa Wilson Presenter: Gabriel Nicolae

Learning Subjective Nouns using Extraction Pattern Bootstrapping

Ellen Riloff, Janyce Wiebe, Theresa Wilson

Presenter: Gabriel Nicolae

Page 2: Learning Subjective Nouns using Extraction Pattern Bootstrapping Ellen Riloff, Janyce Wiebe, Theresa Wilson Presenter: Gabriel Nicolae

Subjectivity – the Annotation Scheme

http://www.cs.pitt.edu/~wiebe/pubs/ardasummer02/

Goal: to identify and characterize expressions of private states in a sentence.

Private state = opinions, evaluations, emotions and speculations.

Also judge the strength of each private state: low, medium, high, extreme.

Annotation gold standard: a sentence is subjective if it contains at least one private-state expression of medium

or higher strength objective – all the rest

The time has come, gentlemen, for Sharon, the assassin, to realize that injustice cannot last long.

Page 3: Learning Subjective Nouns using Extraction Pattern Bootstrapping Ellen Riloff, Janyce Wiebe, Theresa Wilson Presenter: Gabriel Nicolae

Using Extraction Patterns to Learn Subjective Nouns – Meta-Bootstrapping (1/2) (Riloff and Jones 1999) Mutual bootstrapping:

Begin with a small set of seed words that represent a targeted semantic category

(e.g. begin with 10 words that represent LOCATIONS) and an unannotated corpus. Produce thousands of extraction patterns for the entire

corpus (e.g. “<subject> was hired”) Compute a score for each pattern based on the

number of seed words among its extractions Select the best pattern, all of its extracted noun

phrases are labeled as the target semantic category Re-score extraction patterns (original seed words +

newly labeled words)

Page 4: Learning Subjective Nouns using Extraction Pattern Bootstrapping Ellen Riloff, Janyce Wiebe, Theresa Wilson Presenter: Gabriel Nicolae

Using Extraction Patterns to Learn Subjective Nouns – Meta-Bootstrapping (2/2)

Meta-bootstrapping: After the normal bootstrapping

all nouns that were put into the semantic dictionary are reevaluated

each noun is assigned a score based on how many different patterns extracted it.

only the 5 best nouns are allowed to remain in the dictionary; the others are discarded

restart mutual bootstrapping

Page 5: Learning Subjective Nouns using Extraction Pattern Bootstrapping Ellen Riloff, Janyce Wiebe, Theresa Wilson Presenter: Gabriel Nicolae

Using Extraction Patterns to Learn Subjective Nouns – Basilisk (Thelen and Riloff 2002) Begin with

an unannotated text corpus and a small set of seed words for a semantic category

Bootstrapping: Basilisk automatically generates a set of extraction patterns

for the corpus and scores each pattern based upon the number of seed words among its extractions best patterns in the Pattern Pool.

All nouns extracted by a pattern in the Pattern Pool Candidate Word Pool. Basilisk scores each noun based upon the set of patterns that extracted it and their collective association with the seed words.

The top 10 nouns are labeled as the targeted semantic class and are added to the dictionary.

Repeat bootstrapping process.

Page 6: Learning Subjective Nouns using Extraction Pattern Bootstrapping Ellen Riloff, Janyce Wiebe, Theresa Wilson Presenter: Gabriel Nicolae

Using Extraction Patterns to Learn Subjective Nouns – Experimental Results

The graph tracks the accuracy as bootstrapping progressed.

Accuracy was high during the initial iterations but tapered off as the bootstrapping continued.

After 20 words, both algorithms were 95% accurate. After 100 words, Basilisk was 75% accurate and MetaBoot 81%. After 1000 words, MetaBoot 28% and Basilisk 53%.

Page 7: Learning Subjective Nouns using Extraction Pattern Bootstrapping Ellen Riloff, Janyce Wiebe, Theresa Wilson Presenter: Gabriel Nicolae

Creating Subjectivity Classifiers – Subjective Noun Features Naïve Bayes classifier using the nouns as features.Sets:

BA-Strong: the set of StrongSubjective nouns generated by Basilisk

BA-Weak: the set of WeakSubjective nouns generated by Basilisk

MB-Strong: the set of StrongSubjective nouns generated by Meta-Bootstrapping

MB-Weak: the set of WeakSubjective nouns generated by Meta-Bootstrapping

For each set – a three-valued feature: presence of 0, 1, ≥2 words from that set

Page 8: Learning Subjective Nouns using Extraction Pattern Bootstrapping Ellen Riloff, Janyce Wiebe, Theresa Wilson Presenter: Gabriel Nicolae

Creating Subjectivity Classifiers – Previously Established Features (Wiebe, Bruce, O’Hara 1999) Sets:

a set of stems positively correlated with the subjective training examples – subjStems

a set of stems positively correlated with the objective training examples – objStems

For each set – a three-valued feature the presence of 0, 1, ≥2 members of the set.

A binary feature for each: presence in the sentence of a pronoun, adjective,

cardinal number, modal other than will, adverb other than not.

Other features from other researchers.

Page 9: Learning Subjective Nouns using Extraction Pattern Bootstrapping Ellen Riloff, Janyce Wiebe, Theresa Wilson Presenter: Gabriel Nicolae

Creating Subjectivity Classifiers – Discourse Features

||

||)(

S

SinsubjCluesSClueRatesubj

subjClues = all sets defined before except objStems

||

||)(

S

SinobjStemsSClueRateobj

Four features: ClueRatesubj for the previous and following sentences ClueRateobj for the previous and following sentences

Feature for sentence length.

Page 10: Learning Subjective Nouns using Extraction Pattern Bootstrapping Ellen Riloff, Janyce Wiebe, Theresa Wilson Presenter: Gabriel Nicolae

Creating Subjectivity Classifiers –Classification Results

The results of Naïve Bayes classifiers trained with different combinations of features.

Using both WBO and SubjNoun achieves better performance than either one alone.

The best results are achieved with all the features combined.

Another classification, with a higher precision, can be obtained by classifying a sentence as subjective if it contains any of the StrongSubjective nouns. 87% precision 26% recall