semi-automatic building method for a multidimensional affect dictionary for a new language
DESCRIPTION
Semi-automatic Building Method for a Multidimensional Affect Dictionary for a New Language. Guillaume Pitel, Gregory Grefenstette LREC2008. Manually Built Resources. Defining Semantic Dimensions of Affect. Manually Built Resources. Creating seed words - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Semi-automatic Building Method for a Multidimensional Affect Dictionary for a New Language](https://reader034.vdocuments.site/reader034/viewer/2022051216/56814e5d550346895dbbfad3/html5/thumbnails/1.jpg)
Semi-automatic Building Method for a Multidimensional Affect Dictionary
for a New Language
Guillaume Pitel, Gregory Grefenstette
LREC2008
![Page 2: Semi-automatic Building Method for a Multidimensional Affect Dictionary for a New Language](https://reader034.vdocuments.site/reader034/viewer/2022051216/56814e5d550346895dbbfad3/html5/thumbnails/2.jpg)
Manually Built Resources
• Defining Semantic Dimensions of Affect
![Page 3: Semi-automatic Building Method for a Multidimensional Affect Dictionary for a New Language](https://reader034.vdocuments.site/reader034/viewer/2022051216/56814e5d550346895dbbfad3/html5/thumbnails/3.jpg)
Manually Built Resources
• Creating seed words– L1 : For each dimension, select 2 to 4 words.
Total 229 seed words.– L2 : Extended L1 to average 10 words per
class. Total 881 seed words.
![Page 4: Semi-automatic Building Method for a Multidimensional Affect Dictionary for a New Language](https://reader034.vdocuments.site/reader034/viewer/2022051216/56814e5d550346895dbbfad3/html5/thumbnails/4.jpg)
Manually Built Resources
• Creating gold standard– L3 : Using a synonyms dictionary(*), and
manually deleting some words by a human annotator.
– Total 4980 word-to-class relations (3513 distinct words, a word can belong to more than one class.)
– L2 was included, so leaving 2632 words for evaluation.
![Page 5: Semi-automatic Building Method for a Multidimensional Affect Dictionary for a New Language](https://reader034.vdocuments.site/reader034/viewer/2022051216/56814e5d550346895dbbfad3/html5/thumbnails/5.jpg)
Classifying affect words along theirdimensions
• SL-dLSA+SVM• Semantic Likeliness from diversified LSA and SVM.• δ [1..10, 15, 20, 25, 30]∈ : window size.• Considered the windows [0, + δ], [− δ, + δ], [− δ, 0].• For each word, each window will create 300 dimen
sions LSA vector.• Total 12600 dimensions.
– Raw cooccurence matrices would have totalized some 5.3 million dimensions.
– A 44 class SVM classifier was trained.
![Page 6: Semi-automatic Building Method for a Multidimensional Affect Dictionary for a New Language](https://reader034.vdocuments.site/reader034/viewer/2022051216/56814e5d550346895dbbfad3/html5/thumbnails/6.jpg)
Scores of the SL-dLSA+SVM 44 class classifier
• Trained on L1 • Trained on L2
![Page 7: Semi-automatic Building Method for a Multidimensional Affect Dictionary for a New Language](https://reader034.vdocuments.site/reader034/viewer/2022051216/56814e5d550346895dbbfad3/html5/thumbnails/7.jpg)
Scores of the SL-dLSA+SVM 44 class classifier
• Classification of the word “d´esagr´ement” using SL-dLSA+SVM with L2
• Classification of the word “disgrˆace” using SL-dLSA+SVM with L2
=Annoyance, unpleasantness =disgrace, disfavour
![Page 8: Semi-automatic Building Method for a Multidimensional Affect Dictionary for a New Language](https://reader034.vdocuments.site/reader034/viewer/2022051216/56814e5d550346895dbbfad3/html5/thumbnails/8.jpg)
Classifying with SL-PMI measure
• Semantic Orientation Pointwise Mutual Information (Turney and Littman, 2002)– SO-PMI measure is intended to evaluate t
he positiveness/negativeness of a given word.– They adapt SO-PMI to a likeliness measure.
![Page 9: Semi-automatic Building Method for a Multidimensional Affect Dictionary for a New Language](https://reader034.vdocuments.site/reader034/viewer/2022051216/56814e5d550346895dbbfad3/html5/thumbnails/9.jpg)
Classifying with SL-PMI measure
• SL-PMI_C(Semantic Likeliness Pointwise Mutual Information from Information Retrieval for class C)
• H_δ(w1, w2) is the number of cooccurrences of words w1 and w2 in a δ words window.
![Page 10: Semi-automatic Building Method for a Multidimensional Affect Dictionary for a New Language](https://reader034.vdocuments.site/reader034/viewer/2022051216/56814e5d550346895dbbfad3/html5/thumbnails/10.jpg)
Scores of the SL-PMI 44 classes classifier
• Trained on L1 • Trained on L2
![Page 11: Semi-automatic Building Method for a Multidimensional Affect Dictionary for a New Language](https://reader034.vdocuments.site/reader034/viewer/2022051216/56814e5d550346895dbbfad3/html5/thumbnails/11.jpg)
Classifying with SL-LSA measure
• As for the SO-PMI, the original SO-LSA measure is intended to evaluate the positiveness/negativeness of a given word.
• LSAδ(w) is the vector representing word w in a LSA space built with a δ words window.
![Page 12: Semi-automatic Building Method for a Multidimensional Affect Dictionary for a New Language](https://reader034.vdocuments.site/reader034/viewer/2022051216/56814e5d550346895dbbfad3/html5/thumbnails/12.jpg)
F-Scores for the SL-LSA 44 classes classifiers
• Trained on L1 • Trained on L2
![Page 13: Semi-automatic Building Method for a Multidimensional Affect Dictionary for a New Language](https://reader034.vdocuments.site/reader034/viewer/2022051216/56814e5d550346895dbbfad3/html5/thumbnails/13.jpg)
F-scores of the classification methods
• Using L1 as the training data.
• Using L2 as the training data.
![Page 14: Semi-automatic Building Method for a Multidimensional Affect Dictionary for a New Language](https://reader034.vdocuments.site/reader034/viewer/2022051216/56814e5d550346895dbbfad3/html5/thumbnails/14.jpg)
Improvement ratios between L2 and L1 F-scores
![Page 15: Semi-automatic Building Method for a Multidimensional Affect Dictionary for a New Language](https://reader034.vdocuments.site/reader034/viewer/2022051216/56814e5d550346895dbbfad3/html5/thumbnails/15.jpg)
Perspectives
• They we did not evaluate the SVM classifier on simple LSA feature spaces.
• SL-LSA family of classifiers– had similar f-score, but their kappa agreement were v
ery low(0.26~0.34).– Select the correct answers from SL-LSA(L2,30) and
SL-LSA(L2,2), the f-score would raise from 0.13 to 0.19.
• Train a SL-dLSA+SVM classifier using L3 data.
![Page 16: Semi-automatic Building Method for a Multidimensional Affect Dictionary for a New Language](https://reader034.vdocuments.site/reader034/viewer/2022051216/56814e5d550346895dbbfad3/html5/thumbnails/16.jpg)
Perspectives
• Some of classes are partial overlapping.• Advantage and Facilitation• Comfort and Pleasure• Admiration and Praise• See page 7