[ieee 2012 16th csi international symposium on artificial intelligence and signal processing (aisp)...

4
Emotions from Farsi Texts with Mutual-Word- Counting and Word-Spotting Amir Namavar Jahromi Computer Engineering and IT Department Amirkabir University of Technology Tehran, Iran [email protected] Mohammad Mehdi Homayounpour Computer Engineering and IT Department Amirkabir University of Technology Tehran, Iran [email protected] Abstract—Automatic emotion sensing of textual data plays a crucial role in development of many intelligent applications. The objective of this paper is to design a system for sensing of a Farsi sentence emotion. The final goal of this work is to use it with Farsi TTS and SDS systems and make it possible to synthesis sentences with their emotions. Four emotions where considered and four sensing methods including (1) word counting, (2) mutual word counting, (3) word-spotting, and (4) combination of mutual word counting and word-spotting were used. The performance of this system was evaluated using these methods including 10-Fold Cross-Validation and was shown that the combination of mutual word counting and word-spotting is the best with the accuracy of about 82%. It was observed that the amount of labeled training data and the type of sentences have an important role in the system performance. Keywords-word-spotting; word-counting; emotion sensing; mutual-word-counting I. INTRODUCTION Emotions exist in all kinds of communication including textual communication and was represented as an important factor for understanding a full meaning of message and with the increasingly amount of text based communication being produced (mails, user created content), researchers are seeking automated language processing techniques that include the model of emotions. Emotions are very helpful to understand the meaning of a sentence and mood of a talking person also. Many methods are presented to extract this emotion from sentences and employ it in TTS systems. These methods are basically divided into two categories, categorical and n-dimensional models. In this paper we work on categorical models. It is because of the lack of dimensional resources (such as dictionaries and lexicons) for Farsi language and the ease of making categorical resources. Categorical approaches for representing affective states are most commonly used and are based on thesaurus that defines the emotional categories. These models are based on the assumption that people using the same language have similar conceptions for different discrete emotions. For example, WordNet, a lexical database of English terms widely used in computational linguistics research [8] was extended with information on affective terms [10]. Several approaches [3,5] rely on the Linguistic Inquiry Word Count (LIWC), a valid computer tool that analyzes bodies of text using dictionary-based categorization. Four categorical methods are implemented and tested in this paper. Factors that are important to choose these methods are easy- implementing and having good results and high performance. II. RELATED WORKS Researches on interface agents show that a system’s capacity for emotional interactions can make the agents valuable [2]. Aiming at enabling computers to express and recognize emotions, emerging technological advances are inspiring the field of research on “affective computing”. Since many computer user interfaces today are textually-based, the automatic emotion recognition from textual data plays an important role in the design of intelligent user interfaces that are more natural and user-friendly. In the past, many studies have been conducted to automatically detect a user’s affective states from textual data. Some using “keyword-spotting” techniques, but the results are not satisfactory. Keyword-spotting approach apparently cannot apply to sentences without clearly-defined affective keywords. A number of studies applied emotion theories to determine emotions of interactive agents in intelligent systems [1]. In those approaches a variety of hand-crafted emotion models based on psychological theories were employed to specify how interactive events, agents and objects are appraised according to individual’s goals, standards and attitudes. At this stage, emotion sensing based on an emotion theory, is only applicable in interactive systems where the interactive events can be precisely defined, enumerated and automatically detected. Liu, Lieberman, and Selker [6] reported an approach to detect sentence-level emotion based large-scale common sense knowledge-base, ConceptNet. The approach uses real world knowledge about the inherent affective nature of everyday situations to classify sentence into basic emotion categories. In the initial stage, concepts in the ConceptNet with clearly- defined affective keywords were automatically annotated with The 16th CSI International Symposium on Artificial Intelligence and Signal Processing (AISP 2012) 978-1-4673-1479-4/12/$31.00 ©2012 IEEE 339

Upload: mohammad-mehdi

Post on 12-Dec-2016

216 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: [IEEE 2012 16th CSI International Symposium on Artificial Intelligence and Signal Processing (AISP) - Shiraz, Fars, Iran (2012.05.2-2012.05.3)] The 16th CSI International Symposium

Emotions from Farsi Texts with Mutual-Word-Counting and Word-Spotting

Amir Namavar Jahromi Computer Engineering and IT Department

Amirkabir University of Technology Tehran, Iran

[email protected]

Mohammad Mehdi Homayounpour Computer Engineering and IT Department

Amirkabir University of Technology Tehran, Iran

[email protected]

Abstract—Automatic emotion sensing of textual data plays a crucial role in development of many intelligent applications. The objective of this paper is to design a system for sensing of a Farsi sentence emotion. The final goal of this work is to use it with Farsi TTS and SDS systems and make it possible to synthesis sentences with their emotions. Four emotions where considered and four sensing methods including (1) word counting, (2) mutual word counting, (3) word-spotting, and (4) combination of mutual word counting and word-spotting were used. The performance of this system was evaluated using these methods including 10-Fold Cross-Validation and was shown that the combination of mutual word counting and word-spotting is the best with the accuracy of about 82%. It was observed that the amount of labeled training data and the type of sentences have an important role in the system performance.

Keywords-word-spotting; word-counting; emotion sensing; mutual-word-counting

I. INTRODUCTION Emotions exist in all kinds of communication including

textual communication and was represented as an important factor for understanding a full meaning of message and with the increasingly amount of text based communication being produced (mails, user created content), researchers are seeking automated language processing techniques that include the model of emotions.

Emotions are very helpful to understand the meaning of a sentence and mood of a talking person also. Many methods are presented to extract this emotion from sentences and employ it in TTS systems. These methods are basically divided into two categories, categorical and n-dimensional models.

In this paper we work on categorical models. It is because of the lack of dimensional resources (such as dictionaries and lexicons) for Farsi language and the ease of making categorical resources.

Categorical approaches for representing affective states are most commonly used and are based on thesaurus that defines the emotional categories. These models are based on the assumption that people using the same language have similar conceptions for different discrete emotions. For example, WordNet, a lexical database of English terms widely used in

computational linguistics research [8] was extended with information on affective terms [10].

Several approaches [3,5] rely on the Linguistic Inquiry Word Count (LIWC), a valid computer tool that analyzes bodies of text using dictionary-based categorization.

Four categorical methods are implemented and tested in this paper. Factors that are important to choose these methods are easy- implementing and having good results and high performance.

II. RELATED WORKS Researches on interface agents show that a system’s

capacity for emotional interactions can make the agents valuable [2]. Aiming at enabling computers to express and recognize emotions, emerging technological advances are inspiring the field of research on “affective computing”. Since many computer user interfaces today are textually-based, the automatic emotion recognition from textual data plays an important role in the design of intelligent user interfaces that are more natural and user-friendly.

In the past, many studies have been conducted to automatically detect a user’s affective states from textual data. Some using “keyword-spotting” techniques, but the results are not satisfactory. Keyword-spotting approach apparently cannot apply to sentences without clearly-defined affective keywords. A number of studies applied emotion theories to determine emotions of interactive agents in intelligent systems [1]. In those approaches a variety of hand-crafted emotion models based on psychological theories were employed to specify how interactive events, agents and objects are appraised according to individual’s goals, standards and attitudes. At this stage, emotion sensing based on an emotion theory, is only applicable in interactive systems where the interactive events can be precisely defined, enumerated and automatically detected.

Liu, Lieberman, and Selker [6] reported an approach to detect sentence-level emotion based large-scale common sense knowledge-base, ConceptNet. The approach uses real world knowledge about the inherent affective nature of everyday situations to classify sentence into basic emotion categories. In the initial stage, concepts in the ConceptNet with clearly-defined affective keywords were automatically annotated with

The 16th CSI International Symposium on Artificial Intelligence and Signal Processing (AISP 2012)

978-1-4673-1479-4/12/$31.00 ©2012 IEEE 339

Page 2: [IEEE 2012 16th CSI International Symposium on Artificial Intelligence and Signal Processing (AISP) - Shiraz, Fars, Iran (2012.05.2-2012.05.3)] The 16th CSI International Symposium

desired basic emotion categories. Then the emotion for other concepts with semantic relationship to the affectively annotated concepts are assigned automatically based on certain emotion propagation models. The accuracy of such emotion propagation process has not yet invested. The restricted coverage of the concepts and relationship in ConceptNet seriously limits the user of such approach in real-life applications

Shaikh, Prendinger and Ishizuka [9] recently developed a linguistic tool “SenseNet” to detect polarity values of word, phrase, and sentence-level textual data. The approach uses WordNet as a linguistic corpus. Polarity values for adjectives and adverbs in WordNet were manually annotated. The polarity of verb is calculated via same hand-crafted equations that count positive and negative senses from the definitions in WordNet. The polarity of a noun assigned based on the related verbs obtained from relationship recorded in ConceptNet. One major problem is that since the data in ConceptNet contains many misspelled words, false concepts, and overly-specific data, the correctness of polarity value of many concepts in SenseNet is questionable. Moreover, so the concepts not included in WordNet cannot be processed.

Wu, Chang, and Lin [11] recently proposed a novel approach for sentence-level emotion detection based on the semantic labels (SLs) and attributes (ATTs) of entities of a sentence. To distinguish the emotion “happy” and “unhappy”, the SLs are manually classified into three categories, Active SLs (e.g. obtain, reach, lost), Negative SLs (e.g. no, never), and Transitive SLs (e.g. finally, but). ATTs of an entity are automatically obtained by using WordNet as the lexical resource.

Recently some works had done towards a robust emotion sensing engine from free-text using web mining approaches; this study proposes a novel approach for detecting emotion of an individual event embedded in English sentences [7].

A. Word-Counting This method employs Lexical Affinity measures. These

techniques are a bit more refined than keyword spotting where they assign for each word a probabilistic affinity for a certain emotion. For example, the word “success” has an 80% probability of reflecting a positive event. Similar to keyword spotting, lexical affinity techniques perform poorly when facing intricate sentence structure like “This was not a success at all!”. Additionally, using probability measures may be dependent on the text corpus used in the training. Some measures are computed for each word as being the ratio of emotional senses over the total senses the word may have. WordNet and WordNet-Affects are used in order to recognize the total number of senses and the number of emotional senses. This task is easy since it suffices to count in how many synets (sets of synonyms) the word appears [12].

B. Word-Spotting It is based on the lexicon or a dictionary grouping words

that have emotional connections. These techniques predict the emotions of writer by identifying these affective words from the text. The words are unambiguous and reflect clearly a particular emotion, for instance “happy” reflects happiness and

“scared” reflects fears. These techniques are popular because of their simplicity and economical advantage. However, they rely on individual words that is why they perform poorly when the sentence structure is more intricate (e.g. use of negation). Additionally, their dependence on the text’s surface features hampers their ability to uncover underlying emotions from the text.

These techniques rely on available lexicon. One such example is WordNet-Affect which is based on WordNet; the latter is semantic lexicon where words are grouped into sets of synonyms (called synset); WordNet-Affect further annotates the synset that have an affective content. Another example is SentiWordNet which assigns WordNet synsets a graded measure with respect to two scales: a positive/negative scale and a subjective/objective scale. It is important to note that the classification is based on synsets, not on words (because a word can have multiple meanings). Unfortunately, due to the large size of the database it is hard to test the accuracy of the measures for all synsets. That is why some approaches have combined multiple lexicons for increasing the accuracy of the results [12].

III. OUR APPROACHES in this paper, four categorical methods are implemented and

tested, Word-Counting, Mutual-Word-Counting, Word-Spotting and the combination of Mutual-Word-Counting and Word-Spotting. In Farsi spoken language, part-of-speeches may be changed for example from “Subject+ Object+ Verb” to “Verb+ Object+ Subject”, so Farsi is a very complicated language for automatic NLP works. Such events make it hard to use known methods in English for emotional sensing like using CFGs for extracting part-of-speeches.

In this part four methods are explained:

A. Word-Counting The first method that has been developed in this project is

Word- Counting. Two types of Word-Counting are experimented in the research with two different weighting factors. The first one is only count all of the matching words in all categories and assign the highest emotion score to the word as its emotion. The second method is based on positive and negative scores. In this way, each category has a certain and predefined weighting score and the number of matching words is multiplied by this weighting factor. After determining the emotion of words, emotion of the sentence should be predicted. In order to do this, scores of sentence’s words are added to each other.

For the second method, many weighting factors are tested and it was observed that the particular weighting factor with 100, -20, -60 and -120, for happy, neutral, sad and angry emotions respectively, lead to the best result.

After some experiments, it was seen that the second method has better results in comparison to the first one, so it was selected as one of the candidate methods.

340

Page 3: [IEEE 2012 16th CSI International Symposium on Artificial Intelligence and Signal Processing (AISP) - Shiraz, Fars, Iran (2012.05.2-2012.05.3)] The 16th CSI International Symposium

B. Word-Spotting After implementing Word-Counting method, Word-

Spotting method is selected to implement. For achieving this, a dictionary of 1081 labeled words is made. These words are relevant to all of emotion senses except the neutral sense.

To extract the emotion of a sentence, words of a given sentence are searched in this dictionary and if the words of more than one emotion exists in the sentence, the emotion with more number of related words is selected as a final result.

C. Mutual-Word-Counting In many Farsi sentences with the same emotion, it can be

seen that the same words exist in their structure. In this technique sentences having two words similar to words in training set are counted instead of a single word in the first method.

Like Word-Counting method, a weighting factor is needed for each emotion state. Two weighting factor sets are selected and tested like the first method. In the first one, sentences in the files containing emotional state are considered. Sentences with two identical words are counted and the most frequent emotional state is selected as the emotion of that sentence. In the second one, one score is calculated for each sentence and the decision is made using this score. After testing some weighting factors, factors equal to the second technique in Word-Counting are selected.

D. Mutual-Word-Counting and Word-Spotting Results of previous method are not good enough to be used

in real systems. So it is combined with a second method in order to have better results and higher performance. Table I presents the results of Mutual-Word-Counting method. In this table the score range is divided into some intervals. The number of sentences from each category in each interval is counted and then it is decided that each interval is related to which emotion. After this, Word-Spotting is used just for the selected emotions instead of all emotions.

TABLE I. EMOTION DISTRIBUTION IN INTERVALS FOR MUTUAL-WORD-COUNTING

Scores Emotions

Happy Neutral Sad Angry

>=50 *

10,50 *

10,-10 * *

-100,-10 * *

-200,-100 * * *

-300,-200 * *

-400,-300 * * *

-600,-400 * *

-700,-600 * *

Scores Emotions

Happy Neutral Sad Angry

-800,-700 * *

-1000,-800 * *

-1200,-1000 * *

-1300,-1200 * *

-1400,-1300 * *

-1500,-1400 * *

-1600,-1500 * *

-1700,-1600 * * * *

-1800,-1700 * *

-1900,-1800 * *

-2000,-1900 * *

-2000<= * *

This method presents better performance than Word-Spotting since the last section is done not for all emotional dictionaries. In addition, its results are better than the results of Mutual-Word-Counting since in the last method, more than one result is possible for each interval.

IV. EVALUATION This system was tested with 2243 sentences of four

categorized emotional groups of sentences, happy, neutral, sad and angry with 10-fold cross-validation. These sentences are extracted in Laboratory for Intelligent Multimedia Processing from many Farsi plays with different topics from historic to dramatic dialogues.

As illustrated in table II, Word-Counting method has the worst results with accuracy of about 48%, while the most reliable method is the combination of Mutual-Word-Counting and Word-Spotting with 82% of accuracy. So we suggest the last method as a suitable way to extract emotional state of Farsi sentences.

TABLE II. PERFORMANCE COMPARISON OF ALL EMOTION DETECTION IN PERCENT

-

Extraction methods

Word-Counting

Word-Spotting

Mutual-Word-

Counting

Mutual-Word-

Counting &

Word-Spotting

% 48% 60% 68% 82%

Table III shows the percentage of correctness in each emotion for the last method. It can be perceived that the neutral sentences are hard to determine with this method since their

341

Page 4: [IEEE 2012 16th CSI International Symposium on Artificial Intelligence and Signal Processing (AISP) - Shiraz, Fars, Iran (2012.05.2-2012.05.3)] The 16th CSI International Symposium

words do not show any specific emotion. In other hand, angry and happy emotions have the best results, respectively.

TABLE III. EMOTIONAL STATE DETECTION PERFORMANCE USING THE SUGGESTED METHOD

- Emotions

Happy Neutral Sad Angry

% 91% 60% 80% 97%

I. CONCLUSION This paper’s aim is to determine the emotion of a Farsi

sentence that can employ in TTS systems in order to make the output more user-friendly and more natural. Among the four implemented methods, combination of Mutual-Word-Counting and Word-Spotting method showed the best results with about 82% accuracy for detecting the emotional state of Farsi sentences.

REFERENCES [1] C. Bartneck, “Integrating the OCC model of emotions in embodied

characters,.” Workshop on Virtual Convertional Characters: Applications, Methods, and Research Challenges, Melbourne, 2002

[2] J. Bates, “The role of emotion in believable agents,” Communication of ACM, vol. 37, no. 7, pp. 122-125, 1994.

[3] M. A. Cohn, M. R. Mehl, and J. W. Pennbaker, “Linguistic markers of psychological change surronding September 11, 2001,” Psychological Science, vol. 15, pp. 687-693, 2004.

[4] M. Dyer, “Emotions and their computations: Three computer models,” Congestion and Emotion, vol. 1, no. 3, pp. 323-347, 1987.

[5] J. Kahn, R. Tobin, A. Massey, et al, “Measuring emotional expression with the Linguistic Inquiry and Word Counting,” American jornal of psychology, vol. 120, pp. 263-286, 2007.

[6] H. Liu, H. Liberman, and T. Selker, “A model of textual affect sensing using real-world knowledge,” 8th International Conference on Intelligent user Interfaces, New York, 2003.

[7] C. Lu, S. Lin, J. Liu, et al., “Automatic event-level textual emotion sensing using mutual action histogram,” Expert systems with applications, vol. 37, pp. 1643-1653, 2010.

[8] G. Miller, R. Beckwith, C. Fellbaum, et al., “Introduction to WordNet: an on-line lexical database,” Jornal of Lexicography, vol. 3, pp. 235-244, 1990.

[9] M. Shaikh , H. Pennbaker, and M. Ishizuka, “SenseNet: A linguistic tool to visualize numerical-valence based sentiment of textual data,”, International Conference on Natural Language Processing, pp. 147-152, 2007.

[10] C.Strapparava, and A. Valitutti, “WordNet-Affect: an affective extention of WordNet,” 4th International Conference on Language Resources and Evaluation, pp. 1083-1086, 2004.

[11] C.Wu, Z. Chuang, and Y. Lin, “Emotion recognition from text using semantic labels and separablemixture models,” Asian Language Information Processing, vol. 5, no. 2, pp. 165-183, 2006.

[12] M. Yassine, and H. Hjj, “A framework for emotion mining from text in online social network,” in International conference on Data mining, pp. 1136-1142, 2010.

342