![Page 1: Using opinion mining techniques for early crisis detection](https://reader034.vdocuments.site/reader034/viewer/2022052508/559752751a28abe75b8b4647/html5/thumbnails/1.jpg)
Adrian Iftene, Alexandru Lucian Gînscă
ICCCC 2012, 8-12 May, Băile Felix, Oradea, Romania
“Al. I. Cuza”, University of Iasi, Romania
Faculty of Computer Science
![Page 2: Using opinion mining techniques for early crisis detection](https://reader034.vdocuments.site/reader034/viewer/2022052508/559752751a28abe75b8b4647/html5/thumbnails/2.jpg)
System overview
Data acquisition
Topic detection
Data processing
Identification of opinions
Results
Visualization
Conclusions
ICCCC 2012, 8-12 May, Băile Felix, Oradea
![Page 3: Using opinion mining techniques for early crisis detection](https://reader034.vdocuments.site/reader034/viewer/2022052508/559752751a28abe75b8b4647/html5/thumbnails/3.jpg)
3 ICCCC 2012, 8-12 May, Băile Felix, Oradea
![Page 4: Using opinion mining techniques for early crisis detection](https://reader034.vdocuments.site/reader034/viewer/2022052508/559752751a28abe75b8b4647/html5/thumbnails/4.jpg)
4 ICCCC 2012, 8-12 May, Băile Felix, Oradea
Scenario: Street protests in Romania (between 13 and 26 January, 2012)
Crawler component, RSS feeds
Scraping: removed links, photos, menus, special characters
Data locally stored
![Page 5: Using opinion mining techniques for early crisis detection](https://reader034.vdocuments.site/reader034/viewer/2022052508/559752751a28abe75b8b4647/html5/thumbnails/5.jpg)
5 ICCCC 2012, 8-12 May, Băile Felix, Oradea
The topic is very important in detecting articles reffering to a crisis situation
Latent Dirichlet Allocation: state of the art topic model
Problems: • The number of topics needs to be specified from start
• The results are lists of representative words for each topic resulting for a need for human intervention in interpreting them
Solution: WordNet based similarity measures • WuPalmer
• Lin
• Resnik (best results)
![Page 6: Using opinion mining techniques for early crisis detection](https://reader034.vdocuments.site/reader034/viewer/2022052508/559752751a28abe75b8b4647/html5/thumbnails/6.jpg)
6 ICCCC 2012, 8-12 May, Băile Felix, Oradea
Computing the similarity between 2 sets of words
T1, T2 = two sets of words.
sim(t1, t2) = one of the Wu and Palmer, Resnik or Lin similarity measures.
![Page 7: Using opinion mining techniques for early crisis detection](https://reader034.vdocuments.site/reader034/viewer/2022052508/559752751a28abe75b8b4647/html5/thumbnails/7.jpg)
7 ICCCC 2012, 8-12 May, Băile Felix, Oradea
LDA results for our street protests corpus when tracking 3 topics
![Page 8: Using opinion mining techniques for early crisis detection](https://reader034.vdocuments.site/reader034/viewer/2022052508/559752751a28abe75b8b4647/html5/thumbnails/8.jpg)
8 ICCCC 2012, 8-12 May, Băile Felix, Oradea
Language specific resources that contain cities (Iasi, Bucuresti, Ploiesti, etc.), regions (Bucovina, Moldova, Transilvania, etc.) (Iftene et al., 2011)
Introducing a more localized approach: new resources and rules for street (Iasi, Bulevardul Independentei, Bucuresti, Calea Victoriei, etc.) and smaller inner city regions identification (Pacurari district, center of Iasi, Arch of Triumph Square)
Example of Rules: to identify streets (Street + entity, Boulevard + entity, etc.), to identify small regions (the area between street A and street B or the area of the building A)
![Page 9: Using opinion mining techniques for early crisis detection](https://reader034.vdocuments.site/reader034/viewer/2022052508/559752751a28abe75b8b4647/html5/thumbnails/9.jpg)
9 ICCCC 2012, 8-12 May, Băile Felix, Oradea
538 files with 2,806 entities of "street" and “area” types
The overall quality of NE identification component is around 92% and the quality of NE classification component is around 67%
Problems:
◦ incorrect spelling
◦ anaphora resolution
◦ ambigous situations when from the context we cannot conclude that the NE is a person name or a street name
![Page 10: Using opinion mining techniques for early crisis detection](https://reader034.vdocuments.site/reader034/viewer/2022052508/559752751a28abe75b8b4647/html5/thumbnails/10.jpg)
10 ICCCC 2012, 8-12 May, Băile Felix, Oradea
Rule based opinion mining system (Gînscă et al., 2011)
Easily adaptible from a crisis scenario to another – in opposition with a statistical approach
Use of manually built resources to identify opinion keywords (good, bad etc.), amplifiers (most, more etc.), diminishers (less, etc.), negation (not, never etc.)
Calculate the valences for groups of feelings and pairing named entities with scores based on the distance, punctuation and context
Use a dedicated vocabulary for a specific crisis situation with 21 initial words (protest, conflict, fight, etc.) + similar words from WordNet (synonyms, hypernyms, etc.)
![Page 11: Using opinion mining techniques for early crisis detection](https://reader034.vdocuments.site/reader034/viewer/2022052508/559752751a28abe75b8b4647/html5/thumbnails/11.jpg)
11 ICCCC 2012, 8-12 May, Băile Felix, Oradea
Greedy approach – adding iteratively intermediate green points to the current path until solution cannot be improved
Advantages – we reduce the search space for optimal routes and the Greedy solution is obtained very fast
Disavantages – the Greedy solution is closed to the optimal solution
![Page 12: Using opinion mining techniques for early crisis detection](https://reader034.vdocuments.site/reader034/viewer/2022052508/559752751a28abe75b8b4647/html5/thumbnails/12.jpg)
12 ICCCC 2012, 8-12 May, Băile Felix, Oradea
Cumulated sentiment values by days
-40
-30
-20
-10
0
10
20
30
13 14 15 16 17 18 19 20 21 22 23 25
![Page 13: Using opinion mining techniques for early crisis detection](https://reader034.vdocuments.site/reader034/viewer/2022052508/559752751a28abe75b8b4647/html5/thumbnails/13.jpg)
13 ICCCC 2012, 8-12 May, Băile Felix, Oradea
Location type entities mentions by day
0
50
100
150
200
250
13 14 15 16 17 18 19 20 21 22 23 25
![Page 14: Using opinion mining techniques for early crisis detection](https://reader034.vdocuments.site/reader034/viewer/2022052508/559752751a28abe75b8b4647/html5/thumbnails/14.jpg)
14 ICCCC 2012, 8-12 May, Băile Felix, Oradea
GoogleMaps API
Our algorithm is able to find another path (longer) which passes near the red islands and prefers the ways near the green islands
Thus, at every step is possible to insert penalties when the partial solution crosses red islands (with potential risks) and add bonuses when the partial solution crosses green islands (without potential risk)
![Page 15: Using opinion mining techniques for early crisis detection](https://reader034.vdocuments.site/reader034/viewer/2022052508/559752751a28abe75b8b4647/html5/thumbnails/15.jpg)
15 ICCCC 2012, 8-12 May, Băile Felix, Oradea
![Page 16: Using opinion mining techniques for early crisis detection](https://reader034.vdocuments.site/reader034/viewer/2022052508/559752751a28abe75b8b4647/html5/thumbnails/16.jpg)
16 ICCCC 2012, 8-12 May, Băile Felix, Oradea
![Page 17: Using opinion mining techniques for early crisis detection](https://reader034.vdocuments.site/reader034/viewer/2022052508/559752751a28abe75b8b4647/html5/thumbnails/17.jpg)
When we haven’t green islands we must specify another method to select intermediate points in order to improve the quality of current solution
If in the cases of streets and boulevards the GoogleMaps API is able to put these entities on the map, for specific squares and areas it is not able to do this. In such cases we built an additional resource which specifies the GIS coordinates for them
17 ICCCC 2012, 8-12 May, Băile Felix, Oradea
![Page 18: Using opinion mining techniques for early crisis detection](https://reader034.vdocuments.site/reader034/viewer/2022052508/559752751a28abe75b8b4647/html5/thumbnails/18.jpg)
We present a system that can be easily adapted from a crisis situation to another (changing the dictionaries, changing the interest topics)
Efficient topic identification using LDA
Suggestive visualization using GoogleAPI
18 ICCCC 2012, 8-12 May, Băile Felix, Oradea
![Page 19: Using opinion mining techniques for early crisis detection](https://reader034.vdocuments.site/reader034/viewer/2022052508/559752751a28abe75b8b4647/html5/thumbnails/19.jpg)
19 ICCCC 2012, 8-12 May, Băile Felix, Oradea