seher acer, başak Çakar, elif demirli, Şadiye kaptanoğlu

14
Seher Acer, Başak Çakar, Elif Demirli, Şadiye Kaptanoğlu

Upload: annette-such

Post on 29-Mar-2015

267 views

Category:

Documents


4 download

TRANSCRIPT

Page 1: Seher Acer, Başak Çakar, Elif Demirli, Şadiye Kaptanoğlu

Seher Acer, Başak Çakar, Elif Demirli, Şadiye Kaptanoğlu

Page 2: Seher Acer, Başak Çakar, Elif Demirli, Şadiye Kaptanoğlu

Introduction Motivation Aspect Based Clustering

◦ Modeling Aspects◦ Aspect Extraction◦ Framing Cycle-Aware Clustering

User Interface & Demo Conclusion References

2/14

Page 3: Seher Acer, Başak Çakar, Elif Demirli, Şadiye Kaptanoğlu

News are produced in multiple stages:◦ Gathering, writing, editing, etc.

Subjective opinion of producers, owners, advertisers – biased environment

Effort needed for a comprehensive and balanced understanding of a news event

A system that guides and encourages reader to read news from different perspectives

3/14

Page 4: Seher Acer, Başak Çakar, Elif Demirli, Şadiye Kaptanoğlu

Current systems provide limited presentation of news◦ Listing news arbitrarily or according to date

A system that helps users reach news from different viewpoints via a single portal

Capture the difference of aspects within articles reporting a common news story

Use of advanced computational techniques of information retrieval

4/14

Page 5: Seher Acer, Başak Çakar, Elif Demirli, Şadiye Kaptanoğlu

5/14

Page 6: Seher Acer, Başak Çakar, Elif Demirli, Şadiye Kaptanoğlu

Aspect: keyword-weight pairs Keywords are extracted from

◦ Head, sub-head, lead GATE (General Architecture for Text

Engineering)◦ Person, organization, location

Event extraction (Zemberek)◦ Frequently used action words/phrases

6/14

Page 7: Seher Acer, Başak Çakar, Elif Demirli, Şadiye Kaptanoğlu

7/14

Page 8: Seher Acer, Başak Çakar, Elif Demirli, Şadiye Kaptanoğlu

Set of articles on a news shows head-tail characteristics

Head – common aspects Tail – uncommon aspects Separation of head and tail provides

effective classification Two steps:

◦ Head-tail partitioning◦ Tail-side clustering

8/14

Page 9: Seher Acer, Başak Çakar, Elif Demirli, Şadiye Kaptanoğlu

Generate common-uncommon keyword sets HgP: head group proportion Calculate keyword commonness &

uncommonness Commonness – an article with many

common keywords with high weight values Uncommonness - an article with many

uncommon keywords with high weight values

9/14

Page 10: Seher Acer, Başak Çakar, Elif Demirli, Şadiye Kaptanoğlu

Agglomerative hierarchical clustering Similarity measure – Cosine similarity During Agglomerative Clustering

◦ Each object forms a cluster of its own as a singleton

◦ Pairs of clusters are merged iteratively until a certain stopping criterion is met

◦ In the merging process - the similarity between two clusters is measured by the similarity of the most similar pair of sequences belonging to these two clusters (the single-link approach)

10/14

Page 11: Seher Acer, Başak Çakar, Elif Demirli, Şadiye Kaptanoğlu

Simple & user-friendly Present news from different aspects fairly Motivate reader to read news from different

aspects

11/14

Page 12: Seher Acer, Başak Çakar, Elif Demirli, Şadiye Kaptanoğlu

Existing systems: Google news, Yahoo News◦ Limited presentation◦ News listed arbitrarily

Proposed system:◦ Gathers same news with existing systems◦ Clusters news according to aspects◦ Simple user interface◦ Easy to track news stories

The approach is suitable for Turkish news

12/14

Page 13: Seher Acer, Başak Çakar, Elif Demirli, Şadiye Kaptanoğlu

[1] Park, S., Kang, S., Lee, S., Chung, S., Song, J. Mitigating Media Bias: A Computational Approach. ACM, 2008, pp. 47-51.

[2] Park, S., Kang, S., Chung, S., Song, J. NewsCube: Delivering Multiple Aspects of News to Mitigate Media Bias. ACM, 2009.

[3] Cunningham, H., Maynard, D., Bontcheva, K., Tablan, V. GATE: A Framework and Graphical Development Environment for Robust NLP Tools and Applications. Proceedings of the 40th Anniversary Meeting of the Association for Computational Linguistics. ACL'02, 2002.

[4] Park, S., Lee, S., Song, J. Aspect-level News Browsing: Understanding News Events from Multiple Viewpoints. ACM, 2010, pp. 41-50.

13/14

Page 14: Seher Acer, Başak Çakar, Elif Demirli, Şadiye Kaptanoğlu