presenter : jian-ren chen authors : sheng-tun li a,b,* , fu-ching tsai a 2013 , kbs

Post on 24-Feb-2016

50 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

A fuzzy conceptualization model for text mining with application in opinion polarity classification. Presenter : Jian-Ren Chen Authors : Sheng-Tun Li a,b,* , Fu-Ching Tsai a 2013 , KBS. Outlines. Motivation Objectives Methodology Experiments Conclusions Comments. - PowerPoint PPT Presentation

TRANSCRIPT

Intelligent Database Systems Lab

Presenter : JIAN-REN CHEN

Authors : Sheng-Tun Lia,b,*, Fu-Ching Tsaia

2013 , KBS

A fuzzy conceptualization model for text mining with application in opinion

polarity classification

Intelligent Database Systems Lab

OutlinesMotivationObjectivesMethodologyExperimentsConclusionsComments

Intelligent Database Systems Lab

MotivationMost existing document classification algorithms are easily

affected by ambiguous terms.

The ability to disambiguate for a classifier is thus as important as

the ability to classify accurately.

- opinion polarity classification

Intelligent Database Systems Lab

ObjectivesWe propose a concept driven text classification approach based on

Formal Concept Analysis (FCA) to train a classifier using concepts

instead of documents, so as to reduce the inherent ambiguities.

We further utilize fuzzy formal concept analysis (FFCA) to take

uncertain information into consideration.

Intelligent Database Systems Lab

Formal concept analysis

Objects: {Review6,Review7}

Attributes: {Phenomenal, Fantastic, Love}

=> formal concept

positive class:‘‘Phenomenal’’, ‘‘Fantastic’’ and ‘‘Love’’ {Review1, Review4, Review6 and Review7}

neutral class:‘‘Cover’’{Review5}

negative class:‘‘Awful’’{Review2, Review3}

Intelligent Database Systems Lab

Formal concept analysis

positive class: {Review1, Review4, Review6, Review7}negative class:{Review2, Review3}neutral class:{Review5}

Intelligent Database Systems Lab

Methodology - Architecture

Intelligent Database Systems Lab

Methodologytf-idf:

Inverted ConformityFrequency (ICF):

Uniformity (Uni):tf-idf > 26 ICF < log(2)Uni > 0.2

Intelligent Database Systems Lab

Methodology

Intelligent Database Systems Lab

Methodology

Intelligent Database Systems Lab

Experiments - Data set and evaluation

• Data set: Reuter-21578 movie review e-book review

• Evaluation

Intelligent Database Systems Lab

Experiments (parameters)

Intelligent Database Systems Lab

Experiments

Intelligent Database Systems Lab

Experiments (conceptualization)

Intelligent Database Systems Lab

Experiments

Intelligent Database Systems Lab

Experiments

Intelligent Database Systems Lab

Conclusions• FFCM successfully reduce the impact from textual ambiguity.

• The results from the experiments show that FFCM

outperforms other state-of-the-art algorithms for both

Reuters-21578 and two opinion polarity collections.

Intelligent Database Systems Lab

Comments• Advantages

- the formal concepts plays an important role• Disadvantage

- α may differ from various datasets- only focuses on single-class classification

• Applications- text mining

top related