improving text classification using local latent semantic indexing
DESCRIPTION
Improving Text Classification using Local Latent Semantic Indexing. Presenter : CHANG, SHIH-JIE Authors: Tao Liu , Zheng Chen, Benyu Zhang, Wei- ying Ma, Gongyi Wu 2004.ICDM. Outlines. Motivation Objectives Methodology Experiments Conclusions Comments. Motivation. - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Improving Text Classification using Local Latent Semantic Indexing](https://reader036.vdocuments.site/reader036/viewer/2022062309/568135f2550346895d9d6280/html5/thumbnails/1.jpg)
Intelligent Database Systems Lab
Presenter: CHANG, SHIH-JIE
Authors: Tao Liu, Zheng Chen, Benyu Zhang, Wei-ying Ma, Gongyi Wu
2004.ICDM.
Improving Text Classification using Local Latent Semantic Indexing
![Page 2: Improving Text Classification using Local Latent Semantic Indexing](https://reader036.vdocuments.site/reader036/viewer/2022062309/568135f2550346895d9d6280/html5/thumbnails/2.jpg)
Intelligent Database Systems Lab
Outlines
MotivationObjectivesMethodologyExperimentsConclusionsComments
![Page 3: Improving Text Classification using Local Latent Semantic Indexing](https://reader036.vdocuments.site/reader036/viewer/2022062309/568135f2550346895d9d6280/html5/thumbnails/3.jpg)
Intelligent Database Systems Lab
Motivation
• Global LSI ignores class discrimination. It has no help to improve the discrimination power of document classes, so it always yields no better on classification.
• In Local LSI, due to the weighting problem, the improvement of classification performance very limited.
![Page 4: Improving Text Classification using Local Latent Semantic Indexing](https://reader036.vdocuments.site/reader036/viewer/2022062309/568135f2550346895d9d6280/html5/thumbnails/4.jpg)
Intelligent Database Systems Lab
Objectives
• Propose new local LSI method(Local Relevancy Weighted LSI) to solve problem.
![Page 5: Improving Text Classification using Local Latent Semantic Indexing](https://reader036.vdocuments.site/reader036/viewer/2022062309/568135f2550346895d9d6280/html5/thumbnails/5.jpg)
Intelligent Database Systems Lab
Methodology - Local LSI • statistic (QS-CHI): measures the association between
the term and the topic.
• Mutual Information (QS-MI): measures how important a term to a topic.
![Page 6: Improving Text Classification using Local Latent Semantic Indexing](https://reader036.vdocuments.site/reader036/viewer/2022062309/568135f2550346895d9d6280/html5/thumbnails/6.jpg)
Intelligent Database Systems Lab
LRW-LSI Training (1) initial classifier IC of topic c is used to assign initial relevancy score ( rs ) to each training document. (2) each training document is weighted. (3) the top n documents are selected to generate the local term-by-document matrix of the topic c. (4) a truncated SVD is performed to generate the local semantic space. (5) all other weighted training documents are folded into
the new space. (6) all training documents in local LSI vector are used to train a real classifier RC of topic c .
Methodology-Local Relevancy Weighted LSI
![Page 7: Improving Text Classification using Local Latent Semantic Indexing](https://reader036.vdocuments.site/reader036/viewer/2022062309/568135f2550346895d9d6280/html5/thumbnails/7.jpg)
Intelligent Database Systems Lab
Methodology-Local Relevancy Weighted LSI
![Page 8: Improving Text Classification using Local Latent Semantic Indexing](https://reader036.vdocuments.site/reader036/viewer/2022062309/568135f2550346895d9d6280/html5/thumbnails/8.jpg)
Intelligent Database Systems Lab
Experiments
![Page 9: Improving Text Classification using Local Latent Semantic Indexing](https://reader036.vdocuments.site/reader036/viewer/2022062309/568135f2550346895d9d6280/html5/thumbnails/9.jpg)
Intelligent Database Systems Lab
Experiments
![Page 10: Improving Text Classification using Local Latent Semantic Indexing](https://reader036.vdocuments.site/reader036/viewer/2022062309/568135f2550346895d9d6280/html5/thumbnails/10.jpg)
Intelligent Database Systems Lab
Experiments
![Page 11: Improving Text Classification using Local Latent Semantic Indexing](https://reader036.vdocuments.site/reader036/viewer/2022062309/568135f2550346895d9d6280/html5/thumbnails/11.jpg)
Intelligent Database Systems Lab
Experiments
![Page 12: Improving Text Classification using Local Latent Semantic Indexing](https://reader036.vdocuments.site/reader036/viewer/2022062309/568135f2550346895d9d6280/html5/thumbnails/12.jpg)
Intelligent Database Systems Lab
Experiments
![Page 13: Improving Text Classification using Local Latent Semantic Indexing](https://reader036.vdocuments.site/reader036/viewer/2022062309/568135f2550346895d9d6280/html5/thumbnails/13.jpg)
Intelligent Database Systems Lab
Conclusions• LRW-LSI can improve the classification performance
greatly using a much smaller dimension compared to the global LSI and local LSI methods.
![Page 14: Improving Text Classification using Local Latent Semantic Indexing](https://reader036.vdocuments.site/reader036/viewer/2022062309/568135f2550346895d9d6280/html5/thumbnails/14.jpg)
Intelligent Database Systems Lab
Comments• Advantages
- LRW-LSI is quite effective.
• Applications- Text Classification.