a topic map-based ontology ir system versus clustering-based ir system: a comparative study in...
DESCRIPTION
Due to the increasing amount and complexity of digital resources, there are several critical issues that arise in digital environments such as ill-structured and poor management of digital information. Different information organization approaches have been used to address these issues. In particular, Semantic Web has been explored for 10 years; however there are not many practical applications. This is in part due to the fact that much attention has been given to the creation rather than the migration of existing data. In addition, the lack of guidelines for choosing the right migration approach, whether Topic Maps or Resource Description Framework (RDF), needs to be addressed. This paper presents a comparison of Semantic Web Data Models (Topic Maps and RDF), followed by an example of migration of existing metadata into ontology-based data for Semantic Web.TRANSCRIPT
![Page 1: A Topic map-based ontology IR system versus Clustering-based IR System: A Comparative Study in Security Domain](https://reader035.vdocuments.site/reader035/viewer/2022070303/5496ed80ac795925288b53b7/html5/thumbnails/1.jpg)
A Topic map-based ontology IR system versus Clustering-based IR System: A Comparative Study in Security Domain
Myongho Yi
Texas Woman’s University, TX, USA, [email protected]
Sam Gyun Oh
SungKyunKwan University, Seoul, Korea, [email protected]
![Page 2: A Topic map-based ontology IR system versus Clustering-based IR System: A Comparative Study in Security Domain](https://reader035.vdocuments.site/reader035/viewer/2022070303/5496ed80ac795925288b53b7/html5/thumbnails/2.jpg)
Agenda
1. Related Works1. Related Works
2. Research Questions2. Research Questions
3. Research Designs3. Research Designs
4. Research Results 4. Research Results
5. Conclusion
![Page 3: A Topic map-based ontology IR system versus Clustering-based IR System: A Comparative Study in Security Domain](https://reader035.vdocuments.site/reader035/viewer/2022070303/5496ed80ac795925288b53b7/html5/thumbnails/3.jpg)
Background
• Many information organization approaches such as taxonomy, thesaurus, classification, and ontology have been attempted to provide effective searching.
• Among them, clustering and ontology approaches have received much attention. However, there have not been many studies which compare in terms of user performance.
![Page 4: A Topic map-based ontology IR system versus Clustering-based IR System: A Comparative Study in Security Domain](https://reader035.vdocuments.site/reader035/viewer/2022070303/5496ed80ac795925288b53b7/html5/thumbnails/4.jpg)
Three Information Org. Approaches
Term Lists:
Synonym Rings*Authority Files*Glossaries/DictionariesGazetteers*
Natural language Controlled language
Wea
kly- s
truct
u red
Str o
ngly-
stru
ctur
ed
Classification &Categorization: Subject Headings
Classification schemes* Taxonomies*Categorization schemes
Relationship Groups: Ontologies* Semantic networks* Concept maps*Thesauri*
Pick lists*
(Zeng, 2005)
![Page 5: A Topic map-based ontology IR system versus Clustering-based IR System: A Comparative Study in Security Domain](https://reader035.vdocuments.site/reader035/viewer/2022070303/5496ed80ac795925288b53b7/html5/thumbnails/5.jpg)
Clustering 2.0
• Classification of data into different subtopic categories. • Clustering shows related items according to their
similarity. • Classify related search results into topic folders• Clustering 2.0
– Remix clustering • Shows other subtle topics
![Page 6: A Topic map-based ontology IR system versus Clustering-based IR System: A Comparative Study in Security Domain](https://reader035.vdocuments.site/reader035/viewer/2022070303/5496ed80ac795925288b53b7/html5/thumbnails/6.jpg)
Works in Medical Domain
• Less Polysemes
• Mainly Hierarchical Relationships
• Cancer– Breast Cancer– Prostate Cancer– Colon Cancer– Lung Cancer– Skin Cancer– ….
![Page 7: A Topic map-based ontology IR system versus Clustering-based IR System: A Comparative Study in Security Domain](https://reader035.vdocuments.site/reader035/viewer/2022070303/5496ed80ac795925288b53b7/html5/thumbnails/7.jpg)
Norwegian Electronic Health LibraryUnited States Nat’l Lib of Medicine
![Page 8: A Topic map-based ontology IR system versus Clustering-based IR System: A Comparative Study in Security Domain](https://reader035.vdocuments.site/reader035/viewer/2022070303/5496ed80ac795925288b53b7/html5/thumbnails/8.jpg)
How about Other Domains?
• Social Sciences
• Humanities
• Polysemes
• Bank– Financial institution– Rely on
![Page 9: A Topic map-based ontology IR system versus Clustering-based IR System: A Comparative Study in Security Domain](https://reader035.vdocuments.site/reader035/viewer/2022070303/5496ed80ac795925288b53b7/html5/thumbnails/9.jpg)
Clustering - Limitations
• Relevant results ?– Loosely related associative relationships
• Same / Different category
– Examples• Security
– Information Security» Network Security» PGP» Customers (?)» Valuable (?)» Other Topics (?)
No classified? Loosely related terms?Term Lists?
![Page 10: A Topic map-based ontology IR system versus Clustering-based IR System: A Comparative Study in Security Domain](https://reader035.vdocuments.site/reader035/viewer/2022070303/5496ed80ac795925288b53b7/html5/thumbnails/10.jpg)
Purpose of Study
• To measure the efficiency on representations of associative relationships
• To compare the user performance of our Topic Maps-based method with the Clustering-based method.
![Page 11: A Topic map-based ontology IR system versus Clustering-based IR System: A Comparative Study in Security Domain](https://reader035.vdocuments.site/reader035/viewer/2022070303/5496ed80ac795925288b53b7/html5/thumbnails/11.jpg)
Related Works
• Yi (2008)
– Compared on Ontology based System to thesaurus based system• 40 subjects, 8 queries, 2 types of queries, search time and recall• An ontology system showed a better recall and search time for relationship
based queries
• Oh (2006)
– Compared on Topic Map-Based Korean Folk Music (Pansori) Retrieval System (TMPRS) to Current Pansori Retrieval System (CPRS)
• Twenty LIS Students in Korea, 7 different search tasks and own query• TMPRS showed higher performance for objective and subjective
measurements in general
• E.K.F. Dang, Luk, Ho, Chan, & Lee, (2008)
– Clustering algorithms
– Partitioning and hierarchical
![Page 12: A Topic map-based ontology IR system versus Clustering-based IR System: A Comparative Study in Security Domain](https://reader035.vdocuments.site/reader035/viewer/2022070303/5496ed80ac795925288b53b7/html5/thumbnails/12.jpg)
Research Questions
• Are there recall/precision differences between TMIR and CIR?
• Are there search time differences between TMIR and CIR?
• Are there search steps differences between TMIR and CIR?
![Page 13: A Topic map-based ontology IR system versus Clustering-based IR System: A Comparative Study in Security Domain](https://reader035.vdocuments.site/reader035/viewer/2022070303/5496ed80ac795925288b53b7/html5/thumbnails/13.jpg)
Research Design
• Subjects– Information Technology Major Students
• Data Collection– Questionnaire– Screen Recording
• TMIR and CIR– Topic Maps-based Security Information Retrieval
(TMIR) system and Clustering-based Security Information Retrieval (CIR) system.
![Page 14: A Topic map-based ontology IR system versus Clustering-based IR System: A Comparative Study in Security Domain](https://reader035.vdocuments.site/reader035/viewer/2022070303/5496ed80ac795925288b53b7/html5/thumbnails/14.jpg)
Research Variables
Two Retrieval Systems
Topic Map-Based Ontology
Information Retrieval System
Clustering-Based Information Retrieval System
Independent Variables
Quantitative Measurement
Search steps, Search Time
Dependent Variables
![Page 15: A Topic map-based ontology IR system versus Clustering-based IR System: A Comparative Study in Security Domain](https://reader035.vdocuments.site/reader035/viewer/2022070303/5496ed80ac795925288b53b7/html5/thumbnails/15.jpg)
Search Task Types
Task #
Degree of Relationships
Task
1 Simple Task List all the security software 2 Complex Task Name all the Security engineers who work for Cisco
3 Complex Task Find Vendors providing security training service4 Association and
Cross Reference Related Task
List all the security hardware supported by IBM Consultants
5 Association andCross Reference Related Task
List all software using RSA cryptography and find engineers who specialize in these software packages.
6 Association andCross Reference Related Task
Find security system engineers who specialize in firewall and their supervisor and sale representatives
7 Association andCross Reference Related Task
Assume that your organization is interested in security training. Who will be the right people to contact? Please provide their e-mail addresses
![Page 16: A Topic map-based ontology IR system versus Clustering-based IR System: A Comparative Study in Security Domain](https://reader035.vdocuments.site/reader035/viewer/2022070303/5496ed80ac795925288b53b7/html5/thumbnails/16.jpg)
Ontology Development Process
Code using XML Topic Maps (XTM)
Identify Equivalent Relationships
Identify Hierarchical Relationships
Identify Associative Relationships
- Same categories
- Different categories
List Terms by Ontology Engineer
Classify/Categorize by
Ontology Engineer
Add Semantic Relationships by Ontology
Engineer
Normalize by Domain Expert
Implemented by Programmer
Domain Experts
Identify index terms
List the index terms
Do not distinguish between preferred and non-preferred terms
Classify terms
Categorize terms
One term can be in multiple categories
Verify three relationships
Add additional relationships
![Page 17: A Topic map-based ontology IR system versus Clustering-based IR System: A Comparative Study in Security Domain](https://reader035.vdocuments.site/reader035/viewer/2022070303/5496ed80ac795925288b53b7/html5/thumbnails/17.jpg)
Ontology Modeling
• Associations– Works for– Maintains– Applied to– Embed in– Provides– Complies with– Designs– Makes– Provides
![Page 18: A Topic map-based ontology IR system versus Clustering-based IR System: A Comparative Study in Security Domain](https://reader035.vdocuments.site/reader035/viewer/2022070303/5496ed80ac795925288b53b7/html5/thumbnails/18.jpg)
Embed in
• Cryptography embed in Hardware
![Page 19: A Topic map-based ontology IR system versus Clustering-based IR System: A Comparative Study in Security Domain](https://reader035.vdocuments.site/reader035/viewer/2022070303/5496ed80ac795925288b53b7/html5/thumbnails/19.jpg)
Two Retrieval Systems Compared
• Search for “Firewall”• Clustering-based Information Retrieval System
Show Related Terms
Show Related Terms
![Page 20: A Topic map-based ontology IR system versus Clustering-based IR System: A Comparative Study in Security Domain](https://reader035.vdocuments.site/reader035/viewer/2022070303/5496ed80ac795925288b53b7/html5/thumbnails/20.jpg)
Two Retrieval Systems Compared
• Clustering-based Information Retrieval System
Firewall Software
Listed
Firewall Software
Listed
No Related Information Provided
Such as Vendor, Engineers for
Firewall
No Related Information Provided
Such as Vendor, Engineers for
Firewall
![Page 21: A Topic map-based ontology IR system versus Clustering-based IR System: A Comparative Study in Security Domain](https://reader035.vdocuments.site/reader035/viewer/2022070303/5496ed80ac795925288b53b7/html5/thumbnails/21.jpg)
Two Retrieval Systems Compared
• Converted to the Identical Interface using Omnigator
![Page 22: A Topic map-based ontology IR system versus Clustering-based IR System: A Comparative Study in Security Domain](https://reader035.vdocuments.site/reader035/viewer/2022070303/5496ed80ac795925288b53b7/html5/thumbnails/22.jpg)
Two Retrieval Systems Compared
• Search for “Firewall”• Topic Maps-based Information Retrieval System
Show Topic Types and
Associative Relationships
Show Topic Types and
Associative Relationships
![Page 23: A Topic map-based ontology IR system versus Clustering-based IR System: A Comparative Study in Security Domain](https://reader035.vdocuments.site/reader035/viewer/2022070303/5496ed80ac795925288b53b7/html5/thumbnails/23.jpg)
Two Retrieval Systems Compared
• Search for “Firewall”• Topic Maps-based Information Retrieval System
Shows the type of information and
related information such as
developers and sales person
Shows the type of information and
related information such as
developers and sales person
![Page 24: A Topic map-based ontology IR system versus Clustering-based IR System: A Comparative Study in Security Domain](https://reader035.vdocuments.site/reader035/viewer/2022070303/5496ed80ac795925288b53b7/html5/thumbnails/24.jpg)
Research Results
• There was a significant difference in recall between the two groups.
• The estimate value shows the recall on TMIR was higher than CIR.
• The estimate value also has shown that the search time/search steps in the experimental group was less than in the control group.
![Page 25: A Topic map-based ontology IR system versus Clustering-based IR System: A Comparative Study in Security Domain](https://reader035.vdocuments.site/reader035/viewer/2022070303/5496ed80ac795925288b53b7/html5/thumbnails/25.jpg)
Discussion
• There were significant differences between the two groups and in terms of recall, precision, search time, and search steps.
• Overall, recall was higher when performing simple task than when performing complex tasks.
• Performing complex-tasks took more search time than performing simple tasks across the two groups. The control group took more total search time than the experimental group.
![Page 26: A Topic map-based ontology IR system versus Clustering-based IR System: A Comparative Study in Security Domain](https://reader035.vdocuments.site/reader035/viewer/2022070303/5496ed80ac795925288b53b7/html5/thumbnails/26.jpg)
Conclusion
• This study illustrates that the positive influences of a Topic map-based ontology IR system are improved recall/precision, shorter search time and search steps for given search tasks than the clustering-based IR system.
• The results of this study attest to the potential of Topic Maps-based ontology to improve information retrieval system performance through better support for associative relationships between terms belonging to different hierarchies by providing explicit relationships among resources.
![Page 27: A Topic map-based ontology IR system versus Clustering-based IR System: A Comparative Study in Security Domain](https://reader035.vdocuments.site/reader035/viewer/2022070303/5496ed80ac795925288b53b7/html5/thumbnails/27.jpg)
Q & A
Myongho Yi
Texas Woman’s University, TX, USA, [email protected]
Sam Gyun Oh
SungKyunKwan University, Seoul, Korea, [email protected]