measuring the usefulness of knowledge organization systems in information retrieval applications

Measuring the usefulness of Knowledge Organization

Systems in Information Retrieval

applicationsPhilipp Mayr

Observatory for Knowledge Organisation Systems KNOWeSCAPE workshop,

Valletta, MaltaFebruary 01, 2017

2

GESIS• We are developing interactive information retrieval systems

for searching indexed literature and data sets• We follow the principle „research-based service“; develop

research prototypes, test and evaluate them and implement the features which are working for the end users

3

Intro• Typical difficulties in searching digital libraries (DL)

– Vagueness between search and indexing terms– How to support searchers with controlled vocabulary?

• Assumption: a user’s search (experience) should improve by using Knowledge Organization Systems (KOS):– Vague search tasks– Unfamiliar fields– Cross domain searches

• Case studies to demonstrate the effectiveness of KOS in different search scenarios

4

Case Study 1: Information retrieval experiment

• Intra- and interdisciplinary cross-concordances in the project KoMoHe– Social Sciences-SocSci; SocSci-

Economics; SocSci-Psychology; Politics-Economics; Medicine-Psychology, …

• Information retrieval evaluation of the mappings (effectiveness of intellectual mapping)

Controlled terms

5

Case Study 1• How effective are the mappings in an actual search? Does the

application of term mappings (TT) improve search over a non-transformed subject (i.e. controlled vocabulary) search (CT)?

• Real queries, only equivalence relations, 13 thesaurus mappings

Mayr/Petras 2008

• Overlap and more identical terms in intradisciplinary mappings

• Interdisciplinary mappings made the strongest effect

6

Case Study 2: Information retrieval experiment

• Discipline-specific Search-Term-Recommendation (STR) Services in IRM project

• Are recommendations from discipline specific STRs better suited for query expansion than general ones?

• Co-occurence of terms in title/description and assigned controlled terms

• 17 STR services– 16 discipline-spec.– 1 global

Lüke et al. 2012

7

Case Study 2• Are recommendations from discipline specific STRs better suited

for query expansion (QE) than general ones?• 100 topics from the GIRT corpus, top 4 recommendations to

expand the original query• gSTR = global STR; tSTR = topical STR; bSTR = best-performing

STR

Lüke et al. 2012

• QE with specific STRs leads to significantly better results than QE with a general STR

• Selecting the best matching specific STR in an automatic way is a major challenge

8

Case Study 3: Interactive IR experiment

• Measuring the utility and performance of Search Term Recommendation (STR) Services in AMUR project

• Logfile-based evaluation of STR usage and later search session success

• We defined positive signals (export, save, email, full text …) in the system

enter_search_term→select_term_from_recommender→search→view_record_1→view_record_2→view_record_3→export_record• Analysis of one year of log data• Comparing the effect of using a STR

(selecting a term) or not

Hienert/Mutschke 2016

9

Case Study 3• Usage of the STR significantly often implicates the

occurrence of positive signals during the following session steps

Hienert/Mutschke 2016

10

Conclusions• Information retrieval and interactice

IR settings are able to demonstrate the utility of KOS usage (usefulness)– In experimental settings– In user evaluations

• Each methodology has pros and cons– Effort and significance in small user

studies– Too controlled, system-based, without

real users• Terminology mapping projects should

perform more empirical studies

IRInteractive IR

Availability of corpora high lowReproducibility high lowControl high lowMeasures medium medium

Effort low highSignificance medium medium Generalisability medium medium

Realistic Scenario no high

11

Outlook• Integrate different recommender systems in real

retrieval tasks (search sessions)• Use and evaluate recommenders for query

expansion and as dynamic features in IR, in the retrieval process (AMUR project)

• Develop new measures of utility of recommender systems– E.g. measure task completion rates or goal

satisfaction

12

References• Hienert, D. & Mutschke, P. (2016). A Usefulness-based Approach

for Measuring the Local and Global Effect of IIR Services. In Proceedings of the 2016 ACM on Conference on Human Information Interaction and Retrieval (CHIIR '16). ACM, New York, NY, USA, 153-162. http://dx.doi.org/10.1145/2854946.2854962

• Lüke, T., Schaer, P., & Mayr, P. (2012). Improving Retrieval Results with discipline-specific Query Expansion. In International Conference on Theory and Practice of Digital Libraries (TPDL 2012) (pp. 408–413). Paphos, Cyprus: Springer Berlin Heidelberg. http://doi.org/10.1007/978-3-642-33290-6_44

• Mayr, P., & Petras, V. (2008). Cross-concordances: terminology mapping and its effectiveness for information retrieval. In 74th IFLA World Library and Information Congress. Québec, Canada: IFLA. Retrieved from http://www.ifla.org/IV/ifla74/papers/129-Mayr_Petras-en.pdf

http://dx.doi.org/10.1145/2854946.2854962

http://dx.doi.org/10.1145/2854946.2854962

http://dx.doi.org/10.1145/2854946.2854962

http://doi.org/10.1007/978-3-642-33290-6_44

http://doi.org/10.1007/978-3-642-33290-6_44

http://www.ifla.org/IV/ifla74/papers/129-Mayr_Petras-en.pdf

http://www.ifla.org/IV/ifla74/papers/129-Mayr_Petras-en.pdf

13

Thank youContact:

Dr Philipp MayrGESIS - Leibniz Institute for the Social Sciences, GermanyEmail: [email protected]: @philipp_mayr

mailto:[email protected]