l j haravu kesavan inst. of information and knowledge management

21
The AGRIS Search System - Some Ideas for a The AGRIS Search System - Some Ideas for a Semantics- and Knowledge Network enabled Semantics- and Knowledge Network enabled simple search system simple search system L J Haravu L J Haravu Kesavan Inst. of Information and Knowledge Kesavan Inst. of Information and Knowledge Management Management

Upload: teagan-obrien

Post on 02-Jan-2016

27 views

Category:

Documents


0 download

DESCRIPTION

The AGRIS Search System - Some Ideas for a Semantics- and Knowledge Network enabled simple search system. L J Haravu Kesavan Inst. of Information and Knowledge Management. Problems. The human dimension is missing - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: L J Haravu Kesavan Inst. of Information and Knowledge Management

The AGRIS Search System - Some Ideas for a The AGRIS Search System - Some Ideas for a Semantics- and Knowledge Network enabled simple Semantics- and Knowledge Network enabled simple

search systemsearch system

L J HaravuL J Haravu

Kesavan Inst. of Information and Knowledge ManagementKesavan Inst. of Information and Knowledge Management

Page 2: L J Haravu Kesavan Inst. of Information and Knowledge Management

Problems

√The human dimension is missing√The availability of generic search engines has resulted in end-users sacrificing quality for ease of access; relevance for speed and convenience√A domain specific search system such as AGRIS must differentiate itself from the generic ones √Searching skills of users vary widely but all of them are treated alike √ Multilingual systems such as AGRIS pose other problems as well, e.g., need for language analyzers, language-specific stemming algorithms

Page 3: L J Haravu Kesavan Inst. of Information and Knowledge Management

Challenges in Designing Search Systems

Users need help at various stages of their interactions with the search system

They need guidance to make effective use of thesauri The possibility to use web-based ontologies opens new means to improve search effectiveness transparently to the user

The addition of a human dimension would add greatly to the satisfaction of real needs

A search engine, if possible, should go beyond just providing a list of 'hits'.

All intervention by the system should be seen as helping the user

Page 4: L J Haravu Kesavan Inst. of Information and Knowledge Management

Use environments

Use environment: the nature of end-userstheir preferences, searching behaviours, purposes sought to be achieved

Understanding these factors could guide the design and implementation of a search system

Page 5: L J Haravu Kesavan Inst. of Information and Knowledge Management

AGRIS use environments

Very heterogenousMany experineced researchers, teachersAlso many inexperienced, e.g., students

difficulties in articulating searchessearches formulated too generically or too specifically

Multilingual and hence search system should permit searching and retrieval in not just the 5 UN langauges

Work In relatively remote locationsAccess to peers not always there or possible. The possibility to interact with knowledgeable peers

after obtaining the search results is an important

element contributing to the satisfaction of an end-user.

Page 6: L J Haravu Kesavan Inst. of Information and Knowledge Management

AGRIS use environments

Poorly formulated searches result in frustration instead of elucidation because of the excessive 'noise' in the retrieval. Help on how to modify and re-submit the query to obtain better results is obviously desirable.

Search results do not always resolve uncertainties, they may in fact add to them, e.g., the presence of two papers with conflicting findings, dated information.

most users prefer to work with the simple search interface. The use of a simple text box into which users can enter a word or phrase before submitting it to the search engine has

become the preferred choice of users by default.

Page 7: L J Haravu Kesavan Inst. of Information and Knowledge Management

Some ideas for design of a simple search system for AGRIS

Allow the user complete freedom to articulate his needs in his own words.

Any intervention (human or machine) at the first formulation stage is counter productive.

The user's need must be captured, however inadequately defined it may be.

Page 8: L J Haravu Kesavan Inst. of Information and Knowledge Management

Some ideas for design of a simple search system for AGRIS

If a single word is entered in the simple search interface:make a search in the Lucene index and if the number of hits exceeds

a maximum threshold:show the first results page (sorted by relevance) and suggest to the user:

1. the use of one or more specific terms (that are automatically displayed from the AGROVOC or ontology), or 2. the use of other terms in conjunction with the term entered.

If term entered is not an AGROVOC descriptor and if the number of hits exceeds a maximum threshold: show results page and suggest that:

1. Use one or more of a list of terms shown from Lucene index that are orthographically close in the language of the term entered, or 2. the use of other terms in conjunction with the term entered.

Page 9: L J Haravu Kesavan Inst. of Information and Knowledge Management

Some ideas for design of a simple search system for AGRISThe semantics of an Ontology may also be explored, e.g., if the user enters the string “sorghum diseases”

it should be possible for the search system to infer that sorghum belongs to the class “Cereals, and that “sorghum” and “diseases” belong to mutually exclusive classes which calls for the search formulation, viz., sorghum and diseases.

If hits are more than a maximum threshold:Suggest that the user should look at the narrower terms (sub-classes) of Sorghum and add one or more of these to the search. Also present the narrower (sub classes) of Diseases and ask the user to select one or more of these to be added to the search expression.

Page 10: L J Haravu Kesavan Inst. of Information and Knowledge Management

Some ideas for design of a simple search system for AGRISIf a single word is entered, (descriptor or not), and if the number of hits is nil, then suggest that user may consider:

1. The use of one or more broader terms from AGROVOC that are shown (if the term is a descriptor) or 2. the use of other terms either orthographically close to the term in AGROVOC and/or chosen from the Lucene index.

If a string, sentence or phrase is entered, parse the string to identify potential single or compound AGROVOC terms that might be searched. Hyperlink each of these to the semantic network in AGROVOC.

if the number of hits exceeds a maximum threshold, show the records retrieved via the Lucene index but also suggest that the user may decide to add one or more terms that he selects by clicking the hyper-linked terms along with the terms he has entered.

Page 11: L J Haravu Kesavan Inst. of Information and Knowledge Management

Some ideas for design of a simple search system for AGRISAll requests/suggestions to the user should be optional -- the user may or may not use these

One option also is suggest to the user to identify the purpose of his search from a drop down. This could determine the kind of recall/precision user is looking for

Page 12: L J Haravu Kesavan Inst. of Information and Knowledge Management

Query modification and reformulationSearch systems implicitly assume a user's need is fulfilled after showing the results.

In reality, the user, after looking through his search results, and even after he uses the help and suggestions to help him formulate his query, may find that the results achieved are not entirely satisfactory.

Page 13: L J Haravu Kesavan Inst. of Information and Knowledge Management

How can the search system help the user reformulate his query to obtain better results?

Can it simulate an interaction with an information specialist or a more experienced peer who then helps in reformulating the query?

Can the search system use the inherent semantics in a thesaurus or ontology and if so how?

Can the system provide a means for the end-user to actually enter into a dialogue with a knowledgeable peer to obtain a more meaningful interpretation of the results?

Can the search system help the user make use of leads that the search results have provided.

Issues in Query modification, reformulation

Page 14: L J Haravu Kesavan Inst. of Information and Knowledge Management

Ideas for Query modification, reformulationAllow selection of one or more result records and ask for 'More like these'This would need using terms in the

selected records and the relationships of these with other terms in the thesaurus.

Inferences using an ontology may point (along with other user input) to other terms (from titles, author, geographic area) that could be used in a reformulated search

Page 15: L J Haravu Kesavan Inst. of Information and Knowledge Management

Ideas for Query modification, reformulationIn the reverse case Allow selection of one or more result records and ask for 'Not like these'.

The possibility with ontologies to infer the broad categories into which the entered search terms belong.

In agriculture: thing (e.g., plant, crop, species, soils), action (e.g., breeding, harvesting, measurement), condition/property (e.g., diseases), agent (e.g., bacteria, viruses), space (geographic areas, countries, regions), and time (e.g., seasons). This is a facet analytical approach to analysis of a query.

Such an analysis could lead to transparent expansion (and or restirction) of a reformulated query

The system presents the categorized (expanded query) and asks which concepts must be present, which may be present and which should not be in results.

Page 16: L J Haravu Kesavan Inst. of Information and Knowledge Management

Failed Searches

● One or more terms are incorrectly spelled. Terms orthographically close to the entered term taken from the Lucene index are shown.

● search was too tightly formulated.● The use of a synonymous or near synonymous term to an AGROVOC descriptor.

Page 17: L J Haravu Kesavan Inst. of Information and Knowledge Management

Human dimension in search systems

Information, in general, is entropic. Information alone is not enough for a user to take action he needs to (e.g., redo an experiment or revise procedures).

There are situations in which the user does not feel better after searching and getting results. He needs a more experienced peer to guide him.

Page 18: L J Haravu Kesavan Inst. of Information and Knowledge Management

Human dimension in search systemsMany information use surveys point out that more than information from a database, it is the one-on-one interaction with a senior or more experineced peer that helps.

If a search system can build a human interface, many users might benefit. They will be using a knowledge network.

The search system would be going beyond 'hits' alone

Page 19: L J Haravu Kesavan Inst. of Information and Knowledge Management

Human dimension in search systemsOne way of doing this is to build a volunteer group of information specialists and subject experts in different areas and sub-areas of agriculture.

If a user is not happy with his search results, or his own knowledge is insufficient for him to resolve uncertainties, he could be given the option to seek synchronous or asynchronous interaction with a member of the volunteer group.

Page 20: L J Haravu Kesavan Inst. of Information and Knowledge Management

Human dimension in search systemsIf user exercises the option, the system searches for an appropriate volunteer who might be in the best position to guide the user by making available the email, telephone, or online chat of the volunteer and put the end-user in touch with such a volunteer.

This would open the doors to knowledge exchange and elucidation and not end with the provision of information alone.

Page 21: L J Haravu Kesavan Inst. of Information and Knowledge Management

Human dimension in search systems

Feedback analysisthe collection and analysis of feedback (collected

automatically as well as via user input) on user experiences with the search system.

the search system can automatically create a log of search terms (single terms, phrases, strings, etc.) entered by users; the language of the search term used; the instances where users actually used the help/prompts/suggestions that the search system offered and the nature of the help/prompts/suggestions used; the number of hits that a search resulted in; the country or region of the searcher

Feedback provided directly by the user, aggregated and analyzed over time together with automatically collected feedback provides a knowledge base that is valuable in fine-tuning the search system.