hotbot ppt

34
Ammara Muhammad Ashfaq INFORMATION RETRIEVAL TECHNIQUES

Upload: ammara-ashfaq

Post on 17-Aug-2015

9 views

Category:

Education


0 download

TRANSCRIPT

Page 1: Hotbot ppt

Ammara Muhammad Ashfaq

INFORMATION RETRIEVAL TECHNIQUES

Page 2: Hotbot ppt
Page 3: Hotbot ppt

Owned by Terra/Lycos.

One of the largest web search engines.

Uses the Inktomi database combined with Direct

Basic search screen is simple, but the advanced search allows for a full range

of search features.

INTRODUCTION

Page 4: Hotbot ppt

HotBot Launched in May 1996

Founded by Eric Brewer and assistant professor at the University of California at Berkeley and Paul Gauthier

 it was originally owned and operated by Wired Magazine.

It was a very popular search engine in the 1990s, with it’s wild colors and great results.

The search results were provided by the Inktomi database and directory results provided by LookSmart and The

Open Directory

HISTORY

Page 5: Hotbot ppt

HISTORY

In 1998 the search engine was acquired by the Lycos Company and languished with limited development and falling market share.

It was re-launched in 2002 as a meta-like search tool that gave users the option to search either the Google, Inktomi, Teoma or FAST

databases.

HotBot continues to attract a small amount of search traffic and provides results from either the Ask Jeeves (Teoma) or Google

database.

Page 6: Hotbot ppt
Page 7: Hotbot ppt
Page 8: Hotbot ppt

WORKING OF HOTBOT

Page 9: Hotbot ppt
Page 10: Hotbot ppt

Hotbot search engine algorithm is based on: 

• keywords contained in the title

• keywords meta tags

• keywords prominence and density in a text content

Document length (maximum 800 words for hotbot) of any of your web pages.

HOTBOT RANKING ALGORITHM

Page 11: Hotbot ppt

France: http://www.hotbot.lycos.fr

Germany: http://www.hotbot.lycos.de

Italy: http://www.hotbot.lycos.it

Netherlands: http://www.hotbot.lycos.nl

Spain: http://www.hotbot.lycos.es

United Kingdom: http://www.hotbot.lycos.co.uk

HOTBOT AROUND THE WORLD

Page 12: Hotbot ppt

A crawler is a program that visits Web sites and reads their pages and other information in order to create

entries for a search engine indexA program that automatically

fetches Web pages. Spiders are used to feed pages to search

engines. It's called a spider because it crawls over the Web

SPIDER AND WEB CRAWLER

Page 13: Hotbot ppt
Page 14: Hotbot ppt

HotBot offers the choice of three search engine databases:

HotBot (which is actually a Yahoo!/Inktomi database)

Google

Ask Jeeves (the Teoma database)

DATABASES

Page 15: Hotbot ppt

• Advanced searching capabilities

• Page depth limit• Advanced search help• Truncation • Quick check of three major

databases

STRENGTHS

Page 16: Hotbot ppt

Link searches must be exactDatabase size shrunk for awhile

Advanced features have not always worked right

Does not include all advanced features of each of the four databases

WEAKNESSES

Page 17: Hotbot ppt

No cached copies of pagesOnly displays a few hits from each domain

with no access to the rest in InktomiSame ads at the top push regular results

below the foldShould have a file type limit for PDF, MS

Word, PowerPoint, and Excel files

WEAKNESSES

Page 18: Hotbot ppt
Page 19: Hotbot ppt

Default Operation: Processed as an ANDFull Boolean Searching: AND, OR, and NOT

Proximity SearchingTruncation with the * symbol

Case sensitiveExtensive, dynamic stop word list

Word Stemming - Search for grammatical word variants including plural, singular, and tense.

SEARCH FEATURES

Page 20: Hotbot ppt

Multiple search terms are processed as an AND operation by default.

DEFAULT OPERATION

Page 21: Hotbot ppt

HotBot offers full Boolean searching.Use the operators AND, OR, and NOT.

Operators must be in upper case. HotBot can also use for NOT.

Under Word Filters, it has a drop down menu choice for All the Words, Any of the Words, Not the Words, Exact Phrase, and Not Exact Phrase.

These can be used to add additional terms or combining a phrase search with a Boolean

search.

BOOLEAN SEARCHING

Page 22: Hotbot ppt

HotBot and the other Inktomi databases were sometimes case sensitive for unusual usages of case. If search terms are entered in all lower case, all upper case, or with an initial capital,

all mixtures of upper and lower case are searched.

If a search term contains one or more UPPER case letters in the middle of a word such

as arXiv, the search is limited to only records that exactly match the specified case.

CASE SENSITIVITY

Page 23: Hotbot ppt

ANY words with charters after the stem will be matched to your query term if the

search engine supports truncation.Thus if we stem bird*, our search will

match on the words birdbrain.Posing bird* to Hotbot we now get this

document Bird

1,834,510

WORD STEMMING OR TRUNCATION

Page 24: Hotbot ppt

NO. Just phrase searching.

PROXIMITY SEARCHING

Page 25: Hotbot ppt

The display includes the relevance score, title, URL, a brief extract, and date.

HotBot displays 10 records at a time, by default. However, users can request displays of 10, 25, 50,

75, or 100 records at a time. More search engines should give such options. To always go directly to Advanced Search with the default of 100 records and the 'Boolean phrase'

option, make a bookmark to these Advanced Search settings, or use their personalization feature.

DISPLAY

Page 26: Hotbot ppt

Searching title words and links to a specific URL

acrobat/applet/activex/audio/embed/

flash/form/frame/image/script/ shockwave/table/video/vrml

FIELD SEARCHES

Page 27: Hotbot ppt

Results are sorted by relevance with groupings by site available at

the end of each brief record. The display includes the relevance score, title, URL, a brief extract,

and date. HotBot displays 10 records at a time, by default.

SORTING

Page 28: Hotbot ppt

HotBot and the other Inktomi databases have an extensive, dynamic

stop word list.Many common words and numbers

will not be searched.The list changes as the frequency of

terms in the database change.When a stop word is in a phrase, it may not be obvious that the whole

phrase is not being searched.

STOP WORDS

Page 29: Hotbot ppt

WILDCARD SEARCHES

Wildcards searching generally places the symbol "*" after a word. It tells the database to look for variations of that word. For Example:

Investigation*Might pull sites with words such as

investigation, investigator, and investigative. 

Page 30: Hotbot ppt

Some search engines allow you to create more complex queries by grouping AND, OR,

NOT, and NEAR statements using parentheses.

Investigator NEAR (Texas OR Tx)

In the above example, you should pull investigators in Texas or TX  whether the

state name is spelled out in full or abbreviated.

NESTED SEARCHING

Page 31: Hotbot ppt

Page Type –

Default is Any (Any pages)

Top Page (the root page of a URL ie. www.unca.edu)

Page Depth - Limits how far down a subdirectory hierarchy Hotbot SearchesThese are useful for finding the primary

sites for organizations or information

UNIQUE FOR HOTBOT

Page 32: Hotbot ppt

Smaller databasesLess pointing to external

pagesPaid advertising or

sponsorship for visibilityRise of search only sites

FUTURE POSSIBILITIES

Page 33: Hotbot ppt

HotBot is an interface to advanced web searches, and it presents a dynamically

changing backend.Both the Inktomi and Direct Hit technologies serve, in different ways, to provide a relevant list of results through advanced queries, and

both seek to minimize the commercial influence over search results.

All of these technologies are subject to changes in technology developments, and

changes in the business environment.

CONCLUSION

Page 34: Hotbot ppt

Its weaknesses include that it still doesn't seem to produce the depth and breadth of some other engines, and that it's advanced features have not always worked correctly. As the

proliferation of this engine's index and searching features continues, these

weaknesses should be overcome.

CONCLUSION