infobrokering and searching the deep web
DESCRIPTION
Infobrokering and Searching the Deep Web - the New Role of Employee of the Department of Medical Scientific Information. Presentation from EAHIL Workshp Kraków 2007.TRANSCRIPT
Infobrokering and Searching the Deep Web
the New Role of Employee of the Department of Medical Scientific
Information.
Witold Kozakiewicz, Barbara Grala
Main Library, Medical University of Łódź,Poland
"The Librarian", a 1556 painting by Giuseppe Arcimboldo
Information should be meaningful, valuable, adequate, complete, actual and reliable.
Google is like box of chocolate....
Deep Web
The deep Web (or Deepnet, invisible Web or hidden Web) refers to World Wide Web content that is not part of the surface Web indexed by search engines.
Deep Web Pages• Dynamic content - dynamic pages which are returned in response to a submitted
query or accessed only through a form (especially if open-domain input elements e.g. text fields are used; such fields are hard to navigate without domain knowledge).
• Unlinked content - pages which are not linked to by other pages, which may prevent Web crawling programs from accessing the content. This content is referred to as pages without backlinks (or inlinks).
• Private Web - sites that require registration and login (password-protected resources).• Contextual Web - pages with content varying for different access contexts (e.g.
ranges of client IP addresses or previous navigation sequence).• Limited access content - sites that limit access to their pages in a technical way (e.g.,
using the Robots Exclusion Standard, CAPTCHAs or pragma:no-cache/cache-control:no-cache HTTP headers), prohibiting search engines from browsing them and creating cached copies.
• Scripted content - pages that are only accessible through links produced by JavaScript as well as content dynamically downloaded from Web servers via Flash or AJAX solutions.
• Non-HTML/text content - textual content encoded in multimedia (image or video) files or specific file formats not handled by search engines.
Source: Wikipedia
Deep Web databases
Source: Bin He, Mitesh Patel, Zhen Zhang, Kevin Chen-Chuan Chang. Accessing the Deep Web http://doi.acm.org/10.1145/1230819.1241670
How to improve searching process?
try to use abilities of search engines, use more complex questions with Boolean operators, keywords. Use advanced search option or
search engine suggestions
try specialized services like Google Scholar, Google Books, MS Live Search Academic, Yahoo Search Subscriptions
if you are looking for specific file types, try dedicated search engines like Picsearch, or Yahoo Podcast Search;
try metasearch engines like friskr.com, dogpile.com, clusty.com, mamma.com turbo10.com;
use specialized web services and database search enginesPubMed, Medic8, WebMD, MammaHealth
use subject gateways – an online service that provides links to numerous other sites or documents on the Internet. (Intute, Scout Archives, BUBL
Infomine)
try to search open access journals or repositories like DOAJ, OAIster;
try to use find the specific database using CompletePlanet or Geniusfind ;
too many!
too complicated!
The role of Librarian
• Help user to find the information
• Choose proper search tools.
• Prepare the tool-box
• Teach how to use it.
Thank You.