river campus libraries guf: getting users to full-text ( with voyager®, encompass™, openurl,...
Post on 20-Dec-2015
219 views
TRANSCRIPT
River Campus Libraries
GUF: Getting Users to Full-Text (With Voyager®, ENCompass™, OpenURL, etc.)
Jeff SuszczynskiSenior Web Developer
Library IT Environment
Digital Initiatives Unit- Software Developers (3)- Systems Analyst- Computer Scientist- Anthropologist- Art Director / Designer
- Digital Librarian for Public Services- Usability Team- Content Groups
Library IT Environment
Usability Testing Lab
Metasearch - Major Issues
Our users fail when they must:
Deal with Link Resolver menu choices Follow long click-paths Get stuck at dead-ends Resubmit their searches mid-session
Major Issues - Examples
Major Issues - Resolved
GUF (Getting Users to Full-text)
Title links on results screen lead to either: Full-text (best) Print holdings information with map (2nd best) Pre-filled interlibrary loan request form (worst)
Issues Addressed: Deal with Link Resolver menus Follow long click-paths Get stuck at dead-ends Resubmit search mid-session
LibraryWeb Server
GUF
Library website user interface
List ofresults
SubscriptionDatabase
Fulltext
ILL loginw/ request
Map tojournal
Search
GUF
Major Issues - Resolved
In each case, the user only had to click once to get to full-text or the next best option.
Major Issues - Resolved
To resolve user issues:
Link Resolver menu choices Long click-paths Dead-ends Resubmitting searches
GUF must do the following:
Improve metadata transfer Eliminate/handle errors Eliminate clicks Check local holdings Handle multiple editions
Improved Metadata Transfer
Problem:
Metasearch (ERA) ==> Link Resolver (SFX, LinkFinderPlus)
metadata hand-offs are extremely important… …and are generally inadequate
inability to handle different sets of metadata for
different databases inability to handle multiple items types from each
database
Improved Metadata Transfer
Example for Database X:
if type (072 $a) = ‘Journal Article’ or ‘Periodical’ then:ISSN will be in the 022 $a or 773 $x or in a section of the 024 $a (parsable SICI)
BUT
if type (072 $a) = ‘Collected Volume Article’ (series) then: ISSN might be in the 490 $x
Metasearch products generally don’t allow for this granularity.
Improved Metadata Transfer
GUF succeeds where Metasearch => Link Resolver often fails: advanced metadata parsing tailored for each database’s quirks
Improved Metadata Transfer
How does GUF do this?
results.xsl => objectGUF.xsl
Title link on results.xsl sends metadata to objectGUF.xsl
Improved Metadata Transfer
How?
objectGUF.xsl => parserxxxdb.js
objectGUF.xsl grabs and massages metadata received objectGUF.xsl sends cleaned and selected metadata to
database-specific JavaScript file
<xsl:when test="$repocode='E_FSARTF'"><xsl:variable name="sici" select="/ENCOMPASS/ENCOMPASS_BATCHQUERY/GET_OBJECT_XML/SETTAGS/OXML/EncRepoObject/ObjMetadata/MARC/MR024/MR024a"/><xsl:variable name="unique_id" select="/ENCOMPASS/ENCOMPASS_BATCHQUERY/GET_OBJECT_XML/SETTAGS/OXML/EncRepoObject/ObjMetadata/MARC/MR035/MR035a"/><xsl:variable name="author" select="/ENCOMPASS/ENCOMPASS_BATCHQUERY/GET_OBJECT_XML/SETTAGS/OXML/EncRepoObject/ObjMetadata/MARC/MR100/MR100a"/><xsl:variable name="atitle" select="/ENCOMPASS/ENCOMPASS_BATCHQUERY/GET_OBJECT_XML/SETTAGS/OXML/EncRepoObject/ObjMetadata/MARC/MR245/MR245a"/><xsl:variable name="atitle" select="translate($atitle, '"', '')"/><xsl:variable name="jtitle" select="/ENCOMPASS/ENCOMPASS_BATCHQUERY/GET_OBJECT_XML/SETTAGS/OXML/EncRepoObject/ObjMetadata/MARC/MR773/MR773t"/><xsl:variable name="volume" select="/ENCOMPASS/ENCOMPASS_BATCHQUERY/GET_OBJECT_XML/SETTAGS/OXML/EncRepoObject/ObjMetadata/MARC/MR949/MR949a"/><xsl:variable name="issue" select="/ENCOMPASS/ENCOMPASS_BATCHQUERY/GET_OBJECT_XML/SETTAGS/OXML/EncRepoObject/ObjMetadata/MARC/MR949/MR949b"/><xsl:variable name="spage" select="/ENCOMPASS/ENCOMPASS_BATCHQUERY/GET_OBJECT_XML/SETTAGS/OXML/EncRepoObject/ObjMetadata/MARC/MR949/MR949f"/><xsl:variable name="epage" select="/ENCOMPASS/ENCOMPASS_BATCHQUERY/GET_OBJECT_XML/SETTAGS/OXML/EncRepoObject/ObjMetadata/MARC/MR949/MR949c"/><xsl:variable name="year" select="/ENCOMPASS/ENCOMPASS_BATCHQUERY/GET_OBJECT_XML/SETTAGS/OXML/EncRepoObject/ObjMetadata/MARC/MR949/MR949g"/>
<script language="javascript">fa2_articlefirst("<xsl:value-of select="$sici"/>", "<xsl:value-of select="$unique_id"/>", "<xsl:value-of select="$author"/>",
"<xsl:value-of select="$atitle"/>", "<xsl:value-of select="$jtitle"/>", "<xsl:value-of select="$volume"/>", "<xsl:value-of select="$issue"/>", "<xsl:value-of select="$spage"/>", "<xsl:value-of select="$epage"/>", "<xsl:value-of select="$year"/>")</script>
</xsl:when>
Improved Metadata Transfer
How?
parserxxxdb.js
JavaScript file, tailored for particular database, receives the metadata from objectGUF.xsl
It further massages the metadata using regular expressions
Improved Metadata Transfer
<MR973a>The New Republic$bNew Repub$c233$d4$fJuly$gJuly$h07$i25$j2005$k6</MR973a>
var regvolume = /\$c/;
if (regvolume.test(journalinfo)){var volume = journalinfo.replace(/.*\$c/, '');var volume = volume.replace(/\$.*/, '');}
else{var volume = '';}
var regissue = /\$d/;
if (regissue.test(journalinfo)){var issue = journalinfo.replace(/.*\$d/, '');var issue = issue.replace(/\$.*/, '');
var issue = issue.replace(/[a-zA-Z]/g, '');
}else
{var issue = '';}
Improved Metadata Transfer
How?
parserxxxdb.js
JavaScript file forms robust OpenURL using regular expressions and advanced parsing
JavaScript file ships the OpenURL off to GUF
All of this creates a better OpenURL than a typical Metasearch => Link Resolver hand-off, allowing for more and better links to full-text.
Articles – Eliminating Errors
Problem:
Full-text link on Link Resolver menu yields error page fairly often.
Articles – Eliminating Errors
GUF eliminates these dead-ends by pre-fetching pages trying other sources on error
Articles – Eliminating Errors
1. GUF receives OpenURL from ERA server (JavaScript file)
2. GUF formats OpenURL into XML
Example:
http://chico.lib.rochester.edu:8080/SFX_API/sfx_local?XML=<?xml version="1.0" ?> <open url> <object_description>
<object_metadata_zone> <genre>article</genre>
<issn>00084360</issn> <volume>43</volume> <issue>181</issue> <spage>149</spage> <title>Canadian Literature</title> <atitle>A+glimpse+of+something</atitle> <date>2004-08</date> <aulast>Beauregard</aulast> <aufirst>Guy</aufirst>
<__service_type>getFullTxt</__service_type>
</object_metadata_zone> </object_description>
</open-url>
Articles – Eliminating Errors
3. GUF sends XML OpenURL via HTTP Post to SFX API
4. SFX API returns a list of full-text URLs
Example…
Articles – Eliminating Errors
<?xml version="1.0"?><openurl_result>
<record><aulast>Beauregard</aulast><date>2004</date><atitle>A glimpse of something</atitle><spage>149</spage><issn>00084360</issn><__service_type>getFullTxt</__service_type><issue>181</issue><title>Canadian Literature</title><aufirst>Guy</aufirst>
</record><target> <url>http://
gateway.proquest.com/openurl?ctx_ver=Z39.88-2003&res_id=xri:pqd&rft_val_fmt=ori:fmt:kev:mtx:journal&genre=article&issn=0008-4360&date=2004&atitle=A+glimpse+of+something&req_dat=xri:pqil:pq_clntid=17941</url>
<target_name>available via ProQuest Research Library</target_name>
<service>getFullTxt</service></target>
</openurl_result>
Articles – Eliminating Errors
5. GUF executes HTTP calls (screen scraping) to the full-text URLs
6. If any error occurs, skip to the next full-text URL until all are exhausted
<CFELSEIF #REFindNoCase("proquest", targets)#><cfloop from="1" to="#ArrayLen(ft_link)#" index="i">
<CFIF #REFindNoCase("proquest", ft_link[i].XmlText)#><!--- Go to link from ProQuest. --->
<CFSET target_url = #ft_link[i].XmlText#><CFIF REFindNoCase("issn\=\d\d", target_url) AND REFindNoCase("volume\=\d", target_url) AND REFindNoCase("issue\=\d",
target_url)AND REFindNoCase("spage\=\d", target_url)>
<CFSET target_url = #REReplaceNoCase(target_url, "\&date\=(\d)*\&req", "&req")#></CFIF><CFIF REFindNoCase("\&date\=\d\d\d\d\-\d\d\d\&", target_url)><CFSET target_url = REReplaceNoCase(target_url, "\-\d\d\d\&atitle", "&atitle")>
</CFIF><CFHTTP URL="#target_url#"><CFSET new_url = #cfhttp.filecontent#>
<!--- Need to insert error checking here. ---><CFIF REFindNoCase("did not find any documents", new_url) EQ 0>
<!--- Check for link to PDF. ---><CFIF #REFindNoCase("alt\=""Page Image \- PDF", new_url)#>
<!--- If it exists, parse out the URL and go to it. ---><CFSET new_url = #REReplaceNoCase(new_url, "alt\=""Page Image \- PDF.*", "")#><CFSET new_url = #REReplaceNoCase(new_url, ".*href\=""", "", "ALL")#><CFSET new_url = #REReplaceNoCase(new_url, """.*", "")#><CFSET new_url = "http://proquest.umi.com" & #new_url#><CFLOCATION url="#new_url#" addtoken="no">
<!--- Otherwise, check for link to Text+Graphics. ---><CFELSEIF #REFindNoCase("alt\=""Text\+Graphics", new_url)#>
<!--- If it exists, parse out the URL and go to it. ---><CFSET new_url = #REReplaceNoCase(new_url, "alt\=""Text\+Graphics.*", "")#><CFSET new_url = #REReplaceNoCase(new_url, ".*href\=""", "", "ALL")#><CFSET new_url = #REReplaceNoCase(new_url, """.*", "")#><CFSET new_url = "http://proquest.umi.com" & #new_url#><CFLOCATION url="#new_url#" addtoken="no">
<!--- Check for link to Full text (text-only format). ---><CFELSEIF #REFind("alt\=""Full text", new_url)#>
<CFSET new_url = #REReplaceNoCase(new_url, "alt\=""Full text.*", "")#><CFSET new_url = #REReplaceNoCase(new_url, ".*href\=""", "", "ALL")#><CFSET new_url = #REReplaceNoCase(new_url, """.*", "")#><CFSET new_url = "http://proquest.umi.com" & #new_url# & "##fulltext"><CFLOCATION url="#new_url#" addtoken="no">
<!--- Otherwise, do nothing. ---><CFELSE></CFIF>
</CFIF></cfloop>
Articles – Eliminating Clicks
Problem:
Full-text links in Link Resolver menus often lead the user to a journal- or abstract-level page.
This forces the user to scan the page and click one or more times to actually see the article.
Articles – Eliminating Clicks
GUF: drills down to article level for most databases screen scrapes for embedded links to PDF or HTML fulltext
<CFELSEIF #REFindNoCase("muse", targets)#><cfloop from="1" to="#ArrayLen(ft_link)#" index="i">
<CFIF #REFindNoCase("muse", ft_link[i].XmlText)#>
<!--- If 'muse' was found in data from SFX API, execute CFHTTP call to the URL. --->
<CFHTTP URL="#ft_link[i].XmlText#">
<!--- Now screen scrape the data, look to make sure a link to a specific article is returned. ---><CFIF (REFindNoCase("Access article in PDF", cfhttp.filecontent)) AND (REFindNoCase("href\=.*\.pdf",
cfhttp.filecontent)) AND (REFindNoCase("\<title\>.*Table of Contents\<\/title", cfhttp.filecontent) EQ 0)>
<!--- If a specific article is returned, parse out just the embedded link to full-text --->
<CFSET new_url = #cfhttp.filecontent#><CFSET new_url = #REReplaceNoCase(new_url, ".*\<url\>", "")#><CFSET new_url = #REReplaceNoCase(new_url, "\<\/url\>.*", "")#> <CFSET new_url = #REReplaceNoCase(new_url, "\.(html|htm)", ".pdf")#><CFSET new_url = "http://muse.uq.edu.au" & new_url>
<!--- Now go right to the article, without extra clicks. --->
<CFLOCATION url="#new_url#" addtoken="no"> <CFELSEIF REFindNoCase("following articles match", cfhttp.filecontent)>
<CFSET new_url = #cfhttp.filecontent#> <CFIF IsDefined('url.atitle') AND url.atitle NEQ "">
<CFSET new_url = #REReplaceNoCase(new_url, "#url.atitle#.*", "")#><CFSET new_url = #REReplaceNoCase(new_url, ".*href\=""", "", "ALL")#><CFSET new_url = #REReplaceNoCase(new_url, """ target\=""\_new""\>.*", "")#><CFSET base_url = "http://muse.uq.edu.au"><CFSET new_url = base_url & new_url><CFLOCATION url="#new_url#" addtoken="no">
</CFIF><CFELSEIF REFindNoCase("Page Not Found\<\/title\>", cfhttp.filecontent) EQ 0>
<CFSET new_url = #ft_link[i].XmlText#><CFLOCATION url="#new_url#" addtoken="no">
<CFELSE></CFIF>
</CFIF></cfloop>
Articles – Check local holdings
Problem:
Link resolvers do not compare citation info against actual holdings info.
Articles – Check Local Holdings
GUF is able to check your citation against local print holdings
How? Extract of entire print journal holdings from Voyager into
homegrown SQL table
Print holdings are parsed using a set of regular expressions on our non-standard holdings info
Thus, the user only sees the ‘Library Holdings’ page when the library actually owns the volume/issue/date for a metasearch result.
Books – Multiple editions
Problem:
Metasearch can yield book results for particular editions
Currently no way that link resolvers can determine whether local holdings include alternate editions
Dead end result for users
Books – Multiple Editions
GUF uses OCLC’s xISBN service to find all editions and theirISBNs.
Example:<CFHTTP url=http://labs.oclc.org/xisbn/#url.isbn# method =“get”>
Yields:<?xml version="1.0" encoding="UTF-8" ?> <idlist>
<isbn>0441172717</isbn> <isbn>0801950775</isbn> <isbn>0399128964</isbn> <isbn>044100590x</isbn> <isbn>1556909330</isbn> <isbn>0425027066</isbn> <isbn>0425036987</isbn> <isbn>0425046877</isbn> <isbn>042507160x</isbn> <isbn>042505313x</isbn> <isbn>0441172660</isbn> <isbn>0736692401</isbn> …</idlist>
Books – Check local holdings
GUF searches catalog for complete list of ISBNs to find ‘good enough’ copy
How? ColdFusion custom tag that allows multiple, concurrent
HTTP requests against Voyager OPAC This allows for reasonable performance over large numbers
of ISBN searches
Miscellaneous Features
ISSN lookup Homemade DOI resolver Date formatting findarticles.com
Future Enhancements
Item records for print journals Journal abbreviation lookup table Google spellchecking CrossRef DOI help Flash-based maps
Other Ideas?
Any ideas you may have are welcome…
River Campus Libraries
GUF: Getting Users to Full-Text (With Voyager®, ENCompass™, OpenURL, etc.)
Jeff SuszczynskiSenior Web Developer