analytics @ lancaster university library igelu 2014 john krug, systems and analytics manager,...
Post on 18-Dec-2015
217 Views
Preview:
TRANSCRIPT
Analytics @ Lancaster University Library
IGeLU 2014John Krug, Systems and Analytics Manager, Lancaster University Libraryhttp://www.slideshare.net/jhkrug/igelu-analytics-2014
• We are in Lancaster in the UK North West.• ~ 12,000 FTE students, ~ 2300 FTE Staff• Library has 55 FTE staff, building refurbishment in progress• University aims to be 10, 100 – Research, Teaching,
Engagement• Global outlook with partnerships in Malaysia, India, Pakistan
and a new Ghana campus• Alma implemented January 2013 as an early adopter.• I am Systems and Analytics Manager, at LUL since 2002 to
implement Aleph – systems background, not library• How can library analytics help?
Lancaster University, the Library and Alma
• Following implementation of Alma, analytics dashboards rapidly developed for common reporting tasks
• Ongoing work in this area, refining existing and developing new reports
Alma Analytics reporting and dashboards
Projects & Challenges
• LDIV – Library Data, Information & Visualisation• ETL experiments done using PostgresQL and Python• Data from Aleph, Alma, Ezproxy, etc.
• Smaller projects:• e.g. Re-shelving performance – required to use Alma Analytics
returns data along with the number of trolleys re-shelved daily.• Challenges – Infrastructure, Skills, time
• Lots of new skills/knowledge needed for Analytics. For us :Alma analytics (OBIEE), python, Django, postgres, Tableau, nginx, openresty, lua, json, xml, xsl, statistics, data preparation, ETL, etc, etc, etc
Alma analytics data extraction
• Requires using a SOAP API (thankfully a RESTful API is now available for Analytics)
• SOAP support for python not very good, much better with REST. Currently using the suds python library with a few bug fixes for compression, ‘&’ encoding, etc.
• A script get_analytics invokes the required report, manages collection of multiple ‘gets’ if the data is large and produces a single XML file result.
• Needs porting from SOAP to REST.• Data extraction from Alma Analytics is straight forward,
especially with REST
• Ezproxy logs• Enquiry/exit desk query statistics• Re-shelving performance data• Shibboleth logs, hopefully soon. We are dependent on central
IT services• Library building usage counts• Library PC usage statistics• JUSP & USTAT aggregate usage data• University faculty and department data• Social networking • New Alma Analytics subject areas, especially uResolver data
Data from other places
• Currently we have aggregate data from JUSP, USTAT
• Partial off campus picture from ezproxy, but web orientated rather than resource
• Really want the data from Shibboleth and uResolver
• Why the demand for such low level data about individuals?
Gaps in the electronic resource picture
The library and learner analytics
• Learner analytics a growth field• Driven by a mass of data from VLEs and MOOCs …. and
libraries• Student satisfaction & retention• Intervention(?)
• if low(library borrowing) & low(eresource access)
&high(rate of near late or late submissions) &low_to_middling(grades)
thendo_something()
• The library can’t do all that, but the university could/can• Library can provide data
The library as data provider
• LAMP – Library Analytics & Metrics Project from JISC• http://jisclamp.mimas.ac.uk• We will be exporting loan and anonymised
student data for use by LAMP.• They are experimenting with dashboards
and applications• Prototype application later this year.• Overlap with our own project LDIV
• The Library API• For use by analytics projects within the university• Planning office, Student Services and others
The Library API
• Built using openresty, nginx, lua• Restful like API interface• e.g. Retrieve physical loans for a patron
• GET http://lib-ldiv.lancs.ac.uk:8080/ploans/0010215?start=45&number=1&format=xml (or json)
<?xml version="1.0" encoding="UTF-8"?><response> <record> <call_no>AZKF.S75 (H)</call_no> <loan_date>2014-07-10 15:44:00</loan_date> <num_renewals>0</num_renewals> <bor_status>03</bor_status> <rowid>3212</rowid> <returned_date>2014-08-15 10:16:00</returned_date> <collection>MAIN</collection> <rownum>1</rownum> <material>BOOK</material> <patron>b3ea5253dd4877c94fa9fac9</patron> <item_status>01</item_status> <call_no_2>B Floor Red Zone</call_no_2> <bor_type>34</bor_type> <key>000473908000010-200208151016173</key> <due_date>2015-06-19 19:00:00</due_date> </record></response>
[{ "rownum": 1, "key": "000473908000010-200208151016173", "patron": "b3ea5253dd4877c94fa9fac9", "loan_date": "2014-07-10 15:44:00", "due_date": "2015-06-19 19:00:00", "returned_date": "2014-08-15 10:16:00", "item_status": "01", "num_renewals": 0, "material": "BOOK", "bor_status": "03", "bor_type": "34", "call_no": "AZKF.S75 (H)", "call_no_2": "B Floor Red Zone", "collection": "MAIN", "rowid": 3212}]
How does it work?
• GET http://lib-ldiv.lancs.ac.uk:8080/ploans/0010215?start=45&number=1&format=xml
• Nginx configuration maps REST url to database query
location ~ /ploans/(?<patron>\w+) { ## collect and/or set default parameters rewrite ^ /ploans_paged/$patron:$start:$nrows.$fmt; }
location ~ /ploans_paged/(?<patron>\w+):(?<start>\d+):(?<nrows>\d+)\.json { postgres_pass database; rds_json on;
postgres_query HEAD GET " select * from ploans where patron = $patron and row >= $start and row < $start + $nrows"; }
Proxy for making Alma Analytics API requests
• e.g. Analytics report which produces• nginx configuration
• So users of our API can get data directly from Alma Analytics and we manage the interface they useand shield them from any APIchanges at Ex Libris.
location /aa/patron_count { set $b "api-na.hosted.exlibri … lytics/reports"; set $p "path=%2Fshared%2FLancas … tron_count"; set $k "apikey=l7xx6c0b1f6188514e388cb361dea3795e73"; proxy_pass https://$b?$p&$k;}
Re-thinking approaches
• Requirements workshops• Application development
• Data provider via API interfaces• RDF/SPARQL capability
• LDIV – Library Data, Information and Visualisation• Still experimenting• Imported data from ezproxy logs, GeoIP databases, student
data, primo logs, a small amount of Alma data• Really need Shibboleth and uResolver data• Tableau as the dashboard to these data sets
top related