rscd 2017 bo f data lifecycle data skills for libs
TRANSCRIPT
Research Data Lifecycle: Data Skills for Librarians
Kathryn Unsworth (ANDS)RSCD - BoF13 February 2017
What data skills do librarians require?
There’s a matrix of possibilities - it’s complex!
https://etherpad.openstack.org/p/RSCD_2017_Data_lifecycle_&_libs
Elements of the data skills for librarians matrix
• Current librarian role - where and at what level it connects with research/researchers – data-related scope
• Future aspirations of the librarian – data-related scope
• Aspects of the Data Lifecycle the Library provides services for
• RDM maturity of the institution (dictates services provided)
• Research intensity of the institution (dictates services provided)
= data skills and knowledge required
Where and at what level you connect with your research communities:
● Based mainly in the library?● Embedded in research teams/labs?● Hot desking it out of faculty spaces?● Roving in a Research Commons (not only the Library)?● Undertaking research support services (e.g. data
consultations, data collection/collating, data cleaning) out in the (research) field?
● A combination?
Which parts of the Data Lifecycle your Library actively supports with services:
● Data seeking for analysis● Data documentation
(metadata)● Data citation● Data storage and
backups ● Data sharing
● Data archiving and preservation
● Data and teaching● Data cleaning● Data visualisation● DMP tools and advice
+ any intended new services
Your role - what proportion is research & data related? Hybrid or specialist role?
LibrarianLiaison LibrarianMetadata LibrarianSubject LibrarianResearch LibrarianScholarly Communications LibrarianRepository ManagerRepository OfficerData LibrarianResearch Data Management – LibrarianData Science Librarianand more…
Same title, but different
responsibilities at University X vs University Y
Discussion: Are there other elements that we can add to
the matrix?
https://etherpad.openstack.org/p/RSCD_2017_Data_lifecycle_&_libs
How do librarian roles and data skills and knowledge
intersect with the Data Lifecycle?
Data-related skills and knowledge for librarians across the Data Lifecycle
Domain knowledge – best practice (current research methods and data models)
Connecting researchers to research / research data
“Map knowledge/data gaps”
“Identify emerging disciplinary cross-overs”
“Assist in the formulation and refinement of innovative research questions”
“Digital tools to automate Literature reviews – Meta, CHORUS system?”
“Applying network analysis to visualise trends in emerging research”
“Tools to map key research terms in articles – where are the terms appearing?”
“Text and data mining techniques for refining research questions”
Project planning• project management (Prince, Agile, Waterfall,
Critical Path, etc.)• Business analysis – requirements gathering• Problem solving, troubleshooting
Collaboration tools/platforms (OSF, Confluence, Syncplicity, Google Apps)
RDMPs (DMPtool, DMPonline)• Project governance – roles and responsibilities• Data standards• Data organisation – file formats, file naming
conventions, versioning, etc.• Ethics and privacy – consent for sharing• Copyright and other IP, Licensing• Data storage• Data security
Funder and publisher requirements for data
Digital literacies training
Data search/discovery:• Discovery tools and services• Locate existing data• Full text search• Text and data mining• Web APIs to discover, extract, enrich existing dataData organisationData collection methods – generating new data, transforming legacy data, sharing/exchanging data, purchasing dataMetadata capture and creation tools and servicesMetadata standards:• Data description• Controlled vocabularies• Metadata modeling• InteroperabilityPatten recognitionCollaboration tools and platformsDatabases, including relationalVersion tracking
Reference/citation managementStorage options for working, master, raw, sensitive and big dataData appraisal and selectionLicensing – data access/sharing agreementsData security
Storage options for active data, collaborative research, data and metadata flows
Data security
Access rules
Data cleaning
Data aggregation
Machine learning/algorithms – graphical modeling
Scripting/coding
Data mapping across data sources
Data transforms, e.g. raster to shape files
Lab notebooks (eLNs)
Data screening and preparation
Iterative data changes prompted by analysis
Preparing data for long-term preservation and sharing
Process documentation – process diagrams, workflows, tools and automation
Data visualisation
Storage options for active data
Data security, including access controls
Data manipulation
Text and data mining
Scripting/coding
Machine learning
Analysis: Statistical, Spatial, Image
Analysis documentation
Modeling
Interpretation
Database programming (querying DBs)
Problem solving/troubleshooting
Analytical thinking
Why share data?
Author/Creator rights
Data catalogs and portals
Sensitive data
Access rules
Metadata standards
• Descriptive metadata
• Controlled vocabularies
Persistent identifiers (DOIs, ORCIDs)
Data citation
Data licensing
Performance/Impact metricsProgramming – front-end – editing web page source code, incorporate forms, multimedia
Contributor badges
Communication
Storytelling
Data visualisation
Client engagement
Advocacy
Persistent identifiers (DOIs, ORCIDs)
Using tools to identify file formatsConversion to access and preservation formats/mediums
Batch/automation
Data decoding
Data warehousing
Data archives and repositories
Long-term archival storage for final-state dataMetadata standards• Descriptive metadata for discovery• Provenance and other administrative
metadata
Disposition – disposing of obsolete or redundant data, or archival retention
Licensing – legal framework around how data can be (re)used
Reuse documentation (code, simulations, models, protocols, workflows, etc.)
Impact and assessment metrics (Altmetrics, PlumX, ImpactStory)
Data for teaching
Data citation – how and why to cite data
Whole of lifecycle activities
• Describing and contextualising data (metadata, documentation, associated research outputs)
• Managing data quality
• Storage, Back ups and Security
Are you kidding me?Who has the capacity to
attain all these skills?
Teams, not unicorns“Team-building is another important tactic in tackling the skills gap. There is little point looking for the great, single all-rounder who can do everything – the mythical unicorn. Even if such people existed (and they may) they would be too expensive as they can walk into any job. It is much more profitable to look across the skill-set required and build a team to fulfil it.”Read more at: http://www.techweekeurope.co.uk/e-management/skills/bridging-data-science-skills-gap-requires-team-effort-160818#msRDJHrzR8QUhLHa.99
Copyrighted Image - Data Science Roles https://libraryconnect.elsevier.com/articles/learning-about-research-data-lab-pitt-ischool
So what’s the minimum data skills requirement for librarians? Is there an optimal level? Maybe even an aspirational level?
Are we talking about all librarians or only those with data-related responsibilities?
As an academic librarian is it ok to just be “data aware” or do we all need to be “data savvy” or maybe something in between?
Discussion…https://etherpad.openstack.org/p/RSCD_2017_Data_lifecycle_&_libs
What is a Data Savvy Librarian?
“...librarians need increasingly to become data-savvy themselves and to have a deeper understanding of the research data lifecycle in order to enhance the services they offer.”
“...the main requirement is a basic familiarity with how various software tools can transform data.” And, “...to learn the basics of some of the latest tools for extracting, analyzing, storing, and visualizing data.”
“...working directly with messy, unavailable or difficult to-access data it is possible to have a more complete vision of the different issues the researchers have to face when working with data.”
Barbaro, A. (2016). On the importance of being a data-savvy librarian. Journal of EAHIL, 12(1):24-27
Then there’s the question of Data Science in Libraries…
The research librarian of the future: data scientist and co-investigator
There remains something of a disconnect between how research librarians themselves see their role and its responsibilities and how these are viewed by their faculty colleagues. Jeannette Ekstrøm, Mikael Elbaek, Chris Erdmann and Ivo Grigorov imagine how the research librarian of the future might work, utilising new data science and digital skills to drive more collaborative and open scholarship. Arguably this future is already upon us but institutions must implement a structured approach to developing librarians’ skills and services to fully realise the benefits.
Core duties versus ‘stretch’ services
The research librarian community is not in consensus as to what exactly are the emerging roles of future librarians in a rapidly evolving digital scholarship environment (see #libraryfutures). Added to the polarised views within that community, a recent survey shows there is also a clear gap in perception and expectations between librarians and faculty staff. While librarians surveyed agreed that “information literacy” and “aiding students one-on-one in conducting research” are primary and essential roles, they viewed “supporting faculty research” as less important than their faculty colleagues. So does this present an opportunity in the digital age?
The Role of Librarians in Data Science: A Call to Action “All of this hesitancy on the part of librarians to participate in the data
movement is happening at a time when we have seen an increase in the money and involvement in data initiatives from a range of other professions and academic disciplines (e.g. computer science, informatics, etc.). For me, this is an especially critical moment for librarians to talk about data and actively plan and implement our strategies collectively.
I want to share with you a proposed framework for the librarian’s role in data science. I come to the discussion with the fear that data science is an evolving academic discipline being defined solely by computer science and that the field of library and information science is being left behind. I would argue that the principles and values of the field of library and information science that form the core of our profession need to be part of this new discipline and that we can add unique perspectives and roles.”
(Opinion piece by Elaine R. Martin, 2015)
Data Science – is there a future where you see librarians filling the DS skills gap?
What’s your next data skills challenge?
https://etherpad.openstack.org/p/RSCD_2017_Data_lifecycle_&_libs
Discussion points:
AcknowledgementsBarbaro, A. (2016). On the importance of being a data-savvy librarian. Journal of EAHIL, 12(1):24-27 https://www.researchgate.net/publication/299394172_On_the_importance_of_being_a_data-savvy_librarian Ekstrom, J., Elbeaek, M., Erdmann, C., & Grigorov, I. (2016). The research librarian of the future: data scientist and co-investigator, The Impact Blog LSE. http://blogs.lse.ac.uk/impactofsocialsciences/2016/12/14/the-research-librarian-of-the-future-data-scientist-and-co-investigator/ Faundeen, J.L., Burley, T.E., Carlino, J.A., Govoni, D.L., Henkel, H.S., Holl, S.L., Hutchison, V.B., Martín, Elizabeth, Montgomery, E.T., Ladino, C.C., Tessler, Steven, and Zolly, L.S., 2013, The United States Geological Survey Science Data Lifecycle Model: U.S. Geological Survey Open-File Report 2013–1265, 4 p., https://doi.org/10.3133/ofr20131265.
Macrae, D. (2015). Why Bridging The Data Science Skills Gap Requires A Team Effort. TechWeek Europe.http://www.techweekeurope.co.uk/e-management/skills/bridging-data-science-skills-gap-requires-team-effort-160818#25vTmJ6UpzSfI20F.99
Martin, Elaine R. (2015). "The Role of Librarians in Data Science: A Call to Action." Journal of eScience Librarianship 4(2): e1092. http://dx.doi.org/10.7191/jeslib.2015.1092
Library Journal Research. (2015). Bridging the Librarian-Faculty Gap in the Academic Library. Gale Cengage Learning. https://s3.amazonaws.com/WebVault/surveys/LJ_AcademicLibrarySurvey2015_results.pdf
University of Central Florida Libraries Research Lifecycle Committee. (2012). The research lifecycle at UCF [Online Graphic]. Retrieved (February, 13, 2017) from library.ucf.edu/ScholarlyCommunication/ResearchLifecycleUCF.php
With the exception of logos, third party images or where otherwise indicated, this
work is licensed under the Creative Commons Australia Attribution 3.0 Licence.
ANDS is supported by the Australian
Government through the National Collaborative Research Infrastructure Strategy Program. Monash University leads the partnership with
the Australian National University and CSIRO.
Kathryn Unsworth - ANDS Outreach Officer and Data [email protected]