Transparency and Trust: Towards the Promise of Open Science
Professor Liz Lyon School of Information Sciences, University of Pittsburgh
INCONECSS 2016, Berlin
Agenda1. In the Headlines 2. Unpacking Transparency 3. Towards Open Science
– Scholarship– Stewardship
4. Making it Happen– LIS Workforce Development– Re-engineering Research Data Service Models
In the Headlines
Tensions?
Trusted product?
Trusted service?
https://www.washingtonpost.com/news/morning-mix/wp/2015/11/09/scientist-falsified-
data-for-cancer-research-once-described-as-holy-grail-feds-say/
Trusted data?
US institution X experience• Anil Potti paper in Nature Medicine 2006• Independent audit of the research by
Baggerly & Coombes (bio-statisticians)• IRB Inquiry & Report• Lessons learned include (Ince 2011):
– Sloppiness in data curation & software storage– Institutional reviewers did not verify the
provenance of the data– Institutional data was not released– Institutional report was not published
Unpacking the concept:
Transparency
OpenClosedAccess
Participation
Lone scholar
Team science
Citizen science
2D Continuum of Openness
Liz Lyon (2009) Open Science at Web Scale Report
Towards a third dimension?
Easterbrook Nature Geoscience (2014)
NIST definitions of Repeatability & Reproducibility in Tech Note 1297 (1994)
Open Science terms & definitions (1) • Open or Reproducible Research:
Auditable research made openly available
• Auditable Research: Sufficient records (including data and software) have been archived so that the research can be defended later if necessary or differences between independent confirmations resolved.
Victoria Stodden et al Setting the Default to Reproducible Workshop Report (2013)
Open Science: terms & definitions (2)
Transparency:
• The outcome of a suite of behaviourswhich characterise Reproducible Research
• Facilitates enhanced Research Quality, Integrity and Trust
Liz Lyon (2016) LIBER Q
OpenClosed
Access
Participation
Lone scholar
Team science
Citizen science
3D Model of Open Science
Transparency
Liz Lyon (2016) LIBER Q
20 Terms: What Transparency is not!
Integrity?1. Confusing2. Gray/grey3. Vague4. Unclear5. Opaque6. Ambiguous7. Obscured8. Implicit9. Hidden10. Secret
Clarity?11. Not verified12. Not validated13. Not auditable14. Not supported15. Not described16. Not documented17. Not recorded18. Not versioned19. Not tracked20. No provenance
https://www.flickr.com/photos/8885264
What does this mean for Libraries?
….and for Librarians?
https://www.flickr.com/photos/claudia_l/5614406866/
Design
Plan
Collect, Find, Acquire
Process, Visualize Analyze
Store
Publish, Preserve, Archive
Prepare
Track
Adapted from ULS RDM WG Research Data Lifecycle
Context: Research Lifecycle
Design
Plan
Collect, Find, Acquire
Process, Visualize Analyze
Store
Publish, Preserve, Archive
Prepare
Track
TrackingTransparency
Products IdentifiersPeer ReviewsVersions
Workflow toolsScripts & SoftwareGraphicsModels & Simulations
DataCodeSamplesReagentsMaterialsMethodsInstrumentsToolsSubjects
MetadataAnnotationsFormats & StandardsFilesLicensesMethods & ProtocolsResults
Cloud servicesField NotebooksELNCollaboration spaces
Practice: Actions? ProposalsTemplatesDrafts
DMPs
Re-useRatingsCreditsCitationsBlogsTweets
Liz Lyon Liber Q (2016)
Open Science: terms & definitions (3)
Transparency Actions:
• Specific interventions as components of processes, protocols and practices
• Applicable throughout the research lifecycle
Liz Lyon (2016) LIBER Q
Transparency research at Pitt iSchool• Pilot study 2015-16: explore awareness, attitudes
and actions towards Transparency & Open Science
• Aim: to inform LIS service development, tools, LIS education programs, professional skills
• Methodology: focus groups with a) disciplinary researchers b) librarians
• Research Lifecycle as the substrate
Design
Plan
Collect, Find, Acquire
Process, Visualize Analyze
Store
Publish, Preserve, Archive
Prepare
Track
Adapted from ULS RDM WG Research Data Lifecycle
Substrate: Research Lifecycle
Q1How are transparency actions reflected in open scholarship?
“ Recommendation 6As a condition of publication, scientific journals should enforce a requirement that the data on which the argument of the article depends should be accessible, assessable, usable and traceable through information in the article.”
Science as an Open Enterprise Report, Royal Society, UK
Journals changing (open) data policy……
• “Data deposition in a public repository is mandatory …”
• A step towards Transparency ?
This is accepted practice in some disciplines, but in others, not so much….
this leads to issues of trust……
GigaScience and PublonsOpen peer review (CC-BY)
Papers and datasets
Get credit for your reviews!
http://blogs.biomedcentral.com/bmcblog/2014/06/26/gigascience-helping-reviewers-get-credit-through-publons/
http://www.psycontent.com/content/311q281518161139/fulltext.pdf
Reproducibility Project Psychology
Results 2015 : only 39% held up
Transparency & Openness Promotion (TOP) Guidelines
• Center for Open Science 2015• Science article June 2015• Journal Policies and Practices• 8 Transparency Standards• Templates for 3 Levels of
each Standard
http://science.sciencemag.org/content/sci/348/6242/1422.full.pdf?ijkey=ha1o5D9wvW4ZQ&keytype=ref&siteid=sci
8 Transparency Standards (TOP)
1. Citation2. Data transparency3. Analytic methods (code)
transparency4. Research materials transparency5. Design & analysis transparency6. Pre-registration of studies7. Registration of analysis plans8. Replication
CISER Replication Service
http://www.dcc.ac.uk/sites/default/files/documents/IDCC16/54_Arguillas%20and%20Block%20-%20Poster%20IDCC%202016.pdf
Reproducibility isn’t always easy…
…to peer-reproduced?Gonzalez-Beltran, Li et al 2015 PLoS ONE
From peer reviewed …..
Q2How are transparency actions reflected in data stewardship?
Laboratory notebooks: 3 role models
http://mss.sagepub.com/content/8/4/422.full.pdf+htmlhttp
://da
rwin
-onl
ine.
org.
uk/
http://einsteinpapers.press.princeton.edu/
All three role models• Recorded thoughts,
observations, ideas, calculations
• Demonstrated the provenance of their conclusions
• Allowed other scientists to reuse their findings
• Good practice from > 100 years ago!
http://news.utoronto.ca/huntingtons-disease-university-toronto-researcher-first-share-lab-notes-real-time
• Another step towards Transparency ?
LIS data stewardship workflows to support transparency & trust?
http://datasealofapproval.org/en/
Certification….. Trusted
• Data Seal of Approval for repository certification• Self-assessment approach with external peer review• DSA online tool to facilitate application process• DSA is based on 16 guidelines (Version 2 2013)
Making it Happen
Q3How can workforce development catalyse transparency and trust?
A family of new data science roles (Lyon & Brenner IJDC 2015)
Linking data roles, skills & curriculum (Lyon et al 2016, Lyon & Mattern 2016)
• Analysis of real-world positions for six data roles
• Part 1: data librarian, data archivist, data steward
• Part 2: data analyst, data engineer, data journalist
• Map to current iSchool courses
• Informing development of a Data Stewardship Pathway
Methods: Data Collection
Date Range for Job Postings:Part 1 January 2014-April 2015Part 2 October 2015
Keyword searching and visual scanning
Accessed 10 full job descriptions for each role (with IASSIST postings, more abbreviated job advertisements)
Methods:Content Analysis
Competencies: proficiency with specific tools/technologies/programming languages.
Education: Academic qualifications
Experience: direct, hands-on practice
Knowledge: understanding of/familiarity with
topics/subjects/issuesSkills: ability to do an action well
Identified all requirements that appeared in at least three of the positions studied for each role and designated these as “Key Requirements”
Chose not to distinguish between “essential” and “desirable” requirements
Data Librarian
Data Steward/ Curator
Real World Job analysis Part 1 (Lyon et al iPres Proc 2016)
Promote Transparency
Open Science: terms & definitions (4)
These new Data Science roles can act as
Transparency Agents:
• Promote, demonstrate and action specific behaviours and practices for Open Science
Requirements
Methods:Course MappingData Stewardship Pathway
Course
Course
Course
Course
Data Science Position(Data Librarian, Data Archivist, Data Curator / Steward, Data Analyst, Data Engineer, Data Journalist)
Transparency & Trust Principles
“Stepping stones” form a Course Pathway
Transparency and Trust are in the Data Stewardship Pathway in the MLIS curriculum at Pitt iSchool
Q4How should Library research data service models be re-engineered to support transparency and trust?
1. Transactional delivery model
• In the physical Library• Remote• Access & Reference• RDM Advocacy• RDM LibGuides
Lyon New Review Academic Libraries (2016) In press
https://www.flickr.com/photos/smiling-gardener
Lyon New Review Academic Libraries (2016) In press
• Assigned to Faculty / Department
• Liaison• Consultancy• DMP • RDM training
2. Hybrid delivery model
https://www.flickr.com/photos/brownlessbiomedicallibrary
3. Immersive delivery model –Librarians in the Lab
• Laboratory or clinical setting
• Integrated• Collaborative team
science• Data description &
curation• Data analysis &
visualisation
https://www.flickr.com/photos/79173425@N03/9018554012/1410324768
Lyon New Review Academic Libraries (2016) In press
Photo Credits:Flickr NASA HQ
Economics & Business?
• Collaborations• Partnerships
• Institutes• Centres
• Groups• Alliances
Benefits of Re-engineering?• Data support at the researchers’ point-of-need
(here and now)• LIS professionals fully integrated at the coalface • (in the field, in the business, in the lab….)• Default listings in citations with attribution + credit
(LIS “co-authors”)• LIS data science roles act as transparency agents
(enhance research integrity & open science)
Radical Re-engineering….
…our academic & research libraries
https://en.wikipedia.org/wiki/Heydar_Aliyev_Center
Thank you….
INCONECSS 2016Professor Liz Lyon, School of Information Sciences,
University of Pittsburgh