a story of preprints and curation networks: efficiently scaling community outreach ... ›...
TRANSCRIPT
A story of preprints and curation networks: efficiently scaling community outreach using public goods infrastructure
Jeffrey Spies, Co-founder and CTO, Center for Open SciencePhilip Cohen, Professor of Sociology, University of MarylandClaire Stewart, Associate University Librarian for Research and Learning, University of MinnesotaCynthia Hudson-Vitale, Data Services Coordinator, Washington University in St. Louis
SHARE is a free, open dataset of research activity across the research workflow supported by free, open source tools.
The OSF is a free, open source workflow management, integration, and sharing platform.
Public good
● A commodity or service that is provided without profit to all members of a society, either by the government or a private individual or organization. (Oxford)
● A good that is both non-excludable and non-rivalrous in that individuals cannot be effectively excluded from use and where use by one individual does not reduce availability to others. (Wikipedia)
Openness fosters inclusivity, collaboration, and innovation.
Scaling
3^0 = 1
3^1 = 3
3^3 = 9
3^7 = 21873^6 = 7293^5 = 2433^4 = 813^3 = 93^1 = 33^0 = 1
How to achieve efficient scaling
● Engage broadly (and allowing others to engage broadly)● Facilitate experts being experts
○ Especially as force multipliers--make others more efficient
● Facilitate (unknown) innovations
By
● Respecting current incentives and current workflows● Allowing people to be selfish● Creating virtuous cycles● Reusing/repurposing modular, open infrastructure● Crediting
OSF
Publish Report
Search / Discovery
Develop Idea
Design Study
Collect Data
Store Data
Analyze Data
Write Report
http://osf.io
Let experts be experts.
Publish Report
Search / Discovery
Develop Idea
Design Study
Collect Data
Store Data
Analyze Data
Write Report
OSF can integrate rather than append expertise to the research workflow.
Publish Report
Search / Discovery
Develop Idea
Design Study
Collect Data
Store Data
Analyze Data
Write Report
Preservation
Publish Report
Search / Discovery
Develop Idea
Design Study
Collect Data
Store Data
Analyze Data
Write Report
OSF can integrate rather than append expertise to the research workflow.
SHARE can engage local experts to curate descriptions of the research workflow.
APIProviders ConsumersGather
SHARE
● Give this increased audience curation tools and APIs● Give them incentives via the virtuous cycle of open
○ Make it in the consumer’s best interest to contribute
OSF Application Framework
• Workflow• Authentication• Permissions• File Storage• File Rendering• Meta-database• Persistence• Integrations• Search• SHARE
osf.io
osf.io/preprints
osf.io/registries
journals
grants management
university systems
Modularity and abstraction support scaling.
OSF Application Framework
• Workflow• Authentication• Permissions• File Storage• File Rendering• Meta-database• Persistence• Integrations• Search• SHARE
osf.io/preprints
http://osf.io/preprints
OSF Application Framework
• Workflow• Authentication• Permissions• File Storage• File Rendering• Meta-database• Persistence• Integrations• Search• SHARE
osf.io/preprints
OSF Application Framework
• Workflow• Authentication• Permissions• File Storage• File Rendering• Meta-database• Persistence• Integrations• Search• SHARE
osf.io
osf.io/preprints
osf.io/registries
journals
grants management
university systems
How to achieve efficient scaling
● Engage broadly (and allowing others to engage broadly)● Facilitate experts being experts
○ Especially as force multipliers--make others more efficient
● Facilitate (unknown) innovations
By
● Respecting current incentives and current workflows● Allowing people to be selfish● Creating virtuous cycles● Reusing/repurposing modular, open infrastructure● Crediting
▪ Open archive of the social sciences
Free, open-source, open-accessSoft launch JulyNew interface went up last week
▪ Created by sociologists and librarians
Administered at U. of Maryland
▪ Partners: Center for Open Science
Powered by SHAREOn the Open Science Framework
SocArxiv.org ● @socarxiv ● [email protected] ● Facebook.com/SocArXiv
Philip N. CohenU. of [email protected]@familyunequal
SocArxiv.org ● @socarxiv ● [email protected] ● Facebook.com/SocArXiv
SocArxiv.org ● @socarxiv ● [email protected] ● Facebook.com/SocArXiv
Pitch to hesitant social scientists: Reach
Pitch to hesitant social scientists: Time
SocArxiv.org ● @socarxiv ● [email protected] ● Facebook.com/SocArXiv
> Working paper – when it’s ready to shareYes, most journals will still let you submit it later
> Preprint – when it’s ready to publishYes, most journals permit pre-publication posting
> Post-print – when it’s behind a paywallYes, most journals permit post-publication posting
SocArxiv.org ● @socarxiv ● [email protected] ● Facebook.com/SocArXiv
On the OSF | Preprints server
> VersionsUpdate your paper as it evolvesPersistent URL, citation, and optional DOI
> Analytics, social media sharing, linked IDs
> Add optional data and codePublic settings, collaboration
> Down the roadOverlay journalsPost-publication review
SocArxiv.org ● @socarxiv ● [email protected] ● Facebook.com/SocArXiv
Get involved!
Post papers
Spread the word
Volunteer
Raise money / contribute
SocArxiv.org ● @socarxiv ● [email protected] ● Facebook.com/SocArXiv
SHARE Curation Associates
Cynthia Hudson-VitaleWashington University in St. Louis
@cynhudson
Develop digital curation and computational thinking skills to enhance local institutional repositories in a service-learning setting
Jennifer PhillipsNational Center for Atmospheric Research (NCAR)Jonathan CainUniversity of OregonTheresa PolkUniversity of Texas at AustinDana ChandlerTuskegee UniversityFred ReissUniversity of OklahomaZach CobleNew York UniversityWendy RobertsonUniversity of IowaShane ColemanVirginia TechMark ShelstadColorado State UniversityDeborah CornellCollege of William and MaryIyanna SimsNorth Carolina A&T State UniversityAmanda GoochThe George Washington University
Ashley AdairUniversity of Texas at AustinBrianna MarshallUniversity of WisconsinMary AlexanderUniversity of AlabamaKim MearsAugusta UniversityTalea AndersonWashington State UniversityJeremy MynttiUniversity of UtahElizabeth BedfordUniversity of WashingtonLisa PalmerUniversity of Massachusetts Medical SchoolLisa StienbargerUniversity of Notre DameJoanne PatersonWestern UniversityCunera BuysNorthwestern UniversityJulie HardestyIndiana University
Vicky SteevesNew York UniversityMatthew HarpArizona State UniversityEmily StenbergWashington University in St. LouisSteven HollowayJames Madison UniversityNicole Sump-CretharOklahoma State UniversitySalwa IsmailGeorgetown UniversityKelly ThompsonUniversity of MinnesotaSherry LakeUniversity of VirginiaKathleen LuschekUniversity of Hawai’i at MānoaDainan SkeemBrigham Young University
Local curation enhancements and projects that provide benefits locally
Curation Track
First 6 months:
● Metadata review● Gap analysis● Digital preservation review● Draft 3-3-3 plan
Upcoming:
● Implement 3-3-3 plan
Local enhancement activities
Project Track
Populating an OA IR using the SHARE data set
Members: Zach Coble, NYU; Sherry Lake, UVA; Joanne Paterson, Western University
https://osf.io/c3veb/
Graduate Student Profiles
Members: Cunera Buys, Northwestern University; Brianna Marshall, University of Wisconsin-Madison
https://osf.io/w9dm4/
Research Data Searching
Members: Talea Anderson, Washington State University; Elizabeth Bedford, University of Washington; Sherry Lake, UVA; Kelly Thompson, University of Minnesota
https://osf.io/hy3cq/
ORCID
Members: Jonathan Cain, University of Oregon; Steven Halloway, JMU; Salwa Ismail, Georgetown University; Victoria Steeves, NYU
Data Curation Network
Planning a network of expertise model for curating research data in academic libraries
The Data Curation Network project is supported by a grant from the ALFRED P. SLOAN FOUNDATION.
2016-2017
Rise of a data sharing cultureResearchers are increasingly required/incentivised to share data● Funder data sharing mandates● Journal data sharing policies● Disciplinary practices → emphasis on transparency and reproducibility
Data repositories: it’s not enough to just keep the files!
Goal of data curation ⇒ Prepare and maintain research data in ways that make it findable, accessible, interoperable and reusable (FAIR),
Data curation = metadata, documentation, access, preservation, and more...
Data Curation Network
Data curation activities
Data Curation Network
● Code review● Contextualize● Documentation● Embargo● File Format Transformations● Persistent Identifier● Quality Assurance● Use Analytics● Versioning● Data Citation● Deidentification
● File Audit● File Inventory or Manifest● File validation● Metadata● Metadata Brokerage● Rights Management● Risk Management● Terms of Use● Peer-review● Technology Monitoring and Refresh
Challenge for institutional data curation services
How to scale data curation services across all disciplines?
Multiple data curation experts are needed to effectively curate the diverse data types an institution typically generates.
Data curation expertise needed: - File format-- GIS, spreadsheet/tabular, statistical/survey, software code,
video/audio, images/3D, simulations...- Discipline-specific-- genomic sequence, chemical spectra, biological image... - Frequency-- Centers of excellence, departmental concentration
Data Curation Network
Data Curation Network
Data Curation Network
The Data Curation Network will enable academic institutions to better support researchers that are faced with a growing number of requirements to ethically
share their research data.
http://z.umn.edu/datacuration
Kirchner, Joy, Jose Diaz, Geneva Henry, Susan Fliss, John Culshaw, Heather Gendron, and Jon E. Cawthorne. “The Center of Excellence Model for Information Services.” Council on Library and Information Resources (CLIR), February 2015.
Our Vision for the Next 3-5 Years
Data Curation Network
1. Develop standards-driven data curation techniques for all types of repository workflows and infrastructure.
2. Expand into a sustainable entity that grows beyond our initial six partner institutions.
3. Datasets curated by the Data Curation Network will be used to advance research and education in ways that are measurably of greater reuse value than non-curated data.
4. Build an innovative community that enriches capacities for data curation writ large.
Data Curation Network Partners
Data Curation Network
Planning the Data Curation Network
Data Curation Network
(Current) Planning phase, supported by the Alfred P. Sloan Foundation to:● Develop a Data Curation Network ‘model of expertise’ for data curation staff
that includes the projected staffing, costs, skills sets, and demand necessary for implementation.
(Future) Pilot phase will ● Test the model across our six institutions● Plan for how to grow and sustain the Network
Data Curation Network
Draft Model for the Data Curation Network
Our Planning Phase activity to date✓ Summer → Assessed infrastructure/policy/workflow differences and monitor
the demand across institutions. Baseline report.
● Just completed Oct/Nov 2016 → Seek input from researchers to better understand how data curation services fit into their research workflow (focus groups).
● Jan 2017 → ARL Spec Kit survey on library data curation activities.
● Spring 2017 → Develop financial/governance models. Share our draft Data Curation Network model with stakeholders for feedback.
Data Curation Network
Researcher Engagements
Data Curation Network
Results: Researcher Engagements
Data Curation Network
Goal: Identify value/importance placed on 40+ data curation activities in order to Identify gaps in important curation activities that are either not happening/well. Completed engagements, analysis underway (~90 participants at 6 institutions)
Results: Repository Curation Workflows
Data Curation Network
Results: Repository Technologies
Data Curation Network
Results: Repository Policies
Data Curation Network
Thanks!
Web: https://sites.google.com/site/DataCurationNetwork
Twitter #DataCurationNetwork
Claire StewartUniversity of Minnesota Libraries
Data Curation Network