![Page 1: Data Publishing Workflows: Strategies and Standards Sünje Dallmeier-Tiessen (CERN) for many collaborators at CERN and in the RDA-WDS Data Publishing Workflows](https://reader035.vdocuments.site/reader035/viewer/2022062320/56649d975503460f94a80492/html5/thumbnails/1.jpg)
Data Publishing Workflows:Strategies and Standards
Sünje Dallmeier-Tiessen (CERN)
for many collaborators at CERN andin the RDA-WDS Data Publishing Workflows Group
![Page 2: Data Publishing Workflows: Strategies and Standards Sünje Dallmeier-Tiessen (CERN) for many collaborators at CERN and in the RDA-WDS Data Publishing Workflows](https://reader035.vdocuments.site/reader035/viewer/2022062320/56649d975503460f94a80492/html5/thumbnails/2.jpg)
Outline• Policy pressure• Solutions across disciplines• Standards
• Persistent Identifier• Data Citation• Quality Assurance, Peer Review• Licensing
• Examples in High-Energy Physics (CERN)• INSPIRE• Analysis Preservation Framework• Open Data Portal
![Page 3: Data Publishing Workflows: Strategies and Standards Sünje Dallmeier-Tiessen (CERN) for many collaborators at CERN and in the RDA-WDS Data Publishing Workflows](https://reader035.vdocuments.site/reader035/viewer/2022062320/56649d975503460f94a80492/html5/thumbnails/3.jpg)
Research data is a first class citizen
Royal Society, 1665 and 2012
![Page 4: Data Publishing Workflows: Strategies and Standards Sünje Dallmeier-Tiessen (CERN) for many collaborators at CERN and in the RDA-WDS Data Publishing Workflows](https://reader035.vdocuments.site/reader035/viewer/2022062320/56649d975503460f94a80492/html5/thumbnails/4.jpg)
Towards Open Science
Open Source
Open Access
Open Data & Code
Open ScienceWe are here now
Slide provided by Patricia Herterich, CERN
![Page 5: Data Publishing Workflows: Strategies and Standards Sünje Dallmeier-Tiessen (CERN) for many collaborators at CERN and in the RDA-WDS Data Publishing Workflows](https://reader035.vdocuments.site/reader035/viewer/2022062320/56649d975503460f94a80492/html5/thumbnails/5.jpg)
Policy pressure: STFC example
https://www.stfc.ac.uk/Resources/pdf/STFC_Scientific_Data_Policy.pdf
![Page 6: Data Publishing Workflows: Strategies and Standards Sünje Dallmeier-Tiessen (CERN) for many collaborators at CERN and in the RDA-WDS Data Publishing Workflows](https://reader035.vdocuments.site/reader035/viewer/2022062320/56649d975503460f94a80492/html5/thumbnails/6.jpg)
Policy pressure: DOE example
DMPs should provide a plan for making all research data displayed in publications resulting from the proposed research open, machine-readable, and digitally accessible to the public at the time of publication.
…the underlying digital research data used to generate the displayed data should be made as accessible as possible to the public in accordance with the principles stated above.
http://science.energy.gov/funding-opportunities/digital-data-management/
![Page 7: Data Publishing Workflows: Strategies and Standards Sünje Dallmeier-Tiessen (CERN) for many collaborators at CERN and in the RDA-WDS Data Publishing Workflows](https://reader035.vdocuments.site/reader035/viewer/2022062320/56649d975503460f94a80492/html5/thumbnails/7.jpg)
Expectations: PLOS Data Policy
ww
w.p
los.
org
![Page 8: Data Publishing Workflows: Strategies and Standards Sünje Dallmeier-Tiessen (CERN) for many collaborators at CERN and in the RDA-WDS Data Publishing Workflows](https://reader035.vdocuments.site/reader035/viewer/2022062320/56649d975503460f94a80492/html5/thumbnails/8.jpg)
Concerns across disciplines
Datasets are…• Not shared or lost• Difficult to discover and access• Difficult to understand > context missing
Nature, 2009
![Page 9: Data Publishing Workflows: Strategies and Standards Sünje Dallmeier-Tiessen (CERN) for many collaborators at CERN and in the RDA-WDS Data Publishing Workflows](https://reader035.vdocuments.site/reader035/viewer/2022062320/56649d975503460f94a80492/html5/thumbnails/9.jpg)
How this challenge is addressed
![Page 10: Data Publishing Workflows: Strategies and Standards Sünje Dallmeier-Tiessen (CERN) for many collaborators at CERN and in the RDA-WDS Data Publishing Workflows](https://reader035.vdocuments.site/reader035/viewer/2022062320/56649d975503460f94a80492/html5/thumbnails/10.jpg)
Example: Dedicated Data Repositories
ww
w.p
anga
ea.d
e
![Page 11: Data Publishing Workflows: Strategies and Standards Sünje Dallmeier-Tiessen (CERN) for many collaborators at CERN and in the RDA-WDS Data Publishing Workflows](https://reader035.vdocuments.site/reader035/viewer/2022062320/56649d975503460f94a80492/html5/thumbnails/11.jpg)
Preserving and promoting data reuse
ww
w.p
anga
ea.d
e
![Page 12: Data Publishing Workflows: Strategies and Standards Sünje Dallmeier-Tiessen (CERN) for many collaborators at CERN and in the RDA-WDS Data Publishing Workflows](https://reader035.vdocuments.site/reader035/viewer/2022062320/56649d975503460f94a80492/html5/thumbnails/12.jpg)
International sharing and curation of data
ww
.icgc
.org
![Page 13: Data Publishing Workflows: Strategies and Standards Sünje Dallmeier-Tiessen (CERN) for many collaborators at CERN and in the RDA-WDS Data Publishing Workflows](https://reader035.vdocuments.site/reader035/viewer/2022062320/56649d975503460f94a80492/html5/thumbnails/13.jpg)
ICGC – Data Publication Timeline
Time limits for publication moratoriums:
All data shall become free of a publication moratorium when either the data is published by the ICGC member project or one year after a specified quantity of data (e.g. genome dataset from 100 tumors per project) has been released via the ICGC database or other public databases.
[…]
In all cases data shall be free of a publication moratorium two years after its initial release.
https://icgc.org/icgc/goals-structure-policies-guidelines/e3-publication-policy
![Page 14: Data Publishing Workflows: Strategies and Standards Sünje Dallmeier-Tiessen (CERN) for many collaborators at CERN and in the RDA-WDS Data Publishing Workflows](https://reader035.vdocuments.site/reader035/viewer/2022062320/56649d975503460f94a80492/html5/thumbnails/14.jpg)
Zenodo – Data Repository
ww
w.z
enod
o.or
g
![Page 15: Data Publishing Workflows: Strategies and Standards Sünje Dallmeier-Tiessen (CERN) for many collaborators at CERN and in the RDA-WDS Data Publishing Workflows](https://reader035.vdocuments.site/reader035/viewer/2022062320/56649d975503460f94a80492/html5/thumbnails/15.jpg)
How to find a data repository
ww
w.r
e3da
ta.o
rg
![Page 16: Data Publishing Workflows: Strategies and Standards Sünje Dallmeier-Tiessen (CERN) for many collaborators at CERN and in the RDA-WDS Data Publishing Workflows](https://reader035.vdocuments.site/reader035/viewer/2022062320/56649d975503460f94a80492/html5/thumbnails/16.jpg)
Example: A dedicated data journal
ww
w.n
atur
e.co
m/s
data
/
![Page 17: Data Publishing Workflows: Strategies and Standards Sünje Dallmeier-Tiessen (CERN) for many collaborators at CERN and in the RDA-WDS Data Publishing Workflows](https://reader035.vdocuments.site/reader035/viewer/2022062320/56649d975503460f94a80492/html5/thumbnails/17.jpg)
F1000
http
://f
1000
rese
arc
h.c
om/
![Page 18: Data Publishing Workflows: Strategies and Standards Sünje Dallmeier-Tiessen (CERN) for many collaborators at CERN and in the RDA-WDS Data Publishing Workflows](https://reader035.vdocuments.site/reader035/viewer/2022062320/56649d975503460f94a80492/html5/thumbnails/18.jpg)
Connecting articles and data
Tagged Genbank entry(genetic sequence)
Slide provided by H. Koers, Elsevier. Article: doi: 10.1016/j.biortech.2010.03.063
![Page 19: Data Publishing Workflows: Strategies and Standards Sünje Dallmeier-Tiessen (CERN) for many collaborators at CERN and in the RDA-WDS Data Publishing Workflows](https://reader035.vdocuments.site/reader035/viewer/2022062320/56649d975503460f94a80492/html5/thumbnails/19.jpg)
Towards Open Science
Open Source
Open Access
Open Data & Code
Open ScienceWe are here now
Slide provided by Patricia Herterich
![Page 20: Data Publishing Workflows: Strategies and Standards Sünje Dallmeier-Tiessen (CERN) for many collaborators at CERN and in the RDA-WDS Data Publishing Workflows](https://reader035.vdocuments.site/reader035/viewer/2022062320/56649d975503460f94a80492/html5/thumbnails/20.jpg)
Publish (Citable) Software
![Page 21: Data Publishing Workflows: Strategies and Standards Sünje Dallmeier-Tiessen (CERN) for many collaborators at CERN and in the RDA-WDS Data Publishing Workflows](https://reader035.vdocuments.site/reader035/viewer/2022062320/56649d975503460f94a80492/html5/thumbnails/21.jpg)
More and more examples
![Page 22: Data Publishing Workflows: Strategies and Standards Sünje Dallmeier-Tiessen (CERN) for many collaborators at CERN and in the RDA-WDS Data Publishing Workflows](https://reader035.vdocuments.site/reader035/viewer/2022062320/56649d975503460f94a80492/html5/thumbnails/22.jpg)
Published Software Papers
http
://op
enre
sear
chso
ftwar
e.m
etaj
nl.c
om/
![Page 23: Data Publishing Workflows: Strategies and Standards Sünje Dallmeier-Tiessen (CERN) for many collaborators at CERN and in the RDA-WDS Data Publishing Workflows](https://reader035.vdocuments.site/reader035/viewer/2022062320/56649d975503460f94a80492/html5/thumbnails/23.jpg)
STANDARDS
![Page 24: Data Publishing Workflows: Strategies and Standards Sünje Dallmeier-Tiessen (CERN) for many collaborators at CERN and in the RDA-WDS Data Publishing Workflows](https://reader035.vdocuments.site/reader035/viewer/2022062320/56649d975503460f94a80492/html5/thumbnails/24.jpg)
Licensing• Enable others to reuse your data and software • Choose the licenses or public domain dedications
accordingly• As “open” as possible
Re-Use• There are measures to demand citations to track reuse
and the impact of your work• If you re-use, cite the dataset yourself
![Page 25: Data Publishing Workflows: Strategies and Standards Sünje Dallmeier-Tiessen (CERN) for many collaborators at CERN and in the RDA-WDS Data Publishing Workflows](https://reader035.vdocuments.site/reader035/viewer/2022062320/56649d975503460f94a80492/html5/thumbnails/25.jpg)
Digital Object Identifiers (DOI names) offer a solution
Mostly widely used identifier for scientific articles
Researchers, authors, publishers know how to use them
Put datasets on the same playing field as articles
DatasetYancheva et al (2007). Analyses on sediment of Lake Maar. PANGAEA.doi:10.1594/PANGAEA.587840
URLs are not persistent
(e.g. Wren JD: URL decay in MEDLINE- a 4-year follow-up study. Bioinformatics. 2008, Jun 1;24(11):1381-5).
DOIs for datasets
Slides by courtesy of Dr. Jan Brase, DataCite
![Page 26: Data Publishing Workflows: Strategies and Standards Sünje Dallmeier-Tiessen (CERN) for many collaborators at CERN and in the RDA-WDS Data Publishing Workflows](https://reader035.vdocuments.site/reader035/viewer/2022062320/56649d975503460f94a80492/html5/thumbnails/26.jpg)
ORCID id
ww
w.o
rcid
.org
![Page 27: Data Publishing Workflows: Strategies and Standards Sünje Dallmeier-Tiessen (CERN) for many collaborators at CERN and in the RDA-WDS Data Publishing Workflows](https://reader035.vdocuments.site/reader035/viewer/2022062320/56649d975503460f94a80492/html5/thumbnails/27.jpg)
Force11- Data Citation Principles
Author, Publication Year, Dataset Title, Data Repository, Version, Unique Identifier
- should include a persistent method for identification that is machine actionable and globally unique
- should facilitate identification of, access to, and verification of the specific data that support a claim.
www.force11.org
![Page 28: Data Publishing Workflows: Strategies and Standards Sünje Dallmeier-Tiessen (CERN) for many collaborators at CERN and in the RDA-WDS Data Publishing Workflows](https://reader035.vdocuments.site/reader035/viewer/2022062320/56649d975503460f94a80492/html5/thumbnails/28.jpg)
Data Citation in Practice
![Page 29: Data Publishing Workflows: Strategies and Standards Sünje Dallmeier-Tiessen (CERN) for many collaborators at CERN and in the RDA-WDS Data Publishing Workflows](https://reader035.vdocuments.site/reader035/viewer/2022062320/56649d975503460f94a80492/html5/thumbnails/29.jpg)
Quality assurance for data: peer review
Products
• Data records in data repositories
• Data journals• Data articles
Note: standalone vs. supporting materials
QA Workflows
• Standalone or integrated?
• Blind and invited peer review
• Open peer review• Citable review reports
![Page 30: Data Publishing Workflows: Strategies and Standards Sünje Dallmeier-Tiessen (CERN) for many collaborators at CERN and in the RDA-WDS Data Publishing Workflows](https://reader035.vdocuments.site/reader035/viewer/2022062320/56649d975503460f94a80492/html5/thumbnails/30.jpg)
How to publish your data
1. Decide which dataset should be preserved or which dataset might be of interest for others to study or reuse
2. Are there issues which restrict the publishing process, e.g. confidentiality for patient data?
3. Which data product? • Do I have enough materials for a dedicated data article? • Which journal or repository works for me?
4. Prepare the documentation/metadata
5. Publish and let the others know you did
6. Cite the dataset in the resulting papers
7. Track who used and cited your data
![Page 31: Data Publishing Workflows: Strategies and Standards Sünje Dallmeier-Tiessen (CERN) for many collaborators at CERN and in the RDA-WDS Data Publishing Workflows](https://reader035.vdocuments.site/reader035/viewer/2022062320/56649d975503460f94a80492/html5/thumbnails/31.jpg)
HEPHigh-Energy Physics
![Page 32: Data Publishing Workflows: Strategies and Standards Sünje Dallmeier-Tiessen (CERN) for many collaborators at CERN and in the RDA-WDS Data Publishing Workflows](https://reader035.vdocuments.site/reader035/viewer/2022062320/56649d975503460f94a80492/html5/thumbnails/32.jpg)
Research data in HEP
![Page 33: Data Publishing Workflows: Strategies and Standards Sünje Dallmeier-Tiessen (CERN) for many collaborators at CERN and in the RDA-WDS Data Publishing Workflows](https://reader035.vdocuments.site/reader035/viewer/2022062320/56649d975503460f94a80492/html5/thumbnails/33.jpg)
Research Data on INSPIRE: starting from the paper
![Page 34: Data Publishing Workflows: Strategies and Standards Sünje Dallmeier-Tiessen (CERN) for many collaborators at CERN and in the RDA-WDS Data Publishing Workflows](https://reader035.vdocuments.site/reader035/viewer/2022062320/56649d975503460f94a80492/html5/thumbnails/34.jpg)
The underlying datasets (HEPdata)
![Page 35: Data Publishing Workflows: Strategies and Standards Sünje Dallmeier-Tiessen (CERN) for many collaborators at CERN and in the RDA-WDS Data Publishing Workflows](https://reader035.vdocuments.site/reader035/viewer/2022062320/56649d975503460f94a80492/html5/thumbnails/35.jpg)
Data Citation (Tracking)
![Page 36: Data Publishing Workflows: Strategies and Standards Sünje Dallmeier-Tiessen (CERN) for many collaborators at CERN and in the RDA-WDS Data Publishing Workflows](https://reader035.vdocuments.site/reader035/viewer/2022062320/56649d975503460f94a80492/html5/thumbnails/36.jpg)
Referenced Data
arXiv: 1311.1113
![Page 37: Data Publishing Workflows: Strategies and Standards Sünje Dallmeier-Tiessen (CERN) for many collaborators at CERN and in the RDA-WDS Data Publishing Workflows](https://reader035.vdocuments.site/reader035/viewer/2022062320/56649d975503460f94a80492/html5/thumbnails/37.jpg)
Code snippets
![Page 38: Data Publishing Workflows: Strategies and Standards Sünje Dallmeier-Tiessen (CERN) for many collaborators at CERN and in the RDA-WDS Data Publishing Workflows](https://reader035.vdocuments.site/reader035/viewer/2022062320/56649d975503460f94a80492/html5/thumbnails/38.jpg)
Code snippets
![Page 39: Data Publishing Workflows: Strategies and Standards Sünje Dallmeier-Tiessen (CERN) for many collaborators at CERN and in the RDA-WDS Data Publishing Workflows](https://reader035.vdocuments.site/reader035/viewer/2022062320/56649d975503460f94a80492/html5/thumbnails/39.jpg)
… and who gets the credit for sharing data?
![Page 40: Data Publishing Workflows: Strategies and Standards Sünje Dallmeier-Tiessen (CERN) for many collaborators at CERN and in the RDA-WDS Data Publishing Workflows](https://reader035.vdocuments.site/reader035/viewer/2022062320/56649d975503460f94a80492/html5/thumbnails/40.jpg)
Kyle’s profile on INSPIRE
![Page 41: Data Publishing Workflows: Strategies and Standards Sünje Dallmeier-Tiessen (CERN) for many collaborators at CERN and in the RDA-WDS Data Publishing Workflows](https://reader035.vdocuments.site/reader035/viewer/2022062320/56649d975503460f94a80492/html5/thumbnails/41.jpg)
Using author IDs for attributing credit
![Page 42: Data Publishing Workflows: Strategies and Standards Sünje Dallmeier-Tiessen (CERN) for many collaborators at CERN and in the RDA-WDS Data Publishing Workflows](https://reader035.vdocuments.site/reader035/viewer/2022062320/56649d975503460f94a80492/html5/thumbnails/42.jpg)
Excerpt from publication list on
![Page 43: Data Publishing Workflows: Strategies and Standards Sünje Dallmeier-Tiessen (CERN) for many collaborators at CERN and in the RDA-WDS Data Publishing Workflows](https://reader035.vdocuments.site/reader035/viewer/2022062320/56649d975503460f94a80492/html5/thumbnails/43.jpg)
Excerpt from publication list on
Make data publications count - alongside your articles
![Page 44: Data Publishing Workflows: Strategies and Standards Sünje Dallmeier-Tiessen (CERN) for many collaborators at CERN and in the RDA-WDS Data Publishing Workflows](https://reader035.vdocuments.site/reader035/viewer/2022062320/56649d975503460f94a80492/html5/thumbnails/44.jpg)
Focusing on reproducibility and reuseTwo important new tools
![Page 45: Data Publishing Workflows: Strategies and Standards Sünje Dallmeier-Tiessen (CERN) for many collaborators at CERN and in the RDA-WDS Data Publishing Workflows](https://reader035.vdocuments.site/reader035/viewer/2022062320/56649d975503460f94a80492/html5/thumbnails/45.jpg)
Capturing the complexity: Analysis Preservation Framework
![Page 46: Data Publishing Workflows: Strategies and Standards Sünje Dallmeier-Tiessen (CERN) for many collaborators at CERN and in the RDA-WDS Data Publishing Workflows](https://reader035.vdocuments.site/reader035/viewer/2022062320/56649d975503460f94a80492/html5/thumbnails/46.jpg)
Open it up: CERN Open Data Portal
![Page 47: Data Publishing Workflows: Strategies and Standards Sünje Dallmeier-Tiessen (CERN) for many collaborators at CERN and in the RDA-WDS Data Publishing Workflows](https://reader035.vdocuments.site/reader035/viewer/2022062320/56649d975503460f94a80492/html5/thumbnails/47.jpg)
How to publish your data
1. Decide which dataset should be preserved or which dataset might be of interest for others to study or reuse
2. Are there issues which restrict the publishing process, e.g. confidentiality for patient data?
3. Which data product? • Do I have enough materials for a dedicated data article? • Which journal or repository works for me?
4. Prepare the documentation/metadata
5. Publish and let the others know you did
6. Cite the dataset in the resulting papers
7. Track who used and cited your data
![Page 48: Data Publishing Workflows: Strategies and Standards Sünje Dallmeier-Tiessen (CERN) for many collaborators at CERN and in the RDA-WDS Data Publishing Workflows](https://reader035.vdocuments.site/reader035/viewer/2022062320/56649d975503460f94a80492/html5/thumbnails/48.jpg)
Conclusions• Policy pressure nationally and globally: we need data
publishing solutions
• Considerable advancements in many disciplines
We learn from best practices
• HEP with commitment to data preservation and open data releases
• First tools are available to support data preservation and data publishing
![Page 49: Data Publishing Workflows: Strategies and Standards Sünje Dallmeier-Tiessen (CERN) for many collaborators at CERN and in the RDA-WDS Data Publishing Workflows](https://reader035.vdocuments.site/reader035/viewer/2022062320/56649d975503460f94a80492/html5/thumbnails/49.jpg)
Towards Open Science
Open Source
Open Access
Open Data & Code
Open ScienceWe are here now
Slide provided by Patricia Herterich