dataone (datanet observational network for the earth) · dataone’’...
TRANSCRIPT
![Page 1: DataOne (DataNet Observational Network for the Earth) · DataONE’’ Data’Observa-onal’Networkfor’Earth’ RebeccaKoskela William&Michener& Dave&Vieglais& AmberBudden& & OSTP/NITRD&DataSharing&and&MetadataCuraon:&](https://reader031.vdocuments.site/reader031/viewer/2022030600/5acd1d007f8b9a63398db61e/html5/thumbnails/1.jpg)
DataONE Data Observa-onal Network for Earth
Rebecca Koskela William Michener Dave Vieglais Amber Budden OSTP/NITRD Data Sharing and Metadata CuraDon: Obstacles and Strategies May 29, 2013
![Page 2: DataOne (DataNet Observational Network for the Earth) · DataONE’’ Data’Observa-onal’Networkfor’Earth’ RebeccaKoskela William&Michener& Dave&Vieglais& AmberBudden& & OSTP/NITRD&DataSharing&and&MetadataCuraon:&](https://reader031.vdocuments.site/reader031/viewer/2022030600/5acd1d007f8b9a63398db61e/html5/thumbnails/2.jpg)
2 2
The metadata problem
![Page 3: DataOne (DataNet Observational Network for the Earth) · DataONE’’ Data’Observa-onal’Networkfor’Earth’ RebeccaKoskela William&Michener& Dave&Vieglais& AmberBudden& & OSTP/NITRD&DataSharing&and&MetadataCuraon:&](https://reader031.vdocuments.site/reader031/viewer/2022030600/5acd1d007f8b9a63398db61e/html5/thumbnails/3.jpg)
3 3
12 21 26 95 95 96 97
266
676
DIF DwC DC EML FGDC Open GIS
ISO My Lab none
Metadata standards
ScienDsts want to share data Use other researchers’ datasets if easily accessible
Willing to share data across a broad group of researchers
Appropriate to create new datasets from shared data
84%
81%
76%
Currently share all of their data 6%
but don’t know how to and, if they do, want to get proper credit for doing so.
![Page 4: DataOne (DataNet Observational Network for the Earth) · DataONE’’ Data’Observa-onal’Networkfor’Earth’ RebeccaKoskela William&Michener& Dave&Vieglais& AmberBudden& & OSTP/NITRD&DataSharing&and&MetadataCuraon:&](https://reader031.vdocuments.site/reader031/viewer/2022030600/5acd1d007f8b9a63398db61e/html5/thumbnails/4.jpg)
4 4
• Make it easy to describe data • Provide credit to the data/metadata author
• CitaDon • Promote discoverability
• Mandates (ideally, funded!)
Some soluDons
![Page 5: DataOne (DataNet Observational Network for the Earth) · DataONE’’ Data’Observa-onal’Networkfor’Earth’ RebeccaKoskela William&Michener& Dave&Vieglais& AmberBudden& & OSTP/NITRD&DataSharing&and&MetadataCuraon:&](https://reader031.vdocuments.site/reader031/viewer/2022030600/5acd1d007f8b9a63398db61e/html5/thumbnails/5.jpg)
5 5
Best PracDces and So\ware Tools
![Page 6: DataOne (DataNet Observational Network for the Earth) · DataONE’’ Data’Observa-onal’Networkfor’Earth’ RebeccaKoskela William&Michener& Dave&Vieglais& AmberBudden& & OSTP/NITRD&DataSharing&and&MetadataCuraon:&](https://reader031.vdocuments.site/reader031/viewer/2022030600/5acd1d007f8b9a63398db61e/html5/thumbnails/6.jpg)
6 6
Making it easy to describe data
Intercept researchers where they already work
![Page 7: DataOne (DataNet Observational Network for the Earth) · DataONE’’ Data’Observa-onal’Networkfor’Earth’ RebeccaKoskela William&Michener& Dave&Vieglais& AmberBudden& & OSTP/NITRD&DataSharing&and&MetadataCuraon:&](https://reader031.vdocuments.site/reader031/viewer/2022030600/5acd1d007f8b9a63398db61e/html5/thumbnails/7.jpg)
7 7
Data & Metadata (EML)
![Page 8: DataOne (DataNet Observational Network for the Earth) · DataONE’’ Data’Observa-onal’Networkfor’Earth’ RebeccaKoskela William&Michener& Dave&Vieglais& AmberBudden& & OSTP/NITRD&DataSharing&and&MetadataCuraon:&](https://reader031.vdocuments.site/reader031/viewer/2022030600/5acd1d007f8b9a63398db61e/html5/thumbnails/8.jpg)
8 8
Credit: Dryad repository for journal data & metadata
![Page 9: DataOne (DataNet Observational Network for the Earth) · DataONE’’ Data’Observa-onal’Networkfor’Earth’ RebeccaKoskela William&Michener& Dave&Vieglais& AmberBudden& & OSTP/NITRD&DataSharing&and&MetadataCuraon:&](https://reader031.vdocuments.site/reader031/viewer/2022030600/5acd1d007f8b9a63398db61e/html5/thumbnails/9.jpg)
9 9
PromoDng data citaDons via Dryad
Ar-cle Wu D, Wu M, Halpern A, Rusch DB, Yooseph S, Frazier M, Venter JC, Eisen JA (2011) Stalking the fourth domain in metagenomic data: searching for, discovering, and interpreDng novel, deep branches in phylogeneDc trees of phylogeneDc marker genes. PLoS ONE 6(3): e18011. doi:10.1371/journal.pone.0018011 Dryad data package Wu D, Wu M, Halpern A, Rusch DB, Yooseph S, Frazier M, Venter JC, Eisen JA (2011) Data from: Stalking the fourth domain in metagenomic data: searching for, discovering, and interpreDng novel, deep branches in phylogeneDc trees of phylogeneDc marker genes. Dryad Digital Repository. doi:10.5061/dryad.8384
![Page 10: DataOne (DataNet Observational Network for the Earth) · DataONE’’ Data’Observa-onal’Networkfor’Earth’ RebeccaKoskela William&Michener& Dave&Vieglais& AmberBudden& & OSTP/NITRD&DataSharing&and&MetadataCuraon:&](https://reader031.vdocuments.site/reader031/viewer/2022030600/5acd1d007f8b9a63398db61e/html5/thumbnails/10.jpg)
10 10
PromoDng data discovery Provide universal access to data about life on earth and the environment
1. Building community 2. Developing sustainable data discovery and interoperability soluDons
3. Enabling science through tools and services
Plan
Collect
Assure
Describe
Preserve
Discover
Integrate
Analyze
![Page 11: DataOne (DataNet Observational Network for the Earth) · DataONE’’ Data’Observa-onal’Networkfor’Earth’ RebeccaKoskela William&Michener& Dave&Vieglais& AmberBudden& & OSTP/NITRD&DataSharing&and&MetadataCuraon:&](https://reader031.vdocuments.site/reader031/viewer/2022030600/5acd1d007f8b9a63398db61e/html5/thumbnails/11.jpg)
11 11
DataONE Three major components for a flexible, scalable, sustainable network
Coordina-ng Nodes • retain complete metadata catalog
• indexing for search • network-‐wide services • ensure content availability (preservaDon)
• replicaDon services
![Page 12: DataOne (DataNet Observational Network for the Earth) · DataONE’’ Data’Observa-onal’Networkfor’Earth’ RebeccaKoskela William&Michener& Dave&Vieglais& AmberBudden& & OSTP/NITRD&DataSharing&and&MetadataCuraon:&](https://reader031.vdocuments.site/reader031/viewer/2022030600/5acd1d007f8b9a63398db61e/html5/thumbnails/12.jpg)
12 12
DataONE Three major components for a flexible, scalable, sustainable network
Coordina-ng Nodes • retain complete metadata catalog
• indexing for search • network-‐wide services • ensure content availability (preservaDon)
• replicaDon services
Member Nodes • diverse insDtuDons • serve local community • provide resources for managing their data
• retain copies of data
![Page 13: DataOne (DataNet Observational Network for the Earth) · DataONE’’ Data’Observa-onal’Networkfor’Earth’ RebeccaKoskela William&Michener& Dave&Vieglais& AmberBudden& & OSTP/NITRD&DataSharing&and&MetadataCuraon:&](https://reader031.vdocuments.site/reader031/viewer/2022030600/5acd1d007f8b9a63398db61e/html5/thumbnails/13.jpg)
13 13
DataONE Three major components for a flexible, scalable, sustainable network
Coordina-ng Nodes • retain complete metadata catalog
• indexing for search • network-‐wide services • ensure content availability (preservaDon)
• replicaDon services
Member Nodes • diverse insDtuDons • serve local community • provide resources for managing their data
• retain copies of data
![Page 14: DataOne (DataNet Observational Network for the Earth) · DataONE’’ Data’Observa-onal’Networkfor’Earth’ RebeccaKoskela William&Michener& Dave&Vieglais& AmberBudden& & OSTP/NITRD&DataSharing&and&MetadataCuraon:&](https://reader031.vdocuments.site/reader031/viewer/2022030600/5acd1d007f8b9a63398db61e/html5/thumbnails/14.jpg)
14 14
DataONE Three major components for a flexible, scalable, sustainable network
Coordina-ng Nodes • retain complete metadata catalog
• indexing for search • network-‐wide services • ensure content availability (preservaDon)
• replicaDon services
Member Nodes • diverse insDtuDons • serve local community • provide resources for managing their data
• retain copies of data
Inves-gator Toolkit
![Page 15: DataOne (DataNet Observational Network for the Earth) · DataONE’’ Data’Observa-onal’Networkfor’Earth’ RebeccaKoskela William&Michener& Dave&Vieglais& AmberBudden& & OSTP/NITRD&DataSharing&and&MetadataCuraon:&](https://reader031.vdocuments.site/reader031/viewer/2022030600/5acd1d007f8b9a63398db61e/html5/thumbnails/15.jpg)
15 15
DataONE: Enabling data discovery
ORNL DAAC
KNB
PISCO
SANParks
ESA
USGS CSAS Internal Metadata Index
ONEShare
UC Merrik
Extract a
nd Align Metadata
LTER
CLO/AKN
FGDC, ISO, DIF, FGDC
FGDC, ISO, FGDC
EML, FGDC
EML, ISO
EML
EML
EML
EML
EML
EML
Augm
ent M
etadata
Search API
![Page 16: DataOne (DataNet Observational Network for the Earth) · DataONE’’ Data’Observa-onal’Networkfor’Earth’ RebeccaKoskela William&Michener& Dave&Vieglais& AmberBudden& & OSTP/NITRD&DataSharing&and&MetadataCuraon:&](https://reader031.vdocuments.site/reader031/viewer/2022030600/5acd1d007f8b9a63398db61e/html5/thumbnails/16.jpg)
16 16
ICE Collectors
ICE Users
DataONE Users 16
InformaDon Center for the Environment (ICE) UC Davis
ICE Collects Water Data ICE Users
agencies
ciDzens
faculty
Inves-gator Toolkit
![Page 17: DataOne (DataNet Observational Network for the Earth) · DataONE’’ Data’Observa-onal’Networkfor’Earth’ RebeccaKoskela William&Michener& Dave&Vieglais& AmberBudden& & OSTP/NITRD&DataSharing&and&MetadataCuraon:&](https://reader031.vdocuments.site/reader031/viewer/2022030600/5acd1d007f8b9a63398db61e/html5/thumbnails/17.jpg)
17 17
• SemanDc mediaDon • Provenance • Improving metadata quality over Dme
Some remaining challenges
![Page 18: DataOne (DataNet Observational Network for the Earth) · DataONE’’ Data’Observa-onal’Networkfor’Earth’ RebeccaKoskela William&Michener& Dave&Vieglais& AmberBudden& & OSTP/NITRD&DataSharing&and&MetadataCuraon:&](https://reader031.vdocuments.site/reader031/viewer/2022030600/5acd1d007f8b9a63398db61e/html5/thumbnails/18.jpg)
18 18
outcomes
Powerful Data Discovery via SemanDcs
topic model
formal ontologies/ controlled vocabularies
term matching (TF-‐IDF)
query
Enhanced models for knowledge representaDon in earth and environmental sciences
Powerful model-‐driven search interface for data discovery
Improved Precision Improved Recall Automated annota-on
18
![Page 19: DataOne (DataNet Observational Network for the Earth) · DataONE’’ Data’Observa-onal’Networkfor’Earth’ RebeccaKoskela William&Michener& Dave&Vieglais& AmberBudden& & OSTP/NITRD&DataSharing&and&MetadataCuraon:&](https://reader031.vdocuments.site/reader031/viewer/2022030600/5acd1d007f8b9a63398db61e/html5/thumbnails/19.jpg)
19 19
Provenance Origin, context, deriva8on, ownership, history of (data) ar8facts
• Record processing history, data lineage
• dependency graph
• W3C standard: PROV
• DataONE Extension: D-‐PROV • Workflow provenance • System agnosDc!
. . . . . . . . .
![Page 20: DataOne (DataNet Observational Network for the Earth) · DataONE’’ Data’Observa-onal’Networkfor’Earth’ RebeccaKoskela William&Michener& Dave&Vieglais& AmberBudden& & OSTP/NITRD&DataSharing&and&MetadataCuraon:&](https://reader031.vdocuments.site/reader031/viewer/2022030600/5acd1d007f8b9a63398db61e/html5/thumbnails/20.jpg)
20 20
Improving metadata quality for data reuse
Time (< 1 yr)
Inform
aDon
Con
tent
Planning
CollecDon
Assure
DocumentaDon
Archive
Sufficient for Sharing and Reuse
![Page 21: DataOne (DataNet Observational Network for the Earth) · DataONE’’ Data’Observa-onal’Networkfor’Earth’ RebeccaKoskela William&Michener& Dave&Vieglais& AmberBudden& & OSTP/NITRD&DataSharing&and&MetadataCuraon:&](https://reader031.vdocuments.site/reader031/viewer/2022030600/5acd1d007f8b9a63398db61e/html5/thumbnails/21.jpg)
21 21
Mandates (ideally, funded!)
![Page 22: DataOne (DataNet Observational Network for the Earth) · DataONE’’ Data’Observa-onal’Networkfor’Earth’ RebeccaKoskela William&Michener& Dave&Vieglais& AmberBudden& & OSTP/NITRD&DataSharing&and&MetadataCuraon:&](https://reader031.vdocuments.site/reader031/viewer/2022030600/5acd1d007f8b9a63398db61e/html5/thumbnails/22.jpg)
22 22
DataONE: SupporDng scienDfic data preservaDon, discovery, and innovaDon