data stewardship and the decentralized webdata stewardship and the decentralized web danielle...

65
Data Stewardship and the Decentralized Web DANIELLE ROBINSON, PhD Co-Executive Director at Code for Science & Society @daniellecrobins @codeforsociety

Upload: others

Post on 06-Jun-2020

9 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Data Stewardship and the Decentralized WebData Stewardship and the Decentralized Web DANIELLE ROBINSON, PhD ... - Knowledge of decentralized computing, data collection & management

Data Stewardship and the Decentralized Web

DANIELLE ROBINSON, PhDCo-Executive Director at Code for Science & Society

@daniellecrobins @codeforsociety

Page 2: Data Stewardship and the Decentralized WebData Stewardship and the Decentralized Web DANIELLE ROBINSON, PhD ... - Knowledge of decentralized computing, data collection & management

Code for Science & Society

Supporting open source in the public interest

Page 3: Data Stewardship and the Decentralized WebData Stewardship and the Decentralized Web DANIELLE ROBINSON, PhD ... - Knowledge of decentralized computing, data collection & management

Code for Science & Society

Civic tech +Scholarly research +New media +Open source + Equity, support, inclusion

= CS&S community

Page 4: Data Stewardship and the Decentralized WebData Stewardship and the Decentralized Web DANIELLE ROBINSON, PhD ... - Knowledge of decentralized computing, data collection & management

Sharing experiencesBringing:- Knowledge of decentralized

computing, data collection & management

Seeking:- Better understanding of

needs, challenges of your community

Page 5: Data Stewardship and the Decentralized WebData Stewardship and the Decentralized Web DANIELLE ROBINSON, PhD ... - Knowledge of decentralized computing, data collection & management

What is the future of data stewardship?

- Bringing together leaders, stakeholders

- Design a cooperative data preservation network

- Push for ‘FAIR’ and save libraries money

Page 7: Data Stewardship and the Decentralized WebData Stewardship and the Decentralized Web DANIELLE ROBINSON, PhD ... - Knowledge of decentralized computing, data collection & management

1. Data on the web

2. A new model of data stewardship

3. Prototyping decentralized preservation

4. Reimagine data on the web

@daniellecrobins

Page 8: Data Stewardship and the Decentralized WebData Stewardship and the Decentralized Web DANIELLE ROBINSON, PhD ... - Knowledge of decentralized computing, data collection & management

1. Data on the web

2. A new model of data stewardship

3. Prototyping decentralized preservation

4. Reimagine data on the web

@daniellecrobins

Page 9: Data Stewardship and the Decentralized WebData Stewardship and the Decentralized Web DANIELLE ROBINSON, PhD ... - Knowledge of decentralized computing, data collection & management

Across domains, data live online

Early work of a writer

Government data

Newspaper archives

Your family photos

Scientific data

@daniellecrobins

Page 10: Data Stewardship and the Decentralized WebData Stewardship and the Decentralized Web DANIELLE ROBINSON, PhD ... - Knowledge of decentralized computing, data collection & management

@daniellecrobins

Data transparency: Inconsistent practices across domains

Page 11: Data Stewardship and the Decentralized WebData Stewardship and the Decentralized Web DANIELLE ROBINSON, PhD ... - Knowledge of decentralized computing, data collection & management

@daniellecrobins

Many data publishing optionshttps://www.ohsu.edu/xd/education/library/data/share-and-archive/index.cfm

Page 12: Data Stewardship and the Decentralized WebData Stewardship and the Decentralized Web DANIELLE ROBINSON, PhD ... - Knowledge of decentralized computing, data collection & management

@daniellecrobins

Siloed info, centralized gate keepers control access

Doc Searls

Page 13: Data Stewardship and the Decentralized WebData Stewardship and the Decentralized Web DANIELLE ROBINSON, PhD ... - Knowledge of decentralized computing, data collection & management

@daniellecrobinshttps://imgflip.com/memegenerator/Picard-Wtf

Page 14: Data Stewardship and the Decentralized WebData Stewardship and the Decentralized Web DANIELLE ROBINSON, PhD ... - Knowledge of decentralized computing, data collection & management

http://som.csudh.edu/fac/lpress/history/arpamaps/ @daniellecrobins

Distributed beginnings

Page 16: Data Stewardship and the Decentralized WebData Stewardship and the Decentralized Web DANIELLE ROBINSON, PhD ... - Knowledge of decentralized computing, data collection & management

@daniellecrobins

Web centralization

Image courtesy of Beaker Browser

Page 17: Data Stewardship and the Decentralized WebData Stewardship and the Decentralized Web DANIELLE ROBINSON, PhD ... - Knowledge of decentralized computing, data collection & management

@daniellecrobins

Web centralization

It’s easier to manage and monetize a silo

Image courtesy of Beaker Browser

Page 18: Data Stewardship and the Decentralized WebData Stewardship and the Decentralized Web DANIELLE ROBINSON, PhD ... - Knowledge of decentralized computing, data collection & management

“We embed values into our technology whether we are aware

of it or not”- Stephen Whitmore (@noffle)

Digital Democracy

See also the work of Safiya Noble

@daniellecrobinshttps://blog.datproject.org/2018/03/05/css-community-call-03-2018/

Page 19: Data Stewardship and the Decentralized WebData Stewardship and the Decentralized Web DANIELLE ROBINSON, PhD ... - Knowledge of decentralized computing, data collection & management

@daniellecrobins

In the centralized web

We trust the server to locate, not change objects

Silos are the natural state

Data may be in multiple silos

Page 20: Data Stewardship and the Decentralized WebData Stewardship and the Decentralized Web DANIELLE ROBINSON, PhD ... - Knowledge of decentralized computing, data collection & management

@daniellecrobins

Today’s web relies upon

URLs to identify location of objects

Ability to change information without changing location

Aggregating content for discovery

Page 21: Data Stewardship and the Decentralized WebData Stewardship and the Decentralized Web DANIELLE ROBINSON, PhD ... - Knowledge of decentralized computing, data collection & management

@daniellecrobins

Today’s web lacks

Persistent identifiers

Transparent change log

Links between silos

Page 22: Data Stewardship and the Decentralized WebData Stewardship and the Decentralized Web DANIELLE ROBINSON, PhD ... - Knowledge of decentralized computing, data collection & management

@daniellecrobins

“The internet is a terribly unstable way to keep information available”

- Laurie AllenPenn Libraries' Assistant Director for Digital Scholarship

Page 23: Data Stewardship and the Decentralized WebData Stewardship and the Decentralized Web DANIELLE ROBINSON, PhD ... - Knowledge of decentralized computing, data collection & management

@daniellecrobins

“Federal data ≅ website”https://www1.ncdc.noaa.gov/pub/data/

Page 24: Data Stewardship and the Decentralized WebData Stewardship and the Decentralized Web DANIELLE ROBINSON, PhD ... - Knowledge of decentralized computing, data collection & management

@daniellecrobins

Why are federal data ≅ webpages?

To find an object online:

1. Discover the link2. Link still works

3. Trust the info at the link

https://www.slideshare.net/shefw/save-the-data-the-role-of-librarians-in-datarescue-collaborations

Page 25: Data Stewardship and the Decentralized WebData Stewardship and the Decentralized Web DANIELLE ROBINSON, PhD ... - Knowledge of decentralized computing, data collection & management

@daniellecrobins

Why are federal data ≅ webpages?

https://www1.ncdc.noaa.gov/pub/data/annualreports

https://www.slideshare.net/shefw/save-the-data-the-role-of-librarians-in-datarescue-collaborations

Page 26: Data Stewardship and the Decentralized WebData Stewardship and the Decentralized Web DANIELLE ROBINSON, PhD ... - Knowledge of decentralized computing, data collection & management

@daniellecrobinsM. Klein, several papers and talks, links at end

Link rot: When links fail

Content Drift: When referenced content are changed

Link rot + content drift = Reference rot

Page 27: Data Stewardship and the Decentralized WebData Stewardship and the Decentralized Web DANIELLE ROBINSON, PhD ... - Knowledge of decentralized computing, data collection & management

@daniellecrobins

The Internet is broken

and we are using itto access and distributeall of human knowledge

¯\_(ツ)_/¯

Page 28: Data Stewardship and the Decentralized WebData Stewardship and the Decentralized Web DANIELLE ROBINSON, PhD ... - Knowledge of decentralized computing, data collection & management

@daniellecrobinsits all about Rock (:

The web is being reimagined

Page 29: Data Stewardship and the Decentralized WebData Stewardship and the Decentralized Web DANIELLE ROBINSON, PhD ... - Knowledge of decentralized computing, data collection & management

@daniellecrobins

What’s important to you?romana klee

Page 30: Data Stewardship and the Decentralized WebData Stewardship and the Decentralized Web DANIELLE ROBINSON, PhD ... - Knowledge of decentralized computing, data collection & management

1. Data on the web

2. A new model of data stewardship

3. Prototyping decentralized preservation

4. Reimagine data on the web

@daniellecrobins

Page 31: Data Stewardship and the Decentralized WebData Stewardship and the Decentralized Web DANIELLE ROBINSON, PhD ... - Knowledge of decentralized computing, data collection & management

@daniellecrobins

Preservation starts here

Page 32: Data Stewardship and the Decentralized WebData Stewardship and the Decentralized Web DANIELLE ROBINSON, PhD ... - Knowledge of decentralized computing, data collection & management

“Sharing research data is not well understood, incentivized,

or accessible”

Daniella Lowenberg Research Data Specialist

Product Manager of @uc3dashCalifornia Digital Library

@daniellecrobinshttps://medium.com/@UC3CDL/we-are-talking-loudly-and-no-one-is-listening-a108248693f7 / csv

screenshot from https://peerj.com/preprints/2588/

and preserving

^

Page 33: Data Stewardship and the Decentralized WebData Stewardship and the Decentralized Web DANIELLE ROBINSON, PhD ... - Knowledge of decentralized computing, data collection & management

Preservation requires custody@daniellecrobinsseagen

Page 34: Data Stewardship and the Decentralized WebData Stewardship and the Decentralized Web DANIELLE ROBINSON, PhD ... - Knowledge of decentralized computing, data collection & management

Centralized model requirescustody to provide access

@daniellecrobinsImage courtesy of Beaker Browser

Page 35: Data Stewardship and the Decentralized WebData Stewardship and the Decentralized Web DANIELLE ROBINSON, PhD ... - Knowledge of decentralized computing, data collection & management

@daniellecrobins

Web accessible objectsVia Agency

Page 36: Data Stewardship and the Decentralized WebData Stewardship and the Decentralized Web DANIELLE ROBINSON, PhD ... - Knowledge of decentralized computing, data collection & management

Is custody required?@daniellecrobins#WOCinTech Chat

Page 37: Data Stewardship and the Decentralized WebData Stewardship and the Decentralized Web DANIELLE ROBINSON, PhD ... - Knowledge of decentralized computing, data collection & management

“Preservation in place… Bring preservation services

to the content”

-Stephen AbramsPreservation without Possession

California Digital Library

@daniellecrobinshttps://figshare.com/articles/Preservation_without_possession_Content-

addressable_identifiers_for_post-custodial_preservation/5844369

Page 38: Data Stewardship and the Decentralized WebData Stewardship and the Decentralized Web DANIELLE ROBINSON, PhD ... - Knowledge of decentralized computing, data collection & management

Sharing data and costs@daniellecrobins

Cooperative of trusted entities

Image courtesy of Beaker Browser

Page 39: Data Stewardship and the Decentralized WebData Stewardship and the Decentralized Web DANIELLE ROBINSON, PhD ... - Knowledge of decentralized computing, data collection & management

@daniellecrobinsSangyaPundir / www.force11.org/group/fairgroup/fairprinciples

Page 40: Data Stewardship and the Decentralized WebData Stewardship and the Decentralized Web DANIELLE ROBINSON, PhD ... - Knowledge of decentralized computing, data collection & management

@daniellecrobinswww.force11.org/group/fairgroup/fairprinciples

Leverage existing infrastructure

Page 41: Data Stewardship and the Decentralized WebData Stewardship and the Decentralized Web DANIELLE ROBINSON, PhD ... - Knowledge of decentralized computing, data collection & management

@daniellecrobinsPeter Miller

Visions are nice!

Page 42: Data Stewardship and the Decentralized WebData Stewardship and the Decentralized Web DANIELLE ROBINSON, PhD ... - Knowledge of decentralized computing, data collection & management

@daniellecrobins

Now let’s get realvladeb

Page 43: Data Stewardship and the Decentralized WebData Stewardship and the Decentralized Web DANIELLE ROBINSON, PhD ... - Knowledge of decentralized computing, data collection & management

1. Data on the web

2. A new model of data stewardship

3. Prototyping decentralized preservation

4. Reimagine data on the web

@daniellecrobins

Page 44: Data Stewardship and the Decentralized WebData Stewardship and the Decentralized Web DANIELLE ROBINSON, PhD ... - Knowledge of decentralized computing, data collection & management

@daniellecrobins

Multiple decentralized approaches

BTC Keychain / Danilo / http://www.ala.org/tools/future/trends/blockchain /

https://gist.github.com/mafintosh/bd9e6d350ebf02441c9707c5f799d05b

Blockchain Peer-to-peer

Page 45: Data Stewardship and the Decentralized WebData Stewardship and the Decentralized Web DANIELLE ROBINSON, PhD ... - Knowledge of decentralized computing, data collection & management

Data stored at central location, accessed by independent users

@daniellecrobinsImage courtesy of Beaker Browser

Centralized “hub and spoke” model

Page 46: Data Stewardship and the Decentralized WebData Stewardship and the Decentralized Web DANIELLE ROBINSON, PhD ... - Knowledge of decentralized computing, data collection & management

Data persistently identified, networked ability to scale

@daniellecrobins

Decentralized models

Image courtesy of Beaker Browser

Page 47: Data Stewardship and the Decentralized WebData Stewardship and the Decentralized Web DANIELLE ROBINSON, PhD ... - Knowledge of decentralized computing, data collection & management

@daniellecrobins

Peer-to-peer public technology

https://github.com/mafintosh/bws-2017

Page 48: Data Stewardship and the Decentralized WebData Stewardship and the Decentralized Web DANIELLE ROBINSON, PhD ... - Knowledge of decentralized computing, data collection & management

@daniellecrobins

What’s Dat?

Persistent identifiers

+

Network of peers

https://github.com/datproject/docs/blob/master/papers/dat-paper.pdf

Page 49: Data Stewardship and the Decentralized WebData Stewardship and the Decentralized Web DANIELLE ROBINSON, PhD ... - Knowledge of decentralized computing, data collection & management

@daniellecrobins

Dat + scholarly data =

- Automate preservation, versioning

- Find data across storage locations

- Spread cost burden across network

- Foundational links between silos

Page 50: Data Stewardship and the Decentralized WebData Stewardship and the Decentralized Web DANIELLE ROBINSON, PhD ... - Knowledge of decentralized computing, data collection & management

@daniellecrobins俍宏葉

Reimagine data preservation

Page 51: Data Stewardship and the Decentralized WebData Stewardship and the Decentralized Web DANIELLE ROBINSON, PhD ... - Knowledge of decentralized computing, data collection & management

@daniellecrobins

It’s all about TRUST

Image courtesy of Beaker Browser

Page 52: Data Stewardship and the Decentralized WebData Stewardship and the Decentralized Web DANIELLE ROBINSON, PhD ... - Knowledge of decentralized computing, data collection & management

@daniellecrobins

… and I trust LIBRARIES

Image courtesy of Beaker Browser

Page 53: Data Stewardship and the Decentralized WebData Stewardship and the Decentralized Web DANIELLE ROBINSON, PhD ... - Knowledge of decentralized computing, data collection & management

@daniellecrobinsEran Sandler

Building a prototype

Page 54: Data Stewardship and the Decentralized WebData Stewardship and the Decentralized Web DANIELLE ROBINSON, PhD ... - Knowledge of decentralized computing, data collection & management

@daniellecrobinsDr. Dannise V. Ruiz-Ramos describes sea star genome annotation pipeline

Start with data creation

Page 55: Data Stewardship and the Decentralized WebData Stewardship and the Decentralized Web DANIELLE ROBINSON, PhD ... - Knowledge of decentralized computing, data collection & management

Dat in the Lab lessons:

Leverage existing workflows

Automate data versioning, preservation

Link researchers to library

Now linking libraries to each other

@daniellecrobinshttps://blog.datproject.org/tag/science/

Page 56: Data Stewardship and the Decentralized WebData Stewardship and the Decentralized Web DANIELLE ROBINSON, PhD ... - Knowledge of decentralized computing, data collection & management

Prototype: CDL - IA - SDSC

CDL’s DASH corpus (<5 TB)

Copied to IA and SDSC

Deal with technical hurdles (S3)

Next: Monitoring dynamic information

@daniellecrobins

Page 57: Data Stewardship and the Decentralized WebData Stewardship and the Decentralized Web DANIELLE ROBINSON, PhD ... - Knowledge of decentralized computing, data collection & management

Every institution contributes

Storage, bandwidth

Metadata on their collection

Commitment to preserve their collection

to the network

@daniellecrobins

Page 58: Data Stewardship and the Decentralized WebData Stewardship and the Decentralized Web DANIELLE ROBINSON, PhD ... - Knowledge of decentralized computing, data collection & management

Any user can access

Information on library collections

History of objects

Whole or partial data sets

from the network

@daniellecrobins

Page 59: Data Stewardship and the Decentralized WebData Stewardship and the Decentralized Web DANIELLE ROBINSON, PhD ... - Knowledge of decentralized computing, data collection & management

1. Data on the web

2. A new model of data stewardship

3. Prototyping decentralized preservation

4. Reimagine data on the web

@daniellecrobins

Page 60: Data Stewardship and the Decentralized WebData Stewardship and the Decentralized Web DANIELLE ROBINSON, PhD ... - Knowledge of decentralized computing, data collection & management

@daniellecrobins

What’s important to

you?

www.liveoncelivewild.com

Page 61: Data Stewardship and the Decentralized WebData Stewardship and the Decentralized Web DANIELLE ROBINSON, PhD ... - Knowledge of decentralized computing, data collection & management

Discussion:

● What are the data types that your organization is responsible for?

● How are those data created, stored, used? When do they come to you?

● Who interacts with data? How do they interact with it?

● How are equity, justice addressed (or not) in data stewardship plans?

● What are your concerns around long term preservation of data?

Page 62: Data Stewardship and the Decentralized WebData Stewardship and the Decentralized Web DANIELLE ROBINSON, PhD ... - Knowledge of decentralized computing, data collection & management

https://library.auraria.edu/d2pproject/about

The Data to Policy Project (D2P) is an initiative to engage students with their community’s

needs through course-based assignments, which culminate into data-driven policy

proposals to local governments and agencies.

Cool project alert!

Page 63: Data Stewardship and the Decentralized WebData Stewardship and the Decentralized Web DANIELLE ROBINSON, PhD ... - Knowledge of decentralized computing, data collection & management

Thank you to the Western States Government Information

Conference Planning Committee

DANIELLE ROBINSON, PhDCo-Executive Director at Code for Science & Society

@daniellecrobins @codeforsociety

Page 64: Data Stewardship and the Decentralized WebData Stewardship and the Decentralized Web DANIELLE ROBINSON, PhD ... - Knowledge of decentralized computing, data collection & management

Discussion:

● What are the data types that your organization is responsible for?

● How are those data created, stored, used? When do they come to you?

● Who interacts with data? How do they interact with it?

● How are equity, justice addressed (or not) in data stewardship plans?

● What are your concerns around long term preservation of data?

Page 65: Data Stewardship and the Decentralized WebData Stewardship and the Decentralized Web DANIELLE ROBINSON, PhD ... - Knowledge of decentralized computing, data collection & management