data management (ridm) research information · non-digital text (lab books, field notebooks,...

70
Research Information & Data Management (RIDM)

Upload: others

Post on 24-Jul-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Data Management (RIDM) Research Information · Non-digital text (lab books, field notebooks, archival texts) Digital texts or digital copies of text Spreadsheets Audio, video Computer

Research Information & Data Management (RIDM)

Page 2: Data Management (RIDM) Research Information · Non-digital text (lab books, field notebooks, archival texts) Digital texts or digital copies of text Spreadsheets Audio, video Computer

Introductions:

Ellie Ransom: Research Services Coordinator, @CU_SEL, [email protected]

Amy Nurnberger: Research Data Manager, @DataAtCU, [email protected]

Page 3: Data Management (RIDM) Research Information · Non-digital text (lab books, field notebooks, archival texts) Digital texts or digital copies of text Spreadsheets Audio, video Computer

The Plan:& Introductions:

Amy Nurnberger: Research Data Manager, @DataAtCUEllie Ransom: Research Services Coordinator, @CU_SEL

The Plan for Research & Information Data (RID):➔ Identify it➔ Manage it➔ Document it➔ Secure it➔ Deal with it

Page 4: Data Management (RIDM) Research Information · Non-digital text (lab books, field notebooks, archival texts) Digital texts or digital copies of text Spreadsheets Audio, video Computer

Identify it:

Anthony Elia at http://hdl.handle.net/10022/AC:P:19828 | Oscilloscope by Voltcraft by Hannes Grobe, https://commons.wikimedia.org/wiki/File:Oscilloscope-voltcraft_hg.jpg, cc-by-3.0 |

Page 5: Data Management (RIDM) Research Information · Non-digital text (lab books, field notebooks, archival texts) Digital texts or digital copies of text Spreadsheets Audio, video Computer

Identify it:

Anthony Elia at http://hdl.handle.net/10022/AC:P:19828 | Oscilloscope by Voltcraft by Hannes Grobe, https://commons.wikimedia.org/wiki/File:Oscilloscope-voltcraft_hg.jpg, cc-by-3.0 |

Page 6: Data Management (RIDM) Research Information · Non-digital text (lab books, field notebooks, archival texts) Digital texts or digital copies of text Spreadsheets Audio, video Computer

Identify it:

Anthony Elia at http://hdl.handle.net/10022/AC:P:19828 | Oscilloscope by Voltcraft by Hannes Grobe, https://commons.wikimedia.org/wiki/File:Oscilloscope-voltcraft_hg.jpg, cc-by-3.0 |

Page 7: Data Management (RIDM) Research Information · Non-digital text (lab books, field notebooks, archival texts) Digital texts or digital copies of text Spreadsheets Audio, video Computer

Identify it:

Anthony Elia at http://hdl.handle.net/10022/AC:P:19828 | Oscilloscope by Voltcraft by Hannes Grobe, https://commons.wikimedia.org/wiki/File:Oscilloscope-voltcraft_hg.jpg, cc-by-3.0 |

Page 8: Data Management (RIDM) Research Information · Non-digital text (lab books, field notebooks, archival texts) Digital texts or digital copies of text Spreadsheets Audio, video Computer

Identify it:

Anthony Elia at http://hdl.handle.net/10022/AC:P:19828 | Oscilloscope by Voltcraft by Hannes Grobe, https://commons.wikimedia.org/wiki/File:Oscilloscope-voltcraft_hg.jpg, cc-by-3.0 | Queensland University of Technology. Manual of Procedures and Policies. Section 2.8.3. http://www.mopp.qut.edu.au/D/D_02_08.jsp

Material or information "on which an argument, theory,

test or hypothesis, or another research output is

based."

Page 9: Data Management (RIDM) Research Information · Non-digital text (lab books, field notebooks, archival texts) Digital texts or digital copies of text Spreadsheets Audio, video Computer

Identify it - What is it?

➢ Non-digital text (lab books, field notebooks, archival texts)

➢ Digital texts or digital copies of text

➢ Spreadsheets

➢ Audio, video

➢ Computer Aided Design/CAD

➢ Statistics (SPSS, SAS)

➢ Databases

➢ Geographic Information Systems (GIS) and spatial data

➢ Digital copies of images

➢ Non-digital images

➢ Matlab files & Models

➢ Metadata & Paradata

➢ Data visualizations

➢ Computer code

➢ Standard operating procedures and protocols

➢ Protein or genetic sequences

➢ Artistic products

➢ Web files

➢ Curriculum materials

➢ Collection of digital objects acquired and generated during research

Adapted from: Georgia Tech–http://libguides.gatech.edu/content.php?pid=123776&sid=3067221

Page 10: Data Management (RIDM) Research Information · Non-digital text (lab books, field notebooks, archival texts) Digital texts or digital copies of text Spreadsheets Audio, video Computer

Identify it:

Who has it?

Page 11: Data Management (RIDM) Research Information · Non-digital text (lab books, field notebooks, archival texts) Digital texts or digital copies of text Spreadsheets Audio, video Computer

Identify it:

Who has it?

Page 12: Data Management (RIDM) Research Information · Non-digital text (lab books, field notebooks, archival texts) Digital texts or digital copies of text Spreadsheets Audio, video Computer

Identify it:

Who has it?

Page 13: Data Management (RIDM) Research Information · Non-digital text (lab books, field notebooks, archival texts) Digital texts or digital copies of text Spreadsheets Audio, video Computer

Identify it:

Who has it?

Page 14: Data Management (RIDM) Research Information · Non-digital text (lab books, field notebooks, archival texts) Digital texts or digital copies of text Spreadsheets Audio, video Computer

Identify it:

has it!

Page 15: Data Management (RIDM) Research Information · Non-digital text (lab books, field notebooks, archival texts) Digital texts or digital copies of text Spreadsheets Audio, video Computer

So, what are you going to

do with it?

Page 16: Data Management (RIDM) Research Information · Non-digital text (lab books, field notebooks, archival texts) Digital texts or digital copies of text Spreadsheets Audio, video Computer

Manage it!

Page 17: Data Management (RIDM) Research Information · Non-digital text (lab books, field notebooks, archival texts) Digital texts or digital copies of text Spreadsheets Audio, video Computer

What is Research Information & Data Management (RIDM)?

– Rex Sanders

Page 18: Data Management (RIDM) Research Information · Non-digital text (lab books, field notebooks, archival texts) Digital texts or digital copies of text Spreadsheets Audio, video Computer

What is Research Information & Data Management (RIDM)?

existsfound

understandtrustcan use

– Rex Sanders

Page 20: Data Management (RIDM) Research Information · Non-digital text (lab books, field notebooks, archival texts) Digital texts or digital copies of text Spreadsheets Audio, video Computer

http://openarchaeologydata.metajnl.com/about/ , modified

YOU

Page 21: Data Management (RIDM) Research Information · Non-digital text (lab books, field notebooks, archival texts) Digital texts or digital copies of text Spreadsheets Audio, video Computer

Manage it when?

Page 22: Data Management (RIDM) Research Information · Non-digital text (lab books, field notebooks, archival texts) Digital texts or digital copies of text Spreadsheets Audio, video Computer

Manage it when?

Page 23: Data Management (RIDM) Research Information · Non-digital text (lab books, field notebooks, archival texts) Digital texts or digital copies of text Spreadsheets Audio, video Computer

Plan to manage:

1. What information/data are you producing?

2. How are you documenting / describing it?

3. Where are you storing it?

4. When are you sharing it?

5. Who’s responsible?

Page 24: Data Management (RIDM) Research Information · Non-digital text (lab books, field notebooks, archival texts) Digital texts or digital copies of text Spreadsheets Audio, video Computer

What are you producing?

Page 25: Data Management (RIDM) Research Information · Non-digital text (lab books, field notebooks, archival texts) Digital texts or digital copies of text Spreadsheets Audio, video Computer

Manage it: Volume

Page 26: Data Management (RIDM) Research Information · Non-digital text (lab books, field notebooks, archival texts) Digital texts or digital copies of text Spreadsheets Audio, video Computer

Manage it: Volume

Page 27: Data Management (RIDM) Research Information · Non-digital text (lab books, field notebooks, archival texts) Digital texts or digital copies of text Spreadsheets Audio, video Computer

Manage it: Volume

Page 28: Data Management (RIDM) Research Information · Non-digital text (lab books, field notebooks, archival texts) Digital texts or digital copies of text Spreadsheets Audio, video Computer

Manage it: Velocity

Page 29: Data Management (RIDM) Research Information · Non-digital text (lab books, field notebooks, archival texts) Digital texts or digital copies of text Spreadsheets Audio, video Computer

Manage it: Velocity

Page 30: Data Management (RIDM) Research Information · Non-digital text (lab books, field notebooks, archival texts) Digital texts or digital copies of text Spreadsheets Audio, video Computer

Manage it: Velocity

Page 31: Data Management (RIDM) Research Information · Non-digital text (lab books, field notebooks, archival texts) Digital texts or digital copies of text Spreadsheets Audio, video Computer

Manage it: Variety / Interoperability

Page 32: Data Management (RIDM) Research Information · Non-digital text (lab books, field notebooks, archival texts) Digital texts or digital copies of text Spreadsheets Audio, video Computer

Manage it:Sensitive data

Page 33: Data Management (RIDM) Research Information · Non-digital text (lab books, field notebooks, archival texts) Digital texts or digital copies of text Spreadsheets Audio, video Computer

IRB

Classified

Restricted

Intellectual property, e.g. patent or copyright

Ownership

HIPPA

FERPA

Manage it:Sensitive data

PII

Page 34: Data Management (RIDM) Research Information · Non-digital text (lab books, field notebooks, archival texts) Digital texts or digital copies of text Spreadsheets Audio, video Computer

How are you documenting

it?

Page 35: Data Management (RIDM) Research Information · Non-digital text (lab books, field notebooks, archival texts) Digital texts or digital copies of text Spreadsheets Audio, video Computer

Document it:

Take good notes!

Page 36: Data Management (RIDM) Research Information · Non-digital text (lab books, field notebooks, archival texts) Digital texts or digital copies of text Spreadsheets Audio, video Computer

???

00100100 00111111 01101010 10001000 10000101 10100011 00001000 11010011 00010011 00011001 10001010 00101110 00000011 01110000 01110011 01000100 10100100 00001001 00111000 00100010 00101001 10011111 00110001 11010000 00001000 00101110 11111010 10011000 11101100 01001110 01101100 10001001

Page 37: Data Management (RIDM) Research Information · Non-digital text (lab books, field notebooks, archival texts) Digital texts or digital copies of text Spreadsheets Audio, video Computer

00100100 00111111 01101010 10001000 10000101 10100011 00001000 11010011 00010011 00011001 10001010 00101110 00000011 01110000 01110011 01000100 10100100 00001001 00111000 00100010 00101001 10011111 00110001 11010000 00001000 00101110 11111010 10011000 11101100 01001110 01101100 10001001

Methods• What was done• How it was done• Instrumentation/Equipment (RASCAL

course)• LimitationsCode• All of the meaningsDescription / DocumentationLabels (w/ units!)• Codebook• Data dictionary• Laboratory notebook

Page 38: Data Management (RIDM) Research Information · Non-digital text (lab books, field notebooks, archival texts) Digital texts or digital copies of text Spreadsheets Audio, video Computer

00100100 00111111 01101010 10001000 10000101 10100011 00001000 11010011 00010011 00011001 10001010 00101110 00000011 01110000 01110011 01000100 10100100 00001001 00111000 00100010 00101001 10011111 00110001 11010000 00001000 00101110 11111010 10011000 11101100 01001110 01101100 10001001

Methods• What was done• How it was done• Instrumentation/Equipment (RASCAL

course)• LimitationsCode• All of the meaningsDescription / DocumentationLabels (w/ units!)• Codebook• Data dictionary• Laboratory notebook

Cd π

There are standards for documentation: http://www.dcc.ac.uk/resources/metadata-standards

Page 39: Data Management (RIDM) Research Information · Non-digital text (lab books, field notebooks, archival texts) Digital texts or digital copies of text Spreadsheets Audio, video Computer

Document it:

Speaking of standards…

Page 40: Data Management (RIDM) Research Information · Non-digital text (lab books, field notebooks, archival texts) Digital texts or digital copies of text Spreadsheets Audio, video Computer

Standards of scholarship & academia:

Document it:

Plagiarism?

Page 41: Data Management (RIDM) Research Information · Non-digital text (lab books, field notebooks, archival texts) Digital texts or digital copies of text Spreadsheets Audio, video Computer

Standards of scholarship & academia:

Document it:

Plagiarism

Cite stuff!

Page 42: Data Management (RIDM) Research Information · Non-digital text (lab books, field notebooks, archival texts) Digital texts or digital copies of text Spreadsheets Audio, video Computer

Data citation

PPublisher / Distributor

5

AAuthors &

Contributors

1

PdPublication

date

4

TTitle

2 EiElectronic ID,

e.g., DOI

3

Table of citation elements

Get Credit • Give

Credit

- Track reuse- Measure impact- Support reproducibility

https://www.force11.org/group/joint-declaration-data-citation-principles-finalCU-RDM@columbia.edu

Page 43: Data Management (RIDM) Research Information · Non-digital text (lab books, field notebooks, archival texts) Digital texts or digital copies of text Spreadsheets Audio, video Computer

Document it:Citation managers

http://library.columbia.edu/research/citation-management.html

Page 44: Data Management (RIDM) Research Information · Non-digital text (lab books, field notebooks, archival texts) Digital texts or digital copies of text Spreadsheets Audio, video Computer

Document it:Citation managers

http://library.columbia.edu/research/citation-management.html

Page 45: Data Management (RIDM) Research Information · Non-digital text (lab books, field notebooks, archival texts) Digital texts or digital copies of text Spreadsheets Audio, video Computer

Intellectual Property & Ownership:

Who owns it:

?????YouYour PIColumbia UniversityPublisherFunding Agency?????

Page 46: Data Management (RIDM) Research Information · Non-digital text (lab books, field notebooks, archival texts) Digital texts or digital copies of text Spreadsheets Audio, video Computer

How do you store it?

Page 47: Data Management (RIDM) Research Information · Non-digital text (lab books, field notebooks, archival texts) Digital texts or digital copies of text Spreadsheets Audio, video Computer

Store it:

Page 48: Data Management (RIDM) Research Information · Non-digital text (lab books, field notebooks, archival texts) Digital texts or digital copies of text Spreadsheets Audio, video Computer

Security Storage

Secure it:

Page 49: Data Management (RIDM) Research Information · Non-digital text (lab books, field notebooks, archival texts) Digital texts or digital copies of text Spreadsheets Audio, video Computer

Secure it:

How will you protect your or your participant’s:● Security● Privacy/ confidentiality● Intellectual property● Other rights

?

Page 50: Data Management (RIDM) Research Information · Non-digital text (lab books, field notebooks, archival texts) Digital texts or digital copies of text Spreadsheets Audio, video Computer

Secure it:

Page 51: Data Management (RIDM) Research Information · Non-digital text (lab books, field notebooks, archival texts) Digital texts or digital copies of text Spreadsheets Audio, video Computer

Secure it:BackupsWhere● Here● Near● Far

When● Regularly & frequently● Schedule it

Test it● File recovery● Checksums

Page 52: Data Management (RIDM) Research Information · Non-digital text (lab books, field notebooks, archival texts) Digital texts or digital copies of text Spreadsheets Audio, video Computer

Secure it:Backups

Read the fine print (what happens to your stuff when the service

inevitably dies?)

What about

Consider:● Security● Accessibility

● Cost● Longevity

, you ask?

Page 53: Data Management (RIDM) Research Information · Non-digital text (lab books, field notebooks, archival texts) Digital texts or digital copies of text Spreadsheets Audio, video Computer

Secure it:SecurityWho needs to/should see the data when?

IRBPIIFERPA

HIPPARestrictedClassified

CopyrightPatent potentialLicenses & IP

Consider:● Restricting physical access● Encryption● De-identification● Strong passwords (password manager)

Page 54: Data Management (RIDM) Research Information · Non-digital text (lab books, field notebooks, archival texts) Digital texts or digital copies of text Spreadsheets Audio, video Computer

Storing it: Some practicalities

● File formats

● File naming and organization

● Version control

Page 55: Data Management (RIDM) Research Information · Non-digital text (lab books, field notebooks, archival texts) Digital texts or digital copies of text Spreadsheets Audio, video Computer

• Non-proprietary• Open, documented standard• Standard representation (e.g., ASCII, Unicode)• Common, or commonly used by the research

community (e.g. FITS, CIF)• Unencrypted• Uncompressed

Some commonly recognized formats meeting these criteria: ASCII [e.g., .csv, .txt], PDF [.pdf], FLAC, TIFF, JPEG2000 [.jp2], MPEG-4 [.mp4], XML [.xml, .odf, .rdf], R [.r]

✓ Not sure about the extension? Check https://www.nationalarchives.gov.uk/PRONOM/default.htm

http://www.data-archive.ac.uk/media/2894/managingsharing.pdf | http://www.digitalpreservation.gov/formats/index.shtml?PHPSESSID=c26c5e5101396d5f5ebacedb13cae6e3

Storing it:File formats (for interoperability & storage)

Page 56: Data Management (RIDM) Research Information · Non-digital text (lab books, field notebooks, archival texts) Digital texts or digital copies of text Spreadsheets Audio, video Computer

Storing it: File naming

Page 57: Data Management (RIDM) Research Information · Non-digital text (lab books, field notebooks, archival texts) Digital texts or digital copies of text Spreadsheets Audio, video Computer

Storing it: File naming

● Consistency: Pick a system, write it down, stick with it

● Identify necessary elements & consider their order

● Create brief, understandable names

● Date: YYYY-MM-DD or YYYYMMDD

● Version: v01, v02,…FINAL

● Try to stay away from spaces in filenames as well as the following characters: \ / : * ? “. < > | [ ] & $ (reserve . for file extensions)

● Recognize: At the file level, Firefly/browncoat/shiny.txt = Firefly/alliance/shiny.txt

Make a system. Share the system. Follow the system

Page 58: Data Management (RIDM) Research Information · Non-digital text (lab books, field notebooks, archival texts) Digital texts or digital copies of text Spreadsheets Audio, video Computer

Storing it:File organization● Consider organizing by logical chunks, e.g. project,

class, grant

● What makes sense for the work you’re doing? How are you likely to look for related items?

● Identify important elements & how they should be nested

● Don’t make the system too deep

● Choose brief, understandable names

● Document it!

Make a system. Share the system. Follow the system

Page 59: Data Management (RIDM) Research Information · Non-digital text (lab books, field notebooks, archival texts) Digital texts or digital copies of text Spreadsheets Audio, video Computer

Storing it: Versioning: Did you change the file?

Change the name!

Indicate versions● filename_v001

● report_draft_r045

● report_final_r176

● presentation_20140706

Indicate responsibility● Initials: file_v05_gh

● ID designation: file_v05_iam37

Make a system. Share the system. Follow the system

Page 60: Data Management (RIDM) Research Information · Non-digital text (lab books, field notebooks, archival texts) Digital texts or digital copies of text Spreadsheets Audio, video Computer

Storing it:Columbia resources● Lionmail drive● Academic Commons● The Libraries● Departmental IT

Page 61: Data Management (RIDM) Research Information · Non-digital text (lab books, field notebooks, archival texts) Digital texts or digital copies of text Spreadsheets Audio, video Computer

Sharing it:

But my PI told me to do it this way?

Page 62: Data Management (RIDM) Research Information · Non-digital text (lab books, field notebooks, archival texts) Digital texts or digital copies of text Spreadsheets Audio, video Computer

Sharing it:File naming & organizationCollaborating on a complex project?

Make sure to share and agree on your naming, organizational, and versioning systems!

Make a system. Share the system. Follow the system

Page 63: Data Management (RIDM) Research Information · Non-digital text (lab books, field notebooks, archival texts) Digital texts or digital copies of text Spreadsheets Audio, video Computer

Research Information & Data (RID) sharing:

:● What: Unique, reusable, relevant data

● With whom: Your future self! Your collaborators. Your research community. The world. (mind restrictions, etc.)

● When: During the project with collaborators. At pre-determined project stages. At project completion.

● How: Data Publication

● Frequently required by funding organizations

Page 64: Data Management (RIDM) Research Information · Non-digital text (lab books, field notebooks, archival texts) Digital texts or digital copies of text Spreadsheets Audio, video Computer

RID sharing – What?

Not all data should be archived or be kept for the same time, or in the same way. Appraise your data on the following principles:● Relevance to research mission

● Historical or scientific value

● Uniqueness

● Reliability / Integrity / Usability of data

● Replicability, or lack thereof

● Cost of management and preservation

● Adequate available documentation

● Satisfaction of requirements

Page 65: Data Management (RIDM) Research Information · Non-digital text (lab books, field notebooks, archival texts) Digital texts or digital copies of text Spreadsheets Audio, video Computer

RID sharing – With whom

YOU

http://openarchaeologydata.metajnl.com/about/ , modified

Page 66: Data Management (RIDM) Research Information · Non-digital text (lab books, field notebooks, archival texts) Digital texts or digital copies of text Spreadsheets Audio, video Computer

RID sharing – When (depends on whom)

all of the

time!

Page 67: Data Management (RIDM) Research Information · Non-digital text (lab books, field notebooks, archival texts) Digital texts or digital copies of text Spreadsheets Audio, video Computer

RID sharing – How, Data Publishing

● Data publication in repositories○ Institutional: http://academiccommons.columbia.

edu/○ Disciplinary, Directory: http://www.re3data.org/○ Requirements

■ long-term storage and access to data■ validation of data integrity [check-sum]■ a permanent resource locator (e.g., DOI, Purl,

hdl) to make its data persistent, unique, and citable

● Data descriptors● Data papers● Supplementary material

Page 68: Data Management (RIDM) Research Information · Non-digital text (lab books, field notebooks, archival texts) Digital texts or digital copies of text Spreadsheets Audio, video Computer

Using it:Columbia resources

● Open Source Software● Licensed Software● Specialized Software● High Performance Computing

Page 69: Data Management (RIDM) Research Information · Non-digital text (lab books, field notebooks, archival texts) Digital texts or digital copies of text Spreadsheets Audio, video Computer

Responsibility

Page 70: Data Management (RIDM) Research Information · Non-digital text (lab books, field notebooks, archival texts) Digital texts or digital copies of text Spreadsheets Audio, video Computer

Questions?

Contact us:Ellie | Research Services Coordinator | [email protected]

Amy | Research Data Manager | [email protected]