how to succeed in reproducible research without …...they recount those experiences as cautionary...

27
How to succeed in reproducible research without really trying Geoffrey M. Oxberry Lawrence Livermore National Laboratory Computational Engineering Division Energy Conversion and Storage This work was performed under the auspices of the US Department of Energy by Lawrence Livermore National Laboratory under Contract DE-AC52-07NA27344. Views expressed in this work are solely those of its author and do not reflect the views of Lawrence Livermore National Laboratory. February 28, 2013 (LLNL-PRES-621574-DRAFT) Reproducibility without trying February 28, 2013 1 / 27

Upload: others

Post on 05-Jul-2020

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: How to succeed in reproducible research without …...They recount those experiences as cautionary tales1,7,11,12 (LLNL-PRES-621574-DRAFT) Reproducibility without trying February 28,

40 60 80 100 120

40

60

80

mm

How to succeed in reproducible research

without really tryingGeoffrey M. Oxberry

Lawrence Livermore National LaboratoryComputational Engineering Division

Energy Conversion and Storage

This work was performed under the auspices of the US Department of Energy by LawrenceLivermore National Laboratory under Contract DE-AC52-07NA27344. Views expressed inthis work are solely those of its author and do not reflect the views of Lawrence Livermore

National Laboratory.

February 28, 2013

(LLNL-PRES-621574-DRAFT) Reproducibility without trying February 28, 2013 1 / 27

Page 2: How to succeed in reproducible research without …...They recount those experiences as cautionary tales1,7,11,12 (LLNL-PRES-621574-DRAFT) Reproducibility without trying February 28,

40 60 80 100 120

40

60

80

mm

Computational science has culture problems

Lack of verification: Where’s the bug?

I My method?I Its implementation?I Its dependencies?

Lack of transparency

I No public code & data = tough for others to debugI Do results only show “good” case studies?

Efficiency

I Costly to implement everything from scratchI Bad record-keeping makes research, revision, collaboration

harder

(LLNL-PRES-621574-DRAFT) Reproducibility without trying February 28, 2013 2 / 27

Page 3: How to succeed in reproducible research without …...They recount those experiences as cautionary tales1,7,11,12 (LLNL-PRES-621574-DRAFT) Reproducibility without trying February 28,

40 60 80 100 120

40

60

80

mm

We can fix these problems. . . we have the

technology

Tools available to automate:

I Record-keepingI TestingI Building paper

Services available for storing research outputs

(LLNL-PRES-621574-DRAFT) Reproducibility without trying February 28, 2013 3 / 27

Page 4: How to succeed in reproducible research without …...They recount those experiences as cautionary tales1,7,11,12 (LLNL-PRES-621574-DRAFT) Reproducibility without trying February 28,

40 60 80 100 120

40

60

80

mm

Reproducible research practices: a solution

Reproducibility of work yields:

I Verification: easier to find and fix bugsI Transparency: leads increased citation count, broader impactI Efficiency: via de-duplication of effort

In this talk, I ignore restrictions due to:

I Classified or sensitive materialI Nondisclosure agreementsI Software licensing issues

Partial reproducibility may still be possible despite restrictions

(LLNL-PRES-621574-DRAFT) Reproducibility without trying February 28, 2013 4 / 27

Page 5: How to succeed in reproducible research without …...They recount those experiences as cautionary tales1,7,11,12 (LLNL-PRES-621574-DRAFT) Reproducibility without trying February 28,

40 60 80 100 120

40

60

80

mm

How to succeed in reproducible research without

really trying

State of reproducible research to-date

I Defining “reproducible research”I Motivating reproducible research

Tools and services to do reproducible research at reasonable cost

I How and where to host code & dataI Automate verificationI Where to host everything else, and getting credit for it

Challenges still exist

I Objections, and overcoming themI Needed policies, tools, cultural changes

(LLNL-PRES-621574-DRAFT) Reproducibility without trying February 28, 2013 5 / 27

Page 6: How to succeed in reproducible research without …...They recount those experiences as cautionary tales1,7,11,12 (LLNL-PRES-621574-DRAFT) Reproducibility without trying February 28,

40 60 80 100 120

40

60

80

mm

Reproducibility has long history

Mathematical proof is one form of reproducibility

I First: Greek mathematicians (ca. 400 BC)I Modern rigorous proof: 1800s

Notable experimental examples

I Galileo (1620s) built several copies of his telescopeI Pasteur added “Materials and Methods” sections to his journal

articles1

Modern scientific movements

I Structural and protein biology (1980s)2

I Political science (1990s)3

I Genomics and genetics (2000s)2,4,5

I Statistics (2010s)6

(LLNL-PRES-621574-DRAFT) Reproducibility without trying February 28, 2013 6 / 27

Page 7: How to succeed in reproducible research without …...They recount those experiences as cautionary tales1,7,11,12 (LLNL-PRES-621574-DRAFT) Reproducibility without trying February 28,

40 60 80 100 120

40

60

80

mm

Reproducible research = “post paper and all

supporting materials”

Reproducible research has many definitions7

In this presentation, “reproducible research” means submittingat minimum:

I the paperI all code & data to reproduce results under open source licenses8

I README files describing code & data

Minimum standard chosen to minimize cost

Can be helpful to do more

(LLNL-PRES-621574-DRAFT) Reproducibility without trying February 28, 2013 7 / 27

Page 8: How to succeed in reproducible research without …...They recount those experiences as cautionary tales1,7,11,12 (LLNL-PRES-621574-DRAFT) Reproducibility without trying February 28,

40 60 80 100 120

40

60

80

mm

Most computational science not reproducible

currently

People do not post the code & data with their work2,9,10

Even reproducible research gurus have published papers withoutcode & data

They recount those experiences as cautionary tales1,7,11,12

(LLNL-PRES-621574-DRAFT) Reproducibility without trying February 28, 2013 8 / 27

Page 9: How to succeed in reproducible research without …...They recount those experiences as cautionary tales1,7,11,12 (LLNL-PRES-621574-DRAFT) Reproducibility without trying February 28,

40 60 80 100 120

40

60

80

mm

Lack of reproducibility causes problems

Typical anecdotes1,7,9,11–13:

I Which version of code goes with paper?I Where’s the bug: method or implementation?I Easy to forget research set aside for monthsI Results can depend on “magic parameter settings”I New person in lab can’t repeat former grad student’s work

Reproducible research helps avoid these issues

(LLNL-PRES-621574-DRAFT) Reproducibility without trying February 28, 2013 9 / 27

Page 10: How to succeed in reproducible research without …...They recount those experiences as cautionary tales1,7,11,12 (LLNL-PRES-621574-DRAFT) Reproducibility without trying February 28,

40 60 80 100 120

40

60

80

mm

Doing reproducible research has benefits

Reproducible research tends to be cited more7,14

In addition, reproducible research has the following anecdotalbenefits1,7,11,12:

I Enhanced knowledge transferI Easier to resume projects after hiatusI Easier to train new researchersI Decreases time to revisionI Attracts collaboratorsI Decreases debugging time

(LLNL-PRES-621574-DRAFT) Reproducibility without trying February 28, 2013 10 / 27

Page 11: How to succeed in reproducible research without …...They recount those experiences as cautionary tales1,7,11,12 (LLNL-PRES-621574-DRAFT) Reproducibility without trying February 28,

40 60 80 100 120

40

60

80

mm

Reaping benefits of reproducible research reduces

to habits and practices

Basic principles of reproducibility in computational science likethose in experimental sciences

Like experimentalists, computational scientists need to:

I Keep good records: notebooks and version control!I Include code, data, proofs, derivationsI Use tests as control experiments

(LLNL-PRES-621574-DRAFT) Reproducibility without trying February 28, 2013 11 / 27

Page 12: How to succeed in reproducible research without …...They recount those experiences as cautionary tales1,7,11,12 (LLNL-PRES-621574-DRAFT) Reproducibility without trying February 28,

40 60 80 100 120

40

60

80

mm

Tools enable adopting these practices at reasonable

cost

Automating habits with tools and services reduces theircognitive burden

I Version control systemsI Repository hosting web sitesI Unit testing frameworksI Build systemsI Figure, data, preprint archives

Examples, payoffs, and estimated costs (to learn basics) given

(LLNL-PRES-621574-DRAFT) Reproducibility without trying February 28, 2013 12 / 27

Page 13: How to succeed in reproducible research without …...They recount those experiences as cautionary tales1,7,11,12 (LLNL-PRES-621574-DRAFT) Reproducibility without trying February 28,

40 60 80 100 120

40

60

80

mm

Version control systems track all code changes in

repositories

Examples: Git, Mercurial, SVN, CVS, etc.

Payoffs

I Eases collaborationI Can track changes in any file type, and who made themI Can revert file to any point in its tracked history

Costs

I 2-3 days to learn; learn by using on everything you canI SVN, CVS require their own server (Git & Mercurial don’t)I Takes a long time to master (much like LaTeX, MATLAB)

(LLNL-PRES-621574-DRAFT) Reproducibility without trying February 28, 2013 13 / 27

Page 14: How to succeed in reproducible research without …...They recount those experiences as cautionary tales1,7,11,12 (LLNL-PRES-621574-DRAFT) Reproducibility without trying February 28,

40 60 80 100 120

40

60

80

mm

Repository hosting sites available

Examples: GitHub, BitBucket, Google Code, SourceForge, etc.

Payoffs:

I Eases collaborationI Free backup of project filesI PublicityI Academics: free private space on BitBucket

Costs:

I 1 hr to register, sync up filesI Private space usually costs moneyI Space is limited (usually 1 GB or so)I Limited by terms of service

(LLNL-PRES-621574-DRAFT) Reproducibility without trying February 28, 2013 14 / 27

Page 15: How to succeed in reproducible research without …...They recount those experiences as cautionary tales1,7,11,12 (LLNL-PRES-621574-DRAFT) Reproducibility without trying February 28,

40 60 80 100 120

40

60

80

mm

Unit testing frameworks ease debugging,

verification

Examples: MATLAB xUnit, Python Nose, GoogleTest, etc.

Payoffs:

I Automates verificationI Easier to write testsI Reduce software development time costs; get papers faster15

Costs:

I 1-2 days to work through examplesI Each language has its own frameworkI Most frameworks use xUnit standard

(LLNL-PRES-621574-DRAFT) Reproducibility without trying February 28, 2013 15 / 27

Page 16: How to succeed in reproducible research without …...They recount those experiences as cautionary tales1,7,11,12 (LLNL-PRES-621574-DRAFT) Reproducibility without trying February 28,

40 60 80 100 120

40

60

80

mm

Build systems automate running tests, generating

results

Examples: GNU make, SCons, CMake, GNU autotools, etc.

Payoffs:

I Build code, test, run, all in one commandI Build presentations, papers from LaTeX (or other) sourceI Avoid mistake-prone long chains of commands

Costs:

I 1-3 days to work through basic examplesI Takes a long time to masterI Tough to debug

(LLNL-PRES-621574-DRAFT) Reproducibility without trying February 28, 2013 16 / 27

Page 17: How to succeed in reproducible research without …...They recount those experiences as cautionary tales1,7,11,12 (LLNL-PRES-621574-DRAFT) Reproducibility without trying February 28,

40 60 80 100 120

40

60

80

mm

Services host and track pre-prints, data, etc.

Examples:

I Pre-prints: arXiv, Optimization Online, FigShare, etc.I Data, Figures, Presentations: FigShare, DataDryad, ORCID,

etc.

Payoffs:

I Free space for hostingI Assignment of persistent DOIsI Tracking citation metrics (FigShare, ORCID)

Costs:

I Registration: 1 hourI Sometimes, license restrictions placed on posted materialI Limited or no private storage space

(LLNL-PRES-621574-DRAFT) Reproducibility without trying February 28, 2013 17 / 27

Page 18: How to succeed in reproducible research without …...They recount those experiences as cautionary tales1,7,11,12 (LLNL-PRES-621574-DRAFT) Reproducibility without trying February 28,

40 60 80 100 120

40

60

80

mm

Tools reduce costs of reproducible research

Recall that reproducibility of work yields:

I Verification: easier to check and debug workI Transparency: increased citation count, broader impactI Efficiency: de-duplication of effort

These benefits map to tools as follows:

I Verification: download hosted code, build it, run testsI Transparency: hosting sites store research record, tracked with

FigShare & ORCIDI Efficiency: releasing code promotes reuse, testing speeds

debugging, versioning helps collaboration

(LLNL-PRES-621574-DRAFT) Reproducibility without trying February 28, 2013 18 / 27

Page 19: How to succeed in reproducible research without …...They recount those experiences as cautionary tales1,7,11,12 (LLNL-PRES-621574-DRAFT) Reproducibility without trying February 28,

40 60 80 100 120

40

60

80

mm

Despite reducing costs, challenges remain

Tools make it easier to do reproducible research, but. . .

Reproducible research is uncommon, and resisted16

Policies requiring reproducibility have not been effective17,18

Technical challenges still exist (big data, supercomputing)

(LLNL-PRES-621574-DRAFT) Reproducibility without trying February 28, 2013 19 / 27

Page 20: How to succeed in reproducible research without …...They recount those experiences as cautionary tales1,7,11,12 (LLNL-PRES-621574-DRAFT) Reproducibility without trying February 28,

40 60 80 100 120

40

60

80

mm

Reproducible research practices: a solution resisted

Most computational science research is not done reproducibly

I Viewed as costing too muchI Cost-benefit tradeoff considered unfavorable

It shifts costs from research producers to researchconsumers

I Consumers re-implement, then checkI Computational science community reputation also suffers19

I Contradicts the tradition that burden of proof is on producers

Unreproducible research is a false economy

I Producer savings: time spent making research reproducibleI Consumer costs: time spent making others’ research

reproducibleI Producer costs: citations, reputation, wasted time redoing work

(LLNL-PRES-621574-DRAFT) Reproducibility without trying February 28, 2013 20 / 27

Page 21: How to succeed in reproducible research without …...They recount those experiences as cautionary tales1,7,11,12 (LLNL-PRES-621574-DRAFT) Reproducibility without trying February 28,

40 60 80 100 120

40

60

80

mm

Policies requiring reproducibility have not been

effective

Some journals and funding agencies require some form ofreproducibility

I PLoS, Science, others require sharing of code and dataI NIH, NSF, DOE require sharing of data

Despite policies, researchers still don’t share data or code17,18

. . . because policies aren’t enforced

Even if code and data provided, research still may not bereproducible17,20–22; better than no code at all

Better policies must align community and personal incentives2

(LLNL-PRES-621574-DRAFT) Reproducibility without trying February 28, 2013 21 / 27

Page 22: How to succeed in reproducible research without …...They recount those experiences as cautionary tales1,7,11,12 (LLNL-PRES-621574-DRAFT) Reproducibility without trying February 28,

40 60 80 100 120

40

60

80

mm

Technical challenges also still existReproducing big data and supercomputing research is hard

I Scarcity of resources (big storage, big supercomputers)I Best practices: cache intermediate data and results23, cloud

computing

Making sure that source code works on other people’s computersis hard

I Installing software is tedious and hardI To run someone’s code, need their whole development

environment1,7,8,11,12,23

I Best practices: use virtual machines, provisioning software,reproducibility software

Keeping detailed enough records – provenance – is hard

I Provenance software: VisTrails, Madagascar, Sumatra, etc.I Electronic notebooks like Carl Boettiger’s tackle day-to-day

record-keeping

(LLNL-PRES-621574-DRAFT) Reproducibility without trying February 28, 2013 22 / 27

Page 23: How to succeed in reproducible research without …...They recount those experiences as cautionary tales1,7,11,12 (LLNL-PRES-621574-DRAFT) Reproducibility without trying February 28,

40 60 80 100 120

40

60

80

mm

Ultimately, reproducible research is about

verification, transparency, and efficiencyReproducible research is about sharing data & code with papers

VerificationI Publicly posted code & data makes checking work easierI Use tools like version control, unit testing, file hosting

TransparencyI Concerns and questions addressed by looking at code & dataI GitHub and BitBucket list record of all changes to code, data,

paperI FigShare used to share data publicly, track its citations

EfficiencyI Others do not necessarily have to redo existing workI Get more citations per paperI Easier to remember what you did after a long breakI Easier to build upon, collaborate, transfer knowledge

(LLNL-PRES-621574-DRAFT) Reproducibility without trying February 28, 2013 23 / 27

Page 24: How to succeed in reproducible research without …...They recount those experiences as cautionary tales1,7,11,12 (LLNL-PRES-621574-DRAFT) Reproducibility without trying February 28,

40 60 80 100 120

40

60

80

mm

Available tools let you do reproducible research

without really trying

Learning tools requires small investments to automaterecord-keeping

Good record-keeping protects investments in scholarship

Get more citations, save time later

Posting code much better than not

(LLNL-PRES-621574-DRAFT) Reproducibility without trying February 28, 2013 24 / 27

Page 25: How to succeed in reproducible research without …...They recount those experiences as cautionary tales1,7,11,12 (LLNL-PRES-621574-DRAFT) Reproducibility without trying February 28,

40 60 80 100 120

40

60

80

mm

Acknowledgments

Victoria Stodden (posted a literature review)

ICERM (hosted reproducibility workshop)

Jaydeep Bardhan, Ahmed E. Ismail, Matthew Reuter, andfriends (helpful discussions)

Matt McNenly, Dan Flowers, Russell Whitesides, and LLNLcolleagues (helpful discussions)

Lawrence Livermore National Laboratory (funding via postdocaccount)

Gurpreet Singh (program manager)

(LLNL-PRES-621574-DRAFT) Reproducibility without trying February 28, 2013 25 / 27

Page 26: How to succeed in reproducible research without …...They recount those experiences as cautionary tales1,7,11,12 (LLNL-PRES-621574-DRAFT) Reproducibility without trying February 28,

40 60 80 100 120

40

60

80

mm

ColophonPresentation written in Markdown

Compiled into LaTeX/Beamer using Pandoc

Custom LaTeX/Beamer template

Bibliography:

I Content: BibTeX generated by Mendeley (manually curated)I Style: custom CSL hacked from existing styles (attribution in

comments)

Source code hosting:github/goxberry/siam-cse-2013-presentation

FigShare DOI:

ORCID: orcid.org/0000-0001-7451-8097

Licensing: Reproducible Research Standard-like

I Presentation: CC-BY-3.0 licensedI Source code: BSD (except for CSL file, which is CC-BY-SA-3.0)

(LLNL-PRES-621574-DRAFT) Reproducibility without trying February 28, 2013 26 / 27

Page 27: How to succeed in reproducible research without …...They recount those experiences as cautionary tales1,7,11,12 (LLNL-PRES-621574-DRAFT) Reproducibility without trying February 28,

40 60 80 100 120

40

60

80

mm

References[1]Buckheit, J.B. and Donoho, D.L. Wavelab and reproducible research. (1995).[2]Morin, A. et al. Shining light into black boxes. Science. 336, (2012), 159–160.[3]King, G. Replication, Replication. PS: Political Science and Politics. (1995).[4]Schofield, P.N. et al. Post-publication sharing of data and tools. Nature. 461, (2009), 171–173.[5]Birney, E. et al. Prepublication data sharing. Nature. 461, (2009), 168–70.[6]Peng, R.D. Reproducible research and Biostatistics. Biostatistics (Oxford, England). 10, (2009), 405–408.[7]Vandewalle, P. et al. Reproducible research in signal processing – What, why, and how. IEEE Signal Processing Magazine.26, (2009), 37–47.[8]Stodden, V. The Legal Framework for Reproducible Scientific Research: Licensing and Copyright. Computing in Science &Engineering. 11, (2009), 35–40.[9]Merali, Z. Error: Why scientific programming does not compute. Nature. (2010), 6–8.[10]Barnes, N. Publish your computer code: it is good enough. Nature. 467, (2010), 753.[11]LeVeque, R.J. Python tools for reproducible research on hyperbolic problems. Computing in Science & Engineering. (2009),19–27.[12]LeVeque, R.J. Wave propagation software, computational science, and reproducible research. Proceedings of theInternational Congress of Mathematicians (Madrid, Spain, 2006), 1–27.[13]Price, K. Anything You Can Do, I Can Do Better (No You Can’t)... Computer Vision, Graphics, and Image Processing.(1986), 387–391.[14]Piwowar, H. a et al. Sharing detailed research data is associated with increased citation rate. PloS one. 2, (2007), 308.[15]Wilson, G. et al. Best Practices for Scientific Computing. 1–6.[16]Drummond, C. Reproducible Research: a Dissenting Opinion. (2012), 1–10.[17]Ioannidis, J.P. a et al. Repeatability of published microarray gene expression analyses. Nature genetics. 41, (2009), 149–55.[18]Savage, C.J. and Vickers, A.J. Empirical study of data sharing by authors publishing in PLoS journals. PloS one. 4, (2009),7078.[19]Quirk, J. Computational Science “Same Old Silence, Same Old Mistakes” “Something More Is Needed...” Adaptive MeshRefinement-Theory and Applications. (2005), 4–28.[20]McCullough, B.D. Got Replicability? The Journal of Money, Credit and Banking Archive. Econ Journal Watch. 4, (2007),326–337.[21]McCullough, B.D. Do economics journal archives promote replicable research?. Economics Journal Archives. (2008).[22]Manolescu, I. et al. Repeatability & Workability Evaluation of SIGMOD 2009. SIGMOD 2009 (2009), 2–4.[23]Freire, J. et al. Computational reproducibility: state-of-the-art, challenges, and database research opportunities. SIGMOD2012 (2012), 593–596.

(LLNL-PRES-621574-DRAFT) Reproducibility without trying February 28, 2013 27 / 27