how to succeed in reproducible research without …...they recount those experiences as cautionary...
TRANSCRIPT
40 60 80 100 120
40
60
80
mm
How to succeed in reproducible research
without really tryingGeoffrey M. Oxberry
Lawrence Livermore National LaboratoryComputational Engineering Division
Energy Conversion and Storage
This work was performed under the auspices of the US Department of Energy by LawrenceLivermore National Laboratory under Contract DE-AC52-07NA27344. Views expressed inthis work are solely those of its author and do not reflect the views of Lawrence Livermore
National Laboratory.
February 28, 2013
(LLNL-PRES-621574-DRAFT) Reproducibility without trying February 28, 2013 1 / 27
40 60 80 100 120
40
60
80
mm
Computational science has culture problems
Lack of verification: Where’s the bug?
I My method?I Its implementation?I Its dependencies?
Lack of transparency
I No public code & data = tough for others to debugI Do results only show “good” case studies?
Efficiency
I Costly to implement everything from scratchI Bad record-keeping makes research, revision, collaboration
harder
(LLNL-PRES-621574-DRAFT) Reproducibility without trying February 28, 2013 2 / 27
40 60 80 100 120
40
60
80
mm
We can fix these problems. . . we have the
technology
Tools available to automate:
I Record-keepingI TestingI Building paper
Services available for storing research outputs
(LLNL-PRES-621574-DRAFT) Reproducibility without trying February 28, 2013 3 / 27
40 60 80 100 120
40
60
80
mm
Reproducible research practices: a solution
Reproducibility of work yields:
I Verification: easier to find and fix bugsI Transparency: leads increased citation count, broader impactI Efficiency: via de-duplication of effort
In this talk, I ignore restrictions due to:
I Classified or sensitive materialI Nondisclosure agreementsI Software licensing issues
Partial reproducibility may still be possible despite restrictions
(LLNL-PRES-621574-DRAFT) Reproducibility without trying February 28, 2013 4 / 27
40 60 80 100 120
40
60
80
mm
How to succeed in reproducible research without
really trying
State of reproducible research to-date
I Defining “reproducible research”I Motivating reproducible research
Tools and services to do reproducible research at reasonable cost
I How and where to host code & dataI Automate verificationI Where to host everything else, and getting credit for it
Challenges still exist
I Objections, and overcoming themI Needed policies, tools, cultural changes
(LLNL-PRES-621574-DRAFT) Reproducibility without trying February 28, 2013 5 / 27
40 60 80 100 120
40
60
80
mm
Reproducibility has long history
Mathematical proof is one form of reproducibility
I First: Greek mathematicians (ca. 400 BC)I Modern rigorous proof: 1800s
Notable experimental examples
I Galileo (1620s) built several copies of his telescopeI Pasteur added “Materials and Methods” sections to his journal
articles1
Modern scientific movements
I Structural and protein biology (1980s)2
I Political science (1990s)3
I Genomics and genetics (2000s)2,4,5
I Statistics (2010s)6
(LLNL-PRES-621574-DRAFT) Reproducibility without trying February 28, 2013 6 / 27
40 60 80 100 120
40
60
80
mm
Reproducible research = “post paper and all
supporting materials”
Reproducible research has many definitions7
In this presentation, “reproducible research” means submittingat minimum:
I the paperI all code & data to reproduce results under open source licenses8
I README files describing code & data
Minimum standard chosen to minimize cost
Can be helpful to do more
(LLNL-PRES-621574-DRAFT) Reproducibility without trying February 28, 2013 7 / 27
40 60 80 100 120
40
60
80
mm
Most computational science not reproducible
currently
People do not post the code & data with their work2,9,10
Even reproducible research gurus have published papers withoutcode & data
They recount those experiences as cautionary tales1,7,11,12
(LLNL-PRES-621574-DRAFT) Reproducibility without trying February 28, 2013 8 / 27
40 60 80 100 120
40
60
80
mm
Lack of reproducibility causes problems
Typical anecdotes1,7,9,11–13:
I Which version of code goes with paper?I Where’s the bug: method or implementation?I Easy to forget research set aside for monthsI Results can depend on “magic parameter settings”I New person in lab can’t repeat former grad student’s work
Reproducible research helps avoid these issues
(LLNL-PRES-621574-DRAFT) Reproducibility without trying February 28, 2013 9 / 27
40 60 80 100 120
40
60
80
mm
Doing reproducible research has benefits
Reproducible research tends to be cited more7,14
In addition, reproducible research has the following anecdotalbenefits1,7,11,12:
I Enhanced knowledge transferI Easier to resume projects after hiatusI Easier to train new researchersI Decreases time to revisionI Attracts collaboratorsI Decreases debugging time
(LLNL-PRES-621574-DRAFT) Reproducibility without trying February 28, 2013 10 / 27
40 60 80 100 120
40
60
80
mm
Reaping benefits of reproducible research reduces
to habits and practices
Basic principles of reproducibility in computational science likethose in experimental sciences
Like experimentalists, computational scientists need to:
I Keep good records: notebooks and version control!I Include code, data, proofs, derivationsI Use tests as control experiments
(LLNL-PRES-621574-DRAFT) Reproducibility without trying February 28, 2013 11 / 27
40 60 80 100 120
40
60
80
mm
Tools enable adopting these practices at reasonable
cost
Automating habits with tools and services reduces theircognitive burden
I Version control systemsI Repository hosting web sitesI Unit testing frameworksI Build systemsI Figure, data, preprint archives
Examples, payoffs, and estimated costs (to learn basics) given
(LLNL-PRES-621574-DRAFT) Reproducibility without trying February 28, 2013 12 / 27
40 60 80 100 120
40
60
80
mm
Version control systems track all code changes in
repositories
Examples: Git, Mercurial, SVN, CVS, etc.
Payoffs
I Eases collaborationI Can track changes in any file type, and who made themI Can revert file to any point in its tracked history
Costs
I 2-3 days to learn; learn by using on everything you canI SVN, CVS require their own server (Git & Mercurial don’t)I Takes a long time to master (much like LaTeX, MATLAB)
(LLNL-PRES-621574-DRAFT) Reproducibility without trying February 28, 2013 13 / 27
40 60 80 100 120
40
60
80
mm
Repository hosting sites available
Examples: GitHub, BitBucket, Google Code, SourceForge, etc.
Payoffs:
I Eases collaborationI Free backup of project filesI PublicityI Academics: free private space on BitBucket
Costs:
I 1 hr to register, sync up filesI Private space usually costs moneyI Space is limited (usually 1 GB or so)I Limited by terms of service
(LLNL-PRES-621574-DRAFT) Reproducibility without trying February 28, 2013 14 / 27
40 60 80 100 120
40
60
80
mm
Unit testing frameworks ease debugging,
verification
Examples: MATLAB xUnit, Python Nose, GoogleTest, etc.
Payoffs:
I Automates verificationI Easier to write testsI Reduce software development time costs; get papers faster15
Costs:
I 1-2 days to work through examplesI Each language has its own frameworkI Most frameworks use xUnit standard
(LLNL-PRES-621574-DRAFT) Reproducibility without trying February 28, 2013 15 / 27
40 60 80 100 120
40
60
80
mm
Build systems automate running tests, generating
results
Examples: GNU make, SCons, CMake, GNU autotools, etc.
Payoffs:
I Build code, test, run, all in one commandI Build presentations, papers from LaTeX (or other) sourceI Avoid mistake-prone long chains of commands
Costs:
I 1-3 days to work through basic examplesI Takes a long time to masterI Tough to debug
(LLNL-PRES-621574-DRAFT) Reproducibility without trying February 28, 2013 16 / 27
40 60 80 100 120
40
60
80
mm
Services host and track pre-prints, data, etc.
Examples:
I Pre-prints: arXiv, Optimization Online, FigShare, etc.I Data, Figures, Presentations: FigShare, DataDryad, ORCID,
etc.
Payoffs:
I Free space for hostingI Assignment of persistent DOIsI Tracking citation metrics (FigShare, ORCID)
Costs:
I Registration: 1 hourI Sometimes, license restrictions placed on posted materialI Limited or no private storage space
(LLNL-PRES-621574-DRAFT) Reproducibility without trying February 28, 2013 17 / 27
40 60 80 100 120
40
60
80
mm
Tools reduce costs of reproducible research
Recall that reproducibility of work yields:
I Verification: easier to check and debug workI Transparency: increased citation count, broader impactI Efficiency: de-duplication of effort
These benefits map to tools as follows:
I Verification: download hosted code, build it, run testsI Transparency: hosting sites store research record, tracked with
FigShare & ORCIDI Efficiency: releasing code promotes reuse, testing speeds
debugging, versioning helps collaboration
(LLNL-PRES-621574-DRAFT) Reproducibility without trying February 28, 2013 18 / 27
40 60 80 100 120
40
60
80
mm
Despite reducing costs, challenges remain
Tools make it easier to do reproducible research, but. . .
Reproducible research is uncommon, and resisted16
Policies requiring reproducibility have not been effective17,18
Technical challenges still exist (big data, supercomputing)
(LLNL-PRES-621574-DRAFT) Reproducibility without trying February 28, 2013 19 / 27
40 60 80 100 120
40
60
80
mm
Reproducible research practices: a solution resisted
Most computational science research is not done reproducibly
I Viewed as costing too muchI Cost-benefit tradeoff considered unfavorable
It shifts costs from research producers to researchconsumers
I Consumers re-implement, then checkI Computational science community reputation also suffers19
I Contradicts the tradition that burden of proof is on producers
Unreproducible research is a false economy
I Producer savings: time spent making research reproducibleI Consumer costs: time spent making others’ research
reproducibleI Producer costs: citations, reputation, wasted time redoing work
(LLNL-PRES-621574-DRAFT) Reproducibility without trying February 28, 2013 20 / 27
40 60 80 100 120
40
60
80
mm
Policies requiring reproducibility have not been
effective
Some journals and funding agencies require some form ofreproducibility
I PLoS, Science, others require sharing of code and dataI NIH, NSF, DOE require sharing of data
Despite policies, researchers still don’t share data or code17,18
. . . because policies aren’t enforced
Even if code and data provided, research still may not bereproducible17,20–22; better than no code at all
Better policies must align community and personal incentives2
(LLNL-PRES-621574-DRAFT) Reproducibility without trying February 28, 2013 21 / 27
40 60 80 100 120
40
60
80
mm
Technical challenges also still existReproducing big data and supercomputing research is hard
I Scarcity of resources (big storage, big supercomputers)I Best practices: cache intermediate data and results23, cloud
computing
Making sure that source code works on other people’s computersis hard
I Installing software is tedious and hardI To run someone’s code, need their whole development
environment1,7,8,11,12,23
I Best practices: use virtual machines, provisioning software,reproducibility software
Keeping detailed enough records – provenance – is hard
I Provenance software: VisTrails, Madagascar, Sumatra, etc.I Electronic notebooks like Carl Boettiger’s tackle day-to-day
record-keeping
(LLNL-PRES-621574-DRAFT) Reproducibility without trying February 28, 2013 22 / 27
40 60 80 100 120
40
60
80
mm
Ultimately, reproducible research is about
verification, transparency, and efficiencyReproducible research is about sharing data & code with papers
VerificationI Publicly posted code & data makes checking work easierI Use tools like version control, unit testing, file hosting
TransparencyI Concerns and questions addressed by looking at code & dataI GitHub and BitBucket list record of all changes to code, data,
paperI FigShare used to share data publicly, track its citations
EfficiencyI Others do not necessarily have to redo existing workI Get more citations per paperI Easier to remember what you did after a long breakI Easier to build upon, collaborate, transfer knowledge
(LLNL-PRES-621574-DRAFT) Reproducibility without trying February 28, 2013 23 / 27
40 60 80 100 120
40
60
80
mm
Available tools let you do reproducible research
without really trying
Learning tools requires small investments to automaterecord-keeping
Good record-keeping protects investments in scholarship
Get more citations, save time later
Posting code much better than not
(LLNL-PRES-621574-DRAFT) Reproducibility without trying February 28, 2013 24 / 27
40 60 80 100 120
40
60
80
mm
Acknowledgments
Victoria Stodden (posted a literature review)
ICERM (hosted reproducibility workshop)
Jaydeep Bardhan, Ahmed E. Ismail, Matthew Reuter, andfriends (helpful discussions)
Matt McNenly, Dan Flowers, Russell Whitesides, and LLNLcolleagues (helpful discussions)
Lawrence Livermore National Laboratory (funding via postdocaccount)
Gurpreet Singh (program manager)
(LLNL-PRES-621574-DRAFT) Reproducibility without trying February 28, 2013 25 / 27
40 60 80 100 120
40
60
80
mm
ColophonPresentation written in Markdown
Compiled into LaTeX/Beamer using Pandoc
Custom LaTeX/Beamer template
Bibliography:
I Content: BibTeX generated by Mendeley (manually curated)I Style: custom CSL hacked from existing styles (attribution in
comments)
Source code hosting:github/goxberry/siam-cse-2013-presentation
FigShare DOI:
ORCID: orcid.org/0000-0001-7451-8097
Licensing: Reproducible Research Standard-like
I Presentation: CC-BY-3.0 licensedI Source code: BSD (except for CSL file, which is CC-BY-SA-3.0)
(LLNL-PRES-621574-DRAFT) Reproducibility without trying February 28, 2013 26 / 27
40 60 80 100 120
40
60
80
mm
References[1]Buckheit, J.B. and Donoho, D.L. Wavelab and reproducible research. (1995).[2]Morin, A. et al. Shining light into black boxes. Science. 336, (2012), 159–160.[3]King, G. Replication, Replication. PS: Political Science and Politics. (1995).[4]Schofield, P.N. et al. Post-publication sharing of data and tools. Nature. 461, (2009), 171–173.[5]Birney, E. et al. Prepublication data sharing. Nature. 461, (2009), 168–70.[6]Peng, R.D. Reproducible research and Biostatistics. Biostatistics (Oxford, England). 10, (2009), 405–408.[7]Vandewalle, P. et al. Reproducible research in signal processing – What, why, and how. IEEE Signal Processing Magazine.26, (2009), 37–47.[8]Stodden, V. The Legal Framework for Reproducible Scientific Research: Licensing and Copyright. Computing in Science &Engineering. 11, (2009), 35–40.[9]Merali, Z. Error: Why scientific programming does not compute. Nature. (2010), 6–8.[10]Barnes, N. Publish your computer code: it is good enough. Nature. 467, (2010), 753.[11]LeVeque, R.J. Python tools for reproducible research on hyperbolic problems. Computing in Science & Engineering. (2009),19–27.[12]LeVeque, R.J. Wave propagation software, computational science, and reproducible research. Proceedings of theInternational Congress of Mathematicians (Madrid, Spain, 2006), 1–27.[13]Price, K. Anything You Can Do, I Can Do Better (No You Can’t)... Computer Vision, Graphics, and Image Processing.(1986), 387–391.[14]Piwowar, H. a et al. Sharing detailed research data is associated with increased citation rate. PloS one. 2, (2007), 308.[15]Wilson, G. et al. Best Practices for Scientific Computing. 1–6.[16]Drummond, C. Reproducible Research: a Dissenting Opinion. (2012), 1–10.[17]Ioannidis, J.P. a et al. Repeatability of published microarray gene expression analyses. Nature genetics. 41, (2009), 149–55.[18]Savage, C.J. and Vickers, A.J. Empirical study of data sharing by authors publishing in PLoS journals. PloS one. 4, (2009),7078.[19]Quirk, J. Computational Science “Same Old Silence, Same Old Mistakes” “Something More Is Needed...” Adaptive MeshRefinement-Theory and Applications. (2005), 4–28.[20]McCullough, B.D. Got Replicability? The Journal of Money, Credit and Banking Archive. Econ Journal Watch. 4, (2007),326–337.[21]McCullough, B.D. Do economics journal archives promote replicable research?. Economics Journal Archives. (2008).[22]Manolescu, I. et al. Repeatability & Workability Evaluation of SIGMOD 2009. SIGMOD 2009 (2009), 2–4.[23]Freire, J. et al. Computational reproducibility: state-of-the-art, challenges, and database research opportunities. SIGMOD2012 (2012), 593–596.
(LLNL-PRES-621574-DRAFT) Reproducibility without trying February 28, 2013 27 / 27