tools for reproducible and accessible science knitr, vms and omero rob davidson cardiac physiome...
TRANSCRIPT
![Page 1: Tools for reproducible and accessible science KnitR, VMs and OMERO Rob Davidson Cardiac Physiome Workshop Auckland, April 8th 2015 DOI for this talk: 10.6084/m9.figshare.1368774](https://reader035.vdocuments.site/reader035/viewer/2022062407/56649e365503460f94b24f66/html5/thumbnails/1.jpg)
Tools for reproducible and accessible science
KnitR, VMs and OMERORob Davidson
Cardiac Physiome WorkshopAuckland, April 8th 2015
DOI for this talk: 10.6084/m9.figshare.1368774
DOI: 10.6084/m9.figshare.1368774
![Page 2: Tools for reproducible and accessible science KnitR, VMs and OMERO Rob Davidson Cardiac Physiome Workshop Auckland, April 8th 2015 DOI for this talk: 10.6084/m9.figshare.1368774](https://reader035.vdocuments.site/reader035/viewer/2022062407/56649e365503460f94b24f66/html5/thumbnails/2.jpg)
Today’s message
• Tools that fit with GigaDB– General purpose Research Object store
• Enhancing– Accessibility– Reproducibility
• Of some of your research objects– Software– images
DOI: 10.6084/m9.figshare.1368774
![Page 3: Tools for reproducible and accessible science KnitR, VMs and OMERO Rob Davidson Cardiac Physiome Workshop Auckland, April 8th 2015 DOI for this talk: 10.6084/m9.figshare.1368774](https://reader035.vdocuments.site/reader035/viewer/2022062407/56649e365503460f94b24f66/html5/thumbnails/3.jpg)
Problems with scientific software - reproducibility
DOI: 10.6084/m9.figshare.1368774
![Page 4: Tools for reproducible and accessible science KnitR, VMs and OMERO Rob Davidson Cardiac Physiome Workshop Auckland, April 8th 2015 DOI for this talk: 10.6084/m9.figshare.1368774](https://reader035.vdocuments.site/reader035/viewer/2022062407/56649e365503460f94b24f66/html5/thumbnails/4.jpg)
Measuring software reproducibility
• Systematic study:• 515 papers (429 conference, 86 journal)• <30% reproducible
http://reproducibility.cs.arizona.edu DOI: 10.6084/m9.figshare.1368774
![Page 5: Tools for reproducible and accessible science KnitR, VMs and OMERO Rob Davidson Cardiac Physiome Workshop Auckland, April 8th 2015 DOI for this talk: 10.6084/m9.figshare.1368774](https://reader035.vdocuments.site/reader035/viewer/2022062407/56649e365503460f94b24f66/html5/thumbnails/5.jpg)
Measuring software reproducibilityhttp://reproducibility.cs.arizona.edu DOI: 10.6084/m9.figshare.1368774
![Page 6: Tools for reproducible and accessible science KnitR, VMs and OMERO Rob Davidson Cardiac Physiome Workshop Auckland, April 8th 2015 DOI for this talk: 10.6084/m9.figshare.1368774](https://reader035.vdocuments.site/reader035/viewer/2022062407/56649e365503460f94b24f66/html5/thumbnails/6.jpg)
Reasons for failure
“The good news is that I was able to find some code. I am just hoping that it is a stable working version of the code... I have lost some data... The bad news is that the code is not commented and/or clean. So, I cannot really guarantee that you will enjoy playing with it.”
http://reproducibility.cs.arizona.edu DOI: 10.6084/m9.figshare.1368774
![Page 7: Tools for reproducible and accessible science KnitR, VMs and OMERO Rob Davidson Cardiac Physiome Workshop Auckland, April 8th 2015 DOI for this talk: 10.6084/m9.figshare.1368774](https://reader035.vdocuments.site/reader035/viewer/2022062407/56649e365503460f94b24f66/html5/thumbnails/7.jpg)
Cost of failure
• Waste time• Waste money
– Ioannidis 2014 – 85% resources wasted
• Frustrating• Distrust
DOI: 10.1371/journal.pmed.1001747 DOI: 10.6084/m9.figshare.1368774
![Page 8: Tools for reproducible and accessible science KnitR, VMs and OMERO Rob Davidson Cardiac Physiome Workshop Auckland, April 8th 2015 DOI for this talk: 10.6084/m9.figshare.1368774](https://reader035.vdocuments.site/reader035/viewer/2022062407/56649e365503460f94b24f66/html5/thumbnails/8.jpg)
Literate programming - KnitR
DOI: 10.6084/m9.figshare.1368774
![Page 9: Tools for reproducible and accessible science KnitR, VMs and OMERO Rob Davidson Cardiac Physiome Workshop Auckland, April 8th 2015 DOI for this talk: 10.6084/m9.figshare.1368774](https://reader035.vdocuments.site/reader035/viewer/2022062407/56649e365503460f94b24f66/html5/thumbnails/9.jpg)
Literate programming
• Let us change our traditional attitude to the construction of programs: Instead of imagining that our main task is to instruct a computer what to do, let us concentrate rather on explaining to humans what we want the computer to do.– Donald E. Knuth, Literate Programming, 1984
DOI: 10.6084/m9.figshare.1368774
![Page 10: Tools for reproducible and accessible science KnitR, VMs and OMERO Rob Davidson Cardiac Physiome Workshop Auckland, April 8th 2015 DOI for this talk: 10.6084/m9.figshare.1368774](https://reader035.vdocuments.site/reader035/viewer/2022062407/56649e365503460f94b24f66/html5/thumbnails/10.jpg)
Literate programming options
• See listing: http://www.gigasciencejournal.com/content/3/1/19– R: KnitR, Sweave, R-Markdown– Javascript: Tangle, Active Markdown (CoffeeScript)– Python: Ipython Notebooks – iReport links this functionality for Galaxy
DOI: 10.6084/m9.figshare.1368774
![Page 11: Tools for reproducible and accessible science KnitR, VMs and OMERO Rob Davidson Cardiac Physiome Workshop Auckland, April 8th 2015 DOI for this talk: 10.6084/m9.figshare.1368774](https://reader035.vdocuments.site/reader035/viewer/2022062407/56649e365503460f94b24f66/html5/thumbnails/11.jpg)
KnitR is versatile
R
Python
Ruby
HaskellPerl
SAS
Coffeescript
.txt
LaTeX
HTML
D3.js
R Markdown
HTML5 slides
Command line Any text?
WordPress
DOI: 10.6084/m9.figshare.1368774
![Page 12: Tools for reproducible and accessible science KnitR, VMs and OMERO Rob Davidson Cardiac Physiome Workshop Auckland, April 8th 2015 DOI for this talk: 10.6084/m9.figshare.1368774](https://reader035.vdocuments.site/reader035/viewer/2022062407/56649e365503460f94b24f66/html5/thumbnails/12.jpg)
KnitR – how does it work?
• Code chunks– Basic text (or latex or markdown), interrupted by
‘chunks’ of code• For latex, similar to Sweave
…some text \Sexpr{rfunc(var)} more text……some text <<language, chunk_name, chunk_options>>=Some code@
• Process this combined text/code with knit() in R
DOI: 10.6084/m9.figshare.1368774
![Page 13: Tools for reproducible and accessible science KnitR, VMs and OMERO Rob Davidson Cardiac Physiome Workshop Auckland, April 8th 2015 DOI for this talk: 10.6084/m9.figshare.1368774](https://reader035.vdocuments.site/reader035/viewer/2022062407/56649e365503460f94b24f66/html5/thumbnails/13.jpg)
KnitR uses: easy to explainhttp://reproducibility.cs.arizona.edu DOI: 10.6084/m9.figshare.1368774
![Page 14: Tools for reproducible and accessible science KnitR, VMs and OMERO Rob Davidson Cardiac Physiome Workshop Auckland, April 8th 2015 DOI for this talk: 10.6084/m9.figshare.1368774](https://reader035.vdocuments.site/reader035/viewer/2022062407/56649e365503460f94b24f66/html5/thumbnails/14.jpg)
KnitR uses: reproducible analysis
• Can string different tools/languages together • Stores parameters• Just like a pipeline/workflow system
– E.g. galaxy, taverna, Knime
• But also: codifies your figures…
DOI: 10.6084/m9.figshare.1368774
![Page 15: Tools for reproducible and accessible science KnitR, VMs and OMERO Rob Davidson Cardiac Physiome Workshop Auckland, April 8th 2015 DOI for this talk: 10.6084/m9.figshare.1368774](https://reader035.vdocuments.site/reader035/viewer/2022062407/56649e365503460f94b24f66/html5/thumbnails/15.jpg)
KnitR uses – codified figures
• Classic problems:• No description of error
bars• No description of
distributions
• Admittedly this could be fixed by ‘proper’ peer review
Source code: http://bit.ly/1NQZlHh DOI: 10.6084/m9.figshare.1368774
![Page 16: Tools for reproducible and accessible science KnitR, VMs and OMERO Rob Davidson Cardiac Physiome Workshop Auckland, April 8th 2015 DOI for this talk: 10.6084/m9.figshare.1368774](https://reader035.vdocuments.site/reader035/viewer/2022062407/56649e365503460f94b24f66/html5/thumbnails/16.jpg)
KnitR uses: codified figures
• Code can be found quickly• Using text as markers
• Plot can be altered – 1 line of code
• New visualisation produced instantaneously
• Better evaluation of results
Source code: http://bit.ly/1NQZlHh DOI: 10.6084/m9.figshare.1368774
![Page 17: Tools for reproducible and accessible science KnitR, VMs and OMERO Rob Davidson Cardiac Physiome Workshop Auckland, April 8th 2015 DOI for this talk: 10.6084/m9.figshare.1368774](https://reader035.vdocuments.site/reader035/viewer/2022062407/56649e365503460f94b24f66/html5/thumbnails/17.jpg)
GigaScience KnitR example• “This article is an example of a literate programming document. It has
been created in R using the knitr package. Figures and tables in this paper are generated dynamically as the document is compiled. Several R packages are required to run the analysis. Materials are archived in the Gigascience database”
DOI:10.1186/2047-217X-3-3 DOI: 10.6084/m9.figshare.1368774
![Page 18: Tools for reproducible and accessible science KnitR, VMs and OMERO Rob Davidson Cardiac Physiome Workshop Auckland, April 8th 2015 DOI for this talk: 10.6084/m9.figshare.1368774](https://reader035.vdocuments.site/reader035/viewer/2022062407/56649e365503460f94b24f66/html5/thumbnails/18.jpg)
Environment wrappers - VMs
DOI: 10.6084/m9.figshare.1368774
![Page 19: Tools for reproducible and accessible science KnitR, VMs and OMERO Rob Davidson Cardiac Physiome Workshop Auckland, April 8th 2015 DOI for this talk: 10.6084/m9.figshare.1368774](https://reader035.vdocuments.site/reader035/viewer/2022062407/56649e365503460f94b24f66/html5/thumbnails/19.jpg)
Measuring software reproducibilityhttp://reproducibility.cs.arizona.edu DOI: 10.6084/m9.figshare.1368774
![Page 20: Tools for reproducible and accessible science KnitR, VMs and OMERO Rob Davidson Cardiac Physiome Workshop Auckland, April 8th 2015 DOI for this talk: 10.6084/m9.figshare.1368774](https://reader035.vdocuments.site/reader035/viewer/2022062407/56649e365503460f94b24f66/html5/thumbnails/20.jpg)
Your environment
• How hard would it be to start from scratch?• What if you move from Ubuntu to Centos? Or
just upgrade?
• Dependencies / Versions• System settings• Hard for you, horrendous for others!
DOI: 10.6084/m9.figshare.1368774
![Page 21: Tools for reproducible and accessible science KnitR, VMs and OMERO Rob Davidson Cardiac Physiome Workshop Auckland, April 8th 2015 DOI for this talk: 10.6084/m9.figshare.1368774](https://reader035.vdocuments.site/reader035/viewer/2022062407/56649e365503460f94b24f66/html5/thumbnails/21.jpg)
Share your environment• Virtual machine
– Copy your exact environment– If it works for you, it works for anyone– Reproducibility, frozen in time
DOI:10.1186/2047-217X-3-23 DOI: 10.6084/m9.figshare.1368774
![Page 22: Tools for reproducible and accessible science KnitR, VMs and OMERO Rob Davidson Cardiac Physiome Workshop Auckland, April 8th 2015 DOI for this talk: 10.6084/m9.figshare.1368774](https://reader035.vdocuments.site/reader035/viewer/2022062407/56649e365503460f94b24f66/html5/thumbnails/22.jpg)
Share your environment
• Docker– ‘light’ vm – Discrete unit of code+environment– Can be called from command line– Can be linked together
• New possibilities e.g. nucleotid.es – Benchmarking -> “data-driven peer-review”?
http://nucleotid.es/ DOI: 10.6084/m9.figshare.1368774
![Page 23: Tools for reproducible and accessible science KnitR, VMs and OMERO Rob Davidson Cardiac Physiome Workshop Auckland, April 8th 2015 DOI for this talk: 10.6084/m9.figshare.1368774](https://reader035.vdocuments.site/reader035/viewer/2022062407/56649e365503460f94b24f66/html5/thumbnails/23.jpg)
Share your environment
• Some concerns:– http://ivory.idyll.org/blog/vms-considered-harmfu
l.html– VM = black box?– Docker == black box!
Solution-> codify the environment
DOI: 10.6084/m9.figshare.1368774
![Page 24: Tools for reproducible and accessible science KnitR, VMs and OMERO Rob Davidson Cardiac Physiome Workshop Auckland, April 8th 2015 DOI for this talk: 10.6084/m9.figshare.1368774](https://reader035.vdocuments.site/reader035/viewer/2022062407/56649e365503460f94b24f66/html5/thumbnails/24.jpg)
Codify your environment
• Provisioning scripts are ‘research objects’• Improves adaptability (easier to recode for
alternative OS etc)• Builds in extra documentation• Easier to share – although GigaDB still wants a
compiled snapshot (i.e. full machine)
DOI: 10.6084/m9.figshare.1368774
![Page 25: Tools for reproducible and accessible science KnitR, VMs and OMERO Rob Davidson Cardiac Physiome Workshop Auckland, April 8th 2015 DOI for this talk: 10.6084/m9.figshare.1368774](https://reader035.vdocuments.site/reader035/viewer/2022062407/56649e365503460f94b24f66/html5/thumbnails/25.jpg)
Short list of provisioning systems
• Vagrant• Chef• Salt• Puppet• Ansible
• Many more – see link for info
Source: http://bit.ly/1wrYiuI DOI: 10.6084/m9.figshare.1368774
![Page 26: Tools for reproducible and accessible science KnitR, VMs and OMERO Rob Davidson Cardiac Physiome Workshop Auckland, April 8th 2015 DOI for this talk: 10.6084/m9.figshare.1368774](https://reader035.vdocuments.site/reader035/viewer/2022062407/56649e365503460f94b24f66/html5/thumbnails/26.jpg)
Images: release ALL the images with OMERO
“And now for something completely different”
DOI: 10.6084/m9.figshare.1368774
![Page 27: Tools for reproducible and accessible science KnitR, VMs and OMERO Rob Davidson Cardiac Physiome Workshop Auckland, April 8th 2015 DOI for this talk: 10.6084/m9.figshare.1368774](https://reader035.vdocuments.site/reader035/viewer/2022062407/56649e365503460f94b24f66/html5/thumbnails/27.jpg)
NO
Phenotyping with microCTdoi:10.1186/2047-217X-2-14 DOI: 10.6084/m9.figshare.1368774
![Page 28: Tools for reproducible and accessible science KnitR, VMs and OMERO Rob Davidson Cardiac Physiome Workshop Auckland, April 8th 2015 DOI for this talk: 10.6084/m9.figshare.1368774](https://reader035.vdocuments.site/reader035/viewer/2022062407/56649e365503460f94b24f66/html5/thumbnails/28.jpg)
NO
Phenotyping with microCTdoi:10.1186/2047-217X-3-6 DOI: 10.6084/m9.figshare.1368774
![Page 29: Tools for reproducible and accessible science KnitR, VMs and OMERO Rob Davidson Cardiac Physiome Workshop Auckland, April 8th 2015 DOI for this talk: 10.6084/m9.figshare.1368774](https://reader035.vdocuments.site/reader035/viewer/2022062407/56649e365503460f94b24f66/html5/thumbnails/29.jpg)
Hosting Images• Image LIMS
• MetaData!!!• Can handle most
formats• Web embedding
• View online, no need for software
• Open Source
www.openmicroscopy.org/site/products/omero DOI: 10.6084/m9.figshare.1368774
![Page 30: Tools for reproducible and accessible science KnitR, VMs and OMERO Rob Davidson Cardiac Physiome Workshop Auckland, April 8th 2015 DOI for this talk: 10.6084/m9.figshare.1368774](https://reader035.vdocuments.site/reader035/viewer/2022062407/56649e365503460f94b24f66/html5/thumbnails/30.jpg)
www.openmicroscopy.org/site/products/omero DOI: 10.6084/m9.figshare.1368774
![Page 31: Tools for reproducible and accessible science KnitR, VMs and OMERO Rob Davidson Cardiac Physiome Workshop Auckland, April 8th 2015 DOI for this talk: 10.6084/m9.figshare.1368774](https://reader035.vdocuments.site/reader035/viewer/2022062407/56649e365503460f94b24f66/html5/thumbnails/31.jpg)
OMERO: providing access to imaging data
View, filter, measure raw images with direct links from journal article.
See all image data, not just cherry picked examples.
Download and reprocess.
DOI: 10.6084/m9.figshare.1368774
![Page 32: Tools for reproducible and accessible science KnitR, VMs and OMERO Rob Davidson Cardiac Physiome Workshop Auckland, April 8th 2015 DOI for this talk: 10.6084/m9.figshare.1368774](https://reader035.vdocuments.site/reader035/viewer/2022062407/56649e365503460f94b24f66/html5/thumbnails/32.jpg)
OMERO: Adding value http://jcb-dataviewer.rupress.org/ DOI: 10.6084/m9.figshare.1368774
![Page 33: Tools for reproducible and accessible science KnitR, VMs and OMERO Rob Davidson Cardiac Physiome Workshop Auckland, April 8th 2015 DOI for this talk: 10.6084/m9.figshare.1368774](https://reader035.vdocuments.site/reader035/viewer/2022062407/56649e365503460f94b24f66/html5/thumbnails/33.jpg)
The alternative...
...look but don't touch
DOI: 10.6084/m9.figshare.1368774
![Page 34: Tools for reproducible and accessible science KnitR, VMs and OMERO Rob Davidson Cardiac Physiome Workshop Auckland, April 8th 2015 DOI for this talk: 10.6084/m9.figshare.1368774](https://reader035.vdocuments.site/reader035/viewer/2022062407/56649e365503460f94b24f66/html5/thumbnails/34.jpg)
Thanks for listening!
Acknowledgements• GigaTeam
– Scott Edmunds– Peter Li– Chris Hunter– Jesse Xiao– Nicole Edmunds– Laurie Goodman
Where to get these slides• FigShare DOI:
– 10.6084/m9.figshare.1368774
• http://bit.ly/1JmnRiU
DOI: 10.6084/m9.figshare.1368774