![Page 1: Unless otherwise indicated slides licensed under](https://reader036.vdocuments.site/reader036/viewer/2022062410/568163a2550346895dd4a408/html5/thumbnails/1.jpg)
Software Sustainability Institute
www.software.ac.ukDoing Science in
the Digital AgeSoftware, Skills and Sociology
http://dx.doi.org/10.6084/m9.figshare.957527
TGAC Science Symposia series, 11 March 2014Neil Chue Hong (@npch), Software Sustainability InstituteORCID: 0000-0002-8876-7606 | [email protected]
Unless otherwise indicatedslides licensed under
Supported by Project funding from
![Page 2: Unless otherwise indicated slides licensed under](https://reader036.vdocuments.site/reader036/viewer/2022062410/568163a2550346895dd4a408/html5/thumbnails/2.jpg)
Software Sustainability Institute
www.software.ac.uk
Four Paradigms of Research
Empirical
Theoretical
Computational
Data Exploration
![Page 3: Unless otherwise indicated slides licensed under](https://reader036.vdocuments.site/reader036/viewer/2022062410/568163a2550346895dd4a408/html5/thumbnails/3.jpg)
Software Sustainability Institute
www.software.ac.uk
Water Swap Reaction Coordinate
A water-swap reaction coordinate for the calculation of absolute protein-ligand binding free energiesWoods CJ, Malaisree M, Hannongbua S, Mulholland AJJ. Chem. Phys. (2011) vol. 134, pp. 054114http://dx.doi.org/10.1063/1.3519057
![Page 4: Unless otherwise indicated slides licensed under](https://reader036.vdocuments.site/reader036/viewer/2022062410/568163a2550346895dd4a408/html5/thumbnails/4.jpg)
Software Sustainability Institute
www.software.ac.uk
Pleiotropic loci
Selection at pleiotropic loci underlies disease co-occurrence in human populations. Navarro, Haley, Karosas et al. Submitted to Nature Genetics
![Page 5: Unless otherwise indicated slides licensed under](https://reader036.vdocuments.site/reader036/viewer/2022062410/568163a2550346895dd4a408/html5/thumbnails/5.jpg)
Software Sustainability Institute
www.software.ac.uk
Behind every great piece of science…
#go through each SNP of interestfor(my $x = 0; $x < scalar @pos; $x++){ #and then each downstream SNP of interest for(my $y = $x+1; $y < scalar @pos; $y++) { #if SNPs within our chosen distance (500kb) and both present in the haplotypes file if((!($trait[$x] eq $trait[$y])) && (abs($pos[$x] - $pos[$y]) <= 500000) && (exists($legArrayPos{$pos[$x]})) && (exists($legArrayPos{$pos[$y]}))) { my $snp1ArrayPos = "”; my $snp2ArrayPos = "”; my $snp1All = "”; my $snp2All = "”;
#create output file for this SNP pair my $filename = "ConditionedResults2/$chr[$x].$pos[$x]-$pos[$y].EHH.GBR.2.txt”; print "$filename\n”; unless (-e $filename) { open(OUT, ">$filename");
#####################CHANGE THESE IF NOT FOCUSING ON SECOND SNP######################### my $start = $pos[$y]-500000; if ($start < 1) { $start = 1; } my $end = $pos[$y]+500000; if ($end > $chrLengths{$chr[$x]}) { $end = $chrLengths{$chr[$x]}; }
![Page 6: Unless otherwise indicated slides licensed under](https://reader036.vdocuments.site/reader036/viewer/2022062410/568163a2550346895dd4a408/html5/thumbnails/6.jpg)
Software Sustainability Institute
www.software.ac.uk
The modern researcher…
• … worries about: Data management
and analysis Reproducible
research Scalable simulations Integration of
models and workflows
CollaborationPicture of Otto Stern courtesy of Emilio Segre Visual Archives
Where do they learn how to do this?
![Page 7: Unless otherwise indicated slides licensed under](https://reader036.vdocuments.site/reader036/viewer/2022062410/568163a2550346895dd4a408/html5/thumbnails/7.jpg)
Software Sustainability Institute
www.software.ac.uk
Observation 1:Software is pervasive across research
Corollary: software is bleeding edge and long-tail Demanding users are coming from arts + humanities, economics, and social science as well as sciences
![Page 8: Unless otherwise indicated slides licensed under](https://reader036.vdocuments.site/reader036/viewer/2022062410/568163a2550346895dd4a408/html5/thumbnails/8.jpg)
Software Sustainability Institute
www.software.ac.uk
Observation 2:A culture of re-use rather than re-invention is not widespread Corollary: we have wasted effort and increased siloing
![Page 9: Unless otherwise indicated slides licensed under](https://reader036.vdocuments.site/reader036/viewer/2022062410/568163a2550346895dd4a408/html5/thumbnails/9.jpg)
Software Sustainability Institute
www.software.ac.uk
Observation 3:Many people are “embarrassed” about software
Corollary: something is broken in the way we regard, recognise and reward software
![Page 10: Unless otherwise indicated slides licensed under](https://reader036.vdocuments.site/reader036/viewer/2022062410/568163a2550346895dd4a408/html5/thumbnails/10.jpg)
Software Sustainability Institute
www.software.ac.uk
The Research Cycle
Create
Test
Interpret
PublishRevise Paper
Data
Software
Research Outputs Research is a continuous cycle.
When we publish we are contributing to the body of knowledge.
![Page 11: Unless otherwise indicated slides licensed under](https://reader036.vdocuments.site/reader036/viewer/2022062410/568163a2550346895dd4a408/html5/thumbnails/11.jpg)
Software Sustainability Institute
www.software.ac.uk
Research/Reuse/Reward Cycle
Index
Identify
CiteRewardCreate
Test
Interpret
PublishRevise
Research Reuse Reuse is also a cycle. We build our research on the work of others.
Reward mechanisms should encourage reuse.
![Page 12: Unless otherwise indicated slides licensed under](https://reader036.vdocuments.site/reader036/viewer/2022062410/568163a2550346895dd4a408/html5/thumbnails/12.jpg)
Software Sustainability Institute
www.software.ac.uk
The current process
Startresearch
Writesoftware
Usesoftware
Produceresults
Publishresearch
paper
Releasedata
Releasesoftware
Which mentions software and data
This process is simple but does not reward production orreuse of good software and data.
It also has a long contribution cycle.
![Page 13: Unless otherwise indicated slides licensed under](https://reader036.vdocuments.site/reader036/viewer/2022062410/568163a2550346895dd4a408/html5/thumbnails/13.jpg)
Software Sustainability Institute
www.software.ac.uk
Writesoftware
A better process?
Startresearch
Identifyexisting
software
Usesoftware
Produceresults
Publishresearch
paper
Adapt/extend
software
Releasedata
Releasesoftware
Publishsoftware
paper Publishdata
paper
Which references
software and data papers
Software and data papers are needed as proxies for rewarding reuse.
But it enables a shorter contribution cycle for data and software.
![Page 14: Unless otherwise indicated slides licensed under](https://reader036.vdocuments.site/reader036/viewer/2022062410/568163a2550346895dd4a408/html5/thumbnails/14.jpg)
Software Sustainability Institute
www.software.ac.uk
What do we choose to identify:- Workflow?- Software that runs workflow?- Software referenced by workflow?- Software dependencies? What’s the minimum citable part?
Boundary
![Page 15: Unless otherwise indicated slides licensed under](https://reader036.vdocuments.site/reader036/viewer/2022062410/568163a2550346895dd4a408/html5/thumbnails/15.jpg)
Software Sustainability Institute
www.software.ac.uk
Algorithm
Function
Prog
ram
Library / Suite / Package
…
Granularity
![Page 16: Unless otherwise indicated slides licensed under](https://reader036.vdocuments.site/reader036/viewer/2022062410/568163a2550346895dd4a408/html5/thumbnails/16.jpg)
Software Sustainability Institute
www.software.ac.uk
Versioning
Personalv1
Personal v2
Personalv3
Personal v2a
Public v1
Personal v3a
Personal v2a
Public v2
Public v3
Why do we version?- To indicate a change- To allow sharing- To confer special status
![Page 17: Unless otherwise indicated slides licensed under](https://reader036.vdocuments.site/reader036/viewer/2022062410/568163a2550346895dd4a408/html5/thumbnails/17.jpg)
Software Sustainability Institute
www.software.ac.uk
AuthorshipAuthorship• Which authors have had what impact on each version of the software?• Who had the largest contribution to the scientific results in a paper?
http://beyond-impact.org/?p=175
OGSA-DAI projects statistics from Ohloh
![Page 18: Unless otherwise indicated slides licensed under](https://reader036.vdocuments.site/reader036/viewer/2022062410/568163a2550346895dd4a408/html5/thumbnails/18.jpg)
Software Sustainability Institute
www.software.ac.uk
Observation 4:This is all getting just a little confusing
Corollary: maybe we need to get on to firmer conceptual ground
![Page 19: Unless otherwise indicated slides licensed under](https://reader036.vdocuments.site/reader036/viewer/2022062410/568163a2550346895dd4a408/html5/thumbnails/19.jpg)
Software Sustainability Institute
www.software.ac.uk
The Foundations of Digital Research
Software
Software
Software
Re-usable Re-producible
www.software.ac.uk/ software-evaluation-guide resources/guides software-carpentry training
www.rse.ac.uk
www.software.ac.uk/blog/ 2012-11-09-craftsperson-and-scholar
software.ac.uk/blog/2012-08-16-what-research-software-community-and-why-should-you-care
www.software.ac.uk/blog/2011-05-02-publish-or-be-damned-alternative-impact-manifesto-research-software
Prlić A, Procter JB (2012) Ten Simple Rules for the Open Development of Scientific Software PLoS Comput Biol 8(12): e1002802. doi:10.1371/journal.pcbi.1002802
Wilson G, et al. (2014) Best Practices for Scientific ComputingPLoS Biol 12(1): e1001745. doi:10.1371/journal.pbio.1001745
![Page 20: Unless otherwise indicated slides licensed under](https://reader036.vdocuments.site/reader036/viewer/2022062410/568163a2550346895dd4a408/html5/thumbnails/20.jpg)
Software Sustainability Institute
www.software.ac.uk
Gap 1: Software Skills Training
Basic Advanced
ProgrammingFocussed
(Tools)
ResearchFocussed
(methods)
SoftwareCarpentry
Programming 101
SummerSchools
Advanced HPC Training
HPC Short CoursesDoctoral Training
MSc in HPC / scientific
computing
Programming 201
Who fills this gap?
![Page 21: Unless otherwise indicated slides licensed under](https://reader036.vdocuments.site/reader036/viewer/2022062410/568163a2550346895dd4a408/html5/thumbnails/21.jpg)
Software Sustainability Institute
www.software.ac.uk
Gap 2: Lack of recognition and reward
• There is an anachronism in the way we conduct and recognise research? REF references software as an output but it is still not easy
to get recognition – peer review fails• Software careers
Researchers who use software Researcher-Developers Research Software Engineers Research Software Support Research Systems Providers
![Page 22: Unless otherwise indicated slides licensed under](https://reader036.vdocuments.site/reader036/viewer/2022062410/568163a2550346895dd4a408/html5/thumbnails/22.jpg)
Software Sustainability Institute
www.software.ac.uk
Gap 3: Software Maturity and Management
Softw
are
prol
ifera
tion
Time
CustomisationInnovation Consolidation
Not all software should make it to the next stageManagement changes through time, requiring planning
![Page 23: Unless otherwise indicated slides licensed under](https://reader036.vdocuments.site/reader036/viewer/2022062410/568163a2550346895dd4a408/html5/thumbnails/23.jpg)
Software Sustainability Institute
www.software.ac.uk
Standing on the shoulders of giants
• “If I have seen further it is by standing on the shoulders of giants” Isaac Newton
• As researchers we are honour-bound to share our knowledge so that all may benefit
![Page 24: Unless otherwise indicated slides licensed under](https://reader036.vdocuments.site/reader036/viewer/2022062410/568163a2550346895dd4a408/html5/thumbnails/24.jpg)
Software Sustainability Institute
www.software.ac.uk
Observation 5:Most of the issues are not technical, they’re social
Corollary: we can do something to change them
![Page 25: Unless otherwise indicated slides licensed under](https://reader036.vdocuments.site/reader036/viewer/2022062410/568163a2550346895dd4a408/html5/thumbnails/25.jpg)
Software Sustainability Institute
www.software.ac.uk
Career Paths in UKCareers outside academic sector
Non-universityResearch (industry,government etc.)
ProfessorPermanentResearch Staff
Early CareerResearch
PhD
stud
ents
Source: The Scientific Century, Royal Society, 2010 (revised to reflect first stage clarification from “What Do PhD’s Do?” study)
UK STEM graduate
career paths
![Page 26: Unless otherwise indicated slides licensed under](https://reader036.vdocuments.site/reader036/viewer/2022062410/568163a2550346895dd4a408/html5/thumbnails/26.jpg)
Software Sustainability Institute
www.software.ac.uk
We are science
Hear us roar!
Picture by Tamako the Jaguar
![Page 27: Unless otherwise indicated slides licensed under](https://reader036.vdocuments.site/reader036/viewer/2022062410/568163a2550346895dd4a408/html5/thumbnails/27.jpg)
Software Sustainability Institute
www.software.ac.uk
Shake up the system
• “Swim or drown” is not an efficient learning method
• “Publish or perish” is not an effective reward mechanism
• “Becoming a Professor” is not a scalable career path
• “I’ll just have to do it myself” is not a modern way of doing science
![Page 28: Unless otherwise indicated slides licensed under](https://reader036.vdocuments.site/reader036/viewer/2022062410/568163a2550346895dd4a408/html5/thumbnails/28.jpg)
Software Sustainability Institute
www.software.ac.uk
The Software Sustainability Institute
A national facility for cultivating world-class research through software• Better software enables better research• Software reaches boundaries in its
development cycle that prevent improvement, growth and adoption
• Providing the expertise and services needed to negotiate to the next stage
• Developing the policy and tools tosupport the community developing andusing research software
Better software
Better research
Supported by EPSRC Grant EP/H043160/1
![Page 29: Unless otherwise indicated slides licensed under](https://reader036.vdocuments.site/reader036/viewer/2022062410/568163a2550346895dd4a408/html5/thumbnails/29.jpg)
Software Sustainability Institute
www.software.ac.uk
Campaigning for careers
www.rse.ac.uk
http://www.rse.ac.uk/
![Page 30: Unless otherwise indicated slides licensed under](https://reader036.vdocuments.site/reader036/viewer/2022062410/568163a2550346895dd4a408/html5/thumbnails/30.jpg)
Software Sustainability Institute
www.software.ac.uk
Nurturing a training community
• Bringing together 39+ organisations with interest in e-Infrastructure training
• Raising issues and enablers with RCUK, BIS
software.ac.uk/policy
![Page 31: Unless otherwise indicated slides licensed under](https://reader036.vdocuments.site/reader036/viewer/2022062410/568163a2550346895dd4a408/html5/thumbnails/31.jpg)
Software Sustainability Institute
www.software.ac.uk
SSI Fellows 2014
• 2014: 16 fellows
• 2013: 15 fellows
• 2012: 10 fellows
• Range of subjects, career stages
software.ac.uk/fellows
![Page 32: Unless otherwise indicated slides licensed under](https://reader036.vdocuments.site/reader036/viewer/2022062410/568163a2550346895dd4a408/html5/thumbnails/32.jpg)
Software Sustainability Institute
www.software.ac.uk
Welcome to the CW14The Role of Software in Reproducible Research
6th Collaborations Workshop, Oxford26-28th March 2014
Organised by the Software Sustainability InstituteSponsored by Microsoft Research and Github
#CollabW14software.ac.uk/cw14
![Page 33: Unless otherwise indicated slides licensed under](https://reader036.vdocuments.site/reader036/viewer/2022062410/568163a2550346895dd4a408/html5/thumbnails/33.jpg)
Software Sustainability Institute
www.software.ac.uk
Publicise your softwarehttp://openresearchsoftware.metajnl.com
http://dx.doi.org/10.6084/m9.figshare.942289
![Page 34: Unless otherwise indicated slides licensed under](https://reader036.vdocuments.site/reader036/viewer/2022062410/568163a2550346895dd4a408/html5/thumbnails/34.jpg)
Software Sustainability Institute
www.software.ac.uk
What you can do now
• Read the Best Practices for Scientific Computing http://dx.doi.org/10.1371/journal.pbio.1001745
• Release your code and publish it in a journal http://bit.ly/softwarejournals
• Learn new software skills and pass them on to others http://www.software-carpentry.org/
• Ask for software and data if you’re reviewing a paper
• Forge a career in research, and change it for those coming behind you
• The DOI for this presentation: 10.6084/m9.figshare.957257• The Software Sustainabilty Institute is a collaboration between universities of Edinburgh, Manchester, Oxford and
Southampton. Supported by EPSRC Grant EP/H043160/1.