glimpses of future research practice: a musical study
DESCRIPTION
Glimpses of future research practice: a musical study. David De Roure. Overview. Generation 1 – Early adopters Generation 2 – Embedding Generation 3 – Radical sharing Music case study. e-Science. e-Science was defined by John Taylor (Director General of the UK Research Councils) as - PowerPoint PPT PresentationTRANSCRIPT
Glimpses of future research practice: a musical study
David De Roure
Overview
• Generation 1 – Early adopters• Generation 2 – Embedding• Generation 3 – Radical sharing• Music case study
e-Science
• e-Science was defined by John Taylor (Director General of the UK Research Councils) asglobal collaboration in key areas of science and the next generation of infrastructure that will enable it
• e-Science was the name of the destination• It became the name of the journey• When we arrive, the destination is just called
science
“e-research extendse-Science andcyberinfrstructureto other disciplines, including the humanities andsocial sciences.”
e-Research
http://mitpress.mit.edu/catalog/item/default.asp?tid=12185&ttype=2
2000 – 2005
Generation 1
...the imminent flood of scientific data expected from the next generation of experiments, simulations, sensors and satellites
Tony Hey and Anne Trefethen
Source: CERN, CERN-EX-0712023, http://cdsweb.cern.ch/record/1203203
26/2/2007 | myExperiment | Slide 8
Jeremy Frey
• Workflows are the new rock and roll
• Machinery for coordinating the execution of (scientific) services and linking together (scientific) resources
• The era of Service Oriented Applications
• Repetitive and mundane boring stuff made easier
Carole Goble
E. Science laboris
Kepler
Triana
BPEL
Taverna
Trident
Meandre
Galaxy
co-shapingco-design
co-creation
co-constitution
co-evolution
co-construction
co-
co-realisation
humilitythe quality of being modest,
reverential, even politely submissive, and never being
arrogant, contemptuous, rude
Box of Chemists
My Chemistry Experiment
CombeChem
empower to equip or supply with an ability;
enable
servicethe performance of duties or the
duties performed as or by a waiter or servant
Current practices of early adoptors of tools.Characterised by researchers using tools within their particular problem area, with some re-use of tools, data and methods within the discipline. Traditional publishing is supplemented by publication of some digital artefacts like workflows and links to data. Science is accelerated and practice beginning to shift to emphasise in silico work.
1st Generation Summary
2005 – 2010
Generation 2
• Paul writes workflows for identifying biological pathways implicated in resistance to Trypanosomiasis in cattle
• Paul meets Jo. Jo is investigating Whipworm in mouse.
• Jo reuses one of Paul’s workflow without change.• Jo identifies the biological pathways involved in
sex dependence in the mouse model, believed to be involved in the ability of mice to expel the parasite.
• Previously a manual two year study by Jo had failed to do this.
Reuse, Recycling, Repurposing
“A biologist would rather share their toothbrush than their gene name”
Mike Ashburner and othersProfessor in Dept of Genetics,
University of Cambridge, UK
Data mining: my data’s mine and your data’s mine
mySpace for scientists!Facebook for scientists!Not Facebook for scientists!
Web 2
Open Repositories
Researchers
Social Network
The experiment that is
Developers
Social Scientists
“Facebook for Scientists” ...but different to Facebook!
A repository of research methods
A community social network of people and things
A Social Virtual Research Environment
A probe into researcher behaviour
Open source (BSD) Ruby on Rails app
REST and SPARQL interfaces, Linked Data compliant
Inspiration for: BioCatalogue, MethodBox and SysmoDB
myExperiment currently has 3849 members, 234 groups, 1315 workflows, 349 files and 133 packs
data
method
Results
Logs
Results
Metadata PaperSlides
Feeds into
produces
Included in
produces Published in
produces
Included in
Included in Included in
Published in
Workflow 16
Workflow 13Common pathways
QTLPaul’s PackPaul’s Research
Object
Research Objects enable data-intensive research to be:
1. Replayable – go back and see what happened2. Repeatable – run the experiment again3. Reproducible – independent expt to reproduce4. Reusable – use as part of new experiments5. Repurposeable – reuse the pieces in new expt6. Reliable – robust under automation7. Referenceable – citable and traceable
The Six Rs of Research Object Behaviours
http://blog.openwetware.org/deroure/?p=56
“Scientists and developers journeying together”
Projects delivering now.Some institutional embedding.Key characteristic is re-use - of the increasing pool of tools, data and methods across areas/disciplines. Contain some freestanding, recombinant, reproducible research objects. New scientific practices are established and opportunities arise for completely new scientific investigations.Some expert curation.
2nd Generation Summary
2010 – 2015
Generation 3
4th Paradigm
The Fourth Paradigm: Data-Intensive Scientific DiscoveryPresenting the firstbroad look at the rapidly emerging field of data-intensive sciencehttp://research.microsoft.com/en-us/collaboration/fourthparadigm/
BioEssays, 26(1):99–105, January 2004
Francois Belleau
“…to discover proteins that interact with transmembrane proteins, particularly those that can be related to neuro-degenerative diseases in which amyloids play a significant role”1) Taverna provenance exposed as RDF2) myExperiment RDF document for a protein discovery workflow3) Mocked-up BioCatalogue document using myExperiment RDF
data as example4) Provisional RDF documents obtained from the ConceptWiki
(conceptwiki.org) development server5) An RDF document for an example protein, obtained from the RDF
interface of the UniProt web site
A Bioinformatics Experiment Scott Marshall Marco Roos
LifeGuide http://www.lifeguideonline.org/
http://www.galaxyzoo.org/
MethodBox http://www.methodbox.org/
The solutions we'll be delivering in 5 yearsCharacterised by global reuse of tools, data and methods across any discipline, and surfacing the right levels of complexity for the researcher. Routine use.Key characteristic is radical sharing .Research is significantly data driven - plundering the backlog of data, results and methods. Increasing automation and decision-support for the researcher - the VRE becomes assistive. Curation is autonomic and social.
3rd Generation Summary
Easy and low risk to startProgress to advanced skillsFor researchersNo obligationGo as far as you want
Find a service & relax
Intellectual ramps
Malcolm Atkinson
42
NRAO/AUI/NSF
telescopes for the naked mindDatascopesMalcolm Atkinson
From Signal to Understanding
2010 – 2011and beyond
Music and Linked Data
http://www.openarchives.org/ore/terms/aggregates
http://eprints.ecs.soton.ac.uk/id/eprint/20817
EPrints
It’s about enabling the join
Ben Fields, 6th October 2010
SALAMI: Structural Analysis of Large Amounts of Music
Information
David De RoureJ. Stephen Downie
Ichiro Fujinaga
www.diggingintodata.org
Digital Music Collections
Crowdsourced ground truth
Community Software
Linked Data Repositories
Supercomputer
23,000 hours ofrecorded music
250,000 hours NCSASupercomputer time
Music InformationRetrieval Community
The SALAMI collaboration• DDeR (e-Research South), J. Stephen Downie (Illinois) and
Ichiro Fujinaga (McGill)• NCSA donating 250,000 supercomputer hours• 350,000 pieces of music (23,000 hours)
– Internet Archive, DRAM, IMIRSEL, McGill• Feature analysis and structural analysis• Music Ontology by Yves Raimond (BBC)• Musicologists from McGill and Southampton• Sharing of analyses
seasr.org/meandreMeandre
“Signal”Digital Audio
“Ground Truth”
Community
It’s web-like!
Q. If and when should community-generated content be assimilated into managed repositories?
StructuralAnalysis
How country is my country?
www.nema.ecs.soton.ac.uk/countrycountry
Stephen Downie
Music and computational thinking
• Co-*• Ramps• Datascopes• Linked data rocks• Computational thinking• It’s about enabling the join
Take homes
• Co-*• Ramps
• Datascopes• Linked data rocks
• Computational thinking• It’s about enabling the join
Visit wiki.myexperiment.org
Thanks to: Jeremy Frey & CombeChem; Carole Goble & myGrid; Iain Buchan, Sean Bechhofer and the myExperiment team; Doug Kell; Marco Roos; Stephen Downie, Kevin Page, Ben Fields and the NEMA/SALAMI team; Malcolm Atkinson