Research Data and the Future of Software Engineering
Amir AryaniAustralian National Data Service (ANDS)
Heinz SchmidtRMIT University
Amir Aryani (twitter: @amir_at_ands)2
ANDS is enabling transformation of Australian research data
• Funded by the Australian Commonwealth Government through the National Collaborative Research Infrastructure Strategy (NCRIS) with the mission of transforming Australia’s research data.
• Since 2009 ANDS has spent $80 million building skills, knowledge, services and community around data.
Amir Aryani (twitter: @amir_at_ands)3
About eResearch@RMIT
• established to facilitate IT systems, software and services support to researchers with high-performance computing (HPC), high-bandwidth network and data-intensive collaborative spaces needs such as– Research Data Capture and Curation,– Campus Cloud,– Data Visualization, …
Amir Aryani (twitter: @amir_at_ands)4
Agenda
Amir Aryani (twitter: @amir_at_ands)5
Fourth Paradigm• Thousand years ago:
science was empirical– describing natural phenomena
• Last few hundred years: theoretical branch – using models, generalizations
• Last few decades:a computational branch– simulating complex phenomena
• Today :data exploration (eScience) – unify theory, experiment, and simulation
Ref: Tony Hey, Stewart Tansley, and Kristin Tolle, The fourth paradigm: data-intensive scientific discovery, Microsoft Research, 2009
Jim Gray on eScience: A Transformed Scientific Method
6
Presented to Neelie Kroes, European Commission Vice-President for the Digital Agenda, the report "Riding the Wave: How Europe can gain from the rising tide of scientific data" is the result of six months of intense brainstorming and consultations by the High-Level Group on Scientific Data.
Riding the Wave
Ref: http://ec.europa.eu/information_society/newsroom/cf/itemlongdetail.cfm?item_id=6204
Amir Aryani (twitter: @amir_at_ands)7
Investigators are expected to share with other researchers, at no more than incremental cost and within a reasonable time, the primary data, samples, physical collections and other supporting materials created or gathered in the course of work under NSF grants.
Ref: http://www.nsf.gov/bfa/dias/policy/dmp.jsp
Amir Aryani (twitter: @amir_at_ands)8
Making research data widely available to the research community in a timely and responsible manner ensures that these data can be verified, built upon and used to advance knowledge…
Ref:http://www.wellcome.ac.uk/About-us/Policy/Policy-and-position-statements/WTX035043.htm
Amir Aryani (twitter: @amir_at_ands)9
Ref: http://www.arc.gov.au/pdf/LIEF15/LE15%20Funding%20Rules.pdf
“Researchers and institutions have an
obligation to care for and maintain research
data in accordance with the Australian Code
for the Responsible Conduct of Research
(2007). The ARC considers data management
planning an important part of the responsible
conduct of research and strongly encourages
the depositing of data arising from a Project
in an appropriate publicly accessible subject
and/or institutional repository".
Amir Aryani (twitter: @amir_at_ands)10
477 academics and policymakers from around the globe gathered for the Research Data Alliance’s third plenary meeting in Dublin (March 2014)
Research Data Alliance
Amir Aryani (twitter: @amir_at_ands)11
Data sharing and open access
Amir Aryani (twitter: @amir_at_ands)12
Research Data
“The data, records, files or other evidence, irrespective of their content or form (e.g. in print, digital, physical or other forms), that
comprise research observations, findings or outcomes, including primary materials and
analysed data.”Monash University Research Data Policy
ANDS Guideline:ands.org.au/guides/what-is-research-data.html
Amir Aryani (twitter: @amir_at_ands)13
Research Data (a simple perspective)
Amir Aryani (twitter: @amir_at_ands)14
Why share your data?
• Credibility• Transparency• Collaboration– Better return on
investment– New research
opportunities• Data citation– Adding data to your
resume
Amir Aryani (twitter: @amir_at_ands)15
Data Management Framework*
• Institutional policies and procedures• IT infrastructure
(hardware & software)• Support services
(people & advice)• Managing metadata
*www.ands.org.au/datamanagement/overview.html
Amir Aryani (twitter: @amir_at_ands)16
Ref: http://library.ucf.edu/scholarlycommunication/ResearchLifecycleUCF.php
Amir Aryani (twitter: @amir_at_ands)17
Research Data Repositories
Amir Aryani (twitter: @amir_at_ands)18
Software engineers are the key enablers
Amir Aryani (twitter: @amir_at_ands)19
Software is embedded inthe research data lifecycle
Software
Software
Software
How about your research data?
Amir Aryani (twitter: @amir_at_ands)21
Research data in computer science
Storage
?
Infrastructure
?
Amir Aryani (twitter: @amir_at_ands)22
What is missing?
Majority of Australian universities have no domain specific data curation solution to support researchers in computer science and computer engineering.
Amir Aryani (twitter: @amir_at_ands)23
Roadmap for future work
• Formal approach to data management in Computer Science and Software Engineering (CS & SE)
Research
• National/international data management infrastructure for CS & SEInfrastructure
• Building the culture of data citationPolicy and Practice
Amir Aryani (twitter: @amir_at_ands)24
Research Problem
Formal approach (domain-based method) to data management in software engineering
Opportunity:collaborative research
Amir Aryani (twitter: @amir_at_ands)25
Infrastructure Gap
• National/international infrastructure for data management in computer science and software engineering
Requires:cross-institution
collaboration
Amir Aryani (twitter: @amir_at_ands)26
Process/Policy Challenge
Amir Aryani (twitter: @amir_at_ands)27
Last comment: Open data = new research opportunities
Find these slides atTwitter: @amir_at_andsslideshare.net/amiraryani