what business are we in? data-centric research, service requirements and national responses
DESCRIPTION
Data keynote delivered at NEIC 2013 conference in Trondheim, Norway. Argues that research infrastructure providers are all in the data business. Video of presentation online at http://www.youtube.com/watch?feature=player_embedded&v=oCSqYoaRWR0#! (from 4:00 to 34:24).TRANSCRIPT
What business are we in?Data-centric research, service
requirements and national responses
Data Keynote, NEIC 2013Dr Andrew Treloar
Australian National Data Service
CC-BY @atreloar 2
Overview
• What business are we really in?• Service requirements• Infrastructure responses• Research Data Alliance• Conclusions
3Photo CC-BY www.flickr.com/photos/dgjones/7031731377/
4Photo CC-BY www.flickr.com/photos/pejmanphotos/1322835717/
CC-BY @atreloar 5
What Business are you in?
Theodore Levitt, The Changing Character of Capitalism, Harvard Business Review, July–August 1956
“The railroads did not stop growing because the need for passenger and freight transportation declined. That grew. The railroads are in trouble today not because that need was filled by others (cars, trucks, airplanes, and even telephones) but because it was not filled by the railroads themselves. They let others take customers away from them because they assumed themselves to be in the railroad business rather than in the transportation business. The reason they defined their industry incorrectly was that they were railroad oriented instead of transportation oriented; they were product oriented instead of customer oriented....”
6Photo CC-BY www.flickr.com/photos/spookman01/4904264919/
CC-BY @atreloar 7Photo CC-BY www.flickr.com/photos/jerryjohn/63351338/
CC-BY @atreloar 8Photo CC-BY www.flickr.com/photos/stiefkind/6454784607/
CC-BY @atreloar 9Photo CC-BY www.flickr.com/photos/torkildr/3462607995/
CC-BY @atreloar 10
CC-BY @atreloar 11
We are all in the Data business!
• Researchers– with some exceptions
• Research infrastructure providers– with no exceptions
• But what about publications?
LHC output from 2009-2013 = 100PB
(www.symmetrymagazine.org/article/february-2013/achievement-unlocked-100-petabytes-of-
data)
Journal Literature size in context…
@atreloar
CC-BY @atreloar 13
Data-centric view of research data re-use
CC-BY @atreloar 14
eResearch infrastructure requirements
• Create/Capture– automated with capture of associated
metadata
• Store– with appropriate levels of preservation
• Describe– information for discovery, determination
of value, access, re-use
• Identify– indirection operator to reduce
brittleness
CC-BY @atreloar 15
eResearch infrastructure requirements
• Register– in institutional/national/discipline
registries
• Discover– via general or specialised search
interfaces
• Access– with appropriate levels of control,
including humans
• Exploit– by re-analysis or combination
Photo CC-BY http://www.flickr.com/photos/vintuitive/6855133329/
16
I come from a land downunder…
CC-BY @atreloar 17
AU• 6 States• 2 Territories• 2 islands• 23M people
NZ• 2 islands• 4.5M people
You come from the frozen North…
CC-BY @atreloar 18
Nordic Countries• 5 Countries• 4 Territories• So many islands• 26M people
And yet there are some similarities
CC-BY @atreloar 19
• Australia+NZ – 27.5M people
• Scandinavia – 26M people
Australian National Data Service An initiative of the Australian Government being
conducted as part of the National Collaborative Research Infrastructure Strategy ($A24M) and the Super Science Initiative ($A48M)
A collaboration between Monash University, the Australian National University and CSIRO
30 staff, funded to mid 2015 More researchers re-using more data more often Data as a first-class object
CC-BY @atreloar 20
ANDS enables transformation of:Data that are:
UnmanagedDisconnectedInvisibleSingle use
To Structured Collections that are:ManagedConnectedFindableReusable
so that Australian researchers can easily publish, discover, access and use/re-use research data.
CC-BY @atreloar 21
Data-centric view of research data re-use
CC-BY @atreloar 22
CC-BY @atreloar 23
ANDS activities/services Plan
Data management planning tools and resources (N) Create/Capture
69 Data Capture projects at 23 universities Store
working closely with national Research Data Storage Infrastructure (N) Describe
25 institutional Metadata Stores projects National Vocabulary Services (N)
CC-BY @atreloar 24
Identify (N) DataCite DOIs
Register (N) Repository Interchange Format – Collections and Services
(RIF-CS) – based on ISO2146:2010 Discover (N)
Research Data Australia
ANDS activities/services
CC-BY @atreloar 25
ANDS activities/services
Access enforced by underlying data stores
Exploit 25 institutionally-focussed projects to demonstrate value of
combining data Advocate (N)
Be the voice for data Work with Government and Research Funders to change
settings in favour of data sharing
26Research Data Alliance
The Research Data Alliance (RDA) is a new international
organization (driven now by EC, US, AU, more soon) forming to
facilitate specific, short-term efforts that accelerate the sharing and
exchange of research data
Unofficial motto: rough consensus and exchanged data
Working groups will run over 12-18 months to produce
Adopted standards
Deployed infrastructure
Adopted policy
Implemented best practice, etc.
Second Plenary in Washington DC, September 16-18
Slide by Fran Berman
27
Data Type Registries
Data Foundation and
Terminology
Practical Policy
PID Information Types
Metadata Standards WG
Community Capability Model
Working Group on Data Citation:
Making Data Citable
Structural Biology Defining Urban Data Exchange for
Science Marine Data Harmonization Repository Audit and Certification Big Data Analytics Metadata Standards Directory Interest
Group (MSDIG) The Engagement Group Legal Interoperability Preservation e-Infrastructure UPC Code for Data Publishing Data Data in Context Citation of Dynamic Data Agricultural Data Interoperability
Working Groups Interest Groups
Research Data Alliance
Slide by Fran Berman
CC-BY @atreloar 28
Conclusion
• We are all in the data business• Researchers need data services from
their infrastructure providers• A number of services can best be
provided at national or regional level• Research Data Alliance is working to
develop international solutions for data interoperability – join us!
CC-BY @atreloar 29
Questions?
@atreloar
ands.org.au
rd-alliance.org