1 11/09/2015 final reflections apa conference 2011 john wood
TRANSCRIPT
1
21/04/23
Final Reflections
APA Conference 2011
John Wood
2
ERA 2030: ERAB’s STRATEGIC VIEW
October 2009
3
Commissioner Janez Potocnik
... holistic thinking and approach epitomizedthe first ‘Renaissance’, where scholars and artistsmoved relatively freely around Europe among the
centres of learning and culture. While this privilegewas the domain of a few at that time, it should beour ambition, in the new ‘Renaissance’, that this
should be the expectation of all citizens, especiallyin the field of research and innovation.
4
An ERA driven by societal needs to address the ‘Grand Challenges’
5
21/04/23
Highlights - CANFAR connects astronomical dotsJuly 1, 2009 — Victoria, British Columbia
Astronomers were once stereotyped as lone insomniacs tending optical telescopes. But now they do most of their research in
"virtual organizations" - far-flung national and international collaborations of diverse people and institutions that use the
Internet to exchange and crunch vast stores of digital data fed by telescopes of many kinds around the planet.
Scenario IV: Science and the studentRoger is working on an international PhD. It’s a relatively new programme, in which a student applies to become a member
of an international team working on a big problem that affects allpeople. His group is comparing many forms of nonverbal
communications between cultures. It has several hundred members and his university tutor
is one of the nodal points contributing expertise in‘synergistic communication between biologicalcomponents.’ Others in the network are using
archaeological evidence to study communicationsbetween ancient Mesopotamian and Hellenic
cultures; some are studying computer-computerinteractions between different systems; yet moreare studying communications in refugee camps.Each node contributes to the whole. Results are
communicated as they happen, and there are daily,virtual-presence planning sessions. Roger had tosign a contract not to misuse data or contribute
anything that is not for the common good – such asexternally sourced information that he has not checked for provenance
Crystallography
History
Astronomy
Earth Science
Ground Truth
Earth
Observation
Physical
chemistry
Bio-chem
istry
Climatology
Chemistry
Earth Science
Biology
Data Services
Community Support Services
Astronomy
Climatology
Chemistry
History
Biology
• Computing Infrastructure• Persistent Storage Capacity• Integrity• Authentication & Security
• API• Data Discovery & Navigation• Workflows Generation
Demography
Scientific Data(Discipline Specific)
Other Data
Researcher 1
Non Scientific World
Scientific World
Researcher 2
Aggregated Data Sets(Temporary or Permanent)
Workflows
Aggregation Path
Source: High-level Group on Scientific Data
9
Data ingest Managing petabytes+ Common schema(s) How to organize? How to re-organize?
How to coexist & cooperate with other scientists and researchers?
Data query and visualization tools Support/training Performance
Execute queries in a minute Batch (big) query scheduling
Experiments &Instruments
Simulationsfacts
facts
answers
questions
?Literature
Other Archives facts
facts
10
CESSDACouncil of European Social Science
Data Archives
Till December 2009
CLARINCommon Language Resources and
Technology Infrastructure
Till December 2010
DARIAHDigital Research Infrastructure for the
Arts and Humanities
Till September 2010
ESSEuropean Social Survey
Till May 2010
SHARESurvey on Health, Ageing and
Retirement in Europe
Till December 2009
Social Science and Humanities ProjectsSocial Science and Humanities Projects
Copyright © 2009 Norwegian Social Sciences Data Services Grenoble, September 10, 2009
11
Common Language Resources and technology Initiative - CLARINCommon Language Resources and technology Initiative - CLARIN
•large-scale pan-European coordinated infrastructure•language resources and technology to scholars of all disciplines•based on a Grid-type infrastructure•using Semantic Web technology
Estimated costs•Preparation: 4.1 M€ (2008 – 2010)•Construction: 104 M€ (2011 - 2013)•Operation: 38 M€ (2014 - 2018)•Decommissioning: not applicable
www.clarin.eu
Brussels, 25 September 2008
12
RAMIRI Hamburg Sept 2009 - Steven Krauwer 12
What is CLARIN?
• Common Language Resources and Technology Infrastructure (http://www.clarin.eu)
• Basic idea: – European federation of digital archives with language data
and tools (text, speech, multimodal, gesture …)– target audience humanities and social sciences scholars – with uniform single sign-on access to the archives– with access to language and speech technology tools to
retrieve, manipulate, enhance, explore and exploit data– all languages are equally important– to cover all EU and associated countries
13
RAMIRI Hamburg Sept 2009 - Steven Krauwer 13
Main challengesTake-up
• Take-up by target audience:– aim at humanities and social sciences scholars– who have no technical background– who have very little tradition in using
technological tools• Special challenges:
– discovering what they need– making them aware of the potential benefits of
the infrastructure, e.g. to speed up or innovate their research
14
RAMIRI Hamburg Sept 2009 - Steven Krauwer 14
Main challengesLegal and ethical
• Legal challenges:– making a light access and licensing system for the users– protecting owners’ rights and interests– respecting national IPR legislation
• Special problems:– transnational access and diversity of national IPR and
data legislation– repurposed data (e.g. using novels or TV news for
linguistic studies)– ethical & privacy considerations (e.g. use recorded
phone calls to train speech recognition systems)
15
Large-scale e-Infrastructures for Biodiversity Research
16
Experimentation on a fewparameters is not enough:
Limitations to scaling up results for understanding system properties
The biodiversity system is complex and cannot be described by the simple
sum of its components and relations
LifeWatch adds a new technology to support the generation and analysis of large-scale data-sets on biodiversity.Find patterns and learn processes.
17
Distributed data generation
Continental ecological monitoring sites
Marine monitoring sites
Biological collections
Greenhouse gas measurements
Plate observing system
18
Data + users from other infrastructures
19
XFEL: Office and Laboratory Building
20
21
22
3rd Gen. SR
2nd Gen. SR
Laser Slicing
SPPS
Initial
H.-D. Nuhn, H. Winick
Pea
k B
rig
htn
ess
[Ph
ot.
/(s
· m
rad
2 ·
mm
2 ·
0.1
%b
an
dw
.)]
FWHM X-Ray Pulse Duration [ps]
Future
Future
ERLs
X-Ray FELs
InitialUltrafast x-ray sources will probe space and time with
atomic resolution.
Peak brightness of pulsed X-ray sources
what do we do todayand
what tomorrow?
23
ã Firmenam e (Referentennam e)28
Coulomb explosion of lysozyme (50 fs)Coulomb explosion of Lysozyme LCLS
Radiation damageinterferes with atomicscattering factors and
atomic positions
50 fs3x1012 photons/100 nm spot12 keV
R. Neutze, R. Wouts, D. van der Spoerl, E. Weckert, J. Hajdu: Nature 406 (2000) 752-757
t=0
t=50 fsec
t=100 fsec
Coulomb Explosion von Lyzosym
24
25
DAQ Challenge: 2D X-Ray Detector Systems
106 pixels per frame for one detector•O(400-500) frames per train (goal, likely will start with less)•10 trains per second (machine allows up to 30 Hz…)
•With 2 Byte/pixel average rate 10 Gbyte/sec for one 2D detector!•Time between frames as short as 200ns buffering needed
600 s
99.4 ms
100 ms 100 ms
200 ns
LPD
26
Technology Forecast – Storage at DESY
Year Rate Capability
[Gbyte/sec]
Storage Space
[Petabyte]
2009 1 3
2012 5 26
2016 40 200
• not a technology problem• money and manpower issues• to be determined:
• user behaviour• compression and accept/reject algorithms
• potentially critical: access to data!
27
GÉANT: connecting EuropePan-European
coverage (40+ countries /3900
universities / 30+ million students)
Hybrid architecture:
connectivity at 10 Gb/s (aggregated
traffic)
dark fiber wavelengths(demanding
communities)
28
GÉANT: global reach
29
30
31
The role of the Data Scientist?
• Extension of Library or Archive Personnel?
• Where are the data scientists now?• The role of current large data users
in training • Part of a larger problem in how to
manage large research infrastructures including virtual
21/04/23
32
Bernd Panzer-Steindel, CERN
Creating conditions similar
to the Big Bang
The most powerful microscope in the world
‘snapshot of nature’‘snapshot of nature’
Particle AcceleratorParticle Accelerator
one snapshot == one event == 1.5 MByte 322/17/2010
33
RAW data copies
Enhanced copies
RAW data
Sub-samples
Enhanced copies
Data and Bookkeeping Data and Bookkeeping
10 PB RAW data per year+ derived data, extracted physicsdata sets, filtered data sets, artificialdata sets (Monte-Carlo), multiple versions,Multiple copies 20 PB of data at CERN per year
50 PB of data in addition transferred and stored world-wide per year
Data copies safety and access/performance 33Bernd Panzer-Steindel, CERN2/17/2010
The Bellagio Declaration
• Conclusions
• The group of experts concluded that the following joint activities would provide a strong basis for the development of a common framework that would help demonstrate the overall power of investments in science.
• Aligning efforts to develop and implement common approaches. Examples could include collaboration in the development of persistent researcher identifiers; extending the accessibility, usability and interoperability of U.S. and European publication and patent databases; development of interoperable and authenticated research datasets as well as common analysis tools; and identification of quantitative and qualitative information concerning the contributions of various actors to science, innovation and economic growth beyond existing and emerging national and transnational data sets.
Travel Safely!