the rationale and methodology of the 2nd sc5 pilot
TRANSCRIPT
Framework
¥ Computational modelling of atmospheric dispersion of hazardous pollutants
¥ How can BigDataEurope Integrator tools contribute to performing more efficiently computational tasks related to atmospheric dispersion of hazardous pollutants?
11-oct.-16www.big-data-europe.eu
Purposes and means¥ Air pollution abatement / early warning / countermeasures
o Anthropogenic emissions: routine, accidental (nuclear, chemical), malevolent (terrorist) – unannounced releases
o Natural emissions (e.g., volcanic eruptions)
¥ Measurements (from earth or space)
¥ Mathematical modelling¥ Combination of the above → “forward” or “inverse” modelling
through “data assimilation”
11-oct.-16www.big-data-europe.eu
Input data for dispersion modelling¥ Meteorology¥ “Source term”: knowledge of the emitted pollutant(s)
source(s): Location, quantity and conditions of release, timing
¥ Terrain characteristics, geometry of buildings etc.¥ Depending on available input and measurement data:
“forward” or “inverse” modelling
11-oct.-16www.big-data-europe.eu
Cases of “inverse” computations¥ The pollutant emission sources are NOT known:
location and / or quantity of emitted substanceso Technological accidents (e.g., chemical, nuclear), natural
disasters (e.g., volcanos): known location, unknown emission
o Un-announced technological accidents (e.g. Chernobyl), malevolent intentional releases (terrorism), nuclear tests
¥ Inverse “source-term” estimation techniques11-oct.-16www.big-data-europe.eu
Inverse source-term estimation
¥ Available information:o Measurements indicating the presence of air pollutanto Meteorological data for now and recent past
¥ Mathematical techniques blending the above with results of dispersion models to infer position and strength of emitting sourceo Special attention: multiple solutions
11-oct.-16www.big-data-europe.eu
Introducing the 2nd BDE SC5 Pilot¥ The previously mentioned mathematical techniques require
large computing times
¥ Purpose: fast estimation of source location in emergencies¥ Proposed solution: pre-calculate a large number of scenarios,
store them, and at the time of an emergency select the “most appropriate”
¥ BDE will provide the tools to perform this functionality efficiently
11-oct.-16www.big-data-europe.eu
Structure of the 2nd BDE SC5 Pilot
¥ Geographic area: Europe¥ Cases of interest: accidents at Nuclear Power Plants¥ Weather calculations:
o Re-analysis data for 20 yearso Clustering → “typical” weather circulation patternso Downscaling through WRF for the “typical” weather
circulation patterns11-oct.-16www.big-data-europe.eu
Structure of the 2nd BDE SC5 Pilot
¥ Dispersion calculations:o Calculation of dispersion patterns from NPPs for the
above downscaled typical weather circulation patternso Dispersion results: gridded and (optionally) at
monitoring stations
11-oct.-16www.big-data-europe.eu
Structure of the 2nd BDE SC5 Pilot
¥ In the event of radiation signals at some stations:o Matching of current and recent weather to closest
typical circulation patterno From the stored dispersion results pertaining to the
matched weather circulation patterns select the one that closest matches the monitoring data
o The matched dispersion pattern will reveal the most probable emission source
11-oct.-16www.big-data-europe.eu
So far …
¥ Preliminary clustering studies on limited amount of re-analysis data (while waiting for full download)o On the basis of different variables on different
pressure levels
¥ Dispersion calculations for a selected NPP for the revealed weather classes
11-oct.-16www.big-data-europe.eu
So far …¥ Selected a random date, taken as “true” accident day¥ Matching of the “true” day’s weather data with the closest
weather class from the clustering procedure¥ Dispersion calculations with the weather data of the “true” day
¥ Comparison of dispersion results based on “true” and matched weather data
11-oct.-16www.big-data-europe.eu
Workflow
www.big-data-europe.eu
ECM
WF Weather
reanalysis data (20+years) W
RF Pre-processed weather data
Clu
ster
ing Predominant
weather patterns
DIP
CO
T Dispersions for weather patterns, for a number of fixed nuclear sites
Det
ecto
r Detection of dangerous release Wea
ther
se
rvic
e Recent weather (e.g. 3 days)
Batch processing
Interactive workflow Comparison
Candidate release origins
Data
¥ ECMWF Reanalysis data¥ NCAR-UCAR Archive
o Better compatibility with WPS/WRF
¥ 20-30 yearso Approx. 6 TB in total
¥ Grib2 format – again for better compatibility with WRFo NetCDF via WPS
¥ Many variables at multiple geopotential heightswww.big-data-europe.eu
Architectural Overview
www.big-data-europe.eu
Possible additions as BDE pilot components:(1) POSTGIS(2) DIPCOT
Clustering
¥ Traditional methodso Agglomerative hierarchicalo K-means
¥ Soon to implemento NN-based feature extraction (e.g. autoencoders,
convolution nets)o (Possibly) followed by k-means
www.big-data-europe.eu
Evaluation¥ Incremental
o Clustering outcomeo Closeness of constituent weather within clusters / distance between
clusterso Dispersion characteristicso Different cluster descriptors for
v Creating cluster-based dispersionsv Matching “real data” to clusters
¥ Completeo Compare cluster-based dispersion againsto “Real data” dispersion
v For a number of hypothetical scenarioswww.big-data-europe.eu
Preliminary results¥ Clustering over 2-year period (1986, 1987)
o K=6 clusters¥ Multiple geopotentials¥ Other variables – notably wind speed – at
different heights¥ “Visual comparison” against “real data” dispersions¥ Incrementally combining more vars
www.big-data-europe.eu
Cluster quality / GHT 500hPa
www.big-data-europe.eu
• 1986, 1987• Resolution=• Items (6-hr snapshots) =
• K-means, for K-6• Geopotential height=500hPa• Dispersions well differentiated for a
specific hypothetical origin
• Real data:
Immediate Future Work
¥ Feature extractiono Taking into account multiple variableso At more heights
¥ Automatic evaluationo For a number of pre-selected scenarios
¥ Dockerisation and inclusion into the BDE architecture
www.big-data-europe.eu