big data analytics large hadron collider · big data analytics large hadron collider manuel martín...

28

Upload: others

Post on 23-Jun-2020

6 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: BIG DATA ANALYTICS LARGE HADRON COLLIDER · BIG DATA ANALYTICS LARGE HADRON COLLIDER Manuel Martín Márquez, Senior Project Leader CERN –IT Department. 3 CERN EUROPEAN ORGANIZATION
Page 2: BIG DATA ANALYTICS LARGE HADRON COLLIDER · BIG DATA ANALYTICS LARGE HADRON COLLIDER Manuel Martín Márquez, Senior Project Leader CERN –IT Department. 3 CERN EUROPEAN ORGANIZATION

BIG DATA ANALYTICS

LARGE HADRON COLLIDER

Manuel Martín Márquez, Senior Project Leader

CERN – IT Department

Page 3: BIG DATA ANALYTICS LARGE HADRON COLLIDER · BIG DATA ANALYTICS LARGE HADRON COLLIDER Manuel Martín Márquez, Senior Project Leader CERN –IT Department. 3 CERN EUROPEAN ORGANIZATION

3

CERN EUROPEAN ORGANIZATION FOR NUCLEAR RESEACH

A WORLDWIDE COLLABORATION

Page 4: BIG DATA ANALYTICS LARGE HADRON COLLIDER · BIG DATA ANALYTICS LARGE HADRON COLLIDER Manuel Martín Márquez, Senior Project Leader CERN –IT Department. 3 CERN EUROPEAN ORGANIZATION

4

FUNDAMENTAL RESEARCH

WHAT IS THE UNIVERSE MADE OF?

HOW DIT IT START?

Page 5: BIG DATA ANALYTICS LARGE HADRON COLLIDER · BIG DATA ANALYTICS LARGE HADRON COLLIDER Manuel Martín Márquez, Senior Project Leader CERN –IT Department. 3 CERN EUROPEAN ORGANIZATION

5

FUNDAMENTAL RESEARCH

Why do particle have mass?

Page 6: BIG DATA ANALYTICS LARGE HADRON COLLIDER · BIG DATA ANALYTICS LARGE HADRON COLLIDER Manuel Martín Márquez, Senior Project Leader CERN –IT Department. 3 CERN EUROPEAN ORGANIZATION

6

FUNDAMENTAL RESEARCH

Why is there no antimatter left in the Universe?

What is 95% of the Universe made of?

Page 7: BIG DATA ANALYTICS LARGE HADRON COLLIDER · BIG DATA ANALYTICS LARGE HADRON COLLIDER Manuel Martín Márquez, Senior Project Leader CERN –IT Department. 3 CERN EUROPEAN ORGANIZATION

7Manuel Martin MarquezIntel IoT Ignition Lab – Cloud and Big Data

Munich, September 17th

Page 8: BIG DATA ANALYTICS LARGE HADRON COLLIDER · BIG DATA ANALYTICS LARGE HADRON COLLIDER Manuel Martín Márquez, Senior Project Leader CERN –IT Department. 3 CERN EUROPEAN ORGANIZATION

8

CERN’s PARTICLE ACCELERATORS AND

EXPERIMENTS

Page 9: BIG DATA ANALYTICS LARGE HADRON COLLIDER · BIG DATA ANALYTICS LARGE HADRON COLLIDER Manuel Martín Márquez, Senior Project Leader CERN –IT Department. 3 CERN EUROPEAN ORGANIZATION

12/6/2019 Document reference 9

CERN Aerial View

World’s largest scientific instrument27km (16.8 miles) circumference, 6000+ superconducting magnets

Emptiest place in the solar system High vacuum inside the magnets

Hottest spot in the galaxy

During Lead ion collisions create temperatures 100 000x hotter than the heart of the sun;

Fastest racetrack on Earth

Protons circulate 11245 times/s (99.9999991% the speed of light)

Page 10: BIG DATA ANALYTICS LARGE HADRON COLLIDER · BIG DATA ANALYTICS LARGE HADRON COLLIDER Manuel Martín Márquez, Senior Project Leader CERN –IT Department. 3 CERN EUROPEAN ORGANIZATION

12/6/2019 Document reference 10

150 Million of sensorControl and detection sensors

Massive 3D camera

Capturing million of collisions per second

CMS Detector

Page 11: BIG DATA ANALYTICS LARGE HADRON COLLIDER · BIG DATA ANALYTICS LARGE HADRON COLLIDER Manuel Martín Márquez, Senior Project Leader CERN –IT Department. 3 CERN EUROPEAN ORGANIZATION

11

FUTURE CHALLENGES

HIGH-LUMINOSITY LHC

FUTURE CIRCULAR COLLIDER

Page 12: BIG DATA ANALYTICS LARGE HADRON COLLIDER · BIG DATA ANALYTICS LARGE HADRON COLLIDER Manuel Martín Márquez, Senior Project Leader CERN –IT Department. 3 CERN EUROPEAN ORGANIZATION

12

Page 13: BIG DATA ANALYTICS LARGE HADRON COLLIDER · BIG DATA ANALYTICS LARGE HADRON COLLIDER Manuel Martín Márquez, Senior Project Leader CERN –IT Department. 3 CERN EUROPEAN ORGANIZATION

Post-LHC accelerator projects (80-100 km)

Page 14: BIG DATA ANALYTICS LARGE HADRON COLLIDER · BIG DATA ANALYTICS LARGE HADRON COLLIDER Manuel Martín Márquez, Senior Project Leader CERN –IT Department. 3 CERN EUROPEAN ORGANIZATION

14

CERN’s BIG DATA AND MACHINE LEARNING

HIGH ENERGY PHYSICS

Page 15: BIG DATA ANALYTICS LARGE HADRON COLLIDER · BIG DATA ANALYTICS LARGE HADRON COLLIDER Manuel Martín Márquez, Senior Project Leader CERN –IT Department. 3 CERN EUROPEAN ORGANIZATION

15

Page 16: BIG DATA ANALYTICS LARGE HADRON COLLIDER · BIG DATA ANALYTICS LARGE HADRON COLLIDER Manuel Martín Márquez, Senior Project Leader CERN –IT Department. 3 CERN EUROPEAN ORGANIZATION

16

Page 17: BIG DATA ANALYTICS LARGE HADRON COLLIDER · BIG DATA ANALYTICS LARGE HADRON COLLIDER Manuel Martín Márquez, Senior Project Leader CERN –IT Department. 3 CERN EUROPEAN ORGANIZATION

17

Page 18: BIG DATA ANALYTICS LARGE HADRON COLLIDER · BIG DATA ANALYTICS LARGE HADRON COLLIDER Manuel Martín Márquez, Senior Project Leader CERN –IT Department. 3 CERN EUROPEAN ORGANIZATION

18

TRIGGER SYSTEM – FILTERING EVENTS & DATA

The trigger system selects approximately 1000 of the 1.7

billion collisions that occur each second in the centre of the

ATLAS detector.

Page 19: BIG DATA ANALYTICS LARGE HADRON COLLIDER · BIG DATA ANALYTICS LARGE HADRON COLLIDER Manuel Martín Márquez, Senior Project Leader CERN –IT Department. 3 CERN EUROPEAN ORGANIZATION

19

TRIGGER SYSTEM – FILTERING EVENTS & DATA

Improve efficiency, flexibility and quality filtering Remove the costly and rigid hardware based model

Reduce false positives rates

From rule based systems to Deep Learning Classifiers

Page 20: BIG DATA ANALYTICS LARGE HADRON COLLIDER · BIG DATA ANALYTICS LARGE HADRON COLLIDER Manuel Martín Márquez, Senior Project Leader CERN –IT Department. 3 CERN EUROPEAN ORGANIZATION

20

TRIGGER SYSTEM – ML PIPELINE

Data

IngestionFeature

PreparationModel

Development Training

Complex datasets

801x19 matrix

Files about 4TB

Data format preparation

19 Original Features

14 Derived from Domain

Knowledge (HLF)

Feed-Forward DNN

Recursive DNN - GRUs

Combined Models

Hyper-Parameter tuning

Scikit-learn-Keras-Spark (parallel)

Distributed Training

Page 21: BIG DATA ANALYTICS LARGE HADRON COLLIDER · BIG DATA ANALYTICS LARGE HADRON COLLIDER Manuel Martín Márquez, Senior Project Leader CERN –IT Department. 3 CERN EUROPEAN ORGANIZATION

21

TRIGGER SYSTEM – SCORING

Strict latency constraints Target level 1 trigger

Larger networks with longer latency. neutrino,

astronomical experiments, industrial

applications etc.

FPGAs Provide huge flexibility and allow us to cope

with response time required

Performance depends on how well you take

advantage of it

Page 22: BIG DATA ANALYTICS LARGE HADRON COLLIDER · BIG DATA ANALYTICS LARGE HADRON COLLIDER Manuel Martín Márquez, Senior Project Leader CERN –IT Department. 3 CERN EUROPEAN ORGANIZATION

22

QUANTUM COMPUTING – ANOTHER WEAPON?

Can Quantum Computing and Q-ML help Quantum Nearest Neighbors Clustering, PCA and SVM

But still hard to Get access to emulators and simulators

Get access to real devices, benchmark, compare results

Engineering aspects of QC installation, like cryogenics and material science

Use Cases: Track reconstruction in dense environments

Reconstruct neutrino interactions

Optimize Grid workflow

Page 23: BIG DATA ANALYTICS LARGE HADRON COLLIDER · BIG DATA ANALYTICS LARGE HADRON COLLIDER Manuel Martín Márquez, Senior Project Leader CERN –IT Department. 3 CERN EUROPEAN ORGANIZATION

23

CERN’s BIG DATA AND MACHINE LEARNING

CERN CONTROL SYSTEMS AND IOT

Page 24: BIG DATA ANALYTICS LARGE HADRON COLLIDER · BIG DATA ANALYTICS LARGE HADRON COLLIDER Manuel Martín Márquez, Senior Project Leader CERN –IT Department. 3 CERN EUROPEAN ORGANIZATION

24

CERN ACCELERATOR LOGGING SERVICE

+2M signals produce more than 2.5TB data per day.

From scalars to arrays of up-to 4 million elements.

Data diverse in nature: Accelerator running modes,

Equipment statuses,

Magnet currents,

Cryogenics temperatures,

Particle beam positions

Page 25: BIG DATA ANALYTICS LARGE HADRON COLLIDER · BIG DATA ANALYTICS LARGE HADRON COLLIDER Manuel Martín Márquez, Senior Project Leader CERN –IT Department. 3 CERN EUROPEAN ORGANIZATION

25

CERN ACCELERATOR LOGGING SERVICE

Control I-IoT data at CERN is disperse into several data silos

Current system optimized for real-time serving but not for data

exploration: Find hidden correlations

Anomalies detection

Post-mortem analysis

Root cause analysis (RCA)

Intelligent Alarm systems

Etc

Page 26: BIG DATA ANALYTICS LARGE HADRON COLLIDER · BIG DATA ANALYTICS LARGE HADRON COLLIDER Manuel Martín Márquez, Senior Project Leader CERN –IT Department. 3 CERN EUROPEAN ORGANIZATION

26

ENHANCE EXPLORATION - AUTONOMOUS TECH

Oracle Autonomous StrategyAutomated creation of required resources,

Administration, Patches,

Backups,

Memory handling

Flexibility to provisioning and scaling Easy Solution prototyping

Move fast from PoC to Production states

Cost Effective Solutions Hybrid systems integrating DB and External Object Storage

Page 27: BIG DATA ANALYTICS LARGE HADRON COLLIDER · BIG DATA ANALYTICS LARGE HADRON COLLIDER Manuel Martín Márquez, Senior Project Leader CERN –IT Department. 3 CERN EUROPEAN ORGANIZATION

27

DATA INTERFACE – NOTEBOOKS

The human factor

SWAN

Jupyter notebooks

Integrated with CERN

GPUs – Cloud

Oracle collaboration

Page 28: BIG DATA ANALYTICS LARGE HADRON COLLIDER · BIG DATA ANALYTICS LARGE HADRON COLLIDER Manuel Martín Márquez, Senior Project Leader CERN –IT Department. 3 CERN EUROPEAN ORGANIZATION