thomas jefferson national accelerator facility page 1 clas12 reconstruction and analysis framework...

49
Thomas Jefferson National Accelerator Facility Page 1 Clas12 Reconstruction and Analysis Framework V. Gyurjyan

Upload: brianna-lucinda-floyd

Post on 25-Dec-2015

216 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Thomas Jefferson National Accelerator Facility Page 1 Clas12 Reconstruction and Analysis Framework V. Gyurjyan

Thomas Jefferson National Accelerator Facility

Page 1

Clas12 Reconstruction and Analysis Framework

V. Gyurjyan

Page 2: Thomas Jefferson National Accelerator Facility Page 1 Clas12 Reconstruction and Analysis Framework V. Gyurjyan

Thomas Jefferson National Accelerator Facility

Page 2

CLAS Computing Challenges and Complications

• Growing software complexity• Compilation and portability problems• Maintenance and debugging difficulties • Scaling issues

• Author volatility• Difficulty enforcing standards• Drop-out authors

• OS uniformity problems• Difficulty meeting the requirements by collaborating Universities

• Language uniformity problems• Contributions in a defined HL language: tcl, C, C++, Fortran

• Steep learning curve to operate the software• Technology uniformity requirements• Users/contributors qualification• Coherent documentation for ever growing software applications

• Software fragmentation• Decreased user base• Decreased contributions• Decreased productivity

Page 3: Thomas Jefferson National Accelerator Facility Page 1 Clas12 Reconstruction and Analysis Framework V. Gyurjyan

Thomas Jefferson National Accelerator Facility

Page 3

CLAS12 Computing Requirements

• Challenges of the physics data processing environment• Long lifetime, evolving technologies

• Complexity through simplicity • Build complex applications using small and simple components.

• Enhance utilization, accessibility, contribution and collaboration• Reusability of components • On-demand data processing.• Location independent resource pooling.• Software agility

• Multi-Threading • Effective utilization of multicore processor systems.

• Production Processing• Dynamic elasticity.

• Ability to expand computing power with minimal capital expenditure• Utilization of IT resources of collaborating Universities.• Take advantage of available commercial computing resources.

Page 4: Thomas Jefferson National Accelerator Facility Page 1 Clas12 Reconstruction and Analysis Framework V. Gyurjyan

Thomas Jefferson National Accelerator Facility

Page 4

How to Increase Productivity and Physics Output?

• Increase number of contributors• Increase number of users• Decrease development time

How?

• Software application design utilizing reusable application building blocks (LEGO model)

• Keep LEGO modules as simple as reasonable possible.• Limit on the number of code lines

• Relax programming language requirements• Separation between SC programmer and physicist, i.e. separation between software

programming and an algorithm description.• Physicist focuses on a physics algorithm• Programmer programmatically implements described physics algorithm.

• Rapid prototyping, deployment and testing cycles.

Page 5: Thomas Jefferson National Accelerator Facility Page 1 Clas12 Reconstruction and Analysis Framework V. Gyurjyan

Thomas Jefferson National Accelerator Facility

Page 5

Ideas Adopted From GAUDI

• Clear separation between data and algorithms• Services encapsulate algorithms• Services communicate data • Three basic types of data: event, detector, statistics• Clear separation between persistent and transient data

No code or algorithmic solution was borrowed.

Page 6: Thomas Jefferson National Accelerator Facility Page 1 Clas12 Reconstruction and Analysis Framework V. Gyurjyan

Thomas Jefferson National Accelerator Facility

Page 6

Review: Software Architectures

Application Characteristics

Traditional OOA SOA

Modularity Through OO Through processes: SaaS

Composition Monolithic Monolithic/Distributed

Coupling between modules

Strong Loose

Language uniformity Required Not required

OS uniformity Required Not required

Fault tolerance Less (shared address space and program stack)

More (separate processes)

Testing and maintenance Difficult Relatively simple

Updates and extensions Difficult Simple (can be done in steps)

Page 7: Thomas Jefferson National Accelerator Facility Page 1 Clas12 Reconstruction and Analysis Framework V. Gyurjyan

Thomas Jefferson National Accelerator Facility

Page 7

Review: Large Scale Deployments

Architecture Batch and Grid Cloud

Traditional OOA Yes Difficult

SOA Yes Simple/native

Page 8: Thomas Jefferson National Accelerator Facility Page 1 Clas12 Reconstruction and Analysis Framework V. Gyurjyan

Thomas Jefferson National Accelerator Facility

Page 8

Meeting Requirements

Functionality Traditional Computing

Cloud Computing

Reusability Yes Yes

Legacy/foreign software component integration

Hard Yes

On demand data processing Hard Yes

Location independent resource pooling

Hard Yes

Dynamic elasticity Hard Yes

Software agility Hard Yes

Utilization off-site computing resources

Hard Yes

Page 9: Thomas Jefferson National Accelerator Facility Page 1 Clas12 Reconstruction and Analysis Framework V. Gyurjyan

Thomas Jefferson National Accelerator Facility

Page 9

Google Trends.Scale is based on the average worldwide traffic of cloud computing in all years.

Cloud Computing

Grid Computing

Advanced Computing Emerging Trends

Helix Nebuls project at CERN (02 Mar 2012 V3.co.uk digital publishing

company ) . CERN has announced it is developing a pan-European cloud computing platform, with two other leading scientific organizations, which will take advantage of the increased computing power cloud systems offer to help scientists better analyze data. Interview with the Frédéric Hemmer, head of CERN's IT department. "CERN's computing capacity needs to keep up with the enormous amount of data coming from the Large Hadron Collider (LHC) and we see Helix Nebula, the Science Cloud, as a great way of working with industry to meet this challenge.”

 Magellan project at US Department of Energy. USDE will spend $32 million on a project that will deploy a large cloud computing test bed with thousands of Intel Nehalem CPU cores and explore the work of commercial cloud offerings from Amazon, Microsoft and Google for scientific purposes.

Page 10: Thomas Jefferson National Accelerator Facility Page 1 Clas12 Reconstruction and Analysis Framework V. Gyurjyan

Thomas Jefferson National Accelerator Facility

Page 10

SOA Defined

• SOA is a way of designing, developing, deploying, and managing software systems characterized by coarse-grained services that represent reusable functionality.

• Service consumers compose applications or systems using the functionality provided by these services through standard interfaces.

• An architecture (NOT a technology) based on well defined, reusable components: services.

• Services are loosely coupled.• Services with functional context that are agnostic to a composed

application logic.• Agnostic services participate in multiple algorithmic compositions.• Application design based on available services.

Page 11: Thomas Jefferson National Accelerator Facility Page 1 Clas12 Reconstruction and Analysis Framework V. Gyurjyan

Thomas Jefferson National Accelerator Facility

Page 11

Computing Model and Architecture Choice

• SOA is a key architecture choice for the ClaRA • Data processing major components as services

• Multilingual support– Services can be written in C++, Java and Python

• Physics application design/composition based on services• Supports both traditional and cloud computing models

• Single process as well as distributed application design modes• Centralized batch processing• Distributed cloud processing

• Multi-Threaded

Page 12: Thomas Jefferson National Accelerator Facility Page 1 Clas12 Reconstruction and Analysis Framework V. Gyurjyan

Thomas Jefferson National Accelerator Facility

Page 12

Attributes of ClaRA Services

• Communicate data through shared memory and /or pub-sub messaging system

• Well defined, easy-to-use, data-centric interface• Self-contained with no dependencies on other services

• Loose coupling. Coupled through communicating data.• Always available but idle until requests arrival• Location transparent

• Services are defined by unique names, and are discovered through discovery services

• Combine existing services into composite services or applications

• Services can be combined in a single process (run-time environment), communicating data through shared memory (traditional computing model).

Page 13: Thomas Jefferson National Accelerator Facility Page 1 Clas12 Reconstruction and Analysis Framework V. Gyurjyan

Thomas Jefferson National Accelerator Facility

Page 13

More on ClaRA Environment

• User centric interface• ClaRA services are accessed with simple and pervasive methods• ClaRA interface tries not to force users to change their programming

habits, e.g. programming language, compiler and OS.• Clients to ClaRA applications or clouds that are required to be installed

outside of the ClaRA environment are thin and light-weighted. • On-demand service provisioning

• ClaRA resources and services are available for users on demand. Users can customize, configure and control services at every request.

Page 14: Thomas Jefferson National Accelerator Facility Page 1 Clas12 Reconstruction and Analysis Framework V. Gyurjyan

Thomas Jefferson National Accelerator Facility

Page 14

Clas12 Physics Data Processing

Database

Geometry, Calibration

and MFM Services

Raw Data

ReconstructionServices

DST Tuples

DST and TupleGeneration Services

Simulation Sim Data

Physics AnalysisServices

Monitors and Displays

Page 15: Thomas Jefferson National Accelerator Facility Page 1 Clas12 Reconstruction and Analysis Framework V. Gyurjyan

Thomas Jefferson National Accelerator Facility

Page 15

Programming Paradigm

algorithmdata

Page 16: Thomas Jefferson National Accelerator Facility Page 1 Clas12 Reconstruction and Analysis Framework V. Gyurjyan

Thomas Jefferson National Accelerator Facility

Page 16

ClaRA Design Architecture

PDP Service Bus (pub-sub and/or shared-memory)

Service layer

Orchestration Layer

• Rule invocation• Identification• Filtration• Subscription

Control Layer

•Data flow control•Load balancing•Error recovery•Managing errors and exceptions•Security, validation and auditing

Administration SaaS, IaaS,

DaaS

RegistrationService

Service Inventory

Page 17: Thomas Jefferson National Accelerator Facility Page 1 Clas12 Reconstruction and Analysis Framework V. Gyurjyan

Thomas Jefferson National Accelerator Facility

Page 17

ClaRA Containers

Cloud Controller

• Service Bus (pub-sub server)• Registration• Discovery• Administration• Governing

Service Bus

pub-sub server

SaaS

Administration • Service deployment• Service removal• Service recovery

Monitoring• CPU (Unix only)

Java Container

Main Container

Administration • Service deployment• Service removal• Service recovery

Monitoring

SaaS

C++ Containe

r• Container is a process• Single Java container• Single C++ container

Com

putin

g N

ode

1

Cloud Control Node

Com

putin

g N

ode

1

Page 18: Thomas Jefferson National Accelerator Facility Page 1 Clas12 Reconstruction and Analysis Framework V. Gyurjyan

Thomas Jefferson National Accelerator Facility

Page 18

ClaRA Cloud Formation

Transient Data Storage

Transient Data Storage

Service Bus

Service 1 Service 2 Service N

Service 1 Service 2 Service N

Java Container

C++ Container

Computing Node 1

Service Bus

Computing Node 2

Service Bus

Computing Node 1

Service Bus

Computing Node N

Page 19: Thomas Jefferson National Accelerator Facility Page 1 Clas12 Reconstruction and Analysis Framework V. Gyurjyan

Thomas Jefferson National Accelerator Facility

Page 19

ClaRA Cloud Interactions

Cloud

Control

Node

Node 1

Node 2

Node 3

Node N

Cloud 1

Cloud

Control

Node

Node 1

Node 2

Node 3

Node N

Cloud 2

Cloud

Control

Node

Node 1

Node 2

Node 3

Node N

Cloud N

Page 20: Thomas Jefferson National Accelerator Facility Page 1 Clas12 Reconstruction and Analysis Framework V. Gyurjyan

Thomas Jefferson National Accelerator Facility

Page 20

ClaRA SaaS Implementation

UserSoftware

(shared object)

Eng

ine

inte

rfac

e

Mes

sage

pro

cess

ing

Page 21: Thomas Jefferson National Accelerator Facility Page 1 Clas12 Reconstruction and Analysis Framework V. Gyurjyan

Thomas Jefferson National Accelerator Facility

Page 21

Service Coupling

• A functional service context is independent from the outside logic.

• Contract-To-Functional coupling between services.• Service has a name.• Service receives data and sends data.

• Consumer-To-Contract coupling between consumers and services.

• User/consumer sends data to the named service.• Sending the data will trigger execution of the receivers service.

ServiceA

Page 22: Thomas Jefferson National Accelerator Facility Page 1 Clas12 Reconstruction and Analysis Framework V. Gyurjyan

Thomas Jefferson National Accelerator Facility

Page 22

Interface. Transient Data Envelope

Data Object

Meta Data Type

MimeType String/Integer

Description String

Version Integer

Unit String

Govern Type Access

Status String R/W

Control Property List R only

Customer String R only

Page 23: Thomas Jefferson National Accelerator Facility Page 1 Clas12 Reconstruction and Analysis Framework V. Gyurjyan

Thomas Jefferson National Accelerator Facility

Page 23

Service abstraction

• Technology information (hidden)• Programmatic logic information (hidden)• Functional information (exposed through service contract meta

data)• Quality of service information (available from the platform

registration services)

Page 24: Thomas Jefferson National Accelerator Facility Page 1 Clas12 Reconstruction and Analysis Framework V. Gyurjyan

Thomas Jefferson National Accelerator Facility

Page 24

Service Types

• Entity services• Generic, reusable

• Utility services• Legacy systems

• Composite services• Primitive compositions

– Self governing service compositions, task services

• Complex compositions– Controlled aggregate services

Page 25: Thomas Jefferson National Accelerator Facility Page 1 Clas12 Reconstruction and Analysis Framework V. Gyurjyan

Thomas Jefferson National Accelerator Facility

Page 25

Service Composition

• Primitive composition• Represents message exchanges across two or more services.• Requires composition initiator.

• Complex composition• Orchestrated task services seen as a single service.

Orchestration Layer

Page 26: Thomas Jefferson National Accelerator Facility Page 1 Clas12 Reconstruction and Analysis Framework V. Gyurjyan

Thomas Jefferson National Accelerator Facility

Page 26

Service Discovery

ServiceConsumer

ServiceProvider

Register service

Exchange messages

Discov

er

serv

ice

ServiceRegistry

Page 27: Thomas Jefferson National Accelerator Facility Page 1 Clas12 Reconstruction and Analysis Framework V. Gyurjyan

Thomas Jefferson National Accelerator Facility

Page 27

Multi-Threading

• Only one process is active on the ClaRA cloud node.• Single Java container (JVM) on a node.• Single C++ container on a node.• Service containers run in their own threads.• A service container executes contained service engines in

separate threads. • A service engine must be thread safe if it is planned to run in a

multi-threaded mode.

Advantages• Small memory footprint, lower probability for missing a L1/L2..

caches, less RAM access, better performance. • Multi-Processing bookkeeping complexity.

• Users must keep track of input/output data distribution and data reconsolidation.

Page 28: Thomas Jefferson National Accelerator Facility Page 1 Clas12 Reconstruction and Analysis Framework V. Gyurjyan

Thomas Jefferson National Accelerator Facility

Page 28

O

S

S

S

S

S

O

S

S

S

S

S

Orchestrator

EventService

EventService

O

S

S

S

S

S

O

S

S

S

S

S

O

S

S

S

S

S

O

S

S

S

S

S

Node 1

Node 2

Node N

Page 29: Thomas Jefferson National Accelerator Facility Page 1 Clas12 Reconstruction and Analysis Framework V. Gyurjyan

Thomas Jefferson National Accelerator Facility

Page 29

Service deployment graphical Interfaces

Page 30: Thomas Jefferson National Accelerator Facility Page 1 Clas12 Reconstruction and Analysis Framework V. Gyurjyan

Thomas Jefferson National Accelerator Facility

Page 30

Clas12 DetectorElectronics

Trigger

Slow Controls

ETOnline

Transient Data

Storage

PermanentData

Storage

OnlineEB

Services

Online MonitoringServices

EventVisualization

Services

OnlineCalibrationServices

CalibrationDatabase

ConditionsDatabase

Geometry Calibration

Services

Run Conditions

Services

Clo

ud

Co

ntr

ol

Ser

vice

Re

gis

tra

tion

Ser

vice

Co

ntr

ol

Online Cloud

Online Application

Orchestrator

Clo

ud

Co

ntr

ol

Ser

vice

Re

gis

tra

tion

Ser

vice

Co

ntr

ol

Physics Data Processing Application Orchestrator

Geant 4

GEMCSimulation

EBServices

DSTHistogram

VisualizationServicesAnalysis

Services

CalibrationServices

Run Conditions

Services

Geometry Calibration Services

PermanentData Storage

PermanentData

Storage

Ser

vice

Co

ntr

olS

ervi

ce R

eg

istr

atio

n

Clo

ud

Co

ntr

ol

PermanentData Storage

CalibrationDatabase

ConditionsDatabase

Ser

vice

Co

ntr

olS

ervi

ce R

eg

istr

atio

n

Clo

ud

Co

ntr

ol

PermanentData Storage

CalibrationDatabase

ConditionsDatabase

Offline University Cloud 1

Offline University Cloud n

Clo

ud

Sch

ed

ule

r

Offline JLAB Cloud

Page 31: Thomas Jefferson National Accelerator Facility Page 1 Clas12 Reconstruction and Analysis Framework V. Gyurjyan

Thomas Jefferson National Accelerator Facility

Page 31

DCClusterFinder

DCRoadsInClusterFinder

DCRegionSegmentFinder

DCTrackCandidateFinder

FMTHitsToTrackSegment

ForwardKalmanFilter

Tra

nsie

nt

Da

ta S

tora

ge.

Sh

are

d m

em

ory

BSMTHitFilter

BSMTMergeHits

PointHitFinder

HoughTransform

HelixTrackEstimator

TrackFitter

Tra

nsie

nt

Da

ta S

tora

ge.

Sh

are

d m

em

ory

ECStripsCreator

ECHitsFinder

ForwardTOFReconstruction

Sha

red

me

mo

ry

ParticleIdentification

FT

OF

PID

CalorimeterCentral Tracking Forward Tracking

Per

man

ent S

tora

ge

Eve

nt R

econ

stru

ctio

n A

pplic

atio

n O

rche

stra

tor

C++ VM

Node 1

Node n

JVM

Event Reconstruction

DatabaseServices

pub-sub

Page 32: Thomas Jefferson National Accelerator Facility Page 1 Clas12 Reconstruction and Analysis Framework V. Gyurjyan

Thomas Jefferson National Accelerator Facility

Page 32

Page 33: Thomas Jefferson National Accelerator Facility Page 1 Clas12 Reconstruction and Analysis Framework V. Gyurjyan

Thomas Jefferson National Accelerator Facility

Page 33

Page 34: Thomas Jefferson National Accelerator Facility Page 1 Clas12 Reconstruction and Analysis Framework V. Gyurjyan

Thomas Jefferson National Accelerator Facility

Page 34

Tracking in a Single Cloud Node

0 1 2 3 4 5 6 7 8 9 10 11 12 130

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08

0.09

0.1

f(x) = 0.00757517482517483 x + 0.00351969696969698R² = 0.996933436213909

CLARA SOT performance in an Intel Xeon 2x6 Westmere pro-cessor based node

Number of Cores

1/T

p [

mse

c]

Page 35: Thomas Jefferson National Accelerator Facility Page 1 Clas12 Reconstruction and Analysis Framework V. Gyurjyan

Thomas Jefferson National Accelerator Facility

Page 35

Tracking in a 17 Node RU Cloud

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 180

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

f(x) = 0.0845269607843137 x + 0.0962573529411771R² = 0.993904889387651

CLARA SOT performance in a cluster

Number of Nodes

1/T

p [

mse

c]

Page 36: Thomas Jefferson National Accelerator Facility Page 1 Clas12 Reconstruction and Analysis Framework V. Gyurjyan

Thomas Jefferson National Accelerator Facility

Page 36

Challenges

• Communication latency• careful analysis of the criticality of a service, introduce built-in

tolerance for variations in network service response times • Data integration and distribution

• Increases author and user pools.• Management and administration. Strict service canonization rules

• Workloads of different clients may introduce “pileups” on a single service.

• Service and Cloud governance

• Network security• Client authentication and message encryption

Page 37: Thomas Jefferson National Accelerator Facility Page 1 Clas12 Reconstruction and Analysis Framework V. Gyurjyan

Thomas Jefferson National Accelerator Facility

Page 37

Summary and Conclusion

• A multi-treaded analyses framework, based on SOA • PDP application based on specialized services

• Small, independent• Easy to test, update and maintain• Building and running PDP application does not require CS skills.

• List of applications has been developed using the framework• Charge particle tracking (central, forward)• EC reconstruction • FTOF reconstruction• PID • HTCC• PCAL• Detector calibration• Event Building• Histogram services• Database application

– Geometry, calibration constants and run conditions services

ClaRA supports both traditional and cloud computing models and if need be we are ready for cloud deployment.

Page 38: Thomas Jefferson National Accelerator Facility Page 1 Clas12 Reconstruction and Analysis Framework V. Gyurjyan

Thomas Jefferson National Accelerator Facility

Page 38

Service Bus Performance measurements

Control messages only<0.03ms/message

Page 39: Thomas Jefferson National Accelerator Facility Page 1 Clas12 Reconstruction and Analysis Framework V. Gyurjyan

Thomas Jefferson National Accelerator Facility

Page 39

Service Bus Performance measurements

Page 40: Thomas Jefferson National Accelerator Facility Page 1 Clas12 Reconstruction and Analysis Framework V. Gyurjyan

Thomas Jefferson National Accelerator Facility

Page 40

Event Reconstruction Profiling

Component ms / event, thread, track info

Track 11.2 (0.03ms/MPI * 5 =0.15ms) measured

Vertex 5 estimate

EC 3.1 (0.03ms/MPI * 2 =0.06ms) measured

PCAL 3 estimate

HTCC, LTCC, TOF 10 estimate

Total 32.3 estimate

• Extrapolation of the trajectory from hit to hit, along with the uncertainty matrix of track parameters

• At each hit we decrease uncertainty and modify track parameters

• Extrapolation of the magnetic field.

• 3 kalman filter iterations• end-start, start-end, end-vertex

• Magnetic field extrapolation step size• 2cm in z and r, every 0.25degree

MPI Performance

Node 0.03ms

Network 10K data 0.1ms

Network >100K data 100MB/s

1Gigabit Ethernet

Page 41: Thomas Jefferson National Accelerator Facility Page 1 Clas12 Reconstruction and Analysis Framework V. Gyurjyan

Thomas Jefferson National Accelerator Facility

Page 41

Grid vs Cloud

• Cloud computing overlaps with many existing technologies, such as Grid, Utility, and distributed computing in general.

• The problem are mostly the same. There is a common need to be able to manage large computing resources to perform large scale data processing.

Page 42: Thomas Jefferson National Accelerator Facility Page 1 Clas12 Reconstruction and Analysis Framework V. Gyurjyan

Thomas Jefferson National Accelerator Facility

Page 42

Grid vs Cloud

Service OrientedBy Ian Foster, et.all

Page 43: Thomas Jefferson National Accelerator Facility Page 1 Clas12 Reconstruction and Analysis Framework V. Gyurjyan

Thomas Jefferson National Accelerator Facility

Page 43

How different is Cloud?

• It is elastically scalable• Can be encapsulated as an abstract entity that delivers different levels

of services to users outside the cloud.• Services can be dynamically configured and delivered on demand.

• Major difference compared to Grid• Resources in the Cloud are being shared by all users at the same time

• in contrast to dedicated resources governed by a queuing system: batch• MPI (message passing interface) is used to achieve intra-service

communication.• Services are loosely coupled, transaction oriented, and are always

interactive • as appose to batch-scheduled as they are currently in Grids.

Page 44: Thomas Jefferson National Accelerator Facility Page 1 Clas12 Reconstruction and Analysis Framework V. Gyurjyan

Thomas Jefferson National Accelerator Facility

Page 44

Collaboration wide Utilization and Contribution

Application Users

Service Programmers

Framework developers

•Users manual•Graphical interface client bundles•Application design graphical tool•Service discovery and monitoring tool•Wiki how-to•Workshops and conferences

•ClaRA Developers manual•Svn•JavaDoc, Doxigen•Wiki forums•Bug-tracking tools•ClaRA developers blog•Service deployment and management tools.

•Release notes•Discussion forums•Maintenance and request tracking

Page 45: Thomas Jefferson National Accelerator Facility Page 1 Clas12 Reconstruction and Analysis Framework V. Gyurjyan

Thomas Jefferson National Accelerator Facility

Page 45

Transitioning from Development to Operations

• Many test deployments at local and remote desktops and laptops.

• ClaRA operational platforms at RU, ODU and JLAB.• Deployment of the ClaRA platform on the batch queue

system at the JLAB is in the progress with the help of the IT department.

Page 46: Thomas Jefferson National Accelerator Facility Page 1 Clas12 Reconstruction and Analysis Framework V. Gyurjyan

Thomas Jefferson National Accelerator Facility

Page 46

Java Service Container

C++ Service Container

Ser

vice

Bu

s

C++ Service Container

Java Service Container

EventServices

Clo

ud

Co

ntr

ol

Ser

vice

Re

gis

tra

tion

Ser

vice

Co

ntr

ol

ClaRA Cloud Control NodeClas12 private resource

PD

PA

pplicationO

rchestrator

Run Conditions

Services

Geometry Calibration Services

PermanentData

Storage

CalibrationDatabase

ConditionsDatabase

Ser

vice

Bu

s

Farm Nod 2

Farm Nod 1

ClaRA Deployment on the JLAB Batch Farm

Page 47: Thomas Jefferson National Accelerator Facility Page 1 Clas12 Reconstruction and Analysis Framework V. Gyurjyan

Thomas Jefferson National Accelerator Facility

Page 47

Possible Questions

• Grid vs Cloud• Resources in the Cloud can be dynamically allocated and released (elasticity),

as appose to Grid where fixed, dedicated resources are governed by a batch queuing system.

• Services are being shared by all users at the same time, they are transaction oriented, and are always interactive, as appose to batch-scheduled as they are currently in Grids.

• In the cloud Services can be dynamically configured and delivered on demand.

• How ClaRA provides cloud computing environment• Software application can be presented as a service (Cloud SaaS support)• IaaS (infrastructure as a service), i.e. registration, discovery, management and

administration services

• Multithreading vs multi-processing • Small memory footprint, lower probability for missing multilevel core/chip cache,

less RAM access, better performance. • Multi-Processing bookkeeping complexity.

Page 48: Thomas Jefferson National Accelerator Facility Page 1 Clas12 Reconstruction and Analysis Framework V. Gyurjyan

Thomas Jefferson National Accelerator Facility

Page 48

Possible Questions• Service communication latency

• Messages < 1K transferred within the same node – 0.03ms and over the 1GBit network – 0.043ms

• Messages with 10K in the same node – 0.05msand over the 1GBit network – 0.1ms

• Messages over 100K over the 1GBit network – 100MByte/s

• What you mean under loose coupling?• Well defined, data-centric interface• Self-contained with no dependencies on other services• Coupled through communicating data (transient data envelope)

• Errors propagation and fault tolerance • Status field of the transient data envelope

• What communication protocol is used between services?

• Communicate data through shared memory and /or pub-sub messaging system

Page 49: Thomas Jefferson National Accelerator Facility Page 1 Clas12 Reconstruction and Analysis Framework V. Gyurjyan

Thomas Jefferson National Accelerator Facility

Page 49

Possible Questions• What you mean under location transparent resource

allocation• Service is defined by a unique name, and is discovered through the

discovery services of the ClaRA platform, without a knowledge where this service is physically deployed and running.

• What you mean on-demand data processing • ClaRA resources and services have temporal continuity and are available for

users on demand. Users can customize, configure and control services at every request.

• How easy it would be to deploy ClaRA on a commercial cloud?

– ClaRA platform is written in pure Java, making IaaS (infrastructure as a service) easy deployable and portable. C++ container is self contained and ported on many flavors of Unix and mac-OS.