mythes et réalités du grid computing presented by: gilles tourpe directeur technique emea

Mythes et réalités du Grid Computing

Presented by: Gilles Tourpe

Directeur Technique EMEA

Mythes

© Platform Computing Inc. 20043

Grid is …


Grid is … SLASLA

Service wait timeService wait time

Run time per service

Run time per service


Grid is …

If you ask me to give you a very clear and simplistic definition of grid

I would say grid computing is distributed computing involving

multiple sites to integrate and support applications and support

collaboration.


Vision:

A single (virtual) computer to run your business

How?

By delivering products that support all types of workloads, applications, standards, resources and computing environments with global enterprise-level scalability and with a common, virtualized infrastructure

How to complete this vision ?


Data

Demand

Compute

Demand

Retail Banking –

Data Mining

(Fast Interconnect ,

Data aware scheduling)

Exotics –

Risk Management

(Load balancing)

Front Office -

Pricing & Hedging

(Low latency task

distribution)

Grid

Sweetspot

Credit Risk

(Fast Interconnect - InfiniBand,

Scalable I/O storage

Data-aware scheduling)

Crossing the Finance Application Chasm


Differences between Grid Computing and Distributed Computing

Power without control is nothing. Grid power of 1000 machines needs to be managed and steered towards business objectives in a systematic, deterministic and predictable fashion

Grid Computing = Distributed Computing + resource and workload management in terms of:

Resource virtualization Resource ownership and sharing Dynamic resource allocation Resource monitoring, control, failover, and troubleshooting Guaranteed SLA (Service Level Agreement) management Workload scheduling and prioritization Load balance High reliability and availability, robustness, resilience, and failover Performance and scalability in a large grid Workload execution monitoring, control, and troubleshooting Resource and workload usage collection, reporting, and accounting


What are the supportive concepts & technologies

A Virtualized IT environment

Grid Virtualization Strengths

Pool (Virtualize) heterogeneous resources

Allocate and manage resources based on Policy

Server Virtualization Strengths

Partition server into virtual servers that provide a secure “container” for applications.

Data virtualization Strenghts

Data access transparency

.NET Application Virtualization Strengths

Rapid development

Improved operation & maintanability

Agile architecture


Solaris Windows

zLinux

LinuxAIX Windows

Oracle DB2 SQL

Application BApplication A Application C

Legacy Stovepipes

Avg. Utilizatio

n Rate

40%

Avg. Utilizatio

n Rate

40%

Avg. Utilizatio

n Rate

2-5%

Avg. Utilizatio

n Rate

2-5%

Avg. Utilizatio

n Rate

10%

Avg. Utilizatio

n Rate

10%

Avg. Utilization

Rate

10%

Avg. Utilization

Rate

10%

Avg. Utilization

Rate

52%

Avg. Utilization

Rate

52%

Avg. Utilization

Rate

60%

Avg. Utilization

Rate

60%

Avg. Utilization

Rate

10%

Avg. Utilization

Rate

10%

15 Hours15 Hours 8 Hours8 Hours 2 Hours2 Hours

Grid Computing is about virtualizing and sharing resources

Decoupling applications from infrastructure




Solaris Windows

zLinux

LinuxAIX Windows

Results returnedand integrated into

application(s)

Scheduler distributes application

workload(s) to CPUs

Oracle DB2 SQL




workload(s) to CPUs



Solaris Windows

zLinux

LinuxAIX Windows

Results returnedand integrated into

application(s)


workload(s) to CPUs

Oracle DB2 SQL


Collaboration& Resiliency


Enterprise Grid Context

WorkloadManagers

Applications

Users

EnterpriseResources

BI .NET DB’s ERP CRM VM’s

ApplicationManagers

BatchProcess

FlowSOAParallel

HPC MDA EDA CAE Risk

“Acceleration” Applications

Business Applications

Grid Management Console Windows 2003 Server Resource Pool


Is this a Myth ?

Shared application interface, scheduling system, and virtual resource pool

Enables sharing and reuse of knowledge, data, resources, and analytics engines

Shared resource pool is dynamically partitioned into “virtual clusters”

Application interface and scheduling system are now commercially supported, fully documented software

Low-cost off-the-shelf hardware replaces expensive SMP boxes

Multi-asset models can now be run by one user, able to access any analytics engines and resources needed, governed by priority-driven policies


Policy Evolution – Supporting the spectrum of ownership vs.sharing

Model 1: Limits

Put hard limits around each consumer

Virtualize the resources instead of dedicating fixed resources

Guaranteed capacity in event of failures

A B

C

A B

CA

B

Silo Model Enterprise Sharing Model Utility Computing Model

C

Model 2: Borrow/Lend

Each consumer has “owned” capacity

Each consumer can specify lend and borrowing limits around that owned capacity

Model 3:Fairshare

Consumer has % of capacity at each level relative to others

“Owned” capacity is 0 for consumer

Capacity allocated based on need and constrained by shares

Model 4: Economic

Consumers specify budget $

Resource usage has cost ($/cpu-hr, $KB/hr)

System optimizes budget allocations & resource usage driven by application SLA (determined in WLM)

100% ownership of resources by

Consumers

Capped SLA guarantees when peak

reached

- Some minimum ownership of resources

- Ability to share from pool or others

- 0% ownership of resources by

consumer.

- All owned by service provider (IB

- Consumer Pay for usage only

- SLAs guaranteed in exchange for

resource ownership

Réalités


Platform Symphony – Road to Grid and Beyond

Today

Start Small (1 or 2 apps)

Grow the grid and measure ROI

Tomorrow

As you grow, throw more apps or complex jobs at the grid

Platform Symphony is designed to grow with you

Supporting all workload and enterprise-class Scalability

Ultimately future proofing your IT investments via a heterogeneous, standards-based, single, common

infrastructure solution - The Virtual Execution Machine (VEM)


Today FSI

Policy Evolution – Supporting the spectrum of ownership vs.sharing

Model 1: Limits

Put hard limits around each consumer

Virtualize the resources instead of dedicating fixed resources

Guaranteed capacity in event of failures

A B

C

A B

CA

B

Silo Model Enterprise Sharing Model Utility Computing Model

C

Model 2: Borrow/Lend

Each consumer has “owned” capacity

Each consumer can specify lend and borrowing limits around that owned capacity

Model 3:Fairshare

Consumer has % of capacity at each level relative to others

“Owned” capacity is 0 for consumer

Capacity allocated based on need and constrained by shares

Model 4: Economic

Consumers specify budget $

Resource usage has cost ($/cpu-hr, $KB/hr)

System optimizes budget allocations & resource usage driven by application SLA (determined in WLM)

100% ownership of resources by

Consumers

Capped SLA guarantees when peak

reached

- Some minimum ownership of resources

- Ability to share from pool or others

- 0% ownership of resources by

consumer.

- All owned by service provider (IB

- Consumer Pay for usage only

- SLAs guaranteed in exchange for

resource ownership

TodayEDA,IM


Summit VaR Cluster 148 BladesSym 2.1.3

Summit VaR Cluster 248 Blades

Summit VaR Cluster 144 CPUsLognes

Sym 2.1.3

Summit VaRSymphony 2.1.3

Sophis PricingSymphony 2.1.3

Sophis PortfolioSymphony 2.1.3

Compute nodes

Windows 2000

Compute nodes

Windows 2000

Site CSite BSite A

Customer A Architecture


Summit VaR Sym 2.1.3

HybridsSymphony 2.1.3

Compute nodes

Windows 2003

Site DSite CSite B

Customer B Architecture

HybridsSymphony 2.1.3

Compute nodes

Windows 2003

UATProduction

Summit VaR Sym 2.1.3 Summit VaR

Sym 2.1.3

Summit VaR Sym 2.1.3

Site A

WLMWLM WLM

WLM


Customer B – Application focus

Market Data

Work

Web Sphere

Job Server

Job Arrives, contains list of deals

Excel

Job Serverdecomposes

deals into tasks

Excel gets market dataAfter new task arrives

Symphony

Job Server isSymphony Client

Tasks sent to Symphony

Symphony starts ExcelService which starts Excel,. Each task contains the deal string used for the calculation. ExcelService calls the relevant Excel method to start the computation.

Excel Excel Excel

Results returned to client

Tasks distributed to available compute hosts and results returned to Symphony


Summit VaR BATCH Partition

Summit VaRSUMMIT_VaR_H

Sophis PricingSOPHIS_PRICING

Sophis PortfolioSOPHIS_PORTFOLIO

Compute nodesApplication Client

Platform Symphony 2.2.1

Brokerage of ressources

1 Dedicated Service Partition per Application

SLA per application with Lending and Borrowing ENABLED

Contingency SiteCustomer A target (End of June)

Windows machines resource pool


Customer C Architecture

Project A

Windows 2000/2003 Compute Nodes

Exotic Derivatives Pricers

C++

100s CPUs actives today

1000s CPUs by Q1CY2005

Project B


.NET

100 CPUs actives

Project C


.NET

600-800 CPUs by Q1CY2005

Compute farms


Customer C – Desktop Support

SSMMapping FS

FS

WLMMapping FS

Mapping FSCompute Node

Dedicated Workstation

Running Symphony

Mapping FS

DR Site with dedicated positions


Customer C

Grid A

Grid ManagementConsole

Platform Policy Engine and Scheduler

Shared Storage

. . . VMwareDesktop Farm

In Disaster Recovery

Platform Grid

Virtual Machines

. . . Grid B

Grid C

Triggered actions

Compute farms


Customer C - Step4

SSMMapping FS

FS

WLMMapping FS

Mapping FSCompute Node

Dedicated Workstation

Running Symphony

NO FS Dependency

CORE Building workstations


Customer C – Final Step

Pool Compute farm

A B C

Disaster Recovery Desktop

Opportunistic CPU stealing


JPMorgan Chase

Silos of resources and applications supporting different risk management apps

No sharing or collaboration of knowledge, data, resources, or analytics engines

Over-provisioning of hardware for peak in each silo

Unstable and poorly documented home-grown distribution software and application interfaces

Expensive SMP servers needed to support spikes in workload

Multi-asset (cross-silo) models had to be run and assembled manually


Today’s situation !

Key steps to implement an enterprise grid ?


Four Stages of Enterprise Grid

LEVEL TWOConnected, Multi-Domains

There is an agreement between domains to share compute resources according to owner controlled policies. Charge-back manager tracks usage and allocates costs

LEVEL ONESingle or Isolated Domains

Each ”Domain” – project, work group or LoB, etc. controls – its own processes, data and compute resources within an enterprise, behind the firewall.

LEVEL THREECross-enterprise, Compute Backbone”

• Each project, work group or LoB controls its own processes, data and compute resources, behind the firewall• Compute resources are share according to owner controlled policies• For maximum speed and efficiency all domains are linked together• Sophisticated charge-back manager tracks usage and allocates costs.

Value

TIME


LEVEL FOURInternal Utility/Enterprise Utility

• Domains own no or almost compute resources of their own but pay for the use of any that they require.

• IT is charged with buying sufficient compute resources to meet demand from all parties.

• Domains establish priority requirements and are charged only for actual use.

•Sophisticated charge-back manager tracks usage and allocates costs.

• Sophisticated modeling analysis predicts volume/time requirements

Value

Time

Four Stages of Enterprise Grid


Partner Grid - beyond the firewall

Value

Time

LEVEL FIVEInter-Enterprise

Each project, work group or LoB controls its own processes, data and compute resources within an enterprise, behind the firewall and interacts with other base “domains”

But, there is an agreement to share computer resources with partners in other enterprises beyond the firewall.


LEVEL SIXUtility Grid Computing

Business owners own little or no hardware. Buy from utility on an as needed basis. Utility serves many customers; efficiencies drive down costs and drive up scale

Utility Grid

Merci !

mythes et réalités du grid computing presented by: gilles tourpe directeur technique emea

Documents

utilization rate

platform computing

hours grid computing

computing environments

grid power

distributed computing

service slide

mythes slide