mythes et réalités du grid computing presented by: gilles tourpe directeur technique emea

35
Mythes et réalités du Grid Computing Presented by: Gilles Tourpe Directeur Technique EMEA

Upload: thomasine-cooper

Post on 13-Dec-2015

218 views

Category:

Documents


5 download

TRANSCRIPT

Mythes et réalités du Grid Computing

Presented by: Gilles Tourpe

Directeur Technique EMEA

Mythes

© Platform Computing Inc. 20043

Grid is …

© Platform Computing Inc. 20044

Grid is … SLASLA

Service wait timeService wait time

Run time per service

Run time per service

© Platform Computing Inc. 20045

Grid is …

If you ask me to give you a very clear and simplistic definition of grid

I would say grid computing is distributed computing involving

multiple sites to integrate and support applications and support

collaboration. 

© Platform Computing Inc. 20046

Vision:

A single (virtual) computer to run your business

How?

By delivering products that support all types of workloads, applications, standards, resources and computing environments with global enterprise-level scalability and with a common, virtualized infrastructure

How to complete this vision ?

© Platform Computing Inc. 20047

Data

Demand

Compute

Demand

Retail Banking –

Data Mining

(Fast Interconnect ,

Data aware scheduling)

Exotics –

Risk Management

(Load balancing)

Front Office -

Pricing & Hedging

(Low latency task

distribution)

Grid

Sweetspot

Credit Risk

(Fast Interconnect - InfiniBand,

Scalable I/O storage

Data-aware scheduling)

Crossing the Finance Application Chasm

© Platform Computing Inc. 20048

Differences between Grid Computing and Distributed Computing

Power without control is nothing. Grid power of 1000 machines needs to be managed and steered towards business objectives in a systematic, deterministic and predictable fashion

Grid Computing = Distributed Computing + resource and workload management in terms of:

Resource virtualization Resource ownership and sharing Dynamic resource allocation Resource monitoring, control, failover, and troubleshooting Guaranteed SLA (Service Level Agreement) management Workload scheduling and prioritization Load balance High reliability and availability, robustness, resilience, and failover Performance and scalability in a large grid Workload execution monitoring, control, and troubleshooting Resource and workload usage collection, reporting, and accounting

© Platform Computing Inc. 20049

What are the supportive concepts & technologies

A Virtualized IT environment

Grid Virtualization Strengths

Pool (Virtualize) heterogeneous resources

Allocate and manage resources based on Policy

Server Virtualization Strengths

Partition server into virtual servers that provide a secure “container” for applications.

Data virtualization Strenghts

Data access transparency

.NET Application Virtualization Strengths

Rapid development

Improved operation & maintanability

Agile architecture

© Platform Computing Inc. 200410

Solaris Windows

zLinux

LinuxAIX Windows

Oracle DB2 SQL

Application BApplication A Application C

Legacy Stovepipes

Avg. Utilizatio

n Rate

40%

Avg. Utilizatio

n Rate

40%

Avg. Utilizatio

n Rate

2-5%

Avg. Utilizatio

n Rate

2-5%

Avg. Utilizatio

n Rate

10%

Avg. Utilizatio

n Rate

10%

Avg. Utilization

Rate

10%

Avg. Utilization

Rate

10%

Avg. Utilization

Rate

52%

Avg. Utilization

Rate

52%

Avg. Utilization

Rate

60%

Avg. Utilization

Rate

60%

Avg. Utilization

Rate

10%

Avg. Utilization

Rate

10%

15 Hours15 Hours 8 Hours8 Hours 2 Hours2 Hours

Grid Computing is about virtualizing and sharing resources

Decoupling applications from infrastructure

© Platform Computing Inc. 200411

Grid Computing is about virtualizing and sharing resources

Decoupling applications from infrastructure

Solaris Windows

zLinux

LinuxAIX Windows

Results returnedand integrated into

application(s)

Scheduler distributes application

workload(s) to CPUs

Oracle DB2 SQL

Application BApplication A Application C

© Platform Computing Inc. 200412

Scheduler distributes application

workload(s) to CPUs

Grid Computing is about virtualizing and sharing resources

Decoupling applications from infrastructure

Solaris Windows

zLinux

LinuxAIX Windows

Results returnedand integrated into

application(s)

Scheduler distributes application

workload(s) to CPUs

Oracle DB2 SQL

Application BApplication A Application C

Collaboration& Resiliency

© Platform Computing Inc. 200413

Enterprise Grid Context

WorkloadManagers

Applications

Users

EnterpriseResources

BI .NET DB’s ERP CRM VM’s

ApplicationManagers

BatchProcess

FlowSOAParallel

HPC MDA EDA CAE Risk

“Acceleration” Applications

Business Applications

Grid Management Console Windows 2003 Server Resource Pool

© Platform Computing Inc. 200414

Is this a Myth ?

Shared application interface, scheduling system, and virtual resource pool

Enables sharing and reuse of knowledge, data, resources, and analytics engines

Shared resource pool is dynamically partitioned into “virtual clusters”

Application interface and scheduling system are now commercially supported, fully documented software

Low-cost off-the-shelf hardware replaces expensive SMP boxes

Multi-asset models can now be run by one user, able to access any analytics engines and resources needed, governed by priority-driven policies

© Platform Computing Inc. 200415

Policy Evolution – Supporting the spectrum of ownership vs.sharing

Model 1: Limits

Put hard limits around each consumer

Virtualize the resources instead of dedicating fixed resources

Guaranteed capacity in event of failures

A B

C

A B

CA

B

Silo Model Enterprise Sharing Model Utility Computing Model

C

Model 2: Borrow/Lend

Each consumer has “owned” capacity

Each consumer can specify lend and borrowing limits around that owned capacity

Model 3:Fairshare

Consumer has % of capacity at each level relative to others

“Owned” capacity is 0 for consumer

Capacity allocated based on need and constrained by shares

Model 4: Economic

Consumers specify budget $

Resource usage has cost ($/cpu-hr, $KB/hr)

System optimizes budget allocations & resource usage driven by application SLA (determined in WLM)

100% ownership of resources by

Consumers

Capped SLA guarantees when peak

reached

- Some minimum ownership of resources

- Ability to share from pool or others

- 0% ownership of resources by

consumer.

- All owned by service provider (IB

- Consumer Pay for usage only

- SLAs guaranteed in exchange for

resource ownership

Réalités

© Platform Computing Inc. 200417

Platform Symphony – Road to Grid and Beyond

Today

Start Small (1 or 2 apps)

Grow the grid and measure ROI

Tomorrow

As you grow, throw more apps or complex jobs at the grid

Platform Symphony is designed to grow with you

Supporting all workload and enterprise-class Scalability

Ultimately future proofing your IT investments via a heterogeneous, standards-based, single, common

infrastructure solution - The Virtual Execution Machine (VEM)

© Platform Computing Inc. 200418

Today FSI

Policy Evolution – Supporting the spectrum of ownership vs.sharing

Model 1: Limits

Put hard limits around each consumer

Virtualize the resources instead of dedicating fixed resources

Guaranteed capacity in event of failures

A B

C

A B

CA

B

Silo Model Enterprise Sharing Model Utility Computing Model

C

Model 2: Borrow/Lend

Each consumer has “owned” capacity

Each consumer can specify lend and borrowing limits around that owned capacity

Model 3:Fairshare

Consumer has % of capacity at each level relative to others

“Owned” capacity is 0 for consumer

Capacity allocated based on need and constrained by shares

Model 4: Economic

Consumers specify budget $

Resource usage has cost ($/cpu-hr, $KB/hr)

System optimizes budget allocations & resource usage driven by application SLA (determined in WLM)

100% ownership of resources by

Consumers

Capped SLA guarantees when peak

reached

- Some minimum ownership of resources

- Ability to share from pool or others

- 0% ownership of resources by

consumer.

- All owned by service provider (IB

- Consumer Pay for usage only

- SLAs guaranteed in exchange for

resource ownership

TodayEDA,IM

© Platform Computing Inc. 200419

Summit VaR Cluster 148 BladesSym 2.1.3

Summit VaR Cluster 248 Blades

Summit VaR Cluster 144 CPUsLognes

Sym 2.1.3

Summit VaRSymphony 2.1.3

Sophis PricingSymphony 2.1.3

Sophis PortfolioSymphony 2.1.3

Compute nodes

Windows 2000

Compute nodes

Windows 2000

Site CSite BSite A

Customer A Architecture

© Platform Computing Inc. 200420

Summit VaR Sym 2.1.3

HybridsSymphony 2.1.3

Compute nodes

Windows 2003

Site DSite CSite B

Customer B Architecture

HybridsSymphony 2.1.3

Compute nodes

Windows 2003

UATProduction

Summit VaR Sym 2.1.3 Summit VaR

Sym 2.1.3

Summit VaR Sym 2.1.3

Site A

WLMWLM WLM

WLM

© Platform Computing Inc. 200421

Customer B – Application focus

Market Data

Work

Web Sphere

Job Server

Job Arrives, contains list of deals

Excel

Job Serverdecomposes

deals into tasks

Excel gets market dataAfter new task arrives

Symphony

Job Server isSymphony Client

Tasks sent to Symphony

Symphony starts ExcelService which starts Excel,. Each task contains the deal string used for the calculation. ExcelService calls the relevant Excel method to start the computation.

Excel Excel Excel

Results returned to client

Tasks distributed to available compute hosts and results returned to Symphony

© Platform Computing Inc. 200422

Summit VaR BATCH Partition

Summit VaRSUMMIT_VaR_H

Sophis PricingSOPHIS_PRICING

Sophis PortfolioSOPHIS_PORTFOLIO

Compute nodesApplication Client

Platform Symphony 2.2.1

Brokerage of ressources

1 Dedicated Service Partition per Application

SLA per application with Lending and Borrowing ENABLED

Contingency SiteCustomer A target (End of June)

Windows machines resource pool

© Platform Computing Inc. 200423

Customer C Architecture

Project A

Windows 2000/2003 Compute Nodes

Exotic Derivatives Pricers

C++

100s CPUs actives today

1000s CPUs by Q1CY2005

Project B

Windows 2000/2003 Compute Nodes

.NET

100 CPUs actives

Project C

Windows 2000/2003 Compute Nodes

.NET

600-800 CPUs by Q1CY2005

Compute farms

© Platform Computing Inc. 200424

Customer C – Desktop Support

SSMMapping FS

FS

WLMMapping FS

Mapping FSCompute Node

Dedicated Workstation

Running Symphony

Mapping FS

DR Site with dedicated positions

© Platform Computing Inc. 200425

Customer C

Grid A

Grid ManagementConsole

Platform Policy Engine and Scheduler

Shared Storage

. . . VMwareDesktop Farm

In Disaster Recovery

Platform Grid

Virtual Machines

. . . Grid B

Grid C

Triggered actions

Compute farms

© Platform Computing Inc. 200426

Customer C - Step4

SSMMapping FS

FS

WLMMapping FS

Mapping FSCompute Node

Dedicated Workstation

Running Symphony

NO FS Dependency

CORE Building workstations

© Platform Computing Inc. 200427

Customer C – Final Step

Pool Compute farm

A B C

Disaster Recovery Desktop

Opportunistic CPU stealing

© Platform Computing Inc. 200428

JPMorgan Chase

Silos of resources and applications supporting different risk management apps

No sharing or collaboration of knowledge, data, resources, or analytics engines

Over-provisioning of hardware for peak in each silo

Unstable and poorly documented home-grown distribution software and application interfaces

Expensive SMP servers needed to support spikes in workload

Multi-asset (cross-silo) models had to be run and assembled manually

© Platform Computing Inc. 200429

Today’s situation !

Key steps to implement an enterprise grid ?

© Platform Computing Inc. 200431

Four Stages of Enterprise Grid

LEVEL TWOConnected, Multi-Domains

There is an agreement between domains to share compute resources according to owner controlled policies. Charge-back manager tracks usage and allocates costs

LEVEL ONESingle or Isolated Domains

Each ”Domain” – project, work group or LoB, etc. controls – its own processes, data and compute resources within an enterprise, behind the firewall.

LEVEL THREECross-enterprise, Compute Backbone”

• Each project, work group or LoB controls its own processes, data and compute resources, behind the firewall• Compute resources are share according to owner controlled policies• For maximum speed and efficiency all domains are linked together• Sophisticated charge-back manager tracks usage and allocates costs.

Value

TIME

© Platform Computing Inc. 200432

LEVEL FOURInternal Utility/Enterprise Utility

• Domains own no or almost compute resources of their own but pay for the use of any that they require.

• IT is charged with buying sufficient compute resources to meet demand from all parties.

• Domains establish priority requirements and are charged only for actual use.

•Sophisticated charge-back manager tracks usage and allocates costs.

• Sophisticated modeling analysis predicts volume/time requirements

Value

Time

Four Stages of Enterprise Grid

© Platform Computing Inc. 200433

Partner Grid - beyond the firewall

Value

Time

LEVEL FIVEInter-Enterprise

Each project, work group or LoB controls its own processes, data and compute resources within an enterprise, behind the firewall and interacts with other base “domains”

But, there is an agreement to share computer resources with partners in other enterprises beyond the firewall.

© Platform Computing Inc. 200434

LEVEL SIXUtility Grid Computing

Business owners own little or no hardware. Buy from utility on an as needed basis. Utility serves many customers; efficiencies drive down costs and drive up scale

Utility Grid

Merci !