improving system availability in distributed environments

22
Improving System Availability in Distributed Environments Sam Malek [email protected] with Marija Mikic-Rakic [email protected] Nels Beckman [email protected] Nenad Medvidovic [email protected]

Upload: keilah

Post on 19-Jan-2016

37 views

Category:

Documents


0 download

DESCRIPTION

Improving System Availability in Distributed Environments. Sam Malek [email protected] with Marija Mikic-Rakic [email protected] Nels Beckman [email protected] Nenad Medvidovic [email protected]. Motivation. How good is this deployment architecture?. What are its properties?. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Improving System Availability in Distributed Environments

Improving System Availability in Distributed Environments

Sam Malek [email protected]

withMarija Mikic-Rakic [email protected]

Nels Beckman [email protected]

Nenad Medvidovic [email protected]

Page 2: Improving System Availability in Distributed Environments

Motivation

How good is this deployment architecture?What are its properties?

How should it be modified to ensure higher availability?

Page 3: Improving System Availability in Distributed Environments

Effect of Deployment on Availability

Bad deployment Low availability Better deployment Higher availability

Redeployment

• Redeployment to maximize the availability – Frequency and volume of interactions, reliability and capacity of

network links• Hard to determine a good deployment in large scale

distributed systems– In the small example above, there are 310 = 59049 possible

deployments

Host 2Host 1

Host 3

34

8

7

9

5

1 2

6

10

Host 2Host 1

Host 3

34

8

79

5

1 2

6

10

Page 4: Improving System Availability in Distributed Environments

Availability Definition

nsinteractio component total#

nsinteractio component successful #tyAvailabili

The degree to which the system is operational and accessible when required for use

Page 5: Improving System Availability in Distributed Environments

System Model Parameters• Software component properties

• Memory requirements• Frequency of interaction • Size of the exchanged data

• Hardware host properties• Memory capacity• Network reliability• Network bandwidth

• Constraints• Location• Co-location

Page 6: Improving System Availability in Distributed Environments

Problem Definition

• Find a system deployment architecture such that:

• It adheres to the system model parameters and constraints

• It has the greatest availability

Page 7: Improving System Availability in Distributed Environments

Problem Break Down1) Lack of knowledge about runtime system parameters

– System model parameters not known at the time of initial deployment– System model parameters change at runtime

• Reliability of links, frequencies of interaction, etc.– Prism-MW monitoring support

2) Exponentially complex problem– n components and k hosts = kn possible deployments– DeSi’s polynomial time approximating algorithms

3) Solution analysis– Comparison of different solutions and algorithms– Centralized vs. Decentralized, performance vs. complexity, etc– DeSi’s visualization and comparison utilities

4) Effecting the selected solution– Redeploying components– Requires an automated solution– Prism-MW deployment support

Page 8: Improving System Availability in Distributed Environments

DeSi

Approach

Prism-MW

2) Monitoring Data

1) Monitor

4) Redeployment Data

3) Analyze

Page 9: Improving System Availability in Distributed Environments

Prism-MW– An architectural middleware that enables efficient implementation,

deployment, and execution of distributed systems in terms of their architectural elements: components, connectors, configurations, etc.

– Support for monitoring

– Support for redeployment

Admin

34

31

18

2 615

16

4 12

21

Admin

8

3 9

29 1

Admin

28

2030

17

14

0Admin

2226

13

27

10

33

7

24

25

32

19

23

11

Deployer

Distributed System

5

Architecture

Scaffold

BrickConnector

Component

DeployerAdmin

IMonitor

IAdmin

IScaffold

Serializable

Event

Extensible Component

DistributionConnector

Evt FrequencyMonitor

Network ReliabilityMonitor

Simplified Class Diagram of Prism-MW

Page 10: Improving System Availability in Distributed Environments

Prism-MW’s Role

DeSi Prism-MW

2) Monitoring Data

1) Monitor

4) Redeployment Data

3) Analyze

Supports:

• Step 1 by monitoring events in the system and calculating the system parameters

• Step 4 by providing an API for the redeployment of components and meta-level components to automate the tasks

Page 11: Improving System Availability in Distributed Environments

Maximizing Availability

• A family of centralized algorithms• Exact – exponential• Stochastic – quadratic• Adaptive greedy – cubic

• A family of decentralized algorithms• DecAp: Auction-based – cubic

• A set of clustering techniques – Reduce complexity– Improve performance

Page 12: Improving System Availability in Distributed Environments

Algorithms’ Results

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

Ava

ilab

ilit

y10 comps, 4hosts, 100%connected

50 comps, 15hosts, 80%connected

100 comps, 25hosts, 40%connected

250 comps, 50hosts, 80%connected

Page 13: Improving System Availability in Distributed Environments

Assessing the Algorithms• Efficiency

– Execution time vs. precision

• Applicability– Centralized vs. Decentralized

• Effect of system characteristics• Impact of individual parameter changes• Addition of new system parameters• Application to new system properties• Requires “what if” scenario exploration

In comes DeSi!

Page 14: Improving System Availability in Distributed Environments

DeSi’s Architecture

DeSi Model DeSi View

DeSi Controller

MiddlewarePlatform

TableView GraphView

SystemData

AlgoResultData

GraphViewData

Generator

AlgorithmContainer

Modifier

MiddlewareAdapter

Monitor

Effector

Legend:

Dataflow

Controlflow

• Key properties:• Tailorability• Scalability• Efficiency• Explorability

Page 15: Improving System Availability in Distributed Environments

DeSi’s View (1)

Page 16: Improving System Availability in Distributed Environments

DeSi’s View (2)

Page 17: Improving System Availability in Distributed Environments

DeSi’s View (3)

Page 18: Improving System Availability in Distributed Environments

DeSi’s View (4)

Page 19: Improving System Availability in Distributed Environments

DeSi’s View (5)

Page 20: Improving System Availability in Distributed Environments

DeSi’s Role

DeSi Prism-MW

2) Monitoring Data

1) Monitor

4) Redeployment Data

3) Analyze

Supports:

• Step 3 by providing several redeployment algorithms and various visualization utilities

• Steps 2 and 4 by providing the appropriate middleware adapter

Page 21: Improving System Availability in Distributed Environments

Conclusion• Suite of automated tools and techniques for

improving the availability of a distributed system

• Currently extending the tools to model, analyze, and improve other non-functional aspects of a distributed system: security, latency, etc.

Page 22: Improving System Availability in Distributed Environments

Questions?