aaron j. elmore, carlo curino, divyakant agrawal,...

97
Aaron J. Elmore, Carlo Curino, Divyakant Agrawal, Amr El Abbadi [aelmore,agrawal,amr] @ cs.ucsb.edu ccurino @ microsoft.com

Upload: trantram

Post on 19-Jun-2018

230 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Aaron J. Elmore, Carlo Curino, Divyakant Agrawal, …people.cs.uchicago.edu/~aelmore/papers/VLDB_Tutorial_DBaaS_1-share.pdfAaron J. Elmore, Carlo Curino, Divyakant Agrawal, ... Proliferation

Aaron J. Elmore, Carlo Curino,

Divyakant Agrawal, Amr El Abbadi

[aelmore,agrawal,amr] @ cs.ucsb.edu

ccurino @ microsoft.com

Page 2: Aaron J. Elmore, Carlo Curino, Divyakant Agrawal, …people.cs.uchicago.edu/~aelmore/papers/VLDB_Tutorial_DBaaS_1-share.pdfAaron J. Elmore, Carlo Curino, Divyakant Agrawal, ... Proliferation

2

Page 3: Aaron J. Elmore, Carlo Curino, Divyakant Agrawal, …people.cs.uchicago.edu/~aelmore/papers/VLDB_Tutorial_DBaaS_1-share.pdfAaron J. Elmore, Carlo Curino, Divyakant Agrawal, ... Proliferation
Page 4: Aaron J. Elmore, Carlo Curino, Divyakant Agrawal, …people.cs.uchicago.edu/~aelmore/papers/VLDB_Tutorial_DBaaS_1-share.pdfAaron J. Elmore, Carlo Curino, Divyakant Agrawal, ... Proliferation

Virtualized Processing

Virtualized Memory

Fifty+ Years of Virtualization

Page 5: Aaron J. Elmore, Carlo Curino, Divyakant Agrawal, …people.cs.uchicago.edu/~aelmore/papers/VLDB_Tutorial_DBaaS_1-share.pdfAaron J. Elmore, Carlo Curino, Divyakant Agrawal, ... Proliferation

Virtualized I/O

Virtualized Network

Fifty+ Years of Virtualization

Page 6: Aaron J. Elmore, Carlo Curino, Divyakant Agrawal, …people.cs.uchicago.edu/~aelmore/papers/VLDB_Tutorial_DBaaS_1-share.pdfAaron J. Elmore, Carlo Curino, Divyakant Agrawal, ... Proliferation

Stated simply

Implementation

Such indirection, however, has proven to be extremely powerful and versatile

Virtualization Principles

Page 7: Aaron J. Elmore, Carlo Curino, Divyakant Agrawal, …people.cs.uchicago.edu/~aelmore/papers/VLDB_Tutorial_DBaaS_1-share.pdfAaron J. Elmore, Carlo Curino, Divyakant Agrawal, ... Proliferation

Multi-tasking/Timesharing systems

Virtual Memory

Virtualization Origins: 1960s

Page 8: Aaron J. Elmore, Carlo Curino, Divyakant Agrawal, …people.cs.uchicago.edu/~aelmore/papers/VLDB_Tutorial_DBaaS_1-share.pdfAaron J. Elmore, Carlo Curino, Divyakant Agrawal, ... Proliferation

The Notion of Virtual Machine

Page 9: Aaron J. Elmore, Carlo Curino, Divyakant Agrawal, …people.cs.uchicago.edu/~aelmore/papers/VLDB_Tutorial_DBaaS_1-share.pdfAaron J. Elmore, Carlo Curino, Divyakant Agrawal, ... Proliferation

Virtualization in IBM Systems

Page 10: Aaron J. Elmore, Carlo Curino, Divyakant Agrawal, …people.cs.uchicago.edu/~aelmore/papers/VLDB_Tutorial_DBaaS_1-share.pdfAaron J. Elmore, Carlo Curino, Divyakant Agrawal, ... Proliferation

Virtualization Renaissance?

Page 11: Aaron J. Elmore, Carlo Curino, Divyakant Agrawal, …people.cs.uchicago.edu/~aelmore/papers/VLDB_Tutorial_DBaaS_1-share.pdfAaron J. Elmore, Carlo Curino, Divyakant Agrawal, ... Proliferation

Proliferation of Hardware:

Proliferation of Operating Systems:

Proliferation of Applications and Services:

Basically, led to system administration nightmare

Enterprise Server Infrastructure Administration

Page 12: Aaron J. Elmore, Carlo Curino, Divyakant Agrawal, …people.cs.uchicago.edu/~aelmore/papers/VLDB_Tutorial_DBaaS_1-share.pdfAaron J. Elmore, Carlo Curino, Divyakant Agrawal, ... Proliferation

System Administration Realities

Page 13: Aaron J. Elmore, Carlo Curino, Divyakant Agrawal, …people.cs.uchicago.edu/~aelmore/papers/VLDB_Tutorial_DBaaS_1-share.pdfAaron J. Elmore, Carlo Curino, Divyakant Agrawal, ... Proliferation

The Need

Page 14: Aaron J. Elmore, Carlo Curino, Divyakant Agrawal, …people.cs.uchicago.edu/~aelmore/papers/VLDB_Tutorial_DBaaS_1-share.pdfAaron J. Elmore, Carlo Curino, Divyakant Agrawal, ... Proliferation
Page 15: Aaron J. Elmore, Carlo Curino, Divyakant Agrawal, …people.cs.uchicago.edu/~aelmore/papers/VLDB_Tutorial_DBaaS_1-share.pdfAaron J. Elmore, Carlo Curino, Divyakant Agrawal, ... Proliferation

What is Virtualization?

Page 16: Aaron J. Elmore, Carlo Curino, Divyakant Agrawal, …people.cs.uchicago.edu/~aelmore/papers/VLDB_Tutorial_DBaaS_1-share.pdfAaron J. Elmore, Carlo Curino, Divyakant Agrawal, ... Proliferation
Page 17: Aaron J. Elmore, Carlo Curino, Divyakant Agrawal, …people.cs.uchicago.edu/~aelmore/papers/VLDB_Tutorial_DBaaS_1-share.pdfAaron J. Elmore, Carlo Curino, Divyakant Agrawal, ... Proliferation

Full Virtualization

Para Virtualization:

Virtualization Nuances

Page 18: Aaron J. Elmore, Carlo Curino, Divyakant Agrawal, …people.cs.uchicago.edu/~aelmore/papers/VLDB_Tutorial_DBaaS_1-share.pdfAaron J. Elmore, Carlo Curino, Divyakant Agrawal, ... Proliferation

Type I Hypervisor:

Type II Hypervisor:

Types of Hypervisor

Page 19: Aaron J. Elmore, Carlo Curino, Divyakant Agrawal, …people.cs.uchicago.edu/~aelmore/papers/VLDB_Tutorial_DBaaS_1-share.pdfAaron J. Elmore, Carlo Curino, Divyakant Agrawal, ... Proliferation
Page 20: Aaron J. Elmore, Carlo Curino, Divyakant Agrawal, …people.cs.uchicago.edu/~aelmore/papers/VLDB_Tutorial_DBaaS_1-share.pdfAaron J. Elmore, Carlo Curino, Divyakant Agrawal, ... Proliferation

Road Map

Page 21: Aaron J. Elmore, Carlo Curino, Divyakant Agrawal, …people.cs.uchicago.edu/~aelmore/papers/VLDB_Tutorial_DBaaS_1-share.pdfAaron J. Elmore, Carlo Curino, Divyakant Agrawal, ... Proliferation
Page 22: Aaron J. Elmore, Carlo Curino, Divyakant Agrawal, …people.cs.uchicago.edu/~aelmore/papers/VLDB_Tutorial_DBaaS_1-share.pdfAaron J. Elmore, Carlo Curino, Divyakant Agrawal, ... Proliferation

Xen Architecture (Bare-Metal) [Barham et al. SOSP 2003]

Page 23: Aaron J. Elmore, Carlo Curino, Divyakant Agrawal, …people.cs.uchicago.edu/~aelmore/papers/VLDB_Tutorial_DBaaS_1-share.pdfAaron J. Elmore, Carlo Curino, Divyakant Agrawal, ... Proliferation

Have to allocate how Hypervisor shares resources between VMs (CPU, memory, I/O, …)

Better to have some flexibility in how resource allocation is controlled.

Resource Allocation

Page 24: Aaron J. Elmore, Carlo Curino, Divyakant Agrawal, …people.cs.uchicago.edu/~aelmore/papers/VLDB_Tutorial_DBaaS_1-share.pdfAaron J. Elmore, Carlo Curino, Divyakant Agrawal, ... Proliferation

Goals of Resource Allocation

Page 25: Aaron J. Elmore, Carlo Curino, Divyakant Agrawal, …people.cs.uchicago.edu/~aelmore/papers/VLDB_Tutorial_DBaaS_1-share.pdfAaron J. Elmore, Carlo Curino, Divyakant Agrawal, ... Proliferation

Shares

Reservation

Limit

Controls

Page 26: Aaron J. Elmore, Carlo Curino, Divyakant Agrawal, …people.cs.uchicago.edu/~aelmore/papers/VLDB_Tutorial_DBaaS_1-share.pdfAaron J. Elmore, Carlo Curino, Divyakant Agrawal, ... Proliferation

Benefits of Share Based

Change shares for VM

Dynamic reallocation

Add VM, overcommit

Graceful degradation

Remove VM

Exploit extra resources

Page 27: Aaron J. Elmore, Carlo Curino, Divyakant Agrawal, …people.cs.uchicago.edu/~aelmore/papers/VLDB_Tutorial_DBaaS_1-share.pdfAaron J. Elmore, Carlo Curino, Divyakant Agrawal, ... Proliferation

Proportional-Share Scheduling

Simplified virtual-time algorithm

Virtual time = usage / shares

Schedule VM with smallest virtual time

Page 28: Aaron J. Elmore, Carlo Curino, Divyakant Agrawal, …people.cs.uchicago.edu/~aelmore/papers/VLDB_Tutorial_DBaaS_1-share.pdfAaron J. Elmore, Carlo Curino, Divyakant Agrawal, ... Proliferation

Reservation Example

reserved

available

VM1 VM2

Page 29: Aaron J. Elmore, Carlo Curino, Divyakant Agrawal, …people.cs.uchicago.edu/~aelmore/papers/VLDB_Tutorial_DBaaS_1-share.pdfAaron J. Elmore, Carlo Curino, Divyakant Agrawal, ... Proliferation

Other Issues

Page 30: Aaron J. Elmore, Carlo Curino, Divyakant Agrawal, …people.cs.uchicago.edu/~aelmore/papers/VLDB_Tutorial_DBaaS_1-share.pdfAaron J. Elmore, Carlo Curino, Divyakant Agrawal, ... Proliferation

Extra layer of memory indirection by having hypervisor map memory pages.

VMs are allocated a static amount of the machines memory.

Virtualized Memory Management

Page 31: Aaron J. Elmore, Carlo Curino, Divyakant Agrawal, …people.cs.uchicago.edu/~aelmore/papers/VLDB_Tutorial_DBaaS_1-share.pdfAaron J. Elmore, Carlo Curino, Divyakant Agrawal, ... Proliferation
Page 32: Aaron J. Elmore, Carlo Curino, Divyakant Agrawal, …people.cs.uchicago.edu/~aelmore/papers/VLDB_Tutorial_DBaaS_1-share.pdfAaron J. Elmore, Carlo Curino, Divyakant Agrawal, ... Proliferation

Memory Resource Management in VMware ESX

Focus

Improve hypervisor’s memory management sharing

Key Contributions

Memory ballooning

Page sharing to minimize wasted memory

(Limiting idle pages)

Page 33: Aaron J. Elmore, Carlo Curino, Divyakant Agrawal, …people.cs.uchicago.edu/~aelmore/papers/VLDB_Tutorial_DBaaS_1-share.pdfAaron J. Elmore, Carlo Curino, Divyakant Agrawal, ... Proliferation

Memory Reclamation

Page 34: Aaron J. Elmore, Carlo Curino, Divyakant Agrawal, …people.cs.uchicago.edu/~aelmore/papers/VLDB_Tutorial_DBaaS_1-share.pdfAaron J. Elmore, Carlo Curino, Divyakant Agrawal, ... Proliferation

Sharing Memory

Page 35: Aaron J. Elmore, Carlo Curino, Divyakant Agrawal, …people.cs.uchicago.edu/~aelmore/papers/VLDB_Tutorial_DBaaS_1-share.pdfAaron J. Elmore, Carlo Curino, Divyakant Agrawal, ... Proliferation

Other Issues

VLDB'2013 Tutorial 38

Page 36: Aaron J. Elmore, Carlo Curino, Divyakant Agrawal, …people.cs.uchicago.edu/~aelmore/papers/VLDB_Tutorial_DBaaS_1-share.pdfAaron J. Elmore, Carlo Curino, Divyakant Agrawal, ... Proliferation
Page 37: Aaron J. Elmore, Carlo Curino, Divyakant Agrawal, …people.cs.uchicago.edu/~aelmore/papers/VLDB_Tutorial_DBaaS_1-share.pdfAaron J. Elmore, Carlo Curino, Divyakant Agrawal, ... Proliferation

Virtualization in Distributed Environments

What is different? Why is it desirable?

Narrow VMM interface minimizes the residual dependencies

Clean migration of in-memory state (both kernel and user level)

Separation of concerns: users (full control over software) and operators (full control over infrastructure)

Page 38: Aaron J. Elmore, Carlo Curino, Divyakant Agrawal, …people.cs.uchicago.edu/~aelmore/papers/VLDB_Tutorial_DBaaS_1-share.pdfAaron J. Elmore, Carlo Curino, Divyakant Agrawal, ... Proliferation
Page 39: Aaron J. Elmore, Carlo Curino, Divyakant Agrawal, …people.cs.uchicago.edu/~aelmore/papers/VLDB_Tutorial_DBaaS_1-share.pdfAaron J. Elmore, Carlo Curino, Divyakant Agrawal, ... Proliferation

Live Migration of Virtual Machines

Focus

Want to migrate a live VM state (non-persistent state)

Minimize disrupting services

Key Contributions

Canonical live migration for VM

Optimizations minimize downtimes and disruption

Page 40: Aaron J. Elmore, Carlo Curino, Divyakant Agrawal, …people.cs.uchicago.edu/~aelmore/papers/VLDB_Tutorial_DBaaS_1-share.pdfAaron J. Elmore, Carlo Curino, Divyakant Agrawal, ... Proliferation

Live Migration Design Space

Page 41: Aaron J. Elmore, Carlo Curino, Divyakant Agrawal, …people.cs.uchicago.edu/~aelmore/papers/VLDB_Tutorial_DBaaS_1-share.pdfAaron J. Elmore, Carlo Curino, Divyakant Agrawal, ... Proliferation

Pre-copy migration

Careful to avoid service degradation

Implementation

Page 42: Aaron J. Elmore, Carlo Curino, Divyakant Agrawal, …people.cs.uchicago.edu/~aelmore/papers/VLDB_Tutorial_DBaaS_1-share.pdfAaron J. Elmore, Carlo Curino, Divyakant Agrawal, ... Proliferation
Page 43: Aaron J. Elmore, Carlo Curino, Divyakant Agrawal, …people.cs.uchicago.edu/~aelmore/papers/VLDB_Tutorial_DBaaS_1-share.pdfAaron J. Elmore, Carlo Curino, Divyakant Agrawal, ... Proliferation

Open network connections

Technical challenges: Handling Local Resources

Local Storage

Page 44: Aaron J. Elmore, Carlo Curino, Divyakant Agrawal, …people.cs.uchicago.edu/~aelmore/papers/VLDB_Tutorial_DBaaS_1-share.pdfAaron J. Elmore, Carlo Curino, Divyakant Agrawal, ... Proliferation

Overhead in live migration is a consistent copy of updating state

How much time to spend on iterative pre-copy?

Use Xen’s shadow page tables

Technical challenges: Writable Working Set

Page 45: Aaron J. Elmore, Carlo Curino, Divyakant Agrawal, …people.cs.uchicago.edu/~aelmore/papers/VLDB_Tutorial_DBaaS_1-share.pdfAaron J. Elmore, Carlo Curino, Divyakant Agrawal, ... Proliferation

Migrating a Simple Web Server

Page 46: Aaron J. Elmore, Carlo Curino, Divyakant Agrawal, …people.cs.uchicago.edu/~aelmore/papers/VLDB_Tutorial_DBaaS_1-share.pdfAaron J. Elmore, Carlo Curino, Divyakant Agrawal, ... Proliferation

Current challenges in the context of Cloud Computing:

Internet Services/applications that run 24X7:

Need to migrate live services/applications from one data-center to another

VM migration over WAN

Leveraging Virtualization over Multi-Datacenters

Page 47: Aaron J. Elmore, Carlo Curino, Divyakant Agrawal, …people.cs.uchicago.edu/~aelmore/papers/VLDB_Tutorial_DBaaS_1-share.pdfAaron J. Elmore, Carlo Curino, Divyakant Agrawal, ... Proliferation
Page 48: Aaron J. Elmore, Carlo Curino, Divyakant Agrawal, …people.cs.uchicago.edu/~aelmore/papers/VLDB_Tutorial_DBaaS_1-share.pdfAaron J. Elmore, Carlo Curino, Divyakant Agrawal, ... Proliferation

Live Migration over WAN with persistent state

Focus

Extending live migration to manage migrating over wide area

Key Challenges

Ability to migrate local persistent (block) storage

Migrate open connections over WAN lightweight network redirection combined with new address via dynDNS

Page 49: Aaron J. Elmore, Carlo Curino, Divyakant Agrawal, …people.cs.uchicago.edu/~aelmore/papers/VLDB_Tutorial_DBaaS_1-share.pdfAaron J. Elmore, Carlo Curino, Divyakant Agrawal, ... Proliferation
Page 50: Aaron J. Elmore, Carlo Curino, Divyakant Agrawal, …people.cs.uchicago.edu/~aelmore/papers/VLDB_Tutorial_DBaaS_1-share.pdfAaron J. Elmore, Carlo Curino, Divyakant Agrawal, ... Proliferation

VM Migration over WAN requires:

Transfer of network connection (Tunneling and dynDNS)

Transfer of persistent storage (capture block writes)

Throttle write heavy workloads

The paper reports that this is indeed feasible:

Migration of a web server over WAN in less than a minute.

Main Takeaways

Page 51: Aaron J. Elmore, Carlo Curino, Divyakant Agrawal, …people.cs.uchicago.edu/~aelmore/papers/VLDB_Tutorial_DBaaS_1-share.pdfAaron J. Elmore, Carlo Curino, Divyakant Agrawal, ... Proliferation

Checkpoint and recovery

Compression

Much more

Many Other VM Migration Techniques

Page 52: Aaron J. Elmore, Carlo Curino, Divyakant Agrawal, …people.cs.uchicago.edu/~aelmore/papers/VLDB_Tutorial_DBaaS_1-share.pdfAaron J. Elmore, Carlo Curino, Divyakant Agrawal, ... Proliferation

Servers multiplexed across multiple applications:

Many apps per server and components distributed across multiple servers

Dynamic workload fluctuations caused by incremental growth, time-of-day effects, and flash crowds.

Virtualization advantage:

The ability to flexibly remap physical resources to virtual servers to handle workload dynamics

Manual detection and migration lacks the agility to respond to sudden changes in workload dynamics

Data-center Administration Challenges

Page 53: Aaron J. Elmore, Carlo Curino, Divyakant Agrawal, …people.cs.uchicago.edu/~aelmore/papers/VLDB_Tutorial_DBaaS_1-share.pdfAaron J. Elmore, Carlo Curino, Divyakant Agrawal, ... Proliferation

Goal:

Automate the task of monitoring and detecting hotspots

Determining a new mapping of physical and virtual resources to mitigate the problem of hotspots

Initiating the necessary VM migrations to realize this new mapping

Data-center Automation using Virtualization

Page 54: Aaron J. Elmore, Carlo Curino, Divyakant Agrawal, …people.cs.uchicago.edu/~aelmore/papers/VLDB_Tutorial_DBaaS_1-share.pdfAaron J. Elmore, Carlo Curino, Divyakant Agrawal, ... Proliferation
Page 55: Aaron J. Elmore, Carlo Curino, Divyakant Agrawal, …people.cs.uchicago.edu/~aelmore/papers/VLDB_Tutorial_DBaaS_1-share.pdfAaron J. Elmore, Carlo Curino, Divyakant Agrawal, ... Proliferation

Black- and Gray-box Strategies for VM Migration

Focus

Method to monitor and detect hotspots

Resolving overloaded servers

Key Contributions

Black- and gray-box monitoring

Identifying and migrating VMs to alleviate overloaded servers

Page 56: Aaron J. Elmore, Carlo Curino, Divyakant Agrawal, …people.cs.uchicago.edu/~aelmore/papers/VLDB_Tutorial_DBaaS_1-share.pdfAaron J. Elmore, Carlo Curino, Divyakant Agrawal, ... Proliferation

Sandpiper Architecture

Page 57: Aaron J. Elmore, Carlo Curino, Divyakant Agrawal, …people.cs.uchicago.edu/~aelmore/papers/VLDB_Tutorial_DBaaS_1-share.pdfAaron J. Elmore, Carlo Curino, Divyakant Agrawal, ... Proliferation

Infer VM workload from “VM external” observations:

domain-0 (driver domain)

Interested in monitoring during each interval:

CPU Usage

Network Usage

Memory

Disk Assumed to be external

Black-Box Monitoring

Page 58: Aaron J. Elmore, Carlo Curino, Divyakant Agrawal, …people.cs.uchicago.edu/~aelmore/papers/VLDB_Tutorial_DBaaS_1-share.pdfAaron J. Elmore, Carlo Curino, Divyakant Agrawal, ... Proliferation

CPU

Each VM CPU usage is derived by scheduling events

Ignores VM overhead of disk IO and network CPU

Network

Hypervisor implements the network interface driver (direct)

Memory

Cannot directly tell memory usage

Geiger to derive working set from swaps

Black-box Monitoring

Page 59: Aaron J. Elmore, Carlo Curino, Divyakant Agrawal, …people.cs.uchicago.edu/~aelmore/papers/VLDB_Tutorial_DBaaS_1-share.pdfAaron J. Elmore, Carlo Curino, Divyakant Agrawal, ... Proliferation

Resource Provisioning

VLDB'2013 Tutorial 62

Page 60: Aaron J. Elmore, Carlo Curino, Divyakant Agrawal, …people.cs.uchicago.edu/~aelmore/papers/VLDB_Tutorial_DBaaS_1-share.pdfAaron J. Elmore, Carlo Curino, Divyakant Agrawal, ... Proliferation

Black-box is good when we cannot peak inside VM.

Certain environments this OK, not always feasible.

Lightweight monitoring daemon inside VM to watch:

Gray-box Monitoring

Page 61: Aaron J. Elmore, Carlo Curino, Divyakant Agrawal, …people.cs.uchicago.edu/~aelmore/papers/VLDB_Tutorial_DBaaS_1-share.pdfAaron J. Elmore, Carlo Curino, Divyakant Agrawal, ... Proliferation

Generate Profile

Page 62: Aaron J. Elmore, Carlo Curino, Divyakant Agrawal, …people.cs.uchicago.edu/~aelmore/papers/VLDB_Tutorial_DBaaS_1-share.pdfAaron J. Elmore, Carlo Curino, Divyakant Agrawal, ... Proliferation

Thresholds or SLAs are violated for sustained period.

Want to mitigate sustained resource that does not appear to be trending down.

Hotspot Detection

Page 63: Aaron J. Elmore, Carlo Curino, Divyakant Agrawal, …people.cs.uchicago.edu/~aelmore/papers/VLDB_Tutorial_DBaaS_1-share.pdfAaron J. Elmore, Carlo Curino, Divyakant Agrawal, ... Proliferation

Migration

Migrate from hottest to coldest which

Minimize the overhead of migration

Start with migrating VM with highest usage to size (RAM) ratio

VLDB'2013 Tutorial 66

Page 64: Aaron J. Elmore, Carlo Curino, Divyakant Agrawal, …people.cs.uchicago.edu/~aelmore/papers/VLDB_Tutorial_DBaaS_1-share.pdfAaron J. Elmore, Carlo Curino, Divyakant Agrawal, ... Proliferation

Relieving Hotspot

VLDB'2013 Tutorial 67

Page 65: Aaron J. Elmore, Carlo Curino, Divyakant Agrawal, …people.cs.uchicago.edu/~aelmore/papers/VLDB_Tutorial_DBaaS_1-share.pdfAaron J. Elmore, Carlo Curino, Divyakant Agrawal, ... Proliferation

VMWare DRS (Dist. Resource Scheduler)

Violin and Virtuoso

Shirako

Other Projects

VLDB'2013 Tutorial

Page 66: Aaron J. Elmore, Carlo Curino, Divyakant Agrawal, …people.cs.uchicago.edu/~aelmore/papers/VLDB_Tutorial_DBaaS_1-share.pdfAaron J. Elmore, Carlo Curino, Divyakant Agrawal, ... Proliferation

69

Page 67: Aaron J. Elmore, Carlo Curino, Divyakant Agrawal, …people.cs.uchicago.edu/~aelmore/papers/VLDB_Tutorial_DBaaS_1-share.pdfAaron J. Elmore, Carlo Curino, Divyakant Agrawal, ... Proliferation

Motivation

VLDB'2013 Tutorial

Page 68: Aaron J. Elmore, Carlo Curino, Divyakant Agrawal, …people.cs.uchicago.edu/~aelmore/papers/VLDB_Tutorial_DBaaS_1-share.pdfAaron J. Elmore, Carlo Curino, Divyakant Agrawal, ... Proliferation

Storage Virtualization Architecture in Data Centers

Page 69: Aaron J. Elmore, Carlo Curino, Divyakant Agrawal, …people.cs.uchicago.edu/~aelmore/papers/VLDB_Tutorial_DBaaS_1-share.pdfAaron J. Elmore, Carlo Curino, Divyakant Agrawal, ... Proliferation

Basil

Pesto

Romano

Soundararajan et al.

Virtualized Storage Outline

Page 70: Aaron J. Elmore, Carlo Curino, Divyakant Agrawal, …people.cs.uchicago.edu/~aelmore/papers/VLDB_Tutorial_DBaaS_1-share.pdfAaron J. Elmore, Carlo Curino, Divyakant Agrawal, ... Proliferation
Page 71: Aaron J. Elmore, Carlo Curino, Divyakant Agrawal, …people.cs.uchicago.edu/~aelmore/papers/VLDB_Tutorial_DBaaS_1-share.pdfAaron J. Elmore, Carlo Curino, Divyakant Agrawal, ... Proliferation

Black Box Relative Modeling

device independently

Use IO latency

BASIL Approach

VLDB'2013 Tutorial

Page 72: Aaron J. Elmore, Carlo Curino, Divyakant Agrawal, …people.cs.uchicago.edu/~aelmore/papers/VLDB_Tutorial_DBaaS_1-share.pdfAaron J. Elmore, Carlo Curino, Divyakant Agrawal, ... Proliferation

To a first approximation, the following parameters are inherent to the workload and independent of the underlying device:

Average Outstanding IO (OIO)

IO size

Read %

Random %

Latency = (k1+OIO)(k2+IOsize)(k3+Read%)(k4+random%) K1=1.3,k2=51,k3=0.4,k4=0.6 from empirical data

OIO dominates; IO size only when change is LARGE.

Workload Characteristics

VLDB'2013 Tutorial

Page 73: Aaron J. Elmore, Carlo Curino, Divyakant Agrawal, …people.cs.uchicago.edu/~aelmore/papers/VLDB_Tutorial_DBaaS_1-share.pdfAaron J. Elmore, Carlo Curino, Divyakant Agrawal, ... Proliferation

Device performance can vary widely depending on number, type, etc of disk, but hidden from hosts.

P: inverse of the slope of the relationship between average latency and outstanding IOs for given device.

Storage Device Characteristics

VLDB'2013 Tutorial

Page 74: Aaron J. Elmore, Carlo Curino, Divyakant Agrawal, …people.cs.uchicago.edu/~aelmore/papers/VLDB_Tutorial_DBaaS_1-share.pdfAaron J. Elmore, Carlo Curino, Divyakant Agrawal, ... Proliferation

Normalized Load (NL)

∑ workload on device/P

equalize NL

hottest coldest

Experiments:

Load Balancing

VLDB'2013 Tutorial

Page 75: Aaron J. Elmore, Carlo Curino, Divyakant Agrawal, …people.cs.uchicago.edu/~aelmore/papers/VLDB_Tutorial_DBaaS_1-share.pdfAaron J. Elmore, Carlo Curino, Divyakant Agrawal, ... Proliferation
Page 76: Aaron J. Elmore, Carlo Curino, Divyakant Agrawal, …people.cs.uchicago.edu/~aelmore/papers/VLDB_Tutorial_DBaaS_1-share.pdfAaron J. Elmore, Carlo Curino, Divyakant Agrawal, ... Proliferation

BASIL encountered several challenges in real production environments.

Pesto provides IO load balancing with cost-benefits analysis using continuously updated storage performance models.

Motivation

VLDB'2013 Tutorial

Page 77: Aaron J. Elmore, Carlo Curino, Divyakant Agrawal, …people.cs.uchicago.edu/~aelmore/papers/VLDB_Tutorial_DBaaS_1-share.pdfAaron J. Elmore, Carlo Curino, Divyakant Agrawal, ... Proliferation

workload injector

Periodically collect stats

models

load balance

Pesto Approach

Page 78: Aaron J. Elmore, Carlo Curino, Divyakant Agrawal, …people.cs.uchicago.edu/~aelmore/papers/VLDB_Tutorial_DBaaS_1-share.pdfAaron J. Elmore, Carlo Curino, Divyakant Agrawal, ... Proliferation

The LQ slope of a workload can predict the device throughput of that workload closely

Empirically and analytically derive linear relationship between IO latency and Outstanding IO Latency = m * OIO + C (L = mQ +C) m is the LQ slope and C its intercept Peak Throughput = 1/m (using Little’s law))

Philosophy: Approximate models obtained online (vs offline detailed models).

Pesto Workload Performance Model

VLDB'2013 Tutorial

Page 79: Aaron J. Elmore, Carlo Curino, Divyakant Agrawal, …people.cs.uchicago.edu/~aelmore/papers/VLDB_Tutorial_DBaaS_1-share.pdfAaron J. Elmore, Carlo Curino, Divyakant Agrawal, ... Proliferation

Workload injector: run during idle periods to generate performance models periodically.

Compute the slope in the model using least squares.

Fine grain stats are measured on a per-virtual disk periodically, and collected by a manager server to compute percentiles based on workload metrics (OIOs, IO size, read% and randomness).

Performance Model Generation

Page 80: Aaron J. Elmore, Carlo Curino, Divyakant Agrawal, …people.cs.uchicago.edu/~aelmore/papers/VLDB_Tutorial_DBaaS_1-share.pdfAaron J. Elmore, Carlo Curino, Divyakant Agrawal, ... Proliferation

Therefore

overall mix of multiple

Dependence on IO workload

VLDB'2013 Tutorial

Page 81: Aaron J. Elmore, Carlo Curino, Divyakant Agrawal, …people.cs.uchicago.edu/~aelmore/papers/VLDB_Tutorial_DBaaS_1-share.pdfAaron J. Elmore, Carlo Curino, Divyakant Agrawal, ... Proliferation

IO load balancing: equalize latency across devices.

Ignore moves where ‘median’ latencies are lower at source vs destination.

Cost-Benefit Analysis: Migration occurs based on estimates of both cost and benefits of migration from source to a destination.

Performance-based capacity planning (how many workloads on a given device)

Pesto improved throughput by 10% and reduced peak latency by up to 19%.

Load balancing

VLDB'2013 Tutorial

Page 82: Aaron J. Elmore, Carlo Curino, Divyakant Agrawal, …people.cs.uchicago.edu/~aelmore/papers/VLDB_Tutorial_DBaaS_1-share.pdfAaron J. Elmore, Carlo Curino, Divyakant Agrawal, ... Proliferation
Page 83: Aaron J. Elmore, Carlo Curino, Divyakant Agrawal, …people.cs.uchicago.edu/~aelmore/papers/VLDB_Tutorial_DBaaS_1-share.pdfAaron J. Elmore, Carlo Curino, Divyakant Agrawal, ... Proliferation

Basil and Pesto mainly use heuristics based on empirical studies.

Romano constructs workload specific performance models of storage devices automatically.

Explicitly considers multiple heterogeneous workload on a single storage device.

Motivation

VLDB'2013 Tutorial

Page 84: Aaron J. Elmore, Carlo Curino, Divyakant Agrawal, …people.cs.uchicago.edu/~aelmore/papers/VLDB_Tutorial_DBaaS_1-share.pdfAaron J. Elmore, Carlo Curino, Divyakant Agrawal, ... Proliferation

High level View of Romano

VLDB'2013 Tutorial

Page 85: Aaron J. Elmore, Carlo Curino, Divyakant Agrawal, …people.cs.uchicago.edu/~aelmore/papers/VLDB_Tutorial_DBaaS_1-share.pdfAaron J. Elmore, Carlo Curino, Divyakant Agrawal, ... Proliferation

How to model storage performance to incorporate workload/device dependence as well as inherent noise?

A storage system’s performance P depends on both the system S and the workload W: P = fP (S,W)

Using linear regression:

P = Ms * MW + ε = MP + ε

Where P is average IO latency and MP is the desired performance model. Approach: Measure P to determine MW then solve for MS.

Romano Performance Model

VLDB'2013 Tutorial

Page 86: Aaron J. Elmore, Carlo Curino, Divyakant Agrawal, …people.cs.uchicago.edu/~aelmore/papers/VLDB_Tutorial_DBaaS_1-share.pdfAaron J. Elmore, Carlo Curino, Divyakant Agrawal, ... Proliferation

Workload characteristics used to derive MW:

Space of experiments TOO BIG.

Romano uses sampling.

Derive MP = MW*MS

IO Workspace

VLDB'2013 Tutorial

Page 87: Aaron J. Elmore, Carlo Curino, Divyakant Agrawal, …people.cs.uchicago.edu/~aelmore/papers/VLDB_Tutorial_DBaaS_1-share.pdfAaron J. Elmore, Carlo Curino, Divyakant Agrawal, ... Proliferation

How to aggregate workloads so that the total work required can be predicted?

Outstanding IO (O): addition

Read Ratio (R) and Size (Z): weighted (by throughput T) average based on ratio of requests.

Randomness (X): trickier, function of seeks within and across virtual disks, proportionally weighted by throughput (fX).

Romano Aggregation Model

VLDB'2013 Tutorial

Page 88: Aaron J. Elmore, Carlo Curino, Divyakant Agrawal, …people.cs.uchicago.edu/~aelmore/papers/VLDB_Tutorial_DBaaS_1-share.pdfAaron J. Elmore, Carlo Curino, Divyakant Agrawal, ... Proliferation

How Romano load balances heterogeneous workloads and systems?

Uses Simulated Annealing to Optimize global state rather than single storage migration.

Minimize mean and maximum latency.

Load Balancing metric of Merit (M):

α controls weight of max latency.

Proactive: continuously adjusting global state

Romano Load balancer

VLDB'2013 Tutorial

Page 89: Aaron J. Elmore, Carlo Curino, Divyakant Agrawal, …people.cs.uchicago.edu/~aelmore/papers/VLDB_Tutorial_DBaaS_1-share.pdfAaron J. Elmore, Carlo Curino, Divyakant Agrawal, ... Proliferation

Romano Load Balancer

VLDB'2013 Tutorial

Page 90: Aaron J. Elmore, Carlo Curino, Divyakant Agrawal, …people.cs.uchicago.edu/~aelmore/papers/VLDB_Tutorial_DBaaS_1-share.pdfAaron J. Elmore, Carlo Curino, Divyakant Agrawal, ... Proliferation

A performance predictor with workload and storage device characteristics as input.

Storage performance needs to deal with heterogeneity: different workloads interact differently with different devices.

Load balancing is not a bin packing problem, Romano uses a probabilistic approach to find a pseudo–optimal solution.

Romano Summary

VLDB'2013 Tutorial

Page 91: Aaron J. Elmore, Carlo Curino, Divyakant Agrawal, …people.cs.uchicago.edu/~aelmore/papers/VLDB_Tutorial_DBaaS_1-share.pdfAaron J. Elmore, Carlo Curino, Divyakant Agrawal, ... Proliferation
Page 92: Aaron J. Elmore, Carlo Curino, Divyakant Agrawal, …people.cs.uchicago.edu/~aelmore/papers/VLDB_Tutorial_DBaaS_1-share.pdfAaron J. Elmore, Carlo Curino, Divyakant Agrawal, ... Proliferation

Adopt a holistic approach.

Focus on interdependencies between resources:

Storage and database caches

Storage bandwidth

Overall goal:

Effective resource partitioning for database servers running on virtual storage

Approach:

Per-application performance model based on minimal statistics

Experimental sampling and statistical interpolation for refinement

Data Center Infrastructure

VLDB'2013 Tutorial

Page 93: Aaron J. Elmore, Carlo Curino, Divyakant Agrawal, …people.cs.uchicago.edu/~aelmore/papers/VLDB_Tutorial_DBaaS_1-share.pdfAaron J. Elmore, Carlo Curino, Divyakant Agrawal, ... Proliferation

Simple Motivating Example

Page 94: Aaron J. Elmore, Carlo Curino, Divyakant Agrawal, …people.cs.uchicago.edu/~aelmore/papers/VLDB_Tutorial_DBaaS_1-share.pdfAaron J. Elmore, Carlo Curino, Divyakant Agrawal, ... Proliferation

Goal: Partition m resources among n applications to minimize the sum of the data access latencies of all applications.

Dynamic Multi-Resource Allocation

VLDB'2013 Tutorial

Page 95: Aaron J. Elmore, Carlo Curino, Divyakant Agrawal, …people.cs.uchicago.edu/~aelmore/papers/VLDB_Tutorial_DBaaS_1-share.pdfAaron J. Elmore, Carlo Curino, Divyakant Agrawal, ... Proliferation

Model cache hierarchy as a single-level cache. Uncoordinated LRU cache hierarchy: max size

Coordinated Demote cache hierarchy: sum of sizes.

Model disk as a closed loop system (interactive and constant number of users): rate of serviced requests roughly equals rate of incoming requests. Empirically derive baseline latency (full disk access) for an application. Ld(1)

Given allocation of disk (ρd) model the expected latency :

Ld (ρd): Ld(1)/ρd

Per-Application Performance Model

VLDB'2013 Tutorial

Page 96: Aaron J. Elmore, Carlo Curino, Divyakant Agrawal, …people.cs.uchicago.edu/~aelmore/papers/VLDB_Tutorial_DBaaS_1-share.pdfAaron J. Elmore, Carlo Curino, Divyakant Agrawal, ... Proliferation

Many simplifying assumptions introduce

inaccuracies.

Fine Tuning:

Provides an end to end multi-resource allocation method based on a unified resource to performance model.

Putting it all together

VLDB'2013 Tutorial

Page 97: Aaron J. Elmore, Carlo Curino, Divyakant Agrawal, …people.cs.uchicago.edu/~aelmore/papers/VLDB_Tutorial_DBaaS_1-share.pdfAaron J. Elmore, Carlo Curino, Divyakant Agrawal, ... Proliferation

Virtual Storage