: tiering storage for data analytics in the cloudyuecheng/docs/hpdc15_talk.pdf · 2017-08-21 · 10...

67
: Tiering Storage for Data Analytics in the Cloud Yue Cheng , M. Safdar Iqbal , Aayush Gupta , Ali R. Butt Virginia Tech , IBM Research – Almaden

Upload: others

Post on 21-Jun-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: : Tiering Storage for Data Analytics in the Cloudyuecheng/docs/hpdc15_talk.pdf · 2017-08-21 · 10 Heterogeneity in cloud storage services Storage type Capacity (GB/volume) Throughput

: Tiering Storage for Data Analytics in the Cloud Yue Cheng★, M. Safdar Iqbal★, Aayush Gupta†, Ali R. Butt★

Virginia Tech★, IBM Research – Almaden†

Page 2: : Tiering Storage for Data Analytics in the Cloudyuecheng/docs/hpdc15_talk.pdf · 2017-08-21 · 10 Heterogeneity in cloud storage services Storage type Capacity (GB/volume) Throughput

2

Cloud enables cost-efficient data analytics

Cloud infrastructure

Google Cloud BigTable

Google BigQuery

Amazon EMR

Microsoft Azure HDInsight

Sahara

Page 3: : Tiering Storage for Data Analytics in the Cloudyuecheng/docs/hpdc15_talk.pdf · 2017-08-21 · 10 Heterogeneity in cloud storage services Storage type Capacity (GB/volume) Throughput

3

Cloud storage enables data analytics in the cloud

Cloud infrastructure

Google Cloud BigTable

Google BigQuery

Amazon EMR

Microsoft Azure HDInsight

Sahara

Page 4: : Tiering Storage for Data Analytics in the Cloudyuecheng/docs/hpdc15_talk.pdf · 2017-08-21 · 10 Heterogeneity in cloud storage services Storage type Capacity (GB/volume) Throughput

4

Vast variety of cloud storage services

Page 5: : Tiering Storage for Data Analytics in the Cloudyuecheng/docs/hpdc15_talk.pdf · 2017-08-21 · 10 Heterogeneity in cloud storage services Storage type Capacity (GB/volume) Throughput

5

Vast variety of cloud storage services

Object storage

Page 6: : Tiering Storage for Data Analytics in the Cloudyuecheng/docs/hpdc15_talk.pdf · 2017-08-21 · 10 Heterogeneity in cloud storage services Storage type Capacity (GB/volume) Throughput

6

Vast variety of cloud storage services

Object storage

Network-attached block storage

Page 7: : Tiering Storage for Data Analytics in the Cloudyuecheng/docs/hpdc15_talk.pdf · 2017-08-21 · 10 Heterogeneity in cloud storage services Storage type Capacity (GB/volume) Throughput

7

Vast variety of cloud storage services

Object storage

Network-attached block storage

VM-local ephemeral storage

Page 8: : Tiering Storage for Data Analytics in the Cloudyuecheng/docs/hpdc15_talk.pdf · 2017-08-21 · 10 Heterogeneity in cloud storage services Storage type Capacity (GB/volume) Throughput

8

Heterogeneity in cloud storage services

Storage type

Capacity (GB/volume)

Throughput (MB/sec)

IOPS (4KB)

Cost ($/month)

ephSSD 375 733 100000 0.218×375

persSSD 100 48 3000 0.17×100

250 118 7500 0.17×250

500 234 15000 0.17×500

persHDD 100 20 150 0.04×100

250 45 375 0.04×250

500 97 750 0.04×500

objStore N/A 265 550 0.026/GB

ephSSD: VM-local ephemeral SSD, persSSD: Network-attached persistent SSD, persHDD: Network-attached persistent HDD, objStore: Google cloud object storage

Page 9: : Tiering Storage for Data Analytics in the Cloudyuecheng/docs/hpdc15_talk.pdf · 2017-08-21 · 10 Heterogeneity in cloud storage services Storage type Capacity (GB/volume) Throughput

9

Heterogeneity in cloud storage services

Storage type

Capacity (GB/volume)

Throughput (MB/sec)

IOPS (4KB)

Cost ($/month)

ephSSD 375 733 100000 0.218×375

persSSD 100 48 3000 0.17×100

250 118 7500 0.17×250

500 234 15000 0.17×500

persHDD 100 20 150 0.04×100

250 45 375 0.04×250

500 97 750 0.04×500

objStore N/A 265 550 0.026/GB

ephSSD: VM-local ephemeral SSD, persSSD: network-attached persistent SSD, persHDD: network-attached persistent HDD, objStore: Google cloud object storage ephSSD offers best performance w/o data persistence.

Page 10: : Tiering Storage for Data Analytics in the Cloudyuecheng/docs/hpdc15_talk.pdf · 2017-08-21 · 10 Heterogeneity in cloud storage services Storage type Capacity (GB/volume) Throughput

10

Heterogeneity in cloud storage services

Storage type

Capacity (GB/volume)

Throughput (MB/sec)

IOPS (4KB)

Cost ($/month)

ephSSD 375 733 100000 0.218×375

persSSD 100 48 3000 0.17×100

250 118 7500 0.17×250

500 234 15000 0.17×500

persHDD 100 20 150 0.04×100

250 45 375 0.04×250

500 97 750 0.04×500

objStore N/A 265 550 0.026/GB

ephSSD: VM-local ephemeral SSD, persSSD: network-attached persistent SSD, persHDD: network-attached persistent HDD, objStore: Google cloud object storage Performance of the network-attached block storage depends on

the size of the volume.

Page 11: : Tiering Storage for Data Analytics in the Cloudyuecheng/docs/hpdc15_talk.pdf · 2017-08-21 · 10 Heterogeneity in cloud storage services Storage type Capacity (GB/volume) Throughput

11

Heterogeneity in cloud storage services

Storage type

Capacity (GB/volume)

Throughput (MB/sec)

IOPS (4KB)

Cost ($/month)

ephSSD 375 733 100000 0.218×375

persSSD 100 48 3000 0.17×100

250 118 7500 0.17×250

500 234 15000 0.17×500

persHDD 100 20 150 0.04×100

250 45 375 0.04×250

500 97 750 0.04×500

objStore N/A 265 550 0.026/GB

ephSSD: VM-local ephemeral SSD, persSSD: network-attached persistent SSD, persHDD: network-attached persistent HDD, objStore: Google cloud object storage objStore provides the cheapest service and offers comparable

sequential throughput compared to that of a 500GB persSSD.

Page 12: : Tiering Storage for Data Analytics in the Cloudyuecheng/docs/hpdc15_talk.pdf · 2017-08-21 · 10 Heterogeneity in cloud storage services Storage type Capacity (GB/volume) Throughput

12

Heterogeneity in data analytics jobs

Application I/O-intensive CPU-intensive

Map Shuffle Reduce

Sort ✗ ✔ ✗ ✗

Join ✗ ✔ ✔ ✗

Grep ✔ ✗ ✗ ✗

KMeans ✗ ✗ ✗ ✔

Page 13: : Tiering Storage for Data Analytics in the Cloudyuecheng/docs/hpdc15_talk.pdf · 2017-08-21 · 10 Heterogeneity in cloud storage services Storage type Capacity (GB/volume) Throughput

13

Decision paralysis

Cloud tenant

How do I get the MOST BANG-for-the-

buck?

Highly heterogeneous cloud storage

services

Highly heterogeneous

analytics workloads

Page 14: : Tiering Storage for Data Analytics in the Cloudyuecheng/docs/hpdc15_talk.pdf · 2017-08-21 · 10 Heterogeneity in cloud storage services Storage type Capacity (GB/volume) Throughput

14

Motivation

¤ A need for a comprehensive experimental analysis ¤  To study the analytics-job to cloud-storage

relationships

¤ How to exploit heterogeneity in cloud storage and analytics workloads ¤  To reduce $ cost ¤  To improve performance ¤  To meet the deadline

Page 15: : Tiering Storage for Data Analytics in the Cloudyuecheng/docs/hpdc15_talk.pdf · 2017-08-21 · 10 Heterogeneity in cloud storage services Storage type Capacity (GB/volume) Throughput

15

Outline

Motivation

Quantitative analysis

CAST design

Evaluation

Page 16: : Tiering Storage for Data Analytics in the Cloudyuecheng/docs/hpdc15_talk.pdf · 2017-08-21 · 10 Heterogeneity in cloud storage services Storage type Capacity (GB/volume) Throughput

16

Outline

Motivation

Quantitative analysis

CAST design

Evaluation

Page 17: : Tiering Storage for Data Analytics in the Cloudyuecheng/docs/hpdc15_talk.pdf · 2017-08-21 · 10 Heterogeneity in cloud storage services Storage type Capacity (GB/volume) Throughput

17

Experimental study methodology

¤ Experiments on Google Cloud ¤ One n1-standard-16 VM (16 vCPUs, 60GB RAM)

Application I/O-intensive CPU-intensive

Map Shuffle Reduce

Sort ✗ ✔ ✗ ✗

Join ✗ ✔ ✔ ✗

Grep ✔ ✗ ✗ ✗

KMeans ✗ ✗ ✗ ✔

Page 18: : Tiering Storage for Data Analytics in the Cloudyuecheng/docs/hpdc15_talk.pdf · 2017-08-21 · 10 Heterogeneity in cloud storage services Storage type Capacity (GB/volume) Throughput

18

Experimental study methodology

¤ Experiments on Google Cloud ¤ One n1-standard-16 VM (16 vCPUs, 60GB RAM)

¤ Application granularity

¤ Workload granularity

Application I/O-intensive CPU-intensive

Map Shuffle Reduce

Sort ✗ ✔ ✗ ✗

Join ✗ ✔ ✔ ✗

Grep ✔ ✗ ✗ ✗

KMeans ✗ ✗ ✗ ✔

Page 19: : Tiering Storage for Data Analytics in the Cloudyuecheng/docs/hpdc15_talk.pdf · 2017-08-21 · 10 Heterogeneity in cloud storage services Storage type Capacity (GB/volume) Throughput

19

Application granularity: Performance

0

500

1000

1500

2000

2500

3000

Runt

ime

(se

c)

0

200

400

600

800

1000

1200

1400

1600

0

200

400

600

800

1000

1200

1400

1600

1800

0

200

400

600

800

1000

1200

1400

1600

Sort Join Grep KMeans Input download Output upload Processing

Page 20: : Tiering Storage for Data Analytics in the Cloudyuecheng/docs/hpdc15_talk.pdf · 2017-08-21 · 10 Heterogeneity in cloud storage services Storage type Capacity (GB/volume) Throughput

20

Application granularity: Performance

0

500

1000

1500

2000

2500

3000

Runt

ime

(se

c)

0

200

400

600

800

1000

1200

1400

1600

0

200

400

600

800

1000

1200

1400

1600

1800

0

200

400

600

800

1000

1200

1400

1600

Sort Join Grep KMeans Input download Output upload Processing

No storage service provides the best raw performance

Page 21: : Tiering Storage for Data Analytics in the Cloudyuecheng/docs/hpdc15_talk.pdf · 2017-08-21 · 10 Heterogeneity in cloud storage services Storage type Capacity (GB/volume) Throughput

21

Application granularity: Tenant utility

0

0.2

0.4

0.6

0.8

1

1.2

0

500

1000

1500

2000

2500

3000

Runt

ime

(se

c)

0

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

0

200

400

600

800

1000

1200

1400

1600

0

0.5

1

1.5

2

2.5

3

0

200

400

600

800

1000

1200

1400

1600

1800

0

0.5

1

1.5

2

2.5

3

0

200

400

600

800

1000

1200

1400

1600

No

rma

lize

d te

nant

util

ity

Tenant utility = 1T$

Sort Join Grep KMeans Input download Output upload Processing Tenant utility

Performance

$ cost

Page 22: : Tiering Storage for Data Analytics in the Cloudyuecheng/docs/hpdc15_talk.pdf · 2017-08-21 · 10 Heterogeneity in cloud storage services Storage type Capacity (GB/volume) Throughput

22

Application granularity: Tenant utility

0

0.2

0.4

0.6

0.8

1

1.2

0

500

1000

1500

2000

2500

3000

Runt

ime

(se

c)

0

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

0

200

400

600

800

1000

1200

1400

1600

0

0.5

1

1.5

2

2.5

3

0

200

400

600

800

1000

1200

1400

1600

1800

0

0.5

1

1.5

2

2.5

3

0

200

400

600

800

1000

1200

1400

1600

No

rma

lize

d te

nant

util

ity

Sort Join Grep KMeans Input download Output upload Processing Tenant utility

Slower storage, in some case, may provide higher utility & comparable performance

Page 23: : Tiering Storage for Data Analytics in the Cloudyuecheng/docs/hpdc15_talk.pdf · 2017-08-21 · 10 Heterogeneity in cloud storage services Storage type Capacity (GB/volume) Throughput

23

Workload granularity: Data reuse

0

0.2

0.4

0.6

0.8

1

1.2

1.4

No

rma

lize

d te

nant

util

ity

0

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

0

0.5

1

1.5

2

2.5

3

0

0.5

1

1.5

2

2.5

3

Sort Join Grep KMeans No reuse Reuse-lifetime (1-hr) Reuse-lifetime (1-week)

Tenant utility = 1T$

Performance

$ cost

Page 24: : Tiering Storage for Data Analytics in the Cloudyuecheng/docs/hpdc15_talk.pdf · 2017-08-21 · 10 Heterogeneity in cloud storage services Storage type Capacity (GB/volume) Throughput

24

Workload granularity: Data reuse

0

0.2

0.4

0.6

0.8

1

1.2

1.4

No

rma

lize

d te

nant

util

ity

0

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

0

0.5

1

1.5

2

2.5

3

0

0.5

1

1.5

2

2.5

3

Sort Join Grep KMeans No reuse Reuse-lifetime (1-hr) Reuse-lifetime (1-week)

Data reuse patterns affect data placement choices

Page 25: : Tiering Storage for Data Analytics in the Cloudyuecheng/docs/hpdc15_talk.pdf · 2017-08-21 · 10 Heterogeneity in cloud storage services Storage type Capacity (GB/volume) Throughput

25

Workload granularity: Inter dependency

Page 26: : Tiering Storage for Data Analytics in the Cloudyuecheng/docs/hpdc15_talk.pdf · 2017-08-21 · 10 Heterogeneity in cloud storage services Storage type Capacity (GB/volume) Throughput

26

Workload granularity: Inter dependency

2

2.2

2.4

2.6

2.8

3

3.2

7000 8000 9000 10000 11000 12000

Co

st (

$)

Total runtime (sec)

objStore

objStore objStore

objStore

deadline

objStore

Page 27: : Tiering Storage for Data Analytics in the Cloudyuecheng/docs/hpdc15_talk.pdf · 2017-08-21 · 10 Heterogeneity in cloud storage services Storage type Capacity (GB/volume) Throughput

27

Workload granularity: Inter dependency

2

2.2

2.4

2.6

2.8

3

3.2

7000 8000 9000 10000 11000 12000

Co

st (

$)

Total runtime (sec)

persSSD

persSSD

persSSD

persSSD

deadline

objStore persSSD

Page 28: : Tiering Storage for Data Analytics in the Cloudyuecheng/docs/hpdc15_talk.pdf · 2017-08-21 · 10 Heterogeneity in cloud storage services Storage type Capacity (GB/volume) Throughput

28

Workload granularity: Inter dependency

2

2.2

2.4

2.6

2.8

3

3.2

7000 8000 9000 10000 11000 12000

Co

st (

$)

Total runtime (sec)

objStore

objStore ephSSD

ephSSD

objStore

deadline

objStore persSSD

objStore+ephSSD

Page 29: : Tiering Storage for Data Analytics in the Cloudyuecheng/docs/hpdc15_talk.pdf · 2017-08-21 · 10 Heterogeneity in cloud storage services Storage type Capacity (GB/volume) Throughput

29

Workload granularity: Inter dependency

2

2.2

2.4

2.6

2.8

3

3.2

7000 8000 9000 10000 11000 12000

Co

st (

$)

Total runtime (sec)

objStore

objStore ephSSD

persSSD

deadline

objStore persSSD

objStore+ephSSD

objStore+ephSSD+persSSD

Page 30: : Tiering Storage for Data Analytics in the Cloudyuecheng/docs/hpdc15_talk.pdf · 2017-08-21 · 10 Heterogeneity in cloud storage services Storage type Capacity (GB/volume) Throughput

30

Workload granularity: Inter dependency

2

2.2

2.4

2.6

2.8

3

3.2

7000 8000 9000 10000 11000 12000

Co

st (

$)

Total runtime (sec)

objStore

objStore ephSSD

persSSD

deadline

objStore persSSD

objStore+ephSSD

objStore+ephSSD+persSSD

Complex inter-job dependencies require rethinking about use of multiple storage services

Page 31: : Tiering Storage for Data Analytics in the Cloudyuecheng/docs/hpdc15_talk.pdf · 2017-08-21 · 10 Heterogeneity in cloud storage services Storage type Capacity (GB/volume) Throughput

31

CAST: Cloud Analytics Storage Tiering

Different cloud storage services

Heterogeneity in analytics workloads

...

Different application characteristics

Data reuse across jobs Inter-job dependency

...

CAST exploits

Page 32: : Tiering Storage for Data Analytics in the Cloudyuecheng/docs/hpdc15_talk.pdf · 2017-08-21 · 10 Heterogeneity in cloud storage services Storage type Capacity (GB/volume) Throughput

32

CAST: Cloud Analytics Storage Tiering

Different cloud storage services

...

Different application characteristics

Data reuse across jobs Inter-job dependency

...

CAST

Heterogeneity in analytics workloads

exploits

Page 33: : Tiering Storage for Data Analytics in the Cloudyuecheng/docs/hpdc15_talk.pdf · 2017-08-21 · 10 Heterogeneity in cloud storage services Storage type Capacity (GB/volume) Throughput

33

Outline

Motivation

Quantitative analysis

CAST design

Evaluation

Page 34: : Tiering Storage for Data Analytics in the Cloudyuecheng/docs/hpdc15_talk.pdf · 2017-08-21 · 10 Heterogeneity in cloud storage services Storage type Capacity (GB/volume) Throughput

34

CAST framework

Job performance estimator

Tiering solver

Job profiles

Analytics workload spec’s

Job-storage relationships

Cloud service spec’s

Tenant goals

Frameworks

Dryad ...

...

Page 35: : Tiering Storage for Data Analytics in the Cloudyuecheng/docs/hpdc15_talk.pdf · 2017-08-21 · 10 Heterogeneity in cloud storage services Storage type Capacity (GB/volume) Throughput

35

CAST framework

Job performance estimator

Tiering solver

Job profiles

Analytics workload spec’s

Job-storage relationships

Cloud service spec’s

Tenant goals

Frameworks

Dryad ...

...

Page 36: : Tiering Storage for Data Analytics in the Cloudyuecheng/docs/hpdc15_talk.pdf · 2017-08-21 · 10 Heterogeneity in cloud storage services Storage type Capacity (GB/volume) Throughput

36

CAST framework

Job performance estimator

Tiering solver

Job profiles

Analytics workload spec’s

Job-storage relationships

Cloud service spec’s

J0: <S0,C0> J1: <S1,C1>

...

objStore persSSD ephSSD persHDD

Generated job assignment tuples

Tier the cloud storage, execute the workload

Frameworks

Dryad ...

Tenant goals

Page 37: : Tiering Storage for Data Analytics in the Cloudyuecheng/docs/hpdc15_talk.pdf · 2017-08-21 · 10 Heterogeneity in cloud storage services Storage type Capacity (GB/volume) Throughput

37

Job performance estimator Job performance estimator

Tiering solver

EST R̂, M̂ (si,Li )( ) = mnvm ⋅mc

"

##

$

%%⋅

inputim

bwmapsi

&

'

(((

)

*

++++

rnvm ⋅ rc

"

##

$

%%⋅interi

rbwshuffle

si

&

'

(((

)

*

+++

+r

nvm ⋅ rc

"

##

$

%%⋅

ouputir

bwreducesi

&

'

(((

)

*

+++

Map phase Shuffle phase

Reduce phase

# waves Runtime/wave

* MRCute: Bazaar [SoCC’12]

Page 38: : Tiering Storage for Data Analytics in the Cloudyuecheng/docs/hpdc15_talk.pdf · 2017-08-21 · 10 Heterogeneity in cloud storage services Storage type Capacity (GB/volume) Throughput

38

Tiering solver Job performance estimator

Tiering solver

¤ Optimization ¤ Objective function

max Tenant utility =1T

($vm +$store )

Page 39: : Tiering Storage for Data Analytics in the Cloudyuecheng/docs/hpdc15_talk.pdf · 2017-08-21 · 10 Heterogeneity in cloud storage services Storage type Capacity (GB/volume) Throughput

39

Tiering solver Job performance estimator

Tiering solver

¤ Optimization ¤ Objective function

¤ Constraints

=1T

($vm +$store )max Tenant utility

ci ≥ (Ii +Mi +Oi ) (∀i ∈ J )

T = REG si,C[si ], R̂, L̂i( )i=1

J

∑ , where si ∈ F

$vm = nvm ⋅ (Pvm ⋅T )

$store = C[ f ]⋅ (Pstore[ f ]⋅ T 60"#

$%)( )

f

F

Page 40: : Tiering Storage for Data Analytics in the Cloudyuecheng/docs/hpdc15_talk.pdf · 2017-08-21 · 10 Heterogeneity in cloud storage services Storage type Capacity (GB/volume) Throughput

40

Tiering solver Job performance estimator

Tiering solver

¤ Optimization ¤ Objective function

¤ Constraints

=1T

($vm +$store )max Tenant utility

ci ≥ (Ii +Mi +Oi ) (∀i ∈ J )

T = REG si,C[si ], R̂, L̂i( )i=1

J

∑ , where si ∈ F

$vm = nvm ⋅ (Pvm ⋅T )

$store = C[ f ]⋅ (Pstore[ f ]⋅ T 60"#

$%)( )

f

F

Space capacity constraint

Page 41: : Tiering Storage for Data Analytics in the Cloudyuecheng/docs/hpdc15_talk.pdf · 2017-08-21 · 10 Heterogeneity in cloud storage services Storage type Capacity (GB/volume) Throughput

41

Tiering solver Job performance estimator

Tiering solver

¤ Optimization ¤ Objective function

¤ Constraints

=1T

($vm +$store )max Tenant utility

ci ≥ (Ii +Mi +Oi ) (∀i ∈ J )

T = REG si,C[si ], R̂, L̂i( )i=1

J

∑ , where si ∈ F

$vm = nvm ⋅ (Pvm ⋅T )

$store = C[ f ]⋅ (Pstore[ f ]⋅ T 60"#

$%)( )

f

F

Total workload runtime

Page 42: : Tiering Storage for Data Analytics in the Cloudyuecheng/docs/hpdc15_talk.pdf · 2017-08-21 · 10 Heterogeneity in cloud storage services Storage type Capacity (GB/volume) Throughput

42

Tiering solver Job performance estimator

Tiering solver

¤ Optimization ¤ Objective function

¤ Constraints

=1T

($vm +$store )max Tenant utility

ci ≥ (Ii +Mi +Oi ) (∀i ∈ J )

T = REG si,C[si ], R̂, L̂i( )i=1

J

∑ , where si ∈ F

$vm = nvm ⋅ (Pvm ⋅T )

$store = C[ f ]⋅ (Pstore[ f ]⋅ T 60"#

$%)( )

f

F

VM $ cost

Page 43: : Tiering Storage for Data Analytics in the Cloudyuecheng/docs/hpdc15_talk.pdf · 2017-08-21 · 10 Heterogeneity in cloud storage services Storage type Capacity (GB/volume) Throughput

43

Tiering solver Job performance estimator

Tiering solver

¤ Optimization ¤ Objective function

¤ Constraints

=1T

($vm +$store )max Tenant utility

ci ≥ (Ii +Mi +Oi ) (∀i ∈ J )

T = REG si,C[si ], R̂, L̂i( )i=1

J

∑ , where si ∈ F

$vm = nvm ⋅ (Pvm ⋅T )

$store = C[ f ]⋅ (Pstore[ f ]⋅ T 60"#

$%)( )

f

F

∑ Storage $ cost

Page 44: : Tiering Storage for Data Analytics in the Cloudyuecheng/docs/hpdc15_talk.pdf · 2017-08-21 · 10 Heterogeneity in cloud storage services Storage type Capacity (GB/volume) Throughput

44

Tiering solver Job performance estimator

Tiering solver

¤ Optimization ¤ Objective function

¤ Constraints

=1T

($vm +$store )max Tenant utility

ci ≥ (Ii +Mi +Oi ) (∀i ∈ J )

T = REG si,C[si ], R̂, L̂i( )i=1

J

∑ , where si ∈ F

$vm = nvm ⋅ (Pvm ⋅T )

$store = C[ f ]⋅ (Pstore[ f ]⋅ T 60"#

$%)( )

f

F

Tuning knob: Capacity of Ji Tuning knob:

Storage service of Ji

J0: <s0,c0> J1: <s1,c1> J2: <s2,c2> ...

Assigned job storage, adjusted storage capacity

Simulated annealing

Page 45: : Tiering Storage for Data Analytics in the Cloudyuecheng/docs/hpdc15_talk.pdf · 2017-08-21 · 10 Heterogeneity in cloud storage services Storage type Capacity (GB/volume) Throughput

45

Enhancements: CAST++

¤ Enhancement 1: Data reuse awareness ¤ All jobs sharing the same dataset have the same

storage service assigned to them

¤ Enhancement 2: Workflow awareness ¤ Objective

¤ Constraints

Depth-first traversal in workflow DAG for allocating storage capacities

min $total = $vm +$store

T ≤ deadline

Page 46: : Tiering Storage for Data Analytics in the Cloudyuecheng/docs/hpdc15_talk.pdf · 2017-08-21 · 10 Heterogeneity in cloud storage services Storage type Capacity (GB/volume) Throughput

46

Outline

Motivation

Quantitative analysis

CAST design

Evaluation

Page 47: : Tiering Storage for Data Analytics in the Cloudyuecheng/docs/hpdc15_talk.pdf · 2017-08-21 · 10 Heterogeneity in cloud storage services Storage type Capacity (GB/volume) Throughput

47

Methodology

¤ 400-core Hadoop cluster in Google Cloud ¤  25 n1-standard-16 VM (16 vCPUs, 60GB RAM)

¤ Tenant utility measurement ¤ CAST: Effectiveness for general workloads

¤ CAST++: Effectiveness for data reuse

¤ Meeting workflow deadlines with CAST++

Page 48: : Tiering Storage for Data Analytics in the Cloudyuecheng/docs/hpdc15_talk.pdf · 2017-08-21 · 10 Heterogeneity in cloud storage services Storage type Capacity (GB/volume) Throughput

48

Tenant utility improvement

ephSSD 100%

persSSD 100%

persHDD 100%

objStore 100%

Greedy exact-fit

Greedy over-prov

CAST

CAST++

0

0.2

0.4

0.6

0.8

1

1.2

1.4

No

rma

lize

d te

nant

util

ity

Deployment configuration

100-job Hadoop workload, simulating behaviors of Facebook’s 3000-machine Hadoop cluster

Page 49: : Tiering Storage for Data Analytics in the Cloudyuecheng/docs/hpdc15_talk.pdf · 2017-08-21 · 10 Heterogeneity in cloud storage services Storage type Capacity (GB/volume) Throughput

49

Tenant utility improvement

ephSSD 100%

persSSD 100%

persHDD 100%

objStore 100%

Greedy exact-fit

Greedy over-prov

CAST

CAST++

0

0.2

0.4

0.6

0.8

1

1.2

1.4

No

rma

lize

d te

nant

util

ity

Deployment configuration

100-job Hadoop workload, simulating behaviors of Facebook’s 3000-machine Hadoop cluster

No tiering involved

Page 50: : Tiering Storage for Data Analytics in the Cloudyuecheng/docs/hpdc15_talk.pdf · 2017-08-21 · 10 Heterogeneity in cloud storage services Storage type Capacity (GB/volume) Throughput

50

Tenant utility improvement

ephSSD 100%

persSSD 100%

persHDD 100%

objStore 100%

Greedy exact-fit

Greedy over-prov

CAST

CAST++

0

0.2

0.4

0.6

0.8

1

1.2

1.4

No

rma

lize

d te

nant

util

ity

Deployment configuration

100-job Hadoop workload, simulating behaviors of Facebook’s 3000-machine Hadoop cluster

Greedy algo’s

Page 51: : Tiering Storage for Data Analytics in the Cloudyuecheng/docs/hpdc15_talk.pdf · 2017-08-21 · 10 Heterogeneity in cloud storage services Storage type Capacity (GB/volume) Throughput

51

Tenant utility improvement

ephSSD 100%

persSSD 100%

persHDD 100%

objStore 100%

Greedy exact-fit

Greedy over-prov

CAST

CAST++

0

0.2

0.4

0.6

0.8

1

1.2

1.4

No

rma

lize

d te

nant

util

ity

Deployment configuration

100-job Hadoop workload, simulating behaviors of Facebook’s 3000-machine Hadoop cluster

CAST tiering

Page 52: : Tiering Storage for Data Analytics in the Cloudyuecheng/docs/hpdc15_talk.pdf · 2017-08-21 · 10 Heterogeneity in cloud storage services Storage type Capacity (GB/volume) Throughput

52

Tenant utility improvement

ephSSD 100%

persSSD 100%

persHDD 100%

objStore 100%

Greedy exact-fit

Greedy over-prov

CAST

CAST++

0

0.2

0.4

0.6

0.8

1

1.2

1.4

No

rma

lize

d te

nant

util

ity

Deployment configuration

100-job Hadoop workload, simulating behaviors of Facebook’s 3000-machine Hadoop cluster

14%

Page 53: : Tiering Storage for Data Analytics in the Cloudyuecheng/docs/hpdc15_talk.pdf · 2017-08-21 · 10 Heterogeneity in cloud storage services Storage type Capacity (GB/volume) Throughput

53

$ cost vs. runtime

0

50

100

150

200

250

300

350

400

450

0

50

100

150

200

250

300

ephSSD 100%

persSSD 100%

persHDD 100%

objStore 100%

Greedy exact-fit

Greedy over-prov

CAST CAST++

Runt

ime

(m

in)

Co

st (

$)

Cost Runtime

100-job Hadoop workload, simulating behaviors of Facebook’s 3000-machine Hadoop cluster

Page 54: : Tiering Storage for Data Analytics in the Cloudyuecheng/docs/hpdc15_talk.pdf · 2017-08-21 · 10 Heterogeneity in cloud storage services Storage type Capacity (GB/volume) Throughput

54

$ cost vs. runtime

0

50

100

150

200

250

300

350

400

450

0

50

100

150

200

250

300

ephSSD 100%

persSSD 100%

persHDD 100%

objStore 100%

Greedy exact-fit

Greedy over-prov

CAST CAST++

Runt

ime

(m

in)

Co

st (

$)

Cost Runtime

100-job Hadoop workload, simulating behaviors of Facebook’s 3000-machine Hadoop cluster

Page 55: : Tiering Storage for Data Analytics in the Cloudyuecheng/docs/hpdc15_talk.pdf · 2017-08-21 · 10 Heterogeneity in cloud storage services Storage type Capacity (GB/volume) Throughput

55

$ cost vs. runtime

0

50

100

150

200

250

300

350

400

450

0

50

100

150

200

250

300

ephSSD 100%

persSSD 100%

persHDD 100%

objStore 100%

Greedy exact-fit

Greedy over-prov

CAST CAST++

Runt

ime

(m

in)

Co

st (

$)

Cost Runtime

100-job Hadoop workload, simulating behaviors of Facebook’s 3000-machine Hadoop cluster

Page 56: : Tiering Storage for Data Analytics in the Cloudyuecheng/docs/hpdc15_talk.pdf · 2017-08-21 · 10 Heterogeneity in cloud storage services Storage type Capacity (GB/volume) Throughput

56

Meeting workflow deadlines

0

0.2

0.4

0.6

0.8

1

1.2

0

10

20

30

40

50

60

70

80

90

ephSSD 100%

persSSD 100%

persHDD 100%

objStore 100%

CAST CAST++

De

ad

line

mis

s ra

te

Co

st (

$)

Cost Deadline misses

A workload consisting of 5 workflows, with a total of 31 analytics jobs

Page 57: : Tiering Storage for Data Analytics in the Cloudyuecheng/docs/hpdc15_talk.pdf · 2017-08-21 · 10 Heterogeneity in cloud storage services Storage type Capacity (GB/volume) Throughput

57

Meeting workflow deadlines

0

0.2

0.4

0.6

0.8

1

1.2

0

10

20

30

40

50

60

70

80

90

ephSSD 100%

persSSD 100%

persHDD 100%

objStore 100%

CAST CAST++

De

ad

line

mis

s ra

te

Co

st (

$)

Cost Deadline misses

A workload consisting of 5 workflows, with a total of 31 analytics jobs

Page 58: : Tiering Storage for Data Analytics in the Cloudyuecheng/docs/hpdc15_talk.pdf · 2017-08-21 · 10 Heterogeneity in cloud storage services Storage type Capacity (GB/volume) Throughput

58

Meeting workflow deadlines

0

0.2

0.4

0.6

0.8

1

1.2

0

10

20

30

40

50

60

70

80

90

ephSSD 100%

persSSD 100%

persHDD 100%

objStore 100%

CAST CAST++

De

ad

line

mis

s ra

te

Co

st (

$)

Cost Deadline misses

A workload consisting of 5 workflows, with a total of 31 analytics jobs

Page 59: : Tiering Storage for Data Analytics in the Cloudyuecheng/docs/hpdc15_talk.pdf · 2017-08-21 · 10 Heterogeneity in cloud storage services Storage type Capacity (GB/volume) Throughput

59

Conclusion

¤ CAST performs storage allocation and data placement for cloud analytics workloads ¤  Leverages performance and pricing models of cloud

storage services ¤  Leverages analytics workload heterogeneity

¤ CAST++ enhancements detect data reuse and inter-job dependencies ¤  To further improve tenant utility

¤  To effectively meet deadlines while minimizing $ cost

Page 60: : Tiering Storage for Data Analytics in the Cloudyuecheng/docs/hpdc15_talk.pdf · 2017-08-21 · 10 Heterogeneity in cloud storage services Storage type Capacity (GB/volume) Throughput

60

http://research.cs.vt.edu/dssl/

Yue Cheng M. Iqbal Safdar Aayush Gupta Ali R. Butt

Page 61: : Tiering Storage for Data Analytics in the Cloudyuecheng/docs/hpdc15_talk.pdf · 2017-08-21 · 10 Heterogeneity in cloud storage services Storage type Capacity (GB/volume) Throughput

61

Page 62: : Tiering Storage for Data Analytics in the Cloudyuecheng/docs/hpdc15_talk.pdf · 2017-08-21 · 10 Heterogeneity in cloud storage services Storage type Capacity (GB/volume) Throughput

62

Backup Slides

Page 63: : Tiering Storage for Data Analytics in the Cloudyuecheng/docs/hpdc15_talk.pdf · 2017-08-21 · 10 Heterogeneity in cloud storage services Storage type Capacity (GB/volume) Throughput

63

Heterogeneity in cloud storage services

Storage type

Capacity (GB/volume)

Throughput (MB/sec)

IOPS (4KB)

Cost ($/month)

ephSSD 375 733 100000 0.218×375

persSSD 100 48 3000 0.17×100

250 118 7500 0.17×250

500 234 15000 0.17×500

persHDD 100 20 150 0.04×100

250 45 375 0.04×250

500 97 750 0.04×500

objStore N/A 265 550 0.026/GB

ephSSD: VM-local ephemeral SSD, persSSD: network-attached persistent SSD, persHDD: network-attached persistent HDD, objStore: Google cloud object storage A 500GB persSSD provides 1.4X higher throughput & 19X higher

IOPS than a 500GB persHDD.

Page 64: : Tiering Storage for Data Analytics in the Cloudyuecheng/docs/hpdc15_talk.pdf · 2017-08-21 · 10 Heterogeneity in cloud storage services Storage type Capacity (GB/volume) Throughput

64

Straggler issue in fine-grained tiering

0%

100%

200%

300%

400%

500%

ephSSD 100% persSSD 100% persHDD 100% ephSSD 50%persSSD 50%

ephSSD 50%persHDD 50%

Nor

mal

ized

runt

ime

0%

100%

200%

300%

400%

500%

0% 30% 70% 90% 100%

Nor

mal

ized

runt

ime

%-age data stored in ephSSD

persHDD 100% ephSSD 30%persHDD 70%

ephSSD 70%persHDD 30%

ephSSD 90%persHDD 10%

ephSSD 100%

Page 65: : Tiering Storage for Data Analytics in the Cloudyuecheng/docs/hpdc15_talk.pdf · 2017-08-21 · 10 Heterogeneity in cloud storage services Storage type Capacity (GB/volume) Throughput

65

Tiering solver Job performance estimator

Tiering solver

¤ Optimization ¤ Objective function

¤ Constraints

=1T

($vm +$store )max Tenant utility

ci ≥ (Ii +Mi +Oi ) (∀i ∈ J )

T = REG si,C[si ], R̂, L̂i( )i=1

J

∑ , where si ∈ F

$vm = nvm ⋅ (Pvm ⋅T )

$store = C[ f ]⋅ (Pstore[ f ]⋅ T 60"#

$%)( )

f

F

Tuning knob: capacity of Ji

Input size of Ji

Intermediate data size of Ji

Output size of Ji

Tuning knob: Storage service of Ji

Page 66: : Tiering Storage for Data Analytics in the Cloudyuecheng/docs/hpdc15_talk.pdf · 2017-08-21 · 10 Heterogeneity in cloud storage services Storage type Capacity (GB/volume) Throughput

66

Hadoop traces from Facebook

¤ More than 99% of data touched by large jobs that incur most of the storage cost

¤ The aggregated data size for small jobs is only 0.1% of the total dataset size

¤ We focus on large jobs that have enough # mappers & reducers to fully utilize the cluster computing capacity

Page 67: : Tiering Storage for Data Analytics in the Cloudyuecheng/docs/hpdc15_talk.pdf · 2017-08-21 · 10 Heterogeneity in cloud storage services Storage type Capacity (GB/volume) Throughput

67

Storage capacity/service breakdown

0

0.2

0.4

0.6

0.8

1

1.2

ephSSD 100%

persSSD 100%

persHDD 100%

objStore 100%

Greedy exact-fit

Greedy over-prov

CAST CAST++

No

rma

lize

d c

ap

ac

ity

ephSSD persSSD persHDD objStore