jetstream: cluster-scale parallelization of …...mongodb 5.2 9 117 openoffice 7.0 10 32 firefox...

JetStream:Cluster-scaleParallelizationofInformationFlowQueries

AndrewQuinn,DavidDevecsery,PeterChenandJasonFlinn

• DIFT instruments execution to track causality• Also known as “Taint-Tracking”

Dynamic Information Flow Tracking

Sources (Inputs)

Sinks (Outputs)

DIFT for Debugging

o1Server

DIFT for Debugging

Outputs

Inputs

Server

DIFT for Debugging

Outputs

Inputs

Server

DIFT for Debugging

Outputs

Inputs

Server

DIFT for Debugging

Outputs

Inputs

Server

DIFT – limitations

Arnold ‘14 Overheads ~100x

X-Ray ‘12 Long queries

TaintDroid ‘10 No native code

Backtracker ‘03 Coarse-grained causality

Parallelize DIFT

Parallelizing DIFT is HARD

A = read()B = read()C = A + BD = X + Y E = CB = 0Z = A[D]F = E write(F)

SequentialDependencies

Speck (ASPLOS ‘08)

Parallel Lifeguards (ASPLOS ‘08)

Speck (ASPLOS ‘08)

Parallel Lifeguards (APSLOS ‘08)

“EmbarrassinglySequential”- Ruwase etal.

JetStream

Aggregation – pipeline parallelism

Local DIFT – epoch parallelism 2xà21x

Fasterthanoriginalexecution!

• Design of JetStream• Local DIFT• Aggregation

• Evaluation

Outline

Debugging Query

Outputs

Inputs

Local DIFT

Outputs

Inputs

Time slice execution into Epochs

Local DIFT

Outputs

Inputs

Leverage Record and Replay to calculate DIFT in parallel

Local DIFT

Outputs

Inputs

Track mapping between all intermediate locations

Local DIFT

Outputs

Inputs

Mapping is too expensive to calculate:• log operations• defer calculating relationships until aggregation

D = B + C C = A[D]

• Fast Forward: replay execution until start of epoch• Analysis: log operations using a graph

CAFast Forward

Analysis

CA B D

D = B + C C = A[D]

CAFast Forward

Analysis

CA B D

D = B + C C = A[D]

CAFast Forward

Analysis

CA B D

Local DIFT output

Outputs

Inputs

JetStreamLocal DIFT – epoch parallelism

Aggregation

Outputs

Inputs

Calculate paths between source and sinks• Many nodes are not on path between source and sink• Use sequential information to prune work

Forward Pass

Outputs

Inputs

Pass locations which are derived from a source

derivedlocations

Forward Pass

Outputs

Inputs

Pass locations which are derived from a source

derivedlocations

Backward Pass

Outputs

Inputs

usedlocations

Pass locations which are used by a sink

Backward Pass

Outputs

Inputs

usedlocations

Pass locations which are used by a sink

In the paper:• Insights about why naïve approaches fail• Partitioning – challenging to predict the local DIFT time • Pre-pruning – garbage collection of the graph

JetStream

• Evaluation

Outline

• CloudLab cluster of 32 machines, 1–128 cores

Experimental Setup

Benchmark Sequential DIFT Time (Minutes)

Sources(millions)

Sinks(millions)

Ghostscript 1.3 3 0.2Gzip 1.8 64 488Evince 3.9 10 104Nginx 3.3 10 35Mongodb 5.2 9 117OpenOffice 7.0 10 32Firefox 30.6 0.9 2

Benchmark Sequential DIFT Time (Minutes)

Sources(millions)

Sinks(millions)

Ghostscript 1.3 3 0.2Gzip 1.8 64 488Evince 3.9 10 104Nginx 3.3 10 35Mongodb 5.2 9 117OpenOffice 7.0 10 32Firefox 30.6 0.9 2

• CloudLab cluster of 32 machines, 1–128 cores

Experimental Setup

sources:Cookiessinks:suspiciousconnections

sources:homedirectorysinks:all

• Unexpected analysis:• prioritize low record overhead

• Expected analysis• periodic checkpoint• gather partitioning stats

Two Different Scenarios

1 10 100

Normalize

NumberofCoresGzip Ghostscript Evince MongodbNginx OpenOffice Firefox Ideal

mean:13x

Scalability of Unexpected Analysis

1 10 100

Normalize

dSpeedu

NumberofCoresGzip Ghostscript Evince MongodbNginx OpenOffice Firefox Ideal

mean:21x

Scalability of Expected Analysis

JetStream

Local DIFT – epoch parallelism 2xà21x

Fasterthanoriginalexecution!

Questions

Related Work

Epoch Parallelism• Wallace and Hazelwood, “SuperPin: Parallelizing

Dynamic Instrumentation for Real-Time Performance”

Local DIFT• Ruwase et al. “Parallelizing Dynamic Information Flow

Tracking”• Nightengale et al. “Parallelizing security checks on

commodity hardware”

Benchmarks

Benchmark Replay Time (seconds)

JetStream Time(seconds)

Ghostscript 1.0 5.6Gzip 3.0 2.3Evince 13.5 19.5Nginx 4.8 5.6Mongodb 22.8 13.8OpenOffice 7.6 25.0Firefox 67.4 94.4

Aggregation Results

BackwardsPass(seconds)

BothPassesTime(seconds)

FastForward Instrumentation Analysis Pre-pruneForwardPass Prune BackwardPass

0 20 40 60 80 100 120

GzipFirstQuery

0 20 40 60 80 100 120

GzipSecondQuery

FastForward Instrumentation Analysis Pre-pruneForwardPass Prune BackwardPass

OpenOffice

0 20 40 60 80 100 120

OpenOfficeFirstQuery

0 20 40 60 80 100 120

OpenOfficeSecondQuery

jetstream: cluster-scale parallelization of …...mongodb 5.2 9 117 openoffice 7.0 10 32 firefox...

Documents

gige-vision realtime master library documentation (cluster...

design and analysis of a 32-bit embedded high-performance...

dell 32-bit hpcc solutions · web viewdell 32-bit high...

canonical charmed kubernetes on supermicro a+ systems...

updated: 7/13/15 cloudlab. updated: 7/13/15 cloudlab...

dell emc powerstore: virtualization integration...6.5...

application performance optimizations (part 2) ·...

cloudlab overview

32 e réunion cluster wash mali – 25/06/2013 groupe pivot...

cloudlab aditya akella. cloudlab 2 underneath, it’s geni...

902010/32 - bitron...

hpc applications performance and optimizations best...

ciruasland · produk-produk ciruas land adalah cluster gold...

using cluster analysis, cluster validation, and...

arsitagx-master-product.s3-ap-southeast-1.amazonaws.com ·...

the design and operation of cloudlab

design of a cloud computing course...

cluster initiative bavaria - cluster-offensive bayern ·...

32 e réunion cluster wash mali – 25/06/2013

quantum espresso performance benchmark and...