minimizing stall time in single disk susanne albers, naveen garg, stefano leonardi, carsten witt...

54
Minimizing Stall Time in Single Disk Susanne Albers, Naveen Garg, Stefano Leonardi, Carsten Witt Presented by Ruibin Xu

Upload: roxanne-stevenson

Post on 13-Jan-2016

217 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Minimizing Stall Time in Single Disk Susanne Albers, Naveen Garg, Stefano Leonardi, Carsten Witt Presented by Ruibin Xu

Minimizing Stall Time in Single Disk

Susanne Albers, Naveen Garg, Stefano Leonardi, Carsten

WittPresented by Ruibin Xu

Page 2: Minimizing Stall Time in Single Disk Susanne Albers, Naveen Garg, Stefano Leonardi, Carsten Witt Presented by Ruibin Xu

Introduction Prefetching and caching are

powerful techniques for increasing performance in disk systems

Prefetching: load memory blocks into the cache before the actual references (needs to evict blocks simultaneously)

Caching: maintain the most frequently accessed blocks in cache

Page 3: Minimizing Stall Time in Single Disk Susanne Albers, Naveen Garg, Stefano Leonardi, Carsten Witt Presented by Ruibin Xu

Introduction

Both techniques have been studied extensively, but separately

Now look at them in an integrated manner

Focus on the offline problem

Page 4: Minimizing Stall Time in Single Disk Susanne Albers, Naveen Garg, Stefano Leonardi, Carsten Witt Presented by Ruibin Xu

The problem definition

Assume all blocks reside on one disk The cache size is k Serving a request takes one time unit Fetching a block takes F time units Given a request sequence σ = r1, … ,

rn, how to schedule the prefetching to minimize the total stall time

Page 5: Minimizing Stall Time in Single Disk Susanne Albers, Naveen Garg, Stefano Leonardi, Carsten Witt Presented by Ruibin Xu

An example

•k = 4

•F = 5

•Blocks a, b, c and d are initially in the cache

The minimum stall time is 3

Page 6: Minimizing Stall Time in Single Disk Susanne Albers, Naveen Garg, Stefano Leonardi, Carsten Witt Presented by Ruibin Xu

Big question

Cao et. al. designed a 2-approximation algorithm.

Can this problem be solved exactly in polynomial time?

Yes, this paper answers this quesiton

Page 7: Minimizing Stall Time in Single Disk Susanne Albers, Naveen Garg, Stefano Leonardi, Carsten Witt Presented by Ruibin Xu

The idea Use linear programming At first thought, needs to prove that

the optimum solution is integral by arguing that all vertices of the corresponding polytope are integral By showing that the constraint matrix is

total unimodular (ex. Bipartite matching)

By combinatorial argument(ex. Matching and matroid polytopes)

Page 8: Minimizing Stall Time in Single Disk Susanne Albers, Naveen Garg, Stefano Leonardi, Carsten Witt Presented by Ruibin Xu

Main novelty

At second thought, the polytope corresponding to the LP to this problem has nonintegral vertices

Now if we can show that any solution to the LP can be written as a convex combination of (polynomially many) integral solutions, ……

Page 9: Minimizing Stall Time in Single Disk Susanne Albers, Naveen Garg, Stefano Leonardi, Carsten Witt Presented by Ruibin Xu

The roadmap

1. Construct the LP2. Solve the LP3. Find the convex decomposition to

integral solutions

Page 10: Minimizing Stall Time in Single Disk Susanne Albers, Naveen Garg, Stefano Leonardi, Carsten Witt Presented by Ruibin Xu

The LP formulation

This is a 0-1 LP The length of the request sequence

is n The cache size is k The fetching time is F The cache initially contains k blocks

never requested in the sequence

Page 11: Minimizing Stall Time in Single Disk Susanne Albers, Naveen Garg, Stefano Leonardi, Carsten Witt Presented by Ruibin Xu

The variables of the LP

Consider all the intervals of the request sequence of length at most F : interval I = (i, j) of length |I|=j – i – 1, i = 0, … , n-1, j = 1, … , n, i < j

Page 12: Minimizing Stall Time in Single Disk Susanne Albers, Naveen Garg, Stefano Leonardi, Carsten Witt Presented by Ruibin Xu

The variables of the LP Associate each interval I with an indicator

variable X(I) where X(I) =1 indicates a prefetch starting after request i and ending before request j and X(I) =0 indicates no prefetch is performed in this interval

With each interval I and distinct block a, associate variable fI,a ( eI,a), which is 1 if block a is fetched (evicted) in interval I and 0, otherwise

Page 13: Minimizing Stall Time in Single Disk Susanne Albers, Naveen Garg, Stefano Leonardi, Carsten Witt Presented by Ruibin Xu

The objective func. of the LP The prefetch occuring in interval I

has a stall time F - |I|

Thus the objective function is

Page 14: Minimizing Stall Time in Single Disk Susanne Albers, Naveen Garg, Stefano Leonardi, Carsten Witt Presented by Ruibin Xu

The constraints of the LP There are 7 kinds of constraints

A definition: an interval (a, b) is contained in an interval (c, d) if c ≤a and d ≥b, denoted by (a, b) (c, d)

Page 15: Minimizing Stall Time in Single Disk Susanne Albers, Naveen Garg, Stefano Leonardi, Carsten Witt Presented by Ruibin Xu

The 1st constraint To ensure that two prefetches are

not performed simultaneously

Page 16: Minimizing Stall Time in Single Disk Susanne Albers, Naveen Garg, Stefano Leonardi, Carsten Witt Presented by Ruibin Xu

The 2nd constraint For any interval, the total amount of

fetch should be exactly equal to the total amount of eviction and this value should not exceed the value of the interval

Page 17: Minimizing Stall Time in Single Disk Susanne Albers, Naveen Garg, Stefano Leonardi, Carsten Witt Presented by Ruibin Xu

The 3rd constraint A block should be in cache when it is

referenced After each reference to a block, the block

is in cache. It can then be evicted at most once up until the next reference to that block, and if it is, it must be also be fetched back prior to that next reference

Page 18: Minimizing Stall Time in Single Disk Susanne Albers, Naveen Garg, Stefano Leonardi, Carsten Witt Presented by Ruibin Xu

The 4th and 5th constraint To ensure that every block is in cache

at its first reference, the total fetch of a block on intervals before its first reference should be 1 and the total evict of the block on these intervals should be 0

Page 19: Minimizing Stall Time in Single Disk Susanne Albers, Naveen Garg, Stefano Leonardi, Carsten Witt Presented by Ruibin Xu

The 6th constraint A block is not evicted for more than 1

unit after its last reference

Page 20: Minimizing Stall Time in Single Disk Susanne Albers, Naveen Garg, Stefano Leonardi, Carsten Witt Presented by Ruibin Xu

The last constraint On each request, the requested block is

neither prefetched nor evicted

And

Page 21: Minimizing Stall Time in Single Disk Susanne Albers, Naveen Garg, Stefano Leonardi, Carsten Witt Presented by Ruibin Xu

Solving the LP relaxation

First solve the LP relaxation. If we get an integral solution, we are done.

If not, find the convex combination

Page 22: Minimizing Stall Time in Single Disk Susanne Albers, Naveen Garg, Stefano Leonardi, Carsten Witt Presented by Ruibin Xu

Modify the intervals

The goal: to obtain a total order of intervals

An interval I1 = (i1, j1) is properly contained in interval I2 = (i2, j2) iff i1 > i2 and j1 < j2

We don’t want any interval is properly contained in any interval

Page 23: Minimizing Stall Time in Single Disk Susanne Albers, Naveen Garg, Stefano Leonardi, Carsten Witt Presented by Ruibin Xu

Modify the intervals

•For each pair of nested intervals, remove one of them and add two new intervals

Page 24: Minimizing Stall Time in Single Disk Susanne Albers, Naveen Garg, Stefano Leonardi, Carsten Witt Presented by Ruibin Xu

Order the intervals

Now we can order the intervals by increasing starting points;

If two intervals have the same start point, then they are ordered by increasing end-points

Page 25: Minimizing Stall Time in Single Disk Susanne Albers, Naveen Garg, Stefano Leonardi, Carsten Witt Presented by Ruibin Xu

Properties of the optimum sol.

Let C denote the cache configuration after we have performed the fetches and evicts corresponding to the first i intervals; let I be the (i+1)-st interval

There exists an optimum solution for which the next two claims are satisfied

Page 26: Minimizing Stall Time in Single Disk Susanne Albers, Naveen Garg, Stefano Leonardi, Carsten Witt Presented by Ruibin Xu

Properties of the optimum sol. Claim 1: In interval I, we fetch the

block that is not completely in C and whose next reference is earliest

Claim 2: In interval I, we evict the block which is partially or completely in C whose next reference is furthest

Both claims can be proven by contradiction

Page 27: Minimizing Stall Time in Single Disk Susanne Albers, Naveen Garg, Stefano Leonardi, Carsten Witt Presented by Ruibin Xu

Properties of the optimum sol.

The amount of fetch of a block prescribed by claim 1 might be less than x(I). In this case, we apply the same rule to fetch another block in I

The same holds for the case of evictions

Page 28: Minimizing Stall Time in Single Disk Susanne Albers, Naveen Garg, Stefano Leonardi, Carsten Witt Presented by Ruibin Xu

Another view of the process of fetching/evicting Define the distance of interval I

View the process of fetching/evicting as a process in time by associating the time interval [dist(I), dist(I)+x(I)) with interval I

Page 29: Minimizing Stall Time in Single Disk Susanne Albers, Naveen Garg, Stefano Leonardi, Carsten Witt Presented by Ruibin Xu

Another view of the process of fetching/evicting

There is a unique interval associated with each time instant

Also associate a unique fetch/evict with each time instant

Page 30: Minimizing Stall Time in Single Disk Susanne Albers, Naveen Garg, Stefano Leonardi, Carsten Witt Presented by Ruibin Xu

Properties of the optimum sol.

From claim 1&2 and the ordering of fetches/evicts within an interval, it follows that a block a is fetched continuously till it is fully in cache

But the eviction of a could be interrupted before it is completely out of cache

Page 31: Minimizing Stall Time in Single Disk Susanne Albers, Naveen Garg, Stefano Leonardi, Carsten Witt Presented by Ruibin Xu

Properties of the optimum sol.

Consider the fetches/evictions of a block a between two consecutive references to a

Lemma 1. Every interruption in the eviction of a is for some integral time units

Page 32: Minimizing Stall Time in Single Disk Susanne Albers, Naveen Garg, Stefano Leonardi, Carsten Witt Presented by Ruibin Xu

Properties of the optimum sol. A block a is partially fetched/evicted if

the total extent to which a is fetched/evicted between two consecutive references is strictly less than 1

Lemma 2. If a is partially fetched/evicted, then the fetch of a begins some integral time units after the start of its eviction

Page 33: Minimizing Stall Time in Single Disk Susanne Albers, Naveen Garg, Stefano Leonardi, Carsten Witt Presented by Ruibin Xu

Properties of the optimum sol.

Lemma 3. If a is evicted at time t and referenced again, then there is a time t’ = t + i, for some integer i, at which a is fetched back

Page 34: Minimizing Stall Time in Single Disk Susanne Albers, Naveen Garg, Stefano Leonardi, Carsten Witt Presented by Ruibin Xu

The convex decomposition

Let t be in the range [0, 1) and let ti = i + t for every integer i, 0 ≤ i ≤ x(I)

Claim 3. Let t1, t2 be two time instants such that t2 = t1 + i for some positive integer i, and let I1, I2 be the intervals associated with these time instants. Then I1 and I2 are disjoint.

Page 35: Minimizing Stall Time in Single Disk Susanne Albers, Naveen Garg, Stefano Leonardi, Carsten Witt Presented by Ruibin Xu

The convex decomposition

Lemma 4. For any time t in [0,1), the set of intervals that correspond to ti forms a feasible solution

Note that each solution is obtained not for just one value of t but for a range of values, say for all t in the range [a, b]. We associate a weight b – a in the decomposition.

Page 36: Minimizing Stall Time in Single Disk Susanne Albers, Naveen Garg, Stefano Leonardi, Carsten Witt Presented by Ruibin Xu

Conclusion

An optimum prefetching/caching schedule for a single disk can be computed in polynomial time

Page 37: Minimizing Stall Time in Single Disk Susanne Albers, Naveen Garg, Stefano Leonardi, Carsten Witt Presented by Ruibin Xu

Open problem

Now the problem can be solved exactly in polynomial time by using LP, Does there exist a combinatorial, polynomial time algorithm?

Yes, by using multicommodity network flows

Page 38: Minimizing Stall Time in Single Disk Susanne Albers, Naveen Garg, Stefano Leonardi, Carsten Witt Presented by Ruibin Xu

The roadmap

1. Construct the LP2. Solve the LP

3. Find the convex decomposition to integral solutions

1. Construct the multicommodity network2. Solve the network

Page 39: Minimizing Stall Time in Single Disk Susanne Albers, Naveen Garg, Stefano Leonardi, Carsten Witt Presented by Ruibin Xu

Problem No combinatorial polynomial-time

algorithm for computing non-integral min-cost flow is known

But we know an approximation algorithm: for any ε ≥ 0, δ ≥ 0, the algorithm computes a flow such that a fraction of at least 1 - ε of each demand in the network is satisfied and the cost of the flow is at most (1 + δ ) times the optimum

Page 40: Minimizing Stall Time in Single Disk Susanne Albers, Naveen Garg, Stefano Leonardi, Carsten Witt Presented by Ruibin Xu

The network Given a request sequence of

length n, construct a network with n+1 commodities

Associate each request σ(i) with a commodity i, which has a source si, a sink ti and a demand di = 1

For each request σ(i) , introduce two vertices xi and x’i

Page 41: Minimizing Stall Time in Single Disk Susanne Albers, Naveen Garg, Stefano Leonardi, Carsten Witt Presented by Ruibin Xu

An example network

Sketch of the network for request sequence abcbc and F=2

Page 42: Minimizing Stall Time in Single Disk Susanne Albers, Naveen Garg, Stefano Leonardi, Carsten Witt Presented by Ruibin Xu

The problem of previous network

The construction allows a flow algorithm to saturate more than one of the edges that correspond to fetches executed simultaneously

Needs to make sure at most one fetch operation is executed at any time

Page 43: Minimizing Stall Time in Single Disk Susanne Albers, Naveen Garg, Stefano Leonardi, Carsten Witt Presented by Ruibin Xu

Solution

Split the “super edge” (si, xj) into several parts and add one more commodity

For any l, 1≤ l ≤ n-1, let [l, l+1) be the time interval starting at the service of σ(l) and ending immediately before the service of σ(l+1)

Page 44: Minimizing Stall Time in Single Disk Susanne Albers, Naveen Garg, Stefano Leonardi, Carsten Witt Presented by Ruibin Xu

Solution

For any fixed i and j, with 1 ≤ i ≤ n, and pi+1 ≤ j < i, introduce vertices vij

l and wijl where l = j, … , min{j+F,

i} -1 For any fixed i , with 1 ≤ i ≤ n,

introduce vertices viii-1 and wii

i-1

How to connect? How to assign cost and capacity?

Page 45: Minimizing Stall Time in Single Disk Susanne Albers, Naveen Garg, Stefano Leonardi, Carsten Witt Presented by Ruibin Xu

Solution Now add the (n+1)-st commodity Let fl be the number of prefetches

whose execution overlaps with [l, l+1)

Commodity n+1 has a source sn+1, a sink tn+1 and a demand dn+1

Page 46: Minimizing Stall Time in Single Disk Susanne Albers, Naveen Garg, Stefano Leonardi, Carsten Witt Presented by Ruibin Xu

Solution

The flow from sn+1 to tn+1 is routed through the edges (vij

l , wijl ) and

newly introduced “subsinks” tn+1l, 1

≤ l ≤ n-1

How to connect? How to assign cost and weight?

Page 47: Minimizing Stall Time in Single Disk Susanne Albers, Naveen Garg, Stefano Leonardi, Carsten Witt Presented by Ruibin Xu

Optimal flows

Any feasible integral flow of cost C in the network correspond to a feasible prefetching/caching schedule with stall time C for σ, and vice versa

A non-integral flow correspond to a fractional prefetching/caching schedule

Page 48: Minimizing Stall Time in Single Disk Susanne Albers, Naveen Garg, Stefano Leonardi, Carsten Witt Presented by Ruibin Xu

Apply the approximation algo.

Unfortunately, the flow computed by the algorithm does not correspond to a feasible fractional prefetching/caching schedule

It is possible that(1) more than one block is fetched at any time and (2)blocks are not completely in cache when requested

Page 49: Minimizing Stall Time in Single Disk Susanne Albers, Naveen Garg, Stefano Leonardi, Carsten Witt Presented by Ruibin Xu

Apply the approximation algo.

The solution is to choose ε and δ properly and modify the flow

Choose ε=1/(4F2n3) and δ=1/(3nF)

Let Φ be the flow returned by the approximation algorithm

Page 50: Minimizing Stall Time in Single Disk Susanne Albers, Naveen Garg, Stefano Leonardi, Carsten Witt Presented by Ruibin Xu

Apply the approximation algo.

The flow out of each source si, i={1,…n}, is lower bounded by 1-ε. Moreover, commodity n+1 might lack an amount of εdn+1≤ εFn2

Let ρ= 1-ε- εdn+1 , transform the flow Φ into a uniform flow Φ’ which directs exactly ρ units of flow from si to ti

Page 51: Minimizing Stall Time in Single Disk Susanne Albers, Naveen Garg, Stefano Leonardi, Carsten Witt Presented by Ruibin Xu

Apply the approximation algo.

The flow Φ’ corresponds to a fractional solution in which all blocks have size ρ and the number of cache slots is upper bounded by k/ ρ

We can interpret the fractional solution to Φ’ as a convex combination of integral ρ-solution

Page 52: Minimizing Stall Time in Single Disk Susanne Albers, Naveen Garg, Stefano Leonardi, Carsten Witt Presented by Ruibin Xu

Apply the approximation algo.

Let the cost of convex combination of ρ-solutions be C, we can prove that C≤OPT+1/3

By increasing the block size from ρ to 1, we obtain the integral solutions. Let the cost of convex combination of integral solutions be C’, we can prove that C’<OPT+1

Page 53: Minimizing Stall Time in Single Disk Susanne Albers, Naveen Garg, Stefano Leonardi, Carsten Witt Presented by Ruibin Xu

Apply the approximation algo.

Also, it can be proven that no integral component of the convex composition does hold more than k blocks in cache concurrently

Therefore, the convex combination contains at least one integral solution with optimal costs.

Page 54: Minimizing Stall Time in Single Disk Susanne Albers, Naveen Garg, Stefano Leonardi, Carsten Witt Presented by Ruibin Xu

Conclusion

An optimal solution can be computed by a combinatorial algorithm in polynomial time

The running time is O*(n18)