final all-pairs shortest paths

Upload: amitcoolbuddy20

Post on 02-Jun-2018

251 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/10/2019 Final All-pairs Shortest Paths

    1/45

    All-Pairs Shortest Paths

    Csc8530Dr. Prasad

    Jon A Preston

    March 17, 2004

  • 8/10/2019 Final All-pairs Shortest Paths

    2/45

    Outline

    Review of graph theory

    Problem definition

    Sequential algorithms

    Properties of interest

    Parallel algorithm

    Analysis

    Recent research References

  • 8/10/2019 Final All-pairs Shortest Paths

    3/45

    Graph Terminology

    G = (V, E)

    W = weight matrix wij= weight/length of edge (vi, vj)

    wij= if viand vjare not connected by an edge wii= 0

    Assume W has positive, 0, and negative values

    For this problem, we cannot have a negative-sumcycle in G

  • 8/10/2019 Final All-pairs Shortest Paths

    4/45

    Weighted Graph and Weight Matrix

    v3

    v2

    v0 v1

    v4

    1 2

    5

    7

    6

    9

    -43

    0692060701

    97034

    20305

    01450

    43

    2

    1

    0

    0 1 2 3 4

  • 8/10/2019 Final All-pairs Shortest Paths

    5/45

    Directed Weighted Graph and Weight Matrix

    v4

    v2

    v0 v3

    v5

    -1 5

    -2

    9

    4

    3

    1

    2

    056

    401

    072

    30

    920

    10

    5

    4

    3

    2

    1

    0

    0 1 2 3 4 5

    v17

    6

  • 8/10/2019 Final All-pairs Shortest Paths

    6/45

    All-Pairs Shortest Paths Problem Defined

    For every pair of vertices viand vjin V, it isrequired to find the length of the shortest path

    from vito vjalong edges inE.

    Specifically, a matrix Dis to be constructed suchthat dijis the length of the shortest path from vito

    vjin G, for all iandj.

    Length of a path (or cycle) is the sum of thelengths (weights) of the edges forming it.

  • 8/10/2019 Final All-pairs Shortest Paths

    7/45

    Sample Shortest Path

    v4

    v2

    v0 v3

    v5

    -1 5

    -2

    9

    4

    3

    1

    2v1

    7

    6

    Shortest path from v0to v4is along edges

    (v0, v1), (v1, v2), (v2, v4)

    and has length 6

  • 8/10/2019 Final All-pairs Shortest Paths

    8/45

    Disallowing Negative-length Cycles

    APSP does not allow for input to containnegative-length cycles

    This is necessary because:

    If such a cycle were to exist within a path from vito vj,then one could traverse this cycle indefinitely,

    producing paths of ever shorter lengths from vito vj.

    If a negative-length cycle exists, then all paths

    which contain this cycle would have a lengthof -.

  • 8/10/2019 Final All-pairs Shortest Paths

    9/45

    Recent Work on Sequential Algorithms

    Floyd-Warshall algorithm is (V3) Appropriate for dense graphs: |E| = O(|V|2)

    Johnsons algorithm Appropriate for sparse graphs: |E| = O(|V|)

    O(V2 log V + V E) if using a Fibonacci heap

    O(V E log V) if using binary min-heap

    Shoshan and Zwick (1999) Integer edge weights in {1, 2, , W}

    O(W Vp(V W)) where 2.376 and p is a polylog function

    Pettie (2002) Allows real-weighted edges

    O(V2 log log V + V E)

    Strassens Algorithm(matrix multiplication)

    1),(

    k v

    k

    k

    wwvp

  • 8/10/2019 Final All-pairs Shortest Paths

    10/45

    Properties of Interest

    Let denote the length of the shortest path fromvito vjthat goes through at most k - 1 intermediate

    vertices (k hops)

    = wij(edge length from vito vj) If ijand there is no edge from vito vj, then

    Also,

    Given that there are no negative weighted cyclesin G, there is no advantage in visiting any vertex

    more than once in the shortest path from vito vj.

    Since there are only nvertices in G,

    ijij wd1

    kijd

    1

    ij

    d

    01 iiii wd

    1 nijij dd

  • 8/10/2019 Final All-pairs Shortest Paths

    11/45

    Guaranteeing Shortest Paths

    If the shortest path from vito vjcontains vrand vs(wherevrprecedes vs)

    The path from vrto vsmust be minimal (or it wouldntexist in the shortest path)

    Thus, to obtain the shortest path from vito vj, we cancompute all combinations of optimal sub-paths (whose

    concatenation is a path from vito vj), and then select the

    shortest one

    vi vs vj

    MIN MIN

    vr

    MIN

    MINs

  • 8/10/2019 Final All-pairs Shortest Paths

    12/45

    Iteratively Building Shortest Paths

    1kijd

    vi vj

    v1w1j1

    1

    kid

    v2w2j

    1

    2

    kid

    vn wnj1kind

    lj

    k

    il

    n

    l

    k

    ij

    k

    ij wddd 1

    1

    1 min,min

    ljkiln

    l

    k

    ij wdd

    1

    1

    min

  • 8/10/2019 Final All-pairs Shortest Paths

    13/45

    Recurrence Definition

    For k> 1,

    Guarantees O(log k) steps to calculate

    )(min 2/2/ kljkill

    kij ddd

    vi vl vj

    k/2 vertices k/2 vertices

    kvertices

    MIN MIN

    k

    ijd

  • 8/10/2019 Final All-pairs Shortest Paths

    14/45

    Similarity

    lj

    k

    il

    n

    l

    k

    ij

    wdd

    1

    1

    min

    n

    l

    ljilij BAC1

  • 8/10/2019 Final All-pairs Shortest Paths

    15/45

    Computing D

    LetDk= matrix with entries dijfor 0 i,j n- 1.

    GivenD1, computeD2,D4, ,Dm

    D = Dm

    To calculateDkfromDk/2, use special form of

    matrix multiplication

    min

    )1log(2 nm

  • 8/10/2019 Final All-pairs Shortest Paths

    16/45

    Modified Matrix Multiplication

    Step 2: forr= 0 toN1 dopar

    Cr=Ar+Br

    end

    Step 3: form= 2qto3q1 do

    forallr N(rm= 0) dopar

    Cr= min(Cr, Cr(m))

  • 8/10/2019 Final All-pairs Shortest Paths

    17/45

    Modified Example

    43

    21A

    43

    21B

    2215

    107C

    23

    1

    -1

    1

    -2

    2-4

    4

    -3

    3

    -1

    3

    -2

    4

    -4

    P100 P101

    P000 P001

    P110 P111P010 P011

    From 9.2, after step (1.3)

  • 8/10/2019 Final All-pairs Shortest Paths

    18/45

    Modified Example (step 2)

    43

    21A

    43

    21B

    P100 P101

    P000 P001

    P110 P111P010 P011

    From 9.2, after

    modified step 2

    5 -2

    1 0

    0 -1

    2 1

  • 8/10/2019 Final All-pairs Shortest Paths

    19/45

    Modified Example (step 3)

    43

    21A

    43

    21B

    P100 P101

    P000 P001

    P110 P111P010 P011

    From 9.2, after

    modified step 30 -2

    1 0

    MIN MIN

    MIN MIN

    01

    20C

  • 8/10/2019 Final All-pairs Shortest Paths

    20/45

    Hypercube Setup

    Begin with a hypercube of n3processors Each has registersA,B, and C

    Arrange them in an nnnarray (cube)

    SetA(0,j, k) = wjkfor 0 j, k n1 i.e processors in positions (0,j, k) containD1= W

    When done, C(0,j, k) contains APSP =Dm

  • 8/10/2019 Final All-pairs Shortest Paths

    21/45

  • 8/10/2019 Final All-pairs Shortest Paths

    22/45

    APSP Parallel Algorithm

    Algorithm HYPERCUBE SHORTEST PATH (A,C)Step 1: forj= 0 ton- 1 dopar

    fork= 0 ton- 1 dopar

    B(0,j, k) = A(0,j, k)

    end for

    end for

    Step 2: fori= 1 to do

    (2.1) HYPERCUBE MATRIX MULTIPLICATION(A,B,C)

    (2.2) forj= 0 ton- 1 dopar

    fork = 0 ton- 1 dopar

    (i) A(0,j, k) = C(0,j, k)

    (ii) B(0,j, k) = C(0,j, k)

    end for

    end forend for

    )1log( n

  • 8/10/2019 Final All-pairs Shortest Paths

    23/45

    An Example

    056

    401

    072

    30

    920

    10

    5

    4

    3

    2

    1

    00 1 2 3 4 5

    D1=

    09563

    4091001

    100712

    730

    135208

    10310

    5

    4

    3

    2

    1

    0

    D2=

    D4=

    0 1 2 3 4 5

    095643

    409201

    1240112

    7312032

    9514204

    10619310

    5

    4

    3

    2

    1

    0

    0 1 2 3 4 5

    D8=

    095643

    409201

    840112

    7312032

    9514204

    10615310

    5

    4

    3

    2

    1

    0

    0 1 2 3 4 5

  • 8/10/2019 Final All-pairs Shortest Paths

    24/45

    Analysis

    Steps 1 and (2.2) require constant time

    There are iterations of Step (2.1) Each requires O(log n) time

    The overall running time is t(n) = O(log2n)

    p(n) = n3

    Cost is c(n) =p(n) t(n) = O(n3

    log2

    n)

    Efficiency is

    )1log( n

    )(log

    1

    )log(

    )(

    )( 223

    3

    1

    nOnnO

    nO

    nc

    TE

  • 8/10/2019 Final All-pairs Shortest Paths

    25/45

    Recent Research

    Jenq and Sahni (1987) compared various parallelalgorithms for solving APSP empirically

    Kumar and Singh (1991) used the isoefficiencymetric (developed by Kumar and Rao) to analyze

    the scalability of parallel APSP algorithms

    Hardware vs. scalability

    Memory vs. scalability

  • 8/10/2019 Final All-pairs Shortest Paths

    26/45

    Isoefficiency

    For scalable algorithms (efficiency increasesmonotonically as p remains constant and problem

    size increases), efficiency can be maintained for

    increasing processors provided that the problemsize also increases

    Relates the problem size to the number ofprocessors necessary for an increase in speedup

    in proportion to the number of processors used

  • 8/10/2019 Final All-pairs Shortest Paths

    27/45

    Isoefficiency (cont)

    Given an architecture, defines thedegree of scalability Tells us the required growth in problem size to be able to efficiently

    utilize an increasing number of processors

    Ex:

    Given an isoefficiency of kp3

    Ifp0and w0, speedup = 0.8p0(efficiency = 0.8)Ifp1= 2p0, to maintain efficiency of 0.8

    w1= 23w0= 8w0

    Indicates the superiority of one algorithm over another only whenproblem sizes are increased in the range between the twoisoefficiency functions

  • 8/10/2019 Final All-pairs Shortest Paths

    28/45

    Isoefficiency (cont)

    Given an architecture, defines thedegree of scalability Tells us the required growth in problem size to be able to efficiently

    utilize an increasing number of processors

    Ex:

    Given an isoefficiency of kp3

    Ifp0and w0, speedup = 0.8p0(efficiency = 0.8)If w1= 2w0, to maintain efficiency of 0.8

    p1= 23w0= 8w0

    Indicates the superiority of one algorithm over another only whenproblem sizes are increased in the range between the twoisoefficiency functions

    Given an isoefficiency of kp3

    Ifp0and w0, speedup = 0.8p0(efficiency = 0.8)

    Ifp1= 2p0, to maintain efficiency of 0.8w1= 23w0= 8w0

  • 8/10/2019 Final All-pairs Shortest Paths

    29/45

    Memory Overhead Factor (MOF)

    Ratio:

    Total memory required for all processors

    Memory required for the same problems size on single processor

    Wed like this to be lower!

  • 8/10/2019 Final All-pairs Shortest Paths

    30/45

    Architectures Discussed

    Shared Memory (CREW)

    Hypercube (Cube)

    Mesh

    Mesh with Cut-Through Routing Mesh with Cut-Through and Multicast Routing

    Also examined fast and slow communicationtechnologies

  • 8/10/2019 Final All-pairs Shortest Paths

    31/45

    Parallel APSP Algorithms

    Floyd Checkerboard

    Floyd Pipelined Checkerboard

    Floyd Striped

    Dijkstra Source-Partition Dijkstra Source-Parallel

  • 8/10/2019 Final All-pairs Shortest Paths

    32/45

    General Parallel Algorithm (Floyd)

    Repeat steps 1 through 4 for k:= 1 to n

    Step 1: If this processor has a segment ofPk-1[*,k], then transmit it to allprocessors that need it

    Step 2: If this processor has a segment ofPk-1[k,*], then transmit it to all

    processors that need itStep 3: Wait until the needed segments ofPk-1[*,k] andPk-1[k,*] have

    been received

    Step 4: For all i,jin this processors partition, computePk[i,j] := min {Pk-1[i,j],Pk-1[i,k] +Pk-1[k,j]}

  • 8/10/2019 Final All-pairs Shortest Paths

    33/45

    Floyd Checkerboard

    p

    n

    p

    n

    Each cell is assigned to adifferent processor, and this

    processor is responsible for

    updating the cost matrix

    values at each iteration ofthe Floyd algorithm.

    Steps 1 and 2 of the GPF

    involve each of the

    processors sending theirdata to the neighborcolumns and rows.

    p

  • 8/10/2019 Final All-pairs Shortest Paths

    34/45

    Floyd Pipelined Checkerboard

    p

    n

    p

    n

    Similar to the preceding.

    Steps 1 and 2 of the GPF

    involve each of the

    processors sending theirdata to the neighborcolumns and rows.

    The difference is that the

    processors are notsynchronizedand compute

    and send data ASAP (or

    sends as soon as it receives).

    p

  • 8/10/2019 Final All-pairs Shortest Paths

    35/45

    Floyd Striped

    p

    n

    Each column is assigned adifferent processor, and this

    processor is responsible for

    updating the cost matrix

    values at each iteration of

    the Floyd algorithm.

    Step 1 of the GPF

    involves each of the

    processors sending their

    data to the neighborcolumns. Step 2 is not

    needed (since the column

    is contained within the

    processor).

    p

  • 8/10/2019 Final All-pairs Shortest Paths

    36/45

    Dijkstra Source-Partition

    Assumes Dijkstras Single-source Shortest Path is equallydistributed overpprocessors and executed in parallel

    Processorpfinds shortest paths from each vertex in itsset to all other vertices in the graph

    Fortunately, this approach involves nointer-processor communication

    Unfortunately, only nprocessors can be kept busy

    Also, memory overhead is high since each processors hasa copy of the weight matrix

  • 8/10/2019 Final All-pairs Shortest Paths

    37/45

    Dijkstras Source-Parallel

    Motivated by keeping more processors busy Run ncopies of the Dijkstras SSP

    Each copy runs on processors (p> n)

    n

    p

    n

    p

    n

    p

    n

    p

    n

    p

    n

    p

    n

    p

    n

    p

    n

    p

    n

    p

    n

    p

    n

    p

    n

    p

    n

    p

    n

    p

    n

    p

    n

    p

  • 8/10/2019 Final All-pairs Shortest Paths

    38/45

    Calculating Isoefficiency

    Example: Floyd Checkerboard

    At most n2processors can be kept busy

    n must grow as (p) due to problem structure By Floyd (sequential), Te= (n3)

    Thus isoefficiency is (p3) = (p1.5)

    But what about communication

  • 8/10/2019 Final All-pairs Shortest Paths

    39/45

    Calculating Isoefficiency (cont)

    ts= message startup time tw= per-word communication time tc= time to compute next iteration value for one cell in matrix m = number words sent d = number hops between nodes

    Hypercube: (ts+ twm) log d = time to deliver m words 2 (ts+ twm) log p = barrier synchronization time (up & down tree) d = p

    Step 1 = (ts+ twn/p) log p Step 2 = (ts+ twn/p) log p Step 3 (barrier synch) = 2(ts+ tw) log p Step 4 = tcn2/p

    p

    ntpttp

    p

    nttnT

    cwswsp

    2

    log2log2

    Isoefficiency = (p1.5(log p)3)

  • 8/10/2019 Final All-pairs Shortest Paths

    40/45

    Mathematical Details

    epo TpTT

    32

    log2log2 ntp

    ntpttp

    p

    nttnpT ccwswso

    ppntpnpttTwwso

    loglog23 2

    How are nandp

    related?

  • 8/10/2019 Final All-pairs Shortest Paths

    41/45

    Mathematical Details

    epo TpTT

    32

    log2log2 ntp

    ntpttp

    p

    nttnpT ccwswso

    ppntpnpttTwwso

    loglog23 2

    5.1)log( pp 35.1 )(log pp

    35.1 )(log pp

    ppntpnpttKnt wwsc loglog23 23

    E

    E

    1

  • 8/10/2019 Final All-pairs Shortest Paths

    42/45

    Calculating Isoefficiency (cont)

    ts= message startup time tw= per-word communication time

    tc= time to compute next iteration value for one cell in matrix

    m = number words sent

    d = number hops between nodes

    Mesh: Step 1 =

    Step 2 =

    Step 3 (barrier synch) = Step 4 = Te

    pnppp

    nsynccommTp

    2)/( Isoefficiency = (p3+p2.25)

    = (p3)

    pp

    n

    pp

    n

    p

    Isoefficiency and MOF for

  • 8/10/2019 Final All-pairs Shortest Paths

    43/45

    Isoefficiency and MOF for

    Algorithm & Architecture Combinations

    Base Algorithm Parallel Variant Architecture Isoefficiency MOF

    Dijkstra Source-

    Partitioned

    SM, Cube, Mesh, Mesh-CT,

    Mesh-CT-MC

    p3 p

    Dijkstra Source-Parallel SM, Cube (plogp)1.5 n

    Mesh, Mesh-CT

    Mesh-CT-MC

    p1.8 n

    Floyd Stripe SM p3 1

    Cube (plogp)3 1

    Mesh p4.5 1

    Mesh-CT (plogp)3 1

    Mesh-CT-MC p3 1

    Floyd Checkerboard SM p1.5 1

    Cube p1.5(logp)3 1

    Mesh p3 1

    Mesh-CT p2.25 1

    Mesh-CT-MC p2.25 1

    Floyd Pipelined

    Checkerboard

    SM, Cube, Mesh, Mesh-CT,

    Mesh-CT-MC

    p1.5 1

  • 8/10/2019 Final All-pairs Shortest Paths

    44/45

    Comparing Metrics

    Weve used cost previously this semester(cost =pTp)

    But notice that the cost of all of the architecture-

    algorithm combinations discussed here is (n3

    )

    Clearly some are more scalable than others

    Thus isoefficiency is a useful metric whenanalyzing algorithms and architectures

  • 8/10/2019 Final All-pairs Shortest Paths

    45/45

    References

    Akl S. G. Parallel Computation: Models and Methods. PrenticeHall, Upper Saddle River NJ, pp. 381-384,1997. Cormen T. H., Leiserson C. E., Rivest R. L., and Stein C.

    Introduction to Algorithms (2ndEdition). The MIT Press, CambridgeMA, pp. 620-642, 2001.

    Jenq J. and Sahni S. All Pairs Shortest Path on a Hypercube

    Multiprocessor. In International Conference on Parallel Processing.pp. 713-716, 1987.

    Kumar V. and Singh V. Scalability of Parallel Algorithms for the AllPairs Shortest Path Problem. Journal of Parallel and DistributedComputing, vol. 13, no. 2, Academic Press, San Diego CA, pp. 124-138, 1991.

    Pettie S. A Faster All-pairs Shortest Path Algorithm for Real-weighted Sparse Graphs. In Proc. 29th Int'l Colloq. on Automata,Languages, and Programming (ICALP'02), LNCS vol. 2380, pp. 85-97, 2002.