introduction to distributed computing algorithms

29
Introduction to Distributed Computing Algorithms Jeff M. Phillips November 19, 2011

Upload: others

Post on 14-Feb-2022

4 views

Category:

Documents


0 download

TRANSCRIPT

Introduction to Distributed ComputingAlgorithms

Jeff M. Phillips

November 19, 2011

Many Unorganized Computers

I can't do this by myself!

And I don't really know where anyone

else is!

010100101010101010101001010101001010100101010010101001010101010101001010100101010100101.......

Too much data processing forone computer.Not part of an organizedcluster.

Could be huge job.Could be small computer.

Many Unorganized Computers

I can't do this by myself!

And I don't really know where anyone

else is!

010100101010101010101001010101001010100101010010101001010101010101001010100101010100101.......

Too much data processing forone computer.Not part of an organizedcluster.

Could be huge job.Could be small computer.

Many Unorganized Computers

010100101010101010101001010101001010100101010010101001010101010101001010100101010100101.......

Distribute computation out tofriends.

Why won’t this work?

Transferring big data veryexpensive!Often more expensive thancomputation!

Many Unorganized Computers

010100101010101010101001010101001010100101010010101001010101010101001010100101010100101.......

Distribute computation out tofriends.Why won’t this work?

Transferring big data veryexpensive!Often more expensive thancomputation!

Many Unorganized Computers

010100101010101010101001010101001010100101010010101001010101010101001010100101010100101.......

Distribute computation out tofriends.Why won’t this work?

Transferring big data veryexpensive!Often more expensive thancomputation!

Many Unorganized Computers

010100101010101010101001010101001010100101010010101001010101010101001010100101010100101.......

10101010100101..

10101010100101..

10101010100101..

10101010100101..

10101010100101..

10101010100101..

10101010100101.. 10101

010100101..

10101010100101..

Goal:Minimize Communication!

Many Unorganized Computers

010100101010101010101001010101001010100101010010101001010101010101001010100101010100101.......

Distribute computation out tofriends.When might this work?

Computation is Very Expensive.(Exponential)

Many Unorganized Computers

010100101010101010101001010101001010100101010010101001010101010101001010100101010100101.......

Distribute computation out tofriends.When might this work?

Computation is Very Expensive.(Exponential)

Folding@Home

Molecular dynamics

I typically very sequential

I many inaccurate and average

I explore different scenarios

Central Server: sends out work units.Nodes have fixed time to complete.Failures lead to shorted jobs.

Folding@Home

Molecular dynamics

I typically very sequential

I many inaccurate and average

I explore different scenarios

Central Server: sends out work units.Nodes have fixed time to complete.Failures lead to shorted jobs.

Folding@Home

How large is it?

I 439,000 CPUs

I 37,000 GPUs

I 21,000 PS3s

6.7 petaFLOPS

Molecular dynamics

I typically very sequential

I many inaccurate and average

I explore different scenarios

Central Server: sends out work units.Nodes have fixed time to complete.Failures lead to shorted jobs.

BOINC

Berkeley Open Infrastructure for Network Computing

SETI@Home

I 451,000 CPUs

5.6 petaFLOPS

More restrictive protocol [email protected] results, and often duplicates.

Flat Model

010100101010101010101001010101001010100101010010101001010101010101001010100101010100101.......

10101010100101..

10101010100101..

10101010100101..

10101010100101..

10101010100101..

10101010100101..

10101010100101.. 10101

010100101..

10101010100101..

Each processor is connected toserver.

I two-way communication.

I sometimes, data canoriginate on processor

I can stream in, or static

I server can be overloaded

Flat Model

10101010100101..

10101010100101..

10101010100101..

10101010100101..

10101010100101..

10101010100101..

10101010100101.. 10101

010100101..

10101010100101..

Each processor is connected toserver.

I two-way communication.

I sometimes, data canoriginate on processor

I can stream in, or static

I server can be overloaded

Random Sampling

10101010100101..

10101010100101..

10101010100101..

10101010100101..

10101010100101..

10101010100101..

10101010100101.. 10101

010100101..

10101010100101..

Random Sample t items fromk sites.O(k + t) communication.

1. Each node assigns arandom variable ui to allits data vi .

2. Sends top value ui toserver as (xi , ui ).

3. Server keeps xi with top ui

4. Asks corresponding nodefor next top value.

5. Go to 3.

Random Sampling

10101010100101..

10101010100101..

10101010100101..

10101010100101..

10101010100101..

10101010100101..

10101010100101.. 10101

010100101..

10101010100101..

Random Sample t items fromk sites.O(k + t) communication.

1. Each node assigns arandom variable ui to allits data vi .

2. Sends top value ui toserver as (xi , ui ).

3. Server keeps xi with top ui

4. Asks corresponding nodefor next top value.

5. Go to 3.

Tree Model

010100101010101010101001010101001010100101010010101001010101010101001010100101010100101.......

10101010100101..

10101010100101..

10101010100101..

10101010100101..

10101010100101..

10101010100101..

10101010100101.. 10101

010100101..

10101010100101..

Many processors connected toserver through tree

I two-way communication.

I arbitrary topology (tree)

I can stream in, or static

I less stress on server

I latency slower

I might multi-cast fromserver

I sometimes only passsummaries

Tree Model

010100101010101010101001010101001010100101010010101001010101010101001010100101010100101.......

10101010100101..

10101010100101..

10101010100101..

10101010100101..

10101010100101..

10101010100101..

10101010100101.. 10101

010100101..

10101010100101..

Many processors connected toserver through tree

I two-way communication.

I arbitrary topology (tree)

I can stream in, or static

I less stress on server

I latency slower

I might multi-cast fromserver

I sometimes only passsummaries

Tree Model

010100101010101010101001010101001010100101010010101001010101010101001010100101010100101.......

10101010100101..

10101010100101..

10101010100101..

10101010100101..

10101010100101..

10101010100101..

10101010100101.. 10101

010100101..

10101010100101..

Many processors connected toserver through tree

I two-way communication.

I arbitrary topology (tree)

I can stream in, or static

I less stress on server

I latency slower

I might multi-cast fromserver

I sometimes only passsummaries

Mergeable Summaries

010100101010101010101001010101001010100101010010101001010101010101001010100101010100101.......

10101010100101..

10101010100101..

10101010100101..

10101010100101..

10101010100101..

10101010100101..

10101010100101.. 10101

010100101..

10101010100101..

Aggregation Network

I Each node i has data Xi

I Creates summarySi = σ(Xi )

I has ε-error, size f (ε)

Can merge two summaries:

I S = µ(S1,S2)

I has ε-error on S1 ∪ S2I size f (ε)

Neither error nor size grows.Can be used like sum or max.

Clique Model

010100101010101010101001010101001010100101010010101001010101010101001010100101010100101.......

10101010100101..

10101010100101..

10101010100101..

10101010100101..

10101010100101..

10101010100101..

10101010100101.. 10101

010100101..

10101010100101..

Many computers, all can talk(internet)

I may limit degree(10000+ nodes)

I central server may control

I may have no central server

Distributed Hash Tables

I Stores data distributed(like GFS)

I Distribute files (Bitorrent)

Minimize communicationtolerate failure

Clique Model

10101010100101..

10101010100101..

10101010100101..

10101010100101..

10101010100101..

10101010100101..

10101010100101.. 10101

010100101..

10101010100101..

Many computers, all can talk(internet)

I may limit degree(10000+ nodes)

I central server may control

I may have no central server

Distributed Hash Tables

I Stores data distributed(like GFS)

I Distribute files (Bitorrent)

Minimize communicationtolerate failure

Clique Model

10101010100101..

10101010100101..

10101010100101..

10101010100101..

10101010100101..

10101010100101..

10101010100101.. 10101

010100101..

10101010100101..

Many computers, all can talk(internet)

I may limit degree(10000+ nodes)

I central server may control

I may have no central server

Distributed Hash Tables

I Stores data distributed(like GFS)

I Distribute files (Bitorrent)

Minimize communicationtolerate failure

Clique Model

10101010100101..

10101010100101..

10101010100101..

10101010100101..

10101010100101..

10101010100101..

10101010100101.. 10101

010100101..

10101010100101..

Many computers, all can talk(internet)

I may limit degree(10000+ nodes)

I central server may control

I may have no central server

Distributed Hash Tables

I Stores data distributed(like GFS)

I Distribute files (Bitorrent)

Minimize communicationtolerate failure

Lower Bounds

10101010100101..

10101010100101..

10101010100101..

10101010100101..

10101010100101..

10101010100101..

10101010100101.. 10101

010100101..

10101010100101..

I k computers: i

I each computer has n bits:Xi

Compute f (X1,X2, . . . ,Xk).

Number-on-forehead

I See all data, but your own

Blackboard

I Costs to write to BB, freeto read

Multi-Party

I All-pair

I f = OR,XOR, ...Ω(nk) comm

Lower Bounds

10101010100101..

10101010100101..

10101010100101..

10101010100101..

10101010100101..

10101010100101..

10101010100101.. 10101

010100101..

10101010100101..

I k computers: i

I each computer has n bits:Xi

Compute f (X1,X2, . . . ,Xk).

Number-on-forehead

I See all data, but your own

Blackboard

I Costs to write to BB, freeto read

Multi-Party

I All-pair

I f = OR,XOR, ...Ω(nk) comm

Lower Bounds

10101010100101..

10101010100101..

10101010100101..

10101010100101..

10101010100101..

10101010100101..

10101010100101.. 10101

010100101..

10101010100101..

I k computers: i

I each computer has n bits:Xi

Compute f (X1,X2, . . . ,Xk).

Number-on-forehead

I See all data, but your own

Blackboard

I Costs to write to BB, freeto read

Multi-Party

I All-pair

I f = OR,XOR, ...Ω(nk) comm

Lower Bounds

10101010100101..

10101010100101..

10101010100101..

10101010100101..

10101010100101..

10101010100101..

10101010100101.. 10101

010100101..

10101010100101..

I k computers: i

I each computer has n bits:Xi

Compute f (X1,X2, . . . ,Xk).

Number-on-forehead

I See all data, but your own

Blackboard

I Costs to write to BB, freeto read

Multi-Party

I All-pair

I f = OR,XOR, ...Ω(nk) comm