david p. anderson space sciences laboratory university of california – berkeley public distributed...

26
David P. Anderson Space Sciences Laboratory University of California – Berkeley [email protected] Public Distributed Computing with BOINC

Upload: rodney-mcdonald

Post on 17-Jan-2018

216 views

Category:

Documents


0 download

DESCRIPTION

The potential of public computing ● 500,000 CPUs, 65 TeraFLOPs ● 1 billion Internet-connected PCs in 2010, 50% privately owned ● If 100M participate: – ~ 100 PetaFLOPs – ~ 1 Exabyte (10^18) storage public computing Grid computing cluster computing supercomputin g p CPU power, storage capacity cost

TRANSCRIPT

Page 1: David P. Anderson Space Sciences Laboratory University of California – Berkeley Public Distributed Computing with BOINC

David P. AndersonSpace Sciences Laboratory

University of California – [email protected]

Public Distributed Computingwith BOINC

Page 2: David P. Anderson Space Sciences Laboratory University of California – Berkeley Public Distributed Computing with BOINC

Public-resource computing

95 96 97 98 99 00 01 02 03 04

GIMPS, distributed.net

SETI@home, folding@home

fight*@home

climateprediction.net

names:public-resource computingpeer-to-peer computing (no!)public distributed computing“@home” computing

your computers

academicbusine

ss

home PCs

Page 3: David P. Anderson Space Sciences Laboratory University of California – Berkeley Public Distributed Computing with BOINC

The potential of public computing

● SETI@home: 500,000 CPUs, 65 TeraFLOPs● 1 billion Internet-connected PCs in 2010, 50% privately

owned● If 100M participate:

– ~ 100 PetaFLOPs– ~ 1 Exabyte (10^18) storage

public computing

Grid computingcluster

computingsupercomputing

pCPU power,storage capacity

cost

Page 4: David P. Anderson Space Sciences Laboratory University of California – Berkeley Public Distributed Computing with BOINC

Public/Grid differencesPublic Grid

Managed resources? no yesSecure resources? no yesAlways on? no yesAlways connected? no yesNetwork bandwidth Expensive, scarce abundantNetwork connection 1 way (pull) 2 way (pull or push)Must be unobtrusive yes noCredit system yes maybeHow to get resources? complex complexEPO? yes noself-upgrading? yes no

Page 5: David P. Anderson Space Sciences Laboratory University of California – Berkeley Public Distributed Computing with BOINC

Economics (0th order)cluster/Grid computing public-resource computing

resources ($$)

resources (free)

you

Internet ($$)Network (free)

$1 buys 1 computer/day or 20 GB data transfer on commercial InternetSuppose processing 1 GB data takes X computer daysCost of processing 1 GB:

cluster/Grid: $XPRC: $1/20

So PRC is cheaper if X > 1/20(SETI@home: X = 1,000)

Page 6: David P. Anderson Space Sciences Laboratory University of California – Berkeley Public Distributed Computing with BOINC

Economics revisited

Underutilized free Internet (e.g. Internet2)

you

commodity Internet

... other institutions

Bursty, underutilized flat-rate ISP connectionTraffic shapers can send at zero priority==> bandwidth may be free also

Page 7: David P. Anderson Space Sciences Laboratory University of California – Berkeley Public Distributed Computing with BOINC

Why isn't PRC more widely used?

● Lack of platform– jxta, Jabber: not a solution– Java: apps are in C, FORTRAN– commercial platforms: business issues– cosm, XtremWeb: not complete

● Need to make PRC technology easy to use for scientists

Page 8: David P. Anderson Space Sciences Laboratory University of California – Berkeley Public Distributed Computing with BOINC

BOINC: Berkeley Open Infrastructure for Network Computing

● Goals for computing projects– easy/cheap to create and operate projects– wide range of applications possible– no central authority

● Goals for participants– easy to participate in multiple projects– invisible use of disk, CPU, network

● NSF-funded; open source; in beta test– http://boinc.berkeley.edu

Page 9: David P. Anderson Space Sciences Laboratory University of California – Berkeley Public Distributed Computing with BOINC

SETI@home requirementsideal:current:

commercialInternet

Berkeley

participants

tapesInternet2

commercialInternet

Berkeley Stanford USC

participants

50 Mbps

0.3 MB = 8 hrs CPU

Page 10: David P. Anderson Space Sciences Laboratory University of California – Berkeley Public Distributed Computing with BOINC

Climateprediction.net● Global climate study (Oxford Univ.)● Input: ~10MB executable, 1MB data● CPU time: 2-3 months (can't migrate)● Output per workunit:

– 10 MB summary (always upload)– 1 GB detail file (archive on client, may

upload)● Chaotic (incomparable results)

Page 11: David P. Anderson Space Sciences Laboratory University of California – Berkeley Public Distributed Computing with BOINC

Einstein@home (planned)● Gravity wave detection; LIGO;

UW/CalTech● 30,000 40 MB data sets● Each data set is analyzed w/ 40,000

different parameter sets; each takes ~6 hrs CPU

● Data distribution: replicated 2TB servers

● Scheduling problem is more complex than “bag of tasks”

Page 12: David P. Anderson Space Sciences Laboratory University of California – Berkeley Public Distributed Computing with BOINC

Intel/UCB Network Study (planned)

● Goal: map/measure the Internet● Each workunit lasts for 1 day but is

active only briefly (pings, UDP)● Need to control time-of-day when

active● Need to turn off other apps● Need to measure system load indices

(network/CPU/VM)

Page 13: David P. Anderson Space Sciences Laboratory University of California – Berkeley Public Distributed Computing with BOINC

General structure of BOINC

● Project:

● Participant:

Scheduling server (C++)

BOINC DB(MySQL) Work

generation

data server (HTTP)

App App App

data server (HTTP)data server

(HTTP)

Web interfaces

(PHP)

Core client (C++)

Project back endRetry

generation

Result validation

Result processing

Garbage collection

Page 14: David P. Anderson Space Sciences Laboratory University of California – Berkeley Public Distributed Computing with BOINC

Project web site features● Download core client● Create account● Edit preferences

– General: disk usage, work limits, buffering– Project-specific: allocation, graphics– venues (home/school/work)

● Profiles● Teams● Message boards, adaptive FAQs

Page 15: David P. Anderson Space Sciences Laboratory University of California – Berkeley Public Distributed Computing with BOINC

General preferences

Page 16: David P. Anderson Space Sciences Laboratory University of California – Berkeley Public Distributed Computing with BOINC

Project-specific preferences

Page 17: David P. Anderson Space Sciences Laboratory University of California – Berkeley Public Distributed Computing with BOINC

Data architecture● Files

– immutable, replicated– may originate on client or project– may remain resident on client

● Executables are digitally signed● Upload certificates: prevent DOS

<file_info><name>arecibo_3392474_jun_23_01</name><url>http://ds.ssl.berkeley.edu/a3392474</url><url>http://dt.ssl.berkeley.edu/a3392474</url><md5_cksum>uwi7eyufiw8e972h8f9w7</md5_cksum><nbytes>10000000</nbytes>

</file_info>

Page 18: David P. Anderson Space Sciences Laboratory University of California – Berkeley Public Distributed Computing with BOINC

Computation abstractions● Applications● Platforms● Application versions

– may involve many files● Work units: inputs to a computation

– soft deadline; CPU/disk/mem estimates● Results: outputs of a computation

Page 19: David P. Anderson Space Sciences Laboratory University of California – Berkeley Public Distributed Computing with BOINC

Scheduling: pull model

scheduling server

core client

data server

request X seconds of workhost description

result 1...result n

download

upload...compute...

Page 20: David P. Anderson Space Sciences Laboratory University of California – Berkeley Public Distributed Computing with BOINC

Redundant computing

replicator

assimilator

validator

work generator

canonical result

clients

scheduler

select canonical resultassign credit

Page 21: David P. Anderson Space Sciences Laboratory University of California – Berkeley Public Distributed Computing with BOINC

BOINC core client

core client

file transfersrestartableconcurrentuser limited

program executionsemi-sandboxedgraphics controlcheckpoint control% done, CPU time

appAPI

appAPI

shared mem

Page 22: David P. Anderson Space Sciences Laboratory University of California – Berkeley Public Distributed Computing with BOINC

User interface

screensaver

control panel

core client

control/stateRPCs

activate screensaver

appappapp

graphics

Page 23: David P. Anderson Space Sciences Laboratory University of California – Berkeley Public Distributed Computing with BOINC
Page 24: David P. Anderson Space Sciences Laboratory University of California – Berkeley Public Distributed Computing with BOINC

Anonymous platform mechanism

● User compiles applications from source, registers them with core client

● Report platform as “anonymous” to scheduler

● Purposes:– obscure platforms– security-conscious participants– performance tuning of applications

Page 25: David P. Anderson Space Sciences Laboratory University of California – Berkeley Public Distributed Computing with BOINC

Project management tools● Python scripts for project

creation/start/stop● Remote debugging

– collect/store crash info (stack trace)– web-based browsing interface

● Strip charts– record, graph system performance metrics

● Watchdogs– detect system failures; dial pager

Page 26: David P. Anderson Space Sciences Laboratory University of California – Berkeley Public Distributed Computing with BOINC

Conclusion● Public-resource computing is a distinct

paradigm from Grid computing● PRC has tremendous potential for many

applications (computing and storage)● BOINC: enabling technology for PRC

– http://boinc.berkeley.edu