Transcript
Page 1: John%Schulman%and%Arjun%Singh%%kubitron/courses/cs262...Task%Pipeline%Specifica/on%and%Scheduling %John%Schulman%and%Arjun%Singh%% Example Scheduling Results Overview Related%Work

Task%Pipeline%Specifica/on%and%Scheduling%John%Schulman%and%Arjun%Singh%%

Example

Scheduling

Results

Related%WorkOverviewResearch%pipelines,%such%as%those%o8en%found%in%computer%vision%or%computa;onal%biology,%o8en%consist%of%a%large%number%of%heterogenous%programs.%This%leads%to%bri@le%code%that%is%difficult%to%maintain,%while%requiring%significant%effort%to%parallelize%across%mul;ple%machines.%Our%lightweight%framework%executes%pipelines%on%clusters%of%machines%with%minimal%effort%(and%automa;c%dependency%packaging)%while%scheduling%tasks%to%minimize%;me%un;l%comple;on%(including%file%transfer%;me).

Task%PipelinesL%Ruffus:%PythonLbased,%lightweight,%no%parallelismL%Luigi:%Parallelism%via%Hadoop,%requires%code%changes%for%parallel%execu;onL%compmake:%Python%only,%parallelism%via%SGE%and%MultyvacL%Oozie:%DAG%workflows%for%Hadoop,%heavy%syntax

CS%262A

Possible%ExtensionsL%Dynamically%replan%upon%comple;on%of%every%jobL%Visualiza;on%of%pipeline%state%(see%Luigi)L%Integra;on%with%distributed%filesystems%(e.g.%HDFS)L%Formulate%problem%as%ILPL%Only%rerun%parts%of%pipeline%that%haven’t%changed

SchedulingL%Delay%scheduling%(used%in%Spark)L%Job%shop%scheduling%

Filter%Depth%Maps%(600)

Python

Create%Point%Clouds%(600)

Python

Segment%Point%Clouds%(120)

C++

class DetectChessboard(Task): input = {'image': 'filename'} output = {'board': 'filename'}

def run(self): from scipy.misc import imread import pycb board_size = self.params['board_size'] img = imread(self.input['image']) corners, chessboards = pycb.extract_chessboards(img, use_corner_thresholding=False) pycb.save_chessboard(self.output['board'], corners, chessboards, [board_size])

So@ware%UsedL%CDE:%h@p://www.pgbovine.net/cde.html%%%Packages%up%executables%+%all%dependenciesL%cloudpickle:%pickle%(almost)%all%of%Python

When/where%should%each%task%be%computed%(given%DAG)?%%%L%Minimize%total%;me%to%comple;on%(makespan).%%L%Consider%transfer%;me%of%files,%bandwidth%limits,%%%%%%and%limited%computa;on.Input:%%%L%Dependency%graph%with%a%set%of%jobs.%Each%job%has%a%set%%%%%of%input%and%output%files.%%%L%Es;mate%of%how%long%each%task%+%transport%takesOutput:%%%L%Tuples%specifying%computa;on%+%transporta;on%events%%%L%Computa;on%event:%(job,%loca;on,%start%;me,%end%;me)%%%L%Transport%event:%(from%loc.,%to%loc.,%start%;me,%end%;me)

HillCClimbing%AlgorithmL%Let%a%denote%an%assignment%of%jobs%to%computersL%Let%makespan(a)%be%the%makespan%when%simula;ng%a%%%%greedy%execu;on%of%aL%abest%=%[1,1,…1];%tbest%=%makespan(abest)L%Repeat%num_trials%;mes%%L%atrial%=%copy(abest)%%L%Pick%a%random%index%of%atrial%and%set%to%random%value%%L%ttrial%=%makespan(atrial)%%L%update%abest)%if%atrial%is%be@er

Simulated%Scheduling%Experiments

Pipeline%Execu/on%Results

7m+46sNew+pipeline,+local+networking,+1+machine7m+45sNew+pipeline,+no+networking,+1+machine

10m+45sOld+pipeline+(wait+for+each+stage+to+finish+before+next)

Run-meExperiment

User’sLocal+MachineAll+code+++data

Worker(EC2)

Worker(EC2)

Worker(EC2)

Worker(EC2)

Worker(EC2)

Top Related