Download - Parallel Programing in R - Revolutions
![Page 1: Parallel Programing in R - Revolutions](https://reader033.vdocuments.site/reader033/viewer/2022051302/5893127c1a28ab13068bedda/html5/thumbnails/1.jpg)
ParallelCompu,nginR
BioC2009,Sea7le,July2009
![Page 2: Parallel Programing in R - Revolutions](https://reader033.vdocuments.site/reader033/viewer/2022051302/5893127c1a28ab13068bedda/html5/thumbnails/2.jpg)
WhoareREvolu,onCompu,ng?
REvolu,onCompu,ng:– Isacommercialopen‐sourcecompany,foundedin2007– ProvidesservicesandproductsbasedonR
• The“RedHat”®forR– Producesfreeandsubscrip,on‐basedhigh‐performance,enhanceddistribu,onsofR
– Offerssupport,training,valida,onandotherservicesaroundR
– Hasexper,seinhigh‐performanceanddistributedcompu,ng
– IsafinancialandtechnicalcontributortotheRcommunity– Hasopera,onsinNewHaven,Sea7le,andSanFrancisco
![Page 3: Parallel Programing in R - Revolutions](https://reader033.vdocuments.site/reader033/viewer/2022051302/5893127c1a28ab13068bedda/html5/thumbnails/3.jpg)
The People of REvolution
• Mar$nSchultz,ChiefScien,ficOfficer(ArthurKWatsonProfessorofComputerScience,YaleUniversity;founderofScien,ficCompu,ngAssociates;researchinalgorithmdesign,parallelprogrammingenvironmentsandarchitectures)
• DavidSmith,DirectorofCommunity&Rblogger(co‐authorofAnIntroduc+ontoR,ESS)
• BryanLewis,AmbassadorofCool(akaDirectorofSystemsEngineering;appliedmathinterestsinnumericalanalysisofinverseproblems;formerCEOofRocketcalc)
• DaneseCooper,OpenSourceDiva(boardofdirectors,OpenSourceIni,a,ve;member,ApacheSo`wareFounda,on;advisoryboard,Mozilla.org;previouslyseniordirector,opensourcestrategiesatIntelandSun)
• SteveWeston,SeniorResearchScien,st,DirectorofEngineering(REvolu,onandScien,ficCompu,ngAssociates;developmentofNetWorkSpaces–parallelprogrammingwithR,Python,Ruby,andMatlab–NetworkLinda,Paradise,andPiranha).
• JayEmerson,DeptSta,s,cs,YaleUniversity(authorofbigmemorypackage)
![Page 4: Parallel Programing in R - Revolutions](https://reader033.vdocuments.site/reader033/viewer/2022051302/5893127c1a28ab13068bedda/html5/thumbnails/4.jpg)
WhatisREvolu,onR?
• REvolu,onRisthefreedistribu,onofR– Op,mizedforspeed– Usesmul,pleCPUs/coresforperformance– ForWindowsandMacOS(soon:Ubuntu)– Supportviacommunityforums
• REvolu,onREnterpriseisourenhanced,subscrip,on‐onlydistribu,onofR– Telephone/emailsupportfromrealRexperts– Suitableforuseinregulated/validatedenvironments– IncludesproprietaryParallelRpackagesforreliabledistributedcompu,ngwithR
• onclustersorinthecloud– Supportedon64‐bitWindows,Linux
![Page 5: Parallel Programing in R - Revolutions](https://reader033.vdocuments.site/reader033/viewer/2022051302/5893127c1a28ab13068bedda/html5/thumbnails/5.jpg)
Suppor,ngtheRCommunity
Weareanopensourcecompanysuppor,ngtheRcommunity:
• BenefactorofRFounda$on• FinancialsupporterofRconferencesandusergroups• Newfunc$onalitydevelopedincoreRtocontributedunderGPL
• 64‐bitWindowssupport• Step‐debuggingsupport
• REvangelism
“Revolu$ons”Blog:blog.revolu,on‐compu,ng.comDailyNewsaboutR,sta1s1cs,andopen‐source
5
![Page 6: Parallel Programing in R - Revolutions](https://reader033.vdocuments.site/reader033/viewer/2022051302/5893127c1a28ab13068bedda/html5/thumbnails/6.jpg)
Intoday’slab:
• Introduc,ontoParallelProcessing• Mul,‐ThreadedProcessing
– Compu,ngontheGPU
• Iterators• Theforeachloop• Usingmul,plecores:SMP
• ClusterCompu,ng
• Mul,‐Stratumparallelism
• Q&A/Exercises6
![Page 7: Parallel Programing in R - Revolutions](https://reader033.vdocuments.site/reader033/viewer/2022051302/5893127c1a28ab13068bedda/html5/thumbnails/7.jpg)
GemngStarted
• R2.10.x,and2.9.xonWindows:install.packages("foreach",type="source")
install.packages("iterators",type="source")
• R2.9.x,Mac/Linuxonly:install.packages("doMC")
require(doMC)
• Windows/Mac:– InstallREvolu,onREnterprise2.0(R2.7.2)require(doNWS)
7
![Page 8: Parallel Programing in R - Revolutions](https://reader033.vdocuments.site/reader033/viewer/2022051302/5893127c1a28ab13068bedda/html5/thumbnails/8.jpg)
8
Introduc,ontoParallelProcessing
WithanasidetoHigh‐PerformanceCompu,ng
![Page 9: Parallel Programing in R - Revolutions](https://reader033.vdocuments.site/reader033/viewer/2022051302/5893127c1a28ab13068bedda/html5/thumbnails/9.jpg)
HPCo`enmeansefficientlyexploi,ngspecializedhardware
ImagescopyrightCray,Xlinix,NVIDIAfromupper‐left,clockwise.
WhatisHigh‐PerformanceCompu,ng(HPC)?
![Page 10: Parallel Programing in R - Revolutions](https://reader033.vdocuments.site/reader033/viewer/2022051302/5893127c1a28ab13068bedda/html5/thumbnails/10.jpg)
Imagefromncbr.sdsc.edu
WhatisHigh‐PerformanceCompu,ng(HPC)?
• Thesedays,HPCisfrequentlyassociatedwithCOTS*clustercompu,ngandwithSIMDvectoriza,onandpipelining(GPUs)
*Commodity,offtheshelf
• New:cloudcompu,ng
![Page 11: Parallel Programing in R - Revolutions](https://reader033.vdocuments.site/reader033/viewer/2022051302/5893127c1a28ab13068bedda/html5/thumbnails/11.jpg)
• HPCiso`enconcernedwithmul,‐processing(parallel
processing),thecoördina,onof
mul,ple,simultaneously
running(sub)programs
– Threads
– Processes
– Clusters
ImageCopyrightLawrenceLivermoreNationalLab
WhatisHigh‐PerformanceCompu,ng(HPC)?
![Page 12: Parallel Programing in R - Revolutions](https://reader033.vdocuments.site/reader033/viewer/2022051302/5893127c1a28ab13068bedda/html5/thumbnails/12.jpg)
HPCo`eninvolveseffec,velymanaginghugedatasets
– Parallelfilesystems(GPFS,PVFS2,Lustre,GFS2,S3…)
– Paralleldataopera,ons(map‐reduce)
– Workingwithhigh‐performancedatabases
– bigmemorypackageinR
ImageCopyrightHP(amulti‐petabytestoragesystem)
WhatisHigh‐PerformanceCompu,ng(HPC)?
![Page 13: Parallel Programing in R - Revolutions](https://reader033.vdocuments.site/reader033/viewer/2022051302/5893127c1a28ab13068bedda/html5/thumbnails/13.jpg)
ATaxonomyofParallelProcessing
• Mul$‐node/cluster/cloudcompu$ng(heavyweightprocesses)– Memorydistributedacrossnetwork– Examples:foreach,SNOW,Rmpi,batchprocessing
• Mul$‐core/mul$‐processorcompu$ng(heavyweightprocesses)– SMP:SymmetricMul,‐Processing– Independentmemoryinsharedspace– Naturallyscalestomul,‐nodeprocessing– Examples:mul,core(Windows/Unix),foreach
• Mul$‐threadedprocessing(lightweightprocesses)– Usuallysharedmemory– Hardertoscaleoutacrossnetworks– Examples:threadedlinear‐algebralibrariesforR(ATLAS,MKL);GPU
processors(CUDA/NVIDIA;ct/INTEL)
13
![Page 14: Parallel Programing in R - Revolutions](https://reader033.vdocuments.site/reader033/viewer/2022051302/5893127c1a28ab13068bedda/html5/thumbnails/14.jpg)
14
Mul,‐ThreadedProcessing
![Page 15: Parallel Programing in R - Revolutions](https://reader033.vdocuments.site/reader033/viewer/2022051302/5893127c1a28ab13068bedda/html5/thumbnails/15.jpg)
Whatisthreadedprogramming?
• Athreadisakindofprocessthatsharesaddressspacewithitsparentprocess
• Created,destroyed,managedandsynchronizedinCcode– POSIXthreads– OpenMPthreads
• Fast,butdifficulttoprogram– Easytooverwritevariables– Needtoworryaboutsynchroniza,on
15
![Page 16: Parallel Programing in R - Revolutions](https://reader033.vdocuments.site/reader033/viewer/2022051302/5893127c1a28ab13068bedda/html5/thumbnails/16.jpg)
Exploi,ngthreadswithR
• RlinkstoBLAS(BasicLinearAlgebraSubprograms)librariesforefficientvector/matrixopera,ons
• Linux:NeedtocompileandlinkwiththreadedBLAS(ATLAS)
• Windows/Mac:REvolu,onRlinkedtoIntelMKLlibraries,usesasmanythreadsascores– Manyhigher‐levelopera,onsop,mizedaswell
• MacOS:CRANbinaryusesveclibBLAS– threaded,pre7ygoodperformance
16
![Page 17: Parallel Programing in R - Revolutions](https://reader033.vdocuments.site/reader033/viewer/2022051302/5893127c1a28ab13068bedda/html5/thumbnails/17.jpg)
REvolution R SVD Performance
Exampledatamatrix150,000x500fast.svd
Quad‐coreIntelCore2CPU,WindowsVista64‐bitWorkstation
Revolu,onRPerformance
Mul,‐ThreadedProcessing
17
![Page 18: Parallel Programing in R - Revolutions](https://reader033.vdocuments.site/reader033/viewer/2022051302/5893127c1a28ab13068bedda/html5/thumbnails/18.jpg)
18
GPUProgramming
![Page 19: Parallel Programing in R - Revolutions](https://reader033.vdocuments.site/reader033/viewer/2022051302/5893127c1a28ab13068bedda/html5/thumbnails/19.jpg)
WhatisaGPU?
• Dedicatedprocessingchip(orcard)dedicatedtofastfloa,ng‐pointopera,ons– Originallyfor3‐Dgraphicscalcula,ons
• Highlyparallel:100’sofprocessorsonasinglechip,capableofrunning1000’softhreads
• Usuallyincludesdedicatedhigh‐speedRAM,accessibleonlybyGPU– Needtotransferdatain/out
• ProgrammeddirectlyusingcustomCdialect/compilers
• >90%ofnewdesktops/laptopshaveanintegratedGPU
19
![Page 20: Parallel Programing in R - Revolutions](https://reader033.vdocuments.site/reader033/viewer/2022051302/5893127c1a28ab13068bedda/html5/thumbnails/20.jpg)
GeForce8800GT
• LaunchedOct29,2007• 512Mbof256‐bitmemory• 128processors• 512simultaneousthreads• <$200
• DownloadNVIDIACUDATools:– h7p://www.nvidia.com/object/cuda_home.html
• Tutorial– h7p://www.ddj.com/architect/207200659
20
![Page 21: Parallel Programing in R - Revolutions](https://reader033.vdocuments.site/reader033/viewer/2022051302/5893127c1a28ab13068bedda/html5/thumbnails/21.jpg)
PerformanceComparison
Method Time(seconds)
BaseRconvolvefunc,on 9.89
AMDACML 6.29
FFTW(8threads) 3.75
CUDAonGeForce8800GT 1.88(singleprecision)
21
Convolve2vectorsoflength2^22 60Mbofdata
Quaddual‐coreprocessor/GPU
![Page 22: Parallel Programing in R - Revolutions](https://reader033.vdocuments.site/reader033/viewer/2022051302/5893127c1a28ab13068bedda/html5/thumbnails/22.jpg)
22
IntroducingIterators
![Page 23: Parallel Programing in R - Revolutions](https://reader033.vdocuments.site/reader033/viewer/2022051302/5893127c1a28ab13068bedda/html5/thumbnails/23.jpg)
ThoughtExperiment:DrawingCards
• Youaretheteacherof10grade‐schoolpupils.• Classproject:draweachofthe52playingcardsasaposter.
• Eachchildhassuppliesofposterpaperandcrayons,butrequiresareferencecardtocopy.
• Howtoorganizethepupils?
23
![Page 24: Parallel Programing in R - Revolutions](https://reader033.vdocuments.site/reader033/viewer/2022051302/5893127c1a28ab13068bedda/html5/thumbnails/24.jpg)
Iterators
> require(iterators) • Generalizedloopvariable• Valueneednotbeatomic
– Rowofamatrix– Randomdataset– Chunkofadatafile– Recordfromadatabase
• Createwith:iter • Getvalueswith:nextElem • Usedasindexingargumentwithforeach
![Page 25: Parallel Programing in R - Revolutions](https://reader033.vdocuments.site/reader033/viewer/2022051302/5893127c1a28ab13068bedda/html5/thumbnails/25.jpg)
Iteratorsarememoryfriendly
• Allowdatatobesplitintomanageablepiecesonthefly
• Helpsalleviateproblemswithprocessinglargedatastructures
• Piecescanbeprocessedinparallel
![Page 26: Parallel Programing in R - Revolutions](https://reader033.vdocuments.site/reader033/viewer/2022051302/5893127c1a28ab13068bedda/html5/thumbnails/26.jpg)
Iteratorsactasadaptors
• Allowsyourdatatobeprocessedbyforeachwithoutbeingconverted
• Caniterateovermatricesanddataframesbyroworbycolumn:
it <- iter(Boston, by="row") nextElem(it)
![Page 27: Parallel Programing in R - Revolutions](https://reader033.vdocuments.site/reader033/viewer/2022051302/5893127c1a28ab13068bedda/html5/thumbnails/27.jpg)
NumericIterator
> i <- iter(1:3) > nextElem(i) [1] 1
> nextElem(i) [1] 2 > nextElem(i) [1] 3 > nextElem(i) Error: StopIteration
![Page 28: Parallel Programing in R - Revolutions](https://reader033.vdocuments.site/reader033/viewer/2022051302/5893127c1a28ab13068bedda/html5/thumbnails/28.jpg)
Longsequences
> i <- icount(1e9) > nextElem(i) [1] 1 > nextElem(i) [1] 2 > nextElem(i) [1] 3 > nextElem(i) [1] 4 > nextElem(i) [1] 5
![Page 29: Parallel Programing in R - Revolutions](https://reader033.vdocuments.site/reader033/viewer/2022051302/5893127c1a28ab13068bedda/html5/thumbnails/29.jpg)
Matrixdimensions
> M <- matrix(1:25,ncol=5) > r <- iter(M,by="row") > nextElem(r) [,1] [,2] [,3] [,4] [,5] [1,] 1 6 11 16 21 > nextElem(r) [,1] [,2] [,3] [,4] [,5] [1,] 2 7 12 17 22 > nextElem(r) [,1] [,2] [,3] [,4] [,5] [1,] 3 8 13 18 23
![Page 30: Parallel Programing in R - Revolutions](https://reader033.vdocuments.site/reader033/viewer/2022051302/5893127c1a28ab13068bedda/html5/thumbnails/30.jpg)
DataFile
> rec <- iread.table("MSFT.csv",sep=",", header=T, row.names=NULL) > nextElem(rec) MSFT.Open MSFT.High MSFT.Low MSFT.Close MSFT.Volume MSFT.Adjusted 1 29.91 30.25 29.4 29.86 76935100 28.73 > nextElem(rec) MSFT.Open MSFT.High MSFT.Low MSFT.Close MSFT.Volume MSFT.Adjusted 1 29.7 29.97 29.44 29.81 45774500 28.68 > nextElem(rec) MSFT.Open MSFT.High MSFT.Low MSFT.Close MSFT.Volume MSFT.Adjusted 1 29.63 29.75 29.45 29.64 44607200 28.52 > nextElem(rec) MSFT.Open MSFT.High MSFT.Low MSFT.Close MSFT.Volume MSFT.Adjusted 1 29.65 30.1 29.53 29.93 50220200 28.8
30
![Page 31: Parallel Programing in R - Revolutions](https://reader033.vdocuments.site/reader033/viewer/2022051302/5893127c1a28ab13068bedda/html5/thumbnails/31.jpg)
Database
> library(RSQLite) > m <- dbDriver('SQLite') > con <- dbConnect(m, dbname="arrests") > it <- iquery(con, 'select * from USArrests', n=10) > nextElem(it) Murder Assault UrbanPop Rape Alabama 13.2 236 58 21.2 Alaska 10.0 263 48 44.5 Arizona 8.1 294 80 31.0 Arkansas 8.8 190 50 19.5 California 9.0 276 91 40.6 Colorado 7.9 204 78 38.7 Connecticut 3.3 110 77 11.1 Delaware 5.9 238 72 15.8 Florida 15.4 335 80 31.9 Georgia 17.4 211 60 25.8
31
![Page 32: Parallel Programing in R - Revolutions](https://reader033.vdocuments.site/reader033/viewer/2022051302/5893127c1a28ab13068bedda/html5/thumbnails/32.jpg)
Infinite&Irregularsequences
iprime <- function() { lastPrime <- 1 nextEl <- function() { lastPrime <<- as.numeric(nextprime(lastPrime)) lastPrime } it <- list(nextElem=nextEl) class(it) <- c('abstractiter','iter') it}
> require(gmp) > p <- iprime() > nextElem(p) [1] 2 > nextElem(p) [1] 3
![Page 33: Parallel Programing in R - Revolutions](https://reader033.vdocuments.site/reader033/viewer/2022051302/5893127c1a28ab13068bedda/html5/thumbnails/33.jpg)
33
Loopingwithforeach
![Page 34: Parallel Programing in R - Revolutions](https://reader033.vdocuments.site/reader033/viewer/2022051302/5893127c1a28ab13068bedda/html5/thumbnails/34.jpg)
Loopingwithforeach
foreach (var=iterator) %dopar% { statements }
Evaluatestatementsuntiliteratorterminates statementswillreferencevariablevar Valuesof{ … }blockcollectedintoalist
Runssequentially(bydefault)(orforcewith%do% )
34
![Page 35: Parallel Programing in R - Revolutions](https://reader033.vdocuments.site/reader033/viewer/2022051302/5893127c1a28ab13068bedda/html5/thumbnails/35.jpg)
> foreach (j=1:4) %dopar% sqrt (j)
[[1]] [1] 1
[[2]] [1] 1.414214
[[3]] [1] 1.732051
[[4]] [1] 2
Warning message: executing %dopar% sequentially: no parallel backend registered
35
![Page 36: Parallel Programing in R - Revolutions](https://reader033.vdocuments.site/reader033/viewer/2022051302/5893127c1a28ab13068bedda/html5/thumbnails/36.jpg)
CombiningResults> foreach(j=1:4, .combine=c) %dopar% sqrt(j) [1] 1.000000 1.414214 1.732051 2.000000
> foreach(j=1:4, .combine='+’, .inorder=FALSE) %dopar% sqrt(j)
[1] 6.146264
Whenorderofevaluationisunimportant,use.inorder=FALSE
36
![Page 37: Parallel Programing in R - Revolutions](https://reader033.vdocuments.site/reader033/viewer/2022051302/5893127c1a28ab13068bedda/html5/thumbnails/37.jpg)
Referencingglobalvariables> z <- 2 > f <- function (x) sqrt (x + z)
> foreach (j=1:4, .combine='+') %dopar% f(j)
[1] 8.417609
foreachautomaticallyinspectscodeandensuresunboundobjectsarepropagatedtotheevaluationenvironment
37
![Page 38: Parallel Programing in R - Revolutions](https://reader033.vdocuments.site/reader033/viewer/2022051302/5893127c1a28ab13068bedda/html5/thumbnails/38.jpg)
Nestedforeachexecu,on
• foreach opera,onscanbenestedusing%:%operator
• Allowsparallelexecu,onacrossmul,pleloopslevels,“unrolling”theinnerloops
foreach(i=1:3, .combine=cbind) %:% foreach(j=1:3, .combine=c) %dopar% (i + j)
![Page 39: Parallel Programing in R - Revolutions](https://reader033.vdocuments.site/reader033/viewer/2022051302/5893127c1a28ab13068bedda/html5/thumbnails/39.jpg)
39
Speedingupcodewithforeach
SMPProcessing
![Page 40: Parallel Programing in R - Revolutions](https://reader033.vdocuments.site/reader033/viewer/2022051302/5893127c1a28ab13068bedda/html5/thumbnails/40.jpg)
Quickreviewofparallelanddistributedcompu,nginR
• NetWorkSpaces(packagenws;SMP,distributed)– GPL,alsocommerciallysupportedbyREvolu,onCompu,ng– Verycross‐pla�orm,distributedshared‐memoryparadigm– Fault‐tolerant
• Mul,Core(packagemulticore;SMPonly)– Linux/MacOS(requiresPOSIX)– UsesforktocreatenewRprocesses
• Rmpi(packageRmpi;SMP,distributed)– Fine‐grainedcontrolallowsveryhigh‐performancecalcula,ons– Canbetrickytoconfigure– LimitedWindowsandheterogeneousclustersupport
• SNOW(packagesnow;SMP,distributed*)– LimitedWindowssupport(*singlemachineonly)– Meta‐package:supportsMPI,sockets,NWS,PVM
40
![Page 41: Parallel Programing in R - Revolutions](https://reader033.vdocuments.site/reader033/viewer/2022051302/5893127c1a28ab13068bedda/html5/thumbnails/41.jpg)
Parallelbackendsforforeach
• %dopar%behaviourdependsoncurrent“registered”parallelbackend
• Modularparallelbackends
• registerDoSEQ(default)• registerDoNWS(NetWorkSpaces)• registerDoMC(mul,core,MacOS/Windows)
• FromTerminal/ESSonly!(R.appGUIwillcrash.)• registerDoSNOW• registerDoRMPI
41
![Page 42: Parallel Programing in R - Revolutions](https://reader033.vdocuments.site/reader033/viewer/2022051302/5893127c1a28ab13068bedda/html5/thumbnails/42.jpg)
GemngStarted:Mul,‐coreProcessing
• R2.10.x–waitun,lofficialrelease• R2.9.x
require(doMC)
registerDoMC(cores=2)
• REvolu,onREnterpriserequire(doNWS)
s <- sleigh(workerCount=2) registerDoNWS()
42
![Page 43: Parallel Programing in R - Revolutions](https://reader033.vdocuments.site/reader033/viewer/2022051302/5893127c1a28ab13068bedda/html5/thumbnails/43.jpg)
Asimplesimulation:
birthday <- function(n) { ntests <- 1000 pop <- 1:365 anydup <- function(i) any(duplicated( sample(pop, n, replace=TRUE)))
sum(sapply(seq(ntests), anydup)) / ntests }
x <- foreach (j=1:100) %dopar% birthday (j)
43
![Page 44: Parallel Programing in R - Revolutions](https://reader033.vdocuments.site/reader033/viewer/2022051302/5893127c1a28ab13068bedda/html5/thumbnails/44.jpg)
BirthdayExample‐,mings
Backend Time(s)registerDoSEQ() 41sregisterDoMC()#2cores 28sregisterDoNWS()#2workers 26s(*)
44
Dual‐core2.4GHzIntelMacBook:
system.time{ x <- foreach (j=1:100) %dopar% birthday (j) } # Elapsed
![Page 45: Parallel Programing in R - Revolutions](https://reader033.vdocuments.site/reader033/viewer/2022051302/5893127c1a28ab13068bedda/html5/thumbnails/45.jpg)
BirthdaySimula,on:Mul,core/NWS
> x <- foreach (j=1:100) %dopar% birthday (j) > plot(1:100, unlist(x),type="l")
![Page 46: Parallel Programing in R - Revolutions](https://reader033.vdocuments.site/reader033/viewer/2022051302/5893127c1a28ab13068bedda/html5/thumbnails/46.jpg)
46
UsingclusterswithNetworkSpaces
![Page 47: Parallel Programing in R - Revolutions](https://reader033.vdocuments.site/reader033/viewer/2022051302/5893127c1a28ab13068bedda/html5/thumbnails/47.jpg)
SemngUpaCluster
1. Iden,fymachinestoformnodesoncluster– EasiestwithLinux/MacOS
– PossiblewithWindows
2. Selectaservermachine– OKforthisonetobeonWindows
3. Makesurepasswordlesssshenabledoneachworkernode
– ssh nodename Revo --versionshouldwork
47
![Page 48: Parallel Programing in R - Revolutions](https://reader033.vdocuments.site/reader033/viewer/2022051302/5893127c1a28ab13068bedda/html5/thumbnails/48.jpg)
Semngupacluster,part2
4. Logintoserver,startREvolu,onR5. Createasleigh
require(doNWS) s <- sleigh(nodeList=c( rep("localhost",2), rep("thor",8), rep("loki",4)), launch=sshcmd) registerDoNWS(s)
6. Useforeachasbefore7. (op,onal)usejoinSleightoaddnewnodes
48
![Page 49: Parallel Programing in R - Revolutions](https://reader033.vdocuments.site/reader033/viewer/2022051302/5893127c1a28ab13068bedda/html5/thumbnails/49.jpg)
ParallelRandomForest
# a simple parallel random forest
library(randomForest) x <- matrix(runif(500), 100)
y <- gl(2, 50) wc <- 2
n <- ceiling(1000 / wc)
registerDoNWS(s) foreach(ntree=rep(n, wc), .combine=combine,
.packages='randomForest') %dopar% randomForest(x, y, ntree=ntree)
• Easier: randomShrubberyNWS()
49
![Page 50: Parallel Programing in R - Revolutions](https://reader033.vdocuments.site/reader033/viewer/2022051302/5893127c1a28ab13068bedda/html5/thumbnails/50.jpg)
Conver,ngexis,ngcode
• Converttheseloopstoforeach:– for:makebodyreturnitera,onvalueand.combine
– apply:useiter(X, by="row”)and.combine • Oriapply(X,1)
– lapply:useiter(mylist)
50
![Page 51: Parallel Programing in R - Revolutions](https://reader033.vdocuments.site/reader033/viewer/2022051302/5893127c1a28ab13068bedda/html5/thumbnails/51.jpg)
ppiStatsExample
• Sequen,al:bpMats1 <- lapply(bpList, function(x) { bpMatrix(x, symMat = TRUE, homodimer = FALSE, baitAsPrey = FALSE, unWeighted = FALSE, onlyRecip = FALSE, baitsOnly = FALSE) })
• Parallel:bpMats1 <- foreach(x=iter(bpList), .packages = "ppiStats") %dopar% { bpMatrix(x, sysMat = TRUE, homodimer = FALSE, baitAsPrey = FALSE, unWeighted = FALSE, onlyRecip = FALSE, baitsOnly = FALSE) }
51
![Page 52: Parallel Programing in R - Revolutions](https://reader033.vdocuments.site/reader033/viewer/2022051302/5893127c1a28ab13068bedda/html5/thumbnails/52.jpg)
ppiStatsExample
• Sequen,al:bpGraphs <- lapply(bpMats1, function(x) {
genBPGraph(x, directed = TRUE, bp = FALSE)
})
• Parallel:bpGraph <- foreach(x=iter(bpMat1),
.packages = "ppiStats") %dopar% {
genBPGraph(x, directed = TRUE, bp = FALSE) }
52
![Page 53: Parallel Programing in R - Revolutions](https://reader033.vdocuments.site/reader033/viewer/2022051302/5893127c1a28ab13068bedda/html5/thumbnails/53.jpg)
Excercise
• Findother“embarassinglyparallel”BioConductorexamples,andconverttoparallelwithforeach.
53
![Page 54: Parallel Programing in R - Revolutions](https://reader033.vdocuments.site/reader033/viewer/2022051302/5893127c1a28ab13068bedda/html5/thumbnails/54.jpg)
54
Mul,‐StratumParallelism
![Page 55: Parallel Programing in R - Revolutions](https://reader033.vdocuments.site/reader033/viewer/2022051302/5893127c1a28ab13068bedda/html5/thumbnails/55.jpg)
foreach(iterator)%dopar%{tasks}
foreach…
task task
foreach…
task task
CLUSTER
SMP
Anexampleofexplicitmulti‐stratum||ism
55
![Page 56: Parallel Programing in R - Revolutions](https://reader033.vdocuments.site/reader033/viewer/2022051302/5893127c1a28ab13068bedda/html5/thumbnails/56.jpg)
require ("doNWS") require ("foreach") require ("doMC")
s <- sleigh(nodelist=c(rep("localhost",2), rep("bladeserver",8)) registerDoNWS(s)
foreach (iterator_i, .packages=c("foreach", "doMC"))%dopar% { registerDoMC() foreach (iteratorj_) %dopar% { tasks… } }
Mul,‐stratumtemplate:NWS/Mul,core
56
![Page 57: Parallel Programing in R - Revolutions](https://reader033.vdocuments.site/reader033/viewer/2022051302/5893127c1a28ab13068bedda/html5/thumbnails/57.jpg)
Pi�allstoavoid
• Sequen,alvsParallelProgramming• RandomNumberGenera,on
– library(sprngNWS) – sleigh(workerCount=8, rngType=‘sprngLFG’)
• Nodefailure• CosmicRays
57
![Page 58: Parallel Programing in R - Revolutions](https://reader033.vdocuments.site/reader033/viewer/2022051302/5893127c1a28ab13068bedda/html5/thumbnails/58.jpg)
Conclusions
• Parallelcompu,ngiseasy!• Writeloopswithforeach/%dopar%
– Worksfineinasingle‐processorenvironment
– Third‐partyuserscanregisterbackendsformul,processororclusterprocessing
– Speedbenefitswithoutmodifyingcode
• Easyperformancegainsonmodernlaptops/desktops
• Expandtoclustersformeatyjobs
58
![Page 59: Parallel Programing in R - Revolutions](https://reader033.vdocuments.site/reader033/viewer/2022051302/5893127c1a28ab13068bedda/html5/thumbnails/59.jpg)
ThankYou!
• DavidSmith– david@revolu,on‐compu,ng.com,@revodavid
• REvolu,onCompu,ng– www.revolu,on‐compu,ng.com
• Revolu+ons,theRblog– blog.revolu,on‐compu,ng.com
• Downloads:– Slides:h7p://,nyurl.com/R‐Bioc‐slides– Script:h7p://,nyurl.com/R‐Bioc‐script