parallel matlab: doing it right - people | mit csailpeople.csail.mit.edu/cly/rqe.pdf · matlab...

Post on 24-Aug-2020

13 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Parallel MATLAB: Doing it Right

Ron ChoyCSAIL

MIT

• The case for parallel MATLAB• Parallel MATLAB Survey• Star-P

The Rise of Clusters• Cheap and everyone seems to have one

Some MIT Clusters

• Cheap & everyone seems to have one

Impact of Clusters

• More and more ‘non-experts’ are using clusters in their work/research

• Increased computing power does not necessarily translate to increased productivity

• Someone has to program these parallel machines, and this can be hard

Classes of interesting problems

• Large collection of small problems – easy, usually involves running a large number of serial programs – embarrassingly parallel

• Large problems – might not even fit into the memory of a machine. Much harder since it usually involves communication between processes

Parallel programming (1990 – now)

• Dominant model (for distributed memory): MPI (Message Passing Interface)

• Low level control of communications between processes

• Send, Recv, Scatter, Gather …

MPI

• It is not an easy way to do parallel programming.

• Error prone – it is hard to get complex low level communications right

• Hard to debug – MPI programs tend to die with obscure error messages. There is also no good debugger for MPI. (Totalview? Multiple GDBs?)

The result?

• A lot of researchers in the sciences need the power of parallel computing, but do not have the expertise to tackle the programming

• Time saved by using parallel computers is offset by the additional time needed to program them

• Reduced productivity!

The result? (2)

• Also the parallel programs written might not be efficient! Writing good parallel programs is hard

• A high level tool is needed to hide the complexity away from the users of parallel computers

Let’s go back to the serial world

for a moment

What do people want?

• Real applications programmers care less about “p-fold speedups”

• Ease of programming, debugging, correct answer matters much more

MATLAB

• MATrix LABoratory• Started out as an interactive frontend to

LINPACK and EISPACK• Has since grown into a powerful computing

environment with ‘Toolboxes’ ranging from neural networks to computational finance to image processing.

• Many current users know nothing/little about linear algebra (Toolbox users)

MATLAB ‘language’

• MATLAB has a C-like scripting language, with FORTRAN90-like array indexing

• Interpreted language• A wide range of linear algebra routines• Very popular in the scientific community as

a prototyping tool – easier to use than C/FORTRAN

MATLAB ‘language’ (2)

• MATLAB makes use of ATLAS (optimized BLAS) for basic linear algebra routines

• Still, performance is not as good as C/FORTRAN, especially for loops

• However MATLAB is catching up with its release 13 – (just-in-time compiling?)

Why people like MATLAB?

• MATLAB made good choices on numerical libraries that are efficient, and combined them into one interactive, easy to use package.

• People were skeptical at first because they feel MATLAB slows things down. At the end convenience, not performance, won.

Wouldn’t it be nice to mixthe convenience of MATLAB

withparallel computing?

Why there should be a parallel MATLAB

• Cleve Moler, ‘Why there isn’t a parallel MATLAB’– Memory Model– Granularity– Business Situation

Parallel MATLAB Survey

• http://theory.lcs.mit.edu/~cly/survey.html• 27 parallel MATLAB projects found, using

4 approaches– Embarrassingly Parallel– Message Passing– Backend Support– Compiler

CMTMDP-ToolboxMatlabMPIMATmarksMPITB/PVMTBMultiMATLABParallel Toolbox for MATLAB

DistributePPMATLAB Parallelization ToolkitMULTI ToolboxParalizeParmatlabPlabPMI

DlabParamatPLAPACKMatparMATLAB*PNetsolve

CONLAB CompilerFALCONMATCHMenhirOtterParALRTExpress

Parallel MATLABs

CornellRostock, GermanyLincoln LabsUIUCGranada, SpainCornellWake Forest

WUStLLinkopings, SwedenPurdueChalmers, SwedenNortheasternTUDenmarkLucent

UIUCAlpha Data UKUTAustinJPLMITUTK

Umea, SwedenUIUCAccelchipIRISA, FranceOSUSydney, AustraliaISI

Parallel MATLABs

CMTMDP-ToolboxMatlabMPIMATmarksMPITB/PVMTBMultiMATLABParallel Toolbox for MATLAB

DistributePPMATLAB Parallelization ToolkitMULTI ToolboxParalizeParmatlabPlabPMI

DlabParamatPLAPACKMatparMATLAB*PNetsolve

CONLAB CompilerFALCONMATCHMenhirOtterParALRTExpress

Parallel MATLABs

CMTMDP-ToolboxMatlabMPIMATmarksMPITB/PVMTBMultiMATLABParallel Toolbox for MATLAB

DistributePPMATLAB Parallelization ToolkitMULTI ToolboxParalizeParmatlabPlabPMI

DlabParamatPLAPACKMatparMATLAB*PNetsolve

CONLAB CompilerFALCONMATCHMenhirOtterParALRTExpress

Parallel MATLABs

Embarrassingly Parallel

• Multiple MATLABs needed• Scatter/Gather at the beginning and end• No lateral communication

Message Passing

• Multiple MATLABs required• MATLABs talk to each other in a general

MPI fashion• Superset of embarrassingly parallel

Backend support

• Requires 1 MATLAB• MATLAB is used as front end to a parallel

computation engine

Compiler

Compiler (2)

• Requires zero to one MATLAB• Compiles MATLAB scripts into native code

and link with parallel numerical libraries

What do we learn from the survey?

• The projects themselves are ‘existential proofs’ for what users want from a parallel MATLAB

• There are way too many parallel MATLAB projects

Wouldn’t it be nice to have a single project

that covers it all, and does it better than the rest?

Star-P

• Goal: To provide a unified parallel MATLAB for user-friendly parallel computing

• Uses MATLAB as a front end to a parallel linear algebra server

• Encompasses the 4 approaches in the survey

Star-P (2)

• UI idea based on MATLAB*P• Substantially different from MATLAB*P

– New code base– Unified vs. backend support

Star-P (3)

• Server leverages on popular parallel numerical libraries:– ScaLAPACK– FFTW– PARPACK– …

• As well as custom-made parallel routines– Sorting– Sparse Matrix routines

Star-P Example

A = rand(4000*p, 4000*p); x = randn(4000*p, 1); y = zeros(size(x));while norm(x-y) / norm(x) > 1e-11

y = x;x = A*x;x = x / norm(x);

end;

Structure of the System

Package Manager

MatrixManager

Proxy

Client

MATLABMatrix 1

Matrix 2

Matrix 3

PBlas FFTW Scalapack PBase

Server #0

Client Manager

Server #1 Server #2 Server #3 Server #4

ServerManager

Structure of the system (2)

• Client (C++) – Attaches to MATLAB through MEX interface– Relays commands to server and process returns

• Server (C++/MPI)– Manipulates/processes data (matrices)– Performs the actual parallel computation by

calling the appropriate library routines

Functionalities provided

• PBLAS, ScaLAPACK– solve, eig, svd, chol, matmul, +, -, …

• FFTW– 1D and 2D FFT

• PARPACK– eigs, svds

• Sparse Matrix Routines• ‘MultiMATLAB’ mode

Focus

• Requires minimal learning on user’s part• Reuses of existing scripts• Mimics MATLAB behaviour• Data stay on backend until explicitly

retrieved by user• Extendable backend• Be a system that fits all users

Comparison

Extendable through ‘Packages’ and ‘Toolboxes’

Extendable through ‘Toolboxes’

Focus is on usability not performance

Focus is on usability not performance

Frontend to PBLAS, ScaLAPACK, …

Frontend to LINPACK, EISPACK

Star-PMATLAB

What is *p?

• Star-P creates a new MATLAB class called dlayout

• By providing overloaded functions for this new class, parallelism is achieved transparently

• *p is dlayout(1)

Example

• A = randn(1024*p,1024*p);• E = eig(A);• e = pp2matlab(E);• plot(e,’*’);

Example 2

• H = hilb(n) % H = hilb(8000*p)– J = 1:n;– J = J(ones(n,1),:);– I = J’;– E = ones(n,n);– H = E./(I+J-1);

Built-in MATLAB functions

• Hilb is a MATLAB function that get parallelized automatically

• Another example of a MATLAB function that works is cgs

• We do not know how many of them works

Where is the data?

• One often-asked question in parallel computing is: where are the data residing?

• Star-P supports 3 data distributions: – A = randn(m*p,n) – row distribution– B = randn(m,n*p) – column distribution– C = randn(m*p,n*p) – 2D block cyclic

distribution

MultiMATLAB mode

• Similar to MultiMATLAB from Cornell• Starts up a MATLAB process on each node,

and runs MATLAB scripts on them in parallel

• Interesting feature is that the data can come from the ‘regular’ mode of Star-P

Example 3

% FFT2 in 4 lines• A = randn(10000,10000*p);• B = mm('fft',A);• C = B.';• D = mm('fft',C);• F = D.';

MPI in MATLAB

• User can use MPI-like commands inside MultiMATLAB mode

function result = simpletest(X)import mpi.*;MPI.init;if MPI.COMM_WORLD.rank == 0;

data = [4 2];MPI.COMM_WORLD.send(data, 2, 1, 13);result = 9;

elseif MPI.COMM_WORLD.rank == 1got = MPI.COMM_WORLD.recv(2, 0, 13);result = got(1) + got(2);

endMPI.fnlz;

Compilation

• MATLAB R14 (prerelease) provides the ability to compile MATLAB m-files that uses Star-P functionalities into binaries.

• The binaries combined with a running Star-P server enable compiled m-files to be run in parallel.

Types of parallel problems

• Collection of small problems– MultiMATLAB mode– MPI within MultiMATLAB mode

• Large problems– ‘Regular mode’ – ScaLAPACK, FFTW, …

• We even allow a mixture of the two modes to give additional flexibility – FFT2

Visualization package

• A term project done by a group of students in a parallel computing class at MIT

• Provides the equivalent of mesh, surf and spy to distributed matrices

• Visualization of very large matrix is now possible in MATLAB

Why is Star-P interesting?

• First parallel MATLAB system that encompasses the 4 approaches in the survey

• First system that allows seamless (to the user) integration between a large number of parallel numerical libraries

More

• Processor maps – ScaLAPACK magic?• Dynamic addition of processors - Singapore• Star-P on a grid – UCSB, Parry• Pipelining - ??• Fault Tolerance - UCSB

top related