1 © 2012 the mathworks, inc. parallel computing with matlab

24
1 © 2012 The MathWorks, Inc. Parallel computing with MATLAB

Upload: bennett-houston

Post on 01-Jan-2016

223 views

Category:

Documents


0 download

TRANSCRIPT

1© 2012 The MathWorks, Inc.

Parallel computing with MATLAB

2

Going Beyond Serial MATLAB Applications

MATLAB Desktop (Client)

Worker

WorkerWorker

Worker

Worker

Worker

3

Programming Parallel Applications (CPU)

Built-in support with toolboxes

Ea

se o

f U

seG

reate

r Co

ntro

l

4

Example: Optimizing Cell Tower PositionBuilt-in parallel support

With Parallel Computing Toolbox use built-in parallel algorithms in Optimization Toolbox

Run optimization in parallel

Use pool of MATLAB workers

5

Tools Providing Parallel Computing Support

Optimization Toolbox Global Optimization Toolbox Statistics Toolbox Signal Processing Toolbox Neural Network Toolbox Image Processing Toolbox …

BLOCKSETS

Directly leverage functions in Parallel Computing Toolbox

www.mathworks.com/builtin-parallel-support

6

Agenda

Task parallel applications GPU acceleration Data parallel applications Using clusters and grids

8

Example: Parameter Sweep of ODEsParallel for-loops

Parameter sweep of ODE system– Damped spring oscillator

– Sweep through different values of damping and stiffness

– Record peak value for eachsimulation

Convert for to parfor

Use pool of MATLAB workers

0

,...2,1,...2,1

5

xkxbxm

9

The Mechanics of parfor Loops

Pool of MATLAB Workers

a = zeros(10, 1)parfor i = 1:10

a(i) = i;enda

a(i) = i;

a(i) = i;

a(i) = i;

a(i) = i;

Worker

Worker

WorkerWorker

11 22 33 44 55 66 77 88 99 101011 22 33 44 55 66 77 88 99 1010

10

Agenda

Task parallel applications GPU acceleration Data parallel applications Using clusters and grids

11

What is a Graphics Processing Unit (GPU)

Originally for graphics acceleration, now also used for scientific calculations

Massively parallel array of integer andfloating point processors– Typically hundreds of processors per card

– GPU cores complement CPU cores

Dedicated high-speed memory

* Parallel Computing Toolbox requires NVIDIA GPUs with Compute Capability 1.3 or higher, including NVIDIA Tesla 20-series products. See a complete listing at www.nvidia.com/object/cuda_gpus.html

12

Core 1

Core 3 Core 4

Core 2

Cache

Performance Gain with More Hardware

Using More Cores (CPUs) Using GPUs

Device Memory

GPU cores

Device Memory

13

Example: Mandelbrot set

The color of each pixel is the result of hundreds or thousands or iterations

Each pixel is independent of the other pixels

Hundres of thousands of pixels

14

Real-world performance increaseSolving a wave equation

Intel Xeon Processor X5650, NVIDIA Tesla C2050 GPU

Grid Size CPU (s)

GPU(s) Speedup

64 x 64 0.1004 0.3553 0.28

128 x 128 0.1931 0.3368 0.57

256 x 256 0.5888 0.4217 1.4

512 x 512 2.8163 0.8243 3.4

1024 x 1024 13.4797 2.4979 5.4

2048 x 2048 74.9904 9.9567 7.5

15

Programming Parallel Applications (GPU)

Built-in support with toolboxes

Simple programming constructs:gpuArray, gather

Advanced programming constructs:arrayfun, spmd

Interface for experts: CUDAKernel, MEX support

Ea

se o

f U

seG

reate

r Co

ntro

l

www.mathworks.com/help/distcomp/run-cuda-or-ptx-code-on-gpu

www.mathworks.com/help/distcomp/run-mex-functions-containing-cuda-code

16

Agenda

Task parallel applications GPU acceleration Data parallel applications Using clusters and grids

17

Big data: Distributed Arrays

TOOLBOXES

BLOCKSETS

Distributed ArrayLives on the Cluster

Remotely Manipulate Array from Desktop

1111 2626 4141

1212 2727 4242

1313 2828 4343

1414 2929 4444

1515 3030 4545

1616 3131 4646

1717 3232 4747

1717 3333 4848

1919 3434 4949

2020 3535 5050

2121 3636 5151

2222 3737 5252

18

Big Data: Distributed Arrays

Pool of MATLAB Workers

y = distributed(rand(10));

Column 1:3 of y

Worker

Worker

WorkerWorker

Column 7:8 of y Column 9:10 of y

Column 4:6 of y

19

Demo: Approximation of π

∫0

14

1+𝑥2𝑑𝑥=𝜋

20

Programming Parallel Applications (CPU)

Built-in support with toolboxes

Simple programming constructs:parfor, batch, distributed

Advanced programming constructs:createJob, labSend, spmd

Ea

se o

f U

seG

reate

r Co

ntro

l

21

Agenda

Task parallel applications GPU acceleration Data parallel applications Using clusters and grids

22

Working on C3SE

23

Apply for a project with SNIC

24© 2012 The MathWorks, Inc.