1 © 2012 the mathworks, inc. parallel computing with matlab
Post on 01-Jan-2016
225 Views
Preview:
TRANSCRIPT
2
Going Beyond Serial MATLAB Applications
MATLAB Desktop (Client)
Worker
WorkerWorker
Worker
Worker
Worker
3
Programming Parallel Applications (CPU)
Built-in support with toolboxes
Ea
se o
f U
seG
reate
r Co
ntro
l
4
Example: Optimizing Cell Tower PositionBuilt-in parallel support
With Parallel Computing Toolbox use built-in parallel algorithms in Optimization Toolbox
Run optimization in parallel
Use pool of MATLAB workers
5
Tools Providing Parallel Computing Support
Optimization Toolbox Global Optimization Toolbox Statistics Toolbox Signal Processing Toolbox Neural Network Toolbox Image Processing Toolbox …
BLOCKSETS
Directly leverage functions in Parallel Computing Toolbox
www.mathworks.com/builtin-parallel-support
6
Agenda
Task parallel applications GPU acceleration Data parallel applications Using clusters and grids
7
Ideal problem for parallel computing No dependencies or communications between tasks Examples: parameter sweeps, Monte Carlo simulations
Independent Tasks or Iterations
TimeTime
blogs.mathworks.com/loren/2009/10/02/using-parfor-loops-getting-up-and-running/
8
Example: Parameter Sweep of ODEsParallel for-loops
Parameter sweep of ODE system– Damped spring oscillator
– Sweep through different values of damping and stiffness
– Record peak value for eachsimulation
Convert for to parfor
Use pool of MATLAB workers
0
,...2,1,...2,1
5
xkxbxm
9
The Mechanics of parfor Loops
Pool of MATLAB Workers
a = zeros(10, 1)parfor i = 1:10
a(i) = i;enda
a(i) = i;
a(i) = i;
a(i) = i;
a(i) = i;
Worker
Worker
WorkerWorker
11 22 33 44 55 66 77 88 99 101011 22 33 44 55 66 77 88 99 1010
10
Agenda
Task parallel applications GPU acceleration Data parallel applications Using clusters and grids
11
What is a Graphics Processing Unit (GPU)
Originally for graphics acceleration, now also used for scientific calculations
Massively parallel array of integer andfloating point processors– Typically hundreds of processors per card
– GPU cores complement CPU cores
Dedicated high-speed memory
* Parallel Computing Toolbox requires NVIDIA GPUs with Compute Capability 1.3 or higher, including NVIDIA Tesla 20-series products. See a complete listing at www.nvidia.com/object/cuda_gpus.html
12
Core 1
Core 3 Core 4
Core 2
Cache
Performance Gain with More Hardware
Using More Cores (CPUs) Using GPUs
Device Memory
GPU cores
Device Memory
13
Example: Mandelbrot set
The color of each pixel is the result of hundreds or thousands or iterations
Each pixel is independent of the other pixels
Hundres of thousands of pixels
14
Real-world performance increaseSolving a wave equation
Intel Xeon Processor X5650, NVIDIA Tesla C2050 GPU
Grid Size CPU (s)
GPU(s) Speedup
64 x 64 0.1004 0.3553 0.28
128 x 128 0.1931 0.3368 0.57
256 x 256 0.5888 0.4217 1.4
512 x 512 2.8163 0.8243 3.4
1024 x 1024 13.4797 2.4979 5.4
2048 x 2048 74.9904 9.9567 7.5
15
Programming Parallel Applications (GPU)
Built-in support with toolboxes
Simple programming constructs:gpuArray, gather
Advanced programming constructs:arrayfun, spmd
Interface for experts: CUDAKernel, MEX support
Ea
se o
f U
seG
reate
r Co
ntro
l
www.mathworks.com/help/distcomp/run-cuda-or-ptx-code-on-gpu
www.mathworks.com/help/distcomp/run-mex-functions-containing-cuda-code
16
Agenda
Task parallel applications GPU acceleration Data parallel applications Using clusters and grids
17
Big data: Distributed Arrays
TOOLBOXES
BLOCKSETS
Distributed ArrayLives on the Cluster
Remotely Manipulate Array from Desktop
1111 2626 4141
1212 2727 4242
1313 2828 4343
1414 2929 4444
1515 3030 4545
1616 3131 4646
1717 3232 4747
1717 3333 4848
1919 3434 4949
2020 3535 5050
2121 3636 5151
2222 3737 5252
18
Big Data: Distributed Arrays
Pool of MATLAB Workers
y = distributed(rand(10));
Column 1:3 of y
Worker
Worker
WorkerWorker
Column 7:8 of y Column 9:10 of y
Column 4:6 of y
20
Programming Parallel Applications (CPU)
Built-in support with toolboxes
Simple programming constructs:parfor, batch, distributed
Advanced programming constructs:createJob, labSend, spmd
Ea
se o
f U
seG
reate
r Co
ntro
l
21
Agenda
Task parallel applications GPU acceleration Data parallel applications Using clusters and grids
top related