matlab 加速及编程技巧download.ilovematlab.cn/meetup/2019qdu/matlab-coding-accelera… · 11...
TRANSCRIPT
![Page 1: MATLAB 加速及编程技巧download.ilovematlab.cn/meetup/2019QDU/MATLAB-coding-accelera… · 11 动态数组的内存分配 >> x = 4 >> x(2) = 7 >> x(3) = 12 0x0000 0x0008 0x0010](https://reader034.vdocuments.site/reader034/viewer/2022050417/5f8ce8385114f401152f6822/html5/thumbnails/1.jpg)
5© 2017 The MathWorks, Inc.
MATLAB 加速及编程技巧
![Page 2: MATLAB 加速及编程技巧download.ilovematlab.cn/meetup/2019QDU/MATLAB-coding-accelera… · 11 动态数组的内存分配 >> x = 4 >> x(2) = 7 >> x(3) = 12 0x0000 0x0008 0x0010](https://reader034.vdocuments.site/reader034/viewer/2022050417/5f8ce8385114f401152f6822/html5/thumbnails/2.jpg)
6
MATLAB 可以做什么?
交互式数据导入/导出,以及数据统计 数据集合上的计算与分析
数据可视化 通过编写MATLAB代码自动完成复杂任务
![Page 3: MATLAB 加速及编程技巧download.ilovematlab.cn/meetup/2019QDU/MATLAB-coding-accelera… · 11 动态数组的内存分配 >> x = 4 >> x(2) = 7 >> x(3) = 12 0x0000 0x0008 0x0010](https://reader034.vdocuments.site/reader034/viewer/2022050417/5f8ce8385114f401152f6822/html5/thumbnails/3.jpg)
8
编程风格
Link
![Page 4: MATLAB 加速及编程技巧download.ilovematlab.cn/meetup/2019QDU/MATLAB-coding-accelera… · 11 动态数组的内存分配 >> x = 4 >> x(2) = 7 >> x(3) = 12 0x0000 0x0008 0x0010](https://reader034.vdocuments.site/reader034/viewer/2022050417/5f8ce8385114f401152f6822/html5/thumbnails/4.jpg)
9
加速算法运行
系统对象 MATLAB to C
客户代码
自定义 MEX 文件优化的 MATLAB
代码并行计算
![Page 5: MATLAB 加速及编程技巧download.ilovematlab.cn/meetup/2019QDU/MATLAB-coding-accelera… · 11 动态数组的内存分配 >> x = 4 >> x(2) = 7 >> x(3) = 12 0x0000 0x0008 0x0010](https://reader034.vdocuments.site/reader034/viewer/2022050417/5f8ce8385114f401152f6822/html5/thumbnails/5.jpg)
10
MATLAB 优化代码
![Page 6: MATLAB 加速及编程技巧download.ilovematlab.cn/meetup/2019QDU/MATLAB-coding-accelera… · 11 动态数组的内存分配 >> x = 4 >> x(2) = 7 >> x(3) = 12 0x0000 0x0008 0x0010](https://reader034.vdocuments.site/reader034/viewer/2022050417/5f8ce8385114f401152f6822/html5/thumbnails/6.jpg)
11
动态数组的内存分配
>> x = 4
>> x(2) = 7
>> x(3) = 12
0x0000
0x0008
0x0010
0x0018
0x0020
0x0028
0x0030
0x0038
0x0000
0x0008
0x0010
0x0018
0x0020
0x0028
0x0030
0x0038
0x0000
0x0008
0x0010
0x0018
0x0020
0x0028
0x0030
0x0038
4 44
7
4
74
7
12
X(3) = 12X(2) = 7
![Page 7: MATLAB 加速及编程技巧download.ilovematlab.cn/meetup/2019QDU/MATLAB-coding-accelera… · 11 动态数组的内存分配 >> x = 4 >> x(2) = 7 >> x(3) = 12 0x0000 0x0008 0x0010](https://reader034.vdocuments.site/reader034/viewer/2022050417/5f8ce8385114f401152f6822/html5/thumbnails/7.jpg)
12
代码性能提升方法 – 预分配
![Page 8: MATLAB 加速及编程技巧download.ilovematlab.cn/meetup/2019QDU/MATLAB-coding-accelera… · 11 动态数组的内存分配 >> x = 4 >> x(2) = 7 >> x(3) = 12 0x0000 0x0008 0x0010](https://reader034.vdocuments.site/reader034/viewer/2022050417/5f8ce8385114f401152f6822/html5/thumbnails/8.jpg)
13
循环与矢量化
▪ MATLAB 对涉及矩阵和向量的运算进行了优化
▪ 使用 MATLAB 矩阵和向量运算,以替换循环的过程称为矢量化。
% Generate sample data
M = rand(1000, 100);
mu = mean(M, 1);
% No vectorization
for j = 1:n
for i = 1:m
M(i,j) = M(i,j) - mu(j);
end
end
% Semi-vectorized
for j = 1:n
M(:,j) = M(:,j) - mu(j);
end
% REPMAT
muM = repmat(mu, size(M, 1), 1);
M = M-muM;
% Using BSXFUN
M = bsxfun(@minus,M,mu);
1.
2.
3.
4.
![Page 9: MATLAB 加速及编程技巧download.ilovematlab.cn/meetup/2019QDU/MATLAB-coding-accelera… · 11 动态数组的内存分配 >> x = 4 >> x(2) = 7 >> x(3) = 12 0x0000 0x0008 0x0010](https://reader034.vdocuments.site/reader034/viewer/2022050417/5f8ce8385114f401152f6822/html5/thumbnails/9.jpg)
14
代码性能提升方法 – 矢量化
![Page 10: MATLAB 加速及编程技巧download.ilovematlab.cn/meetup/2019QDU/MATLAB-coding-accelera… · 11 动态数组的内存分配 >> x = 4 >> x(2) = 7 >> x(3) = 12 0x0000 0x0008 0x0010](https://reader034.vdocuments.site/reader034/viewer/2022050417/5f8ce8385114f401152f6822/html5/thumbnails/10.jpg)
18
优化 MATLAB 代码:找到瓶颈
▪ 绿色:
– 使用 Code Analyzer 找到次优代码
▪ 探查代码– “Run and Time”
>> profile % MATLAB
>> performanceadvisor % Simulink
▪ 利用内置 timing 函数>> tic / toc
>> timeit
MATLAB code performance
![Page 11: MATLAB 加速及编程技巧download.ilovematlab.cn/meetup/2019QDU/MATLAB-coding-accelera… · 11 动态数组的内存分配 >> x = 4 >> x(2) = 7 >> x(3) = 12 0x0000 0x0008 0x0010](https://reader034.vdocuments.site/reader034/viewer/2022050417/5f8ce8385114f401152f6822/html5/thumbnails/11.jpg)
19
代码分析器 - Code Analyzer
▪ 红色:检测到语法错误
▪ 橙色:警告或者潜在的代码优化可能性
▪ 绿色:无错代码
![Page 12: MATLAB 加速及编程技巧download.ilovematlab.cn/meetup/2019QDU/MATLAB-coding-accelera… · 11 动态数组的内存分配 >> x = 4 >> x(2) = 7 >> x(3) = 12 0x0000 0x0008 0x0010](https://reader034.vdocuments.site/reader034/viewer/2022050417/5f8ce8385114f401152f6822/html5/thumbnails/12.jpg)
20
代码运行探测器 - Profile
![Page 13: MATLAB 加速及编程技巧download.ilovematlab.cn/meetup/2019QDU/MATLAB-coding-accelera… · 11 动态数组的内存分配 >> x = 4 >> x(2) = 7 >> x(3) = 12 0x0000 0x0008 0x0010](https://reader034.vdocuments.site/reader034/viewer/2022050417/5f8ce8385114f401152f6822/html5/thumbnails/13.jpg)
22
代码性能提升方法 – 其他
▪ 代码结构– 用函数代替脚本
– 模块化编程
▪ 将独立运算放到循环外
▪ 使用短路运算
▪ 避免使用全局变量
▪ 保存并加载常量值
▪ 避免频繁读取保存外部数据
▪ 代码行数 < 500 行
![Page 14: MATLAB 加速及编程技巧download.ilovematlab.cn/meetup/2019QDU/MATLAB-coding-accelera… · 11 动态数组的内存分配 >> x = 4 >> x(2) = 7 >> x(3) = 12 0x0000 0x0008 0x0010](https://reader034.vdocuments.site/reader034/viewer/2022050417/5f8ce8385114f401152f6822/html5/thumbnails/14.jpg)
23
Demo: 图像块处理
▪ Calculate a function at grid points
▪ Take the mean of larger blocks
▪ Analyze and improve performance
![Page 15: MATLAB 加速及编程技巧download.ilovematlab.cn/meetup/2019QDU/MATLAB-coding-accelera… · 11 动态数组的内存分配 >> x = 4 >> x(2) = 7 >> x(3) = 12 0x0000 0x0008 0x0010](https://reader034.vdocuments.site/reader034/viewer/2022050417/5f8ce8385114f401152f6822/html5/thumbnails/15.jpg)
31
MATLAB 并行加速
![Page 16: MATLAB 加速及编程技巧download.ilovematlab.cn/meetup/2019QDU/MATLAB-coding-accelera… · 11 动态数组的内存分配 >> x = 4 >> x(2) = 7 >> x(3) = 12 0x0000 0x0008 0x0010](https://reader034.vdocuments.site/reader034/viewer/2022050417/5f8ce8385114f401152f6822/html5/thumbnails/16.jpg)
32
并行计算的实际应用
➢ 为什么要使用 MATLAB 并行计算?
• 利用更多硬件的计算能力
• 加快工作流程,最小化,无需更改原始代码的代码
• 专注于您的工程和研究,而不是计算
![Page 17: MATLAB 加速及编程技巧download.ilovematlab.cn/meetup/2019QDU/MATLAB-coding-accelera… · 11 动态数组的内存分配 >> x = 4 >> x(2) = 7 >> x(3) = 12 0x0000 0x0008 0x0010](https://reader034.vdocuments.site/reader034/viewer/2022050417/5f8ce8385114f401152f6822/html5/thumbnails/17.jpg)
33
并行计算多核电脑
Multicore Desktop
Core 5
Core 1 Core 2
Core 6
MATLAB Desktop
(client)
Worker Worker
Worker Worker
MATLAB multicore
![Page 18: MATLAB 加速及编程技巧download.ilovematlab.cn/meetup/2019QDU/MATLAB-coding-accelera… · 11 动态数组的内存分配 >> x = 4 >> x(2) = 7 >> x(3) = 12 0x0000 0x0008 0x0010](https://reader034.vdocuments.site/reader034/viewer/2022050417/5f8ce8385114f401152f6822/html5/thumbnails/18.jpg)
34
并行计算集群
Cluster of computers
Core 5
Core 1 Core 2
Core 6
MATLAB Desktop
(client)
Core 5
Core 1 Core 2
Core 6
Core 5
Core 1 Core 2
Core 6 Core 5
Core 1 Core 2
Core 6
Worker Worker
Worker Worker
Worker Worker
Worker Worker
Worker Worker Worker Worker
Worker Worker Worker Worker
![Page 19: MATLAB 加速及编程技巧download.ilovematlab.cn/meetup/2019QDU/MATLAB-coding-accelera… · 11 动态数组的内存分配 >> x = 4 >> x(2) = 7 >> x(3) = 12 0x0000 0x0008 0x0010](https://reader034.vdocuments.site/reader034/viewer/2022050417/5f8ce8385114f401152f6822/html5/thumbnails/19.jpg)
35
Parfor 循环
a = zeros(10, 1)
parfor i = 1:10
a(i) = i;
end
aa(i) = i;
a(i) = i;
a(i) = i;
a(i) = i;
Worker
Worker
WorkerWorker
1 2 3 4 5 6 7 8 9 101 2 3 4 5 6 7 8 9 10
![Page 20: MATLAB 加速及编程技巧download.ilovematlab.cn/meetup/2019QDU/MATLAB-coding-accelera… · 11 动态数组的内存分配 >> x = 4 >> x(2) = 7 >> x(3) = 12 0x0000 0x0008 0x0010](https://reader034.vdocuments.site/reader034/viewer/2022050417/5f8ce8385114f401152f6822/html5/thumbnails/20.jpg)
36
并行计算编程
▪ 内置支持
– ..., 'UseParallel', true)
▪ 简单的编程实现
– parfor, batch, parsim
▪ 完全控制型编程
– spmd, parfeval
易用
控制
灵活
![Page 21: MATLAB 加速及编程技巧download.ilovematlab.cn/meetup/2019QDU/MATLAB-coding-accelera… · 11 动态数组的内存分配 >> x = 4 >> x(2) = 7 >> x(3) = 12 0x0000 0x0008 0x0010](https://reader034.vdocuments.site/reader034/viewer/2022050417/5f8ce8385114f401152f6822/html5/thumbnails/21.jpg)
43
Parfor 限制
![Page 22: MATLAB 加速及编程技巧download.ilovematlab.cn/meetup/2019QDU/MATLAB-coding-accelera… · 11 动态数组的内存分配 >> x = 4 >> x(2) = 7 >> x(3) = 12 0x0000 0x0008 0x0010](https://reader034.vdocuments.site/reader034/viewer/2022050417/5f8ce8385114f401152f6822/html5/thumbnails/22.jpg)
45
使用 NVIDIA GPU 加速 MATLAB 代码
➢ 理想问题
• 大规模并行和/或矢量化操作
• 计算密集型
• 元素级运算
➢ 300+ 具有 GPU 功能的 MATLAB 函数
➢ 工具箱支持– Neural Networks
– Image Processing
– Communications
– Signal Processing
Transfer Data To GPU From
Computer Memory
A=gpuArray(A);
Perform Calculation on GPU
X=exping(A);
Gather Data or Plot
X=gather(X)
MATLAB GPU computing
![Page 23: MATLAB 加速及编程技巧download.ilovematlab.cn/meetup/2019QDU/MATLAB-coding-accelera… · 11 动态数组的内存分配 >> x = 4 >> x(2) = 7 >> x(3) = 12 0x0000 0x0008 0x0010](https://reader034.vdocuments.site/reader034/viewer/2022050417/5f8ce8385114f401152f6822/html5/thumbnails/23.jpg)
46
GPUs 编程
▪ 内置支持
▪ 简单的编程实现
– gpuArray, gather
▪ 完全控制型编程
– spmd, arrayfun
▪ 专业接口
– CUDAKernel, mex
易用
控制
灵活
![Page 24: MATLAB 加速及编程技巧download.ilovematlab.cn/meetup/2019QDU/MATLAB-coding-accelera… · 11 动态数组的内存分配 >> x = 4 >> x(2) = 7 >> x(3) = 12 0x0000 0x0008 0x0010](https://reader034.vdocuments.site/reader034/viewer/2022050417/5f8ce8385114f401152f6822/html5/thumbnails/24.jpg)
62
MATLAB to C
![Page 25: MATLAB 加速及编程技巧download.ilovematlab.cn/meetup/2019QDU/MATLAB-coding-accelera… · 11 动态数组的内存分配 >> x = 4 >> x(2) = 7 >> x(3) = 12 0x0000 0x0008 0x0010](https://reader034.vdocuments.site/reader034/viewer/2022050417/5f8ce8385114f401152f6822/html5/thumbnails/25.jpg)
63
手工 MATLAB 到 C 转换的挑战
▪ 单独的功能和实现规范– 导致多个不一致的实现
– 在开发过程中难以修改需求
– MATLAB代码与C代码同步困难
▪ 手工编码可能引入的错误
▪ 耗时且昂贵的过程
Re-code in
C/C++
Algorithm Design
in MATLAB
MEX
.lib
.dll
.exe
.citerate
![Page 26: MATLAB 加速及编程技巧download.ilovematlab.cn/meetup/2019QDU/MATLAB-coding-accelera… · 11 动态数组的内存分配 >> x = 4 >> x(2) = 7 >> x(3) = 12 0x0000 0x0008 0x0010](https://reader034.vdocuments.site/reader034/viewer/2022050417/5f8ce8385114f401152f6822/html5/thumbnails/26.jpg)
64
MATLAB 到 C 自动转换
▪ 在 MATLAB 中维护设计
▪ 快速设计并快速得到 C
▪ 更频繁地在 MATLAB 中测试
▪ 花更多的时间改进算法
MEX
.lib
.dll
.exe
.c
verify /accelerate
iterate
![Page 27: MATLAB 加速及编程技巧download.ilovematlab.cn/meetup/2019QDU/MATLAB-coding-accelera… · 11 动态数组的内存分配 >> x = 4 >> x(2) = 7 >> x(3) = 12 0x0000 0x0008 0x0010](https://reader034.vdocuments.site/reader034/viewer/2022050417/5f8ce8385114f401152f6822/html5/thumbnails/27.jpg)
65
使用 MEX 加速
▪ 加速结果有所不同
▪ 当加速明显时:– 通常用于通信或信号处理
– 可能是具有状态的循环,或者当矢量化不可实现时
– 定点数据
▪ 当没有加速时:– 隐式多线程计算
– 内置函数调用 BLAS 或 IPP
![Page 28: MATLAB 加速及编程技巧download.ilovematlab.cn/meetup/2019QDU/MATLAB-coding-accelera… · 11 动态数组的内存分配 >> x = 4 >> x(2) = 7 >> x(3) = 12 0x0000 0x0008 0x0010](https://reader034.vdocuments.site/reader034/viewer/2022050417/5f8ce8385114f401152f6822/html5/thumbnails/28.jpg)
66
加速策略
选择1. 优化的算法
不同的算法解决同样的问题比如:系统对象
2. 最佳编程实践MATLAB编程技巧比如:向量化计算
3. 多核多处理器使用与原始平台相匹配的硬件比如:并行计算
4. 重构实现使用不同的语言比如:C, FORTRAN, 或者 FPGAs & DSPs
![Page 29: MATLAB 加速及编程技巧download.ilovematlab.cn/meetup/2019QDU/MATLAB-coding-accelera… · 11 动态数组的内存分配 >> x = 4 >> x(2) = 7 >> x(3) = 12 0x0000 0x0008 0x0010](https://reader034.vdocuments.site/reader034/viewer/2022050417/5f8ce8385114f401152f6822/html5/thumbnails/29.jpg)
67
Why MATLAB?Accelerating the Pace of Engineering and Science