programming the ps3
DESCRIPTION
Talk given at FOSDEM 2008 to introduce how to program the Cell BE processor of the Playstation 3 under Linux.TRANSCRIPT
![Page 1: Programming the PS3](https://reader033.vdocuments.site/reader033/viewer/2022052210/554e1c27b4c9056b798b4a63/html5/thumbnails/1.jpg)
Programming with Linux on the Playstation3
FOSDEM [email protected]
Architecture overview: introducing the Cell BE
Installing Linux SIMD programming in C/C++ Asynchronous data transfer with
the DMA
![Page 2: Programming the PS3](https://reader033.vdocuments.site/reader033/viewer/2022052210/554e1c27b4c9056b798b4a63/html5/thumbnails/2.jpg)
Who am I Java / Python developer at Nuxeo (FOSS document
management server) Interested in Artificial Intelligence (and need fast
Support Vector Machines) Slides to be published at:
http://oliviergrisel.name
![Page 3: Programming the PS3](https://reader033.vdocuments.site/reader033/viewer/2022052210/554e1c27b4c9056b798b4a63/html5/thumbnails/3.jpg)
PS3 architecture overview CPU: IBM Cell/BE @ 3.2GHz
218 GFLOPS Main RAM: 256MB XDR ([email protected])
GPU: Nvidia RSX 1.8 TFLOPS (SP) / 356 GFLOPS programmable VRAM: 256MB GDDR3 (2x128b@700MHz)
System Bus: 2.5 GB/s
![Page 4: Programming the PS3](https://reader033.vdocuments.site/reader033/viewer/2022052210/554e1c27b4c9056b798b4a63/html5/thumbnails/4.jpg)
The Cell Broadband Engine 1 PPE core @ 3.2GHz
64bit hyperthreaded PowerPC
512KB L2 cache 8 SPE cores @ 3.2GHz
128bit SIMD optimized 256KB SRAM
![Page 5: Programming the PS3](https://reader033.vdocuments.site/reader033/viewer/2022052210/554e1c27b4c9056b798b4a63/html5/thumbnails/5.jpg)
PS3 Clusters Cheap cluster for
academic researchers Carolina State U. and
U. Massachusetts at D. 8+1 cluster with ssh and
MPI
![Page 6: Programming the PS3](https://reader033.vdocuments.site/reader033/viewer/2022052210/554e1c27b4c9056b798b4a63/html5/thumbnails/6.jpg)
PS3 GRID Computing PS3GRID project
based on BOINC 30,000 atoms simulation
Folding@Home 1 PFLOPS with 800
TFLOPS from PS3s BlueGene == 280
TFLOPS
![Page 7: Programming the PS3](https://reader033.vdocuments.site/reader033/viewer/2022052210/554e1c27b4c9056b798b4a63/html5/thumbnails/7.jpg)
Linux on the PS3 Lv1 Hypervisor shipped with the default firmware Partition utility in the Sony Game OS menu Choose your favorite distro:
Install a powerpc64smp or ps3 kernel Install gccspu + libspe2
![Page 8: Programming the PS3](https://reader033.vdocuments.site/reader033/viewer/2022052210/554e1c27b4c9056b798b4a63/html5/thumbnails/8.jpg)
Programming the Cell/BE in C Program the PPE as a chief conductor to spread the
numerical code to SPEs Use POSIX threads to start SPE subroutines in
parallel Use SPE intrinsics to perform vector instructions Eliminate branches as much as possible in SPE code Align your data to 16 bytes
![Page 9: Programming the PS3](https://reader033.vdocuments.site/reader033/viewer/2022052210/554e1c27b4c9056b798b4a63/html5/thumbnails/9.jpg)
Introduction to SIMD programming 128 bits registers (SSE2, Altivec, SPE)
2 x double 4 x float 4 x int
introduce new vector types 1 vector float operation == 4 float operations logical (and, or, cmp, ...), arithmetic (+, *, abs, ...),
shuffling
![Page 10: Programming the PS3](https://reader033.vdocuments.site/reader033/viewer/2022052210/554e1c27b4c9056b798b4a63/html5/thumbnails/10.jpg)
SIMD programming – the big picture
![Page 11: Programming the PS3](https://reader033.vdocuments.site/reader033/viewer/2022052210/554e1c27b4c9056b798b4a63/html5/thumbnails/11.jpg)
Not always SIMDizable
![Page 12: Programming the PS3](https://reader033.vdocuments.site/reader033/viewer/2022052210/554e1c27b4c9056b798b4a63/html5/thumbnails/12.jpg)
SIMD programming with libspe2 and gccspu
#include <spu_intrinsics.h> avoid scalar types use:
vector_float4 vector_double2 vector_char16 ...
d = spu_and(a, b); e = spu_madd(a, b, c); spugcc pure_spe_prog.c o pure_spe_prog.elf
![Page 13: Programming the PS3](https://reader033.vdocuments.site/reader033/viewer/2022052210/554e1c27b4c9056b798b4a63/html5/thumbnails/13.jpg)
Branch elimination avoid branching (if / else)
c = spu_sel(a, b, spu_cmpgt(a, d));
![Page 14: Programming the PS3](https://reader033.vdocuments.site/reader033/viewer/2022052210/554e1c27b4c9056b798b4a63/html5/thumbnails/14.jpg)
A sample SPE programvolatile union {
vec_float4 vec;float part[4];
} sum;float dot_product(const float* xp, const float* yp, const int size) {
sum.vec = (vec_float4) {0, 0, 0, 0}; vec_float4* xvp = (vec_float4*) xp; vec_float4* yvp = (vec_float4*) yp;
vec_float4* xvp_end = xvp + size / 4;while(__builtin_expect(xvp < xvp_end, 1)) {
sum.vec = spu_madd(*xvp, *yvp, sum.vec);xvp++;yvp++;
}return sum.part[0] + sum.part[1] + sum.part[2] + sum.part[3];
}
![Page 15: Programming the PS3](https://reader033.vdocuments.site/reader033/viewer/2022052210/554e1c27b4c9056b798b4a63/html5/thumbnails/15.jpg)
DMA with the SPUs' Memory Flow Controllers
#include <spu_mfcio.h> mfc_get(&local_data, main_mem_data_ea,
sizeof(local_data), DMA_TAG, 0, 0); mfc_put(&local_data, main_mem_data_ea,
sizeof(&local_data), DMA_TAG, 0, 0); mfc_getb(&local_data, main_mem_data_ea,
sizeof(local_data), DMA_TAG, 0, 0); spu_mfcstat(MFC_TAG_UPDATE_ALL);
![Page 16: Programming the PS3](https://reader033.vdocuments.site/reader033/viewer/2022052210/554e1c27b4c9056b798b4a63/html5/thumbnails/16.jpg)
Doublebuffering – the problem
![Page 17: Programming the PS3](https://reader033.vdocuments.site/reader033/viewer/2022052210/554e1c27b4c9056b798b4a63/html5/thumbnails/17.jpg)
Doublebuffering – the big picture
![Page 18: Programming the PS3](https://reader033.vdocuments.site/reader033/viewer/2022052210/554e1c27b4c9056b798b4a63/html5/thumbnails/18.jpg)
Doublebuffering with MFC 1. SPU queues MFC GET to fill buffer #1 2. SPU queues MFC GET to fill buffer #2 3. SPU waits for buffer #1 to finish filling 4. SPU processes buffer #1 5. SPU queues MFC PUT back content of buffer #1 6. SPU queues MFC GETB to refill buffer #1 7. SPU waits for buffer #2 to finish filling 8. SPU processes buffer #2 (...)
![Page 19: Programming the PS3](https://reader033.vdocuments.site/reader033/viewer/2022052210/554e1c27b4c9056b798b4a63/html5/thumbnails/19.jpg)
Some resources Cell BE Programming Tutorial (ibm.com 190 pages) IBM developerworks short programming tutorials
Search for articles by Jonathan Barlett Barcelona Supercomputing Center (software)
http://www.bsc.es/projects/deepcomputing/linuxoncell/ PS3 programming workshops (videos)
http://www.cc.gatech.edu/~bader/CellProgramming.html #ps3dev on freenode
![Page 20: Programming the PS3](https://reader033.vdocuments.site/reader033/viewer/2022052210/554e1c27b4c9056b798b4a63/html5/thumbnails/20.jpg)
Thanks, credits, licensing Most schemas from excellent GFDL 'd tutorial by
Geoff Levand (Sony Corp) http://www.kernel.org/pub/linux/kernel/people/geoff/cell
Pictures and trade marks belong to their respective owners (Sony, IBM, Universities, Folding@Home, PS3GRID, ...)
All remaining work is GFDL
![Page 21: Programming the PS3](https://reader033.vdocuments.site/reader033/viewer/2022052210/554e1c27b4c9056b798b4a63/html5/thumbnails/21.jpg)
7 differences