hybrid access-speci c software cache techniques for the cell be …zz124/cs516_spring2015/... ·...
TRANSCRIPT
Hybrid access-specific software cache techniquesfor the Cell BE architecture
April 16, 2015
Cell BE architecture
Traditional cache approach
Software cache
High locality cache
Transactional cache
C code transformation
Cell benchmark
Cache overhead
Cell vs POWER5 benchmark
Scalability of Cell
Scalability of POWER5
Efficient computation of sum-products on GPUsthrough software-managed cache
April 16, 2015
Marginalization of a product of functions
MPF bucketization
Computing MPF
MPF access patterns
MPF problem
MPF kernel
Arithmetic intensity
A = compute operationsmemory operations
Arithmetic intensity (example 1)
k(x) = f (x)⊗ g(x)
A = 13
Arithmetic intensity (example 2)
Matrices M × N and N × K
A = 2N−12N+1
Arithmetic intensity (example 2 with cache)
Matrices M × N and N × K
A = 2N−1N( 1M+ 1
K )+1=
2− 1N
1M+ 1
N+1K
Speed
Cached arithmetic intensity
Index vector
< x , z ,w , y >
Benchmarks
Random data performance
Random data speedup
Performance by cache size
Performance by # cache pages per thread block
Loop unrolling
Texture cache
Speedup overhead
Speedup with/without overhead