the glimpses toolkit rapid code prototyping for spes jaswanth sreeram, santosh pande

28
The GLIMPSES Toolkit Rapid code prototyping for SPEs Jaswanth Sreeram, Santosh Pande

Upload: dustin-bennett

Post on 31-Dec-2015

221 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: The GLIMPSES Toolkit Rapid code prototyping for SPEs Jaswanth Sreeram, Santosh Pande

The GLIMPSES ToolkitRapid code prototyping for SPEs

Jaswanth Sreeram, Santosh Pande

Page 2: The GLIMPSES Toolkit Rapid code prototyping for SPEs Jaswanth Sreeram, Santosh Pande

2

Overview of Toolkit

• GLIMPSES Toolkit : GLobal Interprocedural Memory and ParalleliSm Estimator for SPUs

– Profile instrumentation support• Profile parsers and interpreters.

– Analyzers for memory allocation & access behavior

– Visualization Engine

Page 3: The GLIMPSES Toolkit Rapid code prototyping for SPEs Jaswanth Sreeram, Santosh Pande

3

GLIMPSES toolkit• One of two tools available in public domain

– Rapid Prototyping, Legacy Code Migration and Performance Tuning on Cell SPEs

– Second one is asmvis

• Released on source-forge in mid July:http://glimpses.sourceforge.net

• OSI certified open source license(s).

• Has received interest for adoption in academia and industry– Samsung Korea, Codecs and Media computing Group.– Sony Computer Entertainment America (SCEA)

Page 4: The GLIMPSES Toolkit Rapid code prototyping for SPEs Jaswanth Sreeram, Santosh Pande

4

GLIMPSES : Motivation

• Prototyping large codebases for porting to SPEs is challenging– Find a partition (set of functions)– Find a set of upward exposed references– DMA transfer them and lay them out –

alignment– After execution store the results back– Make sure memory requirements do not exceed

capacity

Page 5: The GLIMPSES Toolkit Rapid code prototyping for SPEs Jaswanth Sreeram, Santosh Pande

5

Motivation – contd.

• Challenges due to architectural attributes– Limited local store– High branch penalty– Suited for vectorizable code rather than scalar

code– SPE/PPE interactions

• Provide programmer with tools to– Understand program behavior (esp. memory

usage)– Quickly construct candidates partitions for SPE– Evaluate/Quantify partitions’ suitability for SPEs

Page 6: The GLIMPSES Toolkit Rapid code prototyping for SPEs Jaswanth Sreeram, Santosh Pande

6

GLIMPSES : Details

• Memory Estimation tools enable programmer to:– Estimate static & dynamic memory usage

• Code, Stack, Heap

– Understand program behavior• Detect program objects affecting dynamic memory

behavior• Show the correlation between these program objects and

memory usage.

– Rank program segments• Criteria: Memory requirements, vectorizability, branching,

etc.

– Visualize results interactively.

Page 7: The GLIMPSES Toolkit Rapid code prototyping for SPEs Jaswanth Sreeram, Santosh Pande

7

Features overview• Dynamic Call Graph visualization – ability

to select a call tree • Memory Requirements

– Dynamic– Analytical – ‘what if’ scenario calculator

for memory capacity • Memory Access Patterns

– Locality (spatial, temporal, neighbor affinity)

• Ranking– Criteria based estimates

• Alias and safe pre-fetching information– Multiple alias analyses available

Page 8: The GLIMPSES Toolkit Rapid code prototyping for SPEs Jaswanth Sreeram, Santosh Pande

8

Overview

Test Inputs

VisualizationEngine

Dyn. Memory Estimator

Profile Trace

Analysis &Instrumentation Passes

Execute

Instru. Bytecode

C/C++ program

LLVM compiler flow

Bytecode

LinkRuntime

AnalyticalMemory Estimator

GraphML Trace

Partition Estimator

Page 9: The GLIMPSES Toolkit Rapid code prototyping for SPEs Jaswanth Sreeram, Santosh Pande

9

Visualization

Graph Visualization Area

Results Display Panel

Page 10: The GLIMPSES Toolkit Rapid code prototyping for SPEs Jaswanth Sreeram, Santosh Pande

10

Visualization …contd

Page 11: The GLIMPSES Toolkit Rapid code prototyping for SPEs Jaswanth Sreeram, Santosh Pande

11

Visualization …contd

• Zoom view

• Shows dynamic call chains for a program run (in this case the program is mpeg2-decode)

Page 12: The GLIMPSES Toolkit Rapid code prototyping for SPEs Jaswanth Sreeram, Santosh Pande

12

Visualization …contdFunction Characteristics

Alias Analysis Algorithm used

Type of Aliases displayed (“Must Alias”, “May Alias”, “No Alias”)

Aliasing information for pairs of variables/memory regions.

Page 13: The GLIMPSES Toolkit Rapid code prototyping for SPEs Jaswanth Sreeram, Santosh Pande

13

Analytical Memory Estimation

• Correlate dynamic memory usage with program objects– Dynamic memory usage depends on inputs, etc.

• Compiler Analysis– From each malloc, do a backward traversal to find

instructions that influence the arguments to malloc.– Construct an arithmetic expression for amount of

memory allocated, in terms of inputs or other program objects.

– Handles control flow constructs (if-then-else, loops etc)

Page 14: The GLIMPSES Toolkit Rapid code prototyping for SPEs Jaswanth Sreeram, Santosh Pande

14

Memory Behavior: Analytical Estimation

if (cc==0) size = Picture_Width * Picture_Height;else size = Chroma_Width * Chroma_Height;…..……

for(….) {if (…..)

malloc(size);if (…..)

malloc(size);}

__Malloc_size__1 = Picture_Width*Picture_Height

__Malloc_size__2 = Picture_Width*Picture_Height

__Malloc_size__3 = Picture_Width*Picture_Height

__Malloc_size__4 = Picture_Width*Picture_Height

__Malloc_size__5 = Chroma_Width*Chroma_Height

__Malloc_size__6 = Chroma_Width*Chroma_Height

__Malloc_size__7 = Chroma_Width*Chroma_Height

__Malloc_size__8 = Chroma_Width*Chroma_Height

Page 15: The GLIMPSES Toolkit Rapid code prototyping for SPEs Jaswanth Sreeram, Santosh Pande

15

Memory References

• Memory reference metrics– Temporal (frequency) – Spatial– Neighbor affinity

• Metrics measured per memory line

• Per function metrics or per-partition metrics

• Visually represented via a color map– Pale Violet (low) -> Bright Red (high)

Page 16: The GLIMPSES Toolkit Rapid code prototyping for SPEs Jaswanth Sreeram, Santosh Pande

16

Memory Ref. Frequency (mpeg2decode)Memory Reference map (per partition)

with 1024B memory lines

Page 17: The GLIMPSES Toolkit Rapid code prototyping for SPEs Jaswanth Sreeram, Santosh Pande

17

Mpeg2decode: Load recurrence

Page 18: The GLIMPSES Toolkit Rapid code prototyping for SPEs Jaswanth Sreeram, Santosh Pande

Neighbor Affinity

• Metric to describe how well memory layout is suited to caching

• Consider a slice S of length w of the whole memory access trace and two loads

L1, L2 Є S

If |L1addr – L2addr| < line size then

L1, L2 exhibit neighbor affinity for slice size w

18

Page 19: The GLIMPSES Toolkit Rapid code prototyping for SPEs Jaswanth Sreeram, Santosh Pande

19

Load Neighbor Affinity

Page 20: The GLIMPSES Toolkit Rapid code prototyping for SPEs Jaswanth Sreeram, Santosh Pande

20

Alias Analysis for libode

• Basic AA (least precise, fastest)– Aggressive local analysis– Non context sensitive– Non-flow sensitive

• Total number of queries 119520497• “No Alias” 35924925• “May Alias”

83492482• “Must Alias” 103090

Page 21: The GLIMPSES Toolkit Rapid code prototyping for SPEs Jaswanth Sreeram, Santosh Pande

21

Alias Analysis (contd)

• Globals Mod/Ref– context-sensitive mod/ref and alias

analysis for internal global variables– Very fast, very precise, limited scope

• Total number of queries 119520497• “No Alias” 35944215• “May Alias” 83473192• “Must Alias” 103090

Page 22: The GLIMPSES Toolkit Rapid code prototyping for SPEs Jaswanth Sreeram, Santosh Pande

22

Alias Analysis (contd)

• Anderson’s AA algorithm– Subset-based, flow-insensitive, context-

insensitive, and field-insensitive alias analysis

– Very precise, but slow.

• Total number of queries 119520497• “No Alias” 79361105• “May Alias” 40057171• “Must Alias” 102221

Page 23: The GLIMPSES Toolkit Rapid code prototyping for SPEs Jaswanth Sreeram, Santosh Pande

23

Ranking (MPEG2Encode)• Criteria based

– Code Size (csize)– Stack Size (ssize)– Heap Size (hsize)– Branch density (br_density)– Autovectorizable loops (av_loops)– Is LS memory limit likely to be hit (ls_limit)Rank = w1*csize + w2*ssize + w3*hsize + w4*br_density + w5/(1 + av_loops) + w6* ls_limit

(wi are weights for each criteria)

Page 24: The GLIMPSES Toolkit Rapid code prototyping for SPEs Jaswanth Sreeram, Santosh Pande

Partitioning

• Preprocessing: Propogate ranks upwards in the call graph

Rank(n) = Rank(n) + ∑ Rank(n→child[i])

• Input: Call graph consisting of nodes annotated with ranks

• Output: Graph partitions that are suitable for execution on the SPEs

• A partition P is deemed “suitable” if Rank(P→root) < Threshold

24

Page 25: The GLIMPSES Toolkit Rapid code prototyping for SPEs Jaswanth Sreeram, Santosh Pande

Effect of threshold on partitions

25

mpeg2decode

Page 26: The GLIMPSES Toolkit Rapid code prototyping for SPEs Jaswanth Sreeram, Santosh Pande

26

GLIMPSES status• Beta version available for download at:

http://glimpses.sourceforge.net • 300MB source code package (includes visualizer)• Lines of code (C/C++): 447,000 • Third party tools integrated: LLVM (Compiler),

Prefuse (Visualization) • Executable Size: 422 MB (x86 binaries) • Typical trace size : 900 MB (LIBODE)• Man-hour effort: ~750• Releases :

– v.0.8 : based on LLVM version 1.8 (July 7th)– v.1.0 : based on LLVM version 2.0 (undergoing testing)

• Tested to work with large codebases: – LIBODE (115000 lines of code), mpeg2 (10000 lines of

code etc.), SPEC INT 2000 etc.

Page 27: The GLIMPSES Toolkit Rapid code prototyping for SPEs Jaswanth Sreeram, Santosh Pande

Ongoing and future work

• More Validation– Compare partitions produced with those

generated by expert programmers

• An inter-procedural, flow-sensitive, context-sensitive alias analysis algorithm

27

Page 28: The GLIMPSES Toolkit Rapid code prototyping for SPEs Jaswanth Sreeram, Santosh Pande

Ongoing and future work

• Function data dependence graph– Encapsulates data flow between

functions– Arguments, aliases, globals– Important factor in partitioning decisions

– “affinity between pairs of functions”

28