1 s. tallam, r. gupta, and x. zhang pact 2005 extended whole program paths sriraman tallam rajiv...

22
S. Tallam, R. Gupta, and X. Zhang PACT 2005 1 Extended Whole Program Paths Sriraman Tallam Rajiv Gupta Xiangyu Zhang University of Arizona

Upload: darlene-holmes

Post on 15-Jan-2016

219 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 1 S. Tallam, R. Gupta, and X. Zhang PACT 2005 Extended Whole Program Paths Sriraman Tallam Rajiv Gupta Xiangyu Zhang University of Arizona

S. Tallam, R. Gupta, and X. Zhang PACT 2005 1

Extended Whole Program Paths

Sriraman Tallam

Rajiv Gupta Xiangyu Zhang

University of Arizona

Page 2: 1 S. Tallam, R. Gupta, and X. Zhang PACT 2005 Extended Whole Program Paths Sriraman Tallam Rajiv Gupta Xiangyu Zhang University of Arizona

S. Tallam, R. Gupta, and X. Zhang PACT 2005 2

Control Flow and Dependence Traces

Control Flow Traces• Sequence of basic blocks.• Identification of hot paths.

Path Sensitive Instruction Scheduling and Optimization. Path Prediction and Instruction Fetching.

Dependence Traces• Capture data dependences.

Flow from a definition to a use.• Data Speculative Optimizations for Itanium.• Computation of Dynamic Slices.

Page 3: 1 S. Tallam, R. Gupta, and X. Zhang PACT 2005 Extended Whole Program Paths Sriraman Tallam Rajiv Gupta Xiangyu Zhang University of Arizona

S. Tallam, R. Gupta, and X. Zhang PACT 2005 3

Control Flow and Dependence Traces

Control Flow Traces are smaller than Dependence Traces and can be compressed well.

• Average size for Spec 2K benchmarks is 179 MB.• Compression Factor

Sequitur – 681 VPC – 442

Dependence Traces are large and do not compress as well as Control Flow Traces.

• Average size for Spec 2K benchmarks is 565 MB.• Compression Factor

Sequitur – 1.31 VPC – 5.8

• Is there an alternative trace representation ?

Page 4: 1 S. Tallam, R. Gupta, and X. Zhang PACT 2005 Extended Whole Program Paths Sriraman Tallam Rajiv Gupta Xiangyu Zhang University of Arizona

S. Tallam, R. Gupta, and X. Zhang PACT 2005 4

Our Approach

Extended Control Flow Trace – Unified Trace Representation.

• Capture both control flow and dependence information.• The data dependences are embedded as control flow.

The unified trace is smaller than control flow + dependence traces.

Our compressed unified trace is also smaller than the compressed control flow + compressed dependence traces.

Page 5: 1 S. Tallam, R. Gupta, and X. Zhang PACT 2005 Extended Whole Program Paths Sriraman Tallam Rajiv Gupta Xiangyu Zhang University of Arizona

S. Tallam, R. Gupta, and X. Zhang PACT 2005 5

Goals in Designing the eCF

The dependence can be recovered from the Control Flow.

X = _

X = _

= X

*p = _

The dependence can now not be recovered due to possible aliasing.

Additional Control Flow can capture the dependence.

1

2 3

4

4= X

If p==&X

5

6

Page 6: 1 S. Tallam, R. Gupta, and X. Zhang PACT 2005 Extended Whole Program Paths Sriraman Tallam Rajiv Gupta Xiangyu Zhang University of Arizona

S. Tallam, R. Gupta, and X. Zhang PACT 2005 6

Cost of Capturing Dependences

No-cost capture• For these dependences, no disambiguation checks are

needed.

Fixed cost capture• The number of disambiguation checks needed is a constant.

Variable cost capture.• The number of disambiguation checks varies.

Page 7: 1 S. Tallam, R. Gupta, and X. Zhang PACT 2005 Extended Whole Program Paths Sriraman Tallam Rajiv Gupta Xiangyu Zhang University of Arizona

S. Tallam, R. Gupta, and X. Zhang PACT 2005 7

No Cost Capture

All instances of the dependence can be recovered from the control flow trace.

Page 8: 1 S. Tallam, R. Gupta, and X. Zhang PACT 2005 Extended Whole Program Paths Sriraman Tallam Rajiv Gupta Xiangyu Zhang University of Arizona

S. Tallam, R. Gupta, and X. Zhang PACT 2005 8

Fixed Cost Capture

A single disambiguation check is sufficient to capture this dependence.

Single Check

Page 9: 1 S. Tallam, R. Gupta, and X. Zhang PACT 2005 Extended Whole Program Paths Sriraman Tallam Rajiv Gupta Xiangyu Zhang University of Arizona

S. Tallam, R. Gupta, and X. Zhang PACT 2005 9

Variable Cost Capture

The instances of the dependence can be caused by any instance of the definition statement.

Multiple Checks

Page 10: 1 S. Tallam, R. Gupta, and X. Zhang PACT 2005 Extended Whole Program Paths Sriraman Tallam Rajiv Gupta Xiangyu Zhang University of Arizona

S. Tallam, R. Gupta, and X. Zhang PACT 2005 10

Cost of Instrumentation and Trace Compressibility

Reducing the number of checks• Reducing the size of the generated trace.• Reduction in run-time overhead.

Improving the Compressibility• Similar Control Flow Signatures.

Page 11: 1 S. Tallam, R. Gupta, and X. Zhang PACT 2005 Extended Whole Program Paths Sriraman Tallam Rajiv Gupta Xiangyu Zhang University of Arizona

S. Tallam, R. Gupta, and X. Zhang PACT 2005 11

Two Phased Approach

Conservative nature of Static Pointer Analysis.• Too many potential dependences per use.

Two phased Approach• Filtering Phase

Find all dependences exercised.• Profiling Phase

Add disambiguation checks only for those dependences exercised.

Page 12: 1 S. Tallam, R. Gupta, and X. Zhang PACT 2005 Extended Whole Program Paths Sriraman Tallam Rajiv Gupta Xiangyu Zhang University of Arizona

S. Tallam, R. Gupta, and X. Zhang PACT 2005 12

Binary Search vs. Linear Search

Track the last definition and instance of every write to a memory address.

Search the address array using binary search instead of linear search.

Page 13: 1 S. Tallam, R. Gupta, and X. Zhang PACT 2005 Extended Whole Program Paths Sriraman Tallam Rajiv Gupta Xiangyu Zhang University of Arizona

S. Tallam, R. Gupta, and X. Zhang PACT 2005 13

Optimizing Trace Length and Compressibility

Page 14: 1 S. Tallam, R. Gupta, and X. Zhang PACT 2005 Extended Whole Program Paths Sriraman Tallam Rajiv Gupta Xiangyu Zhang University of Arizona

S. Tallam, R. Gupta, and X. Zhang PACT 2005 14

Experimental Results

Implementation on the Microsoft Phoenix RDK.

Spec 2K benchmark binaries were rewritten to obtain instrumented versions.

• Easy to implement using Phoenix.

Intermediate representation was low-level x86 instruction set.

• Split dependences into register and memory.• Register dependences are always recoverable from control

flow trace.• Memory dependences were recovered using our approach.

Page 15: 1 S. Tallam, R. Gupta, and X. Zhang PACT 2005 Extended Whole Program Paths Sriraman Tallam Rajiv Gupta Xiangyu Zhang University of Arizona

S. Tallam, R. Gupta, and X. Zhang PACT 2005 15

Register and Memory dependences

A Significant (76 %) of dependences (register) can be recovered from the control flow trace

Page 16: 1 S. Tallam, R. Gupta, and X. Zhang PACT 2005 Extended Whole Program Paths Sriraman Tallam Rajiv Gupta Xiangyu Zhang University of Arizona

S. Tallam, R. Gupta, and X. Zhang PACT 2005 16

Uncompressed Trace Sizes

The unified trace is 62 % of the size of Control Flow + Dependence Trace

Cont. + Dep. Unified Ratio

Page 17: 1 S. Tallam, R. Gupta, and X. Zhang PACT 2005 Extended Whole Program Paths Sriraman Tallam Rajiv Gupta Xiangyu Zhang University of Arizona

S. Tallam, R. Gupta, and X. Zhang PACT 2005 17

Sequitur Compressed

Cont. + Dep. Unified Ratio

The compressed unified trace is 4 % of the size of compressed Control Flow + Dependence Trace

Page 18: 1 S. Tallam, R. Gupta, and X. Zhang PACT 2005 Extended Whole Program Paths Sriraman Tallam Rajiv Gupta Xiangyu Zhang University of Arizona

S. Tallam, R. Gupta, and X. Zhang PACT 2005 18

VPC Compressed

RatioUnifiedCont. +

Dep.

The compressed unified trace is 21 % of the size of compressed Control Flow + Dependence Trace

Page 19: 1 S. Tallam, R. Gupta, and X. Zhang PACT 2005 Extended Whole Program Paths Sriraman Tallam Rajiv Gupta Xiangyu Zhang University of Arizona

S. Tallam, R. Gupta, and X. Zhang PACT 2005 19

Memory Dependence Types

30 % of dependences can be recovered at no cost.

Page 20: 1 S. Tallam, R. Gupta, and X. Zhang PACT 2005 Extended Whole Program Paths Sriraman Tallam Rajiv Gupta Xiangyu Zhang University of Arizona

S. Tallam, R. Gupta, and X. Zhang PACT 2005 20

Address Comparisons

Binary Search reduces the address comparisons by 4 orders of magnitude.

Page 21: 1 S. Tallam, R. Gupta, and X. Zhang PACT 2005 Extended Whole Program Paths Sriraman Tallam Rajiv Gupta Xiangyu Zhang University of Arizona

S. Tallam, R. Gupta, and X. Zhang PACT 2005 21

Run-time Overhead

There is a 20 % increase in run-time overhead in collecting the unified trace.

Page 22: 1 S. Tallam, R. Gupta, and X. Zhang PACT 2005 Extended Whole Program Paths Sriraman Tallam Rajiv Gupta Xiangyu Zhang University of Arizona

S. Tallam, R. Gupta, and X. Zhang PACT 2005 22

Conclusions

We have designed an extended control flow trace that captures both control flow and data dependence history.

The key to the unified trace is the ability to convert memory data dependences into control flow.

• The resulting unified trace is smaller than the combined control flow + dependence trace.

• The run-time overhead increases by 20 %.

Our Thanks to Hoi Vo of Microsoft Corporation and the Phoenix Compiler Infrastructure Group.