a case for unlimited watchpoints

23
A Case for Unlimited Watchpoints Joseph L. Greathouse , Hongyi Xin*, Yixin Luo †‡ , Todd Austin University of Michigan Shanghai Jiao Tong University *Carnegie Mellon University ASPLOS, London, UK March 5, 2012

Upload: daxia

Post on 22-Feb-2016

45 views

Category:

Documents


0 download

DESCRIPTION

A Case for Unlimited Watchpoints. Joseph L. Greathouse † , Hongyi Xin *, Yixin Luo †‡ , Todd Austin †. † University of Michigan. * Carnegie Mellon University. ‡ Shanghai Jiao Tong University. ASPLOS, London, UK March 5, 2012. Goal of This Work. MAKE. dynamic. SOFTWARE. analysis. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: A Case for Unlimited Watchpoints

A Case forUnlimited

WatchpointsJoseph L. Greathouse†, Hongyi Xin*, Yixin Luo †‡, Todd Austin†

†University of Michigan ‡Shanghai JiaoTong University

*Carnegie Mellon University

ASPLOS, London, UKMarch 5, 2012

Page 2: A Case for Unlimited Watchpoints

2

Goal of This Work

MAKESOFTWARE

FASTeranalysis

dynamic

Page 3: A Case for Unlimited Watchpoints

3

Bounds Checking

Dynamic Software Analysis

Taint Analysis

Data Race Detection

Deterministic Execution

Transactional Memory

Speculative Parallelization

10-80x 2-300x

2-30x 2-10x

2-50x 2-4x

Page 4: A Case for Unlimited Watchpoints

4

Real Goal of This Work

Generic Hardware to Accelerate Many Dynamic

Software Analyses

WATCHPOINTS

Page 5: A Case for Unlimited Watchpoints

5

HW Interrupt when touching watched data

SW knows it’s touching important data AT NO OVERHEAD

Hardware-Assisted Watchpoints

0 1 2 3 4 5 6 7

A B C D E F G H

LD 2R-Watch 2-4

DC E

WR X→7

X

W-Watch 6-7

G X

Page 6: A Case for Unlimited Watchpoints

6

Bounds Checking

Dynamic Software Analysis

Taint Analysis

Data Race Detection

Deterministic Execution

Transactional Memory

Speculative Parallelization

Page 7: A Case for Unlimited Watchpoints

7

Taint analysis works on shadow values

Set R/W watchpoints on tainted values No tainted data? → Run at full speed

y = x * 1024 w = x + 42

validate(x)x = tainted()

Watchpoint-Based Taint Analysis

Clear

y = x * 1024

PropagateData

Shadow data

x = tainted()

Page 8: A Case for Unlimited Watchpoints

8

Watchpoint-Based Data Race Detection Find inter-thread data sharing, check locks

No sharing, no possible data race Turn off detector until HW finds sharing!

FAULT

Inter-Thread Sharing

FAULT

Page 9: A Case for Unlimited Watchpoints

9

Needed Watchpoint Capabilities Large Number

Fine-grained

Per Thread

Ranges

Others in Paper

V W X YZ???

WP Fault False Fault

False Faults

Page 10: A Case for Unlimited Watchpoints

10

Existing Watchpoint Solutions Watchpoint Registers

– Limited number (4-16), small reach (4-8 bytes)

Virtual Memory– Coarse-grained, per-process, only aligned ranges

ECC Mangling– Per physical address, all cores, no ranges

Page 11: A Case for Unlimited Watchpoints

11

Meeting These Requirements Unlimited Number of Watchpoints

Store in memory, cache on chip Fine-Grained

Watch full virtual addresses Per-Thread

Watchpoints cached per core/thread TID Registers

Ranges Range Cache

Page 12: A Case for Unlimited Watchpoints

12

Range Cache

0x0 0xffff_ffff Not WatchedStart Address End Address Watchpoint?

100

Valid

Set Addresses 0x5 – 0x2000R-Watched

0x40x20000x5 R Watched 1

0x2001 0xffff_ffff Not Watched 1

Load Address 0x400

≤ 0x400? ≥ 0x400?

R Watched

WPInterrupt

Page 13: A Case for Unlimited Watchpoints

13

Memory

Watchpoint System Design I Store Ranges in Main Memory Per-Thread Ranges, Per-Core Range Cache Software Handler on RC miss or overflow Write-back RC works as a write filter Precise, user-level watchpoint faultsT1 Memory T2 Memory Core 1 Core 2

WP Changes

Page 14: A Case for Unlimited Watchpoints

14

Experimental Evaluation Setup Pin-based Simulation

Every memory access through HW simulator Count pipeline-exposed events Record all other events

Trace-based timing simulator

Taint analysis on SPEC INT2000 Race Detection on Phoenix and PARSEC

Comparing only shadow value checks

Page 15: A Case for Unlimited Watchpoints

15

Watchpoint-Based Taint Analysis

164.g

zip

175.v

pr

176.g

cc

181.m

cf

186.c

rafty

197.p

arser

252.e

on

253.p

erlbm

k

254.g

ap

255.v

ortex

256.b

zip2

300.t

wolf

GeoMea

n012345678

MINEMU

Umbra

VM

RC

Slow

dow

n (x

)

19x10x 30x 207x 423x 23x 1429x

20%Slowdown

128 entry Range Cache

Page 16: A Case for Unlimited Watchpoints

16

The Need for Many Small Ranges Some watchpoints better suited for ranges

32b Addresses: 2 ranges x 64b each = 16B Some need large # of small watchpoints

51 ranges x 64b each = 408B Better stored as bitmap? 51 bits!

Taint analysis has good ranges Byte-accurate race detection does not..

Page 17: A Case for Unlimited Watchpoints

17

Watchpoint System Design II Make some RC entries point to bitmaps

-Start Addr End Addr Pointer to

WP BitmapR

-W

1V

1B

Memory CoreRanges Bitmaps Range Cache Bitmap Cache

Accessed in Parallel

Page 18: A Case for Unlimited Watchpoints

18

Watchpoint-Based Data Race Detection

histog

ram

linea

r_reg

ressio

npc

a

word_c

ount

GeoMea

n

black

scho

les

faces

im

freqm

ine

swap

tions vip

s

cann

eal

strea

mcluste

r

GeoMea

n -

5

10

15

20

25

30

VM

RC

RC+ Bitmap

Spee

dup

(x)

+20%

RC now 64 entries, added 2KB bitmap cache

+10%

Page 19: A Case for Unlimited Watchpoints

19

Conclusions & Future Directions Watchpoints a useful generic mechanism

Numerous SW systems can utilize a well-designed WP system

In the future: Clear microarchitectural analysis More software systems, different algorithms

Page 20: A Case for Unlimited Watchpoints

20

Thank You

Page 21: A Case for Unlimited Watchpoints

21

BACKUP SLIDES

Page 22: A Case for Unlimited Watchpoints

22

Existing Watchpoint Solutions Watchpoint Registers

+ Fine-grained, can be per-thread– Limited number (4-16), small reach (4-8 bytes)

Virtual Memory+ Virtually unlimited number– Coarse-grained, per-process, only aligned ranges

ECC Mangling+ Unlimited, finer-grained– Per physical address, no ranges

Page 23: A Case for Unlimited Watchpoints

23

Width Test