accurate and efficient filtering for the intel thread checker race detector

12
Accurate and Efficient Filtering for the Intel Thread Checker Race Detector By Paul Sack, Brian E. Bliss, Z hiqiang Ma, Paul Petersen, Josep Torr ellas 06/23/22 OS Lab Ok-Kyoon Ha 2006 ACM

Upload: neal

Post on 05-Jan-2016

28 views

Category:

Documents


0 download

DESCRIPTION

Accurate and Efficient Filtering for the Intel Thread Checker Race Detector. By Paul Sack, Brian E. Bliss, Zhiqiang Ma, Paul Petersen, Josep Torrellas. 2014-10-23. OS Lab. Ok-Kyoon Ha. 2006 ACM. Motivation. debugging data races is a difficult task - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Accurate and Efficient Filtering for the Intel Thread Checker Race Detector

Accurate and Efficient Filtering for the Intel Thread Checker Race Detector

By Paul Sack, Brian E. Bliss, Zhiqiang Ma, Paul Petersen, Josep Torrellas

04/20/23

OS Lab Ok-Kyoon Ha2006 ACM

Page 2: Accurate and Efficient Filtering for the Intel Thread Checker Race Detector

SBMP06 2

Motivation

debugging data races is a difficult task

detector has two common types of algorithms - Lockset-based algorithm & Vector clock-based algorithm

data race-detection tools- have reasonable overheads (2x slowdowns)

- do not provide much useful information or have limited usage models

Intel Thread Checker

- provide an abundance of useful information and have few usage constrains

- have high performance costs (233x slowdowns)

Page 3: Accurate and Efficient Filtering for the Intel Thread Checker Race Detector

SBMP06 3

Overheads of Intel’s Thread Checker

- instrumentation alone: slowdown of 22x

- full algorithm: slowdown of 233x

- memory overhead: imposes a 20x

Page 4: Accurate and Efficient Filtering for the Intel Thread Checker Race Detector

SBMP06 4

Approach

Objective- to reduce the amount of work done by the algorithm

Filtering useless references

Page 5: Accurate and Efficient Filtering for the Intel Thread Checker Race Detector

SBMP06 5

Three Filters (1/3)

Stack Filter- filter if one thread accesses another’s stack

- cannot cause data races to be lost and is very efficient

Implementation Issues of Stack Filter- the simplest filter and has the lowest overhead

- compares the memory reference address with the stack base and limit address

Page 6: Accurate and Efficient Filtering for the Intel Thread Checker Race Detector

SBMP06 6

Three Filters (2/3)

Duplicate Filter- maintain the first load and store references to a variable in each segments

- filter duplicate references in segments

- can only cause Thread Checker to lose duplicate data races

Implementation Issues of Duplicate Filter- slower than the stack filter

- maintains filter tables that organized 4 fields

add size type ID

add size type ID

T1

T2

Page 7: Accurate and Efficient Filtering for the Intel Thread Checker Race Detector

SBMP06 7

Three Filters (3/3) FSM Filter

- base the Eraser state machine

- filter reference in the Private state and in the Shared Read Only state

- filter the initial references (Uninit → Private, Private → SHD RO)

R, WR

R1, W1

UNINIT PRIVATE

SHR RW SHR RO

Eraser state machine

R1, W1

W

W’R’

Page 8: Accurate and Efficient Filtering for the Intel Thread Checker Race Detector

SBMP06 8

Experimental Setup

Environments- 4-way 2.5GHz Pentium 4 workstation

- use the SPLASH-2 applications

- run with 4 threads on 4 processors

Measurements- filtering statistics are collected by running each application three times

- performance results are collected by running each application nine times

- each application is run in Thread Checker with and without three filters

- compare the number of data-race bugs reported with and without the filters

Page 9: Accurate and Efficient Filtering for the Intel Thread Checker Race Detector

SBMP06 9

Filtering Effectiveness

Different filter combinations Incremental filtering effectiveness

Page 10: Accurate and Efficient Filtering for the Intel Thread Checker Race Detector

SBMP06 10

Performance

Speedups obtained with filtering

Page 11: Accurate and Efficient Filtering for the Intel Thread Checker Race Detector

SBMP06 11

Data-race Detection

Characterizing the impact of the three filers combined

Page 12: Accurate and Efficient Filtering for the Intel Thread Checker Race Detector

SBMP06 12

Conclusions and Future Work

Conclusion- Intel Thread Checker slowdown of 233x on average

- filtering out the vast majority of memory references

- develop three filters that filter 98% of all memory references

- speedups of 3.3x on average

Future Work- improve the FSM filter

- to improve the other overhead sources in Thread Checker