thomas ball principal researcher microsoft corporation sebastian burckhardt researcher microsoft...

30
Concurrency Analysis Platform And Tools For Finding Concurrency Bugs Thomas Ball Principal Researcher Microsoft Corporation Sebastian Burckhardt Researcher Microsoft Corporation Madan Musuvathi Researcher Microsoft Corporation Shaz Qadeer Senior Researcher Microsoft Corporation TL5 8

Upload: ellen-mcdonald

Post on 16-Dec-2015

226 views

Category:

Documents


4 download

TRANSCRIPT

Page 1: Thomas Ball Principal Researcher Microsoft Corporation  Sebastian Burckhardt Researcher Microsoft Corporation  Madan Musuvathi Researcher Microsoft

Concurrency Analysis Platform And Tools For Finding Concurrency Bugs Thomas Ball

Principal ResearcherMicrosoft Corporation

Sebastian BurckhardtResearcherMicrosoft Corporation

Madan MusuvathiResearcherMicrosoft Corporation

Shaz QadeerSenior ResearcherMicrosoft Corporation

TL58

Page 2: Thomas Ball Principal Researcher Microsoft Corporation  Sebastian Burckhardt Researcher Microsoft Corporation  Madan Musuvathi Researcher Microsoft

Concurrency Is HARD !

Rare thread interleavings can result in bugs

These bugs are hard to find, reproduce, and debug Heisenbugs: Observing the bug can “fix” it !

A huge productivity problem Developers and testers can spend

weeks chasing a single Heisenbug

Page 3: Thomas Ball Principal Researcher Microsoft Corporation  Sebastian Burckhardt Researcher Microsoft Corporation  Madan Musuvathi Researcher Microsoft

Main Takeaways

You can find and reproduce Heisenbugs new automatic tool called CHESS for Win32 and .NET

CHESS used extensively inside Microsoft Parallel Computing Platform (PCP) Singularity Dryad/Cosmos

Releasing via DevLabs

Page 4: Thomas Ball Principal Researcher Microsoft Corporation  Sebastian Burckhardt Researcher Microsoft Corporation  Madan Musuvathi Researcher Microsoft

Why Is Concurrency Hard? Madan Musuvathi

ResearcherMicrosoft Corporation

demo

Page 5: Thomas Ball Principal Researcher Microsoft Corporation  Sebastian Burckhardt Researcher Microsoft Corporation  Madan Musuvathi Researcher Microsoft

Concurrency Analysis Platform (CAP)

Goal: Drive a program along an interleaving of choice Interleaving decided by user or by a program/tool

Today: Controlling/observing concurrency is difficult Manual and intrusive process

Enables lots of concurrency tools: Test a program along a set of interleavings Reproduce Heisenbugs Program understanding / debugging ...

Page 6: Thomas Ball Principal Researcher Microsoft Corporation  Sebastian Burckhardt Researcher Microsoft Corporation  Madan Musuvathi Researcher Microsoft

Taming Concurrency

Madan MusuvathiResearcherMicrosoft Corporation

demo

Page 7: Thomas Ball Principal Researcher Microsoft Corporation  Sebastian Burckhardt Researcher Microsoft Corporation  Madan Musuvathi Researcher Microsoft

CAP Architecture

CAP

MemoryModelbugs

Monitors

Coverage

Repro

TestingDataraces

Debugging Visualization

UnmanagedProgram

Windows

ManagedProgram

.NET CLR

• Record the interleaving executed• Drive the program along an interleaving

Page 8: Thomas Ball Principal Researcher Microsoft Corporation  Sebastian Burckhardt Researcher Microsoft Corporation  Madan Musuvathi Researcher Microsoft

CAP Specifics

Ability to explore all interleavings Need to understand complex concurrency APIs

(Win32 and System.Threading) Threads, threadpools, locks, semaphores,

async I/O, APCs, timers, …

Does not introduce false behaviors Any interleaving produced by CAP is

possible on the real scheduler

Page 9: Thomas Ball Principal Researcher Microsoft Corporation  Sebastian Burckhardt Researcher Microsoft Corporation  Madan Musuvathi Researcher Microsoft

Overview

Concurrency Analysis Platform (CAP)

CHESS : find/reproduce Heisenbugs Integration with Visual Studio Demo on CCR Heisenbug

Future CAP tools FeatherLite: Data-race detection Sober: Memory-model bugs

Page 10: Thomas Ball Principal Researcher Microsoft Corporation  Sebastian Burckhardt Researcher Microsoft Corporation  Madan Musuvathi Researcher Microsoft

CHESS: Find And Reproduce Heisenbugs

Kernel: Threads, Scheduler, Synchronization Objects

While(not done) { TestScenario()}

TestScenario() { …}

ProgramCHESS CHESS runs the scenario in a loop

• Every run takes a different interleaving• Every run is repeatable

Win32/.NET

Uses the CAP scheduler • To control and direct interleavings

Detect• Assertion violations• Deadlocks• Dataraces• Livelocks

CAP

Page 11: Thomas Ball Principal Researcher Microsoft Corporation  Sebastian Burckhardt Researcher Microsoft Corporation  Madan Musuvathi Researcher Microsoft

x = 1; … … … … … x = k;

CHESS challenge Programs have LOTS of interleavings

Thread 1 Thread n

x = 1; … … … … …x = k;

n threads

k steps each

Number of executions: nnk Exponential in both n and k

For n=2, k = 100 > # of atoms in the universe

Limits scalability to large programs

Goal: Scale CHESS to large programs (large k)

Page 12: Thomas Ball Principal Researcher Microsoft Corporation  Sebastian Burckhardt Researcher Microsoft Corporation  Madan Musuvathi Researcher Microsoft

x = 1;if (p != 0) { x = p->f;}

x = 1;if (p != 0) {

Preemption Bounding

Focus on executions with small number of preemptions

Unexpected preemptions cause bugs

x = p->f;}

p = 0;

Thread 1 Thread 2

preemption

non-preemption

Page 13: Thomas Ball Principal Researcher Microsoft Corporation  Sebastian Burckhardt Researcher Microsoft Corporation  Madan Musuvathi Researcher Microsoft

x = 1; … … … … … x = k;

Managing Astronomical Number Of Interleavings

Thread 1 Thread n

x = 1; … … … … …x = k;

n threads

k steps each

Number of interleavings: nnk Interleavings with c preemptions (n2k)c. nn

For n=2, k=100, c=2< 1 million interleavings

Analysis techniques reduce this further

Page 14: Thomas Ball Principal Researcher Microsoft Corporation  Sebastian Burckhardt Researcher Microsoft Corporation  Madan Musuvathi Researcher Microsoft

CHESS In VSTS

Thomas BallPrincipal ResearcherMicrosoft Corporation

demo

Page 15: Thomas Ball Principal Researcher Microsoft Corporation  Sebastian Burckhardt Researcher Microsoft Corporation  Madan Musuvathi Researcher Microsoft

Real Scenario: Heisenbug In CCR

demo

Page 16: Thomas Ball Principal Researcher Microsoft Corporation  Sebastian Burckhardt Researcher Microsoft Corporation  Madan Musuvathi Researcher Microsoft

George Chrysanthakopoulus’ Challenge

Page 17: Thomas Ball Principal Researcher Microsoft Corporation  Sebastian Burckhardt Researcher Microsoft Corporation  Madan Musuvathi Researcher Microsoft

CHESS Internal Customers

Parallel Computing Platform PLINQ: Parallel LINQ CDS: Concurrent Data Structures STM: Software Transactional Memory TPL: Task Parallel Library ConcRT: Concurrency RunTime CCR: Concurrency Coordination Runtime

Dryad/Cosmos

Singularity (Research OS from MSR) CHESS can systematically test the boot and shutdown process

Page 19: Thomas Ball Principal Researcher Microsoft Corporation  Sebastian Burckhardt Researcher Microsoft Corporation  Madan Musuvathi Researcher Microsoft
Page 20: Thomas Ball Principal Researcher Microsoft Corporation  Sebastian Burckhardt Researcher Microsoft Corporation  Madan Musuvathi Researcher Microsoft

Overview

Concurrency Analysis Platform (CAP)

CHESS: Find/reproduce Heisenbugs Integration with Visual Studio Demo on CCR Heisenbug

Future CAP tools FeatherLite: Data-race detection Sober: Memory-model bugs

Page 21: Thomas Ball Principal Researcher Microsoft Corporation  Sebastian Burckhardt Researcher Microsoft Corporation  Madan Musuvathi Researcher Microsoft

FeatherLiteLightweight data-race detection

Data-races: Access to data with insufficient synchronization

Data-races are a common source of concurrency errors

EnterCS(cs)if (p != 0) { x = p->f;}LeaveCS(cs)

p = 0;

Thread 1 Thread 2

Page 22: Thomas Ball Principal Researcher Microsoft Corporation  Sebastian Burckhardt Researcher Microsoft Corporation  Madan Musuvathi Researcher Microsoft

Sampling To Reduce Overhead

Existing data-race detection tools have a large runtime overhead More than 10X slowdown Process every memory access

Intelligent sampling algorithms Process < 5% of the memory accesses

Less than 30% runtime overhead Existing tools > 1000% overhead

Page 23: Thomas Ball Principal Researcher Microsoft Corporation  Sebastian Burckhardt Researcher Microsoft Corporation  Madan Musuvathi Researcher Microsoft

FeatherLite + CAP"Active" data-race detection

Programs have “benign” data-races Do not result in program crashes Example: Updating statistical

counters without holding a lock

FeatherLite drives the program along the two outcomes of the data-race

Page 24: Thomas Ball Principal Researcher Microsoft Corporation  Sebastian Burckhardt Researcher Microsoft Corporation  Madan Musuvathi Researcher Microsoft

SoberTool for finding memory-model errors

Expert programmers use “lock-free” techniques Use low-level synchronizations, volatile variables, … For performance

Such programs are exposed to memory-model issues Compiler can reorder instructions Hardware can reorder/delay memory accesses

Result: Hard-to-find bugs that are hard-to-understand

Page 25: Thomas Ball Principal Researcher Microsoft Corporation  Sebastian Burckhardt Researcher Microsoft Corporation  Madan Musuvathi Researcher Microsoft

Sober Quiz

Can both threads DoWork() at the same time?

flag1 =

true;

if( !flag2 )

DoWork();

flag2 =

true;

if( !flag1 )

DoWork();

Thread 1 Thread 2

// initial statevolatile bool flag1 = false;volatile bool flag2 = false;

Page 26: Thomas Ball Principal Researcher Microsoft Corporation  Sebastian Burckhardt Researcher Microsoft Corporation  Madan Musuvathi Researcher Microsoft

Conclusion

Concurrency Analysis Platform (CAP) for controlling thread interleavings Enables lots of concurrency tools

CHESS – Find and reproduce Heisenbugs Don’t stress, use CHESS Look for download on http://msdn.microsoft.com/devlabs/

Future CAP tools FeatherLite: Lightweight data-race detection Sober: Tool for finding memory-model bugs

Page 27: Thomas Ball Principal Researcher Microsoft Corporation  Sebastian Burckhardt Researcher Microsoft Corporation  Madan Musuvathi Researcher Microsoft

Evals & Recordings

Please fill

out your

evaluation for

this session at:

This session will be available as a recording at:

www.microsoftpdc.com

Page 28: Thomas Ball Principal Researcher Microsoft Corporation  Sebastian Burckhardt Researcher Microsoft Corporation  Madan Musuvathi Researcher Microsoft

Please use the microphones provided

Q&A

Page 29: Thomas Ball Principal Researcher Microsoft Corporation  Sebastian Burckhardt Researcher Microsoft Corporation  Madan Musuvathi Researcher Microsoft

© 2008 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market

conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.

Page 30: Thomas Ball Principal Researcher Microsoft Corporation  Sebastian Burckhardt Researcher Microsoft Corporation  Madan Musuvathi Researcher Microsoft