asynchronous programming, analysis and testing with state machines

49
Asynchronous Programming, Analysis and Testing with State Machines Pantazis Deligiannis Alastair Donaldson Jeroen Ketema Akash Lal Paul Thomson PLDI 2015

Upload: pantazis-deligiannis

Post on 12-Aug-2015

215 views

Category:

Technology


1 download

TRANSCRIPT

Asynchronous Programming, Analysis and Testing with State Machines

Pantazis Deligiannis Alastair Donaldson

Jeroen Ketema Akash Lal

Paul Thomson

PLDI 2015

Async programming is HARD!

infamous heisenbugs due to exponentially many thread interleavings!

very challenging to detect, diagnose and fix!

What is a good way to develop async systems?

- actor-based languages (e.g. Scala, Erlang and P), which allow you to declaratively write complex asynchronous code

- event-driven approach: two or more components communicate by sending and receiving events

But how to analyse and test such systems?

- lots of research, really cool techniques and papers about e.g. data race detection

- but arguably still an open problem

We developed a new language co-designed with static race analysis and systematic testing:

- communicating state-machines (same concept as actors) are first class citizens

- extension of C#: you can blend sequential C# code with P# (easier to apply in existing projects)

- compiler exploits the state-machine structure of a P# program to create a more scalable and precise static data race analysis

- it then exploits the race-freedom guarantees from the static analysis to accelerate systematic testing

The P# approach

Case studyWe used P# inside Microsoft to port “AsyncSystem”:

- Real C# asynchronous application, built on top of an earlier internal version of Azure Service Fabric

- 14 P# state machines, ~6K LoC

The process of porting, analysing and testing component by component using P#:

- Revealed multiple (hard to find with traditional stress testing) asynchronous bugs in the original application

- 1 was undetected across multiple internal releases

Microsoft Azure Cluster

Node 0

Azure Service Fabric Runtime

P# Runtime

Machine

MachineState

State

dis

trib

ute

d

shared memory

Node n

Azure Service Fabric Runtime

P# Runtime

P# Application

P# Compiler

P# ecosystem

Microsoft’s Roslyn Compiler-as-a-Service Framework

Static Race Analysis

Systematic Testing

P# Asynchronous Runtime

Parsing

State Machine

State MachineState

StateState

P# Compiler

Parsing

P# ecosystem

Static Race Analysis

Systematic Testing

State Machine

State MachineState

StateState

Microsoft’s Roslyn Compiler-as-a-Service Framework

P# Asynchronous Runtime

syn

erg

y

P# Compiler

P# ecosystem

Microsoft’s Roslyn Compiler-as-a-Service Framework

Systematic Testing

P# Asynchronous Runtime

Parsing

see the paper

focus of this talk

State Machine

State MachineState

StateState

Static Race Analysis

How to code in P#

machine Server { }

Machine Declarations

Each P# program contains one or more machines

Machines execute concurrently with each other

The main machine is the first machine to be created when the P# program starts executing

main machine Client {

}

machine Server {

start state Init {

} }

Each P# machine has one or more states

The start state is the initial state of the machine

State Declarations

main machine Client {

start state Init {

} }

machine Server {

start state Init {

} }

Entry point for a P# state

An optional entry declaration denotes an action that executes upon entering the state

main machine Client {

start state Init { entry {

} } }

machine Server {

start state Init {

} }

Fields cannot be public

A machine is not allowed to directly access another machine’s fields!

Field Declaration

main machine Client { machine Server;

start state Init { entry {

} } }

machine Server {

start state Init {

} }

The create statement executes asynchronously

Statement for instantiating a new P#

machine

main machine Client { machine Server;

start state Init { entry { Server = create Server(); } } }

machine Server {

start state Init {

} }

Statement for “sending” an event to itself

State Transition Declaration

A P# machine can transition to a new state by raising or receiving an event

main machine Client { machine Server;

start state Init { entry { Server = create Server(); raise Unit; } on Unit goto Requesting; }

state Requesting { entry {

} } }

event Unit;

machine Server {

start state Init {

} }

Event Declaration

main machine Client { machine Server;

start state Init { entry { Server = create Server(); raise Unit; } on Unit goto Requesting; }

state Requesting { entry {

} } }

event Unit;

machine Server {

start state Init {

} }

Call sequential C# methods

Methods cannot be public

main machine Client { machine Server;

start state Init { entry { Server = create Server(); raise Unit; } on Unit goto Requesting; }

state Requesting { entry { var f = new File(...); PrepareRequest(f); } }

void PrepareRequest (File f) {

} }

event Unit; event Request;

machine Server {

start state Init { on Request goto Processing; }

state Processing {

} }

main machine Client { machine Server;

start state Init { entry { Server = create Server(); raise Unit; } on Unit goto Requesting; }

state Requesting { entry { var f = new File(...); PrepareRequest(f); } }

void PrepareRequest (File f) { SendRequest(f); }

void SendRequest (File f) { send Request ( f ) to Server; } }

Statement for asynchronously sending an event to another machine

event Unit; event Request;

machine Server {

start state Init { on Request goto Processing; }

state Processing { entry { var file = (File)payload; if (file.Log.Check(...)) { ... } } } }

main machine Client { machine Server;

start state Init { entry { Server = create Server(); raise Unit; } on Unit goto Requesting; }

state Requesting { entry { var f = new File(...); PrepareRequest(f); } }

void PrepareRequest (File f) { SendRequest(f); }

void SendRequest (File f) { send Request ( f ) to Server; } }

Sent payload by reference (not cloned)

event Unit; event Request;

machine Server {

start state Init { on Request goto Processing; }

state Processing { entry { var file = (File)payload; if (file.Log.Check(...)) { ... } } } }

main machine Client { machine Server;

start state Init { entry { Server = create Server(); raise Unit; } on Unit goto Requesting; }

state Requesting { entry { var f = new File(...); PrepareRequest(f); } }

void PrepareRequest (File f) { SendRequest(f); }

void SendRequest (File f) { send Request ( f ) to Server; } }

event Unit; event Request;

machine Server {

start state Init { on Request goto Processing; }

state Processing { entry { var file = (File)payload; if (file.Log.Check(...)) { ... } } } }

main machine Client { machine Server;

start state Init { entry { Server = create Server(); raise Unit; } on Unit goto Requesting; }

state Requesting { entry { var f = new File(...); PrepareRequest(f); } }

void PrepareRequest (File f) { SendRequest(f); }

void SendRequest (File f) { send Request ( f ) to Server; } }

event Unit; event Request;

machine Server {

start state Init { on Request goto Processing; }

state Processing { entry { var file = (File)payload; if (file.Log.Check(...)) { ... } } } }

main machine Client { machine Server;

start state Init { entry { Server = create Server(); raise Unit; } on Unit goto Requesting; }

state Requesting { entry { var f = new File(...); PrepareRequest(f); } }

void PrepareRequest (File f) { SendRequest(f); }

void SendRequest (File f) { send Request ( f ) to Server; } }

event Unit; event Request;

machine Server {

start state Init { on Request goto Processing; }

state Processing { entry { var file = (File)payload; if (file.Log.Check(...)) { ... } } } }

main machine Client { machine Server;

start state Init { entry { Server = create Server(); raise Unit; } on Unit goto Requesting; }

state Requesting { entry { var f = new File(...); PrepareRequest(f); } }

void PrepareRequest (File f) { SendRequest(f); }

void SendRequest (File f) { send Request ( f ) to Server; } }

P# Static Race Analysis

main machine Client { machine Server;

...

state Requesting { entry { var f = new File(...); PrepareRequest(f); f.Log = new Log(...); } }

void PrepareRequest (File f) { SendRequest(f); }

void SendRequest (File f) { send Request ( f ) to Server; } }

machine Server {

...

state Processing { entry { var file = (File)payload; if (file.Log.Check(...)) { ... } } } }

Previous example

machine Server {

...

state Processing { entry { var file = (File)payload; if (file.Log.Check(...)) { ... } } } }

Data race!

Read-Write Race

main machine Client { machine Server;

...

state Requesting { entry { var f = new File(...); PrepareRequest(f); f.Log = new Log(...); } }

void PrepareRequest (File f) { SendRequest(f); }

void SendRequest (File f) { send Request ( f ) to Server; } }

How to detect?Static ownership-based data race analysis that exploits the state-machine structure of a P# program — key idea:

- objects are owned by machines

- ownership is transferred when sending a payload via a send or create statement

- once ownership has been given up, machine must not access the given up object

- if it does, it denotes a potential data race

The example is simple on purpose, in reality we have to deal with well-known challenges such as:

- aliasing - complex data types in C# - and recursion

We do not handle reflection and we assume no non-P# concurrency is used

We conservatively deal with library APIs by over-approximating their effects

We collapse objects: whenever a machine gives up ownership of an object, everything reachable in the heap from that object is also given up

- we have full control of the language! we can tailor our analysis for the needs of P# — more flexibility!

- our goal is to find these tricky data races when state machines communicate asynchronously

- if no races are detected, we speedup systematic testing by scheduling only async. comm. points

- no need for sophisticated analyses (e.g. Thresher from PLDI’13) — we manage to efficiently detect races with low false alarm rate using a relatively simple but effective taint-tracking analysis

Why is this interesting?

Machine- and method-modular data flow analysis on the Roslyn IR (C# in CFG form) — three phases:

- Gives up analysis (computes summaries about method parameters that are given up)

- Respects ownership analysis (uses summaries to detect potential data races — and scale)

- Cross-state analysis (considers the transition graph of a state machine to discard false positives about given up machine fields)

Race Analysis

Gives Up Analysismain machine Client { machine Server;

state Requesting { entry { var f = new File(...); PrepareRequest(f); f.Log = new Log(...); } }

void PrepareRequest (File f) { SendRequest(f); }

void SendRequest (File f) { send Request ( f ) to Server; } }

Gives-up Sets for Client

PrepareRequest

{ }

SendRequest

{ }

Gives Up Analysis

Gives-up Sets for Client

PrepareRequest

{ }

SendRequest

{ }

main machine Client { machine Server;

state Requesting { entry { var f = new File(...); PrepareRequest(f); f.Log = new Log(...); } }

void PrepareRequest (File f) { SendRequest(f); }

void SendRequest (File f) { send Request ( f ) to Server; } }

SendRequest() has no gives-up

Gives Up Analysis

Gives-up Sets for Client

PrepareRequest

{ }

SendRequest

{ }

main machine Client { machine Server;

state Requesting { entry { var f = new File(...); PrepareRequest(f); f.Log = new Log(...); } }

void PrepareRequest (File f) { SendRequest(f); }

void SendRequest (File f) { send Request ( f ) to Server; } }

is sending!

Gives Up Analysis

Gives-up Sets for Client

PrepareRequest

{ }

SendRequest

{ f }

main machine Client { machine Server;

state Requesting { entry { var f = new File(...); PrepareRequest(f); f.Log = new Log(...); } }

void PrepareRequest (File f) { SendRequest(f); }

void SendRequest (File f) { send Request ( f ) to Server; } }

f is given up

Gives Up Analysis

Gives-up Sets for Client

PrepareRequest

{ }

SendRequest

{ f }

main machine Client { machine Server;

state Requesting { entry { var f = new File(...); PrepareRequest(f); f.Log = new Log(...); } }

void PrepareRequest (File f) { SendRequest(f); }

void SendRequest (File f) { send Request ( f ) to Server; } }

SendRequest() now has gives-up

Gives Up Analysis

Gives-up Sets for Client

PrepareRequest

{ f }

SendRequest

{ f }

main machine Client { machine Server;

state Requesting { entry { var f = new File(...); PrepareRequest(f); f.Log = new Log(...); } }

void PrepareRequest (File f) { SendRequest(f); }

void SendRequest (File f) { send Request ( f ) to Server; } }

f is given up

Gives Up Analysis

Gives-up Sets for Client

PrepareRequest

{ f }

SendRequest

{ f }

main machine Client { machine Server;

state Requesting { entry { var f = new File(...); PrepareRequest(f); f.Log = new Log(...); } }

void PrepareRequest (File f) { SendRequest(f); }

void SendRequest (File f) { send Request ( f ) to Server; } }

fixpoint reached!

Respects Ownership Analysis

Gives-up Sets for Client

PrepareRequest

{ f }

SendRequest

{ f }

main machine Client { machine Server;

state Requesting { entry { var f = new File(...); PrepareRequest(f); f.Log = new Log(...); } }

void PrepareRequest (File f) { SendRequest(f); }

void SendRequest (File f) { send Request ( f ) to Server; } }

f is in the gives-up set of PrepareRequest()

Respects Ownership Analysis

Gives-up Sets for Client

PrepareRequest

{ f }

SendRequest

{ f }

main machine Client { machine Server;

state Requesting { entry { var f = new File(...); PrepareRequest(f); f.Log = new Log(...); } }

void PrepareRequest (File f) { SendRequest(f); }

void SendRequest (File f) { send Request ( f ) to Server; } }

WARNING: Write Access to Given Up Object

Modify examplemain machine Client { machine Server; File File;

state Requesting { entry { File = new File(...); PrepareRequest(File); } }

void PrepareRequest (File f) { SendRequest(f); }

void SendRequest (File f) { send Request ( f ) to Server; } }

Gives-up Sets for Client

PrepareRequest

{ f }

SendRequest

{ f }

Modify example

Gives-up Sets for Client

PrepareRequest

{ f }

SendRequest

{ f }

main machine Client { machine Server; File File;

state Requesting { entry { File = new File(...); PrepareRequest(File); } }

void PrepareRequest (File f) { SendRequest(f); }

void SendRequest (File f) { send Request ( f ) to Server; } }

WARNING: Giving up a field of

the machine

xSA: Cross-State Analysis

Gives-up Sets for Client

PrepareRequest

{ f }

SendRequest

{ f }

main machine Client { machine Server; File File;

state Requesting { entry { File = new File(...); PrepareRequest(File); } }

void PrepareRequest (File f) { SendRequest(f); }

void SendRequest (File f) { send Request ( f ) to Server; } }

xSA: Cross-State Analysis

Gives-up Sets for Client

PrepareRequest

{ f }

SendRequest

{ f }

main machine Client { machine Server; File File;

state Requesting { entry { File = new File(...); PrepareRequest(File); } }

void PrepareRequest (File f) { SendRequest(f); }

void SendRequest (File f) { send Request ( f ) to Server; } }

No Data Race: Field not accessed after it is given up

P# Compiler

Parsing

P# ecosystem

Static Race Analysis

Systematic Testing

State Machine

State MachineState

StateState

Microsoft’s Roslyn Compiler-as-a-Service Framework

P# Asynchronous Runtime

syn

erg

y

Evaluation

Case study in Microsoft

Implemented 12 asynchronous applications:

- between 400 and 2300 lines of code

- single-box implementations

- simulate the environment (e.g. timeouts, failures) using nondeterministic P# machines

- we statically analysed them for races and then systematically tested them for bugs

Overview of resultsStatic race analysis:

- with xSA off: - 6 false races in case study (< 15 seconds) - 7 out of 12 verified (each < 6 seconds)

- with xSA on: - 2 false races in case study (< 15 seconds) - 11 out of 12 verified (each < 6 seconds)

Systematic testing:

- Compared with Microsoft’s CHESS

- P# DFS was 7.6x faster (schedules/sec) than CHESS on average, with CHESS’s data race detection off

Exciting directions

Short term: implement new complex, large-scale asynchronous applications in P#

This will give us direct feedback on how to improve the language, analysis and testing

Long term: promote the use of P# inside Microsoft and to the general .NET community

How to check out P#?

Compiler and samples to be open-sourced soon with permissive license in GitHub

[email protected]

Please send queries and feedback to my email: