using static analysis to detect type errors &...

Using Static Analysis to Detect Type Errors & Concurrency Defects in Erlang Programs

Kostis Sagonas

National Technical University of Athens, GreeceUppsala University, Sweden

Kostis Sagonas Dialyzer @ UCM 2010

Erlang A concurrent programming language

− syntax influenced from Prolog− functional: strict, higher-order, pattern

matching, guards, list comprehensions− dynamically typed

The main implementation of Erlang is the Erlang/OTP system from Ericsson

Successfully used within telecommunications and in application domains which require high degree of concurrency and/or fault tolerance


Dynamically typed languages

... such as Erlang can be seen as unityped:− Only one type: term() or any()‏

However, some primitive functions are only defined on subtypes of all terms and their arguments need to be checked at runtime

Type safety is not an issue as it is provided by the runtime system: all terms are tagged with their type, which is checked in primitive operations

However, programmers often make mistakes that a statically typed language would normally catch


Type system to the rescue?

Type system: an analysis that is fast, sound for correctness– “well-typed programs do not go

wrong”• w.r.t. some kinds of errors

– pessimistic and inflexible: if cannot prove safety, reject the program

Using Static Analysis for Detecting Type Errors


False error reports: show-stopper

Flanagan et al. (ESC/Java people), 2002:− “[T]he tool has not reached the desired level of cost

effectiveness. In particular, users complain about an annotation burden that is perceived to be heavy, and about excessive warnings about non-bugs, particularly on unannotated or partially-annotated programs.”

Rutar et al., 2004:− > 9k NPE warnings in 170k non commented source statements− “[T]here are too many warnings to be easily useful by

themselves.”


The problem statement

Infer types for all functions in the program, without imposing any constraints that the programmer never intended

foo([],L) -> L;foo([H|T],L) -> [H|foo(T,L)].

foo(list(), any()) -> any()‏

To be sound for defect detection, the inferred types have to respect the operational semantics of the language

Should this type be different if the function was called append instead of foo?

Women, Men, and Restroom Signs


Restroom signs in the U.S.A.


Restroom signs at NTUA, Greece


Tempting conclusion In the U.S., everything is forbidden unless it is

allowed In Greece, everything is allowed unless it is

forbidden

That’s not the point of the story...


More permissive?


Which is better?

+− More restrictive: no dogs allowed− More permissive: what if both?

-− Too restrictive: what if neither? (e.g., boy/girl)


Which is better?

+− More permissive: what if neither?

“just visiting your planet”

-− Too permissive: dogs allowed− Too restrictive: what if both?


All is a matter of desired specification No definition is better by default

− all is a matter of desired behavior− usual precision and recall metrics for quality

what percentage of allowed behaviors are desired? what percentage of desired behaviors are allowed?

Back to Detecting Type Errors by Static Analysis...


The moral of all this is... Being sound for correctness is not always a

superior approach− restricts expressiveness− rejects valid programs

think of sound analyses for “no division by zero”

All depends on desired behavior− for defect detection tools the goal is to find definite

bugs, rather than to prove correctness− soundness for incorrectness is very valuable!


Type analysis for defect detection

The inferred types should:− Be easy to interpret by the programmer− Never lie: Capture all possible, however unintended,

uses of functions The type inference algorithm should:

− Be completely automatic Not require any user annotations Not require any type declarations

− Handle cases where not all code is available− Be relatively fast


Dialyzer

“A DIscrepancy AnaLYZer for ERlang programs” Part of the Erlang/OTP distribution since 2007 A lightweight static analysis tool for finding

discrepancies in Erlang programs− Managed to uncover many bugs in large,

well-tested, commercial applications− Heavily used in the Erlang community


Dialyzer

Characteristics of dialyzer (for type error detection):

Sound for defect detection – not for correctness!

Push-button technology, completely automatic

Fast and scalable

Very successful


An Erlang implementation of and

> and(true, true).true> and(false, true).false> and(false, gazonk).false> and(3.14, false).false

Trial runs

and(true, true) → true;and(false, _) → false;and(_, false) → false.

Erlang program

bool() ::= true | false




Trial runsHM-type signature

and(bool(), bool()) → bool()‏


Erlang program




Trial runs

Typing inferred by algorithm from S. Marlow and P. Wadler, “A practical subtyping system for Erlang”

Subtyping signature

and(any(), false) → bool()‏


Erlang program


A quick look at inferred domains

Dynamic typing domain

Static typing domain

We need to capture all of the dynamic domain!


Definition:

A success typing for a function f is a type signature, α→β , such that whenever an application f(x) reduces to a value v, then x ∈ α and v ∈ β .

Intuition:− If the arguments are in the domain of the function

the application might succeed, but − if they are not, the application will definitely fail.

Success typings

v


Success typing domain

Function domains revisited

Dynamic typing domain

Static typing domain


Recap

HM-type

and(bool(), bool()) → bool()‏

Subtyping

(any(), 'false') → bool()‏

Success typing

and(any(), any()) → bool()‏


Erlang program


Two sides to the story

Well-typed programs do not go wrong!

Ill-typed programs will surely fail!

Pessimism: If we cannot prove type safety we must reject the program.

Optimism: If we cannot detect a type clash we need to accept the program as it might work.

Static typing view Success typing view


Inferring success typings There is a most general success typing for all

functions of a certain arity (any()) → any() for all functions of arity 1 (any(), any()) → any() for all functions of arity 2 ...

The aim of the inference algorithm is to reduce both the domain and the range of the success typing as much as possible without excluding any valid terms


The inference algorithm [PPDP’06] Constraint-based algorithm

− Constraint generation− Constraint solving, bottom-up per SCC

Constraints are organized in disjunctions and conjunctions of subtype constraints

Conjunctions come from straight-line code and disjunctions come from choices (case statements)‏

( ) ( ) ( )nn CC|CC|TT::=C ∨…∨∧…∧⊆ 1121


Some examples

b(X) when is_integer(X) → a(X).

c(X) →case b(X) of

42 → foo;gazonk → bar

end.

%% b(integer()) → integer() | float().

%% c(integer()) → foo.

a(X) → X + 1.

%% a(integer() | float()) → integer() | float().


More examples

foo(X) when is_integer(X) → X + 1;foo(X) → atom_to_list(X).

%% foo(integer() | atom()) → integer() | list().

gazonk(X) when is_atom(X) → X + 42.%% gazonk(none()) → none()‏


A higher-order example

foo() → F = fun (X) when is_integer(X) → 54 end, h(F).

%% foo() → none().

h(F) → F(true) + 42.%% h((any()) → any()) → number()‏


A slight disturbance...

length_1([]) → 0;length_1([_|T]) → length_1(T) + 1.

length_2(L) → length_3(L, 0).

length_3([], N) → N;length_3([_|T], N) → length_3(T, N+1).

%% length_1(list()) → non_neg_integer()‏

%% length_3(list(), any()) → any()‏

%% length_2(list()) → any()‏


Module system to the rescue In Erlang, the module system cannot be bypassed

− Code resides in modules− Modules have declared interfaces (exported functions)‏

Since the module system protects local functions from arbitrary use, we can collect the types of the parameters of all call sites of these functions

We can use this information to restrict the domains of module-local functions

− “refined success typings”


The length example revisited

-module(my_list_utils).-export([length_2/1]).



%% length_2(list()) → any().

%% length_3(list(), any()) → any().%% length_3(list(), non_neg_integer()) → non_neg_integer().

%% length_2(list()) → non_neg_integer().


Adding function specifications




-spec length_2(list()) → non_neg_integer().


Adding contracts




-spec length_2(list(atom())) → integer().


How Erlang modules used to look like


How modern Erlang modules look

Using Static Analysis for Detecting Concurrency Defects


Concurrency

A method to better structure programs A means to speed up their execution A necessity these days??

The catch: Concurrent programming is harder

and more error-prone than its sequential counterpart


Data race detection in Erlang

Erlang’s concurrency model is based on user-level processes that communicate via asynchronous message passing

− copying semantics (“shared-nothing”)‏ If there is nothing shared between processes, how

can there be data races?

System built-ins allow processes to share data Erlang currently provides no atomicity constructs


What is considered a data race?

If it is possible for another process to succeed in changing the value stored on that variable in between the read and the action in such a way that the action about to be taken is no longer appropriate, then we say that the program has a race condition

When a process reads some variable, it then decides to take some write action based on the value of that variable


proc_reg(Name) -> ... case whereis(Name) of undefined -> Pid = spawn(...), register(Name, Pid); Pid -> % already ok % registered end, ...

Data races in the process registry


Data races in the process registry


run() -> Tab = ets:new(some_tab_name, [public]), Inc = compute_inc(), Fun = fun () -> ets_inc(Tab, Inc) end, spawn_some_processes(Fun).

ets_inc(Tab, Inc) -> case ets:lookup(Tab, some_key) of [] -> ets:insert(Tab, {some_key, Inc}); [{some_key, OldValue}] -> NewValue = OldValue + Inc, ets:insert(Tab, {some_key, NewValue})‏ end.

Data races in ETS


-export([table_func/2]).

table_func(...) -> create_time_stamp_table(), ...

create_time_stamp_table() -> Props = [{type, set}, ...], create_table(time_stamp, Props, ram_copies, false), NRef = case mnesia:dirty_read(time_stamp, ref_count) of [] -> 1; [#time_stamp{data = Ref}] -> Ref + 1 end, mnesia:dirty_write(#time_stamp{data = NRef}).

Data races in mnesia


Single-threaded Erlang

A single scheduler picks up processes from a single ready queue

The selected process gets assigned a number of reductions to execute

Each time the process does a function call, a reduction is consumed

A process gets suspended when the number of remaining reductions reaches zero, or when it gets stuck


Single-threaded Erlang

Being struck by a lightning seems more likely!

proc_reg(Name) -> ... case whereis(Name) of undefined -> Pid = spawn(...), register(Name, Pid); Pid -> % already ok % registered end, ...


Multi-threaded Erlang Since May 2006, a multi-threaded version of the

system has been released, which is the default on multi-core architectures

There are multiple schedulers, each having its own ready queue

Since March 2009, the runtime system employs a redistribution scheme based on work stealing when some scheduler’s run queue becomes empty


Race analysis in Dialyzer

Characteristics:

Sound for either correctness or defect detection

Completely automatic

Fast and scalable

Smoothly integrated into dialyzer


The analysis: a three-step process

1. Collecting information

− Control-flow graphs of functions and closures

− Escape analysis

− Inter-modular call graph

− Sharing/alias analysis

− Fine-grained type information (singleton types)‏



2. Determining all code points with possible race conditions

− Find the root nodes in the forest of call graphs

− Traverse their CFGs using DFS

− Special cases: Statically known function or closure calls

Unknown higher-order calls

Recursion



3. Filtering false alarms

− Variable sharing

− Type information

− Characteristics of race conditions

foo(Fun, N, M) -> ... case whereis(N) of undefined -> ..., Fun(M); Pid -> ... end, ...


Some optimizations Control-flow graph minimization Avoiding repeated traversals and benefiting from

temporal locality Making unknown function calls less unknown


Detecting data races

1 : proc_reg(Name) ->2 : ...3 : case whereis(Name) of4 : undefined ->5 : Pid = spawn(...),6 : register(Name, Pid);7 : Pid -> % already8 : ok % registered9 : end,10: ...

mod.erl:6:The call erlang:register(Name::atom(),Pid::pid()) might fail due to a possible race condition caused by its combination with the‏ erlang:whereis(Name::atom()) in mod.erl on line 3

mod.erl


Effectiveness and Performance


Current status and impact The race analysis has been publicly released as part of

the latest Erlang/OTP distribution (mid November 2009)From: Bernard Duggan (Erlang developer)‏

Sent on 27 November 2009

“Our Erlang codebase comprises 5 applications and a few little ancillary bits and pieces on the side – it's about 40k lines. So far it's turned up three race conditions. … Thanks for a brilliant tool.”


Race detection in Erlang (ICFP’09)‏

QuickCheck: A property-based testing tool

PULSE is a ProTest User Level Scheduler for Erlang that randomly schedules the test case processes and records a detailed trace

A race condition is a possibility of non-deterministic execution that can make a program fail to meet its specification


Current and Future Work Extend dialyzer to detect:

− more kinds of race conditions− more types of concurrency errors− violations in the requirements of Erlang behaviours

(concurrency design patterns)‏ Extend the language of -spec’s to specify

concurrency properties and requirements Generating tests from -spec’s

Gracias!


What are types good for? Document programmers' intentions Can be used to prove properties of programs

− e.g., type safety, ... Often help the compiler generate better code by

avoiding some checks during runtime Detect some programmer errors


Erlang terms Primitive terms:

− integers : 42, 1593405849584548049385− floats : 2.56, 3.14 − atoms : foo, true − binaries : <<0100100111011>>

Structured terms:− tuples : {foo, 42}, {1, 2, 3}− lists : [1, 2, 3.14]

Higher-order terms:− funs : fun(X) when is_atom(X) -> X == a end


Refined success typings Definition:

− Let be a function with success typing . A refined success typing for is a typing on the form , such that

− and , and− For all for which the application reduces to a

value, .

f α β

α ' β '

α '⊆α β '⊆β

p f p

f p ∈ β '

f

using static analysis to detect type errors &...

Documents