verification based on run-time, field-data, and beyond séverine colin laboratoire dinformatique...

Verification Based on Run-Time, Field-Data, and Beyond

Séverine Colin Laboratoire d’Informatique (LIFC) Université de Franche-Comté-CNRS-INRIA

Leonardo Mariani Dipartimento di Informatica, Sistemistica e Comunicazione (DISCo)

Università di Milano Bicocca

Tope Omitola Computer Laboratory

University of Cambridge, UK

2/36

Outline

Traditional Run-Time Verification Techniques– checking properties on execution data at run-time

Test and Verification Techniques based on Field-Data

– gathering execution data to increase effectiveness of (off-line) test and verification techniques

Discussion on Test, Verification and Model-Checking Conclusions

3/36

Run-Time Verification Techniques

Basic idea : to extract an execution trace of an executing program and to analyze it to detect errors

To check classical error pattern (data races, deadlock)

To verify a program against formal specification

4/36

Data races detection

Data race: two concurrent threads access a shared variable and at least one access is a write in same time

Eraser tool dynamically detects data races To enforce every shared variable is protected

by some lock Eraser algorithm is used by PathExplorer,

Visual Thread

5/36

Deadlock Detection

Deadlock: to occur whenever multiple shared resources are required to accomplish a task

A model representation of the program is constructed during the program execution

Deadlock: circularity in the dependency graph

Used by VisualThread and PathExplorer

6/36

Monitoring and Checking (MaC)

System requirements are formalised Monitoring script is constructed:

– to instrument the code– to establish a mapping from low-level information

into high-level events

At run-time, generated events are monitored for compliance with the requirements specification

7/36

MaC: Events and Conditions

Events occur instantaneously during the system execution

Conditions are information that hold for a duration of time

Three-valued logic: true, false, undefined PEDL (Primitive Event Definition Language):

language for monitoring scripts MEDL (Meta Event Definition Language):

language for safety requirements

8/36

PathExplorer (1/2)

Instrumentation module (using Jtrek): it emits relevant events

An interaction module: send events to observer module

An observer module: it verifies the requirement specification

9/36

PathExplorer (2/2)

Requirements are written using past LTL (Monitoring operators are added: ↑F, ↓F, [F,F)S, [F,F)w

Use the recursive nature of past time temporal logic: the satisfaction relation for a formula can be calculated along the execution trace looking only one step backwards (see our paper for the algorithm)

10/36

T&V Techniques based on Field-data

Field-data: “run-time data collected from the field” Why collecting field data for Test and Verification?

– limited knowledge about the final system, e.g., sw components are usually developed in isolation,

assembled with third-party components and, finally, deployed in unknown environments

– uncertainty of the final environment e.g., in the case of ubiquitous computing, pervasive computing,

mobile computing, and wireless networks, it is not possible to predict in advance every possible situation

– dynamic environments e.g., in the case of mobile code, self-adaptive systems and

peer-to-peer systems, resources suddenly appear and disappear

11/36

Existing Approaches

Field-data has been collected for:– Evaluating usability of an application (usability

testing)– Modelling usage of the system

which components, modules and functionalities are used?

– Learning properties of the implementation– Modelling program faults

which failures have been recognized on the target system?

12/36

Evaluating Usability

Traditionally, data for usability testing has been gathered by running testing sessions

Novel approaches: silent data-gathering systems– Automatic Navigability Testing System (ANTS) [Rod02]– Web Variable Instrumented Program (Webvip) [VG] – Gamma System [OLHL02]

13/36

Silent Data-Gathering Systems (1/2)

ANTS Webvip

http://...

ANTS server

server agent

communication user’s actions

Data server

http://...

user’s actions

session fileupload

client-side agent

multimedia content script

14/36

Silent Data-Gathering Systems (2/2)

Gamma

figure appeared in [OLHL02]

15/36

Modelling Usage of the System (1/2)

for performing system-specific impact analysis– Law and Rothermel’s impact analysis [LR03]

the program is instrumented to produce execution traces representing the procedure-level execution flow, e.g., MBrACDrErrrrx

the impacted set for procedure P is computed by selecting procedures that are called by P and procedures that are in the call stack when P returns

– Orso et al.’s impact analysis [OAH03] entity-level instrumentation: an execution trace is a sequence of

traversed entities a change c on entity e potentially affects all entities of traces

containing e the impact set is given from the intersection between the potentially

affected entities and the result of a forward slicing with variable used on change c as slicing criterion

16/36

Modelling Usage of the System (2/2)

Information from impact analysis can be used in regression testing– Orso et al’s regression testing [OAH03]

entity-level instrumentation test suite T’ is initialized with all test cases contained in existing test suite T traversing the change

T’ is augmented with test cases covering uncovered impacted entities computed with Orso et al’s impact analysis technique

test suite prioritization is performed by privileging test cases covering more impacted entities

for increasing confidence of the program– Pavlopoulou and Young’s perpetual testing [PY99]

normal executions are considered as tests instrumentation measures statement coverage of uncovered blocks, even

in the final environment the program can be iteratively generated to reduce instrumentation

17/36

Learning Properties (1/2)

Automatic synthesis of properties/invariants– Ernst et al’s approach [ECGN01]

initially, a large set of invariants is supposed to hold over monitored variables

each execution can falsify some invariants. Falsified invariants are deleted

for each of true invariants is computed the probability that it “randomly holds“

if this probability is below a given threshold the invariant is accepted synthesized properties are defined by the set of accepted invariants

Automatic synthesis of programs– Many approaches from machine learning, but they learn very simple

functions– Lau et al’s approach [LDW03]

it is still simple, but it learns small computer programs based on accurate execution traces and programming constructs

18/36

Learning Properties (2/2)

Synthesized properties, invariants and programs can be used to– check the implementation with respect to the

specification– verify safety of updates (in terms of components’

replacements) Ernst at al. approach has been used to verify Pre-cond,

Post-cond and Inv corresponding to implemented services when replacing components [ME03]

– derive test suites– provide to the programmer confidence over the

implementation

19/36

Test, Verification and Model-Checking (TVM)

Evolution of Testing, Model Checking, and Run-time Verification

Will mention their advantages and disadvantages

Mention future research agenda Conclusion

20/36

TVM

It started with “The Software Crisis” [NATO, 1968]

Led to calls for software “Engineering” [Bauer, 1968]

Focus on methodology for constructing software (e.g. Structured Programming [Dijkstra, 1969]; Chief Programmer Team [Harlan Mills @ IBM, 1973])

21/36

TVM

Higher level languages viewed as panacea (C, Java, ML, Meta-ML)

Buggy software was still being produced Focus shifted to detecting and preventing

mistakes during software construction --- Testing

22/36

TVM - Testing

2 main approaches to Testing: Reliability Growth Modelling (RGM) and Random Testing

In RGM, program is corrected, tested, fails, corrected, tested again, goes on many times

MTBF (Mean Time Between Failure) entered into a mathematical model derived from previous experiences

23/36

TVM - Testing

When the model indicates a very long MTBF, we stop testing, and ship product

Pitfalls of RGM: Very tenuous (weak) link between past

development processes and the current one Correction of a bug can introduce new bugs,

which reduces dependability, and

24/36

TVM - Testing

Industrial practice found you need extremely large amounts of failure-free testing

Thereby not cost-effective Random Testing: test cases are selected

randomly from a domain of possible inputs Advantages of Random Testing over RGM: Random, therefore non-automatable, you are

more likely to find errors, and

25/36

TVM - Testing

Random testing draws on tools from information theory to analyse results

Pitfalls of Random Testing: Distribution of random test cases may not be

the same as real usage of system Random testing takes no account of program

size, a 10-line program treated the same as a 10000-line program

26/36

TVM - Program Review

Buggy software was still being produced Another panacea tried was Program Review

(Software Inspection) Depends on humans making the right

decisions Fallible on human errors

27/36

TVM - Program Proving (Theorem Provers)

Solution then became Formal Deductive Reasoning – Program Proving

Automated Theorem Provers (e. g. Isabelle [Camb]) developed to prove programs

A main problem with theorem provers is the impracticality of proving all layers of the system from software programs to hardware to circuits

28/36

TVM - Model Checking

Alternative approach to theorem provers is model checking

In model checking, specification for a system is expressed in temporal logic, and the system is modelled as graph of finite state transitions, and a model checker checks whether the graph matches the temporal logic specification

29/36

TVM - Model Checking

Advantages over theorem provers: Algorithmic, so the user need only to press a

button and wait for the result while in theorem provers, a user may need to direct the theorem prover to find a solution

Gives counterexamples if formula is not satisfied

30/36

Model Checking

Disadvantage of model checking: Computational complexity, and Some information about the system is lost

when you turn a system with an infinite number of states to a finite number

There are calls for Run-Time Verification of software

31/36

TVM - Run-Time Verification (RTV)

Some ideas of this were presented above. Observations of some RTV tools: Simply debuggers with fancy features Or they provide good tracing mechanisms Encouraging observations of RTV tools: Some use LTL (or extensions) to describe

the program monitor

32/36

TVM - RTV

Some use LTL as the basis for a Property Specification Language, such as PEDL, MEDL

May be used as a basis for understanding and for theory

33/36

Call to Arms - Future Research Agenda

We need a Theory of Testing Such theory should integrate good aspects of

testing, model checking, and run-time verification

I shall mention some approaches (references in our paper)

34/36

Some Approaches to Theory of Testing

Type Systems/Abstract Interpretation Work from compiling and type systems directed

towards optimisation of code can provide good information to direct selection of test cases

Polymorphism and linearity can help Very little work so far on Semantics of Testing

(encouraging work from this workshop)

35/36

Some Approaches to Theory of Testing

Developing semantic structures (e.g. of domain) that facilitate testing may be something to look at

Semantics of A.I. Planning to provide a basis for semantics of run-time verification (ref. in our paper)

Domain theory in concurrency to provide semantics for distributed system testing (ref. in paper)

36/36

Conclusions

Call to arms for theory builders and tool builders

Come up with good theories and better tools Provide tools for software professionals to

use for system specification, design, build, test, audit, monitor systems

Let’s do it !!!

verification based on run-time, field-data, and beyond séverine colin laboratoire dinformatique...

Documents

execution data

field data

fielddata fielddata

field data

runtime data

silent data

pathexplorer slide

algorithm slide