unit system testing

7/29/2019 Unit System Testing

1/46

From Unit Tetsing to System

Testing

(enhanced 6 + 4)

Course Software Testing & Verification

2011/12Wishnu Prasetya


2/46

Plan

Discuss the more practical aspect of testing, e.g. unit

testing; so that you can start your project.

Ch. 7 provides you more background on practical

aspect, e.g. concrete work out of the V-model,outlines of test plan read the chapter yourself!

We will work out some of subjects to further details:

regression (6.1), mocks (6.2.1)

It seems proper to also discuss ch 4 (Input

Partitioning) here.

2


3/46

Unit Testing

Obviously, postponing the testing until we

have a full system is a bad idea. So, test the

units that build up the system.

Experience: the cost of finding the source of

an error at the system-test level is much more

than at the unit-testing level (some reports

suggest more than 10x).

With some effort, infrastructure you have for

unit testing can also be used for integration

and system testing. 3


4/46

But what is a unit ?

In principle, you decide. Possibilities: function,

method, or class.

But different types of units have their own

characteristics, that may require different unit testingapproach:

afunctions behavior depends only on its parameters; does

not do any side effect.

procedure depends-on/affects params and global

variables

method: static vars, params, instance vars

class: is a collection of interacting methods4


5/46

Unit testing in C#

You need at least Visual Studio Professional; and for

code coverage feedback you need at least Premium.

Check these tutorials/docs (Visual Studio 2010):

Walkthrough: Creating and Running Unit Tests

Walkthrough: Run Tests and View Code Coverage

Walkthrough: Using the Command-line Test Utility

API Reference for Testing Tools for Visual Studio, in

particular Microsoft.VisualStudio.TestTools.UnitTesting,

containg classes like

Assert, CollectionAssert, ...

In this lecture we will just go through the concepts5


6/46

The structure of a test project

6

class Thermometer

private double val

private double scale

private double offsetpublic Thermometer(double s, double o)

public double value()

public double warmUp(double v)

public double coolDown(double v)

A test projectis a just a project in your

solution that contains your test-classes.


7/46

The structure of a test project

A solution may contain multiple projects; it may thus

contain multiple test projects.

A test projectis used to group related test classes.

You decide what related means; e.g. you may want

to put all test-cases for the class Thermometer in its

own test project.

A test class is used to group related test method. A test methoddoes the actual testing work. It may

encode a single test-case, or multiple test-cases. You

decide.

7


8/46

Test Class and Test Method

8

[TestClass()]

public class ThermometerTest {

private TestContext testContextInstance;

//[ClassInitialize()]

//public static void MyClassInitialize(...) ...

//[ClassCleanup()]//public static void MyClassCleanup() ...

//[TestInitialize()]

//public void MyTestInitialize() ...

//[TestCleanup()]

//public void MyTestCleanup() ...

[TestMethod()]

public void valueTest1() ...

[TestMethod()]

public void valueTest2() ....

}

public void valueTest1() {

target = new Thermometer(1,273.15);

double expected = - 273.15 ;

double actual = target.value();

Assert.AreEqual(expected, actual);}

Be careful when comparing floating

numbers, you may have to takeimprecision into account, e.g. use this

instead:

AreEqual(expected,actua,delta,...)


9/46

Test Oraclesome patterns will return later

Full: Assert.IsTrue(actual == 273.15)

Partial: Assert.IsTrue(actual > 0)

Property-based : Assert.isTrue(safe(actual))

9

public void valueTest1() {

target = new Thermometer(1,273.15);

double expected = - 273.15 ;

double actual = target.value();

Assert.AreEqual(expected, actual);

}

An oracle specifies your

expectation on the programs

responses.

Usually partial, but allow re-use the predicate safe in

other test-cases.

More costly to maintain,e.g. if you change the

intended behavior of the

program.


10/46

Discussion: propose test casesin particular the oracles...

10

reverse(a) {

N = a.length

if (N 1) return

for (int i=0; i< N/2 ; i++)swap(a,i, N-1-i)

}

incomeTax(i) {

if(i18218) return 0.023 * i

t = 419

if(i32738)

return t + 0.138 * (i 18218)

t += 1568

if(i54367)

return t + 0.42 * (i 32738)

}

Property-based testing fits nicely for reverse, but not for incomeTax; for the latter well have

to fall back to conrete-value oracles, which unfortunately tend to be more costly to

maintain.


11/46

Discussion: the Oracle Problem (6.5)

Every test-case needs an oracle; how to construct it!?

always a big problem!

Using concrete values as oracles is often powerful,

but potentially expensive to maintain. Using properties on the other hand has the

problem that often it is hard to write a complete yet

simple property capturing correctness. E.g. how to

express an oracle for a sorting function?

Alternatively we can fall back to redundancy-based

testing.

11


12/46

Inspecting Test Result

12


13/46

Inspecting Coverage

13


14/46

Finding the source of an error: use a

debugger!

14

Add break points; execution is stopped at

every BP.

You can proceed to the next BP, or execute

one step at a time: step-into, step-over,

step-out.

VisualStudio uses IntelliTrace logging

you can even inspect previous BPs.


15/46

The Debug class

Debug.Print(therm. created)

Debug.Assert(t.scale() > 0) to check for

properties that should hold in your program.

Will be removed if you build with debug off

(for release).

Check the doc of System.Diagnostics.Debug

15


16/46

When you are in a larger (multi-teams)

project....

You want to test your class Heater; but it uses

Thermometerwhich is non-existent or unstable

pollute your testing result! We can opt to use a mock

Thermometer. A mockof a program P: has the same interface as P

only implement a very small subset of Ps behavior

fully under your control

Analogously we have the concept ofmock object.

Make mocks yourself e.g. exploiting inheritance, or use

a mocking tool.

16


17/46

Mocking with Moq

17

test1() {

Heater heater = new Heater()

var mock = new Mock()mock.Setup(t => t.value()).Returns(303.00001)

heater.thermometer = mock.object

heater.limit = 303.0

heater.check

Assert.IsFalse(heater.active)}

interface IThermometer

double value()

double warmUp(double v)

class Heater

double limit

bool active

public check() {

if(thermometer.value() >= limit)

active = false

}


18/46

Mocking with Moq(google it for more info!)

18

var mock = new Mock()

mock.Setup(t => t.value()).Returns(303.00001)

mock.Setup(t => t.warmUp(0)).Returns(0)

mock.Setup(t => t.warmUp(It.IsInRange (-10, 0, Range.Inclusive))

.Returns(0)

mock.Setup(t => t.warmUp (It.IsAny()))

.Returns((double s) => s + 273.15)

Many more mock-functionalities in Moq; but in general mocking can be tedious.

E.g. what to do when your Heater wants to call warmUp in an iteration?


19/46

Discussion: testing a method vs testing

a class

How do you want to test these?

the methodpop() will actually pop out the top

element of a non-empty stack that the class Stack is always in a consistent

state after any sequence of push and pops.

19

Stack

private Object[] content

private int top

public Stack()

push(T x)

T pop()


20/46

Testing a class(postponed: more in-depth discussion on OO testing)

It makes sense to first test each method individually.

Many classes have methods that interact with each

other(e.g. as in Stack). At the class level we want to

test that these interactions are safe. Express this with a class invariant, e.g.

0 top < content.length

forallxin content, typeOf(x) is T

There is nothing that prevents you from reusing the

unit testing infrastructure to do class testing!

20


21/46

To be more concrete...

21

Stack

private Object[] content

private int top

public Stack()

push(T x)T pop()

bool classinv() {

return

0top

&& top


22/46

But that was a familiar idiom!

AnAbstract Data Type (ADT) is a model of a (stateful)

data structure. The data structure is modeled

abstractly by only describing a set ofoperations

(without exposing the actual state). The semantic is described in terms of logical

properties (also called the ADTs axioms) over those

operations.

This generalizes over the concept of class invariant.

Can be used to model how you want to test a bigger

unit, e.g. a subsystem or even a system.

22


23/46

Example : stack

23

Stack

bool isEmpty()

push(T x)

T pop()

Stack axioms :

For allx,s :

s.push(x) ; y = s.pop ;

assert (y==x)

For allxand s :

s.push(x) ;

assert (s is not empty)

For allxand empty s :

s.push(x) ; s.pop

assert (s is empty)

Depending of the offered

operations, it may be hard/notpossible to get a complete

axiomatization.

For some ADTs, not all sequences

of operations are allowed. How

to express this? (next slide)


24/46

A finite state machine (FSM) can be used to express

valid sequences of the operations (2.5.1, 2.5.2)

24

open()

close()

write()

FileADT:

isEmpty() pop ()

push(x)Stack:

Relevant concepts to apply: node

coverage, edge coverage, edge-pair coverage.

FSM can also come from your UML models.


25/46

Suggesting the following design pattern for your

System Under the Test (SUT)

The ADT abstractly models the SUT. We use the ADTs

list of logical properties as the specifications for

the SUT.

Multiple ADTs can be used if we want to model

multiple aspects of the SUT separately. 25

Test Interface-1

(an ADT)

Test Interface-2(an ADT)

Test SuiteSUT

But youll have to invest in building and specifying these ADTs! They dont drop out from the sky.


26/46

Discussion: propose test-cases

(Def 1.26) White boxtesting : common at the unit-testing level

(Def 1.25) Black boxtesting: common at the system-testing

level

Positive test: test the program on its normal parameters range.

But can we afford to assume that the program is always called

in its normal range? Else do Negative test: test that the

program beyond its normal range.

26

save(f,o) saves the object o in a file namedf. It throws an

exception X if o is not serializable, and exception Y is the IO

operation fails.


27/46

Partitioning the inputs

Based on your best understanding of saves semantic.

Terminology: characteristic, block. The domain of a

characteristic is divided into disjoint blocks; the union ofthese blocks must cover the entire domain of the

characteristic.

Assumption : all values of the same block are equally

good.27

save(String fname, Object o)

fname : (A) existing file

(B) non-existing file

o : (P) null

(Q) non-null serializable

(R) non-null non-serializable


28/46

So, what input values to choose?

(C4.23, ALL) All combinations must be tested.

|T| = (i: 0i


29/46

t-wise coverage

(C4.25, pair-wise coverage). Each pair of blocks (from

different characteristics) must be tested.

(C4.26, t-wise coverage). Generalization of pair-wise.

Obviously stronger than EACH CHOICE, and still

scalable.

Problem: we just blindly combine; no semantical

awareness. 29

A

B

P

Q

X

Y

T : (A,P,X) ,

(A,Q,Y) , ... more?


30/46

Adding a bit of semantic

(C4.27, Base Choice Coverage, BCC) Decide a single

base test t0. Make more tests by each time removing

one block from t0, and forming combinations with allremaining blocks (of the same characteristics).

|T| = 1 + (i: 0i


31/46

Or, more bits of semantics

(C4.28, Multiple Base Choices). For each

characteristic we decide at least one base block.

Then decide a setof base tests; each only includebase blocks. For each base test, generate more tests

by each time removing one base block, and forming

combinations with remainingnon-base blocks.

|T| at most M + (i: 0i


32/46

Constraints, to exclude non-sensical

cases

Example:

combo (A,P) is not allowed.

if P is selected, then X must also be selected.

General problem: given a coverage criterion C and aset of constraints, find a test set T satisfying both.

Solvable example: pair-wise coverage + (A,P,Y) is not

allowed.

May not be solvable, e.g. pair-wise coverage + (A,P)

is not allowed.

In general the problem is not trivial to solve.

32


33/46

Overview of partition-based coverage

33

EACHCHOICE

ALL

t-WiseMultiple Base

Choice Coverage

Pair-WiseBase Choice

Coverage


34/46

How much does it cost me?

In my experience, a unit-test set with 100% coverage

is almost always as big as the method you test, if not

bigger.

The test sets are valuable assets! You want to reusethem when in the future you modify your program.

However, maintenance is a major problem

If your test sets cannot be exposed to refactoring tools.

Concrete oracles are costly to adjust.

Are you going to pay this cost!?

34


35/46

Recall: Basic Problems in Testing

How to produce a strong enough test-set ?

What is enough ?

Can we automate ?

the execution of the test-set (related to

testability)

the construction of the test-set

Can we reproduce the result of the test-set?

How to find the source of an error?

35


36/46

What do we want to automate?

1. The execution of the test-cases.

2. Construction of the test-inputs and steps in the

test-cases

Successful use: QuickCheck Massive use is useless without oracles so, only use it

with property-based testing.

3. Construction of the oracles.

36

Notes:

Even level-1 can already be a problem in practice. A system is testable if it exposes an interface you can program against. This is a requirement for level-1.

Level-2 can easily be automated by e.g. a random generation approach (e.g. QuickCheck is a famous example of this). Is often quite productive, but in

general producing a strong level-2 is an undecidable problem. Massively generating test-inputs is useless unless we have a way to construct oracles.

However, since level-3 is problematic, Level-2 is typically used in conjunction to property-based testing.

Level-3 is even harder. There is very little tool that can do it (e.g. Daikon ); even then it can only infer partial oracles. The problem is in general

undecidable.


37/46

Regression Test

To test that a new modification in your program does

not break old functionalities. To be efficient, people

typically reuse existing test sets.

Usually applied for system-testing, where theproblem is considered as more urgent. Challenge:

very time consuming (hours/days!).

For unit-testing: there were research to apply it

continuously as you edit your program; see it as

extended type-checking. Result was positive!

37

(Saff & Enrst, An experimental evaluation of continuous testing during

development. ISSTA 2004)


38/46

Some concepts first...

Test Selection Problem: suppose P has been modified

to P. Let Tbe the test set used on P. Choose a subset

TTto test P.

Obviously, exclude obsolete test cases: those thatcant execute or whose oracles no longer reflect P

semantic. Lets assume: we can identify them.

You want the selection to be safe: T includes test-

cases that will execute differently on P.

Only attractive if the cost of calculating T + executing

T is less than simply re-executing the whole T.

38


39/46

Idea: select those that pass through

modified code

Ifm is the only method in P that changes, the

obvious strategy is to select only test-cases

that pass through m. Better: only select test-cases that pass ms

modified branch.

39

m(x) { if (d) y = x+1 else y=0 }

m(x) { if (d) y = x-1 else y=0 }

(orginal)

(modified)


40/46

Corner case

The first if is modified by removing an else-branch.

Using the previous strategy means that we have toselect alltest-cases that pass m. Yet we see that the

paths [d, e, stmt] and [ d, e] present in both old

and new m; so there is actually no need to select

them. 40

m(x) { if (d) y = x+1 ;

if (e) stmt }

m(x) { if (d) {

y = x+1 ;

if (e) stmt;

u = 0 }

else if (e) stmt

}


41/46

Looking at it abstractly with CFG

41

m(x) { if (d) y = x+1 ;

if (e) stmt }

m(x) { if (d) {

y = x+1 ;

if (e) stmt;

u = 0 }

else if (e) stmt}

d

y=x+1

e stmt

u=0

end

estmt

dy=x+1

e stmt

end

Notice that [d, e, stmt, end] and

[d, e, end] appear in both, and

equivalent.


42/46

Some concepts

We assume: P is deterministic. each test-case

always generate the same test path.

Letp andp be the test-paths of a test-case twhen

executed on P and P; tis modification traversing ifnot(pp). lets select modification traversing

test-cases.

pp if they have the same length, and for each i,pi

andpicontains the same sequence of instructions.

So far this is not helpful, because such a selection

strategy requires us to first execute ton P. Then it is

not attractive anymore! 42


43/46

Intersection Graph

First, extend CFG so that branches are labelled by the corresponding

decision value (e.g. T/F for if-branch). Label non-branch edge with some

constant value.

Each node of G is a pair (u,u). Then G is defined like this :

The pair of initial nodes (s0,s0)G.

If (u,u)G, and uu, and uv is an edge in G, and uv and edge in

G both with the same label, then (u,u)(v,v) should be an edge in

G.43

a

c

end

b

G : a

c1

end

b

G :

c2

d

G = GG: a,a

c,c1

end,end

b,b

c,c2

end,d


44/46

Intersection Graph

Each path p in G describes how a path in Gwould be executed on G if

the same decisions are taken along the way. Note that this is calculated

without re-executing any test-case on P.

Any path in G ends either in a proper exit node (green), or in a pair (u,u)

where not uv (red). This would be the first time a test-case would hit a

modified code when re-executed on P.

44

a

c

end

b

G : a

c1

end

b

G :

c2

d

G = GG: a,a

c,c1

end,end

b,b

c,c2

end,d


45/46

Selection Algorithm

(Safe but not minimalistic) Select test-cases that pass a node u in G that is

part of a red-node in G. same problem as before, it will select also

select [a,c,end] which is not modification traversal.

(Rothermel-Harold, 1997) Select test-cases that pass an edge e in G that in

G leads to a red-node in G. actually the same problem.

45

a

c

end

b

G : a

c1

end

b

G :

c2

d

G = GG: a,a

c,c1

end,end

b,b

c,c2

end,d


46/46

Selection Algorithm

(Ball algorithm,1998) Partition G nodes to those can can reach a green-

node (G partition), and those that cannot (NG partition). Look at edges in

G that cross these partitions (so, from G to NG).

A test pathp is modification traversing if and only if it passes through a

crossing edge (as meant above). use this as the selection criterion.

a

c

end

b

G : a

c1

end

b

G :

c2

d

G = GG: a,a

c,c1

end,end

b,bc,c2

end,d

NG-partition

G-partition

unit system testing

Documents