model-based testing: why, what, how

Model-Based Testing:

Why, What, How

Bob Binder System Verification Associates

Juniper Systems Testing Conference

November 9, 2011

Overview

• What is Model-Based Testing?

• Testing Economics

• Case Studies – Automated Derivatives Trading

– Microsoft Protocol Interoperability

• Product Thumbnails

• Real Testers of …

• Q&A

2 Model-Based Testing: What, Why, How

Why? • For Juniper:

– Reduce cost of testing

– Reduce time to market

– Reduce cost of quality

– Increase competitive advantage

• For you: – Focus on System Under Test (SUT), not test hassles

– Engineering discipline with rigorous foundation

– Enhanced effectiveness and prestige

– Future of testing


WHAT IS MODEL-BASED TESTING? 4 Model-Based Testing: What, Why, How

5

“All Testing is Model-Based”

• Patterns for test design

– Methods

– Classes

– Package and System Integration

– Regression

– Test Automation

– Oracles

• 35 patterns, each a test meta-model

Model-Based Testing: What, Why, How

What is a Test Model?

α

ω

Two PlayerGa me ( )

~( ) ~( )

Gam e S tarte d

P la ye r 1

Served

P la ye r 2

Served

p2 _Start ( ) /

s im u lateVolle y( )

p1 _W insVolle y( ) /


p1 _W insVolle y( )

[th is.p 1_ Score ( ) < 20 ] /

th is.p 1AddPo in t( )


p1 _S tart ( ) /


p2 _W insVolle y( )

[th is.p 2_ Score ( ) < 20 ] /



p2 _W insVolle y( ) /

s im u lateVolle y( )p1 _W insVolle y( )

[th is.p 1_ Score ( ) == 20] /


p2 _W insVolle y( )

[th is.p 2_ Score ( ) == 20] /


p1 _IsW in ner( ) /

retu rn TR UE;

P la ye r 1

W o n

P la ye r 2

W o n p2 _IsW in ner( ) /

retu rn TR UE;

Ga m e S ta rted

P la yer 3

Serve d

p 3_ Start ( ) /

s im ulateVo lley( )

p 1_ W insVo lle y( ) /


p 3_ W insVo lle y( )

[this.p 3_ Sco re ( ) < 2 0] /

th is.p3 Add Point ( )




p 3_ W insVo lle y( )

[this.p 3_ Sco re ( ) == 2 0] /

th is.p3 Add Point ( )

P la yer 3

W on p 3_ IsW in ne r( ) /

return TR UE;

αTh ree P la yerG am e ( ) / Two P la yerG am e ( )



Tw oP layerG am e ( )

ω~( )

TwoPlayerGame

ThreePlayerGame

Game Started

Player 1Served

Player 2Served

p2_Start( ) /simulateVolley( )

p1_WinsVolley( ) /simulateVolley( )

p1_WinsVolley( )[this.p1_Score( ) < 20] /this.p1AddPoint( )simulateVolley( )

p1_Start( ) /simulateVolley( )

p2_WinsVolley( ) [this.p2_Score( ) < 20] /this.p2AddPoint( )simulateVolley( )


p1_WinsVolley( )[this.p1_Score( ) == 20] /this.p1AddPoint( )


p1_IsWinner( ) /return TRUE; Player 1

WonPlayer 2Won

p2_IsWinner( ) /return TRUE;

α

ω

ThreePlayerGame( ) /TwoPlayerGame( )

~( )

Player 3Served

p3_WinsVolley( )[this.p3_Score( ) < 20] /this.p3AddPoint( )simulateVolley( )


Player 3Won

p3_IsWinner( ) /return TRUE;

~( )

p3_Start( )/simulateVolley( )





~( )

6

Mode Machine test design pattern

SUT Design Model Test Model

TwoPlayerGame

ThreePlayerGame

+TwoPlayerGame()+p1_Start( )+p1_WinsVolley( )-p1_AddPoint( )+p1_IsWinner( )+p1_IsServer( )+p1_Points( )+p2_Start( )+p2_WinsVolley( )-p2_AddPoint( )+p2_IsWinner( )+p2_IsServer( )+p2_Points( )+~( )

+ThreePlayerGame()+p3_Start( )+p3_WinsVolley( )-p3_AddPoint( )+p3_IsWinner( )+p3_IsServer( )+p3_Points( )+~( )


Model-based Test Suite

Player 1 Served

Player 2 Served

Player 3 Won

omega

Player 3 Won

Player 3 Served

Player 3 Served

Player 1 Served

Player 2 Won

omega

Player 2 Won

Player 3 Served

Player 2 Served

Player 2 Served

Player 1 Served

Player 1 Won

omega

Player 1 Won

Player 3 Served

Player 2 Served

Player 1 Served

Gam eStartedalpha

3 p2_Start( )

6 p1_WinsVolley( )[this.p1_Score( ) < 20]

2 p1_Start( )

9 p2_WinsVolley( ) [this.p2_Score( ) < 20]

8 p2_WinsVolley( )

7 p1_WinsVolley( ) [this.p1_Score( ) == 20]


1 ThreePlayerGame( )

4 p3_Start( )

5 p1_WinsVolley( )

14 p1_IsWinner( )

15 p2_IsWinner( )

17 ~( )

12 p3_WinsVolley( ) [this.p3_Score( ) < 20]


16 p3_IsWinner( )

11 p3_WinsVolley( )

1

2

3

4

8

11

7

6

9

11

10

5

12

13

8

5

17

14

17

15

17

16

*

*

*

*

*

*

7

• N+ Strategy

– Start at α

– Follow transition path

– Stop if ω or visited

– Three loop iterations

– Assumes state observer

– Try all sneak paths

N+ Test Suite


Automated Model-based Testing

• Software that represents an SUT so that test inputs and expected results can be computed – Useful abstraction of SUT aspects

– Algorithmic test input generation

– Algorithmic expected result generation

– Many possible data structures and algorithms

• SUT interface for control and observation – Abstraction critical

– Generated and/or hand-coded


How MBT Improves Quality

Model

SUT

Expected Outputs (Test Oracle)

Inputs (Test Sequences)

Control

Requirements

Generate

Observe

Model error, omission

SUT Bug

Develop

Evaluate

Ambiguous, missing, contradictory, incorrect, obscured, incomplete

Missing, incorrect

Stobie et al, © 2010 Microsoft. Adapted with permission.

Coverage: Requirements, Model, Code

Reliability Estimate


Typical Test Configuration

10

SUT OS

Transport

Test Suite Host OS

Transport

Adapter

Test Suite

Transport

System Under Test

Transport

Adapter

Control Agent


Development Environment

Typical MBT Environment

11

Reqmts DB

Configuration Management

Design DB

Test Manager

Bug DB

SUT OS

Transport Test Suite Host

Test Host OS

Transport

Adapter

Test Suite

Transport

System Under Test

Transport

Adapter

Control Agent

Code Stack

MBT Tool


TESTING ECONOMICS

Show Me the Money

How much of this … for one of these? 13 Model-Based Testing: What, Why, How

Testing by Poking Around

No tooling costs

No testware costs

Quick start

Opportunistic

Qualitative feedback

+ -

System Under Test

Manual

“Exploratory”

Testing

14

Subjective, wide variation

Low coverage

Not repeatable

Can’t scale

Inconsistent Model-Based Testing: What, Why, How

Manual Testing

Manual

Test Design

Manual

Test Input

Manual

Test Results

Evaluation

Test Setup

System Under Test

15

Flexible, no SUT coupling

Systematic coverage

No tooling costs

No testware cost

Usage validation

+ - 1 test per hour

Usually not repeatable/ed

Not scalable

Inconsistent

Tends to “sunny day” tests Model-Based Testing: What, Why, How

Hand-coded Test Driver

Manual

Test Design

Test Driver

Programming System Under Test

16

10+ tests per hour

Repeatable

Predictable

Consistent

Continuous Integration, TDD

+ - Tooling costs

Testware costs

Brittle, high maintenance cost

Short half-life

Technology focus Model-Based Testing: What, Why, How

Model-based Testing

Modeling,

Automated

Generation

Automated

Setup and

Execution System Under Test

1000+ tests per hour

Maintain model (not testware)

Intellectual control

Explore complex space

Consistent coverage

+ - Tooling costs

Training costs

Paradigm shift

Still need manual, coded tests


Test Automation Envelope

5 Nines

4 Nines

3 Nines

2 Nines

1 Nine

1 10 100 1,000 10,000

Model-based

Reliability (Effectiveness)

Productivity: Tests/Hour (Efficiency)

Manual

Automated Driver


CASE STUDY: REAL TIME DERIVATIVES TRADING

Real Time Derivatives Trading

• “Screen-based trading” over private network

– 3 million transactions per hour

– 15 billion dollars per day

• Six development increments

– 3 years

– 3 to 5 months per iteration

– Testing cycle shadows dev increments

• QA staff test productivity

– One test per hour


System Under Test

• Unified process

• About 90 use-cases, 650 KLOC Java

• CORBA/IDL distributed object model

• HA Sun server farm

• Multi-host Oracle DBMS

• Many interfaces – GUI (trading floor)

– Many high speed program trading users

– Many legacy input/output


MBT: Challenges and Solutions

• One time sample not effective, but fresh test suites too expense

• Too expensive to develop expected results

• Too many test cases to evaluate

• Profile/Requirements change

• SUT interfaces change

• Simulator generates fresh, accurate sample on demand

• Oracle generates expected on demand

• Comparator automates checking

• Incremental changes to rule base

• Common agent interface


Test Input Generation

• Simulation of users

– Use case profile – 50 KLOC Prolog

• Load Profile – Time domain variation

– Orthogonal to event profile

• Each generated event assigned a "port" and submit time

• 1,000 to 750,000 unique tests for 4 hour session

-500.000

0.000

500.000

1000.000

1500.000

2000.000

2500.000

3000.000

3500.000

-5000 0 5000 10000 15000 20000 25000

Eve

nts

Pe

r Se

con

d

Time (seconds)

1

10

100

1000

10000

100000

1000000

10000000

1 2 3 4 5 6 7 8 9 10 11 12


Automated Evaluation

• Oracle

– Processes all test inputs

– About 500 unique rules

– Generates end of session “book”

• Comparator

– Compares SUT “book” to oracle “book”

• Verification

– “Splainer” rule backtracking

– Rule/Run coverage analyzer


Test Harness

SUT

Test Verdict Run Reports

Adapter

Adapter

Adapter

Adapter

Oracle

Comparator

Splainer

Simulator


Technical Achievements

• AI-based user simulation generates test suites

• All inputs generated under operational profile

• High volume oracle and evaluation

• Every test run unique and realistic (about 200)

• Evaluated functionality and load response with

fresh tests

• Effective control of many different test agents

(COTS/ custom, Java/4Test/Perl/Sql/proprietary)


Technical Problems

• Stamp coupling

– Simulator, Agents, Oracle, Comparator

• Re-factoring rule relationships, Prolog limitations

• Configuration hassles

• Scale-up constraints

• Distributed schedule brittleness

• Horn Clause Shock Syndrome


Results

• Revealed about 1,500 bugs over two years

– 5% showstoppers

• Five person team, huge productivity increase

– 1 TPH versus 1,800 TPH

• Achieved proven high reliability

– Last pre-release test run: 500,000 events in two hours, no failures detected

– No production failures

• Abandoned by successor QA staff


CASE STUDY: MICROSOFT PROTOCOL INTEROPERABILITY

Challenges

• Prove interoperability to Federal Judge and court-appointed scrutineers

• Validation of documentation, not as-built implementation

• Is each TD all a third party needs to develop:

– A client that interoperates with an existing service?

– A service that interoperates with existing clients?

• Only use over-the-wire messages


Microsoft Protocols

• All product groups – Windows Server – Office – Exchange – SQL Server – Others

• 500+ protocols – Remote Desktop – Active Directory – File System – Security – Many others

• Remote API for a service


Microsoft Technical Document (TD)

• Publish protocols as “Technical Documents”

• One TD for each protocol

• Black-box spec – no internals

• All data and behavior specified with text


Published Technical Docs

33

http://msdn.microsoft.com/en-us/library/cc216513(PROT.10).aspx






Validating Interoperability with MBT

34

Model-based Test Suite

Analysis

Data and behavior statements

Model assertions generate and check response of actual Windows Services

Technical Document

Modeling • Approximates third party implementation • Validates consistency with actual

Windows implementation

Test Execution

Test Requirements Specification

WS 2008 WS 2003 WS 2000

Stobie et al, © 2010 Microsoft. Adapted with permission.


Protocol Quality Assurance Process

35

Plan

• Complete Test Rqmts

• High Level Test Plan

Design

• Complete Model

• Complete Adapters

Final

• Gen & Run Test Suite

• Prep User Doc

Review

• TD ready?

• Strategy OK?

Review

• Test Rqmts OK?

• Plan OK?

Review

• Model Ok?

• Adapter Ok?

Review

• Coverage OK?

• Test Code OK?

TD v2 TD vn

Test Suite Developers

Authors

Reviewers

TD v1

Study

• Scrutinize TD

• Define Test Strategy


Productivity

“On average, model-based testing took 42% less time than hand-coding tests”

Avg Hrs Per Test Requirement

Task

Document review 1.1

Test requirement extract 0.8

Model authoring 0.5

Traditional test coding 0.6

Adapter coding 1.2

Test case execution 0.6

Final adjustments 0.3

Total, all phases 5.1 Grieskamp et al.

36

Threshold result

• Nearly all requirements had less than three tests

• Much greater gain for full coverage


Results

• Published 500+ TDs, ~150,000 test requirements

• 50,000+ bugs, most identified before tests run

• Many Plugfests, many 3rd party users

• Released high interest test suites as open source

• Met all regulator requirements, on time

– Judge closes DOJ anti-trust case May 12, 2011

• ~20 MSFT product teams now using Spec Explorer


TOOL THUMBNAILS All product or company names mentioned herein may be trademarks or registered trademarks of their respective owners.

CerifyIT Smartesting

39

Model Use cases, OCL; custom test stereotypes; keyword/action abstraction

Notation UML 2, OCL, custom stereotypes, UML Test Profile

UML Support Yes

Requirements Traceability

Interface to DOORS, HP QC, others

Generation Constraint solver selects minimal set of boundary values

Oracle Post conditions in OCL, computed result for test point

Adapter Natural language option; HP GUI drivers

Typical SUT Financial, Smart Card

Notable Top-down formally defined behavior; data stores; GUI model


Conformiq Designer

40

Model State machines with coded event/actions

Notation Statecharts, Java

UML Support Yes


Integrated requirements, traceability matrix

Generation Graph traversal: state, transition, 2-switch

Oracle Model post conditions, any custom function

Adapter Output formatter, TTCN and user-defined

Typical SUT Telecom, embedded

Notable Timers; parallelism and concurrency; on-the-fly mode


MaTeLo All4Tec

41

Model State machine with transition probabilities (Markov); data domains, event timing

Notation Decorated State Machine

UML Support No


Integrated requirements and trace matrix; import from DOORS, others

Generation Most likely path, user defined, all transitions, Markov simulation; subset or full model

Oracle User conditions; Matlab and Simulink

Adapter EXAM mappers; Python output formatter

Typical SUT hardware-in-the-loop; Automotive, Rail

Notable Many standards-based device interfaces; supports software reliability engineering


Automatic Test Generation IBM/Rational

42

Model Sequence diagrams, flow charts, statecharts, codebase

Notation UML, SysML, UML Testing Profile

UML Support Yes


DOORS integration; design model traceability

Generation Parses generated C++ to generate test cases; Reach states, transition, operations, events for modeled classes

Oracle User code

Adapter User code, merge generation

Typical SUT Embedded

Notable Part of systems engineering tool chain


Spec Explorer Microsoft

43

Model C# class with “action” method pre/post condition; regular expressions define “machine” of classes/actions

Notation C#

UML Support Sequence diagrams


API for logging user defined requirements

Generation For any machine, constraint solver finds feasible short or long path of actions; generates C# runtime

Oracle Action post conditions; any custom function

Adapter User code

Typical SUT Microsoft protocols, APIs, products

Notable Pairwise data selection; on-the-fly mode; use any Dot Net capability


T-Vec/RAVE T-Vec

44

Model Boolean system with data boundaries; SCR types and modules; hierarchic modules

Notation SCR-based, tabular definition; accepts Simulink

UML Support No


RAVE requirements management, interface to DOORS, others

Generation Constraint solver identifies test points

Oracle Solves constraints for expected value

Adapter Output formatter; html, C++, java, perl, others

Typical SUT Aerospace, DoD

Notable Simulink for input, oracle, model checking; MCDC model coverage; non-linear and real-valued constraints


Close Cousins

• Data Generators

– Grammar based

– Pairwise, combinatoric

– Fuzzers

• TTCN-3 Compilers

• Load Generators

• Model Checkers

• Model-driven Development tool chains


REAL TESTERS OF …

MBT User Survey

• Part of 1st Model-based Testing User Conference

– Offered to many other tester communities

• In progress

• Preliminary analysis of responses to date

• https://www.surveymonkey.com/s/JSJVDJW


https://www.surveymonkey.com/s/JSJVDJW

MBT Users, SUT Domain

0% 5% 10% 15% 20% 25% 30% 35% 40%

Transaction Processing

Embedded

Software Infrastructure

Communications

Supercomputing

Other

Social Media

Gaming


MBT User, Company Size

0% 5% 10% 15% 20% 25% 30% 35%

1-10

11-100

101-500

501-1000

1001-10000

10000 +

49

Employees


MBT Users, Software Process

0% 5% 10% 15% 20% 25%

Agile

CMMI level 2+

XP/TDD

Incremental

Spiral

Waterfall

Ad Hoc

Other


How Used?

What stage of adoption?

0% 20% 40% 60%

Routine use

Rollout

Pilot Project

Evaluation

Who is the tool provider?

0% 20% 40% 60% 80%

Commercial

Open Source

In House


What is the Overall MBT Role?

At what scope is MBT used?

0% 20% 40% 60% 80%

Unit

Component

System

What is overall test effort for each testing mode?

25% 30% 35% 40%

Model-based

Hand-coded

Manual


How Long to be Proficient?

0% 10% 20% 30% 40% 50%

1-40

80-120

160+

Hours

Median: 100 hours of training/use to become proficient


How Bad are Common Problems?

0% 50% 100%

Model "blows up"

Too difficult to update model

Oracle ineffective

Developing test models is too difficult

Inadequate coverage

Developing SUT interfaces too hard

Cant integrate w other test assets

Misses bugs

Worse than expected Not an issue Better than expected


MBT Effect on Time, Cost, Quality?

0%

5%

10%

15%

20%

25%

30%

35%

40%

Bugs EscapedOverall Testing

Costs Overall TestingTime

35%

28%

36%

0%

23%

18%

Better

Worse

Percent change from baseline: e.g., 35% fewer escaped bugs, 0% more bugs


MBT Traction

Overall, how effective is MBT?

Extremely

Moderately

Slightly

No effect

42%

42%

13%

4%

How likely are you to continue using MBT?

Extremely

Very

Moderately

Slightly

Not at all

38%

38%

21%

4%

0%


CONCLUSIONS

What Have We Learned?

• Test engineering with rigorous foundation

• Global best practice

• Broad applicability

• Mature commercial offerings

• Many proof points

• Commitment and planning necessary

• 10x to 1,000x improvement possible


Q & A

59

[email protected]


mailto:[email protected]?subject=Protocol Verification Webinar

Image Credits

Unless noted below, all content herein Copyright © Robert V. Binder, 2011.

• Pensive Boy: Resource Rack, http://sites.google.com/site/resourcerack/mental

• Isoquant Chart: MA Economics Blog, http://ma-economics.blogspot.com/2011/09/optimum-factor-combination.html

• Derivatives Trading Floor: Money Mavens, http://medillmoneymavens.com/2009/02/11/cboe-and-cbot-a-story-in-two-floors/

• Barrett Pettyman US Federal Courthouse: Earth in Pictures, http://www.earthinpictures.com/world/usa/washington,_d.c./e._barrett_prettyman_united_states_courthouse.html

• Server Room: 1U Server Rack, http://1userverrack.net/2011/05/03/server-room-4/

• Utility Knife: Marketing Tenerife, http://marketingtenerife.com/marketing-tools-in-tenerife/

• Software Tester: IT Career Coach, http://www.it-career-coach.net/2010/02/14/the-job-of-software-testing-quality-assurance-career

• Conclusion: European Network and information Security Agency (ENISA), http://www.enisa.europa.eu/media/news-items/summary-of-summer-school/image/image_view_fullscreen


http://sites.google.com/site/resourcerack/mental

http://ma-economics.blogspot.com/2011/09/optimum-factor-combination.html







http://medillmoneymavens.com/2009/02/11/cboe-and-cbot-a-story-in-two-floors/















http://www.earthinpictures.com/world/usa/washington,_d.c./e._barrett_prettyman_united_states_courthouse.html

http://www.earthinpictures.com/world/usa/washington,_d.c./e._barrett_prettyman_united_states_courthouse.html

http://1userverrack.net/2011/05/03/server-room-4/






http://marketingtenerife.com/marketing-tools-in-tenerife/








http://www.it-career-coach.net/2010/02/14/the-job-of-software-testing-quality-assurance-career



















http://www.enisa.europa.eu/media/news-items/summary-of-summer-school/image/image_view_fullscreen