example experiments in software engineeing

7/25/2019 Example Experiments in Software Engineeing

http://slidepdf.com/reader/full/example-experiments-in-software-engineeing 1/53

Experiments in SE: Some Examples

Dr Atul [email protected]



Some Real ExperimentsGOAL FACTOR and

alternativesResponseVariables

DESIGN DataAnalysis

Results

Effectiveness ofthree testing

techniques(Basili, 87)

Testing tech(CR, F, S)

Programs(3)Subjectexperience (3)

Fault detectioneffectiveness,

Fault detectiontime, Faultdetection rate

3-FactorFractional

Factorial3X3X3

ANOVA Discussed Later

Assessing theeffectiveness ofPBR at NASA

(Basili’96)

Inspection(PBR, usual)Documents

(NASA, generic)

Defects identified 2 factorblockdesign

(2X2)

ANOVA Defect rate G1=G2

Defect rate PBR = Usual

Defect rate NASA =

generic

ComparingFlowchart andPseudocode

(Scanlan’89)

Comprehension(flowchart,pseudocode),Program compl-exity (L,M,H)

%questionsanswered, No oferrors made,Subjectconfidence

2-Factornesteddesign

t-test %questions answered,

No of errors made,

Subject confidence

(flowchart better Pseudo)

Comparing OOand Structureddesign(Briand’97)

Design (OO, F)

Type (good,bad)

%questionsanswered, %modifications,Modi-rate

2-Factornested

ANOVA Good OO > Bad OO

Bad Struct = Bad OO

Good OO = Good Struct

Comparing threeinspection

techniques(Porter’95)

Inspection (adhoc, checklist,

scenarios) SRSdocuments(A,B)

Defects identified 2-Factor

(3X2)

ANOVA Scenarios > Checklist = adhoc



Experiment #1Comparing the Effectiveness of

software Testing Strategies (Basili’87) Code reading by stepwise Abstraction

Functional testing using equivalence

partitioning and boundary value abstraction Structural testing (100 percent statement

coverage criteria)

http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=1702179&tag=1



Experimental SetupIndependent Variable:

Testing Techniques

Program Types

Level of Expertise

Dependent Variable: Fault Detection

effectiveness (with type) Total fault detection time

Fault detection rate

Subjects - 32 professional and

42 Students (Junior,Intermediate, Professional)

Test Programs – 3 programs

with natural and seeded faults

Faults Type:

Omission vs. commission Initialization, Computation,

Control, Interface, Data andCosmetic

Code Reading Functional Test Structural Test

P1 P2 P3 P1 P2 P3 P1 P2 P3

G1 X - - - X - - - X

G2 - X - - - X X - -

G3 - - X X - - X -

Experimental Design: Fractional Factorial (3X3X3)



Experimental ExecutionThree phases – First two in

University of Maryland (82,83), Thirdat Computer Science Corporation andNASA (84)

Training SessionsThree testing sessions and follow up

sessions

Data analysis was done using boxplots and ANOVA



Results: FDEMajor Results of the comparison of fault detection effectiveness: In phase 3 data, code reading detected a greater number and

percentage of faults than the other methods.

In phase 1 data, code reading and functional were equallyeffective, while structural was inferior to both, but in phase 2there was no difference among the 3 techniques.

Number of faults observed depends on the type of software, mostwere detected in data abstraction program (P3), then comes thetest plotter (P1) and least were found in the database maintainer

(P2). Functionally generated test data revealed more observable faults

than did structurally generated test data in phase 1, but not inphase 3.

Junior and intermediate subjects were equally efficient in findingfaults, whereas advanced subjects found more number of faults.

Self-estimates of faults detected were most accurate fromsubjects applying code reading, followed by those doing structuraltesting, with estimates from persons functionally testing havingno relation.



Results: Fault Detection CostMajor results of the comparison of fault detection costs: In the phase 3 data, code reading had a higher fault detection rate than

the other methods, with no difference between functional and structuraltesting.

In the phase 1 and 2, the three techniques where not different in faultdetection rate.

In phase 2 and 3 total detection effort was not different among thetechniques, but in phase 1 less effort was spent for the structural testingthan for the other techniques, while reading and reading and functionalwere not different.

Fault detection rate and total effort in detection effort in detectiondependent on the type of software: abstract data type had the highest detection rate and lowest total

detection effort. Plotter and database maintainer had the lowest total detection rate and

highest total detection effort.

In phase 2 and 3 , subjects across expertise levels were not different infault detection rate or total detection time, in phase 1 intermediatesubjects had a higher detection rate.

There was a moderate correlation between fault detection rate and yearsof professional experience across all subjects.



Results: Fault TypesMajor results of the comparison of the classes of

the faults detected:

Code reading and functional testing bothdetected more omission faults and initializationfaults than did structural testing.

Code reading detected more interface faults than

did the other methods. Functional testing found more control faults than

did the other.

Code reading detected more computational faultsthan the structural testing.

Functional and structural testing were notdifferent in any classes of faults observable butnot reported.



Conclusions With the professional programmers, faults identified ( CR >

F > S) fault detection rate (CR > F = S) In one Univ. of Maryland subject group, code reading and

functional testing were not different in faults found, butboth were superior to structural testing , but in the otherUOM subject group there was no different amongtechniques.

With the UOM subjects, the fault detection rate (CR=F=S).

Number of faults observed, fault detection rate, and totaleffort in the detection depended on the Programs. Code reading detects more interface faults than other

methods. Functional testing detected more control faults than did the

other methods. When asked to estimate the percentage of the faults

detected, code readers gave the most accurate estimateswhile functional testers gave the least accurate estimates.



An Experimental Comparison of theEffectiveness and Efficiency of

Control Flow Based Testing Approaches onSeeded Faults

Experiment #2



11

An Experiment: Evaluating Block, Branch, and

Predicate Coverage criteria

Block Coverage - A block is a set of sequential statements not having any in betweenflow of control, neither inward nor outward. Complete block coverage requires that everysuch block in the program be exercised at least once in the test executions.

Branch Coverage - An evaluation point in the code may result in one of the two outcomes- true or false, each of which represents a branch. Complete branch coverage requiresthat every such branch be exercised at least once in the test executions.

Predicate Coverage (or Condition Coverage) - A predicate is a simple atomic condition

in a logical expression. Complete predicate coverage requires that every such simplecondition must evaluate to TRUE as well as FALSE at least once in the test executions.

Research Questions

Which coverage criterion

has better effectiveness needs more testing effort is more efficient is more reliable



Goals for the ExperimentThe goals of our experiment are to answer the

following questions.

Which coverage criteria have more fault detectionability?

Which coverage criteria need more testing effort?

How these coverage adequate tests perform?

Are there any specific types of bugs which resultsinto different effectiveness?

Co-relation between the elements of the program

and testing approaches. How to choose a suitable criterion for a givenprogram?



13

Some Terms

Test case – A set of Inputs, executionpreconditions, and expected outcomes for testingan specific aspect of CUT

Test Suite - A collection of test cases for the CUT

Test Criterion – A set of test requirements

Mutation Operator - A handle to seed faults in aprogram in some specific context

Mutant - A faulty version of a program containingexactly one known fault

Effectiveness – Fault detection capability

Efficiency – The average testing cost (i.e. effort)to identify a fault in the program



The Experiment Three Control Flow Based Criteria are considered-

Block, Branch and Predicate

Five java Programs ( size between 400-1500 LOC) were used in the study

JUnit Framework is used for test managementand JavaCodeCoverage for obtaining coverage

information Bugs were inserted manually to obtained

‘mutants’ (Use of Mutation Operator)

Multiple test suites are used for each coveragecriteria for each program

minimal test suites were used so as to facilitatecomparison of the performance of these coveragecriteria



Criteria for Comparison

Criteria used to measure fault detection effectiveness FDET

of a test suite T is

FDE T = (number of mutants killed / totalnumber of mutants of the program)

The criteria used to measure testing effort of a test suite T

are:TE T = Number of test cases in a test suite needed to

satisfy testing criteria

Performance index (PI) of a test suite T which is obtainedas

PI T = number of mutants killed by test

suite T/ size of the test suite T



Experimental SetupTest Programs

S No. ProgramName

NCLOC FaultsSeeded

#ofClasses/faultseeded

in

Test-poolsize

1 HotelManagem

ent

390 56 6/4 55

2 PostalCodes 340 93 6/4 105

3 CruiseControl 320 41 6/4 72

4 JavaVector 310 72 1/1 70

5 Monopoly 1490 56 17/8 84



Experimental Setup cont…

Mutation Operators Incorrect Initialization Operator (IIO) - incorrect or missing initialization,

incorrect or missing state assignment.

Literal Change Operator (LCO) - changing increment to decrement or viceversa, incorrect or missing increment.

Language Operator Replacement (LOR) - replacing one relational or logicaloperator with another.

Control Flow Disruption (CFD) - missing or incorrectly placed block markers,break, continue, or return.

Method Name Replacement (MNR) - replacing a method with another methodof similar definition but different behavior. Statement Swap Operator (SSO) - swapping two statement in the same scope. Argument Order Interchange (AOI) - interchanging arguments of the same

type in the parameter list of a method either in the definition or in the method call. Variable Replacement Operator (VRO) - replacing a variable with another of a

similar type. Missing Condition Operator (MCO) - missing out a condition in a composite

conditional statement. Null Reference Operator (NRO) - causing a null reference.



18

Experimental Setup cont…

Coverage tool used: JavaCodeCoverage

Computes test coverage for Method, Block, Branch,

and Predicate coverage criteria

Provides test coverage information visually using acolor scheme

Performs program analysis at the bytecode level Records coverage information for each test case in a

MySQL database



Test-Case

Generation

TestCase

19

Process for Comparing Coverage

Criteria

Coverage AdequateMinimal Test Suites

Preparation

A

CoverageTool

MutantsGeneration

TestPool

Test

CoverageData

FaultsData

Mutant

Mutants

Program/abstractions/ Specifications

DB

Phase – I: Construct a Test-Pool and obtained test coverage information

Phase – II: Construct Minimal Test Suite and perform testingProgram



20

Experiment Execution

For each test program

A large test-pool of JUnit test cases was constructed

Test coverage information for the three coveragecriteria was obtained

Program’s mutants were generated

25 minimal coverage adequate test suites wereconstructed for each coverage criterion

Testing was performed and fault data was recorded



21

Results: Test program: PostalCodes

1. Faults Seeded 2. Effectiveness (FDTT)

4. Efficiency (PIT)3. Testing Efforts (TET )andCoverage

Mutation Operators Applied(PostalCodes)

0

5

10

15

20

25

30

LOR

(8)

LCO

(30)

SSO

(10)

MNR

(15)

CFD

(6)

MCO

(9)

VRO

(2)

IIO

(9)

Mutants

0

20

40

60

80

100

Block Test Suites(Avg. s ize = 33)

Branch Test Suites(Avg. size = 35)

Predicate TestSuites (Avg. size =

49)

Coverage Estimates: PostalCodes

BLC BRC PC

0

0.2

0.4

0.6

0.8

1

LOR(8)

LCO(30)

SSO(10)

MNR(15)

CFD(6)

MCO(9)

VRO(2)

IIO(9)

Fault Detecting Effectiveness(Postal Codes)

Block Branch Predicate

MaxMin

75th %25th %

Median

Box Whisker Plot (PostalCodes)

Performance Index PIT

1.7

1.8

1.9

2.0

2.12.2

2.3

2.4

2.5

Block Branch Predicate



22

Statistical Analysis at Method Level

SNo

NullHypothesis

AlternateHypothesis

Samplesize

Results(at α =0.05)

(p-value)

1 Br = Bl Br > Bl22

Br > Bl0.001

2 Pr = Br Pr > Br22 Not rejected 0.760

3* Pr = Br Pr > Br06 Pr > Br

0.030

µ = Mean Fault Detection Effectiveness

* Methods having composite conditions (06)Bl-Block, Br-Branch, Pr-Predicate



Threats to Validity: Criteria

Evaluation Construct Validity : “Are we actually measuring what

we intend to measure?” Use of “seeded faults”

Effort is measured as size of the test suite Construction of minimal test suites

Internal Validity: “Does the data really follows fromthe experimental concepts?”

Conclusion Validity: “Are the analysis methodsappropriate?” Normality assumption?

External Validity: “Can the results of the experimentbe generalized?”

Results are from 5 Java programs (300-1500 NCLOC) Mutation operators used and faults densities

23



24

Experiment Summary

Results are affected by the program

structure and complexityBranch Test suites offered better

trade-offs in general.

On average, we found: Effectiveness: Predicate > Branch > Block

Effort: Predicate > Branch > Block

Efficiency: Block > Branch > Predicate Reliability : Predicate > Branch > Block

Validity Considerations



Further Work

Inclusions of object-oriented specific

bugsLarger Programs and in Industrial

Settings

Other Coverage Criteria like MC/DC,Simple Path Coverage



An Experimental Evaluation of theEffectiveness and Efficiency of

Test-Driven Development

Experiment#3



TDD

A program development style

Most influential practice in XPCan be applied on standalone basis

Claims about TDD

improves Code Quality

improves Developer’s Productivity

reduces Development Time

reduces Maintenance Cost

272/20/2014



282/20/2014 282/20/2014

TDD

write a test

run the test with all other previouslywritten tests and see it fail

implement just enough to make the testpass

run all the test and see that newly writtentest also passes

refactor the code (and also the test) ifdesired



Some Results about TDD

Initial investigations reveal that itimproves quality but at the expenseof time [Williams’03, Williams’04,Bhat’06]

A close look provides further insightsthat TDD improves the unit testingbut slows down the development

process [Erdogmus’05, Canfora’06]

292/20/2014



Motivation

Hypotheses:

(+) TDD does not requires

detailed up-front design,rather the design of theprogram gradually evolvesand hence it should resultin saving in developmenteffort

(-) TDD requires Code andTest Refactoring, and henceit should result additional

development effort

302/20/2014

Include the designaspect of programdevelopment and then

compare TDD withconventional codedevelopment (CCD)



Inception

A course on Advanced Object-oriented modeling and analysis (CS655 AOOAM) was offered during fall2004 at IIT Kanpur.

The instructor agreed to include TDDas one of the topic of the course aswell as for the experiment.

This experiment was undertaken asa graded assignment for CS 655course

312/20/2014



322/20/2014 322/20/2014

Research Questions

Compared to CCD, When design effort is

also taken into consideration, should TDDresults in

Better Code Quality (CQ)?

Reducing Development Efforts (DE)? Higher Developer’s Productivity (PP)?

Relevant Null Hypotheses are

(H1: CQTDD = CQCCD) (H2: DETDD = DECCD)

(H3: PPTDD = PPCCD)



332/20/2014 332/20/2014

An Experiment (TDD vs. CCD) A graded assignment during a course CS 655

(AOOA&M) during fall 2004 at IIT Kanpur

Response Variables - Code Quality, DevelopmentEffort, and Developer's Productivity

Experimental Design - One factor block design

Blocking Variable - Subject experience

Development Environment - Java programmingusing DrJava editor (with built-in support forJUnit)

Two Test Program- Student Registration System(SRS) and Automated Teller Machine (ATM) withestimated lines of code around 1200



Subjects (22)

Mostly Graduate Students in CS

All have done at least two programmingcourses and a course on SoftwareEngineering

4-10 years of programming experience inJava

Comfortable in developing analysis &

design models

342/20/2014



Test Programs

Student Registration System (SRS)

Registration module for a different academic programs

Course registration module for the current semester

Instructor module for evaluation

An administrator module can make a query for relevant detailsabout the course registration, status of a student, etc.

ATM System (ATM) A consortium of banking organizations. (Bank module)

Individual ATM units may belong to different bankingorganizations but a user of this system can be serviced by any of

them. (ATM module) Typical functionalities incorporated are transaction management

for customer accounts (User module)

ATM administration module

352/20/2014



Preparation

Subjects were trained to develop Java codefollowing TDD using JUnit + distributed relevant

material and exercises to increase theirunderstanding of TDD

A detailed set of instructions for the subject

For each test program Clear and complete specifications

A use case diagram

A desired command line interface

A carefully constructed acceptance test suite

362/20/2014



Experiment Scheduling

372/20/2014 372/20/2014

Schedule for the Experiment

Program

Development Phase (DP) AcceptancePhase (AP)

Week #1 Week #2 Weak #3

G1 G2 G1 G2 G1 G2

P1 CCD TDD CCD TDD

P2 CCD TDD TDD CCD

G1, G2 – Groups of Students (11 in each group)

P1- SRS, P2 - ATM



Experiment Schedule

Schedule for the Experiment (CCD vs. TDD)

Development Phase (DP) Acceptance Phase (AP)

Week #1

CCD

Week #2

TDD

Week #3

CCD + TDD

Subjects S1 S2 S3 … S1 S2 S3 … S1 S2 S3 …

Programs P1 Pi P1’, Pi’

P2 P j P2’, P j’

P3 Pk P3’, Pk ’

… … …



Experimental Steps (CCD vs. TDD)

CCDDevelopment Phase (DP)

Code• Code the class diagram

• When done record the efforts data

Test • Design and run manual tests for the

objects.• Correct any error observed

• Record the code and efforts data forDP phase

Acceptance Phase (AP)

• For (i =1 to size of the AP testsuite)

• Execute ith test• if error, then fixed it

• Record the code and efforts data forAP phase.

TDDDevelopment Phase (DP)

Repeat following till desired

functionality is coded• Select a class• Write a functional test for a method of a

class• Insert just enough code and see that it

passes – If not then insert more code till

the test passes

Record the code and efforts data

Acceptance Phase (AP)

For (i =1 to size of the AP testsuite)

• Execute ith test• if error, – then write a test that would

reveal that bug enter the just-enough-code to fixed it

Record the code and efforts data forAP phase



Experimental Steps

402/20/2014

CCD

Development PhaseDevelopment PhaseDevelopment PhaseDevelopment Phase (DP)(DP)(DP)(DP)

Design

- Derive an analysis diagram (The RUP

approach)- Draw the a set of functional scenarios for

the objects identified in the analysis diagram

- Identify the communications between the

objects and correspondingly develop a classdiagram

- Record the effort data

Code

- Code the class diagram- When done record the effort data

Test

- Design and run manual tests for the objects.

- Correct any error observed

- Record the code and effort data for DP

phase

Acceptance PhaseAcceptance P haseAcceptance PhaseAcceptance Phase (AP)(AP)(AP)(AP)

- For (i =1 to size of the AP test suite) Execute ith test

if error, then fixed it

- Record the code and effort data for APphase.

TDD

Development PhaseDevelopment PhaseDevelopment PhaseDevelopment Phase (DP)(DP)(DP)(DP)

Design

- Find domain objects from the use case diagram

- Attach desired functionality to these objects and

construct an initial class diagram

Test-before-coding

- Repeat the following till desired functionality iscoded

Select a class

Write a functional test for a method

of a class

Do just enough coding to see if thetest passes

Refactor code (and test) if necessary

- Record the code and effort data

Acceptance PhaseAcceptance P haseAcceptance PhaseAcceptance Phase (AP )(AP)(AP)(AP)

- For (i =1 to size of the AP test suite) Execute ith test

if error,• then write a test (or modify a

previously written test) thatwould reveal that bug and enter

the just-enough-code to fixed it

• refactor code (and test) if

required- Record the code and effort data for AP phase



Measurements

CCDDevelopment Phase (DP) Coding Efforts = (person hours)

# of unit tests Executed =

Unit Testing Efforts = (person-hours)

Size of the Program code =(UCLOC)

Acceptance Phase (AP) # Bugs in Development phase=

Time taken to correct thereported bugs = (person-hours)

Size of the Program code (final)=(UCLOC)

TDDDevelopment Phase (DP) Coding Efforts = (person hours)

# of unit test cases written = Size of the Program code=

(UCLOC)

Size of the Test code= (UCLOC )

Acceptance Phase (AP) # Bugs in Development phase =

Time taken to correct thereported bugs = (person-hours)

# of test cases written (final) = Size of the Program code (final)=

(UCLOC) Size of the Test code (final) =

(UCLOC)



Measurements

Code Quality (CQ)

% of acceptance test cases passed by thedeveloped programs

Development Effort (DE)

Effort applied in DP + Effort applied in AP (inperson-hours)

Developer’s Productivity (PP)

Delivered NCLOC per person hours

422/20/2014



Additional Measures

Initial Design Effort

Testing Effort (TE) Testing effort applied in DP + Testing effort

applied in AP (in person-hours)

TETDD in DP = (test code size / (test code +programs code)) * Coding time in DP

TECCD in DP = Recorded by the subjects

432/20/2014



Results – Statistical Test

442/20/2014 442/20/2014

Response Variables Hypothesis p-values Result

Code Quality H01: CQTDD = CQCCD 0.001 CQTDD > CQCCD

Development Efforts H02: DETDD = DECCD 0.0207 DETDD < DECCD

Developer's Productivity H03: PPTDD = PPCCD 0.21 not rejected

SRS

Response Variables Hypothesis p-values Result

Code Quality H01: CQTDD = CQCCD 0.173 not rejected

Development Efforts H02: DETDD = DECCD 0.334 not rejected

Developer's Productivity H03: PPTDD = PPCCD 0.999 not rejected

ATM



Results – SRS

452/20/2014 452/20/2014

MaxMin

75th %25th %

Median

Code Quality

[ % of Tests Passes in AP]

74

78

82

86

90

94

98

CCD TDD

MaxMin

75th %25th %

Median

Overall Development Efforts

[# of person-hours]

15

20

25

30

35

40

45

50

55

CCD TDD

MaxMin

75th %25th %

Median

Developer's Productivity

[ NCLOC/hour ]

30

40

50

60

70

80

90

CCD TDD

MaxMin

75th %25th %

Median

Initial Design Efforts - SRS

[# of person-hours]

0

1

2

3

4

5

6

7

8

9

CCD TDD



Results – ATM

462/20/2014 462/20/2014

Max

Min

75th %

25th %

Median

Code Quality

[ % of Tests Passed in AP]

70

74

78

82

86

90

94

98

CCD TDD

Max

Min

75th %

25th %

Median

Overall Development Efforts

[ # of person-hours]

24

28

32

36

40

44

48

52

CCD TDD

Max

Min

75th %

25th %

Median

Developer's Productivity

[NCLOC/person-hour]

25

35

45

55

65

75

CCD TDD

Max

Min

75th %

25th %

Median

Initial Design Efforts - ATM

[# of person-hours]

0

2

4

6

8

10

12

CCD TDD



Results – Testing effort applied

472/20/2014 472/20/2014

MaxMin

75th %25th %

Median

Testing Efforts [ATM] - CCD

[# of person-hour ]

0

2

4

6

8

10

12

14

DP AP Total

MaxMin

75th %25th %

Median

Testing Efforts [SRS] - TDD

[ # of person-hours]

0

2

4

6

8

10

12

DP AP Total

MaxMin

75th %25th %

Median

Testing Efforts [SRS] - CCD

[ # of person-hour]

0

4

8

12

16

20

DP AP Total

MaxMin

75th %25th %

Median

Testing Efforts [ATM] - TDD

[ # of person-hour]

0

4

8

12

16

20

DP AP Total



Result of Qualitative Analysis

482/20/2014

Aspects TDD CCD

Ease of use 64.706 70.59Confidance in completeness of testing 82.353 47.06

Better debugging efforts 70.588 70.59

Adherece to followed approach 70.588 88.24

More training needed 47.059 23.53

Confidence about the design 47.059 82.35

Better approach for program developement 17.647 29.41

Questionnaire

52.94% subjects favored a mixed approach of (TDD+CCD)



492/20/2014 492/20/2014

Conclusions (TDD vs. CCD)

Reduced Development Time

Improved Developer’s ProductivityCode Quality affected by testing

efforts applied in the developmentstyle

A combination may work better (?)



502/20/2014 502/20/2014

Threats to ValiditySubject Experience

Data Collection ProcessPlagiarism

Large variations in the results



Further Work

Further Validations

Industrial StudiesAccessing the quality of design

resulting by applying TDD

The issue of change management

512/20/2014

f



References

Basili VR, Selby RW. Comparing the effectiveness of software testingstrategies, IEEE TSE 13(12):1278-1296, 1987

Basili VR, Green S, Laitenberger O, Lanubile F, Shull F, Sorumgard S,Zelkowitz MV. The empirical investigations of PBR, Jour. Of ESE, 1(2):133-

164, 1996 Scanlan DA. Structure flowcharts outperformed pseudocode: An

experimental comparison, IEEE Software, pp 28-36, September 1989

Briand LC, Bunse C, Daly JW, Differding C. An experimental comparison ofthe maintainability of O-O and Structured design documents, Jour. Of ESE,2(2):291-312, 1997

Porter A, Votta LG, Basili V. Comparing defect methods for softwarerequirements inspections: A replicated experiment, IEEE TSE 21(6):563-575, 1995

Atul Gupta, Pankaj Jalote. Comparing control flow based coverage criteriabased on seeded faults. In 12th International Conference on Tools andAlgorithms for the Construction and Analysis of Systems (TACAS-06), pp365-378, Austria, 2006, Springer.

Atul Gupta and Pankaj Jalote. An Experimental Evaluation of theEffectiveness and Efficiency of the Test Driven Development, 1st

Symposium on Empirical Software Engineering & Measurement (ESEM-07), pp 285-294, Madrid, Spain, 2007, IEEE Computer Society

Fi l C t



Final Comments

It is not the case that any empiricalresearch is better than other!

Plan and use a combination ofempirical research

Avoid anything more complex thanyou understand.

Get statistical advice.

Experimental investigations shouldbe exploratory in nature