example experiments in software engineeing
TRANSCRIPT
7/25/2019 Example Experiments in Software Engineeing
http://slidepdf.com/reader/full/example-experiments-in-software-engineeing 1/53
Experiments in SE: Some Examples
Dr Atul [email protected]
7/25/2019 Example Experiments in Software Engineeing
http://slidepdf.com/reader/full/example-experiments-in-software-engineeing 2/53
Some Real ExperimentsGOAL FACTOR and
alternativesResponseVariables
DESIGN DataAnalysis
Results
Effectiveness ofthree testing
techniques(Basili, 87)
Testing tech(CR, F, S)
Programs(3)Subjectexperience (3)
Fault detectioneffectiveness,
Fault detectiontime, Faultdetection rate
3-FactorFractional
Factorial3X3X3
ANOVA Discussed Later
Assessing theeffectiveness ofPBR at NASA
(Basili’96)
Inspection(PBR, usual)Documents
(NASA, generic)
Defects identified 2 factorblockdesign
(2X2)
ANOVA Defect rate G1=G2
Defect rate PBR = Usual
Defect rate NASA =
generic
ComparingFlowchart andPseudocode
(Scanlan’89)
Comprehension(flowchart,pseudocode),Program compl-exity (L,M,H)
%questionsanswered, No oferrors made,Subjectconfidence
2-Factornesteddesign
t-test %questions answered,
No of errors made,
Subject confidence
(flowchart better Pseudo)
Comparing OOand Structureddesign(Briand’97)
Design (OO, F)
Type (good,bad)
%questionsanswered, %modifications,Modi-rate
2-Factornested
ANOVA Good OO > Bad OO
Bad Struct = Bad OO
Good OO = Good Struct
Comparing threeinspection
techniques(Porter’95)
Inspection (adhoc, checklist,
scenarios) SRSdocuments(A,B)
Defects identified 2-Factor
(3X2)
ANOVA Scenarios > Checklist = adhoc
7/25/2019 Example Experiments in Software Engineeing
http://slidepdf.com/reader/full/example-experiments-in-software-engineeing 3/53
Experiment #1Comparing the Effectiveness of
software Testing Strategies (Basili’87) Code reading by stepwise Abstraction
Functional testing using equivalence
partitioning and boundary value abstraction Structural testing (100 percent statement
coverage criteria)
http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=1702179&tag=1
7/25/2019 Example Experiments in Software Engineeing
http://slidepdf.com/reader/full/example-experiments-in-software-engineeing 4/53
Experimental SetupIndependent Variable:
Testing Techniques
Program Types
Level of Expertise
Dependent Variable: Fault Detection
effectiveness (with type) Total fault detection time
Fault detection rate
Subjects - 32 professional and
42 Students (Junior,Intermediate, Professional)
Test Programs – 3 programs
with natural and seeded faults
Faults Type:
Omission vs. commission Initialization, Computation,
Control, Interface, Data andCosmetic
Code Reading Functional Test Structural Test
P1 P2 P3 P1 P2 P3 P1 P2 P3
G1 X - - - X - - - X
G2 - X - - - X X - -
G3 - - X X - - X -
Experimental Design: Fractional Factorial (3X3X3)
7/25/2019 Example Experiments in Software Engineeing
http://slidepdf.com/reader/full/example-experiments-in-software-engineeing 5/53
Experimental ExecutionThree phases – First two in
University of Maryland (82,83), Thirdat Computer Science Corporation andNASA (84)
Training SessionsThree testing sessions and follow up
sessions
Data analysis was done using boxplots and ANOVA
7/25/2019 Example Experiments in Software Engineeing
http://slidepdf.com/reader/full/example-experiments-in-software-engineeing 6/53
Results: FDEMajor Results of the comparison of fault detection effectiveness: In phase 3 data, code reading detected a greater number and
percentage of faults than the other methods.
In phase 1 data, code reading and functional were equallyeffective, while structural was inferior to both, but in phase 2there was no difference among the 3 techniques.
Number of faults observed depends on the type of software, mostwere detected in data abstraction program (P3), then comes thetest plotter (P1) and least were found in the database maintainer
(P2). Functionally generated test data revealed more observable faults
than did structurally generated test data in phase 1, but not inphase 3.
Junior and intermediate subjects were equally efficient in findingfaults, whereas advanced subjects found more number of faults.
Self-estimates of faults detected were most accurate fromsubjects applying code reading, followed by those doing structuraltesting, with estimates from persons functionally testing havingno relation.
7/25/2019 Example Experiments in Software Engineeing
http://slidepdf.com/reader/full/example-experiments-in-software-engineeing 7/53
Results: Fault Detection CostMajor results of the comparison of fault detection costs: In the phase 3 data, code reading had a higher fault detection rate than
the other methods, with no difference between functional and structuraltesting.
In the phase 1 and 2, the three techniques where not different in faultdetection rate.
In phase 2 and 3 total detection effort was not different among thetechniques, but in phase 1 less effort was spent for the structural testingthan for the other techniques, while reading and reading and functionalwere not different.
Fault detection rate and total effort in detection effort in detectiondependent on the type of software: abstract data type had the highest detection rate and lowest total
detection effort. Plotter and database maintainer had the lowest total detection rate and
highest total detection effort.
In phase 2 and 3 , subjects across expertise levels were not different infault detection rate or total detection time, in phase 1 intermediatesubjects had a higher detection rate.
There was a moderate correlation between fault detection rate and yearsof professional experience across all subjects.
7/25/2019 Example Experiments in Software Engineeing
http://slidepdf.com/reader/full/example-experiments-in-software-engineeing 8/53
Results: Fault TypesMajor results of the comparison of the classes of
the faults detected:
Code reading and functional testing bothdetected more omission faults and initializationfaults than did structural testing.
Code reading detected more interface faults than
did the other methods. Functional testing found more control faults than
did the other.
Code reading detected more computational faultsthan the structural testing.
Functional and structural testing were notdifferent in any classes of faults observable butnot reported.
7/25/2019 Example Experiments in Software Engineeing
http://slidepdf.com/reader/full/example-experiments-in-software-engineeing 9/53
Conclusions With the professional programmers, faults identified ( CR >
F > S) fault detection rate (CR > F = S) In one Univ. of Maryland subject group, code reading and
functional testing were not different in faults found, butboth were superior to structural testing , but in the otherUOM subject group there was no different amongtechniques.
With the UOM subjects, the fault detection rate (CR=F=S).
Number of faults observed, fault detection rate, and totaleffort in the detection depended on the Programs. Code reading detects more interface faults than other
methods. Functional testing detected more control faults than did the
other methods. When asked to estimate the percentage of the faults
detected, code readers gave the most accurate estimateswhile functional testers gave the least accurate estimates.
7/25/2019 Example Experiments in Software Engineeing
http://slidepdf.com/reader/full/example-experiments-in-software-engineeing 10/53
An Experimental Comparison of theEffectiveness and Efficiency of
Control Flow Based Testing Approaches onSeeded Faults
Experiment #2
7/25/2019 Example Experiments in Software Engineeing
http://slidepdf.com/reader/full/example-experiments-in-software-engineeing 11/53
11
An Experiment: Evaluating Block, Branch, and
Predicate Coverage criteria
Block Coverage - A block is a set of sequential statements not having any in betweenflow of control, neither inward nor outward. Complete block coverage requires that everysuch block in the program be exercised at least once in the test executions.
Branch Coverage - An evaluation point in the code may result in one of the two outcomes- true or false, each of which represents a branch. Complete branch coverage requiresthat every such branch be exercised at least once in the test executions.
Predicate Coverage (or Condition Coverage) - A predicate is a simple atomic condition
in a logical expression. Complete predicate coverage requires that every such simplecondition must evaluate to TRUE as well as FALSE at least once in the test executions.
Research Questions
Which coverage criterion
has better effectiveness needs more testing effort is more efficient is more reliable
7/25/2019 Example Experiments in Software Engineeing
http://slidepdf.com/reader/full/example-experiments-in-software-engineeing 12/53
Goals for the ExperimentThe goals of our experiment are to answer the
following questions.
Which coverage criteria have more fault detectionability?
Which coverage criteria need more testing effort?
How these coverage adequate tests perform?
Are there any specific types of bugs which resultsinto different effectiveness?
Co-relation between the elements of the program
and testing approaches. How to choose a suitable criterion for a givenprogram?
7/25/2019 Example Experiments in Software Engineeing
http://slidepdf.com/reader/full/example-experiments-in-software-engineeing 13/53
13
Some Terms
Test case – A set of Inputs, executionpreconditions, and expected outcomes for testingan specific aspect of CUT
Test Suite - A collection of test cases for the CUT
Test Criterion – A set of test requirements
Mutation Operator - A handle to seed faults in aprogram in some specific context
Mutant - A faulty version of a program containingexactly one known fault
Effectiveness – Fault detection capability
Efficiency – The average testing cost (i.e. effort)to identify a fault in the program
7/25/2019 Example Experiments in Software Engineeing
http://slidepdf.com/reader/full/example-experiments-in-software-engineeing 14/53
The Experiment Three Control Flow Based Criteria are considered-
Block, Branch and Predicate
Five java Programs ( size between 400-1500 LOC) were used in the study
JUnit Framework is used for test managementand JavaCodeCoverage for obtaining coverage
information Bugs were inserted manually to obtained
‘mutants’ (Use of Mutation Operator)
Multiple test suites are used for each coveragecriteria for each program
minimal test suites were used so as to facilitatecomparison of the performance of these coveragecriteria
7/25/2019 Example Experiments in Software Engineeing
http://slidepdf.com/reader/full/example-experiments-in-software-engineeing 15/53
Criteria for Comparison
Criteria used to measure fault detection effectiveness FDET
of a test suite T is
FDE T = (number of mutants killed / totalnumber of mutants of the program)
The criteria used to measure testing effort of a test suite T
are:TE T = Number of test cases in a test suite needed to
satisfy testing criteria
Performance index (PI) of a test suite T which is obtainedas
PI T = number of mutants killed by test
suite T/ size of the test suite T
7/25/2019 Example Experiments in Software Engineeing
http://slidepdf.com/reader/full/example-experiments-in-software-engineeing 16/53
Experimental SetupTest Programs
S No. ProgramName
NCLOC FaultsSeeded
#ofClasses/faultseeded
in
Test-poolsize
1 HotelManagem
ent
390 56 6/4 55
2 PostalCodes 340 93 6/4 105
3 CruiseControl 320 41 6/4 72
4 JavaVector 310 72 1/1 70
5 Monopoly 1490 56 17/8 84
7/25/2019 Example Experiments in Software Engineeing
http://slidepdf.com/reader/full/example-experiments-in-software-engineeing 17/53
Experimental Setup cont…
Mutation Operators Incorrect Initialization Operator (IIO) - incorrect or missing initialization,
incorrect or missing state assignment.
Literal Change Operator (LCO) - changing increment to decrement or viceversa, incorrect or missing increment.
Language Operator Replacement (LOR) - replacing one relational or logicaloperator with another.
Control Flow Disruption (CFD) - missing or incorrectly placed block markers,break, continue, or return.
Method Name Replacement (MNR) - replacing a method with another methodof similar definition but different behavior. Statement Swap Operator (SSO) - swapping two statement in the same scope. Argument Order Interchange (AOI) - interchanging arguments of the same
type in the parameter list of a method either in the definition or in the method call. Variable Replacement Operator (VRO) - replacing a variable with another of a
similar type. Missing Condition Operator (MCO) - missing out a condition in a composite
conditional statement. Null Reference Operator (NRO) - causing a null reference.
7/25/2019 Example Experiments in Software Engineeing
http://slidepdf.com/reader/full/example-experiments-in-software-engineeing 18/53
18
Experimental Setup cont…
Coverage tool used: JavaCodeCoverage
Computes test coverage for Method, Block, Branch,
and Predicate coverage criteria
Provides test coverage information visually using acolor scheme
Performs program analysis at the bytecode level Records coverage information for each test case in a
MySQL database
7/25/2019 Example Experiments in Software Engineeing
http://slidepdf.com/reader/full/example-experiments-in-software-engineeing 19/53
Test-Case
Generation
TestCase
19
Process for Comparing Coverage
Criteria
Coverage AdequateMinimal Test Suites
Preparation
A
CoverageTool
MutantsGeneration
TestPool
Test
CoverageData
FaultsData
Mutant
Mutants
Program/abstractions/ Specifications
DB
Phase – I: Construct a Test-Pool and obtained test coverage information
Phase – II: Construct Minimal Test Suite and perform testingProgram
7/25/2019 Example Experiments in Software Engineeing
http://slidepdf.com/reader/full/example-experiments-in-software-engineeing 20/53
20
Experiment Execution
For each test program
A large test-pool of JUnit test cases was constructed
Test coverage information for the three coveragecriteria was obtained
Program’s mutants were generated
25 minimal coverage adequate test suites wereconstructed for each coverage criterion
Testing was performed and fault data was recorded
7/25/2019 Example Experiments in Software Engineeing
http://slidepdf.com/reader/full/example-experiments-in-software-engineeing 21/53
21
Results: Test program: PostalCodes
1. Faults Seeded 2. Effectiveness (FDTT)
4. Efficiency (PIT)3. Testing Efforts (TET )andCoverage
Mutation Operators Applied(PostalCodes)
0
5
10
15
20
25
30
LOR
(8)
LCO
(30)
SSO
(10)
MNR
(15)
CFD
(6)
MCO
(9)
VRO
(2)
IIO
(9)
Mutants
0
20
40
60
80
100
Block Test Suites(Avg. s ize = 33)
Branch Test Suites(Avg. size = 35)
Predicate TestSuites (Avg. size =
49)
Coverage Estimates: PostalCodes
BLC BRC PC
0
0.2
0.4
0.6
0.8
1
LOR(8)
LCO(30)
SSO(10)
MNR(15)
CFD(6)
MCO(9)
VRO(2)
IIO(9)
Fault Detecting Effectiveness(Postal Codes)
Block Branch Predicate
MaxMin
75th %25th %
Median
Box Whisker Plot (PostalCodes)
Performance Index PIT
1.7
1.8
1.9
2.0
2.12.2
2.3
2.4
2.5
Block Branch Predicate
7/25/2019 Example Experiments in Software Engineeing
http://slidepdf.com/reader/full/example-experiments-in-software-engineeing 22/53
22
Statistical Analysis at Method Level
SNo
NullHypothesis
AlternateHypothesis
Samplesize
Results(at α =0.05)
(p-value)
1 Br = Bl Br > Bl22
Br > Bl0.001
2 Pr = Br Pr > Br22 Not rejected 0.760
3* Pr = Br Pr > Br06 Pr > Br
0.030
µ = Mean Fault Detection Effectiveness
* Methods having composite conditions (06)Bl-Block, Br-Branch, Pr-Predicate
7/25/2019 Example Experiments in Software Engineeing
http://slidepdf.com/reader/full/example-experiments-in-software-engineeing 23/53
Threats to Validity: Criteria
Evaluation Construct Validity : “Are we actually measuring what
we intend to measure?” Use of “seeded faults”
Effort is measured as size of the test suite Construction of minimal test suites
Internal Validity: “Does the data really follows fromthe experimental concepts?”
Conclusion Validity: “Are the analysis methodsappropriate?” Normality assumption?
External Validity: “Can the results of the experimentbe generalized?”
Results are from 5 Java programs (300-1500 NCLOC) Mutation operators used and faults densities
23
7/25/2019 Example Experiments in Software Engineeing
http://slidepdf.com/reader/full/example-experiments-in-software-engineeing 24/53
24
Experiment Summary
Results are affected by the program
structure and complexityBranch Test suites offered better
trade-offs in general.
On average, we found: Effectiveness: Predicate > Branch > Block
Effort: Predicate > Branch > Block
Efficiency: Block > Branch > Predicate Reliability : Predicate > Branch > Block
Validity Considerations
7/25/2019 Example Experiments in Software Engineeing
http://slidepdf.com/reader/full/example-experiments-in-software-engineeing 25/53
Further Work
Inclusions of object-oriented specific
bugsLarger Programs and in Industrial
Settings
Other Coverage Criteria like MC/DC,Simple Path Coverage
7/25/2019 Example Experiments in Software Engineeing
http://slidepdf.com/reader/full/example-experiments-in-software-engineeing 26/53
An Experimental Evaluation of theEffectiveness and Efficiency of
Test-Driven Development
Experiment#3
7/25/2019 Example Experiments in Software Engineeing
http://slidepdf.com/reader/full/example-experiments-in-software-engineeing 27/53
TDD
A program development style
Most influential practice in XPCan be applied on standalone basis
Claims about TDD
improves Code Quality
improves Developer’s Productivity
reduces Development Time
reduces Maintenance Cost
272/20/2014
7/25/2019 Example Experiments in Software Engineeing
http://slidepdf.com/reader/full/example-experiments-in-software-engineeing 28/53
282/20/2014 282/20/2014
TDD
write a test
run the test with all other previouslywritten tests and see it fail
implement just enough to make the testpass
run all the test and see that newly writtentest also passes
refactor the code (and also the test) ifdesired
7/25/2019 Example Experiments in Software Engineeing
http://slidepdf.com/reader/full/example-experiments-in-software-engineeing 29/53
Some Results about TDD
Initial investigations reveal that itimproves quality but at the expenseof time [Williams’03, Williams’04,Bhat’06]
A close look provides further insightsthat TDD improves the unit testingbut slows down the development
process [Erdogmus’05, Canfora’06]
292/20/2014
7/25/2019 Example Experiments in Software Engineeing
http://slidepdf.com/reader/full/example-experiments-in-software-engineeing 30/53
Motivation
Hypotheses:
(+) TDD does not requires
detailed up-front design,rather the design of theprogram gradually evolvesand hence it should resultin saving in developmenteffort
(-) TDD requires Code andTest Refactoring, and henceit should result additional
development effort
302/20/2014
Include the designaspect of programdevelopment and then
compare TDD withconventional codedevelopment (CCD)
7/25/2019 Example Experiments in Software Engineeing
http://slidepdf.com/reader/full/example-experiments-in-software-engineeing 31/53
Inception
A course on Advanced Object-oriented modeling and analysis (CS655 AOOAM) was offered during fall2004 at IIT Kanpur.
The instructor agreed to include TDDas one of the topic of the course aswell as for the experiment.
This experiment was undertaken asa graded assignment for CS 655course
312/20/2014
7/25/2019 Example Experiments in Software Engineeing
http://slidepdf.com/reader/full/example-experiments-in-software-engineeing 32/53
322/20/2014 322/20/2014
Research Questions
Compared to CCD, When design effort is
also taken into consideration, should TDDresults in
Better Code Quality (CQ)?
Reducing Development Efforts (DE)? Higher Developer’s Productivity (PP)?
Relevant Null Hypotheses are
(H1: CQTDD = CQCCD) (H2: DETDD = DECCD)
(H3: PPTDD = PPCCD)
7/25/2019 Example Experiments in Software Engineeing
http://slidepdf.com/reader/full/example-experiments-in-software-engineeing 33/53
332/20/2014 332/20/2014
An Experiment (TDD vs. CCD) A graded assignment during a course CS 655
(AOOA&M) during fall 2004 at IIT Kanpur
Response Variables - Code Quality, DevelopmentEffort, and Developer's Productivity
Experimental Design - One factor block design
Blocking Variable - Subject experience
Development Environment - Java programmingusing DrJava editor (with built-in support forJUnit)
Two Test Program- Student Registration System(SRS) and Automated Teller Machine (ATM) withestimated lines of code around 1200
7/25/2019 Example Experiments in Software Engineeing
http://slidepdf.com/reader/full/example-experiments-in-software-engineeing 34/53
Subjects (22)
Mostly Graduate Students in CS
All have done at least two programmingcourses and a course on SoftwareEngineering
4-10 years of programming experience inJava
Comfortable in developing analysis &
design models
342/20/2014
7/25/2019 Example Experiments in Software Engineeing
http://slidepdf.com/reader/full/example-experiments-in-software-engineeing 35/53
Test Programs
Student Registration System (SRS)
Registration module for a different academic programs
Course registration module for the current semester
Instructor module for evaluation
An administrator module can make a query for relevant detailsabout the course registration, status of a student, etc.
ATM System (ATM) A consortium of banking organizations. (Bank module)
Individual ATM units may belong to different bankingorganizations but a user of this system can be serviced by any of
them. (ATM module) Typical functionalities incorporated are transaction management
for customer accounts (User module)
ATM administration module
352/20/2014
7/25/2019 Example Experiments in Software Engineeing
http://slidepdf.com/reader/full/example-experiments-in-software-engineeing 36/53
Preparation
Subjects were trained to develop Java codefollowing TDD using JUnit + distributed relevant
material and exercises to increase theirunderstanding of TDD
A detailed set of instructions for the subject
For each test program Clear and complete specifications
A use case diagram
A desired command line interface
A carefully constructed acceptance test suite
362/20/2014
7/25/2019 Example Experiments in Software Engineeing
http://slidepdf.com/reader/full/example-experiments-in-software-engineeing 37/53
Experiment Scheduling
372/20/2014 372/20/2014
Schedule for the Experiment
Program
Development Phase (DP) AcceptancePhase (AP)
Week #1 Week #2 Weak #3
G1 G2 G1 G2 G1 G2
P1 CCD TDD CCD TDD
P2 CCD TDD TDD CCD
G1, G2 – Groups of Students (11 in each group)
P1- SRS, P2 - ATM
7/25/2019 Example Experiments in Software Engineeing
http://slidepdf.com/reader/full/example-experiments-in-software-engineeing 38/53
Experiment Schedule
Schedule for the Experiment (CCD vs. TDD)
Development Phase (DP) Acceptance Phase (AP)
Week #1
CCD
Week #2
TDD
Week #3
CCD + TDD
Subjects S1 S2 S3 … S1 S2 S3 … S1 S2 S3 …
Programs P1 Pi P1’, Pi’
P2 P j P2’, P j’
P3 Pk P3’, Pk ’
… … …
7/25/2019 Example Experiments in Software Engineeing
http://slidepdf.com/reader/full/example-experiments-in-software-engineeing 39/53
Experimental Steps (CCD vs. TDD)
CCDDevelopment Phase (DP)
Code• Code the class diagram
• When done record the efforts data
Test • Design and run manual tests for the
objects.• Correct any error observed
• Record the code and efforts data forDP phase
Acceptance Phase (AP)
• For (i =1 to size of the AP testsuite)
• Execute ith test• if error, then fixed it
• Record the code and efforts data forAP phase.
TDDDevelopment Phase (DP)
Repeat following till desired
functionality is coded• Select a class• Write a functional test for a method of a
class• Insert just enough code and see that it
passes – If not then insert more code till
the test passes
Record the code and efforts data
Acceptance Phase (AP)
For (i =1 to size of the AP testsuite)
• Execute ith test• if error, – then write a test that would
reveal that bug enter the just-enough-code to fixed it
Record the code and efforts data forAP phase
7/25/2019 Example Experiments in Software Engineeing
http://slidepdf.com/reader/full/example-experiments-in-software-engineeing 40/53
Experimental Steps
402/20/2014
CCD
Development PhaseDevelopment PhaseDevelopment PhaseDevelopment Phase (DP)(DP)(DP)(DP)
Design
- Derive an analysis diagram (The RUP
approach)- Draw the a set of functional scenarios for
the objects identified in the analysis diagram
- Identify the communications between the
objects and correspondingly develop a classdiagram
- Record the effort data
Code
- Code the class diagram- When done record the effort data
Test
- Design and run manual tests for the objects.
- Correct any error observed
- Record the code and effort data for DP
phase
Acceptance PhaseAcceptance P haseAcceptance PhaseAcceptance Phase (AP)(AP)(AP)(AP)
- For (i =1 to size of the AP test suite) Execute ith test
if error, then fixed it
- Record the code and effort data for APphase.
TDD
Development PhaseDevelopment PhaseDevelopment PhaseDevelopment Phase (DP)(DP)(DP)(DP)
Design
- Find domain objects from the use case diagram
- Attach desired functionality to these objects and
construct an initial class diagram
Test-before-coding
- Repeat the following till desired functionality iscoded
Select a class
Write a functional test for a method
of a class
Do just enough coding to see if thetest passes
Refactor code (and test) if necessary
- Record the code and effort data
Acceptance PhaseAcceptance P haseAcceptance PhaseAcceptance Phase (AP )(AP)(AP)(AP)
- For (i =1 to size of the AP test suite) Execute ith test
if error,• then write a test (or modify a
previously written test) thatwould reveal that bug and enter
the just-enough-code to fixed it
• refactor code (and test) if
required- Record the code and effort data for AP phase
7/25/2019 Example Experiments in Software Engineeing
http://slidepdf.com/reader/full/example-experiments-in-software-engineeing 41/53
Measurements
CCDDevelopment Phase (DP) Coding Efforts = (person hours)
# of unit tests Executed =
Unit Testing Efforts = (person-hours)
Size of the Program code =(UCLOC)
Acceptance Phase (AP) # Bugs in Development phase=
Time taken to correct thereported bugs = (person-hours)
Size of the Program code (final)=(UCLOC)
TDDDevelopment Phase (DP) Coding Efforts = (person hours)
# of unit test cases written = Size of the Program code=
(UCLOC)
Size of the Test code= (UCLOC )
Acceptance Phase (AP) # Bugs in Development phase =
Time taken to correct thereported bugs = (person-hours)
# of test cases written (final) = Size of the Program code (final)=
(UCLOC) Size of the Test code (final) =
(UCLOC)
7/25/2019 Example Experiments in Software Engineeing
http://slidepdf.com/reader/full/example-experiments-in-software-engineeing 42/53
Measurements
Code Quality (CQ)
% of acceptance test cases passed by thedeveloped programs
Development Effort (DE)
Effort applied in DP + Effort applied in AP (inperson-hours)
Developer’s Productivity (PP)
Delivered NCLOC per person hours
422/20/2014
7/25/2019 Example Experiments in Software Engineeing
http://slidepdf.com/reader/full/example-experiments-in-software-engineeing 43/53
Additional Measures
Initial Design Effort
Testing Effort (TE) Testing effort applied in DP + Testing effort
applied in AP (in person-hours)
TETDD in DP = (test code size / (test code +programs code)) * Coding time in DP
TECCD in DP = Recorded by the subjects
432/20/2014
7/25/2019 Example Experiments in Software Engineeing
http://slidepdf.com/reader/full/example-experiments-in-software-engineeing 44/53
Results – Statistical Test
442/20/2014 442/20/2014
Response Variables Hypothesis p-values Result
Code Quality H01: CQTDD = CQCCD 0.001 CQTDD > CQCCD
Development Efforts H02: DETDD = DECCD 0.0207 DETDD < DECCD
Developer's Productivity H03: PPTDD = PPCCD 0.21 not rejected
SRS
Response Variables Hypothesis p-values Result
Code Quality H01: CQTDD = CQCCD 0.173 not rejected
Development Efforts H02: DETDD = DECCD 0.334 not rejected
Developer's Productivity H03: PPTDD = PPCCD 0.999 not rejected
ATM
7/25/2019 Example Experiments in Software Engineeing
http://slidepdf.com/reader/full/example-experiments-in-software-engineeing 45/53
Results – SRS
452/20/2014 452/20/2014
MaxMin
75th %25th %
Median
Code Quality
[ % of Tests Passes in AP]
74
78
82
86
90
94
98
CCD TDD
MaxMin
75th %25th %
Median
Overall Development Efforts
[# of person-hours]
15
20
25
30
35
40
45
50
55
CCD TDD
MaxMin
75th %25th %
Median
Developer's Productivity
[ NCLOC/hour ]
30
40
50
60
70
80
90
CCD TDD
MaxMin
75th %25th %
Median
Initial Design Efforts - SRS
[# of person-hours]
0
1
2
3
4
5
6
7
8
9
CCD TDD
7/25/2019 Example Experiments in Software Engineeing
http://slidepdf.com/reader/full/example-experiments-in-software-engineeing 46/53
Results – ATM
462/20/2014 462/20/2014
Max
Min
75th %
25th %
Median
Code Quality
[ % of Tests Passed in AP]
70
74
78
82
86
90
94
98
CCD TDD
Max
Min
75th %
25th %
Median
Overall Development Efforts
[ # of person-hours]
24
28
32
36
40
44
48
52
CCD TDD
Max
Min
75th %
25th %
Median
Developer's Productivity
[NCLOC/person-hour]
25
35
45
55
65
75
CCD TDD
Max
Min
75th %
25th %
Median
Initial Design Efforts - ATM
[# of person-hours]
0
2
4
6
8
10
12
CCD TDD
7/25/2019 Example Experiments in Software Engineeing
http://slidepdf.com/reader/full/example-experiments-in-software-engineeing 47/53
Results – Testing effort applied
472/20/2014 472/20/2014
MaxMin
75th %25th %
Median
Testing Efforts [ATM] - CCD
[# of person-hour ]
0
2
4
6
8
10
12
14
DP AP Total
MaxMin
75th %25th %
Median
Testing Efforts [SRS] - TDD
[ # of person-hours]
0
2
4
6
8
10
12
DP AP Total
MaxMin
75th %25th %
Median
Testing Efforts [SRS] - CCD
[ # of person-hour]
0
4
8
12
16
20
DP AP Total
MaxMin
75th %25th %
Median
Testing Efforts [ATM] - TDD
[ # of person-hour]
0
4
8
12
16
20
DP AP Total
7/25/2019 Example Experiments in Software Engineeing
http://slidepdf.com/reader/full/example-experiments-in-software-engineeing 48/53
Result of Qualitative Analysis
482/20/2014
Aspects TDD CCD
Ease of use 64.706 70.59Confidance in completeness of testing 82.353 47.06
Better debugging efforts 70.588 70.59
Adherece to followed approach 70.588 88.24
More training needed 47.059 23.53
Confidence about the design 47.059 82.35
Better approach for program developement 17.647 29.41
Questionnaire
52.94% subjects favored a mixed approach of (TDD+CCD)
7/25/2019 Example Experiments in Software Engineeing
http://slidepdf.com/reader/full/example-experiments-in-software-engineeing 49/53
492/20/2014 492/20/2014
Conclusions (TDD vs. CCD)
Reduced Development Time
Improved Developer’s ProductivityCode Quality affected by testing
efforts applied in the developmentstyle
A combination may work better (?)
7/25/2019 Example Experiments in Software Engineeing
http://slidepdf.com/reader/full/example-experiments-in-software-engineeing 50/53
502/20/2014 502/20/2014
Threats to ValiditySubject Experience
Data Collection ProcessPlagiarism
Large variations in the results
7/25/2019 Example Experiments in Software Engineeing
http://slidepdf.com/reader/full/example-experiments-in-software-engineeing 51/53
Further Work
Further Validations
Industrial StudiesAccessing the quality of design
resulting by applying TDD
The issue of change management
512/20/2014
f
7/25/2019 Example Experiments in Software Engineeing
http://slidepdf.com/reader/full/example-experiments-in-software-engineeing 52/53
References
Basili VR, Selby RW. Comparing the effectiveness of software testingstrategies, IEEE TSE 13(12):1278-1296, 1987
Basili VR, Green S, Laitenberger O, Lanubile F, Shull F, Sorumgard S,Zelkowitz MV. The empirical investigations of PBR, Jour. Of ESE, 1(2):133-
164, 1996 Scanlan DA. Structure flowcharts outperformed pseudocode: An
experimental comparison, IEEE Software, pp 28-36, September 1989
Briand LC, Bunse C, Daly JW, Differding C. An experimental comparison ofthe maintainability of O-O and Structured design documents, Jour. Of ESE,2(2):291-312, 1997
Porter A, Votta LG, Basili V. Comparing defect methods for softwarerequirements inspections: A replicated experiment, IEEE TSE 21(6):563-575, 1995
Atul Gupta, Pankaj Jalote. Comparing control flow based coverage criteriabased on seeded faults. In 12th International Conference on Tools andAlgorithms for the Construction and Analysis of Systems (TACAS-06), pp365-378, Austria, 2006, Springer.
Atul Gupta and Pankaj Jalote. An Experimental Evaluation of theEffectiveness and Efficiency of the Test Driven Development, 1st
Symposium on Empirical Software Engineering & Measurement (ESEM-07), pp 285-294, Madrid, Spain, 2007, IEEE Computer Society
Fi l C t
7/25/2019 Example Experiments in Software Engineeing
http://slidepdf.com/reader/full/example-experiments-in-software-engineeing 53/53
Final Comments
It is not the case that any empiricalresearch is better than other!
Plan and use a combination ofempirical research
Avoid anything more complex thanyou understand.
Get statistical advice.
Experimental investigations shouldbe exploratory in nature