1 the deviance problem in effort estimation [email protected] promise-2 and defect prediction and...
TRANSCRIPT
![Page 1: 1 The deviance problem in effort estimation tim@menzies.us PROMISE-2 And defect prediction And software engineering](https://reader036.vdocuments.site/reader036/viewer/2022062422/56649f1b5503460f94c307d6/html5/thumbnails/1.jpg)
1
The deviance problem in effort estimation
And defectprediction
And softwareengineering
![Page 2: 1 The deviance problem in effort estimation tim@menzies.us PROMISE-2 And defect prediction And software engineering](https://reader036.vdocuments.site/reader036/viewer/2022062422/56649f1b5503460f94c307d6/html5/thumbnails/2.jpg)
2
Variance: confusing prior results?
Software effort estimation– Jorgensen: most effort estimation is
“expert-based”; so “model-based” estimation
is a waste of time model- vs expert-base studies:
5 better, 5 even, 5 worse
Software defect prediction– Shepperd&Ince:
static code measures un-informative for software quality
dumb LOC vs Mccabe studies: 6 better, 6 even, 6 worse
I smell a rat
Selecting Best Practices for Effort Estimation - Menzies, Chen, Hihn, Lum. TSE 200X
Data Mining Static Code to Learn Defect Predictor - Menzies, Greenwald,Frank, TSE 200X
![Page 3: 1 The deviance problem in effort estimation tim@menzies.us PROMISE-2 And defect prediction And software engineering](https://reader036.vdocuments.site/reader036/viewer/2022062422/56649f1b5503460f94c307d6/html5/thumbnails/3.jpg)
3
What you never want to hear…
“This isn't right. This isn't even wrong.”– Wolfgang Pauli
![Page 4: 1 The deviance problem in effort estimation tim@menzies.us PROMISE-2 And defect prediction And software engineering](https://reader036.vdocuments.site/reader036/viewer/2022062422/56649f1b5503460f94c307d6/html5/thumbnails/4.jpg)
4
Standard disclaimer
An excessive focus on empiricism …– … stunts the development of novel ,
pre-experimental, speculations
But currently: – there is no danger of an excess of
empiricism in SE– SE= a field flooded by pre-
experimental speculations.
![Page 5: 1 The deviance problem in effort estimation tim@menzies.us PROMISE-2 And defect prediction And software engineering](https://reader036.vdocuments.site/reader036/viewer/2022062422/56649f1b5503460f94c307d6/html5/thumbnails/5.jpg)
5
Sampleexperiments Public domain data Don’t test using your
training data– N-way cross val
M * randomize order Straw man Feature subset
selection Thall shalt script
– you will run it again Study mean and
variance over M * N
Defect predictions
defect prediction
effort estimation
![Page 6: 1 The deviance problem in effort estimation tim@menzies.us PROMISE-2 And defect prediction And software engineering](https://reader036.vdocuments.site/reader036/viewer/2022062422/56649f1b5503460f94c307d6/html5/thumbnails/6.jpg)
6
Data summation:K.I.S.S. Combine PD/PF
Compute & sort combined performance deltas , method A vs all others
Summarize as quartiles
400,000 runs– Nb= naïve bayes– J48= entrophy-based
decision tree learner– oneR=straw man– logNums= log the numerics
Massive FSSMassive FSS
Singletons, including LOC, not enoughSingletons, including LOC, not enough
![Page 7: 1 The deviance problem in effort estimation tim@menzies.us PROMISE-2 And defect prediction And software engineering](https://reader036.vdocuments.site/reader036/viewer/2022062422/56649f1b5503460f94c307d6/html5/thumbnails/7.jpg)
7
Variance: confusing prior results?
Software effort estimation– Jorgensen: most effort estimation is
“expert-based”; so “model-based” estimation
is a waste of time model- vs expert-base studies:
5 better, 5 even, 5 worse
Software defect prediction– Shepperd&Ince:
static code measures un-informative for software quality
dumb LOC vs Mccabe studies: 6 better, 6 even, 6 worse
I smell a rat
![Page 8: 1 The deviance problem in effort estimation tim@menzies.us PROMISE-2 And defect prediction And software engineering](https://reader036.vdocuments.site/reader036/viewer/2022062422/56649f1b5503460f94c307d6/html5/thumbnails/8.jpg)
8
Sources of varianceSoftware effort estimation
30 * { shuffle, test = data[1..10]
train = data - test, <a,b> = LC(train)
MRE = Estimate(a,b,test) }
Software defect prediction10 * { randomly select 90% of data,
score each attribute via “INFOGAIN” }
Numerous candidatesfor “most informative”attributes
Large deviations confuse comparisons of competing methods
Can be reduced by FSS
Target class: continuous
Target class: discrete
![Page 9: 1 The deviance problem in effort estimation tim@menzies.us PROMISE-2 And defect prediction And software engineering](https://reader036.vdocuments.site/reader036/viewer/2022062422/56649f1b5503460f94c307d6/html5/thumbnails/9.jpg)
9
What is Feature Subset Selection?
PCA worse (empirically)
INFOGAIN fastest – Useful for defect detection
e.g. 10,000 modules in defect logs
WRAPPER slowest– Performs best– Practical for effort estimation
e.g. dozens of past projects in company databases
Turned blue to green
a = 10.1 + 0.3x + 0.9y - 1.2z1) “wiggle” in x,y,z causes “wiggle” in “a”2) Removing x,y,z,reduces “wiggle”in “a”3) But can damage mean performance
![Page 10: 1 The deviance problem in effort estimation tim@menzies.us PROMISE-2 And defect prediction And software engineering](https://reader036.vdocuments.site/reader036/viewer/2022062422/56649f1b5503460f94c307d6/html5/thumbnails/10.jpg)
10
Warning: no single “best” theorydefect predictioneffort estimation
![Page 11: 1 The deviance problem in effort estimation tim@menzies.us PROMISE-2 And defect prediction And software engineering](https://reader036.vdocuments.site/reader036/viewer/2022062422/56649f1b5503460f94c307d6/html5/thumbnails/11.jpg)
11
Committee-based learning
Ensemble-based learning – bagging, boosting,
stacking, etc
Conclusions by voting across a committee– 10 identical experts are a
waste of money– 10, slightly different, experts
can offer different insights onto a problem
![Page 12: 1 The deviance problem in effort estimation tim@menzies.us PROMISE-2 And defect prediction And software engineering](https://reader036.vdocuments.site/reader036/viewer/2022062422/56649f1b5503460f94c307d6/html5/thumbnails/12.jpg)
12
Using committees Classification ensembles:
– “Majority vote does as good as anything else”- Tom Dietrich
Numeric prediction ensembles– Can use other measures: “heuristic rejection rules”– Theorists: “gasp horror”– Seasoned cost-estimation practitioners: “of course”
QuickTime™ and aTIFF (Uncompressed) decompressor
are needed to see this picture.
QuickTime™ and aTIFF (Uncompressed) decompressor
are needed to see this picture.
Standard statistics failing . T-tests report that none of these are “worse
Standard statistics failing . T-tests report that none of these are “worse
![Page 13: 1 The deviance problem in effort estimation tim@menzies.us PROMISE-2 And defect prediction And software engineering](https://reader036.vdocuments.site/reader036/viewer/2022062422/56649f1b5503460f94c307d6/html5/thumbnails/13.jpg)
13
• For any pair of treatments, • If one is “worse”• Vote it off• Repeat till none “worse”
survivors
![Page 14: 1 The deviance problem in effort estimation tim@menzies.us PROMISE-2 And defect prediction And software engineering](https://reader036.vdocuments.site/reader036/viewer/2022062422/56649f1b5503460f94c307d6/html5/thumbnails/14.jpg)
14
So… So, those M*N-way cross-vals
– Time to use them.
New research area– Automatic model selection methods are now required
– Data fusion in biometrcs
The technical problem is not the challenge– Issues with explanation and expectation
![Page 15: 1 The deviance problem in effort estimation tim@menzies.us PROMISE-2 And defect prediction And software engineering](https://reader036.vdocuments.site/reader036/viewer/2022062422/56649f1b5503460f94c307d6/html5/thumbnails/15.jpg)
15
Why so many unverified ideas in software engineering?
Humans use language to mark territory– Repeated effect: linguistic drift
Villages, separated by just a few miles, evolve different dialects
– Language choice = who you talk to, what tools you buy
US vs THEM: SUN built JAVA as a weapon against Microsoft
– Result: never-ending stream of new language systems
Vendors want to sell new tools, not assess them.
![Page 16: 1 The deviance problem in effort estimation tim@menzies.us PROMISE-2 And defect prediction And software engineering](https://reader036.vdocuments.site/reader036/viewer/2022062422/56649f1b5503460f94c307d6/html5/thumbnails/16.jpg)
16
But, the tide is turning Text mining of NFRs, traceability:
– IEEE RE’06 (Minniapolis, 2006) The Detection and Classification
of Non-Functional Requirements Cleland-Huang, Settimi, Zou, Solc
– IEEE TSE Jan, 2006, p 4-19: Advancing Candidate Link Generation
for Requirements Tracing– Hayes, Dekhtyar, Sundaram
Software effort estimation– IEEE TSE 200?
Selecting Best Practices for Effort Estimation– Menzies, Chen, Hihn, Lum
Software defect prediction– IEEE TSE 200?
Data Mining Static Code Attributes to Learn Defect Predictors
– Menzies, Greenwald, Frank Yes Timmy, senior forums endorseempirical rigor
Yes Timmy, senior forums endorseempirical rigor
Bestpaper