on the application of gp for software engineering predictive modeling: a systematic review expert...

21
On the application of GP for software engineering predictive modeling: A systematic review Expert systems with Applications, Vol. 38 no. 9, 2011 Wasif Afzal, Richard Torkar Blekinge Institute of Technology, Karlskrona, Sweden. {waf,rto}@bth.se

Upload: thomas-sandoval

Post on 10-Dec-2015

216 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: On the application of GP for software engineering predictive modeling: A systematic review Expert systems with Applications, Vol. 38 no. 9, 2011 Wasif

On the application of GP for software engineering predictive modeling: A systematic review

Expert systems with Applications, Vol. 38 no. 9, 2011

Wasif Afzal, Richard Torkar

Blekinge Institute of Technology,

Karlskrona, Sweden.

{waf,rto}@bth.se

Page 2: On the application of GP for software engineering predictive modeling: A systematic review Expert systems with Applications, Vol. 38 no. 9, 2011 Wasif

Agenda• Research question

• Symbolic regression

• Prediction and estimation in sw engineering

• GP for prediction and estimation in sw engineering

• Application of GP for sw quality classification

• Application of GP for sw cost/effort/size estimation

• Application of GP for sw fault prediction and sw reliability growth modeling

• Future work

• Conclusions

• Recommendations

Page 3: On the application of GP for software engineering predictive modeling: A systematic review Expert systems with Applications, Vol. 38 no. 9, 2011 Wasif

Our research question• Is there evidence that:

symbolic regression using GP is an effective method for:

prediciton and estimation, in comparison with:

regression, machine learning and other models (including expert opinion and different improvements over the standard GP algorithm)?

Page 4: On the application of GP for software engineering predictive modeling: A systematic review Expert systems with Applications, Vol. 38 no. 9, 2011 Wasif

It is about symbolic regression!• Symbolic regression – One of the many application

areas of GP– Finds a function, with the outputs having desired

outcomes.

– Makes no assumptions about:

• Structure of the function

• Data distribution

• Relationship between independent and dependent variables

• Helps in identifying the significant variables in subsequent modeling attempts

Page 5: On the application of GP for software engineering predictive modeling: A systematic review Expert systems with Applications, Vol. 38 no. 9, 2011 Wasif

Prediction and estimation in sw engineering

• Software quality

– Software quality classification

– Software fault prediction

– Software reliability growth modeling

• Software size

• Software development cost/effort

• Maintenance task effort

• Software release timing

Page 6: On the application of GP for software engineering predictive modeling: A systematic review Expert systems with Applications, Vol. 38 no. 9, 2011 Wasif

GP for prediction and estimation in sw engineering

• 23 identified primary studies– Software quality classification (8)– Software cost/effort/size estimation (7)– Software fault prediction and software

reliability growth modeling (8)

Page 7: On the application of GP for software engineering predictive modeling: A systematic review Expert systems with Applications, Vol. 38 no. 9, 2011 Wasif

GP for prediction and estimation in sw engineering cntd…

Page 8: On the application of GP for software engineering predictive modeling: A systematic review Expert systems with Applications, Vol. 38 no. 9, 2011 Wasif

Application of GP for sw quality classification (8 studies)

• Variations of the dependent variable:

– Fault proneness

– Quality ranking of program modules (high risk to low risk)

• Variations in sampling of training and testing sets:

– Simple hold-out and 10-fold CV.

Page 9: On the application of GP for software engineering predictive modeling: A systematic review Expert systems with Applications, Vol. 38 no. 9, 2011 Wasif

Application of GP for sw quality classification cntd…

• Variations in fitness function– Single objective

• Minimization of root mean square

• Minimization of average cost of misclassification

– Multi-objective• Minimization of average cost of misclassification +

minimization of tree size

• Maximization of the best percentage of the actual faults averaged over the percentiles level of interest + controlling the tree size.

• Balancing the over sampling and under sampling in each class for a decision tree.

Page 10: On the application of GP for software engineering predictive modeling: A systematic review Expert systems with Applications, Vol. 38 no. 9, 2011 Wasif

Application of GP for sw quality classification cntd…

• Variations in comparison groups:– Neural networks – k-nearnest neighbour– Regression (linear, logistic)– Humans

Page 11: On the application of GP for software engineering predictive modeling: A systematic review Expert systems with Applications, Vol. 38 no. 9, 2011 Wasif

Application of GP for sw quality classification cntd…

• Results:– Majority of the studies (6 out of 8) reported

results in favor of using GP for the classification task.

• Limitations:– Increase the comparisons with a more

representative set of techniques.– Increase the use of publically available data sets

for easier replications.

Page 12: On the application of GP for software engineering predictive modeling: A systematic review Expert systems with Applications, Vol. 38 no. 9, 2011 Wasif

Application of GP for sw quality classification cntd…

• Encouraging aspects:– The datasets used represent real-world

projects.– Problem dependent objectives represented in

fitness functions perform better than standard GP.

Page 13: On the application of GP for software engineering predictive modeling: A systematic review Expert systems with Applications, Vol. 38 no. 9, 2011 Wasif

Application of GP for sw cost/effort/size (CES) estimation (7 studies)

• Variations of the dependent variable– Software effort– Software cost– Software size

• Variations in fitness function– Single objective

• Minimization of mean squared error or MMRE

Page 14: On the application of GP for software engineering predictive modeling: A systematic review Expert systems with Applications, Vol. 38 no. 9, 2011 Wasif

Application of GP for sw cost/effort/size (CES) estimation cntd…

• Variations in comparison groups– ANN, nearest neighbour and different forms

of regression.• Variations in sampling of training and testing

sets– Simple hold-out.

Page 15: On the application of GP for software engineering predictive modeling: A systematic review Expert systems with Applications, Vol. 38 no. 9, 2011 Wasif

Application of GP for sw cost/effort/size (CES) estimation cntd…

• Results– No strong evidence of GP performing consistently on

all evaluation measures used.

• Limitations– Evaluation measures used are not standardized.

– Different hold-out samplings for train and test sets.

– Lack of statistical hypothesis testing.

– Lack of comparison groups.

Page 16: On the application of GP for software engineering predictive modeling: A systematic review Expert systems with Applications, Vol. 38 no. 9, 2011 Wasif

Application of GP for sw fault prediciton and sw reliability growth modeling (8 studies) • Variations of the dependent variable

– SW fault prediction– SW reliability growth modeling

• Variations in fitness function– Single objective:

• Minimization of standard error

Page 17: On the application of GP for software engineering predictive modeling: A systematic review Expert systems with Applications, Vol. 38 no. 9, 2011 Wasif

Application of GP for sw fault prediciton and sw reliability growth modeling cntd …

• Variations in comparison groups– Standard GP, Naive Bayes, traditional

software reliability growth models.

• Variations in sampling of training ad testing sets– Hold-out and 10-fold CV

Page 18: On the application of GP for software engineering predictive modeling: A systematic review Expert systems with Applications, Vol. 38 no. 9, 2011 Wasif

Application of GP for sw fault prediciton and sw reliability growth modeling cntd …

• Results:– 7 out of 8 studies favor the use of GP.

• Limitations:– Poor representation of comparison groups– Absence of a baseline to compare to.

Page 19: On the application of GP for software engineering predictive modeling: A systematic review Expert systems with Applications, Vol. 38 no. 9, 2011 Wasif

Promising future work to undertake

• Multi-objective fitness evaluation (e.g. Minimization of standard error and maximization of correlation coefficient)

• Simplification of GP solutions to help interpretation of relationships between variables.

• Evaluation of techniques to minimize overfitting of GP solutions.

Page 20: On the application of GP for software engineering predictive modeling: A systematic review Expert systems with Applications, Vol. 38 no. 9, 2011 Wasif

Conclusions• A total of 23 studies apply GP for predictive studies in sw

engineering:

– sw quality classification (8)

– sw cost/effort/size estimation (7)

– sw fault prediciton and sw reliability growth modeling (8)

• There is evidence in support of using GP for:

– sw quality classifiaction

– sw fault prediction and SW reliability growth modeling

• but not for:

– sw cost/effort/size estimation.

Page 21: On the application of GP for software engineering predictive modeling: A systematic review Expert systems with Applications, Vol. 38 no. 9, 2011 Wasif

Recommendations• Use public data sets wherever possible.• Apply commonly used sampling strategies.• Use techniques to avoid overfitting in GP

solutions.• Report the settings of GP parameters.• Compare the performances against a commonly

used baseline.• Use statistical experimental designs.