frank xu, ph.d. gannon university mining decision trees as test oracles for java bytecode xu, w.,...

Frank Xu, Ph.D. Gannon University Mining Decision Trees as Test Oracles for Java Bytecode Xu, W., Ding, T., Wang, H., Xu. D., Mining Test Oracles for Test Inputs Generated from Java Bytecode, Proc. of the 37th Annual International Computer Software & Applications Conference, pp. 27- 32, Kyoto, Japan, July 2013 Mining Decision Trees as Test Oracles for Java Bytecode (Extended version of conference paper), Accepted by Journal of Systems and Software Slide 2 Bio Frank Xu Education Ph.D. in Software Engineering North Dakota State University M.S. in Computer Science Towson University B.S. in Computer ScienceSoutheast Missouri State University Working Experience GE Transportation, 2008- present, Consultant of Locomotive Remote Diagnostics Service Center Gannon University, 2008- present, Assistant Professor of Software Engineering, Director of Keystone Software Development Institute, University VA Wise, 2007- 2008, Assistant Professor of Software Engineering Swanson Health Products, 2005 ~ 2007, Sr. Programmer Analyst Volt Information Science Inc., 2004 ~ 2005, Software Engineer Slide 3 Teaching Source: Student Evaluation Report Slide 4 Research Source: Google scholar: http://scholar.google.com/citations?user=9_I4ZUgAAAAJ&hl=en Slide 5 Mining Decision Trees as Test Oracles I NTRODUCTION R UNNING E XAMPLE T EST I NPUT G ENERATION M ODEL M INER E MPIRICAL S TUDY R ELATED W ORK C ONCLUSIONS Slide 6 I NTRODUCTION Slide 7 Exercise Implementing a method to solve Triangle problem Slide 8 What is Triangle Problem? Slide 9 How to test Triangle? String getTriangleType (int a, int b, int c){ if((a=b+c Slide 21 Apply Rules to a Predicate Tree for Generating Test Inputs For a given seed value, we adjust the value to guide the execution path based on rules 10 4 7 Slide 22 IDPredicateExpected Evaluation Outcomes Advising Rules 1. i0 > i1(i0 > i1) = true(i0 , i1) (i0, i1 ) 1. (i0 > i1) = false(i0 ,b) (i0, i1 ) 1. i0 == i1(i0==i1) = true(i0 D, i1)( i0, i1 D) 1. (i0== i1)= false(i0 ,i1) (i0, i1 ) (i0 ,i1) (i0, i1 ) 1. i2 = i0 + i1i2 (i0 , i1) (i0, i1 ) 1. i2 (i0 , i1) (i0, i1 ) 1. i2 = i0 - i1i2 (i0 , i1) (i0, i1 ) 1. i2 (i0 , i1)( i0, i1 ) 1. i2 = i0 * i1 (i0>0, i1>0) i2 (i0 , i1) (i0, i1 ) 1. i2 (i0 , i1) (i0, i1 ) 1. i2 = i0 / i1 (i0>0, i0 > 0) i2 (i0 , i1) (i0, i1 ) 1. i2 (i0- i1)( i0, i1 ).. 1. s0>s1(s0 >s1) = true(s0[k] , s1) (s0, s1[l] ) 1. (s0 > s1) = false(s0[k] ,s2) (s0, s1[l] ) Slide 23 M ODEL M INER IDabc 1777 21173 38 19 41173 522 9 6304330 72213 8 1620 1333 45 14523052 153147 16272822 Slide 24 Jimple Predicates and Attributes of Triangle Program Jimple Predicate Attribute of UUT i0 > = $i3a > = b + c i1 > = $i4b > = a + c i2 > = $i5c > = a + b i0 != i1a != b i1 != i2b != c i0 == i1a = b i0 == i2a = c i1 == i2b = c For a given test input generated by rule-based method, predicates produce a set of T or F values Input (a=7.b=7,v=7) f f f f f t t t Slide 25 Covert Test Inputs Using Attributes Test input ID abc a1 a2a3a4a5a6a7a8 O 1 777 fffffttt1 2 1173 ftfttfff4 3 8 19 fftttfff4 4 1173 tffttfff4 13 33 45 fffttftf2 14 523052 fffttfft2 15 3147 ffftffff3 16 272822 tffttfff4 Slide 26 C4.5 mining algorithm The key idea of the algorithm is to calculate the highest normalized information gain of attributes and then build a decision node that splits on the attributes Tool Weka 3: http://www.cs.waikato.ac.nz/ml/weka/ ID abc a1a2a3a4a5a6a7a8o 1 777 fffffttt1 2 1173 ftfttfff4 3 8 19 fftttfff4 4 1173 tffttfff4 13 33 45 fffttftf2 14 523052 fffttfft2 15 3147 ffftffff3 16 272822 tffttfff4 Slide 27 E MPIRICAL S TUDY Slide 28 T HREE S TUDY S UBJECTS Line of CodeNumber of Predicates JavaJimpleJava Jimple (Allow duplications) Attributes (No duplication) Triangle2227388 Next Date4851919 Vending Machine8268214110 Slide 29 G OAL OF E MPIRICAL S TUDIES Measure fault detection capability # mutants killed /#mutants *100% Slide 30 Measure fault detection capability: Process Step 1: Implant mutants Step 2: Build a decision tree model Step 3: Find mismatches Find possible causes Step 4: Calculate fault detectability Mutation OperatorExamples CategoryIDTypeOriginalReplaced Arithmetic Operations 1Arithmetic Operator Replacement a + ba - b 2Arithmetic Operator Insertion b + c-b + c Relations3Relational Operator Replacement a != ba == b Conditions4Conditional Operator Replacement (a==b) && (b==c) (a==b) || (b==c) Constants5Constant Value Modifications = as = b Return Values 6Return Value Modificationreturn sreturn s Insert bug Faulty version Find mismatches Two possible causes -Found bugs -assertEquals(Equilateral, new Trianlge(7,7,7).getTriType()) -Model is not correct -assertEquals (Isosceles, new Trianlge(7,7,7).getTriType()) Two possible causes -Found bugs -assertEquals(Equilateral, new Trianlge(7,7,7).getTriType()) -Model is not correct -assertEquals (Isosceles, new Trianlge(7,7,7).getTriType()) Slide 31 ID # of Mutants # of Tests Executed Oracle Results # Mutants Discovered# Faults in Models SDUSDUSDU Triangle Problem 13444333000 2416 333111 33555333000 41666111000 53666333000 63444333000 Next Date Problem 131315 233100 262737 466200 34221924434010 43261827323010 52666222000 Vending Machine 2330 333000 3445 444000 5439 444000 6441 444000 Total508861428 Slide 32 R ELATED W ORK Lo et al. (Lo, Cheng, Han, Khoo, & Sun, 2009), Milea et al. (Milea, Khoo, Lo, & Pop, 2012) mines a set of discriminative features capturing repetitive series of events from program execution traces. These features are then used to train a classier to detect failures. Bowring et al. (Bowring, Rehg, & Harrold, 2004) models program executions as Markov models, and a clustering method for Markov models that aggregates multiple program executions into effective behavior classifiers. (Pacheco & Ernst, 2005) Pacheco and Ernst build an operational model from observations of the software running properly. The operation model includes object invariants and properties. The object invariants are the conditions hold on entry and exit of all public methods. Our approach generates and classifies inputs based on the internal structure of the UUT. Briand (Briand, 2008) has proposed the use of machine learning techniques - including decision trees - for the test oracle problem. The decision tree model he has proposed is manually built from software requirements. Slide 33 C ONCLUSIONS The first attempt to mine decision tree models from auto-generated test inputs based on static analysis of Java bytecode Our empirical study indicates that using the mined test oracles, average 94.67% mutants are killed by the generated test inputs. Slide 34 Thanks Slide 35 Future Research Direction Requirements Engineering & Natural language Process Generating UML diagrams, e.g., Use case, Class diagram Validating SRS Deriving test cases from SRS Software Design & Social Networks Analysis Utilizing SSA for analyzing communication diagram, class diagram, and sequence diagram for improving the quality of the software Software Implementation & Big Data Mining repository for software quality assurance using Hadoop Software Testing & Mobile/Cloud Application Testing mobile applications and distributed applications Slide 36 Build Variable Dependency Tree (VDT)

frank xu, ph.d. gannon university mining decision trees as test oracles for java bytecode xu, w.,...

Documents