on required element testing

9
IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. SE-10, NO. 6, NOVEMBER 1984 795 On Required Element Testing SIMEON C. NTAFOS, MEMBER, IEEE Abstract-In this paper we introduce two classes of program testing functional testing [91 ). Random testing has the advantage of strategies that consist of specifying a set of required elements for the simplicity and performs surprisingly well in many cases [2] program and then covering those elements with appropriate test inputs. Howev In general, a required element has a structural and a functional com- er, it is not effective for detecting certain types of errors ponent and is covered by a test case if the test case causes the features (e.g., errors in the handling of special conditions). Strategies specified in the structural component to be executed under the condi- that carefully analyze the input specification are more effec- tions specified in the functional component. Data flow analysis is used tive but they may require effort and skill on the part of the to specify the structural component and data flow interactions are used as a basis for developing the functional component. The strategies are user commensurate with that required to produce a correct illustrated with examples and some experimental evaluations of their program in the first place. Error-driven strategies use known effectiveness are presented. errors to guide the generation of test cases. For example, in element testing, testing mutation analysis [1] a number of simple changes (mutants) stradexgerms. Data flow analysis, requiredare introduced into the program and test cases are developed so that they can detect the mutants. The validity of this ap- proach is based on the existence of a strong correlation be- I. INTRODUCTION tween the performance of a test set on the mutants and its PROGRAM testing is widely used to improve the reliability performance on real errors (the coupling effect [1 ] ). Disad- of software. It consists of generating a set of test cases vantages of mutation analysis include the potentially large according to some testing strategy and then checking the out- number of runs needed and questions regarding the validity of puts produced by the test cases against the expected results. its assumptions. Most of the testing strategies that have been proposed can be The ultimate goal of program testing would be to guarantee classified into three categories: structural, black-box, and that the program is correct. - Goodenough and Gerhart [4] error-driven strategies. Structural strategies ask for varying proposed a set of conditions under which testing is equivalent degrees of coverage of the control structure of the program. to a proof of correctness. Although test sets that satisfy such They range from segment and branch testing to (exhaustive) conditions can be generated for certain classes of errors and path testing. A program segment is a sequence of consecutive programs [22], [23], in general, guaranteeing correctness is statements that are always executed together. A set of test not a realistic goal for program testing [7]. Formal proofs of cases achieves segment testing if it causes each segment in the correctness are also impractical because of their complexity. program to be executed at least once. Branch testing requires Still, programs that have not been shown to be correct can be, that each possible transfer of control in the program be exer- and are, useful. Then, we take as the goal of program testing cised at least once, while path testing asks that all possible to improve the reliability of software at an acceptable cost and, execution sequences in the program be tested. Path testing is thus, increase its usefulness. Each of the strategies discussed usually impractical since even small programs can have a very above emphasizes a different approach to testing, i.e., testing large number of paths. Many variations of path testing have the control structure, the input specification, or for certain been proposed where the number of paths is reduced by group- types of errors. It is felt that better strategies can be developed ing similar paths together and then testing a few representative by combining features of these approaches. For example, paths from each class of similar paths (e.g., structured path Weyuker and Ostrand [21] proposed a strategy that combines testing [8]). The main shortcoming of structural testing is structural and black-box testing. The input domain of the that tests are generated using the possibly incorrect code, and program is partitioned into path domains using the control thus, certain types of errors, especially errors in the specifica- structure of the program and also into subdomains that have tions, are hard to detect. Black-box strategies treat the pro- distinct functional characteristics. Test cases are selected from gram as a black box and concentrate on the input specification each subdomain in the intersection of the two partitions. instead. Test case generation methods range from using ran- Required element testing is a class of testing strategies that dom inputs (random testing) to identifying important test data allow the various approaches to testing to be combined. In for each of the functions that the program performs (as in general, a required element testing strategy is a rule for specify- ing a set of required elements for a program. Required ele- Manuscript received September 8, 1982; revised November 23, 1983. ments have the general form {S; F}, where S is the structural This work was supported in part by the National Science Foundation component and F is the functional component. S describes under Grant MCS-8003322. The author is with the Computer Science Program, University of Texas what structural features (i.e., segments, branches, paths) are to at Dallas, Richardson, TX 75080. be executed. F describes (in the form of assertions) conditions 0098-5 589/84/1 100-079 5$01 .00 © 1 984 IEEE

Upload: others

Post on 19-Oct-2021

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: On Required Element Testing

IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. SE-10, NO. 6, NOVEMBER 1984 795

On Required Element TestingSIMEON C. NTAFOS, MEMBER, IEEE

Abstract-In this paper we introduce two classes of program testing functional testing [91 ). Random testing has the advantage ofstrategies that consist of specifying a set of required elements for the simplicity and performs surprisingly well in many cases [2]program and then covering those elements with appropriate test inputs. HowevIn general, a required element has a structural and a functional com- er, it is not effective for detecting certain types of errorsponent and is covered by a test case if the test case causes the features (e.g., errors in the handling of special conditions). Strategiesspecified in the structural component to be executed under the condi- that carefully analyze the input specification are more effec-tions specified in the functional component. Data flow analysis is used tive but they may require effort and skill on the part of theto specify the structural component and data flow interactions are usedas a basis for developing the functional component. The strategies are user commensurate with that required to produce a correctillustrated with examples and some experimental evaluations of their program in the first place. Error-driven strategies use knowneffectiveness are presented. errors to guide the generation of test cases. For example, in

element testing, testing mutation analysis [1] a number of simple changes (mutants)stradexgerms. Data flow analysis, requiredare introduced into the program and test cases are developed

so that they can detect the mutants. The validity of this ap-proach is based on the existence of a strong correlation be-

I. INTRODUCTION tween the performance of a test set on the mutants and itsPROGRAM testing is widely used to improve the reliability performance on real errors (the coupling effect [1 ] ). Disad-

of software. It consists of generating a set of test cases vantages of mutation analysis include the potentially largeaccording to some testing strategy and then checking the out- number of runs needed and questions regarding the validity ofputs produced by the test cases against the expected results. its assumptions.Most of the testing strategies that have been proposed can be The ultimate goal of program testing would be to guaranteeclassified into three categories: structural, black-box, and that the program is correct. - Goodenough and Gerhart [4]error-driven strategies. Structural strategies ask for varying proposed a set of conditions under which testing is equivalentdegrees of coverage of the control structure of the program. to a proof of correctness. Although test sets that satisfy suchThey range from segment and branch testing to (exhaustive) conditions can be generated for certain classes of errors andpath testing. A program segment is a sequence of consecutive programs [22], [23], in general, guaranteeing correctness isstatements that are always executed together. A set of test not a realistic goal for program testing [7]. Formal proofs ofcases achieves segment testing if it causes each segment in the correctness are also impractical because of their complexity.program to be executed at least once. Branch testing requires Still, programs that have not been shown to be correct can be,that each possible transfer of control in the program be exer- and are, useful. Then, we take as the goal of program testingcised at least once, while path testing asks that all possible to improve the reliability of software at an acceptable cost and,execution sequences in the program be tested. Path testing is thus, increase its usefulness. Each of the strategies discussedusually impractical since even small programs can have a very above emphasizes a different approach to testing, i.e., testinglarge number of paths. Many variations of path testing have the control structure, the input specification, or for certainbeen proposed where the number of paths is reduced by group- types of errors. It is felt that better strategies can be developeding similar paths together and then testing a few representative by combining features of these approaches. For example,paths from each class of similar paths (e.g., structured path Weyuker and Ostrand [21] proposed a strategy that combinestesting [8]). The main shortcoming of structural testing is structural and black-box testing. The input domain of thethat tests are generated using the possibly incorrect code, and program is partitioned into path domains using the controlthus, certain types of errors, especially errors in the specifica- structure of the program and also into subdomains that havetions, are hard to detect. Black-box strategies treat the pro- distinct functional characteristics. Test cases are selected fromgram as a black box and concentrate on the input specification each subdomain in the intersection of the two partitions.instead. Test case generation methods range from using ran- Required element testing is a class of testing strategies thatdom inputs (random testing) to identifying important test data allow the various approaches to testing to be combined. Infor each of the functions that the program performs (as in general, a required element testing strategy is a rule for specify-

ing a set of required elements for a program. Required ele-Manuscript received September 8, 1982; revised November 23, 1983. ments have the general form {S; F}, where S is the structural

This work was supported in part by the National Science Foundation component and F is the functional component. S describesunder Grant MCS-8003322.The author is with the Computer Science Program, University of Texas what structural features (i.e., segments, branches, paths) are to

at Dallas, Richardson, TX 75080. be executed. F describes (in the form of assertions) conditions

0098-5589/84/1 100-0795$01 .00 © 1984 IEEE

Page 2: On Required Element Testing

796 IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. SE-10, NO. 6, NOVEMBER 1984

that a test case that covers the required element must satisfy. s1: DOUBLE PRECISION FUNCTION SIN(X,E)s2:* DOUBLE PRECISION E,TERM,SUMThis definition is general and flexible enough to allow most s2- REAL X

testing strategies to be formulated in terms of required ele- s4: TERM = Xs5: DO 20 I=3,100,2

ments. Of course, this generality also makes it of little practi- s6: TERM=TERM*X**2/(I*(I-1))cal use. We arrive at a more concrete definition by looking at s7: IF(TERM.LT.E) G00TO 3

s8: SUM = SUM + (.-1**(I/2))*TERMa method that uses data flow analysis to generate the required s9: 20 CONTINUE

s1C: 3D SIN = SUMelements. In the next section we develop the needed back- s11: RETURNground from data flow analysis and graph theory and discuss s12: ENDprevious results. In Section. II, we define two classes of re- Fig. 1. Function SIN(X, E) computes the sine of X to accuracy E.

quired element testing strategies that use data flow analysis.In Section IV, we illustrate the strategies with examples and ment j if there is an undefinition-free and redefinition-free pathpresent the results of some experimental evaluations of their from i to j (i.e., a path along which the path expression for Xeffectiveness. contains no u's and only one d representing the definition ofX

at vi). Data flow analysis is very simple to perform in straightII. DATA FLOW ANALYSIS line code, i.e., within a segment. Thus, it. is common to first

Data flow analysis is a static analysis of the program that perform data flow analysis within the segments and then con-provides information about actions taken on the program sider the data flow between segments by associating definitions,variables [5] and effects of such actions at various points in references, and undefinitions with segments instead of state-the program. The actions that are of interest are: definitions ments. We say that a variable X is referenced in a segment if(d), undefinitions (u), and references (r). A variable is defined the path expression for X within the segment is of the form re,in a statement if it is assigned a value in that statement. A where e is any path expression. We say that a variable is de-variable is referenced in a statement, if execution of the state- fined in a segment, if the corresponding path expression hasment requires that a value for the variable be obtained. A vari- the form ede' and e' contains no d's or u's. Dummy verticesable is undefined when no value for it is available. For ex- may be added to allow undefinitions to be associated withample, local variables become undefined upon exit from the vertices.current block (in block-structured languages), a loop index may Data flow analysis has been used in program testing mainlybe undefined upon exit from the loop (DO loops in Fortran) to detect data flow anomalies [3], [11], [17]. Anomalies areand all variables are considered undefined upon entry to the patterns of actions that are indicative of error, e.g., the patternprogram. ur is anomalous since the variable is referenced without firstAn essential tool for data flow analysis is the control struc- being defined. Also, the pattern du and dd are considered

ture of the program which is commonly modeled as a directed anomalous. Data flow analysis was also proposed as a basisgraph (digraph). A digraph G = (V, E) consists of a set of ver- for a testing strategy in [6], [16], and [19]. The proposedtices V and a set of edges E. A vertex vi represents a program strategies ask that certain types of data flow interactions besegment (i.e., sequential code with a single entry and a single tested (where a data flow interaction is a sequence of actionsexit), and an edge (vi, v1) represents a possible transfer of con- from the set {d, r, u}). The strategy proposed in [6] is to testtrol from segment i to segment j. Usually, a vertex is taken to each interaction involving a reference and a definition thatrepresent a maximal segment (i.e., a segment to which no state- reaches that reference (dr interaction). The main shortcomingments can be added) but may also be associated with simple of this strategy is that it does not guarantee that branch test-statements or parts of segments. Unless otherwise indicated ing (which is commonly considered to be a minimal testingwe let vertices represent maximal segments. If (vi, vj) E E, requirement) is achieved. The required pairs strategy, studiedvertex vi is an (immediate) predecessor of vj and vj is an (im- in [16], also tests dr interactions but augments them so thatmediate) successor of vi. Without loss of generality we assume branch testing is achieved and also loops are tested with atthat G has a unique vertex with no predecessors (the source, s) least two distinct iteration counts. In [19], a distinction isand a unique vertex with no successors (the sink, t). A path made between references to variables in a computation (c-use)from vi and vj is an alternating sequence of vertices and edges and in a predicate (p-use). The p-uses are associated withstarting with vi, terminating with vj and such that an edge (x, branches corresponding to the outcomes of the predicate.y) appears in the path between x andy. A test case in the pro- Then a class of six strategies is proposed. Of those, the "allgram corresponds to a source to sink (s-t) path in G which uses" strategy is similar to the required pairs strategy. It asksrepresents the execution sequence evoked by the test case. that all iteractions between a c-use or a p-use and a definitionData flow analysis associates with each statement the sets of that reaches it be tested. Four other strategies are limited ver-variables that are defined, referenced, or undefined in that sions of "all uses" asking that 1) each definition be tested withstatement (in many cases undefinitions are more naturally a path along which the definition reaches a c-use or a p-useassociated with branches). Of more interest are sequences of (all-defs), 2) each interaction between ap-use and a definitionactions taken on a variable along a path. Such sequences are that reaches it be tested (all-p-uses), 3) each interaction be-described by path expressions. For example, the path expres- tween a c-use and a definition that reaches it be tested andsion for the variable TERM along the path through statements each definition be tested by some path along which it reachess4-s5-s6-s7-s8 in the routine SIN (Fig. 1) is drdrr-. A definition a use (all-c-uses/some-p-uses), and 4) each interaction betweenof X in statement i is said to reach a reference to X in state- a p-use and a definition that reaches it be tested and each def-

Page 3: On Required Element Testing

NTAFOS: REQUIRED ELEMENT TESTING 797

inition be tested by some path along which it reaches a use Consider a program and the corresponding digraph G. In(all-p-uses/some-c-uses). These four strategies have limited most applications, a vertex in G represents a (maximal) seg-applicability. The first two are variations of segment and ment in the program. For the purposes of this discussion, webranch testing, respectively, and, as is pointed out in [191 , the let vertices be associated with segments as long as the segmentsthird emphasizes the detection of computation errors while do not contain data flow interactions of the form dr or com-the fourth emphasizes domain errors. The sixth strategy, "all- pound statements (e.g., branch statements with compounddu-paths," asks that each interaction between a p-use or a c- predicates). If path expressions of the form dr occur within ause and a definition that reaches it be tested along all cycle segment, the segment is split so that this is no longer the casefree paths connecting the appropriate statements. The "all- and a distinct vertex is associated with each part. Compounddu-paths" strategy is more extensive than the required pairs predicates are also broken up so that hidden paths are not over-strategy but may be impractical in many cases because of the looked. We also assume that G includes two additional vertices,large number of paths that will be produced. Perhaps the so and to, representing the input and output specification,major shortcoming of all the strategies discussed above is that respectively. In terms of data flow analysis we consider thatall of them are structural in nature, and therefore, one would all variables are defined in so and referenced in to, and that theexpect them to suffer from the same deficiencies as other definitions at so reach every reference in the program whilestructural strategies. the references at to are reached by every definition in the pro-

III. REQUIRED ELEMENT TESTING STRATEGIES gram. The basic structural unit that we test in a program is aRequired element testing consists of specifying a set of re- k-dr interaction which we define as follows. A k-dr interac-

quired elements for a program and then finding test inputs to tion consists of k - 1 variables Xl, X2, .* , Xk - (not neces-cover these elements. In general, a required element has the sarily distinct) and k distinct statements s1, S2, . , Sk, suchform {S; F} where S describes the structural aspects of the ele- that there is a path that visits the statements in the given orderment (i.e., what features of the program's structure are to be and variable Xi is defined in si, that definition reaches a ref-executed) and F is a set of assertions describing conditions erence to Xi in si+ 1 which is used to define variable Xi,I inunder which the structural features should be executed. A test si+ 1 for 1 < i < k. In a k-dr interaction, the definition of X1case is said to cover the required element if it causes the fea- in s1 (i.e., the first definition) and the reference to Xk- 1 in Sktures specified in S to be executed and also satisfies the asser- (the last reference) are treated as the focal points of the interac-tions in F. For example, if we want to include a test of the tion. The rest of the interaction may be viewed as a sequencestatement s: X = Y/2 with an odd value for Y, we can specify of actions leading up to the last reference or a sequence ofthe required element {vi; P(Y)} where vi represents the seg- actions that follows the first definition. In general, there willment containing s and P(Y) is true if and only if Y is odd. The be a set of k-dr interactions that have the same last reference,structural component can be a sequence of vertices, edges, or each representing one of a variety of paths along which thatsubpaths. Thus, any structural strategy can be described using reference can be reached. Similarly, there will be a set of k-drrequired elements. For example, branch testing corresponds interactions that have the same first definition, each describingto the set of required elements Rb = {{(vi, vi);}I(vi, vj) E E}. one of a variety of consequences of that definition.Black-box strategies can be described in terms of required ele- The definition of a k-dr interaction given above specifiesments by attaching appropriate assertions in F. To allow re- that the reference to Xi in si+ is used directly (i.e., in thequired elements that are specified in terms of the input speci- same statement) to define Xi, I. An alternative would be tofication to be treated in a uniform manner, we extend the also allow indirect uses, e.g., variable Xi is referenced in aprogram's digraph so that it includes a new source so and a branch predicate and the outcome of that predicate deter-new sink to representing the input and output specifications, mines what the next definition of Xi+ 1 would be. This is arespectively. We also add edges (so, s) and (t, to). Then, re- viable alternative representing a tradeoff between a morequired elements that model black-box testing strategies will be thorough coverage of the program and increased complexityof the form {{so, to}; F} where F reflects the conditions (cost) in developing the k-dr interactions. A number of simi-specified in the black box strategy. Also, information about lar choices (more extensive strategy versus higher cost) arisecommonly occurring errors can be incorporated in the form of in the remainder of this section. We will take the approach ofassertions. Thus, required element testing is flexible enough to defining what we consider to be a minimal strategy and pointinclude most other strategies. Since our goal is to combine out some of the alternatives as we go along.features from the various approaches to testing, this flexibility A variety of required elements can be associated with eachis needed. However, it does not provide us with a practically interaction. A required element that represents a k-dr interac-useful strategy without further specification. Our goal in terms tion consists of an n-tuple (n > k) of vertices and a set of as-of structural testing is to subsume branch testing (which many sertions. Of the vertices, k represents the segments containingconsider a minimal requirement), and stay short of path test- the statements involved in the interaction. The remaining ver-ing, which is impractical. The goal in terms of black box test- tices, together with the assertions, are introduced as needed toing is to provide a connection between the program's structure describe a path, or a set of paths, along which the interactionand the specifications so that potential sources of error can be is covered, i.e., the definition of Xi in s1 reaches the referenceidentified without the extensive analysis that is characteristic to Xi in s + 1 for i <k. A number of required elements canof many black-box strategies. We feel that data flow analysis represent a single k-dr interaction since different iterationis useful in both respects. counts may be specified for statements that occur in loops and,

Page 4: On Required Element Testing

798 IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. SE-10, NO. 6, NOVEMBER 1984

also, different assertions may be included in it. As a general mining whether or not a given path is executable. Note thatrule, we produce one required element for each k-dr interac- the choice of required element(s) to represent a k-dr interac-tion. Additional required elements are produced when the last tion can affect the chances that an executable path that coversreference of the interaction occurs in a branch predicate and it exist in the program. The more vertices and assertions arealso when the last reference or the first definition of the inter- used to specify the required element, the less freedom in select-action occurs within a loop. At least two required elements ing a path to cover it results, and it is more likely that requiredare produced when the last reference of the interaction occurs elements that are not covered by any executable path will bein a branch predicate. These elements differ only in that they produced. Thus, it is important to specify the required ele-specify different outcomes of the branch predicate. This ments only to the extent needed to represent the interaction.guarantees that branch testing is achieved. The problem posed This may also allow a greater number of required elements toby variables defined and referenced within loops is that such be covered by the same test case, thus reducing the number ofa variable represents a sequence of definitions/references (one test cases needed to cover the required elements.for each iteration). If definitions/references in each iteration We define the required k-tuples strategy as follows. We con-are considered distinct, we have to deal with a possibly very sider all k-dr interactions in the program. For each k-dr inter-large number of required elements. Thus, we take an approach action involving a last reference in a branch predicate, we pro-similar to those taken in structural testing to limit the number duce one required element for each outcome of the branchof iterations that are considered. If the first definition occurs predicate. If the first definition or the last reference of thein a loop, we produce two required elements that specify two interaction occurs in a loop, two iteration counts for that loopdistinct iteration counts for the loop. Similarly, if the last are considered in producing the required elements. The re-reference occurs in a loop, we produce two required elements quired elements will correspond to the boundary conditionswith different iteration counts. In both cases, whatever part of the loop. One required element is produced for all otherof the required element lies within the loop will also be iter- k-dr interactions.ated the corresponding number of times. Since boundary con- "Required k-tuples" is a class of strategies obtained by vary-ditions in loops are likely sources of error, it is preferable that ing k. For k = 2 we have a strategy similar to the requiredthe two iteration counts that are specified for a loop represent pairs strategy discussed in [16] . The only difference is thatthe boundary conditions of the loop. If the first definition or the new strategy specifies that two (as opposed to one) re-the last reference occurs within more than one loop, then two quired elements be produced when the first definition of therequired elements are produced for the innermost loop (or for interaction occurs within a loop. Higher values of k result inany one loop, if the loops are not nested). It would be prefer- more extensive strategies in which more complex data flowable to test each loop in the interaction with at least two itera- interactions are tested. Alternatively, we can define a class oftion counts as long as that does not produce too large a set of strategies by considering the basic required 2-tuples strategyrequired elements. and asking that combinations of required 2-tuples be tested,Subscripted variables pose another problem. In order to i.e., test k required 2-tuples along the same path. Required k-

specify the required elements using static analysis methods, tuples may also be viewed as testing combinations of requiredwe treat all elements of an array as occurrences of the same 2-tuples, but it has the advantage that only combinations ofvariable. This is certainly a limitation. One way to deal with required k-tuples that are related by data flow are tested, andit may be to flag required elements that represent interactions thus, it should be more cost effective. As k increases, the in-involving subscripted variables and test these elements with teractions span more and more of the program and the strategymore than one test case. Run time instrumentation techniques becomes a variation of structured path testing [8] .(like the ones described in [11] ) can be used, if array elements It should be noted that the required k-tuples strategy is aare to be treated as distinct variables. Another issue is raised structural strategy. Although the required elements includeby subroutine calls and aliasing. In terms of data flow analysis, assertions, the role of the assertions is to specify structuralinput parameters to the subroutine are considered to be refer- information. Next, we discuss how the required k-tuples strat-enced in the subroutine call and output variables are considered egy can be used as a basis for a more extensive strategy thatto be defined in it. It is assumed that all aliases are known and combines features of black-box and error-driven testing asthat data flow analysis is done bottom up so that the data flow well.of variables within a subroutine can be used to establish the There are two main sources of information that is useful inappropriate actions for the subroutine call. Recursive routines testing aside from the code itself. They are the input and out-can be handled using induction methods. put specifications (formal and informal) and gathered knowl-Perhaps the most serious problem that has to be faced is that edge about common sources of error and likely errors with

of nonexecutable paths. Since data flow analysis is static and various constructs. Next we expand the required k-tuplessince the k-dr interactions are generated statically, it is pos- strategy so that such information can be incorporated. Datasible that no executable paths that cover some interactions flow analysis is not directly useful in this since it does not dealexist in the program. The problem of determining whether or with the semantics of variable use. However, a k-dr interactionnot an executable path that covers a k-dr interaction exists is, can be used as a link to the specifications and to informationin general, undecidable. Then, some k-dr interactions will on common errors. In many cases, testing for likely errors canhave to be discarded at the test data generation phase. In be done by attaching appropriate assertions to the requiredmany cases, symbolic execution [13] may be useful in deter- elements. Consider a k-dr interaction in which the last refer-

Page 5: On Required Element Testing

NTAFOS: REQUIRED ELEMENT TESTING 799

ence occurs in a branch predicate of the form X> Y. The not require the extensive analysis needed in most black boxrequired k-tuples strategy will produce at least one required strategies. Again, we feel that by concentrating on a k-dr in-element that includes the assertion X > Y, and at least one teraction the task is eased. A k-dr interaction involves somerequired element including the assertion X < Y. A common variables in a series of computations. If the interaction iserror in such predicates is branching the wrong way on equal- covered by an executable path, it can be viewed as a partiality. We can test for this error by adding a required element computation. On the other hand, the specifications providethat includes the assertion X = Y. Other domain errors in information about what the program is intended to do. Then,branch predicates can be tested in a similar fashion. The k-dr by contrasting the internal view of the computation that isinteraction can be used to develop additional required elements described by the k-dr interaction, with the external view de-by 1) focusing on the last reference in the interaction and con- scribed in the specifications, conditions that need to be testedsidering likely errors for the operations in which the variable and likely sources of error can be discovered. For example,is involved, and 2) by considering the series of computations consider the sine program of Fig. 1 and the error in statementrepresented bv the k-dr interaction as a whole. Examples of s7 (it should be: IF(DABS(TERM).LT.E) GO TO 30). The re-the first include checking for domain errors in branch predi- quired k-tuples strategy will produce at least one requiredcates, using even and odd values in an integer division by 2, us- element that represents the 2-dr interaction between the inputing zero to test divisions, etc. This approach simply expresses specification and the reference to TERM in s7. Assuming thatgathered knowledge on likely errors and existing testing strate- the input specification defines the range ofX to include nega-gies in terms of required elements. The premise in the second tive values, the error should be discovered when this interac-approach is that, if one looks at a statement by itself, it may tion is considered. As a second example consider the binarybe hard to spot problems. However, once that statement is search routine shown in Fig. 3 and suppose that the statementplaced in the context of a series of related computations, in sI 1: HIGH = MID - 1 is changed to HIGH = HIGH - 1 (this ismany cases it should be easier to determine what is going on one of the mutants introduced by mutation analysis [11 ). Theand uncover possible sources of error. These two approaches resulting routine still produces the correct result but performsindicate different uses of the required elements. When looking a combination of a sequential and a binary search and shouldfor errors within a statement by itself the k-dr interaction is be considered incorrect. This error would be difficult to de-not really useful. Thus, assertions should be attached to re- tect by many testing strategies, since intermediate values ofquired 2-tuples that involve the input-output specification and MID are unlikely to be checked. The required k-tuples strategythe statement in question. The full k-dr interaction should be will produce a required element for the interaction betweenconsidered only in trying to discover errors along it. As an the "definition" of MID in the input specification and theexample, consider the routine in Fig. 1 which computes the reference to MID in s7. By concentrating on this interaction,sine of X to accuracy E, using the Maclaurin series expansion. and assuming that the input specification defines MID as theThis program appears in the "Common Blunders" section of index in a binary search, it may well be that the user would[12]. Consider the error in statement s8 which should be: ask that a trace be performed on MID to verify that a binarySUM = SUM + ((-1)**(I/2))*TERM. This error can be detected search is indeed performed.by considering the statement by itself since -1 **(I/2) is always The required k-tuples with assertions strategy consists ofequal to - 1. While one might miss this error by inspecting the developing required elements according to the required k-tuplescode, one would probably detect it if asked to examine the strategy and then producing additional required elements byoperations in which the variable I is involved. Another error contrasting each k-dr interaction against the specifications andin the sine program is that the test for exiting the loop takes information on commonly occurring errors. This definition isplace before the value of the sine is updated. This error will necessarily vague since it depends on what common errors arebe detected with high probability by the required k-tuples considered and on what assertions are produced by consideringstrategy, but it may also be detected by inspection if attention the input specification. Again, this strategy is based on theis focused on the 2-dr interaction involving the variable TERM premise that, by concentrating on a specific interaction, sourcesin statements s6, s7 and the predicate TERM < E. of error can be discovered more easily. This view is supportedAnother important source of information about the pro- by work on "program slices" reported in [20]. Program slices

gram are its input and output specifications. We take it that the are sets of statements related by the flow of data. They areinput specification describes the input domain of the program produced using data flow analysis and they are related to k-drand defines the role of the programming variables. The output interactions. It is reported in [20] that experiments indicatespecification is taken to describe what the program is intended that programmers use slices when debugging a program.to do and what its outputs are. The specifications need not be There are some differences b;etween required k-tuples on oneformal, although that would simplify the task of automating hand and k-tuples with assertions on the other besides the ob-some of the process. Documentation may be used if no speci- vious differences in the functional part. First, in the requiredfications are available. As with information on common errors, k-tuples strategy, the required elements that involve the speci-one way to ulse the specifications would be to specify known fications are redundant since they will be covered by any setblack-box strategies in terms of required elements. This ap- of test paths that covers the remaining elements. Also, interac-proach would suffer from the same problems as the existing tions within segments will be covered automatically, so weblack-box strategies. What would be more useful is to find a need only consider interactions between segments and let theway that the specifications can be used effectively, that does vertices of the program's digraph represent maximal segments.

Page 6: On Required Element Testing

800 IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. SE-10, NO. 6, NOVEMBER 1984

We have included these redundant required elements because 1: SUBROUTINE TRIANGLE(A,B,C) 1

they are useful in the required k-tuples with assertions strategy. s2. 5 IF((A.GE.).A1D.(B.GE.C)) OTO 100 2 3The required elements associated with the specifications can sA: PRINT 50 4

s5: 50 FOR14AT(1H,'LENGTHS NOT IN ORDER')be easily removed. Indications are that the inclusion of re- s6: RETURNquired elements that represent interactions within segments s7: 100 IF7(.EQ.B).OR.(3.EQ.C)) GOTO 500 5

does not increase the size of the required element set signifi- 39: B=B*Bs 10: C=C**2

cantly. For example, in [3], it is reported that only a 0.56 1i: D=B+C 7'reduction in the size of the program's digraph resulted from 53 IF(A.NE.D) GOTO 200 7

associating vertices with maximal segments instead of single 914: 150 FORMAT) H, 'RIGHT TRIANGLE')statements in a set of Fortran programs. s16: 200 IF(A.LT.D) GOTO 300 10

s 17: PRINT 250 12s18: 250 FORMAT(1H,'OBTUSE TRIANGLE')

IV. ILLUSTRATIONS AND EVALUATIONS OF REQUIRED F 19: RETURNELEMENT TESTING STRATEGIES s20: 300 PRINT 350 13

s21: 330 FORMAT(1H, 'ACUTE TRIANGLE')

In this section, we illustrate some of the strategies defined s12: 500 RETRN 3( OTO 600 11in the previous section and present the results of some experi- s24: PRINT 550 16

s25: 550 FORMAT()H,'ISOSCELES TRIANGLE')mental evaluations of these strategies. s26: RETURNFirst, consider the subroutine TRIANGLE [18] . A listing of 527: 600 PRINT 650 17s28: 650 FORMIAT(1H,'EQUILATTIRAL TRIANGL-E')

TRIANGLE and a corresponding digraph are shown in Fig. 2(a) s29: RETURNand (b), respectively. The vertex indexes correspond to the 30: END

segment numbers shown in Fig. 2(a). Vertex so represents the (a)input specification and to represents the output specification. sNote that the compound predicates in statements s3, s7, and os23 are broken up so that hidden branches are not overlooked V(compound predicates can, in general, be interpreted in a num-ber of ways, all of which should be considered for a thorough V2test of the predicate). Also, note that segment 7 is split into V2'three segments since there is a drdr interaction within the seg-ment. The subroutine TRIANGLE accepts three positive, non- VV3 4zero, integer values A , B, C in nondecreasing order of their sizeand determines if the triangle with sides of length A, B, C is a V5right, acute, obtuse, isosceles, or equilateral triangle. We let the V5\input specification define A, B, C as positive nonzero integerswith A > B > C and let the output specification define the tri- v vangle types for each expected output (e.g., that A, B, C are the 7

sides of an isosceles triangle if a) A < B + C and b) exactly two t 4 V7\of the sides are equal). Thus, the routine contains an error v14since it does not check that the three lengths satisfy the triangle Vinequality. The required elements for the required 2-tuples v15and required 3-tuples strategies are generated using data flowanalysis. Thesetofrequired2-tupleshasatotalof35elements. v16The set of required 3-tuples contains five additional elements V17 6\for a total of 40. Table I contains some representative elements \ v12including a) all 2-tuples involving statement s3, b) all 2-tuples 13involving the output specification, c) all 2-tuples associatedwith the definitions in segment 7, and d) the five 3-tuples thatcorrespond to 3-dr interactions. Note that required elements ,tare associated with the output statements although those donot involve any variables. The required elements involve only (b)the specifications and the WRITE statements. For the required Fig. 2. (a) The triangle classification routine TRIANGLE. (b) The digraph2-tuples strategy, these are redundant since coverage Of all seg- for the subroutine TRIANGL- .ments is achieved without them. These elements gain impor-tance when the k-tuples with assertions strategy is considered. and the required 3-tuples strategies may fail to detect the errorIt can be seen from the table that there is significant redun- in the program, since they do not force test cases with valuesdancy in the set of required 2-tuples. Also, in this program, that violate the triangle inequality to be produced. Otherthe elements corresponding to 3-dr interactions are redundant structural strategies would have difficulty detecting this errorsince the tree-like structure of the program ensures that any as well. This emphasizes the need to include the specificationsset of test paths that covers the 2-tuples will also cover any k- in the testing process. The required 2-tuples with assertionstuples. It is interesting to note that both the required 2-tuples strategy will ask the user to contrast the statements s4, etc.,

Page 7: On Required Element Testing

NTAFOS: REQUIRED ELEMENT TESTING 801

TABLE I s1: SUBROUTINE SEARCH(F,N,A,NPOS)S2 INTEGE.R F,N,NPOS,LOW,HIGH,MIO

REPRESENTATIVE REQUIRED 2-TUPLES AND 3-TUPLES FOR TRIANGLE S3: INTEGER AF,N)S4: NPOS 0

{[v ,V2]; A > B } ([s0,t ];1 S5: L0W=100 ~~~~~~~s6: H-IGH=N

{[V 2]; A < B } t[v4Nt0] } s7: 5 MIID=(LOW+HIGH)/2 2S8: IF(HIGH.LT.LOW) GO TO 20 3

t[v ,V23; B > C I {[V9,t ]t} S9: IF(F.EQ.A(MID)) GO TO 25 4 52 -9 0' S10: IF(F.GT.A(MID)) GO TO 30 6 7t[vlv2,; B < c }[v12sl3: HIGH_MID-1 3

s12: GO TO 5

I [v3 ,to]; s313: 30 LOW=MID+i 9s14: GO TO 5

(a) {[v 6t] * } s15: 25 NPOS=MID 1016O'0 's16: 20 RETURN 11{CV17,t0]; } s17: END

4SOI[v7,v7j1;} b

{ 7,V71!]; A 0D }

il7[V7 V7t,,; A = D }V22\(IV7,',v71, 1; A i DI {[V1,V7,V7,];}V7(IV7''V7,]; A = D }Cv1,v7,v7,,]; A D } V3 5

(Cv v ~~~~~~~~1'7, 3]A 0{[ V 1,313; A < D }{[ 'V1,V71,V7,; A -D }7' i7 K0 C

IC'17,V 1 A >0 1 }Cy1, ,V10]; A < D 1 V7

IV7, ,v13]; A < D }Cv 1,'V7,v10]; A > D I V9

(IV7#,VI;)1; A > D

(c) (d) t

Fig. 3. A binary search routine and the corresponding digraph.

with the output specification that defines what a triangle isand, thus, should result in the detection of the error. ing live mutants, 75 are actually equivalent to the original pro-The set of required elements for the 2-tuples with assertions gram. Most of the undetected mutants were mutants that in-

strategy cannot be defined precisely since it depends on what troduce slight shifts in the branch predicates (e.g., replaceset of common errors is considered and on how the specifica- statement s3 with: IF((A + 1).GE.B.AND.B.GE.C) GO TO 100).tions are used. With respect to checking likely sources of Those mutants would be detected by the required 2-tuples witherror, we may produce assertions that use ideas from domain assertions strategy if domain testing approaches are used totesting [22] to test the branch predicates. With respect to the produce the assertions. The set of required 3-tuples is coveredspecifications, assertions testing the various types of triangles by the same nine test cases as the set of 2-tuples.with legal and illegal values should be produced. These asser- As a second example, consider the binary search routinetions will be associated with the 2-tuples involving the WRITE SEARCH [12] shown in Fig. 3. The input to this routine is anstatements anid the output specification and with the 2-tuple integer array A of size N and an integer F. The output is the{[so, to] ;}. The resulting set of required 2-tuples with asser- variable NPOS whose value should be such that A(NPOS) = F,tions may include the following elements: {[Vl1, V2 I; A = B}, or zero if F is not in A. The variables Low and HIGH specify{[V1, V2'], B = C}, {[vg, to]A = B > C}, {[vs, to ], A >B > the range of indexes in A over which F is to be searched for,C, A < B + C}, {[vg, to]; A > B > C, A > B + C}. The first and MID is the search index for the binary search. SEARCHtwo of these test branching on equality in segment 2 and the contains two loops (not nested). We will refer to the loop thatremaining three test the WRITE statement for isosceles triangles includes v8 as L 1 and to the loop that includes vg as L 2. Thewith legal and illegal values. sets of required 2-tuples and 3-tuples contain 50 and 87 ele-The subroutine TRIANGLE was tested using required 2-tuples ments, respectively. Unlike TRIANGLE, the set of 3-tuples will

and required 3-tuples. The performance of the strategies was force the execution of paths that may not be used to cover 2-measured using mutation analysis. Mutation analysis generates tuples. Table II shows a) the required 2-tuples correspondinga set of simple changes that are introduced into the program to the 2-dr interactions between segments 1 and 2 and betweenone at a time. Then the original program and all its mutants segments 8 and 2, and b) the required 3-tuples representing theare run and a mutant is considered killed if its output differs 3-dr interactions in which the first defilnition is the definitionfrom the output of the original program. Additional test cases of LOW in segment 9. The notation ZE (exitZ),ZR (repeatZ)are applied to the remaining live mutants. Nine test cases were is used to specify that loop Z is to be exited at the first oppor-used to cover the set of required 2-tuples (six correspond to tunity and iterated some larger number of times, respectively.the various triangle types, and the remaining three cover the L refers to either loop Li or ioop L2 since the two ioops arehidden branches in the compound predicates). The test cases not nested. The subroutine SEARCH was also tested using mu-detected all but 86 out of a total 741 mutants. Of the remain- tation analysis. The set of required 2-tuples is covered by 10

Page 8: On Required Element Testing

802 IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. SE-I0, NO. 6, NOVEMBER 1984

TABLE II ing and 100 for random testing. The undetected mutants in-SOME OF THE REQUIRED ELEMENTS FOR SEARCH clude those that are equivalent to the original program. De-E spite the high degree of detection, there were a number of

I (v 1', v21; I' HK< LOW I simple errors that were not detected.{[vl ,v2 ]; ,E HIGH > LOW In a second experiment we considered the collection of pro-

v,]; LR HIGH < LOW } grams and program segments in the "Common Blunders" sec-2

([vl,v2]; LR, HIGH > LOW tion of [12]. There are 14 different programs and program{V8 V2 * L E, HIGH ( LOW segments in the two editions of [121] , containing a total of 23

errors. Of those, fi1ve will always be detected by the required{ [v8,V2]; LiE, HIGH > LOW} 2-tuples strategy. Three of them are data flow anomalies of£ v8,v2]; LIN, HIGH < LOW I the form ur. While those would be trivially detected by data

8' 21; L R, HIGH > LOW flow analysis, they may be missed by other strategies depend-ing on what default initializations are performed by the com-

(a) piler, Ten other errors will be detected by required 2-tupleswith very high probability. Of the remaining, required 2-tuples

([V V V ]L2E,F = A(O } has a fair chance of detecting two, and is unlikely to detectt[v9,v2 ,v4]; L2F,F= A(N) six. If required 2-tuples with assertions is used, all the errors[EV9,V , V41; L2 F X A(N) should be detected, since the "Common Blunders" section of(IV9, v,VNI; L2R, F = A(N) } [131 is intended as a source of information on commonly{[V ,V2,v4l; L2R, F i A(N) occurring errors.

{[V9, V2,v6 1; L2E, F > A(N) } V. CONCLUSIONS{1V9, V2v6 1; L2E, F < A(N) } We have introduced two classes of required element strategies,1Ev9,2, v6]; L2E, F > A(N) the required k-tuples and the required k-tuples with assertions{[V V2,V6 L2E F K AN. } strategies. Although the latter includes the former for anyIIV9v~) V6 X2 F < A

given k, there is a tradeoff between using k-tuples on one hand11v9,V2,V ] LE } and m-tuples with assertions (with m <k) on the other. Our1[V9 , V2' 8 ]; L2R } experiments with required 2-tuples with assertions and requiredEv9,V2,v8 ; LI E 3-tuples indicate that the incorporation of assertions is the[V9,v2,V8]; L R most promising approach. A lot more needs to be known

about effectiveness, and cost models need to be developed(b) before a complete analysis of the tradeoff can be obtained.

An important consideration for both strategies is their cost.The required elements for the k-tuples strategy can be gener-

test cases which detected 357 out of 399 mutants. Three ated automatically with static analysis tools. The k-tuplesadditional test cases were used to cover the set of required 3- with assertions strategy is best used interactively where thetuples, bringing the number of detected mutants to 365. Of user will formulate the assertions from the specifications, in-the remaining live mutants, 27 are equivalent to the original formation on common errors, and a description of the k-drprogram and another three should be detected by the required interaction. The number of required k-tuples in a program2-tuples with assertions strategy. can, in the worst case, be 0(nk) where n is the number ofRequired element testing strategies were evaluated in two statements and it is assumed that the number of variables

other experiments. In [16], an experimental comparison of acted upon in a statement is bounded by a constant. In prac-required pairs (limited version of required 2-tuples), branch tice, we expect the number of required k-tuples to be muchtesting and random testing on a set of seven programs is re- smaller. The number of required k-tuples with assertions willported. The strategies were evaluated using mutation analysis depend on how extensive an analysis is performed, but we ex-as a measure of test set effectiveness. Test cases for three pect it to be of the same order of magnitude as the numberstrategies were selected from a large set of random test cases of required k-tuples. Probably the more important considera-that was generated for each program. For random testing, tion is the number of test cases needed to cover the set oftest cases were selected directly from this set. For branch and required elements. Since the number of test runs contributesrequired pairs testing, a minimal set of paths that covered the to the cost of testing, it is important to minimize it. Therebranches and the required pairs was developed and then one are two difficulties with achieving this goal. First, trying totest case from the input domain of each of these paths was cover more required elements with a single path may make itselected from the set of random cases (slight modifilcations more difficult (if not impossible) to find such a path that iswere needed in some cases). The required pairs strategy de- also an executable path. Even if nonexecutable paths weretected 90 percent of the mutants in a set of seven programs not an issue, it has been shown that the related problem ofas compared to 85.5 percent for branch testing and 79.5 per- finding a minimum set of test paths that cover a set of pairscent for random testing. A total of 84 test cases were used to of vertices in an acyclic structured graph is intractable [14),cover the required pairs as compared with 34 for branch test- [15] .

Page 9: On Required Element Testing

NTAFOS: REQUIRED ELEMENT TESTING 803

Further research is needed to establish the extent to which [131 J. King, "Symbolic executions and program testing," Commun.data flow interactions are useful and to choose the best alter- Ass. Comput. Mach., pp. 385-394, July 1976.[14] S. C. Ntafos and S. L. Hakimi, "On path cover problems in di-natives in the formulation of strategies. In general, the trade- graphs and applications to program testing," IEEE Trans. Soft-offs are between a more extensive strategy and a correspond- ware Eng., voL SE-5, pp. 520-529, Sept. 1979.

[151 -, "On structured digraphs and program testing," IEEE Trans.ing increase in cost and complexity. Another interesting Issue Comput., vol. C-30, pp. 67-77, Jan. 1981.is to determine how the number of k-dr interactions varies [161 S.C. Ntafos, "On testing with required elements,"inProc. COMP-with k in practice and whether or not there are program in- SACG81, Nov. 1981, pp. 132-139.

valus o khatare estforthek-tuleswit aser-[17] L. J. Osterweil and L. D. Fosdick, "DAVE-A validation, errordependent values of k that are best for the k-tuples with asser- detection and documentation system for FORTRAN programs,"tions strategy (i.e., how long should the interaction be so that Software Practice Exp., voL 6, pp. 473-486, 1976.the user can determine what the interaction represents). [181 C. V. Ramamoorthy, S. F. Ho, and W. T. Chen, "On the auto-

mated generation of program test data," IEEE Trans. SoftwareEng., vol. SE-2, pp. 293-300, Dec. 1976.ACKNOWLEDGMENT [19] S. Rapps and E. J. Weyuker, "Data flow analysis techniques for

I wish to thank Prof. DeMillo for the use of the mutation program test data selection," Dep. Comput. Sci., Courant Inst.sse atGeorgia Tech and the referees of an earlier version [21Math. Sci., Tech. Rep. 23, Dec. 1981.system at Georgia Tech and the referees of an earlier version [20] M. Weiser, "Programmers use slices when debugging," Commun.

of this paper for their helpful suggestions. Ass. Comput. Mach., vol. 25, pp. 446-452, July 1982.[211 E. J. Weyuker and T. J. Ostrand, "Theories of program testing

REFERENCES and the application of revealing subdomains," IEEE Trans. Soft-ware Eng., voL SE-6, pp. 236-246, May 1980.

[11 R. A. De Millo, R. J. Lipton, and F. G. Sayward, "Hints for test [221 L. J. White and E. J. Cohen, "A domain strategy for computerdata selection: Help for the practicing programmer," IEEE Com- program testing," IEEE Trans. Software Eng., vol. SE-6, pp. 247-puter, vol. 11, pp. 34-41, Apr. 1978. 257, May 1980.

[21 J. W. Duran and S. C. Ntafos, "A report on random testing," in [231 S. J. Zeil and L. J. White, "Sufficient test sets for path analysisProc. 5th Int. Conf Software Eng., Mar. 1981, pp. 179-183. testing strategies," in Proc. 5th Int. Conf Software Eng., Mar.

[31 L. D. Fosdich and L. J. Osterweil, "Data flow analysis in software 1981, pp. 184-194.reliability," ACM Comput. Surveys, vol. 8, no. 3, pp. 305-330,1976.

[4] J. B. Goodenough and S. L. Gerhart, "Towards a theory of testdata selection," IEEE Trans. Software Eng., vol. SE-1, pp. 156-173, June 1975.

[5] M. S. Hecht, Flow Analysis of Computer Programs. New York:North-Holland, 1977. Simeon C. Ntafos (S'76-M'78) was born in

[6] P. M. Herman, "A data flow analysis approach to program test- Trikala, Greece, in 1952. He received the B.S.ing," Aust. Comput. J., vol. 8, pp. 92-96, Nov. 1976. _ _ degree in electrical engineering from Wilkes

[7] W. E. Howden, "Reliability of the path analysis testing strategy," _ College, Wilkes-Barre, PA, in 1976, and theIEEE Trans. Software Eng., vol. SE-2, pp. 208-214, Sept. 1976. _ i M.S. degree in electrical engineering and the

[8] -, "Symbolic testing-Design techniques, costs and effective- Ph.D. degree in computer science from North-ness," NTIS PB-268517, May 1977. western University, Evanston, IL, in 1976 and

[9] -, "Functional program testing," IEEE Trans. Software Eng., 1979, respectively.vol. SE-6, pp. 160-169, Mar. 1980. He was a Visiting Assistant Professor in the

[10] J. C. Huang, "An approach to program testing," ACM Comput. Department of Electrical Engineering and Com-Surveys, voL 7, pp. 113-128, Sept. 1975. puter Science at Northwestern University from

[11] -, "Detection of data flow anomaly through program instru- September 1978 to August 1979. Since September 1979, he has beenmentation," IEEE Trans. Software Eng., vol. SE-5, pp. 226-236, with the Computer Science Program at the University of Texas atMay 1979. Dallas where he is now an Associate Professor. His current research

[12] B. S. Kernighan and P. J. Plauger, The Elements ofProgramming interests are in the areas of software reliability and computationalStyle. New York: McGraw-Hill, 1978. complexity.