automated reasoning for multi-step feature model …dbc/dbc_archivos/pubs/white09-splc.pdf · 2009....

Automated Reasoning for Multi-step Feature Model Configuration Problems

Jules White, Brian Dougherty, and Doulas C. SchmidtVanderbilt University, Nashville, TN USA

Email:{jules, briand, schmidt}@dre.vanderbilt.edu

David BenavidesUniversity of Seville, Seville, Spain

Email:[email protected]

Abstract

The increasing complexity and cost of software-intensivesystems has led developers to seek ways of increasing soft-ware reusability. One software reuse approach is to developa Software Product-line (SPL), which is a reconfigurablesoftware architecture that can be reused across projects.Creating configurations of the SPL that meets arbitrary re-quirements is hard.

Existing research has focused on techniques that pro-duce a configuration of the SPL in a single step. This paperprovides three contributions to the study of multi-step con-figuration for SPLs. First, we present a formal model ofmulti-step SPL configuration and map this model to con-straint satisfaction problems (CSPs). Second, we show howsolutions to these CSP configuration problem CSPs can bederived automatically with a constraint solver. Third, wepresent empirical results demonstrating that our CSP-basedtechnique can solve multi-step configuration problems in-volving hundreds of features in seconds.

1 IntroductionThe high-cost of developing distributed real-time and

embedded (DRE) systems has pushed developers to findnovel solutions to increase the reusability of software.One promising reuse approach is Software Product-lines(SPLs) [6], which are software architectures that are de-signed with built-in points of variability that can be al-tered so that the software can be more readily reused acrossprojects. For example, an SPL for a car can be built with theability to use multiple engine control software componentsso it can be adapted to cars with different engine types.

Ensuring that a correct software product is producedfrom an SPL involves building models of the rules for con-figuring the points of variabilty. For example, an SPL con-figuration for a car cannot simultaneously employ two dif-ferent engine control software components or the wrongcomponent for the given engine type. A common tech-nique for specifying SPL configuration rules is afeaturemodel[11], which abstracts the components and points of

variaiblity in a software product asfeatures.Feature models are typically implemented as tree-like

structures that specify how the components and pointsof variability affect one another. For example, the fea-ture model of a car in Figure 1 can optionally includeanAutomated Driving Controller. If the car includesthis feature it must also include theCollision AvoidanceBreaking feature. Any arbitrary configuration can bechecked against the feature model to determine if it is acomplete and correct configuration of a software product.

When an SPL is configured for a new set of require-ments, developers must find a selection of the features fromthe feature model that (1) satisfy the requirements and (2)adhere to the rules in the feature model. This configurationprocess involves reasoning over a complex set of constraintsto meet an end goal. Various tools [2, 13, 1, 4, 5, 15] havebeen developed to help reduce the complexity of this pro-cess by automating parts of the feature selection process.

Open problems. Some configuration problems requirestarting at an arbitrary state and deriving a new configura-tion that meets the target requirements. For instance, anautomotive software designer using an SPL may start withno features selected and derive a selection of features forthe automobile software to meet the needs of a new modelyear car. Often, however, constraints limit developers fromdirectly transitioning from the starting state to the desiredend configuration.

For example, assume that a group of automotive SPLdevelopers want to modify the configuration of an existingSPL car model to include automated driving capabilities, asshown in Figure 1. The developers have determined that thecost of adding all the new features will be 88 million dollars.The developers only have a annual development budget of35 million dollars to reconfigure the SPL variant, however,which means that the developers cannot simply add all fea-tures in a single year. Management has also asked the de-velopers to make continual progress on developing the carby adding new features to it every year.

To manage these constraints, the developers must incre-mentally add the desired features over a series of steps,i.e.,over several years the developers will produce a series of in-

1

Figure 1: A Configuration Problem Requiring Multiple Steps

termediate configurations that leverage each other to reachthe desired configuration. For example, they can developa subset of the new features in the first year’s car config-uration, add more of the remaining desired features in thesecond year, and add the rest of the desired features andreach the new configuration in the third year. This processof producing a series of intermediate configurations—i.e.,a configuration path—is shown in Figure 2. We call thissequence of activities amulti-step configuration problem.

A key challenge is that the developers cannot arbitrar-ily pick and choose features to add in a given year to meetthe budget constraint. For example, developers will vio-late the feature model rules if they choose to addParallelParking in year 3 withoutLateral Range Finder, whichis required via a cross-tree constraint, as shown in Figure 2.Developers must therefore not only adhere to their con-straints on the changes that can be produced in a given year(such as the maximum allowed annual development bud-get) but also ensure that the changes they choose create avalid configuration at the end of each year. The developersalso cannot choose an intermediate configuration to transi-tion through that will not function and hence cannot be sold.

Further complicating the multi-step configuration prob-lem is that developers may need to foresee tradeoffs thatmust be made along the way. For example, it is not possibleto simultaneously addCollision Avoidance Brakingand Enhanced Avoidance in the same year since theirtotal development cost is 36 million dollars, as shownin Figure 2. Developers must therefore addCollisionAvoidance Braking andStandard Avoidance one year(to simultaneously meet the budget constraint and the fea-ture model constraints) and then remove theStandardAvoidance feature at a later step to add the desiredEnhanced Avoidance feature.

Prior work on configuration analysis and automation [13,1, 4, 5] focused on creating one configuration that meets aspecific set of requirements,i.e., they find a configuration inone step and assume that it is possible to directly transitionto it. These techniques do not, however, support the need

to split the configuration over multiple steps to adhere to achange constraint, such as the maximum development bud-get per year. A gap therefore exists in current techniqueswhen developers need to reason about and automate config-uration over multiple steps.

Solution overview and contributions.To fill the gap inexisting research, we have developed an automated methodfor deriving a set of configurations that meet a series ofrequirements over a span of configuration steps. We callour technique theMUlti-step Software Configuration prob-LEm solver(MUSCLE). MUSCLE transforms multi-stepfeature configuration problems into constraint satisfactionproblems (CSPs) [9]. Once a CSP has been produced forthe problem, MUSCLE uses a constraint solver (which is anautomated tool for finding solutions to CSPs) to generate aseries of configurations that meet the multi-step constraints.

This paper provides the following contributions to thestudy of feature model configuration over a span of multiplesteps:

1. We provide a formal model of multi-step configura-tion,

2. We show how the formal model of multi-step configu-ration can be mapped to a CSP,

3. We show how multi-step requirements, such as limitson the cost of feature changes between two successiveconfigurations, can be specified using our CSP formu-lation of multi-step configuration,

4. We describe mechanisms for optimally deriving a setof configurations that meet the requirements and min-imize or maximize a property of the configurations orconfiguration process, such as total configuration cost,

5. We show how multi-step optimizations can be per-formed, such as deriving the series of configurationsthat meet a set of end-goals in the fewest time steps,and

Paper organization. The remainder of the paper is orga-nized as follows: Section 2 summarizes the challenges ofperforming automated configuration reasoning over a se-quence of steps; Section 3 describes a formal model of

Figure 2: Potential Configuration Paths

multi-step configuration; Section 4 explains MUSCLE’sCSP-based automated multi-step configuration reasoningapproach; Section 5 analyzes empirical results from ex-periments demonstrating the scalability of MUSCLE; Sec-tion 6 compares MUSCLE with related work; and Section 7presents concluding remarks.

2 Challenges

A multi-step configuration problem for an SPL involvestransitioning from a starting configuration through a seriesof intermediate configurations to a configuration that meetsa desired set of end state requirements. The solution spacefor producing a series of successive intermediate configura-tions to reach the desired end state can be represented as adirected graph, as shown in Figure 3(a).

(a) A Graph of a Multi-step Configuration Problem

(b) Optimization of Total Steps

Figure 3: Multi-step Configuration Graphs

Each successive series of points represents potential con-figurations of the feature model at a given step. For ex-ample, the configurationsB0 . . .Bi represent the intermedi-ate configurations that can be reached in one step from thestarting configuration. In this section we use this graph for-mulation of the problem’s solution space to showcase thechallenges of finding valid solutions.

2.1 Challenge 1: Graph ComplexityA critical challenge to developers attempting to derive

solutions to multi-step configuration problems manually orto use a graph algorithm is that there are an exponentialnumber of potential intermediate configurations and pathsthat could be used to reach the desired end state. In theworst case, at any given intermediate step, there can beO(2n) points (wheren is the number of features in the fea-ture model). In the worst case, therefore, there are 2n poten-tial subsets of the features in the feature model that couldform a configuration. Moreover, for a multi-step configura-tion problem overK time steps, there areO(K2n) possibleintermediate points.

Further compounding this problem is that for any inter-mediate configuration at stepT, there are in the worst case2n−1 points at stepT +1 that could be reached from it byadding or removing features to its feature selection. The in-termediate configurations that do not precede the end pointwill therefore have 2n − 1 outgoing edges. Section 4 dis-cusses how MUSCLE uses CSP-based automation to elim-inate the need for developers to manually find solutionsto these multi-step configuration problems, which reducesconfiguration time and cost.

2.2 Challenge 2: Point Configuration ConstraintsAlthough there are a substantial number of potential

intermediate configurations, many of these configurationswill not meet developer requirements. For example, manyof theK2n arbitrary subsets of feature selections will repre-sent configurations that do not adhere to the feature model

constraints. Moreover, other external constraints, such assafety constraints requiring a specific feature to be selectedat all times, may not be met. We term these constraints onthe allowed configurations at a given steppoint configura-tion constraints.

Point configuration constraints eliminate many potentialconfiguration paths. These constraints may create small ad-ditional restrictions, such as that a particular feature mustalways be selected. Complex step-based constraints mayalso be present, such as a particular automotive feature be-comes unavailable after a specific time step (year) becausethe supplier discontinues it. Finally, a multi-step configu-ration problem may not dictate an exact starting and end-ing configuration, but merely a series of point configurationconstraints that must hold for the start and end points ofthe configuration path. The myriad of possible point con-figuration constraints significantly increases the challengeof finding a valid configuration path for a multi-step con-figuration problem. Section 4.3 describes how MUSCLEmodels these constraints using a CSP, which enables a CSPsolver to automatically derive solutions that adhere to theseconstraints and thus reduce tedious and error-prone manualconfiguration.

2.3 Challenge 3: Configuration Change/EdgeConstraints

The automotive example in Figure 1 requires that devel-opers adding new features spend no more than 35 milliondollars in one year. The cost of adding/removing featurescan be captured as the length or weight of the edges con-necting two transitions. For example, to transition directlyfrom the starting configuration to the desired end configu-ration requires 88 million dollars and has an edge weight of88.

Developers must not only find a path that reaches thedesired end state without violating the point configurationconstraints in Section 2.2, but also ensure that any con-straints on the edges connecting successive configurationsare met. Transitioning directly from the start configurationto end configuration would violate the edge constraint ofthe 35 million dollar yearly development budget. Edge con-straints further reduce the number of valid paths and addcomplexity to the problem. Section 4.4 shows how theseedge restrictions can be encoded as constraints on MUS-CLE’s CSP variables to plan configuration paths that adhereto development budgets, which is hard to determine manu-ally.

2.4 Challenge 4: Configuration Path Optimiza-tion

There may often be multiple correct configuration pathsthat reach the desired end point. In these cases, develop-ers would like to optimize the path chosen, for example to

minimize total cost (the sum of the edge weights). In othercases, it may be more imperative to meet the desired endpoint constraints in as few time steps as possible.

For example, in Figure 3(b), developers have an initialdevelopment budget of 35 million dollars and then a subse-quent yearly budget of 50 million dollars.

Although the cost of the path through intermediate con-figurationsBi andCi is cheaper (70 million), developersmay prefer to pass throughB0 andC0 since they will al-ready have a configuration that meets the end goals atC0.Developers must therefore not only contend with numer-ous multi-step constraints, but must also perform complexoptimizations on the properties of the configuration path.Section 4.5 shows how optimization can be performed onMUSCLE’s CSP formulation of multi-step configuration toallow developers to find the fastest and most cost-effectivemeans of achieving a configuration goal.

3 A Formal Definition of Multi-step Configu-ration

This section presents a formal model of multi-step con-figuration. In its most general form, multi-step configura-tion involves finding a sequence of at mostK configura-tions that satisfy a series of point configuration constraintsand edge constraints. This definition requires the start andend configurations meet a set of point constraints, but doesnot dictate that there be asinglevalid starting and endingconfiguration.

General formal model. We define a multi-step configuration problem using the 6-tupleMsc =<

E,PC,∆(FT ,FU),K,FStart,Fend >, where:• E is the set of edge constraints, such as the maximum

development cost per year for features,• PC is the set of point configuration constraints that

must be met at each step, such as the feature modelrules that developers may require to be adhered toacross all steps (feature model rules do not have to beenforced at each time step),

• ∆(FT ,FU) is a function that calculates the change costor edge weight of moving from a configurationFT atstepT to a configurationFU at stepU ,

• K is the maximum number of steps in the configurationproblem,

• FStart is a set of configuration constraints on the start-ing configuration, such as a list of features that mustinitially be selected,

• Fend is a set of configuration constraints on the finalconfiguration, such as a list of features that must beselected or maximum cost of the final configuration.

We define a configuration path from stepT overK stepsas a K-tuple

P =< FT ,FT+1, . . .FT+K−1 >

, where the configuration at stepT is denoted byFT . Eachconfiguration,FT , denotes the set of selected features at stepT.

Section 4 shows how this formal model can be specifiedas a CSP. Although we use CSPs for reasoning on the for-mal model, we could also use SAT solvers, propositionallogic, or other techniques to reason about this model. Theformal model is thus applicable to a wide range of reasoningapproaches.

3.1 Constraint and Function ExamplesWe now describe how the formal model presented above

can be used to model typical SPL configuration constraints.We show how common configuration needs, such as the se-lection of specific features or budgetary constraints, can bemapped to portions of our multi-step configuration problemtuple.

Edge constraint examples.The set of edge constraintsE can include numerous types of constraints on the tran-sition from one configuration to another. For example, aconstrainte1 ∈ E may dictate that the maximum weight ofany edge between successive configurations inFT ,FT+1 ∈Phave at most weight 35 (for the automotive problem fromFigure 1):

∀T ∈ (0..K−1), ∆(FT ,FT+1) ≤ 35

Edge constraints may also vary depending on the step, forexample a development budget may start at $35 million andmay expand as a function of the step:

∀T ∈ (0..K−1), ∆(FT ,FT+1) ≤35

1− (.01∗T)

Edge constraints may also be attached to specific time steps:

∀T ∈ (0..4,6..K−1), ∆(FT ,FT+1) ≤ 351−(.01∗T)

∆(F5,F6) ≤ 40

Point configuration constraint examples.The point con-figuration constraints specify properties that must hold forthe feature selection at a given step. Both the starting andending points for the multi-step configuration problem aredefined as point configuration constraints on the first andlast steps. For example, we want to start at a specific con-figurationFstart and reach another configurationFend:

(F0 = Fstart)∧ (FK = Fend)

Another general constraintpc1 ∈ PC could require that forany stepT, the feature selectionFT satisfies the featuremodel constraintsFc:

∀T ∈ (0..K−1), FT ⇒ Fc

Developers could also require that a specific set of featuresFstart, such as safety critical braking features, be selected atall times:

∀T ∈ (0..K−1), Fstart ⊂ FT

Change calculation function examples.The function∆(FT ,FU) calculates the cost of changing from one config-uration to another configuration at a different step. For ex-ample, the following change calculation function computesthe cost of changing from one configuration to another:

Fadded = FU −FT

∆(FT ,FU) = ∑ fi ∗ ci, fi ∈ Fadded

wherefi is theith added feature andci is the price of addingthat feature.

4 A CSP Model of Multi-step ConfigurationThis section describes how MUSCLE uses CSPs to auto-

matically derive solutions to multi-step configuration prob-lems. To address the challenges outlined in Section 2 weshow that deriving a configuration path for a multi-step con-figuration problem can be modeled as a CSP [9] using theformal framework from Section 3. After a CSP formulationof a multi-step configuration problem is built, MUSCLE canuse a CSP solver to automatically derive a valid configura-tion path on behalf of the developer. Automating the con-figuration path derivation helps reduce the complexity fromChallenge 1 in Section 2.1. Moreover, the CSP solver canbe used to perform optimizations that would be extremelyhard to achieve manually.

Prior work on automated feature model configuration [3,15, 16] has yielded a framework for representing featuremodels and configuration problems as CSPs. This sectionshows how a new formulation of feature models and con-figuration problems can be developed that (1) incorporatesmultiple steps, (2) allows a constraint solver to derive a con-figuration path for evolving a feature selection over mul-tiple intermediate steps to meet an end goal, (3) permitsthe specification of intermediate configuration constraints,(4) allows for change/edge constraints on the transition be-tween feature selections, and (5) can be leveraged to opti-mize configuration path properties, such as path length orcost.

4.1 CSP Automated Configuration BackgroundA CSP is a set of variables and a set of constraints over

the variables. For example,(X −Y > 0)∧ (X < 10) is asimple CSP involving the integer variablesX andY. Aconstraint solver is an automated tool that takes a CSP asinput and produces alabelingor set of values for the vari-ables that simultaneously satisfies all of the constraints.Thesolver can also be used to find a labeling of the variablesthat maximizes or minimizes a function of the variablese.g.maximizeX +Y yieldsX = 9,Y = 8.

A feature model can be modeled as a CSP through a se-ries of integer variablesF, where the variablefi ∈ F cor-responds to theith feature in the feature model. A config-uration is defined as a series of values for these variables

Figure 4: Mapping Feature Model Rules to a CSP

such thatfi = 1 implies that theith feature is selected in theconfiguration. If theith feature is not selected,fi = 0. Con-figuration rules from the feature model are represented asconstraints over the variables inF , as shown in Figure 4.

More details on building a CSP from a feature model aredescribed in [15, 3].

4.2 Introducing Multiple Steps into the CSPThe goal of automated configuration over multiple-steps

is to find a configuration path that permutes a given startingconfiguration through a sequence of intermediate configura-tions to reach a desired end state. For example, the configu-ration paths in Figure 2 capture sequential modifications tothe car configuration, shown in Figure 1, that will incorpo-rate high-end features into the base automobile model. Toreason about a configuration path over a span of steps, wefirst introduce a notion of a configuration step into MUS-CLE’s CSP model of configuration.

CSP model of configuration steps.To introduce con-figuration steps into MUSCLE’s configuration CSP, wemodify the configuration CSP formulation outlined in Sec-tion 4.1. We no longer use a variablefi to refer to whetheror not theith feature is selected or deselected. Instead,werefer to the selection state of each feature at a specificstep T with the variablefiT , i.e., if the ith feature is se-lected at stepT, fiT = 1. We refer to an entire configura-tion at a specific step as a set of values for these variables,fiT ∈ FT . A solution to the CSP is configuration path de-fined by a labeling of all of the variables in the K-tuple:< FT ,FT+1 . . .FT+K−1 >.

For example, if theABS feature (denotedfa) is not se-lected at stepT and is selected at stepT +1, then:

faT = 0faT+1 = 1

Figure 5 shows a visualization of how thefiT ∈ FT vari-ables map to feature selections.4.3 CSP Point Configuration Constraints

To address Challenge 2 from Section 2.2, the point con-figuration constraints (which are the constraints that define

Figure 5: An Example of Variables Representing FeatureSelection State at Specific Steps

what constitutes a valid intermediate configuration) can bemodeled as constraints on the variablesfiT ∈FT . Each pointconfiguration constraint has a specific set of steps,Tpc, dur-ing which it must be met,i.e., the constraint must only eval-uate to true on the precise steps for which it is in effect.For example, a simple constraint would be that the 2nd and3rd configurations must have the featuref1 selected. Theset of steps for which this constraint must hold would beTpc = {2,3}.

CSP model of point configuration constraints.A CSPpoint configuration constraint,pci ∈ PC, requires that:

∀T ∈ Tpc, FT ⇒ pci

Arbitrary point configuration constraints can be built usingthis model to restrict the valid configurations that are passedthrough by the configuration path. This flexible point con-figuration constraint mechanism allows developers to spec-ify and automatically find solutions to problems involvingthe constraints from Challenge 2 in Section 2.2.

CSP point configuration constraint example.Assumethat we want to find values forFT . . .FT+K such that wenever violate any of the feature model constraints at anystep. Further assume that the constraints in the featuremodel remain static over theK steps (feature model changesover multiple steps can also be modeled). If thejth featureis a mandatory child of theith feature, we add the constraint:

∀T ∈ (0. . .K), ( fiT = 1) ⇔ (FjT = 1)

That is, we require that at any stepT, if the ith feature (FiT )is selected, thejth feature (f jT ) is also selected. Further-more, at any stepT, if the jth feature (FjT ) is selected, theith feature (fiT ) is also selected. Other example point con-figuration constraints can be mapped to the CSP as shownin Figure 6(a) and Figure 6(b).

(a) Point Configuration Constraints for Feature Model Structure

(b) Point Configuration Constraint for Feature Selection

Figure 6: Point Configuration Constraint Examples

4.4 CSP Edge/Change ConstraintsChallenge 3 from Section 2.3 described how developers

must be able to specify and adhere to constraints on the dif-ference between two configurations at different steps. Theseedge/change constraints can be modeled in the CSP as con-straints over the variables in two configurationsFT andFU .By extending the CSP techniques we have developed in pastwork [16], we can specifically capture which features areselected or deselected between any two steps and constrainthese changes via budget or other restrictions.

CSP model of edge/change constraints.To capture dif-ferences between feature selections between stepsT andU ,we create two new sets of variablesSTU andDTU. Thesevariables have the following constraints applied to them:

∀siTU ∈ STU, (siTU = 1) ⇔ ( fiT = 0)∧ ( fiU = 1)∀diTU ∈ DTU, (diTU = 1) ⇔ ( fiT = 1)∧ ( fiU = 0)

If a feature is selected at time stepT and not at time stepU ,thendiTU is equal to 1. Similarly, if a feature is not selectedat stepT and selected at stepU , siTU is equal to 1.

An edgeedge(T,U) between the configurations at stepsT andU is defined as a 2-tuple:

edge(T,U) =< DTU,STU >

An edge is thus defined by the features deselected and se-lected to reach configurationFU from configurationFT . The

weight of the edgeweight(edge(T,U)) can then be calcu-lated as a function of the edge tuple. For example, if theithfeature costsci to add or remove then

weight(edge(T,U)) =n

∑i=0

siTU ∗ ci +n

∑i=0

diTU ∗ ci

CSP edge/change constraint example.The cost of in-cluding a particular feature may change over time. Forexample, the cost of adding a GPS guidance system to acar does not remain fixed, but instead typically decreasesfrom one year to the next as GPS technology is commodi-tized. We can model and account for these changes inMUSCLE’s CSP formulation and constrain the configura-tion path so that it adds features at times when they are suf-ficiently cheap. We will define an edge constraint that takesinto account changing feature modification costs and limitsthe change in cost between two successive configurations to$35 million dollars.

We assume we can calculate that the price of includingthe ith feature so that it is included in the feature selectionat stepT by the function:

Cost(i,T) =ci

T +1

We can then define the cost of adding features to a configu-ration as:

weight(edge(T,T +1)) =n

∑i=1

(siT T+1∗Cost(i,T +1))

We can now limit the cost of any two successive configura-tions via the edge constraint:

∀T ∈ (0..K−1), weight(edge(T,T +1))≤ 35

4.5 Multi-step Configuration OptimizationChallenge 4 from Section 2.4 showed that optimizing the

configuration path is an important issue. CSP solvers canautomatically perform optimization while finding values forthe variables in a CSP (though it may be impractical time-wise for some problems). We can define goal functions overthe CSP variables to leverage these optimization capabilitiesand address Challenge 4.

In some cases, developers may not want to just find anyconfiguration path that ends in the desired state. Instead,they may want a path that produces a configuration thatmeets the end goals as early as possible. For example, in theautomotive problem from Section 1 developers may want tofind a configuration path that meets their constraints and in-cludes the high-end features in the base model in fewer thanfive years.

CSP model of path length. To support path lengthoptimization, we define a measure of the number of stepsneeded to reach a valid end state. We must therefore deter-mine if the constraints on the final configurationFend (whichis the goal state) are met by some configuration prior to the

last configuration (FT whereT < K −1). If we meet the fi-nal state constraints sooner than the final configuration, thenwe have found a configuration process that requires fewerconfiguration steps.

To track whether or not a configuration has met the con-straints on the ending configurationFend, we create a seriesof variableswT ∈W to represent whether or not the configu-rationFT ∈ P satisfiesFend. For each configuration,FT ∈ P,if Fend is satisifed:

(FT ⇒ Fend) ⇒ (wT = 1)

i.e., if at any step (up to and including the last step) wesatisfy the end state requirements, setwT equal to 1. Wealso require that after one step has reached a correct end-ing configuration, the remaining steps also keep the correctconfiguration and do not alter it:

(wT = 1) ⇒ (wT+1 = 1)(wT = 1) ⇒ (∑n

i=0siT T+1 + ∑ni=0diTT+1 = 0)

Optimization examples.We can optimize to find the short-est configuration path to reach the goals over K steps byasking the solver to maximize:

K−1

∑T=0

wT

The reason that maximizing this sum will minimize thenumber of steps taken to reach the desired end state is thatthe sooner the state is reached, the more stepswT will equal1.

The most straightforward optimization functions are de-fined as functions of the variables in the configuration pathP. For example, we can instruct the solver to minimize thecost of the ending configuration. Assume that the cost ofithfeature at stepK is denoted by the variableci ∈ CK , mini-mizeCK , where:

CK =n

∑i=0

fi ∗ ci

Other optimizations can be performed on the weights ofthe edges. For example, to find the configuration path withthe lowest development cost, where the development cost isthe edge weight the goal is to minimize:

K−1

∑T=0

weight(edge(T,T +1))

4.6 A Complete Multi-step CSP Example

We now provide a complete mapping of the automotiveconfiguration problem in Section 1 to MUSCLE’s multi-step CSP. For this problem, the automotive developers wantto include the high-end features into the base model overthe course of five years (K = 5). We first create a series ofconfiguration variables to represent the feature selectionatthe end of each of the five years:

Figure 7: Point Configuration Constraints for the Automo-bile Example

F0 = ( f00, f10, . . . f80)F1 = ( f00, f11, . . . f81)

. . .

F4 = ( f00, f14, . . . f84)

The mappings of the automobile features from Figure 1 toCSP variables can be seen in Figure 7. A configuration pathis defined by a set of feature selections for each of the fiveyears:

P =< F0,F1, . . .F4 >

The point constraint,pc0 ∈ PC, ensures that the featuremodel constraints are met by each year’s configuration, asshown in Figure 7. We must also specify the point configu-ration constraint for the starting configuration:

Fstart = ( f00 = 1)∧ ( f10 = 1)∧ ( f20 = 0)∧ ( f30 = 0)∧( f40 = 0)∧ ( f50 = 0)∧ ( f60 = 0)∧ f70 = 0)∧( f80 = 0)

Moreover, we must ensure that the high-end features areincluded in the last configuration:

Fend = ( f74 = 0)∧ ( f04 = 1)∧ ( f14 = 1)∧ ( f24 = 1)∧( f34 = 1)∧ ( f44 = 1)∧ ( f54 = 1)∧ ( f64 = 1)∧( f84 = 1)

Our complete set of point configuration constraints isPC=(pc0).

Finally, we must specify how the change cost betweentwo configurations is calculated and enforce the edge con-straint that at most $35 million dollars is spent per year.

∆(FT ,FT+1) = 20s2TT+1 +14s3TT+1 +19s4TT+1 +8s5TT+1

+11s6TT+1 +1s7TT+1 +16s8TT+1

E = (∀T ∈ (0..3), ∆(FT ,FT+1) ≤ 35)

Given this CSP formulation, we can use a constraint solverto automatically derive a solution to the multi-step automo-tive configuration problem described in Section 1.

5 Results

As described in Section 2.1, configuring an SPL overmultiple steps is a highly combinatorial problem. An au-tomated multi-step SPL configuration technique should beable to scale to hundreds of features and multiple steps. Thissection presents empirical results from experiments we per-formed to determine the scalability of MUSCLE.

Our experiments were performed with an implemen-tation of the MUSCLE provided by the open-source As-cent Design Studio (available fromcode.google.com/p/ascent-design-studio). The Ascent Design Stu-dio’s implementation of MUSCLE is built using the JavaChoco open-source CSP solver (available fromchoco.sourceforge.net). The experiments were performed ona computer with an Intel Core DUO 2.4GHZ CPU, 2 gi-gabytes of memory, Windows XP, and a version 1.6 JavaVirtual Machine (JVM). The JVM was run in server modeusing a heap size of 40 megabytes (-Xms40m) and a maxi-mum memory size of 256 megabytes (-Xmx256m).

To test the scalability of MUSCLE we needed 1,000s offeature models to test with, which posed a problem sincethere are not many large-scale feature models available toresearchers. To solve this problem, we used a random fea-ture model generator developed in prior work [16]. Thefeature model generator and code for these experiments isavailable in open-source form along with the Ascent DesignStudio. We used a maximum branching factor of 5 childrenper feature and a maximum of 1/3 of the features were in anXOR group.1

We measured the solving time of MUSCLE by gener-ating random multi-step configuration problems and solv-ing for configuration paths that involved larger and largernumbers of steps. The problems were created by generatinga semi-random feature model with 500 features as well asstarting and ending configurations. MUSCLE was used toderive a configuration path between the two configurations.

Our experiments were performed withlarge-scale con-figuration paths, which were produced by forcing the solverto find a configuration path that involved switching betweentwo children of the root feature that were involved in anXOR group. For a feature model with 500 features config-ured over 3 steps, the worst case solving time we observedwas roughly 3 seconds. The worst case solving time for fea-ture models configured over 10 steps was 16s. These initialresults indicate that the technique should be sufficiently fastfor feature models with hundreds of features.

6 Related Work

This section compares MUSCLE with related work.

1XOR feature groups are features that require the set of theirselectedchildren to satisfy a cardinality constraint (the constraint is 1..1 for XOR).

Automated Single-step Configuration:Several single-step feature model configuration and validation techniqueshave been proposed [2, 13, 1, 4, 5, 15]. These techniquesuse CSPs and propositional logic to derive feature modelconfigurations in a single stage as well as assure their va-lidity. These techniques help address the high complexityof finding a valid feature selection for a feature model thatmeets a set of intricate constraints.

While these techniques are useful for the derivation andvalidation of configurations in a single step, they do notconsider feature configuration over the course of multiplesteps. In scenarios, such as the automotive example fromSection 1, the ability to reason about configuration overmultiple steps is critical. MUSCLE provides this automatedreasoning across multiple steps. Moreover, MUSCLE canalso be used for single-step configurations since it is a spe-cial case of multi-step configuration where there is only onestepK = 1.

Staged Configuration: Czarnecki et al. [7] describe amethod for using staged feature selection to achieve a fi-nal target configuration. This multi-stage selection con-siders cases in which the selection of features in a previ-ous stage impacts the validitiy of later stage feature selec-tions. Our technique also examines the production of a fea-ture model configuration over multiple configuration steps.MUSCLE is complementary to Czarnecki et al.’s work andprovides a general formal framework that can be used toperform automated reasoning on staged configuration pro-cesses. Moreover, MUSCLE can also be used to reasonabout other multi-step configuration processes that do notfit into the staged configuration model, such as the the ex-ample from Section 1 where each step must reach a validconfiguration.

Staged configuration can be modeled as a special in-stance of multi-step configuration. Specifically, staged con-figuration is an instance of a multi-step configuration prob-lem where:E = /0, Fstart = /0, Fend = (FK−1 ⇒ Fc), K isset to the number of stages,∆(FT ,FU) is not defined, andFc is the set of feature model constraints. That is, thereare no limitations on the changes that can be made betweensuccessive configurations, the starting configuration has nofeatures selected, and the ending configuration yields a validfeature model configuration. The staged configuration def-inition can be refined to guarantee that successive stagesonly add features:∀T ∈ (0..K −1),FT ⊂ FT+1.

Classen et al. [12] have investigated creating a formalsemantics for staged configuration. Furthermore, they haveprovided a definition of a configuration path through a se-ries of stages for a feature model. Whereas Classen et al.have focused on configuration paths that continually reducevariability, MUSCLE is a formal model that allows for boththe reduction and introduction of variability in the config-uration process. Moreover, MUSCLE allows for complete

configurations to be met at multiple points in the configura-tion process.

Quality Attribute Evaluation: Several techniques havebeen proposed for using quality attribute evaluation [8, 10,14] to guide a configuration process. These techniques pro-vide a framework for assessing the impact of each featureselection on the overall capabilities of the configured sys-tem. As a result, quality characteristics, such as reliability,can be taken into account when selecting features. Thesetechniques are also designed for single step configurationprocesses. These techniques could be used in a complemen-tary fashion to MUSCLE to produce the point configuration,edge, and other constraints in the multi-step configurationmodel.

7 Concluding RemarksMany production SPL configuration problems require

developers to evolve a configuration over multiple steps,rather than in a single iteration. Multi-step configuration,however, must take into account constraints on the changebetween successive configurations, such as the increase incost of an automobile’s configuration from one year to thenext. Moreover, even though configuration is performedover multiple steps, a valid configuration must still be pro-duced at the end of each year, further adding complexitywhile maintaining a functional system configuration.

It is hard to determine a sequence of feature model con-figurations and feature selections such that an initial config-uration can be transformed into a desired target configura-tion. This paper introduces a technique, called theMUlti-step Software Configuration probLEm solver(MUSCLE),for modeling and solving multi-step configuration prob-lems. MUSCLE represents the problem as a CSP, whichenables CSP solvers to determine a path from a starting con-figuration to a target configuration. The output from MUS-CLE is a valid sequence of feature selections that will leadfrom a starting configuration to the desired target configu-ration while also taking into account resource constraints.

Open-source implementations of MUSCLE are availablein the Ascent Design Studio (ascent-design-studio.googlecode.com) and FAMA (www.isa.us.es/fama)2.

References

[1] D. Batory. Feature Models, Grammars, and PrepositionalFormulas.Software Product Lines: 9th International Confer-ence, SPLC 2005, Rennes, France, September 26-29, 2005:Proceedings, 2005.

[2] D. Benavides, S. Segura, P. Trinidad, and A. Ruiz-Cortés.FAMA: Tooling a framework for the automated analysis offeature models. InProceeding of the First International

2This work has been partially supported by the European Commission(FEDER) and Spanish Government under CICYT project Web-Factories(TIN2006-00472) and the Andalusian Government project ISABEL (TIC-2533)

Workshop on Variability Modelling of Software-intensiveSystems (VAMOS), 2007.

[3] D. Benavides, P. Trinidad, and A. Ruiz-Cortes. AutomatedReasoning on Feature Models. InProceedings of the 17thConference on Advanced Information Systems Engineering,Porto, Portugal, 2005. ACM/IFIP/USENIX.

[4] D. Beuche. Variant Management with Pure:: variants.Technical report, Pure-Systems GmbH, http://www.pure-systems.com, 2003.

[5] R. Buhrdorf, D. Churchett, and C. Krueger. Salion’s Expe-rience with a Reactive Software Product Line Approach. InProceedings of the 5th International Workshop on ProductFamily Engineering, Siena, Italy, November 2003.

[6] P. Clements and L. Northrop.Software Product Lines: Prac-tices and Patterns. Addison-Wesley, Boston, USA, 2002.

[7] K. Czarnecki, S. Helsen, and U. Eisenecker. Staged Con-figuration Using Feature Models.Software Product Lines:Third International Conference, SPLC 2004, Boston, MA,USA, August 30-September 2, 2004: Proceedings, 2004.

[8] L. Etxeberria and G. Sagardui. Variability Driven QualityEvaluation in Software Product Lines. InSoftware ProductLine Conference, 2008. SPLC’08. 12th International, pages243–252, 2008.

[9] P. V. Hentenryck.Constraint Satisfaction in Logic Program-ming. MIT Press, Cambridge, MA, USA, 1989.

[10] A. Immonen. A method for predicting reliability and avail-ability at the architectural level.Research Issues in SoftwareProduct-Lines-Engineering and Management, T. Käkölä andJC Dueñas, Editors, 2005.

[11] K. C. Kang, S. Kim, J. Lee, K. Kim, E. Shin, and M. Huh.FORM: A Feature-Oriented Reuse Method with Domain-specific Reference Architectures.Annals of Software Engi-neering, 5(0):143–168, January 1998.

[12] B. Littlewood and L. Strigini. A Formal Semantics for Multi-level Staged Configuration. InProceedings of the ThirdWorkshop on Variability Modelling of Software-intensiveSystems, pages 51–60, January 2009.

[13] M. Mannion. Using first-order logic for product line modelvalidation. Proceedings of the Second International Confer-ence on Software Product Lines, 2379:176–187, 2002.

[14] F. Olumofin and V. Misic. Extending the ATAM Ar-chitecture Evaluation to Product Line Architectures. InIEEE/IFIP Working Conference on Software Architecture,WICSA, 2005.

[15] J. White, A. Nechypurenko, E. Wuchner, and D. C. Schmidt.Automating Product-Line Variant Selection for Mobile De-vices. InProceedings of the 11th Annual Software ProductLine Conference (SPLC), Kyoto, Japan, Sept. 2007.

[16] J. White, D. C. Schmidt, D. Benavides, P. Trinidad, andA. Ruiz-Cortez. Automated Diagnosis of Product-line Con-figuration Errors in Feature Models. InProceedings of theSoftware Product Lines Conference (SPLC), Limerick, Ire-land, Sept. 2008.

automated reasoning for multi-step feature model …dbc/dbc_archivos/pubs/white09-splc.pdf · 2009....

Documents