knowledge-based risk assessment and cost estimation

12
Automated Software Engineering, 2, 21%230 (1995) @ 1995 KluwerAcademicPublishers,Boston. Manufactured in The Netherlands. Knowledge-Based Risk Assessment and Cost Estimation RAYMOND J. MADACHY [email protected] USC Center.for Software Engineering, University of Southern California, Los Angeles, CA 90089-0781; and SoJb, vare Engineering Process Group, Litton Data Systems, Agoura Hills, CA 91376-6008 Abstract. A knowledge-based method for software projectrisk assessmentand cost estimation has been imple- mented on multipleplatforms. As an extensionto the ConstructiveCost Model (COCOMO), it aids in project planningby identifying, categorizing,quantifying and prioritizing projectrisks. It also detects cost estimateinput anomalies and providesrisk controladvicein additionto conventional COCOMO cost and schedule calculation. The method has been developed in conjunction with a system dynamicsmodel of the software development process, and serves as an intelligent front end to the simulationmodel. It extends previous research in the knowledge-basedcost estimationdomainby focusingon risk assessment,incorporating substantially more rules, going beyond standard COCOMO, performingquantitative validation,providing a user-friendly interface, and integratingit with a dynamicsimulation model. Results of the validation are promising, and the methodis beingused at LittonData Systemsand other industrial environments.It willbe undergoing furtherenhancement as partof anintegratedcapability for softwareengineering to assist in system acquisition,project planningand risk management. Keywords: software cost estimation, software risk management, knowledge-based software engineering, COCOMO. Introduction The objective of software risk management is to identify, address and eliminate risk items before undesirable outcomes occur. It is often very difficult to implement because of the scarcity of seasoned experts and the unique characteristics of individual projects. However, the practice of risk management can be improved by leveraging on existing knowledge and expertise. In particular, expert knowledge can be employed during cost estimation activities by using cost factors for risk identification and assessment to detect patterns of project risk. During cost estimation, consistency constraints and cost model assumptions may be violated or an estimator may overlook project planning discrepancies and fail to realize risks. Approaches for identifying risks are usually separate from cost estimation, thus a technique that identifies risk in conjunction with cost estimation is an improvement. COCOMO is a widely used cost model that incorporates the use of cost drivers to adjust effort calculations. As significant project factors, cost drivers can be used for risk assessment using sensitivity analysis or Monte-Carlo simulation, but this approach uses them to infer specific risk situations. At project inception for example, a manager who is inexperienced and/or lacking suffi- cient time to do a thorough analysis may have a vague idea that the project is risky. But he will not know exactly which risks to mitigate and how. With automated assistance, the identified risks derived from cost inputs are used to create mitigation plans based on the relative risk severities and provided advice. The method described herein is encapsulated in a tool called Expert COCOMO. In conjunction with a dynamic model of an inspection-based software lifecycle process to

Upload: others

Post on 12-Sep-2021

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Knowledge-based risk assessment and cost estimation

Automated Software Engineering, 2, 21%230 (1995) @ 1995 Kluwer Academic Publishers, Boston. Manufactured in The Netherlands.

Knowledge-Based Risk Assessment and Cost Estimation

RAYMOND J. MADACHY [email protected] USC Center.for Software Engineering, University of Southern California, Los Angeles, CA 90089-0781; and SoJb, vare Engineering Process Group, Litton Data Systems, Agoura Hills, CA 91376-6008

Abstract. A knowledge-based method for software project risk assessment and cost estimation has been imple- mented on multiple platforms. As an extension to the Constructive Cost Model (COCOMO), it aids in project planning by identifying, categorizing, quantifying and prioritizing project risks. It also detects cost estimate input anomalies and provides risk control advice in addition to conventional COCOMO cost and schedule calculation.

The method has been developed in conjunction with a system dynamics model of the software development process, and serves as an intelligent front end to the simulation model. It extends previous research in the knowledge-based cost estimation domain by focusing on risk assessment, incorporating substantially more rules, going beyond standard COCOMO, performing quantitative validation, providing a user-friendly interface, and integrating it with a dynamic simulation model.

Results of the validation are promising, and the method is being used at Litton Data Systems and other industrial environments. It will be undergoing further enhancement as part of an integrated capability for software engineering to assist in system acquisition, project planning and risk management.

Keywords: software cost estimation, software risk management, knowledge-based software engineering, COCOMO.

Introduction

The objective of software risk management is to identify, address and eliminate risk items before undesirable outcomes occur. It is often very difficult to implement because of the

scarcity of seasoned experts and the unique characteristics of individual projects. However, the practice of risk management can be improved by leveraging on existing knowledge and

expertise. In particular, expert knowledge can be employed during cost estimation activities

by using cost factors for risk identification and assessment to detect patterns of project risk. During cost estimation, consistency constraints and cost model assumptions may be

violated or an estimator may overlook project planning discrepancies and fail to realize risks. Approaches for identifying risks are usually separate from cost estimation, thus a technique that identifies risk in conjunction with cost estimation is an improvement.

COCOMO is a widely used cost model that incorporates the use of cost drivers to adjust effort calculations. As significant project factors, cost drivers can be used for risk assessment

using sensitivity analysis or Monte-Carlo simulation, but this approach uses them to infer specific risk situations.

At project inception for example, a manager who is inexperienced and/or lacking suffi- cient time to do a thorough analysis may have a vague idea that the project is risky. But he will not know exactly which risks to mitigate and how. With automated assistance, the identified risks derived from cost inputs are used to create mitigation plans based on the relative risk severities and provided advice.

The method described herein is encapsulated in a tool called Expert COCOMO. In conjunction with a dynamic model of an inspection-based software lifecycle process to

Page 2: Knowledge-based risk assessment and cost estimation

220 MADACHY

support quantitative evaluation of the process, these modeling techniques can support project planning and management, and aid in process improvement.

The remainder of this paper provides background and related work, the methodology used during development, implementation details including an example session, and conclusions. The background and details of the simulation aspect of this work can be found in (Madachy, 1994).

Background

This research has drawn upon the related software engineering disciplines of knowledge- based methods, risk management and cost estimation. The sections below provide a brief background to major areas relative to this work.

Recent research in knowledge-based assistance for software engineering is on supporting all lifecycle activities (Green et al., 1983), though much past work has focused on automating coding activities. Improvements have been made in transformational methods, but there has been much less progress towards accumulating knowledge bases for large scale software engineering processes (Boehm, 1992). Despite the potential of capturing expertise to assist in project management functions such as cost estimation and risk management, few applications have specifically addressed such concerns.

Cost Estimation

Cost models are commonly used for project planning and estimation to predict both the person effort and elapsed time of a project. The most widely accepted and thoroughly documented software cost model is Boehm's COCOMO (Boehm, 1981). The model is incorporated in many of the estimation tools used in industry and research. The multi-level model provides formulas for estimating effort and schedule using cost driver ratings to adjust the estimated effort.

The COCOMO model estimates software effort as a nonlinear function of the product size and modifies it by a geometric product of effort multipliers associated with cost driver ratings. The cost driver variables include product attributes, computer attributes, personnel attributes and project attributes. The revised Ada COCOMO model adds some more cost drivers, including process attributes (Boehm-Royce, 1989). The COCOMO 2.0 project is currently underway to update the model for new development processes and products, and incorporates a revised set of cost drivers (Boehm et al., 1995).

Mitre developed an Expert System Cost Model (ESCOMO) employing an expert system shell on a PC (Day, 1987). It used 46 rules involving COCOMO cost drivers and other model inputs to focus on input anomalies and consistency checks with no quantitative risk assessment.

Risk Management

Risk is the possibility of undesirable outcome, or a loss. Risk impact, or risk exposure is defined as the probability of loss multiplied by the cost of the loss.

Risk management is a new discipline whose objectives are to identify, address and elim- inate software risk items before they become either threats to successful software operation

Page 3: Knowledge-based risk assessment and cost estimation

KNOWLEDGE-BASED RISK ASSESSMENT AND COST ESTIMATION 221

or major sources of software rework (Boehm, 1989). There is documented evidence of many software development failures to highlight the need for risk management practice (Charette, 1989).

Examples of risk in software development include exceeding budget, schedule overrun, or delivering an unsuitable product. These illustrate the classic risk taxonomy of cost, schedule and performance (technical). Boehm identifies the top 10 generic software risk items in (Boehm, 1989). In practice, risks must be identified as specific instances to be manageable.

Software risk management involves a combination of methods used to assess and con- trol risk, and is on ongoing activity throughout a development project. Some common techniques used include performance models, cost models, checklists, network analysis, decision analysis, quality factor analysis and others (Boehm, 1989; Rook, 1993).

Management of risk involves both risk assessment and risk control (Boehm, 1989; Charette, 1989). The substeps in risk assessment are risk identification, risk analysis (eval- uating the magnitudes of loss probability and consequence) and risk prioritization, whereas risk control entails risk management planning, risk resolution and risk monitoring.

Risk management is heavily allied with cost estimation (Boehm, 1989; Charette, 1989; Rook, 1993). Cost estimates are used to evaluate risk and perform risk tradeoffs, risk methods such as Monte-Carlo simulation can be applied to cost models, and the likelihood of meeting cost estimates depends on risk management.

Risk management attempts to balance the triad of cost-schedule-functionality (Charette, 1989; Boehm, 1989). Though COSt, schedule and product risks are interrelated, they can also be analyzed independently. Some methods used to quantify cost, schedule and performance risk include table methods, analytical methods, knowledge based techniques, questionnaire- based methods and others.

A risk identification scheme has been developed by the Software Engineering Institute (SEI) that is based on a risk taxonomy (Cart et al., 1993). A hierarchical questionnaire is used by trained assessors to interview project personnel. Different risk classes are product engineering, development environment and program constraints.

Knowledge-based methods can be used to assess risk and provide advice for risk mitiga- tion. Incorporation of expert system rules can place considerable added knowledge at the disposal of the software project planner or manager to help avoid high-risk development situations and cost overruns.

Toth has developed a knowledge-based software technology risk advisor (STRA) (Toth, 1994), which provides assistance in identifying and managing software technology risks. Whereas Expert COCOMO uses knowledge of risk situations based on cost factors to identify and quantify risks, STRA uses a knowledge base of software product and process needs, satisfying capabilities and maturity factors. Risk areas are inferred by evaluating disparities between needs and capabilities. STRA focuses on technical product risk while Expert COCOMO focuses on cost and schedule risk.

A knowledge-based project management tool has also been developed to assist in choos- ing the software development process model that best fits the needs of a given project (Sabo, 1993). It also performs remedial risk management tasks by alerting the developer to poten- tial conflicts in the project metrics. This work was largely based on a process structuring decision table in (Boehm, 1989), and operates on knowledge of the growth envelope of the project, understanding of the requirements, robustness, available technology, budget, schedule, haste, downstream requirements, size, nucleus type, phasing, and architecture understanding.

Page 4: Knowledge-based risk assessment and cost estimation

222 MADACHY

Method

Knowledge engineering involved choosing appropriate abstractions for formulating heuris- tics, iterative elicitation of expert knowledge, representation of the knowledge for diagnosis, and testing of the expert system. Additionally, a risk quantification scheme was devised. Cost drivers in the COCOMO model were identified very early as a complete set of attributes for project risk diagnosis, and this approach leveraged off of them.

Knowledge was acquired from written sources on cost estimation (Boehm, 1981; Boehm- Royce, 1989; Day, 1987), risk management (Boehm, 1989; Charette, 1989) and domain experts including Dr. Barry Boehm, Walker Royce and this author.

A matrix of COCOMO cost drivers was used as a starting point for identifying risk situations as a combination of multiple cost attributes, and the risks were formulated into a set of rules. As such, the risk assessment scheme represents a heuristic decomposition of cost driver effects into constituent risk escalating situations.

A risk situation can be described as a combination of extreme cost driver values indicating increased effort, whereas an input anomaly may be a violation of COCOMO consistency constraints such as an invalid development mode given size or certain cost driver ratings. Risk items are identified, quantified, prioritized and classified depending on the cost drivers involved and their ratings.

A typical risk situation can be visualized in a 2-D plane as shown in Figure 1, where each axis is defined as a cost attribute rating range. The curves represent iso-risk contours, and this figure shows risk increasing towards the top right corner. An example would be for complex product development in conjunction with low analyst capability; risk would be increasing in the direction of increasing product complexity (CPLX) and decreasing analyst

very low

ATTRIBUTE 2

very high

VERY LOW LOW

ATTRIBUTE 2 NOMINAL tHGH VERY HIGI-

very low ATTRIBUTE 1

extra high

VERY LOW

discretized I into

ATTRIBUTE 1 LOW NOMINAL HIGH VERY IIIGII EXTRA HIGH

MODERATE ~[OH VERY HIGH ¢IODERATE HIGH

MODERATE

Figure 1. Typical assignment of r isk levels.

Page 5: Knowledge-based risk assessment and cost estimation

KNOWLEDGE-BASED RISK ASSESSMENT AND COST ESTIMATION 223

capability (ACAP). As seen in the figure, the continuous representation is discretized into a table. A risk condition corresponds to an individual cell containing an identified risk level. The rules use cost driver ratings to index directly into these tables of risk levels. The tables constitute the knowledge base for risk situations defined as interactions of cost attributes.

After several iterations of the prototype, the experts were engaged again to help quantify the risks. A quantitative risk weighting scheme was developed that accounts for the nonlin- earity of the assigned risk levels and cost multiplier data to compute overall risks for each category and the entire project according to

#categories #categery risks

project risk = ~ ~ risk level~.i effort multiplier product/,./ .i=l i=l

where risk level = 1 moderate 2 high 4 very high

effort multiplier product = (driver # 1 effort multiplier)* (driver #2 effort multiplier) . . . * (driver #n effort

multiplier). If the risk involves a schedule constraint (SCED), then

effort multiplier product = (SCED effort multiplier)/(relative schedule)* (driver #2 effort multiplier)... * (driver #n effort multiplier).

The risk level corresponds to the probability of the risk occurring and the effort multiplier product represents the cost consequence of the risk. The product involves those effort multipliers involved in the risk situation. When the risk involves a schedule constraint, the product is divided by the relative schedule to obtain the change in the average personnel level (the near-term cost) since the staffing profile is compressed into a shorter project time. The risk assessment calculates general project risks, indicating a probability of not meeting cost, schedule or performance goals.

The risk levels were normalized to provide meaningful relative risk indications. Sen- sitivity analysis was performed to determine the sensitivity of the quantified risks with varying inputs, and extreme conditions were tested. An initial scale with benchmarks for low, medium and high overall project risk was developed as follows: 0-15 low risk, 15-50 medium risk, 50-350 high risk.

Implementation

A working prototype assistant called Expert COCOMO was developed that runs on a Macintosh using HyperCard. Earlier versions utilized an expert system shell, but the prototype was recoded to eliminate the need for a separate inference engine.

The Litton Software Engineering Process Group (SEPG) has also ported the rule base to a Windows environment, and has incorporated the risk assessment technique into standard planning and management practices. The tool, Litton COCOMO, encapsulates cost equa- tions calibrated to historical Litton data and was written in Visual Basic as a set of macros within Microsoft Excel.

The tools evaluate user inputs for risk situations or inconsistencies and perform cal- culations for the intermediate versions of standard COCOMO (Boehm, 1981) and ADA

Page 6: Knowledge-based risk assessment and cost estimation

224 MADACHY

L U$C Expert COCOMO 1.7 2TI

CosL Driver Rating very lO~t hOmlMI hNh verdi extra

L inear f ~ t l r s Io~v high htgh Project n a m e Product kttributes i example I R ELY - req~irqKI ,oft~caro roll, bllltg O ~ 8 8 ~ S I Z a : ~ S L O C ~ DATA - data base size CPLX - product eomple×ity C) C) O C)

S c h e d u l e : ~ M o n t h s Comouter A~trlbutes '[', HE - execu||on t| Ine cQ n,, rat n, 0 ~ ~ ~ ~ (~ orgamc STOR - rrmin ~tor~a constraint tlode: O Semidetached YtRT - ¥irluel rn~ch|ne volattlitg (~ TURN - computer turnaraUCKI tilde C) ~j) Embedded

pQrsonnel Attributes

AEXP - applications experience P~P - programmer capabilitQ YElp - vi rtt~l mechi r~ experience ~ 8 LEXP - proQrammtng langoage experience

j~t'.9.iec~ kttributea MODP- useo' modern prograrnrning practices ~ ~ 8 8 8 TOOL - ~ Qf softvera ~Qois SI2ED - requir~ developmen! =hedu]e (~ 0 O O 0

E x p o N n t i g l fRtor~; Ads Process Attributes

.0.T-..da.,g°,h0.og,.. 8 8 8 ( o, ) RISK - risks elimir~t, bV PDR 8 8 8 ~tO L - reqLd ramants volatility

Figure 2. Sample input screen.

COCOMO (Boehm-Royce, 1989; Royce, 1990). They operate on project inputs, encoded knowledge about the cost drivers, associated project risks, cost model constraints and other information received from the user.

The screen snapshots in subsequent figures are from the prototype, and Litton COCOMO looks very similar. The graphical user interface provides an interactive form for project input using radio buttons, provides access to the knowledge base, and provides output in the form of dialog box warnings, risk summary tables and charts, calculated cost and schedule and graphs of effort phase distributions. It provides multiple windowing, hypertext helP utilities and several operational modes. The risk weight tables (seen in the bottom of Figure 1) are user-editable and can be dynamically changed for specific environments, resulting in different risk weights.

The following example is for a very risky project where many cost drivers are rated at their costliest values. Figure 2 is the input screen showing the rated attributes for the project. This data also constitutes the input for a cost estimate. In this example, it is seen that the project has a tightly constrained schedule as well as stringent product attributes and less than ideal personnel attributes. With this input data, the expert system identifies specific risk situations and quantifies them per the aforementioned formulas.

The individual risks are also ranked, and the different risk summaries are presented in a set of tables. The interface supports embedded hypertext links, so the user can click on a risk item in a list to traverse to a screen containing the associated risk table and related information.

An example risk output is seen in Figure 3, showing the overall project risk and risks for subcategories. It is seen that the leading subcategories of risk are schedule, product and personnel. Other outputs include a prioritized list of risk situations as seen in Figure 4

Page 7: Knowledge-based risk assessment and cost estimation

KNOWLEDGE-BASED RISK ASSESSMENT AND COST ESTIMATION 225

| i~ ' ' d s k output

[RISK SUMMARY] Project Risk igo

Schedule I 6 6

Product I 3 7 Personne l I 5 1

Process I17

C o m p u t e r I28

Schedule Product Rult Wt. Rile Wt.

SCED,..CPLX 0.~ $CED-CPLX 10.| 8CED._RELY 4,5~ 5CED-RELY 4.5~ SCED_TIME IO.E RELY_ACAP 5.33 SCED_VIRT 1.BE RELY-PCAP 5,27 $CED_TOOL 1.8C RELY-MODF 6,94 $CED_TURN 3.5( CPLX_ACAP 5.92 SCED_~CAP 3.9C CPLX_PCAP 5.86 SC E D,.,.A EX P 8.4( SC ED_.PCAP 3.8~ ~ED_VEXP 7.9. ~ SCED-HODP i 8.1"

lexemple

Personnel Rule Wt.

5CED..ACAP 3.gc 5CED~P 18.4E ~ E D ~ P 3.8Z SCED_¥~P 7.9. • R E L Y ~ P 3.33 RELY~P 5.23 CPLX_ACAP 3.9~ C PLX_PCAP 5.8( TI HE..ACAP 3.9E TI ME_PCAP 3.8E STOR.ACAP 1.4, ~ 5TOR_P~P 1.41 VI~T.yF~P 1 ~c

Process Anomaly g i l l WI. Rile Wt.

~CED_TOOL 1.8C SIZE_SCED 5CEP..HODP 8.1~ HODE,.A EXP RELY-PlOD P 6.9~

Computer Rile Wt.

5CED_TIHE ,oe i 5CED_VI RT 1.8E SCED_TURN 3.50 TIHE_ACAP 3.95 ! TIHUCAP 3.88 5TOR_ACAP ! .43 STOR_PCAP 1.41 ¥IRT_Y~P 1.3~;

Total 16s.71 136.7 I I~O.~11 ~16.el IZS:3] L z w e i g h t s

O v e r a l l w e i g h t

Figure 3. Sample risk outputs.

( Risk Ranking )

ir~,l, . . . . . . . . . . i , rlsk output i, , ' , ' ,ill ~] I

LRisk Ranking ] Rank Rule Warning Weight

I 2 3 =I 5 6 7 O 9 10 II 12 13 14 15 16 17 18 19 20

SCED_TIME SCED_CPLX rSCEO_AEXP SCED_MODP $CEO_VEXP RELV_HODP SCED-RELY TIME_ACAP CPLX_ACAP SCED._ACAP TIME_PCAP CPLX-PCAP SCED_PCAP SCED_TURN RELY_ACAP RELV_PCAP SCED_VIRT SCED_TOOL STOR_ACAP STOR_PCAP

Tight schedule and high computer time constralnt. Tight schedule and a highly complex system. Tight schedule wi th low applications experience. Tight schedule and low use of modern progremming practices. Tight schedule wlth low vlrtual machine experience. High reliabillty wlth low use of modern programming practices. Tight schedule and a highly reliable system. Execution time constraint and low anal~st capability. Hlgh complexity and low analyst capability. Tight schedule with low analyst capability. Execution time constraint and low programmer capability. High complexity and low programmer capability. Tight schedule wlth low programmer capability. Tight schedule wlth hlgh turnaround time. High reliability wlth low analyst capability. High reliability wlth low programmer capability. Tight schedule with high virtual machine volatility. Tight schedule wlth low use of software tools. Storage constraint and low analyst capability. Storage constraint and low programmer capability.

O l s k Summary )

Figure 4. Prioritized risks.

Page 8: Knowledge-based risk assessment and cost estimation

226 MADACHY

and a list of advice to help manage the risks. The highest risks in this example deal with schedule, and appropriate advice is provided to the user. Standard COCOMO cost and schedule estimates are also provided.

Rule base

Currently, 77 rules have been identified of which 52 deal with project risk, 17 are input anomalies and 8 provide advice. There are over 320 risk conditions, or discrete combinations of input parameters that are covered by the rule base.

The knowledge is represented as a set of production rules for a forward chaining inference engine. The current rule base for project risks, anomalies and advice is shown in Table 1. The four letter identifiers in the rulenames are the standard abbreviations for the COCOMO cost drivers.

Figure 5 shows the rule taxonomy and corresponding risk taxonomy as previously de- scribed. For each risk category, the cost drivers involved for the particular risk type are shown in boldface. Note that most rules show up in more than one category. Figure 5 also shows the cost factors and rules for input anomalies and advice.

Validation

Testing and evaluation of the expert system has been done against the COCOMO project database and other industrial data. In one test, correlation is performed between the quan- tified risks versus actual cost and schedule project performance. Using the rule set on the COCOMO database shows a correlation coefficient of .74 between the calculated risk and actual realized cost in person-months/KDSI, as shown in Figure 6. Figure 7 shows risk by

Table 1. Rulebase

Risk Anomaly Advice

sced_cplx rely_pcap cplx_tool sced_rely rely_modp time_tool seed_time cplx_acap cplx_acap_pcap sced_virt cplx_pcap rely~cap_pcap seed_tool t ime_acap rely_data_seed seed_turn time_pcap rely_stor_sced sced_acap stor. .acap cplx_time..sced sced_aexp stor_pcap cplx_stor~sced sced_pcap virt_vexp time_stor..sced sced_vexp rvol. .rely time_virt_sced seedAexp rvot_acap acap_risk sced_modp rvol_aexp increment_drivers rely_acap rvol_sced sced_vexp_pcap modp,acap rvol_cplx virt_sced_pcap modp_pcap rvol_stor lexp_aexp_sced tool_aeap rvol_time ruse..aexp tool_pcap rvol_turn ruseAexp tool_modp size_pcap

mode_cplx mode.rely size_seed size_mode size_cplx mode_virt mode_time mode_aexp mode_vexp size_pcap size_acap tool_modp tool_modp_tum pmex_pdrt_risk_awot increment_personnel modp_pcap seed acap_anomaly

size_turn rely_data data_turn time stor pcap_acap data

Page 9: Knowledge-based risk assessment and cost estimation

KNOWLEDGE-BASED RISK ASSESSMENT AND COST ESTIMATION 227

Overall Project Risk

Schedule risk Product risk Personnel risk Process dsk Computer risk SCED RELy ACAP MODP TIME

sced_r ely rely_acap acap risk seed modp sced time seed_time rely.peep cp[x acap rl~ly modp time_-pcap sced_vir t rely modp cpt~_aclp~cep modp_acap tirne_acap seed tool rvol_rely modp_llCap modp_peap cplx_timm_sced seed turn soed_fefy rely_acap too/_modp tame stor $ced seed_seep rcly data_sced rely_acapj~cap TOOL tirne-virt-soed sced_aexp raly_stor seed rvo{ acap scedtool rvol ~ma sced_pcap rdy_acap_pcap seed_seep tool_leap time_tool seed vexp DATA stor_acap tool_peep STOR sced_lexp rely_data$ced time_acap cplx tool stor_acap sced modp SEE tooLacap time tool stor_pcap rvol seed size_peep AEXP tool modp rvol stor rdy_data_sced CPLX lexpaexp_sced RVDL cplx_stor_sced rely stot_sc~d eplx_acsp ruao_aexp rvof_rely timestorsced cptx time_seed cplx_acap_pcap rvol_aexp rvol leap VIRT cptx_stor seed cplxj~cap sced_aex~ rvol_~exp sced_vlrt tJme_stor_scod cplx_$tor seed LEXP rvoi seed vir t_vexp time_vir t seed cpIx dme sced lexp_aexpsced rvoLcplx virLsced pea p sced_vexp_pcap cplx_tOOI $ced ]exp rvol_slz)r tin;e_Wrt_scod vlr t seed_peep rvol_cplx ruse_lexp rvol~lJme 11JRN lexp_aexp sced sced cpix PCAP rvol turn seed_turn rvol_sced vir t_sced_pcap RUSE rvol_turn cplx tlme sced modpjocap ruse aexp Cplx stor_sced rely_peep ruse lexp time stor_sced cplx pcap INCREMENTS; time virt seed seed_peep increment ddvats

sizej~cl~p RISK stor pcap seep_risk time pcap PMEX tool_pcap PDRT Cp/X JCBp~Cap raly_ocop ocap LEGEND:

sced_v~p_.ocmp Rule t ype VEXP

virt vaxp COST FACTOR scad_vexp rulenamel sced_vaxp..pcap ruFenarne2

Input anomaly S~ 'E CPLX

slze_cplx MODE RELY "NME VIRT AEXP VEXP

mode cplx mode rely mode_vir t modetime mode aexp mode vexp

SC~D seed

TOOL MODP TURN ACAP

tool_modp tcol_rnodp turn modp peep

INCREMENT increment perAonnBI

PMEX PDRT RISK RVOL

pmexJ0 dr t risk_r vol

Figure 5. Rule taxonomy.

Adv ice eRE TURN

s~ze_turn RELY DATA

rely_data TIME

time STOR

stor PCAP ACAP

pcap aca~ DATA

data data turn

5O 45

40 35 C¢1 ao 25

~, 20 Z O 15 ffl

10

I I B

m m i i 5 ~ = = = • n •=O= == •

I D I I I I

O 20 40 60 80 1 O0 120

RISK

Figure 6. Correlation against actual cost.

project number and grouped by project type for the COCOMO database. This depiction also appears reasonable and provides confidence in the method. For example, a control application is on average riskier than a business or support application.

Industrial data from Litton and other affiliates of the USC Center for Software Engineering is also being used for evaluation, where the calculated risks are compared to actual cost and schedule variances from estimates. Data is still being collected, and correlation will be per- formed against the actual cost and schedule variance from past projects. In another test, the risk taxonomy is being used as a basis for post-mortem assessments of completed projects.

Page 10: Knowledge-based risk assessment and cost estimation

228 MADACHY

120

100

80

tP. 60

40

20

0

- BUS CTL HMI SCI SUP SYS

II'll I i ' i

,..till I!'1 . . . . . . . . . . . . ! i . PROJECT #

Figure 7. Risk by COCOMO project number.

Software engineering practitioners have been evaluating the system and providing feed- back and additional project data. Many of the USC Affiliate companies are testing the tool in-house. At Litton, nine evaluators consisting of the SEPG and other software managers have unanimously evaluated the risk output of the tool as reasonable for a given set of test cases, including past projects and sensitivity tests.

Conclusions and Future Work

The existing set of COCOMO cost drivers served well as a core set of abstractions to map onto decision drivers for project risk assessment. The completeness of the attribute set for the cost estimation domain was vital for generating a critical mass of rules from them. Common inputs between the expert system and cost model also ensured unambiguous mapping from project data to the ruleset.

This work is another example of cost drivers providing a powerful mechanism to identify risks. Explication of risky attribute interactions helps illuminate underlying reasons for risk escalation as embodied in cost drivers, thus providing insight into software development risk. Analysis has shown that risk is highly correlated with the total effort multiplier product associated with the cost drivers, and the value of this approach is that it identifies specific risk situations that need attention.

More refined calibrations are still needed for meaningful risk scales. Consistency with other risk taxonomies and assessment schemes is also desired, such as the SEI risk as- sessment method (Cart et al., 1993) and the Software Technology Risk Advisor method of calculating the disparities between functional needs and capabilities (Toth, 1994).

At Litton, the knowledge base will be extended for specific product lines and environments to assist in consistent estimation and risk assessment. It is being implemented in the risk management process, and is being used as a basis for risk assessment of ongoing projects.

Additional risk data from industrial projects will be collected and reported oni The fol- lowing additional features are also planned for the tool: incremental development support, additional and refined rules, cost risk analysis using Monte-Carlo simulation to assess the effects of uncertainty in the COCOMO inputs, and automatic sensitivity analysis to vary

Page 11: Knowledge-based risk assessment and cost estimation

KNOWLEDGE-BASED RISK ASSESSMENT AND COST ESTIMATION 229

cost drivers for risk minimization. For Monte-Carlo simulation, users will be able to specify

probabil ist ic distributions for cost factors. The prototype will continue to be enhanced and rehosted if necessary to attain wider

usage. Addit ional rules will be identified and incorporated to handle more cost driver interactions, cost model constraints, incremental development inheritance rules, rating of consistency violations and advice. Substantial additions are expected for process related factors and advice to control the risks. The domain experts will continue to provide feedback and clarification.

The current risk model is well-suited for integration with a dynamic project model as demonstrated in (Madachy, 1994), since cost (drivers), schedule and risk are interrelated factors that interact in a dynamic fashion throughout a project lifecycle. By combining quantitative techniques with expert judgement in a common model, an intelligent simulation capabili ty results to support planning and management functions for software development projects.

This work is also coordinated with other relevant research at the USC Center for Software Engineering. The evolving COCOMO 2.0 model has updated cost and scale drivers relative to the original COCOMO upon which the risk assessment heuristics are based. The risk and anomaly rulebases will be updated to correspond with the new set of cost factors and model definitions.

A working hypothesis for COCOMO 2.0, is that risk assessment should be a feature of the cost model (Boehm et al., 1994). Towards this, graduate students at USC have incorporated the rule base into the next revision of the public domain USC COCOMO tool. The assessment scheme will also be incorporated into the WinWin spiral model prototype (Boehm et al., 1993) to support negotiation based on COCOMO parameters.

Acknowledgments

The author would like to thank Dr. Barry Boehm for his guidance and inspiration in this work in addition to serving as a domain expert, and Dr. Prasanta Bose for his insightful comments and suggestions on this paper. Thanks also to the Litton Data Systems division SEPG personnel and management for their support.

References

Boehm, B. 1981. Software Engineering Economics. Englewood Cliffs, NJ: Prentice-Hall. Boehm, B. 1989. Software Risk Management. Washington, D.C.: IEEE-CS Press. Boehm, B. 1992. Knowledge-based process assistance.for large software projects, white paper in response to

Rome Laboratories PRDS #92-08-PKRD, USC. Boehm, B., Bose, P., Horowitz, E., Scacchi, W., et aL 1993. Next generation process models and their environment

support. Proceedings of the USC Center.for Software Engineering Convocation, USC. Boehm, B., and Clark, B., Horowitz, E., Westland, C., Madachy, R., and Selby, R. 1995. Cost modelsfi)rfkture

so.flware life cycle processes: COCOMO 2.0, to appear in Annals of Software Engineering Special Volume on Software Process and Product Measurement, J.D. Arthur and S.M. Henry (Eds.), J.C. Baltzer AG, Science Publishers, Amsterdam, The Netherlands.

Boehm, B. Royce, W. t989. Ada COCOMO and the Ada process model. Proceedings, Fifth COCOMO Users" Group Meeting, SEI.

Page 12: Knowledge-based risk assessment and cost estimation

230 MADACHY

Boehm, B. and Bose E 1994. Critical success factors for knowledge-based software engineering applications. Proceedings of the Ninth Knowledge-Based Software Engineering ConJerence, Monterey, CA: IEEE Computer Society Press.

Carr, M., Konda, S., Monarch, I., Ulrich, E, and Walker, C. 1993. Taxonomy-Based Risk Identification. Technical Report CMU/SEI-93-TR-06, Software Engineering Institute.

Charette, R. 1989. Software Engineering RiskAnalysis andManagement. Intertext Pnblications/Multiseience Press and McGraw-Hill, New York, NY.

Conte, S., Dunsmore, H., and Shen, V. 1986. S¢~ftware Engineering Metrics and Models. Menlo Park, CA: Benjamin/Cummings Publishing Co., Inc.

Day, V. 1987. Expert System Cost Model (ESCOMO) Prototype. Proceedings, Third Annual COCOMO Users' Group Meeting, SEI.

Green, C., Luckham, D., Balzer, R., Cheatham, T., and Rich, C. 1983. Report on a Knowledge-Based Software Assistant. Kestrel Institute, RADC#TR83-195, Rome Air Development Center, NY.

Madachy, R. 1994. A software project dynamics model for process cost, schedule and risk assessment. Ph.D. Dissertation, Department of Industrial and Systems Engineering, USC.

Rook, E 1993. Cost estimation and risk management tutorial. Proceedings of the Eighth International Forum on COCOMO and Software Cost Modeling, SEI, P!ttsburgh, PA.

Royce, W. 1990. TR W's Ada process model for incremental development of large software systems. TRW-TS-90-01, TRW, Redondo Beach, CA.

Sabo, J. 1993. Process model advisor. CSC1577A class project, University of Southern California. Toth, G. 1994. Software technology risk advisor. Proceedings of the Ninth Knowledge-Based Software Engineering

Conference, Monterey, CA: IEEE Computer Society Press.