cs 568 spring 10 lecture 5 estimation

63
Lecture 5 Estimation Estimate size, then Estimate effort, schedule and cost from size & complexity CS 568

Upload: lawrence-bernstein

Post on 10-May-2015

1.717 views

Category:

Documents


0 download

DESCRIPTION

Estimation insights

TRANSCRIPT

Page 1: Cs 568 Spring 10  Lecture 5 Estimation

Lecture 5 EstimationEstimate size, thenEstimate effort, schedule and cost from size &

complexity

CS 568

Page 2: Cs 568 Spring 10  Lecture 5 Estimation

Project Metrics

Cost and schedule estimation Measure progress Calibrate models for future estimating Metric/Scope Manager Product

Number of projects x number of metrics = 15-20

Page 3: Cs 568 Spring 10  Lecture 5 Estimation

Approaches to Cost Estimation

• By experts• By analogies• Decomposition• Parkinson’s Law; work expands to fill time• Pricing to win: customer willingness to pay• Lines of Code• Function Points• Mathematical Models: Function Points &

COCOMO

Page 4: Cs 568 Spring 10  Lecture 5 Estimation

Time

Staff-month

Ttheoretical

75% * Ttheoretical

Impossible design

Linear increase

Boehm: “A project can not be done in less than 75% of theoretical time”

Ttheoretical = 2.5 * 3√staff-months

But, how can I estimate staff months?

Page 5: Cs 568 Spring 10  Lecture 5 Estimation

PERT estimation

Mean Schedule date = [ earliest date +

4 likely date +

latest date] /6

Variance = [latest date –earliest date]/6

This is a β distribution.

Page 6: Cs 568 Spring 10  Lecture 5 Estimation

Example

If min = 10 months

mode = 13.5 months

max = 20 months, then

Mean = 14 months

Std. Dev = 1.67 months

Page 7: Cs 568 Spring 10  Lecture 5 Estimation

Probability Distributions

Page 8: Cs 568 Spring 10  Lecture 5 Estimation

See www.brighton-webs.co.uk/distributions/beta.asp

  Beta Triangular

Mean 14.00 14.50

Mode 13.65 13.5

Standard Deviation

1.67 2.07

Q1 (25%) 12.75 12.96

Q2 (50% - Median)

13.91 14.30

Q3 (75%) 15.17 15.97

The mean, mode and standard deviation in the above table are derived from the minimum, maximum and shape factors which resulted from the use of the PERT approximations.

Page 9: Cs 568 Spring 10  Lecture 5 Estimation

Sizing Software Projects

Effort = (productivity)-1 (size)c

productivity ≡ staff-months/ksloc

size ≡ ncksloc

c is a function of staff skillsStaff

months

Lines of Code or

Function Points

500

Page 10: Cs 568 Spring 10  Lecture 5 Estimation

Understanding the equations

Consider a transaction project of 38,000 lines of code, what is the shortest time it will take to develop? Module development is about 400 SLOC/staff month

Effort = (productivity)-1 (size)c

= (1/.400 KSLOC/SM) (38 KSLOC)1.02

= 2.5 (38)1.02 ≈ 100 SMMin time = .75 T= (.75)(2.5)(SM)1/3

≈ 1.875(100)1/3

≈ 1.875 x 4.63 ≈ 9 months

Page 11: Cs 568 Spring 10  Lecture 5 Estimation

How many software engineers?

1 full time staff week = 60 hours, half spent on project (30 hours)

1 student week = 20 hours. Therefore, an estimation of 100 staff months is

actually 150 student months. 150 staff months/5 months/semester = 30 student

software engineers, therefore simplification is mandatory

Page 12: Cs 568 Spring 10  Lecture 5 Estimation

Lines of Code

LOC ≡ Line of Code KLOC ≡ Thousands of LOC KSLOC ≡ Thousands of Source LOC NCSLOC ≡ New or Changed KSLOC

Page 13: Cs 568 Spring 10  Lecture 5 Estimation

Productivity per staff-month:» 50 NCSLOC for OS code (or real-time system)

» 250-500 NCSLOC for intermediary applications (high risk, on-line)

» 500-1000 NCSLOC for normal applications (low risk, on-line)

» 10,000 – 20,000 NCSLOC for reused code

Reuse note: Sometimes, reusing code that does not provide the exact functionality needed can be achieved by reformatting input/output. This decreases performance but dramatically shortens development time.

Bernstein’s rule of thumb for small components

Page 14: Cs 568 Spring 10  Lecture 5 Estimation

“Productivity” as measured in 2000

Classical rates 130 – 195 NCSLOC/sm

Evolutionary or Incremental approaches (customized)

244 – 325 NCSLOC/sm

New embedded flight software (customized)

Reused Code

17 – 105 NCSLOC/sm

1000-2000 NCSLOC/sm

Code for reuse 3 x code for customized

Page 15: Cs 568 Spring 10  Lecture 5 Estimation

QSE Lambda Protocol

Prospectus Measurable Operational Value Prototyping or Modeling sQFD Schedule, Staffing, Quality Estimates ICED-T Trade-off Analysis

Page 16: Cs 568 Spring 10  Lecture 5 Estimation

Universal Software Engineering Equation

Reliability (t) = ℮ -k t

when the error rate is constant and where k is a normalizing constant for a software shop and

= Complexity/ effectiveness x staffing

Page 17: Cs 568 Spring 10  Lecture 5 Estimation

Post-Release Reliability Growth in Software Products

Author: Pankaj Jalote ,Brendan Murphy, Vibhu Saujanya Sharma

Guided By: Prof. Lawrence BernsteinPrepared By: Mautik Shah

Page 18: Cs 568 Spring 10  Lecture 5 Estimation

Introduction

The failure rate of software products decreases with time, even when there no software changes are being made.

This violates our intuition where there is a growth in reliability without any fault removal.

Modeling this reliability growth in the initial stages after product release is the focus of this paper.

Page 19: Cs 568 Spring 10  Lecture 5 Estimation

Three possible reasons:

Users learn to avoids faults that cause failure and a failure is never random.

After Initially exploring many different features and options, users choose a small set of product features, thereby reducing the number of fault carrying paths that are actually exercised.

Installing new software onto existing systems often results in versioning and configuration issues which cause failures.

Page 20: Cs 568 Spring 10  Lecture 5 Estimation

Failure rate model

Page 21: Cs 568 Spring 10  Lecture 5 Estimation

Using product support data

Page 22: Cs 568 Spring 10  Lecture 5 Estimation

Using data from Automated Reporting

Page 23: Cs 568 Spring 10  Lecture 5 Estimation

Product stabilization time

Stabilization time indicates the product’s transient defects as well as the user experience.

A smaller value of stabilization time means that the end users will have fewer troubles.

If the steady state failure rate of a product is acceptable, then instead of investing in system testing the vendor may need to focus on improving issues related to installation, configuration, usage, etc. to reduce stabilization time

A high stabilization time will require a different strategy for improving the user experience than is needed for dealing with a high steady state failure rate of a product.

.

Page 24: Cs 568 Spring 10  Lecture 5 Estimation

Conclusion

Traditional software reliability models generally assume that software reliability is primarily a function of the fault content and remains unchanged if the software is unchanged. But, the failure rate often gets smaller with time, even without any changes being made to the product. T

This may be due to users learning to avoid the situations that cause failures, using a limited amount of features functionality or resolving configuration issues, etc.

Stabilization time is the time it takes after installation for the failure rate to reach its steady state value.

For an organization which plans to have its employees use a software product, the stabilization time could indicate the period after which the organization could expect the production usage of the product.

Page 25: Cs 568 Spring 10  Lecture 5 Estimation

Derivation of Reliability Equation valid after the stabilization intereval.

Let T be the stabilization time, then g(T) is some constant failure rate, F.

. To convert from a rate to a time function we need to intergrate the Fourier transform:

R(t-T) = ∫ g(ω) exp(-λ(t-T)) from o to ∞,

With g(w) is a constant F and τ= t-T

R(τ)= F exp(- λτ) and

λ = complexity/effective staffing

Page 26: Cs 568 Spring 10  Lecture 5 Estimation

Function Point (FP) Analysis

Useful during requirement phase Substantial data supports the methodology Software skills and project characteristics are accounted

for in the Adjusted Function Points FP is technology and project process dependent so that

technology changes require recalibration of project models.

Converting Unadjusted FPs (UFP) to LOC for a specific language (technology) and then use a model such as COCOMO.

(start here)

Page 27: Cs 568 Spring 10  Lecture 5 Estimation

0

2

4

6

8

10

12

20 40 80 160 320 640 1280 2560 5120 10240 20480 40960

Function Points

Bell Laboratories data

Capers Jones data

Prod

uctiv

ity (F

unct

ion

poin

ts /

staf

f mon

th)

Productivity= f (size)

Page 28: Cs 568 Spring 10  Lecture 5 Estimation

Adjusted Function Points

Accounting for Physical System Characteristics

Characteristic Rated by System User

• 0-5 based on “degree of influence”

• 3 is average

UnadjustedFunction

Points (UFP)

UnadjustedFunction

Points (UFP)

General SystemCharacteristics

(GSC)

General SystemCharacteristics

(GSC)

X

=

AdjustedFunction

Points (AFP)

AdjustedFunction

Points (AFP)

AFP = UFP (0.65 + .01*GSC), note GSC = VAF= TDI

1. Data Communications

2. Distributed Data/Processing

3. Performance Objectives

4. Heavily Used Configuration

5. Transaction Rate

6. On-Line Data Entry

7. End-User Efficiency

8. On-Line Update

9. Complex Processing

10. Reusability

11. Conversion/Installation Ease

12. Operational Ease

13. Multiple Site Use

14. Facilitate Change

Page 29: Cs 568 Spring 10  Lecture 5 Estimation

Function Point Calculations

Unadjusted Function Points

UFP= 4I + 5O + 4E + 10L + 7F, Where

I ≡ Count of input types that are user inputs and change data structures. O ≡ Count of output typesE ≡ Count of inquiry types or inputs controlling execution.

[think menu selections]L ≡ Count of logical internal files, internal data used by system

[think index files; they are group of logically related data entirely within the applications boundary and maintained by external inputs. ]

F ≡ Count of interfaces data output or shared with another application

Note that the constants in the nominal equation can be calibrated to a specific software product line.

Page 30: Cs 568 Spring 10  Lecture 5 Estimation

Complexity Table

TYPE: SIMPLE AVERAGE COMPLEX

INPUT (I) 3 4 6

OUTPUT(O) 4 5 7

INQUIRY(E) 3 4 6

LOG INT (L) 7 10 15

INTERFACES (F)

5 7 10

Page 31: Cs 568 Spring 10  Lecture 5 Estimation

Complexity Factors

1. Problem Domain ___2. Architecture Complexity ___3. Logic Design -Data ___4. Logic Design- Code ___

Total ___

Complexity = Total/4 = _________

Page 32: Cs 568 Spring 10  Lecture 5 Estimation

Problem Domain Measure of Complexity (1 is simple and 5 is complex)

1. All algorithms and calculations are simple.2. Most algorithms and calculations are simple.3. Most algorithms and calculations are moderately

complex.4. Some algorithms and calculations are difficult.5. Many algorithms and calculations are difficult.

Score ____

Page 33: Cs 568 Spring 10  Lecture 5 Estimation

Architecture ComplexityMeasure of Complexity (1 is simple and 5 is complex)

1. Code ported from one known environment to another. Application does not change more than 5%.2. Architecture follows an existing pattern. Process design is straightforward. No complex hardware/software interfaces.3. Architecture created from scratch. Process design is straightforward. No complex hardware/software interfaces.4. Architecture created from scratch. Process design is complex. Complex hardware/software interfaces exist but they are well defined and unchanging.5. Architecture created from scratch. Process design is complex. Complex hardware/software interfaces are ill defined and changing.

Score ____

Page 34: Cs 568 Spring 10  Lecture 5 Estimation

Logic Design -Data

1. Simple well defined and unchanging data structures. Shallow inheritance in class structures. No object classes have inheritance greater than 3.

2. Several data element types with straightforward relationships. No object classes have inheritance greater than

3. Multiple data files, complex data relationships, many libraries, large object library. No more than ten percent of the object classes have inheritance greater than three. The number of object classes is less than 1% of the function points

4. Complex data elements, parameter passing module-to-module, complex data relationships and many object classes has inheritance greater than three. A large but stable number of object classes.

5. Complex data elements, parameter passing module-to-module, complex data relationships and many object classes has inheritance greater than three. A large and growing number of object classes. No attempt to normalize data between modules

Score ____

Page 35: Cs 568 Spring 10  Lecture 5 Estimation

Logic Design- Code

1. Nonprocedural code (4GL, generated code, screen skeletons). High cohesion. Programs inspected. Module size constrained between 50 and 500 Source Lines of Code (SLOCs).

2. Program skeletons or patterns used. ). High cohesion. Programs inspected. Module size constrained between 50 and 500 SLOCs. Reused modules. Commercial object libraries relied on. High cohesion.

3. Well-structured, small modules with low coupling. Object class methods well focused and generalized. Modules with single entry and exit points. Programs reviewed.

4. Complex but known structure randomly sized modules. Some complex object classes. Error paths unknown. High coupling.

5. Code structure unknown, randomly sized modules, complex object classes and error paths unknown. High coupling.

Score __

Page 36: Cs 568 Spring 10  Lecture 5 Estimation

Computing Function Points

See http://www.engin.umd.umich.edu/CIS/course.des/cis525/js/f00/artan/functionpoints.htm

Page 37: Cs 568 Spring 10  Lecture 5 Estimation

Adjusted Function Points- review

Now account for 14 characteristics on a 6 point scale (0-5) Total Degree of Influence (DI) is sum of scores. DI is converted to a technical complexity factor (TCF)

TCF = 0.65 + 0.01DI Adjusted Function Point is computed by

FP = UFP X TCF For any language there is a direct mapping from

Unadjusted Function Points to LOC

Beware function point counting is hard and needs special skills

Page 38: Cs 568 Spring 10  Lecture 5 Estimation

Function Points Qualifiers

Based on counting data structures Focus is on-line data base systems Less accurate for WEB applications Even less accurate for Games, finite state machine and

algorithm software Not useful for extended machine software and compliers

An alternative to NCKSLOC because estimates can be based on requirements and design data.

Page 39: Cs 568 Spring 10  Lecture 5 Estimation

Function Point pros and cons

Pros:

• Language independent

• Understandable by client

• Simple modeling

• Hard to fudge

• Visible feature creep

Cons:• Labor intensive• Extensive training • Inexperience results in

inconsistent results• Weighted to file

manipulation and transactions

• Systematic error introduced by single person, multiple raters advised

Page 40: Cs 568 Spring 10  Lecture 5 Estimation

Initial Conversion

Language Median SLOC/ UFP C 104

C++ 53

HTML 42

JAVA 59

Perl 60

J2EE 50

Visual Basic 42

http://www.qsm.com/FPGearing.html

Page 41: Cs 568 Spring 10  Lecture 5 Estimation
Page 42: Cs 568 Spring 10  Lecture 5 Estimation
Page 43: Cs 568 Spring 10  Lecture 5 Estimation

SLOC

78 UFP * 53 (C++ )SLOC / UFP = 4,134 SLOC

≈ 4.1 KSLOC

.

(Reference for SLOC per function point: http://www.qsm.com/FPGearing.html)

Page 44: Cs 568 Spring 10  Lecture 5 Estimation

3

15

3037.5

47

75

113142

475638

81

1

10

100

1000

1955 1960 1965 1970 1975 1980 1985 1990 1995 2000 2005

ExpansionFactor

TechnologyChange:

RegressionTesting

4GL Small ScaleReuse

MachineInstructions

High LevelLanguages

MacroAssemblers

DatabaseManagers

On-LineDev

Prototyping SubsecTimeSharing

ObjectOrientedProgramming

Large ScaleReuse

Order of MagnitudeEvery Twenty Years

Each date is an estimate of widespread use of a software technology

The ratio ofSource line of code to a machine level line of code

Expansion Trends

Page 45: Cs 568 Spring 10  Lecture 5 Estimation

Heuristics to do Better Estimates

Decompose Work Breakdown Structure to lowest possible level and type of software.

Review assumptions with all stakeholders Do your homework - past organizational experience Retain contact with developers Update estimates and track new projections (and warn) Use multiple methods Reuse makes it easier (and more difficult) Use ‘current estimate’ scheme

Page 46: Cs 568 Spring 10  Lecture 5 Estimation

Heuristics to meet aggressive schedules

Eliminate features Simplify features & relax specific feature

specifications Reduce gold plating Delay some desired functionality to version 2 Deliver functions to integration team

incrementally Deliver product in periodic releases

Page 47: Cs 568 Spring 10  Lecture 5 Estimation

Specification for Development Plan

Project Feature List Development Process Size Estimates Staff Estimates Schedule Estimates Organization Gantt Chart

Page 48: Cs 568 Spring 10  Lecture 5 Estimation

COCOMO

COnstructive COst MOdel Based on Boehm’s analysis of a database of 63

projects - models based on regression analysis of these systems

Linked to classic waterfall model Effort is number of Source Lines of Code (SLOC)

expressed in thousands of delivered source instructions) - excludes comments and unmodified utility software

Page 49: Cs 568 Spring 10  Lecture 5 Estimation

COCOMO Formula

Effort in staff months =a*KDLOCb

a b

organic 2.4 1.05

semi-detached

3.0 1.12

embedded 3.6 1.20

Page 50: Cs 568 Spring 10  Lecture 5 Estimation

A Retrospective on the Regression Models

They came to similar conclusions:• Time:

» Watson-Felix T = 2.5E 0.35

» COCOMO(organic) T = 2.5E 0.38

» Putnam T = 2.4E 0.33

• Effort:» Halstead E = 0.7 KLOC 1.50

» Boehm E = 2.4 KLOC 1.05

» Watson-Felix E = 5.2 KLOC 0.91

Page 51: Cs 568 Spring 10  Lecture 5 Estimation

Initial Conversion

Language Median SLOC/function point

C 104

C++ 53

HTML 42

JAVA 59

Perl 60

J2EE 50

Visual Basic 42

http://www.qsm.com/FPGearing.html

Page 52: Cs 568 Spring 10  Lecture 5 Estimation

Delphi Method

A group of experts can give a better estimate The Delphi Method:

• Coordinator provides each expert with spec

• Experts discuss estimates in initial group meeting

• Each expert gives estimate in interval format: most likely value and an upper and lower bound

• Coordinator prepares summary report indicating group and individual estimates

• Group iterates until consensus

Page 53: Cs 568 Spring 10  Lecture 5 Estimation

Function Point Method

External Inputs External Outputs External Inquiries Internal Logical Files External Interface Files

External Input

External Inquiry

External Output

InternalLogical

Files

External Interface

File

Five key components are identified based on logical user view

Application

Page 54: Cs 568 Spring 10  Lecture 5 Estimation

Downside

Function Point terms are confusing Too long to learn, need an expert Need too much detailed data Does not reflect the complexity of the application Does not fit with new technologies Takes too much time “We tried it once”

Page 55: Cs 568 Spring 10  Lecture 5 Estimation

Complexity

RecordElement

Types

Data Elements (# of unique data fields)

or File Types Referenced

Low Average High Low

Low Average

HighAverage High

Components: Low Avg. High Total

Internal Logical File (ILF) __ x 7 __ x 10 __ x 15 ___

External Interface File (EIF) __ x 5 __ x 7 __ x 10 ___

External Input (EI) __ x 3 __ x 4 __ x 6 ___

External Output (EO) __ x 4 __ x 5 __ x 7 ___

External Inquiry (EQ) __ x 3 __ x 4 __ x 6 ___

___Total Unadjusted FPs

Data Relationships

1 3

3

For each component compute a Function Point value based on its make-up and complexity of its data

Page 56: Cs 568 Spring 10  Lecture 5 Estimation

When to Count

CORRECTIVEMAINTENANCE

PROSPECTUS ACHITECTURE TESTING DELIVERY REQUIREMENTS

IMPLEMENTATION

SIZING

SIZING

ChangeRequest

ChangeRequest SIZING SIZING

SIZING

SIZING

Page 57: Cs 568 Spring 10  Lecture 5 Estimation

:

• Technology (tools, languages, reuse, platforms)• Processes including tasks performed, reviews,

testing, object oriented • Customer/User and Developer skills • Environment including locations & office space• System type such as information systems;

control systems, telecom, real-time, client server, scientific, knowledge-based, web

• Industry such as automotive, banking, financial, insurance, retail, telecommunications, DoD

Estimates vary f{risk factors}

Page 58: Cs 568 Spring 10  Lecture 5 Estimation

Using the equations

For a 59 function point project to be written in C++, we need to write 59 x 53 = 3127 SLOC

Effort = (productivity)-1 (size)c

= [1/(.9 x 53 KSLOC/SM)] (3.127 KSLOC)1.02

= 2.1 (3.127 )1.02 = 2.1 (3.127 )1 (3.127 ).02

≈ 7 SM

Page 59: Cs 568 Spring 10  Lecture 5 Estimation

Baseline current performance levels

PERFORMANCEPRODUCTIVITY

CAPABILITIES

PERFORMANCE

SOFTWAREPROCESS

IMPROVEMENT

TIME TO MARKET

EFFORT

DEFECTSMANAGEMENT

SKILL LEVELS

PROCESS

TECHNOLOGYPRODUCTIVITY

IMPROVEMENT INITIATIVES / BEST PRACTICES

RISKS

MEASUREDBASELINE

0.00

5.00

10.00

15.00

20.00

25.00

30.00

35.00

0 100 200 400 800 1600 3200 6400

SubPerformance

BestPractices

IndustryAverages

Organization Baseline

Page 60: Cs 568 Spring 10  Lecture 5 Estimation

Modeling Estimation

SIZEREQUIREMENT

REQUIREMENT

Analyst

ESTABLISHPROFILE

SELECT MATCHING

PROFILE

GENERATE ESTIMATE

WHAT IFANALYSIS

ACTUALS

Counter ProjectManager Software PM / User Metrics

Database

Plan vs. ActualReport

ProfileSize Time

The estimate is based on the best available information.A poor requirements document

will result in a poor estimate

Accurate estimating is a function of using historical data with an effective

estimating process.

Page 61: Cs 568 Spring 10  Lecture 5 Estimation

Rate of DeliveryFunction Points per Staff Month

0200

400600800

100012001400

1600

180020002200

0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34 36

SoftwareSize

Establish a baseline

Performance Productivity

A representative selectionof projects is measured

Size isexpressedin terms of functionalitydelivered to theuser

Rate of delivery is a measure of productivity

Organizational Baseline

9

Page 62: Cs 568 Spring 10  Lecture 5 Estimation

Monitoring improvements

Track Progress

Rate of DeliveryFunction Points per Person Month

0200

400600800

100012001400

1600

180020002200

0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34 36

SoftwareSize

Second year

Page 63: Cs 568 Spring 10  Lecture 5 Estimation

Brooks Calling the Shot

Do not estimate the whole task by estimating coding and multiplying by 6 or 9!

Effort increases as a power of size Unrealistic assumptions about developer’s

time - studies show at most 50% of the time is allotted to development

Productivity is also related to complexity of the task, more complex, less lines/year - high level languages & reuse critical