a flexible method for maintaining software metrics data: a universal metrics repository

The Journal of Systems and Software 72 (2004) 225–234

www.elsevier.com/locate/jss

A flexible method for maintaining software metrics data:a universal metrics repository q

Warren Harrison *

Department of Computer Science, Portland State University, Portland, OR 97207-0751, USA

Received 11 June 2001; received in revised form 15 August 2001; accepted 6 June 2002

Abstract

A neglected aspect of software measurement programs is what will be done with the metrics once they are collected. Often

databases of metrics information tend to be developed as an afterthought, with little, if any concessions to future data needs, or

long-term, sustaining metrics collection efforts. A metric repository should facilitate an on-going metrics collection effort, as well as

serving as the ‘‘corporate memory’’ of past projects, their histories and experiences. In order to address these issues, we describe a

transformational view of software development that treats the software development process as a series of artifact transformations.

Each transformation has inputs and produces outputs. The use of this approach supports a very flexible software engineering

metrics repository.

� 2003 Elsevier Inc. All rights reserved.

Keywords: Software metrics; Software engineering repositories; Software engineering data collection

1. Introduction

When metrics are collected pertaining to a softwareproduct or process, the measurements are usually stored

for later retrieval. Such a collection of metrics data is

known as a metrics repository.

The importance of repositories has been recognized in

the past. A recommendation from the Workshop on

Executive Software Issues held in 1988 (Martin et al.,

1989) stated that ‘‘Software organizations should

promptly implement programs to: Define, collect, storein databases, analyze, and use process data’’. Twenty

years ago Basili (1980) wrote ‘‘All the data collected on

the project should be stored in a computerized data

base. Data analysis routines can be written to collect

derived data from the raw data in the data base. The

data collection process is clearly iterative. The more we

learn, the more we know about what other data we need

and how better to collect it’’. Even then a data collection

qA preliminary version of this paper was presented at the 2000

Pacific Northwest Software Quality Conference.* Tel.: +1-503-7253108; fax: +1-503-7252615.

E-mail address: [email protected] (W. Harrison).

URL: http://www.cs.pdx.edu/~warren.

0164-1212/$ - see front matter � 2003 Elsevier Inc. All rights reserved.

doi:10.1016/S0164-1212(03)00092-X

process that would change over time was envisioned. As

users learn more about the products, processes and

metrics, we should expect the information they desire tochange.

The significance of a persistent collection of software

engineering data becomes more obvious when we note

that it represents the only objective source of an or-

ganizational memory. When did project X finish?

How many programmers worked on it? How reliable

was it?

Unfortunately, a neglected aspect of software mea-surement programs is what will be done with the metrics

once they are collected. As a consequence, databases of

metrics information––we hesitate to call some of these

efforts metric repositories––tend to be developed as an

afterthought, with little, if any concessions to future

data needs, or long-term, sustaining metrics collection

efforts. For instance, in what was described as a study in

‘‘the process and methods used, experience gained, andsome lessons learned in establishing a software mea-

surement program’’ the associated metrics database was

simply a collection of ten separate spreadsheet files

(Rozum, 1993). We feel that this state of affairs has a

significant effect on the viability of measurement pro-

grams and software engineering research in general.

mail to: [email protected]

http://www.cs.pdx.edu/~warren

226 W. Harrison / The Journal of Systems and Software 72 (2004) 225–234

This document describes a framework to represent

the software development process generalizable to many

different lifecycle models and a corresponding repository

of metrics data that addresses these issues.

2. Past efforts at metrics repositories

Over the past decades, there has been some interest

within the software engineering community to make

data available. In this section, we consider some illus-

trative examples. While we do not claim these examples

represent every metrics dataset, we feel that they are

notable in that they were ostensibly designed to foster

distribution and sharing of data among a variety of

unrelated organizations.

2.1. The DACS productivity dataset (DPDS)

One of the earliest examples of a publicly available

‘‘metrics repository’’ is one that was (and continues to

be) disseminated by the Data and Analysis Center forSoftware (DACS) at Rome Airforce Base. The reposi-

tory is known as the DACS Productivity Dataset (http://

www.thedacs.com/databases/sled/prod.shtml) and con-

sists of summary data on over 500 software projects

dating from the early 1960s through the early 1980s. The

DACS Productivity Dataset (DPDS) was intended to

support research in cost modeling and estimation, and

as a consequence includes most of the parameters thatwere in vogue within the estimation community at the

time the repository was created. The parameters main-

tained by the DACS Productivity Dataset include:

• Project size. Number of delivered source lines of code

(DSLOC) in the delivered project.

• Project effort. Effort in man months required to pro-

duce the software product.• Project duration. Duration of project in total months

derived from begin and end dates of projects, less any

‘‘dead time’’ in the project.

• Source language. Programming languages used on the

project recorded by name and expressed as a percent-

age of the total DSLOC written in each different lan-

guage.

• Errors. The number of formally recorded SoftwareProblem Reports (SPRs) for which a fix has been gen-

erated during the period covered by the project.

• Documentation. Delivered pages of documentation

including program listings, flow charts (low and high

level), operating procedures, maintenance proce-

dures, etc.

• Implementation technique. The implementation tech-

niques used on the software project expressed as apercentage of the DSLOC built using specific tech-

niques of Structured Coding, Top Down Design

and Programming, Chief Programmer Teams, Code

Reviews or Inspections, and Librarian or Program

Support Library.

• Productivity. Ratio consisting of DSLOC to Total

man-months

• Average number of personnel. Ratio consisting ofTotal Manmonths to Total Months

• Error rate. Ratio consisting of Errors to DSLOC

The repository consists of fixed columns containing

each of these pieces of information on each project for

which information was captured. This rigid, inflexible

format has serious ramifications on later software

engineering data collection.First, by mandating specific measures, it codifies

metrics that may or may not be the ones we really want

to collect. Boehm et al. (1976) suggest that ‘‘. . . the

software field is still evolving too rapidly to establish

metrics in some areas. In fact, doing so would tend to

reinforce current practice, which may not be good’’.

This is just as true today as it was in 1976. Secondly,

some metrics may be perfectly fine for past projects, butobsolete for more recent ones. Development techniq-

ues such as Chief Programmer Teams and Structured

Coding are unknown to many contemporary software

engineers. Likewise, for some very popular languages

such as Visual BASIC, the concept of ‘‘Delivered Source

Instructions’’ may be inapplicable.

For purposes of illustration, assume an organization

collects Logical Sources Statements ala’ the IEEE

Standard for Software Productivity Metrics (IEEE, 1992)

rather than delivered source lines of code as is done in

the Productivity Data Set. The size parameters between

the earlier and later data are no longer comparable.

Does this mean that we create yet another repository to

maintain the post-DSLOC data? If so, it makes it diffi-

cult to easily combine data that does continue to be

comparable from the two repositories––for instance,average number of personnel or average defects for a

project. This difficulty is compounded when one oper-

ates within a culture that promotes a ‘‘metric of the

week’’ mentality. Existing metric repositories can fall by

the wayside erasing all previous objective organizational

knowledge. Conversely, an organization may resist

modifying their metrics collection efforts, even when

they should because it will result in the loss of a signif-icant investment of time, effort and organizational

knowledge.

2.2. The software reliability dataset

The DACS also maintains a dataset of failure data

from 16 projects compiled by John Musa in the 1970s

(http://www.thedacs.com/databases/sled/swrel.shtml).The dataset consists of failure interval data to assist

software managers in monitoring test status, predicting

http://www.thedacs.com/databases/sled/prod.shtml

http://www.thedacs.com/databases/sled/prod.shtml

http://www.thedacs.com/databases/sled/swrel.shtml

W. Harrison / The Journal of Systems and Software 72 (2004) 225–234 227

schedules and assisting software researchers in validat-

ing software reliability models. It represents projects

from a variety of applications including real time com-

mand and control, word processing, commercial, and

military applications.

• Failure number. A sequential number identifying a

particular failure.

• Failure interval. The time elapsed from the previous

failure to the current failure.

• Day of failure. The day on which the failure occurred

in terms of the number of working days from the start

of the current phase or data collection period.

As with the Productivity Dataset, the focus of the

Software Reliability dataset on a specific application, re-

sults in very specific (and era-specific) data being main-

tained. As a result, the applicability of the data is limited.

The dataset in its current form is not compatible with

most defect datasets, even though the information itself

on these 16 projects may be of interest, and comparable

to other datasets.

2.3. The architecture research facility (ARF) dataset

This dataset was produced by Weiss (1979) in the late

1970s. It contains information on a 192 man-week soft-

ware project developed at the Naval Research Labora-

tory and the 142 errors that were discovered in the code.

As are the other datasets we discuss, the ARF dataset isavailable from the DACS (http://www.thedacs.com/

about/services/pdf/Data-Brochure.pdf). This represents

an improvement in terms of schema sophistication over

the Productivity and Reliability Datasets in that the data

is separated into two tables, one containing component

information and one containing information on software

problem reports. Individual problem reports are asso-

ciated with one or more components through the useof one or more instances of the foreign key ‘‘Code of

Changed Component’’.

• Software components

� Component Code

� Total Statements in Segment

� Number of Comments in Segment

� Number of Pre-Process Statements in Segment� Subjective Complexity (Easy, Moderate, Hard)

� Function (Computational, Control, Data Access-

ing, Error Handling, Initialization)

• Software problem reports

� Form Number

� Project Code

� Programmer Code

� Form Date� Number of Components Changed

� Number of Components Examined

� More than One Component Affected

� Date Change was Determined

� Date Change was Started

� Effort for Change (Less than 1 h, 1 h to 1 day, 1

day to 3 days, more than 3 days, unknown)

� Type of Change (Error Correction, PlannedEnhancement, Implement Requirements Change,

Improve Clarity, Improve User Service, Develop

Utility, Optimization, Adapt to Environment,

Other)

� Code of Changed Component

� Type of Error (Requirements Incorrect, Func-

tional Specs Incorrect, Design Error, Several,

Misunderstand External Environment, Error inLanguage Use, Clerical Error, Other)

� Phase When Error Entered System (Requirements,

Functional Specs, Design, Code and Test, System

Test, Unknown)

� Data Structure Error (Y/N)

� Control Logic Error Flag (Y/N)

� Error Isolation Activities (Pre-acceptance Test,

Acceptance Test, Post-acceptance Test, Inspectionof Output, Code Reading by Programmer, Code

Reading by Another, Talk with other Program-

mers, Special Debug Code, System Error Message,

Project Specific Error Message, Reading Docu-

mentation, Trace, Dump, Cross Reference, Proof

Technique, Other)

� Time To Isolate Error (Less than 1 h, 1 h to 1 Day,

More than 1 Day, Never Found)� Work Around Used (Y/N)

� Related to Previous Change (Y/N)

� Previous Form Number

� Previous Form Date

Because of the increased schema sophistication, we

can retrieve a much richer set of information than with

the earlier ‘‘flat file’’ datasets. However, the dataset stillsuffers from a lack of generality. Clearly Weiss was

interested in collecting information about errors. As

such, much potentially useful information about devel-

opment is missing, such as length of time to develop,

hours of design, etc. This raises serious doubts as to the

ease of combining the information with other datasets.

However, we would claim that under some circum-

stances, this is exactly what an analyst may wish to do.For the sake of argument, assume that the definition

of a defect in the Productivity Dataset is comparable to a

‘‘problem’’ in the ARF dataset and the definition of

DSLOC in the Productivity Dataset is comparable to the

definition of a statement in the ARF dataset. Given this

assumption, we may find that it would be desirable to roll

the component data from the ARF dataset up to the level

of a ‘‘Project’’ as represented by the Productivity Data-set, so we could fully utilize all our data to compute

statistics about defects per DSLOC for projects.

http://www.thedacs.com/about/services/pdf/Data-Brochure.pdf

http://www.thedacs.com/about/services/pdf/Data-Brochure.pdf


Consolidating this information would require the

integration of at least two radically different databases.

If the ARF dataset actually contained data on multiple

projects, we would find this to be a non-trivial task. In

fact, we might very well limit our analysis to only one

dataset or the other, thereby ignoring a significantinformation asset, and making decisions with less than

all the information available to us. In order to be able to

do this easily, not only must the definitions be com-

patible, but also the schemas.

2.4. The NASA/SEL Dataset

The DACS also maintains a more general dataset, theNASA/SEL Dataset (http://www.dacs.com/databases/

sled/sel.shtml), which contains information collected

since 1976 at the NASA Goddard Software Engineering

Laboratory (also see http://sel.gsfc.nasa.gov/). One of

the major functions of the SEL is the collection of de-

tailed software engineering data, describing all facets of

the development process, and the archival of this data

for future use. The SEL maintains a tremendous amountof such information (the dataset is distributed on a CD-

ROM containing 424 MB of data and consisting of over

100 tables)

This dataset is a good example of a more flexible

repository. Rather than being developed to fulfill a

specific function such as reliability estimation or cost

estimation, the SEL Dataset is meant to provide a much

more flexible source of data for use by researchers.While the size of the schema (the schema requires a

number of pages to fully describe the tables and their

relationships) prohibits a complete listing of factors

here, the dataset contains the following information:

• General project summary (completed at each mile-

stone)

• Component summary (completed during the designphase)

• Programmer–Analyst Survey (completed once by

each programmer)

• Resource summary (completed weekly)

• Component status report (completed weekly)

• Computer program run analysis (completed each

time the program is run)

• Change report (completed for each change made)

Unlike the other datasets, the NASA/SEL dataset

enjoys on-going maintenance, last being updated (at

DACS) in 1997.

The NASA/SEL schema is much more flexible than

the earlier attempts. By virtue of the organization of

the data as a collection of tables whose members are

connected via foreign keys, the dataset can be aug-mented by future data that may contain additional

characteristics.

However, the NASA/SEL repository still suffers from

serious limitations. This is because it does not appear to

be built around a coherent model of the software pro-

duction process. Rather, items appear to be inserted into

the database in an opportunistic manner, as data is ac-

quired. This makes repository augmentation a moredifficult proposition.

3. General limitations of contemporary metrics reposito-

ries

If the only purpose of a metrics repository is as a

place to maintain a ‘‘data dump’’ of a single project orset of completed projects, then the limitations of most

repositories are minor. However, if the repository is

expected to serve as a home for ongoing collection ef-

forts, then most contemporary metrics repositories are

seriously deficient. In our discussions we assume not

only that the metric repository is intended to support an

ongoing metrics collection effort, but we additionally

consider the metrics repository as the ‘‘corporate mem-ory’’ of past projects, their histories and experiences.

Within this context, four important limitations of con-

temporary metrics repositories are:

1. Obsolescence. In most examples of contemporary

metrics repositories we have surveyed, specific metrics

are built into the schema. For instance, consider the

measure of ‘‘Delivered Source Lines’’, or ‘‘Cyclo-matic Complexity’’. What happens when a specific

metric goes out of vogue? Do we continue to collect

the obsolete metrics in order to remain compatible

with the schema? If new metrics become popular,

do we avoid collecting them because ‘‘there’s no place

to put them?’’ Or do we simply discard what we have

and start over?

2. Ambiguity. It may often be the case that users of aparticular repository do not know what a specific

field within the repository means unless they were

involved in the original collection efforts. Florac

(1992) points out the importance of Communication

(will others know precisely what has been measured

and what has been included and excluded?) and

Repeatability (would someone else be able to repeat

the measurement and get the same results?). Layoutdocumentation of most repositories is limited (at

best) to often out-of-date documentation files. When

a user sees the word ‘‘Design’’ in an error report,

does that mean the error was injected, found or

fixed during the design phase? Similar issues exist

for almost every other property of a project we

may wish to retain––for instance Goethert et al.

(1992) propose a definition for counting staff hours,Park (1992) suggests a method for counting source

statements.

http://sel.gsfc.nasa.gov/

Requirements DesignDesignDocument


3. Augmentation. It is expected that as we gain experi-

ence with the use of metrics, we’ll wish to augment

the information set they provide us. A manager

who becomes used to obtaining defect information

might then desire information on design events or

maintenance effort. If the repository does not em-brace such augmentation of data, we’ll find groups

of additional repositories springing up to meet the

information needs of various stakeholders. The dan-

ger in this is that multiple repositories are terribly dif-

ficult to keep consistent and multiple repositories

represent significant inefficiencies in effort to both

capture and utilize the information. Additionally,

the difficulty of mining information from diverserepositories makes it less likely that the entire store

of organizational knowledge will be accessed.

4. Focus. Most repositories are historical. That is, they

reflect occurrences and properties from past projects.

In addition, because they focus on what has hap-

pened as opposed to what is currently happening,

they tend to be product-centric. As such, they are

quite good at reflecting information about products,but not so good at reflecting information about pro-

cesses or events. While product information is impor-

tant, much decision-making deals with process and

event information more than product information.

The current state of repository technology leads to an

inflexible, product-centric view of software engineering

decision-making. In the following section, we describe amethod of viewing the software development process

that integrates product and process data in a natural,

flexible manner.

The goal of this work is to define, implement, eval-

uate and populate a ‘‘universal’’ metrics repository that

will address these issues.

Fig. 1. The software design activity as a transformation.

DesignDocument

Screen 1

CloseDB

Connect

OpenDB

Screen 2

Design

Fig. 2. A transformation that generates multiple artifacts.

4. A model of development to support data collection

In order to address the issue, the repository design

must view a software project as a dynamic, growing

collection of artifacts as opposed to a static, monolithic

bucket of code. Unfortunately, the current view of

metrics repositories is exactly that. The general view

considers the software development process as simply acollection of intermediate products. For instance, we

might characterize a software project as consisting of a

requirements specification, a design, and finally a col-

lection of code modules.

4.1. A transformational view of software development

The transformational view of software development

considers the software development process as a series of

artifact transformations. An artifact is some identifiable

product of an activity that takes place during the soft-

ware development process. For example, the needs

analysis phase of a software development project pro-

duces an artifact we sometime refer to as a Require-

ments Specification. In turn, these artifacts are used as

inputs to other transformations to create new artifacts.That is, an activity transforms one or more artifacts into

a new kind of artifact or collection of artifacts. Each

transformation has inputs (artifacts) and produces out-

puts (artifacts). For example, the process we usually

refer to as ‘‘design’’ transforms a requirements specifi-

cation into a design document as illustrated in Fig. 1.

The transformations can be described at arbitrary

levels of abstraction. For instance, the requirementscould actually be logically (and perhaps physically)

separated into two pieces––say the user interaction and

internal processing segments. This implies that the

‘‘Design’’ transformation actually creates two artifacts:

the user interaction and internal processing artifacts.

This situation can be illustrated by Fig. 2.

Of course, the paradigm supports any level of

abstraction desired. For instance, the requirementsartifacts could be paragraphs, pages, chapters, docu-

ments, or the entire set of requirements depending upon

the needs of the user. Fig. 3 illustrates an even more

abstract representation where the entire software

development process that transforms a requirements

specification into a delivered product is considered a

single transformation.

Requirements DevelopDelivered Product

Fig. 3. A highly abstract transformation: requirements to delivered

product.

Updated Code

Original Code

ChangeRequest

Correction

Fig. 4. Representing code repairs for artifacts.


4.2. The transformational approach to iterative software

development

A major weakness of the traditional ‘‘intermediate

product’’ view of repositories is its inability to accom-

modate iterative development. When a product is re-

turned due to a change request, it may be awkward to

represent the change in a new version of the product.For instance, should an error be found during testing

which results in a code change, either the revised code

unit will not be represented in the repository, or the

changed code unit will replace the original design

product. On the other hand, the transformational view

accommodates this situation quite naturally. Fig. 4

illustrates the original code artifact and specific change

request being transformed into an ‘‘updated code arti-fact’’.

5. Schema issues

The transformational view of software development

gives rise to the organizational aspect of our proposed

repository, which consists of Artifact entities con-

transformsArtifact -1

property

property

Fig. 5. The Transf

nected via transforms relationships, as shown in Fig.

5.

This provides the flexibility to represent an arbitrary

flow of artifacts through a software development pro-

cess, regardless of process model or level of granularity.

In order to maintain appropriate information aboutthe transformation, each must be annotated with

descriptive items. A transformation is associated with an

event that denotes the sort of activity or event that

took place to effect the transformation, a count of re-

sources consumed during the transformation, and a

completion date. This gives rise to the second aspect of

our proposed repository, shown in Fig. 6, which asso-

ciates each transformation with an Event and a collec-tion of resources consumed.

Each artifact also is associated with certain specific

information. However, rather than enforcing a standard

selection of characteristics (which is what most previous

repositories have chosen to do) the third aspect of the

proposed repository is that artifacts are linked via

relationships to as many or as few characteristic entities

as data permits.For instance, an artifact such as a code module may

be data rich or poor. It may be linked to dozens of

characteristics, or it may be linked to a single one. Each

characteristic includes a name, type and descrip-

tion and the possesses relationship is annotated to

reflect the quantity of the characteristic. This provides

maximum flexibility by avoiding mandating specific

metrics for a given artifact.Thus, an artifact characteristic has a property

describing the ‘‘domain’’ of interest, such as complexity,

size, application area, etc. In addition, each character-

istic has a ‘‘working name’’, such as Cyclomatic Com-

plexity, Software Science Effort, Lines of Code, etc.

Because the characteristics that can be associated with

an artifact may include measures made over many dif-

ferent time periods, by many different people for manydifferent reasons, the schema includes the descrip-

tions property. This meta-data allows each data item

to be defined in order to avoid inappropriate combining

of data (or discover opportunities to appropriately

combine data which have different names) simply be-

cause the characteristics have similar names (e.g., ‘‘Lines

of Code’’ could mean a variety of different measures

which should not be combined).

Artifact -2

property

property

orms relation.

transformsArtifact

Characteristic

Resource

Event

consumes

possesses

possesses

CharacteristicCharacteristic

possesses

Fig. 6. The Universal Metrics Repository Schema.


6. A prototype implementation

We have implemented a proof-of-concept prototype

using MySQL. The project homepage can be found at

http://www.cs.pdx.edu/~reposit/. This site describes the

schema and provides limited access to the prototype

implementation. The Prototype Repository consists of

the following tables or relations (currently we assumethe repository consists of metrics for a single project,

thus we omit a Project entity, future versions of the

repository will support multiple projects):

Entities:

• Artifact (id, name)

• Event (id, name)

• Resource (id, name)• Characteristic (id, name, type, description)

Relationships (all are n-to-n, and the entire tuple

comprises the key):

• transforms (id, Artifact.id, Artifact.id, Event.id, End-

Date)

• possesses ({Artifact, Resource, Event}.id, Character-istic.id, qty)

• consumes (transforms.id, Resource.id, quantity)

Each entity instance is assigned a unique unsigned

integer identifier (id). This is done by assigning consec-

utive, unsigned integers to each entity instance as it is

added to the database. In addition, each Transforms

relationship is also assigned a unique id from this poolof unsigned integers.

The intention is to capture the transformation of one

or more artifacts (say a design specification) to another

artifact (say a piece of code), and the resources necessary

to perform the transformation.

Artifacts represent the specific ‘‘trackable’’ items that

make up a project––code units (files, modules, packages,

functions––the level of granularity is arbitrary), specifi-

cation documents, design documents, test cases, etc.

Events represent types of activities that occur, such as an

inspection, a modification, an error correction, etc. Re-

sources represent the types of things valued by the pro-

ject that are consumed in the process of creating theartifacts––effort, calendar time, computer time, etc.

Characteristics represent types of information about the

entities––size, complexity, source, etc., as appropriate

and available.

The relationships connect various entity instances to

other entity instances. The most significant relationship is

the transforms relationship. transforms represents the

transformation of one Artifact into another Artifact be-cause of anEvent that occurs.We are also interested in the

Resources that are consumed by a particular transfor-

mation, so the consumes relationship associates a partic-

ular transformationwith a particular resource type as well

as the quantity of this type of resource that was consumed

during the transformation. Every Artifact, Event, and

Resource possesses a certain set of Characteristics.

A partial example follows:

Artifact(001, A1)

Artifact(002, A2)

Artifact(005, A3)

Artifact(013, A4)

Event(003, Bug Fix)

Resource(004, Programmer Eort in Hours)

Characteristic(007, Cyclomatic Number,

Complexity, V(G) per XYZ metric tool)

Characteristic(008, LOC, Size, Number of

non-blank lines in module)

Characteristic(009, Pages, Size, Number

of non-blank pages in document)

http://www.cs.pdx.edu/~reposit/


Characteristic(010, C Code, Type, File

consisting of C source code)

Characteristic(011, Requirements,

Type, Document consisting of features

the product must implement)

Characteristic(015, Bug Report, Type,

Record of a bug)

Characteristic(016, Bug Description,

Info, Missing Reply Feature)

transforms(012, 002, 005, 003, 11-01-99)

transforms(012, 013, 005, 003, 11-01-99)

possesses(001, 011, 0)



possesses(002, 008, 49)

possesses(005, 008, 59)

possesses(001, 009, 29)



consumes(012, 004, 10)

This example represents the following facts (amongothers).

1. Artifact 1 is a 29 page requirements document

2. Artifact 2 is a 49 line C file

3. Artifact 5 is a 59 line C file that resulted from a mod-

ification of Artifact 2 in response to a Bug Fix to cor-

rect a ‘‘Missing Reply Feature’’, which took 10 h of

programmer effort4. Artifact 13 is a bug report

7. Metadata in the universal metrics repository

One of the issues that contemporary data repositories

suffer from, is the issue of ambiguity. It is common for

various terms such as ‘‘Effort’’, ‘‘Size’’, ‘‘Duration’’,‘‘Complexity’’ etc. to be used in metrics repositories with

little opportunity for documentation. However, each

specific characteristic contains a description:

Characteristic (id, name, type, description)

This imposes the maintenance of metadata describing

each characteristic. Naturally, there is no mechanism to

ensure that the description is useful or meaningful,

however this at least makes the description of eachmetric an actual part of the repository as opposed to a

separate file description page that is unlikely to be

maintained, and usually produced as an afterthought.

8. Queries against the repository

Retrieving information from contemporary ‘‘flat file’’repositories is usually fairly straightforward. A simple

database projection or selection can be used to retrieve

either information about a particular metric or a par-

ticular project. Unfortunately, the price we pay for the

enhanced flexibility of the Universal Metrics Repository

is more complicated retrieval queries.

For instance, to determine the resources consumed in

creating a given project, an SQL query would have tojoin the various entities and relationships as follows:

SELECT * FROM

Artifact as A1, Transforms as T, Arti-

fact as A2,

Consumes as C, Resource as R WHERE

A2.name ¼ ��$artifact’’ AND

A1.id ¼ T.artifactID1 AND

A2.id ¼ T.artifactID2 AND

C.tid ¼ T.tID AND C.rid ¼ R.id;

Or to retrieve information about the characteristics of

a particular code unit, an SQL query may look like:

SELECT * FROM

Artifact as A, Possesses as P, Charac-

teristic as C

WHERE A.name ¼ ��code_unit1.C’’ AND

A.id ¼ P.entityID AND

P.characteristicID ¼ C.id;

However, in our opinion, the additional power pro-

vided by the Repository more than compensates for the

additional query complexity.

9. Discussion

Software measurement has been a significant issue

within software engineering for many years. It is im-

portant that relationships between new concepts and

past work be explicitly addressed.

9.1. The universal metrics repository and GQM

Basili and Rombach (1988) introduced a concept

called the ‘‘Goal-Question-Metric’’ (GQM) Framework

that provides guidance in selecting specific metrics to

capture. The framework entails first specifying the goals

for the measurement effort, then relating those goals toquestions that are to be answered by the measurement,

and finally identifying the operational data necessary to

answer those questions.

A significant implication from the GQM Framework

is that different data will be captured in response to

different questions and goals. Since it is reasonable to

expect that over its lifetime, and organization will have

many different measurement goals and questions, astatic repository schema is wholly inadequate. An

organization using the GQM Framework will require a


much more flexible schema, such as that embodied

within the Universal Metrics Repository. In the Uni-

versal Repository, new metrics can be added, and

metrics that are no longer relevant can be dropped

without requiring the creation of new databases with

different schemas. More importantly, however, is thatdata that is common across different measurement ef-

forts can continue to be housed within a single logical

database so common aspects from different efforts can

be combined.

9.2. Alternate lifecycle models and the universal metrics

repository

A number of different lifecycle models have been

proposed over the past thirty years. In order for a

Metric Repository schema to be truly ‘‘universal’’, it

must adequately support a variety of lifecycle models.

Since a generic lifecycle model describes the process in-

volved in developing a software product as a series of

transformations the Universal Repository Schema is

flexible enough to represent projects utilizing differentlifecycles.

The Waterfall Lifecycle Model. The well-known

Waterfall Lifecycle (Royce, 1970), describes a series of

sequential phases:

1. Requirements (determine what the user wants)

2. Design (convert the requirements into a design to

solve the problem)3. Implementation (transform the design into a pro-

gram)

4. Testing (test and debug the program to yield a ‘‘deliv-

ered program’’)

Each phase results in the transformation of an earlier

artifact into a new artifact which is in turn transformed

by subsequent phases. This sequential collection ofphases and their artifacts can be modeled quite naturally

using the Universal Metrics Repository Schema, since it

is based on the transformation of one or more artifacts

into one or more new artifacts.

Spiral Lifecycle Model. The Spiral Lifecycle model

(Boehm, 1988) is an iteration through four stages:

1. Planning (gathering of requirements for this itera-tion)

2. Risk Analysis (do conditions that threaten the project

exist?)

3. Engineering (if risk analysis says GO, then build the

next increment of the system)

4. Customer evaluation (get evaluation and feedback

from customer)

Unlike the Waterfall Lifecycle, requirements are

uncovered incrementally, and interleaved with system

implementation, thus feedback from customer use of a

running system is provided early and continuously

during the project. Nevertheless, requirements gathered

at the Planning phase are transformed into ‘‘GO arti-

facts’’ which are then transformed into executable arti-

facts to make up the next increment of the system.Within the Spiral Lifecycle, artifacts (i.e., require-

ments for various iterations of the project) appear

‘‘spontaneously’’ at various points throughout the

development. These artifacts may be transformed into

other artifacts (i.e., the Risk Analysis says ‘‘GO’’), or

their evolution may be terminated (i.e., the Risk Ana-

lysis says ‘‘NO GO’’). In either case, they should be of

interest to the data collection. Most alternate ap-proaches to maintaining project measurement data is

deficient in this model, since they typically ignore the

temporal aspects of a project and simply collect data at

project completion. Often aborted implementations of

requirements and other ‘‘non-delivered’’ code simply

does not appear in the final project database.

However, the Universal Metrics Repository Schema

explicitly addresses the issue of artifacts that appear andthen disappear before project completion because it re-

cords the introduction, transformation, and ultimately

termination of every artifact in the project. Thus, the

schema can provide a more accurate picture of artifacts,

allocation of effort and schedule than traditional meth-

ods used to record project metrics.

Extreme Programming. Extreme programming (Beck,

2000) (XP) is an iterative development model in which:

1. Tests are written before code.

2. Code is regularly refactored.

3. The emphasis on very small increments of functional-

ity.

4. User stories are used instead of formal requirements

and design.

User stories can be represented as artifacts. As stories

are implemented, they are transformed into a series of

tests, which are then transformed into the code neces-

sary to implement the story. This process is typically

performed as a series of small iterations, and integrated

weekly. It is said that in XP actual product release is

almost a non-event since the customer can take inter-

mediate versions of the system and deliver them to theirusers. As new iterations encompassing enough addi-

tional functionality are ready, users can upgrade.

In traditional ‘‘end of project’’ measurement schemes,

it is difficult to determine when to begin populating the

repository. Because the Universal Metrics Repository

schema maintains information on artifacts concurrently

with development progress, it provides a snapshot of the

state of the system at any given point in the project.Thus, metrics are always available for fielded systems

because data collection is on-going.


Combining Artifacts from Different Process Models.

Because the transformation model considers software

development as a series of transformations of artifacts,

it is not at all unreasonable to consider merging repos-

itories of data from projects using different life cycle

models. While we may not necessarily wish to compare auser story describing a customer purchasing a ticket to a

concert written on a 4� 6 card with a formal specifica-

tion of an autopilot system in an airliner, neither may we

wish to maintain separate measurement repositories just

because some projects are developed using different

methodologies. Because the Universal Metrics Reposi-

tory associates simple metadata with many of its objects,

such as Characteristics and allows the storage ofmore elaborate metadata through the use of the pos-

sesses relationship, plenty of opportunity exists to

ensure that different lifecycles are logically segregated

while physically participating in the same physical

measurement database.

10. Summary

In this paper, we have addressed the issue of repre-

senting and managing the data associated with a soft-

ware metrics effort. In order to avoid many of the

problems inherent in earlier repository efforts, we have

devised a model by which to represent the activities that

are performed during a software development project.

We refer to this model as the Transformational View of

Software Development. By using the Transformational

View of Software Development, we are able to represent

both activities and artifacts in a flexible repository of

metrics data. This approach can be used equally well on

lifecycle models ranging from heavyweight processes

such as the Waterfall Lifecycle Model to agile, light-

weight process such as Extreme Programming.

We also described a proof-of-concept prototypeRepository. While we have populated the prototype

repository with some software engineering data, we plan

on acquiring a richer variety of software engineering

repositories. From this we hope to gain experience in

using the UMR to consolidate incompatible metrics

datasets.

Acknowledgements

The author would like to thank the reviewers for their

helpful comments and suggestions, as well as Linda

Keast and Subu Swayam for their work in implementing

the proof-of-concept implementation. This work waspartially funded by the National Science Foundation’s

Software Engineering Research Center (http://www.

serc.net) and Northrup-Grumman Corporation.

References

Basili, V.R., 1980. Data collection, validation, and analysis. In:

Tutorial on Models and Metrics for Software Management and

Engineering. IEEE Computer Society Press.

Basili, V.R., Rombach, H.D., 1988. The TAME project: towards

improvement-oriented software environments. IEEE Transactions

on Software Engineering SE-14 (6), 758–773.

Beck, K., 2000. Extreme Programming Explained: Embrace Change.

Addison Wesley.

Boehm, B.W., Brown, J.R., Lipow, M., 1976. Quantitative evaluation

of software quality. In: Proceedings, Second International confer-

ence on Software Engineering. pp. 592–605.

Boehm, B.W., 1988. A spiral model for software development and

enhancement. Computer 21 (5), 61–72.

Florac, W., 1992. Software quality measurement: a framework for

counting problems and defects. SEI Technical Report CMU/SEI-

92-TR-022.

Goethert, W., Elizabeth, K., Bailey, E., Busby, M., 1992. Software

effort and schedule measurement: a framework for counting staff-

hours and reporting schedule information. SEI Technical Report

CMU/SEI-92-TR-021.

IEEE Standard for Software Productivity Metrics, 1992. IEEE Std

1045-1992.

Martin, R., Carey, S., Coticchia, M., Fowler, P., Maher, J., 1989. In:

Proceedings of the Workshop on Executive Software Issues August

2–3 and November 18, 1988. SET Technical Report CMU/SEI-89-

TR-006.

Park, R., 1992. Software size measurement: a framework for counting

source statements. SEI Technical Report CMU/SEI-92-TR-020

ADA258304.

Royce, W.W., 1970. Managing the development of large software

systems. In: Proceedings of WESTCON, San Francisco.

Rozum, J., 1993. The SEI and NAWC: working together to establish a

software measurement program. SEI Technical Report: CMU/SEI-

93-TR-07.

Weiss, D., 1979. Evaluating software development by error analysis:

The data from the architecture research facility. The Journal of

Systems and Software 1 (1), 57–70.

http://www.serc.net

http://www.serc.net

a flexible method for maintaining software metrics data: a universal metrics repository

Documents