introduction to the special issue on software engineering

2
introduction to the Special Issue on Software Engineering Software engineering, like economics and psychol- ogy, is a field of conjecture. In ail three fields, the researcher makes a conjecture and then attempts to substantiate it through an analysis of observed data. For example, an economist may make a conjecture about the impact of deregulation of airlines on competi- tion. Whether this conjecture is true can be determined only through observation of the state of the industry after deregulation and comparison with the state of the industry before deregulation. Likewise, a psychologist may form a hypothesis that a person’s short-term memory consists of a stack-type data structure. This theory can be validated only by having a subject try to remember words, and determining which of the words are forgotten first as the limit of short-term memory is approached. Clearly, economics and psychology share a similar paradigm (conjecture, then proof through obse~ation); it is significant, however, that their specific methodolo- gies differ. The economist examines the workings of the “real world” via empirical data, never quite sure if the activity observed is due to the factor under examination or is a side effect of some other unanticipated occur- rence. Conversely, the psychologist, by using finely controlled experimentation, can isolate events and exam- ine them in sharp focus. However, the researcher can never be sure that what is observed in the laboratory setting will generalize to the real world, where many other variables interact with the one of interest. Software engineering is rooted quite strongly in the conjecture-proof paradigm, and in fact uses both the empirical and experimental approaches to testing a particular hypothesis. For example, a researcher may conjecture that deep control nesting results in programs that are more likely to have errors than programs that avoid nesting through the use of multiple compound conditionals (a necessary trade-off). This conjecture may be validated in one of two ways: 1 The researcher may split a group of programmers into two groups, one of which is to write a program using shallow nesting and many compound condi- tionals. The second group will write the same program with a great deal of nesting but few compound conditionals. After having the subjects develop the programs, the experimentor can then The Journal of Systems and Software 8, l-2 (1988) 0 1988 Elsevier Science Publishing Co.. Inc., 1988 2. determine which group’s code contains more errors. A statistical difference in error rates for one group over the other might lead the experimentor to conclude that the hypothesis is either true or false. Unfortunately, in such cases the program to be written is usually quite trivial, the development environment artificial, and the mount of time devoted to carrying out the experiment limited to an hour or less. Thus, it is not at all clear that a manager in industry can generalize these results from the laboratory to a given project. A second approach to validating the hypothesis would be for the researcher to acquire a large number of industrial software systems, along with their error histories. The degree of nesting in each system could be determined, and the level of nesting and error histories among the sample could be statistically compared. However, because the software systems would no doubt come from a wide variety of environments, application domains, and program- mers, the impacts of these differences might over- whelm that of the difference due to the level of nesting. Thus, the manager would not be able to trust these validation efforts any more than those done in the experimental setting. The purpose of this special issue is to attempt to improve the current situation in empirical software engineering studies. The issue will present tools and techniques that make collecting, sharing, and using empirical software engineering data as painless as possible. The papers in the issue can be divided into three categories. First, the papers by Sallie Henry (“A Technique for Hiding Proprietary Details While Providing Sufficient Info~ation for Researchers”) and Jim Bieman, Al Raker, Paul Clites, David Gustafson, and Austin Melton (“A Standard Representation of Imperative Language Programs for Data Collection and Software Measures Specification”) discuss an extremely important problem that is often encountered in the attempt to obtain empirical data-the issue of proprietary software. The typical project manager would be reluctant to give a copy of a $100,000 software system to a professor at T-Test 27 to “study. ” This reluctance should not be surprising-vendors seldom even provide source code to 1 0164-1212/88/$3.50

Upload: warren-harrison

Post on 21-Jun-2016

216 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Introduction to the special issue on software engineering

introduction to the Special Issue on Software Engineering

Software engineering, like economics and psychol- ogy, is a field of conjecture. In ail three fields, the researcher makes a conjecture and then attempts to substantiate it through an analysis of observed data.

For example, an economist may make a conjecture about the impact of deregulation of airlines on competi- tion. Whether this conjecture is true can be determined only through observation of the state of the industry after deregulation and comparison with the state of the industry before deregulation. Likewise, a psychologist may form a hypothesis that a person’s short-term memory consists of a stack-type data structure. This theory can be validated only by having a subject try to remember words, and determining which of the words are forgotten first as the limit of short-term memory is approached.

Clearly, economics and psychology share a similar paradigm (conjecture, then proof through obse~ation); it is significant, however, that their specific methodolo- gies differ. The economist examines the workings of the “real world” via empirical data, never quite sure if the activity observed is due to the factor under examination or is a side effect of some other unanticipated occur- rence. Conversely, the psychologist, by using finely controlled experimentation, can isolate events and exam- ine them in sharp focus. However, the researcher can never be sure that what is observed in the laboratory setting will generalize to the real world, where many other variables interact with the one of interest.

Software engineering is rooted quite strongly in the conjecture-proof paradigm, and in fact uses both the empirical and experimental approaches to testing a particular hypothesis. For example, a researcher may conjecture that deep control nesting results in programs that are more likely to have errors than programs that avoid nesting through the use of multiple compound conditionals (a necessary trade-off). This conjecture may be validated in one of two ways:

1 The researcher may split a group of programmers into two groups, one of which is to write a program using shallow nesting and many compound condi- tionals. The second group will write the same program with a great deal of nesting but few compound conditionals. After having the subjects develop the programs, the experimentor can then

The Journal of Systems and Software 8, l-2 (1988)

0 1988 Elsevier Science Publishing Co.. Inc., 1988

2.

determine which group’s code contains more errors. A statistical difference in error rates for one group over the other might lead the experimentor to conclude that the hypothesis is either true or false. Unfortunately, in such cases the program to be written is usually quite trivial, the development environment artificial, and the mount of time devoted to carrying out the experiment limited to an hour or less. Thus, it is not at all clear that a manager in industry can generalize these results from the laboratory to a given project. A second approach to validating the hypothesis would be for the researcher to acquire a large number of industrial software systems, along with their error histories. The degree of nesting in each system could be determined, and the level of nesting and error histories among the sample could be statistically compared. However, because the software systems would no doubt come from a wide variety of environments, application domains, and program- mers, the impacts of these differences might over- whelm that of the difference due to the level of nesting. Thus, the manager would not be able to trust these validation efforts any more than those done in the experimental setting.

The purpose of this special issue is to attempt to improve the current situation in empirical software engineering studies. The issue will present tools and techniques that make collecting, sharing, and using empirical software engineering data as painless as possible. The papers in the issue can be divided into three categories.

First, the papers by Sallie Henry (“A Technique for Hiding Proprietary Details While Providing Sufficient Info~ation for Researchers”) and Jim Bieman, Al Raker, Paul Clites, David Gustafson, and Austin Melton (“A Standard Representation of Imperative Language Programs for Data Collection and Software Measures Specification”) discuss an extremely important problem that is often encountered in the attempt to obtain empirical data-the issue of proprietary software.

The typical project manager would be reluctant to give a copy of a $100,000 software system to a professor at T-Test 27 to “study. ” This reluctance should not be surprising-vendors seldom even provide source code to

1

0164-1212/88/$3.50

Page 2: Introduction to the special issue on software engineering

2 W. Harrison

their customers. The papers just cited describe some

novel methods of producing special program formats

that let us get at the characteristics we might be

interested in, while barring reproduction of the original source code.

The second set of papers is by T. J. Yu, B. A. Nejmeh, H. E. Dunsmore, and V. Y. Shen (“SMDC:

An Interactive Software Metrics Data Collection and Analysis System”), William Farr and Oliver Smith (“A Tool for Statistical Modeling and Estimation of Reliabil-

ity Functions for Software: SMERFS”), and Warren Harrison (‘ ‘MAE: A Syntactic Metric Analysis Environ- ment”). Quite often, even if the data are available, we

might have a hard time summarizing and interpreting it.

These papers report on tools and environments used to analyze and maintain collections of software engineering

data. Finally, the paper by Burt Swanson and Cynthia Beath

(“The Use of Case Study Data in Software Management

Research”) explores the role of case studies in software engineering research. The content of this paper differs radically from that of the other papers, since these

authors look at what might be termed the human side of software, as opposed to the automatic generation and analysis of code characteristics addressed in the other

papers. If not for the help and encouragement of everyone

involved, this special issue would not have been possi-

ble. In particular, I thank Connie Helm for her help in contacting potential authors and reviewers. I also thank Bob Glass for his enthusiasm and encouragement, as

well as for arranging for the review of my paper so the process of anonymous review could be maintained.

Most important, I thank the referees for their hard

work and willingness to participate in this special issue. Some of the referees reviewed as many as three papers. A partial list of the referees follows.

Bahram Adrangi

University of Portland

James M. Bieman

Iowa State University

Curt Cook Oregon State University

Stewart G. Crawford AT&T Bell Laboratories

Nancy J. Currans Hewlett-Packard Corporation

Sallie Henry Virginia Tech

Rocco F. Iuorno IIT Research Institute

Data & Analysis Center for

Software (DACS)

Kenneth Magel

North Dakota State University

Thomas G. Moher University of Illinois at Chicago

Jai Navlakha Florida International University

Brian A. Nejmeh

AT&T Bell Laboratories

Linda M. Ott Michigan Technological University

T. M. Steinbock

SET Laboratories. Inc.

Several others participated in the review of papers; however, they failed to return a release in time to allow

their names to be included in this list. To all these people I extend a heartfelt thank you. I

hope they enjoy reading this special issue as much as I enjoyed organizing it.

Warren Harrison

Portland, Oregon April 21, 1987