27/06/2015dr andy brooks1 fyrirlestrar 5 & 6 the benefits of modularity i really don´t...

18/04/23 Dr Andy Brooks 1

Fyrirlestrar 5 & 6

The Benefits of Modularity

I really don´t understand these

monoliths.

Time to break it down into pieces?

MSc Software MaintenanceMS Viðhald hugbúnaðar


Case Study Dæmisaga

T D Korson and V K Vaishnavi, An empirical study of the effects of modularity on program modifiability. In E Solway and Iyengar S S, editors, Empirical Studies of Programmers: First Workshop, pages 168-186, Ablex Publishing Corporation, 1986, A Volume in the Ablex Human/Computer Interaction Series.

ReferenceAn External Replication of a Korson Experiment, J Daly, A Brooks, J Miller, M Roper, and M Wood, EFoCS-4-94, Department of Computer Science, University of Strathclyde, Glasgow, Scotland. http://www.cis.strath.ac.uk/research/efocs/index.html

original Korson work


Korson´s Original Results“The study provides strong evidence a modular program is faster to modify than a non-modular, but otherwise equivalent version of the same program, when the following conditions hold:

(a) Modularity has been used to implement “information hiding” which localises changes required by modification. (b) Existing modules in a program perform useful generic operations, some of which can be used in implementing a modification.(c) A significant understanding of, and changes to, the existing code are required for performing a modification.

In contrast, the study provides evidence that modifications not fitting into the above categories are unaided by the presence of modularity in the source code.”


Korson´s conditions can be viewed as an attempt at defining a subset of the state-space of all modular programs for which modularity allows program modifications to be made faster.


Replicating Experiments

• There are always doubts about drawing conclusions from a single experiment: every stage of an experiment from design through to result interpretation is prone to error.

• Scientists demand that experimental results are externally reproducable i.e. that an independent group of other researchers can repeat the experiment and obtain similar results.

• External replication is a cornerstone of modern scientific disciplines.

• When experiments are repeated by the same researchers, the replication is said to be internal.


Results From Replication

• If you obtain the same results, confidence is built up in both the original experimental design and the original experimental results.

• If you obtain different results, there is every reason to be suspicious of both the original experimental design and the original experimental results.


Modular And Monolithic Program Versions

• Inventory, point-of-sale program.• The non-modular (monolithic) program was

created by replacing every procedure and function call in the modular version with the body of the procedure or function.

• Modular version was approximately 1,000 lines long.

• Monolithic version was approximately 1,400 lines long.


Modification Process• Phase 1 think: on paper, the modifications

are coded noting deletions, additions, and changes to the original source code.

• Phase 2 edit: at the computer, the original program is edited to reflect the modifications made on paper.

• Phase 3 syntax: the syntax errors are interactively removed from the modifications.

• Phase 4 logic: logic errors are interactively debugged until the program passes a standard test.


Familiarization Of Subjects

• Prior to the experiment itself, each subject completed a pre-test. This was used simply as a way of familiarizing the subjects with the experimental design and the computer hardware and software environments.

• Before starting Phase 1, subjects were given ample time to read the instructions and to ask questions on the experimental process and the perfective maintenance tasks.


Experiment 1: Initial Concerns

1. There is a global array variable InventoryArray in the modular program.

True information hiding is not implemented.

2. Modular subjects while performing one modification are drawn to the following very helpful comment:

{All access to inventory information is via the following four procedures}

3. Korson relies solely on timing data on which to base his interpretations.

It is important to search for alternative interpretations by gathering different kinds of data.

3. It is more natural for a programmer to locate code to change by using the search facility of a text editor.


Pilot Studies

• A pilot study involves performing any pre-test and experiment on a small number of subjects to determine if there are any problems with experimental materials or the execution of the tasks in the computer environment being used.

• The EFoCS team performed a pilot study with four members of the EFoCS team.

• The results of the pilot study should always be indicated in any published report of an experiment.


Problems Encountered During Pilot Study

• Keyboard command keys used for testing failed to work properly.– A hot key customisation tool fixed this problem.

• Spelling mistakes were found in the documentation.– They were corrected for the experiment.

• Subjects were unfamiliar with American-specific terminology.– Explanatory footnotes were provided in the experiment.

• One subject had to ask what file name to use.– Name of file was highlighted in bold in the experiment.– Key numbers were also highlighted in bold in the experiment.

• A large sheet of test results caused data identification problems.– Data changes to look for were circled in the experiment. (Korson

had actually done this. EFoCS had overlooked doing this.)


Pilot Study Debriefing Comments

• “the pretest taught the semantics of the task”• “no intellectual capacity needed to perform

changes” (monolithic and modular programs)• “the comment... I no longer have to consider the

program, just looking at the four procedures”• “in the monolithic program, the task turns into a

global search and replace operation”• “I can´t use the editing environment...feel

disadvantaged at having to manually search through the listing”


EFoCS Experimental Design Differences

• Minor recipe-improving changes– Explanatory footnotes for American-specific terminology,

debriefing questionnaire, etc.• 4 or 5 subjects to each monitor

– Ratio unknown for the Korson study.• Turbo Pascal reference manual to every 4 or 5 subjects

– 1:1 ratio for the Korson study.– No reference conflicts occurred.

• Pre-test and experiment between 2pm and 6pm– 6pm and 10pm for the Korson experiment.– Unknown time for Korson pre-test.

• Subjects received photocopied lecture notes as reward– Korson allowed subjects to keep the Turbo Pascal reference

manuals.


Debriefing Questionnaire

Personal DetailsName ...Age ...Sex ...Position (e.g. 2nd year CS) ...Qualifications if you are a graduate (e.g. BSc Comp Sci 2(i)) ...

1. Did you find the task easy?2. Did you make good use of the editor?3. What caused you the most difficulty?4. How well do you understand the code?5. Have you learned anything? If so, what?

Any other comments?


Replication Subjects• 23 volunteers all from the Computer Science

department:– 5 second year students– 2 third year students– 9 final year students– 4 research students– 3 research assistants

• All met the Korson criteria: “fluency in Pascal, knowledge of the IBM-PC, and an amount of programming experience”


Replication Methodology

• Subjects were randomly assigned to two groups.• All worked in the same laboratory.• Subjects with monolithic programs sat next to

subjects with modular programs to discourage any form of cooperation between subjects.

• Subjects advised they were working on different versions of a program so some might finish earlier than others.– Done to reduce subjects concern about their

performances.


Replication Procedure

• Pre-test to familiarize subjects.• Short break for refreshments

10-15 minutes

• Experiment• Subject asked for brief personal details

and asked to complete a debriefing questionnaire.

• Pre-test and experiment were of the same format as Korson.


EFoCS ExpectationOn average, Korson´s monolithic subjects took more than 4 times as long as his modular subjects.

Given the size of the effect Korson found, EFoCS hypothesized that the results from the replication would be similar to Korson´s i.e. that the times for modular subjects would be significantly faster than for the monolithic subjects.


Result Times of Experiment

48.019.3

59.185.9

~ 1.3x~ 4x

EFoCSKorson


Bar Charts Displaying Total Mean Times

~ 4x

~ 1.3x

If EFoCS subjects were simply less able (longer modular times), then monolithic times should also be longer – they were not.


Statistics

• Korson used a Wilcoxon Rank Sum test. He rejected the null hypothesis (H0: information hiding has no effect on maintainability) with p<0.001.

• EFoCS used a two-tailed, independent t-test (df = 15, t = -0.870). They could not reject H0 with p<0.4.


Inductive Analysis

• To help explain the different result, a database of experimental results was created.

• A rule induction system called IRIS was used to find patterns (If... then... rules) in the induction database. IRIS takes each database variable in turn as the dependent variable (then...).

• New variables (noted after the experiment), may be introduced into the induction database, as long as variable values can be obtained.

Induction Variables


Grading Of Induction Variables

• Variable (14) diff was graded:– 1 pretest; 2 finding the correct places to modify; 3 other.

• Variable (15) code was graded:– 0 no understanding; 1 only the relevant parts; 2 fairly well; 3

well or very well

• Variable (16) learn was graded:– 1 nothing; 2 read the instructions fully; 3 other; 4 to make

modifications the code does not have to be understood

• Variable (17) extra was graded:– 1 comments in the code not noticed; 2 none; 3 comments in the

code read but not of any help; 4 pascal syntax forgotten; 5 comments in code helped


Result Times Of Pretest

Pretest times were graded similarly to Variables (1)-(6) and included in the induction process.


Induced Rules

1. A suggested relationship was found between total experimental times and total pretest times. All 9 of the monolithic subjects appeared in the top 12 places when ranked by pretest timings.

2. A high number of changes missed in the think phase in the experiment suggested an excessively high syntax or logic time.


Ability Effect Rule 1

• Rule 1 suggests an ability effect.• Programming ability differences have been

reported between 4:1 and even 25:1.• Modular Subject A who finished fastest on

both the pretest and the experiment was known as a person with high ability.

• So one interpretation is that, by accident, more able subjects were assigned to the monolithic program.


Subjects B, Q, and O

• Subject B had the longest pretest time but the second fastest modular time. – This subject had not taken time to read the pretest

instructions but had learnt from doing the pretest.• Subject Q took the longest with the monolithic

program despite being second fastest on the pretest.– This subject had inadvertently deleted a line during the

experiment and eventually a monitor had to advise him of his problem.

• Subject O had a good pretest but a relatively poor experimental performance.– No substantive explanation as to why.


Artificial Experiment? Rule 2

• The 4 subjects with the highest syntax times all made edits, during the syntax phase, correcting semantics, specifically against instructions.– 3 of the 4 were working on the modular program

• Perhaps separation into the four phases (think, edit, syntax, and logic) was too artificial.– Subjects K and P specifically commented on this

artificiality.• Perhaps more monitors would have helped control

subjects´ progression through the four phases. – Korson reported no problems with subjects moving

prematurely between phases.


Logic Time Variations

• Subjects with a correct modular program from the syntax phase took between 1 to 20 minutes to perform the testing in the logic phase.

• Subjects with a correct monolithic program from the syntax took between 1 to 5 minutes to perform testing in the logic phase.


Three Ways To Test• Programs could be tested in three ways:

– using a single hot key (1m 15s)– using a series of hot keys (55s)– manually entering test data (2m 30s)

• Rereading testing instructions could take another minute.– So times between 1 and 5 minutes are explainable.

• A thought experiment was conducted where all the logic times for correct programs from the syntax phase were set to be 1 minute.– There was still no statistically significant difference

between modular and monolithic subjects with p<0.259 and the gap between means of total time widening only slightly (~1.38x)


Subjects´ Code Understanding

• Monitors noticed several subjects attempting to follow program execution flow.

• I and J, the two fastest finishers with the monolithic program, said they had no understanding of the code and no understanding was required to make the modifications.

• G and H, who took the longest times with the modular program, had written on the listings in a way indicating they had attempted to follow program execution.

• L and O both wrote procedures for the monolithic version.


Possible Improvements To Experimental Recipe

• Improve program layout and commentary.• Remove biased comment in the modular program.• Replace the global variable in the modular program to

implement fully information hiding.• Reduce the size of the experimental instructions.• Employ more monitors.• Instruct monitors to closely control the four different

phases.• Use only one way to test the program.• Use a more specific group of subjects.

– More homogenous.– Less problems with an ability effect?


Main Reasons For Different Results

• Possible ability effect– accidental assignment of more able subjects

to monolithic program

• Varing levels of code understanding– 2 fast subjects had no understanding of code

• “pragmatic maintenance”

– 2 slow subjects tried to understand the code


Conclusions Niðurstöður

• Korson´s results were not confirmed by the replication.

• Software engineering experiments must be externally replicated at research centres round the world to determine if the results are really generalizable.

• It is vital to get a better understanding of the role of “pragmatic maintenance”.

27/06/2015dr andy brooks1 fyrirlestrar 5 & 6 the benefits of modularity i really don´t...

Documents