examining a semantic metrics suite for object-oriented design

23
Examining A Semantic Metrics Suite for Object-Oriented Design Dr. Letha Etzkorn (PI) Ms. Cara Stein Dr. Glenn Cox Dr. Sampson Gholston Dr. Dawn Utley Dr. Phil Farrington The University of Alabama in Huntsville

Upload: savea

Post on 25-Feb-2016

50 views

Category:

Documents


0 download

DESCRIPTION

Examining A Semantic Metrics Suite for Object-Oriented Design. Dr. Letha Etzkorn (PI) Ms. Cara Stein Dr. Glenn Cox Dr. Sampson Gholston Dr. Dawn Utley Dr. Phil Farrington. The University of Alabama in Huntsville. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Examining A Semantic Metrics Suite for Object-Oriented Design

Examining A Semantic Metrics Suite for Object-Oriented Design

Dr. Letha Etzkorn (PI)Ms. Cara SteinDr. Glenn CoxDr. Sampson GholstonDr. Dawn UtleyDr. Phil Farrington

The University of Alabama in Huntsville

Page 2: Examining A Semantic Metrics Suite for Object-Oriented Design

A Semantic Metrics Suite for Object-Oriented Design –

Problem

Standard software metrics have some problems!

They are implementation dependent since they are calculated strictly from the code

They count code items; sometimes it is arguable whether the items counted accurately reflect the qualities the metrics are supposed to measure Does # of lines of code always reflect

complexity?The University of Alabama in Huntsville

Page 3: Examining A Semantic Metrics Suite for Object-Oriented Design

A Semantic Metrics Suite for Object-Oriented Design –

Problem

Segment #1 and Segment #2 do the same thing but have very different LOC metrics:

Code Segment #1:test[--cnt]-> test =

val1[input_count++].counter + val2[tmp_count--]->mycount;

Code Segment #2:temp = val1[input_count].counter + val2[tmp_count]->mycount;input_count++;tmp_count--;--cnt;test[cnt]->test = temp;

The University of Alabama in Huntsville

Page 4: Examining A Semantic Metrics Suite for Object-Oriented Design

A Semantic Metrics Suite for Object-Oriented Design –

Problem

These problems with software metrics all result due to the metrics being calculated on syntactic aspects of the code.

The solution is to define metrics based on semantic aspects of code (“what the code means”, the code design versus the code implementation) Program Understanding!!!

Includes any activity that uses dynamic or static methods to reveal program properties

The University of Alabama in Huntsville

Page 5: Examining A Semantic Metrics Suite for Object-Oriented Design

A Semantic Metrics Suite for Object-Oriented Design –

Approach

Existing Etzkorn Informal Tokens approach: Used natural language processing and

information extraction techniques Used informal tokens: comments and

identifier names Used a knowledge-base consisting of a

hierarchical semantic network Originally implemented in the PATRicia

system (Program Analysis Tool for Reuse)

The University of Alabama in Huntsville

Page 6: Examining A Semantic Metrics Suite for Object-Oriented Design

A Semantic Metrics Suite for Object-Oriented Design -- Approach

Under the NASA grant, extended the informal tokens portion of the PATRicia system to analyze Semantic Metrics

Extended the PATRicia system to analyze software design documents in IEEE Design Document format

New tool is called semMetThe University of Alabama in Huntsville

Page 7: Examining A Semantic Metrics Suite for Object-Oriented Design

A Semantic Metrics Suite for Object-Oriented Design –

Approach

The University of Alabama in Huntsville

CURSOR LOC OBJINSERT TEXT

Interface layer—consists of keywords tagged with the part of speech (noun, adjective, verb, etc.)

infer

inferConceptual graphs, concepts in conceptual graphs

Page 8: Examining A Semantic Metrics Suite for Object-Oriented Design

A Semantic Metrics Suite for Object-Oriented Design –

Approach

Class Domain Complexity (CDC): CDC = Σi=1

m |concept + conceptual relations| X weight

1 + number of conceptual relations linking the current concept to another concept recognized by the class. Concepts linking to concepts in another class are not included in the count. Only outgoing conceptual relations are included in the count (to prevent counting the same conceptual relation twice)

The University of Alabama in Huntsville

Page 9: Examining A Semantic Metrics Suite for Object-Oriented Design

A Semantic Metrics Suite for Object-Oriented Design –

Approach

2 Suites of Semantic metrics have been examined:

Etzkorn and Delugach suite (2000)

Stein et al. suite (2004)

The University of Alabama in Huntsville

Page 10: Examining A Semantic Metrics Suite for Object-Oriented Design

A Semantic Metrics Suite for Object-Oriented Design –

Approach

Class Domain Complexity Variations were CDC0, CDC1, CDC2, CDC2aClass Domain Entropy Variations were CDE, CDEa. CDEbRelative Class Domain Complexity Variations were RCDC0, RCDC1, RCDC2,

RCDC2aRelative Class Domain Entropy Variations were RCDE, RCDEa, RCDEbKey Class Identity Variations were KCI0, KCI1, KCI1, KCI2a

The University of Alabama in Huntsville

Page 11: Examining A Semantic Metrics Suite for Object-Oriented Design

A Semantic Metrics Suite for Object-Oriented Design –

Approach

The University of Alabama in Huntsville

Entropy Key Class Identity Variations were EKCIa, EKCIbLogical Relatedness of Methods Variations were LORM, LORM2,

LORM2a, LORM2b, LORM3’Key Class Factor (KCF)Logical Disparity of Methods (LDM)

Page 12: Examining A Semantic Metrics Suite for Object-Oriented Design

A Semantic Metrics Suite for Object-Oriented Design –

Approach

The University of Alabama in Huntsville

Percentage of Shared Ideas (PSI)Percentage of Universal Ideas (PUI)Percentage of Closely Related

Classes (PCRC)Average Proportion of Ideas Shared

with other Classes (APISOC)

Page 13: Examining A Semantic Metrics Suite for Object-Oriented Design

A Semantic Metrics Suite for Object-Oriented Design --Results

Have compared these metrics to: Traditional Object-Oriented syntactic metrics Expert Analyses of Software Over 3 Graphical User Interface software

packages Have performed theoretical analyses of these

metrics using Kitchenham criteria Weyucker criteria Briand, Morasca, and Basili

The University of Alabama in Huntsville

Page 14: Examining A Semantic Metrics Suite for Object-Oriented Design

A Semantic Metrics Suite for Object-Oriented Design --Results

Work is continuing analyzing the metrics (requires new KBs):

Real time software packages Data from NASA MSFC

Also, Mike Chapman and Edward Aycoth are currently running semMet and collecting semantic metrics on data from the Metrics Data Program (KB already has been built)

The University of Alabama in Huntsville

Page 15: Examining A Semantic Metrics Suite for Object-Oriented Design

A Semantic Metrics Suite for Object-Oriented Design – Sample Results

The University of Alabama in Huntsville

Metric WMC DIT

CDC0 0.7037 (<0.0001) 0.1235(0.2601)CDC 0.7199 (<0.0001) 0.1602 (0.2989)CDC2 0.7596 (<0.0001) 0.1492 (0.3338)CDE 0.7461 (<0.0001) 0.2439 (0.1104)CDEa 0.7534 (<0.0001) 0.4254 (0.0040)CDEb 0.6538 (<0.0001) 0.2010 (0.1907)

Page 16: Examining A Semantic Metrics Suite for Object-Oriented Design

A Semantic Metrics Suite for Object-Oriented Design – Results

The University of Alabama in Huntsville

To provide a proof of concept for calculating semantic metrics from design specifications, the semantic metrics were computed separately from the design specification and source code of the same system, and compared:

• To ensure a baseline case where the system was clearly implemented according to the design (which is not always the case)

• 3 systems were compared to design specifications reverse engineered from source code using Javadoc

• Semantic metrics were computed from human-generated design documents from a GUI package, then separately from the source code for this package, and compared (for two versions of the GUI package)• A perl script was used to reformat the design documents to the format expected by semMet

Page 17: Examining A Semantic Metrics Suite for Object-Oriented Design

A Semantic Metrics Suite for Object-Oriented Design – Results

The University of Alabama in Huntsville

• Semantic metrics for reverse-engineered systems had statistically significant moderate to large correlations (as was expected)• Semantic metrics on the GUI packages:

• Cohesion metrics correlated well for wxWindows version 2.4.2, complexity metrics performed less well• For wxWindows version 1.6, most of the semantic metrics had statistically significant large or very large correlations

Page 18: Examining A Semantic Metrics Suite for Object-Oriented Design

A Semantic Metrics Suite for Object-Oriented Design – Results

The University of Alabama in Huntsville

• Work is continuing analyzing the semantic metrics as calculated on design documents (will require new KBs)

Page 19: Examining A Semantic Metrics Suite for Object-Oriented Design

A Semantic Metrics Suite for Object-Oriented Design –Importance/Benefits

Advantages of semantic metrics Provide a seamless set of metrics from

Software Design Document creation through implementation and maintenance

Provide metrics early in the software development cycle

Metrics do not vary based on code syntax alone Metrics may provide additional insight based

on measuring domain complexity rather than implementation complexity.

The University of Alabama in Huntsville

Page 20: Examining A Semantic Metrics Suite for Object-Oriented Design

A Semantic Metrics Suite for Object-Oriented Design –

Relevance to NASA

Can be used anywhere traditional metrics are currently used, but have potential advantages compared to existing metrics

We have received data from MSFC Plan is to examine this data using both

semantic metrics and traditional metrics Mike Chapman will analyze data from

MDP using semMetThe University of Alabama in Huntsville

Page 21: Examining A Semantic Metrics Suite for Object-Oriented Design

A Semantic Metrics Suite for Object-Oriented Design –Accomplishments

Completed initial feature of semMet that measures semantic metrics on source code, both on class headers alone as well as entire source code

Defined several new semantic metrics in addition to those originally defined

Completed initial feature of semMet that analyzes Software Design Documents in IEEE Design Document format

Have performed some initial statistical analyses on code and Software Design Documents

The University of Alabama in Huntsville

Page 22: Examining A Semantic Metrics Suite for Object-Oriented Design

A Semantic Metrics Suite for Object-Oriented Design –

Accomplishments – Current Publications

Etzkorn, L.H., Gholston, S.E., Fortune, J.L., Stein, C.E., Utley, D., Farrington, P.A., Cox, G.W., “A Comparison of Cohesion Metrics for Object-Oriented Systems,” Information and Software Technology, Vol. 46, August, 2004, pp. 677-687

Stein,Cara Etzkorn,Letha, Cox,Glenn, Farrington,Philip, Gholston,Sampson, Utley,Dawn, and Fortune,Julie, "A New Suite of Semantic Metrics for Object-Oriented Software," 1st International Workshop on Software Audit and Metrics, Porto, Portugal, April 13-14, 2004.

Stein, C. Etzkorn, L.,Utley,D., Farrington,P. Cox,G., Fortune, J. and Gholston,S. "Computing Software Metrics from Design Documents," 42nd Annual ACM Southeast Conference, Huntsville, AL, April 2-3, 2004.

Stein, Cara, “Fine-Grained Semantic Metrics for Object-Oriented Software,” International Conference on Software Engineering Research and Practice (SERP ’04), Las Vegas, NV, June 21-24, 2004. (student paper)

Stein, Cara, Semantic Metrics for Source Code and Design, doctoral dissertation, (successfully defended June 9, 2004).

Additionally, currently have other papers submitted

The University of Alabama in Huntsville

Page 23: Examining A Semantic Metrics Suite for Object-Oriented Design

A Semantic Metrics Suite for Object-Oriented Design –

Accomplishments – Next Steps

Continue statistical tests: Analyze semantic metrics over additional code drawn

from several different areas Analyze data from MSFC Analyze data from MDP (Mike Chapman) Analyze other datasets

Analyze semantic metrics over design documents Requires building multiple additional knowledge-bases

Examine additional new or improved semantic metrics

The University of Alabama in Huntsville