quality assessment of embedded language modules · quality assessment of embedded language modules:...

42
Quality Assessment of Embedded Language Modules: Targeting the spotlight on heterogeneous systems Martin van der Vlist Laboratory for Quality Software (LaQuSo)

Upload: doquynh

Post on 18-Apr-2018

226 views

Category:

Documents


5 download

TRANSCRIPT

Quality Assessment of Embedded Language Modules:Targeting the spotlight on heterogeneous systems

Martin van der VlistLaboratory for Quality Software (LaQuSo)

PAGE 203-02-10

Overview

• Introduction• COBOL

• Approach• Results

• Java• Approach• Results

• Conclusions• Current work• Future work

PAGE 303-02-10

Introduction

• LaQuSo: code analysis is core activity• Embedded language recurring issue

• Often SQL/Database intensive applications• Metrics/dependencies for host and

embedded language available• No measurement for

interaction• Not much research

information available

PAGE 403-02-10

Research questions

• How can the combination of embedded language fragments and host language statements be analyzed?

• When analyzing embedded language fragments and host language statements, can relevant metrics be calculated and dependencies be analyzed?

PAGE 503-02-10

Common approach

• Based on existing tools: SQuAVisiT• 2 different host-languages:

• COBOL + PL/SQL• Java + SQL

• Include case studies• Legacy COBOL system• 2 Java ERP systems• Small Java program

PAGE 603-02-10

COBOL: example

IDENTIFICATION DIVISION. PROGRAM-ID. XAG1P626.

* skipped all initializations etc MOVE HIA-DATUM-INGANG OF PAR TO HIA-DATUM-INGANG OF Q4 MOVE HIA-VOLGNUMMER OF PAR TO HIA-VOLGNUMMER OF Q4 PERFORM M2-1002-ACCESS-BLOCK * a lot of methods in between M2-1002-ACCESS-BLOCK SECTION. M2-1002-ACCESS-BLOCK-PAR. EXEC SQL SELECT XAP626.HIA_HEFFINGSGRONDSLAG INTO :A4.HIA-HEFFINGSGRONDSLAG FROM XAP626 WHERE ( TO_CHAR(XAP626.HIA_DATUM_INGANG, 'YYYYMMDD') = :Q4.HIA-DATUM-INGANG AND XAP626.HIA_VOLGNUMMER = :Q4.HIA-VOLGNUMMER ) END-EXEC. M2-1002-ACCESS-BLOCK-EX. EXIT.

PAGE 703-02-10

COBOL:approach

• Parse COBOL program using generated parser

y = 42 * (x + 481)=

*

+

481

x

42

y

PAGE 803-02-10

IDENTIFICATION DIVISION. PROGRAM-ID. XAG1P626.

* skipped all initializations etc MOVE HIA-DATUM-INGANG OF PAR TO HIA-DATUM-INGANG OF Q4 MOVE HIA-VOLGNUMMER OF PAR TO HIA-VOLGNUMMER OF Q4 PERFORM M2-1002-ACCESS-BLOCK * a lot of methods in between M2-1002-ACCESS-BLOCK SECTION. M2-1002-ACCESS-BLOCK-PAR. EXEC SQL SELECT XAP626.HIA_HEFFINGSGRONDSLAG INTO :A4.HIA-HEFFINGSGRONDSLAG FROM XAP626 WHERE ( TO_CHAR(XAP626.HIA_DATUM_INGANG, 'YYYYMMDD') = :Q4.HIA-DATUM-INGANG AND XAP626.HIA_VOLGNUMMER = :Q4.HIA-VOLGNUMMER ) END-EXEC. M2-1002-ACCESS-BLOCK-EX. EXIT.

COBOL:approach

• Parse COBOL program using generated parser• regarding SQL statements as 1 token

M2-1002-ACCESS-BLOCK-PAR

SQLStatement

EXEC SQL END-EXEC

SELECT XAP626.HIA_HEFFINGSGRONDSLAG…XAP626.HIA_VOLGNUMMER = :Q4.HIA-VOLGNUMMER )

PAGE 903-02-10

COBOL: approach

• Parse COBOL program• regarding SQL statements as 1 token

• After COBOL-analysis, parse SQL statements • (using existing SQL parser)

• Analyze dependencies between modules and tables• Compute metrics separately

PAGE 1003-02-10

COBOL: case study

• Large pension fund• COBOL/PL-SQL system• 2816 files, 1603 files with queries• 1.62 Million lines of code

PAGE 1103-02-10

COBOL results: metrics

• Number of queries per module• 7 modules: > 15• 644 modules: 1• One file contains all stored procedures

• Number of tables per module• Max number of tables: 4• 143 modules > 1 (out of 1603)

PAGE 1203-02-10

COBOL results: metrics106

33

NO

T S

UP

POS

ED T

O R

EAD

TH

IS

PAGE 1303-02-10

COBOL results: dependencies

PAGE 1403-02-10

COBOL results: dependencies

PAGE 1503-02-10

COBOL results: dependencies

PAGE 1603-02-10

COBOL results: dependencies

PAGE 1703-02-10

COBOL results: dependencies

PAGE 1803-02-10

Java: example

//running examplepublic int getCustomerCredit(int id) throws SQLException { String sql = "SELECT cd.credit "; sql += "FROM CustomerDetails cd" + "WHERE cd.category = " + this.getCategory(); if (this.restrict) { sql += " AND cd.restriction = 1"; } sql += " AND cd.id = ?"; PreparedStatement s = this.con.prepareStatement(sql); s.setInt(1, id); ResultSet rs = s.executeQuery(); rs.next(); return rs.getInt(“credit");}

PAGE 1903-02-10

Java: the issue

IDENTIFICATION DIVISION. PROGRAM-ID. XAG1P626.

* skipped all initializations etc MOVE HIA-DATUM-INGANG OF PAR TO HIA-DATUM-INGANG OF Q4 MOVE HIA-VOLGNUMMER OF PAR TO HIA-VOLGNUMMER OF Q4 PERFORM M2-1002-ACCESS-BLOCK * a lot of methods in between M2-1002-ACCESS-BLOCK SECTION. M2-1002-ACCESS-BLOCK-PAR. EXEC SQL SELECT XAP626.HIA_HEFFINGSGRONDSLAG INTO :A4.HIA-HEFFINGSGRONDSLAG FROM XAP626 WHERE ( TO_CHAR(XAP626.HIA_DATUM_INGANG, 'YYYYMMDD') = :Q4.HIA-DATUM-INGANG AND XAP626.HIA_VOLGNUMMER = :Q4.HIA-VOLGNUMMER ) END-EXEC. M2-1002-ACCESS-BLOCK-EX. EXIT.

//running examplepublic int getCustomerCredit(int id)

throws SQLException { String sql = "SELECT cd.credit "; sql += "FROM CustomerDetails cd" + "WHERE cd.category = " + this.getCategory(); if (this.restrict) { sql += " AND cd.restriction = 1"; } sql += " AND cd.id = ?"; PreparedStatement s = this.con.prepareStatement(sql); s.setInt(1, id); ResultSet rs = s.executeQuery(); rs.next(); System.out.println(“Query executed"); return rs.getInt(“credit");}

SQLparts

Not SQLSQL

PAGE 2003-02-10

Java: approach

public int getCustomerCredit(int id) throws SQLException { String sql = "SELECT cd.credit "; sql += "FROM CustomerDetails cd" + "WHERE cd.category = " + this.getCategory(); if (this.restrict) { sql += " AND cd.restriction = 1"; } sql += " AND cd.id = ?"; PreparedStatement s = this.con.prepareStatement(sql); s.setInt(1, id); ResultSet rs = s.executeQuery(); rs.next(); return rs.getInt(“credit");}

Starting point

# of String Parts = 6

# Of Decisions = 1

# Of Alternatives = 2

# Unknown Parts = 1

% syntactically Correct Alt. = 100%

• Determine starting point• Slicing to determine fragment• Compute metrics

PAGE 2103-02-10

Java: limitations on analysis

• Analysis limited to methods• Only interpretation of:

• String• StringBuilder• StringBuffer• Numeric types

• Decisions regarded as non-deterministic• More/less alternatives found than possible

PAGE 2203-02-10

Java: case studies

• ERP system• GPL license• Company-maintained, little

community feedback used• 0.7 million lines of code• 2056 DB calls detected

• ERP system• GPL license• Community-driven fork of

Compiere• 1.1 million lines of code• 2631 DB calls detected

DJI• GUI-based interview appl.• 2 students assistants, 1 month• 12,600 lines of code

PAGE 2303-02-10

Compiere results: categorization

67%

15%

12%6%

1. Known value2. Decisions3. Unknown parts4. Both

Queries categorized by complexity

PAGE 2403-02-10

Compiere results: parsability

2180

1466

317 267 128243

1405

310114

2072

0

500

1000

1500

2000

2500

Cat. 1 Cat. 2 Cat. 3 Cat. 4 Total

Statementsdetectedparsed

PAGE 2503-02-10

Compiere: parse errors

• Analysis errors: 54• Incomplete SQL parser: 5• Syntax errors: 8

String sql = "SELECT * " + "FROM tableX " + "WHERE value IS NOT NULL" + "ORDER BY value";stmt.executeQuery(sql);

PAGE 2603-02-10

Compiere: quality metrics

Java SQL

PAGE 2703-02-10

Compiere: Dependencies

Globaldependencies

PAGE 2803-02-10

Compiere: Dependencies

Databasedependencies

PAGE 2903-02-10

Code duplication between files

AdempiereCompiere

PAGE 3003-02-10

Code duplication: DB error handling

AdempiereCompiere

PAGE 3103-02-10

Case study: DJI

• GUI-based interview application• 2 student assistants, 1 month• 12,600 lines of code• Prototype

PAGE 3203-02-10

DJI: Dependencies

Internal dependencies

PAGE 3303-02-10

DJI: GUI & DB dependencies

DatabaseGUI

PAGE 3403-02-10

Conclusions

• Analysis of embedded languages can be done

• Tools need optimizations for optimal results

• Useful: results give insight in structure

• Assessment is a man's job

• No 100% analysis possible

PAGE 3503-02-10

Conclusions: answer 1

• How can the combination of embedded language fragments and host language statements be analyzed?

Language dependent: − COBOL: recognize extended language− Java: interpret language

No generic approach, each language needs specialized tools

PAGE 3603-02-10

Conclusions: answer 2

• When analyzing embedded language fragments and host language statements, can relevant metrics be calculated and dependencies be analyzed?

Complexity of embedded code Interaction between host- and guest-languageDependencies between host-language and

databaseRelevant insight

PAGE 3703-02-10

Current work: Evolution

• Compiere and Adempiere compared over time• Compiere: 5 versions between 2006 and 2008• Adempiere: 16 versions between 2006 and nov 2009

• Gain confidence in proposed interaction metrics

© 2008lucdgbxl@flickr

PAGE 3803-02-10

Evolution: Categories

2631 fragments

2056 fragments

PAGE 3903-02-10

Evolution: outcome

• Analysis of either Java or SQL not enough• Categorization distinguishes case studies• Using this:

• Insight in evolution of case studies• Insight in complexity of case studies

PAGE 4003-02-10

OpenPM

Current work: Hibernate

• Analysis of Hibernate categorization performed• 3 days to make modifications• Vast differences between 3 case studies

Compared by files using DB system:OpenBravo

JDBCSQLCHibernate

Plazma

PAGE 4103-02-10

Future work

• Validation of metrics• Analysis of C#/Ado.NET

PAGE 4203-02-10

Questions?

SQL