knowledge extraction and analysis of software systems for

33
Knowledge Extraction and Analysis of Software Systems for Modernization and Risk Management: A Standards Approach Rama S. Moorthy, CEO Mike Oara, CTO Hatha Systems

Upload: others

Post on 03-Feb-2022

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Knowledge Extraction and Analysis of Software Systems for

Knowledge Extraction and Analysis of Software Systems

for Modernization and Risk Management: A Standards Approach

Rama S. Moorthy, CEOMike Oara, CTOHatha Systems

Page 2: Knowledge Extraction and Analysis of Software Systems for

Issues raised in Managing Operational Software Management

• Legacy and operational code continues to grow

• Staying current for functional, security and compliance reasons does not scale to address amount of existing code

• Various attempts in tools development (non‐standard) seem to address a very narrow field of knowledge extraction (e.g. Business rules, Code quality weaknesses, code security weaknesses) 

• System Integrators, despite high revenue generation, are frustrated with the lack of success in modernization

• CIOs (industry and government) continue to rank risk managing operational systems as one of their top 2 priorities two years running. 

10/23/2012 © Hatha Systems™, LLC 2

Page 3: Knowledge Extraction and Analysis of Software Systems for

Industry and Standards progress for managing operational software

• In an attempt to bridge the knowledge gap, Industry through standards innovation invested in the development of a number of standards– Knowledge Discovery Metamodel (KDM)

– Business Process Modeling Notation (BPMN)

– Semantics of Business Vocabulary and Business Rules (SBVR)

– Rules Interface Framework (RIF)

– Resource Description Framework (RDF)

• Significant efforts were made toward interoperability as later standards were developed  

• Focus on common data format enabling ease of movement from one analysis environment to another

10/23/2012 © Hatha Systems™, LLC 3

Page 4: Knowledge Extraction and Analysis of Software Systems for

Static Analysis Helping to Manage Software Systems’ Risk

10/23/2012 © Hatha Systems™, LLC 4

Knowledge Refinery (KDM)Extraction of code characteristics| Detailed Design| Architecture| Business Terms & Rules

Fort

ify

Cov

erity

Vera

code

Gra

mm

aTe

ch

Open Source Code Weakness Analysis

Tools

Non

-Sta

ndar

dC

ode

Wea

knes

s A

naly

sis

Tool

s (S

ourc

e an

d B

inar

y)

KDM NativeBinary

Code Weakness Analysis Tools

KDM NativeSource Code Extraction

(e.g. COBOL, Java, C, C++, C#, PL1, assembler,

Fortran, ADA, Mumps, etc.)

Business Process Extraction and Analysis(BPMN)

Business Process Simulation (BPMN) Code Generation

for migration(KDM/UML)

System Function Simulation

(KDM)

Compliance(kPath/SBVR)

(FISMA, CC, FIPS, CWE)

Rules Engines(RIF)

XML Schema Mapping

Code QualityAnalysis

Tools

Page 5: Knowledge Extraction and Analysis of Software Systems for

The StandardsKnowledge Extraction and Analysis

10/23/2012 © Hatha Systems™, LLC 5

Page 6: Knowledge Extraction and Analysis of Software Systems for

™Standard models: KDM

• Knowledge Discovery Metamodel (KDM): an ISO/OMG Standard providing ontology (a set of definitions) for system knowledge extraction and analysis.  KDM provides a framework for the capture of code, platform and other software system characteristics. This further allows the extraction of data flows, control flows, architectures, business/operational rules, business/operational terms, and the derivation of business/operational process; the extraction can be delivered from source, binary, or byte code.  Additionally the intermediate representation of the extraction is in executable models creating the possibility of simulation and code generation. 

10/23/2012 © Hatha Systems™, LLC 6

Page 7: Knowledge Extraction and Analysis of Software Systems for

™Standard models: SBVR

• SBVR (Semantics of Business Vocabulary and Business Rules): An ISO/OMG standard, this specification provides a structured process for formalizing, in natural language, the existing English language representation of compliance points.   The standard enables the various compliance specifications (e.g. FISMA, HIPAA, SOX, FIPs, CWEs, etc) to be formalized reducing the room for interpretation from organization to organization when implementing the compliance and auditing requirements. 

10/23/2012 © Hatha Systems™, LLC 7

Page 8: Knowledge Extraction and Analysis of Software Systems for

™Standard models: BPMN

• Business Process Modeling Notation (BPMN): an OMG standard delivering a modeling notation used to capture business/operational processes in support of system and organizational process simulation and analysis.  It is used today to capture both human and IT system processes for the purposes of simulating environments both ‘as is’ and ‘to be’ for software modernization.   This notation is compatible with KDM so that system extraction can be represented in BPMN for gap analysis of the current state of the system vs. what is thought to be the current state of the system – critical for modernization and compliance.

10/23/2012 © Hatha Systems™, LLC 8

Page 9: Knowledge Extraction and Analysis of Software Systems for

Standard modelsRDF, RDBMS, XMI

• Data/Metadata Storage Standards:  With the emergence of the standards noted above and the need for storing this information for analysis, a set of storage standards needed to be embraced.   XMI, RDMBS, and RDF (Resource Description Framework) are the three formats that are compatible with these standards.   – RDF ‐ perhaps the least known of them ‐ is a W3C standard that is compatible with KDM 

and BPMN.   There is a specific approach in the standard called RDF triple store which is currently being used in semantic web applications.  The value of RDF is that it can manage large amounts of data and metadata which is critical for doing comprehensive static analysis.  

10/23/2012 © Hatha Systems™, LLC 9

Page 10: Knowledge Extraction and Analysis of Software Systems for

What knowledge does Integrating the Standards 

deliver?

10/23/2012 © Hatha Systems™, LLC 10

Page 11: Knowledge Extraction and Analysis of Software Systems for

© Hatha Systems, LLC, Aug 2010

Automated Extraction of System  Architecture(KDM Standard)

Automated Extraction of Business Processes (BPMN Standard)

Business Transaction Process Modeling/Simulation(BPMN Standard)

Automated Extraction of Business Rules(KDM Standard)

Extraction of system Data Flows and 

Control Flows (KDM Standard) 

Automated Extraction of Code WeaknessesWithin System

(KDM Standard, etc)

Enterprise SystemStatic Analysis 

(Modernization, System Integration,  Security, Compliance, Assurance) 

Common Repository/ Common Data Format

Standard Querying Language

Legend: • KDM: Knowledge Discovery Metamodel ISO standard• RDF:  Resource Description Framework – W3C standard• BPMN: Business Process Modeling Notation – OMG standard• SBVR: Semantics of Business Vocabulary and Rules ‐ ISO standard

Extraction of Configuration Data  (including black box 

interface data)(KDM Standard)

Inputs  (BPMN & SBVR) into Analysis Environment (KDM)  for  gaps and compliance analysis

Formal Representation of Semantic Controls (FISMA, CC, FIPs, CWE, HIPAA etc) 

(SBVR Standard)

Data Store using XMI, RDF, RDBMS; Analytics/Correlation

Business Rules Analysis (RIF Standard)

Enterprise Software Systems Static Analysis Standards Mapping

Page 12: Knowledge Extraction and Analysis of Software Systems for

Static Analysis & StandardsA Scenario

10/23/2012 © Hatha Systems™, LLC 12

Cobol/CICS/DB2

C/C++

Java 

KDM Repository

ParsingTool A

Tool B

Tool C

Analysis

Architecture

Common Code Weaknesses

ExtractionBusiness rules

(Business Rules Eng)

Processes(Simulation)

Tool D

Tool E

Tool F

Tool GC# 

…….

Page 13: Knowledge Extraction and Analysis of Software Systems for

™Static Analysis

Methods Benefits Limitations

Architecture diagramsData flowsControl flowsInvestigative queriesSoftware measurementsPattern discoverySimulation

System understanding in breath and depthDirect discovery of non‐compliancePinpointing errors in the code, architecture, processProof of complianceNo “guesswork” testing

No performance dataSome run‐time conditions may not be capturedSource code is required

10/23/2012 © Hatha Systems™, LLC 13

Source code Repository model

Parsing

Comprehensive System DiagramsUser Definable QueriesBoth comprehensive and Custom/specialized analysis tools

Analysis

Page 14: Knowledge Extraction and Analysis of Software Systems for

™Integrated Analysis Environment

10/23/2012 © Hatha Systems™, LLC 14

Business Process Simulation (BPMN)

Functional Simulation (KDM)

Rules extraction forAnalysis or 

Rules engine use(RIF)

SystemDocumentation

(KDM)

Compliance(kPath/SBVR)

(FISMA, CC, FIPSHIPPA, SOX)(SBVR/kPath)

Simulation(BPMN/KDM)

Security Analysis:Weakness & Criticality

(KDM/kPath)

Generation (Code, D/B schema,WSDL, Service Definitions)(KDM/UML)

Software System Static Analysis Platform(s) Extraction of Data Flows| Control Flows| Architecture| Business Rules|

Code/Architecture/Process Weaknesses| Business Processes| System Compliance|KDM based platform – Querying environment (kPath or SparQL)

Storage Standard (XMI, RDMBS, RDF)

Extraction

Automated Analysis

Parsing

Legacy Applications (COBOL/CICS/JCL/IMS/D

B2/PL1/VB6, etc)

Java Application

Java/C++ Program

Others (ADA, Fortran, etc) 

C#/VBProgram

Standards based analyses and migration solutionsStandards based Analysis Platform

Legend

Software Systems & Applications (COTS, GOTS, Custom)

C Program

COBOL/C 

Program…

Page 15: Knowledge Extraction and Analysis of Software Systems for

Software System Knowledge using Standards‐Based Tools

Examples of Extractable Static Knowledge Retrievable in Java and 

COBOL Code

10/23/2012 © Hatha Systems™, LLC 15

Page 16: Knowledge Extraction and Analysis of Software Systems for

™Static Analysis: Data Flows

Definition Compliance applications

Shows dependencies between variables

What data appears in a user interface? (HIPAA)Was data encrypted before being stored or transmitted?

10/23/2012 © Hatha Systems™, LLC 16

Page 17: Knowledge Extraction and Analysis of Software Systems for

™ Data Flow

10/23/2012 © Hatha Systems™, LLC 17

Data flow variant (in a Cobol program)

Page 18: Knowledge Extraction and Analysis of Software Systems for

™Static Analysis: Control Flow

Definition Compliance applications

Shows flow dependencies between statements

Is data validated before being processed?What conditions lead to a particular action? (example: approval of a request)

10/23/2012 © Hatha Systems™, LLC 18

Page 19: Knowledge Extraction and Analysis of Software Systems for

™ Call Maps

10/23/2012 © Hatha Systems™, LLC 19

Page 20: Knowledge Extraction and Analysis of Software Systems for

™Static Analysis: Platform Diagrams

10/23/2012 © Hatha Systems™, LLC 20

Definition Compliance applications

A diagram of relationshipsbetween runtime elements of the application (programs, transactions, tables, files, queues, etc.)

Is the application properly layered?  Are all data access methods done through an API layer?Which elements of the application work with confidential or secret information?

Page 21: Knowledge Extraction and Analysis of Software Systems for

™ Business Rules Report

10/23/2012 © Hatha Systems™, LLC 21

Page 22: Knowledge Extraction and Analysis of Software Systems for

™Database Schema – Table References

10/23/2012 © Hatha Systems™, LLC 22

Page 23: Knowledge Extraction and Analysis of Software Systems for

™ User Navigation

10/23/2012 © Hatha Systems™, LLC 23

Example: JSP application

Page 24: Knowledge Extraction and Analysis of Software Systems for

™ Package Dependency

10/23/2012 © Hatha Systems™, LLC 24

Page 25: Knowledge Extraction and Analysis of Software Systems for

™ Architecture Extraction

Architecture Colorized by Layers – client facing, data facing, mixed

Page 26: Knowledge Extraction and Analysis of Software Systems for

Source Code Version Differential Analysis(Metadata)

10/23/2012 © Hatha Systems™, LLC 26

Comparing two version of the same source using a differential analysisTechnique to show architecture, business layer, design layer differencesWhat changed? What is impacted by the change?

Page 27: Knowledge Extraction and Analysis of Software Systems for

™Rules Extraction Coverage Report

10/23/2012 © Hatha Systems™, LLC 27

Shows the number of rules for each function in the application

Page 28: Knowledge Extraction and Analysis of Software Systems for

™ Weaknesses 

10/23/2012 © Hatha Systems™, LLC 28

Deep nested ‘ifs’ increasing the cost of maintenanceNot to mention quality and security issues

Page 29: Knowledge Extraction and Analysis of Software Systems for

™ Program Access to Files or Tables

10/23/2012 © Hatha Systems™, LLC 29

Page 30: Knowledge Extraction and Analysis of Software Systems for

Example Commercial Solution:Hatha Systems Knowledge Refinery 

Platform

10/23/2012 © Hatha Systems™, LLC 30

Page 31: Knowledge Extraction and Analysis of Software Systems for

Knowledge Refinery Platform:Main Window

10/23/2012 © Hatha Systems™, LLC 31

An Example of a Standards based Commercial Platform

Page 32: Knowledge Extraction and Analysis of Software Systems for

Demonstration

10/23/2012 © Hatha Systems™, LLC 32

Page 33: Knowledge Extraction and Analysis of Software Systems for

Thank YouHatha Systems, LLC

202‐756‐2974www.hathasystems.com

Rama S. [email protected]

Mike [email protected]