semantic based model matching with emf compare
TRANSCRIPT
Dipartimento di Ingegneria e Scienze
Università degli Studi dell’Aquila
dell’Informazione e Matematica
Semantic-based Model Matching with
EMFCompare
Davide Di Ruscio
@ddiruscio
Models and Evolution Workshop at MoDELS 2016 – October 2, 2016 – Saint-Malo, France
ME‘16 – October 2, 2016 – Saint-Malo, France
2Joint work with
Alfonso PierantonioUnversity of L’Aquila
(Italy)
Ludovico IovinoGran Sasso Science Institute
(Italy)
Juri Di RoccoUnversity of L’Aquila
(Italy)
Lorenzo AddaziMalardalen University
(Sweden)
Antonio CicchettiMalardalen University
(Sweden)
ME‘16 – October 2, 2016 – Saint-Malo, France
3Introduction
Model comparison is one of the most challenging
operations in MDE
It underpins a wide range of modelling activities
• E.g., model versioning, evolution, collaborative modeling, …
Calculating model differences relies on the model
matching problem
• It can be reduced to the problem of finding correspondences between
two given graphs (Graph Isomorphism Problem, NP-Hard)
ME‘16 – October 2, 2016 – Saint-Malo, France
4Introduction
a
b
f
c
e
dVersion 1
a
k
l
c
e
dVersion 2
m
ME‘16 – October 2, 2016 – Saint-Malo, France
5Introduction
a
b
f
c
e
dVersion 1
a
k
l
c
e
dVersion 2
m
Establish
correspondences
Calculate
differences
ME‘16 – October 2, 2016 – Saint-Malo, France
6Introduction
a
b
f
c
e
dVersion 1
a
k
l
c
e
dVersion 2
m
Establish
correspondences
Calculate
differences
ME‘16 – October 2, 2016 – Saint-Malo, France
7Introduction
a
b
f
c
e
dVersion 1
a
k
l
c
e
dVersion 2
m
Establish
correspondences
Calculate
differences
> Rename node b as k
> Rename node f as l
> Add node m
> Add edge from k to m
ME‘16 – October 2, 2016 – Saint-Malo, France
8Model-matchingStatic Identity-Based Matching: each model element has a persistent unique identifier that is assigned to it upon creation
Signature-Based Matching: the identifier of each model element is dynamically calculated by combining the values of its features
Similarity-Based Matching: models are typed attribute graphs and matching elements are identified by considering the aggregated similarity of their features.
Language-Specific Matching: matching algorithms are tailored to a particular modelling language
ME‘16 – October 2, 2016 – Saint-Malo, France
9
Similiartiy-based matching
Extensible
• Static identity-based or signature-based matching can be also added
by defining custom generator functions
ME‘16 – October 2, 2016 – Saint-Malo, France
10
The default match engine
The Levenshtein distance algorithm is applied on the
string representation of the elements
• For optimisation purposes the models are compared by considering
elements selected within a proper search window
...
foreach (elM1 : Model1.getElements())
foreach (elM2 : elM1.getWindowElements())
result[elM1][elM2] = calculateSimilarity(elM1, elM2)
return createMatches(result)
...
ME‘16 – October 2, 2016 – Saint-Malo, France
11A meta-model evolution scenario
A University theses management metamodel
ME‘16 – October 2, 2016 – Saint-Malo, France
12A meta-model evolution scenario
A University theses management metamodel
Extract super class
ME‘16 – October 2, 2016 – Saint-Malo, France
13A meta-model evolution scenario
A University theses management metamodel
Attribute renaming
ME‘16 – October 2, 2016 – Saint-Malo, France
14A meta-model evolution scenario
A University theses management metamodel
ME‘16 – October 2, 2016 – Saint-Malo, France
15A meta-model evolution scenario
A University theses management metamodel
ME‘16 – October 2, 2016 – Saint-Malo, France
16A meta-model evolution scenario
A University theses management metamodel
ME‘16 – October 2, 2016 – Saint-Malo, France
17
Contextual issues: limited consideration of the
features characterising the elements
surrounding/containing the compared one
Linguistic issues: lack of semantical evaluation of
the features characterizing the compared elements
• False-negative e.g., renaming a given class using a syntactically
different name
• False-positive e.g., renaming a given class using a semantically
different term, which however presents a strong syntactical
similarity
ME‘16 – October 2, 2016 – Saint-Malo, France
19Proposed approach
Semantic Match Engine
• Use of the WordNet lexical dictionary as ontological source
ME‘16 – October 2, 2016 – Saint-Malo, France
20WordNet in a nutshell
Lexical database for the English language
English words are grouped into sets of synonyms
(synsets)
Each synset includes
- a generic definition joining the contained words
- semantic relationships connecting it to other synsets
http://www.cs.princeton.edu/courses/archive/fall16/cos226/assignments/wordnet.html
ME‘16 – October 2, 2016 – Saint-Malo, France
21WordNet in a nutshell
Lexical database for the English language
English words are grouped into sets of synonyms
(synsets)
Each synset includes
- a generic definition joining the contained words
- semantic relationships connecting it to other synsets
http://www.cs.princeton.edu/courses/archive/fall16/cos226/assignments/wordnet.html
ME‘16 – October 2, 2016 – Saint-Malo, France
22The proposed
semantic model matching
function createMatches(Comparison comparison, List
leftEObjects, List rightEObjects){
SemanticMatch root = createSemanticMatch(null, null);
exploreMatches(root, leftEObjects, rightEObjects);
evaluateMatches(root);
filterMatches(root, comparison);
}
Exploration
Evaluation
Filtering
ME‘16 – October 2, 2016 – Saint-Malo, France
23The proposed
semantic model matching
function createMatches(Comparison comparison, List
leftEObjects, List rightEObjects){
SemanticMatch root = createSemanticMatch(null, null);
exploreMatches(root, leftEObjects, rightEObjects);
evaluateMatches(root);
filterMatches(root, comparison);
}
Exploration
Evaluation
Filtering
A labelled graph representation of the compared models is produced• each node represents a semantic match• each incoming or outgoing labelled edge
represents a connection with its parents or children elements
ME‘16 – October 2, 2016 – Saint-Malo, France
24
Type: EAttribute
Source: Student.usernameTarget: User.password
Sim: null
Type: EClass
Source: StudentTarget: User
Sim: null
Type: EClass
Source: StudentTarget: Student
Sim: null
Type: EAttribute
Source: Student.passwordTarget:User.password
Sim: nullType: EAttribute
Source: Student.passwordTarget: User.username
Sim: nullType: EAttribute
Source: Student.usernameTarget: User.username
Sim: null
ME‘16 – October 2, 2016 – Saint-Malo, France
25The proposed
semantic model matching
function createMatches(Comparison comparison, List
leftEObjects, List rightEObjects){
SemanticMatch root = createSemanticMatch(null, null);
exploreMatches(root, leftEObjects, rightEObjects);
evaluateMatches(root);
filterMatches(root, comparison);
}
Exploration
Evaluation
Filtering
Each SemantichMatch node is integrated with the semantic distance value between the encapsulated element
ME‘16 – October 2, 2016 – Saint-Malo, France
26
Type: EAttribute
Source: Student.usernameTarget: User.password
Sim: 0.2
Type: EClass
Source: StudentTarget: User
Sim: 0.4
Type: EClass
Source: StudentTarget: Student
Sim: 1
Type: EAttribute
Source: Student.passwordTarget:User.password
Sim: 0.6Type: EAttribute
Source: Student.passwordTarget: User.username
Sim: 0.2Type: EAttribute
Source: Student.usernameTarget: User.username
Sim: 0.6
ME‘16 – October 2, 2016 – Saint-Malo, France
27The proposed
semantic model matching
function createMatches(Comparison comparison, List
leftEObjects, List rightEObjects){
SemanticMatch root = createSemanticMatch(null, null);
exploreMatches(root, leftEObjects, rightEObjects);
evaluateMatches(root);
filterMatches(root, comparison);
}
Exploration
Evaluation
Filtering
The set of SemanticMatch elements are filtered out with respect to a predefined threshold
ME‘16 – October 2, 2016 – Saint-Malo, France
28
Type: EAttribute
Source: Student.usernameTarget: User.password
Sim: 0.2
Type: EClass
Source: StudentTarget: User
Sim: 0.4
Type: EClass
Source: StudentTarget: Student
Sim: 1
Type: EAttribute
Source: Student.passwordTarget:User.password
Sim: 0.6Type: EAttribute
ource: Student.passwordTarget: User.username
Sim: 0.2Type: EAttribute
Source: Student.usernameTarget: User.username
Sim: 0.6
ME‘16 – October 2, 2016 – Saint-Malo, France
29Experiments
The Model Exchange Benchmark
• 5 structural modelling languages
• All the possible pairs of metamodels are given as input to:
• Semantic EMFCompare
• EMFCompare
• GAMMA(*)
• Coma++, FOAM, Crosi, Alignment API, AMW
(*) M. Kessentini, A. Ouni, P. Langer, M. Wimmer, and S. Bechikh, “Search-based metamodel matching
with structural and syntactic measures,” J. Syst. Softw., vol. 97, no. C, pp. 1–14, Oct. 2014.
ME‘16 – October 2, 2016 – Saint-Malo, France
30Experiments
Measures
ME‘16 – October 2, 2016 – Saint-Malo, France
31Experiments
Measures
It denotes the percentage of correctly matched elements
with respect to all the proposed matches
ME‘16 – October 2, 2016 – Saint-Malo, France
32Experiments
Measures
It denotes the percentage of correctly matched elements
with respect to all the expected matches
ME‘16 – October 2, 2016 – Saint-Malo, France
33Experiments
Measures It combines Precision and Recall to get an equally
weighted average value of the measures
ME‘16 – October 2, 2016 – Saint-Malo, France
34Experiments
GAMMA provides best results with respect to Precision,
Recall, and F-Measure
GAMMA uses SBSE approaches and it requires to be
initialized with a set of initial solutions (knowledge base)
ME‘16 – October 2, 2016 – Saint-Malo, France
35Experiments
Semantic EMFCompare:
• produces more matches than expected
• in some cases has lower Precision than EMFCompare
• only in one case F-Measure is lower than EMFCompare
ME‘16 – October 2, 2016 – Saint-Malo, France
36Experiments
ME‘16 – October 2, 2016 – Saint-Malo, France
37Lessons learnt
Extending EMFCompare with semantic aspects can be done in a lightweight manner
An increasing matching power can come at the price of an increasing imprecision (more false-positives and false-negatives)
The selection of the appropriate dictionary (depending on the artifacts to be compared) can make the difference• Comparing metamodels is semantically different than comparing models
of specific domains
Performing experiments can be an issue due to the lack of models to be used as test cases
• Existing model mutations approaches should be extended to implement “semantics-aware” mutations
ME‘16 – October 2, 2016 – Saint-Malo, France
38Conclusion and Future Work
Model comparison is a very complex task
It underpins the management of a wide number of
(meta-)model (co-)evolution scenarios
An extension of the EMFCompare tool has been
proposed to enable “semantics-aware” matches
Further experiments will be performed by considering
the application of different dictionaries depending on
the kinds of artifacts to be matched
ME‘16 – October 2, 2016 – Saint-Malo, France
39