model-independent schema and data translation paolo atzeni università roma tre based on tutorial at...

99
Model-Independent Schema and Data Translation Paolo Atzeni Università Roma Tre Based on Tutorial at ICDE 2006 Paper in EDBT 2006 (with P. Cappellari and P. Bernstein) JISBD 2006 --- Sitges, 5 October 2006

Upload: emily-hines

Post on 03-Jan-2016

220 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: Model-Independent Schema and Data Translation Paolo Atzeni Università Roma Tre Based on Tutorial at ICDE 2006 Paper in EDBT 2006 (with P. Cappellari and

Model-Independent Schema and Data Translation

Paolo AtzeniUniversità Roma Tre

Based onTutorial at ICDE 2006

Paper in EDBT 2006 (with P. Cappellari and P. Bernstein)

JISBD 2006 --- Sitges, 5 October 2006

Page 2: Model-Independent Schema and Data Translation Paolo Atzeni Università Roma Tre Based on Tutorial at ICDE 2006 Paper in EDBT 2006 (with P. Cappellari and

P. Atzeni JISBD - October 5, 2006 2

Outline

• Introduction• Model management• Schema and data translation: the problem• A metamodel based approach

Page 3: Model-Independent Schema and Data Translation Paolo Atzeni Università Roma Tre Based on Tutorial at ICDE 2006 Paper in EDBT 2006 (with P. Cappellari and

P. Atzeni JISBD - October 5, 2006 3

A ten-year goal for database research

• The “Asilomar report”(Bernstein et al. Sigmod Record 1999 www.acm.org/sigmod):– The information utility:

make it easy for everyone to store, organize, access, and analyze the majority of human information online

Page 4: Model-Independent Schema and Data Translation Paolo Atzeni Università Roma Tre Based on Tutorial at ICDE 2006 Paper in EDBT 2006 (with P. Cappellari and

P. Atzeni JISBD - October 5, 2006 4

A general framework: cooperation

• "The capacity of a system to interact (effectively) with other systems, possibly managed by different organizations"

• Forms of cooperation– Process-centered cooperation:

• the systems offer services one another, by exchanging messages, or by triggering activities, without making remote data explicitly visible

– Data-centered cooperation: • the systems offer data one another; data is distributed,

heterogeneous and autonomous

Page 5: Model-Independent Schema and Data Translation Paolo Atzeni Università Roma Tre Based on Tutorial at ICDE 2006 Paper in EDBT 2006 (with P. Cappellari and

P. Atzeni JISBD - October 5, 2006 5

Databases in the Internet era

• Databases before the Internet– An internal infrastructure, a precious resource, but usually

hidden, with some controlled cooperation• Internet changes the requirements

– More users (not only humans), more diverse cooperating systems (distributed, heterogeneous, autonomous), more types of data

• "Future" Internet changes more– New devices (embedded everywhere), even more users

(many “per person”), real mobility, need for personalization and adaptation

Page 6: Model-Independent Schema and Data Translation Paolo Atzeni Università Roma Tre Based on Tutorial at ICDE 2006 Paper in EDBT 2006 (with P. Cappellari and

P. Atzeni JISBD - October 5, 2006 6

The most studied form of data-centered cooperation: integration

• We are interested in data-centered cooperation, often referred to as integration

“The unification of related, heterogeneous data from disparate sources, for example, to enable collaboration” (Hammer & Stonebraker 2005)

• Some "paradigms" …

Page 7: Model-Independent Schema and Data Translation Paolo Atzeni Università Roma Tre Based on Tutorial at ICDE 2006 Paper in EDBT 2006 (with P. Cappellari and

P. Atzeni JISBD - October 5, 2006 7

Multidatabase

Global Manager

Local mgr

DB

Mediator

Local mgr

DB

Mediator

Local mgr

DB

Mediator

Page 8: Model-Independent Schema and Data Translation Paolo Atzeni Università Roma Tre Based on Tutorial at ICDE 2006 Paper in EDBT 2006 (with P. Cappellari and

P. Atzeni JISBD - October 5, 2006 8

Data Warehousing System

Mediator

Local manager

Mediator

Local manager

Mediator

Local manager

“Integrator”

DB DB DB

Data Warehouse

DW manager

Page 9: Model-Independent Schema and Data Translation Paolo Atzeni Università Roma Tre Based on Tutorial at ICDE 2006 Paper in EDBT 2006 (with P. Cappellari and

P. Atzeni JISBD - October 5, 2006 9

Intermediate solutions in practice

Mediator

Local manager

DB

Local manager

DB

Integrator

Mediator

Local manager

Mediator

Local manager

DB DB

DB

Local Manager

Application

Page 10: Model-Independent Schema and Data Translation Paolo Atzeni Università Roma Tre Based on Tutorial at ICDE 2006 Paper in EDBT 2006 (with P. Cappellari and

P. Atzeni JISBD - October 5, 2006 10

Peer-based architecturePeer mgr

Local mgr

DB

Local mgr

DB

Mediator

Mediator

Peer mgr

Local mgr

DB

Local mgr

DB

Mediator

Mediator

Peer mgr

Local mgr

DB

Local mgr

DB

Mediator

Mediator

Page 11: Model-Independent Schema and Data Translation Paolo Atzeni Università Roma Tre Based on Tutorial at ICDE 2006 Paper in EDBT 2006 (with P. Cappellari and

P. Atzeni JISBD - October 5, 2006 11

Data is not just in databases

• Mail messages• Web pages• Spreadsheets• Textual documents• Palmtop devices, mobile phones• Multimedia annotations (e.g., in digital photos)• XML documents

Page 12: Model-Independent Schema and Data Translation Paolo Atzeni Università Roma Tre Based on Tutorial at ICDE 2006 Paper in EDBT 2006 (with P. Cappellari and

P. Atzeni JISBD - October 5, 2006 12

The same data in the same form?

• Adaptivity:– Personalization: content adapted to the user

• upon system's decision• upon user's request

– Customization: structure adapted to the user• according to the user's role• upon user's request

– Context dependence• User, Device, Network, Place, Time, Rate

Page 13: Model-Independent Schema and Data Translation Paolo Atzeni Università Roma Tre Based on Tutorial at ICDE 2006 Paper in EDBT 2006 (with P. Cappellari and

P. Atzeni JISBD - October 5, 2006 13

Summarizing: a general need

• We have data at various places, and data has to be– exchanged– replicated – migrated– integrated – adapted

Page 14: Model-Independent Schema and Data Translation Paolo Atzeni Università Roma Tre Based on Tutorial at ICDE 2006 Paper in EDBT 2006 (with P. Cappellari and

P. Atzeni JISBD - October 5, 2006 14

A major difficulty

• Heterogeneity– System level– Model level– Structural (different structure for similar data)– Semantic (different meaning for the same structure)

• Many efforts, but current techniques are mostly manual and ad hoc

Page 15: Model-Independent Schema and Data Translation Paolo Atzeni Università Roma Tre Based on Tutorial at ICDE 2006 Paper in EDBT 2006 (with P. Cappellari and

P. Atzeni JISBD - October 5, 2006 15

A direction for the solutions

• Be general! – ad hoc solution could work in-the-small, but they

• are repetitive and time consuming • do not scale• are not maintainable

• Historical notes:– W. C. McGee: Generalization: Key to Successful Electronic

Data Processing. J. ACM 1959• Indeed, databases are the result of generalization applied to

secondary storage management!

Page 16: Model-Independent Schema and Data Translation Paolo Atzeni Università Roma Tre Based on Tutorial at ICDE 2006 Paper in EDBT 2006 (with P. Cappellari and

P. Atzeni JISBD - October 5, 2006 16

Generality requires …

• … high-level descriptions of problems within the family of interest:– Metadata:

• “data about data”• (formal or informal) description of structures and

meaning

• General solutions leverage on metadata (and then operate on data as a consequence)

Page 17: Model-Independent Schema and Data Translation Paolo Atzeni Università Roma Tre Based on Tutorial at ICDE 2006 Paper in EDBT 2006 (with P. Cappellari and

P. Atzeni JISBD - October 5, 2006 17

Outline

• Introduction and motivation• Model management• Schema and data translation: the problem• A metamodel based approach

Page 18: Model-Independent Schema and Data Translation Paolo Atzeni Università Roma Tre Based on Tutorial at ICDE 2006 Paper in EDBT 2006 (with P. Cappellari and

P. Atzeni JISBD - October 5, 2006 18

A wider perspective

• (Generic) Model Management:– A proposal by Bernstein et al (2000 +)– Includes a set of operators on

• schemas and • mappings between schemas

Page 19: Model-Independent Schema and Data Translation Paolo Atzeni Università Roma Tre Based on Tutorial at ICDE 2006 Paper in EDBT 2006 (with P. Cappellari and

P. Atzeni JISBD - October 5, 2006 19

Terminology: a warning

Model Mgmt people Traditional DB people

Meta-metamodel Metamodel

Metamodel Model

Model Schema

Page 20: Model-Independent Schema and Data Translation Paolo Atzeni Università Roma Tre Based on Tutorial at ICDE 2006 Paper in EDBT 2006 (with P. Cappellari and

P. Atzeni JISBD - October 5, 2006 20

Schemas and mappings

• A simplified approach:– Schema:

• a set of elements, related in some way to one another– Mapping:

• a set of correspondences (pair of elements) or• its reification, a third schema related to the other two via

two sets of correspondences

Page 21: Model-Independent Schema and Data Translation Paolo Atzeni Università Roma Tre Based on Tutorial at ICDE 2006 Paper in EDBT 2006 (with P. Cappellari and

P. Atzeni JISBD - October 5, 2006 21

Model mgmt operators, a first set

• map = Match (S1, S2) • S3 = Merge (S1, S2, map)• S2 = Diff (S1, map) • and more

– map3 = Compose (map1, map2)

– S2 = Select (S1, pred)

– Apply (S, f)

– list = Enumerate (S)

– S2 = Copy (S1)

– …

Page 22: Model-Independent Schema and Data Translation Paolo Atzeni Università Roma Tre Based on Tutorial at ICDE 2006 Paper in EDBT 2006 (with P. Cappellari and

P. Atzeni JISBD - October 5, 2006 22

Match

• map = Match (S1, S2)– given

• two schemas S1, S2– returns

• a mapping between them• the “classical” initial step in data integration:

– find the common elements of two schemas and the correspondences between them

Page 23: Model-Independent Schema and Data Translation Paolo Atzeni Università Roma Tre Based on Tutorial at ICDE 2006 Paper in EDBT 2006 (with P. Cappellari and

P. Atzeni JISBD - October 5, 2006 23

Merge

• S3 = Merge (S1, S2, map)– given

• two schemas and a mapping between them– returns

• a third schema (and two mappings)• the “classical” second step in data integration:

– given the correspondences, find a way to obtain one schema out of two

Page 24: Model-Independent Schema and Data Translation Paolo Atzeni Università Roma Tre Based on Tutorial at ICDE 2006 Paper in EDBT 2006 (with P. Cappellari and

P. Atzeni JISBD - October 5, 2006 24

Diff

• S2 = Diff (S1, map) – given

• a schema and a mapping from it (to some other schema, not relevant)

– returns • a (sub-)schema, with the elements that do not participate

in the mapping

Page 25: Model-Independent Schema and Data Translation Paolo Atzeni Università Roma Tre Based on Tutorial at ICDE 2006 Paper in EDBT 2006 (with P. Cappellari and

P. Atzeni JISBD - October 5, 2006 25

DW2

Example

(Bernstein and Rahm, ER 2000)• A database (a “source”), a data warehouse and a mapping

between the two• we get a second source, with some similarity to the first one• and we want to update the DW

DB1 DW1

DB2

Page 26: Model-Independent Schema and Data Translation Paolo Atzeni Università Roma Tre Based on Tutorial at ICDE 2006 Paper in EDBT 2006 (with P. Cappellari and

P. Atzeni JISBD - October 5, 2006 26

DW2

Example, the "solution"

DB1 DW1

DB2

m1

m2m3

DB2’

m2 = Match(DB1,DB2)

m3= Compose(m2,m1)

DB2’=Diff(DB2,m3)

DW2’, m4 user defined

m5 = Match(DW1,DW2’)

DW2 = Merge(DW1,DW2’,m5)

DW2’m4

m5

Page 27: Model-Independent Schema and Data Translation Paolo Atzeni Università Roma Tre Based on Tutorial at ICDE 2006 Paper in EDBT 2006 (with P. Cappellari and

P. Atzeni JISBD - October 5, 2006 27

Magic does not exist

• Operators might require human intervention:– Match is the main case

• Scripts involving operators might require human intervention as well (or at least benefit from it):– a full implementation of each operator might not always

available– a mapping might require manual specification– incomparable alternatives might exist

Page 28: Model-Independent Schema and Data Translation Paolo Atzeni Università Roma Tre Based on Tutorial at ICDE 2006 Paper in EDBT 2006 (with P. Cappellari and

P. Atzeni JISBD - October 5, 2006 28

The “data level”

• The major operators have also an extended version that operates on data, and not only on schemas

• Especially apparent for– Merge

Page 29: Model-Independent Schema and Data Translation Paolo Atzeni Università Roma Tre Based on Tutorial at ICDE 2006 Paper in EDBT 2006 (with P. Cappellari and

P. Atzeni JISBD - October 5, 2006 29

We also have heterogeneity

• Round trip engineering (Bernstein, CIDR 2003)– A specification, an implementation– then a change to the implementation: want to revise the

specification• We need a translation from the implementation model to the

specification one

S1

I1 I2

S2

Page 30: Model-Independent Schema and Data Translation Paolo Atzeni Università Roma Tre Based on Tutorial at ICDE 2006 Paper in EDBT 2006 (with P. Cappellari and

P. Atzeni JISBD - October 5, 2006 30

Model management with heterogeneity

• The previous operators have to be “model generic” (capable of working on different models)

• We need a “translation” operator– <S2, map12> = ModelGen (S1)

Page 31: Model-Independent Schema and Data Translation Paolo Atzeni Università Roma Tre Based on Tutorial at ICDE 2006 Paper in EDBT 2006 (with P. Cappellari and

P. Atzeni JISBD - October 5, 2006 31

ModelGen, an additional operator

• <S2, map12> = ModelGen (S1) – given

• a schema (in a model)– returns

• a schema (in a different data model) and a mapping between the two

• A “translation” from a model to another• I should call it “SchemaGen” …• We should better write

– <S2, map12> = ModelGen (S1,mod2)

Page 32: Model-Independent Schema and Data Translation Paolo Atzeni Università Roma Tre Based on Tutorial at ICDE 2006 Paper in EDBT 2006 (with P. Cappellari and

P. Atzeni JISBD - October 5, 2006 32

S2

Round trip engineering

S1

I1

m1

I2

m2 = Match (I1,I2)m3 = Compose (m1,m2)I2’= Diff(I2,m3)<S2’,m4 > = Modelgen(I2’)… Match, Merge

m2

m3

I2’

S2’

m4

Page 33: Model-Independent Schema and Data Translation Paolo Atzeni Università Roma Tre Based on Tutorial at ICDE 2006 Paper in EDBT 2006 (with P. Cappellari and

P. Atzeni JISBD - October 5, 2006 33

Outline

• Introduction• Model management• Schema and data translation: the problem• A metamodel based approach

Page 34: Model-Independent Schema and Data Translation Paolo Atzeni Università Roma Tre Based on Tutorial at ICDE 2006 Paper in EDBT 2006 (with P. Cappellari and

P. Atzeni JISBD - October 5, 2006 34

Schema and data translation, a long standing issue

• Schema translation:– given schema S1 in model M1 and model M2– find a schema S2 in M2 that “corresponds” to S1

• Schema and data translation:– given also a database D1 for S1– find also a database D2 for S2 that “contains the same data”

to D1

Page 35: Model-Independent Schema and Data Translation Paolo Atzeni Università Roma Tre Based on Tutorial at ICDE 2006 Paper in EDBT 2006 (with P. Cappellari and

P. Atzeni JISBD - October 5, 2006 35

Schema and data translation, a long standing issue

• Translations from a model to another have been studied since the 1970’s

• Whenever a new model is defined, techniques and tools to generate translations are studied

• However, proposals and solutions are usually model specific

Page 36: Model-Independent Schema and Data Translation Paolo Atzeni Università Roma Tre Based on Tutorial at ICDE 2006 Paper in EDBT 2006 (with P. Cappellari and

P. Atzeni JISBD - October 5, 2006 36

Model specific solutions

• Given an ER schema, find the suitable relational schema that “implements” it – the original paper (Chen 1976) contains the basics– further discussions by many (e.g. Markowitz and Shoshani

1989)– illustrated in every textbook

• Similarly with – any other conceptual model and any other logical one– XML and relational (or object)

Page 37: Model-Independent Schema and Data Translation Paolo Atzeni Università Roma Tre Based on Tutorial at ICDE 2006 Paper in EDBT 2006 (with P. Cappellari and

P. Atzeni JISBD - October 5, 2006 37

Another problem in the picture: data exchange

• Given a source S1 and a target schema S2 (in different models or even in the same one), find a translation, that is, a function that given a database D1 for S1 produces a database D2 for S2 that “correspond” to D1

• Often emphasized with reference to materialized solutions

Page 38: Model-Independent Schema and Data Translation Paolo Atzeni Università Roma Tre Based on Tutorial at ICDE 2006 Paper in EDBT 2006 (with P. Cappellari and

P. Atzeni JISBD - October 5, 2006 38

Integration

• Given two or more sources, build an integrated schema or database

Page 39: Model-Independent Schema and Data Translation Paolo Atzeni Università Roma Tre Based on Tutorial at ICDE 2006 Paper in EDBT 2006 (with P. Cappellari and

P. Atzeni JISBD - October 5, 2006 39

Schema translation

• Given a schema find another one with respect to some specific goal (better quality, another model, …)

Page 40: Model-Independent Schema and Data Translation Paolo Atzeni Università Roma Tre Based on Tutorial at ICDE 2006 Paper in EDBT 2006 (with P. Cappellari and

P. Atzeni JISBD - October 5, 2006 40

Data exchange

• Given a source and a target schema, find a transformation from the former to the latter

Page 41: Model-Independent Schema and Data Translation Paolo Atzeni Università Roma Tre Based on Tutorial at ICDE 2006 Paper in EDBT 2006 (with P. Cappellari and

P. Atzeni JISBD - October 5, 2006 41

Schema translation and data exchange

• Can be seen a complementary:– Data translation = schema translation + data exchange

• Given a source schema and database• Schema translation produces the target schema• Data exchange generates the target database

• In model management terms we could write– Schema translation:

• <S2, map12> = ModelGen (S1,mod2)– Data exchange:

• i2 = DataGen (S1,i1,S2,map12)

Page 42: Model-Independent Schema and Data Translation Paolo Atzeni Università Roma Tre Based on Tutorial at ICDE 2006 Paper in EDBT 2006 (with P. Cappellari and

P. Atzeni JISBD - October 5, 2006 42

Outline

• Introduction• Model management• Schema and data translation: the problem• A metamodel based approach

Page 43: Model-Independent Schema and Data Translation Paolo Atzeni Università Roma Tre Based on Tutorial at ICDE 2006 Paper in EDBT 2006 (with P. Cappellari and

P. Atzeni JISBD - October 5, 2006 43

 The problem

• ModelGen– given two data models M1 and M2, and a schema S1 of M1

(the source schema and model), • generate a schema S2 of M2 (the target schema and

model), corresponding to S1• and, for each database D1 over S1, generate an

equivalent database D2 over S2

Page 44: Model-Independent Schema and Data Translation Paolo Atzeni Università Roma Tre Based on Tutorial at ICDE 2006 Paper in EDBT 2006 (with P. Cappellari and

P. Atzeni JISBD - October 5, 2006 44

We have been doing this for a while

• Initial work more than ten years ago (Atzeni & Torlone, 1996)• Major novelty recently

– translation of both schemas and data– data-level translations generated, from schema-level ones

• Moreover:– a visible, multilevel and (in part) self-generating dictionary– high-level, visible and customizable translation rules

• in Datalog with OID-invention– mappings between elements generated as a by-product

Page 45: Model-Independent Schema and Data Translation Paolo Atzeni Università Roma Tre Based on Tutorial at ICDE 2006 Paper in EDBT 2006 (with P. Cappellari and

P. Atzeni JISBD - October 5, 2006 45

Heterogeneity

• We need to handle artifacts and data in various models– Data are defined wrt to schemas– Schemas are defined wrt to models – How models can be defined?

Models

Schemas

Data

Page 46: Model-Independent Schema and Data Translation Paolo Atzeni Università Roma Tre Based on Tutorial at ICDE 2006 Paper in EDBT 2006 (with P. Cappellari and

P. Atzeni JISBD - October 5, 2006 46

A metamodel approach

• The constructs in the various models are rather similar:

– can be classified into a few categories (Hull & King 1986):

• Lexical: set of printable values (domain)

• Abstract (entity, class, …)

• Aggregation: a construction based on (subsets of) cartesian products (relationship, table)

• Function (attribute, property)

• Hierarchies

• …

• We can fix a set of metaconstructs (each with variants):

– lexical, abstract, aggregation, function, ...

– the set can be extended if needed, but this will not be frequent

• A model is defined in terms of the metaconstructs it uses

Page 47: Model-Independent Schema and Data Translation Paolo Atzeni Università Roma Tre Based on Tutorial at ICDE 2006 Paper in EDBT 2006 (with P. Cappellari and

P. Atzeni JISBD - October 5, 2006 47

 The metamodel approach, example

• The ER model:– Abstract (called Entity)– Function from Abstract to Lexical (Attribute)– Aggregation of abstracts (Relationship) – …

• The OR model:– Abstract (Table with ID)– Function from Abstract to Lexical (value-based Attribute)– Function from Abstract to Abstract (reference Attribute)– Aggregation of lexicals (value-based Table)– Component of Aggregation of Lexicals (Column)– …

Page 48: Model-Independent Schema and Data Translation Paolo Atzeni Università Roma Tre Based on Tutorial at ICDE 2006 Paper in EDBT 2006 (with P. Cappellari and

P. Atzeni JISBD - October 5, 2006 48

 The supermodel

• A model that includes all the meta-constructs (in their most general forms)– Each model is subsumed by the supermodel (modulo

construct renaming) – Each schema for any model is also a schema for the

supermodel (modulo construct renaming)

Page 49: Model-Independent Schema and Data Translation Paolo Atzeni Università Roma Tre Based on Tutorial at ICDE 2006 Paper in EDBT 2006 (with P. Cappellari and

P. Atzeni JISBD - October 5, 2006 49

A Multi-Level Dictionary

• Handles models, schemas and data• Has both a model specific and a model independent component

Page 50: Model-Independent Schema and Data Translation Paolo Atzeni Università Roma Tre Based on Tutorial at ICDE 2006 Paper in EDBT 2006 (with P. Cappellari and

P. Atzeni JISBD - October 5, 2006 50

Multi-Level Dictionary

model

schema

modelgeneric

modelspecific

des

crip

tion

modelindependence

Supermodel description

(mSM)

Model descriptions

(mM)

Supermodel schemas

(SM)

Model specific schemas

(M)

Supermodel instances

(i-SM)

Model specific instances

(i-M)data

Page 51: Model-Independent Schema and Data Translation Paolo Atzeni Università Roma Tre Based on Tutorial at ICDE 2006 Paper in EDBT 2006 (with P. Cappellari and

P. Atzeni JISBD - October 5, 2006 51

The supermodel description

MSM-Construct

OID Name IsLex

1 AggregationOfLexicals F

2 ComponentOfAggrOfLex T

3 Abstract F

4 AttributeOfAbstract T

5 BinaryAggregationOfAbstracts F

MSM-Property

OID Name Constr. Type

11 Name 1 String

12 Name 2 String

13 IsKey 2 Boolean

14 IsNullable 2 Boolean

15 Type 2 String

16 Name 3 String

17 Name 4 String

18 IsIdentifier 4 Boolean

19 IsNullable 4 Boolean

20 Type 4 String

21 IsFunct1 5 Boolean

22 IsOptional1 5 Boolean

23 Role1 5 String

24 IsFunct2 5 Boolean

25 IsOptional2 5 Boolean

26 Role2 5 String

MSM-Reference

OID Name Construct Target

30 Aggregation 2 1

31 Abstract 4 3

32 Abstract1 5 3

33 Abstract2 5 3

mSMmM

SMM

i-SMi-M

Page 52: Model-Independent Schema and Data Translation Paolo Atzeni Università Roma Tre Based on Tutorial at ICDE 2006 Paper in EDBT 2006 (with P. Cappellari and

P. Atzeni JISBD - October 5, 2006 52

Schemas in the supermodel

SM-Abstract

OID Schema Name

301 1 Employees

302 1 Departments

201 3 Clerks

202 3 Offices

SM-AttributeOfAbstract

OID Schema Name isIdent isNullable Type AbstrOID

401 1 EmpNo T F Int 301

402 1 Name F F Text 301

404 1 Name T F Char 302

405 1 Address F F Text 302

501 3 Code T F Int 201

… … … … … … …

MSM-Construct

OID Name IsLex

… … …

3 Abstract F

4 AttributeOfAbstract T

… … …

Supermodel schemas

mSMmM

SMM

i-SMi-M

Employees

Departments

EmpNo

Name

Name

Address

Page 53: Model-Independent Schema and Data Translation Paolo Atzeni Università Roma Tre Based on Tutorial at ICDE 2006 Paper in EDBT 2006 (with P. Cappellari and

P. Atzeni JISBD - October 5, 2006 53

i-SM-Abstract

OID AbsOID InstOID

3010 301 1

3011 301 1

… … …

3010 301 2

i-SM-AttributeOfAbstract

OID AttrOfAbsOID InstOID Value i- AbsOID

4010 401 1 75432 3010

4020 402 1 John Doe 3010

4021 402 1 Bob White 3011

… … … … …

Instances in the supermodel

SM-Abstract

OID Schema Name

301 1 Employees

302 1 Departments

201 3 Clerks

202 3 Offices

SM-AttributeOfAbstract

OID Schema Name isIdent isNullable Type AbstrOID

401 1 EmpNo T F Int 301

402 1 Name F F Text 301

404 1 Name T F Char 302

405 1 Address F F Text 302

501 3 Code T F Int 201

… … … … … … …Supermodel schemasSupermodel instances

mSMmM

SMM

i-SMi-M

Page 54: Model-Independent Schema and Data Translation Paolo Atzeni Università Roma Tre Based on Tutorial at ICDE 2006 Paper in EDBT 2006 (with P. Cappellari and

P. Atzeni JISBD - October 5, 2006 54

Multi-Level Repository

model

schema

modelgeneric

modelspecific

des

crip

tion

modelindependence

Supermodel description

(mSM)

Model descriptions

(mM)

Supermodel schemas

(SM)

Model specific schemas

(M)

Supermodel instances

(i-SM)

Model specific instances

(i-M)data

Page 55: Model-Independent Schema and Data Translation Paolo Atzeni Università Roma Tre Based on Tutorial at ICDE 2006 Paper in EDBT 2006 (with P. Cappellari and

P. Atzeni JISBD - October 5, 2006 55

Model descriptions

MSM-Construct

OID Name IsLex

1 AggregationOfLexicals F

2 ComponentOfAggrOfLex T

3 Abstract F

4 AttributeOfAbstract T

5 BinaryAggregationOfAbstracts F

MM-Construct

OID Model MSM-Constr Name

1 2 3 ER_Entity

2 2 4 ER_Attribute

3 2 5 ER_Relationship

4 1 1 Rel_Table

5 1 2 Rel_Column

6 3 3 OO_Class

MM-Model

OID Name

1 Relational

2 Entity-Relationship

3 Object

MSM-Property

OID Name Construct Type

… … … …

MSM-Reference

OID Name Construct …

… … … …

MM-Property

… … … …

MM-Reference

… … … …

mSMmM

SMM

i-SMi-M

Page 56: Model-Independent Schema and Data Translation Paolo Atzeni Università Roma Tre Based on Tutorial at ICDE 2006 Paper in EDBT 2006 (with P. Cappellari and

P. Atzeni JISBD - October 5, 2006 56

Multi-Level Repository, generation and use

model

schema

modelgeneric

modelspecific

des

crip

tion

modelindependence

Supermodel description

(mSM)

Model descriptions

(mM)

Supermodel schemas

(SM)

Model specific schemas

(M)

Supermodel instances

(i-SM)

Model specific instances

(i-M)data

Structure fixed, content provided by tool

designers

Structure generated by the tool from the content of mSM

Structure generated by the tool from the content of mSM

Structure fixed, content provided by model

designers out of mSM

Structure generated by the tool from the content of mM

Page 57: Model-Independent Schema and Data Translation Paolo Atzeni Università Roma Tre Based on Tutorial at ICDE 2006 Paper in EDBT 2006 (with P. Cappellari and

P. Atzeni JISBD - October 5, 2006 57

The metamodel approach, translations

• The constructs in the various models are rather similar: – can be classified into a few categories (“metaconstructs'')– translations can be defined on metaconstructs,

• and there are “standard”, accepted ways to deal with translations of metaconstructs

• they can be performed within the supermodel– each translation from the supermodel SM to a target model

M is also a translation from any other model to M:• given n models, we need n translations, not n2 

Page 58: Model-Independent Schema and Data Translation Paolo Atzeni Università Roma Tre Based on Tutorial at ICDE 2006 Paper in EDBT 2006 (with P. Cappellari and

P. Atzeni JISBD - October 5, 2006 58

Generic translation environment

1. Copy

2. Translation

3. Copy

Supermodel

Source model

Target model

Translation:composition 1,2 & 3

Page 59: Model-Independent Schema and Data Translation Paolo Atzeni Università Roma Tre Based on Tutorial at ICDE 2006 Paper in EDBT 2006 (with P. Cappellari and

P. Atzeni JISBD - October 5, 2006 59

Translations within the supermodel

• We still have too many models:– Just within simple ER model versions, we have 4 or 5

constructs, and each has several independent features which give rise to variants

• for example, relationships can be– binary or N-ary– with all possible cardinalities or without many-to-

many– with or without the possibility of specifying optionality– with or without attributes– …

– Combining all these, we get hundreds of models!– The management of a specific translation for each model

would be hopeless

Page 60: Model-Independent Schema and Data Translation Paolo Atzeni Università Roma Tre Based on Tutorial at ICDE 2006 Paper in EDBT 2006 (with P. Cappellari and

P. Atzeni JISBD - October 5, 2006 60

Translations, the approach

• Elementary translation steps to be combined• Each translation step handles a supermodel construct (or a

feature thereof) "to be eliminated" or "transformed"• A translation is the concatenation of elementary translation

steps

Page 61: Model-Independent Schema and Data Translation Paolo Atzeni Università Roma Tre Based on Tutorial at ICDE 2006 Paper in EDBT 2006 (with P. Cappellari and

P. Atzeni JISBD - October 5, 2006 61

A complex translation, example(0,N) (0,N)

• Eliminate N-ary relationships • Eliminate attributes from relationships • Eliminate many-to-many relationships• Replace relationships with references• Eliminate generalizations

Page 62: Model-Independent Schema and Data Translation Paolo Atzeni Università Roma Tre Based on Tutorial at ICDE 2006 Paper in EDBT 2006 (with P. Cappellari and

P. Atzeni JISBD - October 5, 2006 62

Complex translations

Binary ER w/o gen

N-ary ER w/ gen

Binary ERw/ gen

Relational

Bin ER w/ gen w/o attr on rel

OO w/ gen

N-ary ER w/o gen

Bin ER w/o gen w/o attr on rel

Bin ER w/o gen w/o M:N rel

Elim. N-ary relationships Elim. Relationship attr.sElim. M:N relationshipsReplace relationships with

referencesElim OO generalizations Elim ER generalizations

Bin ER w/ gen w/o M:N rel

OO w/o gen…

Page 63: Model-Independent Schema and Data Translation Paolo Atzeni Università Roma Tre Based on Tutorial at ICDE 2006 Paper in EDBT 2006 (with P. Cappellari and

P. Atzeni JISBD - October 5, 2006 63

Translations

• Basic translations are written in a variant of Datalog, with OID invention– We specify them at the schema level– The tool "translates them down" to the data level– Some completion or tuning may be needed

Page 64: Model-Independent Schema and Data Translation Paolo Atzeni Università Roma Tre Based on Tutorial at ICDE 2006 Paper in EDBT 2006 (with P. Cappellari and

P. Atzeni JISBD - October 5, 2006 64

A basic translation

• From (a simple) binary ER model to the relational model– a table for each entity– a column (in the table for E) for each attribute of an entity E– for each M:N relationship

• a table for the relationship• columns …

– for each 1:N and 1:1 relationship:• a column for each attribute of the identifier …

Page 65: Model-Independent Schema and Data Translation Paolo Atzeni Università Roma Tre Based on Tutorial at ICDE 2006 Paper in EDBT 2006 (with P. Cappellari and

P. Atzeni JISBD - October 5, 2006 65

A basic translation application

Departments

Name Address

EmployeesEmpNo

Name

Affiliation

Departments

1,1

0,N

Name

Address

Employees

EmpNo Name Affiliation

Page 66: Model-Independent Schema and Data Translation Paolo Atzeni Università Roma Tre Based on Tutorial at ICDE 2006 Paper in EDBT 2006 (with P. Cappellari and

P. Atzeni JISBD - October 5, 2006 66

A basic translation (in supermodel terms)

• From (a simple) binary ER model to the relational model– an aggregation of lexicals for each abstract– a component of the aggregation for each attribute of abstract– for each M:N aggregation of abstracts …

• …

• From (a simple) binary ER model to the relational model– a table for each entity– a column (in the table for E) for each attribute of an entity E– for each M:N relationship

• a table for the relationship• columns …

– for each 1:N and 1:1 relationship:• a column for each attribute of the identifier …

Page 67: Model-Independent Schema and Data Translation Paolo Atzeni Università Roma Tre Based on Tutorial at ICDE 2006 Paper in EDBT 2006 (with P. Cappellari and

P. Atzeni JISBD - October 5, 2006 67

"An aggregation of lexicals for each abstract"

SM_AggregationOfLexicals(

OID: #aggregationOID_1(OID),

Name: name)

SM_Abstract (

OID: OID,

Name: name ) ;

Page 68: Model-Independent Schema and Data Translation Paolo Atzeni Università Roma Tre Based on Tutorial at ICDE 2006 Paper in EDBT 2006 (with P. Cappellari and

P. Atzeni JISBD - October 5, 2006 68

Datalog with OID invention

• Datalog (informally):– a logic programming language with no function symbols and

predicates that correspond to relations in a database– we use a non-positional notation

• Datalog with OID invention:– an extension of Datalog that uses Skolem functions to

generate new identifiers when needed• Skolem functions:

– injective functions that generate "new" values (value that do not appear anywhere else); so different Skolem functions have disjoint ranges

Page 69: Model-Independent Schema and Data Translation Paolo Atzeni Università Roma Tre Based on Tutorial at ICDE 2006 Paper in EDBT 2006 (with P. Cappellari and

P. Atzeni JISBD - October 5, 2006 69

"An aggregation of lexicals for each abstract"

SM_AggregationOfLexicals(

OID: #aggregationOID_1(OID),

Name: n)

SM_Abstract (

OID: OID,

Name: n ) ;

• the value for the attribute Name is copied (by using variable n)

• the value for OID is "invented": a new value for the function #aggregationOID_1(OID) for each different value of OID, so a different value for each value of SM_Abstract.OID

Page 70: Model-Independent Schema and Data Translation Paolo Atzeni Università Roma Tre Based on Tutorial at ICDE 2006 Paper in EDBT 2006 (with P. Cappellari and

P. Atzeni JISBD - October 5, 2006 70

"An aggregation of lexicals for each abstract"

SM-Abstract

OID Schema Name

301 1 Employees

302 1 Departments

… … …

SM-AttributeOfAbstract

OID Schema Name isIdent isNullable Type AbstrOID

401 1 EmpNo T F Int 301

402 1 Name F F Text 301

… … … … … … …

EmployeesEmpNoName

11

11

Departments1002

Employees1001

SM_AggregationOfLexicals(

OID: #aggregationOID_1(OID),

Name: n)

SM_Abstract (

OID: OID,

Name: n ) ;

3021002

……

1001

SM-AggregationOfLexicals

Schema NameOID

SM-aggregationOID_1_SK

absOIDOID

301

Employees

Page 71: Model-Independent Schema and Data Translation Paolo Atzeni Università Roma Tre Based on Tutorial at ICDE 2006 Paper in EDBT 2006 (with P. Cappellari and

P. Atzeni JISBD - October 5, 2006 71

"A component of the aggregation for each attribute of abstract"

SM_ComponentOfAggregation… ( OID: #componentOID_1(attOID),Name: name, AggrOID: #aggregationOID_1(absOID),IsNullable: isNullable, IsKey: isIdent, Type : type )

←SM_AttributeOfAbstract(

OID: attOID,Name: name,AbstractOID: absOID, IsIdent: isIdent, IsNullable: isNullable , Type : type ) ;

• Skolem functions

– are functions

– are injective

– have disjoint ranges

• the first function "generates" a new value

• the second "reuses" the value generated by the first rule

Page 72: Model-Independent Schema and Data Translation Paolo Atzeni Università Roma Tre Based on Tutorial at ICDE 2006 Paper in EDBT 2006 (with P. Cappellari and

P. Atzeni JISBD - October 5, 2006 72

A component of the aggregation for each attribute of abstract"

SM-Abstract

OID Schema Name

301 1 Employees

302 1 Departments

… … …

EmployeesEmpNoName

11

11

Departments1002

Employees1001

3021002

……

1001

SM-AggregationOfLexicals

Schema NameOID

SM-aggregationOID_1_SK

absOIDOID

301

SM_ComponentOfAggregation… ( OID: #componentOID_1(attOID),Name: name, AggrOID: #aggregationOID_1(absOID),IsNullable: isNullable, IsKey: isIdent, Type : type )

←SM_AttributeOfAbstract(

OID: attOID,Name: name,AbstractOID: absOID, IsIdent: isIdent, IsNullable: isNullable , Type : type ) ;

SM-componentOID_1_SK

absOIDOID

Employees

EmpNo Name

SM-AttributeOfAbstract

OID Schema Name isIdent isNullable Type AbstrOID

401 1 EmpNo T F Int 301

402 1 Name F F Text 301

… … … … … … …

SM-ComponentOfAggregationOfLexicals

Schema Type AggrOIDisNullableisIdentNameOID

Text 1001FFName111004

1001IntFTEmpNo11

1003 401

1004 402

1003

Page 73: Model-Independent Schema and Data Translation Paolo Atzeni Università Roma Tre Based on Tutorial at ICDE 2006 Paper in EDBT 2006 (with P. Cappellari and

P. Atzeni JISBD - October 5, 2006 73

Generating data-level translations

Schema translation

Data translation

• Same environment

• Same language

• High level translation specificationSupermodel description

(mSM)

Supermodel schemas

(SM)

Supermodel instances

(i-SM)

Page 74: Model-Independent Schema and Data Translation Paolo Atzeni Università Roma Tre Based on Tutorial at ICDE 2006 Paper in EDBT 2006 (with P. Cappellari and

P. Atzeni JISBD - October 5, 2006 74

Translation rules, data level

i-SM_ ComponentOfAggregation… (OID: #i-componentOID_1 (i-attOID),i-AggrOID: #i-aggregationOID_1(i-absOID),ComponentOfAggregationOfLexicalsOID:

#componentOID_1(attOID),Value: Value )

←i-SM_AttributeOfAbstract(

OID: i-attOID,i-AbstractOID: i-absOID,AttributeOfAbstractOID: attOID,Value: Value ),

SM_AttributeOfAbstract(OID: attOID,AbstractOID: absOID,Name: attName,IsNullable: isNull,IsID: isIdent,Type: type )

SM_ComponentOfAggregation… ( OID: #componentOID_1(attOID), Name: name, AggrOID:

#aggregationOID_1(absOID),IsNullable: isNullable, IsKey: isIdent, Type : type )

←SM_AttributeOfAbstract(

OID: attOID,Name: name,AbstractOID: absOID, IsIdent: isIdent, IsNullable: isNullable , Type: type ) ;

Page 75: Model-Independent Schema and Data Translation Paolo Atzeni Università Roma Tre Based on Tutorial at ICDE 2006 Paper in EDBT 2006 (with P. Cappellari and

P. Atzeni JISBD - October 5, 2006 75

Correctness

• Usually modelled in terms of information capacity equivalence/dominance (Hull 1986, Miller 1993, 1994)

• Mainly negative results in practical settings that are non-trivial• Probably hopeless to have correctness in general• We follow an "axiomatic" approach:

– We have to verify the correctness of the basic translations, and then infer that of complex ones

Page 76: Model-Independent Schema and Data Translation Paolo Atzeni Università Roma Tre Based on Tutorial at ICDE 2006 Paper in EDBT 2006 (with P. Cappellari and

P. Atzeni JISBD - October 5, 2006 76

Experiments

• A significant set of models– ER (in many variants and extensions)– Relational– OR– XSD– UML

Page 77: Model-Independent Schema and Data Translation Paolo Atzeni Università Roma Tre Based on Tutorial at ICDE 2006 Paper in EDBT 2006 (with P. Cappellari and

P. Atzeni JISBD - October 5, 2006 77

Summary

• ModelGen was studied a few years ago• New interest on it within the "Model management" framework• New approach

– Translation of schema and data– Visible (and in part self generated) dictionary– Visible and modifiable rules– Skolem functions describe mappings

Page 78: Model-Independent Schema and Data Translation Paolo Atzeni Università Roma Tre Based on Tutorial at ICDE 2006 Paper in EDBT 2006 (with P. Cappellari and

P. Atzeni JISBD - October 5, 2006 78

Conclusion

• The ten-year goal of the Asilomar report:– The information utility:

make it easy for everyone to store, organize, access, and analyze the majority of human information online

• A lot of interesting work has been done but …• …integration, translation, exchange are still difficult…• … 2009 is approaching … we are late!

Page 79: Model-Independent Schema and Data Translation Paolo Atzeni Università Roma Tre Based on Tutorial at ICDE 2006 Paper in EDBT 2006 (with P. Cappellari and

P. Atzeni JISBD - October 5, 2006 79

GrazzieGraciesGracias

Grazie

Thank you

Page 80: Model-Independent Schema and Data Translation Paolo Atzeni Università Roma Tre Based on Tutorial at ICDE 2006 Paper in EDBT 2006 (with P. Cappellari and

P. Atzeni JISBD - October 5, 2006 80

Page 81: Model-Independent Schema and Data Translation Paolo Atzeni Università Roma Tre Based on Tutorial at ICDE 2006 Paper in EDBT 2006 (with P. Cappellari and

Leftovers

Page 82: Model-Independent Schema and Data Translation Paolo Atzeni Università Roma Tre Based on Tutorial at ICDE 2006 Paper in EDBT 2006 (with P. Cappellari and

P. Atzeni JISBD - October 5, 2006 82

i-SM-Abstract

OID AbsOID InstOID

3010 301 1

3011 301 1

3012 302 1

… … …

i-SM-AttributeOfAbstract

OID AttrOfAbsOID InstOID Value i- AbsOID

4010 401 1 75432 3010

4020 402 1 John Doe 3010

4021 402 1 Bob White 3011

4022 404 1 CS 3012

… … … … …

Instances in our dictionary

SM-Abstract

OID Schema Name

301 1 Employees

302 1 Departments

SM-AttributeOfAbstract

OID Schema Name isIdent isNullable Type AbstrOID

401 1 EmpNo T F Int 301

402 1 Name F F Text 301

404 1 Name T F Char 302

… … … … … … …

mSMmM

SMM

i-SMi-M

75432

John Doe

Bob White

CS

Page 83: Model-Independent Schema and Data Translation Paolo Atzeni Università Roma Tre Based on Tutorial at ICDE 2006 Paper in EDBT 2006 (with P. Cappellari and

P. Atzeni JISBD - October 5, 2006 83

Translation of instances

75432

John Doe

Bob White

CS

Employees

EmpNo Name Affiliation

75432 John Doe CS

Bob White

Departments

Name

CS

Departments

Name Address

EmployeesEmpNo

Name

Affiliation

Departments

1,1

0,N

Name

Address

Employees

EmpNo Name Affiliation

Page 84: Model-Independent Schema and Data Translation Paolo Atzeni Università Roma Tre Based on Tutorial at ICDE 2006 Paper in EDBT 2006 (with P. Cappellari and

P. Atzeni JISBD - October 5, 2006 84

i-SM-Abstract

OID AbsOID InstOID

3010 301 1

3011 301 1

3012 302 1

… … …

i-SM-AttributeOfAbstract

OID AttrOfAbsOID InstOID Value i- AbsOID

4010 401 1 75432 3010

4020 402 1 John Doe 3010

4022 404 1 CS 3012

… … … … …

Instances in our dictionary75432

John Doe

Bob White

…CS

Employees

EmpNo Name Affiliation

75432 John Doe CS

Departments

Name

CS

i-SM-AggregationOfLexicals

1002

1001

AggOID

115011

115010

InstOIDOID

5010CS1110054030

5010754321110034010

5010John Doe1110044020

i-SM-ComponentOfAggregationOfLexicals

i- AggOIDValueInstOIDCompOfAggOIDOID

Page 85: Model-Independent Schema and Data Translation Paolo Atzeni Università Roma Tre Based on Tutorial at ICDE 2006 Paper in EDBT 2006 (with P. Cappellari and

P. Atzeni JISBD - October 5, 2006 85

i-SM-Abstract

OID AbsOID InstOID

3010 301 1

3011 301 1

3012 302 1

… … …

i-SM-AttributeOfAbstract

OID AttrOfAbsOID InstOID Value i- AbsOID

4010 401 1 75432 3010

4020 402 1 John Doe 3010

4022 404 1 CS 3012

… … … … …

Instances in our dictionary

i-SM-AggregationOfLexicals

1002

1001

AggrOID

115011

115010

InstOIDOID5010754321110036010

5010John Doe1110046020

5010CS1110056030

i-SM-ComponentOfAggregationOfLexicals

i- AggrOIDValueInstOIDCompOfAggOIDOID

i-SM_ComponentOfAggregation… (OID: #i-componentOID_1 (i-attOID),i-AggrOID: #i-aggregationOID_1(i-absOID),ComponentOfAggregationOfLexicalsOID: #componentOID_1(attOID),Value: Val )

← i-SM_AttributeOfAbstract( OID: i-attOID, i-AbstractOID: i-absOID,AttributeOfAbstractOID: attOID, Value: Val ),

SM_AttributeOfAbstract( OID: attOID, AbstractOID: absOID)

Page 86: Model-Independent Schema and Data Translation Paolo Atzeni Università Roma Tre Based on Tutorial at ICDE 2006 Paper in EDBT 2006 (with P. Cappellari and

P. Atzeni JISBD - October 5, 2006 86

 The problem

• ModelGen (a model management operator, [Bernstein 2003])– given two data models M1 and M2, and a schema S1 of M1

(the source schema and model), • generate a schema S2 of M2 (the target schema and

model), corresponding to S1• and, for each database D1 over S1, generate an

equivalent database D2 over S2

Generality: what are the models we consider?

Correctness: what do corresponding and equivalent mean?

Page 87: Model-Independent Schema and Data Translation Paolo Atzeni Università Roma Tre Based on Tutorial at ICDE 2006 Paper in EDBT 2006 (with P. Cappellari and

P. Atzeni JISBD - October 5, 2006 87

Many artifacts and models for them

• Many notations, each with variants and conventions:

– Object diagrams

– Interface definitions

– Database schemas

– Web site layouts

– Control flow diagrams

– XML schemas

– Form definitions

– …

Page 88: Model-Independent Schema and Data Translation Paolo Atzeni Università Roma Tre Based on Tutorial at ICDE 2006 Paper in EDBT 2006 (with P. Cappellari and

P. Atzeni JISBD - October 5, 2006 88

Many models just in the database world

• With different features and goals– semantic models and logical models:

• E-R, functional, (conceptual) object• relational, object-relational, object, network

– general purpose models (for all seasons) and problem oriented models (for specific contexts: DW, statistical, spatial, temporal)

• Variations of models– models with different levels of abstraction– versions within a family (e.g. many versions of the ER

model)• More models recently with the Web and XML

Page 89: Model-Independent Schema and Data Translation Paolo Atzeni Università Roma Tre Based on Tutorial at ICDE 2006 Paper in EDBT 2006 (with P. Cappellari and

P. Atzeni JISBD - October 5, 2006 89

What do we need to exchange and translate

• In design:– Artifacts (schemas and other descriptions)

• In operations:– Data (in databases, files, documents, …)

Page 90: Model-Independent Schema and Data Translation Paolo Atzeni Università Roma Tre Based on Tutorial at ICDE 2006 Paper in EDBT 2006 (with P. Cappellari and

P. Atzeni JISBD - October 5, 2006 90

Elementary steps

• There can be a set of predefined basic translations, for example– eliminate n-ary aggregations; replace them with binary ones

(and abstracts)– eliminate binary aggregations; replace them with functions– eliminate functions to abstracts; replace them with

aggregations– eliminate complex attributes; replace them with simple

attributes and abstracts• Assumed to be correct and so complex translations built over

them are correct by definition (an “axiomatic” approach)

Page 91: Model-Independent Schema and Data Translation Paolo Atzeni Università Roma Tre Based on Tutorial at ICDE 2006 Paper in EDBT 2006 (with P. Cappellari and

P. Atzeni JISBD - October 5, 2006 91

A complex translation

• From an N-ary ER model with generalizations to a simple Object model with only single valued references and no generalizations– Eliminate N-ary relationships (replaced by binary ones and

new entities)– Eliminate attributes from relationships – Eliminate many-to-many relationships– Transform relationships to references– Eliminate generalizations

Page 92: Model-Independent Schema and Data Translation Paolo Atzeni Università Roma Tre Based on Tutorial at ICDE 2006 Paper in EDBT 2006 (with P. Cappellari and

P. Atzeni JISBD - October 5, 2006 92

"An aggregation of lexicals for each M:N aggregation of abstracts"

SM_AggregationOfLexicals(

OID: #aggregationOID_2(aggOID),

Name: n )

SM_BinaryAggregationOfAbstracts(

OID: aggrOID,

Name: n,

isFun1: "False",

isFun2: "False") ;

• another Skolem function #aggregationOID_2():

– generates a "table" for a "relationship"

• the rule applies only to M:N "relationships", due to the conditions on isFun1 and isFun2

Page 93: Model-Independent Schema and Data Translation Paolo Atzeni Università Roma Tre Based on Tutorial at ICDE 2006 Paper in EDBT 2006 (with P. Cappellari and

P. Atzeni JISBD - October 5, 2006 93

Translation rules, data level

i-SM_ ComponentOfAggregation… (OID: #i-componentOID_1 (i-attOID),i-AggrOID: #i-aggregationOID_1(i-absOID),

ComponentOfAggregationOfLexicalsOID: #componentOID_1(attOID),

Value: Value )←…

SM_ComponentOfAggregation… ( OID: #componentOID_1(attOID), AggrOID:

#aggregationOID_1(absOID),Name: name, IsNullable: isNullable, IsKey: isIdent, Type : type )

←…

Page 94: Model-Independent Schema and Data Translation Paolo Atzeni Università Roma Tre Based on Tutorial at ICDE 2006 Paper in EDBT 2006 (with P. Cappellari and

P. Atzeni JISBD - October 5, 2006 94

Translation rules, data level

←i-SM_AttributeOfAbstract(

OID: i-attOID,i-AbstractOID: i-absOID,AttributeOfAbstractOID: attOID,Value: Value ),

SM_AttributeOfAbstract(OID: attOID,AbstractOID: absOID,Name: name,IsNullable: isNullable,IsID: isIdent,Type: type )

←SM_AttributeOfAbstract(

OID: attOID,AbstractOID: absOID, Name: name,IsIdent: isIdent, IsNullable: isNullable , Type: type ) ;

Page 95: Model-Independent Schema and Data Translation Paolo Atzeni Università Roma Tre Based on Tutorial at ICDE 2006 Paper in EDBT 2006 (with P. Cappellari and

P. Atzeni JISBD - October 5, 2006 95

Management of translations

• Basic properties:– Correctness, minimality, …

• Construction of complex translations by picking basic translations in the library

Page 96: Model-Independent Schema and Data Translation Paolo Atzeni Università Roma Tre Based on Tutorial at ICDE 2006 Paper in EDBT 2006 (with P. Cappellari and

P. Atzeni JISBD - October 5, 2006 96

ModelGen: the architecture

Page 97: Model-Independent Schema and Data Translation Paolo Atzeni Università Roma Tre Based on Tutorial at ICDE 2006 Paper in EDBT 2006 (with P. Cappellari and

P. Atzeni JISBD - October 5, 2006 97

 A simple example

• An object relational database, to be translated in a relational one

• Source: the OR-model

• Target: the relational model

System managed ids

Page 98: Model-Independent Schema and Data Translation Paolo Atzeni Università Roma Tre Based on Tutorial at ICDE 2006 Paper in EDBT 2006 (with P. Cappellari and

P. Atzeni JISBD - October 5, 2006 98

 Example, 2

• Does the OR model allow for keys? • Assume EmpNo and Name are keys

Page 99: Model-Independent Schema and Data Translation Paolo Atzeni Università Roma Tre Based on Tutorial at ICDE 2006 Paper in EDBT 2006 (with P. Cappellari and

P. Atzeni JISBD - October 5, 2006 99

 Example, 3

• Does the OR model allow for keys? • Assume no keys are specified