G. Papastefanatos1, P. Vassiliadis2, A. Simitsis3, K. Aggistalis2 , F. Pechlivani2, Yannis Vassiliou1
(1) National Technical University of Athens{gpapas,yv}@dbnet.ece.ntua.gr
(2) University of Ioannina
{pvassil,kostazz,fpechliv}@cs.uoi.gr
(3) IBM Almaden Research [email protected]
Language Extensions for the Automation of Database Schema
Evolution
ICEIS'08, Barcelona, June 2008 2
Outline
• Motivation• Graph-based modeling• A language for policy
annotation • Evaluation• Conclusions
ICEIS'08, Barcelona, June 2008 3
Outline
• Motivation• Graph-based modeling• A language for policy
annotation • Evaluation• Conclusions
ICEIS'08, Barcelona, June 2008 4
Data Warehouse Environment
WWW Act1
Act2
Act3
Act4
Act5
ICEIS'08, Barcelona, June 2008 5
Motivation
WWW Act1
Act2
Act3
Act4
Act5
ICEIS'08, Barcelona, June 2008 6
Database Schema Evolution
• Current database systems are continuously evolving environments, where design constructs are– Added– Removed– Modified
• Evolution is not handled by current DBMS with an automatic way– Syntactic as well as semantic adaptation of queries
and views is a time-consuming task, treated in most of the cases manually
• Evolution-driven database design is missing
ICEIS'08, Barcelona, June 2008 7
Evolution Effects
• SW artifacts around the DBMS are affected:– Syntactically – i.e., become invalid– Semantically – i.e., query must conform to
the new source database semantics
• Adaptation of activities, queries and views– time-consuming task– treated in most of the cases manually by the
administrators/developers
ICEIS'08, Barcelona, June 2008 8
We would like to know…
• What part of the process is affected and how if e.g., an attribute is deleted?
• Can we predict the impact of changes? • To what extent can readjustment be
automated? • Can we perform what-if analysis for
potential changes of source configurations?
ICEIS'08, Barcelona, June 2008 9
Outline
• Motivation• Graph-based modeling• A language for policy
annotation • Evaluation• Conclusions
ICEIS'08, Barcelona, June 2008 10
Database Schema Evolution – Our approach
Mechanism for performing what-if analysis for potential changes of database configurations
Graph based representation of database constructs (i.e., relations, views, constraints, queries)Annotation of graph with rules for adapting queries to database schema evolution
Evolving databases
QueriesDatabase Schema
Graph-based modeling
for uniform representation
Metrics for Evaluating Evolution
Design
Evolving applications
Rules for Handling Evolution
ICEIS'08, Barcelona, June 2008 11
SELECT EMP.Emp#, Sum(WORKS.Hours) as T_Hours
FROM EMP, WORKSWHERE EMP.Emp# = WORKS.Emp#GROUP BY EMP.Emp#
Graph based representation
ICEIS'08, Barcelona, June 2008 12
Annotating the graph with adaptation rules
Set of evolving database constructs:• relations• attributes• constraints
11
Set of potential evolution changes: • addition• deletion• modification
22
Set of reaction policies: • propagate• block• prompt
33
ICEIS'08, Barcelona, June 2008 13
Query Adaptation - Example
Annotated Query Graph Event
Add attribute Phone to relation EMP
Transformed Query Graph
Q
Name
EID
Name
EID
S
S
EMPS
S
map-select
map-select …
ON attribute addition TO EMP THEN propagate
Q: SELECT EID, Name FROM EMP
Q: SELECT EID, Name, Phone FROM EMP
Q
Name
EID
Name
EID
S
S
EMPS
S
map-select
map-select …
ON attribute addition TO EMP THEN propagate
Phone
S
Phone
S
map-select
ICEIS'08, Barcelona, June 2008 14
Contributions
• Based on– a graph-based model for database constructs
(relational tables, views, database constraints) and SQL queries
• we present – a mechanism for the annotation of the graph’s
constructs with elements that facilitate what-if analysis and predetermine the reaction to evolution events occurring in the database schema
– SQL extensions that enable the implementation of our approach for evolution management
• Case study
ICEIS'08, Barcelona, June 2008 15
Outline
• Motivation• Graph-based modeling• A language for policy
annotation • Evaluation• Conclusions
ICEIS'08, Barcelona, June 2008 16
Annotation of graph constructs with policies
• {relation, query, view, attribute constraint, condition}
• specific nodes
ON <event> TO <element> THEN <policy>
{delete update insert}
{block propagate prompt}
ICEIS'08, Barcelona, June 2008 17
Global constraints
ON DELETE NODE THEN PROPAGATE
ON DELETE ATTRIBUTE THEN PROMPT
Example:CREATE DATABASE company
ON DELETE ATTRIBUTE THEN PROMPT
ICEIS'08, Barcelona, June 2008 18
Top – level constructs
Example for relations:CREATE TABLE works (EMP# NUMBER(3), PROJ# NUMBER(3), HOURS NUMBER(5),
ON Add Attribute TO works THEN propagate)
This statement for relation works allows the addition of attributes and propagates this addition to all queries and views accessing this relation
ICEIS'08, Barcelona, June 2008 19
Top – level constructs
Example for views:CREATE VIEW emps-prjs AS SELECT E.Emp#, E.Name, P.Projname FROM Emp E,Works W,Proj P WHERE E.EMP#=W.EMP# AND W.Proj#=P.Proj#ON Modify Condition TO emps-prjs THEN block
This syntax blocks changes in the WHERE clause of the view definition for view emps-prjs
ICEIS'08, Barcelona, June 2008 20
Top – level constructs
Example for queries:Q: SELECT EP.Emp#, EP.Name
FROM emps-prjs EPWHERE EP.PRJNAME = ‘Olympic Games’ON Add Attribute TO emps-prjs THEN
block
The above syntax blocks the inclusion of added attributes in the underlying view emps-prjs in the select clause of the query.
ICEIS'08, Barcelona, June 2008 21
Specific node annotation
CREATE TABLE emp
(EMP# NUMBER(3),
Name Varchar2(150),
... ,
ON Delete Attribute TO Name THEN block)
Q: SELECT E.Emp#, E.Name, P.Projname
FROM Emp E,Works W,Proj P
WHERE E.EMP#=W.EMP# AND W.Proj#=P.Proj#
ON Delete Attribute TO Name THEN propagate
ICEIS'08, Barcelona, June 2008 22
Constraints
CREATE TABLE emp
(EMP# NUMBER(3),
Name Varchar2(150),
Constraint EMP.PK PRIMARY KEY (EMP#),
ON Modify Constraint TO EMP.PK THEN propagate)
ICEIS'08, Barcelona, June 2008 23
Conditions
CREATE CONDITION <condition> AS <expression>CREATE CONDITION Emp_Age_Cond AS AGE>50CREATE CONDITION Works_Emp_FK AS WORKS.EMP# IN
EMP.EMP#CREATE CONDITION Works_Emp_J AS WORKS.EMP#=EMP.EMP#
• Then, a query SELECT * FROM EMP WHERE AGE_COND would simply use the condition as a macro.
• Parametric conditions, to allow referring to aliases in SQL queries are straightforward.
• Reverse eng. of existing code: automatic condition names can be assigned to all the queries.
ICEIS'08, Barcelona, June 2008 24
Outline
• Motivation• Graph-based modeling• A language for policy
annotation • Evaluation• Conclusions
ICEIS'08, Barcelona, June 2008 25
Testbed
• A data mart of an agency for the Greek public sector
• 6 months of monitoring• 52 “queries” (in reports, scripts)
over 18 relations• When converted to our graph
structure: 2500 nodes
ICEIS'08, Barcelona, June 2008 26
Distribution of policies over the annotated nodes
Event# of nodes
Propagate Block
Rename relations 18 0
Add attributes 64 13
Delete Attributes 1608 92
Rename Attributes 1615 85
Domain Modification 1690 10
Condition Modification0 21
Total Annotations 4995 221
ICEIS'08, Barcelona, June 2008 27
Global policy and individual annotations
Scope
# of operations
AnnotationsPolicy Clauses
Query scope 486 9
Relation Scope
5180 293
Database Scope
36 2
ICEIS'08, Barcelona, June 2008 28
Outline
• Motivation• Graph-based modeling• A language for policy
annotation • Evaluation• Conclusions
ICEIS'08, Barcelona, June 2008 29
Conclusions
• Coherent framework for propagating potential changes of the database to all the software artifacts around the database based on a model-policy-event scheme
• Extension to the SQL language specifically tailored for the management of evolution.
• Limited overhead imposed on both the system and the humans, who design and maintain it.
• Tested in a real-world scenario of the Greek public sector.
ICEIS'08, Barcelona, June 2008 30
On-going/Future Work
• Hecataeus: A tool for visualizing and performing what-if analysis for several evolution scenarios.
• Patterns of evolution sequences
ICEIS'08, Barcelona, June 2008 31
Acknowledgment
Information dissemination of this work was supported by the European Union in the framework of the project “Support of Computer Science Studies in the University of Ioannina” of the “Operational Program for Education and Initial Vocational Training” of the 3rd Community Support Framework of the Hellenic Ministry of Education, funded by national sources and by the European Social Fund (ESF).
ICEIS'08, Barcelona, June 2008 32
Gracias!
Sources: http://en.wikipedia.org/wiki/Image:Barcelona_-_planol_ciutat_vella_1860.jpg
http://maps.google.com
ICEIS'08, Barcelona, June 2008 33http://www.cs.uoi.gr/~pvassil/projects/architecture_graph/
Questions?
ICEIS'08, Barcelona, June 2008 34
Algorithm Propagate changeS (DaWaK 2007)
Input: an ETL summary S over a graph Go=(Vo,Eo) and an event eOutput: a graph Gn=(Vn,En)Variables: a set of events E, and an affected node A Begin dps(S, Go, Gn, {e}, A)End
dps(S, Gn, Go, E, A) { I = Ins_by_policy(affected(E)) D = Del_by_policy(affected(E)) Gn = Go – D I E = E–{e}action(affected(E)) if consumer(A)nil for each consumer(A) dps(S,Gn,Go,E,consumer(A))}
ICEIS'08, Barcelona, June 2008 35
Conflict resolution (DaWaK 2007)
• Graph constructs may have contradictory policies for the same event
Policies defined on query graph structures are stronger than policies defined on view graph structures which in turn prevail on policies defined on relation graph structures
RuleRule
Query Module
View Module
Relation Module
Query Conditions
Query Attributes
Query
View Conditions
View Attributes
View
Relation Conditions
Relation Attributes
Relation
User’s Choice
ICEIS'08, Barcelona, June 2008 36
Configuration of our setting (DaWaK 2007)
7 ETL processes
S1 Tmp1
L1 L2 Tmp2
Join
S2
T1
T2
Add field
Load to DSA
γ V3
Load to DSA
π
π Add field
Patch text fields
Sources DW Data Staging Area (DSA)
Patch text fields
53 ETL activities3 lookup tables
7 source tables 9 target tables
ICEIS'08, Barcelona, June 2008 37
Adapted activities per Event (DaWaK 2007)
Affected Activities Per Event
0
10
20
30
40
50
60
70
80
RenameTable
RenameAttributes
AddAttributes
DeleteAttributes
ModifyAttributes
Change PK
Affected
Adapted
ICEIS'08, Barcelona, June 2008 38
Related work
• DB schema Evolution– Schema Versioning
• DW schema Evolution• Materialized View Evolution• Evolution wrt Model Mappings