the need for temporal databases

19
The need for Temporal Databases (1) Need for coping with temporal data (both VT and TT) (2) Just adding 1 (or 2, or 4) temporal attributes (and maybe some ad-hoc procedures) does not work! (3) First, a rigorous semantic framework is needed, to give formal specification to the implementation. Properties: clearness, expressiveness, upward compatibiliy. Ex. BCDM (4) Second, the implementation must be proven to respect the semantics. Core issue here: efficient (1-NF) implementations hardly grant uniqueness of representation. Ex TSQL2

Upload: tyrone-leon

Post on 31-Dec-2015

35 views

Category:

Documents


3 download

DESCRIPTION

The need for Temporal Databases. (1)Need for coping with temporal data (both VT and TT). (2)Just adding 1 (or 2, or 4) temporal attributes (and maybe some ad-hoc procedures) does not work!. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: The need for Temporal Databases

The need for Temporal Databases

(1) Need for coping with temporal data (both VT and TT)

(2) Just adding 1 (or 2, or 4) temporal attributes (and maybe some ad-hoc procedures) does not work!

(3) First, a rigorous semantic framework is needed, to give formal specification to the implementation. Properties: clearness, expressiveness, upward compatibiliy. Ex. BCDM

(4) Second, the implementation must be proven to respect the semantics. Core issue here: efficient (1-NF)

implementations hardly grant uniqueness of representation. Ex TSQL2

Page 2: The need for Temporal Databases

Ad-hoc approaches are complex and are not likely to work

Example 1. Projection (and temporal coalescing)

Name Salary Title VT_start VT_end

Andrea 60000 Ass. Provost 1/1/1993 30/5/1993

Andrea 70000 Ass. Provost 1/6/1993 30/9/1993

Andrea 70000 Provost 1/10/1993 31/1/1994

Andrea 70000 Professor 1/2/1994 31/12/1994

Question: salary history of Andrea

“Intuitive” SQL query:SELECT Salary,VT_start,VT_endFROM EmployeeWHERE Name=“Andrea”

Page 3: The need for Temporal Databases

Ad-hoc approaches are complex and are not likely to work

Result obtained: Salary VT_start VT_end

60000 1/1/1993 30/5/1993

70000 1/6/1993 30/9/1993

70000 1/10/1993 31/1/1994

70000 1/2/1994 31/12/1994

Desired result: Salary VT_start VT_end

60000 1/1/1993 30/5/1993

70000 1/6/1993 31/12/1994

Page 4: The need for Temporal Databases

Ad-hoc approaches are complex and are not likely to work

How to get the desired result using SQL92:

CREATE TABLE Temp(Salary, VT_start,VT_end)AS SELECT Salary, VT_start, VT_end

FROM EmployeeWHERE Name = “Andrea”;

RepeatUPDATE Temp T1SET (T1.VT_end)=SELECT MAX(T2.VT_end)WHERE T1.Salary=T2.Salary AND T1.VT_start < T2.VT_Start AND

T1.VT_end >= T2.VT_start AND T1.VT_end < T2.VT_endWHERE EXISTS (SELECT *

FROM Temp AS T2WHERE T1.Salary=T2.Salary AND T1.VT_start < T2.VT_Start

AND T1.VT_end >= T2.VT_start AND T1.VT_end < T2.VT_end)

Until no tuples updated

Page 5: The need for Temporal Databases

Ad-hoc approaches are complex and are not likely to work

How to get the desired result using SQL92 (continues!):

DELETE FROM Temp T1WHERE EXISTS (SELECT *

FROM Temp AS T2WHERE T1.Salary = T2.Salary AND ((T1.VT_start > T2.VT_Start) AND (T1.VT_end <= T2.VT_end)) OR ((T1.VT_start >= T2.VT_Start) AND (T1.VT_end < T2.VT_end))

Page 6: The need for Temporal Databases

Ad-hoc approaches are complex and are not likely to work

Underlying semantic phenomenon:

Projection on temporal relations involves temporal coalescing about value equivalent tuples

When it occurs (SQL):

Whenever a proper subset of the attributes of the relations is chosen in the SELECT part of the query

Page 7: The need for Temporal Databases

Ad-hoc approaches are complex and are not likely to work

How to get the desired result using a Temporal DB (ex. TSQL2)

SELECT SalaryFROM EmployeeWHERE Name = “Andrea”

Page 8: The need for Temporal Databases

Ad-hoc approaches are complex and are not likely to work

Example 2. Join (and temporal intersection)

Name Salary VT_start VT_end

Andrea 60000 1/1/1993 30/5/1993

Andrea 70000 1/6/1993 31/12/1994

Employee1

Name Title VT_start VT_end

Andrea Ass. Provost 1/1/1993 30/9/1993

Andrea Provost 1/10/1993 31/1/1994

Andrea Professor 1/2/1994 31/12/1994

Employee2

Query: “combined” history of both Andrea’s salary and title

Page 9: The need for Temporal Databases

Ad-hoc approaches are complex and are not likely to work

“Intuitive” SQL query:SELECT Salary, Title, Emp1.VT_start, Emp1.VT_end

Emp2.VT_start, Emp2.VT_end FROM Employee1, Employee2WHERE Employee1.Name=“Andrea” AND

Employee1.Name=“Andrea”

Page 10: The need for Temporal Databases

Ad-hoc approaches are complex and are not likely to work

Result obtained:

Salary Emp1.VT_start

Emp1.VT_end

Title Emp2.VT_start

Emp2.VT_end

60000 1/1/1993 30/5/1993 Ass. Provost 1/1/1993 30/9/1993

60000 1/1/1993 30/5/1993 Provost 1/10/1993 31/1/1994

60000 1/1/1993 30/5/1993 Professor 1/2/1994 31/12/1994

70000 1/6/1993 31/12/1994 Ass. Provost 1/1/1993 30/9/1993

70000 1/6/1993 31/12/1994 Provost 1/10/1993 31/1/1994

70000 1/6/1993 31/12/1994 Professor 1/2/1994 31/12/1994

Page 11: The need for Temporal Databases

Ad-hoc approaches are complex and are not likely to work

Result desired:

Salary Title VT_start VT_end

60000 Ass. Provost 1/1/1993 30/5/1993

70000 Ass. Provost 1/6/1993 30/9/1993

70000 Provost 1/10/1993 31/1/1994

70000 Professor 1/2/1994 31/12/1994

Page 12: The need for Temporal Databases

Ad-hoc approaches are complex and are not likely to work

How to get the desired result using SQL92:SELECT Employee1.Name,Salary,Dept,Employee1.VT_start,Employee1.VT_endFROM Employee1, Employee2WHERE Employee1.Name=Employee2.Name AND Employee2.VT_start <= Employee1.VT_start AND Employee1.VT_end < Employee2.VT_endUNIONSELECT Employee1.Name,Salary,Dept,Employee1.VT_start,Employee2.VT_endFROM Employee1, Employee2WHERE Employee1.Name=Employee2.Name AND Employee1.VT_start >= Employee2.VT_start AND Employee2.VT_end < Employee1.VT_end AND Employye1.VT_start < Employee2.VT_endUNIONSELECT Employee1.Name,Salary,Dept,Employee2.VT_start,Employee1.VT_endFROM Employee1, Employee2WHERE Employee1.Name=Employee2.Name AND Employee2.VT_start > Employee2.VT_start AND Employee1.VT_end < Employee2.VT_end AND Employye2.VT_start < Employee1.VT_endUNIONSELECT Employee1.Name,Salary,Dept,Employee2.VT_start,Employee2.VT_endFROM Employee1, Employee2WHERE Employee1.Name=Employee2.Name AND Employee2.VT_start > Employee1.VT_start AND Employee2.VT_end < Employee1.VT_end

Page 13: The need for Temporal Databases

Ad-hoc approaches are complex and are not likely to work

Underlying semantic phenomenon:

Join (Cartesian product) on temporal relations involves temporal intersection

When it occurs (SQL):

Whenever more than one relation is used in the FROM part of the query

Note: the number of terms in the SQL union is 2n, where n is the number of relations in the FROM part

Page 14: The need for Temporal Databases

Ad-hoc approaches are complex and are not likely to work

How to get the desired result using a Temporal DB (ex. TSQL2)

SELECT Salary, TitleFROM Employee1, Employee2WHERE Employee1.Name=“Andrea” AND

Employee1.Name=“Andrea”

Page 15: The need for Temporal Databases

Ad-hoc approaches are complex and are not likely to work

And what about:- Union, difference, …, nested queries- Temporal predicates- Primary\secondary keys- Aggregate functions- Integrity constraints- Multiple (user-defined!) granularities- ……- arbitrary combinations of all the above issues

Until now, just two simple examples concerning:- SELECT a subset of attributes ( loop to do colaescing)- FROM with >1 relations (exponential union to do intersection)

?

Page 16: The need for Temporal Databases

Ad-hoc approaches are complex and are not likely to work

Key message:- Dealing with temporal data is a general problem in DB’s- Difficult problems (often “hidden” ones) have to be faced

WHY? (informally)

(1) Relational model assumes independence of attribute values

Time attributes predicates about when the values of the other attributes hold in the modeled reality (VT) or are current in the DB (TT)

Page 17: The need for Temporal Databases

Ad-hoc approaches are complex and are not likely to work

WHY? (informally)

(2) Relational model assumes all data\knowledge is explicit

The association of time to tuples involves “semantic assumptions”

Such assumptions are part of our “commonsense” way of looking at time. However, they are NOT supported by the relational model (as long as they are left implicit)

Making the semantic of time explicit is a complex task (everywhere, not only in DBs)

Page 18: The need for Temporal Databases

Ad-hoc approaches are complex and are not likely to work

From a Software Engeneering point of view:Letting applications solve the problem in an ad-hoc way is:- Both cost and time expensive- Likely to lead to errors in the applications- Likely to make integration (shared data) between applications impossible

Page 19: The need for Temporal Databases

Ad-hoc approaches are complex and are not likely to work

Temporal DB: an area of research aiming at

providing once-and-forall principled and integrated solution to the problem