copyright © c. j. date 2005page 1 agenda : 1. a few preliminaries 2. laying the foundations...
TRANSCRIPT
Copyright © C. J. Date 2005 page 1
AGENDA :
1. A few preliminaries
2. Laying the foundations
3. Building on the foundations
Copyright © C. J. Date 2005 page 2
OBJECTIVES OF THIS PRESENTATION :
Time dimension is becoming increasingly important —especially in data warehouse context
No one has done it right yet!
We’re lobbying to get it done right ...
"If you do it the stupid way, then you’ll have to do it again" (Gregory Chudnovsky)
In particular: The relational model is THE RIGHT AND PROPER FOUNDATION for temporal support
Requires NO extension / revision / violations!
Copyright © C. J. Date 2005 page 3
THE RUNNING EXAMPLE : SUPPLIERS AND SHIPMENTS(nontemporal version—sample values) :
S S# SNAME STATUSCITY SP S# P#
S1 Smith 20 LondonS2 Jones 10 ParisS3 Blake 30 ParisS4 Clark 20 LondonS5 Adams 30 Athens
Note supplier S5 in particular
Potential shipments: Supplier Sx is currently able to supply part Py
S1 P1S1 P2S1 P3S1 P4S1 P5S1 P6S2 P1S2 P2S3 P2S4 P2S4 P4S4 P5
Copyright © C. J. Date 2005 page 4
• DB consists of relation variables or RELVARS
• Relvar value at any time = RELATION
• Relvar represents a PREDICATE
• Tuples represent PROPOSITIONS
• Relvar at time t contains ALL AND ONLY the tuples that represent TRUE propositions (= true "instantiations" of corresponding predicate): CLOSED WORLD ASSUMPTION (CWA)
• Double underlining used in figures to indicate primary key attributes
Copyright © C. J. Date 2005 page 5
FOR EXAMPLE :
relvar at same relvartime t1 at time t2
SP S# P# SP S# P#
S1 P1 S1 P1S2 P1 S2 P1
S3 P2
relations
Predicate: Supplier S# is able to supply part P#
Propositions: Supplier S1 is able to supply part P1(etc.)
Copyright © C. J. Date 2005 page 6
AGENDA :
1. A few preliminaries
2. Laying the foundations
3. Building on the foundations
Copyright © C. J. Date 2005 page 7
2. LAYING THE FOUNDATIONS :
Time and the DB
What’s the problem?
Intervals
Interval operators
The EXPAND and COLLAPSE operators
The PACK and UNPACK operators
Relational operators
Copyright © C. J. Date 2005 page 8
TEMPORAL vs. NONTEMPORAL DATABASES :
Nontemporal DB (i.e., DB as conventionally understood—sometimes unfortunately called a "snapshot DB"):
Current data only—e.g., status of supplier S1 is currently ("now") 20
Temporal DB:
Historical data instead of or as well as current data —e.g., status of supplier S1 is currently 20 AND has been 20 ever since July 1st AND was 15 from April 5th to June 30th (etc., etc.)
If no DELETE or UPDATE then historical data only (as in some data warehouses)
Copyright © C. J. Date 2005 page 9
WHY ARE TEMPORAL DATABASESBECOMING IMPORTANT ? Cheap disk storage means we can now store large volumes
of historical data (data warehouses again)
Direct consequence: Users beginning to be faced with temporal DB problems … and they want solutions!
A note on the research (there’s been some controversy):
Two approaches: Treat temporal data as special and depart from relational principles? …OR … Abide firmly by those principles?
Fortunately, the good guys won!
Copyright © C. J. Date 2005 page 10
CAVEAT :
Various questions regarding nature of time—e.g.,
Does time have a beginning or an end?
Is time a continuum or is it divided into discrete quanta?
How can we best characterize the concept now (aka "the moving point now")?
—aren’t really DB questions as such! So we won’t try toanswer them definitively, but just make what we hope arereasonable assumptions as we proceed ...
Copyright © C. J. Date 2005 page 11
NOTE, HOWEVER, THAT (important!) :
The (good) temporal DB research has led to certain INTERESTING GENERALIZATIONS …
We’ll be touching on some of those generalizations from time to time
However, we follow convention in referring to (e.g.) "temporal" ops, "temporal" relations, etc., even though the concepts are often not exclusive to temporal data as such
We also take "history" to include the future, where approp
Copyright © C. J. Date 2005 page 12
SOME TENTATIVE DEFINITIONS :
If data = encoded representation of propositions, then temporal data = encoded representation of timestamped propositions …
Temporal relation: relation in which each tuple includes at least one timestamp
(i.e., heading includes at least one attrib of some timestamp type)
Temporal relvar: relvar whose heading is that of a temporal relation
Temporal DB: DB in which all relvars are temporal
Copyright © C. J. Date 2005 page 13
BUT … !!!
"Temporal DB" concept as just defined isn’t very useful!(Why not?)
So … from this point forward, we take a temporal DB to bea DB that includes but is not limited to temporal relvars
Note: With one exception, all of the new ops and other constructs to be discussed are just shorthand!
Exception = INTERVAL type generator (see later)
Copyright © C. J. Date 2005 page 14
CONSIDER THIS 2-TUPLE :
S# FROM
S1 July 1st, 1999
Possible interpretations include:
T1: Supplier S1 was placed under contract on July 1st, 1999
T2: Supplier S1 has been under contract since July 1st, 1999
T3: Supplier S1 was under contract during the interval from July 1st, 1999, to the present day
Copyright © C. J. Date 2005 page 15
TIMESTAMPED PROPOSITIONS :
Any of T1, T2, T3 could be intended interpretation, depending on applicable predicate
Prepositions on, since, and during characterize the three interpretations
On = "at some instant" (not very interesting) Since = "ever since" During = "throughout (the interval in question)"
Copyright © C. J. Date 2005 page 16
TIMESTAMPED PROPOSITIONS (cont.) :
But don’t T1, T2, T3 really all say the same thing?
Well … they need to be tightened up!
T1: Supplier S1 was most recently appointed on July 1st, 1999 (but the contract might subsequently have been terminated)
T2: Supplier S1 was not under contract on June 30th, 1999, but has been so ever since July 1st, 1999
T3: Supplier S1 was not under contract on June 30th, 1999, but has been so during the interval from July 1st, 1999, to the present day
T2 and T3 are equivalent to each other but not to T1
Copyright © C. J. Date 2005 page 17
TIMESTAMPED PROPOSITIONS (cont.) :
T2 and T3 are logically equivalent but significantly different in form … Reverting to simpler versions for brevity:
T2: Supplier Sx has been under contract since date d
T3: Supplier Sx was under contract during the interval from date b to date e
Form of T3 can be used for historical records! —which do usually involve intervals
Concept of DURING is important (all-pervasive)
Copyright © C. J. Date 2005 page 18
"VALID TIME" vs. "TRANSACTION TIME" :
Can historical data be updated ???
Well … "historical data" in the DB represents, not history as such, but rather our beliefs about that history … and beliefs can change
Valid time for p : set of times at which(updatable) (by our current knowledge)
p is/was/will be true
Transaction time for q : set of times at which q is(not updatable) represented in DB as true
Copyright © C. J. Date 2005 page 19
FOR EXAMPLE :
Let p be "Supplier S1 was under contract"
Suppose we currently believe this state of affairs held from July 1st, 1999, until May 1st, 2000, so we insert:
S# FROM TO
S1 July 1st, 1999 May 1st, 2000
This tuple does not correspond to proposition p! —but to a timestamped extension of p
FROM/TO represents "valid time" for p—time when (according to our current beliefs) p represented a "true fact"
Copyright © C. J. Date 2005 page 20
Later we discover date of appointment was June 1st, so we "update the tuple"
S# FROM TO
S1 June 1st, 1999 May 1st, 2000
Changes "valid time", not p!
Later we discover S1 was never under contract at all and therefore delete the tuple—p now known to be false and has no "valid time" at all
Copyright © C. J. Date 2005 page 21
Suppose tuple was INSERT’d at time t1UPDATE’d at time t2DELETE’d at time t3
Interval from t1 to t3 = "transaction time"—not for p, but for proposition "p was true throughout some interval"
Interval from t1 to t2 = "transaction time" for timestamped extension of p with timestamp July 1st, 1999 - May 1st, 2000
Interval from t2 to t3 = "transaction time" for timestamped extension of p with timestamp June 1st, 1999 - May 1st, 2000
We will revisit these concepts later ...
Copyright © C. J. Date 2005 page 22
OBVIOUS BUT FAR-REACHING ASIDE :
interval with begin time b and end time e
can be thought of as
set of all times t such that b < t < e
(where "<" means "earlier than")
Copyright © C. J. Date 2005 page 23
SOME FUNDAMENTAL QUESTIONS :
Doesn’t "all times t such that b < t < e" mean we’re dealing with infinite sets?
Assumption:
Timeline = finite sequence of discrete, indivisibletime quanta
Time quantum = smallest time unit representable in the system or chronon
Copyright © C. J. Date 2005 page 24
Propositions T1-T3 seem to assume time quanta are days …Doesn’t the system support time units down to (e.g.)microseconds?
If S1 was appointed on July 1st, what do we do about the interval from the beginning of July 1st up to the very instant of appointment?
Distinguish time quanta vs. time units relevant for some particular purpose (e.g., years, months, days, msecs)
= time points aka "points" aka "granules”
Granularity = "size" of applicable time points = "size" of gap between adjacent points
Copyright © C. J. Date 2005 page 25
If we regard the timeline (for some given purpose) as a finite sequence of time points, each time point has aunique successor and a unique predecessor—right?
Yes—except for the points corresponding to "the end of time" and "the beginning of time",of course
Copyright © C. J. Date 2005 page 26
T3: Supplier Sx was under contract during the interval from date b to date e
If corresp relation includes 3-tuple—
S# FROM TO
S1 July 1st, 1999 September 25th, 2000
—doesn’t CWA imply it must also include:
S# FROM TO etc.,etc.,
S1 July 2nd, 1999 September 24th, 2000 etc.?
Copyright © C. J. Date 2005 page 27
GOOD POINT !!!
Clearly, T3 needs to be tightened up:
T3: Supplier Sx was under contract on every day from date b to date e, but not on the day immediately before b, nor on the day immediately after e
Since = "ever since and not immediately before"
During = "throughout and not immediately before or immediately after (the interval in question)"
Copyright © C. J. Date 2005 page 28
SOME ASSUMPTIONS :
Henceforth we assume that:
No supplier can end one contract on one day and start another on the very next day
No supplier can be under two distinct contracts at the same time
Contracts can be open-ended—i.e., a supplier can be currently under contract and the end date for that contract can be currently unknown
Copyright © C. J. Date 2005 page 29
2. LAYING THE FOUNDATIONS :
Time and the DB
What’s the problem?
Intervals
Interval operators
The EXPAND and COLLAPSE operators
The PACK and UNPACK operators
Relational operators
Copyright © C. J. Date 2005 page 30
SUPPLIERS AND SHIPMENTS—SIMPLIFIED VERSION(sample values) :
S S# SP S# P#
S1 Supplier Sx is S1 P1S2 currently S1 P2S3 under contract S1 P3S4 S1 P4S5 S1 P5
S1 P6 S2 P1
S2 P2S3 P2
Potential shipments: S4 P2Supplier Sx is currently S4 P4able to supply part Py S4 P5
Copyright © C. J. Date 2005 page 31
SAMPLE CONSTRAINTS :
Consider key constraints only (until further notice):
{S#} is primary key for S
{S#,P#} is primary key for SP
{S#} is foreign key in SPmatching primary key of S
Copyright © C. J. Date 2005 page 32
SAMPLE QUERIES :
Query A: Get supplier numbers of suppliers who arecurrently able to supply at least one part
SP { S# }
/* projection */
Query B: Get supplier numbers of suppliers who are currently unable to supply any part at all
S { S# } MINUS SP { S# }
/* difference between two projections */
Copyright © C. J. Date 2005 page 33
"SEMITEMPORALIZING" SUPPLIERS AND SHIPMENTS :
S_SINCE SP_SINCE
S# SINCE S# P# SINCE
S1 d04 S1 P1 d04S2 d07 S1 P2 d05S3 d03 S1 P3 d09S4 d04 S1 P4 d05S5 d02 S1 P5 d04
S1 P6 d06S2 P1 d08
Granularity = one day S2 P2 d09S3 P2 d08
Assume day 1 immediately precedes S4 P2 d06day 2, etc., etc. S4 P4 d04
S4 P5 d05
Copyright © C. J. Date 2005 page 34
PREDICATES :
S_SINCE:
Ever since day SINCE (and not on the day immediately before day SINCE), supplier S# has been under contract
SP_SINCE:
Ever since day SINCE (and not on the day immediately before day SINCE), supplier S# has been able to supply part P#
Copyright © C. J. Date 2005 page 35
SAMPLE CONSTRAINTS :
PK and FK constraints are as for nontemporal version
But we also need to "augment" the FK constraint to say no supplier can supply any part before that supplier is under contract …
CONSTRAINT XST1IS_EMPTY ( ( ( S_SINCE RENAME (SINCE AS SS) ) JOIN
( SP_SINCE RENAME (SINCE AS SPS) ) )WHERE SPS < SS ) ;
/* if tuple sp in SP_SINCE references tuple s in S_SINCE, *//* SINCE value in sp must not be less than that in s */
Nice to have some convenient shorthand ...
Copyright © C. J. Date 2005 page 36
SAMPLE QUERIES :
Query A: Get supplier numbers of suppliers who are currently able to supply at least one part,together with the date since when they havebeen able to do so
SUMMARIZE SP_SINCE BY { S# } ADD (MIN ( SINCE ) AS SINCE)
S# SINCE
S1 d04S2 d08S3 d08S4 d04
Copyright © C. J. Date 2005 page 37
Query B: Get supplier numbers of suppliers who arecurrently unable to supply any part at all, together with the date since when they havebeen unable to do so
Supplier S5 is currently unable to supply any parts at all… But we don’t know the date since when S5 has been unableto supply any parts (insufficient information in the DB)—DB is still only "semitemporalized"
I.e., we need to keep historical records!
Copyright © C. J. Date 2005 page 38
FULLY TEMPORALIZING SUPPLIERS AND SHIPMENTS(first attempt) :
S_FROM_TO SP_FROM_TO
S# FROM TO S# P# FROM TO
S1 d04 d10 S1 P1 d04 d10S2 d02 d04 S1 P2 d05 d10S2 d07 d10 S1 P3 d09 d10S3 d03 d10 S1 P4 d05 d10S4 d04 d10 S1 P5 d04 d10S5 d02 d10 S1 P6 d06 d10
S2 P1 d02 d04S2 P1 d08 d10S2 P2 d03 d03S2 P2 d09 d10S3 P2 d08 d10S4 P2 d06 d09S4 P4 d04 d08S4 P5 d05 d10
Copyright © C. J. Date 2005 page 39
POINTS ARISING : Assume for definiteness that "date today" = d10
Have shown d10 as TO value for each tuple pertainingto current state of affairs
How can all of those d10’s become d11’s on the strokeof midnight? See later!
More tuples than before … This fully temporal DB includes everything from semitemporal DB,* plus historicalrecords regarding previous interval of time (from d02 to d04)during which S2 was also under contract and able to supply certain parts
Except that TO value for two of S4’s shipments < today—i.e., those shipments now "historical" instead of "current"*
Copyright © C. J. Date 2005 page 40
PREDICATES :
S_FROM_TO:
From day FROM (and not on the day immediately before FROM) to day TO (and not on the day immediately after TO), supplier S# was under contract
SP_FROM_TO:
From day FROM (and not on the day immediately before FROM) to day TO (and not on the day immediately after TO), supplier S# was able to supply part P#
Copyright © C. J. Date 2005 page 41
SAMPLE CONSTRAINTS : Must prohibit FROM-TO pairs in which TO < FROM:
CONSTRAINT S_FROM_TO_OKIS_EMPTY ( S_FROM_TO WHERE TO < FROM ) ;
CONSTRAINT SP_FROM_TO_OKIS_EMPTY ( SP_FROM_TO WHERE TO < FROM ) ;
Primary keys:
For S_FROM_TO = {S#,FROM} —or {S#,TO}?
For SP_FROM_TO= {S#,P#,FROM} —or {S#,P#,TO}?
But these constraints aren’t sufficient!
Copyright © C. J. Date 2005 page 42
SAMPLE CONSTRAINTS (cont.) :
If S_FROM_TO includes (e.g.)—
S# FROM TO
S1 d04 d10
—then it mustn’t also include (e.g.):
S# FROM TO
S1 d02 d06
Copyright © C. J. Date 2005 page 43
SAMPLE CONSTRAINTS (cont.) :
These two tuples need to be combined into one …
S# FROM TO
S1 d02 d10
Note that not combining the tuples would be as bad as permitting duplicates!—and would mean S_FROM_TO violated its own predicate (contains contradiction)
PK constraint insufficient to prohibit overlapping tuples
Copyright © C. J. Date 2005 page 44
SAMPLE CONSTRAINTS (cont.) :
If S_FROM_TO includes (e.g.)—
S# FROM TO
S1 d04 d10
—then it mustn’t also include (e.g.):
S# FROM TO
S1 d02 d03
Copyright © C. J. Date 2005 page 45
SAMPLE CONSTRAINTS (cont.) :
These two tuples need to be combined into one …
S# FROM TO
S1 d02 d10
No redundancy as such, but circumlocution (and violation of predicate)—containing contradiction
PK constraint insufficient to prohibit abutting tuples
Copyright © C. J. Date 2005 page 46
SAMPLE CONSTRAINTS (cont.) :
CONSTRAINT XFT1IS_EMPTY
( ( ( S_FROM_TO RENAME ( FROM AS F1, TO AS T1 ) ) JOIN( S_FROM_TO RENAME ( FROM AS F2, TO AS T2 ) ) )
WHERE ( T1 > F2 AND T2 > F1 ) ) OR( F2 = T1+1 OR F1 = T2+1 ) ) ;
Complicated !!!
"T1+1" ??? "T2+1" ???
Do we begin to see the problem ???
Copyright © C. J. Date 2005 page 47
SAMPLE CONSTRAINTS (cont.) :
{S#,FROM} is not a FK from SP_FROM_TO to S_FROM_TO
But if supplier s appears in SP_FROM_TO, then supplier s must appear in S_FROM_TO as well:
CONSTRAINT XFT2SP_FROM_TO { S# } S_FROM_TO { S# } ;
Example of an inclusion dependency (FK constraints are a special case)
Note the relation comparison in this example
Copyright © C. J. Date 2005 page 48
SAMPLE CONSTRAINTS (cont.) :
But Constraint XFT2 is not enough! … If SP_FROM_TO shows supplier s is able to supply some part during some interval of time, then S_FROM_TO must show supplier s is under contract during that same interval of time
CONSTRAINT XFT3COUNT ( S_FROM_TO { ALL BUT P# } ) =COUNT ( ( ( SP_FROM_TO RENAME ( FROM AS SPF,TO AS SPT ) )
{ ALL BUT P# }JOIN( S_FROM_TO RENAME ( FROM AS SF, TO AS ST ) ) )
WHERE SF < SPF AND ST > SPT ) ;
Draw your own conclusions ... Next…