why not relational?
DESCRIPTION
Why Not Relational?. Evaluating the Manifestos The future of databases. Evaluating the Database manifestos. A closer look at the OODB Manifesto + commentary from 3GDB Manifesto Motivation for OODB revisited (see intro to 3G Manifesto) rich objects - PowerPoint PPT PresentationTRANSCRIPT
04/19/23 CS319 Theory of Databases1
Why Not Relational?
Evaluating the Manifestos
The future of databases
04/19/23 CS319 Theory of Databases2
Evaluating the Database manifestos
A closer look at the OODB Manifesto
+ commentary from 3GDB Manifesto
Motivation for OODB revisited
(see intro to 3G Manifesto)• rich objects
(e.g. different abstractions, multi-media etc)• design databases
(e.g. 1-off structures, evolutionary character)• behavioural elements (e.g. program code)• better interfaces
04/19/23 CS319 Theory of Databases3
‘Object’-oriented approaches in general sense
E-R modelling:
objects as ‘organising real-world observations’
generic classes + methods to transform state
look at the example O2 DB extract
objects substituted for relation tables
methods to get the information hiding
inheritance to share methods
04/19/23 CS319 Theory of Databases4
Features of ‘object’-oriented approaches 1
Grouping of real-world observables by locality and by existence dependency … conspicuous omission is the concept of functional dependency cf. the way in which observations are organised across objects
General themes:
tension with mathematical abstraction
e.g. non-procedural flavour of set not list
not ‘variables with an identity’
04/19/23 CS319 Theory of Databases5
Features of ‘object’-oriented approaches 2
Parody of the OODB position ...
OOP can solve the problems of HLL programming
indeed can achieve end-user-programming
DBs are an aberration that came about because programming used to be much more difficult; now that it's easy, need to break down the conceptual barrier between queries and APs
[possible subagenda: why need for DB anyway?]
04/19/23 CS319 Theory of Databases6
Principal points in the OODB Manifesto 1
complex objects
need for orderings etc where not present in relations
[A2.1] any constructor should apply to any object
BUT does this rule out non-1NF relational models?
constructors not orthogonal in the relational model
object identity
[A2.2] object sharing and object updates
04/19/23 CS319 Theory of Databases7
Principal points in the OODB Manifesto 2
encapsulationdata part + procedure part[A2.3] in PL context, use information hiding
data part is part of the implementation
in classical DB context, is data structure hidden?e.g. APs may use the file structure, queries don't
ambiguity: is relation table part of interface?
in OO setting use info hiding paradigm for DB:can gain access to tuples etc only as objects
04/19/23 CS319 Theory of Databases8
Principal points in the OODB Manifesto 3
encapsulation (cont.)
[A2.5] PS why bother encapsulating ad hoc queries?!
philosophy seems to be OOP is easy enough anyway
cf [3GM 1.3 especially p36]:
“encapsulated operators not enough”
04/19/23 CS319 Theory of Databases9
Principal points in the OODB Manifesto 3
types and classes
[A2.4] Controversy in OOP itself re type vs class
class more of run-time notion
Issue
does the user or the system maintain classes?
cf design context: 1-off structures / environments
class type rectangle vs class of universal interest
04/19/23 CS319 Theory of Databases10
Principal points in the OODB Manifesto 4
class or type hierarchies: inheritance
see the illustrative example from O2[A2.5] have many varieties of inheritance ....
overriding, overloading and late binding
one display method interpreted differently by objectse.g. display person, bitmap, graph
need late binding for this:decide which display to invoke at run-time
04/19/23 CS319 Theory of Databases11
Principal points in the OODB Manifesto 5
persistence
PLs can be more DB-like: data can survive process execution
computational completeness
[A2.7] PLs with persistence gives character of DB
why live with restricted computational power of SQL?
ideally resource complete … [next step for OO?]
04/19/23 CS319 Theory of Databases12
Principal points in the OODB Manifesto 6
extensibility
user defines own types
+ no distinction user/system-defined
[A2.9-2.11] wish list:
large DBs, concurrency, recovery ...
NB gives no evidence to suggest OO handles these!
04/19/23 CS319 Theory of Databases13
Issues raised by the OODB approach 1
• consistency of update: how to maintain FDs?A proposal: active data values + rules [3GM 1.5, p36]This is a problematic programming style
• flexibility in schema evolutionobject redesign is non-trivial
• a database is quite unlike a program:is representation of state, not a state-transition model
• is OOP really that easy?evidence against OOP for end-user programming... still needs the guru?
04/19/23 CS319 Theory of Databases14
Issues raised by the OODB approach 2
• OO modelling doesn't linkreal-world observations and computer modelas simply as relational DB design
• what is the counterpart of the query language?[3GM 3.2 p37] can't return to navigation!can't by-pass the optimiserschema evolution makes problematic
[3GM p38] "arguments against navigation arecompelling & some programmers just need educating"
invidious arguments from efficiency [3GM 2.4]
04/19/23 CS319 Theory of Databases15
Issues raised by the OODB approach 3
Disadvantages of OODBs
• no formal semantics
• loss of relational simplicity
• navigational queries
• no general query language
• lack of support for dynamic processes
Brown, A.W.
Object-oriented DBs: Applications in S/W Engineering
04/19/23 CS319 Theory of Databases16
Appraisal of the 3G DB Manifesto
Philosophy of the 3G DB manifesto based on
Extended relational models
[3GM] we can get there from where we are ... you can't throw away the benefits of relational DB theory, can exploit it ...
NB Date and Darwen:
relations with objects as data elements
not objects in place of relations
so that relation = interrelationship amongst objects
04/19/23 CS319 Theory of Databases17
Concerns re 3G DB manifesto
Approach to issues seems unsatisfactory in many ways:
• emphasises pragmatism: it attaches too much importance to whether the solution is realistic NOW
• suggests no need to change, just subsume, things
• presumes that only evolutionary change is required
A proposal commercially not academically motivated?
04/19/23 CS319 Theory of Databases18
Concerns re 3G DB manifesto
A proposal commercially not academically motivated?
doesn't clarify what principles matter in relational DBsburies any pretensions to a good underlying theory
cf Codd and Kent's concern for:understanding what a good data model is
A database is not interesting just as a utility
Theory of DBs is concerned with fundamental issues fordata modelling that are profoundly relevant to thedesign of PLs & data representation beyond computers
04/19/23 CS319 Theory of Databases19
… this is a good point at which to revisit the case for the relational model, recognising the potential need to generalise what relational theory offers …
whyrel.ppt (slide 34)
04/19/23 CS319 Theory of Databases20
Where is the future of databases? 1
A personal perspective
1. Tension between DB as
real-world model vs program generator
Good way of real-world modelling
? good way of programming
This was the thesis of Simula (1967), BUT
OOP doesn't deliver on this front?
04/19/23 CS319 Theory of Databases21
Where is the future of databases? 2
1. Tension between DB asreal-world model vs program generator
Contrast association of observables in RDB and OODB
Compare with agent-oriented modelling perspective:• model what each agent observes• model what each agent can act to change
DB as defining real-world STATESprogramming as defining BEHAVIOURSprogram constructs are about TRANSITIONS
04/19/23 CS319 Theory of Databases22
Where is the future of databases? 3
2. Functional dependency
FDs ... seem to have an ambivalent role on the fringe of the relational theory [Kent: Data &Reality p138]
Fundamental to RDB design: powerful link content-form
? Idea not sufficiently general in the RDB context
e.g. consider 4NF, 5NF
e.g. relationships within a record
04/19/23 CS319 Theory of Databases23
Where is the future of databases? 4
2. Functional dependency
? Idea not sufficiently general in the RDB context
e.g. consider 4NF
value in one set of columns determines the set of possible values in another set of columns etc
e.g. relationships within tuple Kent: Data & Reality p111
(Emp, DoB, Spouse, Spouse_DoB, Wedding_Date)
Issues
This is information re employee (the primary key)
Can find that E's S_DoB is .... but not (E'S)'s DoB
Model doesn't know W_D concerns relationship
04/19/23 CS319 Theory of Databases24
Where is the future of databases? 5
2. Functional dependency
? Idea not sufficiently general in the RDB context
Relational tables can serve as enumerated functions
so …
Why not functions returning non-tabular structures?
Why not functions that can't be tabulated?
Important semantic distinctions: compare• student determines slot in project timetable• student determines supervisor• student determines project mark
04/19/23 CS319 Theory of Databases25
Where is the future of databases? 6
3. Dependencies between observations are
viewpoint dependent
atomicity of data
indivisibility of association between observables
are BOTH influenced by
who you are, and what you're doing
must distinguish between
rules, triggers and constraints
04/19/23 CS319 Theory of Databases26
Where is the future of databases? 7
3. Dependencies between observations are viewpoint dependent
must distinguish between rules, triggers and constraints
cf. observations about a game of cricket include
dependencies that declare indivisibility:boundary is scored as ball crosses rope
event-driven action:when ball is received batsman plays shot
constraint:always at most 4 of the batting side on the field
04/19/23 CS319 Theory of Databases27
Where is the future of databases? 8
3. Dependencies between observations are viewpoint dependent
must distinguish between rules, triggers and constraints
expert systems, deductive databases, ad hoc triggering, prototyping tools, hypercard, spreadsheets ...
... all use these powerful mechanisms, but have no satisfactory theoretical data modelling foundation
04/19/23 CS319 Theory of Databases28
Where is the future of databases? 9
4. "Computers are only good for logic?!" [HD]
logic is emphatically not about state
- need variables with identity for state
Many great mathematicians contributed to formalising mathematics “... unfortunately, they also died." [BC-S]
logical variables don't have identity …
cf HD mode of reference to data (cf nested relations, atomicity of data) is not a logical concept
cf HD - what is the ROBIN attribute if not the identifier of a BIRD object?
04/19/23 CS319 Theory of Databases29
Where is the future of databases? 10
4. "Computers are only good for logic?!" [HD]
logic isn’t always an appropriate medium for knowledge
representation [Mensa]
This is the basis of a very significant philosophical
argument in AI: the logicist vs. the non-logicist position
A Mensa problem (slides 37-45) illustrates the logicist view of knowledge as rational in an extreme form ….
04/19/23 CS319 Theory of Databases30
Looking to the future 1
Emphasis of modern computing:
metaphor not symbolic representation
metaphor: the form reflects the content
e.g. metaphor is behind virtual reality
cf. a postscript file and the image it defines:
- the image is a metaphor for the thing itself
sensory elements are involved in metaphor
cf. no good objective criteria by which the user can choose
the "right" way to represent some given piece of data (p6)
04/19/23 CS319 Theory of Databases31
Looking to the future 2
Crystal Ball Gazing ...
• key ideas of relational DBs will be taken over & generalised away from relational algebras
• the emphasis will shift from representation to metaphor: "database as a real-world model"
• a new focus for foundations will emerge, more general than classical logic
04/19/23 CS319 Theory of Databases32
Where is the future of databases? 11
5. Where next?
database is about generating views for different agents
database is a generator of metaphors [cf virtual reality]
technology / medium dependent: if computers could
only generate smells would we have relational DBs?
in general (e.g. concurrent engineering)
no guaranteed consistent view, hence conflicts
+ need to represent outside framework of logic
04/19/23 CS319 Theory of Databases33
Where is the future of databases? 12
5. Where next?
… in general, no guaranteed consistent view, hence conflicts
+ need to represent outside framework of logic
Classical DB suits where there is sharing + consensus
BUT harder to represent cooperation than consensus
cf individual idosyncratic representations + many different perspectives on data
[cf. Brooks: No Silver Bullet - “the essence of software development”]
Compare seminars and books: book is a milestone, seminars are elusive, incomplete, but fundamentally just as important and more primary
04/19/23 CS319 Theory of Databases34
Where is the future of databases? 13
5. Where next?
… in general, no guaranteed consistent view, hence conflicts
+ need to represent outside framework of logic
Consensus operates in many ways at many levels:• agreement about experimental outcomes• language and ritual assigns meaning• object and domain identification• essential milestones in design "progress”
04/19/23 CS319 Theory of Databases35
Where is the future of databases? 14
6. Technical concepts being used to support
Definitive scripts to express FDs between observations via metaphor
Agents + redefinitions to model changes of state
Functions in underlying algebra encapsulate + displace tables in RDB
Modes of definition of variables, different agent viewpoints
... from experiment to theory aspect see http://www.dcs.warwick.ac.uk/modelling/
19/04/23 36
project_table_LHS_FD is project(current_table, makestrlist(FDs[current_FD][1]));
project_table_RHS_FD is project(current_table, [FDs[current_FD][2]]);
pattern_duplicate_rows is index_duplicated(tail(project_table_LHS_FD));
newcol is transformcol(makelistcol(project_table_RHS_FD), pattern_duplicate_rows);
newtable is apply_current_FD_current_table(current_table, newcol);
Listing 1: Observables and dependencies in the TLJ construal
An observation-oriented model of the testing lossless join algorithm (constructed using tkeden)
04/19/23 CS319 Theory of Databases37
End of the module
04/19/23 CS319 Theory of Databases38
Logic and Commonsense Knowledge 1
A logical (?) problem [taken from a MENSA publication]
The Captain of the darts team needs 72 to win. Before throwing a dart, he remarks that (coincidentally) 72 is the product of the ages of his three daughters. After throwing one dart, he remarks that (coincidentally) the score for the dart he has just thrown is the sum of the ages of his daughters. Fred, his opponent, observes at this point, that he doesn't know the ages of the Captain's daughters. "I'll give you a clue", says the Captain. My eldest daughter is called Vanessa. "I see", says Fred. "Now I know their ages."
Exercise in inference: What were their ages?
04/19/23 CS319 Theory of Databases39
Logic and Commonsense Knowledge 2
There is much domain knowledge and convention that is - or might be - relevant to the solution
• ages are integers• ages are positive• ages are restricted to a plausible range of values
• “knowing their ages” actually means “knowing the abstract set of ages” (in Mensa-speak) …
04/19/23 CS319 Theory of Databases40
Logic and Commonsense Knowledge 3
… “knowing their ages” means “knowing the abstract set of ages”
• when Fred observes that he doesn't know their ages, he refers to knowing the set of ages, and not to being able to associate an age with any particular daughter who might turn up at the darts match.
• even when Fred says "Now I know their ages", were one or more of the daughters to turn up at the darts match, much more domain knowledge would be required to identify their ages.
04/19/23 CS319 Theory of Databases41
Logic and Commonsense Knowledge 4
• correct use of "eldest" presupposes that there is only one eldest daughter
• what can be scored with one dart is restricted
There are also many conventions of the problem …
For instance, who's doing the reasoning?
• if Fred said "Now I know (the set of) their ages" before he knew that the eldest daughter was called Vanessa, would we know their ages?
04/19/23 CS319 Theory of Databases42
Logic and Commonsense Knowledge 5
In any case:
• why should we attach any significance to the Fred's observation that he doesn't know their ages? As will emerge ... we are meant to suppose that he is very clever, can be sure that everyone else is also equally clever, and has taken full account of all the available information, but he might just be too lazy, ignorant or drunk to be able to factorise 72, or not realise the significance of such factorisation.
04/19/23 CS319 Theory of Databases43
Logic and Commonsense Knowledge 6
Solution to the problem
Because Fred doesn't know the ages before he knows that the Captain has an eldest daughter, we know that the value of the first dart is some number v such that xyz=72 and x+y+z=v has more than one solution set {x,y,z}.
The possible sets of factors of 72 are
{1,1,72}, {1,2,36}, {1,3,24}, {1,4,18}, {1,6,12}, {1,8,9},
{2,2,18}, {2,3,12}, (2,4,9},{2,6,6}, {3,3,8}, {3,4,6}
04/19/23 CS319 Theory of Databases44
Logic and Commonsense Knowledge 7
Solution to the problem
The possible sets of factors of 72 are
{1,1,72}, {1,2,36}, {1,3,24}, {1,4,18}, {1,6,12}, {1,8,9},
{2,2,18}, {2,3,12}, (2,4,9},{2,6,6}, {3,3,8}, {3,4,6}
These are the associated sums of factors; they correspond to the value of first dart:
{1,1,72}: 74, {1,4,18}: 23,{1,2,36}: 39, {1,3,24}: 28, {1,6,12}: 19, {1,8,9}: 18{2,2,18}: 22, {2,3,12}: 17, (2,4,9}: 15, {2,6,6}: 14{3,3,8}: 14, {3,4,6}: 13
The only relevant information here is that there is just one way in which two distinct sets of ages generate the same sum viz. {2,6,6}: 14, {3,3,8}: 14
04/19/23 CS319 Theory of Databases45
Logic and Commonsense Knowledge 8
Solution to the problem
there is just one way in which two distinct sets of ages generate the same sum viz. {2,6,6}: 14, {3,3,8}: 14
If we know that there is an eldest daughter, this rules
out the possibility that their set of ages is {2,6,6}, so
Vanessa is 8 etc.
Some interesting irrelevant information might haveplayed a role in getting the answer had the problembeen more subtle. For instance: {1,1,72} & {1,4,18} areimpossible because of the constraints on the value ofthe first dart, whilst {1,2,36} is implausible if the girlsreally have the same mother.
04/19/23 CS319 Theory of Databases46
Logic and Commonsense Knowledge 9
Moral: real-world inference is not abstract logic but situated reasoning in which many incidental observations about the nature of the world determine what can be inferred. Such inference uses premises that are acts of faith.
Data modelling techniques need to be suitable for this ...
04/19/23 CS319 Theory of Databases47
Logic and Commonsense Knowledge 10
I am at a conference in the Netherlands.
I arrive late at night and hardly notice where my room is.
Next morning, I notice that my room is on the top floor.
I walk down to breakfast thinking about my talk later on.
After breakfast I meet two other delegates X and Y.
We get in the lift to return to our rooms.
04/19/23 CS319 Theory of Databases48
Logic and Commonsense Knowledge 11
X presses the button for floor 3.
Y says he is on the floor above X, and selects floor 4.
Since the top button is selected, I don’t press a button.
We talk as we ascend. The lift stops. The door opens.
The floor numbers aren’t clearly marked.
I say to X – ‘this must be floor 3’ – he gets out.
04/19/23 CS319 Theory of Databases49
Logic and Commonsense Knowledge 12
Y and I carry on talking.
When the lift next stops, the floor is still unclear.
I say to Y ‘X is on the floor below you; this is your floor’.
Y gets out. I think something is not quite right.
I think ‘is this the top floor?’ and ‘should I get out?’.
I’m unsure, but notice that the button for floor 4 is still lit.
04/19/23 CS319 Theory of Databases50
Logic and Commonsense Knowledge 13
I proceed to the top floor which is the next floor, floor 4.
When I get out of the lift, I can’t find my room.
There’s no room where my room is on floor 4.
I walk down to floor 3, and pass Y on his way to floor 4.
When I reach floor 3, I meet X coming up from floor 2 …
How did I manage to get all 3 of us to the wrong floor?
04/19/23 CS319 Theory of Databases51
Logic and Commonsense Knowledge 14
Two key facts help to explain this …
1. Someone called the lift to floor 2 and didn’t wait for it to come. I persuaded X to get out at floor 2 thinking it was floor 3.
2. I was on floor 3, which was ‘locally’ the top floor, but the lift was in a part of the building where there were 4 floors.
How do we model this kind of commonsense scenario?
04/19/23 CS319 Theory of Databases52
… now back to slide 29
04/19/23 CS319 Theory of Databases53
Theory of Databases: themes 1
Data modelling object-oriented
relational
entity-relationship
dependency
persistence/transience
Mathematical semantics
algebra + logic
procedural / declarativebatch / interactive
closed-world / open development
04/19/23 CS319 Theory of Databases54
Theory of Databases themes 2
Experiential VR “current state”
perception of state
situation
efficiency of retrieval
mobile computing
Features of DBs object abstractions
deductive elements
triggers
interfaces
04/19/23 CS319 Theory of Databases55
Theory of Databases themes 3
Pragmatic SQL
education of designers / users
standards/diversity/integration
commercial influences
‘no need for normalisation’
Foundational agenda of the DBA
DB design and anomalies
avoiding chaotic data organisation
logical + physical data independence
04/19/23 CS319 Theory of Databases56
Theory of Databases themes 4
Applications classical applications
design DBs
real-time applications
distributed DBs
personal DBs
interactive applications
spreadsheets
4GL