why not relational?

56
06/20/22 CS319 Theory of Databases 1 Why Not Relational? Evaluating the Manifestos The future of databases

Upload: roanna-fleming

Post on 31-Dec-2015

39 views

Category:

Documents


1 download

DESCRIPTION

Why Not Relational?. Evaluating the Manifestos The future of databases. Evaluating the Database manifestos. A closer look at the OODB Manifesto + commentary from 3GDB Manifesto Motivation for OODB revisited (see intro to 3G Manifesto) rich objects - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Why Not Relational?

04/19/23 CS319 Theory of Databases1

Why Not Relational?

Evaluating the Manifestos

The future of databases

Page 2: Why Not Relational?

04/19/23 CS319 Theory of Databases2

Evaluating the Database manifestos

A closer look at the OODB Manifesto

+ commentary from 3GDB Manifesto

Motivation for OODB revisited

(see intro to 3G Manifesto)• rich objects

(e.g. different abstractions, multi-media etc)• design databases

(e.g. 1-off structures, evolutionary character)• behavioural elements (e.g. program code)• better interfaces

Page 3: Why Not Relational?

04/19/23 CS319 Theory of Databases3

‘Object’-oriented approaches in general sense

E-R modelling:

objects as ‘organising real-world observations’

generic classes + methods to transform state

look at the example O2 DB extract

objects substituted for relation tables

methods to get the information hiding

inheritance to share methods

Page 4: Why Not Relational?

04/19/23 CS319 Theory of Databases4

Features of ‘object’-oriented approaches 1

Grouping of real-world observables by locality and by existence dependency … conspicuous omission is the concept of functional dependency cf. the way in which observations are organised across objects

General themes:

tension with mathematical abstraction

e.g. non-procedural flavour of set not list

not ‘variables with an identity’

Page 5: Why Not Relational?

04/19/23 CS319 Theory of Databases5

Features of ‘object’-oriented approaches 2

Parody of the OODB position ...

OOP can solve the problems of HLL programming

indeed can achieve end-user-programming

DBs are an aberration that came about because programming used to be much more difficult; now that it's easy, need to break down the conceptual barrier between queries and APs

[possible subagenda: why need for DB anyway?]

Page 6: Why Not Relational?

04/19/23 CS319 Theory of Databases6

Principal points in the OODB Manifesto 1

complex objects

need for orderings etc where not present in relations

[A2.1] any constructor should apply to any object

BUT does this rule out non-1NF relational models?

constructors not orthogonal in the relational model

object identity

[A2.2] object sharing and object updates

Page 7: Why Not Relational?

04/19/23 CS319 Theory of Databases7

Principal points in the OODB Manifesto 2

encapsulationdata part + procedure part[A2.3] in PL context, use information hiding

data part is part of the implementation

in classical DB context, is data structure hidden?e.g. APs may use the file structure, queries don't

ambiguity: is relation table part of interface?

in OO setting use info hiding paradigm for DB:can gain access to tuples etc only as objects

Page 8: Why Not Relational?

04/19/23 CS319 Theory of Databases8

Principal points in the OODB Manifesto 3

encapsulation (cont.)

[A2.5] PS why bother encapsulating ad hoc queries?!

philosophy seems to be OOP is easy enough anyway

cf [3GM 1.3 especially p36]:

“encapsulated operators not enough”

Page 9: Why Not Relational?

04/19/23 CS319 Theory of Databases9

Principal points in the OODB Manifesto 3

types and classes

[A2.4] Controversy in OOP itself re type vs class

class more of run-time notion

Issue

does the user or the system maintain classes?

cf design context: 1-off structures / environments

class type rectangle vs class of universal interest

Page 10: Why Not Relational?

04/19/23 CS319 Theory of Databases10

Principal points in the OODB Manifesto 4

class or type hierarchies: inheritance

see the illustrative example from O2[A2.5] have many varieties of inheritance ....

overriding, overloading and late binding

one display method interpreted differently by objectse.g. display person, bitmap, graph

need late binding for this:decide which display to invoke at run-time

Page 11: Why Not Relational?

04/19/23 CS319 Theory of Databases11

Principal points in the OODB Manifesto 5

persistence

PLs can be more DB-like: data can survive process execution

computational completeness

[A2.7] PLs with persistence gives character of DB

why live with restricted computational power of SQL?

ideally resource complete … [next step for OO?]

Page 12: Why Not Relational?

04/19/23 CS319 Theory of Databases12

Principal points in the OODB Manifesto 6

extensibility

user defines own types

+ no distinction user/system-defined

[A2.9-2.11] wish list:

large DBs, concurrency, recovery ...

NB gives no evidence to suggest OO handles these!

Page 13: Why Not Relational?

04/19/23 CS319 Theory of Databases13

Issues raised by the OODB approach 1

• consistency of update: how to maintain FDs?A proposal: active data values + rules [3GM 1.5, p36]This is a problematic programming style

• flexibility in schema evolutionobject redesign is non-trivial

• a database is quite unlike a program:is representation of state, not a state-transition model

• is OOP really that easy?evidence against OOP for end-user programming... still needs the guru?

Page 14: Why Not Relational?

04/19/23 CS319 Theory of Databases14

Issues raised by the OODB approach 2

• OO modelling doesn't linkreal-world observations and computer modelas simply as relational DB design

• what is the counterpart of the query language?[3GM 3.2 p37] can't return to navigation!can't by-pass the optimiserschema evolution makes problematic

[3GM p38] "arguments against navigation arecompelling & some programmers just need educating"

invidious arguments from efficiency [3GM 2.4]

Page 15: Why Not Relational?

04/19/23 CS319 Theory of Databases15

Issues raised by the OODB approach 3

Disadvantages of OODBs

• no formal semantics

• loss of relational simplicity

• navigational queries

• no general query language

• lack of support for dynamic processes

Brown, A.W.

Object-oriented DBs: Applications in S/W Engineering

Page 16: Why Not Relational?

04/19/23 CS319 Theory of Databases16

Appraisal of the 3G DB Manifesto

Philosophy of the 3G DB manifesto based on

Extended relational models

[3GM] we can get there from where we are ... you can't throw away the benefits of relational DB theory, can exploit it ...

NB Date and Darwen:

relations with objects as data elements

not objects in place of relations

so that relation = interrelationship amongst objects

Page 17: Why Not Relational?

04/19/23 CS319 Theory of Databases17

Concerns re 3G DB manifesto

Approach to issues seems unsatisfactory in many ways:

• emphasises pragmatism: it attaches too much importance to whether the solution is realistic NOW

• suggests no need to change, just subsume, things

• presumes that only evolutionary change is required

A proposal commercially not academically motivated?

Page 18: Why Not Relational?

04/19/23 CS319 Theory of Databases18

Concerns re 3G DB manifesto

A proposal commercially not academically motivated?

doesn't clarify what principles matter in relational DBsburies any pretensions to a good underlying theory

cf Codd and Kent's concern for:understanding what a good data model is

A database is not interesting just as a utility

Theory of DBs is concerned with fundamental issues fordata modelling that are profoundly relevant to thedesign of PLs & data representation beyond computers

Page 19: Why Not Relational?

04/19/23 CS319 Theory of Databases19

… this is a good point at which to revisit the case for the relational model, recognising the potential need to generalise what relational theory offers …

whyrel.ppt (slide 34)

Page 20: Why Not Relational?

04/19/23 CS319 Theory of Databases20

Where is the future of databases? 1

A personal perspective

1. Tension between DB as

real-world model vs program generator

Good way of real-world modelling

? good way of programming

This was the thesis of Simula (1967), BUT

OOP doesn't deliver on this front?

Page 21: Why Not Relational?

04/19/23 CS319 Theory of Databases21

Where is the future of databases? 2

1. Tension between DB asreal-world model vs program generator

Contrast association of observables in RDB and OODB

Compare with agent-oriented modelling perspective:• model what each agent observes• model what each agent can act to change

DB as defining real-world STATESprogramming as defining BEHAVIOURSprogram constructs are about TRANSITIONS

Page 22: Why Not Relational?

04/19/23 CS319 Theory of Databases22

Where is the future of databases? 3

2. Functional dependency

FDs ... seem to have an ambivalent role on the fringe of the relational theory [Kent: Data &Reality p138]

Fundamental to RDB design: powerful link content-form

? Idea not sufficiently general in the RDB context

e.g. consider 4NF, 5NF

e.g. relationships within a record

Page 23: Why Not Relational?

04/19/23 CS319 Theory of Databases23

Where is the future of databases? 4

2. Functional dependency

? Idea not sufficiently general in the RDB context

e.g. consider 4NF

value in one set of columns determines the set of possible values in another set of columns etc

e.g. relationships within tuple Kent: Data & Reality p111

(Emp, DoB, Spouse, Spouse_DoB, Wedding_Date)

Issues

This is information re employee (the primary key)

Can find that E's S_DoB is .... but not (E'S)'s DoB

Model doesn't know W_D concerns relationship

Page 24: Why Not Relational?

04/19/23 CS319 Theory of Databases24

Where is the future of databases? 5

2. Functional dependency

? Idea not sufficiently general in the RDB context

Relational tables can serve as enumerated functions

so …

Why not functions returning non-tabular structures?

Why not functions that can't be tabulated?

Important semantic distinctions: compare• student determines slot in project timetable• student determines supervisor• student determines project mark

Page 25: Why Not Relational?

04/19/23 CS319 Theory of Databases25

Where is the future of databases? 6

3. Dependencies between observations are

viewpoint dependent

atomicity of data

indivisibility of association between observables

are BOTH influenced by

who you are, and what you're doing

must distinguish between

rules, triggers and constraints

Page 26: Why Not Relational?

04/19/23 CS319 Theory of Databases26

Where is the future of databases? 7

3. Dependencies between observations are viewpoint dependent

must distinguish between rules, triggers and constraints

cf. observations about a game of cricket include

dependencies that declare indivisibility:boundary is scored as ball crosses rope

event-driven action:when ball is received batsman plays shot

constraint:always at most 4 of the batting side on the field

Page 27: Why Not Relational?

04/19/23 CS319 Theory of Databases27

Where is the future of databases? 8

3. Dependencies between observations are viewpoint dependent

must distinguish between rules, triggers and constraints

expert systems, deductive databases, ad hoc triggering, prototyping tools, hypercard, spreadsheets ...

... all use these powerful mechanisms, but have no satisfactory theoretical data modelling foundation

Page 28: Why Not Relational?

04/19/23 CS319 Theory of Databases28

Where is the future of databases? 9

4. "Computers are only good for logic?!" [HD]

logic is emphatically not about state

- need variables with identity for state

Many great mathematicians contributed to formalising mathematics “... unfortunately, they also died." [BC-S]

logical variables don't have identity …

cf HD mode of reference to data (cf nested relations, atomicity of data) is not a logical concept

cf HD - what is the ROBIN attribute if not the identifier of a BIRD object?

Page 29: Why Not Relational?

04/19/23 CS319 Theory of Databases29

Where is the future of databases? 10

4. "Computers are only good for logic?!" [HD]

logic isn’t always an appropriate medium for knowledge

representation [Mensa]

This is the basis of a very significant philosophical

argument in AI: the logicist vs. the non-logicist position

A Mensa problem (slides 37-45) illustrates the logicist view of knowledge as rational in an extreme form ….

Page 30: Why Not Relational?

04/19/23 CS319 Theory of Databases30

Looking to the future 1

Emphasis of modern computing:

metaphor not symbolic representation

metaphor: the form reflects the content

e.g. metaphor is behind virtual reality

cf. a postscript file and the image it defines:

- the image is a metaphor for the thing itself

sensory elements are involved in metaphor

cf. no good objective criteria by which the user can choose

the "right" way to represent some given piece of data (p6)

Page 31: Why Not Relational?

04/19/23 CS319 Theory of Databases31

Looking to the future 2

Crystal Ball Gazing ...

• key ideas of relational DBs will be taken over & generalised away from relational algebras

• the emphasis will shift from representation to metaphor: "database as a real-world model"

• a new focus for foundations will emerge, more general than classical logic

Page 32: Why Not Relational?

04/19/23 CS319 Theory of Databases32

Where is the future of databases? 11

5. Where next?

database is about generating views for different agents

database is a generator of metaphors [cf virtual reality]

technology / medium dependent: if computers could

only generate smells would we have relational DBs?

in general (e.g. concurrent engineering)

no guaranteed consistent view, hence conflicts

+ need to represent outside framework of logic

Page 33: Why Not Relational?

04/19/23 CS319 Theory of Databases33

Where is the future of databases? 12

5. Where next?

… in general, no guaranteed consistent view, hence conflicts

+ need to represent outside framework of logic

Classical DB suits where there is sharing + consensus

BUT harder to represent cooperation than consensus

cf individual idosyncratic representations + many different perspectives on data

[cf. Brooks: No Silver Bullet - “the essence of software development”]

Compare seminars and books: book is a milestone, seminars are elusive, incomplete, but fundamentally just as important and more primary

Page 34: Why Not Relational?

04/19/23 CS319 Theory of Databases34

Where is the future of databases? 13

5. Where next?

… in general, no guaranteed consistent view, hence conflicts

+ need to represent outside framework of logic

Consensus operates in many ways at many levels:• agreement about experimental outcomes• language and ritual assigns meaning• object and domain identification• essential milestones in design "progress”

Page 35: Why Not Relational?

04/19/23 CS319 Theory of Databases35

Where is the future of databases? 14

6. Technical concepts being used to support

Definitive scripts to express FDs between observations via metaphor

Agents + redefinitions to model changes of state

Functions in underlying algebra encapsulate + displace tables in RDB

Modes of definition of variables, different agent viewpoints

... from experiment to theory aspect see http://www.dcs.warwick.ac.uk/modelling/

Page 36: Why Not Relational?

19/04/23 36

project_table_LHS_FD is project(current_table, makestrlist(FDs[current_FD][1]));

project_table_RHS_FD is project(current_table, [FDs[current_FD][2]]);

pattern_duplicate_rows is index_duplicated(tail(project_table_LHS_FD));

newcol is transformcol(makelistcol(project_table_RHS_FD), pattern_duplicate_rows);

newtable is apply_current_FD_current_table(current_table, newcol);

Listing 1: Observables and dependencies in the TLJ construal

An observation-oriented model of the testing lossless join algorithm (constructed using tkeden)

Page 37: Why Not Relational?

04/19/23 CS319 Theory of Databases37

End of the module

Page 38: Why Not Relational?

04/19/23 CS319 Theory of Databases38

Logic and Commonsense Knowledge 1

A logical (?) problem [taken from a MENSA publication]

The Captain of the darts team needs 72 to win. Before throwing a dart, he remarks that (coincidentally) 72 is the product of the ages of his three daughters. After throwing one dart, he remarks that (coincidentally) the score for the dart he has just thrown is the sum of the ages of his daughters. Fred, his opponent, observes at this point, that he doesn't know the ages of the Captain's daughters. "I'll give you a clue", says the Captain. My eldest daughter is called Vanessa. "I see", says Fred. "Now I know their ages."

Exercise in inference: What were their ages?

Page 39: Why Not Relational?

04/19/23 CS319 Theory of Databases39

Logic and Commonsense Knowledge 2

There is much domain knowledge and convention that is - or might be - relevant to the solution

• ages are integers• ages are positive• ages are restricted to a plausible range of values

• “knowing their ages” actually means “knowing the abstract set of ages” (in Mensa-speak) …

Page 40: Why Not Relational?

04/19/23 CS319 Theory of Databases40

Logic and Commonsense Knowledge 3

… “knowing their ages” means “knowing the abstract set of ages”

• when Fred observes that he doesn't know their ages, he refers to knowing the set of ages, and not to being able to associate an age with any particular daughter who might turn up at the darts match.

• even when Fred says "Now I know their ages", were one or more of the daughters to turn up at the darts match, much more domain knowledge would be required to identify their ages.

Page 41: Why Not Relational?

04/19/23 CS319 Theory of Databases41

Logic and Commonsense Knowledge 4

• correct use of "eldest" presupposes that there is only one eldest daughter

• what can be scored with one dart is restricted

There are also many conventions of the problem …

For instance, who's doing the reasoning?

• if Fred said "Now I know (the set of) their ages" before he knew that the eldest daughter was called Vanessa, would we know their ages?

Page 42: Why Not Relational?

04/19/23 CS319 Theory of Databases42

Logic and Commonsense Knowledge 5

In any case:

• why should we attach any significance to the Fred's observation that he doesn't know their ages? As will emerge ... we are meant to suppose that he is very clever, can be sure that everyone else is also equally clever, and has taken full account of all the available information, but he might just be too lazy, ignorant or drunk to be able to factorise 72, or not realise the significance of such factorisation.

Page 43: Why Not Relational?

04/19/23 CS319 Theory of Databases43

Logic and Commonsense Knowledge 6

Solution to the problem

Because Fred doesn't know the ages before he knows that the Captain has an eldest daughter, we know that the value of the first dart is some number v such that xyz=72 and x+y+z=v has more than one solution set {x,y,z}.

The possible sets of factors of 72 are

{1,1,72}, {1,2,36}, {1,3,24}, {1,4,18}, {1,6,12}, {1,8,9},

{2,2,18}, {2,3,12}, (2,4,9},{2,6,6}, {3,3,8}, {3,4,6}

Page 44: Why Not Relational?

04/19/23 CS319 Theory of Databases44

Logic and Commonsense Knowledge 7

Solution to the problem

The possible sets of factors of 72 are

{1,1,72}, {1,2,36}, {1,3,24}, {1,4,18}, {1,6,12}, {1,8,9},

{2,2,18}, {2,3,12}, (2,4,9},{2,6,6}, {3,3,8}, {3,4,6}

These are the associated sums of factors; they correspond to the value of first dart:

{1,1,72}: 74, {1,4,18}: 23,{1,2,36}: 39, {1,3,24}: 28, {1,6,12}: 19, {1,8,9}: 18{2,2,18}: 22, {2,3,12}: 17, (2,4,9}: 15, {2,6,6}: 14{3,3,8}: 14, {3,4,6}: 13

The only relevant information here is that there is just one way in which two distinct sets of ages generate the same sum viz. {2,6,6}: 14, {3,3,8}: 14

Page 45: Why Not Relational?

04/19/23 CS319 Theory of Databases45

Logic and Commonsense Knowledge 8

Solution to the problem

there is just one way in which two distinct sets of ages generate the same sum viz. {2,6,6}: 14, {3,3,8}: 14

If we know that there is an eldest daughter, this rules

out the possibility that their set of ages is {2,6,6}, so

Vanessa is 8 etc.

Some interesting irrelevant information might haveplayed a role in getting the answer had the problembeen more subtle. For instance: {1,1,72} & {1,4,18} areimpossible because of the constraints on the value ofthe first dart, whilst {1,2,36} is implausible if the girlsreally have the same mother.

Page 46: Why Not Relational?

04/19/23 CS319 Theory of Databases46

Logic and Commonsense Knowledge 9

Moral: real-world inference is not abstract logic but situated reasoning in which many incidental observations about the nature of the world determine what can be inferred. Such inference uses premises that are acts of faith.

Data modelling techniques need to be suitable for this ...

Page 47: Why Not Relational?

04/19/23 CS319 Theory of Databases47

Logic and Commonsense Knowledge 10

I am at a conference in the Netherlands.

I arrive late at night and hardly notice where my room is.

Next morning, I notice that my room is on the top floor.

I walk down to breakfast thinking about my talk later on.

After breakfast I meet two other delegates X and Y.

We get in the lift to return to our rooms.

Page 48: Why Not Relational?

04/19/23 CS319 Theory of Databases48

Logic and Commonsense Knowledge 11

X presses the button for floor 3.

Y says he is on the floor above X, and selects floor 4.

Since the top button is selected, I don’t press a button.

We talk as we ascend. The lift stops. The door opens.

The floor numbers aren’t clearly marked.

I say to X – ‘this must be floor 3’ – he gets out.

Page 49: Why Not Relational?

04/19/23 CS319 Theory of Databases49

Logic and Commonsense Knowledge 12

Y and I carry on talking.

When the lift next stops, the floor is still unclear.

I say to Y ‘X is on the floor below you; this is your floor’.

Y gets out. I think something is not quite right.

I think ‘is this the top floor?’ and ‘should I get out?’.

I’m unsure, but notice that the button for floor 4 is still lit.

Page 50: Why Not Relational?

04/19/23 CS319 Theory of Databases50

Logic and Commonsense Knowledge 13

I proceed to the top floor which is the next floor, floor 4.

When I get out of the lift, I can’t find my room.

There’s no room where my room is on floor 4.

I walk down to floor 3, and pass Y on his way to floor 4.

When I reach floor 3, I meet X coming up from floor 2 …

How did I manage to get all 3 of us to the wrong floor?

Page 51: Why Not Relational?

04/19/23 CS319 Theory of Databases51

Logic and Commonsense Knowledge 14

Two key facts help to explain this …

1. Someone called the lift to floor 2 and didn’t wait for it to come. I persuaded X to get out at floor 2 thinking it was floor 3.

2. I was on floor 3, which was ‘locally’ the top floor, but the lift was in a part of the building where there were 4 floors.

How do we model this kind of commonsense scenario?

Page 52: Why Not Relational?

04/19/23 CS319 Theory of Databases52

… now back to slide 29

Page 53: Why Not Relational?

04/19/23 CS319 Theory of Databases53

Theory of Databases: themes 1

Data modelling object-oriented

relational

entity-relationship

dependency

persistence/transience

Mathematical semantics

algebra + logic

procedural / declarativebatch / interactive

closed-world / open development

Page 54: Why Not Relational?

04/19/23 CS319 Theory of Databases54

Theory of Databases themes 2

Experiential VR “current state”

perception of state

situation

efficiency of retrieval

mobile computing

Features of DBs object abstractions

deductive elements

triggers

interfaces

Page 55: Why Not Relational?

04/19/23 CS319 Theory of Databases55

Theory of Databases themes 3

Pragmatic SQL

education of designers / users

standards/diversity/integration

commercial influences

‘no need for normalisation’

Foundational agenda of the DBA

DB design and anomalies

avoiding chaotic data organisation

logical + physical data independence

Page 56: Why Not Relational?

04/19/23 CS319 Theory of Databases56

Theory of Databases themes 4

Applications classical applications

design DBs

real-time applications

distributed DBs

personal DBs

interactive applications

spreadsheets

4GL