8 october, 2001dbis4 - rlc1 databases and information systems 4 richard cooper (rich@dcs) and tony...

95
8 October, 2001 DBIS4 - RLC 1 Databases and Information Systems 4 Richard Cooper (rich@dcs) and Tony Printezis (tony@dcs)

Upload: teresa-lester

Post on 27-Dec-2015

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 8 October, 2001DBIS4 - RLC1 Databases and Information Systems 4 Richard Cooper (rich@dcs) and Tony Printezis (tony@dcs)

8 October, 2001 DBIS4 - RLC 1

Databases and InformationSystems 4

Richard Cooper (rich@dcs)

and

Tony Printezis (tony@dcs)

Page 2: 8 October, 2001DBIS4 - RLC1 Databases and Information Systems 4 Richard Cooper (rich@dcs) and Tony Printezis (tony@dcs)

8 October, 2001 DBIS4 - RLC 2

The Fundamental Problem

• Database Systems have been very successful in providing good support for managing data which is fairly large and fairly complex

• What happens when:– the data gets very much larger– the data gets very much more complex

Page 3: 8 October, 2001DBIS4 - RLC1 Databases and Information Systems 4 Richard Cooper (rich@dcs) and Tony Printezis (tony@dcs)

8 October, 2001 DBIS4 - RLC 3

Contents of Course

• Week 1 (Richard)– Introduction– Overview of RDB/ORDB/OODB

• Week 2 (Richard)– Orthogonal Persistence– Object Oriented Database Systems

Page 4: 8 October, 2001DBIS4 - RLC1 Databases and Information Systems 4 Richard Cooper (rich@dcs) and Tony Printezis (tony@dcs)

8 October, 2001 DBIS4 - RLC 4

Contents of Course

• Week 3 (Tony)– Java Object Serialization– The PJama API

• Week 4 (Tony)– Object Caching and Object Faulting– Pointer Swizzling

Page 5: 8 October, 2001DBIS4 - RLC1 Databases and Information Systems 4 Richard Cooper (rich@dcs) and Tony Printezis (tony@dcs)

8 October, 2001 DBIS4 - RLC 5

Contents of Course 3

• Week 5 (Tony)– Garbage Collection - Disk Behaviour– Object Promotion

• Week 6 (Tony)– Object Eviction– Orthogonal Persistence for Java

Page 6: 8 October, 2001DBIS4 - RLC1 Databases and Information Systems 4 Richard Cooper (rich@dcs) and Tony Printezis (tony@dcs)

8 October, 2001 DBIS4 - RLC 6

Contents of Course 4

• Week 7 (Tony)– Store Organisation– Garbage Collection

• Week 8 (Richard)– Object Query Languages– Transaction Models

Page 7: 8 October, 2001DBIS4 - RLC1 Databases and Information Systems 4 Richard Cooper (rich@dcs) and Tony Printezis (tony@dcs)

8 October, 2001 DBIS4 - RLC 7

Contents of Course 5

• Week 9 (Richard)– Transaction Models for Multi-Site Databases– Schema Evolution

• Week 10– Specialised Indexing (Ela)– XML (Richard)

Page 8: 8 October, 2001DBIS4 - RLC1 Databases and Information Systems 4 Richard Cooper (rich@dcs) and Tony Printezis (tony@dcs)

8 October, 2001 DBIS4 - RLC 8

Assumptions about Database Use

As database systems evolved, it was assumed that:1. There was a central data store with lots of distributed

users.

2. The data was relatively simple (largely alphanumeric).

3. The data was regular and complete.

4. There was a lot of data, but there was also an implicit limit to the size.

5. The users were either consumers or specialised creators

Page 9: 8 October, 2001DBIS4 - RLC1 Databases and Information Systems 4 Richard Cooper (rich@dcs) and Tony Printezis (tony@dcs)

8 October, 2001 DBIS4 - RLC 9

The Real World

• Now we have:– data all over the place– in all kinds of structures– much of it is text– even more of it is graphical or aural– vast amounts of it– some of it is missing or is structured differently

in different places– users with various kinds of interest/involvement

Page 10: 8 October, 2001DBIS4 - RLC1 Databases and Information Systems 4 Richard Cooper (rich@dcs) and Tony Printezis (tony@dcs)

8 October, 2001 DBIS4 - RLC 10

When Data is Small

• You can get away with:– non-linear algorithms– hand-crafted code and data– an ad hoc structure– implicit rules and informal conventions

Page 11: 8 October, 2001DBIS4 - RLC1 Databases and Information Systems 4 Richard Cooper (rich@dcs) and Tony Printezis (tony@dcs)

8 October, 2001 DBIS4 - RLC 11

When Data Gets Large

• You must have– linear or (better still) incremental algorithms– systematic code and data management– regular structures, frameworks and tools to

support them– explicit, visible and interpretable rules

Page 12: 8 October, 2001DBIS4 - RLC1 Databases and Information Systems 4 Richard Cooper (rich@dcs) and Tony Printezis (tony@dcs)

8 October, 2001 DBIS4 - RLC 12

When Data is also Long Lived

• We have the hardware to keep data for a very long time– and there are often laws forcing us to do so

• However, long-lived data tends to change:– new data is added– it is restructured– the software expected to handle it evolves

• Can you read a ten year old floppy???

Page 13: 8 October, 2001DBIS4 - RLC1 Databases and Information Systems 4 Richard Cooper (rich@dcs) and Tony Printezis (tony@dcs)

8 October, 2001 DBIS4 - RLC 13

When Data is also Heterogeneous

• Information Systems increasingly must bring together data produced– of different kinds (numeric and multi-media)– separately (e.g. in merged companies)– for different purposes– using different technologies

• As though they were all designed to work together

Page 14: 8 October, 2001DBIS4 - RLC1 Databases and Information Systems 4 Richard Cooper (rich@dcs) and Tony Printezis (tony@dcs)

8 October, 2001 DBIS4 - RLC 14

Large, Long-lived, Heterogenous and Unstoppable

• Because the data supports continuous operations:– utilities, banking, airlines, public service

• You may not stop such systems if:– you want to change the hardware or software– you want to change your database– you want to change the application– there are hardware or software failures– there are operations which require exclusive access

Page 15: 8 October, 2001DBIS4 - RLC1 Databases and Information Systems 4 Richard Cooper (rich@dcs) and Tony Printezis (tony@dcs)

8 October, 2001 DBIS4 - RLC 15

This is the Reality we Live With

• There are lots of examples:– shared scientific data (e.g. genomic data)– e-business– governmental systems and health-care data– computer aided design and manufacturing– geographic information systems– etc., etc.

Page 16: 8 October, 2001DBIS4 - RLC1 Databases and Information Systems 4 Richard Cooper (rich@dcs) and Tony Printezis (tony@dcs)

8 October, 2001 DBIS4 - RLC 16

And There Are Many More Media for Data Access

• Not just a private network, but also:– the internet– digital television– mobile devices– etc., etc.

Page 17: 8 October, 2001DBIS4 - RLC1 Databases and Information Systems 4 Richard Cooper (rich@dcs) and Tony Printezis (tony@dcs)

8 October, 2001 DBIS4 - RLC 17

How To Cope 1

• Software Re-use:– not just small libraries such as Java API’s– but large components, such as:

• databases, payroll packages, GUI packages, etc.

• Standardised Frameworks– CORBA, DCOM, EJB, .NET, XML

Page 18: 8 October, 2001DBIS4 - RLC1 Databases and Information Systems 4 Richard Cooper (rich@dcs) and Tony Printezis (tony@dcs)

8 October, 2001 DBIS4 - RLC 18

How to Cope 2

• Generate code rather than write it– since much code is repetitious and can be

generated from• a high-level notation or by reflecting over data

• Work incrementally– revolution is never affordable– plan and resource route for transition

• remember the users!

Page 19: 8 October, 2001DBIS4 - RLC1 Databases and Information Systems 4 Richard Cooper (rich@dcs) and Tony Printezis (tony@dcs)

8 October, 2001 DBIS4 - RLC 19

The Fundamental Coping Device

• Effective high-level and complex standards for representing:– data (relations not enough)– applications (regular, strict languages needed)– distributed systems (CORBA, etc.)– processes (UML, business processes, etc.)– etc., etc.

Page 20: 8 October, 2001DBIS4 - RLC1 Databases and Information Systems 4 Richard Cooper (rich@dcs) and Tony Printezis (tony@dcs)

8 October, 2001 DBIS4 - RLC 20

But also ...

• It may be necessary to create new storage techniques to fit new data structures

• It will be necessary to invent new storage structures to manage the new complexity

• There is need for work at both– the implementation level and– the usability level

Page 21: 8 October, 2001DBIS4 - RLC1 Databases and Information Systems 4 Richard Cooper (rich@dcs) and Tony Printezis (tony@dcs)

8 October, 2001 DBIS4 - RLC 21

Lecture 2

• New Requirements on DB Functions

• Why Relations Won't Do

• Extending Relations:– Historical and Deductive Databases

• Object Relational Databases– Oracle Objects, SQL3, etc.

• Object Oriented Databases– intro only

Page 22: 8 October, 2001DBIS4 - RLC1 Databases and Information Systems 4 Richard Cooper (rich@dcs) and Tony Printezis (tony@dcs)

8 October, 2001 DBIS4 - RLC 22

New Applications withNew Requirements

1. CAD, CAE, CIM

2. Computer Aided Software Engineering

3. Office Information Systems

4. Geographic Information Systems

5. Hypermedia Systems

Data is large, often graphical, multiple versions required, data is complex

Page 23: 8 October, 2001DBIS4 - RLC1 Databases and Information Systems 4 Richard Cooper (rich@dcs) and Tony Printezis (tony@dcs)

8 October, 2001 DBIS4 - RLC 23

Requirements which carry over from Traditional Applications

• Efficient access to large amounts of data

• Recovery mechanisms

• Security mechanisms

• Data independence

• Distribution of data

Page 24: 8 October, 2001DBIS4 - RLC1 Databases and Information Systems 4 Richard Cooper (rich@dcs) and Tony Printezis (tony@dcs)

8 October, 2001 DBIS4 - RLC 24

Requirements Modified by the New Applications I

• Transactions– in traditional applications, these are short -

milliseconds to book a seat – in novel applications, they may be long - hours

or days to edit a design – in traditional applications they are competitive

- don't book the same seat twice– in novel applications they may be co-operative

e.g. collaboration on design development

Page 25: 8 October, 2001DBIS4 - RLC1 Databases and Information Systems 4 Richard Cooper (rich@dcs) and Tony Printezis (tony@dcs)

8 October, 2001 DBIS4 - RLC 25

Requirements Modified by the New Applications II

• Integrity Constraints are much more important– as the data is more semantically complex– some of the semantics is best expressed as

constraints

• User Interfaces play a greater rôle– the data is manageable only if appropriate

visualised– complex operations must be made usable

Page 26: 8 October, 2001DBIS4 - RLC1 Databases and Information Systems 4 Richard Cooper (rich@dcs) and Tony Printezis (tony@dcs)

8 October, 2001 DBIS4 - RLC 26

Requirements Modified by the New Applications III

• Data is organised differentlyTrad. Apps Novel Apps

Numbers of Objects Large Small

Number of Types Small Large

Object size Small Large/Huge

Page 27: 8 October, 2001DBIS4 - RLC1 Databases and Information Systems 4 Richard Cooper (rich@dcs) and Tony Printezis (tony@dcs)

8 October, 2001 DBIS4 - RLC 27

New Requirements Made by the New Applications I

• Complex Data Structures– Just sets of records won't do– Object identity easier than primary keys– Implicit references easier than foreign keys– First Normal Form is a Killer!

• Multimedia Data Types

Page 28: 8 October, 2001DBIS4 - RLC1 Databases and Information Systems 4 Richard Cooper (rich@dcs) and Tony Printezis (tony@dcs)

8 October, 2001 DBIS4 - RLC 28

New Requirements Made by the New Applications II

• The Database must hold Code– to hold complex derived data – to hold "active values"

• Multiple Versions– We only want one bank account record at any

time– But many alternative designs– Building configurations becomes a problem

Page 29: 8 October, 2001DBIS4 - RLC1 Databases and Information Systems 4 Richard Cooper (rich@dcs) and Tony Printezis (tony@dcs)

8 October, 2001 DBIS4 - RLC 29

Can We Go On UsingRelational DBMS?

Only with increased mapping problems

The RM only has two ways of relating two pieces of data:

1. They are in the same record.

2. They are in two records connected by a foreign key.

Page 30: 8 October, 2001DBIS4 - RLC1 Databases and Information Systems 4 Richard Cooper (rich@dcs) and Tony Printezis (tony@dcs)

8 October, 2001 DBIS4 - RLC 30

The Semantic Poverty of the RM

• The former is used for:• grouping attributes

• 1-1 relationships

• compound attributes

• connecting keys of M-N relationships

• The latter is used for:• multi-valued attributes

• sub-typing

• one-many attributes

Page 31: 8 October, 2001DBIS4 - RLC1 Databases and Information Systems 4 Richard Cooper (rich@dcs) and Tony Printezis (tony@dcs)

8 October, 2001 DBIS4 - RLC 31

Other Problem with RDBs• You can't do recursive queries

– e.g. "Return all the ancestors of X"

• Nor much support for constraints– e.g. "All employees earn less than their boss"

• You can't add new operations– e.g. "Return the volume of a building"

• Impedance mismatch– if you have use a PL this has a different data model

than does SQL

Page 32: 8 October, 2001DBIS4 - RLC1 Databases and Information Systems 4 Richard Cooper (rich@dcs) and Tony Printezis (tony@dcs)

8 October, 2001 DBIS4 - RLC 32

Three Approaches for Progress

Start with traditional DBMS Object-Relational System and extend its modelling power

or

Start with rich data model Object Oriented DBMS and add DBMS facilities

or

Start with a Programming Persistent Prog Language Language and add DBMS facilities

Manifesto Wars

Page 33: 8 October, 2001DBIS4 - RLC1 Databases and Information Systems 4 Richard Cooper (rich@dcs) and Tony Printezis (tony@dcs)

8 October, 2001 DBIS4 - RLC 33

The Third-Generation Database System Manifesto I

Three tenets:1. Besides traditional data management services,

third generation DBMSs will provide support for richer object structures and rules

2. Third generation DBMSs must subsume second generation systems

3. Third generation DBMSs must be open to other subsystems

Page 34: 8 October, 2001DBIS4 - RLC1 Databases and Information Systems 4 Richard Cooper (rich@dcs) and Tony Printezis (tony@dcs)

8 October, 2001 DBIS4 - RLC 34

The Third-Generation Database System Manifesto II

Thirteen Propositions:Rich type system Inheritance

Functions/encapsulation OIDs only if no primary key

Rules (triggers and constraints) are important

The query language should be central to all access

Manual&Automatic Collections Update through views

Performance and data model should be kept separate

Multiple Prog. Languages SQL is the de facto standard

Persistent extension of languages is good

Network communication through queries and results

Page 35: 8 October, 2001DBIS4 - RLC1 Databases and Information Systems 4 Richard Cooper (rich@dcs) and Tony Printezis (tony@dcs)

8 October, 2001 DBIS4 - RLC 35

The Object Oriented System Manifesto I

Mandatory Features:Complex Objects Object Identity Encapsulation

Types and Classes Inheritance Late binding

Ad hoc querying Extensibility Persistence

Efficient storage Concurrency Recovery

Computational completeness

Disagreement :Integrity constraints DB Admin Tools Views

Schema Evolution Tools

Page 36: 8 October, 2001DBIS4 - RLC1 Databases and Information Systems 4 Richard Cooper (rich@dcs) and Tony Printezis (tony@dcs)

8 October, 2001 DBIS4 - RLC 36

The Object Oriented System Manifesto II

Optional Features:Multiple inheritance Type checking

Distribution Design Transactions Versions

Open Choices Programming paradigm Type systemUniformity

Page 37: 8 October, 2001DBIS4 - RLC1 Databases and Information Systems 4 Richard Cooper (rich@dcs) and Tony Printezis (tony@dcs)

8 October, 2001 DBIS4 - RLC 37

The Third Manifesto

The relational model is still important and OO features should be orthogonal

Like:

relations relational algebra up front

integrity constraints mutiple and single inheritance

computational completeness static type checking

Don't like

SQL, object Ids and null values

Page 38: 8 October, 2001DBIS4 - RLC1 Databases and Information Systems 4 Richard Cooper (rich@dcs) and Tony Printezis (tony@dcs)

8 October, 2001 DBIS4 - RLC 38

Two Extensions of RDBMS

• Historical DBMS– keep all past states of the database

• Deductive DBMS– derived data as well as base data– uses a language like Prolog to add the derived

data

Page 39: 8 October, 2001DBIS4 - RLC1 Databases and Information Systems 4 Richard Cooper (rich@dcs) and Tony Printezis (tony@dcs)

8 October, 2001 DBIS4 - RLC 39

Historical DBMSOld records are kept when they are deleted to answer

queries like "give balance on 1/10/88?"

Records have two extra fields - creation and deletion datesdelete sets the deletion field

insert sets the creation field

update sets the deletion field and creates a new record

Two notions of time:when the data is valid and when it is entered

Page 40: 8 October, 2001DBIS4 - RLC1 Databases and Information Systems 4 Richard Cooper (rich@dcs) and Tony Printezis (tony@dcs)

8 October, 2001 DBIS4 - RLC 40

Deductive DBMS (DDB)

A DDB is made up of two kinds of component:

facts are simple base assertions - i.e. recordsfather( jane, john ) mother( jill, jane)

rules are ways of deriving more factsgrandfather( C, G ) :- parent( C, P ), father( P, G )

parent( C, P ) :- father( C, P ), etc.

Queries are rules with variables to be filled in:grandfather( X, john )? - who are john's grandchildren

Page 41: 8 October, 2001DBIS4 - RLC1 Databases and Information Systems 4 Richard Cooper (rich@dcs) and Tony Printezis (tony@dcs)

8 October, 2001 DBIS4 - RLC 41

Object-Relational Databases

• Also known as:– Extended relational databases– Complex object databases

• Main features– get rid of First Normal Form– add methods to tables

• Main examples– Oracle 8/i onwards, SQL3, Infomix

Page 42: 8 October, 2001DBIS4 - RLC1 Databases and Information Systems 4 Richard Cooper (rich@dcs) and Tony Printezis (tony@dcs)

8 October, 2001 DBIS4 - RLC 42

The Main Additions to RDBs

• User defined abstract data types

• Row types so that one value can include a nested complex value

• Collection types for domains

• Inclusion of user-defined functions defined on types

• Inheritance

• Multimedia data types and large objects

Page 43: 8 October, 2001DBIS4 - RLC1 Databases and Information Systems 4 Richard Cooper (rich@dcs) and Tony Printezis (tony@dcs)

8 October, 2001 DBIS4 - RLC 43

SQL3 (Evolving Standard)

• This is a massive extension to SQL and has:– computational completeness– row types– user-defined types– user-defined procedures, functions and operators– type constructors for arrays, sets, lists and multisets– support for large objects - BLOBs and CLOBs– recursion

Page 44: 8 October, 2001DBIS4 - RLC1 Databases and Information Systems 4 Richard Cooper (rich@dcs) and Tony Printezis (tony@dcs)

8 October, 2001 DBIS4 - RLC 44

Row Types in SQL3

• A row type is a sequence of field name/type pairs - i.e. the type of a row of a table

• In SQL3 it can also be the domain of a columncreate table Branch( branchNo longInt,

address row( street varchar(20),

city varchar(20) ) );

• Row types can be named:

create row type EmpRT( Ename varchar(35), age integer );

create table Employee of type EmpRT;

Page 45: 8 October, 2001DBIS4 - RLC1 Databases and Information Systems 4 Richard Cooper (rich@dcs) and Tony Printezis (tony@dcs)

8 October, 2001 DBIS4 - RLC 45

User-Defined Types (UDTs) in SQL3

• These are a means of defining new domain types in SQL3, e.g.:

create type StaffNumberType as varchar(5) final;

• More generally a UDT is an abstract data type with:– (non First Normal Form) fields– constructor methods– observer and mutator (get and set) methods– general methods

Page 46: 8 October, 2001DBIS4 - RLC1 Databases and Information Systems 4 Richard Cooper (rich@dcs) and Tony Printezis (tony@dcs)

8 October, 2001 DBIS4 - RLC 46

UDT Example

create type personType as ( private dateOfBirth Date,

public fname VARCHAR(15) not null,

public lname VARCHAR(15) not null,

function age(p PersonType) returns integer

return /* code to calculate age */

end )

ref is system generated // see later

instantiable // if not, only subtypes are

not final; // can have sub-types

Page 47: 8 October, 2001DBIS4 - RLC1 Databases and Information Systems 4 Richard Cooper (rich@dcs) and Tony Printezis (tony@dcs)

8 October, 2001 DBIS4 - RLC 47

Subtypes and Supertypes

• Given a type, we can create a subtype, e.g.:create type StaffType under PersonType as

( staffNo varchar(6), etc.

• This works by creating an extra attribute which refers to a PersonType value

• This also works at the table level:create table Manager under Staff( MgrStartDate Date);

– This creates a table with all the columns of Staff duplicated and all manager records in both tables

Page 48: 8 October, 2001DBIS4 - RLC1 Databases and Information Systems 4 Richard Cooper (rich@dcs) and Tony Printezis (tony@dcs)

8 October, 2001 DBIS4 - RLC 48

References

• In SQL3 it is possible to set up OID style references.

• On slide 46 we said that PersonType had system-generated references, so we can do:

create table Branch as ( branchNo integer,

address addressType,manager ref(PersonType)..... )

• In this, the value is a system-generated OID

Page 49: 8 October, 2001DBIS4 - RLC1 Databases and Information Systems 4 Richard Cooper (rich@dcs) and Tony Printezis (tony@dcs)

8 October, 2001 DBIS4 - RLC 49

Collection Types

• SQL3 supports four collection types:ARRAY - one dimensional fixed length array

LIST - ordered and allows duplicates

SET - unordered and does not allow duplicates

MULTISET - unordered and allows duplicates

• E.g. if PersonType has an attribute:nextOfKin set(PersonType)

• The following makes sense:select fName, lName, count(NextOfKin)

Page 50: 8 October, 2001DBIS4 - RLC1 Databases and Information Systems 4 Richard Cooper (rich@dcs) and Tony Printezis (tony@dcs)

8 October, 2001 DBIS4 - RLC 50

Triggers• Triggers are pieces of code which act when some

condition is met. Each trigger defines:– the event and whether to act before or after it occurs– whether to operate on each row or only once– what to docreate trigger MailNewStaffNextOfKin

after insert on Staff referencing new row as STbegin

insert into StaffToMail values ( select P.name, P.address from Person where ST.nextOFKin[1] = ST.staffNo )

end

Page 51: 8 October, 2001DBIS4 - RLC1 Databases and Information Systems 4 Richard Cooper (rich@dcs) and Tony Printezis (tony@dcs)

8 October, 2001 DBIS4 - RLC 51

Large Objects

• Large objects are increasingly important and there are two kinds:– Binary Large Objects (BLOBs)– Character Large Objects (CLOBs)

• You can– Concatenate them and do "substring" operations– Overlay and trim them– Return the length

Page 52: 8 October, 2001DBIS4 - RLC1 Databases and Information Systems 4 Richard Cooper (rich@dcs) and Tony Printezis (tony@dcs)

8 October, 2001 DBIS4 - RLC 52

Recursion

• SQL3 permits linearly recursive queries, such as: with recursive AllManagers( staffNo, managerStaffNo)

(select staffNo, managerStaffNo

from Staff

union

select in.staffNo, out.managerStaffNo

from AllManager in, Staff out

where in.managerStaffNo = out.staffNo )

Page 53: 8 October, 2001DBIS4 - RLC1 Databases and Information Systems 4 Richard Cooper (rich@dcs) and Tony Printezis (tony@dcs)

8 October, 2001 DBIS4 - RLC 53

Objects in Oracle

• The object option in Oracle8 provides, among other things:– user-defined data types– the use of objects directly by use of the ref keyword– collection types including variable length arrays – multimedia data types

Page 54: 8 October, 2001DBIS4 - RLC1 Databases and Information Systems 4 Richard Cooper (rich@dcs) and Tony Printezis (tony@dcs)

8 October, 2001 DBIS4 - RLC 54

User Defined Types

• UDTs have a name, attributes and methods:create type Person as object

( name varchar2(30),

address varchar2(40),

member function getName return varchar2(30) );

• Constructor methods - as usual

• Comparison methods - to help order objects

• General methods

Page 55: 8 October, 2001DBIS4 - RLC1 Databases and Information Systems 4 Richard Cooper (rich@dcs) and Tony Printezis (tony@dcs)

8 October, 2001 DBIS4 - RLC 55

Ref Types

• Attributes with object types have their domains declared using ref:

create type Person as OBJECT

( name VARCHAR2(30),

spouse ref person );

• For an object P of type Person, you can then do:

P.spouse.name // to get the name of P's spouse

Page 56: 8 October, 2001DBIS4 - RLC1 Databases and Information Systems 4 Richard Cooper (rich@dcs) and Tony Printezis (tony@dcs)

8 October, 2001 DBIS4 - RLC 56

Collection Types

• There are two collection types:– Arrays (called VARRAYs)

create type Prices as varray(10) of number(1,2)

– Tables (called nested tables)create type PersonTable as table of Person;

• Now we can have columns whose domains are either of the above

Page 57: 8 October, 2001DBIS4 - RLC1 Databases and Information Systems 4 Richard Cooper (rich@dcs) and Tony Printezis (tony@dcs)

8 October, 2001 DBIS4 - RLC 57

Object Views

• An object view is a virtual table of objects– useful to evolve relational applications into object

applicationscreate table Person (NINum varchar2(9),

Name varchar2(30), Age number)

create view OldView with object oid (NUNum) as

select NINum, Name, Age from Person

where Age > 40;

• Update through views permitted where sensible

Page 58: 8 October, 2001DBIS4 - RLC1 Databases and Information Systems 4 Richard Cooper (rich@dcs) and Tony Printezis (tony@dcs)

8 October, 2001 DBIS4 - RLC 58

Comparing ORDBs and OODBs

• ORDBs are better for:– integrating a pre-existing RDB– traditional DBMS facilities (security, recovery etc.)

• OODBs are better for:– advanced transactions, navigational queries– schema evolution– integrating a programming language

Page 59: 8 October, 2001DBIS4 - RLC1 Databases and Information Systems 4 Richard Cooper (rich@dcs) and Tony Printezis (tony@dcs)

8 October, 2001 DBIS4 - RLC 59

Lecture 3Orthogonal Persistence

• Why Orthogonal Persistence is important

• What Orthogonal Persistence is

• Principles of Orthogonal Persistence

• How to achieve Orthogonal Persistence

• Examples of Persistence Mechanisms

Page 60: 8 October, 2001DBIS4 - RLC1 Databases and Information Systems 4 Richard Cooper (rich@dcs) and Tony Printezis (tony@dcs)

8 October, 2001 DBIS4 - RLC 60

The Problem

• Traditional data intensive programming requires programmers to be distracted trying to arrange storage for the data:– Fortran programs + files– Cobol + Network Databases– C, etc. + Relations

• This distraction slows productivity

Page 61: 8 October, 2001DBIS4 - RLC1 Databases and Information Systems 4 Richard Cooper (rich@dcs) and Tony Printezis (tony@dcs)

8 October, 2001 DBIS4 - RLC 61

Too Many Mappings!

T raditional Persistence

(Using a File)

R eal W orld U nderstanding

Piano

Legs Keys

KeyboardTop

In Memory Model

12 31 9 3.5 4.7 6.3 2.5

Stored Model

R e a l W o r l d U n d e r s t a n d i n g

O r t h o g o n a l P e r s i s t e n c e

Piano

Legs Keys

KeyboardTop

Computer Model

Page 62: 8 October, 2001DBIS4 - RLC1 Databases and Information Systems 4 Richard Cooper (rich@dcs) and Tony Printezis (tony@dcs)

8 October, 2001 DBIS4 - RLC 62

Defining Persistence

• Persistence is the length of time for which a piece of data (including program) continues to exist.

• from until the end of the block it was declared in

• to outliving the program which constructed it

• Most systems provide different persistence mechanisms for different data.– Often systems only permit some data long term

persistence - e.g. JOS.

Page 63: 8 October, 2001DBIS4 - RLC1 Databases and Information Systems 4 Richard Cooper (rich@dcs) and Tony Printezis (tony@dcs)

8 October, 2001 DBIS4 - RLC 63

Orthogonal Persistence

• is the automatic management of data so that it may:– outlive an individual program execution

• automatically moving to and from backing store

– be used concurrently by more than one program• not just storing a heap image - e.g. LISP, SmallTalk

• dynamic binding of names and types

– be used by successive program versions• requires an evolution mechanism

Page 64: 8 October, 2001DBIS4 - RLC1 Databases and Information Systems 4 Richard Cooper (rich@dcs) and Tony Printezis (tony@dcs)

8 October, 2001 DBIS4 - RLC 64

Principles of OP

• Data of any type (including multimedia and code fragments) should have an equal right to all levels of persistence

• All of the data is stored completely

• The data retains its structure when stored

• The code is the same whatever the persistence of its data

Page 65: 8 October, 2001DBIS4 - RLC1 Databases and Information Systems 4 Richard Cooper (rich@dcs) and Tony Printezis (tony@dcs)

8 October, 2001 DBIS4 - RLC 65

Why is this Important?

• Every departure from these rules creates an irregularity that the programmer has to work around:– data types which cannot be stored in the same

way as everything else– rebuilding incomplete structures– dealing with referential integrity problems– different code for transient and persistent data

Page 66: 8 October, 2001DBIS4 - RLC1 Databases and Information Systems 4 Richard Cooper (rich@dcs) and Tony Printezis (tony@dcs)

8 October, 2001 DBIS4 - RLC 66

Other Benefits of OP

• Only one persistence technique to learn

• Avoids extra code which obscures the application logic

• Permits code re-use

• But how does the programmer assign a persistence level variously to the data?– Any data can persist but for this application

which should?

Page 67: 8 October, 2001DBIS4 - RLC1 Databases and Information Systems 4 Richard Cooper (rich@dcs) and Tony Printezis (tony@dcs)

8 October, 2001 DBIS4 - RLC 67

Mechanisms for Indicating Persistence

•Explicit write statements - not in the spirit of OP

•Persistence indicated by class or typeODMG supports this

The E language had "Shadow" classes - one for each real class

•Persistence indicated at object declaration or at object creation

some OODBs do this

•Persistence by reachabilitythis will be our favourite, you'll see!

Page 68: 8 October, 2001DBIS4 - RLC1 Databases and Information Systems 4 Richard Cooper (rich@dcs) and Tony Printezis (tony@dcs)

8 October, 2001 DBIS4 - RLC 68

Persistent Class Examples

• Classes declared to be persistent:– persistent class Person

early ODMG proposal

– class Person implements SerializableJava - native code can't play

– class Person : public d_ObjectODMG proposal for C++

Page 69: 8 October, 2001DBIS4 - RLC1 Databases and Information Systems 4 Richard Cooper (rich@dcs) and Tony Printezis (tony@dcs)

8 October, 2001 DBIS4 - RLC 69

Persistent Object Examples

• persistent Person P;

• Person P = new Person(MyDB);– Person P is created in the database

• Person Q = new Person( P );– Person P is created in the database "near to"

Person P.

Page 70: 8 October, 2001DBIS4 - RLC1 Databases and Information Systems 4 Richard Cooper (rich@dcs) and Tony Printezis (tony@dcs)

8 October, 2001 DBIS4 - RLC 70

Persistence by Reachability

• Some objects are explicitly stored - persistent roots

• Any other object which is pointed to by a root is automatically stored as well

• Objects pointed to by those objects are also stored– in fact, the transitive closure of references from

the roots are stored

• This is similar to Garbage Collection

Page 71: 8 October, 2001DBIS4 - RLC1 Databases and Information Systems 4 Richard Cooper (rich@dcs) and Tony Printezis (tony@dcs)

8 October, 2001 DBIS4 - RLC 71

Example of Reachability

Memory

The Database

A tree in memory Explicit storage of tree root

The rest of the tree is dragged in as well

Page 72: 8 October, 2001DBIS4 - RLC1 Databases and Information Systems 4 Richard Cooper (rich@dcs) and Tony Printezis (tony@dcs)

8 October, 2001 DBIS4 - RLC 72

Using Persistence by Reachability

• The data must be organised around the idea of persistent roots and their transitive closures

• Note this is not new– An RDB has each relation as a root whose

transitive closure is the set of records– ORDB and OODB databases can be organised the

same way– Except other structures may now be used - e.g. a

tree

Page 73: 8 October, 2001DBIS4 - RLC1 Databases and Information Systems 4 Richard Cooper (rich@dcs) and Tony Printezis (tony@dcs)

8 October, 2001 DBIS4 - RLC 73

History of OP

1978 - Identified by Atkinson

1978 - 1982 - Search for a suitable language

1983 - 1988 - PS-algol

1988 - 1995 - Napier88

1985 - present, ideas gradually appear in commercial systems

1995 - 2000 - Pjama, Persistent Java

Page 74: 8 October, 2001DBIS4 - RLC1 Databases and Information Systems 4 Richard Cooper (rich@dcs) and Tony Printezis (tony@dcs)

8 October, 2001 DBIS4 - RLC 74

What the Research has Entailed

• Identification of language with suitable properties– regularity, popularity

• Identification of necessary techniques– store organisation, memory management,

organising the movement of data

• Implementation of those techniques efficiently

Page 75: 8 October, 2001DBIS4 - RLC1 Databases and Information Systems 4 Richard Cooper (rich@dcs) and Tony Printezis (tony@dcs)

8 October, 2001 DBIS4 - RLC 75

Lecture 4

• Persistent Programming Languages– What is a suitable language?– Some examples

• Object Oriented Database Systems– Features– Examples– The Object Data Management Group Standard

Page 76: 8 October, 2001DBIS4 - RLC1 Databases and Information Systems 4 Richard Cooper (rich@dcs) and Tony Printezis (tony@dcs)

8 October, 2001 DBIS4 - RLC 76

A Suitable Language to Make Persistent

• A persistent programming language is one which accords with the principles of persistence (slide 64)

• In building a persistent language other aspects of a language are desirable:– regularity and small number of constructs– since irregularities and more constructs increase

the number of aspects that the persistence layer must cope with

Page 77: 8 October, 2001DBIS4 - RLC1 Databases and Information Systems 4 Richard Cooper (rich@dcs) and Tony Printezis (tony@dcs)

8 October, 2001 DBIS4 - RLC 77

PS-algol• This added persistence to S-algol a simple

and regular form of algol at St. Andrews• complex object structure, but object domains were all

of the same type• procedures are first-class objects which means an

object can a have a piece of code as a component, there are variables which hold procedures, etc.

• databases as objects in which you can enter name, value pairs to be persistent roots

• persistence by reachability from those• anything can persist

Page 78: 8 October, 2001DBIS4 - RLC1 Databases and Information Systems 4 Richard Cooper (rich@dcs) and Tony Printezis (tony@dcs)

8 October, 2001 DBIS4 - RLC 78

Napier88

• More powerful version of PS-algol developed at St Andrews and Glasgow

• complex objects but now the domains are typed

• single procedure to return the persistent store as the sole persistent root objects

• databases inserted immediately below this

• abstract data types and other type constructors

• hyper-programming allows programming directly against the database

• image data type

Page 79: 8 October, 2001DBIS4 - RLC1 Databases and Information Systems 4 Richard Cooper (rich@dcs) and Tony Printezis (tony@dcs)

8 October, 2001 DBIS4 - RLC 79

Persistent Java

• PJama was developed in Glasgow from 1995 onwards

• Allows Java objects to be bound into the persistent store and retrieved

• Much more on this in subsequent lectures

Page 80: 8 October, 2001DBIS4 - RLC1 Databases and Information Systems 4 Richard Cooper (rich@dcs) and Tony Printezis (tony@dcs)

8 October, 2001 DBIS4 - RLC 80

Object Oriented Databases

• An Object Oriented Database has the following features:– Objects can persist– Object identifiers and references– Encapsulation of data and methods– Inheritance– Dynamic binding of code to data

Page 81: 8 October, 2001DBIS4 - RLC1 Databases and Information Systems 4 Richard Cooper (rich@dcs) and Tony Printezis (tony@dcs)

8 October, 2001 DBIS4 - RLC 81

Example

OID 56789

name: Richard

department: OID 56789

OID 56543name: David

department: OID 56789

OID 43215

name: Computing Science

head: OID 56543

Memory DiscObject Table

56789

56543

43215

Page 82: 8 October, 2001DBIS4 - RLC1 Databases and Information Systems 4 Richard Cooper (rich@dcs) and Tony Printezis (tony@dcs)

8 October, 2001 DBIS4 - RLC 82

Problems with OODBs

• They are hard to implement– Adding concurrency, distribution, efficiency,

reliability and querying to an OO system is difficult

• They use different persistence mechanisms

• They use different OO models– and different OO languages

• They have been produced by small, unstable companies

Page 83: 8 October, 2001DBIS4 - RLC1 Databases and Information Systems 4 Richard Cooper (rich@dcs) and Tony Printezis (tony@dcs)

8 October, 2001 DBIS4 - RLC 83

Differences in Object Models

• Are scalars objects?

• Can properties be public?– if not how is the optimiser going to work?

• Are there other information hiding controls? - e.g. friends

• Multiple or single inheritance

• What can be made persistent and how?

Page 84: 8 October, 2001DBIS4 - RLC1 Databases and Information Systems 4 Richard Cooper (rich@dcs) and Tony Printezis (tony@dcs)

8 October, 2001 DBIS4 - RLC 84

History of OODBMs

• First products in the field use Smalltalk/Own Language:

• 1986/7 - GemStone and Vbase

• Big companies toy with the idea• 1987 - DEC (Trellis/Owl) and Hewlett-Packard (IRIS)

• C++ Products in Late Eighties• Ontos, Versant, Objectivity, ObjectStore

• Other models 1990 onwards• O2, POET, UniSQL, Jasmine, etc.

Page 85: 8 October, 2001DBIS4 - RLC1 Databases and Information Systems 4 Richard Cooper (rich@dcs) and Tony Printezis (tony@dcs)

8 October, 2001 DBIS4 - RLC 85

Gemstone/J

• Started as persistent Smalltalk

• Switch now to Java– Distributed Java Beans and EJBs– Servlets and JSP– CORBA– etc.

• OQL, Transactions, etc.

Page 86: 8 October, 2001DBIS4 - RLC1 Databases and Information Systems 4 Richard Cooper (rich@dcs) and Tony Printezis (tony@dcs)

8 October, 2001 DBIS4 - RLC 86

Jasmine

• From Computer Associates (INGRES RDB)

• Studio for application development

• Java

• Multimedia classes

• Authoring tools

• Web development facilities

Page 87: 8 October, 2001DBIS4 - RLC1 Databases and Information Systems 4 Richard Cooper (rich@dcs) and Tony Printezis (tony@dcs)

8 October, 2001 DBIS4 - RLC 87

POET

• Java and C++• OQL• Targeted at small applications• Transactions and Locking• Schema versions• Event Notification• Security and Authorisation• Object factory - putting objects into RDBs

Page 88: 8 October, 2001DBIS4 - RLC1 Databases and Information Systems 4 Richard Cooper (rich@dcs) and Tony Printezis (tony@dcs)

8 October, 2001 DBIS4 - RLC 88

The Object Data Management Group (ODMG)

• Set up by Rick Catell at Sun and the main OODB vendors:– voting members - Sun, POET, Objectivity,

Excelon– reviewer members - CERN, Versant, CA, NEC

and Micro Data Base Systems– academic members– membership always changing!

Page 89: 8 October, 2001DBIS4 - RLC1 Databases and Information Systems 4 Richard Cooper (rich@dcs) and Tony Printezis (tony@dcs)

8 October, 2001 DBIS4 - RLC 89

What are the ODMG Doing?

– an architecture for OODBMS – a logical data model expressed as a class

hierarchy – a data definition language, ODL – a data interchange format, OIF – a query language, OQL– a number of Object Manipulation Languages

(OMLs)• bindings to Java, C++ and SmallTalk

Page 90: 8 October, 2001DBIS4 - RLC1 Databases and Information Systems 4 Richard Cooper (rich@dcs) and Tony Printezis (tony@dcs)

8 October, 2001 DBIS4 - RLC 90

ODMG - OO Features Appropriate for Databases

• Special Treatment of Literal Values• A DB cannot afford to make an integer an object

• Separate Provision for Relationships• Most OO models are not very good at relationships• ODMG provides for automatically maintained relationships

- i.e. when one side changes so does the other

• Domain Types - date and time domains• Objects for Database Management

• databases, transactions, locks, sessions, schemata

• Metadata Management

Page 91: 8 October, 2001DBIS4 - RLC1 Databases and Information Systems 4 Richard Cooper (rich@dcs) and Tony Printezis (tony@dcs)

8 October, 2001 DBIS4 - RLC 91

The ODMG Data Model

• The data model is defined in terms of a number of types which include:Interfaces - describe the abstract behaviour of objects

Classes - describe the abstract behaviour and state of objects

Collections - sets, bags, lists, arrays, dictionaries

Constructed Types - enumerations, structures and unions

Objects (with identity) and Literals (no identity)

Page 92: 8 October, 2001DBIS4 - RLC1 Databases and Information Systems 4 Richard Cooper (rich@dcs) and Tony Printezis (tony@dcs)

8 October, 2001 DBIS4 - RLC 92

The Type Hierarchy

Denotable_Object

Characteristic

Object Atomic_Object TypeExceptionIterator

Property AttributeRelationship

Operation

Structural_ObjectStructure < e1: T1 .... en: Tn>Collection <T> Array <T>

Set <T>Bag <T>List <T>

StringBit_String

Literal Atomic_Literal

Structured_LiteralImmutable_Structure < e1: T1 .... en: Tn>

DateTimeTimeStampInterval

EnumerationImmutable_Array <T>Immutable_Set <T>Immutable_Bag <T>Immutable_List <T>

Immutable_StringImmutable_Bit_String

Immutable_Collection <T>

Page 93: 8 October, 2001DBIS4 - RLC1 Databases and Information Systems 4 Richard Cooper (rich@dcs) and Tony Printezis (tony@dcs)

8 October, 2001 DBIS4 - RLC 93

Example (ODL)

• struct Address{ int house; String road; ... }defines a complex literal (not an object)

• interface Person{ String name; int age; ... }defines an uninstantiable object structure

• class Employee : Person{ int StaffNo; Dept d; ... }defines an instantiable object structure

":" is inheritance which can be multiple

Page 94: 8 October, 2001DBIS4 - RLC1 Databases and Information Systems 4 Richard Cooper (rich@dcs) and Tony Printezis (tony@dcs)

8 October, 2001 DBIS4 - RLC 94

Relationships

• Attributes and relationships are distinguishedclass Employee : Person{

attribute int StaffNo;

relationship Dept d inverse Dept:Employees ; ... }

class Dept {

relationship set<Employee> Employees inverse Employee : d; ... }

• Relationships can have automatically maintained inverses

Page 95: 8 October, 2001DBIS4 - RLC1 Databases and Information Systems 4 Richard Cooper (rich@dcs) and Tony Printezis (tony@dcs)

8 October, 2001 DBIS4 - RLC 95

Extents

• The extent of a type is the set of instances of that type in the database

• The extent of a subtype is a subset of the extent of the supertype

• The DB designer can request that the extent of a class is maintained automatically

• A particular implementation may include indexes and keys