1 information retrieval and use de-normalisation and distributed database systems geoff leese...

24
1 Information Retrieval and Use De-normalisation and Distributed database systems Geoff Leese September 2008, revised October 2009

Upload: bartholomew-dickerson

Post on 20-Jan-2018

217 views

Category:

Documents


0 download

DESCRIPTION

3 Mapping considerations n Independence n Privacy n Efficiency of queries

TRANSCRIPT

Page 1: 1 Information Retrieval and Use De-normalisation and Distributed database systems Geoff Leese September…

1

Information Retrieval and Use

De-normalisation and Distributed database systems

Geoff Leese September 2008, revised October 2009

Page 2: 1 Information Retrieval and Use De-normalisation and Distributed database systems Geoff Leese September…

2

Mapping the logical model onto physical design

Entities become tables More often than not!

Attributes become fields (columns) Unique identifiers become primary keys Relationships implemented by foreign key

columns Resolve M:N relationships by inserting

intersection table

Page 3: 1 Information Retrieval and Use De-normalisation and Distributed database systems Geoff Leese September…

3

Mapping considerations

Independence Privacy Efficiency of queries

Page 4: 1 Information Retrieval and Use De-normalisation and Distributed database systems Geoff Leese September…

4

Denormalisation

Joins take time! Split or merge normalised entities

based on frequent associated useRemove redundant relationshipsMerge entities with 1:1 relationshipsUse summary fieldsUse summary tables and views

Page 5: 1 Information Retrieval and Use De-normalisation and Distributed database systems Geoff Leese September…

5

Using summary field(1) Consider running a query “give the

total value of all orders for customer X”

How many joins?

Page 6: 1 Information Retrieval and Use De-normalisation and Distributed database systems Geoff Leese September…

6

Using summary field (2)

Note summary field in Orders table

How many joins now?

Page 7: 1 Information Retrieval and Use De-normalisation and Distributed database systems Geoff Leese September…

7

Distributed database systems

Special rules apply!

Page 8: 1 Information Retrieval and Use De-normalisation and Distributed database systems Geoff Leese September…

8

The traditional model

One centralised database Terminals at remote locations Disadvantages

Networks are slow (esp WANS!)Central machine does all processing If central machine fails, database is down

(Integrity, redundancy and disaster recovery considered in later lectures!)

Page 9: 1 Information Retrieval and Use De-normalisation and Distributed database systems Geoff Leese September…

9

The Client/Server model

Client – application – “front end” Server – DBMS – “back end” Still dependent on central database

Page 10: 1 Information Retrieval and Use De-normalisation and Distributed database systems Geoff Leese September…

10

Client responsibilities

Manages user interface Accepts user data Has local processing capability within the

application Generates database requests and

transmits them via network to server Receives results from server and formats

them as required by application

Page 11: 1 Information Retrieval and Use De-normalisation and Distributed database systems Geoff Leese September…

11

Server responsibilities

Accepts database requests from client Processes database requests

Handles security issues Deals with concurrency issues Optimizes queries Handles recovery/rollback issues

Returns results to client

Page 12: 1 Information Retrieval and Use De-normalisation and Distributed database systems Geoff Leese September…

12

Distributed database architecture

A collection of logically related “sites”, connected together so that the users view is that of a single database at a single location.

Each site is a database in it’s own right Not necessarily physically or

geographically separated, but often are – and are logically separated.

Page 13: 1 Information Retrieval and Use De-normalisation and Distributed database systems Geoff Leese September…

13

Advantages

Organisations are distributed, why shouldn’t their data be?

Improved efficiencyStore data close to where it’s used

Page 14: 1 Information Retrieval and Use De-normalisation and Distributed database systems Geoff Leese September…

14

Types of DDS

Homogenous – same type of RDBMS at each site (easy!)

Heterogeneous – different types of DBMS at each site (not so easy!)

Page 15: 1 Information Retrieval and Use De-normalisation and Distributed database systems Geoff Leese September…

15

Implementation methods (1) Fragmentation – splitting data

between sitesHorizontal – row based – e.g. store all

employee records for a location at that location

Vertical – column based – e.g. store all payroll columns in payroll department, all other employee data in HR

Either way, fragments must be able to be put back together!

Page 16: 1 Information Retrieval and Use De-normalisation and Distributed database systems Geoff Leese September…

16

Implementation methods (2)

ReplicationControlled duplication of data at more

than one site Update propagation?

Page 17: 1 Information Retrieval and Use De-normalisation and Distributed database systems Geoff Leese September…

17

Objectives (1)

Local autonomy Local data locally owned and managed

– minimal data requirements from remote sites.

No reliance on central site Continuous operation

ReliabilityAvailability

Page 18: 1 Information Retrieval and Use De-normalisation and Distributed database systems Geoff Leese September…

18

Objectives (2)

Location independenceFrom user’s view, all data is at their site.

Fragmentation independenceNeeds joins and unions to put

fragments back together Replication independence

Page 19: 1 Information Retrieval and Use De-normalisation and Distributed database systems Geoff Leese September…

19

Objectives (3)

Distributed query processing Distributed transaction management

Transactions carried out by “agents” at distributed sites

Two-phase commitLocking issues (later lecture)

Page 20: 1 Information Retrieval and Use De-normalisation and Distributed database systems Geoff Leese September…

20

Objectives (4)

Hardware independence Operating system independence Network independence DBMS independence

Page 21: 1 Information Retrieval and Use De-normalisation and Distributed database systems Geoff Leese September…

21

DDS issues

Query processing Optimisation even more important

Catalogue (data dictionary) management Centralised? Fully replicated? Partitioned? Combination of first and third?

Page 22: 1 Information Retrieval and Use De-normalisation and Distributed database systems Geoff Leese September…

22

DDS issues

Update propagationAn issue where replication is used. “Primary copy” system

RecoveryTwo-phase commit

RecoveryLocking strategies

Page 23: 1 Information Retrieval and Use De-normalisation and Distributed database systems Geoff Leese September…

23

Summary

Mapping the logical model Denormalisation Traditional database architecture Client/server model Distributed Database systems

Advantages Objectives Implementation methods Issues

Page 24: 1 Information Retrieval and Use De-normalisation and Distributed database systems Geoff Leese September…

24

Further reading

Rolland chapter 10 Hoffer chapters 12 Denormalisation - click to follow the link!