copyright © 2007 ramez elmasri and shamkant b. navathe slide 25- 1 types of distributed database...

17
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide 25- 1 Types of Distributed Database Systems Homogeneous All sites of the database system have identical setup, i.e., same database system software. The underlying operating system may be different. For example, all sites run Oracle or DB2, or Sybase or some other database system. The underlying operating systems can be a mixture of Linux, Window, Unix, etc. Site 5 Site 1 Site 2 Site 3 Oracle Oracle Oracle Oracle Site 4 Oracle Linux Linux W indow W indow Unix Com m unications network

Upload: katelyn-johnston

Post on 26-Mar-2015

218 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide 25- 1 Types of Distributed Database Systems Homogeneous All sites of the database system have

Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide 25- 1

Types of Distributed Database Systems

Homogeneous All sites of the database

system have identical setup, i.e., same database system software.

The underlying operating system may be different.

For example, all sites run Oracle or DB2, or Sybase or some other database system.

The underlying operating systems can be a mixture of Linux, Window, Unix, etc.

Site 5Site 1

Site 2Site 3Oracle Oracle

OracleOracle

Site 4

Oracle

LinuxLinux

Window

WindowUnix

Communications network

Page 2: Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide 25- 1 Types of Distributed Database Systems Homogeneous All sites of the database system have

Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide 25- 2

Types of Distributed Database Systems

Heterogeneous Federated: Each site may run different database system but the

data access is managed through a single conceptual schema. This implies that the degree of local autonomy is minimum. Each site

must adhere to a centralized access policy. There may be a global schema.

Multidatabase: There is no one conceptual global schema. For data access a schema is constructed dynamically as needed by the application software.

Communications network

Site 5Site 1

Site 2Site 3

NetworkDBMS

Relational

Site 4

ObjectOriented

LinuxLinux

Unix

Hierarchical

ObjectOriented

RelationalUnix

Window

Page 3: Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide 25- 1 Types of Distributed Database Systems Homogeneous All sites of the database system have

Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide 25- 3

Types of Distributed Database Systems

Federated Database Management Systems Issues Differences in data models:

Relational, Objected oriented, hierarchical, network, etc.

Differences in constraints: Each site may have their own data accessing and

processing constraints. Differences in query language:

Some site may use SQL, some may use SQL-89, some may use SQL-92, and so on.

Page 4: Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide 25- 1 Types of Distributed Database Systems Homogeneous All sites of the database system have

Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe

Concurrency Control and Recovery

Distributed Databases encounter a number of concurrency control and recovery problems which are not present in centralized databases. Some of them are listed below. Dealing with multiple copies of data items Failure of individual sites Communication link failure Distributed commit Distributed deadlock

Slide 25- 4

Page 5: Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide 25- 1 Types of Distributed Database Systems Homogeneous All sites of the database system have

Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe

Concurrency Control and Recovery

Details Dealing with multiple copies of data items:

The concurrency control must maintain global consistency. Likewise the recovery mechanism must recover all copies and maintain consistency after recovery.

Failure of individual sites: Database availability must not be affected due to

the failure of one or two sites and the recovery scheme must recover them before they are available for use.

Slide 25- 5

Page 6: Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide 25- 1 Types of Distributed Database Systems Homogeneous All sites of the database system have

Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe

Concurrency Control and Recovery

Details (contd.) Communication link failure:

This failure may create network partition which would affect database availability even though all database sites may be running.

Distributed commit: A transaction may be fragmented and they may be executed

by a number of sites. This require a two or three-phase commit approach for transaction commit.

Distributed deadlock: Since transactions are processed at multiple sites, two or

more sites may get involved in deadlock. This must be resolved in a distributed manner.

Slide 25- 6

Page 7: Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide 25- 1 Types of Distributed Database Systems Homogeneous All sites of the database system have

Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe

Concurrency Control in Distributed Databases

Single-Lock-Manager Approach Distributed Lock Manager

Primary copy Majority protocol Biased protocol Quorum consensus

Slide 25- 7

Page 8: Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide 25- 1 Types of Distributed Database Systems Homogeneous All sites of the database system have

Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe

Single-Lock-Manager Approach

System maintains a single lock manager that resides in a single chosen site, say Si (Primary Site Technique)

When a transaction needs to lock a data item, it sends a lock request to Si and lock manager determines whether the lock can be granted immediately If yes, lock manager sends a message to the site which

initiated the request If no, request is delayed until it can be granted, at which

time a message is sent to the initiating site

Page 9: Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide 25- 1 Types of Distributed Database Systems Homogeneous All sites of the database system have

Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe

Single-Lock-Manager Approach (Cont.)

The transaction can read the data item from any one of the sites at which a replica of the data item resides.

Writes must be performed on all replicas of a data item Advantages of scheme:

Simple implementation Simple deadlock handling

Disadvantages of scheme are: Bottleneck: lock manager site becomes a bottleneck Vulnerability: system is vulnerable to lock manager site

failure.

Page 10: Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide 25- 1 Types of Distributed Database Systems Homogeneous All sites of the database system have

Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe

Distributed Lock Manager

In this approach, functionality of locking is implemented by lock managers at each site

Lock managers control access to local data items But special protocols may be used for replicas

Advantage: work is distributed and can be made robust to failures Disadvantage: deadlock detection is more complicated

Lock managers cooperate for deadlock detection

Several variants of this approach Primary copy Majority protocol Biased protocol Quorum consensus

Page 11: Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide 25- 1 Types of Distributed Database Systems Homogeneous All sites of the database system have

Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe

Primary Copy

Choose one replica of data item to be the primary copy. Site containing the replica is called the primary site for that data

item Different data items can have different primary sites

When a transaction needs to lock a data item Q, it requests a lock at the primary site of Q.

Implicitly gets lock on all replicas of the data item Benefit

Concurrency control for replicated data handled similarly to unreplicated data - simple implementation.

Drawback If the primary site of Q fails, Q is inaccessible even though other

sites containing a replica may be accessible.

Page 12: Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide 25- 1 Types of Distributed Database Systems Homogeneous All sites of the database system have

Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe

Majority Protocol

Local lock manager at each site administers lock and unlock requests for data items stored at that site.

When a transaction wishes to lock an unreplicated data item Q residing at site Si, a message is sent to Si ‘s lock manager. If Q is locked in an incompatible mode, then the

request is delayed until it can be granted. When the lock request can be granted, the lock

manager sends a message back to the initiator indicating that the lock request has been granted.

Page 13: Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide 25- 1 Types of Distributed Database Systems Homogeneous All sites of the database system have

Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe

Majority Protocol (Cont.) In case of replicated data

If Q is replicated at n sites, then a lock request message must be sent to more than half of the n sites in which Q is stored.

The transaction does not operate on Q until it has obtained a lock on a majority of the replicas of Q.

When writing the data item, transaction performs writes on all replicas.

Benefit Can be used even when some sites are unavailable

Drawback Requires 2(n/2 + 1) messages for handling lock requests, and (n/2 +

1) messages for handling unlock requests. Potential for deadlock even with single item - e.g., each of 3

transactions may have locks on 1/3rd of the replicas of a data.

Page 14: Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide 25- 1 Types of Distributed Database Systems Homogeneous All sites of the database system have

Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe

Biased Protocol

Local lock manager at each site as in majority protocol, however, requests for shared locks are handled differently than requests for exclusive locks.

Shared locks. When a transaction needs to lock data item Q, it simply requests a lock on Q from the lock manager at one site containing a replica of Q.

Exclusive locks. When transaction needs to lock data item Q, it requests a lock on Q from the lock manager at all sites containing a replica of Q.

Advantage - imposes less overhead on read operations. Disadvantage - additional overhead on writes

Page 15: Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide 25- 1 Types of Distributed Database Systems Homogeneous All sites of the database system have

Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe

Quorum Consensus Protocol

A generalization of both majority and biased protocols Each site is assigned a weight.

Let S be the total of all site weights

Choose two values read quorum Qr and write quorum Qw

Such that Qr + Qw > S and 2 * Qw > S

Quorums can be chosen (and S computed) separately for each item

Each read must lock enough replicas that the sum of the site weights is >= Qr

Each write must lock enough replicas that the sum of the site weights is >= Qw

Page 16: Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide 25- 1 Types of Distributed Database Systems Homogeneous All sites of the database system have

Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe

Recovery in a Distributed Database

Single Lock Manager Approach: (Primary Site Approach)

All transaction management activities go to primary site which is likely to overload the site.

If the primary site fails, the entire system is inaccessible.

To aid recovery a backup site is designated which behaves as a shadow of primary site.

In case of primary site failure, backup site can act as primary site.

Slide 25- 23

Page 17: Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide 25- 1 Types of Distributed Database Systems Homogeneous All sites of the database system have

Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe

Recovery in a Distributed Database

Recovery from a coordinator failure In both approaches a coordinator site or copy may become

unavailable. This will require the selection of a new coordinator.

Primary site approach with no backup site: Aborts and restarts all active transactions at all sites. Elects

a new coordinator and initiates transaction processing. Primary site approach with backup site:

Suspends all active transactions, designates the backup site as the primary site and identifies a new back up site. Primary site receives all transaction management information to resume processing.

Primary and backup sites fail or no backup site: Use election process to select a new coordinator site.

Slide 25- 25