my slide distributed database management systems
Post on 19-Jan-2015
2.635 Views
Preview:
DESCRIPTION
TRANSCRIPT
Rushdi Shams, Dept of CSE, KUET 1
Database Database SystemsSystems
Distributed Database Distributed Database SystemsSystems
Version 1.0Version 1.0
2Rushdi Shams, Dept of CSE, KUET
Introduction Introduction A distributed database systems is a A distributed database systems is a
database systems which is database systems which is fragmented fragmented or replicatedor replicated on machines on machines
These machines are usually located on These machines are usually located on different geographical location of an different geographical location of an organizationorganization
FragmentationFragmentation is made of some subsets is made of some subsets of the original databaseof the original database
ReplicationReplication refers to the copy of the refers to the copy of the whole database or part of the original whole database or part of the original databasedatabase
3Rushdi Shams, Dept of CSE, KUET
Idea of Distributed Idea of Distributed Database SystemsDatabase Systems
4 sites connected by a communication network4 sites connected by a communication network Sites 1, 2 and 4 run a single databaseSites 1, 2 and 4 run a single database Site 3 has no database. It accesses the other 3 Site 3 has no database. It accesses the other 3
sites for data manipulationsites for data manipulation
4Rushdi Shams, Dept of CSE, KUET
FragmentationFragmentation
There are 2 basic types of There are 2 basic types of fragmentationsfragmentations
1.1. Horizontal fragmentationHorizontal fragmentation
2.2. Vertical fragmentationVertical fragmentation
5Rushdi Shams, Dept of CSE, KUET
Horizontal Horizontal FragmentationFragmentation
Horizontal fragmentation is the subset of rows of a Horizontal fragmentation is the subset of rows of a single tablesingle table
Say, we need to manipulate a table that contains Say, we need to manipulate a table that contains information about British Peopleinformation about British People
We have 3 sitesWe have 3 sites Edinburgh site will have those rows of the table that Edinburgh site will have those rows of the table that
have information about Scottish peoplehave information about Scottish people Cardiff site will have those rows of the table that have Cardiff site will have those rows of the table that have
information about Welsh peopleinformation about Welsh people London site will have those rows of the table that have London site will have those rows of the table that have
information about English peopleinformation about English people The 3 sites are working as distributed processors. So, The 3 sites are working as distributed processors. So,
together they represent information about all the together they represent information about all the British peopleBritish people
6Rushdi Shams, Dept of CSE, KUET
Horizontal Fragmentation Horizontal Fragmentation (continued)(continued)
7Rushdi Shams, Dept of CSE, KUET
Horizontal Fragmentation Horizontal Fragmentation (continued)(continued)
Horizontal fragmentation is done by Horizontal fragmentation is done by restricting the table with WHERE restricting the table with WHERE condition in query languages!!condition in query languages!!
In the previous example, you can In the previous example, you can fragment the table like fragment the table like
1.1. WHERE LOCATION=EDINBURGHWHERE LOCATION=EDINBURGH2.2. WHERE LOCATION=CARDIFFWHERE LOCATION=CARDIFF3.3. WHERE LOCATION=LONDONWHERE LOCATION=LONDON To find the original table, you just union To find the original table, you just union
all the fragmented tables!all the fragmented tables! Easy, huh?Easy, huh?
8Rushdi Shams, Dept of CSE, KUET
Horizontal Fragmentation Horizontal Fragmentation (continued)(continued)
Consider the horizontal fragmentation of relation Consider the horizontal fragmentation of relation Proj according to its BUDGET value.Proj according to its BUDGET value.
Relations with BUDGET > 200000 go into Proj1 Relations with BUDGET > 200000 go into Proj1 and the rest goes into Proj2.and the rest goes into Proj2.
Proj1= Proj1= (budget>200000) (budget>200000) ProjProjProj2= Proj2= (budget (budget ≤≤ 200000) 200000) ProjProj
9Rushdi Shams, Dept of CSE, KUET
Vertical FragmentationVertical Fragmentation
Vertical fragmentation is a method Vertical fragmentation is a method of fragmenting a table byof fragmenting a table by projectingprojecting columns of a table columns of a table with with primary keyprimary key
To find out the original table, you To find out the original table, you just need to join the newly created just need to join the newly created tables according to the primary key!tables according to the primary key!
Again, it’s easy, huh?Again, it’s easy, huh?
10Rushdi Shams, Dept of CSE, KUET
Vertical Fragmentation Vertical Fragmentation (continued)(continued)
The table proj is fragmented into 2 tables proj 1 and The table proj is fragmented into 2 tables proj 1 and proj 2proj 2
Both tables have the primary key- PNO. Keep an eye on Both tables have the primary key- PNO. Keep an eye on it, fellows!it, fellows!
If you join them according to the PNO of both table, If you join them according to the PNO of both table, what do you get? Answer- Proj table again!! what do you get? Answer- Proj table again!!
11Rushdi Shams, Dept of CSE, KUET
Both Fragmentation at A Both Fragmentation at A GlanceGlance
12Rushdi Shams, Dept of CSE, KUET
Why FragmentationWhy Fragmentation
Usage:Usage:
Applications work with views rather Applications work with views rather than entire relationsthan entire relations
Efficiency:Efficiency:
Data is stored close to where it is Data is stored close to where it is most frequently usedmost frequently used
Data that is not needed by local Data that is not needed by local applications are not storedapplications are not stored
13Rushdi Shams, Dept of CSE, KUET
Why Fragmentation Why Fragmentation (continued)(continued)
Parallelism:Parallelism:
Transaction can be divided into Transaction can be divided into several subqueries that operate on several subqueries that operate on fragmentsfragments
Security:Security:
Data that is not needed by local Data that is not needed by local applications are not stored and so is applications are not stored and so is not vulnerable to unauthorized usersnot vulnerable to unauthorized users
14Rushdi Shams, Dept of CSE, KUET
Disadvantage of Disadvantage of FragmentationFragmentation
Performance:Performance:
If queries involve to fetch data from If queries involve to fetch data from tables that are on different sites, it tables that are on different sites, it requires processing timerequires processing time
15Rushdi Shams, Dept of CSE, KUET
Correctness of Correctness of FragmentationFragmentation
Well, when I first hear correctness- I Well, when I first hear correctness- I was boomed! Actually it means was boomed! Actually it means nothing rather than some properties nothing rather than some properties of fragmentationof fragmentation
So, don’t worry about that. It is So, don’t worry about that. It is called CORRECTNESS in database called CORRECTNESS in database jargon, so, don’t call it property, jargon, so, don’t call it property, a’right?a’right?
16Rushdi Shams, Dept of CSE, KUET
Correctness of Correctness of Fragmentation (continued)Fragmentation (continued)
There are 3 correctness rulesThere are 3 correctness rules
1.1. CompletenessCompleteness
2.2. ReconstructionReconstruction
3.3. DisjointnessDisjointness
17Rushdi Shams, Dept of CSE, KUET
Correctness of Correctness of Fragmentation (continued)Fragmentation (continued)
1.1. Completeness:Completeness:
If relation R is fragmented into If relation R is fragmented into fragments R1, R2, R3… Rn, each data fragments R1, R2, R3… Rn, each data item that can be found in R must appear item that can be found in R must appear in at least one fragmentin at least one fragment
So, why don’t you say this way- no data So, why don’t you say this way- no data item of original relation R gets missing!!item of original relation R gets missing!!
Man, I hate theoretical definitions!Man, I hate theoretical definitions!
18Rushdi Shams, Dept of CSE, KUET
Correctness of Correctness of Fragmentation (continued)Fragmentation (continued)
2.2. Reconstruction:Reconstruction:
There must be a relational There must be a relational operation by which we can operation by which we can reconstruct R from the fragmentsreconstruct R from the fragments
We already saw that by Unioning We already saw that by Unioning (() horizontal fragments we can ) horizontal fragments we can have original R and by joining have original R and by joining vertical fragments, we can achieve vertical fragments, we can achieve R!R!
19Rushdi Shams, Dept of CSE, KUET
Correctness of Correctness of Fragmentation (continued)Fragmentation (continued)
3.3. Disjointness:Disjointness:
If data item Di appears in fragment If data item Di appears in fragment Ri, then it should not appear in any Ri, then it should not appear in any other fragmentother fragment
Exception of this is in vertical Exception of this is in vertical fragmentation, where primary key fragmentation, where primary key attributes must be repeated to allow attributes must be repeated to allow reconstructionreconstruction
20Rushdi Shams, Dept of CSE, KUET
TransparencyTransparency
You have distributed one table to 3 You have distributed one table to 3 sites just now. The user, when he sites just now. The user, when he requires data, should not know this!requires data, should not know this!
This process of hiding the This process of hiding the fragmentation and distribution the fragmentation and distribution the fragments to different sites is called fragments to different sites is called transparencytransparency
21Rushdi Shams, Dept of CSE, KUET
Types of TransparencyTypes of Transparency
1.1. Location transparencyLocation transparencyUser should not be aware of the location of the User should not be aware of the location of the data. This simplifies the user interface and data. This simplifies the user interface and user programs that are used to query the tableuser programs that are used to query the table
2.2. Fragmentation transparencyFragmentation transparencyUser must not know that the data have been User must not know that the data have been fragmented and how the data have been fragmented and how the data have been fragmentedfragmented
3.3. Replication transparencyReplication transparencyReplication is necessary sometimes as this Replication is necessary sometimes as this makes the processing faster. But user should makes the processing faster. But user should not be aware of it.not be aware of it.
22Rushdi Shams, Dept of CSE, KUET
Need of TransparencyNeed of Transparency A manager wishing to find the total
number of employees at the Scottish subsidiary need not be aware that he is querying a remote database
A manager running a query in London should not need to be aware that to produce the aggregate salary bill for the company all three sites – London, Cardiff and Edinburgh – need to be interrogated
When periodically data need to be updated, the user need not directly know that three sites are effectively updated
23Rushdi Shams, Dept of CSE, KUET
Foundation RuleFoundation Rule
The foundation rule of distributed The foundation rule of distributed database systems states-database systems states-
““Although the database systems are Although the database systems are distributed to several sites, it must look distributed to several sites, it must look like a centralised database systems to the like a centralised database systems to the user”user”
Then how do you make this foundation Then how do you make this foundation rule true?rule true?
Answer- by applying 3 types of Answer- by applying 3 types of transparencies transparencies
24Rushdi Shams, Dept of CSE, KUET
Advantages of Distributed Advantages of Distributed Database SystemsDatabase Systems
Reflects organizational structureReflects organizational structure — database fragments are located in — database fragments are located in the departments they relate to. the departments they relate to.
Local autonomyLocal autonomy — a department — a department can control the data about them (as can control the data about them (as they are the ones familiar with it.) they are the ones familiar with it.)
Improved availabilityImproved availability — a fault in — a fault in one database system will only affect one database system will only affect one fragment, instead of the entire one fragment, instead of the entire database database
25Rushdi Shams, Dept of CSE, KUET
Advantages of Distributed Advantages of Distributed Database Systems Database Systems
(continued)(continued) Improved performanceImproved performance — data is located — data is located
near the site of greatest demand, and the near the site of greatest demand, and the database systems themselves are parallelized, database systems themselves are parallelized, allowing load on the databases to be balanced allowing load on the databases to be balanced among servers. (A high load on one module of among servers. (A high load on one module of the database won't affect other modules of the database won't affect other modules of the database in a distributed database.) the database in a distributed database.)
EconomicsEconomics — it costs less to create a — it costs less to create a network of smaller computers with the power network of smaller computers with the power of a single large computer. of a single large computer.
Modularity Modularity — systems can be modified, — systems can be modified, added and removed from the distributed added and removed from the distributed database without affecting other modules database without affecting other modules (systems). (systems).
26Rushdi Shams, Dept of CSE, KUET
Disadvantages of Disadvantages of Distributed Database Distributed Database
SystemsSystems ComplexityComplexity — extra work must be done by the — extra work must be done by the
DBAs to ensure that the distributed nature of the DBAs to ensure that the distributed nature of the system is transparent. Extra work must also be system is transparent. Extra work must also be done to maintain multiple disparate systems, done to maintain multiple disparate systems, instead of one big one. Extra database design instead of one big one. Extra database design work must also be done to account for the work must also be done to account for the disconnected nature of the database — for disconnected nature of the database — for example, joins become prohibitively expensive example, joins become prohibitively expensive when performed across multiple systems. when performed across multiple systems.
Economics Economics — increased complexity and a more — increased complexity and a more extensive infrastructure means extra labour extensive infrastructure means extra labour costs. costs.
27Rushdi Shams, Dept of CSE, KUET
Disadvantages of Disadvantages of Distributed Database Distributed Database Systems (continued)Systems (continued)
SecuritySecurity — remote database fragments must be — remote database fragments must be secured, and they are not centralized so the remote secured, and they are not centralized so the remote sites must be secured as well. The infrastructure sites must be secured as well. The infrastructure must also be secured (eg: by encrypting the must also be secured (eg: by encrypting the network links between remote sites). network links between remote sites).
Difficult to maintain integrityDifficult to maintain integrity — in a distributed — in a distributed database enforcing integrity over a network may database enforcing integrity over a network may require too much networking resources to be require too much networking resources to be feasible. feasible.
InexperienceInexperience — distributed databases are difficult — distributed databases are difficult to work with, and as a young field there is not much to work with, and as a young field there is not much readily available experience on proper practice. readily available experience on proper practice.
28Rushdi Shams, Dept of CSE, KUET
Types of Distributed Types of Distributed Database SystemsDatabase Systems
1.1. Homogeneous Database SystemsHomogeneous Database Systems
2.2. Heterogeneous Database SystemsHeterogeneous Database Systems
3.3. Federated Database SystemsFederated Database Systems
29Rushdi Shams, Dept of CSE, KUET
Homogeneous Distributed Homogeneous Distributed Database SystemsDatabase Systems
Data is distributed across 2 or more Data is distributed across 2 or more systemssystems
All the systems will have to run the All the systems will have to run the same DBMS (eg. Oracle)same DBMS (eg. Oracle)
Moreover, the systems should be run Moreover, the systems should be run on the same hardware platformon the same hardware platform
And the systems should be run on the And the systems should be run on the same Operating Systemssame Operating Systems
Hmm, pretty weird??Hmm, pretty weird??
30Rushdi Shams, Dept of CSE, KUET
Homogeneous Distributed Homogeneous Distributed Database Systems Database Systems
(continued)(continued)
31Rushdi Shams, Dept of CSE, KUET
Heterogeneous Distributed Heterogeneous Distributed Database SystemsDatabase Systems
Data is distributed across 2 or more Data is distributed across 2 or more systemssystems
Those systems’ hardware & software Those systems’ hardware & software configuration is diverseconfiguration is diverse
One site might be running ORACLE under Windows NT, another site Informix under UNIX, and yet another site Ingress under Windows NT
Pretty Cool, huh?
32Rushdi Shams, Dept of CSE, KUET
Heterogeneous Distributed Heterogeneous Distributed Database Systems Database Systems
(continued)(continued)
UNIX
INFORMIX
INGRESS
33Rushdi Shams, Dept of CSE, KUET
Federated Distributed Federated Distributed Database SystemsDatabase Systems
Switzerland is a country that is Switzerland is a country that is comprised with several political comprised with several political federationsfederations
These federations are autonomous and These federations are autonomous and political unitspolitical units
The national level decisions are made The national level decisions are made by combining their own decisionsby combining their own decisions
A federated database system is made up of a number of relatively independent, autonomous databases
34Rushdi Shams, Dept of CSE, KUET
Federated Distributed Federated Distributed Database Systems Database Systems
(continued)(continued)
35Rushdi Shams, Dept of CSE, KUET
Centralized DBMS vs Centralized DBMS vs Distributed DBMSDistributed DBMS
The system catalogue of a distributed database has to be more complex. For instance, it has to store details about the location of fragments and replicates
Concurrency problems are multiplied in distributed systems. The problems of propagating updates to a series of different sites are very involved
36Rushdi Shams, Dept of CSE, KUET
Centralized DBMS vs Centralized DBMS vs Distributed DBMS Distributed DBMS
(continued)(continued) A query optimiser in a true
distributed system should be able to utilise information about the structure of the network in deciding how best to satisfy a given query
To ensure a robust system, the distributed DBMS should not be located solely at one site. Software as well as data need to be distributed
37Rushdi Shams, Dept of CSE, KUET
Implementation Phase of Implementation Phase of Distributed DBMSDistributed DBMS
1. In the first phase we distribute queries between sites but update only to a single site
2. In the second phase we not only distribute queries, we also distribute transactions between sites.
The latter scenario is clearly the more technically challenging of the two
Most existing distributed database systems are in phase 1
Very few organisations seem to have solved all of the problems associated with phase 2 applications
38Rushdi Shams, Dept of CSE, KUET
ReferencesReferences
www.wikipedia.orgwww.wikipedia.org Database Systems by Paul Beynon-Database Systems by Paul Beynon-
Devies, Palgrave Macmillan, 2004Devies, Palgrave Macmillan, 2004 www.cs.uga.edu/~tartir/classes/8370/FDBwww.cs.uga.edu/~tartir/classes/8370/FDB
S.htmlS.html
Distributed Database Design by Fabio Distributed Database Design by Fabio Porto, Database LaboratoryPorto, Database Laboratory
John hall, Senior Lecturer, University of John hall, Senior Lecturer, University of Bolton, United KingdomBolton, United Kingdom
top related