data independence objectives of the lecture : to consider the problems solved by a dbms; to consider...

Data Independence

Objectives of the Lecture :

•To consider the problems solved by a DBMS;

•To consider how a DBMS uses Data Independence to solve the problems;

•To consider the nature of Logical & Physical Data Independence;

•To consider View relations.

What Problem Does a DBMS Solve ?

P1P2 P3

R1R3R2

Problem of Shared Data !

D E F G H J

R S T U VD’ E’ F’ MR’ S’ T’ N

Q2Q1 Q3 Q4

ABC

W XY Z

Initial ProblemsDuplicate Data

Data common to 2 or more applications is duplicated in each application’s files.Each version of the data may be physically stored differently : different data types (e.g. integer vs. floating point) &/or organisations (e.g. differently structured records) &/or access methods (e.g. one reached by hashing, the other by an index).

Applications Constrained by Existing Data/ApplicationsWhere data already exists, newer applications are constrained to use it, to minimise data input and storage.Newer applications are handicapped by data entry timings and methods, storage structures and organisations that are unsuitable for the new application.

The 2 problems are at opposite ends of the spectrum.

Even More Problems !Duplicate data leads to Inconsistent Data, or Updating Overheads.

If the updating of all copies is not synchronised, they will become inconsistent. Applications using inconsistent data will cause chaos.On-line transactions must update multiple updates simultaneously; batch update runs must be highly integrated.

Constraining applications leads to Reduced Performance and/or Excessive Maintenance.Although each application may be well planned, the overall data storage situation will become complex and ill thought-out because it is unplanned.To simplify it and make it efficient will require extra maintenance to re-configure existing application data storage every time a new application is added.

Database Management System

Problems Solved !

P1P2 P3

R1R3R2

Q2 Q3

A B C D E F G H J M N R S T U V W X Y Z

Benefits of a DBMS Data duplication is eliminated. Maintenance of the data storage is significantly reduced. Physical storage can be optimised for overall performance, and

easily altered to maintain performance with altering any applications.

Applications are simpler because they no longer deal with data storage, which is handed over to the DBMS.

Each application can obtain its data in a form that is optimised to meet its requirements.

Maintenance of the applications is reduced in that it is simple to meet their revised data needs if they are altered.

How are the Benefits Provided ?

The DBMS provides : Data Independence. The DBMS acts a layer of insulation

between application programs and data. An application requests whatever data it needs, and the DBMS provides it. The DBMS can store data with a variety of different storage organisations & methods, and use the best one(s) for its applications.

One integrated & coherent pool of data, i.e. a DB. Since all the data is separated from applications, it can be viewed together and designed according to its inherent meaning & structure.

Bonus is the ability to ask ad hoc queries of the DB. Possible as all the data can now be made visible in a coherent structure.Queries and updates can be done directly on the DB without an intervening application. (The user interface is actually an application program).

Physical Data IndependenceDefinition : the ability to change the way data is physically stored in

the computer, while leaving the logical structure of the DB - i.e. all the relations in it - unchanged.

Physical storage consists of : physical record formats, which determine how a few data values

(usually corresponding to a tuple) are stored; the arrangement of the physical records into physical files; the methods by which the physical data is accessed ; e.g. by

reading sequentially through records, using an index, etc.

Physical independence always allows the user to see data as relations, regardless of how relations are physically stored underneath. Physical storage can be altered, while the user still sees the same relation.

Logical Data Independence (1)Definition : the ability to change the logical structure required by

an application - i.e. all the relations it uses - without the computer having to change the logical structure of the DB - i.e. all the relations in the DB.It includes the logical equivalent of this, viz. the ability to change the logical structure of the DB - i.e. all the relations in it - while still providing any application with the same logical structure that it requires - i.e. all the relations it requires.

To achieve this requires Views.

Definition : a View is a new relation derived from pre-existing relations; it is the specification of the view that is stored, not the data that appears in a view.A view is a named query; e.g. an SQL query or an algebra expression to be evaluated & retrieved. So in SQL :Create View VIEW_NAME As ( some valid SQL query ) ;

Logical Data Independence (2)

View :PROJECT_EMPLOYEE

View :EMPLOYEE_LESS_SAL

View :CAR_OWNER

Base :PROJECT

Base :EMPLOYEE

Base :CAR

Choose a set of base and/or view relationsthat give an application the data it wants.

Join

Project

Join

The ANSI/SPARC 3-Layer Architecture

P1

Physical Schema

F1OperatingSystem Files

ApplicationPrograms

DBMS

P2 Q2 Q3 R1

F2 F3 F4

Logical Schema

Sub Schema Sub SchemaSub Schema

Describerelations

Describesfiles

The Provision of Data Independence Let an application program Q3 use a view, say EMPLOYEE_

LESS_SAL, which is a member of some Sub Schema. The view is mapped to its base relation, EMPLOYEE in this case,

which is a member of the Logical Schema. The base relation is mapped to its storage specification, which is

a member of the Physical Schema. When program Q3 wants to do something with EMPLOYEE_

LESS_SAL, it sends an instruction - a query or update written in SQL - to the DBMS. The DBMS follows the mappings though to the storage specification and determines what it must do with the actual stored data to accomplish this instruction.

The DBMS carries out the action on which it has decided. From the result of the action, the DBMS uses the mappings in the

reverse direction to generate what Q3 requires, and passes the result to Q3.

More about SchemasLogical Schema

There is only one in a DB; so all the base relations automatically form it.

Physical SchemaThere is only one in a DB; so all the storage specifications attached to base relations automatically form it.

Sub SchemasSQL has no means of providing them directly.Instead it provides them indirectly by the use of :

a Grant statement to give to certain DB users the privilege(s) of being able to carry out certain statements (e.g. Selects, Inserts) on certain view and/or base relations;

a Revoke statement to remove from certain DB users the privilege(s) of being able to carry out certain statements (e.g. Selects, Inserts) on certain view and/or base relations.

Using ViewsIf a user is to be able to use a view like a base relation, then they

must be able to retrieve the data in a view and update a view. Retrieval

Replace the view by its (query) definition, and evaluate that definition.If the view is a component of a query, then use this value for the component.

UpdateWhen a view is updated, since only its definition not its value is stored, this requires that the underlying relations, i.e. those appearing in the view definition, are updated instead to create the effect.Unfortunately, SQL only implements a few of the logical possibilities.

This means that often a view cannot be used as if it were a base relation.

Syntax of SQL ViewsThe full syntax of an SQL view is :

Create View VIEW_NAME ( list of column names )As

Select statementWith Check Option ; Optional

Optional

The ‘list of column names’ is only required if the default names arising as a result of the Select statement are not appropriate.

The Select statement can be any legitimate query, although it must conform to the limitations if view updates are required.

For updateable views, the ‘With Check Option’ option means updates on it that violate the integrity constraints are rejected; so use the option. (If it is not used, the underlying tables will be updated, but the results will not appear in the view !)

Views as Shorthands The normal method of using views, assumed so far, is to provide

Logical Data Independence, where a view is used indistinguishably from a base relation if possible.

Another use is to create views, additional to relations in the Sub Schema, in order to make it easier to write commonly occurring queries, or make complex queries easier to write.

Example :Create View CAR_OWNER As ( Select EName, ENo, RegNo From CAR Join EMPLOYEE On ( Owner = ENo ) ) ;is created because there are many queries on car owners, and it saves work to have this view as a starting point for them.

Here users know that these are views; they do not need updating.

data independence objectives of the lecture : to consider the problems solved by a dbms; to consider...

Documents