intro to dbms

Notes of Database Management System

Mrs Mousmi Ajay Chaurasia Lect.Deptt of INFORMATION TECHNOLOGY

UNIT-1

INTRODUCTION TO DATABASE MANAGEMENT SYSTEM

Data - Data is meaningful known raw facts that can be processed and stored as information.

Database - Database is a collection of interrelated and organized data.

DBMS - Database Management System (DBMS) is a collection of interrelated data [usually called database]

and a set of programs to access, update and manage those data [which form part of management system.

OR

It is a software package to facilitate creation and maintenance of computerized database. It is general purpose

software that facilitates the following:

1. Defining: Specifying data types and structures, and constraints for data to be stored.

2. Constructing: Storing data in a storage medium.

3. Manipulating: Involves querying, updating and generating reports.

4. Sharing: Allowing multiple users and programs to access data simultaneously.

Eg. Of DBMS - Access, dBase, FileMaker Pro, and FoxBASE, ORACLE etc.

FIGURE 1.1 Navathe Page-6

Primary goals of DBMS are:

1. To provide a way to store and retrieve database information that is both convenient and efficient.

2. To manage large and small bodies of information. It involves defining structures for storage of information and

providing mechanism for manipulation of information.

3. It should ensure safety of information stored, despite system crashes or attempts at unauthorized access.

4. If data are to be shared among several users, then system should avoid possible anomalous results.

Eg. Of DBMS applications

1. Banking – For customer information, accounts, and loans, and banking transactions. [all transactions]

2. Airlines – For reservation and schedule information. [reservations, schedules]

3. Universities – For student information, course registrations, and grades. [registration, grades]

4. Credit Card Transactions – For purchases on credit card and generation of monthly statements.

5. Telecommunication – For keeping records of calls made, generating monthly bills, maintaining balances on

prepaid calling cards, and storing information about communication networks.

6. Finance – For storing information about holdings, sales, and purchases of financial instruments such as stocks

and bonds.

7. Sales – For customer, product, and purchase information. [customers, products, purchases]



8. Manufacturing – For management of supply chain and for tracking production of items in factories, inventories of

items in warehouses/stores, and orders for items. [production, inventory, orders, supply chain]

9. Human Resources – For information about employees, salaries, payroll taxes and benefits, and generation of

paychecks. [employee records, salaries, tax deductions]

File systems/File processing systems

A file system is basically storing information in data structures called ‘files’ in the operating

system and manipulating this information via application programs that manipulate the files.

Disadvantages of File Processing System

1. Data Redundancy – Since different programmers create the files and application programs over a long

period, the various files are likely to have different formats and the programs may be written in several

programming languages. Moreover, the same information may be duplicated in several files, this duplication

of data over several files is known as data redundancy. Eg. The address and telephone number of a particular

customer may appear in a file that consists of saving- account records and in a file that consists of checking-

account records.This reduanduncy leads to higher storage & excess cost also leads to inconsistency discussed

in the next.

2. Data Inconsistency – The various copies of same data may no longer agree i.e. various copies of the same

data may contain different information. Eg. A changed customer address may be reflected in savings-account

records but not elsewhere in the system.

3. Difficulty in accessing data – In a conventional file processing system it is difficult to access the data in a

specific manner and it is require creating an application program to carry out each new task. Eg. Suppose that

one of the bank officers needs to find out the names of all customers who live within a particular postal-code

area. We ask the officer of data-processing department to generate such a list.We have 2 choices:

- List all cust_names & manually do the information required

- Ask the data processing dept. to have system programmer to write necessity application programe

Because the designers of the original system did not anticipate this request, there is no application

program on hand to meet it.

4. Data Isolation – Because data are scattered in various files, and files may be in different formats, writing new

application programs to retrieve the appropriate data is difficult.

5. Integrity problems – The data stored in the database must satisfy certain types of consistency constraints.

Eg. The balance of a bank account may never fall below a prescribed amount (say, ICICI 2500/- ). Developers

enforce these constraints in the system by adding appropriate code in the various application programs.

However, when new constraints are added, it is difficult to change the programs to enforce them. The problem

is compounded when constraints involve several data items from different files.



6. Atomicity problems – A computer system, like any other mechanical or electrical device, is subject to

failure. In many applications, it is crucial that, if a failure occurs, the data be restored to the consistent state

that existed prior to the failure. Eg. Before computer formatt,we require to have a backup first. It is difficult

to ensure atomicity in a conventional file processing system.

7. Concurrent-access anomalies – For the sake of overall performance of the system and faster response, many

systems allow multiple users to update the data simultaneously. In such an environment, interaction of

concurrent updates may result in inconsistent data. Eg. Consider bank account A, containing $500. If two

customers withdraw funds (say $50 and $100 respectively) from account A at about the same time, the result

of the concurrent executions may leave the account in an incorrect (or inconsistent) state. Suppose that the

programs executing on behalf of each withdrawal read the old balance, reduce that value by the amount being

withdrawn, and write the result back. If the two programs run concurrently, they may both read the value

$500, and write back $450 and $400, respectively. Depending on which one writes the value last, the account

may contain either $450 or $400, rather than the correct value of $350.

To guard against this possibility, the system must maintain some form of supervision. But supervision is

difficult to provide because data may be accessed by many different application programs that have not been

coordinated previously.

8. Security Problems – Not every user of the database system should be able to access all the data. Eg. In a

bank system, payroll personnel need to see only that part of the database that has information about the

various bank employees. They do not need access to information about customer accounts. But, since

application programs are added to the system in an ad hoc manner, enforcing such security constraints is

difficult.



Difference between DBMS and File-processing system

DBMS File-processing Systems

1. Redundancies and inconsistencies in data are reduced

due to single file formats and duplication of data is

eliminated.

1. Redundancies and inconsistencies in data exist due to

single file formats and duplication of data.

2. Data is easily accessed due to standard query

procedures.

2. Data cannot be easily accessed due to special

application programs needed to access data.

3. Isolation/retrieval of required data is possible due to

common file format, and there are provisions to easily

retrieve data

3. Data isolation is difficult due to different file formats,

and also because new application programs have to be

written.

4. Integrity constraints, whether new or old, can be

created or modified as per need.

4. Introduction of integrity constraints is tedious and

again new application programs have to be written.

5. Atomicity of updates is possible. 5. Atomicity of updates may not be maintained.

6. Several users can access data at the same time i.e

concurrently without problems

6. Concurrent accesses may cause problems such as .

inconsistencies.

7. Security features can be enabled in DBMS very easily. 7. It may be difficult to enforce security features.

Advantages of a DBMS over file-processing system

A DBMS has three main features:

(1) Centralized data management,

(2) Data independence and

(3) Data integration /System integration

In a DB system,the DBMS provides the interface b/w the application programes & data.when changes are made to the

data representation, the metadata maintained by the DBMS is changed but the DBMS contiues to provide data to

application programs in previously used way. The DBMS handles the task of information of data whereever

necessary.This independence b/w the programs & the data is called data independence.this made programe to

continue irrespective of changes made in it. To provide a high degree of data independence , a DBMS must include a

sophisticated metadata mgmt system.In DBMS , all files are integrated into one system thus reducing redundancies &

making data mgmt more efficient.In addition, DBMS provides centralised control of the operational data.Some of

advantages of above mention three features are:

Due to its centralized nature, the database system can overcome the disadvantages of the file-based system as

discussed below.



• Minimal Data Redundancy - Since the whole data resides in one central database, the various programs in the

application can access data in different data files. Hence data present in one file need not be duplicated in another.

This reduces data redundancy. However, this does not mean all redundancy can be eliminated. There could be

business or technical reasons for having some amount of redundancy. Any such redundancy should be carefully

controlled and the DBMS should be aware of it.

• Data Consistency - Reduced data redundancy leads to better data consistency.

• Data Integration - Since related data is stored in one single database, enforcing data integrity is much easier.

Moreover, the functions in the DBMS can be used to enforce the integrity rules with minimum programming in the

application programs.

• Data Sharing - Related data can be shared across programs since the data is stored in a centralized manner. Even

new applications can be developed to operate against the same data.

• Enforcement of Standards - Enforcing standards in the organization and structure of data files is required and

also easy in a Database System, since it is one single set of programs which is always interacting with the data files.

• Application Development Ease - The application programmer need not build the functions for handling issues

like concurrent access, security, data integrity, etc. The programmer only needs to implement the application business

rules. This brings in application development ease. Adding additional functional modules is also easier than in file-

based systems.

• Better Controls - Better controls can be achieved due to the centralized nature of the system.

• Data Independence - The architecture of the DBMS can be viewed as a 3-level system comprising the

following:

- The internal or the physical level where the data resides.

- The conceptual level which is the level of the DBMS functions

- The external level which is the level of the application programs or the end user.

Data Independence is isolating an upper level from the changes in the organization or structure of a lower level.

For example, if changes in the file organization of a data file do not demand for changes in the functions in the

DBMS or in the application programs, data independence is achieved. Thus Data Independence can be defined as

immunity of applications to change in physical representation and access technique. The provision of data

independence is a major objective for database systems.

• Reduced Maintenance - Maintenance is less and easy, again, due to the centralized nature of the system.

Disadvantages of a DBMS

The following are disadvantages of DBMS

1. Setup of the database system requires more knowledge, money, skills, and time.

2. The complexity of the database may result in poor performance.



Functions of a DBMS

The functions performed by a typical DBMS are the following:

• Data Definition - The DBMS provides functions to define the structure of the data in the application. These

include defining and modifying the record structure, the type and size of fields and the various constraints/conditions

to be satisfied by the data in each field.

• Data Manipulation - Once the data structure is defined, data needs to be inserted, modified or deleted. The

functions which perform these operations are also part of the DBMS. These function can handle planned and

unplanned data manipulation needs. Planned queries are those which form part of the application. Unplanned queries

are ad-hoc queries which are performed on a need basis.

• Data Security & Integrity - The DBMS contains functions which handle the security and integrity of data in

the application. These can be easily invoked by the application and hence the application programmer need not code

these functions in his/her programs.

• Data Recovery & Concurrency - Recovery of data after a system failure and concurrent access of records by

multiple users are also handled by the DBMS.

• Data Dictionary Maintenance - Maintaining the Data Dictionary which contains the data definition of the

application is also one of the functions of a DBMS.

• Performance - Optimizing the performance of the queries is one of the important functions of a DBMS. Hence

the DBMS has a set of programs forming the Query Optimizer which evaluates the different implementations of a

query and chooses the best among them.

Thus the DBMS provides an environment that is both convenient and efficient to use when there is a large volume of

data and many transactions to be processed.

Data abstraction

It can be summed up as follows.

1. When the DBMS hides certain details of how data is stored and maintained, it provides what is

called as the abstract view of data.

2. This is to simplify user-interaction with the system.

3. Complexity (of data and data structure) is hidden from users through several levels of abstraction.

Data abstraction is used for following purposes:

1. To provide abstract view of data.

2. To hide complexity from user.

3. To simplify user interaction with DBMS.



Levels of data abstraction

There are three levels of data abstraction.

1. Physical level: It describes how a record (e.g., customer) is stored.

Features:

a) Lowest level of abstraction.

b) It describes how data are actually stored.

c) It describes low-level complex data structures in detail.

d) At this level, efficient algorithms to access data are defined.

2. Logical level: It describes what data stored in database, and the relationships among the data.

Features:

a) It is next-higher level of abstraction. Here whole Database is divided into small simple structures.

b) Users at this level need not be aware of the physical-level complexity used to implement the simple

structures.

c) Here the aim is ease of use.

d) Generally, database administrators (DBAs) work at logical level of abstraction.

3. View level: Application programs hide details of data types. Views can also hide information (e.g., salary) for

security purposes.

Features:

a) It is the highest level of abstraction.

b) It describes only a part of the whole Database for particular group of users.

c) This view hides all complexity.

d) It exists only to simplify user interaction with system.

e) The system may provide many views for the whole system.



Instances and Schemas

Instances and Schemas are similar to types and variables in programming languages.

1. Schema:

The overall design of a database is called database schema. E.g., the database consists of information about a set

of customers and accounts and the relationship between them. It is analogous to variable along with its type

information in a program.

Types of Schemas (partitioned according to levels of abstraction):

a. Physical schema: It is database design at the physical level. It is hidden below logical schema, and can be

changed easily without affecting application programs.

b. Logical schema: It is database design at the logical level. Programmers construct applications using

logical schema. It is by far the most important schema, in terms of its effect on application programs.

c. Subschema: It is schema at view level.

2. Instance:

It is the actual content of the database at a particular point in time. It is analogous to the value of a variable.

The ANSI/SPARC Architecture

A DBMS can be considered as a buffer between application programs, end users and a database designed to fulfill

features of data independence. In 1975 the American National Standards Institute Standards Planning and

Requirements Committee (ANSI-SPARC) proposed three-level architecture identified three levels of abstraction.

These levels are sometimes referred to as schemas or views.

1. The External or User Level: This level describes the user’s or application program’s view of the database.

Several programs or users may share the same view.

2. The Conceptual Level: This level describes the organization’s view of all the data in the database, the

relationships between the data and the constraints applicable to the database. This level describes a logical view of

the database i.e. a view locking implementation detail.

3. The Internal or Physical Level: This level describes the way in which data is stored and the way in which data

may be accessed. This level describes a physical view of the database.

Each level is described in terms of schema – a map of the database. The three- level architecture is

used to implement data independence through two levels of mapping: that between the external schema and the

conceptual schema and that between the conceptual schema and the physical schema.



Foreg., Consider a Bank System, It uses

• Customer_Details Table. • Customer_Transactions Table.

At the internal level, a Customer_Details or Customer_Transaction record can be described as a block of consecutive

storage locations (for example, words of bytes). The language compiler hides this level of detail from programmers.

Similarly, the database system hides the lowest-level storage details (how data is stored and accessed) from database

programmers. At the conceptual level, the table definition (the attributes data type and width definition) and the

interrelationship among the data is described. Finally at the external level, several views of the database are defined,

and database end users are also to see those views. In addition to hiding details of the conceptual level of the database,

the views also provide a security mechanism to prevent users from accessing parts of the database. For e.g., tellers in

the bank will be able to see only that part of the database that has information on customer accounts; they cannot

access information concerning salaries of bank employees.



Data Independence

Data independence is the ability to modify a schema definition in one level without affecting a schema definition in a

higher level is called data independence.

There are two types of ‘data independence’:

1. Physical data independence

a. It is the ability to modify the physical scheme without causing application programs to be rewritten.

b. Modifications at this level are usually to improve performance.

2. Logical data independence

a. It is the ability to modify the conceptual scheme without causing application programs to be rewritten

b. It is usually done when logical structure of database is altered.

Logical data independence is harder to achieve as the application programs are usually heavily dependent on the

logical structure of the data. An analogy is made to abstract data types in programming languages.

Database Users

Users are differentiated by the way they expect to interact with the system. They fall into the following categories:

1. Application programmers: They are computer professionals interacting with the system through DML calls

embedded in a program written in a host language (e.g. C, PL/1, Pascal).

a. These programs are called application programs.

b. The DML precompiler converts DML calls (prefaced by a special character like $, #, etc.) to normal

procedure calls in a host language.

c. The host language compiler then generates the object code.

d. Some special types of programming languages combine Pascal-like control structures with control structures

for the manipulation of a database.

e. These are sometimes called fourth-generation languages.

f. They often include features to help generate forms and display data.

2. Sophisticated users: They interact with the system without writing programs.

a. They form requests by writing queries in a database query language.

b. These are submitted to a query processor that breaks a DML statement down into instructions for the database

manager module.

3. Specialized users: They are sophisticated users writing special database application programs. These may be

CAD systems, knowledge-based and expert systems, complex data systems (audio/video), etc.

4. Naive users: They are unsophisticated users who interact with the system by using permanent application

programs (e.g. automated teller machine).



Database Administrator

The database administrator is a person having central control over data and programs accessing that data. He

coordinates all the activities of the database system; the database administrator has a good understanding of the

enterprise’s information resources and needs.

Functions of a DBA

Database administrator's duties include:

1. Schema definition: the creation of the original database schema. This involves writing a set of definitions in a

DDL (data storage and definition language), compiled by the DDL compiler into a set of tables stored in the data

dictionary.

2. Storage structure and access method definition: writing a set of definitions translated by the data storage and

definition language compiler.

3. Schema and physical organization modification: writing a set of definitions used by the DDL compiler to

generate modifications to appropriate internal system tables (e.g. data dictionary). This is done rarely, but sometimes

the database schema or physical organization must be modified.

4. Granting user authority to access the database: granting different types of authorization for data access to

various users

5. Specifying integrity constraints: generating integrity constraints. These are consulted by the database manager

module whenever updates occur.

6. Routine Maintenance: It includes the following:

a. Acting as liaison with users.

b. Monitoring performance and responding to changes in requirements.

c. Periodically backing up the database.

Database languages

We have Data Definition Languages (DDL) to specify database schemas and Data Manipulation Language (DML) to

express database updates and queries. In practice, these are not to separate languages but are part of a single database

language, like SQL.

1. Data Definition Languages (DDL)

It is the language that is used to specify database schemas by a set of definitions contained in it.

2. Data Manipulation Language (DML)

It is a language for accessing and manipulating the data organized by the appropriate data model.

DML is also known as query language.



There are two types of DML

a) Procedural DMLs

b) Declarative DMLs (non-procedural DMLs)

1. Procedural DMLs - This language requires user to specify what data is required and how to get those data.

2. Declarative DMLs (non-procedural DMLs) - This language requires user to specify what data is required

without specifying how to get those data.

Features of DDL

1. Used to specify a database schema as a set of definitions expressed in a DDL

2. DDL statements are compiled, resulting in a set of tables stored in a special file called a data dictionary or data

directory.

3. The data directory contains metadata (data about data).

4. The storage structure and access methods used by the database system are specified by a set of definitions in a

special type of DDL called a data storage and definition language.

5. Basic idea of DDL: To hide implementation details of the database schemes from the users.

Features of DML

1. A DML is a language which enables users to access and manipulate data.

2. The goal is to provide efficient human interaction with the system.

3. There are two types of DMLs

a. Procedural: Here user specifies what data is needed and how to get it.

b. Non-procedural: Here user only specifies what data is needed.

- Easier for user

- May not generate code as efficient as that produced by procedural languages

DBMS System Structure and its Components

We can explain the overall structure of DBMS/System structure and its components by the diagram given below:

1. Database systems are partitioned into modules for different functions. Some functions (e.g. file systems) may be

provided by the operating system.

2. Broadly the functional components of a database system are:

a. Query Processor: It is one of the functional components of DBMS. It translates statements in a query language

into low-level instructions the database manager understands. It may also attempt to find an equivalent but more

efficient form.



It contains following components:

a. DML compiler - It converts DDL statements to a set of tables containing metadata stored in a data dictionary.

It also performs query optimization.

b. DDL interpreter – It interprets DDL statements and records definitions into data dictionary.

c. Query evaluation engine – It executes low-level instructions generated by DML compiler. They mainly deal

with solving all problems related to queries and query processing. It helps database system simplify and facilitate

access to data.

b. Storage Manger (Database Manager)

1. Storage manager is a program module that provides the interface between the low-level data stored

in the database and the application programs and queries submitted to the system.

2. The storage manager is responsible to the following tasks:

1. interaction with the file manager

2. efficient storing, retrieving and updating of data

3. The important components include:

a. File manager: It manages allocation of disk space and data structures used to represent information on disk.

b. Database manager: It is the interface between low-level data and application programs and queries.

c. Transaction manager: Transaction manager ensures that the database remains in a consistent (correct) state

despite system failures (e.g., power failures and operating system crashes) and transaction failures.

d. DML precompiler: It converts DML statements embedded in an application program to normal procedure

calls in a host language. The precompiler interacts with the query processor.

e. DDL compiler: It converts DDL statements to a set of tables containing metadata stored in a data dictionary.

f. Authorization and integrity manger – It conducts integrity checks and user authority to access data.

g. Buffer manger – It is critical part of DB and stores temporary data.

In addition, several data structures are required for physical system implementation:

a. Data files: They store the database itself.

b. Data dictionary: It stores information about the structure of the database. It is used heavily. Great emphasis

should be placed on developing a good design and efficient implementation of the dictionary. In short, it stores

metadata.

c. Indices: They provide fast access to data items holding particular values.



Data Model

A data model is collection of tools for describing

1. data 2. data relationships 3. data semantics 4. data constraints

Types of Data Models

There are basically two types of data models

1. Record based Data Models. 2. Object based Data Models.

1. Record based Data Models – In Record-based models, the database is organized in fixed-format records of several

types. A fixed number of fields, or attributes, are defined in each record type, and each field is usually of a fixed

length.

The three most popular record-based data models are

1. Relational Data Model 2. Network Data Model 3. Hierarchical Data Model



1. Relational Data Model –

1. The relational model uses tables to represent the data and the relationships among those data.

2. Each table has multiple columns, and each column is identified by a unique name.

3. It is a low level model.

In this database, each row in the table represents a different customer. Relationships link rows from two tables on the

basis of the key field, in this case – number.

Advantages of Relational Data Model

a. Structural Independence – Relational database model has structural independence, i.e. changes made

in the database structure do not affect the DBMS’s capability to access data.

b. Simplicity – The relational model is the simplest model at the conceptual level. It allows the designer

to concentrate on the logical view of the database, leaving the physical data storage details.

c. Ease of designing, implementation, maintenance, and usage – Due to the inherent features of data

independence and structural independence, and the relational model makes it easy to design,

implement, maintain and use the databases.

d. Adhoc query capability – One of the main reasons for the huge popularity of the relational database

model is the presence of powerful, flexible and easy-to-use query capability. The query language of the

relational database model – Structure Query Language or SQL – is a fourth generation language

(4GL). A 4GL concentrates on the ‘what’ and not on the ‘how’ of the problem. Selective output can be

achieved by giving a simple query. The relational database translates the user queries into the code

required to extract the desired information.



Disadvantages of Relational Data Model

a. Hardware overheads – The RDBMS needs comparatively powerful hardware as it hides the implementation

complexities and the physical data storage details from the users.

b. Ease of design can result in bad design – As the relational database is an easy-to-design and use system, it can

result in the development and implementation of poorly designed database management systems. As the size of the

database increases, several problems may creep in – system shutdown, performance degradation and data corruption.

2. Network Data Model –

1. In the network model, data are represented by collections of records.

2. Relationships among data are represented by links.

3. In this model Graph data structure is used.

4. A network model permits a record to have more than one parent

Advantages of Network Data Model

a. Simplicity – The network data model is also conceptually simple and easy to design.

b. Ability to handle more relationship types – The network model can handle the one-to-many and many-to-many

relationships.

c. Ease of data access – In the network database terminology, a relationship is a set. Each set comprises of two types

of records – an owner record and a member record. In a network model an application can access an owner record

and all the member records within a set.

d. Data Integrity – In a network model, no member can exist without an owner. A user must therefore first define

the owner record and then the member record. This ensures the data integrity.

e. Data Independence – The network model draws a clear line of demarcation between the programs and the

complex physical storage details. The application programs work independently of the data. Any changes made in the

data characteristics do not affect the application program.

f. Database standards – The standards devised by the DBTG (Database Task Group of CODASYL Committee) form

the basis of the network model. These standards were further enhanced by ANSI/SPARC (American National

Standards Institute/Standards Planning and Requirements Committee) in the 1970s. All the network database



management systems adhere to these standards. These standards comprise of a DDL and a DML that augments the

database administration and portability.

Disadvantages of Network Data Model

a. System complexity – In a network model, data are accessed one record at a time. This makes it essential for the

database designers, administrators, and programmers to be familiar with the internal data structures to gain access to

the data. Therefore, a user-friendly database management system cannot be created using the network model.

b. Lack of structural independence – Making structural modifications to the database is very difficult in the network

database model as the data access method is navigational. Any changes made to the database structure require the

application programs to be modified before they can access data. Though the network database model achieves data

independence, it still fails to achieve structural independence.

3. Hierarchical Model

1. In the hierarchical model, data are represented by collections of records.

2. Relationships among data are represented by links.

3. In this model Tree data structure is used.

4. There are two concepts associated with the hierarchical model – segment types and parent-child relationships.

Segment type is similar to the record types in the network models. The information retrieved only by navigating from

the root segment type to the nodes segment types. Thus you can access a segment type only via its parent segment

type in the parent-child relationship. The operators provided for manipulating such structures include operators for

traversing hierarchic paths up and down the trees.



Advantages of hierarchical Model

1. Simplicity – Since the database is based on the hierarchical structure, the relationship between the various layers is

logically simple. Thus, the design of a hierarchical database is simple.

2. Data Security – Hierarchical model was the first database that offered the data security that is provided and

enforced by the DBMS.

3. Data Integrity – Since the hierarchical model is based on the parent/child relationship, there is always a link

between the parent segment and the child segment under it. The child segments are always automatically referenced

to its parent, this model promotes data integrity.

4. Efficiency – The hierarchical database model is a very efficient one when the database contains a large number of

one-to-many relationships and when the users require large number of transactions, using data whose relationships are

fixed.

Disadvantages of hierarchical Model

1. Implementation Complexity – Although the hierarchical database model is conceptually simple and easy to

design, it is quite complex to implement. The database designers should have very good knowledge of the physical

data storage characteristics.

2. Database management problems – If you make any changes in the database structure of a hierarchical database,

then it is required to make the necessary changes in all the application programs that access the database. Thus,

maintaining the database and the applications can become very cumbersome.

3. Lack of structural independence – Structural independence exists when the changes made to the database

structure does not affect the DBMS’s ability to access data. Hierarchical database systems use physical storage paths

to navigate to the different data segments. So the application programmer should have a good knowledge of the

relevant access paths to access the data. So if the physical structure is changed the applications will also have to be

altered. Thus, in a hierarchical database the benefits of data independence are limited by structural dependence.

4. Programming complexity – Due to the structural dependence and the navigational structure, the application

programmers and the end users must know precisely how the data is distributed physically in the database in order to

access data. This requires knowledge of complex pointer systems, which is difficult for users who have little or no

programming knowledge.



5. Implementation limitation – Many of the common relationships do not conform to the one-to-many format

required by the hierarchical model. The many-to-many relationships, which are more common in real life, are very

difficult to implement in a hierarchical model.

S.No Hierarchical Data Model Network Data Model Relational Data Model

1. Relationship between records

is of parent child type.

Relationship between records is

expressed in the form of pointers

or links.

Relationship between record is

represented by a relation that contains a

key for each record involved in the

relations.

2. Many-to-many relationship

cannot be expressed in this

model.

Many-to-many relationship can

also be implemented.

Many-to-many relationship can be easily

implemented

3. It is a simple, Straight forward

and natural method of

implementing record

relationships

Record relationship

implementation is quite complex

due to the use of pointers.

Relationship implementation is very easy

though the use of a key or composite key

field(s).

4. This type of model is useful

only when there is some

hierarchical character in the

database.

Network model is useful for

representing such records

which have many-to-many

relationships.

Relational model is useful for

representing most of the real world

objects and relationships among them.

5. In order to represent links

among records, pointers are

used. Thus relationships

among records are physical.

In Network model also the

relationship among records are

physical.

Relational model does not maintain

physical connection among records. Data

is organized logically in the form of

rows and columns and stored in table.

6. Searching for a record is very

difficult since one can retrieve

a child only after going though

its parent record.

Searching a record is easy since

there are multiple access paths to

a data element.

A unique, indexed key field is used to

search for a data element.

7. During updation or deletion

process, chance of data

inconsistency is involved.

No problem of inconsistency

exists in network model because

a data element is physically

located at just one place.

Data integrity maintaining methods like

Normalization process, etc. are adopted

for consistency.



Object Based Data Models – In Object-based models, the database is organized in real world objects of several

types. A number of fields, or attributes, are defined in each object type, and each field is usually of a variable length.

The two most popular object-based data models are

a. Object oriented model b. E R Model

1. Object Oriented Model -

1. The object-oriented model is based on a collection of objects, like the E-R model.

2. An object contains values stored in instance variables within the object.

3. Unlike the record-oriented models, these values are themselves objects.

4. Thus objects contain objects to an arbitrarily deep level of nesting.

5. An object also contains bodies of code that operate on the object. These bodies of code are called methods.

6. Objects that contain the same types of values and the same methods are grouped into classes.

7. A class may be viewed as a type definition for objects.

8. Analogy: the programming language concept of an abstract data type.

9. The only way in which one object can access the data of another object is by invoking the method of that other

object. This is called sending a message to the object.

10. Internal parts of the object, the instance variables and method code, are not visible externally.

11. Result is two levels of data abstraction.

For example, consider an object representing a bank account.

a. The object contains instance variables number and balance.

b. The object contains a method pay-interest which adds interest to the balance.

c. Under most data models, changing the interest rate entails changing code in application programs.

d. In the object-oriented model, this only entails a change within the pay-interest method.

12. Unlike entities in the E-R model, each object has its own unique identity, independent of the values it contains:

a. Two objects containing the same values are distinct.

b. Distinction is maintained in physical level by assigning distinct object identifiers.



Advantages of Object Oriented Data Model

a. Capability to handle large number of different data types – Traditional database models like hierarchical, network

and relational database are limited in their capability to store the different types of data. For e.g., one cannot store

pictures, voices and video in these databases. But the object-oriented database can store any type of data including

text, numbers, pictures, voice and video.

b. Combination of object-oriented programming and database technology – Perhaps the most significant

characteristic of object-oriented database technology is that it combines object-oriented programming with database

technology to provide an integrated application development system.

c. Object-oriented features improve productivity – Inheritance allows one to develop solutions to complex problems

incrementally by defining new objects in terms of previously defined objects. Polymorphism and dynamic binding

allow one to define operations for one object and then to share the specification of the operation with other objects.

These objects can further extend this operation to provide behaviors that are unique to those objects. Dynamic

binding determines at runtime, which of these operations is actually executed, depending on the class of the object

requested to perform the operation. Polymorphism and dynamic binding are powerful object-oriented features that

allow one to compose objects to provide solutions without having to write code that is specific to each object. All of

these capabilities come together to provide significant productivity advantages to database application developers.

d. Data access – Object-oriented database represent relationships explicitly, supporting both navigational and

associative access to information. As the complexity of interrelationships between information within the database

increases, the greater the advantages of representing relationships explicitly. Another benefit of using explicit

relationships is the improvement in data access performance over relational value-based relationships.

Disadvantages of Object Oriented Data Model

a. Difficult to maintain – In the real world, the data model is not static. It changes as organizational information

needs change and as missing information is identified. Consequently, the definition of objects must be changed

periodically and existing databases migrated to conform to the new object definitions. Object-oriented databases are

semantically rich introducing a number of challenges when changing object definitions and migrating databases.

Object-oriented databases have a greater challenge handling schema migration because it is not sufficient to simply

migrate the data representation to conform to the changes in class specifications. One must also update the behavioral

code associated with each object.

b. Not suited for all applications – Object-oriented database systems are not suited for all applications. If it is used in

situations where it is not required, then it will result in performance degradation and high processing requirements.

OODBMS is popular in area such as e-commerce, engineering product data management, and special purpose



databases in securities and medicine. The strength of the object model is in applications where there is an underlying

needed to manage complex relationships among data objects.

E R Model (Entity Relational Model)

1. The entity-relationship model is based on a perception of the world as consisting of a collection of basic objects

(entities) and relationships among these objects.

2. It is an object-based logical model.

3. It is a high-level data model.

4. An entity is a distinguishable object that exists.

5. Each entity has associated with it a set of attributes describing it. E.g. number and balance for an account entity.

6. A relationship is an association among several entities.

E.g. A cust_acct relationship associates a customer with each account he or she has.

7. The set of all entities or relationships of the same type is called the entity set or relationship set.

8. Another essential element is the E-R diagram in which the mapping cardinalities express the number of entities

to which another entity can be associated via a relationship set.

9. The overall logical structure of a database can be expressed graphically by an E-R diagram:

a. Rectangles: represent entity sets.

b. Ellipses: represent attributes.

c. Diamonds: represent relationships among entity sets.

d. Lines: link attributes to entity sets and entity sets to relationships.

intro to dbms

Documents