questions n answers unit 1 - dbms

14
Q1 : What is the purpose of using Database ? OR Q1: What are the advantages of using database over file system ? Answer : We use Database in place of file system because they reduces the following problems : 1 > Data redundancy and inconsistency. Since different programmers create the files and application programs over a long period, the various files are likely to have different formats and the programs may be written in several programming languages. Moreover, the same information may be duplicated in several places (files). For example, the address and telephone number of a particular customer may appear in a file that consists of savings-account records and in a file that consists of checking-account records. This redundancy leads to higher storage and access cost. In addition, it may lead to data inconsistency; that is, the various copies of the same data may no longer agree. For example, a changed customer address may be reflected in savings- account records but not elsewhere in the system. 2 > Difficulty in accessing data. Suppose that one of the bank officers needs to find out the names of all customers who live within a particular postal-code area. The officer asks the data-processing department to generate such a list. Because the designers of the original system did not anticipate this request, there is no application program on hand to meet it. There is, however, an application program to generate the list of all customers. The bank officer has now two choices: either obtain the list of all customers and extract the needed information manually or ask a system programmer to write the necessary application program. Both alternatives are obviously unsatisfactory. Suppose that such a program is written, and that, several days later, the same officer needs to trim that list to include only those customers who have an account balance of $10,000 or more. As expected, a program to generate such a list does not exist. Again, the officer has the preceding two options, neither of which is satisfactory. The point here is that conventional file-processing environments do not allow needed data to be retrieved in a convenient and efficient manner. More responsive data-retrieval systems are required for general use. 3 > Data isolation. Because data are scattered in various files, and files may be in different formats, writing new application programs to retrieve the appropriate data is difficult. 4 > Integrity problems. The data values stored in the database must satisfy certain types of consistency constraints. For example, the balance of a bank account may never fall below a prescribed amount (say, $25). Developers enforce these constraints in the system by adding appropriate code in the various application programs. However, when new constraints are added, it is difficult to change the programs to enforce them. The problem is compounded when constraints involve several data items from different files. 5 > Atomicity problems. A computer system, like any other mechanical or electrical device, is subject to failure. In many applications, it is crucial that, if a failure occurs, the data be restored to the consistent state that existed prior to the failure. Consider a program to transfer $50 from account A to account B. If a system failure occurs during the execution of the program, it is possible that the $50 was removed from account A but was not credited to account B, resulting in an inconsistent database state. Clearly, it is essential to database consistency that either both the credit and debit occur, or that neither occur. That is, the funds

Upload: weirdwolfvortex

Post on 27-Dec-2015

53 views

Category:

Documents


1 download

DESCRIPTION

DBMS - Questions and Answers notes

TRANSCRIPT

Q1 : What is the purpose of using Database ?

OR

Q1: What are the advantages of using database over file system ?

Answer :

We use Database in place of file system because they reduces the following problems :

1 > Data redundancy and inconsistency.

Since different programmers create the files and application programs over a long period, the various files are likely to have different formats and the programs may be written in several programming languages. Moreover, the same information may be duplicated in several places (files). For example, the address and telephone number of a particular customer may appear in a file that consists of savings-account records and in a file that consists of checking-account records. This redundancy leads to higher storage and access cost. In addition, it may lead to data inconsistency; that is, the various copies of the same data may no longer agree. For example, a changed customer address may be reflected in savings-account records but not elsewhere in the system.

2 > Difficulty in accessing data.

Suppose that one of the bank officers needs to find out the names of all customers who live within a particular postal-code area. The officer asks the data-processing department to generate such a list. Because the designers of the original system did not anticipate this request, there is no application program on hand to meet it. There is, however, an application program to generate the list of all customers. The bank officer has now two choices: either obtain the list of all customers and extract the needed information manually or ask a system programmer to write the necessary application program. Both alternatives are obviously unsatisfactory. Suppose that such a program is written, and that, several days later, the same officer needs to trim that list to include only those customers who have an account balance of $10,000 or more. As expected, a program to generate such a list does not exist. Again, the officer has the preceding two options, neither of which is satisfactory. The point here is that conventional file-processing environments do not allow needed data to be retrieved in a convenient and efficient manner. More responsive data-retrieval systems are required for general use.

3 > Data isolation.

Because data are scattered in various files, and files may be in different formats, writing new application programs to retrieve the appropriate data is difficult.

4 > Integrity problems.

The data values stored in the database must satisfy certain types of consistency constraints. For example, the balance of a bank account may never fall below a prescribed amount (say, $25). Developers enforce these constraints in the system by adding appropriate code in the various application programs. However, when new constraints are added, it is difficult to change the programs to enforce them. The problem is compounded when constraints involve several data items from different files.

5 > Atomicity problems.

A computer system, like any other mechanical or electrical device, is subject to failure. In many applications, it is crucial that, if a failure occurs, the data be restored to the consistent state that existed prior to the failure. Consider a program to transfer $50 from account A to account B. If a system failure occurs during the execution of the program, it is possible that the $50 was removed from account A but was not credited to account B, resulting in an inconsistent database state. Clearly, it is essential to database consistency that either both the credit and debit occur, or that neither occur. That is, the funds

transfer must be atomic—it must happen in its entirety or not at all. It is difficult to ensure atomicity in a conventional file-processing system.

6 > Concurrent-access anomalies.

For the sake of overall performance of the system and faster response, many systems allow multiple users to update the data simultaneously. In such an environment, interaction of concurrent updates may result in inconsistent data. Consider bank account A, containing $500. If two customers withdraw funds (say $50 and $100 respectively) from account A at about the same time, the result of the concurrent executions may leave the account in an incorrect (or inconsistent) state. Suppose that the programs executing on behalf of each withdrawal read the old balance, reduce that value by the amount being withdrawn, and write the result back. If the two programs run concurrently, they may both read the value $500, and write back $450 and $400, respectively. Depending on which one writes the value last, the account may contain either $450 or $400, rather than the correct value of $350. To guard against this possibility, the system must maintain some form of supervision. But supervision is difficult to provide because data may be accessed by many different application programs that have not been coordinated previously.

7 > Security problems.

Not every user of the database system should be able to access all the data. For example, in a banking system, payroll personnel need to see only that part of the database that has information about the various bank employees. They do not need access to information about customer accounts. But, since application programs are added to the system in an ad hoc manner, enforcing such security constraints is difficult.

Q2 :Explain Components of dbms with the help of a diagram . OR Q2 : Expain architecture / structure of DBMS with the help of a diagram . Answer : The functional components of a database system can be broadly divided into the storage

manager and the query processor components.

1 . Storage Manager A storage manager is a program module that provides the interface between the lowlevel data stored in the database and the application programs and queries submitted to the system. The storage manager components include:

1 > Authorization and integrity manager, which tests for the satisfaction of integrity constraints and

checks the authority of users to access data.

2 > Transaction manager, which ensures that the database remains in a consistent (correct) state

despite system failures, and that concurrent transaction executions proceed without conflicting.

3 > File manager, which manages the allocation of space on disk storage and the data structures used

to represent information stored on disk.

4 > Buffer manager, which is responsible for fetching data from disk storage into main memory, and

deciding what data to cache in main memory. The buffer manager is a critical part of the database system, since it enables the database to handle data sizes that are much larger than the size of main memory. The storage manager implements several data structures as part of the physical system implementation: 1 >> Data files, which store the database itself.

2 >>Data dictionary, which stores metadata about the structure of the database, in particular the schema

of the database.

3 >> Indices, which provide fast access to data items that hold particular values.

2 . The Query Processor The query processor components include

1 >> DDL interpreter, which interprets DDL statements and records the definitions in the data dictionary.

2 >> DML compiler, which translates DML statements in a query language into an evaluation plan

consisting of low-level instructions that the query evaluation engine understands. A query can usually be translated into any of a number of alternative evaluation plans that all give the same result. The DML compiler also performs 3 >> query optimization, that is, it picks the lowest cost evaluation plan from among the alternatives.

4 >> Query evaluation engine, which executes low-level instructions generated by the DML compiler.

Q4 : Explain 3-tier Architecture of DBMS with a diagram.

OR Q4 : Explain levels of abstraction in DBMS with a diagram. OR Q4 : Explain level of schemas with a diagram. Answer :

levels of abstraction

• Physical level: describes how a record (e.g., customer) is stored. • Logical level: describes data stored in database, and the relationships among the data.

type customer = record

customer_id : string; customer_name : string; customer_street : string; customer_city : string;

end;

• View level: application programs hide details of data types. Views can also hide information (such as an employee’s salary) for security purposes.

• Schema – the logical structure of the database – Example: The database consists of information about a set of customers and accounts

and the relationship between them) – Analogous to type information of a variable in a program – Physical schema: database design at the physical level – Logical schema: database design at the logical level

• Instance – the actual content of the database at a particular point in time – Analogous to the value of a variable

• Physical Data Independence – the ability to modify the physical schema without changing the logical schema

– Applications depend on the logical schema – In general, the interfaces between the various levels and components should be well

defined so that changes in some parts do not seriously influence others.

Q 5 : Explain different Data models in DBMS . Answer : A data model is a collection of conceptual tools for describing data, data relationships, data semantics, and consistency constraints.

HIERARCHICAL MODEL

A hierarchical database model is a data model in which the data is organized into a tree-like structure.

The structure allows representing information using parent/child relationships: each parent can have

many children, but each child has only one parent (also known as a 1-to-many relationship). All

attributes of a specific record are listed under an entity type.

One of the oldest data model

Tree like structure

One of the hierarchical database model is IBM - IMS

Organizes the data in the tabular rows

Different operation were:-

Insert

Update

Delete

Retrieval of data

Advantages : -

simlicity

Data integrity

Efficiency

Disadvantages :

Implementation complexity

Lack of structural independency

Implementation limitations

Program complexity

Network Model

The network model expands upon the hierarchical structure, allowing many-to-many relationships in a

tree-like structure that allows multiple parents.

Network model replaces the hierarchical tree with a graph

Able to handle different relations

Some well-known database systems that use the network model include:

Integrated Data Store (IDS)

IDMS (Integrated Database Management System)

Advantages :

Capable to handle different relationships

Ease in data access

Data integrity

Database standards

Data independence

Disadvantages :

System complexity

Operational anomalies

Absence of structural independence

Object-oriented database models

An object-relational database (ORD), or object-relational database management system (ORDBMS), is a

database management system (DBMS) similar to a relational database, but with an object-oriented

database model: objects, classes and inheritance are directly supported in database schemas and in the

query language. In addition, just as with pure relational systems, it supports extension of the data model

with custom data-types and methods.

Q 6 : What are different types of KEYS ?

Answer :

• Candidate key

• Primary key

• Foreign key

• Unique key

• Superkey

• Alternate key

Unique Key :

• A unique key is a key that uniquely defines the characteristics of each row.

• The primary key has to consist of characteristics that cannot collectively be duplicated by any

other row.

• A Unique and Foreign key is needed in order to link tables

Foreign Key :

• In the context of relational databases, a foreign key is a field (or collection of fields) in

one table that uniquely identifies a row of another table.

• In other words, a foreign key is a column or a combination of columns that is used to establish

and enforce a link between two tables.

Primary Key :

The primary key of a relational table uniquely identifies each record in the table. It can either be a

normal attribute that is guaranteed to be unique (such as Social Security Number in a table with no

more than one record per person) or it can be generated by the DBMS (such as a globally unique

identifier, or GUID, in Microsoft SQL Server). Primary keys may consist of a single attribute or multiple

attributes in combination.

Candidate Key :

A candidate key is a combination of attributes that can be uniquely used to identify a database record

without any extraneous data. Each table may have one or more candidate keys. One of these candidate

keys is selected as the table primary key.

Alternate Key :

There can be a key apart from primary key in a table that can also be a key. This key may or may not be a

unique key. For example, in an employee table, empno is a primary key, empname is a alternate key that

may not be unique but still helps in identifying a row of the table.

Super Key :

A superkey is a combination of columns that uniquely identifies any row within a relational database

management system (RDBMS) table.

Candidate key is a subset of superkey .

A candidate key is a closely related concept where the superkey is reduced to the minimum number of

columns required to uniquely identify each row.

Q 7 : What are the responsibilities of a DBA (Database Administrator ) ?

Answer :

One of the main reasons for using DBMSs is to have central control of both the data and the programs that access those data. A person who has such central control over the system is called a database administrator (DBA). The functions of a DBA include:

• Schema definition. The DBA creates the original database schema by executing a set of data definition

statements in the DDL.

• Storage structure and access-method definition.

• Schema and physical-organization modification. The DBA carries out changes to the schema and

physical organization to reflect the changing needs of the organization, or to alter the physical organization to improve performance.

• Granting of authorization for data access. By granting different types of authorization, the database

administrator can regulate which parts of the database various users can access. The authorization information is kept in a special system structure that the database system consults whenever someone attempts to access the data in the system.

• Routine maintenance. Examples of the database administrator’s routine maintenance activities are:

Periodically backing up the database, either onto tapes or onto remote servers, to prevent loss of data in case of disasters such as flooding.

Ensuring that enough free disk space is available for normal operations, and upgrading disk space as required.

Monitoring jobs running on the database and ensuring that performance is not degraded by very expensive tasks submitted by some users.

E R Model

Q 8 : What is Generalization and Specialization ? Explain with a diagram . Answer : Specialization and generalization define a containment relationship between a higher-level entity set and one or more lower-level entity sets. Specialization is the result of taking a subset of a higher-level entity set to form a lower level entity set. Generalization is the result of taking the union of two or more disjoint (lower-level) entity sets to produce a higher-level entity set. The attributes of higher-level entity sets are inherited by lower-level entity sets.

Q 9 : What is Aggregation ? Explain with a diagram . Answer : Aggregation is an abstraction in which relationship sets (along with their associated entity sets) are treated as higher-level entity sets, and can participate in relationships.