dbms
TRANSCRIPT
UNIT -I1.Data base system
Database system is nothing more than a computer-based record keeping system (i.e.) a system whose overall purpose is to record and maintain information. The information concerned can be anything that is deemed to be of significance to the organization or the system which may serve the organization in decision-making processes involved in the management of that organization.
The database system involves four major componenets.They are data ,hardware, software and users.
Database Management System
User1
User
User
Application programs End UsersFig: Simplified picture of a database system
Data The data stored in the system is partitioned into one or more databases. A database is a repository for stored data, it is of both integrated and shared. Integrated: By integrated we mean that the database can be thought of as a unification of several distinct files, with the redundancy among those files eliminated. Example: Combination of EMPLOYEE and ENROLLMENT data files. Shared: By Shared we mean that individual pieces of data in the database can be shared among different users that is many users can have access to the same piece of data. Example: The department information in EMPLOYEE file would be shared by users in the personal department, education department etc.
1
Hardware
The hardware consists of the secondary storage device disks, drums,etc… on which the database resides together with the associated devices, control units, channels and so forth.
Software
Between the physical database and the users of the system is a layer of software usually called the DBMS.All requests from users for access to the database are handled by the DBMS.One general function provided by the DBMS is thus the shielding of the database users from hardware level. The DBMS provides a view of the database that is elevated somewhat above the hardware level and supports user operation that are expressed in terms of that higher-level view.
Users
We consider three broad categories of database users, they are *application programmers *end-users *DBA
1.Application programmers Application programmer is responsible for writing application programs that use the database. These application programs operate on the data in all the usual ways that is in retrieving information, creating new information, deleting or changing existing information.
2.End-users End-users access the database from a terminal. An end-user may employ a query language provided as an integral part of the system or may invoke a user-written application program that accepts commands from the terminal and in turn issues requests to the DBMS on the end-user’s behalf.
3.Database Administrator DBMS have central control of both the data and to the programs that access those data. The person who has such control over the system is called DBA.The main functions of DBA are *Schema definition *Storage structure and access-method definition *Granting and physical-organization modification *Integrity-constraint specification
These are the various components of a database system.
2
2.Operational data
A database is a collection of stored operational data used by the application systems of some particular enterprise. Where enterprise is a conventional generic term for any reasonably self-contained commercial, scientific, technical or other organization. Examples. Manufacturing company,Bank,Hospital,University,Government department etc. The enterprise should maintain a lot of data about its operation. The “operational data” for the enterprises quoted above are, Product data, account data, patient data, student data, planning data.
Example for the illustration of operational data
Consider the manufacturing company where the enterprise will wish to retain information about the projects it has on hand; parts used in those projects; the suppliers who supply the parts; the warehouses in which the parts are stored; the employees who work on the projects etc..These are the basic entities about which data is recorded in the database. In general there will be associations or relationships linking the basic entities together(entity is any distinguishable object).
For example, there is an association between suppliers and parts that is each supplier supplies certain parts and conversely each part is supplied by certain suppliers etc..
Fig: An example of operational dataThe figure illustrates
1.Most of the associations are between two entities or more than that ex., arrow connecting suppliers-parts-projects Here supplier s2 supplies part p4 to project j3.
3
suppliersprojects
warehouses parts
employees
locations departments
2.The example also shows one arrow involving only in one type of entity (parts) ex., some parts are components of other parts (a screw is a component of a huge assembly or char etc..) 3.Some entities may be associated in more than one relationship Ex., projects and employees are linked in two relationships a. the employee works on the project b .the employee is the manager of the project This example clearly illustrates operational data and its functions.
3.Data Independence
The ability to modify a schema definition in one level without affecting a schema in the next higher level is called data independence. Most present day applications are data-dependent. This means ,the way in which the data is organized in secondary storage and the way in which it is accessed are both dictated by the requirements of the application ,and moreover that knowledge of the data organization and access technique is built into the application logic. For example, if a file is stored in indexed sequential form, and in order to modify the file the indexes defined should be known. Here the data is dependent, and the modification requires complete application program to be rewritten. In database system, data resides independent and any modification done at physical level/conceptual level may not affect the database system.
Two types of data independence stated are
1.Physical data independence
Physical data independence is the ability to modify the physical schema without causing application programs to be rewritten. Modifications at the physical level are occasionally necessary to improve performance. Example, Modifying the structure of the database using ALTER command etc.
2.Logical data independence
Logical data independence is the ability to modify the logical schema without causing the application programs to be rewritten.
Example, Modifications such as adding new columns or field to the database. Most of the modifications are done by the DBA and the types of change that the
DBA wish to make may be explained with the help of the following definitions:
4
Stored field: Stored field is the smallest unit of data stored in the database.
Ex., database containing information about parts would probably include a stored field type called part number etc.
Stored record: Stored record is a named collection of associated stored fields. Stored file: Stored file is the collection of all occurrences of one type of stored record. Similarly if a data type of the stored field has to be changed is also done by Data. The data storage may be in any of the following form.
1.Representation of numeric data Data may be stored in internal arithmetic form or as a character string.
2.Representation of character data A character field may be stored in any of several character codes (eg.EBCDIC,ASCII..)
3.Units for numeric data The units in a numeric field may change.Ex.,from inches to centimeters 4.Data coding In some situations it may be desirable to represent data in storage by coded values. Ex., the value for part color=RED can be interpreted as 1=’RED’.
5.Structure of stored records Two existing types of stored record may be combined into one. For ex., the record types(part number, color) and (part number, weight) may be integrated to give (part number,color,weight). Also a single type of stored record may be split into two. For ex.,(part number,color,weight) may be broken down into (part number, color) and (part number, weight). 6.Structure of stored fields A given stored file may be physically implemented in storage in a wide variety of ways. For ex., storing the file in single storage volume or spread across several volumes. The above fact implies that the database is able to grow without affecting existing applications.
5
4.Architecture for a Database system
The architecture is divided into three general levels, they are internal,conceptual,external levels,
------------------- - External level (individual user
views)
Conceptual level (Community user view)
Internal level(Storage view)
Fig:Three levels of architecture
*Internal level(Physical level)
This level is the one closest to the physical storage .This is a low-level representation of the entire database; it consists of many occurrences of each of many types of internal record .The storage view is described by means of the internal schema which not only defines the various stored record types but also specifies what indexes exist, how stored files are represented ,what physical sequence the stored records are in and so on.
*Conceptual level (Community logical level) This level is the representation of the entire information content of the database. It consists of many occurrences of each of many types of conceptual record. Also this is a level of indirection between the other two levels.
*External level(user logical level) This level is closest to the users and is concerned with the way the data is seen by the individual users. The users may be application programmers,end-users,DBA etc.Each user has a language at his/her disposal to interact with the database. For the application programmer the language will be either a conventional programming like c++,JAVA etc. For end users the language will be either a query language or some special-purpose language and that language is data sub language (DSL) which is a subset of the total language that is concerned with database objects and operations. The DSL is embedded within the corresponding host language . A given system might support any number of host languages and any number of data sub languages; however, one particular data sub language that is supported by almost all current systems is the language SQL.
6
Any given data sub language is a combination of at least two subordinate languages-a Data definition language(DDL) and data manipulation language(DML).Where the DDL portion consists of declarative constructs and the DML portion consists of executable statements. The individual user will generally be interested only in some portion of the total database; moreover ,that user’s view of that portion will generally be somewhat abstract when compared with the way the data is physically stored. The term for an individual user’s view is an external view. An external view is thus the content of the database as seen by some particular user.
For example, A user from the Personnel Department might view the details of employee and department and nothing else.
Detailed System architectureUser A1 user A2 User B1 User B2
*external *external schema A schema B
External/conceptual External/conceptualmapping A mapping B
conceptual schema
Conceptual/internalmapping
storage structure definition
(internal schema)
fig: Database system architecture
7
Host language+DSL
Host language+DSL
Host language+DSL
Host language+DSL
External view A External view B
Conceptual view
Database management system(DBMS)
Stored database(internal level)
*user interface
Mappings
The mappings involved in the architecture are conceptual/internal mapping and external/conceptual mappings.The conceptual/internal mapping defines the correspondence between the conceptual view and stored database, it specifies how conceptual records and fields are represented at the internal level. If the structure of the stored database is changed then the conceptual/internal mapping must be changed accordingly, so that the conceptual schema can remain invariant. The effects of such changes must be isolated below the conceptual level, in order to preserve physical data independence.The external/conceptual mapping defines the correspondence between a particular external view and the conceptual view.
Database administrator(DBA)
The Data Administrator(DA) is the person who makes the strategic and policy decisions regarding the data of the enterprise and the DBA is the person who provides the necessary technical support fro implementing those decisions. Thus the DBA is responsible for the overall control of the system in technical level. The major tasks of DBA are *defining the conceptual schema or schema definition *storage structures and access-method definition *schema and physical organization modification *granting of authorization for data access *integrity constraint specification
DBMS
The DBMS is the software that handles all access to the database. Its functions are as follows
A user issues an access request using some particular data sub language The DBMS intercepts that request and analyses it. The DBMS inturn,intercepts the external schema for that user, the corresponding
external/conceptual mapping, the conceptual schema, the conceptual/internal mapping, the storage structure definition.
The DBMS executes the necessary operations on the stored database
8
The diagrammatic representation of the major functions of DBMS and its components.
Enforce security and Integrity constraints
9
Source schemas and mappings Planned DML
requestsUnplanned DML requests
DDL processors DML processor Query language processor
Compiled requests
optimizer
Optimized requests
Run time manager
database
Metadata (data dictionary)
Meta data
Source and object schemas and mappings
5.Distributed databases
The key objective of distributed system is that it should look like a centralized system to the users. Distributed processing means that distinct machines can be connected together into communication network such as the Internet, so that the single data-processing task can span several machines in the network. A distributed database is typically a database that is not stored in its entirety at a single physical location, but rather is spread across a network of computers that are geographically dispersed and connected through communication links. For example, consider a banking system in which the customer accounts database is distributed across the bank branch offices, such that each individual customer account record is stored at the customer’s local branch. It other words the data is stored at the location at which it is frequently used, but is still available through communication network to users at other locations for example, users at the bank’s central office.
D database
Advantages
Efficiency of local processing
10
Communication network
ClientServer
ClientServer
ServerClient
ServerClient
Data sharing
Disadvantages
Overhead may be quite high Technical difficulties
6.Storage structures and its purposes. The main idea behind data maintenance is for future reference and it has to be stored for the storage and access of data ,various techniques like sequential ,direct access etc. exists. Once the data is stored in the memory in internal level(physical storage) then it is accessed through DML operations in terms of external records and must be converted in turn to operations at the actual hardware level that is to operations on physical records or blocks. The component responsible for this internal/physical conversion is called an access method. The access method consists of a set of routines whose function is to conceal all device-dependent details from the DBMS and to present the DBMS with a stored record interface.
user interface
External record
occurrences Stored record interface
Stored recordoccurrences Physical record interface
physical record occurrences
Fig: The stored record interface
The stored record interface thus corresponds to the internal level, just as the user interface corresponds to the external level. Also the stored record interface allows DBMS to view the storage structure as a collection of stored files each consisting of all occurrences of one type of stored record. The DBMS knows *What stored files exist *The structure of the corresponding stored record
11
USER
DBMS
Access Method
*The stored fields on which it is sequenced *The stored field which can be used for direct access etc. These information will be specified as part of the storage structure definition.The DBMS does not know a)anything about physical records b)how sequencing is performed c)how direct access is performed These information are specified to the access method not to the DBMS.
Also ,when a new stored record occurrence is first created and entered into the database, the access method is responsible for assigning it a unique stored record address(SRA).This value distinguishes each stored records from other records, the SRA for a particular occurrence is returned to the DBMS by the access method when the occurrence is first created and may be used by the DBMS for subsequent direct access to the occurrence concerned. The SRA for a given occurrence does not change until the occurrence is physically moved as part of a database reorganization.
7.How data are stored in the physical storage?
There are various possible representations of data within the memory and some of them are explained here. Consider the following example.
The table consists of information about five suppliers for each supplier a record number ,a supplier name, a status value and a location is recorded. Also the supplier number for each supplier is unique, that is each record is sequenced on the basis of its primary key. The above example is the simplest from of data representation containing only five record occurrences with unique supplier number. If the suppliers are 10000 rather than five and located in only 10 different cities then the storage will be wasted specifying the 10 cities among 10000 suppliers. Then the pointer is specified from the supplier file to the city file by separating the city attribute alone to a file.
The following is another form of data the representation
Supplier file city file
S# Sname Status CityS1 Smith 20 LondonS2 Jones 10 ParisS3 Blake 30 ParisS4 Clark 20 LondonS5 Adams 30 Athens
12
CityAthensLondonParis
In the above figure the pointers exists from supplier file to the city file and they are SRAs(Storage record address).Advantage of this form of representation over the previous one is, in the later memory space is saved.
The third form of data representation is indexing. If a file is indexed on any of its attributes(more frequently occurring) then accessing such file is quite easier. The representation can be
S# Sname StatusS1 Smith 20S2 Jones 10S3 Blake 30S4 Clark 20S5 Adams 30
indexed on city
An example,”Find all suppliers in a given city”,when this query is placed then the result is retrieved quite easily from the database if represented as above that is in indexed form.
The purpose of indexing is to provide an access path to the file.An index is a file in which each entry(record) consists of a data value together with one or more pointers.The data value is a value for some field of the indexed file and the pointers identify records in the indexed file having that value for that field.An index can be used in two ways first it
S# Sname Status City-ptrS1 Smith 20S2 Jones 10S3 Blake 30S4 Clark 20S5 Adams 30
City Supplier ptrAthensLondon
paris
13
is used for sequential access to the indexed file and another is used for direct access to individual records in the indexed file on the basis of a given value for that same field. The another form of dat representation is multilist organisation.
8.DATA STRUCTURES AND CORRESPONDING OPERATORS
The range of data structures supported at the user level is a factor that critically affects many componenets of the system .It dictates the design of the corresponding data manipulation languages,since DML operation must be defined in terms of its effect on those datastructures.We may categorize database systems according to the approach and the best known approaches are
Relational approach Hierarchical approach Network approach
The relational approach
The relational approach uses a collection of tables to represent both data and the relationships among those data. Each table has multiple columns and each column has a unique name.
Sample relational database
Bank customerCustomer name Snsocial-security-no. customer-street customer-city account-no.JohnsonSmithHayesTurnerJohnsonJonesLindsaySmith
92-83-7465019-28-3746677-28-9011182-73-6091192-83-7465321-12-3123336-66-9999019-28-3746
AlmaNorthMainPutnamAlmaMainParkNorth
Palo AltoRyeHarrisonStamfordPalo AltoHarrisonPits fieldRye
A-101A-215A-102A-305A-201A-217A-222A-201
Accounts
account-no balanceA-101A-215
500700
14
A-102A-305A-201A-217A-222
400350900750700
For example, customer Johnson whose social-security-no. is 192-83-7465 lives on Alma in Palo Alto and has 2 accounts A-101 with balance 500,a-201 with balance 900.Also smith and Jhonson shares A-201 account.
Network model Data in the network model are represented by collections of records and relationships among data .The relationships among data can be represented by links, which can be viewed as pointers
Sample network databases
Hierarchical Model
This form of data representation is similar to network model in the sense that records represent data and relationships among data and links .It differs from the network model in that the records are organized as collection of trees rather than graphs.
Johnson 192-83-7465 Alma Palo Alto A-101 500
A-215 700Smith 019-28-3746 North Rye
15
9.Advantages of using DBMS
Many enterprises choose to store its operational data in an integrated database because it provides the enterprise with centralized control of its operational data, which is most valuable.
DBA has the central responsibility over operational data.Advantages if data is stored under centralized control.
1.Redundancy can be reduced In non-database system each application has its own private files-which may cause redundancy in stored data. By means of integration this can be avoided.
2.Inconsistency can be avoided (to some extent) Suppose the fact, Employee E3 works in department D8 is represented by two distinct entries in the database and the system is not aware of this duplication. And if any one alone is updated in some occasions they will not agree and comes inconsistent state. So if the redundancy is controlled then the system could guarantee that the database is never inconsistent as seen by the user, by ensuring that any change made to either of two entries is automatically made to each other. This process is known as propagating updates.
3.The data can be shared New applications can access the stored databases.
4.Security restrictions can be applied. Only if permissions are available all users could access the database. The permissions are given by the DBA, so the data ensures security.
5.Integrity can be maintained Data in the database is accurate or not is mostly validated.
10.Database Administrator
One of the main reasons for using DBMS is to have central control of both the data and the programs that access those data. The person who has such central control
16
over the system is called the database administrator (DBA). The functions of the DBA include the following.
Schema definition: The DBA creates the original database schema by writing a set of definitions that is translated by DDL compiler to a set of tables that is stored permanently in the data dictionary.
Storage structure and access-method definition: The DBA creates appropriate storage structures and access methods by writing a set of definitions, which is translated by the data-storage and data-definition-language compiler.
Schema and physical-organization modification: Programmers accomplish the relatively rare modifications either to the database schema or to the description of the physical storage organization by writing a set of definitions that is used by either the DDL compiler or the data-storage and data-definition language.
Granting of authorization for data acess: Granting of different types of authorization allows the DBA to regulate which parts of the database various users can access.
Integrity – constraint specification: Setting constraints (conditions) while entering data to the database .For ex, the minimum balance in the account should be at least 500 etc.
17
DATABASE MANAGEMENT SYSTEM UNIT IObjective questions
1.Database is a) Computer-based billing system b) Computer-based record keeping system c) Computer-based animation system2.The software used for access to the database is a) BASIC b) PASCAL c) DBMS3.The end-users access the database from the terminal using a) Query language b) English language c) C language 4.DBA stands for a) Data Base Administrator b) Data base Access c) Data Batch Administration5.Which of the following is not operational data a) Product data b) Account data c) two numbers6.The database system provides the enterprise with ___________ control of its operational data a) Centralized b) Single c) Shared7.The ability to modify the schema definition in one level without affecting the schema in the other level is called a) Data dependence b) data independence c) data abstraction8.Which of the following is not a level of database architecture a) External b) logical c) super d) conceptual9.Data sub language is a combination of a) DDL and DML b) DDL and TCL c) C and C++10.A database that is not stored in a single physical location in its entirety and spread across the network is a) Centralized database b) Distributed database c) Shared database11.DBMS is a) A software that handles all access to the database b) A hardware c) An interface between end-user and computer 12.The component responsible for internal/physical conversion is called a) Access method b) internal conversion c) a hardware13. SRA is a) Stored Record Array b) Stored Record Access c) Stored Record Address14.Primary key is the key which
18
a) Avoids duplication of data b) supports duplication of data c) allows null values15.The data is represented in terms of 1) Relational approach 2) hierarchical approach 3) network approach a) 1,2 b) 1,2,3 c) none of the above
16.The representation of data in relational approach 1) Tables 2) tuples 3) relations Ans: a) 1 b) 1,2 c) 1,2,3 d) none17.The data represented in network approach is through a) Records and links b) tables c) trees18.The ___________permits the DBMS to view the storage structure as a collection of stored files. a) Stored record interface b) Stored record address c) Access method19.Entity is a) Any distinguishable real world object b) Not an object c) Incident20.DBMS stands for a) Data Base Management System b) Database Multimedia system c) Data Base Management Standards
Short questions
1.What are the basic components of database system?2.Explain the components of a database system with the simplified diagram.3.What is an operational data?4.Explain operational data with example.5.Explain data independence.6.Why database systems is adopted rather than filesystem or write down the advantages of database system.7.Distinguish between input, output, and operational data8.Explain three levels of database system in brief.9.What is the role of DBA?10.What are the functions of DBMS?11.Explain in brief distributed databases.12.Relate distributed databases with client server architecture.13.Explain access method, SRA, SRI.14.Differentiate relational, network, hierarchical approaches.15.Explain any one form of data representation.
Elaborate questions
19
1.Role of DBA with any one-function explanation in detail2.DBMS and its functions, advantages, disadvantages3Database system is followed now-a-days. Justify4.Explain the architecture of database system.5.Explain database system with simplified structure.6.Explain storage structures with at least any one representation.7.Explain various data structures used to represent data in database system.
Course : B.Com CA
Semester : III
Subject : Data Base Management System
Unit : Two
Unit II
Syllabus
Relational approach: Relational data structure: relation, domain, attributes, keysRelational algebra: Introduction, traditional set operation, attribute names for derived relations, special relational operations.
Books for Reference: Database system Concepts - Abraham silberschatz, Henry F.Korth, S.Sudharsan
An introduction to database system - C.J.Date
Principles of database system -Aho D.Ullman
An introduction to database systems -Bipin P.Desai
Relational ApproachIntroduction:
20
The relational model has established itself as the primary data model for commercial data-processing applications. The first database systems were based on either the network model or the hierarchical model. The relational model is now being used in numerous applications outside the domain of traditional data processing.
Structure of relational databases.
A relational database consists of a collection of tables, each of which is assigned a unique name. A row in a table represents a relationship among a set of values. The rows are termed as tuples and columns are termed as attributes. Since a table is a collection of such relationships, there is a close correspondence between the concept of table and the mathematical concept relation, from which the relational data model takes its name.
The following account table or relation has three column headers: branch-name, account-number and balance. These are the attributes (columns are referred as attributes). For each attribute there is a set of permitted values, called the domain of that attribute. For the attribute, branch-name set of all branch-names is its domain.
The account relation
Let D1 denote the set of all branch-names, D2 denote the set of all account-numbers, and D3 the set of all balances. In the account relation it consists of a 3-tuple (v1, v2, v3), were v1 is a branch name, v2 is an account number and v3 is a balance. The account will contain only a subset of the set of all possible rows. It can be represented as D1 * 2 * D3 In general a table of n attributes must be a subset of D1 * D2 *……Dn-1 * D n
The relation is said to be a subset of a Cartesian product of a list of domains. Tables are relations and the mathematical terms relation and tuple is used for the terms table and row respectively. In the account relation of the above figure there are seven tuples. Let the tuple variable t refer to the first tuple of the relation .We
Branch-name Account-number BalanceDowntownMianusPerry ridgeRound HillBrightonRedwoodBrighton
A-101A-215A-102A-305A-201A-222A-217
500700400350900700750
21
use the notation t [branch-name] to denote the value of t on the branch-name attribute. Thus, t [branch-name]=”Downtown”, and t [balance]=500.Since the relation is a set of tuples, we use the mathematical notation of t E r to denote that tuple r is in relation r.
Domain: -Domain is a pool of values. Also we can say that domain is atomic if elements of the domain are considered to be individual units. For example, the set of integers is a nonatomic domain. The distinction is that we do not normally consider integers to have subparts, but we consider sets of integers to have subparts-namely, the integers comprising the set. It is possible for several attributes to have the same domain.
The customer relation
It is possible for several attributes to have the same domain. For example, suppose that we have a relation customer that has the three-attribute customer-name, customer-street and customer-city, and a relation employee that includes the attribute employee-name. It is possible that the attributes customer-name and employee-name will have the same domain: the set of all person names. The domains of balance and branch-name are certainly distinct. It is perhaps less clear whether customer-name and branch-name should have the same domain. At the physical level, both customer names and branch-names are character strings. However, at the logical level, we may want customer-name and branch-name to have distinct domains.
Customer-name
Customer-street Customer-city
JonesSmithHayesCurryLindsayTurnerWilliamsAdamsJohnsonGlennBrooksGreen
MainNorthMainNorthParkPutnamNassauSpringAlmaSand HillSenatorWalnut
HarrisonRyeHarrisonRyePittsfieldStamfordPrincetonPittsfieldPalo AltoWoodsideBrooklynStamford
22
Relation:
Definition for relation (mathematically): Given a collection of set D1, D2,……Dn (not necessarily distinct,R is a relation on those n sets if it is a set of ordered n-tuples <d1,d2,……dn> such that d1 belongs to D1,d2 belongs to D2 ,…..dn belongs to Dn.Set D1,D2,D3,…..Dn are the domains of R.The value of n is the degree of R.
The concepts of relation correspond to the programming-language notion of a variable. The concept of a relation schema corresponds to the programming-language notion of type definition. It is convenient to give a name to a relation schema, just as we give names to type definitions in programming languages. We adopt the convention of using lowercase names for relations, and names beginning with an uppercase letter for relation schemas. For example,
Account-schema=(branch-name, account-number, balance)
The explanation of relation can be expressed diagrammatically with the help of E-R diagrams. Before discussing E-R diagrams, the common terms used in the diagrams is analysed.
Entity: This is a thing or object in the real world that is distinguishable from all other objects. For example, each person in an enterprise is an entity. An entity has a set of properties, and the values for some set of properties may uniquely identify entity. For example, the social-security number 677-89-9011(employee number 1111) uniquely identifies one particular person in the enterprise.
Entity Set: An entity set is a set of entities of the same type that share the same properties or attributes. The set of all persons who are customers at a given bank, for example, can be defined as the entity set customer.
Attributes: An entity is represented by a set of attributes. Attributes are descriptive properties possessed by each member of an entity set. Possible attributes of customer entity are customer-number, customer-street, and customer-city. The following attribute types, as used in the E-r model, can characterize an attribute.
Simple and Composite attributes : The attributes, which can be divided into subparts, are composite attribute. For example, name is an attribute, which is combination of first-name, middle name, and last-name.
Single-valued and Multivalued attributes : The attributes that we have specified in our examples all have a single value for a particular entity. For instance, the loan-number attribute for a specific loan entity refers to only one loan number. Such attributes are said to be single valued. There
23
may be instances where an attribute has a set of values for a specific entity.
Null attributes : A null value is used when an entity does not have a value for an attribute.
Derived attribute: The value for this type of attribute can be derived from the values of other related attributes or entities. For instance, let us say that the customer entity set has an attribute loans-held, which represents how many loan a customer entity set has from the bank. We can derive the value for this attribute by counting the number of loan entities associated with that customer.
Relationship sets Consider the relation loan. Branch-name Loan-number AmountDowntownRedwoodPerry ridgeDowntownMianusRound HillPerry ridge
L-17L-23L-15L-14L-93L-11L-16
10002000150015005009001300
A relationship is an association among several entities. For example, we can define a relationship that associates customer Hayes with loan number L-15.This relationship specifies that Hayes is a customer with loan number L-15.
A relationship set is a set of relationships of the same type.Formally.it is a mathematical relation on n>=2 (possibly non distinct) entity sets. If E1, E2,…..En are entity sets, then a relationship set R is a subset of {(e1, e2,…………..,en)|e1 E1,e2 E2 ,…..en En} Where (e1, e2,…….en) is a relationship.
Consider the two entity sets customer and loan, we can define the relationship set borrower to denote the association between customers and the bank loans that the customers have. As another example, consider the two-entity sets loan and branch. We can define the relationship set loan-branch to denote the association between a bank loan and the branch in which that loan is maintained.
24
Each row of the table represents one n-tuple of the relation. The number of tuples in the relation is called the cardinality of the relation. Eg. The cardinality of the relation loan is 7.
The relations may be unary, binary, ternary, n-ary etc.
Unary: Relations of degree one is unary.
For ex, the query Find the branch name that issued loan with number L-17.The output will be
Branch-nameDowntown
Binary: Relations of degree two are binary.
Ex, Find branch-name and amount for loan-number L-17 from branch relationThe output will be,
Branch-name AmountDowntown 1000
Ternary: Relations of degree three are ternary
N-ary: Relations of degree n are n-ary.
Mapping cardinalities: Mapping cardinalities, or cardinality ratios, express the number of entities to which another entity can be associated via relationship set. Mapping cardinalities are most useful in describing binary relationship sets, although occasionally they contribute to the description of relationship sets that involve more than two entity sets. For binary relationship set R between sets A and B, the mapping cardinality must be one of the following:
One to one: An entity is associated with at most one entity in B, and an entity in B is associated with at most one entity in A.
One to Many: An entity in A is associated with any number of entities in B.An entity in B, however, can be associated with at most one entity in A.
Many to one: An entity in A is associated with at most one entity in B.An entity in B, however, can be associated with any number of entities in A.
25
Many to Many: An entity in A is associated with any number of entities in B, and an entity in B is associated with any number of entities in A.
Keys:
In a relation there is one attribute whose values is unique within the relation and thus can be used to identify the tuples of that relation.
For ex, in the above said loan relation the loan number can be considered as a key, which is unique, and can be used to distinguish all other tuples in that relation. Befrore discussing on various keys let us have a glance on integrity constraints.
Integrity constraints:
An integrity constraint is a mechanism used by oracle to prevent invalid data entry into the table. It is nothing but enforcing rule for the coloumn in a table. The following are the various types of integrity constraints: -
*Domain integrity constraints
Maintains value according to the specification like ‘not null’ condition, so that the user has to enter a value for the coloumn on which it is specified. ‘Not null’ and ‘Check’ constraints fall unde this category.
*Entity integrity constraint
Maintains uniqueness in a record.
*Referential integrity constraint
Enforces relationship between tables
To establish a ‘parent-child’ or a ‘master-detail’ relationship between two tables having a common column we make use of referential integrity constraints. To implement this we should define the column in the parent table as a primary key and the same column in the child table as a foreign key referring to the corresponding parent entry. We define constraint to either at table or column level. If it is defined at the table level, then it can be enforced to any number of columns in a table .On other hand, if it is defined at the column level then it holds good only for the column for which it is defined.
Various keys related to relational approaches are
26
Primary Key: Primary key is a set of one or more attributes that, taken collectively allows us to identify uniquely an entity in the entity-set.
Ex.1) An-number in the loan relation 2) Also the combination of branch-name and loan-number
Candidate Key: Several distinct sets of attributes could serve as candidate key
Referenced key:It is a unique or a primary key, which is defined on a coloumn belonging to the parent table.
Foreign Key: A coloumn or combination of coloumns included in the definition of referential integrity, which would refer to a referenced key.
Child table: This table depends upon the values present in the referenced key of the parent table, which is referred by a foreign key.
Parent table: This table determines whether insertion or updation of data can be done in child table. This table would be referred by child table’s foreign key.
On delete cascade clause
If all rows under the referenced key coloumn in a parent table are deleted, than all rows in the child table with dependent foreign key will also be deleted automatically.
Entity-Relationship Diagrams:
An E-R diagram can express the overall logical structure of a database graphically. Such a diagram consists of the following major components:
The symbol used to represent entity is rectangle
The symbol used to represent attribute is ellipse
The symbol used to represent links is lines _______
The symbol used to represent the relation is
The symbol used to represent multivalued attributes is Double ellipses
The symbol used to represent the derived attributes is dashed ellipses
27
The symbol used to represent the total partition of entity in a relationship set is double lines.
E-R diagram for a Banking-Enterprise
Various relations used for the discussion of this chapter are
1.Account relation
Branch-name Account-number BalanceDowntownMianusPerry ridgeRound HillBrightonRedwoodBrighton
A-101A-215A-102A-305A-201A-222A-217
500700400350900700750
28
account
Account-number Balance
Account-branch
branch
Branch-city
Branch-name
Assets
Deposit-or
customer
Customer-name
Customer-city
Customer-street
Borro-wer
loan
Loan-number
Amount
Loan-branch
2.Loan relation
3.Branch relation
Branch-name Branch-city AssetsDowntownRedwoodPerryridgeMianusRound hillPownalNorth townBrighton
BrooklynPalo altoHorse neckHorse neckHorse neckBenningtonRyeBrooklyn
900000021000001200000400000800000030000037000007100000
4.Customer relation
Customer-name
Customer-street Customer-city
JonesSmithHayesCurryLindsayTurnerWilliamsAdamsJohnsonGlennBrooksGreen
MainNorthMainNorthParkPutnamNassauSpringAlmaSand HillSenatorWalnut
HarrisonRyeHarrisonRyePittsfieldStamfordPrincetonPittsfieldPalo AltoWoodsideBrooklynStamford
Branch-name Loan-number AmountDowntownRedwoodPerry ridgeDowntownMianusRound HillPerry ridge
L-17L-23L-15L-14L-93L-11L-16
10002000150015005009001300
29
5.Depositor relation Customer-name
Account-number
JohnsonSmithHayesTurnerJohnsonJonesLindsay
A-101A-215A-102A-305A-201A-217A-222
6.Borrower relation
Customer-name
Loan-number
JonesSmithHayesJacksonCurrySmithWilliamsAdams`
L-17L-23L-15L-14L-93L-11L-17L-16
Relational Algebra
Note: Query languages A query language is a language in which a user requests information from the database. These languages are typically of a level higher than that of a standard programming language. Query languages can be categorized as being either procedural or non-procedural .In procedural language, the user instructs the system to perform a sequence of operations on the database to compute the desired result. In a non-procedural language, the user describes the information desired without giving a specific procedure for obtaining that information.
30
Introduction
Relational algebra is a collection of operations on relations. Also it is a procedural query language, it consists of a set of operations that take one or two relations as input and produce a new relation as their result.
The fundamental operations or traditional set operations available with relational algebra are select, project, set difference, Cartesian, rename, union. In addition to the fundamental operations, there are several other operations-namely, set intersection, natural join, division, and assignment. These operations will be defined in terms of the fundamental operations. Also we can state the selction, projection, join and division operations as special relational operators.
Fundamental operations
The select, project and rename operations are called unary operations, because they operate on one relation. The other three operations union, setdifference and Cartesian product operate on pairs of relations and are, therefore called binary operations.
The select operation
The select operation selects tuples that satisfy a given predicate. The lowercase Greek letter sigma () is used to denote selection. The predicate appear as a subscript to . The argument relation is given in parenthesis following the .
Example: 1.Select those tuples of the loan relation where the branch is “Perryridge”.
branch _name=”perryridge”(loan) The result of the query is
2.Find all tuples in which the amount lent is more than $1200 Amount>1200(loan) All comparisons using =,, <,,≥ in the selection predicate. Also we can combine larger predicates using the connectives and (^) and or (۷).
3.Find those tuples pertaining to loans of more than $1200 made by Perryridge branch
branch _name=”perryridge”^amount>1200(loan)
Branch-name Loan-number AmountPerryridgePerryridge
L-15L-16
15001300
31
The project operation
Suppose we want to list all loan numbers and the amount of the loans, but do not care about the branch name. The project operation allows us to produce this relation. The project operation is a unary operation that returns its argument relation, with certain attributes left out. Since a relation is a set, any duplicate rows are eliminated. Projection is denoted by the Greek letter pi (π). We list those attributes that we wish to appear in the result as subscript to π.The argument relation follows in parentheses.
Example: 1.List all loan numbers and the amount of the loan .The corresponding query is
π loan-number,amount(loan) The relation that results from this query is
Loan-number AmountL-17L-23L-15L-14L-93L-11L-16
10002000150015005009001300
The set difference operation
The set-difference operation, denoted by -, allows us to find tuples that are in one relation but are not in another. The expression r – s results in a relation containing those tuples in r but not in s.
Example: 1.Find all customers of the bank who have an account but not a loan
π customer-name (depositor) – πcustomer-name (borrower) The result will be
Customer-nameJohnsonTurnerLindsay
For a set difference operation r-s to be valid, we require that the relations r and s be of the same arity, and that the domains of the ith attribute of r and the ith attribute of s be the same.
32
The cartesian – product operation
The Cartesian-product operation, denoted by a cross (X), allows us to combine information from any two relations. We write the Cartesian product of relations r1 and r2 as r1 X r2. Since the same attribute name may appear in both r1 and r2, we need to devise a naming schema to distinguish between these attributes. We do so here by attaching to an attribute the name of the relation from which the attribute originally came. For example, the relation schema for r = borrower X loan is
(borrower.customer-name,borrower.loan-number,loan.branch-name,loan.loan-number,loan.amount)So now we can distinguish borrower.loan-number from loan.loan-number.For those attributes that appear in only one of the two schemas,we shall usually drop the relation-name prefix.We can wrte the relation schema for r as (customer-name,borrower.loan-number,branch-name,loan.loan-number,amount) This above naming convention requires that the relations that are arguments of the Cartesian-product operation have distinct names.
Assume that we have n1 tuples in borrower and n2 tuples in loan. Then, there are n1 * n2 ways of choosing a pair of tuples –one tuple from each relation; so there are n1*n2 tuples in r. In particular ,note that for some tuples t in r,it may be that t[borrower. loan-number] not equal to t[loan.loan-number]. In general ,if we have relations r1(R1) and r2(R2),then r1 X r2 is a realtion whose schema is the concatenation of R1 and R2.Relation R contains all tuples t for which there is a tuple t1 in r1,and t2 in r2 for which t[R1]=t1[R1] and t[R2]=T2[R2].
For example
1.if we want to find the names of all customers who have a loan at the Perryridge branch.We need the information in both the loan relation and the borrower relation to do so.If we write
branch-name=”Perryridge”(borrower X loan) Customer-name Borrower.loan-
numberBranch-name Loan.loan-numberAmount
JonesJones…….…….…….AdamsAdams
L-17L-17…….…….…….L-16L-16
DowntownRedwood…….…………Round hillPerryridge
L-17L-23……..…….…….L-11L-16
10002000…..…..…..9001300
Table:Result of borrower X loan
Now the output of the query stated above will be as
33
Customer-name Loan-number Branch-name Loan-number AmountJonesJonesSmithSmithHayesHayesJacksonJacksonCurryCurrySmithSmithWilliamsWilliamsAdamsAdams
L-17L-17L-23L-23L-15L-15L-14L-14L-93L-93L-11L-11L-17L-17L-16L-16
PerryridgePerryridgePerryridgePerryridgePerryridgePerryridgePerryridgePerryridgePerryridgePerryridgePerryridgePerryridgePerryridgePerryridgePerryridgePerryridge
L-15L-16L-15L-15L-15L-16L-15L-16L-15L-16L-15L-16L-15L-16L-15L-16
1500130015001300150013001500130015001300150013001500130015001300
Table:result of query branch-name=”Perryridge”(borrower X loan)
The relation describes the details relating to perryridge branch alone.But there is a chance that many customers may not have a loan at perryridge branch.So the query can be re-written as
borrower.loan-number=loan.loan-number
( branch-name=”Perryridge”(borrower X loan))
In order to retrieve only the customer-name ,we vcan have the projection operation as
customer-name(borrower.loan-number = loan.loan-number
(branch-name=”Perryridge”(borrower X loan)
The result is as shown below
Customer-nameHayesAdams
Table:Result of customer-name(borrower.loan-number = loan.loan-number
(branch-name=”Perryridge”(borrower X loan)
The rename operation
Unlike relations in the database, the results of relational-algebra expressions do not have a name that we can use to refer to them. It is useful to be able to give them names; the rename operator, denoted by the lower-case Greek letter rho (), lets us perform this task.
34
Given a relational-algebra expression E, the expression x(E) returns the result of expression E under the name x.
A relation r by itself is considered to be a trivial relational-algebra expression. Thus, we can also apply the rename operation to a relation r to get the same relation under a new name.
A second form of the rename operation is as follows. Assume that a relational-algebra expression E has arity n. Then the expression x(A1,A2,.....An)(E) returns the result of expression E under the name x,and with the attributes renamed to A1,A2,.....An.
For example,
1.Find the largest balance in the bank Steps invloved are
Compute first the relation consisting of those balances that are not the largest
The take the set difference between the relation balance(account) Then comes the temporary relation
The corresponding queries are
account.balance( account.balance < d.balance(account X d (account)))
This expression gives those balances in the account relation for which a larger balance appears somewhere in the account relation(renamed as d).The result contains all balances except the largest one. The relation is
Balance500700400350750
The query to find the largest account balance in the bank can be written as follows: balance(account) –
account.balance (account.balance <d.balance(account X d (account))) the result of this query is
Balance900
35
Fig: largest account balance in the bank
2.Find the names of all customers who live on the same street and in the same city as Smith The street and city of smith can be obtained by writing as
customer-street,customer-city(customer-name=”Smith”(customer))
In order to find other customers with this street and city, we must reference the customer relation a second time. In the following query, we use the rename operation on the preceding expression to give its result the name smith-addr, and to rename its attributes to street and city, instead of customer-street and customer-city:
customer.customer-name
(customer.customer-street=smith-addr.street^customer.customer-city=smith-addr.city
(customer X smith-addr(street,city)
(customer-street,customer-city(customer-name=”Smith”(customer)))))
The result of this query is as shown below
Customer-nameSmithcurry
Additional operations or special relational operations
1.The set-intersection operation The symbol used to identify is .
Example: 1.Find all customers who have both a loan and an account. Query is
customer-name(borrower) customer-name(depositor) The result will be
Customer-nameHayesJonesSmith
Table: customers with both an account and a loan at the bank
The intersection operation can be replaced using the set difference operation as r s =r-(r-s)
The Union operation
36
With the help of this operation we can choose the details which are present in either of two relations.
For example:
1.Find the names of all bank customers who have either an accoubt or a loan or both. The customer relaion does not contain the information ,since a customer does not need to have either an account or a loan at the bank.And to answer this query we need the information in the depositor relation and in the borrower relation . *To find the customers with loan at the bank we use
customer-name(borrower) *To find the names of all customers with an account in the bank:
customer_name(depositor) To find both account and loan holding customers we need to union these two as
Customer-name(borrower) customer-name(depositor) The result of this query is
Customer-nameJohnsonSmithHayesTurnerJonesLondsayJacksonCurryWilliamsAdams
For union operation r U s to be valid, we require two conditions:
1.The relations r and s must be of the same arity. That is, they must have the same number of attributes. 2.The domain of the ith attribute of r and the ith attribute of s must be the same, for all i. Where r and s can be, in general temporary relations that are the result of relational-algebra expressions.
The natural-join operation
It is often desirable to simplify certain queries that require a Cartesian product. A query that involves a Cartesian product includes a selection operation on the result of the Cartesian product.
Assume:
37
Find the names of all customers who have a loan at the bank, and find the amount of the loan. Steps : 1.Form the Cartesian product of the borrower and loan relations. 2.Select those tuples that pertain to only the same loan-number. 3.Project the customer-name,loan-number and amount.
customer-name,loan.loan-number,amount
(borrower.loan-number=loan.loan-number(borrower X loan))
The natural join is a binary operation that allows us to combine certain selections and a Cartesian product into one operation. It is denoted by the “join” symbol ⋈.The natural-join operation forms a Cartesian product of its two arguments, performs a selection forcing equality on those attributes that appear in both relation schemas, and finally removes duplicate attributes.
For example: 1.Find the names of all customers who have a loan at the bank, and find the amount of the loan.
customer-name,loan-number,amount(borrower⋈ loan) The result of the query is
Customer-name Loan-number AmountJonesSmithHayesJacksonCurrySmithWilliamsAdams
L-17L-23L-15L-14L-93L-11L-17L-16
100020001500150050090010001300
2.find names of all branches with customers who have an account in the bank and who live in Harrison
branch-name( customer-city=”Harrison”(customer ⋈ account ⋈ depositor)) The result of the query is
Branch-nameBrightonPerryridge
The division operation
The division operation, denoted by, is suited to queries that include the phrase “for all”.
Example: 1.Find all customers who have an account at all the branches located in Brooklyn.
38
Steps: 1.All branches in Brooklyn can be obtained as r1= branch-name( branch-city=”Brooklyn”(branch))
The result is
We can find all (customer-name,branch-name) pairs for which the customer has an account at a branch by writing
r2=customer-name,branch-name(depositor⋈ account)
Table:Result of customer-name,branch-name(depositor⋈ account)
Our question is to find those customers who appear in r2 with every branch name in r1.We formulate the query by writing
customer-name,branch-name(depositor ⋈ account)
⊹ Branch-name( branch-city=”Brooklyn”(branch)) Extended relational-algebra operations
The basic relational-algebra expressions have been extended in several ways. A simple extension is to allow arithmetic operations as part of projection. An important extension is to allow aggregate operations, such as computing the sum of the elements of a set, or their average. Another important extension is the outer-join operation, which allows relational-algebra expressions to deal null values, which model missing information.
Generalized Projection The generalized projection operation extends the projection operation by allowing arithmetic functions to be used in the projection list. The generalized projection has the form F1,F2,……Fn(E)
Branch-nameBrightonDowntown
Customer-name Branch-nameJohnsonSmithHayesTurnerWilliamsLindsayJohnsonJones
DowntownMianusPerryridgeRound hillPerryridgeRedwoodBrightonBrighton
39
Where E is any relational-algebra expression, and each F1, F2,…Fn are arithmetic expressions involving constants and attributes in the schema of E.As a special case, the arithmetic expression may be simply an arithmetic or a constant. The following example demonstrates the basis for the use of the generalized projection operation. Suppose we have a relation credit-info, as shown, which lists the credit limit and expenses so far .If we want to find how much more each person can spend, we can write the following expression:
customer-name,limit - credit-balance(credit-info)
Customer-name Limit Credit-balance
JonesSmithHayesCurry
6000200015002000
70040015001750
Table:The credit-info relation
Customer-name Limit-credit_balance
JonesSmithHayesCurry
530016000250
The result of customer-name, limit - credit-balance (credit-info)
Outer join
The outer-join operation is an extension of the join operation to deal with missing information.
Aggregate functions
Aggregate functions are functions that take a collection of values and return a single value as a result. For example, the aggregate function sum takes a collection of values and returns the sum of the values.
The function sum applied on the collection <1,1,3,4,4,11>returns the value 24.
40
The function avg returns the average of the values. So average of the above is 4.
The function count returns the number of the elements in the collection and would return 6 on the preceding collection.
The functions min and max, returns the minimum and maximum values in a collection; they return 1 and 11.
Examples:
1.Find out the total sum of salaries of all part-time employees in the bank.
The query is Sum salary (pt-works) The result of this query is a relation with a single attribute, containing a single row with a numerical value corresponding to the sum of all the salaries of all employees working part-time in the bank.
Refer for further details of aggregate functions in the text
1.Database system concepts -Abraham Silberschatz,Henry K.Forth
2.Refer ‘An introductin to database systems’ –chapter 4 -Bipin P.Desai for relational approach.
Short questions:
1.What is relational approach.2.What is relational algebra.3.Write the definition for relational algebra.4.What are the fundamental operations of relational algebra.5.What is entity, relation, entity set, relaionship, relationship set, attribute.6.Briefly explain mapping cardinalities.7.Draw the entity relationship diagram for banking enterprise.8.Explain selection and projection operation with example.9.Explain aggregate functions in brief.10.Explain set operations.11.Explain binary, unary, ternary and n-ary relations.12.What are the various symbols used in entity relationship diagram.13.What is constraint?14.Write note on integrity rules.15.What is a key?
41
Elaborate questions:
1.Write the definition for key and explain various keys with example.2.Explain the structure of relational databases with example.3.Explain referential integrity constraint or rule, with example. 4.Explain all fundamental operations of relational algebra or traditional set operations with example.5.Write all aggregate functions and explain in detail with example.6.What is extended relational operations and explain all the available operations.
STUDY MATERIAL
Course :B.Com CASemester:III
Subject :Data Base Management System
Unit :Three
_______________________________________________________________________
Unit III Syllabus
Embedded SQL:Introduction –operators not involving cursors, involving cursors-Dynamic statements. Query by example-retrieval operations, builtin-functions, update operations, QBE Dictionary.Normalization: Functional Dependency, First, Second, third normal formd, relations with more than one candidate key, good and bad decomposition.
Books for Reference:
An introduction to database system - C.J.Date
Database system Concepts - Abraham silberschatz, Henry F.Korth, S.Sudharsan
Principles of database system -Aho D.Ullman
Embedded SQL
42
SQL provides a powerful declarative query language; writing queries in SQL are typically much easier than is coding the same queries in a general-purpose programming language. To access a database from a general-purpose programming language is for the following two reasons. 1.Not all queries can be expressed in SQL, since SQL does not provide the full expressive power of a general-purpose language. That is, there exists queries that can be expressed in a language such as Pascal, C, COBOL or FORTRAN that cannot be expressed in SQL write queries, we can embed SQL within a more powerful language 2.Nondeclarative actions-such as printing a report, interacting with a user, or sending the results of a query to a graphical user interface-cannot be done from within SQL.
A language in which SQL queries are embedded is referred to as host language, and the SQL structures permitted I the host language constitute embedded SQL.
Languages such as PL/I however are not well equipped to handle more that one record at a time. It is therefore necessary t provide some form of bridge between the two functional levels and embedded SQL provides such a bridge by means of a new type of object called a cursor.
Operations not involving cursors The DML statements that do not need cursors are as follows:
“Singleton SELECT” UPDATE INSERT DELETE
Singleton SELECT
We use the term “singleton SELECT “ to mean statement for which the retrieved table contains at most one row. Example: SELECT statement
UPDATE
This statement can be executed to have changes in the databases designed. Example: UPDATE, statement of SQL.
INSERT
This statement is used to include new row or information. Example: INSERT, statement of SQL.
43
DELETE
This is used to delete information from the database. Example: DELETE, statement of SQL.
Operations involving cursors
Consider the case of a SELECT that selects a whole set of records, not just one. What is needed is a mechanism for accessing the records in the set one by one; and cursors provide such a mechanism. Explicitly defined cursors are constructs that enable the user to name an area of memory to hold a specific statement for access at a later time. The programmer to process a multiple-row active set one record at a time defines explicit cursors. The following are steps for using explicitly defined cursors within PL/SQL.
1.Declare the cursor * Name the cursor * Each cursor associates a query with cursor
SyntaxDeclare cursor-name is select statement
ExampleDeclare c_names is select branch_name from branch where branch_city=’Brooklyn’;
2.Open the cursorOpening the cursor activates the query and identifies the active set.
Open also initializes the cursor pointer to just before the first row of the active set.
SyntaxOpen cursor-name;
3.Fetching the cursor
Getting data into the cursor is accompolished with the fetch command.The fetch command retrieves the rows in the cursor set one row at a time.
SyntaxFetch cursor-name into record-list;
44
4.Closing the cursor
The close statement closes or deactivates the previously opened cursor and makes the active set undefined oracle will implicitly close a cursor when the user’s program or see\ssion is terminated.After a cursor is closed ,we cannot perform any operation on it.
SyntaxClose cursor-name;
Attributes involved in cursors
%ISOPEN returns TRUE if the cursor is already OPEN %FOUND returns TRUE if the last FETCH returned a row, and
returns FALSE if the last FETCH failed to return a row.
%NOTFOUND is the logical opposite of %FOUND. %ROWCOUNT yields the number of rows fetched.
Example to illustrate cursor1) Declare
Cursor c4 is select salary,job from emp where job=’CLERK’;Begin
if c4%isopen thendbms.output.put_line(‘This message will not be displayed’);else
open c4;dbms.output.put_line(‘Cursor not found’);
end if;close c4;
end;
2) The procedure to update students information by finding the total and average.
Declarest stu%rowtype;cursor c1 is select * from stu;
BeginOpen c1;loop; fetch c1 into st;
exit when c1%notfound;st.tot1l:=st.m1+st.m2+st.m3;st.average:=st.total/3;
45
if st.m1>=50 and st.m2>=50 and st.m3>=50 thenst.result:=’PASS’;
elsest.result:=’FAIL’;
end if;update stu set
total=st.total,average=st.average,result=st.result where regno=st.regno;end loop;commit;
end;
Dynamic Statements
Embedded SQL provides certain features to facilitate the writing of on-line application programs that is programs to support on-line access to the database from an end-user at the terminal. Steps involved are
1.accept a command from the terminal 2.analyze the command 3.issue appropriate SQL statements 4.return a message and/or results to the terminal
The precompiler is a compiler for the SQL language. Suppose the application programs have written a program P that includes some embedded SQL statements.
Pre-compilation proceeds as follows.
The precompiler scans the source program P and locates the embedded SQL statements.
For each statement it finds the precompiler decides on a strategy for implementing that statements in terms of RSI operations. This process is referred to as optimization
The precompiler replaces each of the original embedded SQL statements by an ordinary PL/I statement
The dynamic SQL component of SQL-92 allows programs to construct and submit SQL queries at run-time. In case of embedded SQL, each statement must be completely present at compile time, and are compiled by the embedded SQL preprocessor. Using dynamic SQL, programs can create SQL queries as strings at run-time (based on i/p from the user) and can either have them executed immediately, or have them prepared for subsequent use. The two principal dynamic statements are PREPARE and EXECUTE.
DCL SQLSOURCE CHAR (256);
46
SQLSOUCE =’DELETE FROM BRANCH WHERE BRANCH_CITY=’PERRYRIDGE’;
$PREPARE SQLOBJ FROM SQLSOURCE:$EXECUTE SQLOBJ:
The PREPARE statement passes the SQLSOURCE string to the RDS precompiler which goes through its normal process of parsing, optimization, code generation and builds a machine language versions of the statement called SQLOBJ.EXECUTE statement causes this machine language routine to be executed and thus causes the actual deletions to occur.Once PREPAREd ,a given dynamically generated SQL statement can be
EXECUTED many times. The generated statement can be replaced by another by issuing PREPARE again with the same target and a different source.
QUERY-BY-EXAMPLE
Query-by-example (QBE) is the name of both a data-manipulation language and the database system that included this language. The QBE database system was developed at IBM T.J.Watson Research center in the early 1970s.Today,some-database systems for personal computers support variants of QBE languages. It has two distinctive features: 1.Unlike most query languages and programming languages, QBE has a two-dimensional syntax: Queries look like tables. A query in one-dimensional language can be written in a one line. A two-dimensional language requires two dimensions for its expression.2.QBE queries are expressed “by example”. Instead of giving a procedure for obtaining the desired answer, the user gives an example of what is desired. The system generalizes this example to compute the answer to the query.
We express queries in QBE using skeleton tables. These tables show the relation schema as shown below.
Example the representation of branch relation
Branch Branch name
Branch city assets
Retreival operations
47
Queries on One relation
Examples:
1:Find all loan numbers at the Perryridge branch
Loan Branch-name
Loan-number
Amount
Perryridge P._x
The proceeding query causes the system to look for tuples in loan that have “perryridge” as the value for the branch-name attribute. For each such tuple the value of the loan-number attribute is assigned to the variable x. The value of the variable x is “printed”, because the command P. appears in the loan-number coloumn next to the variable x.QBE assumes that a blank position in a row contains unique variable.As a result,if a variable does not appear more than once in a query,it may be omitted.
Thus the previous query can be re-written as
Loan branch-name loan-number amount Perryridge P.
QBE performs duplicate elimination automatically.To suppress the duplicate elimination,we insert the command ALL. After the P. command:
Loan branch-name loan-number amount Perryridge P.ALL
To display the entire loan relation ,we can create a single row consisting of P. in every field.
Loan branch-name loan-number amountP.
QBE allows queries that involve arithmetic comparisons
Example
48
1.Find the loan numbers of all loans with a loan amount of more than $700.
Loan Branch-name Loan-no. Amount P.>700
The arithmetic operations that QBE supports are =,<,≤,≥ and ¬
2.Find the names of all branches that are not located in Brooklyn.
Branch Branch-name Branch-city Assets
P. ¬Brooklyn
3.Find the loan-no. of all loans made jointly to Smith and Jones.
Borrower Customer-name Loan-no. ‘Smith’ P._x ‘Jones’ _x
4.Find the loan numbers of all loans made to smith ,to Jones or to both jointly.
Borrower customer-name loan-no. ‘Smith’ P._x ‘Jones’ P._y
5.Find all customers who live in the same city as Jones.
Customer Customer-name Customer-street Customer-cityP._x _yJones _y
Queries on several relations
QBE allows queries that span several different relations. The connections among the various relations are achieved through variables that force certain tuples to have the same value on certain attributes.
Example
49
1.Find the names of all customers who have a loan from the ‘perryridge’ branch.. loan branch_name loan_no. amount
perryridge _x
borrower cust_name loan_no.
P._x _x
2.Find the names of all customers who have both an account and a loan at the bank.
Depositor customer-name account-no.
P._x
Borrower customer-name account-no.
_x
3.Find the names of all customers who have an account at the bank ,but who have a loan from the bank.
Depositor customer-name account-no.P._x
Borrower customer-name loan-no._x
4.Find all customers who have atleast two account.
Depositor customer-name account-no.
P._x _yx y
The condition box
It is not convenient to express all the constraints on the domain variables within the skeleton tables. To overcome this QBE includes a
50
condition box feature that allows the expression of general constraints over any of the domain variables.
Example:
1:Find all customers who are not named ‘Jones’ and who atleast two account.
Depositor customer-name account-no.
P._x _yx y
2.Find all account-no. with a balance between $1300 and $1500 ,we write
acc-no. branch-name acc-no. balanceP. _x
3.Find all branches that have assests greater than those of atleast one branch loacated in ‘Brooklyn’.
Branch branch-name branch-city assets
P._x _y Brooklyn _x
51
Conditions -Y>_z
Conditions
_x.≥1300_x≤1500
Conditions
_Y >_z
Options available with condition Box 1.QBE allows complex arithmetic expressions to appear in a condition box.Example:Find all branches that have assets that are atleast twice as large as the assets of one of the branches located in Brooklyn.
Branch branch-name branch-city assets
P._x _y Brooklyn _x
2.QBE allows logical expressions to appear in condition box.Operators used are and( & ),or( | )
Example
Find all account numbers with a balance between $1300 and $2000 but not exactly $1500.
Account branch-name account-no. balanceP. _x
The result relation
If the result of a query includes attributes from several relation schemas, we need a mechanism to display the desired result in a single table.Example
52
Conditions
_x=( ≥1300 and ≤2000 and ┐1500)
1.Find the customer-name, account-no. and balance for all accounts at the perryridge branch
In relational algebra1.Join depositor and account relation2.project customer-name, account-no. and balance
QBE related with this.
1.Create a skeleton table called result with attributes customer-name, account-no. and balance.
Account branch-name account-no. Balance
Perryridge _y _z
Depositor customer-name account-no.
_x _y
Result customer-name account-no. Balance
P. _x _y _z Ordering of the display of tuples
By using the command AO. And DO. we can order the contents.
Example
1.List all customers in descending alphabetical order.
Depositor customer-name account-no.
P.DO.
Aggregate functions[Built-in functions]
53
QBE includes the aggregate operators AVG, MAX, MIN, SUM and CNT.we must postfix these operators with ALL. to create a multiset on which the aggregate operation is evaluated.
Example
1.Find the total balance of all the account maintained at the perryridge branch.
Account branch-name account-no. balance
Perryridge P.SUM ALL.
2.Find the total no. of customers who have an account at the bank.
Depositor customer-name account-no.
P.CNT.UNQ.ALL.
3.Find the name,street and city of all customers who have more than one account at the bank.
Customer cust-name cust-street cust-city
P. _x
Depositor Cust-name Account-No.
G._x CNT.ALL._y
Update operations/Modification of the database This section deals with the options how to add, remove or change information using QBE.
Deletion
54
Conditions
CNT.ALL._y > 1
Deletion of tuples from a relation is expressed in much the same way as a query. The major difference is the use of D. in the place of P..In QBE we can delete whole tuples, as well as values in selected coloumns. To delete information in only some of the columns, null values, specified by-are inserted.
D. Operates on only one relation. To delete tuples from several relations, we must use one D. operator for each relation.
*Delete customer smith
customer cust_name cust_street cust_city D. Smith
*Delete the branch-city value of the branch whose name is “Perryridge”.
Branch branch-name branch-city asstes
Perryridge D.
*Delete all loans with a loan amount between $1300 and $1500
Loan Branch-name loan-no. amount D. _y _x
Borrower cust_name loan_no.D. _y
*Delete all accounts at all branches located in Brooklyn.
Account branch_name account_no. balance
D. _x _y
Depositor cust_name acc_no.
D. _y
55
Condition
_x=(>=1300 and <= 1500)
branch branch_name branch_city assets
_x Brooklyn
Insertion
We do the insertion by placing the I. Operator in the query expression.The attribute values for inserted tuplles must be members of the attributes domain
Example
*To insert into the branch relation information about a new branch with name “Capital” and city “Queens”,but with a null asset value,we write
branch branch_name branch_city assets I. Capital Queens
*To insert the account A-9732 at the Perryridge branch has a balance of $700.
Account branch-name account_no. balanceI. Perryridge A-9732 700
Updates
If we want to changeone value in a tuple withput changing all values in the tuple we use the update facility and the operartor used is U. .QBE allows users to update the primary key fields.
Update the asset value of the Perryridge branch to $10,000,000
Branch branch-name branch-city assets
Perryridge U. 100000000
56
The query updates the assets of the Perryrigde branch to $10,000,000 regardless of the old values.If we want to update a value using the previous vaulue ,we must express a request using two rows:One specifying the old tuples that need to be updated,and the other indicating the new updated tuples to be inserted in the database
The interesty payments are being made,and all branches are to be increased by 5%.
Account branch-name account-no. balance
U. _x * 1.05 _x.
QBE Dictionary
QBE has a built-in dictionary that is represented to the user as a collection of tables. The dictionary include for example, a TABLE and a DOMAIN table, giving details of all tables and all domains currently known to the system. The dictionary tables can be interrogated using the ordinary retrieval operations of the DML.
Retrieval of table-names
Get the names of all tables known to the system.
P.
Instead of having to build a skeleton for the TABLE table and entering “P.” in the NAME column of that skeleton, the user can formulate this query by simply entering the “P.” in the table-name position of the blank table.
Retrieval of column-name for a given table
Get names of all columns in table S. S P.
57
User enters the table-name (S) followed by “P.” against the row of (blank) column-names.
Creation of a new table
1.Create table branch
I. branch I. Branch name branch city branch street
The first I. Creates a dictionary entry for table branch; the 2nd I. Creates dictionary entries for the four columns of the table branch. Also the information for each column must be specified .The information includes the name of the underlying domain; the data-type of the domain; if that domain is not already known to QBE.
Dropping a table
Drop table branch.
A table can be dropped only if it is currently empty.
1)Delete all branch details
branch branch name branch city branch street
D.
2)Drop the table
D. Branch branch name branch city branch street
Expanding a table
Add a asset coloumn to the table branch.
QBE does not directly support the dynamic addition of a new column to an existing table is currently empty.
So the following steps should be followed.
58
1) Define a new table the same shape as the existing table plus the new column.2) Load the new table from the old using a multiple-record insert.3) Delete all data from the old table.4) Drop the old table.5) Change the name of the new table to that of the old table.
Normalization
Introduction
Normalization theory is build around the concept of normal forms. A relation is said to be in a particular normal form if it satisfies a certain specified set of constraints. For example, a relation is said to be in first normal form if and only if it satisfies the constraint that it contains atomic values only. Various normal forms are First Normal Form, Second Normal Form, Third Normal Form, DKNF, and BCNF etc. Concept of normalization arises in the case to design a relational-database without unnecessary redundancy, easy way of retrieval etc…So if we want to design such a database we go for normalization.
For the description of normalization, we shall consider the supplier-and-parts database. The database or relation is as follows:
PART---P
SP------
P#Pname
Color Weight City
P1P2P3P4P5P6
NutBoltScrewScrewCamCog
RedGreenBlueRedBlueRed
121717141219
LondonParisRomeLondonParisLondon
S# Sname Status CityS1S2S3S4S5
SmithJonesBlakeClarkAdams
2010302030
LondonParisParisLondonAthens
S# P# QTYS1S1S1S1S1S1S2S2S3S4S4S4
P1P2P3P4P5P6P1P2P2P2P4P5
300200400200100100300400200200300400
59
FIG:1
Functional Dependency
Definition:
Given a relation R, attribute Y of R is functionally dependent on attribute X of R if and only if each X-value in R has associated with it precisely one Y-value in R.
In the supplier-and-parts database the attributes SNAME, STATUS and CITY of a relation S are each functionally dependent on attribute S#. For a particular value for S# there exists precisely one corresponding value for each of SNAME, STATUS and CITY.
S.S# S.SNAMES.S# S.STATUSS.S# S.CITY
Or we can say represent asS.S#S. (SNAME, STATUS, CITY)
The statement S.S#S.CITY is read as “attribute S.CITY is functionally dependent on attribute S.S#”, or “attribute S.S# functionally determines attribute S.CITY”.
Alternate definition for functional dependence
Given a relation R, attribute Y of R is functionally dependent on attribute X of R if and only if, whenever two tuples of R agree on their X-value, they also agree on their Y-value.
S# P# Qty StatusS1S1S1S1
P1P2P3P4
300200400100
20202020
Fig: Partial tabulation of relation SP’.
For example in this relation SP’
60
SP’.S#SP’.STATUS
A functional dependence is a special form of integrity constraint. For example, if a relation S satisfies the FD S.S#S.CITY then we say that every legal extension of that relation satisfies that constraint.It is convenient to represent the FDs in a given set of relations by means of a
functional dependency diagram.
Example:
Fig: Functional dependencies in relations S, P, SP.
Various Normal Forms
Brief description of Normal forms
First Normal Form
Eliminates repetition of data that is converts each data value to its atomic form No two rows should be identical Each table entry should be single valued Every table has a primary key, which is a unique label or identifier for each row
Second Normal Form
Requires taking out data that is only dependent on a part of the key
Each non-key attribute is functionally dependent on the entire key
Third Normal form
61
S# STATUS
SNAME CITY
P#
PNAME
COLOR
WEIGHT
CITY
QTY
S#
P#
Involves getting rid of anything in the tables that does not depend solely on the primary key 3NF is sometimes characterized as “the key, the whole key, and nothing but the key”
First Normal Form
Definition:
A relation R is in first normal form(1NF) if and only if all underlying domain contain atomic values only.
A relation that is only in first normal form has a structure that is undesirable for a number of reasons.
For example:
Let us assume that information concerning suppliers and shipments, rather than being split into two separate relations (S and SP) is combined into a single relation and let the name be FIRST with fields (S#, STATUS, CITY, P#, QTY).
Where S# represents the supplier number, STATUS represents the supply details, CITY represents the city where the supply has been made P# represents the Part number, QTY represents the quantity of supply.
Here the constraint is STATUS is functionally dependent on CITY. That is the meaning of this constraint is that a supplier’s status is determined by the corresponding location: e.g., all LONDON suppliers must have a status of 20.Also we ignore the attribute SNAME for simplicity The primary key of FIRST is the combination of (S#, P#). The following is the functional dependency diagram for this relation
Fig: Functional dependencies in the relation FIRST
In the diagram
i) STATUS and CITY are not functionally dependent on the primary key.
62
QTY
S#
P#
STATUS
CITY
ii) STATUS and CITY are not mutually dependent.
Certain difficulties of the FIRST relation occurs while UPDATION.They are explained as
Insert: We cannot enter the fact that a particular supplier is located in a particular city until that supplier supplies at least one part. The following is the tabulation of FIRST.
S# STATUS CITY P# QTY
S1
S1
S1
S1
S1
S1
S2
S2
S3
S4
S4
S4
20
20
202
20
20
20
10
10
10
20
20
20
London
London
London
London
London
London
Paris
Paris
Paris
London
London
London
P1
P2
P3
P4
P5
P6
P1
P2
P2
P2
P4
P5
300
200
400
200
100
100
300
400
200
200
300
400
Table: FIRST
The FIRST relation does not show that supplier S% is located in ATHENS. Because until S5 supplies some part, we have not appropriate primary key value.
Deletion: If we delete the only FIRST tuple for a particular supplier, we destroy not only the shipment connecting that supplier to some part but also the information that the supplier is located in a particular city.
For example if we delete the FIRST tuple with S# value S# and P# value P2, we lose the information that S3 is located in Paris.
Updation: the city value for a given supplier appears in FIRST many times, this redundancy causes update problems.
For example, if supplier S1 moves from London to Amsterdam then the two difficulties occurs. They are
Searching the FIRST relation to find every tuple connecting S1 and London and this produces an inconsistent result. The solution to these problems is to replace the relation FIRST by the two relations SECOND (S#, STATUS, CITY) and SP (S#, P#, QTY). The functional dependency diagrams for these two relations are as shown here.
Fig:Functional dependencies in the relation SECOND and SP.
63
S#
STATUS
CITY
S#
P#
CITY
The following tables shows the sample tabulations corresponding to the data values of FIG:1 except the information for supplier S5 has been included in SECOND and not in SP.
SECOND
S# Status City
S1
S2
S3
S4
S5
20
10
10
20
30
London
Paris
Paris
London
Athens
SP
S# P# QTY
S1
S1
S1
S1
S1
S1
S2
S2
S3
S4
S4
S4
P1
P2
P3
P4
P5
P6
P1
P2
P2
P2
P4
P5
300
200
400
200
100
100
300
400
200
200
300
400
Fig: Sample tabulations of SECOND and SP.
After building the tables as shown we overcome the difficulties of FIRST relation. Now we can easily do the operations on the tables. This is about first normal form.
SECOND NORMAL FORM:
DEFINITION: A relation R is in second normal form (2NF) if and only if it is in 1NF and every nonkey attribute is fully dependent on the primary key.
Relations SECOND and SP are both 2NF (the primary keys are S# and the combination (S#,P#), respectively). Relation FIRST is not in 2NF. A relation that is in first normal form and not in second can always be reduced to an equivalent collection of 2NF relations. The reduction consists of replacing the relations by suitable projections; the collections of these projections is equivalent to the original relations, in the sense that the original relation can
64
always be recovered by taking the natural join of these projections, so no information is lost in the process. In other words, the process is reversible.
In our example: SECOND and SP relations are projections of FIRST, and FIRST is the natural join of SECOND and SP over S#.
The reduction of FIRST to the pair (SECOND, SP) is an example of nonloss decomposition. In general, given a relation R with possibly composite attributes A, B, C satisfying the FD R.A R.B, R can always be “nonloss-decomposed” into its projections R1 (A, B) and R2 (A, C).Since no information is lost in the reduction process, any information that can be derived from the original structure can also be derived from the new structure. The converse is not true, however: The new structure may contain information (such as the fact that S5 is located in Athens) that could not be represented in the original. In the sense the new structure is a slightly more faithful reflection of the real world.
The SECOND /SP structure still causes problems, however. Relation SP is satisfactory; as a matter of fact, relation SP is now in the normal form, and we shall ignore it for the reminder of this section. Relation SECOND, on the other hand, still suffers from a lack of mutual independence among its nonkey attributes. The dependence diagram for SECOND is still more complex than a 3NF diagram. To be specific, the dependency of the STATUS on S#, thought it is functional, is transitive (via CITY): Each S# value determines a CITY value, and this in returns determines the STATUS value. This transitivity leads, once again, to difficulties over update operations. (We now concentrate on the association between cities and status values-ie.,on the functional dependency of STATUS on CITY .)
INSERTING: We cannot enter the fact that a particular city has a particular status value-for example, we cannot state that any supplier in Rome must have a status of 50-until we have some supplier located in that city. The reason is, again, that until such a supplier exists we have no appropriate primary key value.
DELETING: If we delete the only SECOND tuple for a particular city, we destroy not only the information for the supplier concerned but also the information that that the city has that particular status value. For example, if we delete the SECOND tuple for S5, we lose the information that the status for the Athens is 30.
UPDATING:The status value for a given city appears in SECOND many times.Thus,if we need to change the status value for London from 20 to 30 we are faced with either the problem of searching the SECOND relation to find every tuple for London or the possibilbity of producing an inconsistent result.
The solution to the problems is to replace the original relation (SECOND) by two projections SC(S#,CITY) and CS(CITY,STATUS).And the corresponding functional dependency diagram is shown here.
65
S# CITY CITY STATUS
The tabulations corresponding to these is
SC
CS---
Fig:2 Sample tabulations of SC and CS.
It should be clear that this new structure overcomes all the problems over update operations concerning the CITY-STATUS association.
Third Normal Form
Definition: A relation R is in third normal form (3NF) if and only if is in 2NF and every non-key attribute is non-transitively dependent on the primary key.
Relations SC and CS (shown in Fig:2)are both 3NF;relation SECOND (shown in page 20)is not in 3NF.A relation that is not in second normal form and not in third can always be reduced to an equivalent collection of 3NF relations.
Relations with more than one candidate key or BCNF (Boyce-codd normal form)
Definition:
A relation R is in BCNF if and only if every determinant is a candidate key.
The objective of BCNF is to handle a relation having two or more composite and overlapping candidate keys. Although BCNF is stronger than 3NF,it is still true that any relation can be decomposed in a non-less way into an equivalent collection of BCNF relations.
Relation FIRST consists of three determinants: S#, CITY and the combination (S#, P#). Among these (S#, P#) alone is a candidate key; hence FIRST is not in BCNF.
Relation SECOND is also not in BCNF because the determinant CITY is not a candidate key.
S# City
S1
S2
S3
S4
S5
London
Paris
Paris
London
Athens
City Status
Athens
London
Paris
30
20
10
66
Relations SP, SC and CS are in BCNF because in each case the primary key is the only determinant in the relation.
Example: involving two disjoint (non-overlapping) candidate keys. Let us consider relation S (S#, SNAME, STATUS, CITY) .the relation S is BCNF.However, it is desirable to specify both keys in the definition of the relation:
a) To inform the DBMS, so that it may enforce the constraints implied by the two-way dependency between the two keys-namely, that corresponding to each supplier number there exists a unique supplier name, and conversely
b) To inform the users, since of course the uniqueness of the two attributes is an aspect of the semantics of the relation and is therefore of interest to people using it.
Example -where the candidate keys overlap.Two candidate keys overlap if they involve two or more attributes
each and have an attribute in common.
1) We suppose that the supplier names are unique, and we consider the relation SSP (S#, SNAME, P#, QTY). The keys are (S#, P#) and (SNAME, P#). This is relation is not in BCNF because we have two determinants# and SNAME, which are not keys for the relation (S# determines SNAME, and conversely). But the relation is in 3NF if we consider the definition----A relation R is in 3NF if and only if it is in 2NF and every non-key attribute is non-transitively dependent on the primary key. Here in this definition it does not require an attribute to be fully dependent on the primary key if it was itself a component of some other key in the relation, and so the fact that SNAME is not fully dependent on (S#, P#). But this fact leads to redundancy and hence to update problems in the relation SSP.If we go for updating the name of supplier S from Smith to Robinson leads either to search problems or to possibly inconsistent results. The solution to the problems as usual is to decompose the relation SSP into two projections, in this case SS (S#, SNAME) and SP (S#, P#, QTY) for SP (SNAME,P#,QTY).These projections are both BCNF.
2) Second example;Consider the relation SJT with attributes S(student),J(subject) and T(teacher).The meaning of an SJT tuple is that the specified student is taught the specified subject by the specified teacher. The semantic rules follow:
1.Only one teacher teaches each student of thet subject2.Each teacher teaches only one subject3.Several tachers teach each subject.
The sample tabulation of this relation is as follows
67
SJT
S J TSmithSmithJonesJones
MathPhysicsMathPhysics
Prof.whiteProf.GreenProf.WhiteProf.Brown
The functional dependencies of SJT are:From the first semantic rule we have functional dependency of T on the composite attributes (S, J).Form the second semantic rule we have a functional dependency of J on T.From the third semantic rule it is understood that there is no functional dependency of T on J.So the diagram is as follows
Fig: Functional dependencies in the relation SJT.
Here again we are having two overlapping candidate keys: the combination (S, J) and the combination (S, T). Once again the relation is 3NF and not BCNF; and once again the relation suffers from certain anomalies in connection with update operations. For example, if we wish to delete the information that Jones is studying physics, we cannot do so without at the same time losing information that professor Brown teaches physics.
The difficulties are caused by the fact that T is determinant but not a candidate key. Again we can get over the problem by replacing the original relation by two BCNF projections, in this case ST (S, T) and T, J (T, J).
Finally we say that the concept of BCNF eliminates certain problem cases that could occur under the old definition of 3NF.Moreover,BCNF is conceptually simpler than 3NF,in that it involves no reference to the concepts of primary key, transitive dependence and full dependence. The reference of candidate keys can also be replaced by a reference to the more fundamental notion of functional dependence. The reference to candidate keys can also be replaced by a reference to the more fundamental notion of functional dependence.
68
S
J
T
Good and Bad decompositions
During the reduction process it is frequently the case that a given relation can be decomposed in a variety of different ways. Consider the relation SECOND (S#, STATUS, CITY) with functional dependencies (FDs).
SECOND.S#SECOND.CITYSECOND.CITYSECOND.STATUS
And therefore by transitivitySECOND.S#SECOND.STATUS
The representation of SECOND relation is
Fig: Functional dependencies in relations S, P, SP
The above diagram clearly states that the update problems encountered with SECOND could be overcome by replacing it by its decomposition into the two 3NF projections
SC (S#, CITY) and CS (CITY, STATUS)------------------ALet this composition be A.
An alternative decomposition is SC (S#, CITY) and SS (S#, STATUS)---------------------------BDecomposition B is also nonloss, and the two projections are again
BCNF.But decomposition B is less satisfactory than decomposition A.
69
S#
SNAME
STATUS
CITY
P#
PNAME
COLOR
WEIGHT
CITY
S#
P#
QTY
For example, it is still not possible (in B) to insert the fact that a particular city has a particular status value unless supplier is located in that city. The explanation of this example is as follows:
In decomposition A the two projections are independent of each other, in the sense that updates can be made to either one without regard for the other; So joining them will not violate the FD constraints on SECOND.
In decomposition B updates to either of the two projections must be monitored to ensure that the FD SECOND.CITYSECOND.STATUS is not violated. Thus projections SC and SS are not independent of each other.
A relation that cannot be decomposed into independent component is said to be atomic.
Questions:
1.What is embedded SQL?2.Define QBE.3.Explain operations involving cursors and not involving cursors.4.What do you meant by dynamic statements?5.Explain retrieval operations of QBE.6.Explain update operations of QBE.7.Explain built-in functions of QBE.8.Define Normalization.9.What are various forms of normalization?10.What do you meant by QBE dictionary?11.Explain first, second and third normal forms.12.Explain relations with more than one candidate keys [BCNF].13.what do you meant by good and bad decomposition?14.What are QBE-aggregate functions?15.What is functional dependency?
STUDY MATERIAL
Course: B.Com CASubject: Data base management systemSemester:III
Unit: Four
70
Unit IV Syllabus
Hierarchical Approach:IMS data structure. Physical database, database description, Hierarhical sequence. External level of IMS: Logical Databases, the program communication block. IMS data manipulation: Defining the program communication block: DL/I Examples.
Books for Reference:
An introduction to database system - C.J.Date
Database system Concepts - Abraham silberschatz, Henry F.Korth, S.Sudharsan
Principles of database system -Aho D.Ullman
IMS data structure(Information Management System)
A physical database is an ordered set, the elements of which consist of all occurrences of one type of physical database record(PDBR).A PDBR occurrences in turn consists of a hierarchical arrangement of fixed-length segment occurrences; and a segment occurrence consists of a set of associated fixed-length field occurrences.
As an example we consider a PDB that contains information about the internal education system of a large industrial company. The hierarchical structure of this PDB-that is the PDBR type is shown here
Course
Prereq Offering
TeacherStudent
Fig: PDBR type for the education database.
In this example we are assuming that the company maintains an education department whose function is to run a number of training courses. Each course is offered
Course# Title Description
Course# Title Date Location Format
Emp# Name Emp# Name Grade
71
at a number of different locations within the company. The PDB contains details both of offerings already given and of offerings scheduled to be in the future,. The details are as follows:
For each course: course number (unique), course title, course description, details of prerequisites courses if any, and details of all offerings.
For each prerequisite course for a given course: course number and title. For each offering of a given course: date, location, format, details of all
teachers and details of all students; For each teacher of a given offering: employee number and name For each student of a given offerings: (EMP_N), name and grade.
In the PDBR structure shown, we have five types of sgments:
COURSE, PREREQ, OFFERING, TEACHER and STUDENT, each one consisting of the field types indicated.
COURSE is the root segment type and the others are department segment types. Each dependent has a parent for example the parent of TEACHER is OFFERING. Similarly each parent has at least one child, for example COURSE has two children. For one occurrence of any given segment type may be any number occurrences of each of its child segment types.
Course
Prereq Offering
StudentTeacher
Fig: Sample PDBR Occurrence for the education database.
M23 Dynamics …
M19 CalculusM16 Trignomentry
750106 Oslo F2751104 Dublin F3730813 Madrid F3
421633 Sharp.R761620 Tallis.T B183009 Gibbons.O A102141 Byrd,W B
72
The database Description
Each physical database is defined together with its mapping to storage by a database description (DBD). The source form of the DBD is written using special System/370 Assembler language macro statements, once written the DBD is assembled and the object form is stored away in a system library, from which it may be extracted when required by the IMS control program. So the following is the DBD for the education database.
1 DBD NAME=EDUCPDBD2 SEGM NAME=COURSE, BYTES=2563 FIELD NAME=(COURSE#, SEQ), BYTES=3,START=14 FIELD NAME=TITLE, BYTES=33,START=45 FIELD NAME=DESCRIPN, BYTES=220,START=376 SEGM NAME=PREREQ, PARENT=COURSE, BYTES=367 FIELD NAME=(COURSE#, SEQ), BYTES=3,START=18 FIELD NAME=TITLE, BYTES=33,START=49 SEGM NAME=OFFERING, PARENT=COURSE, BYTES=2010 FIELD NAME=(DATE, SEQ, M), BYTES=12,START111 FIELD NAME=LOCATION, BYTES=12,START=1912 FIELD NAME=FORMAT, BYTES=2,START=1913 SEGM NAME=TEACHER,PARENT=OFFERING,BYTES=2414 FIELD NAME=(EMP#, SEQ), BYTES=6,START=715 FIELD NAME=NAME, BYTES=18,START=716 SEGM NAME=STUDENT,PARENT=OFFERING, BYTES=2517 FIELD NAME=(EMP#, SEQ), BYTES=18MSTART=718 FIELD NAME=NAME, BYTES=18,START=719 FIELD NAME=GRADE, BYTES=1,START=25
FIG: DBD for the education PDB.
Explanation
Statement 1:Assigns the name EDUCPDBD (“education physical database
description”) to the DBD.All the names in IMS are limited to a maximum length of eight characters.
Statement 2:Defines the root segment type with the name COURSE and has totally 256 bytes length.
Statement 3-5:Defines the field types that go to make up COURSE. Each is given a name, a length in bytes, and a start position within the segment. The first field, COURSE# is defined to be the sequence field for the segment. So the PDBR occurrences will be sequenced in ascending course number order.
73
Statement 6:Defines PREREQ as a 36-byte segment and is dependent on COURSE.
Statements 7-8:Define the fields of PREREQ.
Statement 9:Defines OFFERING as a child of COURSE.
Statements 10-12:Define the fields of OFFERING.DATE are defined as the sequence field for OFFERING. The specification M (multiple) means that twin OFERING occurrences may contain the same date value.
Statements 13-15:Define the TEACHER segment and its fields
Statements 16-19:Define the STUDENT segment and its fields
The sequence of statements in the DBD is significant. Specifically SEGM statements must appear in the sequence that reflects the hierarchical structure also each SEGM statement must be immediately followed by the appropriate FIELD statements.
Hierarchical Sequence
The concept of hierarchical sequence within a database is a very important one in IMS.The definition for this is as follows:
For each segment occurrence, we define the “hierarchical sequence key value” to consist of the sequence field value for that segment, prefixed with the type code for that segment, prefixed with the hierarchical sequence key value of its parent, if any. For example, the hierarchical sequence key value for the STUDENT occurrence for “Byrd,W.” is
1M2337308135102141
Here 1 is the type code for COURSE, M23 the course#, 3 is the type code of OFFERING, 730813 is the DATE of OFFERING, 5 is the type code of STUDENT, 102141 is the EMP# of STUDENT.
Then the hierarchical sequence for an IMS database is that sequence of segment occurrences defined by ascending values of the hierarchical sequence key. This notion is important in case of IMS databases because in IMS databases are stored in hierarchical sequence.
External Level OF IMS
74
Logical databases:In architecture the user’s external view was defined as subset of
the corresponding physical database. A LDB (logical database) is an ordered set, the elements of which consist of all occurrences of one type of LDBR (logical database record).An LDBR type is a hierarchical arrangement of segment types, and is derived from the corresponding PDBR hierarchy in accordance with the following rules.
Any segment type of the PDBR hierarchy together with all its dependents can be omitted from the LDBR hierarchy
The fields of an LDBR segment type can be a subset of those of the corresponding PDBR segment type, and can be rearranged within that LDBR segment type.
Example:
Course
Offering
Student
Fig: Sample LDBR type for the education database.
Sensitive Segments:
The segments, which are present in PDB and is included in LDB are said to be sensitive segments. In the above example COURSE, STUDENT, OFFERING are sensitive segments .The user of this LDB will not be aware of the existence of any other segments.
For example, the DL/I “get next” operation, which in general is used for sequential retrieval, will simply skip over any segments that are not sensitive for the user. If the user deletes a sensitive segment all children of that segment will be deleted regardless of sensitiveness. So the user should not be given the authority to delete a segment, which allows the deletion of other hidden segments too.
Course# Title Description
Date Location Format
Emp# Name Grade
75
Also sensitive-segment concept protects the user from modification like addition to the PDB unless it is proved that the addition of new segment may not affect any existing parent-child relationship.
Also sensitive-segment concept provides a degree of control over data security, is as much as users can be prevented from accessing particular segment types by the omission of those segments from the LDB.
Sensitive fields
Sensitive fields are those fields of the PDB that are included in the LDB.Every sensitive field must be controlled within a sensitive segment A given LDB may include or exclude any combination of fields from the PDB, in general except that if the program intends to insert new occurrences of a given segment type, then it must be “sensitive to” the sequence filed for that segment type.
Field sensitivity, like segment sensitivity, protects the user from certain types of growth in the database and provides a simple level of data security.
The program communication block (PCB)
Each LDB is defined by a PDB.The PCB includes the specification of the mapping between the LDB and the corresponding PDB.Like DBD (database description) a PCB is written using special system/370 assembler language macro statements. These statements constitute the “external DDL”for IMS.The set of all PCBs for a given user forms that user’s program specification block (PSB); the object form of the PSB is stored in a system library, from which it may be extracted when required by the IMS control program.
Example:
1 PCB TYPE=DB,DBNAME=EDUCPDBD,KEYLEN=152 SENSEG NAME=COURSE, PROCOPT=G3 SENSEG NAME=OFFERING,PARENT=COURSE,PROCOPT=G4 SENSEG NAME=STUDENT,PARENT=OFFERING, PROCOPT=G
Fig: PCB for the LDB
Explanation
Statement 1:Specifies that this is a PCB database and named as EDUCPDBD, length of the key feedback area is 15 bytes.
Key Feedback: When the user accesses an LDB, the corresponding PCB is held in storage and acts, as a communication area between the user’s program and
76
IMS.One of the fields in the PCB is the key feedback area. When the user retrieves a segment from the LDB, IMS not only fetches the requested segment but also places a “fully concatenated key” into the key feedback area. The fully concatenated key consists of the concatenation of the sequence field values of all segments in the hierarchical path from the root down to the retrieved segment.Fetches the requested segment
For example;Retrieve the STUDENT occurrence for
Byrd.W.
IMS will place the value M23730813102141 in the key feedback area. The fully concatenated key of a segment is not quite the same as the “hierarchical sequence key” as this does not include segment type code information.
Statement 2:Specifies the first sensitive segment in the LDB.The name of the sensitive segment must be same as the name assigned to the segment in the DBD.
The PROCOPT (processing options”) entry specifies the types of operation that the user will be permitted to perform on this segment. In this example the entry is G (“get”) indicating retrieval only. Other options are I (“insert”), R (“replace”) and D (“delete”).
Statement 3:Defines the next sensitive segments in the LDB. Statement 4:Defines the last sensitive segments. In our example
statements 3 and 4 are very similar. The PROCOPT entry is the same for each of the three sensitive segments .In such a situation we may specify PROCOPT in the PCB statement instead of in each SENSEG statement.
If PROCOPT=K is specified in the SENSEG statement for OFFERING, the user may largely ignore the presence of OFFERINGs in the hierarchy. The output for this modification is shown as follows.
Course
Course# Title Description
77
Student
Fig: Effect of specifying PROCOPT=K for offering
The main difference is that when a STUDENT occurrence is retrieved, the fully concatenated key in the key feedback area will include the date value from the parent OFFERING.
The LDB shown in the example figure 1, is sensitive to all fields in segments COURSE, OFFERING and STUDENT of the underlying PDB.Suppose if we wish to exclude the LOCATION field of the OFFERING segment from the LDB while still remaining sensitive still all other fields as shown here:
SENFLD NAME=FORMAT, START=1 SENFLD NAME=DATE, START=1
These statements specify the fields to be included in the LDB segment and their start position within that segment. If no SENFLD statement is given for a particular SENSEG statement, then by default that segment is taken to be identical to the underlying PDB segment.
IMS Data Manipulation
Defining the Program Communication Block (PCB)
The IMS data manipulation language (DL/I) is invoked from the host language (PL/I) by means of ordinary subroutine calls. When an application program is operating on a particular logical database (LDB), the PCB for that LDB is kept in storage to serve as a communication area between the programs and IMS; infact when the program calls DL/I, it has to quote the storage address of the appropriate PCB to identify to DL/I which LDB it is to operate on.
PCB address is supplied to the program by IMS when the program is first entered. what actually happens is this.when a database application is to be run, IMS is given control first. IMS determines which PSB and DBD(s) are required, fetches them from their respective libraries and loads them into storage. IMS then
Emp# Name Grade
78
fetches the application program and gives it control, passing it the PCB address as parameters.
In order for the application program to be able to access the information in the PCB for a particular LDB, it must contain a definition of that PCB.
DLITPLI: PROCEDURE (COSPCB_ADDR) OPTIONS (MAIN);...
Declare 1 COSPCB BASED(COSPCB_ADDR), 2 DBDNAME CHARACTER(8),
2 SEGLEVEL CHARACTER(2),2 STATUS CHARACTER(2),2 PROCOPT CHARACTER(4),2 RESERVED FIXED BINARY(31),2 SEGNAME CHARACTER(8),2 KEYFBLEN FIXED BINARY(31),2 #SENSEGS FIXED BINARY(31),2 KEYFBAREA CHARACTER(15);
Fig A: Example of program entry and PCB definition (PL/I).
Explanation:
The procedure statement (labeled DLITPLI) is the program entry point. the expression in parentheses following the keyword PROCEDURE represents the parameters to be passed to the program by IMS, it consist of the pointer giving the address of the PCB. The rest of the Fig A consist of a declare statement that defines a structure to represent the single PCB used in the application.
The field DBDNAME contains the name of the underlying DBD throughout the execution of the program.
The SEGLEVEL field is set after the DL/I operation to contain the segment level number of the segment just accessed.
The STATUS field is the most important field in the PCB. After each DL/I call, the two character value is placed in this field to indicate the success or otherwise of the requested operation. A blank value indicates that the operation was completed satisfactorily, any other value represents an exceptional or error condition.
The PROCOPT field contains the PROCOPT value as specified in the PCB statement when the PCB was originally defined.
The SEGNAME field contains the name if the segment last accessed.The KEYFBLEN field contains the length of the fully concatenated key.
79
The #SENSEGS field contains a count of the number of sensitive segments.
The field KEYFBAREA is the key feedback area contains the fully concatenated key.
DL/I Examples
Get Unique (GU) Direct retrievalGet next (GN) Sequential retrievalGet next with parent (GNP) Sequential retrieval under current parentGet hold (GHU), (GHN),(GHNP) Allows subsequent DLET/REPLInsert (ISRT) Add new segment occurrenceDelete (DLET) Delete existing segment occurrenceReplace (REPL) Replace existing segment occurrence
Tab: DL/I Operations
Direct retrieval: Get the first OFFERING occurrence where the location is Stockholm.
GU COURSEOFFERING (LOCATION =’STOCKHOLM’)
Sequential retrieval with an SSA:Get all STUDENT occurrences in the LDB, starting with the first student for the
first offering in Stockholm.
GU COURSEOFFERING (LOCATION=’STOCKHOLM’)STUDENT
NS GN STUDENTGOTO NS
Sequential retrieval with an SSA within a parent:Get all students for the offering on 13 august 1973 of course M23.
GU COURSE (COURSE#=’M23’)OFFERING (DATE=’730813’)
80
NP GNP STUDENTGOTO NP
Segment occurrence insertion:Add a new segment occurrence for the offering on 13 august 1973 of course M23.
ISRT COURSE (COURSE#=’M23’)OFFERING (DATE=’730813’)STUDENT
Segment deletion:Delete the offering of course M23 on aug 1973.
GHU COURSE (COURSE# = ‘M23’)OFFERING (DATE=’730813’)
DLET
Segment replacement:Change the location of the 13 Aug 1973 offering of course M23 to Helsinki.
GHU COURSE (COUSE# =’M23’)OFFERING (DATE=’730813’)
REPL
Questions.1. Explain physical and logical database of hierarchical approach with example.2. Explain DataBase Description (DBD) with example.3. Explain Hierarchical sequence key value.4. Explain Program communication block (PCB).5. Discuss DL/I operations with some examples.
STUDY MATERIALCourse : B.Com CASubject : Data base management systemSemester :III
Unit : Five
UNIT-V
Syllabus
Network approach: Architecture of DBTG system. DBTG data structure: The set construct, singular sets, sample schema, and the external level of DBTG-DBTG Data manipulation
81
Books for reference:
1:Database system conceptsAbraham Silberschatz and Henry F.Korth
2:An introduction to database systemsC.J.Date
Basic concepts:
A network database consists of a collection of records, which are connected to one another through links. A record is in many respects similar to an entity in the entity-relationship model. Each record is a collection of fields (attributes), each of which contains only one value. A link can be viewed as a restricted (binary) form of relationship in the sense of the E-R model.
To illustrate, consider a database representing a customer-account relationship in a banking system. There are two record types, customer and account. As we saw earlier, the customer record type can be defined, using Pascal-like notation, as follows:
type customer = recordname: string;street: string;city: string;
end
The account record type can be defined as follows:
type account = recordnumber: integer;balance: integer;end
The sample database in figure A.1 shows that Lowman has account 305, Camp has accounts 226 and 177, and kahn has account 155.
Lowman Square Dallas 305 500
Camp Downridge Garland
82
226 336
177 205
155 62
Fig:1Sample database
Data-structure diagrams: [Architecture of network model]
A data-structure diagram is the scheme representing the design of a network database. Such a diagram consists of two basic components: *Boxes, which correspond to record types. *Lines, which correspond to links.
A data-structure diagram serves the same purpose as an entity-relationship diagram; namely, it specifies the overall logical structure of the database. We shall consider the representation of binary, ternary etc. relationships of entity-relationship diagrams.
Binary relationship
The entity-relationship diagram for banking example is shown as follows:
E-R diagram (a)
(b)
FIG:2The above shown diagram (a) is the entity-relationship diagram and consists of
two entity-sets customer and account, and they are related through a binary ‘many-to-many’ relationship ‘custacct’ with no descriptive attributes.
The diagram shows that a customer may have several accounts and that an account may belong to several different customers. The corresponding data-
83
customer accountCustAcct
Number
BalanceStreet
CityName
Name street city Number balance
structure diagram is shown in figure (b). Here the record type customer corresponds to the entity set customer. It includes three fields-name, street and city.
Similarly, account is the record type corresponding to account entity-set and includes the attributes number and balance. Since, in the E-R diagram of above figure the CustAcct relationship is many-to-many, we draw no arrows on the link CustAcct diagram. If the relationship custacct were one-to-many from customer to account then the link custacct would have an arrow pointing to customer record type. The representation is shown as follows:
Customer account
(a)
Customer account
FIG:3
A sample database corresponding to the data-structure diagram of figure as shown. Since the relation is many-to-many, we show that katz has accounts 256 and 347 and that account 347 is owned by katz and Doner. A sample database corresponding to the data-structure diagram is shown here:
Fig:4Sample database corresponding t diagram of FIG:3a
Since the relationship is one-to-many -------
84
Beck Maple San Francisco 200 55
Katz North San jose256 100 000
347 667
Doner Sidehill Palo Alto 301 10 533
name street city number balance
name street city number balance
From customer to account, a customer may have more than one account, as is the case with Camp, who owns both 226 and 177. An account, however, cannot belong to more than one customer, as is indeed observed in the sample database. Finally, a sample database corresponding to the data-structure diagram of fig:3b is shown in the FIG:1.
How to replace the E-R diagram shown in FIG:2a if the descriptive attribute has to be included?
The transformation is more complicated because the link cannot contain any data value.So new record type has to be created and links need to be established as follows:
If for example we consider the E-R diagram shown in FIG:2a and we are trying to add the descriptive attribute date to the custacct relationship to denote the last time the customer has accessed the account.The newly derived E-R diagram is shown here
To transform this diagram to a data-structure diagram we need to:1:Replace entities customer and account with record types customer and account2:Create a new record type date with a single field to represent the date.3:Create the following many-to-one links:
*custdate from the date record type to the customer record type*acctdate from the date record type to the account recotd type
The DBTG CODASYL ModelThe Database Task Group wrote the first database standard specification, called
the CODASYL DBTG 1971 report, in the late 1960s. Then a number of changes have been suggested to that report, the last official one in 1978.The rules or standards advised by DBTG group are
Link restrictionDBTG SetsRepeating Groups
Link Restriction
In the DBTG model, only many-to-one links can be used. Many-to-many links are disallowed in order to simplify the implementation. One-to-one links are represented using a many-to-one link. Let us illustrate this with the help of an example:
85
Consider a binary relationship that is either one-to-many or one-to-one. If for our customer-account database, if the custacct relationship is one-to-many with no descriptive attributes and with descriptive attribute is shown in the following figure:
Customer account
Customer account
Fig: Two data-structure diagrams
If the custacct relationship is many-to-many then our transformation algorithm must be refined as follows. If the relationships have no descriptive attributes then the following algorithm must be employed:
1:Replace the entity sets customer and account with record types customer and account.2:Create a new dummy record type Rlink that may either have no fields or have a single field containing an externally defined unique identifier.3:Create the following two many-to-one links:
custrlink from rlink record type to customer record type*acctlink from record type to account record type.
D
DBTG sets
86
Name Street City
Number Balance
Name Street City
Number Balance
Date
Customer AccountcustAcct
name
street
Citynumber
Balance
Given that only many-to-one links can be used in the DBTG model, a data-structure diagram consisting of two record types that are linked together has the general form of the following figure:
Fig:AThe above shown structure is referred in the DBTG model as a DBTG-set. The name of the set is usually chosen to be the same as the name of the link connecting the two record types.
In each such DBTG-set, the record type A is said as the owner (or parent) of the set, and the record type B is said as the member (or child) of the set. Each DBTG-set can have any number of set occurrences-that is actual instances of linked records.
For example in the figure we are having three occurrences corresponding to the DBTG-set of figure A.
Since many-to-many links are disallowed, each set occurrence has precisely one owner and zero or more member records. In addition, no member record of a set can participate. Simultaneoulsy in several set occurrences of different DBTG-sets.
To illustrate, consider the data-structure diagram shown here. There are two DBTG-sets.
Custacct, having customer as the owner of the DBTG-set, and account as the member of the DBTG-set. Brncacct, having branch as the owner of the DBTG-set, and account as the member of the DBTG-set.The set custacct may be defined as follows:
87
Name street city Numberbalance
B
A
Set name is custacct Owner is customer Member is account
The set brncacct may be defined similarly asSet name is brncacct Owner is branch Member is account
An instance of the database is shown here:
Five set occurences are shown: three of set custacct,and two of set brncacct
1:owneer is customer record Lowman with a singke member account record 3052:owner is customer record Camp with two member account records 177 and 2263:Owner is cuatomer record Kahn with three member account records 155,402 and 408.4:Owner is branch record Hillside with three member account records 305,226 and 155.5:Owner is branch record Valleyview with three member account records 177,402 and 408
Here the fact, an account record cannot appear in more than one set occurrence of one individual set type. This is because an account can belong to exactly one customer, and can be associated with only one bank branch. An account can appear in two set occurrences of different set types. For example, acccount 305 is a member of set
occurrence 1 of type custacct and is also a member of set occurrence 4 of type brncacct.
The member records of a set occurrence may be ordered in a variety of ways.
Repeating Groups:
The DBTG model provides a mechanism for a field to have a set of values, rather than one single value.
For example, Suppose that a customer have several addresses. In this case, the customer record type will have the (street, city) pair of fields is defined as repeating group. So the customer record for Kahn is shown here:
88
The repeating groups construct is another way of representing the notion of weak entities in the E-R model. To illustrate we shall split the entity set customer into two sets:
*Customer, with descriptive attribute name*Address, with descriptive attribute street and city.
The address entity set is weak entity set, since it depends on the strong entity set customer.
DBTG data retrieval facility
The data manipulation language of the DBTG proposal consists of a number of commands that are embedded in a host language. The commands are explained as follows:
The Find and Get commands
The two most frequently used DBTG commands are
*find-locates a record in the database and sets the appropriate currency pointers*get,which copies the record to which the current of run-unit points from the database to the appropriate program work area template.
Access of individual records:
The find command has a number of forms. There are two different find commands for locating individual records in the database. the simplest command has the form:
Find any <record type> using <record-field>
Purpose: Locates a record of type <record type> whose <record-field> value is the same as the value of <record-field> in the <record-type> template in the program work-area. The following currency pointers are set to point to that record:
*The currency of run-unit pointer*The record-type currency pointer for <record type>*For each set in which that record belongs, the appropriate set currency pointer
For example: Construct the DBTG query that prints the street address of Lowman.
89
Customer. name:=”Lowman”;Find any customer-using name;
Get customer;Print (customer.street);
To display the duplicate records the command is
Find duplicate <record type> using <record-field>
Which locates the next record, which matches the <record-field>.
Example: Construct the DBTG-query that prints the names of all the customers who live in Dallas:
Customer.city:=”Dallas”;Find any customer-using city;
While DB-status = 0 do Begin
Get customer;Print(customer.name);Find duplicate customer using city;
End;
Access of records within a set
Purpose: Locate records in a particular DBTG-set.
There are three different types of commands.
The basic find command is
Find first <record type> within <set-type>
Which locates the first database record of type <record type> belonging to the current <set-type>.
To locate the other members of a set the command is
Find next <record-type> within <set-type>
90
This command finds the next elements in the set <set-type>
Example: Construct the DBTG query that prints the total balance of all accounts belonging to Lowman.
Sum: =0;Customer. name:=”Lowman”;Find any customer-using name;Find first account within custacct;While DB-status =0 doBegin
Get account;Sum:=sum + account. Balance;Find next account within custacct;
EndPrint (sum);
To find the owner of a particular DBTG-set .The command used is
Find owner within <set-type>
Example: Construct the DBTG-query that prints all the customers of the Hillside branch:
Branch-name:=”Hillside”;Find any branch-using name;Find first account within brncacct;While DB-status=0 doBegin
Find owner within custacct;Get customer;Print(customer. name);Find next account within brncacct;
End
DBTG update facility
Creating new records
To create a new record of type <record type> we insert the appropriate values in the corresponding <record type> template. And the command used is
91
Store <record type>
Example: Construct the DBTG query to add a new customer Jackson to the database.
Customer.name:=”Jackson”;Customer.street:=”Old road”;Customer.city:=”Richardson”;Store customer;
Modifying an existing record
In order to modify an existing record of type <record type> we must find the record in the database, get that record into the memory, and then change the desired fields in the template of <record type>. Once this is accomplished, we reflect the changes to the record to which the currency pointer of <record type> points by executing the command:
Modify <record type>
The DBTG model requires the find command to be executed prior to modifying a record must have the additional clause “for update” so that the system is aware of the fact that the record is to be modified.
Example:Construct the DBTG program to change the street address of Kahn to North Loop.
Customer.name:=”Kahn”;Find for update any customer using name;Get customer;Customer.city:=”North Loop”;Modify customer;
Deleting a record
To delete an existing record of type <record type> we use the command:
Erase <record type>
92
Example:The query to construct the DBTG program to delete account 402 belonging to
Kahn:
Finish:=false;Customer.name:=”Kahn”;Find any customer using name;Find for update first account within custacct;While DB-status=0 and not finish doBegin
Get account;If account.number =402 then BeginErase account;Finish: = true;End;ElseFind for update next account within custAcct
End;
It is possible to delete an entire set occurrence by finding the owner of the set – say, a record of type <record type> - and executing.
Erase all<record-type>
This will delete the owner of the set as well as its entire member. If a member of the set is an owner of another set the members of that set are also deleted. That the erase all operation is recursive.
Eg.Consider the DBTG program to delete customer “Camp” and all of her accounts.
Customer.name :=”Camp”;Find for update any customer using name;Erase all customer.
DBTG set-processing facility
This mainly concerns with the mechanism of inserting records into and removing records from a particular set occurrence.
The connect statement
To insert a new record of type <record type> into a particular occurrence of <set-type> we must first insert the record into the database, then set the currency pointers of <record type> and <set type> to point to the appropriate record and set occurrence.
93
The command used is
Connect <record type> to <set-type>
A new record can be inserted as follows:1:create a new record of type <record type> .2:Find the appropriate owner of the set <set type>.3:Insert the new record into the set by executing the connect statement.
Example:
Create the DBTG query for creating new account 267 which belongs to Jackson:
Account.number:=267;Account.balance:=0;Store account;Customer.name:=”Jackson”;Find any customer using name;Connect account to custacct;
The Disconnect statement
In order to remove a record of type <record type> from a set occurrence of <set-type>, we need to set the currency pointer of <record type> and <set-type> to point to the appropriate record and set occurrence. Once this is accomplished, the record can be removed from the set by executing
Disconnect <record-type> from <set-type>
Eg. To remove account 177 from the set occurrence of type custacct.
Account.number :=177;Find for update any account using number;Get account;Find owner within custacct;Disconnect account from custacct;
The reconnect statement
In order to move a record of type <record-type> from one set occurrence to another set occurrence of type <set-type>, we need to find the appropriate record and the owner of the set occurrence to which the record is to be moved. Once this is done, we can move the record by executing:
Reconnect <record-type> to <set-type>
94
Consider the DBTG program to move all accounts of Lowman that are currently at the hillside branch to the valley view branch.
Customer.name :=”Lowman”;Find any customer-using name;Find first account within custacct;While DB-status =0 do
BeginFind owner within brncacct;Getbranch;If branch.name = “hillside” thenBegin
Branch.name:=”Valley view”;Find any branch-using name;Reconnect account to brncacct;
End;Find next account within custacct;
End;
Set Insertion and RetentionWhen a new set is defined, we must specify how member records are to be
inserted. In addition, we must specify the conditions under which a record must be retained in the set occurrence in which it was initially inserted.
Set Insertion A newly created record of type <record type > of a set type <set type > can be
added to a set occurrence either explicitly (MANUALLY) or implicitly (automatically). This distinction is specified at set definition time via
Insertion is < insert mode >
Where < insert mode > can take one of two forms.
95
Manual : The new record can be inserted into the set manually ( explicitly ) by executing .
Connect < record type > to <set-type>
Automatic : The new record is inserted into the set automatically ( implicitly ) when it is created , that is , when we execute .
Store < record type >
In either case, just prior to insertion, the <set-type> currency pointer must point to the set occurrence into which the insertion is to be made.
Set Retention There are various restrictions on how and when a member record can be removed
from a set occurrence into which it has been inserted previously. These restrictions are specified at set definition time via
Retention is < retention-mode >
Where <retention-mode> can take one of the three forms
Fixed : Once a member record has been inserted into a particular set occurrence , it cannot be removed from that set . If retention is fixed , then to reconnect a record to another set , we must first erase that record , re-create it , and then insert it into the new set occurrence .Mandatory : Once a member record has been inserted into a particular set occurrence , it can be reconnected only to another set occurrence of type <set-type>. It can neither be disconnected nor be reconnected to a set of another type .Optional : No restrictions are placed on how and when a member record can be reconnected , disconnected ,and connected at will .The decision as to which to option to choose is dependent on the application .
Deletion
96
When a record is deleted (erased) and that record is the owner of set occurrence of type <set-type> , the best way of handling this deletion depends on the specification of the set retention of <set-type>
If the retention status is optional, then the record will be deleted and every member of the set it owns will be disconnected. These records, however, are kept in the database. If the retention status is fixed, then the record and all of its owned members will be deleted. This follows from the fact that the fixed status indicates that a member record cannot be removed from the set occurrence without being deleted.If the retention status is mandatory, then the record cannot be erased this is because the mandatory status indicates that a member record must belong to a set occurrence; it cannot be disconnected form that set.
Set Ordering The members of a set occurrence of <set-type> may be ordered in a variety of
ways. A programmer specifies these orders when the set is defined Order is <order-mode>
Where <order-mode> can be First : When a new record is added to a set , it is inserted in the first positive . Thus, the set is in reverse chronological ordering Last : When a new record is added to a set , it is inserted in the ;last position . Thus, the set is in chronological ordering Next : Suppose that the currency pointer of <set-type> points to record X . if X is a member type , then when a new record is added to the set . It is inserted in the position following X. If X is an owner type, then when a new record is added, it is inserted in the last position. Prior : Suppose that the currency pointer of ,set-type> points to record X . If X is a member type, then when a new record is added to the set it is inserted in the position just prior to X. If X is an owner type, then when a new record is added, it is inserted in the last position. System default : When a new record is added to a set , it is inserted in an arbitrary position determined by the system . Sorted : When a new record is added to a set , it is inserted in a position that ensures that the set will remain sorted . The sorting order is specified by a particular key value when a programmer defines the set. The programmer must specify whether members are ordered in ascending or descending order relative to that key.
97
REFER THE TEXT BOOK FOR FURTHER REFERENCE
Questions:1. Explain the architecture of network model.2. Write short notes on
a) Link restrictionb) DBTG Setsc) Repeating Groups
3. Explain DBTG data retrieval facility.4. Explain DBTG set-processing facility.5. explain DBTG update facility.6. What is set insertion and retention.
98