dbms unit 1 notes

Upload: ravi-kiran

Post on 04-Nov-2015

626 views

Category:

Documents


4 download

DESCRIPTION

Dbms Unit 1 Notes

TRANSCRIPT

  • unit 1database :

    A database is a collection of stored operational data used by various

    applications and/or users by some particular enterprise or by a set of

    outside authorized applications and authorized users.

    DataBase Management System :

    A DataBase Management System (DBMS) is a software system that

    manages execution of users applications to access and modify database

    data so that the data security, data integrity, and data reliability is

    guaranteed for each application and each application is written with an

    assumption that it is the only application active in the database.

    What Is Data ?

    Different view points:

    A sequence of characters stored in computer memory or storage

    Interpreted sequence of characters stored in computer memory or

    storage

    Interpreted set of objects

    Database supports a concurrent access to the data

    File Systems :

    File is uninterpreted, unstructured collection of information

    File operations: delete, catalog, create, rename, open, close, read,

    write, find,

    Access methods: Algorithms to implement operations along with

    internal file organization

    Examples: File of Customers, File of Students; Access method:

    implementation of a set of operations on a file of students or

    customers.

    File Management System Problems :

    Data redundancy

    Data Access: New request-new program

    Data is not isolated from the access implementation

    Concurrent program execution on the same file

    Difficulties with security enforcement

    Integrity issues .

    Database Applications :

    Airline Reservation Systems Data items are: single passenger

    reservations; Information about flights and airports; Information about

    ticket prices and tickets restrictions.

    Banking Systems Data items are accounts, customers, loans,

    mortgages, balances, etc. Failures are not tolerable. Concurrent access

    must be provided

    Corporate Records Data items are: sales, accounts, bill of materials

    dbms notes

    www.jntuworld.com

    www.jntuworld.com

    www.jwjobs.net

  • records, employee and their dependents

    ADVANTAGES OF A DBMS:

    Data independence: Application programs should be as independent as

    possible from details of data representation and storage. The DBMS can

    provide an abstract view of the data to insulate application code from

    such details.

    cient data access: A DBMS utilizes a variety of sophisticated techniques

    to store and retrieve data efciently. This feature is especially important

    if the data is stored on external storage devices.

    Data integrity and security: If data is always accessed through the

    DBMS, the DBMS can enforce integrity constraints on the data. For

    example, before inserting salary information for an employee, the DBMS

    can check that the department budget is not exceeded. Also, the DBMS

    can enforce access controls that govern what data is visible to dierent

    classes of users.

    Data administration: When several users share the data, centralizing

    the administration

    of data can oer signi cant improvements. Experienced professionals who

    understand the nature of the data being managed, and how dierent

    groups of users use it, can be responsible for organizing the data

    representation to minimize redundancy and for ne-tuning the storage of

    the data to make retrieval efcient.

    concarence recovery: A DBMS schedules concurrent accesses to the data

    in such a manner that users can think of the data as being accessed by

    only one user at a time. Further, the DBMS protects users from the

    eects of system failures.

    Reduced application development time: Clearly, the DBMS supports

    many important functions that are common to many applications

    accessing data stored in the DBMS. This, in conjunction with the high-

    level interface to the data, facilitates quick development of

    applications. Such applications are also likely to be more robust than

    applications developed from scratch because many important tasks are

    handled by the DBMS instead of being implemented by the application.

    Data Levels and their Roles :

    Physical corresponds to the first view of data: How data is stored,

    how is it accessed, how data is modified, is data ordered, how data is

    allocated to computer memory and/or peripheral devices, how data

    www.jntuworld.com

    www.jntuworld.com

    www.jwjobs.net

  • items are actually represented (ASCI, EBCDIC,) .The physical schema

    speci es additional storage details. Essentially, the physical schema

    summarizes how the relations described in the conceptual schema are

    actually stored on secondary storage devices such as disks and tapes.

    We must decide what le organizations to use to store the relations, and

    create auxiliary data structures called indexes to speed up data retrieval

    operations.

    Conceptual corresponds to the second view of data: What we want

    the data to express and what relationships between data we must

    express, what story data tells, are all data necessary for the story

    are discussed. The conceptual schema (sometimes called the logical

    schema) describes the stored data in terms of the data model of the

    DBMS. In a relational DBMS, the conceptual schema describes all

    relations that are stored in the database. In our sample university

    database, these relations contain information about entities, such as

    students and faculty, and about relationships, such as students'

    enrollment in courses. All student entities can be described using

    records in a Students relation, as we saw earlier. In fact, each collection

    of entities and each collection of relationships can be described as a

    relation, leading to the following conceptual schema:

    Students(sid: string, name: string, login: string, age: integer, gpa: real)

    Faculty( d: string, fname: string, sal: real)

    Courses(cid: string, cname: string, credits: integer)

    Rooms(rno: integer, address: string, capacity: integer)

    Enrolled(sid: string, cid: string, grade: string)

    Teaches( d: string, cid: string)

    Meets In(cid: string, rno: integer, time: string)

    The choice of relations, and the choice of elds for each relation, is not

    always obvious,

    and the process of arriving at a good conceptual schema is called

    conceptual

    database design.

    View corresponds to the third view of data:What part of the data

    is seen by a specific application .External schemas, which usually are

    also in terms of the data model of the DBMS, allow data access to be

    customized (and authorized) at the level of individual users or groups of

    users.The external schema design is guided by end user requirements.

    For example, we might ant to allow students to nd out the names of

    faculty members teaching courses, as well as course enrollments. This

    can be done by de ning the following view:

    Courseinfo(cid: string, fname: string, enrollment: integer)

    STRUCTURE OF A DBMS:

    www.jntuworld.com

    www.jntuworld.com

    www.jwjobs.net

  • When a user issues a query, the parsed query is presented to a query

    optimizer, which uses information about how the data is stored to

    produce an effcient execution plan for evaluating the query. An

    execution plan is a blueprint for evaluating a query, and is usually

    represented as a tree of relational operators.

    The code that implements relational operators sits on top of the le and

    access methods layer. This layer includes a variety of software for

    supporting the concept of a le, which, in a DBMS, is a collection of pages

    or a collection of records. This layer typically supports a heap le, or le of

    unordered pages, as well as indexes. In addition to keeping track of the

    pages in a le, this layer organizes the information within a page.The les

    and access methods layer code sits on top of the buer manager, which

    brings pages in from disk to main memory as needed in response to

    read requests.

    The lowest layer of the DBMS software deals with management of space

    on disk, where the data is stored. Higher layers allocate, deallocate,

    read, and write pages through (routines provided by) this layer, called

    the disk space manager.

    The DBMS supports concurrency and crash recovery by carefully

    scheduling user requests and maintaining a log of all changes to the

    database. DBMS components associated with concurrency control and

    recovery include the transaction manager, which ensures that

    transactions request and release locks according to a suitable locking

    protocol and schedules the execution transactions; the lock manager,

    which keeps track of requests for locks and grants locks on database

    objects when they become available; and the recovery manager, which

    is responsible for maintaining a log, and restoring the system to a

    consistent state after a crash. The disk space manager, buer manager,

    and le and access method layers must interact with these components.

    Data Models:

    A collection of tools for describing ......

    Data.

    Data relationships.

    Data semantics.

    www.jntuworld.com

    www.jntuworld.com

    www.jwjobs.net

  • Data constraints.

    Relational model..........

    Entity-Relationship data model (mainly for database design) .

    Object-based data models (Object-oriented and Object-relational).

    Semistructured data model (XML).

    Other older models:.........

    Network model .

    Hierarchical model.

    Database Access from Application Programs:

    To access the database, DML statements need to be executed from the

    host language.There are two ways to do this:

    By providing an application program interface (set of procedures) that

    can be used to send DML and DDL statements to the database, and

    retrieve the results.The Open Database Connectivity (ODBC) standard

    defined by Microsoft for use with the C language is a commonly used

    application program interfacestandard. The Java Database Connectivity

    (JDBC) standard provides correspondingfeatures to the Java language.

    By extending the host language syntax to embed DML calls within the

    host language program. Usually, a special character prefaces DML calls,

    and a preprocessor, called the DML precompiler, converts the DML

    statements to normal procedure calls in the host language.

    Database Users and Administrators:

    Naive users are unsophisticated users who interact with the

    system by invoking one of the application programs that have

    been written previously.

    Application programmers are computer professionals who write

    application programs.

    Sophisticated users interact with the system without writing

    programs. Instead,they form their requests in a database query

    language. They submit each such query to a query processor,

    whose function is to break down DML statements into

    instructions that the storage manager understands. Analysts who

    submit queries to explore data in the database fall in this

    category.

    Specialized users are sophisticated users who write specialized

    database applications that do not fit into the traditional data-

    processing framework.

    Database Administrator:A person who has such central control

    over the system is called a database administrator (DBA)

    Schema definition. The DBA creates the original database schema

    by executing a set of data definition statements in the DDL.

    Storage structure and access-method definition.

    Schema and physical-organization modification. The DBA carries

    out changes to the schema and physical organization to reflect

    www.jntuworld.com

    www.jntuworld.com

    www.jwjobs.net

  • the changing needs of the organization, or to alter the physical

    organization to improve performance.

    Granting of authorization for data access.

    Routine maintenance.

    Data Model: A data model is a collection of conceptual tools for describing data, data

    relationships, data semantics, and consistency constraints.

    Entity:An entity is a thing or object in the real world that is

    distinguishable from all other objects. For example, each person in an

    enterprise is an entity.

    Entity set:An entity set is a set of entities of the same type that share

    the same properties, orattributes. The set of all persons who are

    customers at a given bank, for example, can be defined as the entity

    set customer. Similarly, the entity set loan might represent the set of all

    loans awarded by a particular bank.

    An entity is represented by a set of attributes. Attributes are

    descriptive properties possessed by each member of an entity set. The

    designation of an attribute for an entity set expresses that the database

    stores similar information concerning each entity in the entity set;

    however, each entity may have its own value for each attribute.

    Simple and composite attributes:the attributes havebeen simple; that

    is, they are not divided into subparts is called as "simple attributes". on

    the other hand, can be divided into subparts is called as "composite

    attributes".For example, an attribute name could be structured as a

    composite attribute consisting of first-name, middle-initial, and last-

    name.

    Single-valued and multivalued attributes:For instance, the loan-

    number attribute for a specific loan entity refers to only one loan

    number. Such attributes are said to be single valued. There may be

    instances where an attribute has a set of values for a specific entity.

    Consider an employee entity set with the attribute phone-number. An

    employee may have zero, one, or several phone numbers, and different

    employees may have different numbers of phones.

    This type of attribute is said to be multivalued.

    Derived attribute:The value for this type of attribute can be derived

    from the values of other related attributes or entities. For instance, let

    us say that the customer entity set has an attribute loans-held, which

    represents how many loans a customer has from the bank.We can derive

    the value for this attribute by counting the number of loan entities

    associated with that customer.

    Relationship Sets:A relationship is an association among several

    entities. A relationship set is a set of relationships of the same type.

    Mapping Cardinalities:Mapping cardinalities, or cardinality ratios,

    express the number of entities to which another entity can be

    associated via a relationship set. Mapping cardinalities are most useful

    in describing binary relationship sets, although they can contribute to

    the description of relationship sets that involve more than two entity

    www.jntuworld.com

    www.jntuworld.com

    www.jwjobs.net

  • sets.

    One to one. An entity in A is associated with at most one entity

    in B, and an entity in B is associated with at most one entity in A.

    One to many. An entity in A is associated with any number (zero

    or more) of entities in B. An entity in B, however, can be

    associated with at most one entity in A.

    Many to one. An entity in A is associated with at most one entity

    in B. An entity in B, however, can be associated with any number

    (zero or more) of entities in A.

    Many to many. An entity in A is associated with any number (zero

    or more) of entities in B, and an entity in B is associated with any

    number (zero or more) of entities in A.

    Keys: A key allows us to identify a set of attributes that suffice to

    distinguish entities from each other. Keys also help uniquely identify

    relationships, and thus distinguish relationships from each other.

    superkey:A superkey is a set of one or more attributes that, taken

    collectively, allow us to identify uniquely an entity in the entity set. For

    example, the customer-id attribute of the entity set customer is

    sufficient to distinguish one customer entity from another. Thus,

    customer-id is a superkey. Similarly, the combination of customer-name

    and customer-id is a superkey for the entity set customer. The

    customer-name attribute of customer is not a superkey, because several

    people might have the same name.

    candidate key:minimal superkeys are called candidate keys.If K is a

    superkey, then so is any superset of K. We are often interested in

    superkeys for which no proper subset is a superkey.It is possible that

    several distinct sets of attributes could serve as a candidate

    key.Suppose that a combination of customer-name and customer-street

    is sufficient to distinguish among members of the customer entity set.

    Then, both {customer-id} and {customer-name, customer-street} are

    candidate keys. Although the attributes customerid and customer-name

    together can distinguish customer entities, their combination does not

    form a candidate key, since the attribute customer-id alone is a

    candidate key.

    primary key:which denotes the unique identity is called as primary

    key.primary key to denote a candidate key that is chosen by the

    database designer as the principal means of identifying entities within

    an entity set. A key (primary, candidate, and super) is a property of the

    entity set, rather than of the individual entities. Any two individual

    entities in the set are prohibited from having the same value on the key

    attributes at the same time. The designation of a key represents a

    constraint in the real-world enterprise being modeled.

    Weak Entity Sets:An entity set may not have sufficient attributes to

    form a primary key. Such an entity set is termed a weak entity set. An

    entity set that has a primary key is termed a strong entity set.

    www.jntuworld.com

    www.jntuworld.com

    www.jwjobs.net

  • For a weak entity set to be meaningful, it must be associated with

    another entityset, called the identifying or owner entity set. Every weak

    entity must be associated with an identifying entity; that is, the weak

    entity set is said to be existence dependent on the identifying entity

    set. The identifying entity set is said to own the weak entity set that it

    identifies. The relationship associating the weak entity set with the

    identifying entity set is called the identifying relationship. The

    identifying relationship is many to one from the weak entity set to the

    identifying entity set, and the participation of the weak entity set in the

    relationship is total.

    In our example, the identifying entity set for payment is loan, and a

    relationship loan-payment that associates payment entities with their

    corresponding loan entities is the identifying relationship.

    Although a weak entity set does not have a primary key, we

    nevertheless need a means of distinguishing among all those entities in

    the weak entity set that depend on one particular strong entity. The

    discriminator of a weak entity set is a set of attributes that allows this

    distinction to be made.

    In E-R diagrams, a doubly outlined box indicates a weak entity set, and

    a doubly outlined diamond indicates the corresponding identifying

    relationship.in fig the weak entity set payment depends on the strong

    entity set loan via the relationship set loan-payment.

    The figure also illustrates the use of double lines to indicate total

    participationthe of the (weak) entity set payment in the relationship

    loan-payment is total, meaning that every payment must be related via

    loan-payment to some loan. Finally, the arrow from loan-payment to

    loan indicates that each payment is for a single loan. The discriminator

    of a weak entity set also is underlined, but with a dashed, ratherthan a

    solid, line.

    Specialization:An entity set may include subgroupings of entities that

    are distinct in some wayfrom other entities in the set. For instance, a

    subset of entities within an entity set may have attributes that are not

    shared by all the entities in the entity set. The E-R model provides a

    www.jntuworld.com

    www.jntuworld.com

    www.jwjobs.net

  • means for representing these distinctive entity groupings. Consider an

    entity set person, with attributes name, street, and city. A personmay

    be further classified as one of the following:

    customer

    employee

    Each of these person types is described by a set of attributes that

    includes all the attributes of entity set person plus possibly additional

    attributes. For example, customer entities may be described further by

    the attribute customer-id, whereas employee entities may be described

    further by the attributes employee-id and salary. The process of

    designating subgroupings within an entity set is called specialization.

    The specialization of person allows us to distinguish among persons

    according to whether they are employees or customers.

    Generalization:The design process may also proceed in a bottom-up

    manner, in which multiple entity sets are synthesized into a higher-level

    entity set on the basis of common features. The database designer may

    have first identified a customer entity set with the attributes name,

    street, city, and customer-id, and an employee entity set with the

    attributes name, street, city, employee-id, and salary. There are

    similarities between the customer entity set and the employee entity set

    in the sense that they have several attributes in common. This

    commonality can be expressed by generalization, which is a containment

    relationship that exists between a higher-level entity set and one or

    more lower-level entity sets. In our example, person is the higher-level

    entity set and customer and employee are lower-level entity sets.

    Higher- and lower-level entity sets also may be designated by the terms

    superclass and subclass, respectively. The person entity set is the

    superclass of the customer and employee subclasses.For all practical

    purposes, generalization is a simple inversion of specialization. We will

    apply both processes, in combination, in the course of designing the E-R

    schema for an enterprise. In terms of the E-R diagram itself, we do not

    distinguish between specialization and generalization. New levels of

    entity representation will be distinguished (specialization) or

    synthesized (generalization) as the design schema comes to express

    fully the database application and the user requirements of the

    database. Differences in the two approaches may be characterized by

    their starting point and overall goal.Generalization proceeds from the

    recognition that a number of entity sets share some common features

    (namely, they are described by the same attributes and participatein the

    same relationship sets).

    www.jntuworld.com

    www.jntuworld.com

    www.jwjobs.net

  • Aggregation:

    Aggregation is an abstraction in which relationship sets (along with their

    associated entity sets) are treated as higher-level entity sets, and can

    participate in relationships.

    Symbols used in the E-R notation:

    www.jntuworld.com

    www.jntuworld.com

    www.jwjobs.net

  • ER Model For a college DB:

    Assumptions :

    A college contains many departments

    Each department can offer any number of courses

    Many instructors can work in a department

    An instructor can work only in one department

    For each department there is a Head

    An instructor can be head of only one department

    Each instructor can take any number of courses

    A course can be taken by only one instructor

    A student can enroll for any number of courses

    Each course can have any number of students

    Steps in ER Modeling:

    Identify the Entities

    Find relationships

    Identify the key attributes for every Entity

    Identify other relevant attributes

    Draw complete E-R diagram with all attributes including Primary

    Key

    Step 1: Identify the Entities:

    DEPARTMENT

    STUDENT

    COURSE

    INSTRUCTOR

    Step 2: Find the relationships:

    One course is enrolled by multiple students and one student

    enrolls for multiple courses,

    www.jntuworld.com

    www.jntuworld.com

    www.jwjobs.net

  • hence the cardinality between course and student is Many to

    Many.

    The department offers many courses and each course belongs to

    only one department,

    hence the cardinality between department and course is One to

    Many.

    One department has multiple instructors and one instructor

    belongs to one and only one

    department , hence the cardinality between department and

    instructor is one to Many.

    Each department there is a Head of department and one

    instructor is Head of

    department ,hence the cardinality is one to one .

    One course is taught by only one instructor, but the instructor

    teaches many courses,

    hence the cardinality between course and instructor is many to

    one.

    Step 3: Identify the key attributes

    Deptname is the key attribute for the Entity Department, as it

    identifies the Department uniquely.

    Course# (CourseId) is the key attribute for Course Entity.

    Student# (Student Number) is the key attribute for Student

    Entity.

    Instructor Name is the key attribute for Instructor Entity.

    Step 4: Identify other relevant attributes

    For the department entity, the relevant attribute is location

    For course entity, course name,duration,prerequisite

    For instructor entity, room#, telephone#

    For student entity, student name, date of birth

    www.jntuworld.com

    www.jntuworld.com

    www.jwjobs.net

  • ER model for Banking Business :

    Assumptions :

    There are multiple banks and each bank has many branches. Each

    branch has multiple customers

    Customers have various types of accounts

    Some Customers also had taken different types of loans from

    these bank branches

    One customer can have multiple accounts and Loans

    Step 1: Identify the Entities

    BANK

    BRANCH

    LOAN

    ACCOUNT

    CUSTOMER

    Step 2: Find the relationships

    One Bank has many branches and each branch belongs to only one

    bank, hence the

    cardinality between Bank and Branch is One to Many.

    One Branch offers many loans and each loan is associated with one

    branch, hence the

    cardinality between Branch and Loan is One to Many.

    One Branch maintains multiple accounts and each account is

    associated to one and

    only one Branch, hence the cardinality between Branch and Account is

    One to Many

    One Loan can be availed by multiple customers, and each Customer

    can avail multiple

    loans, hence the cardinality between Loan and Customer is Many to

    Many.

    One Customer can hold multiple accounts, and each Account can be

    held by multiple

    Customers, hence the cardinality between Customer and Account is

    Many to Many

    Step 3: Identify the key attributes

    BankCode (Bank Code) is the key attribute for the Entity Bank, as it

    identifies the bank

    uniquely.

    Branch# (Branch Number) is the key attribute for Branch Entity.

    Customer# (Customer Number) is the key attribute for Customer

    Entity.

    Loan# (Loan Number) is the key attribute for Loan Entity.

    Account No (Account Number) is the key attribute for Account

    Entity.

    Step 4: Identify other relevant attributes

    For the Bank Entity, the relevant attributes other than BankCode

    would be Name

    and Address.

    For the Branch Entity, the relevant attributes other than Branch#

    would be Name

    and Address.

    www.jntuworld.com

    www.jntuworld.com

    www.jwjobs.net

  • For the Loan Entity, the relevant attribute other than Loan#

    would be Loan Type.

    For the Account Entity, the relevant attribute other than Account

    No would be

    Account Type.

    For the Customer Entity, the relevant attributes other than

    Customer# would be

    Name, Telephone# and Address.

    E-R diagram with all attributes including Primary Key:

    www.jntuworld.com

    www.jntuworld.com

    www.jwjobs.net