cop5725 database management system final review

36
COP5725 Database Management System Final Review Reviewed by Ramakrishna. Parts of this are taken from Fernando Farfan’s presentation

Upload: laird

Post on 25-Feb-2016

78 views

Category:

Documents


1 download

DESCRIPTION

COP5725 Database Management System Final Review. Reviewed by Ramakrishna . Parts of this are taken from Fernando Farfan’s presentation. AGENDA. Exercises to do…. Ch6 JDBC Ch7 3-Tier Architecture Ch8.Storage and Indexing Ch10.Tree-Structured Indexing Ch7.XML Documents - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: COP5725 Database Management System Final Review

COP5725 Database Management SystemFinal Review

Reviewed by Ramakrishna.

Parts of this are taken from Fernando Farfan’s presentation

Page 2: COP5725 Database Management System Final Review

AGENDA

Exercises to do…. Ch6 JDBC Ch7 3-Tier Architecture Ch8. Storage and Indexing Ch10. Tree-Structured Indexing Ch7. XML Documents Ch27. XQUERY: Querying XML Data

Page 3: COP5725 Database Management System Final Review

Exercises to do….

Remember to practice even numbered exercises of the book

Solutions are available online at http://www.cs.wisc.edu/~dbbook/openAccess/thirdEdition/supporting_material.htm

Important Recommended exercises: 8.3, 8.4, 8.5, 8.7, 8.10, 8.11

Page 4: COP5725 Database Management System Final Review

JDBC What is JDBC ? Explain its purpose.

JDBC is Java DataBase Connectivity used to enable integration of SQL with a general purpose programming language.

Explain JDBC Architecture4 components :

Application, Driver Manager, Data Source Specific Drivers, Data Sources (stored in

MySql,Oracle,DB2,MSSQL,Access and so on).

Page 5: COP5725 Database Management System Final Review

JDBC Explain the individual steps required to

submit a query to a data source and to retrieve results in JDBC ?JDBC Driver Management (class.forName(..)),Establishing Connection (Connection Object),Executing SQL Statements:

Statement, PreparedStatement, CallableStatement.

Retrieving results ( Examining ResultSets).

Page 6: COP5725 Database Management System Final Review

Stored Procedures Explain the term stored procedure, and give

examples why stored procedures are useful.

Stored procedures are programs that run on the database server and can be called with a single SQL statement.

Runs inside the process space of the database server.

Stored procedures can also be used to reduce network communication.

They are locally-executed and results are packaged in to one big result.

Different users can re-use the stored procedure.

Page 7: COP5725 Database Management System Final Review

3-Tier Application Architecture

What is a 2-tier architecture ? Explain different types of 2-tier architectures. It’s a client-server architecture.

Application Logic

DBMS

Network

Client

Client

Architecture 1: Thin Clients ( web browsers)

Page 8: COP5725 Database Management System Final Review

3-Tier Application Architecture

Architecture 2: Thick Clients

Application Logic

DBMSNetwork

Client

Client

Application Logic

Disadvantages ?

1) No central place to update

2) Need to trust client

Page 9: COP5725 Database Management System Final Review

3-tier architecture What are the advantages of 3-tier

architecture ?Heterogeneous SystemsThin Clients Integrated Data AccessScalable

DBMS (database)

Network

Client (user interface)

Client (web browser)

Application Logic (in C++ or Java..)

Network

Page 10: COP5725 Database Management System Final Review

CH8. OVERVIEW OF STORAGE AND INDEXING DBMS stores vast quantities of data Data must persist across program executions

Data is stored on external storage devices. The unit of information read from or written to

disk is a page. The cost of page I/O dominates the cost of

typical database operations. Input from disk to main memory Output from main memory to disk

Page 11: COP5725 Database Management System Final Review

CH8. OVERVIEW OF STORAGE AND INDEXING Simplest file structure is an unordered file, or

heap file. An index is a data structure to optimize certain

kinds of retrieval operations.

CLUSTERED - When a file is organized so that the ordering of data records is the same as the ordering of data entries in some index

UNCLUSTERED - It is un-clustered otherwise

There is UTMOST one clustered index and several un-clustered ones for a table.

Page 12: COP5725 Database Management System Final Review

Example

Employee (Eid: integer, name: String, age: integer…..)

Assume data records sorted by name Assume an index on name

Index on name CLUSTERED Assume an index on age

Index on age UNCLUSTERED

Page 13: COP5725 Database Management System Final Review

Example: Using Indexes Employee Data Records : Sorted by Name

Page 1 : (1,Alex,30) (2,Amy,21)

Page 2: (3, Bob,31) (4, Brenda,21)

Select * from Employee where name like “A%”; Use index on name Retrieve Page 1.

Select * from Employee where age = 21; Use index on age Retrieve Page 1 & Page 2.

Page 14: COP5725 Database Management System Final Review

Clustered and UnClustered Indexes

Retrieval using a Clustered indexData retrieval using minimum number of

Data page I/Os.

Retrieval using an Unclustered index In worst case, 1 page I/O for every

qualifying tuple.

Page 15: COP5725 Database Management System Final Review

Another Example

Consider “Select E.dno from Employee E where E.age > 40”;Assume B+ Tree index on age. Is it better to use this index ? Or just do a

segment scan ? Answer : Depends on several factors:

Depends on Selectivity of the condition.Whether the index is clustered or unclustered.

Page 16: COP5725 Database Management System Final Review

CH8. OVERVIEW OF STORAGE AND INDEXING 8.3 Consider a relation stored as a randomly ordered

file for which the only index is an unclustered index on a field called sal. If you want to retrieve all records with sal > 20, is using the index always the best alternative? Explain.

No. In this case, the index is unclustered, each qualifying data entry could contain an rid that points to a distinct data page, leading to as many data page I/Os as the number of data entries that match the range query. In this situation, using index is actually worse than file scan.

Page 17: COP5725 Database Management System Final Review

CH8. OVERVIEW OF STORAGE AND INDEXING 8.5 Explain the difference between Hash indexes and

B+-tree indexes. In particular, discuss how equality and range searches work.

Hash Index: Hashing function. Quickly maps a search key value to a bucket. Inserts/Deletes are simple. Linked list for collisions. Good at equality searches. Terrible for range queries.

B+-tree Index: Hierarchical search data structure. Maintenance is costly. Costly for individual record lookup. Efficient for range queries.

Page 18: COP5725 Database Management System Final Review

CH8. OVERVIEW OF STORAGE AND INDEXING Consider the following relation

Answer: Update the salary of an employee using the employee id.

Page 19: COP5725 Database Management System Final Review

CH8. OVERVIEW OF STORAGE AND INDEXING

Update the age or employee id for some department id

Update the salaries of all employees in some department

Page 20: COP5725 Database Management System Final Review

Constraints

Page 21: COP5725 Database Management System Final Review

Constraints

Page 22: COP5725 Database Management System Final Review

Constraints

Page 23: COP5725 Database Management System Final Review

Constraints

Assume tables are already created.

ALTER TABLE ENROLLED add constraint cons1 check((select count(*) from ENROLLED group by cname)>=5)

ALTER TABLE ENROLLED add constraint cons2 check((select count(*) from ENROLLED group by cname)<=30)

Page 24: COP5725 Database Management System Final Review

Constraints

Page 25: COP5725 Database Management System Final Review

Views

Page 26: COP5725 Database Management System Final Review

Views

Page 27: COP5725 Database Management System Final Review

Tree-Structured Indexing Explain ISAM and B+ Trees ? What are the

differences, advantages and disadvantages ? ISAM – Indexes Sequential access Method

Effective for static files Unsuitable for dynamically changing files

B+ Tree Dynamic index structure Adjusts well to changes Does well for both range and equality selections

Page 28: COP5725 Database Management System Final Review

Tree-Structured Indexing

Insertions and deletions in ISAMHappen only in leaf pages Insertions – adding records to the overflow chainSearch – inefficient as the chain grows

In B+ tree even after several insertions and deletions,The tree is kept balancedHeight of the tree length of the path from root

to leaf

Page 29: COP5725 Database Management System Final Review

Query EvaluationConsider a relation R(a,b,c,d,e) containing 5,000,000 records, where each data page of the relation holds 10 records. R is organized as a sorted file with secondary indexes. Assume that R.a is a candidate key for R, with values lying in the range 0 to 4,999,999, and that R is stored in R.a order. For each of the following relational algebra queries, state which of the following three approaches is most likely to be the cheapest:

Access the sorted file for R directly. Use a (clustered) B+ tree index on attribute R.a. Use a linear hashed index on attribute R.a.

Page 30: COP5725 Database Management System Final Review

Query Evaluation

Page 31: COP5725 Database Management System Final Review

CH7. XML DOCUMENTS 7.1 When is an XML document well-formed?

When is an XML document valid?

An XML document is valid if it has an associated DTD and the document follows the rules of the DTD. An XML document is well-formed if it follows three guidelines:

1. It starts with an XML declaration.2. It contains a root element that contains all other

elements.3. All elements are properly nested.

Page 32: COP5725 Database Management System Final Review

CH27. XQUERY <bookstore>

<book category="COOKING"><title lang="en">Everyday Italian</title><author>Giada De Laurentiis</author><year>2005</year><price>30.00</price>

</book><book category="CHILDREN">

<title lang="en">Harry Potter</title><author>J K. Rowling</author><year>2005</year><price>29.99</price>

</book><book category="WEB">

<title lang="en">XQuery Kick Start</title><author>James McGovern</author><author>Per Bothner</author><author>Kurt Cagle</author><author>James Linn</author><author>Vaidyanathan Nagarajan</author><year>2003</year><price>49.99</price>

</book><book category="WEB">

<title lang="en">Learning XML</title><author>Erik T. Ray</author><year>2003</year><price>39.95</price>

</book></bookstore>

Page 33: COP5725 Database Management System Final Review

CH27. XQUERY Query:

doc("books.xml")/bookstore/book/title Result:

<title lang="en">Everyday Italian</title><title lang="en">Harry Potter</title><title lang="en">XQuery Kick Start</title><title lang="en">Learning XML</title>

Page 34: COP5725 Database Management System Final Review

CH27. XQUERY Query:

doc("books.xml")/bookstore/book[price<30]

Result:<book category="CHILDREN">

<title lang="en">Harry Potter</title><author>J K. Rowling</author><year>2005</year><price>29.99</price>

</book>

Page 35: COP5725 Database Management System Final Review

CH27. XQUERY Query:

for $x in doc("books.xml")/bookstore/book where $x/price>30 order by $x/title return $x/title

Result:<title lang="en">Learning XML</title><title lang="en">XQuery Kick Start</title>

Page 36: COP5725 Database Management System Final Review

CH27. XQUERY Query:

<ul> {for $x in doc("books.xml")/bookstore/book/title order by $x return <li>{$x}</li>

} </ul> Result:

<ul><li><title lang="en">Everyday Italian</title></li> <li><title lang="en">Harry Potter</title></li> <li><title lang="en">Learning XML</title></li>

<li><title lang="en">XQuery Kick Start</title></li></ul>