views and indexing - computer science at rpisibel/dbs_notes/views_indexing.pdf · 2013. 2. 25. ·...

21
Views and Indexing CSCI 4380 Database Systems Sunday, October 24, 2010

Upload: others

Post on 27-Feb-2021

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Views and Indexing - Computer Science at RPIsibel/dbs_notes/views_indexing.pdf · 2013. 2. 25. · Views • A view is a query. • Views can be anonymous SELECT * FROM (SELECT id,

Views and Indexing

CSCI 4380 Database Systems

Sunday, October 24, 2010

Page 2: Views and Indexing - Computer Science at RPIsibel/dbs_notes/views_indexing.pdf · 2013. 2. 25. · Views • A view is a query. • Views can be anonymous SELECT * FROM (SELECT id,

Views• A view is a query.

• Views can be anonymous

SELECT *

FROM

(SELECT id, name, username, password

FROM student

WHERE password is null) R

WHERE R.id = 5 ;

• The relation R above is an anonymous view (it is not known outside of this query).

• This query is combined with the remaining query to find the optimal query plan.

Sunday, October 24, 2010

Page 3: Views and Indexing - Computer Science at RPIsibel/dbs_notes/views_indexing.pdf · 2013. 2. 25. · Views • A view is a query. • Views can be anonymous SELECT * FROM (SELECT id,

Views optimization• For example, after optimization the query below

SELECT *

FROM

(SELECT id, name, username, password

FROM student

WHERE password is null) R

WHERE R.id = 5 ;

• may become:

SELECT *

FROM student

WHERE password is NULL and id = 5 ;

Sunday, October 24, 2010

Page 4: Views and Indexing - Computer Science at RPIsibel/dbs_notes/views_indexing.pdf · 2013. 2. 25. · Views • A view is a query. • Views can be anonymous SELECT * FROM (SELECT id,

Views optimization

• It is best to use an anonymous view if the query cannot be written without it.

• Otherwise, the optimizer may loose some optimizations and rewritings of the query when views are used.

Sunday, October 24, 2010

Page 5: Views and Indexing - Computer Science at RPIsibel/dbs_notes/views_indexing.pdf · 2013. 2. 25. · Views • A view is a query. • Views can be anonymous SELECT * FROM (SELECT id,

(non anonymous) Views

• Views can also be given a name. This allows them to be used in many different queries.CREATE VIEW nullpwd(id, name, username, password)

AS

SELECT id, name, username, password

FROM student

WHERE password is null ;

Sunday, October 24, 2010

Page 6: Views and Indexing - Computer Science at RPIsibel/dbs_notes/views_indexing.pdf · 2013. 2. 25. · Views • A view is a query. • Views can be anonymous SELECT * FROM (SELECT id,

Views

• Attributes can be renamed in views.CREATE VIEW facultyWCourse(faculty_id, name)

AS

SELECT DISTINCT f.id, f.name

FROM faculty f, class c

WHERE f.id = c.instructor_id ;

Sunday, October 24, 2010

Page 7: Views and Indexing - Computer Science at RPIsibel/dbs_notes/views_indexing.pdf · 2013. 2. 25. · Views • A view is a query. • Views can be anonymous SELECT * FROM (SELECT id,

Using views in queries• Views can be used in any query as if they were a table.

• Remember, views are just queries. No tuples are stored for them. SELECT * FROM nullpwdWHERE id = 5 ;

• When executing this query, the query processor first takes the query definition and replaces the query name with its definition (getting the expression in slide 2). Then, the query is optimized.

Sunday, October 24, 2010

Page 8: Views and Indexing - Computer Science at RPIsibel/dbs_notes/views_indexing.pdf · 2013. 2. 25. · Views • A view is a query. • Views can be anonymous SELECT * FROM (SELECT id,

Why use views?• Creating views allows the system designer to customize the

application code so that:

• The functionality for different users can be built on top of views.

• For example, faculty cannot access financial information of students and can only the information about the students who are currently taking a course from them.

• Solution: Create a view for the students in a specific class which only includes the relevant attributes. The application code will be built on top of this view.

Sunday, October 24, 2010

Page 9: Views and Indexing - Computer Science at RPIsibel/dbs_notes/views_indexing.pdf · 2013. 2. 25. · Views • A view is a query. • Views can be anonymous SELECT * FROM (SELECT id,

Other uses of views• Views can also be used to insert/update/delete tuples

instead of the table they are based on.

• This builds on the philosophy of building functionality based on views.

• However, this is only possible for a very restricted subset of views, called updatable views.

• Updatable views are such that each tuple in the view maps to one and only one tuple in the table it is based on.

Sunday, October 24, 2010

Page 10: Views and Indexing - Computer Science at RPIsibel/dbs_notes/views_indexing.pdf · 2013. 2. 25. · Views • A view is a query. • Views can be anonymous SELECT * FROM (SELECT id,

Updatable views

• A view is updatable if:

• It has only one table T in its from clause

• It contains all attributes in that cannot be null

• It does not have any distinct, group by statements

Sunday, October 24, 2010

Page 11: Views and Indexing - Computer Science at RPIsibel/dbs_notes/views_indexing.pdf · 2013. 2. 25. · Views • A view is a query. • Views can be anonymous SELECT * FROM (SELECT id,

Updatable views

• Example:

CREATE VIEW nullpwd(id, name, username, password)

AS

SELECT id, name, username, password

FROM student

WHERE password is null ;

UPDATE nullpwd SET password = ‘abc’ WHERE id = 5 ;

Sunday, October 24, 2010

Page 12: Views and Indexing - Computer Science at RPIsibel/dbs_notes/views_indexing.pdf · 2013. 2. 25. · Views • A view is a query. • Views can be anonymous SELECT * FROM (SELECT id,

Updatable views

• Note: nullpwd does not store any tuples. This expression allows only those tuples of student that are accessible through view to be updated.

• Furthermore, after the update, the resulting tuple may not even be in the view (unless the view is created with the CHECK OPTION).

UPDATE nullpwd SET password = ‘abc’ WHERE id = 5 ;

Sunday, October 24, 2010

Page 13: Views and Indexing - Computer Science at RPIsibel/dbs_notes/views_indexing.pdf · 2013. 2. 25. · Views • A view is a query. • Views can be anonymous SELECT * FROM (SELECT id,

Why use views?

• Using views to create functionality hides data complexity from developers.

• Also, if the data model changes, the application code does not have to change as long as the new model can be mapped to the same view.

Sunday, October 24, 2010

Page 14: Views and Indexing - Computer Science at RPIsibel/dbs_notes/views_indexing.pdf · 2013. 2. 25. · Views • A view is a query. • Views can be anonymous SELECT * FROM (SELECT id,

Why not use views?

• Writing a query using views may hide some optimizations from the database, creating sub optimal query plans.

Sunday, October 24, 2010

Page 15: Views and Indexing - Computer Science at RPIsibel/dbs_notes/views_indexing.pdf · 2013. 2. 25. · Views • A view is a query. • Views can be anonymous SELECT * FROM (SELECT id,

INDEXING• Views do not improve performance,

sometimes they may even cause a loss of performance.

• One way to improve performance is store (cache) the result of some queries in the database.

• Indexes are exactly that, cached results of queries.

Sunday, October 24, 2010

Page 16: Views and Indexing - Computer Science at RPIsibel/dbs_notes/views_indexing.pdf · 2013. 2. 25. · Views • A view is a query. • Views can be anonymous SELECT * FROM (SELECT id,

Indexing exampleSELECT CrsCode, Grade

FROM Transcript

WHERE studId = 5 AND semester = ‘Fall’ AND year = 2010

Answering this query requires reading all of transcript from disk to find the matching tuples to return. Note that the matching tuples will be few, 4-5 in this case. However, transcript is large.

Sunday, October 24, 2010

Page 17: Views and Indexing - Computer Science at RPIsibel/dbs_notes/views_indexing.pdf · 2013. 2. 25. · Views • A view is a query. • Views can be anonymous SELECT * FROM (SELECT id,

Cost Analysis• Let us some basic cost analysis.• Suppose transcript is stored on disk in 100 disk pages.

Then, answering this query requires reading 100 disk pages.

• Suppose instead we had an index that allowed us to find all the matching tuples. Example:• Index on Transcript (studid, semester, year)• Then, answering this query will cost:

• Finding the tuples from index• Reading them from disk, at most 4-5 pages.

Sunday, October 24, 2010

Page 18: Views and Indexing - Computer Science at RPIsibel/dbs_notes/views_indexing.pdf · 2013. 2. 25. · Views • A view is a query. • Views can be anonymous SELECT * FROM (SELECT id,

Indexing• Indexes are just query results stored explicitly.• They are also stored on disk, but can be cheaper to

use because:• They have fewer disk pages as they store only a

subset of the attributes in the relation.• They are stored in a way to make it easy to find

queries on specific values in the index (we will see how).

• For now, we can assume that querying the index is small for most queries.

Sunday, October 24, 2010

Page 19: Views and Indexing - Computer Science at RPIsibel/dbs_notes/views_indexing.pdf · 2013. 2. 25. · Views • A view is a query. • Views can be anonymous SELECT * FROM (SELECT id,

Index cost/benefit analysis• Indices are good if

• they reduce the cost of frequently asked queries

• the reduction is considerable

• Indices must be kept up to date when the tables change

• Indices increase the cost insert/update/delete operations (at least one extra disk page access for each index created)

Sunday, October 24, 2010

Page 20: Views and Indexing - Computer Science at RPIsibel/dbs_notes/views_indexing.pdf · 2013. 2. 25. · Views • A view is a query. • Views can be anonymous SELECT * FROM (SELECT id,

What are good indices

• A good index will reduce the total number of matching tuples to 1 or a few

• Examples: attributes in a key

• An index on student(id) will greatly improve queries like

• SELECT * FROM student WHERE id = 5;

Sunday, October 24, 2010

Page 21: Views and Indexing - Computer Science at RPIsibel/dbs_notes/views_indexing.pdf · 2013. 2. 25. · Views • A view is a query. • Views can be anonymous SELECT * FROM (SELECT id,

What are good indices• If the underlying relation is sorted with respect to some

attribute, then an index on that attribute will help performance.

• Suppose, transcript tuples are sorted by semester, year.

• Create an index on Transcript(semester, year)

• Given the query:

SELECT CrsCode, Grade FROM Transcript

WHERE semester = ‘Fall’ AND year = 2010

use the index to find the first tuple for Fall/2010, and then scan the Transcript relation starting from that point.

Sunday, October 24, 2010