databases on 1 foot joe hellerstein computer science division uc berkeley

Databases on 1 Foot

Joe HellersteinComputer Science Division

UC Berkeley

Overview for the day• Motivation• Relational model• Relational Algebra• SQL• Concurrency Control• Recovery

What we’re skipping• Access Methods

– disk layout for tuples and pages– indexes (B-trees, linear hashing)

• Query optimization– how to map a declarative query (SQL) to a query

plan (relational algebra + implementations)• Query processing algorithms

– sort, hash, join algorithms• Database design

– logical design: E-R models & normalization– physical design: indexes, clustering, tuning

More Background• Overview Texts:

– Ramakrishnan, Database Management Systems (the cow book)

– Silberschatz, Korth, Sudarshan: Database System Concepts (the sailboat book)

– O’Neil: Database Principles, Programming, Performance– Ullman & Widom: A 1st Course in Database Systems

• Graduate-level Texts– Stonebraker & Hellerstein, eds. Readings in Database

Systems (http://redbook.cs.berkeley.edu)– Gray & Reuter: Transaction Processing: Concepts and

Techniques.– Ullman: Principles of Database Systems

Database Management Systems• What more could we want than a file

system?– Simple, efficient ad hoc1 queries– concurrency control– recovery– benefits of good data modeling

1ad hoc: formed or used for specific or immediate problems or needs

Describing Data: Data Models• A data model is a collection of concepts for

describing data.• A schema is a description of a particular

collection of data, using the a given data model.

• The relational model of data is the most widely used model today.– Main concept: relation, basically a table with

rows and columns.– Every relation has a schema, which describes

the columns, or fields.

Levels of Abstraction• Many views, single

conceptual (logical) schema and physical schema.– Views describe how

users see the data.

– Conceptual schema defines logical structure

– Physical schema describes the files and indexes used.

Physical Schema

Conceptual Schema

View 1 View 2 View 3

Example: University Database• Conceptual schema:

– Students(sid: string, name: string, login: string, age: integer, gpa:real)

– Courses(cid: string, cname:string, credits:integer) – Enrolled(sid:string, cid:string, grade:string)

• Physical schema:– Relations stored as unordered files. – Index on first column of Students.

• External Schema (View): – Course_info(cid:string,enrollment:integer)

Data Independence• Applications insulated from how data is

structured and stored.• Logical data independence: Protection

from changes in logical structure of data.• Physical data independence: Protection

from changes in physical structure of data.

One of the most important benefits of using a DBMS!

Structure of a DBMS• A typical DBMS has a

layered architecture.• The figure does not

show the concurrency control and recovery components.

• This is one of several possible architectures; each system has its own variations.

Query Optimizationand Execution

Relational Operators

Files and Access Methods

Buffer Management

Disk Space Management

DB

These layersmust considerconcurrencycontrol andrecovery

Advantages of a DBMS• Data independence• Efficient data access• Data integrity & security• Data administration• Concurrent access, crash recovery• Reduced application development time• So why not use them always?

– Can be expensive, complicated to set up and maintain

– This cost & complexity must be offset by need

- Often worth it!

By relieving the brain of all unnecessary work, a good notation sets it free to

concentrate on more advanced problems, and, in effect, increases the mental power of

the race.

-- Alfred North Whitehead (1861 - 1947)

Relational Algebra

Relational Query Languages• Query languages: Allow manipulation and

retrieval of data from a database.• Relational model supports simple, powerful QLs:

– Strong formal foundation based on logic.– Allows for much optimization.

• Query Languages != programming languages!– QLs not expected to be “Turing complete”.– QLs not intended to be used for complex calculations.– QLs support easy, efficient access to large data sets.

Formal Relational Query LanguagesTwo mathematical Query Languages form

the basis for “real” languages (e.g. SQL), and for implementation:

Relational Algebra: More operational, very useful for representing internal execution plans. (Database byte-code…)

Relational Calculus: Lets users describe what they want, rather than how to compute it. (Non-operational, declarative -- SQL comes from here.)

Preliminaries• A query is applied to relation instances, and

the result of a query is also a relation instance.– Schemas of input relations for a query are

fixed (but query will run regardless of instance!)

– The schema for the result of a given query is also fixed! Determined by definition of query language constructs.

– Languages are closed (can compose queries)

Example Instancessid sname rating age22 dustin 7 45.031 lubber 8 55.558 rusty 10 35.0

sid sname rating age28 yuppy 9 35.031 lubber 8 55.544 guppy 5 35.058 rusty 10 35.0

sid bid day22 101 10/10/9658 103 11/12/96

R1

S1

S2

• “Sailors” and “Reserves” relations for our examples.

• We’ll use positional or named field notation, assume that names of fields in query results are `inherited’ from names of fields in query input relations.

Relational Algebra• Basic operations:

– Selection ( ) Selects a subset of rows from relation.– Projection ( ) Deletes unwanted columns from relation.– Cross-product ( ) Allows us to combine two relations.– Set-difference ( ) Tuples in reln. 1, but not in reln. 2.– Union ( ) Tuples in reln. 1 and in reln. 2.

• Additional operations:– Intersection, join, division, renaming: Not essential, but

(very!) useful.

Projectionsname ratingyuppy 9lubber 8guppy 5rusty 10

sname rating S, ( )2

age35.055.5

age S( )2

• Deletes attributes that are not in projection list.

• Schema of result:– exactly the fields in the

projection list, with the same names that they had in the (only) input relation.

• Projection operator has to eliminate duplicates! (Why??)– Note: real systems typically

don’t do duplicate elimination unless the user explicitly asks for it. (Why not?)

Selection

rating S8 2( )

sid sname rating age28 yuppy 9 35.058 rusty 10 35.0

sname ratingyuppy 9rusty 10

sname rating rating S, ( ( ))8 2

• Selects rows that satisfy selection condition.

• No duplicates in result! • Schema of result:

– identical to schema of (only) input relation.

• Result relation can be the input for another relational algebra operation! (Operator composition.)

Union, Intersection, Set-Difference

• All of these operations take two input relations, which must be union-compatible:– Same number of

fields.– `Corresponding’ fields

have the same type.• What is the schema of

result?

sid sname rating age22 dustin 7 45.031 lubber 8 55.558 rusty 10 35.044 guppy 5 35.028 yuppy 9 35.0

sid sname rating age31 lubber 8 55.558 rusty 10 35.0

S S1 2

S S1 2

sid sname rating age22 dustin 7 45.0

S S1 2

Cross-Product• S1 x R1: Each row of S1 is paired with each

row of R1.• Result schema has one field per field of S1

and R1, with field names `inherited’ if possible.– Conflict: Both S1 and R1 have a field

called sid.

( ( , ), )C sid sid S R1 1 5 2 1 1

(sid) sname rating age (sid) bid day22 dustin 7 45.0 22 101 10/ 10/ 9622 dustin 7 45.0 58 103 11/ 12/ 9631 lubber 8 55.5 22 101 10/ 10/ 9631 lubber 8 55.5 58 103 11/ 12/ 9658 rusty 10 35.0 22 101 10/ 10/ 9658 rusty 10 35.0 58 103 11/ 12/ 96

Renaming operator:

Joins• Condition Join:

• Result schema same as that of cross-product.• Fewer tuples than cross-product, might be able to

compute more efficiently• Sometimes called a theta-join.

R c S c R S ( )

(sid) sname rating age (sid) bid day22 dustin 7 45.0 58 103 11/ 12/ 9631 lubber 8 55.5 58 103 11/ 12/ 96

11 .1.1 RS sidRsidS

Joins• Equi-Join: Special case: condition c contains

only conjunction of equalities.

• Result schema similar to cross-product, but only one copy of fields for which equality is specified.

• Natural Join: Equijoin on all common fields.

sid sname rating age bid day22 dustin 7 45.0 101 10/ 10/ 9658 rusty 10 35.0 103 11/ 12/ 96

S Rsid1 1

Basic SQL Query

• relation-list : A list of relation names – possibly with a range-variable after each name

• target-list : A list of attributes of tables in relation-list

• qualification : Comparisons combined using AND, OR and NOT.– Comparisons are Attr op const or Attr1 op Attr2,

where op is one of• DISTINCT: optional keyword indicating that the

answer should not contain duplicates. – Default is that duplicates are not eliminated!

SELECT [DISTINCT] target-listFROM relation-listWHERE qualification

, , , , ,

Conceptual Evaluation Strategy• Semantics of an SQL query defined in terms

of the following conceptual evaluation strategy:– Compute the cross-product of relation-list.– Discard resulting tuples if they fail qualifications.– Delete attributes that are not in target-list.– If DISTINCT is specified, eliminate duplicate rows.

• Probably the least efficient way to compute a query! – An optimizer will find more efficient strategies

same answers.

Example of Conceptual EvaluationSELECT S.sname

FROM Sailors S, Reserves RWHERE S.sid=R.sid AND R.bid=103

(sid) sname rating age (sid) bid day22 dustin 7 45.0 22 101 10/ 10/ 9622 dustin 7 45.0 58 103 11/ 12/ 9631 lubber 8 55.5 22 101 10/ 10/ 9631 lubber 8 55.5 58 103 11/ 12/ 9658 rusty 10 35.0 22 101 10/ 10/ 9658 rusty 10 35.0 58 103 11/ 12/ 96

A Note on Range Variables• Really needed only if the same relation

appears twice in the FROM clause. The previous query can also be written as:

SELECT S.snameFROM Sailors S, Reserves RWHERE S.sid=R.sid AND bid=103

SELECT snameFROM Sailors, Reserves WHERE Sailors.sid=Reserves.sid AND bid=103

Some folks suggestusing range variables always!

OR

Find sailors who’ve reserved at least one boat

• Would adding DISTINCT to this query make a difference?

• What is the effect of replacing S.sid by S.sname in the SELECT clause? – Would adding DISTINCT to this variant of the

query make a difference?

SELECT S.sidFROM Sailors S, Reserves RWHERE S.sid=R.sid

Expressions and Strings

• Arithmetic expressions, string pattern matching.

• AS and = are two ways to name fields in result.• LIKE is used for string matching.

– `_’ stands for any one character and `%’ stands for 0 or more arbitrary characters.

SELECT S.age, age1=S.age-5, 2*S.age AS age2FROM Sailors SWHERE S.sname LIKE ‘B_%B’

Aggregate Operators• Significant extension of

relational algebra.

COUNT (*)COUNT ( [DISTINCT] A)SUM ( [DISTINCT] A)AVG ( [DISTINCT] A)MAX (A)MIN (A)

SELECT AVG (S.age)FROM Sailors SWHERE S.rating=10

SELECT COUNT (*)FROM Sailors S

SELECT AVG ( DISTINCT S.age)FROM Sailors SWHERE S.rating=10

SELECT S.snameFROM Sailors SWHERE S.rating= (SELECT MAX(S2.rating) FROM Sailors S2)

single column

SELECT COUNT (DISTINCT S.rating)FROM Sailors SWHERE S.sname=‘Bob’

Find name and age of the oldest sailor(s)

• The first query is illegal!– We’ll look into the

reason a bit later, when we discuss GROUP BY.

• Third query equivalent to second query– allowed in SQL/92

standard, but not supported in some systems.

SELECT S.sname, MAX (S.age)FROM Sailors SSELECT S.sname, S.ageFROM Sailors SWHERE S.age = (SELECT MAX (S2.age) FROM Sailors S2)

SELECT S.sname, S.ageFROM Sailors SWHERE (SELECT MAX (S2.age) FROM Sailors S2) = S.age

GROUP BY and HAVING• So far, we’ve applied aggregate operators to all

(qualifying) tuples. – Sometimes, we want to apply them to each of several

groups of tuples.• Consider: Find the age of the youngest sailor for

each rating level.– In general, we don’t know how many rating levels

exist, and what the rating values for these levels are!– Suppose we know that rating values go from 1 to 10;

we can write 10 queries that look like this (!):

SELECT MIN (S.age)FROM Sailors SWHERE S.rating = i

For i = 1, 2, ... , 10:

Queries With GROUP BY and HAVING

• The target-list contains (i) attribute names (ii) terms with aggregate operations (e.g., MIN (S.age)).– The attribute list (i) must be a subset of grouping-list.

Intuitively, each answer tuple corresponds to a group, and these attributes must have a single value per group. (A group is a set of tuples that have the same value for all attributes in grouping-list.)

SELECT [DISTINCT] target-listFROM relation-listWHERE qualificationGROUP BY grouping-listHAVING group-qualification

Conceptual Evaluation• The cross-product of relation-list is computed, tuples

that fail qualification are discarded, `unnecessary’ fields are deleted, and the remaining tuples are partitioned into groups by the value of attributes in grouping-list.

• The group-qualification is then applied to eliminate some groups. Expressions in group-qualification must have a single value per group!– In effect, an attribute in group-qualification that is not an

argument of an aggregate op also appears in grouping-list. (SQL does not exploit primary key semantics here!)

• One answer tuple is generated per qualifying group.

Find the age of the youngest sailor with age 18, for each rating with at least 2 such sailors

• Only S.rating and S.age are mentioned in the SELECT, GROUP BY or HAVING clauses; other attributes `unnecessary’.

• 2nd column of result is unnamed. (Use AS to name it.)

SELECT S.rating, MIN (S.age)FROM Sailors SWHERE S.age >= 18GROUP BY S.ratingHAVING COUNT (*) > 1

sid sname rating age22 dustin 7 45.031 lubber 8 55.571 zorba 10 16.064 horatio 7 35.029 brutus 1 33.058 rusty 10 35.0rating age

1 33.07 45.07 35.08 55.510 35.0

rating7 35.0

Answer relation

For each red boat, find the number of reservations for this boat

• Grouping over a join of three relations.• What do we get if we remove B.color=‘red’ from

the WHERE clause and add a HAVING clause with this condition?

• What if we drop Sailors and the condition involving S.sid?

SELECT B.bid, COUNT (*) AS scountFROM Sailors S, Boats B, Reserves RWHERE S.sid=R.sid AND R.bid=B.bid AND B.color=‘red’GROUP BY B.bid

Find the age of the youngest sailor with age > 18, for each rating with at least 2 sailors (of any age)

• Shows HAVING clause can also contain a subquery. • Compare this with the query where we considered only

ratings with 2 sailors over 18!• What if HAVING clause is replaced by:

– HAVING COUNT(*) >1

SELECT S.rating, MIN (S.age)FROM Sailors SWHERE S.age > 18GROUP BY S.ratingHAVING 1 < (SELECT COUNT (*) FROM Sailors S2 WHERE S.rating=S2.rating)

Find those ratings for which the average age is the minimum over all ratings

• Aggregate operations cannot be nested! WRONG:

SELECT S.ratingFROM Sailors SWHERE S.age = (SELECT MIN (AVG (S2.age)) FROM Sailors S2)

SELECT Temp.rating, Temp.avgageFROM (SELECT S.rating, AVG (S.age) AS avgage FROM Sailors S GROUP BY S.rating) AS TempWHERE Temp.avgage = (SELECT MIN (Temp.avgage) FROM Temp)

• Correct solution (in SQL/92):

Null Values• Field values in a tuple are sometimes unknown (e.g.,

a rating has not been assigned) or inapplicable (e.g., no spouse’s name). – SQL provides a special value null for such situations.

• The presence of null complicates many issues. E.g.:– Special operators needed to check if value is/is not null. – Is rating>8 true or false when rating is equal to null?

What about AND, OR and NOT connectives?– We need a 3-valued logic (true, false and unknown).– Meaning of constructs must be defined carefully. (e.g.,

WHERE clause eliminates rows that don’t evaluate to true.)– New operators (in particular, outer joins) possible/needed.

Embedded SQL

• SQL commands can be called from within a host language (e.g., C or COBOL) program.– SQL statements can refer to host variables

(including special variables used to return status).

– Must include a statement to connect to the right database.

– Requires compiler preprocessing– SQL relations are (multi-) sets of records, with

no a priori bound on the number of records. No such data structure in C.

– SQL supports a mechanism called a cursor to handle this.

Cursors• Can declare a cursor on a relation or query

statement (which generates a relation).• Can open a cursor, and repeatedly fetch a tuple

then move the cursor, until all tuples have been retrieved.– Can use ORDER BY to control the order in which

tuples are returned.• Fields in ORDER BY clause must also appear in SELECT

clause.• Can also modify/delete tuple pointed to by a

cursor.

Cursor that gets names of sailors who’ve reserved a red boat, in alphabetical order

• Note that it is illegal to replace S.sname by, say, S.sid in the ORDER BY clause! (Why?)

• Can we add S.sid to the SELECT clause and replace S.sname by S.sid in the ORDER BY clause?

DECLARE sinfo CURSOR FORSELECT S.snameFROM Sailors S, Boats B, Reserves RWHERE S.sid=R.sid AND R.bid=B.bid AND B.color=‘red’ORDER BY S.sname;

FETCH 5 IN sinfo;

Embedding SQL in C: An Examplechar SQLSTATE[6];EXEC SQL BEGIN DECLARE SECTIONchar c_sname[20]; short c_minrating; float c_age;EXEC SQL END DECLARE SECTIONc_minrating = random();EXEC SQL DECLARE sinfo CURSOR FOR

SELECT S.sname, S.age FROM Sailors SWHERE S.rating > :c_minratingORDER BY S.sname;

do {EXEC SQL FETCH sinfo INTO :c_sname, :c_age;printf(“%s is %d years old\n”, c_sname, c_age);

} while (SQLSTATE != ‘02000’);EXEC SQL CLOSE sinfo;

Database APIs: alternative to embedding

• Rather than modify compiler, add library with database calls (API)– special procedures/objects– passes SQL strings from language, presents result

sets in a language-friendly way– Microsoft’s ODBC becoming C/C++ standard on

Windows– Sun’s JDBC a Java equivalent– Supposedly DBMS-neutral

• a “driver” traps the calls and translates them into DBMS-specific code

• database can be across a network

SQL API in Java (JDBC)Connection con = // connect DriverManager.getConnection(url, ”login", ”pass"); Statement stmt = con.createStatement(); // set up stmtString query = "SELECT COF_NAME, PRICE FROM COFFEES";ResultSet rs = stmt.executeQuery(query);try { // handle exceptions // loop through result tuples while (rs.next()) { String s = rs.getString("COF_NAME"); Float n = rs.getFloat("PRICE"); System.out.println(s + " " + n); }} catch(SQLException ex) { System.out.println(ex.getMessage () + ex.getSQLState () + ex.getErrorCode ());}

Concurrency Control & Recovery.Why Have Concurrent Processes?• Better transaction throughput, response time

• Done via better utilization of resources:– While one processes is doing a disk read, another

can be using the CPU or reading another disk.• DANGER DANGER!DANGER DANGER! Concurrency could lead to

incorrectness!– Must carefully manage concurrent data access.– There’s (much!) more here than the usual OS

tricks!

Query Optimization & Processing• Optimizer maps SQL to algebra tree with

specific algorithms– access methods and join algorithms

• relational operators implemented as iterators– open()– next(possible with condition)– close

• processing engine is a pull-based, single-threaded data flow– parallelizes naturally

Transactions• Basic concurrency/recovery concept: a

transaction (Xact).– A sequence of many actions which are

considered to be one atomic unit of work.• DBMS “actions”:

– reads, writes– Special actions: commit, abort– for now, assume reads and writes are on

tuples; we’ll revisit this assumption later.

The ACIDACID Properties

• AA tomicity: All actions in the Xact happen, or none happen.

• CC onsistency: If each Xact is consistent, and the DB starts consistent, it ends up consistent.

• II solation: Execution of one Xact is isolated from that of other Xacts.

• DDurability: If a Xact commits, its effects persist.

Passing the ACID Test• Concurrency Control

– Guarantees Consistency and Isolation, given Atomicity.

• Logging and Recovery– Guarantees Atomicity and Durability.

• C. C.:– What problems could arise?– What is acceptable behavior?– How do we guarantee acceptable behavior?

Schedules• Schedule: An interleaving of actions

from a set of Xacts, where the actions of any 1 Xact are in the original order.– Represents some actual sequence of

database actions.– Example: R1(A), W1(A), R2(B), W2(B),

R1(C), W1(C)– In a complete schedule, each Xact ends

in commit or abort.• Initial State + Schedule Final State

Acceptable Schedules• One sensible “isolated, consistent” schedule:

– Run Xacts one at a time, in a series.– This is called a serial schedule.– NOTE: Different serial schedules can have different final

states; all are “OK” -- DBMS makes no guarantees about the order in which concurrently submitted Xacts are executed.

• Serializable schedules:– Final state is what some serial schedule would have

produced.– Aborted Xacts are not part of schedule; ignore them for now

(they are made to `disappear’ by using logging).

Serializability Violations

• Two actions conflict when 2 xacts access the same item:– W-R conflict: T2 reads

something T1 wrote.– R-W and W-W conflicts:

Similar.• WR conflict (dirty read):

– Result is not equal to any serial execution!

transfer $100 from A to B

add 6% interest to A & B

Database is inconsistent!

More Conflicts• RW Conflicts (Unrepeatable Read)

– T2 overwrites what T1 read.– If T1 reads it again, it will see something new!

• Example when this would happen?• The increment/decrement example.

– Again, not equivalent to a serial execution.• WW Conflicts (Overwriting Uncommited Data)

– T2 overwrites what T1 wrote.• Example: 2 Xacts to update items to be kept equal.

– Usually occurs in conjunction w/other anomalies.• Unless you have “blind writes”.

Locking: A Technique for C. C.• Concurrency control usually done via

locking.• Lock info maintained by a “lock manager”:

– Stores (XID, RID, Mode) triples. • This is a simplistic view; suffices for now.

– Mode {S,X}– Lock compatibility table:

• If a Xact can’t get a lock, it is suspended on a wait queue.

-- S X--SX

Two-Phase Locking (2PL)• 2PL:

– If T wants to read an object, first obtains an S lock.

– If T wants to modify an object, first obtains X lock.

– If T releases any lock, it can acquire no new locks!

• Locks are automatically obtained by DBMS.• Guarantees serializability!

– Why?

Time

# oflocks

lock point

growing phase

shrinking phase

Strict 2PL• Strict 2PL:

– If T wants to read an object, first obtains an S lock.

– If T wants to modify an object, first obtains X lock.

– Hold all locks until end of transaction.• Guarantees serializability, and avoids

cascading aborts, too!– also avoids WW problems!

Time

# oflocks

Precedence Graph• A Precedence (or Serializability) graph:

– Node for each commited Xact.– Arc from Ti to Tj if an action of Ti precedes

and conflicts with an action of Tj.• T1 transfers $100 from A to B, T2 adds

6%– R1(A), W1(A), R2(A), W2(A), R2(B), W2(B),

R1(B), W1(B)T1 T2

Conflict Serializability• 2 schedules are conflict equivalent if:

– they have the same sets of actions, and– each pair of conflicting actions is ordered in

the same way.• A schedule is conflict serializable if it is

conflict equivalent to a serial schedule.– Note: Some serializable schedules are not

conflict serializable!

Conflict Serializability & Graphs• Theorem: A schedule is conflict serializable

iff its precedence graph is acyclic.• Theorem: 2PL ensures that the precedence

graph will be acyclic! • Strict 2PL improves on this by avoiding

cascading aborts, problems with undoing WW conflicts.

Lock Manager Implementation• Question 1: What are we locking?

– Tuples, pages, or tables?– Finer granularity increases concurrency, but also

increases locking overhead.• Question 2: How do you “lock” something??• Lock Table: A hash table of Lock Entries.

– Lock Entry:• OID• Mode• List: Xacts holding lock• List: Wait Queue

Handling a Lock RequestLock Request (XID, OID, Mode)

Currently Locked?

Grant Lock

Empty Wait Queue?

Currently X-locked?

Mode==X Mode==S

No

YesNo Yes

Put on QueueYes

No

More Lock Manager Logic• On lock release (OID, XID):

– Update list of Xacts holding lock.– Examine head of wait queue.– If Xact there can run, add it to list of Xacts

holding lock (change mode as needed).– Repeat until head of wait queue cannot be

run.• Note: Lock request handled atomically!

– via latches (i.e. semaphores/mutex; OS stuff).

Lock Upgrades• Think about this scenario:

– T1 locks A in S mode, T2 requests X lock on A, T3 requests S lock on A. What should we do?

• In contrast: – T1 locks A in S mode, T2 requests X lock on A, T1

requests X lock on A. What should we do? • Allow such upgrades to supersede lock requests.

– Consider this scenario:• S1(A), X2(A), X1(A): DEADLOCK!

• BTW: Deadlock can occur even w/o upgrades:– X1(A), X2(B), S1(B), S2(A)

Deadlock Detection• Lock Mgr maintains a “Waits-for” graph:

– Node for each Xact.– Arc from Ti to Tj if Tj holds a lock and Ti is

waiting for it.• Periodically check graph for cycles.• “Shoot” some Xact to break the cycle.• Simpler hack: time-outs.

– T1 made no progress for a while? Shoot it.

To lock such rascal counters from his friends, Be ready, gods, with all your thunderbolts: Dash him to pieces!

-- Shakespeare, Julius Caesar

Prevention vs. Detection• Prevention might abort too many Xacts.• Detection might allow deadlocks to tie up

resources for a while.– Can detect more often, but it’s time-

consuming.• The usual answer:

– Detection is the winner.– Deadlocks are pretty rare.– If you get a lot of deadlocks, reconsider your

schema/workload!

Multiple-Granularity Locks• Hard to decide what granularity to lock

(tuples vs. pages vs. tables).• Shouldn’t have to decide!• Data “containers” are nested:

Tuples

TablesPages

Database

contains

Solution: New Lock Modes, Protocol• Allow Xacts to lock at each level, but with

a special protocol using new “intention” locks:

Before locking an item, Xact must set “intention locks” on all its ancestors.

For unlock, go from specific to general (i.e., bottom-up).

SIX mode: Like S & IX at the same time.

-- IS IX--ISIX

S X

SX

Examples• T1 scans R, and updates a few tuples:

– T1 gets an SIX lock on R, and occasionally upgrades to X on a tuple.

• T2 uses an index to read only part of R:– T2 gets an IS lock on R, and repeatedly

gets an S lock on tuples of R.• T3 reads all of R:

– T3 gets an S lock on R. – OR, T3 could behave like T2; can

use lock escalation to decide which.

-- IS IX--ISIX

S X

SX

Logging and Recovery: Motivation

• Atomicity: – Transactions may abort (“Rollback”).

• Durability:– What if DBMS stops running? (Causes?)

crash! Desired Behavior after

system restarts:– T1, T2 & T3 should be

durable.– T4 & T5 should be

aborted (effects not seen).

T1T2T3T4T5

Assumptions• Concurrency control is in effect.

– Strict 2PL, in particular.• Updates are happening “in place”.

– i.e. data is overwritten on (deleted from) the disk.

• A simple scheme to guarantee Atomicity & Durability?

Handling the Buffer Pool• Force write to disk at

commit?– Poor response time.– But provides durability.

• Steal buffer-pool frames from uncommited Xacts?– If not, poor throughput.– If so, how can we ensure

atomicity?

Force

No Force

No Steal Steal

Trivial

Desired

More on Steal and Force• STEAL (why enforcing Atomicity is hard)

– To steal frame F: Current page in F (say P) is written to disk; some Xact holds lock on P.

• What if the Xact with the lock on P aborts?• Must remember the old value of P at steal time (to

support UNDOing the write to page P).• NO FORCE (why enforcing Durability is hard)

– What if system crashes before a modified page is written to disk?

– Write as little as possible, in a convenient place, at commit time,to support REDOing modifications.

Basic Idea: Logging• Record REDO and UNDO information, for every

update, in a log.– Sequential writes to log (put it on a separate disk).– Minimal info (diff) written to log, so multiple

updates fit in a single log page.• Log: An ordered list of REDO/UNDO actions

– Log record contains: <XID, pageID, offset, length, old data, new data>

– and additional control info

Write-Ahead Logging (WAL)• The Write-Ahead Logging Protocol:

Must force the log record for an update before the corresponding data page gets to disk.

Must write all log records for a Xact before commit.

• #1 guarantees Atomicity.• #2 guarantees Durability.

databases on 1 foot joe hellerstein computer science division uc berkeley

Documents