concurrency control in production environments, it is unlikely that we can limit our system to just...

Concurrency control• In production environments, it is unlikely that we can

limit our system to just one user at a time.– Consequently, it is possible for multiple queries or

transactions to be submitted at approximately the same time.

• If all of the queries were very small (i.e., in terms of time), we could probably just execute them on a first-come-first-served basis.

• However, many queries are both complex and time consuming.– Executing these queries would make other queries wait a

long time for a chance to execute.• So, in practice, the DBMS may be running many

different transactions at about the same time.

The problem• Transactions consist of multiple actions (reads, writes,

updates, etc.)• Since transactions may access or modify the same

elements as other transactions (e.g., fields), conflicts may result.– Specifically, arbitrarily interleaving the individual actions of

various transactions may produce unexpected results.• A primary goal of the DBSM is to maintain the database

in a consistent and predictable state.– Therefore we must take steps to ensure or guarantee this

consistency. • We assume that each transaction in isolation takes the

database from a consistent state to a consistent state

• It is the job of the Transaction Manager to convert queries into actions

• To maintain database consistency, restrictions must be placed on the order of the transactions

– The Transaction Manager does not do this.

• Instead it passes requests to the Scheduler sub-system.

• Its job is to create an order that preserves the consistency of the DB. Scheduler may:

– delay a request from a transaction

– abort a transaction

The Scheduler

Serial and serializable schedulesExample with two transactions•Assume A = B is required for consistency.

• We deal only with Reads and Writes in the main memory buffers.

• T1 and T2 individually preserve DB consistency.

An Acceptable (serial) schedule S1

• Assume initially A = B = 25. Here is one way to execute

S1= (T1; T2) so they do not interfere.

Another acceptable (serial) schedule S2

• Here, transactions are executed as S2= (T2; T1). The result is different, but consistency is maintained.

Interleaving Doesn't Necessarily Hurt (S3)

But Then Again, It Might!

The Semantics of transactions is also important. Here T2 adds 200 to A and to B

We Need a Simpler Model

• Assume that whenever a transaction T writes X, it changes X in some unique way.

– Arithmetic coincidence never happens

• Thus, we focus on the reads and writes only, assuming that whenever transaction T reads X and then writes it, X has changed in some unique way.

– Notation: rT(X) denotes T reads the DB element X

wT(X) denotes T writes X

• If transactions are T1,…,Tk, then we will simply use ri and wi,

instead of rTi and wTi

Transactions and Schedules

• A transaction (model) is a sequence of r and w actions on database elements.

• A schedule is a sequence of reads/writes actions performed by a collection of transactions.

• Serial Schedule = All actions for each transaction are consecutive.

• r1(A); w1(A); r1(B); w1(B); r2(A); w2(A); r2(B); w2(B); …

• Serializable Schedule: A schedule whose “effect” is equivalent to that of some serial schedule.

• We will introduce a sufficient condition for serializability.

Conflicts

• Suppose for fixed DB elements X & Y,

ri(X); rj(Y) is part of a schedule, and we flip the order of these operations.

– ri(X); rj(Y) ≡ rj(Y); ri(X) … In what sense?

– This holds always (even when X=Y)

• We can flip ri(X); wj(Y), as long as X≠Y

• That is, ri(X); wj (X) wj(X); ri (X)

– In the RHS, Ti reads the value of X written

by Tj, whereas it is not so in the LHS.

Conflicts (Cont’d)• We can flip wi(X); wj(Y); provided X≠Y

• However, wi(X); wj(X) ≢ wj(X); wi(X);

– The final value of X may be different depending on which write occurs last.

• There is a conflict if 2 conditions hold.

• A read and a write of the same X, or

• Two writes of X conflict in general and may not be swapped in order.

All other events (reads/writes) may be swapped without changing the effect of the schedule (on the DB).

Example

• Two scheduless are conflict-equivalent if they can be converted into the other by a series of non-conflicting swaps of adjacent elements

• A schedule is conflict-serializable if it can be converted into a serializable schedule in the same way

r1(A); w1(A); r2(A); w2(A); r1(B); w1(B); r2(B); w2(B)

r1(A); w1(A); r2(A); w2(A); r1(B); w1(B); r2(B); w2(B)

r1(A); w1(A); r2(A); r1(B); w2(A); w1(B); r2(B); w2(B)

r1(A); w1(A); r1(B); r2(A); w2(A); w1(B); r2(B); w2(B)

r1(A); w1(A); r1(B); r2(A); w1(B); w2(A); r2(B); w2(B)

r1(A); w1(A); r1(B); w1(B); r2(A)w2(A); r2(B); w2(B)

• This final result is a serial schedule.

Precedence Graphs

• Non-swappable pairs of actions represent potential conflicts between transactions.

• The existence of non-swappable actions enforces an ordering on the transactions that perform these actions.

• We say that T1 takes precedence over T2 in the schedule S, written T1 <S T2, if there are actions A1 of T1 and A2 of T2 such that1. A1 is ahead of A2 in S AND2. Both A1 and A2 involve the same element AND3. At least one of A1 and A2 is a write.

• If there is a cycle in the graph – Then, there is no serial schedule which is conflict

equivalent to S. • Each arc represents a requirement on the order of

transactions in a conflict equivalent serial schedule.

• A cycle puts too many requirements on any linear order of transactions.

• If there is no cycle in the graph– Then any topological order of the graph suggests a

conflict equivalent schedule.

Why the Precedence-Graph Test Works

• Idea: if the precedence graph is acyclic, then we can swap actions to form a serial schedule consistent with some topological order;

Proof: By induction on n, number of transactions.

• Basis: n = 1. That is, S={T1}; then S is already serial.

• Induction: S={T1,T2,…,Tn}. Given that SG(S) is acyclic, then pick Ti in S such that Ti has no incoming arcs

– We swap all actions of Ti to the front (of S).

– (Actions of Ti)(Actions of the other n-1 transactions)

– The tail is a precedence graph that is the same as the original without Ti, i.e. it has n-1 nodes and it is acyclic.

By the induction hyposthesis, we can reorder the actions of the other transactions to turn it into a serial schedule

Schedulers• A scheduler takes requests from transactions for reads and

writes, and decides if it is “OK” to allow them to operate on DB or defer them until it is safe to do so.

• Ideal: a scheduler forwards a request iff it cannot lead to inconsistency of DB

– Too hard to decide this in real time.

• Real: it forwards a request if it cannot result in a violation of conflict serializability.

• We thus need to develop schedulers which ensure conflict-serializablility.

• To prevent conflicts, a Scheduler employs Lock Tables.

– The idea is to record information about the current or requested locks.

• Consequently, by blocking access to certain elements, the Scheduler can enforce an access order

– Basically, we can prevent actions on the same element from being improperly swapped

– This essentially enforces the order of the precedence graph.

Lock tables

Lock Actions• Before reading or writing an element X, a transaction Ti requests a

lock on X from the scheduler.

• The scheduler can either grant the lock to Ti or make Ti wait for the lock.

• If granted, Ti should eventually unlock (release) the lock on X.

• Shorthands:

– li(X) = “transaction Ti requests a lock on X”

– ui(X) = “Ti unlocks/releases the lock on X”

• The use of lock must be proper in 2 senses:– Consistency of Transactions:

• Read or write X only when hold a lock on X.

– ri(X) or wi(X) must be preceded by some li(X) with no intervening ui(X).

• If Ti locks X, Ti must eventually unlock X.

– Every li(X) must be followed by ui(X).

– Legality of Schedules: • Two transactions may not have locked the same element X without

one having first released the lock.

– A schedule with li(X) cannot have another lj(X) until ui(X) appears in between

Legal Schedule Doesn’t Mean Serializable

Two Phase LockingThere is a simple condition, which guarantees confict-serializability: In every transaction, all lock requests (phase 1) precede all unlock requests (phase 2).

Why 2PL Works• Precisely: a legal schedule S of 2PL transactions is conflict

serializable.

• Proof is an induction on n, the number of transactions.

• Remember, conflicts involve only read/write actions, not locks, but the legality of the transaction requires that the r/w's be consistent with the l/u's.

Why 2PL Works (Cont’d)• Basis: if n=1, then S={T1}, and hence S is conflict-serializable.

• Induction: S={T1,…,Tn}. Find the first transaction, say Ti, to perform an unlock action, say ui(X).

• Can we show that the r/w actions of Ti can be moved to the front of the other transactions without conflict?

• Consider some action such as wi(Y). Can it be preceded by some conflicting action wj(Y) or rj(Y)? In such a case we cannot swap them.

– If so, then uj(Y) and li(Y) must intervene, as

wj(Y)...uj(Y)...li(Y)...wi(Y).

– Since Ti is the first to unlock, ui(X) appears before uj(Y).

– But then li(Y) appears after ui(X), contradicting 2PL.

• Conclusion: wi(Y) can slide forward in the schedule without conflict; similar argument for a ri(Y) action.

Risk of deadlocks

The scheduler must perform deadlock resolution(usually by aborting one of the locked transactions)

Shared/Exclusive Locks

• Problem: while simple locks + 2PL guarantee conflict serializability,

they do not allow two readers of DB element X at the same time.

• But having multiple readers is not a problem for conflict serializability (since read actions commute).

Shared/Exclusive Locks (Cont’d)

• Solution: Two kinds of locks:

1. Shared lock sli(X) allows Ti to read, but not write X. – It prevents other transactions from writing X but not

from reading X.

2. Exclusive lock xli(X) allows Ti to read and/or write X; no other transaction may read or write X.

• Consistency of transaction conditions:

– A read ri(X) must be preceded by sli(X) or xli(X), with no intervening ui(X).

– A write wi(X) must be preceded by xli(X), with no intervening ui(X).

• Legal schedules:

– No two exclusive locks.

• If xli(X) appears in a schedule, then there cannot be a xlj(X) until after a ui(X) appears.

– No exclusive and shared locks.

• If xli(X) appears, there can be no slj(X) until after ui(X).

• If sli(X) appears, there can be no xlj(X) until after ui(X).

• 2PL condition:

– No transaction may have a sl(X) or xl(X) after a u(Y).

Scheduler Rules• When there is more than one kind of lock, the

scheduler needs a rule that says “if there is already a lock of type A on DB element X, can I grant a lock of type B on X?”

• The compatibility matrix answers the question. Compatibility Matrix for Shared/Exclusive Locks is:

Upgrading Locks• Instead of taking an exclusive lock immediately, a

transaction can take a shared lock on X, read X, and then upgrade the lock to exclusive so that it can write X.

Upgrading Locks (Cont.)

Had T1 asked for an exclusive lock on B before reading B, the request would have been denied, because T2 already has a

shared lock on B.

Deadlocks

Problem: when we allow upgrades, it is easy to get into a deadlock situation.

Example:T1 and T2 each reads X and later writes X.

Solution: Update Locks

• Update lock uli(X) with asymetric compatibility matrix. – Only an update (not read) can be upgraded to write

(If there are no shared locks anymore).

– Legal schedules: read action permitted when there is either a shared or update lock.

– An update can be granted while there is a shared lock, but the scheduler will not grant a shared lock when there is an update.

Compatibility Matrix for Shared, Exclusive,

and Update Locks

Example: T1 and T2 each read X and later write X.

concurrency control in production environments, it is unlikely that we can limit our system to just...

Documents

different transactions

transaction t

wti transactions

collection of transactions

semantics of transactions

database consistency

db consistency

multiple queries