concurrency control in production environments, it is unlikely that we can limit our system to just...

37
Concurrency control In production environments, it is unlikely that we can limit our system to just one user at a time. Consequently, it is possible for multiple queries or transactions to be submitted at approximately the same time. If all of the queries were very small (i.e., in terms of time), we could probably just execute them on a first-come-first-served basis. However, many queries are both complex and time consuming. Executing these queries would make other queries wait a long time for a chance to execute. So, in practice, the DBMS may be running many different transactions at about the same time.

Upload: randell-lamb

Post on 03-Jan-2016

213 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Concurrency control In production environments, it is unlikely that we can limit our system to just one user at a time. – Consequently, it is possible

Concurrency control• In production environments, it is unlikely that we can

limit our system to just one user at a time.– Consequently, it is possible for multiple queries or

transactions to be submitted at approximately the same time.

• If all of the queries were very small (i.e., in terms of time), we could probably just execute them on a first-come-first-served basis.

• However, many queries are both complex and time consuming.– Executing these queries would make other queries wait a

long time for a chance to execute.• So, in practice, the DBMS may be running many

different transactions at about the same time.

Page 2: Concurrency control In production environments, it is unlikely that we can limit our system to just one user at a time. – Consequently, it is possible

The problem• Transactions consist of multiple actions (reads, writes,

updates, etc.)• Since transactions may access or modify the same

elements as other transactions (e.g., fields), conflicts may result.– Specifically, arbitrarily interleaving the individual actions of

various transactions may produce unexpected results.• A primary goal of the DBSM is to maintain the database

in a consistent and predictable state.– Therefore we must take steps to ensure or guarantee this

consistency. • We assume that each transaction in isolation takes the

database from a consistent state to a consistent state

Page 3: Concurrency control In production environments, it is unlikely that we can limit our system to just one user at a time. – Consequently, it is possible

• It is the job of the Transaction Manager to convert queries into actions

• To maintain database consistency, restrictions must be placed on the order of the transactions

– The Transaction Manager does not do this.

• Instead it passes requests to the Scheduler sub-system.

• Its job is to create an order that preserves the consistency of the DB. Scheduler may:

– delay a request from a transaction

– abort a transaction

The Scheduler

Page 4: Concurrency control In production environments, it is unlikely that we can limit our system to just one user at a time. – Consequently, it is possible

Serial and serializable schedulesExample with two transactions•Assume A = B is required for consistency.

• We deal only with Reads and Writes in the main memory buffers.

• T1 and T2 individually preserve DB consistency.

Page 5: Concurrency control In production environments, it is unlikely that we can limit our system to just one user at a time. – Consequently, it is possible

An Acceptable (serial) schedule S1

• Assume initially A = B = 25. Here is one way to execute

S1= (T1; T2) so they do not interfere.

Page 6: Concurrency control In production environments, it is unlikely that we can limit our system to just one user at a time. – Consequently, it is possible

Another acceptable (serial) schedule S2

• Here, transactions are executed as S2= (T2; T1). The result is different, but consistency is maintained.

Page 7: Concurrency control In production environments, it is unlikely that we can limit our system to just one user at a time. – Consequently, it is possible

Interleaving Doesn't Necessarily Hurt (S3)

Page 8: Concurrency control In production environments, it is unlikely that we can limit our system to just one user at a time. – Consequently, it is possible

But Then Again, It Might!

Page 9: Concurrency control In production environments, it is unlikely that we can limit our system to just one user at a time. – Consequently, it is possible

The Semantics of transactions is also important. Here T2 adds 200 to A and to B

Page 10: Concurrency control In production environments, it is unlikely that we can limit our system to just one user at a time. – Consequently, it is possible

We Need a Simpler Model

• Assume that whenever a transaction T writes X, it changes X in some unique way.

– Arithmetic coincidence never happens

• Thus, we focus on the reads and writes only, assuming that whenever transaction T reads X and then writes it, X has changed in some unique way.

– Notation: rT(X) denotes T reads the DB element X

wT(X) denotes T writes X

• If transactions are T1,…,Tk, then we will simply use ri and wi,

instead of rTi and wTi

Page 11: Concurrency control In production environments, it is unlikely that we can limit our system to just one user at a time. – Consequently, it is possible

Transactions and Schedules

• A transaction (model) is a sequence of r and w actions on database elements.

• A schedule is a sequence of reads/writes actions performed by a collection of transactions.

• Serial Schedule = All actions for each transaction are consecutive.

• r1(A); w1(A); r1(B); w1(B); r2(A); w2(A); r2(B); w2(B); …

• Serializable Schedule: A schedule whose “effect” is equivalent to that of some serial schedule.

• We will introduce a sufficient condition for serializability.

Page 12: Concurrency control In production environments, it is unlikely that we can limit our system to just one user at a time. – Consequently, it is possible

Conflicts

• Suppose for fixed DB elements X & Y,

ri(X); rj(Y) is part of a schedule, and we flip the order of these operations.

– ri(X); rj(Y) ≡ rj(Y); ri(X) … In what sense?

– This holds always (even when X=Y)

• We can flip ri(X); wj(Y), as long as X≠Y

• That is, ri(X); wj (X) wj(X); ri (X)

– In the RHS, Ti reads the value of X written

by Tj, whereas it is not so in the LHS.

Page 13: Concurrency control In production environments, it is unlikely that we can limit our system to just one user at a time. – Consequently, it is possible

Conflicts (Cont’d)• We can flip wi(X); wj(Y); provided X≠Y

• However, wi(X); wj(X) ≢ wj(X); wi(X);

– The final value of X may be different depending on which write occurs last.

• There is a conflict if 2 conditions hold.

• A read and a write of the same X, or

• Two writes of X conflict in general and may not be swapped in order.

All other events (reads/writes) may be swapped without changing the effect of the schedule (on the DB).

Page 14: Concurrency control In production environments, it is unlikely that we can limit our system to just one user at a time. – Consequently, it is possible

Example

• Two scheduless are conflict-equivalent if they can be converted into the other by a series of non-conflicting swaps of adjacent elements

• A schedule is conflict-serializable if it can be converted into a serializable schedule in the same way

r1(A); w1(A); r2(A); w2(A); r1(B); w1(B); r2(B); w2(B)

r1(A); w1(A); r2(A); w2(A); r1(B); w1(B); r2(B); w2(B)

r1(A); w1(A); r2(A); r1(B); w2(A); w1(B); r2(B); w2(B)

r1(A); w1(A); r1(B); r2(A); w2(A); w1(B); r2(B); w2(B)

r1(A); w1(A); r1(B); r2(A); w1(B); w2(A); r2(B); w2(B)

r1(A); w1(A); r1(B); w1(B); r2(A)w2(A); r2(B); w2(B)

• This final result is a serial schedule.

Page 15: Concurrency control In production environments, it is unlikely that we can limit our system to just one user at a time. – Consequently, it is possible

Precedence Graphs

• Non-swappable pairs of actions represent potential conflicts between transactions.

• The existence of non-swappable actions enforces an ordering on the transactions that perform these actions.

• We say that T1 takes precedence over T2 in the schedule S, written T1 <S T2, if there are actions A1 of T1 and A2 of T2 such that1. A1 is ahead of A2 in S AND2. Both A1 and A2 involve the same element AND3. At least one of A1 and A2 is a write.

Page 16: Concurrency control In production environments, it is unlikely that we can limit our system to just one user at a time. – Consequently, it is possible
Page 17: Concurrency control In production environments, it is unlikely that we can limit our system to just one user at a time. – Consequently, it is possible

• If there is a cycle in the graph – Then, there is no serial schedule which is conflict

equivalent to S. • Each arc represents a requirement on the order of

transactions in a conflict equivalent serial schedule.

• A cycle puts too many requirements on any linear order of transactions.

• If there is no cycle in the graph– Then any topological order of the graph suggests a

conflict equivalent schedule.

Page 18: Concurrency control In production environments, it is unlikely that we can limit our system to just one user at a time. – Consequently, it is possible

Why the Precedence-Graph Test Works

• Idea: if the precedence graph is acyclic, then we can swap actions to form a serial schedule consistent with some topological order;

Proof: By induction on n, number of transactions.

• Basis: n = 1. That is, S={T1}; then S is already serial.

• Induction: S={T1,T2,…,Tn}. Given that SG(S) is acyclic, then pick Ti in S such that Ti has no incoming arcs

– We swap all actions of Ti to the front (of S).

– (Actions of Ti)(Actions of the other n-1 transactions)

– The tail is a precedence graph that is the same as the original without Ti, i.e. it has n-1 nodes and it is acyclic.

By the induction hyposthesis, we can reorder the actions of the other transactions to turn it into a serial schedule

Page 19: Concurrency control In production environments, it is unlikely that we can limit our system to just one user at a time. – Consequently, it is possible

Schedulers• A scheduler takes requests from transactions for reads and

writes, and decides if it is “OK” to allow them to operate on DB or defer them until it is safe to do so.

• Ideal: a scheduler forwards a request iff it cannot lead to inconsistency of DB

– Too hard to decide this in real time.

• Real: it forwards a request if it cannot result in a violation of conflict serializability.

• We thus need to develop schedulers which ensure conflict-serializablility.

Page 20: Concurrency control In production environments, it is unlikely that we can limit our system to just one user at a time. – Consequently, it is possible

• To prevent conflicts, a Scheduler employs Lock Tables.

– The idea is to record information about the current or requested locks.

• Consequently, by blocking access to certain elements, the Scheduler can enforce an access order

– Basically, we can prevent actions on the same element from being improperly swapped

– This essentially enforces the order of the precedence graph.

Lock tables

Page 21: Concurrency control In production environments, it is unlikely that we can limit our system to just one user at a time. – Consequently, it is possible

Lock Actions• Before reading or writing an element X, a transaction Ti requests a

lock on X from the scheduler.

• The scheduler can either grant the lock to Ti or make Ti wait for the lock.

• If granted, Ti should eventually unlock (release) the lock on X.

• Shorthands:

– li(X) = “transaction Ti requests a lock on X”

– ui(X) = “Ti unlocks/releases the lock on X”

Page 22: Concurrency control In production environments, it is unlikely that we can limit our system to just one user at a time. – Consequently, it is possible

• The use of lock must be proper in 2 senses:– Consistency of Transactions:

• Read or write X only when hold a lock on X.

– ri(X) or wi(X) must be preceded by some li(X) with no intervening ui(X).

• If Ti locks X, Ti must eventually unlock X.

– Every li(X) must be followed by ui(X).

– Legality of Schedules: • Two transactions may not have locked the same element X without

one having first released the lock.

– A schedule with li(X) cannot have another lj(X) until ui(X) appears in between

Page 23: Concurrency control In production environments, it is unlikely that we can limit our system to just one user at a time. – Consequently, it is possible

Legal Schedule Doesn’t Mean Serializable

Page 24: Concurrency control In production environments, it is unlikely that we can limit our system to just one user at a time. – Consequently, it is possible

Two Phase LockingThere is a simple condition, which guarantees confict-serializability: In every transaction, all lock requests (phase 1) precede all unlock requests (phase 2).

Page 25: Concurrency control In production environments, it is unlikely that we can limit our system to just one user at a time. – Consequently, it is possible

Why 2PL Works• Precisely: a legal schedule S of 2PL transactions is conflict

serializable.

• Proof is an induction on n, the number of transactions.

• Remember, conflicts involve only read/write actions, not locks, but the legality of the transaction requires that the r/w's be consistent with the l/u's.

Page 26: Concurrency control In production environments, it is unlikely that we can limit our system to just one user at a time. – Consequently, it is possible

Why 2PL Works (Cont’d)• Basis: if n=1, then S={T1}, and hence S is conflict-serializable.

• Induction: S={T1,…,Tn}. Find the first transaction, say Ti, to perform an unlock action, say ui(X).

• Can we show that the r/w actions of Ti can be moved to the front of the other transactions without conflict?

• Consider some action such as wi(Y). Can it be preceded by some conflicting action wj(Y) or rj(Y)? In such a case we cannot swap them.

– If so, then uj(Y) and li(Y) must intervene, as

wj(Y)...uj(Y)...li(Y)...wi(Y).

– Since Ti is the first to unlock, ui(X) appears before uj(Y).

– But then li(Y) appears after ui(X), contradicting 2PL.

• Conclusion: wi(Y) can slide forward in the schedule without conflict; similar argument for a ri(Y) action.

Page 27: Concurrency control In production environments, it is unlikely that we can limit our system to just one user at a time. – Consequently, it is possible

Risk of deadlocks

The scheduler must perform deadlock resolution(usually by aborting one of the locked transactions)

Page 28: Concurrency control In production environments, it is unlikely that we can limit our system to just one user at a time. – Consequently, it is possible

Shared/Exclusive Locks

• Problem: while simple locks + 2PL guarantee conflict serializability,

they do not allow two readers of DB element X at the same time.

• But having multiple readers is not a problem for conflict serializability (since read actions commute).

Page 29: Concurrency control In production environments, it is unlikely that we can limit our system to just one user at a time. – Consequently, it is possible

Shared/Exclusive Locks (Cont’d)

• Solution: Two kinds of locks:

1. Shared lock sli(X) allows Ti to read, but not write X. – It prevents other transactions from writing X but not

from reading X.

2. Exclusive lock xli(X) allows Ti to read and/or write X; no other transaction may read or write X.

Page 30: Concurrency control In production environments, it is unlikely that we can limit our system to just one user at a time. – Consequently, it is possible

• Consistency of transaction conditions:

– A read ri(X) must be preceded by sli(X) or xli(X), with no intervening ui(X).

– A write wi(X) must be preceded by xli(X), with no intervening ui(X).

• Legal schedules:

– No two exclusive locks.

• If xli(X) appears in a schedule, then there cannot be a xlj(X) until after a ui(X) appears.

– No exclusive and shared locks.

• If xli(X) appears, there can be no slj(X) until after ui(X).

• If sli(X) appears, there can be no xlj(X) until after ui(X).

• 2PL condition:

– No transaction may have a sl(X) or xl(X) after a u(Y).

Page 31: Concurrency control In production environments, it is unlikely that we can limit our system to just one user at a time. – Consequently, it is possible

Scheduler Rules• When there is more than one kind of lock, the

scheduler needs a rule that says “if there is already a lock of type A on DB element X, can I grant a lock of type B on X?”

• The compatibility matrix answers the question. Compatibility Matrix for Shared/Exclusive Locks is:

Page 32: Concurrency control In production environments, it is unlikely that we can limit our system to just one user at a time. – Consequently, it is possible

Upgrading Locks• Instead of taking an exclusive lock immediately, a

transaction can take a shared lock on X, read X, and then upgrade the lock to exclusive so that it can write X.

Page 33: Concurrency control In production environments, it is unlikely that we can limit our system to just one user at a time. – Consequently, it is possible

Upgrading Locks (Cont.)

Had T1 asked for an exclusive lock on B before reading B, the request would have been denied, because T2 already has a

shared lock on B.

Page 34: Concurrency control In production environments, it is unlikely that we can limit our system to just one user at a time. – Consequently, it is possible

Deadlocks

Problem: when we allow upgrades, it is easy to get into a deadlock situation.

Example:T1 and T2 each reads X and later writes X.

Page 35: Concurrency control In production environments, it is unlikely that we can limit our system to just one user at a time. – Consequently, it is possible

Solution: Update Locks

• Update lock uli(X) with asymetric compatibility matrix. – Only an update (not read) can be upgraded to write

(If there are no shared locks anymore).

– Legal schedules: read action permitted when there is either a shared or update lock.

– An update can be granted while there is a shared lock, but the scheduler will not grant a shared lock when there is an update.

Page 36: Concurrency control In production environments, it is unlikely that we can limit our system to just one user at a time. – Consequently, it is possible

Compatibility Matrix for Shared, Exclusive,

and Update Locks

Page 37: Concurrency control In production environments, it is unlikely that we can limit our system to just one user at a time. – Consequently, it is possible

Example: T1 and T2 each read X and later write X.