Download - Concurrency Control (Chapter 10.4-10.6)
Concurrency Control (Chapter 10.4-10.6)
Concurrency Control Techniques
• Protocols that guarantee serializability 1. Locking 2. Timestamps 3. Multiversion 4. Validation or certification
Locking
• Locking - main technique to control concurrent execution
• execution based on concept of locking data items
• locks granted to a specific transaction for a particular data item
• a lock forces mutual exclusion on data items • lock manager subsystem to keep track of and
control access to locks
Binary locks
• binary lock - 2 states or values – locked, unlocked
• distinct lock for each data item • Suppose:
– transaction T must issue a lock request before R, W
– issue unlock after R, W – Too restrictive, what if want transactions to
read at same time?
Multiple-mode lock
• multiple-mode lock - 3 types, indivisible – R_lock (RL) share lock – W_lock (WL) exclusive lock – Unlock (UL)
a lock needs 3 fields:
<data, lock(R,W), # of reads>
Locking Rules for multiple-mode
• Rules: 1. T must issue RL(X) or WL(X) before any R(X) 2. T must issue WL(X) before any W(X) 3. T must issue UL after R or W completed 4. If issue WL(X) and hold RL on X, must upgrade RL to WL(X) 5. If issue RL(X) and hold WL on X, must downgrade WL to RL(X) 6. T will not issue UL(X) unless hold RL or WL(X)
Lock conflicts
• If lock conflict, force requesting transaction to wait for transaction holding lock, proceed with other transactions
dirty write example R1(Y) R2(X) W2(Y) C2 R1(X) W1(X) C1
Will locking prevent dirty write?
RL1(Y) R1(Y) UL1(Y) RL2(X) R2(X) UL2(X) WL2(Y) W2(Y) UL2(Y) C2 WL1(X) R1(X) W1(X) UL1(X) C1
• Locks do not guarantee serializability - need 2PL
Two-Phase Locking 2PL
• 2PL has a: growing phase - locks acquired shrinking phase - locks released –
cannot request new lock during this phase
• 2PL limits concurrency – Can upgrade RL to WL - must be done in growing
phase
• Can downgrade WL to RL - must be done in shrinking phase
Releasing locks
• Theoretically, can release locks whenever done with item, as long as don't request new locks on any other data items
• Commercial systems that use 2PL assume unlock at last stage of commit
• If R(X) then W(X), should request as WL(X)
Any problems?
R1(A) R2(A) W1(A) C1 W2(A) C2WL1(A) R1(A) WL2(A)* W1(A) UL1(A) C1 W2(A) C2
*T2 blocked then both commit – no lost update
R1(Y) R2(X) W2(Y) C2 R1(X) W1(X) C1RL1(Y) R1(Y) RL2(X) R2(X) WL2(Y)* WL1(X)**
*T2 blocked then **T1 blocked
no lost update, but DEADLOCK
Types of 2PL
1. Basic 2PL enforces serializability, but deadlocks
2. Strict 2PL does not release any locks until commits or aborts - what most commercial system assume - Not deadlock free (But easier to recover from)
3. Conservative 2PL requires transactions to lock all data items before begin executing - prevents deadlock. Declare readset, writeset or request in order
Strategies for deadlock in 2PL
1. No waiting - if Ti unable to obtain a lock, aborted and restarted after a time delay without checking if deadlock can occur. T's can abort and restart needlessly.
2. Cautious waiting - if Ti tried to lock X and already locked by Tj and if Tj is not blocked:
Ti waits else Tj aborts
Deadlock free - illustrate through the total ordering of blocking times
3. Timeouts - if Ti waits > some system defined time-out, assume Ti is deadlocked, Ti is aborted
Strategies cont’d
4. Deadlock detection - useful if T's rarely access the same items and each T only locks few items– construct a waits for graph (maintained by lock
scheduler) – deadlock when cycle in graph – victim selection - which to abort – Livelock - if repeatedly choose the same victim to
abort and restart– if wait indefinite period of time, need a fair waiting
scheme – FCFS
Strategies cont’d
5. Timestamps to prevent deadlocks– Transactions can be assigned timestamps, TS(Ti)
if T1 starts before T2, TS(T1) < TS(T2) if T1 starts after T2, TS(T1) > TS(T2)
– older Ti has a smaller TS – Timestamp can be: a counter, or the current value
of the system clock– wait-die or wound-wait strategies use TS’s
Wait-die
Suppose Ti tries to lock X, and a CONFLICTING lock is already held by Tj
Wait-die: (aborts Transaction requesting lock) if TS(Ti) < TS(Tj) // Ti is older then Ti waits else Ti aborts and restart with the same
timestamp // Tj is older R1(Y) R2(X) W2(Y) C2 R1(X) W1(X) C1 where T1<T2
RL1(Y) R1(Y) RL2(X) R2(X) WL2(Y) Abort T2 WL1(X) R1(X) W1(X) C1 T2 restarts and commits
Wound-wait
Wound-wait: (aborts Transaction holding lock) if TS(Ti) < TS(Tj) // Ti is older then abort Tj // if does not finish
// in the mean time restart with same // timestamp
else Ti waits // Ti is younger R1(Y) R2(X) W2(Y) C2 R1(X) W1(X) C1 where T1<T2
RL1(Y) R1(Y) RL2(X) R2(X) WL2(Y) Block T2 WL1(X) T2 aborted R1(X) W1(X) C1 T2 restarts and commits
Comparisons
• Wait-die - older waits on younger, else younger is aborted and restarted – favors younger lock holder
• Wound-wait - younger waits on older, older T requesting items held by younger, preempts younger by abort –– favors older requester
• T's aborted and restarted even if not deadlocked. • Wait-die: Ti aborted and restarted in a row• Wound-wait: can be aborted even it obtain all of its locks
(not true for wait-die)
Problems with serializability
• Scheduling that guarantees perfect serializability can be intrusive on performance
• too many transaction in wait state
• if increase number of threads, can reduce the number of transactions active
• CPU never fully utilized
Alternatives to serializability
• Weakened forms of 2PL locking in SQL-99 - levels of isolation
• Used instead of degrees of isolation • Can set the isolation level with set transaction
statement - can specify R only, W only 1) read uncommitted - dirty reads 2) read committed (in DB2 cursor stability) 3) repeatable read 4) serializable
Read uncommitted
• Short-term lock (guarantees R, W, is atomic) vs. long-term lock (held until commit or abort)
• Read uncommitted - allow for Read only operations - no long-term locks used – Since Read only, no dirty writes, but dirty
reads can occur
Read committed
• Read committed - WL long term, RL short term • Can only read data that has been written by
committed transactions • cursor stability - lock held on each row fetched
by cursor (R short, W long)• Read and update before move to next row
affects Ri(A) -> Wj(A) • Problems: repeatable read • Usually no lost update - but it can occur (see
Fig. 10.12)
Repeatable read
• Repeatable read - WL, RL long term predicate locking is not guaranteed– Predicate locking – lock only rows that satisfy
specified condition (e.g. major =‘CS’)– therefore can have phantom updates due to
inserting new rows • e.g. if branch totals in branch table, and insert
while computing total
Serializability
• Serializable requires RL long term on all data satisfying predicate – lock entire table
Timestamp Ordering
• Timestamp Ordering
• concurrency control techniques based on TS's - do not use locks
• Can deadlock occur?
Timestamps
• Use timestamp ordering (TO)
• in 2PL schedule, serializable by being equivalent to some serial schedule allowed by locking protocols
• In TO schedule, equivalent to particular order that corresponds to order of transaction TS's
Timestamps (TO)
• Basic TO algorithm: – associated with each X, 2 TS values– R_TS(X) - largest TS that has successfully
read X – W_TS(X) - largest TS that has successfully
written X– If T is aborted, it is restarted with LATER
timestamp
TO Algorithms
If T issues W(X): if R_TS(X) > TS(T) or W_TS(X) > TS(T) then abort and rollback T
// reject operation request else W(X) and set W_TS(X) = TS(T)
If T issues R(X): if W_TS(X) > TS(T) then abort and reject operation request else
//W_TS(X) ≤ TS(T)
R(X), set R_TS(X) = Max(TS(T), R_TS(X))
R1(X)R2(X)W1(X)W2(X)
ExampleR1(X)R2(X)W1(X)W2(X) X
R_TS W_TS0 0 1 0 R1(X) 2 0 R2(X)
W1(X), rollback, restart with T1 = 3
2 2 W2(X) C2
3 2 R1(X) 3 3 R1(X)
C1
TO vs. 2PL
• TO and 2PL guarantee serializability – – Some schedules possible under each, not
allowed under the other – If T is aborted and rolled back, any value
written by T also must be rolled back (Can have cascading rollback)
– Schedules produced are not recoverable, does not ensure recoverable and cascade- less or strict schedules
TO vs. 2PL
• Strict TO - ensures strict schedules and conflict serializability
• if T issues R(X) or W(X), s.t. TS(T) > W_TS(X) – R,W operation delayed until T that wrote to
value of X committed or aborted – to implement, must simulate locking of X, but
doesn't cause deadlocks
Multiversion Concurrency Control
• Multiversion Concurrency Control • Oracle uses multiversions to enforce levels of isolation• Useful for temporal DBs and RTDBs • Keep old values when item is updated - several versions
maintained • When operation accesses item, appropriate version
chosen to ensure serializability • Read older version of item instead of abort• Write new version, keep old one • View serializability not conflict serializability is ensured
Multiversion
• Disadvantage - more storage
• however, may keep older versions for recovery anyway
Multiversion Algorithm
• If T issues: R(X), find version i of X with largest W_TS s.t. W_TS(Xi) ≤ TS(T) set R_TS(Xi) = max(TS(T), R_TS(Xi))
• If T issues: W(X), find version i of X with largest W_TS s.t. W_TS (Xi) ≤ TS(T) if TS(T) >= R_TS(Xi)
create new version Xj with R_TS(Xj) = W_TS(Xj) = TS(T)
else abort // TS(T) < R_TS(Xi) so must abort
T1: W1(X) T2: R2(X)W2(X) T3: W3(X) T4: W4(X)
W1(X)R2(X)W3(X)W2(X)W4(X)
Multiversion
W1(X)R2(X)W3(X)W2(X)W4(X)
Version R_TS W_TS X0 0 0 W0(X) X1 0 1 W1(X) 2 R2(X) X2 3 3 W3(X) X3 2 2 W2(X) X4 4 4 W4(X)
View Equivalence and View Serializability • Less restrictive than conflict equivalence • View equivalent if given 2 schedules, S and S‘
1. same set of transactions in S and S' (same operations)
2. for any operation on X in S, if value X read has been written by Wj(X), same condition must hold in S'
3. If the operation Wk(Y) is the last operation to write to Y in S, then Wk(Y) must also be the last in S'
Multiversion
• Premise – As long as each read reads a result of the
same write1. the W operations produce the same results
2. the read operations see the same view
– If the final write operations are the same, the database state is the same
Multiversion
• How is view equivalence less restrictive? • constraints vs. unconstrained write
– constrained - any W1(X) preceded by R1(X) – unconstrained - no read before the W1(X) - independent of any
previous value, also called blind write
• If the write is constrained, view and conflict equivalence are the same
• The difference occurs with blind writes • There is an algorithm to test for view serializability - test
is NP-complete • Conflict seralizability is a subset of view seralizability
Multiversion
• Several versions of X X1, X2, ... Xk
• For each version Xi, keep the timestamps:– R_TS(Xi) - largest of all TS's of transactions that sucessfully
read version Xi – W_TS(Xi) - TS of transaction that wrote values of version Xi
• Want to read version closest to, but less than your TS• Always write a new version• Only abort when a version with W_TS ≤ TS(T) has a
R_TS > TS(T)
Multiversion
Is the example schedule serializable?
W1(X)R2(X)W3(X)W2(X)W4(X)
T1->T2->T3->(T3->T2)-> T4
View equivalent but not conflict equivalent
Example (using TO)
W1(X)R2(X)W3(X)W2(X)W4(X)
R_TS W_TS 0 0 0 1 W1(X) 2 R2(X) 3 W3(X) W2(X) aborts
// W_TS > TS(T) 4 W4(X)
Example
W0(X)R1(X)W1(X)R2(X)R3(X)W3(X)W2(X)
Version R_TS W_TS X0 0 0 W0(X) 1 R1(X) X1 1 1 W1(X) 2 R2(X) 3 R3(X) X2 3 3 W3(X) W2(X) aborts,
version X1 has W_TS <= TS(T2) but R_TS(X1) > T2(T2)
because T3 should have read values written by T2
Example
W0(X)R1(X)W3(Y)W2(X)R3(X)W3(X)W4(X)W4(Y)
Optimistic (validation) Concurrency Control Techniques (or certification)
• Optimistic (validation) Concurrency Control Techniques (or certification)
• In other concurrency control techniques, checking is done to the DB before operations are executed – e.g. TO, 2PL
• In optimistic, checking is done after
Optimistic
• Updates are applied to local copies of the data (transaction workspace)
• If serializability is not violated, transactions are committed
• The DB is updated from local copies
• Otherwise T is aborted and restarted
Optimistic
3 phases:
1. R phase - R item from DB, updates to local copies
2. validation phase - check for serializability
3. W phase – if validation successful, update DB else, restart transaction
Optimistic
• If little interference, T's validated successfully and optimistic protocol works well else it doesn't
• This protocol (there are others) uses TS's and W-sets and R-sets
Optimistic
• Validation phase– For each Ti ready to commit and in validation phase, – check for each Tj, committed or in validation phase
1) Tj completes W-phase before Ti starts to read 2) Ti starts W-phase after Tj completes W-phase and R-set of Ti has no items in common with W-set of Tj 3) Both R-set and W-set of Ti, have no items in common with W-set of Tj and Tj commits R-phase before Ti completes R-phase
Optimistic
• If any 1 of the previous holds, no interference Ti validated successfully, Else Ti aborted and restarted later.
• Interference may have occurred.
Is this topic relevant?
• Check out the Data Concurrency and Consistency link for Oracle
• http://www-rohan.sdsu.edu/doc/oracle/server803/A54643_01/toc.htm