© chinese university, cse dept. distributed systems / 6 - 1 distributed systems topic 6:...

33
Chinese University, CSE Dept. Distributed Systems / 6 - 1 Distributed Systems Topic 6: Concurrency Control Dr. Michael R. Lyu Computer Science & Engineering Department The Chinese University of Hong Kong

Upload: mark-holland

Post on 03-Jan-2016

219 views

Category:

Documents


0 download

TRANSCRIPT

© Chinese University, CSE Dept. Distributed Systems / 6 - 1

Distributed Systems

Topic 6: Concurrency Control

Dr. Michael R. LyuComputer Science & Engineering Department

The Chinese University of Hong Kong

© Chinese University, CSE Dept. Distributed Systems / 6 - 2

Outline

1 Motivation

2 Concurrency Control Techniques

– two-phase locking protocol

– optimistic concurrency control protocol

– comparison

3 Lock Compatibility Matrix in Concurrency Control

Service

4 Summary

© Chinese University, CSE Dept. Distributed Systems / 6 - 3

1 Motivation

Components of distributed systems use

shared resources concurrently:» Hardware Components» Operating system resources» Databases» Objects

Resources may have to be accessed in

mutual exclusion (requiring safety, liveness,

and ordering).

© Chinese University, CSE Dept. Distributed Systems / 6 - 4

1 Motivation

Concurrent access and updates of resources which maintain state information may lead to:

» lost updates

» inconsistent retrievals.

Motivating example for lost updates:» Cash withdrawal from ATM, and concurrently

» Credit of a check.

Motivating example for inconsistent retrievals:» Funds transfer between accounts of one customer

» Summary of account balances (Report for Inland Revenue Department).

© Chinese University, CSE Dept. Distributed Systems / 6 - 5

1 Motivating Examples

class Account { protected: float balance; public: void debit(float amount){ float new=balance-amount; balance=new; }; void credit(float amount) { float new=balance+amount;

balance=new; };

float get_balance() {return balance;}; };

© Chinese University, CSE Dept. Distributed Systems / 6 - 6

1 Lost Updates

Time

Customer@ATM: Clerk@Counter:

Balance of account anAcc at t0 is 100

t0

anAcc->debit(50):

new=50;

balance=50;

anAcc->credit(50):

new=150;

balance=150;

t1

t2

t3

t4

t5

t6

© Chinese University, CSE Dept. Distributed Systems / 6 - 7

1 Inconsistent Retrievals

Funds transfer: Inland Revenue Report:t0

t1

t2

t3

t4

Time

Balances at t0 Acc1: 7500, Acc2: 0

t5

t6

t7

Acc1->debit(7500):

Acc1.new=0;

Acc1.balance=0;

Acc2->credit(7500):

Acc2.new=7500;

Acc2.balance=7500;

float sum=0;

sum+=Acc2->get_bal():

sum=0;

sum+=Acc1->get_bal():

sum=0;

© Chinese University, CSE Dept. Distributed Systems / 6 - 8

2 Concurrency Control Techniques

1 Assessment Criteria

2 Two Phase Locking (2PL)

3 Optimistic Concurrency Control

4 Comparison

© Chinese University, CSE Dept. Distributed Systems / 6 - 9

2.1 Assessment Criteria

Serializability

Deadlock Freedom

Fairness

Degree of Concurrency

Complexity

© Chinese University, CSE Dept. Distributed Systems / 6 - 10

2.1 Serial Equivalence

Serially equivalent interleaving – An interleaving of the operations of transactions in which the combined effect is the same as if the transaction had been performed one at a time in some order.

The use of serial equivalence as a criterion for correct concurrent execution prevents the occurrence of lost updates and inconsistent retrievals.

© Chinese University, CSE Dept. Distributed Systems / 6 - 11

2.1 Serial Equivalence

When a pair of operations conflicts, it means their combined effect depends on the order in which they are executed.

For two transactions to be serially equivalent, it is necessary and sufficient that all pairs of conflicting operations of the two transactions be executed in the same order at all of the objects they both access.

© Chinese University, CSE Dept. Distributed Systems / 6 - 12

2.1 Serial Equivalence Examples

Transaction T: Read(a); Write(b,3); Read(c);Transaction U: Write(a,2); Write(b,4);Assume a is 0 and b is 0 at t0,

Time

Transaction T Transaction U

t0

Read(a);

Write(b,3);

Read(c);

Write(a,2);

Write(b,4);

t1t2

t3

t4

t5

t6

Example 1

Time

Transaction T Transaction U

t0

Read(a);

Write(b,3);

Read(c);

Write(a,2);

Write(b,4);

t1t2

t3

t4

t5

t6

Example 2

© Chinese University, CSE Dept. Distributed Systems / 6 - 13

2.1 Recoverability from Aborts

Dirty reads» Read the uncommitted state of other transactions» In Example 2, transaction U aborts after transaction T has committed

Premature writes» Write on the same object of another transaction that may abort» In Example 1, transaction U commits and then transaction T aborts

Cascading aborts» The aborting of latter transactions may cause further transactions to be

aborted. Such situations are called cascading aborts

Strict transactions» The service delays both read and write operations on an object until all

transactions that previously wrote that object have either committed or aborted. This can avoid premature writes and cascading aborts.

© Chinese University, CSE Dept. Distributed Systems / 6 - 14

2.2 Two Phase Locking (2PL)

The most popular concurrency control technique. Used in:

» Relational Database Management System (RDBMS) » Object Database Management System (ODBMS)

» Transaction Monitor

Concurrent processes acquire locks on shared resources from lock manager.

Lock manager grants lock if request does not conflict with already granted locks.

Guarantees serializability.

© Chinese University, CSE Dept. Distributed Systems / 6 - 15

2.2 Locks

A lock is a token that indicates that a process

accesses a resource in a particular mode.

Minimal lock modes: read and write.

Locks are used to indicate to concurrent

processes the current use of that resource.

© Chinese University, CSE Dept. Distributed Systems / 6 - 16

2.2 Lock Compatibility

Lock manager grants locks. Grant depends on compatibility of acquisition

request with modes of already granted locks. Compatibility defined in lock compatibility

matrix. Minimal lock compatibility matrix:

Read WriteRead + -Write - -

© Chinese University, CSE Dept. Distributed Systems / 6 - 17

2.2 Locking Conflicts

Lock requests cannot be granted if incompatible locks are held by concurrent processes.

This is referred to as a locking conflict. Approaches to handle conflicts:

» Force requesting process to wait until conflicting locks are released.

» Tell process that the lock cannot be granted.

© Chinese University, CSE Dept. Distributed Systems / 6 - 18

2.2 Example (Avoiding Lost Updates)

Time

Customer@ATM: Clerk@Counter:

Balance of account anAcc at t0 is 100

t0 anAcc->debit(50):

anAcc->lock(write);

new=50;

balance=50;

anAcc->unlock(write);

anAcc->credit(50):

anAcc->lock(write);

new=100;

balance=100;

anAcc->unlock(write);

t1

t2

t3

t4

t5

t6

© Chinese University, CSE Dept. Distributed Systems / 6 - 19

2.2 Deadlocks

Locking may lead to situations where processes are waiting for each other to release locks.

These situations are called deadlocks. Deadlocks have to be detected by the lock

manager. Deadlocks have to be resolved by aborting

one or several of the processes involved. This requires to undo all the actions that these

processes have done.

© Chinese University, CSE Dept. Distributed Systems / 6 - 20

2.2 Locking

Processes acquire locks before they access shared resources and release locks afterwards.

2-phase locking : Processes do not acquire locks once they have released a lock.

Strict 2-phase locking : A transaction does not release any of its locks until after it terminates (by committing or aborting)

Typical 2PL locking profile of a process:

Number oflocks heldby one process

Time

© Chinese University, CSE Dept. Distributed Systems / 6 - 21

2.2 Example

Apply two-phase locking protocol to the schedules below by adding statements like S-lock(A), X-lock(A), and unlock(A).

S-lock stands for Share lock, e.g. READ LOCK X-lock stands for eXclusive lock, e.g. WRITE LOCK

T1 T2

write(A)

write(A)

write(B)

write(B)

© Chinese University, CSE Dept. Distributed Systems / 6 - 22

2.2 Example Answer

T1 T2

X-lock(A)

write(A)

X-lock(B)

unlock(A)

X-lock(A)

write(A)

write(B)

unlock(B)

X-lock(B)

write(B)

unlock(A)

unlock(B)

© Chinese University, CSE Dept. Distributed Systems / 6 - 23

2.2 Locking Granularity

2PL applicable to resources of any granularity. High degree of concurrency with small locking

granularity. For small granules large number of locks

required. May involve significant locking overhead. Trade-off between degree of concurrency and

locking overhead. Hierarchic locking as a compromise.

© Chinese University, CSE Dept. Distributed Systems / 6 - 24

2.2 Hierarchic Locking

Used with container resources, e.g.» file (containing records)» set or sequence (containing objects)

Lock modes intention read (IR) and intention write (IW).

Lock compatibility: IR R IW W

IR + + + - R + + - -

IW + - + - W - - - -

© Chinese University, CSE Dept. Distributed Systems / 6 - 25

2.2 Transparency of Locking

Who is acquiring locks? Options:» Concurrency control infrastructure

» Implementation of resource components

» Clients of resource components

First option desirable but not always possible:» Infrastructure must manage all resources

» Infrastructure must know all resource accesses.

Last option is undesirable and avoidable!

© Chinese University, CSE Dept. Distributed Systems / 6 - 26

2.3 Optimistic Concurrency Control

Complexity of 2PL is linear in number of accessed resources.

Unreasonable if conflicts are unlikely. Idea of optimistic concurrency control:

» Processes modify (logical) copies of resources.

» Validate access pattern against conflicts with concurrent processes.

» If no conflicts: write copy back.

» else: undo all modifications and start over again.

© Chinese University, CSE Dept. Distributed Systems / 6 - 27

2.3 Validation Prerequisites

Units (distinguishable segments of the operation sequence) of concurrency control required.

For each unit the following information has to be gathered:

» Starting time of the unit st(U).» Time stamp for start of validation TS(U).» Ending time of unit E(U).» Read and write sets RS(U) and WS(U).

Needs precise time information. Requires synchronization of local clocks.

© Chinese University, CSE Dept. Distributed Systems / 6 - 28

2.3 Validation Set

A unit u has to be validated against a set of other units VU(u) that have been validated earlier.

VU(u) is formally defined as:VU(u):={u´|st(u)<E(u´) and u´ has been validated}

© Chinese University, CSE Dept. Distributed Systems / 6 - 29

2.3 Read/Write Conflict Detection

A read/write conflict occurred during the course of a unit u iff:u´ VU(u) : WS(u) RS(u´) RS(u) WS(u´)

Then u has written a resource that this other unit u´ has read or vice versa.

Then u cannot be completed and has to be undone.

© Chinese University, CSE Dept. Distributed Systems / 6 - 30

2.3 Write/Write Conflict Detection

A write/write conflict occurred during the course of a unit u iff:u´ VU(u) : WS(u) WS(u´)

Then u has modified a resource that this other unit u´ has modified as well.

Then u cannot be completed and has to be undone.

© Chinese University, CSE Dept. Distributed Systems / 6 - 31

2.4 Comparison

Both approaches + guarantee serializsability.– need ability to undo processes.

Pessimistic control (2PL): – overhead for locking.– not deadlock free+ works efficient also in the case of likely conflicts

Optimistic control: + small overhead when conflicts are unlikely.+ deadlock free.– computation of conflict sets difficult in distributed systems.– overhead for time synchronization.

© Chinese University, CSE Dept. Distributed Systems / 6 - 32

3 An Additional Lock Compatibility

Some concurrency control services support hierarchic locking.

Upgrade locks for decreasing probability of deadlocks.

Compatibility matrix:IR R U IW W

IR + + + + -R + + + - -U + + - - -

IW + - - + -W - - - - -

© Chinese University, CSE Dept. Distributed Systems / 6 - 33

4 Summary

Lost updates and inconsistent retrievals. Pessimistic vs. optimistic concurrency control

» Pessimistic control: – higher overhead for locking.+ works efficiently in cases where conflicts are likely

» Optimistic control: + small overhead when conflicts are unlikely.– distributed computation of conflict sets expensive.– requires global clock.

Additional lock compatibility is introduced. Read Textbook Chapter 16 (16.2 and 16.3 in next

lecture).