1 dc8: transactions chapter 12 transactions and concurrency control

70
1 DC8: Transactions Chapter 12 Transactions and Concurrency Control

Upload: henry-jones

Post on 19-Jan-2016

232 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 1 DC8: Transactions Chapter 12 Transactions and Concurrency Control

1

DC8: Transactions Chapter 12

Transactions and Concurrency Control

Page 2: 1 DC8: Transactions Chapter 12 Transactions and Concurrency Control

2

Transactions

We study transactions here because they require a lot of synchronization and coordination.

Transactions (think databases) have database tables and data items as shared resources. Transactions have the additional capability of coordinating the update of several of these “resources” at once. It is as if a process must have the CR (critical region) for several resources at the same time.

Page 3: 1 DC8: Transactions Chapter 12 Transactions and Concurrency Control

3

Topics

The transaction environment Serializability theory

Schedules and Conflicts Recoverability, Cascading Aborts

Mechanisms to enforce serializability Two Phase Locking Timestamp Concurrency Control

Page 4: 1 DC8: Transactions Chapter 12 Transactions and Concurrency Control

4

The Transaction Model

A transaction is a unit of program execution that accesses and possibly updates various data items.

A transaction must see a consistent database. During transaction execution the database may be

inconsistent. When the transaction is committed, the database

must be consistent. Two main issues to deal with:

Failures of various kinds, such as hardware failures and system crashes

Concurrent execution of multiple transactions

Page 5: 1 DC8: Transactions Chapter 12 Transactions and Concurrency Control

5

The Transaction Model

Concurrent execution of user programs is essential for good DBMS performance.

A user’s program may carry out many operations on the data retrieved from the DB, however, the DBMS is only concerned with reads and writes.

Users submit transactions and the DBMS interleaves the operations to achieve concurrency.

This is called Concurrency Control

Page 6: 1 DC8: Transactions Chapter 12 Transactions and Concurrency Control

6

The Transaction Model (ACID) Atomicity. Either all operations of the transaction are

properly reflected in the database or none are. Consistency. Execution of a transaction in isolation

preserves the consistency of the database. Isolation. Although multiple transactions may execute

concurrently, each transaction must be unaware of other concurrently executing transactions. Intermediate transaction results must be hidden from other concurrently executed transactions.

Durability. After a transaction completes successfully, the changes it has made to the database persist, even if there are system failures.

Page 7: 1 DC8: Transactions Chapter 12 Transactions and Concurrency Control

7

Example: Funds Transfer

Transaction to transfer $50 from account A to account B:1. read(A)2. A := A – 503. write(A)4. read(B)5. B := B + 506. write(B)

Consistency requirement – the sum of A and B is unchanged by the execution of the transaction.

Atomicity requirement — if the transaction fails on any step (after step 3 and before step 6) the system ensures that its updates are not reflected in the database.

Page 8: 1 DC8: Transactions Chapter 12 Transactions and Concurrency Control

8

Example: Funds Transfer continued Durability requirement — once the user has been

notified that the transaction has completed (i.e., the transfer of the $50 has taken place), the updates to the DB must persist despite failures.

Isolation requirement — if between steps 3 and 6, another transaction is allowed to access the partially updated database, it will see an inconsistent database (the sum A + B will be less than it should be) violating the isolation requirement.Can be ensured by running transactions serially.

Page 9: 1 DC8: Transactions Chapter 12 Transactions and Concurrency Control

9

The Transaction Model

Examples of primitives for transactions.

Write data to a file, a table, or otherwiseWRITE

Read data from a file, a table, or otherwiseREAD

Kill the transaction and restore the old valuesABORT_TRANSACTION

Terminate the transaction and try to commitEND_TRANSACTION

Make the start of a transactionBEGIN_TRANSACTION

DescriptionPrimitive

Page 10: 1 DC8: Transactions Chapter 12 Transactions and Concurrency Control

10

The Transaction Model- Aborts

a) Transaction to reserve three flights commitsb) Transaction aborts when third flight is unavailable

BEGIN_TRANSACTION reserve SFO -> JFK; reserve JFK -> Nairobi; reserve Nairobi -> Malindi full =>ABORT_TRANSACTION (b)

BEGIN_TRANSACTION reserve SFO -> JFK; reserve JFK -> Nairobi; reserve Nairobi -> Malindi;END_TRANSACTION (a)

Page 11: 1 DC8: Transactions Chapter 12 Transactions and Concurrency Control

11

Transaction States

Active, the transaction is executing Failed, after the discovery that normal execution

can no longer proceed. Aborted, after the transaction has been rolled back

and the database restored to its state prior to the start of the transaction. Two options after it has been aborted: restart the transaction – only if no internal logical error kill the transaction

Committed, after successful completion.

Page 12: 1 DC8: Transactions Chapter 12 Transactions and Concurrency Control

12

Reasons for Concurrency

Multiple transactions are allowed to run concurrently in the system. Advantages are: increased processor and disk utilization,

leading to better transaction throughput: one transaction can be using the CPU while another is reading from or writing to the disk

reduced average response time for transactions: short transactions need not wait behind long ones.

Page 13: 1 DC8: Transactions Chapter 12 Transactions and Concurrency Control

13

Distributed Transactions

Flat transactions versus nested transactions (which allow partial results to be committed).

A nested transaction is a transaction that is logically decomposed into a hierarchy of subtransactions.

A distributed transaction is a logically flat indivisible transaction that operates on distributed data.

Page 14: 1 DC8: Transactions Chapter 12 Transactions and Concurrency Control

14

Distributed Transactions

a) A nested transactionb) A distributed transaction

Page 15: 1 DC8: Transactions Chapter 12 Transactions and Concurrency Control

15

Transaction Atomicity, Isolation, and Durability Conceptually, when a transaction starts, it is given a

private workspace to make its changes to. When it commits, the private workspace replaces

the corresponding data items in the permanent workspace. If the transaction aborts, the private workspace can simply be discarded.

This type of implementation leads to many private workspaces and thus consumes a lot of space. Also, if a transaction only reads a data table or item, it doesn’t need a private copy.

Page 16: 1 DC8: Transactions Chapter 12 Transactions and Concurrency Control

16

Private Workspace

a) The file index and disk blocks for a three-block fileb) After a transaction has modified block 0 and appended block 3c) After committing

Page 17: 1 DC8: Transactions Chapter 12 Transactions and Concurrency Control

17

More Efficient Implementation Two common methods of implementation are write-

ahead logs and before images. With write-ahead logs, the transactions act on the

permanent workspace, but before they can make a change, a log record is written to stable storage with the transaction and data item ID and the old and new values.

This log can then be used if the transaction aborts and the changes need to be rolled back.

Page 18: 1 DC8: Transactions Chapter 12 Transactions and Concurrency Control

18

Writeahead Log

a) A transactionb) – d) The log before each statement is executed

Log

[x = 0 / 1][y = 0/2][x = 1/4]

(d)

Log

[x = 0 / 1][y = 0/2]

(c)

Log

[x = 0 / 1]

(b)

x = 0;y = 0;BEGIN_TRANSACTION; x = x + 1; y = y + 2 x = y * y;END_TRANSACTION; (a)

Page 19: 1 DC8: Transactions Chapter 12 Transactions and Concurrency Control

19

Before- and After- Images

A before- and after-image is kept for each data item. When a data item is changed, the old value is written

to the before-image and the new value is the after-image.

Other transactions are not allowed to “see” the new value until the current transaction commits.

The after-image is made permanent and durable once the transaction which wrote it commits.

If the transaction aborts, the before-image is restored.

Page 20: 1 DC8: Transactions Chapter 12 Transactions and Concurrency Control

20

DBMS Organization General organization of managers for handling

transactions.

Page 21: 1 DC8: Transactions Chapter 12 Transactions and Concurrency Control

21

DBMS Organization General organization of

managers for handling distributed transactions.

Page 22: 1 DC8: Transactions Chapter 12 Transactions and Concurrency Control

22

Concurrency Control

Concurrency control schemes – mechanisms to achieve isolation, i.e., to control the interaction among the concurrent transactions in order to prevent them from destroying the consistency of the database.

These schemes are used along with logs and before-, after-images

Need a definition for correct execution of transactions: serializability

Page 23: 1 DC8: Transactions Chapter 12 Transactions and Concurrency Control

23

Transaction Schedules

Schedules – sequences that indicate the chronological order in which instructions of concurrent transactions are executed a schedule for a set of transactions must consist

of all instructions of those transactions must preserve the order in which the

instructions appear in each individual transaction.

Page 24: 1 DC8: Transactions Chapter 12 Transactions and Concurrency Control

24

Example Schedule

Let T1 transfer $50 from A to B, and T2 transfer 10% of the balance from A to B. Here is a serial schedule, in which T1 is followed by T2.

Page 25: 1 DC8: Transactions Chapter 12 Transactions and Concurrency Control

25

Example Continued

Let T1 and T2 be the transactions defined previously. This schedule is not a serial schedule, but it is equivalent to Schedule 1. That is, the sum A+B is preserved. All the effects are the same as they would be if the schedule were serial.

Page 26: 1 DC8: Transactions Chapter 12 Transactions and Concurrency Control

26

Not Serializable

This concurrent schedule does not preserve the value of the the sum A + B and is not equivalent to the serial schedule.

Page 27: 1 DC8: Transactions Chapter 12 Transactions and Concurrency Control

27

Serializability

Assumption – Each transaction preserves database consistency. Thus a serial execution of a set of transactions preserves database consistency.

A (possibly concurrent) schedule is serializable if it is equivalent to a serial schedule.

We assume that transactions may perform arbitrary computations on data in local buffers in between reads and writes. Our simplified schedules consist of only read and write instructions.

Page 28: 1 DC8: Transactions Chapter 12 Transactions and Concurrency Control

28

Conflict Serializability

Instructions li and lj of transactions Ti and Tj respectively, conflict if and only if there exists some item Q accessed by both li and lj, and at least one of these instructions wrote Q.

1. li = read(Q), lj = read(Q). li and lj don’t conflict.2. li = read(Q), lj = write(Q). They conflict.3. li = write(Q), lj = read(Q). They conflict4. li = write(Q), lj = write(Q). They conflict

Intuitively, a conflict between li and lj forces a (logical) temporal order between them. If li and lj are consecutive in a schedule and they do not conflict, their results would remain the same even if they were interchanged.

Page 29: 1 DC8: Transactions Chapter 12 Transactions and Concurrency Control

29

Conflict Serializable

If a schedule S can be transformed into a schedule S´ by a series of swaps of non-conflicting instructions, we say that S and S´ are conflict equivalent.

We say that a schedule S is conflict serializable if it is conflict equivalent to a serial schedule

Example of a schedule that is not conflict serializable:

T3 T4

read(Q)write(Q)

write(Q)

Page 30: 1 DC8: Transactions Chapter 12 Transactions and Concurrency Control

30

Conflict Serializable

This schedule is conflict serializable.

Page 31: 1 DC8: Transactions Chapter 12 Transactions and Concurrency Control

31

Remember Recovery?

The DB must behave as if it contains all of the effects of committed transactions and none of the effects of uncommitted ones. So, when a transaction aborts, the DBMS must wipe out all its effects.

If a transaction, t1, writes a value to data item x, and t2 reads that value, what happens when t1 subsequently aborts?

Page 32: 1 DC8: Transactions Chapter 12 Transactions and Concurrency Control

32

Recoverable Schedules

If T8 should abort, T9 would have read (and possibly shown to the user) an inconsistent database state. Hence if T9 is allowed to commit before T8, the schedule is not recoverable.

In order to be recoverable, a transaction is not allowed to commit until every transaction it reads from has committed.

Page 33: 1 DC8: Transactions Chapter 12 Transactions and Concurrency Control

33

Cascading Aborts

Cascading aborts – a single transaction failure can lead to a series of transaction rollbacks. Consider the following schedule where none of the transactions has yet committed (so the schedule is recoverable)

If T10 fails, T11 and T12 must also be rolled back.

Page 34: 1 DC8: Transactions Chapter 12 Transactions and Concurrency Control

34

How to Avoid Cascading Aborts

If we ensure that every transaction reads only those items whose values were written by committed transactions, the schedule will avoid cascading aborts.

This restricts the reads of a transaction. What about the writes? Will restricting writes give us an useful property?

Page 35: 1 DC8: Transactions Chapter 12 Transactions and Concurrency Control

35

Strict Executions

If we ensure that every transaction writes to only those items whose values were written by committed transactions, the schedule is strict.

This nice property ensures that we only have to keep one before-image.

T1 T2 T3write(A)

write(A)write(A)

abortabort

Page 36: 1 DC8: Transactions Chapter 12 Transactions and Concurrency Control

36

Levels of Consistency (SQL92)

Serializable — default Repeatable read — only committed records to be

read, repeated reads of same record must return same value. However, a transaction may not be serializable. Scheduler must maintain RR property. If t1 makes a second read request and X has been modified, then t1 will be aborted.

Read committed — only committed records can be read, but successive reads of record may return different (but committed) values.

Read uncommitted — even uncommitted records may be read (browse).

Page 37: 1 DC8: Transactions Chapter 12 Transactions and Concurrency Control

37

Repeatable Read but not Serializable

T1 T2Read(A)B=A+1Write(B)

Read(B)Read(C)Commit

Read(A)C=A+2Write(C)

Page 38: 1 DC8: Transactions Chapter 12 Transactions and Concurrency Control

38

Read Committed but not Repeatable Read

T1 T2 Read(A) B=A+1 Write(B)

Read(A)A=A*2Write(A)Commit

Read(A) B=A+2 Write(B)

Page 39: 1 DC8: Transactions Chapter 12 Transactions and Concurrency Control

39

Example

Is the schedule on the next slide conflict serializable, and if so, find a conflict

equivalent serial order?

Remember: two operations conflict if they are from different transactions, they access the same data item,

and at least one of them is a write.

Page 40: 1 DC8: Transactions Chapter 12 Transactions and Concurrency Control

40

T1 T2 T3 T4 T5

read(X)read(Y)read(Z)

read(V)read(W)read(W)

read(Y)write(Y)

write(Z)read(U)

read(Y)write(Y)read(Z)write(Z)

read(U) write(U)

Page 41: 1 DC8: Transactions Chapter 12 Transactions and Concurrency Control

41

How to Enforce Serializability? Pessimistic approach: prevent

transactions from accessing data that might lead to a conflict.

Optimistic approach: allow transactions to access the data, but require them to “validate” before committing.

Page 42: 1 DC8: Transactions Chapter 12 Transactions and Concurrency Control

42

Two Phase Locking (1)

Pessimistic approach Easiest and most widely used way. Scheduler maintains a lock for each data

item. An item is locked on behalf of a transaction and then no other transaction can access it.

Refinement: distinguish between read locks and write locks. Read locks can be shared with other readers.

Page 43: 1 DC8: Transactions Chapter 12 Transactions and Concurrency Control

43

Rules for Two Phase Locking(2)

Transaction must get a read or write lock on data item d before reading d and must get a write lock on d before writing to d.

After a transaction relinquishes a lock, it may not acquire any new locks.

Page 44: 1 DC8: Transactions Chapter 12 Transactions and Concurrency Control

44

Two-Phase Locking (3)

2 Phase Locking

Page 45: 1 DC8: Transactions Chapter 12 Transactions and Concurrency Control

45

Two Phase Locking (4)

Strict 2PL avoids cascading aborts by preventing transactions from seeing uncommitted values.

Locks are acquired then held until the transaction is ready to commit or is aborted.

Page 46: 1 DC8: Transactions Chapter 12 Transactions and Concurrency Control

46

Two-Phase Locking (5)

Strict two-phase locking.

Page 47: 1 DC8: Transactions Chapter 12 Transactions and Concurrency Control

47

Two Phase Locking

T1 T2RL(Q)

read(Q) RL(Q) read(Q)UL(Q)

WL(Q) commit

write(Q)

UL(Q)

commit

RL(X) means acquire a read lock on X

WL(X) means acquire a write lock on X

UL(X) means unlock X

Page 48: 1 DC8: Transactions Chapter 12 Transactions and Concurrency Control

48

Two Phase Locking Prevents Schedules That Are Not Serializable

T1 T2Read(A)B=A+1Write(B)

Read(B)Read(C)Commit

Read(A)C=A+2Write(C)

The repeatable read example. T2 will not be able to get the locks it needs

Page 49: 1 DC8: Transactions Chapter 12 Transactions and Concurrency Control

49

Some Serializable Schedules Are Also Prevented

The scheduler either acquires all locks at the start of all transactions or it acquires them as needed for all transactions. This schedule can only be done with 2PL by a combination of those strategies.

Page 50: 1 DC8: Transactions Chapter 12 Transactions and Concurrency Control

50

Pessimistic Timestamp Ordering Every transaction gets a (Lamport, totally ordered)

timestamp when it starts. Every data item has a read ts and a write ts and a commit bit c.

The read ts is the ts of the transaction that most recently read the data item. The write ts is the ts of the transaction that most recently wrote to the item.

The commit bit c is true if and only if the most recent transaction to write to that item has committed.

The scheduler maintains the item timestamps and checks to make sure the reads and writes are correct. Goal is to enforce serializability.

Page 51: 1 DC8: Transactions Chapter 12 Transactions and Concurrency Control

51

Read Too Late

T1 tries to read X, but ts(T1) < write-ts(X) meaning X has been written to by a later transaction.

T1 should not be allowed to read X because it was written by a transaction that occurs later in the serialization order (transactions are serialized by start time).

Solution: T1 is aborted.

T2 writes X

T1 reads X?

T1 starts T2 starts

Page 52: 1 DC8: Transactions Chapter 12 Transactions and Concurrency Control

52

Write Too Late

T1 tries to write X, but the read-ts indicates that some other transaction should have read the value about to be written. write-ts(X) < ts(T1) < read-ts(X)

Solution: T1 is aborted.

T2 reads X

T1 writes X?

T1 starts T2 starts

Page 53: 1 DC8: Transactions Chapter 12 Transactions and Concurrency Control

53

Dirty Reads

T1 reads X that was last written by T2. The timestamps are properly ordered, but the commit bit c=false so if T2 later aborts then T1 must abort.

Solution: We can avoid cascading aborts by delaying T1’s read until T2 has committed (though not necessary to ensure serializability).

T2 writes X

T1 reads X?

T2 starts T1 starts T2 abort

Page 54: 1 DC8: Transactions Chapter 12 Transactions and Concurrency Control

54

Thomas Write Rule

T2 has written to X before T1. When T1 tries to write, the appropriate action is to do nothing. No other transaction T3 that should have read T1’s value of X got T2’s value instead, because it would have been aborted because of a too late read. Future reads of X want T2’s value or a later value, not T1’s value.

Solution: T1’s write can be skipped if T2 commits.

T2 writes X

T1 writes X?

T1 starts T2 starts

Page 55: 1 DC8: Transactions Chapter 12 Transactions and Concurrency Control

55

Commit Requests

Transaction commit requests are also passed to the scheduler.

To ensure strict executions, a commit request can be delayed until all transactions that wrote items that it overwrote have committed.

The scheduler sets the commit bit c on data items in the write set when it services the commit request.

Page 56: 1 DC8: Transactions Chapter 12 Transactions and Concurrency Control

56

TS Ordering Rules

When scheduler receives a read request from transaction T, if ts(T)>= write-ts(X) and c(X) is true, grant

request and set read-ts(X) to MAX{ts(T),read-ts(X)}

if ts(T)>= write-ts(X) and c(X) is false, delay T until c(X) becomes true or txn aborts.

If ts(T)< write-ts(X), abort T and restart with new timestamp.

Page 57: 1 DC8: Transactions Chapter 12 Transactions and Concurrency Control

57

TS Ordering Rules, continued When scheduler receives a write request

from transaction T, if ts(T)>= read-ts(X) and ts(T)>= write-ts(X), grant

request, set write-ts(X) to ts(T) and c(X)=false if ts(T)>= read-ts(X) and ts(T)< write-ts(X), don’t

do the operation but allow T to continue as if done (Thomas write rule).

If ts(T)< read-ts(X), abort T and restart with new timestamp.

Page 58: 1 DC8: Transactions Chapter 12 Transactions and Concurrency Control

58

Pessimistic TS Ordering

If the scheduler enforces these rules, transactions will be serializable. The serial order is the order of their timestamps.

The next slide is an example of 3 transactions T1, T2, and T3. T1 runs first and completes and has used every item T2 and T3 want. In a, b, c and d, T2 requests a write(x) at the end of the given sequence. In e, f, g and h, T2 requests a read(x) at the end of the sequence.

In (d), T2 could continue (Thomas write rule) if there are no intervening reads. In (f) timestamps of T2 and T3 are reversed.

Page 59: 1 DC8: Transactions Chapter 12 Transactions and Concurrency Control

59

Pessimistic Timestamp Ordering

Concurrency control using timestamps.

Tent means written but transaction has not yet committed

Page 60: 1 DC8: Transactions Chapter 12 Transactions and Concurrency Control

60

Optimistic Timestamp Ordering In any optimistic concurrency control, each

transaction does its writes to a private workspace until completion of a validation phase.

In the validate phase, the scheduler validates the transaction by comparing its read set and write set with those of other transactions.

After validation, the write set values are written to the database and the transaction commits

Validation is frequently done with the help of timestamps.

Page 61: 1 DC8: Transactions Chapter 12 Transactions and Concurrency Control

61

Summary and Examples

Page 62: 1 DC8: Transactions Chapter 12 Transactions and Concurrency Control

62

Rules for Two Phase Locking

Transaction must get a read or write lock on data item d before reading d and must get a write lock on d before writing to d.

A read lock can be shared with other read locks. A write lock is an exclusive lock.

After a transaction relinquishes a lock, it may not acquire any new locks.

Page 63: 1 DC8: Transactions Chapter 12 Transactions and Concurrency Control

63

Example One

Is this schedule serializable and would it be permitted under the rules of two phase locking? Where would the locks be acquired?

What about Strict 2PL?

T1 T2Read(A)B=A+1Write(B)

Read(A)Read(B)Commit

Read(A)C=A+2Write(C)Commit

Page 64: 1 DC8: Transactions Chapter 12 Transactions and Concurrency Control

64

Example Two

Is this schedule serializable and would it be permitted under the rules of two phase locking?

Page 65: 1 DC8: Transactions Chapter 12 Transactions and Concurrency Control

65

Pessimistic Timestamp Ordering Every transaction gets a (Lamport, totally ordered)

timestamp when it starts. Every data item has a read ts and a write ts and a commit bit c.

The read ts is the ts of the transaction that most recently read the data item. The write ts is the ts of the transaction that most recently wrote to the item.

The commit bit c is true if and only if the most recent transaction to write to that item has committed.

Page 66: 1 DC8: Transactions Chapter 12 Transactions and Concurrency Control

66

TS Ordering Rules

When scheduler receives a read request from transaction T, if ts(T)>= write-ts(X) and c(X) is true, grant

request and set read-ts(X) to MAX{ts(T),read-ts(X)}

if ts(T)>= write-ts(X) and c(X) is false, delay T until c(X) becomes true or txn aborts.

If ts(T)< write-ts(X), abort T and restart with new timestamp.

Page 67: 1 DC8: Transactions Chapter 12 Transactions and Concurrency Control

67

TS Ordering Rules, continued When scheduler receives a write request

from transaction T, if ts(T)>= read-ts(X) and ts(T)>= write-ts(X), grant

request, set write-ts(X) to ts(T) and c(X)=false if ts(T)>= read-ts(X) and ts(T)< write-ts(X), don’t

do the operation but allow T to continue as if done (Thomas write rule).

If ts(T)< read-ts(X), abort T and restart with new timestamp.

Page 68: 1 DC8: Transactions Chapter 12 Transactions and Concurrency Control

68

Example One

Is this schedule serializable and how would it be handled in timestamp ordering?

Page 69: 1 DC8: Transactions Chapter 12 Transactions and Concurrency Control

69

Example Two

Is this schedule serializable and how would it be handled in timestamp ordering?

Page 70: 1 DC8: Transactions Chapter 12 Transactions and Concurrency Control

70

End