chapter 21 introduction to transaction processing concepts and theory
DESCRIPTION
Chapters 21 & 22 Introduction to Transaction Processing & Concurrency Control links: http://en.wikipedia.org/wiki/Concurrency_control http://docs.oracle.com/cd/E11882_01/server.112/e25789/consist.htm#CNCPT020 Transaction processing systems: Systems with large databases and hundreds of concurrent users that are executing database transactions. Examples: systems for reservations, banking, credit card processing, stock markets, supermarket checkout, They require high availability and fast response time for hundreds of concurrent users. Chapters 21-22TRANSCRIPT
1
Chapter 21 Introduction to Transaction Processing Concepts and Theory
Chapter 22Concurrency Control Techniques
Chapters 21-22
Chapters 21-22 2
Chapters 21 & 22Introduction to Transaction Processing& Concurrency Control links: http://en.wikipedia.org/wiki/Concurrency_control
http://docs.oracle.com/cd/E11882_01/server.112/e25789/consist.htm#CNCPT020
Transaction processing systems: Systems with large databases and hundreds of
concurrent users that are executing database transactions. Examples: systems for reservations, banking, credit card
processing, stock markets, supermarket checkout, They require high availability and fast response time for
hundreds of concurrent users.
Oracle links http://www.oracle.com/pls/db112/sear
ch?remark=quick_search&word=concurrency&partno=
Chapters 21-22 3
Chapters 21-22 4
A transaction is a logical unit of database processing that includes one or more database access operations—these can include insertion, deletion, modification, or retrieval operations.
Transactions
Chapters 21-22 5
Multiprogramming: Computers can execute multiple programs—or processes—at the same time
Interleaved: if single central processing unit (CPU):
execute at most one process at a time , then suspend that process and execute some commands from the next process, and so on. A process is resumed at the point where it was suspended whenever it gets its turn to use the CPU again. Hence, concurrent execution of processes is actually interleaved.
Chapters 21-22 6
Figure 17.1 Interleaved processing versus parallel processing of concurrent transactions
Concurrency control in databases: developed in terms of interleaved concurrencyParallel processing: computer systems have multiple hardware processors (CPUs) (processes C and D)
Chapters 21-22 7
READ AND WRITE Items read_item(X): Reads a database item named X into a
program variable.
write_item(X): Writes the value of program variable X into the database item named X.
Chapters 21-22 8
Executing a read_item(X) command includes the following steps: 1. Find the address of the disk block that contains item X. 2. Copy that disk block into a buffer in main memory (if that
disk block is not already in some main memory buffer). 3. Copy item X from the buffer to the program variable named
X.
Executing a write_item(X) command includes the following steps: 1. Find the address of the disk block that contains item X. 2. Copy that disk block into a buffer in main memory (if that
disk block is not already in some main memory buffer). 3. Copy item X from the program variable named X into its
correct location in the buffer. 4. Store the updated block from the buffer back to disk (either
immediately or at some later point in time). Step 4 is the one that actually updates the database on
disk.
Chapters 21-22 9
Figure 17.2 Two sample transactions (a) Transaction T1, (b) Transaction T2
Chapters 21-22 10
Concurrency Control Why Concurrency Control Is
Needed1. The Lost Update Problem2. The Temporary Update (or Dirty Read) Problem3. The Incorrect Summary Problem
Chapters 21-22 11
The Lost Update Problem This problem occurs when two transactions that access the same database items have their operations interleaved in a way that makes the value of some database item incorrect.
Time Transaction A Transaction B Balance
T1 Read Bal … $1000.00
T2 … Read Bal $1000.00
T3 Bal := Bal-50 … $1000.00
T4 Write BAL … $950.00
T5 … Bal := Bal+100 $950.00
T6 … Write BAL $1100.00
Chapters 21-22 12
Figure 17.3 (a) The lost update problem
Chapters 21-22 13
The Temporary Update (or Dirty Read) ProblemThis problem occurs when one transaction updates a database item and then the transaction fails for some reason. The updated item is accessed by another transaction before it is changed back to its original value.
Time DepositTransaction
InterestTransaction
Balance
T1 Read Bal $1000.00
T2 BAL := BAL+1,000
… $1000.00
T3 Write BAL Read Bal $2000.00
Interest = Bal*.05
Total = Total + Interest
T7 Rollback … ???
Chapters 21-22 14
Figure 17.3 (b) The temporary update problem
Chapters 21-22 15
The Incorrect Summary Problem
If one transaction is calculating an aggregate summary function on a number of records while other transactions are updating some of these records, the aggregate function may calculate some values before they are updated and others after they are updated.
Chapters 21-22 16
Time Sum_interest Transfer
T1 TOTINT := 0 …….
T2 Read BAL_A(5,000) …….
T3 INT := BAL_A*0.05 …….
T4 TOTINT := TOTINT+INT …….
T5 Read BAL_B …….
T6 INT := BAL_B*0.05 …….
T7 TOTINT := TOTINT+INT …….
T8 ……. Read BAL_A(5000)
T9 ……. BAL_A := BAL_A-1000
T10 ……. Write BAL_A(4000)
T11 Read BAL_C …….
T12 INT := BAL_C*0.05 …….
T13 TOTINT := TOTINT+INT …….
T14 ……. Read BAL_E(5000)
T15 ……. BAL_E := BAL_E+1000
T16 ……. Write BAL_E(6000)
T17 Read BAL_D …….
T18 INT := BAL_D*0.05 …….
T19 TOTINT := TOTINT+INT …….
T20 Read BAL_E(6,000) …….
T21 INT := BAL_E*0.05 …….
T22 TOTINT := TOTINT+INT …….
Chapters 21-22 17
Figure 17.3 (c) The incorrect summary problem
Chapters 21-22 18
Transaction States and Additional Operations
A transaction is an atomic unit of work that is either completed in its entirety or not done at all.
Two Possible Operations on atransaction:
Commit Rollback
Chapters 21-22 19
Commit Commit Point
all operations that access the database have been executed successfully
the effect of all the transaction operations on the database has been recorded in the log
Beyond the commit point the transaction is said to be committed and its effect
is assumed to be permanently recorded in the database
It indicates the successful end-of -transaction
Chapters 21-22 20
ROLLBACK Signals unsuccessful end-of-transaction
so that any changes or effects that the transaction may have applied to the database must be undone
Two Possible Operations on a Transaction(cont.)
create table temp (f_id char(8), f_pin Char(8));insert into temp values(1,1181);insert into temp values(2,1075);insert into temp values(3, 8531);insert into temp values(4,1690);insert into temp values(5, 1222);Commit;F_ID F_PIN -------- --------1 1181 2 1075 3 8531 4 1690 5 1222 Select * from temp;F_ID F_PIN -------- --------1 1181 2 1075 3 8531 4 1690 5 1222
Chapters 21-22 21
Chapters 21-22 22
SQL> create table temp (f_id char(8), f_pin Char(8));SQL> select * from temp;
F_ID F_PIN---------- ---------- 1 1181 2 1075 3 8531 4 1690 5 1222
SQL> insert into temp values (99, 9999);
1 row created.
SQL> select * from temp;
F_ID F_PIN---------- ---------- 1 1181 2 1075 3 8531 4 1690 5 1222 99 9999
SQL> rollback;
Rollback complete.
SQL> select * from temp;
F_ID F_PIN---------- ---------- 1 1181 2 1075 3 8531 4 1690 5 1222
SQL> insert into temp values (99, 9999);
1 row created.
SQL> select * from temp;
F_ID F_PIN---------- ---------- 1 1181 2 1075 3 8531 4 1690 5 1222 99 9999
6 rows selected.
SQL> commit;
Commit complete.
SQL> delete from temp where f_id = 99;
1 row deleted.
SQL> select * from temp;
F_ID F_PIN---------- ---------- 1 1181 2 1075 3 8531 4 1690 5 1222
SQL> commit;
Commit complete.
Chapters 21-22 23
SQL> select * from temp;
F_ID F_PIN ---------- ---------- 1 1181 2 1075 3 8531 4 1690 5 1222
SQL> insert into temp values (99, 9999);
1 row created.
SQL> select * from temp;
F_ID F_PIN ---------- ---------- 1 1181 2 1075 3 8531 4 1690 5 1222 99 9999
SQL> rollback;
Rollback complete.
SQL> select * from temp;
F_ID F_PIN ---------- ---------- 1 1181 2 1075 3 8531 4 1690 5 1222
SQL> select * from temp;
F_ID F_PIN ---------- ---------- 1 1181 2 1075 3 8531 4 1690 5 1222
Chapters 21-22 24
Chapters 21-22 25
Introduction to Transactions Transaction A logical unit of work or atomic transaction
(It is usually a series of actions to be taken on the database such that either all the actions are done successfully or the database remains unchanged)
Chapters 21-22 26
Example of a transaction A transaction to enter the customer order
might include the following actions:
1. Change the customer record to enter new order
2. Change the salesperson record to enter the sales details of the new order
3. Insert the new order into order table in the database
Chapters 21-22 27
* Suppose the last step failed, perhaps because of insufficient file space
* Imagine the confusion that would take place if the first two changes were applied but the third one is not
* The customer might receive an invoice for all item never received, and a salesperson might receive a commission on an item never sent to the customer
Example of a transaction (cont.)
Chapters 21-22 28
There are basically two ways to perform these transaction activities 1. As a series of independent steps. 2. As an atomic transaction.
Defining a series of steps as a transaction is an important aspect of concurrency control.
Example of a transaction (cont.)
Chapters 21-22 29
Desirable Properties of Transactions(Also called as AClD properties
Atomicity: A transaction is an atomic unit of processing; it is either performed in its entirety or not performed at all.
Consistency preservation: A transaction is consistency preserving if its complete execution take(s) the database from one consistent state to another.
Isolation: A transaction should appear as though it is being executed in isolation from other transactions. That is, the execution of a transaction should not be interfered with by any other transactions executing concurrently.
Durability or permanency: The changes applied to the database by a committed transaction must persist in the database. These changes must not be lost because of any failure.
Chapters 21-22 30
database state is a collection of all the stored data items (values) in the database at a given point in time.
A consistent state of the database satisfies the constraints specified in the schema as well as any other constraints that should hold on the database.
Chapters 21-22 31
Serializability Schedule: Given a set of transactions,
any execution of those transactions is called a schedule
Serial Schedule: A schedule S is serial if for every transaction T participating in the schedule, all the operations of the transactions are executed consecutively.
Chapters 21-22 32
Serial Schedule 1. Read Item count for A (count
=10) 2. Reduce count by 5 for A (count =
5) 3. Write Item count for A (count =
5) 4. Read Item count for B (count =
5) 5. Reduce count by 3 for B (count =
2) Write Item count for B (count = 2)
B starts after A finishes
Chapters 21-22 33
Interleaved Schedule: A schedule which is not serial is called interleaved
Interleaved or Non-serial Schedule 1. Read Item count for A (count =10) 2. Read Item count for B (count = 10) 3. Reduce count by 5 for A (count = 5) 4. Write Item count for A (count = 5) 5. Reduce count by 3 for B (count = 7) Write Item count for B (count = 7)
B starts before A finishes
Chapters 21-22 34
Serializability is the generally accepted criterion for correctness for concurrency control.
* A given interleaved execution of a set of transactions is considered to be correct if it is serializable
-- if it produces the same result as some serial execution of the same transaction running them
one at a time A Schedule S of n transactions is serializable if it is
equivalent to some serial schedule of the same n transactions
A B C Result R
Chapters 21-22 35
Serializable
Serial
A B C R
Parallel
A B R C
Chapters 21-22 36
Concurrency Control Techniques (guarantee serializability)
Locking Time Stamping
Chapters 21-22 37
If all transactions obey the "two-phase locking protocol," then all possible interleaved schedules are serializable
The two-phase locking protocol
1.Before operating on any object (i.e., database tuple), a transaction must acquire a lock on that object.
2. After releasing a lock, a transaction must never go on to acquire any more locks.
Two-phase locking theorem
Chapters 21-22 38
Locking
A lock is a variable associated with a data item in the database and it describes the status of that
item
Generally, there is one lock for each data item in the database
When a database wants to access a data item it acquires a lock on that object.
Chapters 21-22 39
Types of Locks
Binary Shared and Exclusive
Chapters 21-22 40
Binary Locks:
A binary lock can have two states or values: locked and unlocked
Locked: Value of lock is 1 Unlocked: Value of lock is 0
Chapters 21-22 41
When the binary locking scheme is used, every transaction must obey the following rules:
1. A transaction T must issue the operation lock_item(X) before any read_item(X) or write_item(X) operations
2. A transaction T must issue the operation unlock_item(X) after all read_item(X) or write_item(X) operations are completed
3. A transaction T will not issue a lock_item operation if it already holds the lock on item X
4. A transaction T will not issue an unlock item operation unless it already holds the lock on item X
If a transaction holds the lock for item X, and if another
transaction requests a lock on that item, it is forced to wait until the first transaction is completed and releases the lock
Chapters 21-22 42
Shared and Exclusive Locks
Disadvantage of binary locking -- too restrictive (at most one transaction can hold a lock on
that item) We can allow several transactions to access the same
item X if they all access X for reading purposes only However, if a transaction is to write an item X, it must
have exclusive access to X
Two kinds of locks read or shared locks (S) write or exclusive lock (X) If a transaction wants to read an item X it requests read lock. If a transaction wants to update an item X it requests write lock.
Chapters 21-22 43
Rules for granting a lock:
1. If no transactions have any kind of lock on item X then any requested Read or Write lock is granted
2. Suppose transaction A has a read lock (S) on item X - A request from some distinct transaction B for a
read lock will be granted - A request from some distinct transaction B for a
write lock will be denied. 3. Suppose a transaction B has a write lock (X) on item X
- A request from some distinct transaction B for a read lock will be denied
- A request from some distinct transaction B for a write lock will be denied
Chapters 21-22 44
The rules can be summarized by means of a compatibility matrix
Read WriteRead | Y NWrite | N N
If a transaction A has a read lock for data item X, and if it wants to update the item, it can upgrade its read lock to write lock. This will be granted only if there are no other readers for that item.
Chapters 21-22 45
Time K L t1 req S lock a . t2 S lock a .
t3 . req S lock a t4 . S lock aTime K L t1 req S lock a . t2 S lock a . t3 . req X lock a t4 . wait
Time K L t1 req X lock a . t2 X lock a . t3 . req X lock a t4 . wait
Chapters 21-22 46
When to Lock with ROW SHARE and ROW EXCLUSIVE Mode
LOCK TABLE Emp_tab IN ROW SHARE MODE; LOCK TABLE Emp_tab IN ROW EXCLUSIVE MODE;
ROW SHARE and ROW EXCLUSIVE table locks offer the highest degree of concurrency.
You might use these locks if: Your transaction needs to prevent another transaction from acquiring an
intervening share, share row, or exclusive table lock for a table before the table can be updated in your transaction. If another transaction acquires an intervening share, share row, or exclusive table lock, no other transactions can update the table until the locking transaction commits or rolls back.
Your transaction needs to prevent a table from being altered or dropped before the table can be modified later in your transaction.
http://download-east.oracle.com/docs/cd/B14117_01/appdev.101/b10795/adfns_sq.htm#1025374
Chapters 21-22 47
When to Lock with SHARE Mode LOCK TABLE Emp_tab IN SHARE MODE;
Your transaction only queries the table, and requires a consistent set of the table data for the duration of the transaction.
You can hold up other transactions that try to update the locked table, until all transactions that hold SHARE locks on the table either commit or roll back.
Other transactions may acquire concurrent SHARE table locks on the same table, also allowing them the option of transaction-level read consistency.
Chapters 21-22 48
LOCK TABLE Emp_tab IN SHARE ROW EXCLUSIVE MODE;
LOCK TABLE Emp_tab IN EXCLUSIVE MODE;
Explicitly Acquiring Row Locks SELECT... FOR UPDATE
Chapters 21-22 49
Deadlocks * Locking solves the three basic
problems of Concurrency * But it introduces problems of its own,
principally the problem of deadlocks Deadlocks occur when each of two transactions is
waiting for the other to release the lock on an item
Chapters 21-22 50
Time K L t1 req S lock a . t2 S lock a . t3 . req X lock b t4 . X lock b t5 req S lock b . t6 wait . t7 . req X lock a t8 . wait
Chapters 21-22 51
AA B C DA
Wait-for Graph
A B
Chapters 21-22 52
Two-Phase Locking protocol (2PL) Before operating on any object (i.e., database tuple),
a transaction must acquire a lock on that object
After releasing a lock, a transaction must never go onto acquire any more locks
A transaction is said to follow the two-phase locking protocol if all locking operations proceed the first unlock operation in the transaction
Chapters 21-22 53
Two-Phase Locking protocol (2PL) 2 Phases Expanding or growing phase
New locks on items can be acquires but none can be released
Shrinking Phase Existing locks can be released but no
new locks can be acquired
Chapters 21-22 54
Variations of two-phrase locking (2PL)
Basic 2PL Conservative 2 PL Strict 2PL Rigorous 2PL
Chapters 21-22 55
Conservative 2 PL T locks all items it accesses before
the transaction begins executions predeclaring read set and write set wait until gets all locks before the T can start
Chapters 21-22 56
Strict 2PL Most popular T does not release X locks until
after it commits or abort Not deadlock free
Chapters 21-22 57
Rigorous 2 PL Guarantees strict 2PL T does not release X or S locks
until after it commits or abort Easier to implement than strict 2 PL
Chapters 21-22 58
Deadlock Prevention Protocols Deadlock Detection and Timeouts
Two Fundamental strategies used to deal with deadlocks
Chapters 21-22 59
Deadlock Prevention Protocols Used in conservative 2 PL
Acquires every T locks all items it needs in advance
If any items cannot obtain, none of the items are locked – wait
Limits concurrency
Chapters 21-22 60
Deadlock Detection System checks if deadlock exists If deadlock
Create a wait-for graph (if deadlock – has a cycle)
Victim selection – avoid T that has run for a long time and has many update operations
Chapters 21-22 61
Starvation The algorithm selects the same
transaction as victim repeatedly– cause it to abort and never finish execution
To solve give higher priority to transactions that have been aborted multiple times
Chapters 21-22 62
Timeouts If T waits for a long time, abort it. Low overhead cost, simple
Chapters 21-22 63
Disadvantages of Two Phase Locking Limits the amount of concurrency
A transaction T may not be able to release an item X after it is through using it, if T needs an additional lock later
Meanwhile another transaction seeking to access X may be forced to wait even though T is through using X
Chapters 21-22 64
Two problems with 2PL deadlock livelock Livelock: A transaction cannot proceed for an
indefinite period of time while other transactions in the system continue normally
Problem: waiting scheme for locked items is unfair Solution: to have a fair waiting scheme. e.g. First-come-first-serve
Chapters 21-22 65
Granularity of Data Items A database item could be one of the
following A database record A field value of a database record A disk block A whole file The whole database
Chapters 21-22 66
Granularity of Data Items (cont.) The size of the data items are called
the data item granularity. Fine granularity
small item sizes coarse granularity
large item sizes.
Chapters 21-22 67
If the data item size is larger Degree of concurrency permitted is lower.
For example, if the data item size is a disk block, a transaction T that needs to lock a record A must lock the whole disk block X that contains A.
Chapters 21-22 68
If the data item size is smaller If the data size is smaller, more items will exist in
the database. More lock and unlock operations will be performed,
causing a higher overhead. more storage space will be required for the lock
table. For Timestamps, - storage is required for read_TS and write_TS - overhead of handling a larger number of
items is similar to that in the case of locking.
Chapters 21-22 69
The best data size depends on the types of transactions involved.
If a typical transaction accesses a small number of records, it is advantageous to have the data item granularity be one record.
On the other hand, if a transaction typically accesses many records of the same file, it may be better to have block or file granularity so that the transaction will consider all those records as one data items.
Most concurrency control techniques have a uniform data item size. In some techniques, the data item size may be changed to the granularity that best suits the transactions that are currently executing on the system.
Chapters 21-22 70
Timestamp Ordering Do not use locks so deadlock cannot occur
The algorithm must ensure that, for each item accessed, by more than one transaction in the schedule, the order in which the item is accessed does not violate the serializability of the schedule.
Chapters 21-22 71
Timestamp Time stamp -- unique identifier created by the DBMS -- identify a transaction start time Time Stamps use: 1. Counter that is incremented each time its value is
assigned to a transaction (1,2,3…) 2. Use system clock and ensure that no two timestamp
values are generated during the same tick of the clock.
Chapters 21-22 72
Notations Read_TS(X): largest TS among the TS that have
successfully read item X
Read_TS(X) = TS(T) where T is the youngest transaction that has read X successfully
Write_TS(X) = TS(T) where T is the youngest transaction that has written X successfully
Chapters 21-22 73
Basic Timestamp Ordering Transaction T issues a write_item(X) operation:
If Read_TS(X) > TS(T) or if Write_TS(X) > TS(T) – Abort and Rollback T Else Write_item(X); Write_TS(X) = TS(T)
Chapters 21-22 74
Basic Timestamp Ordering Whenever some transaction T tries to issue a
read_item(X) or a write_item(X) operation, the basic TO algorithm compares the timestamp of T with read_TS(X) and write_TS(X) to ensure that the timestamp order of transaction execution is not violated. If this order is violated, then transaction T is aborted and resubmitted to the system as a new transaction with a new timestamp.
Chapters 21-22 75
Cascading rollback
If T is aborted and rolled back, any transaction T1 that may have used a value written by T must also be rolled back. Similarly, any transaction T2 that may have used a value written by T1 must also be rolled back, and so on.
Chapters 21-22 76
Basic Timestamp Ordering (cont.) Transaction T issues a read_item(X) operation: If Write_TS(X) > TS(T) – Abort and Rollback T Else (Write_TS(X) <= TS(T)) - Read_item(X); - Read_TS(X) = TS(T)
TS(T) -- timestamp of transaction T
Chapters 21-22 77
conflicting operations that occur in the incorrect order, it rejects the later of the two operations by aborting the transaction that issued it.
The schedules produced by basic TO are hence guaranteed to be conflict serializable, like the 2PL protocol.
Chapters 21-22 78
Strict Timestamp Ordering
a transaction T that issues a read_item(X) or write_item(X) such that TS(T) > write_TS(X) has its read or write operation delayed until the transaction T that wrote the value of X (hence TS(T ) = write_TS(X)) has committed or aborted.
Chapters 21-22 79
Oracle Documentation
http://www.oracle.com/pls/db112/search?remark=quick_search&word=concurrency&partno=