cs 377 database systems - mathcs.emory.edulxiong/cs377_f11/share/... · redo certain transaction...
TRANSCRIPT
CS 377
Database SystemsTransaction Processing and Recovery
1
Transaction Processing and Recovery
Li Xiong
Department of Mathematics and Computer Science
Emory University
Transaction Processing
� Basic DB Functionalities
� Data Storage
� Query Processing
� Transaction Processing Systems
2
Transaction Processing Systems
� Systems with large databases and concurrent users executing database transactions
� A transaction is a logical unit of DB processing which may include one or more database access operations
� Issues
� Transactions may fail – how to recover?
� Multiple transactions are being done once – how to ensure consistency?
Why is Recovery Needed?
Pooh was sitting in his house one day, counting his pots of
honey, when there came a knock on the door.
“Fourteen,” said Pooh. “Come in. Fourteen. Or was it fifteen?
Bother. That’s muddled me.”
“Hallo, Pooh,” said Rabbit.
3
“Hallo, Pooh,” said Rabbit.
“Hallo, Rabbit. Fourteen, wasn’t it?”
“What was?”
“My pots of honey what I was counting.”
“Fourteen, that’s right.”
“Are you sure?”
“No, ” said Rabbit. “Does it matter?”
� Computer failure (system crash)
� Main memory failure
� Transaction error
� integer overflow; division by zero; erroneous parameter values; logical programming error
Why is Recovery Needed? Really?
4
� Disk failure
� Read/write malfunction or head crash
� Catastrophes
� Power or air-conditioning failure, fire, theft, sabotage, overwriting disks or tapes by mistake, and mounting of a wrong tape by the operator.
Basic Model and Notations
� A database - collection of named data items
� Granularity of data - a field, a record , or a whole disk block(Concepts are independent of granularity)
� Basic query operations are read and write
� read_item (x)
5
� read_item (x)
� block containing x -> memory
� Program variable x ← value of x in block
� write_item (x):
� value of x in block ← program variable x
� block containing x -> disk
Sample Transaction with Failure� E.g. X = 100, Y = 50, N = 10
6
failure!
Multiple Transactions
� Serial schedules: transactions are executed in
isolation and consecutively
� E.g. X = 100, Y = 50, N = 10, M = 20
� Drawback?
7
Drawback?
Concurrency
� Concurrency� Interleaved processing:
�Concurrent execution of processes is interleaved in a single CPU
� Parallel processing:�Processes are concurrently executed in multiple CPUs.
8
Transaction Execution with
concurrency� E.g. X = 100, Y = 50, N = 10, M = 20
9
Outline
� Motivation
� Transaction Concepts
� Recovery
� Concurrency Control (next lecture)
10
� Concurrency Control (next lecture)
Transactions Concepts� Transaction: logical unit of data processing that includes one or
more basic access operations (read -retrieval, write - insert or
update, delete)
� ACID Properties of Transactions
� Atomicity: an atomic unit; either performed in its entirety or not
performed at all
11
performed at all
� Consistency: preserve consistency; take the database from one consistent
state to another
� Isolation: appear executed in isolation from other transactions
� Durability: changes by a committed transaction must persist
� Transaction Management
� Recovery – atomicity, durability
� Concurrency control – isolation
� Application programs - consistency
Writing to Disk
� In-place updating
� Write the buffer to the same original disk, overwriting
the old value
� Before image and after image
12
� Shadowing
� Write the updated buffer to a different disk location, so
multiple versions of data items can be maintained
T1: Read (A,t);
t ← t×2;Write (A,t);Read (B,t); t ← t×2Write (B,t);
Constraint: A=B
T1: A ← A × 2
B ← B × 2
Unfinished Transaction Example
13
A: 8B: 8
memory disk
T1: Read (A,t);
t ← t×2;Write (A,t);Read (B,t); t ← t×2Write (B,t);
failure!
Constraint: A=B
T1: A ← A × 2
B ← B × 2
Unfinished Transaction Example
14
A: 8B: 8
A: 8B: 8
memory disk
1616
16
�Violates atomicity
Recovery
Credits: Hansel and Gretel, 782 AD
� Keep a system log and perform recovery when necessary
� System log
� Separate, non-volatile
15
� Periodically backed up to archival storage (tape)
� append only file consists of entries called log records
� record the operations that each transaction has
performed on the data.
Recovery
� Log records
� start: beginning of transaction execution.
� read or write: read or write operations on database items
� commit: successful end of the transaction – any updates should be permanently applied to DB (appear on disk)
� rollback or abort: unsuccessful end - any changes should not be applied to DB or undone if applied
� Write ahead logging (WAL): all modifications are written to a log before they
16
� Write ahead logging (WAL): all modifications are written to a log before they are applied to the database
� Logging
� Undo – immediate update
� Redo – deferred update
� Recovery
� undo certain transaction operations to ensure all operations of an uncommitted transaction are not applied
� redo certain transaction operations to ensure all operations of a committed transaction are applied successfully
Undo Logging
� Idea: undo operations for uncommitted transactions to go
back to original state of DB
� In order to undo the updates made by a transaction,
we save the original (old) value of every updated data
item
17
item
� An UNDO log:
� [start, TID] : indicates that transaction TID has started
� [write, TID, X, old_value]: indicates
that transaction TID has over-written data item X whose
value was old_value
� [commit, TID] : indicates
that transaction TID has completed successfully
� [abort, TID] : indicates that transaction TID has been aborted
Undo Logging� When a new transaction begins
� Append [start, T] to the UNDO log
� When transaction T reads a data item X:
� Don't need to do anything...
� When transaction T writes a data item X:
18
� Append [write, T, X, old_value] to the UNDO log
� AFTER the log has been written successful , update X (with the
new value)
� When transaction T completes successfully:
� Append [commit, T] to the UNDO log
� When transaction T is aborted:
� Append [abort, T] to the UNDO log
Undo Logging: Disk Writing Order
a) Log records of changed data items
b) Changed data items (immediate modifications)
c) Commit log record
19
T1: Read (A,t);
t ← t×2;Write (A,t);Read (B,t); t ← t×2;Write (B,t);
Undo logging
20
A:8B:8
memory disk log
T1: Read (A,t);
t ← t×2;Write (A,t);Read (B,t); t ← t×2;Write (B,t);
Undo logging
21
A:8B:8
A:8B:8
memory disk log
1616
<T1, start><T1, A, 8>
<T1, commit>16 <T1, B, 8>
16
Undo logging: Possible Recovery Rules
� For every Ti with <Ti, start> in log:
If <Ti,commit> or <Ti,abort> in log, do nothing
Else in forward order:
For all <Ti, X, v> in log:
22
For all <Ti, X, v> in log:
write (X, v); output (X )
Write <Ti, abort> to log
Scans the log in reverse order (latest → earliest)
(1) Remember transactions with <Ti, commit> (or <Ti, abort>)
record
(2) For each <Ti, X, v>
if Ti does not have <Ti, commit> (or <Ti, abort>)
Undo logging: Recovery Rules
23
if Ti does not have <Ti, commit> (or <Ti, abort>)
then write (X, v); output (X)
(3) For each Ti that does not have <Ti, commit> (or <Ti, abort>)
write <Ti, abort> to log
Checkpointing
Periodically:
(1) Do not accept new transactions
(2) Wait until all active transactions to finish
(3) Flush all log records to disk (log)
24
(3) Flush all log records to disk (log)
(4) Write “checkpoint” record on disk (log)
(5) Resume transaction processing
Nonquiescent Checkpointing
(1) Write a log record <Start CKPT (T1, …, Tk)>
with all active transactions
(2) Wait until all active transactions commit or abort,
do not prohibit other transactions from starting
25
do not prohibit other transactions from starting
(3) Flush all log records to disk (log)
(4) Write a log record <End CKPT>
Exercise: Undo Logging� An undo loggoing database starts a nonquiescent checkpoint after line 5.
Initial value of A in the database (on disk) is 2. � Show the log file entries that would be generated by this execution.
� If the system crashes, what is the value of A in the database? What recovery would have to be done?� immediately after line 12
� immediately after line 11
T1 T2 T3
--------------------------------------------------------------------
26
--------------------------------------------------------------------
0 start
1 READ A
2 A := A + 1
3 start
4 WRITE A
5 commit
6 start
7 READ A
8 A := A + 1
9 READ A
10 commit
11 WRITE A
12 commit
Outline
� Transaction Basics
� Recovery
� Undo Logging
� Redo Logging
27
� Redo Logging
� Undo/Redo Logging
� Concurrency Control
REDO logging
� Idea: save disk I/Os by deferring data changes – (re)do the
changes for committed transactions
� In order to redo the updates made by the transaction,
we save the NEW value of every updated data item
� A REDO log
28
� A REDO log
� [start, TID] : indicates that transaction TID has started
� [write, TID, X, new_value]: indicates
that transaction TID has over-written data item X with
new_value
� [commit, TID] : indicates
that transaction TID has completed successfully
� [abort, TID] : indicates that transaction TID has been aborted
REDO Logging
� When a new transaction begins , do:
� Append [start, T] to the REDO log
� When transaction T reads a data item X:
� Don't need to do anything...
� When transaction T writes a data item X:
29
� When transaction T writes a data item X:
� Append [write, T, X, new_value] to the REDO log
� When transaction T completes successfully:
� Append [commit, T]
� Updates the database
� Append [End, T]
� When transaction T is aborted:
� Append [abort, T]
Redo Logging: Disk Writing Order
a) Log records of changed data items
b) Commit log record
c) Changed data items (deferred modification)
30
T1: Read (A,t);
t ← t×2;Write (A,t);Read (B,t); t ← t×2;Write (B,t);
REDO logging
31
A:8B:8
memory disk log
Redo logging (deferred modification)
T1: Read(A,t); t t×2; write (A,t);
Read(B,t); t t×2; write (B,t);
Output(A); Output(B)
32
A: 8B: 8
A: 8B: 8
memory DB
LOG
1616
<T1, start><T1, A, 16><T1, B, 16><T1, commit>
<T1, end>
output
1616
(1) Let S = set of transactions with
<Ti, commit> (and no <Ti, end>) in log
(2) For each <Ti, X, v> in log, in forward order (earliest
→ latest) do:
Recovery rules: Redo logging
33
if Ti ∈ S then
Write(X, v); Output(X)
(3) For each Ti ∈ S, write <Ti, end>