databases might get lost we don’t like that, because … solutions are based on logging...

Databases might get lost We don’t like that, because … Solutions are based on logging techniques General term: write ahead logging

Wrong user data: avoid by using constraints System failure: loss of main memory (= volatile

memory) Media failures (disk errors) Catastrophe (several levels possible)

Traffic between disk and main memory: IO

Unit size: page or block Block in memory is

buffered Typically 32 kbyte Access time: 5 – 10 msec Block is often unit for

concurrency (sometimes tuple)

Log file is a separate file, containing information to support database reconstruction

Entries have record structure Entries are (in most cases) related to a specific

transaction Good practice: log file on separate disk Recent development: log files on SSD

INPUT(X): copy block X from disk to memory READ(X,t): assign value of X to variable t WRITE(X,t): copy value of t into X in memory OUTPUT(X): copy block X from memory to disk

(also called flushing X)

Example (financial transaction):INPUT(Account1); READ(Account1, v1); v1 := v1 – 200;WRITE(Account1, v1); OUTPUT(Account1);INPUT(Account2); READ(Account2, v2); v2 := v2 + 200;WRITE(Account2, v2); OUTPUT(Account2);

INPUT(Account1); READ(Account1, v1); v1 := v1 – 200;WRITE(Account1, v1); OUTPUT(Account1);

>> CRASH <<

INPUT(Account2); READ(Account2, v2); v2 := v2 + 200;WRITE(Account2, v2); OUTPUT(Account2);

Old values of each data element X should be written to the log file: <T, X, oldvalue>

Such a record is often called the before image of X Before doing an OUTPUT(X), the log record for this X

should be flushed The <T, commit> record is written to the log after all

database elements have been updated on disk

Example transaction run + log entriesDistinguish values in M(emory) and on D(isk)

Note: Log means M-Log; after FLUSH LOG: D-Log = M-Log

Check the log file for uncommitted transactions Rollback these transactions using the before images The order of undoing transactions is essential

What about … a crash during the recovery process?

By the way: restart the transactions that are undone

Dealing with all the transactions since the DB started operating (possibly a few years ago) may be less desirable

Checkpoint steps:1. No new transactions accepted2. Wait until all active transactions are finished

(COMMIT or ABORT)3. Flush the log records4. Write a <CKPT> record into the log5. Resume transaction processing

INPUT(Account1); READ(Account1, v); v := v – 200;WRITE(Account1, v); INPUT(Account2); READ(Account2, v); v := v + 200;WRITE(Account2, v); COMMIT;

>> CRASH <<

OUTPUT(Account1);OUTPUT(Account2);

Why would you delay OUTPUT?

Before commitment, the new value of X should be written to the log file: <T, X, newvalue>

Such a record is often called the after image of X After the logging of all after images,

the <T, commit> is written to the log

Example transaction run + log entriesDistinguish values in M(emory) and on D(isk)

Check the log file for committed transactions Restore the effects of these transactions using the

after images The order of redoing transactions is essential What about … a crash during the recovery process?

Combining before image and after image:<T, X, oldvalue, newvalue>

Optimal freedom for buffer manager

Definition: Tj RF (reads from) Ti if there is an X such that Wi[X] is the last write on X before Rj[X]

Definition: IF Tj RF Ti and Ti is not yet committed, then this read is called a dirty read

A schedule is recoverable if for each pair of transactions the following property holds:Tj RF Ti => COMMIT i < COMMIT j (in log)

A schedule avoids cascading rollbacks if:Tj RF Ti => COMMIT i < Rj[X]

A schedule is strict if: Wi[X] < Oj[X] => COMMITi < Oj[X] or ABORTi < Oj[X] for any read or write operation Oj[X]

Implementation of strictness by 2PL: hold your locks until ABORT/COMMIT

Relates to SQL Isolation Levels

Up till now: system failures◦Memory lost; data on disk undamaged

What to do in case of disk failure? Archiving: always keep a copy of your entire DB Full dump or incremental dump Keep logs since last dump Archive copy + log = actual DB

… that’s why you should keep your log file on a separate disk

Checkpoint for REDO Non-quiescent checkpoint

>> see the exercises

databases might get lost we don’t like that, because … solutions are based on logging...

Documents

separate disk slide

rjx slide

ssd slide

tuple slide

block x

v2 outputaccount2 slide

block block

levels possible slide