1 cs411 database systems 12: recovery obama and eric schmidt sysadmin song

42
1 CS411 Database Systems 12: Recovery obama and eric schmidt http://www.youtube.com/watch? v=k4RRi_ntQc8 sysadmin song http://www.youtube.com/watch?v=udhd9fmOdCs 14th century sysadmin http://www.youtube.com/watch?v=8UXAF- CUmIA

Upload: jase-bebb

Post on 29-Mar-2015

221 views

Category:

Documents


4 download

TRANSCRIPT

Page 1: 1 CS411 Database Systems 12: Recovery obama and eric schmidt  sysadmin song

1

CS411Database Systems

12: Recoveryobama and eric schmidt http://www.youtube.com/watch?v=k4RRi_ntQc8sysadmin song http://www.youtube.com/watch?v=udhd9fmOdCs14th century sysadmin http://www.youtube.com/watch?v=8UXAF-CUmIA

Page 2: 1 CS411 Database Systems 12: Recovery obama and eric schmidt  sysadmin song

2

Bad things happen, but the DB contents must live on regardless.

System crashes are the most common problem.

We’ll worry about media failure later.

Page 3: 1 CS411 Database Systems 12: Recovery obama and eric schmidt  sysadmin song

3

On restart, some transactions should be aborted, others must be durable.

crash!T1

T2

T3

T4

T5

T1, T2, T3 are should be durable.

T4, T5 should be aborted.

Page 4: 1 CS411 Database Systems 12: Recovery obama and eric schmidt  sysadmin song

Recovery has a big impact on buffer management.

Force T’s writes to disk at commit time?– Poor response time.– If not, how do we

guarantee durability?

Steal dirty buffer pool pages from uncommitted Tns?– If not, poor throughput.– If so, what about

atomicity?

Force

No Force

No Steal Steal

Trivial

Desired

If T aborts, must undo T’s writes on disk!

Page 5: 1 CS411 Database Systems 12: Recovery obama and eric schmidt  sysadmin song

The log helps us guarantee atomicity and durability.

Append-only file with all info needed to REDO or UNDO every write

Give it its own disk (why?)

Page 6: 1 CS411 Database Systems 12: Recovery obama and eric schmidt  sysadmin song

6

Undo Logging

(force, steal)

Page 7: 1 CS411 Database Systems 12: Recovery obama and eric schmidt  sysadmin song

7

Undo logs don’t need to save after-images

Log record types:<START T>

– transaction T has begun

<COMMIT T> – T has committed

<ABORT T>– T has aborted

<T, X, old_v>– T has updated element X, and its old value was

old_v

Page 8: 1 CS411 Database Systems 12: Recovery obama and eric schmidt  sysadmin song

8

Undo logging has 2.5 rules.

U1: If T modifies X, then the log record <T, X, old_v> must be on disk before X is written to disk

U2: If T commits, then <COMMIT T> can’t be written to disk until all data changes by T are on disk (“early OUTPUTs”)

There may be many pages

to force, &

other Tns may want

them in memor

y

U2.5: Need to do the right thing when a transaction aborts (what?)

Buffer management

rule, not a logging rule

Page 9: 1 CS411 Database Systems 12: Recovery obama and eric schmidt  sysadmin song

9

Crash recovery is easy with an undo log.1. Scan log, decide which transactions T

completed. <START T>….<COMMIT T>…. <START T>….<ABORT T>……. <START T>………………………

2. Starting from the end of the log, undo all modifications made by incomplete transactions.

The chance of crashing during recovery is relatively high!

But undo recovery is idempotent: just restart it if it crashes.

Page 10: 1 CS411 Database Systems 12: Recovery obama and eric schmidt  sysadmin song

10

Detailed algorithm for undo log recovery

From the last entry in the log to the first:– <COMMIT T>: mark T as completed– <ABORT T>: mark T as completed– <T,X,v>: if T is not completed

then write X=v to disk else ignore

– <START T>: ignore

So how should we

handle ordinary

Tn aborts?

Page 11: 1 CS411 Database Systems 12: Recovery obama and eric schmidt  sysadmin song

11

Undo recovery practice

…<T6,X6,v6>……<START T5><START T4><T1,X1,v1><T5,X5,v5><T4,X4,v4><COMMIT T5><T3,X3,v3><T2,X2,v2>

Which actions do we undo, in which order?

What could go wrong if we undid them in a different order?

Page 12: 1 CS411 Database Systems 12: Recovery obama and eric schmidt  sysadmin song

12

Scanning a year-long log is SLOW and businesses lose money every minute their DB is down.

Solution: checkpoint the database periodically.

Easy version:

1. Stop accepting new transactions

2. Wait until all current transactions complete

3. Flush log to disk

4. Write a <CKPT> log record, flush

5. Resume transactions

Page 13: 1 CS411 Database Systems 12: Recovery obama and eric schmidt  sysadmin song

13

During undo recovery, stop at first checkpoint.

……<T9,X9,v9>……(all completed)<CKPT><START T2><START T3<START T5><START T4><T1,X1,v1><T5,X5,v5><T4,X4,v4><COMMIT T5><T3,X3,v3><T2,X2,v2>

T2,T3,T4,T5

other transactions

Page 14: 1 CS411 Database Systems 12: Recovery obama and eric schmidt  sysadmin song

14

This “quiescent checkpointing” isn’t good enough for 24/7 applications. Instead:

1. Write <START CKPT(T1,…,Tk)>,where T1,…,Tk are all active transactions

2. Continue normal operation

3. When all of T1,…,Tk have completed, write <END CKPT>

Page 15: 1 CS411 Database Systems 12: Recovery obama and eric schmidt  sysadmin song

15

Example of undo recovery with nonquiescent checkpointing

…………

…<START CKPT T4, T5, T6>…………<END CKPT>………

T4, T5, T6, plus later transactions

earlier transactions plus T4, T5, T5

later transactions

What would go wrong if we didn’t use<END CKPT> ?

What would go wrong if we didn’t use<END CKPT> ?

Page 16: 1 CS411 Database Systems 12: Recovery obama and eric schmidt  sysadmin song

16

Crash recovery algorithm with undo log, nonquiescent checkpoints.

1. Scan log backwards until the start of the latest completed checkpoint, deciding which transactions T completed. <START T>….<COMMIT T>…. <START T>….<ABORT T>……. <START CKPT {T…}>….<COMMIT T>…. <START CKPT {T…}>….<ABORT T>……. <START T>………………………

2. Starting from the end of the log, undo all modifications made by incomplete transactions.

Page 17: 1 CS411 Database Systems 12: Recovery obama and eric schmidt  sysadmin song

17

Redo Logging

(no force, no steal)

Page 18: 1 CS411 Database Systems 12: Recovery obama and eric schmidt  sysadmin song

18

Redo log entries are just slightly different from undo log entries.

<START T>

<COMMIT T>

<ABORT T>

<T, X, new_v> – T has updated element X, and its new value is

new_v

same as before

Page 19: 1 CS411 Database Systems 12: Recovery obama and eric schmidt  sysadmin song

19

Redo logging has one rule.

R1: If T modifies X, then both <T, X, new_v> and <COMMIT T> must be written to disk before X is written to disk (“late OUTPUT”)

Don’t have to force all those

dirty data pages to disk

before committing!

Don’t steal dirty buffer pages

from uncommitted

tns!

Implicit and reasonable

assumption: log records reach disk in order;

otherwise terrible things will happen.

Page 20: 1 CS411 Database Systems 12: Recovery obama and eric schmidt  sysadmin song

20

Recovery is easy with an undo log.

1. Decide which transactions T completed. <START T>….<COMMIT T>…. <START T>….<ABORT T>……. <START T>………………………

2. Read log from the beginning, redo all updates of committed transactions.

The chance of crashing during recovery is relatively high!

But REDO recovery is idempotent: just restart it if it crashes.

Page 21: 1 CS411 Database Systems 12: Recovery obama and eric schmidt  sysadmin song

21

Example of redo recovery

<START T1><T1,X1,v1><START T2><T2, X2, v2><START T3><T1,X3,v3><COMMIT T2><T3,X4,v4><T1,X5,v5>……

Which actions do we redo, in which order?

What could go wrong if we redid them in a different order?

Page 22: 1 CS411 Database Systems 12: Recovery obama and eric schmidt  sysadmin song

22

Nonquiescent checkpointing is trickier with a redo log than an undo log

1. Write a <START CKPT(T1,…,Tk)>where T1,…,Tk are the active transactions

2. Flush to disk all dirty data pages of transactions committed by the time the checkpoint started, while continuing normal operation

3. After that, write <END CKPT>

dirty = written

Page 23: 1 CS411 Database Systems 12: Recovery obama and eric schmidt  sysadmin song

23

What exactly does “dirty” mean?

• When you are talking about buffer management and which buffers you can steal, a dirty page is a data page in memory that has been modified but not yet sent back to disk.

• When you are talking about concurrency control, a dirty page is a data page in memory that has been modified but not yet committed. A dirty read is a read of a dirty page.

Either way, the dirty pages are the ones that can get you in trouble.

Page 24: 1 CS411 Database Systems 12: Recovery obama and eric schmidt  sysadmin song

24

Example of redo recovery with nonquiescent checkpointing

…<START T1>…<COMMIT T1>……<START CKPT T4, T5, T6>……<END CKPT>……<START CKPT T9, T10>…

1. Look forthe last<END CKPT>

2. Redo from <START T>, for committed T in {T4, T5, T6}.

3. Normal redo for committed Tns that started after this point.

All data written by T1 is known

to be on disk

Page 25: 1 CS411 Database Systems 12: Recovery obama and eric schmidt  sysadmin song

25

But neither undo nor redo logging matches what we would like to have for buffer management

Force

No Force

No Steal Steal

Trivial

Desired

Undo Logging

Redo Logging

Use undo/redo logging to attain this

nirvana

Page 26: 1 CS411 Database Systems 12: Recovery obama and eric schmidt  sysadmin song

26

Redo/undo logs save both before-images and after-images.

<START T> <COMMIT T> <ABORT T><T, X, old_v, new_v>

– T has written element X; its old value was old_v, and its new value is new_v

Page 27: 1 CS411 Database Systems 12: Recovery obama and eric schmidt  sysadmin song

Undo/redo recovery has 1.5 rules.

1. Must force the log record for an update to disk before the corresponding data page goes to disk.

As usual, T committed iff <T

commits> is on disk

1.5: Need to do the right thing when a transaction aborts (what?)

Item X can be updated on disk once <T wrote X> is

on disk , before <T

commits> is on disk (i.e., early or late

OUTPUT)

“Write-ahead

logging”

Page 28: 1 CS411 Database Systems 12: Recovery obama and eric schmidt  sysadmin song

28

Recovery is more complex with undo/redo logging.

1. Redo all committed transactions, starting at the beginning of the log

2. Undo all incomplete transactions, starting from the end of the log

<START T1><T1,X1,v1><START T2><T2, X2, v2><START T3><T1,X3,v3><COMMIT T2><T3,X4,v4><T1,X5,v5>……

REDO

UNDO

“incomplete” = started &

not committed or aborted

How do we know these undos won’t undo some committed

writes?

Page 29: 1 CS411 Database Systems 12: Recovery obama and eric schmidt  sysadmin song

29

Algorithm for non-quiescent checkpoint for undo/redo

1. Write <start checkpoint, list of all active transactions> to log

2. Flush log to disk3. Write to disk all dirty buffers,

whether or not their transaction has committed(this implies some log records may

need to be written to disk (WAL))

4. Write <end checkpoint> to log

5. Flush log to disk29

Flush dirty

buffer pool

pages

<start checkpoint, active Tns are T1, T2, …>

<end checkpoint>

Active

Tns

Pointers are one of

many tricks to speed up future

undos

Page 30: 1 CS411 Database Systems 12: Recovery obama and eric schmidt  sysadmin song

UNDO

30

Algorithm for undo/redo recovery with nonquiescent checkpoint 1. Backwards undo pass (end of log to

start of last completed checkpoint)

a. C = transactions that committed after the checkpoint started

b. Undo actions of transactions that (are in A or started after the checkpoint started) and (are not in C)

2. Undo remaining actions by incomplete transactionsa. Follow undo chains for transactions in

(checkpoint active list) – C

3. Forward pass (start of last completed checkpoint to end of log)

a. Redo actions of transactions in C

Active

Tns…

<start checkpoint, A=active Tns>

…<end checkpoint>

REDO

S

Page 31: 1 CS411 Database Systems 12: Recovery obama and eric schmidt  sysadmin song

31

Examples what to do at recovery time?

no <T1 commit>

Undo T1 (undo A, B, C)

…T1 wrote A, ……checkpoint start (T1 active)

…T1 wrote B, ……checkpoint end…T1 wrote C, ……

Page 32: 1 CS411 Database Systems 12: Recovery obama and eric schmidt  sysadmin song

32

Redo T1: (redo B, C)

…T1 wrote A, ……checkpoint start (T1 active)

…T1 wrote B, ……checkpoint end…T1 wrote C, ……T1 commit

Examples what to do at recovery time?

Page 33: 1 CS411 Database Systems 12: Recovery obama and eric schmidt  sysadmin song

33

Real world actions

E.g., dispense cash at ATM

Ti = a1…... aj …... an

$

“Solution”:

(1) try to make idempotent

(2) execute real-world actions after commit

Why are these a problem from a

DB perspecti

ve?

Page 34: 1 CS411 Database Systems 12: Recovery obama and eric schmidt  sysadmin song

34

PHYSICAL DISASTERS

Page 35: 1 CS411 Database Systems 12: Recovery obama and eric schmidt  sysadmin song

35

These recovery algorithms won’t help you if your disk fails.

Solution: careful replication!

Page 36: 1 CS411 Database Systems 12: Recovery obama and eric schmidt  sysadmin song

36

Example 1 Triple modular redundancy

Keep 3 copies on separate disks• Output(X) --> three outputs• Input(X) --> three inputs + vote

Copy 1 Copy 2 Copy 3

Page 37: 1 CS411 Database Systems 12: Recovery obama and eric schmidt  sysadmin song

37

Example 2 Redundant writes, single reads

Keep N copies on separate disks• Output(X) --> N outputs• Input(X) --> Input one copy

- if ok, done

- else try another one

Assumes bad data can be

detected (traditional but false)

Copy 1Copy 1Copy 1Copy 1Copy 1Copy 1

Page 38: 1 CS411 Database Systems 12: Recovery obama and eric schmidt  sysadmin song

38

Example 3: DB dump + log

backup

databaseactive

databaselog

If active database is lost,– restore active database from backup– bring up-to-date using redo entries in log

Page 39: 1 CS411 Database Systems 12: Recovery obama and eric schmidt  sysadmin song

39

When can log be discarded?

check-

pointdb

dump

last

needed

undo

not needed for

media recovery

not needed for undo

after system failure

not needed for

redo after system failure

log

time

Page 40: 1 CS411 Database Systems 12: Recovery obama and eric schmidt  sysadmin song

The real picture: what’s stored where

DB

Data pageseach with a pageLSN

(LSN of last write to that data page)

Xact TablelastLSN

status

Dirty Page TablerecLSN

flushedLSN

RAM

prevLSNXIDtype

lengthpageID

offsetbefore-imageafter-image

LSN (log sequence number)

LogRecords

LOG

Master record

Page 41: 1 CS411 Database Systems 12: Recovery obama and eric schmidt  sysadmin song

Summary of Logging/Recovery

• Recovery manager guarantees atomicity & durability---two of the ACID properties.

• Redo logging and undo logging are simple but make the system too slow in practice for serious applications.

• Use write-ahead logging with undo/redo logging to speed up the system (by allowing STEAL/NO-FORCE) without sacrificing correctness.

Page 42: 1 CS411 Database Systems 12: Recovery obama and eric schmidt  sysadmin song

Summary, Cont.

• Checkpointing: A quick way to limit the amount of log to scan on recovery. Nonquiescent checkpoints are especially useful.

• Recovery works in 3 phases:– Analysis: Forward from checkpoint.– Redo: Forward.– Undo: Backward.