information management – db2 © 2009 ibm corporation
DESCRIPTION
TRANSCRIPT
© 2009 IBM CorporationMarch 1, 2009
Log File Management in DB2 for Linux, UNIX, and WindowsRon CastellettoIBM Canada [email protected]
Information Management – DB2
© 2009 IBM Corporation2
Agenda
How does DB2 ensures no data loss on crash ?
Log file management
Log file archiving
Removing single points of failure
Information Management – DB2
© 2009 IBM Corporation3
Database Fundamentals
The ACID properties– Atomicity – all actions in the TXN happen or none happen– Consistency – if each TXN is consistent, and the DB starts
as consistent, the DB ends up consistent– Isolation – execution of one TXN is isolated from that of
other TXNs– Durability – if a TXN commit, its effects persists
The Recovery Manager ensure Atomicity and Durability
Information Management – DB2
© 2009 IBM Corporation4
Atomicity and Durability
Atomicity– Transactions may abort (rollback)
Durability– What if DB2 stops running (eg, power failure) ?
After system crashes– T1, T2, T3 should be durable– T4 and T5 should abort
Information Management – DB2
© 2009 IBM Corporation5
Properties of Database Transactions
Concurrency control is in effect– Share (read) and exclusive (write) locking
Updates happen in place– Data is overwritten on disk
A simple solution for atomicity and durability ? – Use a “force” and “no steal” buffer pool
Information Management – DB2
© 2009 IBM Corporation6
The Buffer Pool
“Force” write of data pages to disk at commit – Provides durability– But poor performance
“No stealing” of pages from uncommited Txns – However, “no steal” implies poor performance– But, if 'steal' allowed, how to ensure atomicity ?
No Steal Steal
Force
No Force
Information Management – DB2
© 2009 IBM Corporation7
Forcing and Stealing Pages
No Force (why Durability is hard) – Reminder: “Force” is writing all modified pages for this
TXN at commit time– What is system crashes before all pages written ?– Need to know what changes were lost before I/O
completed so can redo them.
Steal (why enforcing Atomicity is hard) – Reminder: “Steal” is allowing a new TXN to steal a slot in
the buffer pool, ie: write a modified page to disk before the TXN commits
– What is the TXN aborts ? Need to know what modification were written to disk so can undo the change.
Solution: write log records to support redo/undo of changes
Information Management – DB2
© 2009 IBM Corporation8
Logging
DB2 records all modifications for every update in log records
– Log records are written sequentially to log files(s)– Should be placed on a different disk– Log record should contain as little possible (old/new data)– Multiples log records fit on a single log page– Log records represent an ordered list of update TXNs
Information Management – DB2
© 2009 IBM Corporation9
Write Ahead Logging
Write Ahead Logging Protocol
– 1) Log records for an update must be written to disk, before the corresponding data page is written
– 2) write all log records for a TXN when it commits
1) guarantees Atomicity
2) guarantees Durability
Information Management – DB2
© 2009 IBM Corporation10
300 old newBuffer pools
200 new
200 new
300 old
300 new
100
100
101
101
201 old new
201 new301 old
Table space Containers
Read Only Committed Update Uncommitted Update
Page CleanersI/O ServersLogger
DB2 Agent
DB2 Agent
S0000020.log
S0000021.log
Active Log DirectoryMin Buff LSN = Oldest Changed Page in Buffer pool
Low Tran LSN = Oldest Uncommited Log Record
Min Buff LSN Low Tran LSN
201 new 301 new
Bufferpool and Log buffer – all log records flushed
SQLOGCTL.LFH.1SQLOGCTL.LFH.2
Log Control File
Log Buffer
301 old new
200 old new
300 old new200 old new
commit
301 old new
300 old new
201 old new
commit
Information Management – DB2
© 2009 IBM Corporation11
300 old newBuffer pools
200 old
200 new
300 old
300 new
100
100
101
101
201 old301 old
Table space Containers
Read Only Committed Update Uncommitted Update
Page CleanersI/O ServersLogger
DB2 Agent
DB2 Agent
S0000020.log
S0000021.log
Active Log DirectoryMin Buff LSN = Oldest Changed Page in Buffer pool
Low Tran LSN = Oldest Uncommited Log Record
Min Buff LSN Low Tran LSN
201 new 301 new
Bufferpool and Log buffer – some logrecs flushed
SQLOGCTL.LFH.1SQLOGCTL.LFH.2
Log Control File
Log Buffer
301 old new
200 old new
300 old new200 old new
commit301 old new
300 old new
201 old new
commit
Information Management – DB2
© 2009 IBM Corporation12
300 old newBuffer pools
200 new
200 new
300 new
300 new
100
100
101
101
201 new301 new
Table space Containers
Read Only Committed Update Uncommitted Update
Page CleanersI/O ServersLogger
DB2 Agent
DB2 Agent
S0000020.log
S0000021.log
Active Log DirectoryMin Buff LSN = Oldest Changed Page in Buffer pool
Low Tran LSN = Oldest Uncommited Log Record
Min Buff LSN Low Tran LSN
201 new 301 new
ERROR - commit not flushed
SQLOGCTL.LFH.1SQLOGCTL.LFH.2
Log Control File
Log Buffer
301 old new
200 old new
300 old new200 old new
commit
301 old new
300 old new
201 old new
Information Management – DB2
© 2009 IBM Corporation13
300 old newBuffer pools
200 new
200 new
300 new
300 new
100
100
101
101
201 new301 new
Table space Containers
Read Only Committed Update Uncommitted Update
Page CleanersI/O ServersLogger
DB2 Agent
DB2 Agent
S0000020.log
S0000021.log
Active Log DirectoryMin Buff LSN = Oldest Changed Page in Buffer pool
Low Tran LSN = Oldest Uncommited Log Record
Min Buff LSN Low Tran LSN
201 new 301 new
ERROR – page flushed before log record (row 201)
SQLOGCTL.LFH.1SQLOGCTL.LFH.2
Log Control File
Log Buffer
301 old new
200 old new
300 old new200 old new
commit301 old new
300 old new
201 old new
commit
Information Management – DB2
© 2009 IBM Corporation14
S0000020.log S0000021.log
S0000023.log
Active log directory
Other Transactions (committed)
1000 1000
1000
S0000025.log
S0000022.log
1001
S0000024.log
Log Space Reservation
1001
1000
Transaction 1000
Information Management – DB2
© 2009 IBM Corporation15
S0000020.log S0000021.log
S0000023.log
Active log directory
Other Transactions (committed)
1000 1000
1000
S0000024.log
S0000022.log
1001
S0000024.log
Out Of Log Space
1001 Transaction 1000 and 1001 (reserved)
1000
Reserved
Transaction 1000
Reserved
Information Management – DB2
© 2009 IBM Corporation16
S0000020.log S0000021.log
S0000023.log
Active log directory
Transaction 1000 12 MB Limit
Other Transactions
1000 1000
1000
S0000024.log
S0000022.log
1000
1000
S0000024.log
max_log * (logfilsiz * 4096 *logprimary) / 10050 * (1000 * 4096 * 6 ) / 100 = 12 MB
-964sqlcode
Database CFGlogfilsiz 1000logprimary 6max_log 50
MAX_LOG database configuration parameter
Information Management – DB2
© 2009 IBM Corporation17
NUM_LOG_SPAN database cfg parameter
S0000020.log S0000021.log
Active log directory
Transaction 1000
Other Transactions
1000
S0000023.logS0000022.log
1000
num_log_span = 2 db2 Forceapplication
-964 SQLCODE
Database CFG
logfilsiz 1000logprimary 4num_log_span 2
Information Management – DB2
© 2009 IBM Corporation18
Archived logs
Database Configuration
Archived logs
LOG 100
Active log directory
Active Logs
LOGSECOND = -1LOGPRIMARY= 6LOGARCHMETH1=DISK:/db/archiveOVERFLOWLOGPATH = /db/tp1/ologsFIRST ACTIVE LOG = S0000100.LOG
Log CTL File
Low Tran LSN
Log Manager
LOG 101 LOG 102 LOG 103
LOG 104
LOG 116 LOG 118 LOG 119LOG 117 LOG 120
LOG 98 LOG 99
LOG 107 LOG 108 LOG 109
LOG 110
LOG 106LOG 105
LOG 111 LOG 112 LOG 113 LOG 114 LOG 115
LOG 121
LOG 121 LOG 122 LOG 123 LOG 124
LOG 126
LOG 125
Infinite active logs - LOGSECOND = -1
Information Management – DB2
© 2009 IBM Corporation19
Controlling Minbuff - SOFTMAX
Specifies the percentage of the logfilsiz when a soft checkpoint will:– Write the log control file to disk (Sqlogctl.lfh)– Call an asynchronous page cleaner (Log space page cleaners)
Range: 1 to 100 * number of primary logs.
Default = 100 (soft checkpoint in every logfile).
Lower value will reduce the time required to restart a database during crash recovery.
The smaller the number, the greater overhead of normal database logging activity due to page clean activity.
Information Management – DB2
© 2009 IBM Corporation20
Tablespace Containers
Buffer Pools
S0000020.log S0000021.log S0000022.log
Min Buff LSN
1006
1000
1007
1001
100410081010
1005
1001
1009
Current
LOGFILSIZ = 1000
SOFTMAX = 200
200% * ( 1000 * 4K ) = 8MB
1001 1003 1004 1005 1006 1007 1008 1009 1010
1003
Active log directoryLog CTL File
Page Cleaner
Page Cleaner1002
1002
1002
xxxx Change NOT Written to Disk
xxxx
Updates to Log CTL
Change Written to Disk
Database crash recovery - Softmax = 200
Information Management – DB2
© 2009 IBM Corporation21
Table space Containers
Buffer Pools
S0000020.log S0000021.log S0000022.log
Min Buff LSN
1006 1000
1007
1007
1004
10081010
1005
1001
1009
Current
LOGFILSIZ = 1000
SOFTMAX = 50
50% * ( 1000 * 4K ) = 2MB
1001 1003 1004 1005 1006 1007 1008 1009 1010
1003
Active log directory
Log CTL File
Page Cleaner
Page Cleaner
1002
1002
1008
xxxx Change NOT Written to Disk
xxxx
Updates to Log CTL
Change Written to Disk
Database crash recovery - Softmax = 50
Information Management – DB2
© 2009 IBM Corporation22
S0000020.log S0000021.log S0000022.log
Active log directory
transaction 1000
transaction 1001 transaction 1002
transaction 1003
1000 10001001 1001 1002 1003 1003
1 2 3
DB2 Log Manager
4
Activity Action TakenPoint 1 Log S0000020 full Log Manager archives Log 20. Transaction 1000 is oldest
current transaction (lowtran lsn). Cannot rename log 20.
Point 3
Transaction 1000 CommitsPoint 2 Transaction 1000 commits Log 20 no longer an active log, can be renamed to Log 23.
Point 4 Transaction 1002 commits Log S0000021 can be renamed to Log 24
1003
DB2 log archive processing
1002
Log S0000021 full Log Manager archives Log 21. Transaction 1002 is oldest current transaction (lowtran lsn). Cannot rename log 21.
Information Management – DB2
© 2009 IBM Corporation23
S0000020.log S0000021.log S0000022.log
Active log directory
transaction 1000
transaction 1001 transaction 1002 transaction 1003
1000 100010001001 1001 1002 1003 1003
1 2 3
DB2 Log Manager
4
Activity Action TakenPoint 1 Log S0000020 full Log Manager begins to Archive Log 20 but FAILS.
Transaction 1000 is oldest current transaction (lowtran lsn)
Point 2 Log S0000021 full Log Manager does not Archive log 21 until 20 is done.
Point 3 Transaction 1000 Commits Log 20 not active log and not archived. Txn 1003 is lowtran lsn. Log 21 is first active log. Log 23 is created
Point 4 Active log dir contains 4 logs Log 20 is not active, and unarchived. Logs 21-23 are active logs.
1003
DB2 log archive processing - Backlog
Information Management – DB2
© 2009 IBM Corporation24
What if Archiving does not work for a long time?
Eventually, the filesystem becomes full .... DB2 cannot pre-allocate log files to maintain
LOGPRIMARY ... Later, run out of disk space provided by pre-
allocated log files. This scenario is 'out of disk space', distinct from
'out of log space' Default behaviour, database comes down with
severe error How to avoid this ? 2 solutions ...
Information Management – DB2
© 2009 IBM Corporation25
DB2 Database
Log Buffer
Active Log Files
Active log path
Inactive Unarchived
Logs
Offline Archive Log
Files
DB2 Log Manager
DISK FULL!If NO (default)
Database comes down with severe error (out of disk space)
If YESUpdate transactions wait for new logRead only transactions continueWrite message to db2diag.logWait 5 minutes and retry creating new logCheck for completed log archive and rename log
.....BLK_LOG_DSK_FUL = YES.....
CFG
Log disk full condition handling
Information Management – DB2
© 2009 IBM Corporation26
S0000020.log S0000021.log S0000022.log
Active log directory
transaction 1000
transaction 1001transaction
1002transaction 1003
1000 100010001001 1001 1002 1003 1003
1 2 3
DB2 Log Manager
4
Activity Action TakenPoint 1 Log S0000020 full Log Manager begins to archive Log 20 but FAILS.
Transaction 1000 is oldest current transaction (lowtran lsn)
Point 2 Log S0000021 full Log Manager does not archive log 21 until 20 is done.
Point 3 Transaction 1000 Commits Log 20 no longer an active log, but is not archived. Txn 1003 is lowtran lsn. Log 21 is first active log. Log 20 is moved to FAILPATH. Log 23 created.
Point 4 Archiving works again. Log 20 is archived from FAILPATH, Log 21 is archived from active log dir.
1003
DB2 log archive processing - FAILARCHPATH
FAILARCH directory
S0000020.log
1000 1001 1001
Information Management – DB2
© 2009 IBM Corporation27
Log Buffer
DB2 Database
Primary Disk
Operating System Mirroring orHardware RAID
Mirror Disk
DB2 Logs Mirror Copy
No protection from accidental deletion of active logs.
Recovery from Log media failure:
– Operating System facilities - Software RAID and Disk Mirrors
1. RAID Hardware:
RAID 1 - Mirroring RAID 5 - Parity Based
Protecting the DB2 logs
Information Management – DB2
© 2009 IBM Corporation28
Log Buffer
DB2 Database
DB2 Active Logscopy 1
Protection from accidental deletion of active logs.
CFG
.....
LOGPATH = /database/tp1/logs
MIRRORLOGPATH = /database/tp1/logmir
.....
/database/tp1/logs /database/tp1/logmir
DB2 Active Logscopy 2
Protecting the DB2 Active logs - Log mirroring
Information Management – DB2
© 2009 IBM Corporation29
db2 update db cfg for salesdb using logarchmeth1 disk:/dbarchlcldb2 update db cfg for salesdb using logarchmeth2 disk:/dbarchdr
salesdb Database
Table Spaces
DatabaseDirectory Active Logs
db2 LogManager
DB CFG
Remote Archive
logs
Local Archive
logs
/dbarchdr/INST1/SALESDB/NODE0000
/dbarchlcl/INST1/SALESDB/NODE0000
SYS2
SYS1
Recovery History
Path Includes:– Instance Name– Database Name– Database Partition (NODExxxx)
Archiving logs to Local and Remote disks
© 2009 IBM CorporationMarch 1, 2009
Log File Management in DB2 for Linux, UNIX, and Windows
THE END
Ron CastellettoIBM Canada [email protected]