CMU Database Systems - 19 Database Logging & Shadow Paging
Failue classification
- Transaction failure (logical errors, internal state errors, deadlock)
- System failure (software failure, hardware failure)
- Storage Media failure
UNDO VS REDO
undo: The process of removing the effects of an incomplete or abouted txn
redo: The process of re applying the effects og a committed txn for durablity




Question: when is it okay for us to write ou dirty pages to disk? What is the requirement that when we say a transaction commits, what we have to do with any of its dirty pages?
There are two policies:
Whether the DBMS allows an uncommited txn to overwrite the most recent committed value of an object in non-volatile storage
STEAL policy: allowed
No-STEAL policy : not allowed Any page has been modified by a transaction that has not committed can not leave the buffer pool, can not be written a disk
FORCE policy
Whether the DBMS requires that all updated made by a txn are reflected on non-volatile storage before the txn can commit
Whether the DBMS is required to have any pages that were modified by transaction flush out to a disk when before they are allowed to say that we have committed them
Example no steal force


The problem with this approch is that, a transaction can not modifiy a portion of database that exceeds the amount of memorz available in the buffer with no-steal policy
Never undo change, redo change
Shadow PAGING
Shadow paging is an implementation of no steel force
Instead of copying the entire database, the DBMS copies pages on write to create 2 versions
- Master: Contain only changes from commited txns
- Shadow: Temporary database with changes made from uncommitted transactions
To install updated when a txn commits, overwrite the root so it points to the shadow, thereby swapping the master and shadow






Undo: Remove the shadow pages, leave the master and the DB root point alone
Redo: no need
Write ahead log
DBMS must write to disk the log file records that correcpond to changes made to a database object before it can flush that object to disk
同时生成log和database modification。 在database flush 到disk之前必须先把log flush到disk
log structure storage
STEAL + No-force
WAL Protocal
write a <BEGIN> record to the log for each txn to mark its starting point. When a txn finishes, the DBMS will
- Write a <COMMIT> record on the log
- Make sure that all log records are flushed before it returns an acknowledfement to application

Example





Flushing the log buffer to disk every time a txn commits will become a bottleneck. -> use group commit





checkpoint
- pause all queries
- flush all WAL records in memory to disk
- Flush all modified pages in the buffer pool to disk
- Write a <CHECKPOINT> entry to WAL and flush to disk
- Resume queries

- Any txn that committed before the checkpoint is ignored(T1). In case of T1 commited before the checkpoint, has been moved onto disk
- T2, T3 did not commit before the last checkpoint, make sure there changes has to be applied
- Need to redo T2, because it commited after checkpoint, and T2 has been flushed onto disk
- Need to undo T3, it did not commit before crash, so every changes made by T3 should be moved.

- Flushed LSN: Im memory we will keep track of what is the largest LSN that we have written to the disk sofar , ende of the log on disk and beginning od the log tail in memory

each datapage in the database is going to have pageLSN: a pointer to the log that points to the LSN of the most recent log record for an update to that page
less or equal to the flushed LSN -> the log record for this page has already been flushed to disk now
In this picture the blue buffer page is pointing to an LSN that is in the log tail > then flushed LSN. We can not write this blue page to the disk yet, because the WAL property is not satisfied and the log is not flused

With time goed on, we can see the pageLSN points to the LSN<= flushed LSN. At this point, we can flush the blue page to the database disk

ARIES Logging



Transaction commit
When a txn commits, the DBMS writes a commit record to log, and guarantees that all log records up to txns commit are flushed to disk
When a commit succeeds, write a spechial TXN-END record to log. This does not need to be flushed immediately.



Transaction abort
We need to add another field to our log records: prevLSN linkedlist


Compensation log records(CLR)
A CLR describes the actions taken to undo the action of a previous updated record
It has all the fields of an update + the undoNext point(the next-to-be-undone LSN )
CLR are added to the log records but the DBMS does not wait for them to be flushed before notifying the application that the txn aborted