Chapter 10 - WAL
Summary
Logging is required to save/guarantee data durability, since some of data is in RAM/main-memory before committing to disk we need to ensure the updates are safe whenever received from user to database.
Notes During Reading
Logging
- Writing data to disk at all times is not practical, as it would make the system inefficient considering there’s a huge cost for flushing each data to hard drive.
- Hence, while the server is running, some of the current data is available only in RAM, its writing to permanent storage being deferred. Therefore, the data stored on disk is always inconsistent during server operation, as pages are never flushed all at once.
- Postgres creates a log entry that contains all the essential information required to repeat the update operations if the need arises.
- Following actions are recorded in WAL:
Operation | Logged | Comments |
---|---|---|
Modifications in the buffer cache | Yes | Writes are deferred |
Transaction commits and rollbacks | Yes | Status change happens in CLOG buffers |
File operations | Yes | Must be in sync with data changes |
Operations on UNLOGGED tables | No | |
Operations on temporary tables | No | Since there lifetime is anyway limited by session |
- Postgres doesn’t support Undo log, it only supports Redo log.
- ℹ️ ==undo logs are used to “undo original state of data before a transaction had made” changes, while redo logs are used to “redo” changes to a database during recovery==.
WAL Structure
- Each WAL entry is of variable length but preceded by a fixed length header.
- Header consists of following information:
- transaction ID related to entry
- resource manager that interprets the entry
- ℹ️ Resource managers in Postgres are defined to handle different database operations like(heap updates, index changes etc.). This entry allows while WAL replay to allow analysing which resource manager should handle the entry.
- checksum to detect data corruption
- WAL entry length
- reference to previous WAL entry
- WAL files take up special buffers in server’s shared memory. By default, the size is
1/32
of the total buffer cache size. - WAL cache is similar to buffer cache, but it usually operates in the ring buffer mode.
- New entries are added to head
- Older entries are saved to disk starting at the tail.
- WAL cache shouldn’t be too small as this can lead to frequent disk syncs.
- Particular entry of a WAL is denoted by LSN i.e. Log sequence number that represents a 64-bit offset in bytes from the start of the WAL to an entry.
- LSN helps in identifying in WAL the byte address of the given entry in the WAL stream.
- At each WAL entry created LSN is advanced to next.