27 Nov 2024

Chapter 10 - WAL

Summary
Logging is required to save/guarantee data durability, since some of data is in RAM/main-memory before committing to disk we need to ensure the updates are safe whenever received from user to database.

Notes During Reading

Logging

Writing data to disk at all times is not practical, as it would make the system inefficient considering there’s a huge cost for flushing each data to hard drive.
Hence, while the server is running, some of the current data is available only in RAM, its writing to permanent storage being deferred. Therefore, the data stored on disk is always inconsistent during server operation, as pages are never flushed all at once.
Postgres creates a log entry that contains all the essential information required to repeat the update operations if the need arises.
Following actions are recorded in WAL:

Operation	Logged	Comments
Modifications in the buffer cache	Yes	Writes are deferred
Transaction commits and rollbacks	Yes	Status change happens in CLOG buffers
File operations	Yes	Must be in sync with data changes
Operations on UNLOGGED tables	No
Operations on temporary tables	No	Since there lifetime is anyway limited by session

Postgres doesn’t support Undo log, it only supports Redo log.
- ℹ️ ==undo logs are used to “undo original state of data before a transaction had made” changes, while redo logs are used to “redo” changes to a database during recovery==.

WAL Structure

Each WAL entry is of variable length but preceded by a fixed length header.
Header consists of following information:
- transaction ID related to entry
- resource manager that interprets the entry
  - ℹ️ Resource managers in Postgres are defined to handle different database operations like(heap updates, index changes etc.). This entry allows while WAL replay to allow analysing which resource manager should handle the entry.
- checksum to detect data corruption
- WAL entry length
- reference to previous WAL entry
WAL files take up special buffers in server’s shared memory. By default, the size is 1/32 of the total buffer cache size.
WAL cache is similar to buffer cache, but it usually operates in the ring buffer mode.
- New entries are added to head
- Older entries are saved to disk starting at the tail.
WAL cache shouldn’t be too small as this can lead to frequent disk syncs.
Particular entry of a WAL is denoted by LSN i.e. Log sequence number that represents a 64-bit offset in bytes from the start of the WAL to an entry.
- LSN helps in identifying in WAL the byte address of the given entry in the WAL stream.
- At each WAL entry created LSN is advanced to next.

Prashant Shubham

Chapter 10 - WAL

Notes During Reading

Logging

WAL Structure