Guide to Postgres Vacuum

Why Vacuum?

Vacuum is Postgre’s GC to reclaim space, recycle transaction ID, update statistics for query planner. Postgres has built-in Autovacuum Daemon that we 100% want to turn on: Routine Vacuuming.

The autovacuum process takes care of several maintenance chores inside your database that you really need. If you think you need to turn regular vacuuming off because it's taking too much time or resources, that means you're doing it wrong. The answer to almost all vacuuming problems is to vacuum more often, not less. So that each individual vacuum operation has less to clean up.
https://wiki.postgresql.org/wiki/Tuning_Your_PostgreSQL_Server

VACUUM marks the disk space reference that this is available to use again, it did not free the space back to OS. VACUUM FULL will get the disk space back but in order to do so, it needs to lock the table.

The visibility map is bitmap with two bits (all-visible and all-frozen). Vaccum will skip all-frozen and only vacuum on all-visible.

  • Use VACUUM and avoid VACUUM FULL
    VACUUM FULL requires exclusive lock on the table it is working on

  • VACUUM creates a substantial amount of I/O traffic, which can cause poor performance for other active sessions.

  • TRUNCATE removes the entire content of the table immediately, without requiring a subsequent VACUUM or VACUUM FULL to reclaim the now-unused disk space. But this violates MVCC.

  • ⭐VACUUM every table in every database at least once every 2 Billion transactions to avoid Transaction ID Wraparound

  • When will the Autovacuum Daemon vacuum?

    vacuum threshold = vacuum base threshold +
                       vacuum scale factor * number of tuples
    
  • When will the Autovacuum Daemon analyze?

    analyze threshold = analyze base threshold +
                        analyze scale factor * number of tuples
    

Metrics

These are the things you want to monitor in order to make a data-informed decision:

  • Processes running Auto Vacuum
  • Auto Vacuum workers per table
  • Active/Idle Transaction Connections
  • CPU usage
  • Disk Read
  • Disk Pending Write
  • Disk I/O time
  • Disk Free Space

Terms

  • Memory: shared buffer cache
  • Row: Tuple

Understanding Vacuum Configs

Good thing about Vacuum is when you changed the configuration, you do not need to restart the database.

Imagine Vacuum is a Cleaner in the airport (Database) that has many customers (IO operations) going around. The Cleaner’s job is to keep the airport tidy and clean. But Cleaner cannot always do vacuum, because the airport has many customers go around to do important things. So when the cleaner does vacuum work, it keeps track of the cost of each vacuum work item.

When the airport is shut down, you can do a full clean up VACUUM FULL but the airport never shuts down and it would take 100% Disk IO. Cleaner is like doing this micro clean up here and there while the airport (Database) is still functional.

  • vacuum_cost_limit — Defaults to 200. The total points of vacuum work before Cleaner takes a break. So the customers (IO operations) can use the airport (Disk).
  • vacuum_cost_page_hit — Defaults to 1, When Cleaner vacuums a row in memory (cache hit), it costs 1 point.
  • vacuum_cost_page_miss — Defaults to 10, When Cleaner vacuums a row found in the disk (cache miss), it costs 10 points.
  • vacuum_cost_page_dirty — Defaults to 20, When Postgres vacuums a page that was clean, when Postgres vacuums it becomes dirty and need to make it clean again.
  • vacuum_cost_delay — The break time for Cleaner when her vacuum work points exceeded vacuum_cost_limit. Defaults to 0, which never lets Cleaner take a break. This value should be set a multiple of 10.

Auto Vacuum configs

  • autovacuum_start_daemon
  • autovacuum_max_workers
  • autovacuum_work_mem
  • autovacuum_naptime
  • autovacuum_vac_thresh
  • autovacuum_vac_scale
  • autovacuum_vac_ins_thresh
  • autovacuum_vac_ins_scale
  • autovacuum_anl_thresh
  • autovacuum_anl_scale
  • autovacuum_freeze_max_age
  • autovacuum_multixact_freeze_max_age
  • autovacuum_vac_cost_delay
  • autovacuum_vac_cost_limit

Note

There is a window (caused by pgstat delay) on which a vacuum worker may choose a table that was already vacuumed; This is a bug in the current design. Source