One key technique to address tombstones is to understand how the autovacuum process actually works.
every UPDATE or DELETE on an existing tuple writes a new tuple version and sets t_xmax on the old tuple to the XID of the transaction performing the update.
If it's an UPDATE, the new tuple's t_xmin is set to that same XID. t_xmin, here is just the XID of the transaction that inserted (or, in this case, wrote) the tuple. t_xmax is the XID of the transaction that deleted or updated it, zero means no such transaction yet. the XID of the transaction performing the update.
Another concept that brings everything together is the Visibility Map. So, it is just one byte per heap page holding two bits: ALL_VISIBLE and ALL_FROZEN. ALL_VISIBLE set means every tuple on that page is visible to all current and future transactions, no dead tuples to clean. ALL_VISIBLE clear means the page might contain dead tuples or uncommitted rows and needs scanning. The VM's primary purpose is index-only scans: if the planner can answer a query from the index alone and the VM says the page is all-visible, it skips the heap fetch entirely. Without that bit set, every "index-only" scan degrades into index-scan-plus-heap-fetch.
The process of cleaning dead tuples is VACUUM. A naive VACUUM would scan every page sequentially. The visibility map lets it skip that work — pages marked ALL_VISIBLE have nothing to clean and are skipped; pages where the bit is clear get scanned and any dead tuples are reclaimed.
So how does this answer the original question? Autovacuum only runs once a trigger condition is met. To find that trigger, you need two parameters:
1. SHOW autovacuum_vacuum_threshold
2. SHOW autovacuum_vacuum_scale_factor
The trigger formular is,
trigger = autovacuum_vacuum_threshold (autovacuum_vacuum_scale_factor x n_live_tup)
Now, the catch is this by default, autovacuum_vacuum_threshold is always 50, and then imagine in a high-throughput system like a booking application where we have a large state-machine table, we are going to experience a high n_live_tup.
let say we have 50,000 tickets, we are going to be having 50,000 n_live_tup tuples
if SHOW autovacuum_vacuum_scale_factor = 0.2
It implies that auto vacuum will trigger at:
50 (0.2 * 50,000) = 10,050 dead tuples
So, in a high volume system, you have to adjust the 2 factors that triggers auto vacuum by doing this
ALTER TABLE tickets SET (
autovacuum_vacuum_scale_factor = 0.01,
autovacuum_vacuum_threshold = 50
);
so at a scale factor of 0.01 we will have:
trigger = 50 (0.01 x 50,000) = 550 dead tuples
In summary, to resolve these dead tuple issues, just tune based on autovacuum_vacuum_scale_factor, and
autovacuum_vacuum_threshold
Did you know that Postgres doesn't actually delete rows when you run DELETE?
It marks them as invisible to future transactions and leaves the physical row on disk.
That dead row sits there consuming space, bloating your indexes, and slowing down every sequential scan until autovacuum comes to clean it up.
At high delete volume, autovacuum can't keep up.
Your table grows forever, even though the data is "gone".
Run this right now and see how many dead rows are sitting in your most-written tables:
If n_dead_tup is close to or larger than n_live_tup on any table, you have a problem worth fixing today.