Bloat removal without table swapping

Some time ago I wrote about my favorite method of bloat removal. Around one year earlier, I wrote about another idea for bloat removal. This older idea was great – it didn't involve usage of triggers, overhead on all writes, table swapping. It had just one small, tiny, minuscule little issue. It was unbearably slow.

My idea was explored by Nathan Thom, but his blogpost disappeared.

Recently, Sergey Konoplev wrote to me about his tool, that he wrote using the same idea – updating rows to move them to other pages. So I decided that I have to check it.

Obviously first I needed some test data. To avoid autovacuum cleaning bloat, I disabled it, and then created the table:

All in all – it works pretty good. The 800 pages per second means 6.5MB/s. Maybe not the fastest thing on earth, but faster than I expected.

One thing to keep in mind – it works by doing a lot of updates in the table. So, if your table has “ON UPDATE" triggers, you have to put some caution into using the tool (and check if your triggers shouldn't be fixed, not to do anything if the update doesn't change any data in row).

The tool has some rough edges (hardcoded default port for example), but generally – it surprised me how well it behaved. Great work Sergey, thanks a lot.

I successfully use pgcompact on about 40 production servers for more then 1 year. Some of the servers are with 10k+ queries per seconds in avg. The tool adjusts its pace to the current load of the server automatically to not affect user queries. The main idea of the tool was to put it to crontab and forget about bloat.

@Depesz

I would like to add a little bit to the article:

1. If you specify ‘–reindex’ it will reindex bloated indexes as well, including PKs and UKs.
2. You don’t need to specify a table name, just ‘-d dbname’ (one or more time) or ‘-a’ for all databases the cluster, the tool will automatically find all the tables that have bloat and compact them.
3. You can use ‘./fatpack/pgcompact’ instead of ‘./bin/pgcompact’, because the former one has all the dependencies packed in, in other words it fully autonomous and you do not need to specify PERL5LIB for it.
4. You do not need to be wary about ‘ON UPDATE’ triggers because the tool sets ‘session_replication_role’ to ‘replica’ so user defined triggers will not work.
5. You can ‘pgcompact –man’ to see the full manual with all the parameters and notes.