Postgres 10 and auto vacuum

Postgres 10 and auto vacuum

We have a busy postgres 10 Db with a principal table that holds about 15 million active rows and has about 90M Inserts|Updates|Deletes a day.

All performs well, except a small number of monitoring SQL statements of the nature

select code, max(timestamp) from mainTable group by code;

After running ANALYZE or VACUUM ANALYZE on the table, the Query Plan uses the an index on columns (code, timestamp) and takes less than 1s to report the latest value for each code.

However, after about 20 minutes, the Query Plan has changed to be a full table scan mainTable and this takes about 30s. This is very puzzling because the nature of the table and data has not changed, although many values and 5M changes may have happened. The only possible cause can be auto vacuum.

Playing with

autovacuum_analyze_threshold

autovacuum_analyze_scale_factor

default_statistics_target

What is the best option to ensure that the table statistics stay upto data and that the best Query Plan is generated.One option would be to use CRON and regenerate the table statistics every 15 minutes, or disable auto vacuum ANALYZE function, but neither of these options feel write.

Re: Postgres 10 and auto vacuum

Changing these will impact how often the table is analyzed based on the rough count of changed rows. You may want to adjust autovacuum settings as well so that dead space can be reused.

default_statistics_target

Increasing this from default 100 will result in longer planning time, but you may get a better plan (more consistently).

What is the best option to ensure that the table statistics stay upto data and that the best Query Plan is generated.One option would be to use CRON and regenerate the table statistics every 15 minutes, or disable auto vacuum ANALYZE function, but neither of these options feel write.

You can check if autovacuum is working on this table by checking pg_stat_user_tables or turning on logging of autovacuum and reviewing your logs.

How does the auto vacuum analyze function work? Does it

use a subset of the whole table to regenerate the table statistics

Yes. It scans 300*default_statistics_target rows and for each column estimates null fraction, most common values and the frequency of those, histogram_bounds and other info found in pg_stats.