PostgreSQL's statistics collector is a subsystem that supports
collection and reporting of information about server activity.
Presently, the collector can count accesses to tables and indexes
in both disk-block and individual-row terms. It also supports
determining the exact command currently being executed by other
server processes.

Since collection of statistics adds some overhead to query
execution, the system can be configured to collect or not
collect information. This is controlled by configuration
parameters that are normally set in postgresql.conf. (See Section 16.4 for details about
setting configuration parameters.)

The parameter stats_start_collector
must be set to true for the statistics
collector to be launched at all. This is the default and
recommended setting, but it may be turned off if you have no
interest in statistics and want to squeeze out every last drop
of overhead. (The savings is likely to be small, however.) Note
that this option cannot be changed while the server is
running.

The parameters stats_command_string,
stats_block_level,
and stats_row_level
control how much information is actually sent to the collector
and thus determine how much run-time overhead occurs. These
respectively determine whether a server process sends its
current command string, disk-block-level access statistics, and
row-level access statistics to the collector. Normally these
parameters are set in postgresql.conf
so that they apply to all server processes, but it is possible
to turn them on or off in individual sessions using the
SET command. (To prevent
ordinary users from hiding their activity from the
administrator, only superusers are allowed to change these
parameters with SET.)

Note: Since the parameters stats_command_string, stats_block_level, and stats_row_level default to false, very few statistics are collected in
the default configuration. Enabling one or more of these
configuration variables will significantly enhance the
amount of useful data produced by the statistics collector,
at the expense of additional run-time overhead.

Several predefined views, listed in Table
23-1, are available to show the results of statistics
collection. Alternatively, one can build custom views using the
underlying statistics functions.

When using the statistics to monitor current activity, it is
important to realize that the information does not update
instantaneously. Each individual server process transmits new
block and row access counts to the collector just before going
idle; so a query or transaction still in progress does not
affect the displayed totals. Also, the collector itself emits a
new report at most once per pgstat_stat_interval milliseconds (500 by
default). So the displayed information lags behind actual
activity. Current-query information is reported to the
collector immediately, but is still subject to the pgstat_stat_interval delay before it becomes
visible.

Another important point is that when a server process is
asked to display any of these statistics, it first fetches the
most recent report emitted by the collector process and then
continues to use this snapshot for all statistical views and
functions until the end of its current transaction. So the
statistics will appear not to change as long as you continue
the current transaction. This is a feature, not a bug, because
it allows you to perform several queries on the statistics and
correlate the results without worrying that the numbers are
changing underneath you. But if you want to see new results
with each query, be sure to do the queries outside any
transaction block.

Table 23-1. Standard Statistics Views

View Name

Description

pg_stat_activity

One row per server process, showing process
ID, database, user,
current query, and the time at which the current query
began execution. The columns that report data on the
current query are only available if the parameter
stats_command_string has been
turned on. Furthermore, these columns read as null
unless the user examining the view is a superuser or
the same as the user owning the process being reported
on. (Note that because of the collector's reporting
delay, current query will only be up-to-date for
long-running queries.)

pg_stat_database

One row per database, showing the number of active
backend server processes, total transactions committed
and total rolled back in that database, total disk
blocks read, and total number of buffer hits (i.e.,
block read requests avoided by finding the block
already in buffer cache).

pg_stat_all_tables

For each table in the current database, total
numbers of sequential and index scans, total numbers of
rows returned by each type of scan, and totals of row
insertions, updates, and deletions.

pg_stat_sys_tables

Same as pg_stat_all_tables, except that only
system tables are shown.

pg_stat_user_tables

Same as pg_stat_all_tables, except that only
user tables are shown.

pg_stat_all_indexes

For each index in the current database, the total
number of index scans that have used that index, the
number of index rows read, and the number of
successfully fetched heap rows. (This may be less when
there are index entries pointing to expired heap
rows.)

pg_stat_sys_indexes

Same as pg_stat_all_indexes, except that only
indexes on system tables are shown.

pg_stat_user_indexes

Same as pg_stat_all_indexes, except that only
indexes on user tables are shown.

pg_statio_all_tables

For each table in the current database, the total
number of disk blocks read from that table, the number
of buffer hits, the numbers of disk blocks read and
buffer hits in all the indexes of that table, the
numbers of disk blocks read and buffer hits from the
table's auxiliary TOAST table (if any), and the numbers
of disk blocks read and buffer hits for the TOAST
table's index.

pg_statio_sys_tables

Same as pg_statio_all_tables, except that
only system tables are shown.

pg_statio_user_tables

Same as pg_statio_all_tables, except that
only user tables are shown.

pg_statio_all_indexes

For each index in the current database, the numbers
of disk blocks read and buffer hits in that index.

pg_statio_sys_indexes

Same as pg_statio_all_indexes, except that
only indexes on system tables are shown.

pg_statio_user_indexes

Same as pg_statio_all_indexes, except that
only indexes on user tables are shown.

pg_statio_all_sequences

For each sequence object in the current database,
the numbers of disk blocks read and buffer hits in that
sequence.

pg_statio_sys_sequences

Same as pg_statio_all_sequences, except that
only system sequences are shown. (Presently, no system
sequences are defined, so this view is always
empty.)

pg_statio_user_sequences

Same as pg_statio_all_sequences, except that
only user sequences are shown.

The per-index statistics are particularly useful to
determine which indexes are being used and how effective they
are.

The pg_statio_ views are
primarily useful to determine the effectiveness of the buffer
cache. When the number of actual disk reads is much smaller
than the number of buffer hits, then the cache is satisfying
most read requests without invoking a kernel call. However,
these statistics do not give the entire story: due to the way
in which PostgreSQL handles
disk I/O, data that is not in the PostgreSQL buffer cache may still reside
in the kernel's I/O cache, and may therefore still be fetched
without requiring a physical read. Users interested in
obtaining more detailed information on PostgreSQL I/O behavior are advised to use
the PostgreSQL statistics
collector in combination with operating system utilities that
allow insight into the kernel's handling of I/O.

Other ways of looking at the statistics can be set up by
writing queries that use the same underlying statistics access
functions as these standard views do. These functions are
listed in Table
23-2. The per-database access functions take a database OID
as argument to identify which database to report on. The
per-table and per-index functions take a table or index OID.
(Note that only tables and indexes in the current database can
be seen with these functions.) The per-backend process access
functions take a backend process ID number, which ranges from
one to the number of currently active backend processes.

Table 23-2. Statistics Access Functions

Function

Return Type

Description

pg_stat_get_db_numbackends(oid)

integer

Number of active backend processes for
database

pg_stat_get_db_xact_commit(oid)

bigint

Transactions committed in database

pg_stat_get_db_xact_rollback(oid)

bigint

Transactions rolled back in database

pg_stat_get_db_blocks_fetched(oid)

bigint

Number of disk block fetch requests for
database

pg_stat_get_db_blocks_hit(oid)

bigint

Number of disk block fetch requests found in cache
for database

pg_stat_get_numscans(oid)

bigint

Number of sequential scans done when argument is a
table, or number of index scans done when argument is
an index

pg_stat_get_tuples_returned(oid)

bigint

Number of rows read by sequential scans when
argument is a table, or number of index rows read when
argument is an index

pg_stat_get_tuples_fetched(oid)

bigint

Number of valid (unexpired) table rows fetched by
sequential scans when argument is a table, or fetched
by index scans using this index when argument is an
index

pg_stat_get_tuples_inserted(oid)

bigint

Number of rows inserted into table

pg_stat_get_tuples_updated(oid)

bigint

Number of rows updated in table

pg_stat_get_tuples_deleted(oid)

bigint

Number of rows deleted from table

pg_stat_get_blocks_fetched(oid)

bigint

Number of disk block fetch requests for table or
index

pg_stat_get_blocks_hit(oid)

bigint

Number of disk block requests found in cache for
table or index

pg_stat_get_backend_idset()

set of integer

Set of currently active backend process IDs (from 1
to the number of active backend processes). See usage
example in the text.

pg_backend_pid()

integer

Process ID of the backend process attached to the
current session

pg_stat_get_backend_pid(integer)

integer

Process ID of the given backend process

pg_stat_get_backend_dbid(integer)

oid

Database ID of the given backend process

pg_stat_get_backend_userid(integer)

oid

User ID of the given backend process

pg_stat_get_backend_activity(integer)

text

Active command of the given backend process (null
if the current user is not a superuser nor the same
user as that of the session being queried, or
stats_command_string is not
on)

pg_stat_get_backend_activity_start(integer)

timestamp with time zone

The time at which the given backend process'
currently executing query was started (null if the
current user is not a superuser nor the same user as
that of the session being queried, or stats_command_string is not on)

pg_stat_reset()

boolean

Reset all currently collected statistics

Note:pg_stat_get_db_blocks_fetched minus
pg_stat_get_db_blocks_hit
gives the number of kernel read() calls issued for the table, index,
or database; but the actual number of physical reads is
usually lower due to kernel-level buffering.

The function pg_stat_get_backend_idset provides a
convenient way to generate one row for each active backend
process. For example, to show the PIDs and current queries of all backend
processes:

SELECT pg_stat_get_backend_pid(s.backendid) AS procpid,
pg_stat_get_backend_activity(s.backendid) AS current_query
FROM (SELECT pg_stat_get_backend_idset() AS backendid) AS s;