Performance Analysis Tools

This page is focused on tools for collecting data outside of PostgreSQL, in order to learn more about the system as a whole, about PostgreSQL's use of system resources, about things that may be bottlenecks for PostgreSQL's performance, etc.

Most of the time, the tools PostgreSQL provides internally will be more than adequate for your needs. The most important tool in your toolbox is the SQL EXPLAIN command and its EXPLAIN ANALYZE alternative. The pg_catalog.pg_stat_activity and pg_catalog.pg_locks views are also vital.

System level tools for I/O, CPU and memory usage investigation

Tools like `ps' with the `wchan' format specifier, `vmstat', `top', `iotop', `blktrace' + `blkparse', `btrace', `sar', alt-sysrq-t, etc can help you learn much more about what your system is doing and where things are being delayed.

On supported systems (currently Solaris and FreeBSD), `dtrace' is also a powerful tool. PostgreSQL has DTrace hooks to allow you to investigate its internal workings and performance as well as that of the operating system it is running on.

top

Like vmstat, gives you some system overview info, though isn't as useful for disk information. Quite configurable.

Don't be fooled by the "used" and "free" memory values. The real memory in use is actually (roughly) the value in "used" minus the value in "buffers", since "buffers" includes the kernel's disk cache. The kernel will use most of the free memory for disk cache, but will shrink that cache as required to fit other things into memory. After all, truly free memory is wasted memory that does you no good.

free

The "free" value beside "-/+ buffers/cache" is the one you should generally consider the "real" amount of free memory in the system. The "-m" flag asks for values in megabytes.

sar

Process accounting. See the man page.

gdb

Why is a debugger listed in performance analysis tools?

Because sometimes, attaching a debugger to a backend, interrupting the backend periodically to get a backtrace, and then trying to figure out what on earth it's up to is a helpful way to track down an issue.

Wireshark, tshark, and tcpdump

These tools are for monitoring and analysis of traffic on a network interface. If you can't figure out what's keeping your network interface pegged, they might help you figure out what data is being transmitted/received.

pktstat

A top-like utility for network interfaces. Doesn't, alas, show process IDs or names, but you can figure those out from source and destination port information. Handy for tracking down a backend that's flooding the interface with traffic.

Linux-only tools

iostat

Provides summary information about I/O activity and load on block devices in the system. Doesn't provide information about what processes are responsible for the load, but produces short and easily read output in real time. Like vmstat, most usefully used in continuous mode, eg "iostat 1".

See the man page for details.

blktrace, blkparse and btrace

These are tools to get very low level information about the activity on a given block device, what process(es) are causing that activity, and what they're doing. It produces a huge amount of information, but can be filtered somewhat. It takes a bit of thought to interpret, but can be great for those "what the hell is thrashing my disk" moments.

Because of the bgwriter and wal writer, it's not usually easy to see what PostgreSQL backends are keeping a disk busy with writes. It's still good for tracking down heavy read loads and figuring out what backend (and, thus, query - via the pg_catalog.pg_stat_activity) is responsible.

strace

strace is a useful tool that can attach to a process and report all the system calls that process makes, including arguments to those system calls. It's great for figuring out what files a process opens, or if it's waiting for something, etc.

Useful interpretation requires some C programming knowledge and some idea about POSIX APIs.

(Note that other systems have similar tools with different names - for example, some BSDs have truss).

oprofile

Probably not useful for general users, but a handy tool if you have something within Pg or a library used by Pg that's running strangely slowly and you don't know why.

alt-sysrq-t

The "magic sysrq key". Mainly useful for tracking down processes mysteriously stuck in 'D' state in the kernel, ie hung in a system call, since it will output a stack trace of the kernel stack for all processes in the system. Mostly used when trying to figure out whether Pg is having problems due to a kernel bug, hardware issue, file system bug, etc.

Must be enabled with 'sysctl -w kernel.sysrq=1' before it can be used.