Proposal

This page will contain information about the best way to install and maintain a Postgresql database, including environment variables, paths and other relevant stuff.

Fell free to add your suggestions and knowledge to make it a 'hands on' resorce.

Self compiled vs. package distributed

Before installing, consider whether to use packages distributed with your operating system or whether to roll your own. (Compiling PostgreSQL on Windows might be a difficult task. It's quite easy on Linux/Unix boxes.)

Let's compare using a pre-built package and compiling PostgreSQL yourself. (Note: This is a rather Linux-centric view. For Windows, you'll likely want to use the binary packages provided for each release.)

Using pre-built package from distribution

Compiling yourself.

Very easy to install - just use your package manager.

You might need to install gcc and some development packages just for building PostgreSQL.

The packages might be out of date or new minor versions might not become available frequently.

You are free to use the latest stable version and perform upgrades at your will.

The package management knows about the PostgreSQL installation and will update it.

Your package management doesn't know anything about the installation. Dependent libraries might get uninstalled or replaced by newer, incompatible versions. (Note: This is rather unlikely. I've never seen it happen. PostgreSQL doesn't depend on any strange or fast-evolving packages.)

Issues

UUID: There is a problem with UUID library in Open Solaris 200911. If some one have a workaround in this, please post it.

Multiple Versions on the same host

If you have to install multiple PostgreSQL versions at the same host, compile from source and call configure like this:

./configure --prefix=/opt/postgresql-8.2.11 --with-pgport=8200

That way, you never need to worry what version you are talking with - you just look at the port number.

Other way is changing port in postgresql.conf. Beware of that if you have am own init script, remeber to change values of PGDATA and PGUSER.

Making sure it starts up at system boot time

TODO: Provide a default init script (if there's not already one in contrib/).

Recommended values to be changed in big servers

In Linux, the SHMMAX value can often be set rather low, especially in older 32-bit distributions. Depending on your PostgreSQL configuration, you might need to tweak the values of SHMMAX and/or SHMALL.

How to calculate? One of the equations is: (FIXME: does this formula refer to SHMMAX?)

250kb + 8.2kb * shared_buffers +14.2kb * max_connections

The SHMMAX variable controls the maximum amount of memory to be allocated for shared memory use. If you try to assign high values for e.g. the shared_buffers GUC in PostgreSQL without adjusting SHMMAX, you might see an error message in Postgres' log like " ... Failed system call was shmget ... usually means that PostgreSQL's request for a shared memory segment exceeded your kernel's SHMMAX parameter", and you'll have to adjust SHMMAX upward accordingly.

Directory Locations Recommended

WAL Directory

Write Ahead Logging (WAL) is a critical part of Postgres' operation. Due to Postgres' use of the WAL to ensure data consistency, the WAL will receive significant I/O activity. Especially if your PGDATA directory isn't located on a capable RAID array, you might consider relocating the WAL (pg_xlog directory) to a separate disk to ease I/O load on the rest of the database. The WAL should be written in perfectly sequential fashion, so the I/O savings from relocating this directory can be substantial. However, be aware of additional data consistency risks you will take on by relocating the WAL.

(FIXME: awkward and vague paragraph) Configuration: If you have the PGDATA in very different location, you maybe want to have a /etc/postgresql/data<number>. Or, if you have several versions for testing purposes you maybe want to have 'debian' like tree (/etc/postgresql/<version>/<pgdata_number>). The default way is have config files inside the PGDATA.

See here for information on offloading various PostgreSQL data onto different drives.

Versioning sql scripts and configuration files

As you are doing right now (versioning the sql scripts), other best practice is to version the configuration files. Not only a simple versioning, remember that you have several envoronments (Development, Test, Production).

Other good practice, is to have versioned the DBA modifications in a separated script (SET STORAGE modifications, special indexes and rules, etc).

Backup and Recovery strategies

Users athentications

One recommended installation for access by host mode is the md5 method for encrypted password. We can draw something like this:

host all all 192.168.1.0/24 md5

This mode, reduces the the access to the IP addreses that are included in 192.168.1.x and using the correct password for the user.
Adding to this restriction, you must remember that in the postgres.conf you should modify listen_addresses variable listen_addresses.

Remember that you can create users wiht VALID UNTIL option. When you create a user you can calculate the timestamp with something like this:

SELECT (CURRENT_DATE+1)::timestamp;

You might replace 1 with the number of days that you want to enable the user(7=1 week, 30=month, etc.). Then, copy result in the ALTER or CREATE statement.

Remember: The first step before change trust for other method based in passwords, you should assign one to the user.

Changes in the pg_hba.conf only need reload signal. So, there is no need downtime.

You can create superusers for the databases, but remember that if youwant to restrict the access to the databases, the superusers still have permissions to drop the other BD's and other mainteneance tasks. Try to reduce the number of superusers or almost one.