General Bits is a column loosely based on the PostgreSQL mailing list
pgsql-general.
To find out more about the pgsql-general list and PostgreSQL, see
www.PostgreSQL.org.

Tracking the row count

[GENERAL] create triggers

20-Oct-2003

A trigger to update the row count is needed.
The actual row count is important for the
application and count(*) by nature is too slow.

The way to implement a proper row counter is to
create a trigger on the table which needs to
be counted. This trigger will increment or
decrement the count on insert or delete.

There are a couple of interesting issues with the
row counting implementation. The first is that the
counter trigger function can work for both insert and
delete and with any table by using trigger information in
plpgsql. The second is the issue of initializing the rowcount table.

Suppose you had the table mystuff and
set up the table rowcount to hold the
row count of many tables.

Most people are familiar with the use of OLD and NEW variables
in trigger functions, but there are several other informational
variables available.

Name

Meaning

TG_NAME

Name of the trigger

TG_WHEN

BEFORE or AFTER

TG_LEVEL

ROW or STATEMENT

TG_OP

INSERT, UPDATE or DELETE

TG_RELID

Relation OID of table with trigger

TG_RELNAME

Table name of table with trigger

TG_NARGS

Number of arguments in row being updated

TG_ARGV[]

Text array of datatypes in row being updated

Note that Statement level triggers are available in 7.4 and forward.

The count_rows() trigger function uses TG_OP and TG_RELNAME.
TG_OP indicates whether it is a DELETE or INSERT and TG_RELNAME
is used to store the count in the countrows table by table name.
This is the function:

There are many ways the various TG variables can be used to
write generic triggers. Different courses of action can
be taken based on the exact definition of the trigger.
The data in the row can be accessed via NEW and OLD and
information about their datatypes is available if decisions
or actions need to be done based on the data type.

Now we will look at what is necessary to initialize
the trigger. If the trigger function and the tables
are all defined in the same transaction, then initialization
is not necessary, except for the creation of the row in
rowcount for the table in question. However, most of the
time if row counting functionality
is being added to an existing table, the base row count
must be ascertained.

For an existing table, the initialization must be done in a
single transaction. The target table is locked to prevent
updates during this initialization. Then the trigger is created
and the rowcount is inserted with the current row count.
Once the initialization transaction is committed, then your counter
is operational. Don't forget to test it!

Beta 5 of PostgreSQL v7.4 has now been released. All testers are encourage
to download and use this version. Debian packages are in the experimental part
of the Debian archive.

NOTE This version requires an initdb.

The decision to have a Beta 5 was made due to bug fixes which required the
initdb. With that type of major change it is not sensible to release
without a beta cycle.

We look forward to all of the features and new documentation that will
be available in RC1!

Contributors:
Marc G. Fournier scrappy at postgresql.org

Fast Port Reports for 7.4

[HACKERS] Call for port reports

25-Oct-2003

Over the last 24 hours or so, about 22 platforms have been certified
with PostgreSQL v7.4. The current list is available
on Bruce's site.

Very fast definition and correction of problems by Bruce and Peter Eisenstraut
and many others helped speed this process. As did all of the people involved
in building and running regressions on their platforms.

Knowing what sort of effort this takes in a commercial enterprise I am very
impressed with the cooperation, speed and good will invested in this testing.

A discussion of the degradation of query speed during vacuum led
to a linux tuning solution. It is common for queries to slow
down during vacuum, however, the specific case showed a large slowdown.
The initial measurements showed a query running normally in
25.52 msec ran in 2290.22 msec during vacuum.

Initial analysis looked at CPU speed, however, further speculation
suggested that the machine was I/O bound. If the system is near
I/O saturation, vacuum can push it over the edge.

The use of IDE drives affect this. It seems to be agreed that
a SCSI drives would be better, however, many people prefer
the less expensive IDE option.

Analysis using vmstat, paying particular attention
to the bi/bo columns was suggested to determine if there
were actually I/O problems.

The solution pointed to the settings of the linux utility
elvtune. This utility tunes the disk I/O elevator in Linux
by reordering read/write requests according to disk sectors
in order to reduce disk seeks. Different people reported
widely differing values from elvtune.

It was suggested that max_bomb_segments has been disabled and should
always be zero because of some inefficiencies in the elevator algorithm,
however documentation on this change was not found in a cursory search.

The read_latency field changed between 2.4.18-x and 2.4.20-x accounting
for the differences between that field in RH8 and RH9.