Production rule systems are conceptually simple, but there are
many subtle points involved in actually using them. Some of these
points and the theoretical foundations of the Postgres rule system can be found in
[Stonebraker et al, ACM,
1990].

Some other database systems define active database rules. These
are usually stored procedures and triggers and are implemented in
Postgres as functions and
triggers.

The query rewrite rule system (the "rule system" from now on) is
totally different from stored procedures and triggers. It modifies
queries to take rules into consideration, and then passes the
modified query to the query optimizer for execution. It is very
powerful, and can be used for many things such as query language
procedures, views, and versions. The power of this rule system is
discussed in [Ong and Goh,
1990] as well as [Stonebraker et al, ACM, 1990].

To understand how the rule system works it is necessary to know
when it is invoked and what it's input and results are.

The rule system is located between the query parser and the
optimizer. It takes the output of the parser, one querytree, and
the rewrite rules from the pg_rewrite
catalog, which are querytrees too with some extra information, and
creates zero or many querytrees as result. So it's input and output
are always things the parser itself could have produced and thus,
anything it sees is basically representable as an SQL statement.

Now what is a querytree? It is an internal representation of an
SQL statement where the single parts
that built it are stored separately. These querytrees are visible
when starting the Postgres backend
with debuglevel 4 and typing queries into the interactive backend
interface. The rule actions in the pg_rewrite system catalog are also stored as
querytrees. They are not formatted like the debug output, but they
contain exactly the same information.

Reading a querytree requires some experience and it was a hard
time when I started to work on the rule system. I can remember that
I was standing at the coffee machine and I saw the cup in a
targetlist, water and coffee powder in a rangetable and all the
buttons in a qualification expression. Since SQL representations of querytrees are sufficient
to understand the rule system, this document will not teach how to
read them. It might help to learn it and the naming conventions are
required in the later following descriptions.

When reading the SQL
representations of the querytrees in this document it is necessary
to be able to identify the parts the statement is broken into when
it is in the querytree structure. The parts of a querytree are

the commandtype

This is a simple value telling which command (SELECT,
INSERT, UPDATE, DELETE) produced the parsetree.

the rangetable

The rangtable is a list of relations that are used in the
query. In a SELECT statement that are the relations given after
the FROM keyword.

Every rangetable entry identifies a table or view and tells
by which name it is called in the other parts of the query. In
the querytree the rangetable entries are referenced by index
rather than by name, so here it doesn't matter if there are
duplicate names as it would in an SQL statement. This can happen after the
rangetables of rules have been merged in. The examples in this
document will not have this situation.

the resultrelation

This is an index into the rangetable that identifies the
relation where the results of the query go.

SELECT queries normally don't have a result relation. The
special case of a SELECT INTO is mostly identical to a CREATE
TABLE, INSERT ... SELECT sequence and is not discussed
separately here.

On INSERT, UPDATE and DELETE queries the resultrelation is
the table (or view!) where the changes take effect.

the targetlist

The targetlist is a list of expressions that define the
result of the query. In the case of a SELECT, the expressions
are what builds the final output of the query. They are the
expressions between the SELECT and the FROM keywords (* is just
an abbreviation for all the attribute names of a relation).

DELETE queries don't need a targetlist because they don't
produce any result. In fact the optimizer will add a special
entry to the empty targetlist. But this is after the rule
system and will be discussed later. For the rule system the
targetlist is empty.

In INSERT queries the targetlist describes the new rows that
should go into the resultrelation. Missing columns of the
resultrelation will be added by the optimizer with a constant
NULL expression. It is the expressions in the VALUES clause or
the ones from the SELECT clause on INSERT ... SELECT.

On UPDATE queries, it describes the new rows that should
replace the old ones. Here now the optimizer will add missing
columns by inserting expressions that put the values from the
old rows into the new one. And it will add the special entry
like for DELETE too. It is the expressions from the SET
attribute = expression part of the query.

Every entry in the targetlist contains an expression that
can be a constant value, a variable pointing to an attribute of
one of the relations in the rangetable, a parameter or an
expression tree made of function calls, constants, variables,
operators etc.

the qualification

The queries qualification is an expression much like one of
those contained in the targetlist entries. The result value of
this expression is a boolean that tells if the operation
(INSERT, UPDATE, DELETE or SELECT) for the final result row
should be executed or not. It is the WHERE clause of an
SQL statement.

the others

The other parts of the querytree like the ORDER BY clause
arent of interest here. The rule system substitutes entries
there while applying rules, but that doesn't have much to do
with the fundamentals of the rule system. GROUP BY is a special
thing when it appears in a view definition and still needs to
be documented.