Aggregate functions compute a single
result from a set of input values. The built-in aggregate
functions are listed in Table
9-43 and Table
9-44. The special syntax considerations for aggregate
functions are explained in Section 4.2.7.
Consult Section 2.7 for
additional introductory information.

Table 9-43. General-Purpose Aggregate
Functions

Function

Argument Type(s)

Return Type

Description

array_agg(expression)

any

array of the argument type

input values, including nulls, concatenated into an
array

avg(expression)

smallint, int, bigint, real, double precision,
numeric, or interval

numeric for any integer-type
argument, double precision for a
floating-point argument, otherwise the same as the
argument data type

It should be noted that except for count, these functions return a null value when
no rows are selected. In particular, sum of no rows returns null, not zero as one
might expect, and array_agg returns
null rather than an empty array when there are no input rows. The
coalesce function can be used to
substitute zero or an empty array for null when necessary.

Note: Boolean aggregates bool_and and bool_or correspond to standard SQL
aggregates every and
any or some. As for any and some,
it seems that there is an ambiguity built into the standard
syntax:

SELECT b1 = ANY((SELECT b2 FROM t2 ...)) FROM t1 ...;

Here ANY can be considered
either as introducing a subquery, or as being an aggregate
function, if the subquery returns one row with a Boolean
value. Thus the standard name cannot be given to these
aggregates.

Note: Users accustomed to working with other SQL
database management systems might be disappointed by the
performance of the count
aggregate when it is applied to the entire table. A query
like:

SELECT count(*) FROM sometable;

will be executed by PostgreSQL using a sequential scan of
the entire table.

The aggregate functions array_agg, string_agg, and xmlagg, as well as similar user-defined
aggregate functions, produce meaningfully different result values
depending on the order of the input values. This ordering is
unspecified by default, but can be controlled by writing an
ORDER BY clause within the aggregate
call, as shown in Section 4.2.7.
Alternatively, supplying the input values from a sorted subquery
will usually work. For example:

SELECT xmlagg(x) FROM (SELECT x FROM test ORDER BY y DESC) AS tab;

But this syntax is not allowed in the SQL standard, and is not
portable to other database systems.

Table
9-44 shows aggregate functions typically used in statistical
analysis. (These are separated out merely to avoid cluttering the
listing of more-commonly-used aggregates.) Where the description
mentions N, it means the number
of input rows for which all the input expressions are non-null.
In all cases, null is returned if the computation is meaningless,
for example when N is zero.

Table 9-44. Aggregate Functions for
Statistics

Function

Argument Type

Return Type

Description

corr(Y, X)

double precision

double precision

correlation coefficient

covar_pop(Y, X)

double precision

double precision

population covariance

covar_samp(Y, X)

double precision

double precision

sample covariance

regr_avgx(Y, X)

double precision

double precision

average of the independent variable (sum(X)/N)

regr_avgy(Y, X)

double precision

double precision

average of the dependent variable (sum(Y)/N)

regr_count(Y, X)

double precision

bigint

number of input rows in which both expressions are
nonnull

regr_intercept(Y, X)

double precision

double precision

y-intercept of the least-squares-fit linear equation
determined by the (X,
Y) pairs

regr_r2(Y, X)

double precision

double precision

square of the correlation coefficient

regr_slope(Y, X)

double precision

double precision

slope of the least-squares-fit linear equation
determined by the (X,
Y) pairs