This chapter is from the book

This chapter is from the book

The preceding chapter described scalar functions, which operate on individual row values. This chapter introduces SQLs aggregate functions, or set functions, which operate on a group of values to produce a single, summarizing value. You apply an aggregate to a set of rows, which may be:

All the rows in a table

Only those rows specified by a WHERE clause

Those rows created by a GROUP BY clause

No matter how many rows the set contains, an aggregate function returns a single statistic: a sum, minimum, or average, for example.

In this chapter, Ill also cover the SELECT statements GROUP BY clause, which groups rows, and HAVING clause, which filters groups.

In Table 6.1, expr often is a column name, but it also can be a literal, function, or any combination of column names, literals, and functions coupled by operators.

SUM() and AVG() work with only numeric data types. MIN() and MAX() work with characters, numeric, and datetime datatypes. COUNT(expr) and COUNT(*) workwith all data types.

All aggregate functions except COUNT(*) ignore nulls. (You can use COALESCE() in an aggregate function argument to substitute a value for a null; see Checking for Nulls with COALESCE() in Chapter 5.)

COUNT(expr) and COUNT(*) never returnnull but return either a positive integer or zero. The other aggregate functions return null if the set contains no rows or contains rows with only nulls.

Use the DISTINCT keyword to aggregate distinct values; see Aggregating Distinct Values with DISTINCT later in this chapter.

Aggregate functions often are used with the GROUP BY clause; see Grouping Rows with GROUP BY later in this chapter.

Use a WHERE clause to restrict the rows used in aggregate calculations; see Filtering Rows with WHERE in Chapter 4.

Default column headings for aggregate expressions vary by DBMS; use AS to name the result column. See Creating Column Aliases with AS in Chapter 4.

An aggregate expression cant appear in a WHERE clause. If you want to find the title of the book with the highest sales, you cant use:

SELECT title_id --Illegal
FROM titles
WHERE sales = MAX(sales);

You cant mix nonaggregate (row-by-row) and aggregate expressions in a SELECT clause. A SELECT clause must contain either all nonaggregate expressions or all aggregate expressions. If you want to find the title of the book with the highest sales, you cant use:

SELECT title_id, MAX(sales)
FROM titles; --Illegal

The one exception to this rule is that you can mix nonaggregate and aggregate expressions for grouping columns (see Grouping Rows with GROUP BY later in this chapter):

SELECT type, SUM(sales)
FROM titles
GROUP BY type; --Legal

You can use more than one aggregate expression in a SELECT clause:

SELECT MIN(sales), MAX(sales)
FROM titles; --Legal

You cant nest aggregate functions:

SELECT SUM(AVG(sales))
FROM titles; --Illegal

You can use aggregate expressions in sub-queries. This statement finds the title of the book with the highest sales:

You cant use subqueries (see Chapter 8) in aggregate expressions: AVG(SELECT price FROM titles) is illegal.

TIPS

Oracle lets you nest aggregate expressions in GROUP BY queries. The following example calculates the average of the maximum sales of all book types. Oracle evaluates the inner aggregate MAX(sales) for the grouping column type and then aggregates the results again:

SELECT AVG(MAX(sales))
FROM titles
GROUP BY type; --Legal in Oracle

MySQL 4.0 and earlier versions dont support subqueries.

DBMSes provide additional aggregate functions to calculate other statistics, such as the standard deviation; search your DBMS documentation for aggregate functions or group functions.