Introducing Unions

Added to MySQL in version 4.0 is the UNION. UNIONs are like JOINs in that they work on multiple tables (or the same table multiple times). With a JOIN, the returned results are normally based upon a condition, where a column in one table relates to a column in another. With a UNION, no such relationship is necessary, but the selected data from both tables should be of the same type. Specifically, the nth column selected in the first query should be of the same type as the nth column selected in the second query.

The basic syntax of a UNION is

SELECTquery1 UNION SELECTquery2

Assuming you have the two dummy tables shown in Figure 10.22, you can see the result of the following UNION in Figure 10.23.

Figure 10.22. I'll use these two dummy tables to quickly demonstrate how UNIONs behave.

Figure 10.23. The contents of two unrelated columns from two unrelated tables are merged into one result set.

SELECT a_string FROM table1 UNION SELECT this_string FROM table2

You should notice a couple of things. First, the column names from the first table are used in the results. Second, any duplicate records will be removed (cat only appears once even though it's in both tables). This is true unless you add the ALL keyword (Figure 10.24):

Figure 10.24. With a UNION ALL query, duplicates are not removed (compare with Figure 10.23).

SELECT a_string FROM table1
UNION ALL
SELECT this_string FROM table2;

I'll play with some examples in the following steps, but also see the sidebar for a discussion as to when UNIONs are most useful.

When to Use UNIONs

Unions aren't needed often, but they can be a lifesaver. A common example of when to use them is if you have a query where two unrelated conditions could be met:

SELECT * FROM tablename WHERE col1='this value' OR col2='that value'

Such a query will work, but inefficiently (at least prior to MySQL 5.0). Instead you could use:

Figure 10.25. This basic UNION retrieves a list of dates and times.

The first SELECT retrieves every expense_date. The second retrieves every invoice_date. These are unrelated values but of the same type. The UNION will return them all together, with exact duplicates removed.

3.

Show the same dates formatted (Figure 10.26).

Figure 10.26. A side effect of formatting the returned dates is that five fewer (non-distinct) records are returned (compare with Figure 10.25). This occurred because the formatting removed the time part of the dates from consideration.

This alteration on the previous query uses the DATE_FORMAT() function to also format the results. This has the added effect of removing some records that were returned by the other query (because they happened on the same date but not at the same time).

4.

Show all of the accounting events that took place this year, including any paid invoices (Figure 10.27).

Figure 10.27. This query uses conditionals in each WHERE clause to limit the dates to the current year. It also performs the UNION on three query results.

UNIONs, like JOINs, can involve multiple queries and tables. In other words, you can UNION two or more query results. To do so, you just continue on with the UNION syntax (SELECTqueryUNIONSELECTquery2UNIONSELECTquery3).

Each SELECT query differs from the other ones in two ways. First, I've changed the formatting of the date so that I can sort it better. If I sorted the dates as they were formatted before, they would be sorted alphabetically, not chronologically. Second, each query uses a WHERE clause to restrict the returned results to only those that took place in the current year.

Finally, all of the results are ordered by the formatted date. Parentheses are used to clarify each query and to make clear that the ORDER BY applies to the whole UNION.

Tips

You can apply a LIMIT clause to a UNION result, but you'll need to use parentheses to make clear where the LIMIT applies. Take these two slightly different examples:

The first query returns every row from the first table unioned with five rows from the second table. The second query takes everything from the first table unioned with everything from the second, and then returns only five records from that whole result.

You can also use ORDER BY clauses with your UNIONs. You can use them on either or both queries (within parentheses as with the LIMIT clauses) or use ORDER BY on the whole UNION (outside of any parentheses). If you do the former, you must also use a LIMIT clause or else the ORDER BY will have no effect. If you do the latter, you cannot use the tablename.columnname syntax. Instead, define an alias within the appropriate query and then refer to it. Step 4 (and Figure 10.27) is an example of a UNION with an ORDER BY.

An Introduction to Subqueries

Added to MySQL in version 4.1 is the ability to do subqueries, specifically subSELECT queries. The premise is simple: part of a query is based upon the results of another query.

The knock against a subSELECT is that you can often accomplish the same thing using a UNION or a more efficient JOIN. For this reason, and because the MySQL manual has a pretty good discussion of the topic, I'm not going to demonstrate the subquery in this book.

If you get bored with all of the other SQL you've learned, or otherwise think you might need to use subqueries, check out the manual or the handful of articles available online for the syntax and usage. But before going hog wild with subqueries, do some investigating to see if a JOIN isn't what you actually need. (You can use EXPLAIN to compare how MySQL executes a query.)