In MySQL, JOIN, CROSS
JOIN, and INNER JOIN are syntactic
equivalents (they can replace each other). In standard SQL, they
are not equivalent. INNER JOIN is used with
an ON clause, CROSS JOIN
is used otherwise.

INNER JOIN and ,
(comma) are semantically equivalent in the absence of a join
condition: both produce a Cartesian product between the
specified tables (that is, each and every row in the first
table is joined to each and every row in the second table).

However, the precedence of the comma operator is less than
that of INNER JOIN, CROSS
JOIN, LEFT JOIN, and so on. If
you mix comma joins with the other join types when there is
a join condition, an error of the form Unknown
column 'col_name' in 'on
clause' may occur. Information about dealing with
this problem is given later in this section.

The conditional_expr used with
ON is any conditional expression of the
form that can be used in a WHERE clause.
Generally, the ON clause serves for
conditions that specify how to join tables, and the
WHERE clause restricts which rows to
include in the result set.

If there is no matching row for the right table in the
ON or USING part in a
LEFT JOIN, a row with all columns set to
NULL is used for the right table. You can
use this fact to find rows in a table that have no
counterpart in another table:

SELECT left_tbl.*
FROM left_tbl LEFT JOIN right_tbl ON left_tbl.id = right_tbl.id
WHERE right_tbl.id IS NULL;

The
USING(column_list)
clause names a list of columns that must exist in both
tables. If tables a and
b both contain columns
c1, c2, and
c3, the following join compares
corresponding columns from the two tables:

a LEFT JOIN b USING (c1, c2, c3)

The NATURAL [LEFT] JOIN of two tables is
defined to be semantically equivalent to an INNER
JOIN or a LEFT JOIN with a
USING clause that names all columns that
exist in both tables.

RIGHT JOIN works analogously to
LEFT JOIN. To keep code portable across
databases, it is recommended that you use LEFT
JOIN instead of RIGHT JOIN.

The { OJ ... } syntax shown in the join
syntax description exists only for compatibility with ODBC.
The curly braces in the syntax should be written literally;
they are not metasyntax as used elsewhere in syntax
descriptions.

You can use other types of joins within { OJ ...
}, such as INNER JOIN or
RIGHT OUTER JOIN. This helps with
compatibility with some third-party applications, but is not
official ODBC syntax.

The parser does not permit nested { OJ ...
} constructs (which are not legal ODBC syntax,
anyway). Queries that use such constructs should be
rewritten.

STRAIGHT_JOIN is similar to
JOIN, except that the left table is
always read before the right table. This can be used for
those (few) cases for which the join optimizer processes the
tables in a suboptimal order.

In the first SELECT
statement, column j appears in both
tables and thus becomes a join column, so, according to
standard SQL, it should appear only once in the output, not
twice. Similarly, in the second SELECT statement, column
j is named in the
USING clause and should appear only once
in the output, not twice.

First, coalesced common columns of the two joined
tables, in the order in which they occur in the first
table

Second, columns unique to the first table, in order in
which they occur in that table

Third, columns unique to the second table, in order in
which they occur in that table

The single result column that replaces two common columns is
defined using the coalesce operation. That is, for two
t1.a and t2.a the
resulting single join column a is defined
as a = COALESCE(t1.a, t2.a), where:

COALESCE(x, y) = (CASE WHEN x IS NOT NULL THEN x ELSE y END)

If the join operation is any other join, the result columns
of the join consist of the concatenation of all columns of
the joined tables.

A consequence of the definition of coalesced columns is
that, for outer joins, the coalesced column contains the
value of the non-NULL column if one of
the two columns is always NULL. If
neither or both columns are NULL, both
common columns have the same value, so it doesn't matter
which one is chosen as the value of the coalesced column. A
simple way to interpret this is to consider that a coalesced
column of an outer join is represented by the common column
of the inner table of a JOIN. Suppose
that the tables t1(a, b) and
t2(a, c) have the following contents:

A USING clause can be rewritten as an
ON clause that compares corresponding
columns. However, although USING and
ON are similar, they are not quite the
same. Consider the following two queries:

a LEFT JOIN b USING (c1, c2, c3)
a LEFT JOIN b ON a.c1 = b.c1 AND a.c2 = b.c2 AND a.c3 = b.c3

With respect to determining which rows satisfy the join
condition, both joins are semantically identical.

With respect to determining which columns to display for
SELECT * expansion, the two joins are not
semantically identical. The USING join
selects the coalesced value of corresponding columns,
whereas the ON join selects all columns
from all tables. For the USING join,
SELECT * selects these values:

COALESCE(a.c1, b.c1), COALESCE(a.c2, b.c2), COALESCE(a.c3, b.c3)

For the ON join, SELECT
* selects these values:

a.c1, a.c2, a.c3, b.c1, b.c2, b.c3

With an inner join, COALESCE(a.c1,
b.c1) is the same as either
a.c1 or b.c1 because
both columns will have the same value. With an outer join
(such as LEFT JOIN), one of the two
columns can be NULL. That column is
omitted from the result.

The statement fails with an Unknown column 'i3' in
'on clause' error because i3 is
a column in t3, which is not an operand
of the ON clause. To enable the join to
be processed, rewrite the statement as follows:

SELECT * FROM t1 JOIN t2 JOIN t3 ON (i1 = i3);

JOIN has higher precedence than the comma
operator (,), so the join expression
t1, t2 JOIN t3 is interpreted as
(t1, (t2 JOIN t3)), not as ((t1,
t2) JOIN t3). This affects statements that use an
ON clause because that clause can refer
only to columns in the operands of the join, and the
precedence affects interpretation of what those operands
are.

The JOIN takes precedence over the comma
operator, so the operands for the ON
clause are t2 and t3.
Because t1.i1 is not a column in either
of the operands, the result is an Unknown column
't1.i1' in 'on clause' error.

To enable the join to be processed, use either of these
strategies:

Group the first two tables explicitly with parentheses
so that the operands for the ON
clause are (t1, t2) and
t3:

SELECT * FROM (t1, t2) JOIN t3 ON (t1.i1 = t3.i3);

Avoid the use of the comma operator and use
JOIN instead:

SELECT * FROM t1 JOIN t2 JOIN t3 ON (t1.i1 = t3.i3);

The same precedence interpretation also applies to
statements that mix the comma operator with INNER
JOIN, CROSS JOIN, LEFT
JOIN, and RIGHT JOIN, all of
which have higher precedence than the comma operator.

A MySQL extension compared to the SQL:2003 standard is that
MySQL permits you to qualify the common (coalesced) columns
of NATURAL or USING
joins, whereas the standard disallows that.

(Background: This database is used to keep track of scores for students in my classes.)

So in this case, I have three tables, one has student's "codename" (as posting their real name on the web is a no-no) and an index (there is more data in this table, but this is all you really need to know.) Then there's a table with the assignments, containing the assignment name, and an index for each assignment. Finally, there is a scores table, which has for each paper I get turned in, a student_id (releated to the student index) an act_id (related to the assignments index) and a score.

So here's how the sql statement ended up:SELECT names.codename,s1.score AS "Score1", s1.comment AS "Comments1",s2.score AS "Score2", s2.comment AS "Comments2",SUM(st.score) AS "Total"FROM students names LEFT JOIN scores s1 ON s1.act_id=1 AND names.id=s1.student_id LEFT JOIN scores s2 ON s2.act_id=2 AND names.id=s2.student_id LEFT JOIN scores st ON names.id=st.student_idWHERE names.codename <> ''GROUP BY names.codenameORDER BY names.codename;

As you can see, for each activity, I need to add another left join, but it looks exactly like the last one, thus it is easy to build through a program like php. I hope this helps someone out.

Posted by
Thomas Mayer
on
April 21, 2003

I use left joins to generate sums on one table using different conditions:t1 to make sure that ALL grouped records are shownt(n+1) for use per conditionand as mentioned above, the JOIN condition must be used as well for the primary key AND for the condtion per sum!

-- returns correct sumsselect t1.mygroup, sum(t2.value) as cond_1, sum(t3.value) as cond_2, sum(t4.value) as cond_3from testtable t1 left join testtable t2 on t1.mykey=t2.mykey and t2.cond=1 left join testtable t3 on t1.mykey=t3.mykey and t3.cond=2 left join testtable t4 on t1.mykey=t4.mykey and t4.cond=3group by 1 order by 1;

Martin, your comment & others helped emensely!Here's a left-join select that also has a regular join to another table.I want to get all Plans (& associated SubType info), but also see which plans user 13 is signed upfor, but only if the expire_date hasn't passed. This select will show all Plan & SubType info,but user-info only if the user is signed up,and the expire-date hasn't passed.

This is an example of using a left to get lookup values from a table twice. The reason that an outer join was used instead of an inner join, was that in this case, there may be values that are null inside of the degree table.

And now the 'id' column will be preserved since the alias is evaluated *after* the LEFT JOIN.

Posted by
on
December 14, 2004

Note that table_reference in the above grammar may also be a "anonymous table" resulting from a join, as shown below. I don't know if this is intended or by occasion, but for me it works (MySQL 4.0.22):

This enables you to OUTER JOIN with the results of an INNER JOIN within one query.

Posted by
Ben Griffin
on
June 30, 2005

Scenario: A group of companies have a set of publications. Any company in the group can subscribe to any number of publications.Requirement: Create a query to generate a list of companies, showing the count of publications and subscriptions for each company.

This checks for an event that should happen yearly, depending on other factors - it could be applied to checking for any event that should occur at regular intervals - ie monthly or weekly - with adjustments to the code for selecting max_calves.

This is a fairly complex example, with lots of joins, aliases and grouping... Its aim is to extract a list of cows which have not calved every year in a given period. It accounts for cows which have not been in the database for the entire period, are not currently on our property (locations.local==1), or which are too young to have calves for the entire period (they can't calve before 2 years old).There are many actions that signal entry/exit of an animal - eg birth/buy/sell/death. many (or none) of them may be recorded in the history table. As a fallback, animal.yob is the year of birth of the animal.Cows and their calves are all stored as "animals" in the animals table (the mob field distinguishes between them).

The query returns the number of calves they did have in the time frame (num_calves) and the number of calves they should have had (max_calves).

The SQL to make it happen:The original formatting is easier to read, but the leading spaces are not preserved here (the site won't let me use html to get it either)$xxx are the input variables:$start_year, $end_year: date range to check (inclusive)$all_animals: 0= alive and on property, 1= everything$nomiss: 0= those that missed a year, 1=those that didn't miss$allow_miss: number of misses considered acceptable.

select a1.*,aquire.date,disposal.date,count(a2.id) num_calves, (IF(IsNull(disposal.date),$end_year, IF(DATE_FORMAT(disposal.date,'%Y') < $end_year,DATE_FORMAT(disposal.date,'%Y'),$end_year) - IF(DATE_FORMAT(aquire.date,'%Y')>$start_year, IF(DATE_FORMAT(aquire.date,'%Y')>(a1.yob+2),DATE_FORMAT(aquire.date,'%Y'),a1.yob+2),IF(a1.yob>($start_year-2),a1.yob+2,$start_year) ) +1) as max_calves from animals as a1 left join animals as a2 on a1.id=a2.mother and a2.yob>=$start_year left join history as aquire on a1.id=aquire.id and aquire.action in (select id from actions where type='aquire') left join history as disposal on a1.id=disposal.id and disposal.action in (select id from actions where type='disposal') where a1.mob in (select id from mobs where mobtype in (select id from mob_types where name='Cow')) and (($all_animals>0) or a1.location in (select id from locations where alive=1 and local=1)) group by a1.id having IF($nomiss>0, num_calves>=(max_calves-$allow_miss), num_calves<(max_calves-$allow_miss)) order by a1.location,cast(a1.tag1 as unsigned),a1.tag1;

Speed: with about 6000 records in the animals table, and appropriate indexes, this returns in under half a second on a modern machine (versus several seconds without indexes)

I still need to add support for twins (two events in same year should be counted as one), but I'm not sure how to do that yet (any suggestions?).

The result of a full outer join betwee tables A and B includes:- rows from the result of the inner join- rows from A that don't have corresponding rows in B- rows from B that don't have corresponding rows in A

Due to the fact that the first union represents a left outer join, the statement can be simplified:

SELECT *FROM A LEFT JOIN B ON A.id = B.idUNION ALLSELECT *FROM A RIGHT JOIN B ON A.id = B.idWHERE A.id IS NULL

Posted by
on
March 16, 2007

The example posted on September 12 2006 about how to emulate FULL OUTER JOIN using UNION has a subtle problem. Assume you have 3 tables, each with one single colum "id" and 4 rows.t0 contains 1,3,5,7,t1 contains 2,3,6,7,t2 contains 4,5,6,7(ti contains j iff 2^i xor j = 1).

Scott Atkins on January 23 2003 posted about pivoting the rows in a table into the columns, by performing a separate join for each value based column. I've found it easier to use a single join. Then use the CASE statement in the SELECT clause to separate the columns. Given Scott's example, I would use the following syntax: