Fun with Outer Joins

I’ve been working with OUTER JOIN’s for almost twenty years now and still stumble every now and again. I’ve learned to be very careful in how I use them and in fact I’ve limited how I use them at times to make sure that I don’t create more problems than I’m trying to solve. I recently had the opportunity to help out one of my co-workers with an OUTER JOIN and in the process figured out a big part of my confusion over the years. Below I’m going to try to explain what they are and how to use them in clear easy steps.

Let’s start with the definition of an OUTER JOIN. When a LEFT OUTER JOIN is specified you want all of the rows from the “left” or “first” table and only the data from those rows that match on the “right” or “second” table. On any rows where the “right” table doesn’t have a match the “left” then in the results the columns from that table are NULL. Below I have an example of an OUTER JOIN with a mistake I frequently see along with a query that has a very strange result. Hopefully there is enough of an explanation for everyone to understand the whys of this structure.

I’m going to use LEFT OUTER JOINs in my examples below because they are the most common. All of the queries could certainly be re-written as RIGHT OUTER JOINs. First some setup.

They come to me and ask something along the lines of “Now wait just a minute. What happened to Dr. Sprite and Dr. 7-up? They have tenure and I used a LEFT OUTER JOIN!”. OK, that's not really what happens. Let’s face it, they come to me and say “It’s broke, why?”

Looking good so far but we still have entries for classes in 2010. And we saw before if we add

AND Class.ClassYear >= 2011

to the WHERE clause we lose Dr Sprite and Dr 7-up.

“Why is that! “ I hear some of you cry. The rest are split between 75% who already know the answer, 15% who don’t care, and the remaining 10% are still trying to figure out what an OUTER JOIN is and why I’m going on about it.

If you look at the last output you will notice that rows 6 & 7 have NULLs for any field from the Class table. As we all know, the expression NULL = any value is always false. (I am NOT getting into ANSI_NULLS discussions here.)

Finally I was starting to get somewhere. By using the FULL JOIN I was now able to see the larger picture. You will note a fair number of the classes that had professors are not actually matched up. For example rows 8-11 have a ProfessorId of 4, which we might reasonably expect to match up with row 6, “Mr Pepper”. Why aren’t they?

The reason for this is the same reason the final “correct” query worked. It lies in the difference between the ON clause and the WHERE clause. The WHERE clause eliminates rows from the result set, or really determines which results to include, but the net result is the same. The ON clause is the join between the two tables.

In the case of a LEFT OUTER JOIN the ON clause will have absolutely no effect on what rows from the left hand table will get returned. For some reason I had always persisted in thinking of the ON clause as having two separate parts. The part that does the join and the part that eliminates rows from the result set. In reality the ON clause is of course one piece that only determines the JOIN between the referenced tables. That is it. No more, no less. The WHERE clause is the part that determines which rows are returned and which aren’t. I should probably mention that the ON clause can increase the number of rows returned in a one to many or many to many situation.

Back to the LEFT OUTER JOIN version of the query.

Note again that the ON clause only affects how the two tables are joined together. In other words, I only want to join the two tables together if the Id’s match, Tenure is TRUE and ClassYear is 2011 or greater. The LEFT OUTER JOIN will eliminate rows 8 (see previous image) and above from the result set since there is no match from Professor (the left table). Then since there is no WHERE clause no other rows are eliminated. Leaving us not only with the rows we expected (1-5) but with the previously confusing rows 6 and 7.

In this particular case the WHERE clause gets rid of any professors who don’t have tenure. The ON clause is only going to join the two tables together if the ids match and the ClassYear >= 2010. Finally the LEFT OUTER JOIN keeps all of the professors that meet the WHERE clause, and gets rid of any classes that don’t have a match with Professor.

OUTER JOINs are very useful and highly utilized tools. Unfortunately they can also cause a great deal of confusion in even relatively simple queries. If you can keep in mind the true use of the ON clause (the join) VS the true use of the WHERE clause (eliminating rows from the result set) then hopefully some of the confusion can be eliminated. Of course it should also be said that the difference between ON and WHERE is the same for all types of JOINs, it just isn’t always as obvious as it can be with OUTER JOINs.

Join the most active online SQL Server Community

SQL knowledge, delivered daily, free:

Email address:

You make SSC a better place

As a member of SQLServerCentral, you get free access to loads of fresh content: thousands
of articles and SQL scripts, a library of free eBooks, a weekly database news roundup,
a great Q & A platform… And it’s our huge, buzzing community of SQL Server Professionals
that makes it such a success.