This will select all employees if there are any subemployees (and conversely it will select no employees if there are no subemployees).
The IN clause establishes a relationship between the two tables. The EXISTS clause does not require any relationship between the two tables.

Alex - I agree. Adding the predicate to the subquery changes it from a non-correlated (independent) subquery to a correlated subquery, and correlated subqueries are usually very poor performers.

The poster also specifically asks regarding SQL Server 2008R2. I don't know how that optimizer works, but other optimizers invoke query rewrite to reconcile both forms and produce the same optimal plan. For these optimizers, the JOIN, EXISTS and IN (with the appropriate predicates) are recognized as functionally equivalent and therefore they all use the same plan. If SQL Server doesn't do this now, I'm confident it will in time.

Brian, I agree with your word "USUALLY" and don't agree with your generalization. As shown in the article linked by enthusiasts, it greatly depends on the size of the [correlated] sub-query.

About a month ago I was asked to help with performance issue of a similar SQL. The correlated subquery was very simple:

"(SELECT user FROM something
WHERE state NOT IN (<a list of 50 something US states and territories>)"

They also tried "WHERE state IN ('Puerto-Rico State Code') . The execution time varied between 20 minutes and 25 minutes.

A simple change to a left join returning NULLs, which is equal to 'NOT EXISTS' returns results in less than a minute. Also, the list of states was replaced by a common table expression to avoid treating the list as 'ORed' predicates.

The reason, of course, was the size of the temp table generated by the correlated sub-query - millions and millions of unindexed rows. So, with uncorrelated query, the database engine (DB2 for LUW vers. 9.5) was subsequently performing a join of two humongous tables. A correlated subquery or equal left join required a join between one table of the same very big size size and a tiny little temporary table.

I agree again. "Usually" and "it greatly depends" are the watchwords of all performance-related answers.

I'm a fan of letting the database designers and optimizer-writers solve performance problems, rather than trying to solve them at the "What, not How" level of the query language. This is really the only option available when SQL queries are being written or built by non-technical users -- they are the ones interested in "What not How". Sadly, not all optimizers are powerful, and not all database designers understand performance and usage.

So -- given the non-specific circumstances of the poster's question -- is there a right answer? Only that (1) the queries as given are different and need to be changed to be equivalent; (2) "faster" is a function of population size of each table, available access paths, uniqueness of the employee ids, distribution of the values of the predicate columns, capabilities of the dbms optimizer (SQLServer2008R2), power of the platform (cpu size and count, memory size, disk speed and count), concurrent workload, user urgency and a host other parameters.

But usually correlated subqueries don't perform well for tables of any significant size, usually indexed paths improve performance, usually unique values perform better than non-unique values and usually two tables of the same large size will present a performance challenge. Of course, it greatly depends.