Browse by Tags

You might recall (from my last post) that query plans containing a row goal tend to favour nested loops or merge join over hashing. This is because a hash join has to fully process its build input (to populate its hash table) before it can start probing for matches from its second input. Hash join therefore has a high start-up cost, which is balanced by a lower per-row cost once probing begins. In this post, I’ll take a look at how row goals affect grouping operations.
Grouping Strategies
While the start-up cost of hash join often makes it unsuitable for plans with a row goal, there are times when hashing operations may feature in such plans, since the Hash Match iterator also supports a streaming mode.
As an example, say we are asked to list one hundred unique first names from the AdventureWorks Contacts table: Read More...

Background
One of the core assumptions made by the SQL Server query optimiser’s model is that clients will consume all of the rows produced by a query. This results in plans that favour the overall execution cost, though it may take longer to begin producing rows. Let’s look at an example:
The optimiser chooses to perform the logical join using a Hash Match physical iterator, resulting in a plan with a total estimated cost of around 1.4 units. By forcing alternative physical joins using a query hint, we see that a plan based on Sort-Merge would have an estimated cost of just under 10, and using Nested Loops would cost over 18 units. All these cost estimates are based on the assumption that all rows are required.
Hash Match
As detailed in a previous post, the Hash Match iterator starts by consuming all rows produced by its build input (the Product table) in order to build a hash table. This makes Hash Match a semi-blocking iterator: it can only start producing output rows once the build phase is complete. If we need the first few rows from the query quickly, this join type may not be optimal. Read More...