Merge Join

Introduction

The Merge Join operator is one of four operators that join data from two input streams into a single combined output stream. As such, it has two inputs, called the left and right input. In a graphical execution plan, the left input is displayed on the top.

Merge Join is the most effective of all join operators. However, it requires all input data to be sorted by the join columns. Often this means that a Merge Join can’t be used without adding extra Sort operators. These extra sorts increase the total plan cost. In such cases, the optimizer tends to choose other join operators instead. The alternatives are Nested Loops (ideal for joining a small data stream with a cheap input), Hash Match (most effective for joining large unsorted sets), and Adaptive Join (which can be used when the optimizer finds viable plans for both Nested Loops and Hash Match and wants to postpone the final choice until run time).

The Merge Join operator supports all ten logical join operations: inner join; left, right, and full outer join; left and right semi and anti semi join; as well as concatenation and union. The algorithm requires at least one equality-based join predicate.

Visual appearance in execution plans

Depending on the tool being used, a Merge Join operator is displayed in a graphical execution plan as shown below:

SQL Server Management Studio

Azure Data Studio

(until version 17.3)

(version 17.4 and up)

Algorithm

The basic algorithm for the Merge Join operator is as shown below:

Note that this flowchart is a simplification. It doesn’t show that execution stops whenever a row is returned, and resumes where it was upon next being called. It also does not show some of the extra logic required to handle many to many joins.

The flowchart shows how the operator processes the data from two inputs at the same time. It compares the values from the two inputs, handles the data as appropriate, and then advances the input with the lowest value. This ensures that matching rows are always processed simultaneously.

In this post you can see some nice (slightly simplified) animations showing the algorithm in action. Thanks, Bert!

Variations by join type

The flowchart above shows the basic outline of Merge Join processing. Depending on the requested join type (the logical operation), some of the steps will change.

Inner Join

For an inner join, “Handle matching rows” returns the combined data to the calling operator. On the next call, the “Next?” test directs to read the next row from the right input, because a one to many merge join always arranges the side with potential duplicates as the right input. The “Done?” test for an inner join results in yes if either of the inputs is depleted. Execution only continues as long as both inputs still have data.

“Handle unmatched left row” and “Handle unmatched right row” do nothing for the inner join operation. Because of that, the “Left row matched?” and “Right row matched?” tests are irrelevant, and there is no need to track whether rows were matched. It is unknown whether this operator is actually implemented to test the logical operation and then skip these steps, or whether Microsoft chose to simply run these steps in all cases.

Left Outer Join

The left outer join operation returns the same combined data from “Handle matching rows” as an inner join, and also continues with reading the right input after finding a match. However, “Handle unmatched left row” now also returns data, from the left input and with null values for the columns from the right input. “Handle unmatched right row” does nothing; its associated logic has no effect and may or may not be skipped during processing.

After reading the next row from the left or right input (depending on where the row was returned), the “Done?” test results in yes if the left input is depleted. Execution continues as long as the left input has data, even if the right input is already fully processed.

Right Outer Join

The right outer join operation returns the same combined data from “Handle matching rows” as an inner join, and also continues with reading the right input after finding a match. However, “Handle unmatched right row” now also returns data, from the right input and with null values for the columns from the left input. “Handle unmatched left row” does nothing; its associated logic has no effect and may or may not be skipped during processing.

After reading the next row from the right input, the “Done?” test results in yes if the right input is depleted. Execution continues as long as the right input has data, even if the left input is already fully processed.

Full Outer Join

The full outer join operation returns the same combined data from “Handle matching rows” as an inner join, and also continues with reading the right input after finding a match. However, both “Handle unmatched left row” and “Handle unmatched right row” now also return data, from the unmatched input and with null values for the columns from the other input.

After reading the next row from the left or right input (depending on where the row was returned), the “Done?” test results in yes if both inputs are depleted. Execution continues until both inputs are fully processed.

Left Semi Join

The left semi join operation returns data from “Handle matching rows”. The data returned will be from the left input only. After regaining control, the “Next?” test directs the algorithm to advance the left input. In a one to many join, there may be more matching rows in the right input; immediately advancing the left input ensures that these don’t cause duplicates. (The remaining rows from the right input that would have matched the last value from left will now be treated as unmatched).

“Handle unmatched left row” and “Handle unmatched right row” do nothing in this case. As with the other logical operations, the logic associated with testing and tracking matches has no effect and may or may not be skipped during processing.

The “Done?” test for a left semi join results in yes if either of the inputs is depleted. Execution only continues as long as both inputs still have data.

Left Semi Join (probed)

The Merge Join operator does not have an explicit property to show when a left semi join actually executes as a probed left semi join. The only indication is the presence of a column in the Output List that does not come from the left input. There is no Defined Values property for this new column.

For a probed left semi join, both “Handle unmatched left row” and “Handle matching rows” return data from the left input, with either NULL or a value assigned to the Probe column. Execution after returning a row always moves on to advance the left input.

“Handle unmatched right row” does nothing; the logic associated with testing and tracking matches has no effect and may or may not be skipped during processing. The “Done?” test for a probed left semi join results in yes if the left input is depleted. Execution continues as long as the left input has data, even if the right input is already fully processed.

Left Anti Semi Join

For a left anti semi join , “Handle matching rows” does nothing; the “Next?” test after it directs the operator to advance the left input. In other words, it immediately moves to the next left row upon finding but a single match in the right input.

The “Left row matched?” test can only branch to “No” in this case, because control never gets here when a match exists. This test and the logic to track this can be skipped or executed, the effect is the same. The “Handle unmatched left row” then is where a row is returned to the caller.

“Handle unmatched right row” does nothing; the logic associated with testing and tracking matches has no effect and may or may not be skipped during processing. The “Done?” test for a left anti semi join results in yes if the left input is depleted. Execution continues as long as the left input has data, even if the right input is already fully processed.

Right Semi Join

The right semi join operation returns data from “Handle matching rows”. The data returned will be from the right input only. After regaining control, the “Next?” test directs the algorithm to advance the right input.

“Handle unmatched left row” and “Handle unmatched right row” do nothing in this case. As with the other logical operations, the logic associated with testing and tracking matches has no effect and may or may not be skipped during processing.

The “Done?” test for a right semi join results in yes if either of the inputs is depleted. Execution only continues as long as both inputs still have data.

Right Anti Semi Join

For a right anti semi join , “Handle matching rows” does nothing; the “Next?” test after it directs the operator to advance the right input. In other words, it immediately moves to the next right row upon finding but a single match in the right input.

The “Right row matched?” test can only branch to “No” in this case, because control never gets here when a match exists. This test and the logic to track this can be skipped or executed, the effect is the same. The “Handle unmatched right row” then is where a row is returned to the caller.

“Handle unmatched left row” does nothing; the logic associated with testing and tracking matches has no effect and may or may not be skipped during processing. The “Done?” test for a right anti semi join results in yes if the right input is depleted. Execution continues as long as the right input has data, even if the left input is already fully processed.

Concatenation

It is rare to see Merge Join (Concatenation) in an execution plan. The Concatenate operator is a cheaper way to achieve the same result, but it doesn’t preserve order. When the concatenated results are required to be ordered, and both inputs are already sorted in that order, then Merge Join (Concatenate) is more effective than using Concatenate and adding a Sort operator to restore the correct order.

For a concatenation, both “Handle unmatched right row” and “Handle unmatched left row” return the unmatched row. “Handle matching rows” in this case returns a row from the right input, and does not mark the left row as matched; on the next call the “Next?” test will direct the algorithm to read the next row from the right input. Because of this, both the “Left row matched” and the “Right row matched” tests will always result in no. It is, as always, not clear whether this test is still performed or whether it is simply skipped for this logical operation.

The “Done?” test only results in yes if both inputs are depleted; execution continues until both inputs are fully processed.

Union

A Merge Join (Union) is often a more effective way to obtain the union of two input sets then using a normal Concatenation and then adding e.g. a Sort (Distinct) to remove the duplicates. This is especially the case if the inputs are already sorted and guaranteed to have no duplicates. If one or both of the inputs can have duplicates, then the optimizer has to add extra steps to remove these before inputting the data into the Merge Join operator. The union logic of Merge Join removes duplicates between the two sets, but does not remove duplicates within either of the sets. It is not uncommon to see a Merge Join (Union) with a Stream Aggregate on both inputs.

For a union, both “Handle unmatched right row” and “Handle unmatched left row” return the unmatched row. “Handle matching rows” in this case returns a row from the right input. Other than for concatenation, it does mark the left row as matched. On the next call the “Next?” test will direct the algorithm to read the next row from the right input. The “Right row matched” test always results in no; it is not clear whether this test is still performed or whether it is simply skipped for this logical operation.

The “Done?” test only results in yes if both inputs are depleted; execution continues until both inputs are fully processed.

One to many

For the “real” join types (all logical join types except Concatenation and Union), the algorithm above only works correctly if the left input has no duplicates in the join columns. This is called a one to many join (even when the right input has no duplicates either). The optimizer uses (trusted) constraints and logic of plan elements (e.g. aggregations) to determine whether one of the inputs is guaranteed to have no duplicates. If it finds one such input, it arranges that input to be on the left side and marks the Merge Join as one to many.

Many to many

If neither input is guaranteed to have no duplicates, the join is many to many. For a many to many merge join, a worktable is used in tempdb to store values from the right input that need to be used multiple times. This is only done when it is really needed. To achieve this, a few alterations are made to the “Handle matching rows” part of the algorithm.

Once a match is found, the operator will first read the next row from the right input and store this in the “peek-ahead buffer”, a temporary holding area in memory. It then compares the values in the current row from the right input and the “next” row in the peek-ahead buffer. What happens next depends on whether or not they are the same.

Unique row

When the values in the join columns of the current row from the right input and the peek-ahead buffer are not equal, the current row is unique in the input. The operator returns the combined results. Once it re-gains control, the “Next?” test directs to read the next row from the left input. If this row has the same values in the join columns as the previous row, it will again end up in “Handle matching rows”; the next row already is in the peek-ahead buffer and the values are still different so another match is returned. This repeats until the values in the left input’s join columns change. This would normally direct the algorithm to run GetNext() on the right input. In this case that has already been done and that row is stored in the peek-ahead buffer. So instead of calling its child operator, the Merge Join now promotes the peek-ahead buffer to be the current row and then resumes from the top.

Duplicated rows

When the values in the join columns of the current row from the right input and the peek-ahead buffer are the same, these two (and possibly more) rows in the right input are duplicates on the join columns. In this case, the Merge Join stores both the current row from the right input and the row in the peek-ahead buffer in a worktable (in tempdb), then continues to request rows from the right input and add them to the worktable. This continues until a row is returned with a different value in the join columns, which is then stored in the peek-ahead buffer. At this point, all rows from the right input with the same join value are processed and stored in the worktable, and one extra row from the right input is stored in the peek-ahead buffer.

The algorithm then reads the first row from the worktable, produces the combined row of this row and the matching row from the left input, and returns this. Upon regaining control, it reads the next row from the worktable, matches this with the current row from the left input, and so on. Once all rows from the worktable have been processed, the algorithm moves to “Next?”, which directs to read the next row from the left input.

If this row has the same values in the join columns as the previous row, it will again end up in “Handle matching rows”. All matching rows are still in the worktable, so the above process is repeated: rows are read one by one from the worktable and matching rows are returned, and then the next left row is requested. This repeats until the values in the left input’s join columns change. This would normally direct the algorithm to run GetNext() on the right input. In this case that has already been done and that row is stored in the peek-ahead buffer. So instead of calling its child operator, the Merge Join now clears out the worktable, promotes the peek-ahead buffer to be the current row, and then resumes from the top.

Table order

As described above, the many to many merge join algorithm minimizes performance loss by only using the worktable when there is an actual duplicate in the right input. The optimizer tries to maximize the benefit of this optimization by placing the input with the lowest expected number of actual duplicates at the bottom. This will usually be the input with the highest estimated number of rows.

Operator properties

The properties below are specific to the Merge Join operator, or have a specific meaning when appearing on it. For all other properties, see Common properties.

Property name

Description

Defined Values

Observed on Concatenation and Union only.
This property lists, for each output column, the two corresponding input columns.

Used on all logical operations except Concatenation and Union.
See the main text for further details.

Residual

Used on all logical operations except Concatenation and Union.
Specifies the full logical condition that has to be met for a combination of rows from the left and right input to be considered a match. The Residual property always includes all elements of the Where (join columns) property, but may also include additional predicates.

Where (join columns)

Used on all logical operations except Concatenation and Union.
Specifies the part of the join condition that is used to drive the Merge Join algorithm. Additional predicates that are applied after finding a match on this property are described in the Residual property.
The Where (join columns) property is broken down into two subproperties: Inner Side Join columns (for the right input) and Outer Side Join columns (for the left input). The names of these properties are a confusing reference to the input names of the Nested Loops operator.
For Concatenation and Union, the Merge Join algorithm is always based on comparing all columns from both inputs. The Where (join columns) property is not listed in this case.

Implicit properties

This table below lists the behavior of the implicit properties for the Merge Join operator.

Property name

Description

Batch Mode enabled

The Merge Join operator supports row mode execution only.

Blocking

Merge Join is a non-blocking operator.
Technically, there is some (small) amount of blocking in the case of a many to many Merge Join: when duplicates are detected in the right input, they are all loaded into the worktable before the operator can start to return data.

Memory requirement

The Merge Join operator does not have any special memory requirement.

Order-preserving

The Merge Join operator is in most cases fully order-preserving for both the left and the right input.
For a many to many Merge Join, the order of the left input is fully preserved; the order of the right input is only partially conserved. The operator’s output is ordered by all the join columns, but the order of rows from the right input that have the same values in these columns can be affected.

By continuing to use the site, you agree to the use of cookies. more information

The cookie settings on this website are set to "allow cookies" to give you the best browsing experience possible. If you continue to use this website without changing your cookie settings or you click "Accept" below then you are consenting to this.