Join algorithms

SQL Anywhere supports a variety of different join implementations that the query optimizer chooses from. Each of the join
algorithms has specific characteristics that make it more or less suitable for a given query and a given execution environment.

The order of the joins in an access plan may or may not correspond to the ordering of the joins in the original SQL statement;
the query optimizer is responsible for choosing the best join strategy for each query based on the lowest execution cost.
In some situations, query rewrite optimizations may be utilized for complex statements that either increase, or decrease,
the number of joins computed for any particular statement.

There are three classes of join algorithms supported by SQL Anywhere, though each of them has additional variants:

Nested Loops Join The most straightforward algorithm is Nested Loops Join. For each row on the left-hand side, the right-hand side is scanned
for a match based on the join condition. Ordinarily, rows on the right-hand side are accessed through an index to reduce the
overall execution cost. This scenario is frequently referred to as an Index Nested Loops Join.

Nested Loops Join has variants that support LEFT OUTER and FULL OUTER joins. A nested loops implementation can also be used
for semijoins (most often used for processing EXISTS subqueries).

A Nested Loops Join can be utilized no matter what the characteristics of the join condition, although a join over inequality
conditions can be very inefficient to compute.

A Nested Loops FULL OUTER join is very expensive to execute over inputs of any size, and is only chosen by the query optimizer
as a last resort when no other join algorithm is possible.

Merge Join A Merge Join relies on its two inputs being sorted on the join attributes. The join condition must contain at least one equality
predicate in order for this method to be chosen by the query optimizer. The basic algorithm is a straightforward merge of
the two inputs: when the values of the two join attributes differ, the algorithm scrolls to the next row of the left or right-hand
side, depending on which side has the lower of the two values. Backtracking may be necessary when there is more than one match.

There are Merge Join variants to support LEFT OUTER and FULL OUTER joins. Merge Join for FULL OUTER Joins is considerably
more efficient than its nested loops counterpart.

The basic Merge Join algorithm is also used to support the SQL set operators EXCEPT and INTERSECT, although these variants
are explicitly named as EXCEPT or INTERSECT algorithms within an access plan.

Hash Join A Hash Join is the most versatile join method supported by the SQL Anywhere database server. In a nutshell, the Hash Join
algorithm builds an in-memory hash table of the smaller of its two inputs, and then reads the larger input and probes the
in-memory hash table to find matches.

Hash Join variants exist to support LEFT OUTER join, FULL OUTER join, semijoin, and anti-semijoin. In addition, SQL Anywhere
supports hash join variants for recursive INNER and LEFT OUTER joins when a recursive UNION query expression is being used.

The Hash Inner Join, Left Outer Join, Semijoin, and Antisemijoin algorithms can be executed in parallel.

If the in-memory hash table constructed by the algorithm does not fit into available memory, the Hash Join algorithm splits
the input into partitions (possibly recursively for very large inputs) and performs the join on each partition independently.
If there is not enough cache memory to hold all the rows that have a particular value of the join attributes, then, if possible,
each Hash Join dynamically switches to an index-based nested loops strategy after first discarding the interim results to
avoid exhausting the statement's memory consumption quota.

Variants of Hash Join are also utilized to support the SQL query expressions EXCEPT and INTERSECT, although these variants
are explicitly named as EXCEPT or INTERSECT algorithms within an access plan.