5.7. Eager Fetching

Eager fetching is the ability to efficiently load subclass data and
related objects along with the base instances being queried.
Typically, Kodo has to make a trip to the database whenever a
relation is loaded, or when you first access data that is mapped to a
table other than the least-derived superclass table. If you perform a
query that returns 100 Person objects, and then
you have to retrieve the Address for each
person, Kodo may make as many as 101 queries (the initial
query, plus one for the address of each person returned). Or if some
of the Person instances turn out to be
Employees, where Employee
has additional data in its own joined table, Kodo once again might need
to make extra database trips to access the additional employee data.
With eager fetching, Kodo can reduce these cases to a single query.

Eager fetching only affects relations in the active fetch groups, and
is limited by the declared maximum fetch depth and field recursion
depth (see Section 5.6, “Fetch Groups”). In
other words, relations that would not normally be loaded immediately
when retrieving an object or accessing a field are not affected by
eager fetching. In our example above, the address of each person would
only be eagerly fetched if the query were configured to include the
address field or its fetch group, or if the address were in the default
fetch group. This allows you to control exactly which fields are
eagerly fetched in different situations. Similarly, queries that
exclude subclasses aren't affected by eager subclass fetching,
described below.

Eager fetching has three modes:

none: No eager fetching is performed.
Related objects are always loaded in an independent select
statement. No joined subclass data is loaded unless it is in
the table(s) for the base type being queried. Unjoined subclass
data is loaded using separate select statements rather than
a SQL UNION operation.

join: In this mode, Kodo joins to to-one
relations in the configured fetch groups. If Kodo is loading
data for a single instance, then Kodo will also
join to any collection field in the configured
fetch groups. When loading data for multiple instances, though,
(such as when executing a Query) Kodo
will not join to collections by default. Instead, Kodo defaults
to parallel mode for collections, as
described below. You can force Kodo use a join rather than
parallel mode for a collection field using the metadata
extension described in Section 7.9.2.1, “Eager Fetch Mode”.

Under join mode, Kodo uses a left outer join
(or inner join, if the relations' field metadata declares the
relation non-nullable) to select the
related data along with the data for the target objects.
This process works recursively for to-one joins, so that if
Person has an
Address, and
Address has a
TelephoneNumber, and the fetch groups
are configured correctly, Kodo might issue a single select that
joins across the tables for all three classes. To-many joins
can not recursively spawn other to-many joins, but they can
spawn recursive to-one joins.

Under the join subclass fetch mode, subclass
data in joined tables is selected by outer joining to all
possible subclass tables of the type being queried. Unjoined
subclass data is selected with a SQL UNION where possible.
As you'll see below, subclass data fetching is configured
separately from relation fetching, and can be disabled for
specific classes.

parallel: Under this mode, Kodo selects
to-one relations and joined collections as outlined
in the join mode description above. Unjoined
collection fields, however, are eagerly fetched using a
separate select statement for each collection, executed in
parallel with the select statement for the target objects.
The parallel selects use the WHERE
conditions from the primary select, but add their own joins to
reach the related data. Thus, if you perform a query that
returns 100 Company objects, where each
company has a list of Employee objects
and Department objects, Kodo will make
3 queries. The first will select the company objects, the second
will select the employees for those companies, and
the third will select the departments for the same companies.
Just as for joins, this process can be
recursively applied to the objects in the relations being
eagerly fetched. Continuing our example, if the
Employee class
had a list of Projects in one of the
fetch groups being loaded, Kodo would execute a single
additional select in parallel to load the projects of all
employees of the matching companies.

Using an additional select to load each collection avoids
transferring more data than necessary from the database to
the application. If eager joins were used instead of parallel
select statements, each collection added to the
configured fetch groups would cause the amount of data being
transferred to rise dangerously, to the point that you could
easily overwhelm the network.

Polymorphic to-one relations to table-per-class mappings use
parallel eager fetching because proper joins are impossible.
You can force other to-one relations to use parallel rather than
join mode eager fetching using the metadata extension described
in Section 7.9.2.1, “Eager Fetch Mode”.

Setting your subclass fetch mode to parallel
affects table-per-class and vertical inheritance hierarchies.
Under parallel mode, Kodo issues separate selects for each
subclass in a table-per-class inheritance hierarchy, rather
than UNIONing all subclass tables together as in join mode.
This applies to any operation on a table-per-class base class:
query, by-id lookup, or relation traversal.

When dealing with a vertically-mapped hierarchy, on the other
hand, parallel subclass fetch mode only applies to queries.
Rather than outer-joining to subclass tables, Kodo will issue
the query separately for each subclass. In all other
situations, parallel subclass fetch mode acts just like join
mode in regards to vertically-mapped subclasses.

When Kodo knows that it is selecting for a single object only,
it never uses parallel mode, because the
additional selects can be made lazily just as efficiently.
This mode only increases efficiency over join
mode when multiple objects with eager relations
are being loaded, or when multiple selects might be faster than
joining to all possible subclasses.

5.7.1. Configuring Eager Fetching

You can control Kodo's default eager fetch mode through the
kodo.jdbc.EagerFetchMode and
kodo.jdbc.SubclassFetchMode configuration
properties. Set each of these properties to one of the mode names
described in the previous section: none, join,
parallel. If left unset, the eager fetch mode defaults
to parallel and the subclass fetch mode defaults
to join. These are generally the most robust and
performant strategies.

You can easily override the default fetch modes at runtime for any
lookup or query through Kodo's fetch configuration APIs. See
Chapter 9, Runtime Extensions for details.

You can specify a default subclass fetch mode for an individual
class with the metadata extension described in
Section 7.9.1.1, “Subclass Fetch Mode”.
Note, however, that you cannot "upgrade" the runtime fetch mode
with your class setting. If the runtime fetch mode is
none, no eager subclass data fetching will
take place, regardless of your metadata setting.

This applies to the eager fetch mode metadata extension as well
(see Section 7.9.2.1, “Eager Fetch Mode”).
You can use this extension to disable eager fetching on a field or
to declare that a collection would rather use joins than parallel
selects or vice versa. But an extension value of
join won't cause any eager joining if the fetch
configuration's setting is none.

5.7.2. Eager Fetching Considerations and Limitations

There are several important points that you should consider when
using eager fetching:

When you are using parallel eager fetch
mode and you have large result sets enabled (see
Section 4.11, “Large Result Sets”) or you place
a range on a query, Kodo performs the needed parallel
selects on one page of results at a time. For example,
suppose your FetchBatchSize is set to
20, and you perform a large result set
query on a class that has collection fields in the
configured fetch groups. Kodo will immediately cache
the first 20 results of the query using
join mode eager fetching only. Then,
it will issue the extra selects needed to eager fetch
your collection fields according to parallel
mode. Each select will use a SQL
IN clause (or multiple OR
clauses if your class has a compound primary key)
to limit the selected collection elements to those owned
by the 20 cached results.

Once you iterate past the first 20 results, Kodo will
cache the next 20 and again issue any needed extra selects
for collection fields, and so on. This pattern ensures that
you get the benefits of eager fetching without bringing
more data into memory than anticipated.

Once Kodo eager-joins into a class, it cannot issue any
further eager to-many joins or parallel selects from that
class in the same query. To-one joins, however, can
recurse to any level.

Using a to-many join makes it impossible to determine the
number of instances the result set contains without
traversing the entire set. This is because each result
object might be represented by multiple rows. Thus, queries
with a range specification or queries configured for lazy
result set traversal automatically turn off eager to-many
joining.

Kodo cannot eagerly join to polymorphic relations to
non-leaf classes in a table-per-class inheritance hierarchy.
You can work around this restriction using the mapping
extensions described in Section 7.9.2.2, “Nonpolymorphic”.