Batching reading using exist and IN

Typically batch reading uses a join of the batch query to the source query.

This join has some issues:

If a m-1 or m-m, join causes duplicate rows to be selected, (a distinct is used to filter rows for m-1, but duplicates are returned for m-m as join table info is needed).

DISTINCT does not work with LOBs.

DISTINCT may be less efficient on some databases than alternatives.

Does not work well with cursors.

Needs verification if works with pagination.

Two new configuration options are introduced in EclipseLink 2.1. One such option is to use an EXISTS with a sub-select instead of a JOIN. This should not result in duplicate rows, so avoid issues with DISTINCT.

Another option is to load the target objects using a IN clause containing the source query object's primary keys. This would also work with cursors, but as the limitation of requiring custom SQL support for composite primary keys, and produces a large dynamic SQL query.

A new BatchFetchType enum will be define and the usesBatchReading flag will enhance to setBatchFetch allowing for JOIN, EXISTS or IN. This option will also be added to ObjectLevelReadQuery, rolling up the current 4 batch reading properties into a new BatchFetchPolicy, also moving them up from ReadAllQuery to allow ReadObjectQuery to also specify nested batched attributes. A new BatchFetch annotation and query hint will be added.

The EXISTS option will operate very similar to the existing JOIN batching, just putting the same join condition inside a sub-select. Although it should have be straight forward, I hit several issues with our current sub-selects support that I had to debug and fix. This also lead to discovering some issues with our JPQL support, that also needed to be fixed. EXISTS supports all of the mappings the same as JOIN, but does not require a distinct. m-1 will not use a DISTINCT by default for EXISTS (even though it would avoid duplicates, as the reason for using the EXISTS is normally to avoid the distinct), but the distinct can be enabled on the originating query if desired. EXISTS will still select duplicates for m-m, and not work well with cursors.

The IN option will query a batch of the target objects using an SQL IN clause containing the key values. For a 1-1 the foreign keys from the source rows will be extracted, if these are the primary keys of the target, they will first be filtered by checking the cache for each object. The remaining keys will be split into batches of a query configurable size, defaulting to 256. For composite keys the multi array SQL syntax for ((key1, key2), (...)) IN ((:key1, :key2), (...)) will be used. Only some databases support this syntax, so composite primary keys will only be supported on some databases. For 1-m or m-m the source rows do not contain the primary keys of the target objects. The IN will still be supported, but will be based on join the source query as in JOIN (or maybe EXISTS?) for the set of keys. For cursors the IN size will default to the cursor pageSize, and each cursor page will be processed separately.

Query Casting

JPA/ORM developers will use this query feature to query across attributes in subclasses. This feature is available in JPQL, EclipseLink Expressions and Criteria API.

These examples show on how to use EclipseLink to define queries on inheritance hierarchies with down casting to specific classes (bug 259266).

JPA2.0 Type

Extensions to the expression framework to limit the results to those of a specific subclass have already been implemented as part of the JPA 2.0 effort. Expression.type(Class) is available in the expression framework and equivalent functionality is available in JPQL.

e.g. "select p from Employee e join e.projects p where type(p) = LargeProject" can be used to retrieve all the LargeProjects (Subclass of Project) from Employee.

JPQL Extensions to use Downcast

JPQL is extended to cast in the FROM clause. The format of this will use the keyword "TREAT" and be part of the join clause. The following is an example:

Criteria API

As it is defined by JPA 2.0, Expression.as(type) does a simple cast that allows matching of types within the generics.

EclipseLink 2.1 extends criteria API to allow a cast using Expression.as(type). The as method has been extended to check the hierarchy and if type is a subclass of the type for the expression that as is being called on a cast will be implemented. Here is a criteria query that will do a cast:

Should be written as: select f from Foo f join cast(f.bars, BarSubclass) b where type(b) = BarSubclass And b.subclassAttribute = "value" by users that wish to enforce the type.

EclipseLink will automatically append type information for cases where the cast results in a single type, but for classes in the middle of a hierarchy, no type information will not be appended to the SQL

Query Keys in Queries

As of EclipseLink 2.1 it is now possible to reference manually defined query keys within your JPA queries. Prior to this release these query keys could only referenced in native queries using expressions.

By default EclipseLink creates query keys for all of your mapped attributes but in some scenarios users find it beneficial to add their own.