Query pagination with JPA and Hibernate

Introduction

Inspired by this StackOverflow answer I gave recently, I decided it’s time to write an article about query pagination when using JPA and Hibernate.

In this article, you are going to see how to use query pagination to restrict the JDBC ResultSet size and avoid fetching more data than necessary.

Domain Model

Now, let’s assume we defined the following Post and PostComment entity classes in our application:

The Post class is the parent entity while the PostComment is the child as it has a @ManyToOne association with the Post entity. Both entities implement the Identifiable interface which provides a contract for accessing the underlying entity identifier.

Next, we are going to save the following Post and PostComment entities in the database:

SELECT p.id AS id1_0_,
p.created_on AS created_2_0_,
p.title AS title3_0_
FROM post p
ORDER BY p.created_on
LIMIT 10

On SQL Server 2012 (or newer), Hibernate will execute the following SQL query:

SELECT p.id AS id1_0_,
p.created_on AS created_2_0_,
p.title AS title3_0_
FROM post p
ORDER BY p.created_on
OFFSET 0 ROWS FETCH NEXT 10 ROWS ONLY

Therefore, the SQL pagination query is adapted to the underlying database engine capabilities.

The use of ORDER BY is mandatory when using query pagination because SQL does not guarantee any particular order unless we provide one via the ORDER BY clause.

Using offset to position the result set

If the previous query was typical for the first page of a given pagination query, navigating the following page requires positioning the result set where the last page ended. For this purpose, the JPA Query interface provides the setFirstResult method.

SELECT p.id AS id1_0_0_,
c.id AS id1_1_1_,
p.created_on AS created_2_0_0_,
p.title AS title3_0_0_,
c.created_on AS created_2_1_1_,
c.post_id AS post_id4_1_1_,
c.review AS review3_1_1_,
c.post_id AS post_id4_1_0__,
c.id AS id1_1_0__
FROM post p
LEFT OUTER JOIN post_comment c ON p.id=c.post_id
ORDER BY p.created_on

This is because Hibernate wants to fetch entities fully along with their collections as indicated by the JOIN FETCH clause while the SQL-level pagination could truncate the ResultSet possibly leaving a parent Post entity with fewer elements in the comments collection.

The problem with the HHH000104 warning is that Hibernate will fetch the product of Post and PostComment entities, and due to the result set size, the query response time is going to be significant.

In order to work around this limitation, you have to use a Window Function query:

For more details about using Window Functions to fix the HHH000104 issue as well as the code for DistinctPostResultTransformer, check out this article.

Why not use query streaming instead?

JPA 2.2 added the getResultStreamQuery method, which you might think it’s a valid alternative to pagination. However, the stream result will not provide the result set size to the query planner, hence a suboptimal execution plan might be picked. For this reason, it’s much more efficient to use pagination than streaming when it comes to fetching small amounts of data.

For more details about why pagination is more efficient than streaming, check out this article.

Keyset pagination

Markus Winand, who wrote the SQL Performance Explained book, advocates for Keyset pagination instead of Offset. Although Offset pagination is an SQL standard feature, there are two reasons why you would prefer Keyset pagination:

performance (the index must be scanned up to the offset while, for keyset pagination, we can go directly to the first index entry that matches our order by predicate and filtering criteria)

correctness (if elements are being added in between, offset pagination will not provide consistent reads)

Even if Hibernate does not support keyset pagination, you can use Blaze Persistence for this purpose.

If you enjoyed this article, I bet you are going to love my Book and Video Courses as well.

Conclusion

Fetching just as much data as you need is one of the most important tips when it comes to data access performance. When fetching data, pagination allows you to control the result set size so that performance is stable even if the underlying dataset grows with time.

While keyset pagination offers better performance for large result sets, if you can narrow the scanned data set using the right filtering predicates, then offset pagination is going to perform rather well. To get consistent reads, you have to make sure that the scanned data set is always sorted in such a way that new entries are appended at the end of the set, rather than being mixed in between old entries.

Subscribe to our Newsletter

* indicates required

Email Address *

10 000readers have found this blog worth following!

If you subscribeto my newsletter, you'll get:

A free sampleof my Video Course about running Integration tests at warp-speed using Docker and tmpfs

Post navigation

2 thoughts on “Query pagination with JPA and Hibernate”

From my experience using the setMaxResults(10) method still selects the whole dataset and then only returns the amount requested after wards. For MS SQL I would have expected for the query to be ‘select top(10) from table’ but that was not the case, which ended up being a performance hit. Does the driver still operate that way?