So, you have read through the previous parts of this series (1, 2 and 3) and you now have a Spring Boot project with jOOQ enabled, an Oracle Database with a simple schema containing musical data. What can you do with it?

Content

In the previous post I had a simple select, getting all artists from my library ordered by name. For queries like that I wouldn’t bother investigating into jOOQ. I have written extensively about my use of Hibernate and summarized my thoughts in this interview. In short: You can come very far with JPA / Hibernate, you can even create portable JPQL queries that support some analytics, as Thorben shows. Thorben also has a nice little tool that helps deciding wether JPA / Hibernate is the right tool for the job, checkout it out: Should you use JPA for your next project?

Imagine the following use cases based on this schema:

I want a list of top n albums by a list of artists ordered by play count

I want a list of artists and how many times I played a track by them cumulative by day

Given a specific months, I want a top n list of tracks played in this month ordered by play count and also including the change in result to the previous month

Please read through the links at the top of this post to grok each step.

An introductory example

I want a list of top n albums by a list of artists ordered by play count

For the younger generation: An album was a coherent collection of tracks that you often listened from the first to the last track. You could buy those on CDs or LPs, some even owned compact cassettes.

The task looks simple but if you look closely, I haven’t bothered normalizing the tracks table into 2nf in regard to the “album” column. Stuff like this exists! The specific reason the schema above looks like this is that I just was to lazy to normalize it all the way. I would have at least adding the artists to albums as well but even then album is non unique in some cases. Anyway.

SELECT
tracks.album,
count(*) as cnt
FROM plays
JOIN tracks
ON plays.track_id = tracks.id
WHERE tracks.artist_id IN (54, 86)
GROUP BY
tracks.artist_id,
tracks.album
ORDER BY
cnt DESC,
tracks.album ASC

It should be a bigger problem writing that down in JPQL. But then you would have to use a kind of projection, a custom object to be instantiated and mapped through ResultSet mappings or constructor call inside the query itself. Not a nice outlook.

How does this look with jOOQ? Having the imports in place, that is: The generated Tables and there records, it pretty much looks identical:

So far nothing got executed. At this point you also can use #getSQL to just retrieve the generated SQL. #fetch will get you a result of records, any of the overloaded methods allow you to retrieve just the columns you need into a record. Note that no mapping to concrete objects is necessary at this point! You can however use generated POJOs or my idea of providing RecordMapper from a Spring Context.

The result you get from calling #fetch has some nice additional methods that all start with “format” and we are gonna use one of those to get useful representation. As the post is called “An HTTP api for analytic queries” I’m not targeting a real REST api speaking of resources but a representation of analytic queries that is useful for a well defined set of analytic dashboards and such. Looking from that perspective, one may call those queries themselves resources.

jOOQ offers formatJSON to get an actually useful JSON result. We are not gonna use the method that returns a string and pass that as the response value from our Spring Boot controller but instead write the contend directly to the response, avoid a possible overhead from a String to JSON converter which isn’t necessary.

But wouldn’t that be boring, no limits, no possibility to query stuff? Let’s introduce both: A query that allows restriction to certain dates and certain artists ids as well as limiting the query result:

Getting the parameters into the controller is super easy with Spring Boot. You’ll see @PathVariable which matches an array of ids, labelled “artistIds” inside the request mapping. You can call this URL with a comma separated list of values. Then you’ll find several parameters annotated with @RequestParam. Those are query parameters that even get converted in a usable format, as you can see for the date parameters final Optional<LocalDate> from and to:

You get those into your jOOQ query like this with quite some cool things to discover: The array of Ids can be passed to a “in()” clause as is… Without having to fear SQL injections anywhere. The worst thing that could happen is a type mismatch exception at controller level.

For a correct query on the days, I want to truncate the played on date to the day. Easy to be done with SQL, even easier with jOOQ as a can reference that expression as a variable PLAYED_ON_TRUNCATED_TO_DAY. And last but not least, you’ll see how nicely Java 8 optionals fit into such an expression: If the date is set create a greaterOrEqual expression, otherwise just a simple true condition:

Window functions with jOOQ

I want a list of artists and how many times I played a track by them cumulative by month

The second use case. I want a list Something you could achieve with client side programming working on the result set of a query very similar to the above, having group the plays by month and artist and keep a running total. Throw some nice lambdas in and you’re done.

The whole idea behind window functions is to allow you to process several values of the result set at a time: you see through the window some peer rows and are able to compute a single output value from them, much like when using an aggregate function.

I have heard a lot of Motörhead after Lemmy passing away by the end of 2015. I want to know how many times I heard Motörhead on January 1st, how many times on 2nd, 3rd and so on in total, like 5 times, than additional 10 times (makes 15) and another 7 (should show as 22). Result should look like this:

To make it a bit more interesting, I’ve added the additional option to query multiple artists. So, what do we have? I have added some comments to the query:

@RequestMapping(path ="/{artistIds}/cumulativePlays")publicvoid getCumulativePlays(
@PathVariable finalBigDecimal[] artistIds,
@RequestParam
@DateTimeFormat(iso = ISO.DATE)final Optional<LocalDate> from,
@RequestParam
@DateTimeFormat(iso = ISO.DATE)final Optional<LocalDate> to,
final HttpServletResponse response
)throwsIOException{
response.setContentType(MediaType.APPLICATION_JSON_UTF8_VALUE);this.create// What to select
.select(// The play date, truncated to the day as explained above
PLAYED_ON_TRUNCATED_TO_DAY,
// The artists
ARTISTS.ARTIST,
// The sum of the play counts in different partitions
sum(count())
.over(// This is where the definition of the window begins// without further restriction the window moves from// the first result up and including to the current
partitionBy(ARTISTS.ARTIST)// The window has to move through multiple partions// here, the artist as I want each cumulated playcount separately
.orderBy(PLAYED_ON_TRUNCATED_TO_DAY)// The window entries are ordered by playdate ).as("cumulativePlays"))
.from(PLAYS)
.join(TRACKS).onKey()
.join(ARTISTS).onKey()
.where(ARTISTS.ID.in(artistIds))
.and(from.map(Date::valueOf)
.map(PLAYED_ON_TRUNCATED_TO_DAY::greaterOrEqual)
.orElseGet(DSL::trueCondition))
.and(to.map(Date::valueOf)
.map(PLAYED_ON_TRUNCATED_TO_DAY::lessOrEqual)
.orElseGet(DSL::trueCondition))
.groupBy(PLAYED_ON_TRUNCATED_TO_DAY, ARTISTS.ARTIST)
.orderBy(PLAYED_ON_TRUNCATED_TO_DAY, ARTISTS.ARTIST)
.fetch()
.formatJSON(response.getOutputStream());}

@RequestMapping(path = "/{artistIds}/cumulativePlays")
public void getCumulativePlays(
@PathVariable final BigDecimal[] artistIds,
@RequestParam
@DateTimeFormat(iso = ISO.DATE)
final Optional<LocalDate> from,
@RequestParam
@DateTimeFormat(iso = ISO.DATE)
final Optional<LocalDate> to,
final HttpServletResponse response
) throws IOException {
response.setContentType(MediaType.APPLICATION_JSON_UTF8_VALUE);
this.create
// What to select
.select(
// The play date, truncated to the day as explained above
PLAYED_ON_TRUNCATED_TO_DAY,
// The artists
ARTISTS.ARTIST,
// The sum of the play counts in different partitions
sum(count())
.over( // This is where the definition of the window begins
// without further restriction the window moves from
// the first result up and including to the current
partitionBy(ARTISTS.ARTIST) // The window has to move through multiple partions
// here, the artist as I want each cumulated playcount separately
.orderBy(PLAYED_ON_TRUNCATED_TO_DAY) // The window entries are ordered by playdate
).as("cumulativePlays")
)
.from(PLAYS)
.join(TRACKS).onKey()
.join(ARTISTS).onKey()
.where(ARTISTS.ID.in(artistIds))
.and(from.map(Date::valueOf)
.map(PLAYED_ON_TRUNCATED_TO_DAY::greaterOrEqual)
.orElseGet(DSL::trueCondition)
)
.and(to.map(Date::valueOf)
.map(PLAYED_ON_TRUNCATED_TO_DAY::lessOrEqual)
.orElseGet(DSL::trueCondition)
)
.groupBy(PLAYED_ON_TRUNCATED_TO_DAY, ARTISTS.ARTIST)
.orderBy(PLAYED_ON_TRUNCATED_TO_DAY, ARTISTS.ARTIST)
.fetch()
.formatJSON(response.getOutputStream());
}

Important thing here is that jOOQ not only makes it particularly easy to use those standard SQL methods but it also helps me to get them right. I could hide this beautiful query inside an annotation passing it to the JPA provider of my choice, but than I would not only have no compile time checking, I would have to deal with the result set on my own.

If you query the URL above, this is the query which gets generated and executed by jOOQ as formatted by jOOQ:

Given a specific months, I want a top n list of tracks played in this month ordered by play count

Do you know those charts list that present the top 10 or top 40 of musical tracks during the last months? With changes in regard to the previous month (How many positions a track won or lost)? I have such a chart in my daily photo project that looks like this:

This is already the UI I’m gonna present in the next article, though. Back then, I have written an atrocious MySQL statement which I nearly sent to #CTMMC some months ago. That beast is huge, slow and complicated.

What does it do? There are to CTEs, previous and current month, both including a window function over their “position”, where position is defined as the count of plays in the given period of time. The dense_rank() method works much like row num, but “equal” rows (same count in this case) are given the same number, with “dense_rank” having no gaps. Exactly what we need to compute the chart. In the actual query I can join those CTEs like normal tables and I can compute the change very easily.

If I use jOOQ to break this down into chunks, it probably looks more readable to non-SQL developers, see ChartReportController:

Those CTEs can be “saved” onto variables and subsequently be used in a pretty much ordinary query.

Summary and use cases

The above query is not a query I write on an everyday use case and maybe it should be written at all. In real live though, I often haven to deal with complex data models and data silos that need to be queried in many different ways. Those analyses are really useful and helpful to a lot of people.

With the combination of Spring Boot and jOOQ I can provide those analysis in no time. Having those queries available as an HTTP API without depending on some HTTP tool inside a database provides much value. I can map simple and complex queries a like and get great logging for free.

Using Springs awesome MVC support, I can pass URL query parameters directly to the jOOQ builder without risking injection scenarios.

But: I strongly believe that JPA / Hibernate is not evil at all. Quite the contrary. Combine it with Spring Data JPA, implement the repository pattern and you’ll get easy to test modules that provide a lot of cruft for you.

If I would have write access of some kind, aka OLTP, in my scenario here, I’d rather would use JPA for it than filling those inserts with jOOQ myself for example.

Gavin King, creator of Hibernate, said on reddit: “Just because you’re using Hibernate, doesn’t mean you have to use it for everything.” and i fully agree with him. After working now on a jOOQ based project for 13 months, this is how an ideal workflow for a database centric but not database dependent application looks like for me:

Always use automatic database migrations, for example Flyway or Liquibase

Get your domain model as “right” as possible, normalize your database and use JPA / Hibernate in combination with Spring Data JPA.

Use JPQL queries if necessary but don’t hide native queries somewhere in annotations

For complex queries, projections or use cases like merge statements, use jOOQ. Either directly or pass the generated SQL to JPA or if you like it, to Springs JDBCTemplate. You can write your statements as strings, but then the burden of using prepared statements and the like correctly is on you.

But the most important point to make: SQL, JPA and jOOQ are expert tools. You have to know them to get the most out of them. Read about SQL and persistence in common, read the reference docs. I personally recommend the following sites and their authors:

If you enjoyed this post, you may follow me on twitter, leave some feedback or even star the demo. Next week I’m gonna publish the last entry in this series where I add an UI in which those analytic queries really shine.

[…] So what has this todo with my talk? My SQL talk was all about doing analytics. That is, retrieving data like in this image from a relational database with build-in analytic functions. Computing running totals, differences from previous windows and so on (read more here). […]

Post a Comment

Name *

Email *

Website

Sum of three + six ? *

Comment

Notify me of followup comments via e-mail

Your email is never published. We need your name and email address only for verifying a legitimate comment. For more information, a copy of your saved data or a request to delete any data under this address, please send a short notice to michael@simons.ac from the address you used to comment on this entry.
By entering and submitting a comment, wether with or without name or email address, you'll agree that all data you have entered including your IP address will be checked and stored for a limited time by Automattic Inc., 60 29th Street #343, San Francisco, CA 94110-4929, USA. only for the purpose of avoiding spam. You can deny further storage of your data by sending an email to support@wordpress.com, with subject “Deletion of Data stored by Akismet”.
Required fields are marked *