IT Knowledge from developers for developers

In the previous blog post we looked at a best practice architecture for Java Batch applications that is running successfully a lot.

Still, we see challenges that affect productivity and costs, three of them are the following:

Monoliths

Application server

Meta framework

Let’s a look at them now.

Monoliths

Conway’s law states that

organizations which design systems … are constrained to produce designs which are copies of the communication structures of these organizations

We noticed that often organization units start out with one batch application, and then it grows and grows and more and more batch jobs are deployed to it. You end up with one big application per organization unit. It’s somehow natural, because it’s far easier to just add another job to an existing application that create a new one. The pain comes later, and it comes with a vengeance. When updating code or a library, all jobs have to be tested, even if the changes are only for one job. And in general are batch jobs hard to test and don’t change a lot, so unnecessary testing is really painful.

So what do you do about it? One application per job – and you make it really easy to create a batch application. You may call this a Micro-Batch-Service if you want.

Application server

Yes, there are differences, and maybe I always had to deal with the most painful ones – but they always have been an issue for productivity. Eberhard Wolff claims in his article “Application Servers are dead” that there is a circular dependency between Application Server and application: the application is using libraries and infrastructure of the Application Server, and the Application Server needs to provide DataSource pools, JMS ConnectionFactories, shared libraries customized for the application, and often the Application Server itself is tweaked for one application. This really does affect a lot – you cannot just download a server and deploy your application to it, somebody needs to package and script it for your applications. I spent a lot of time installing and fixing Application Server installations during my career, time that could have been spent better. And even if you got it running it still slows you down.

So what do you do about it? You don’t use one. You embed a Servlet container in your application because it’s all one anyway.

Meta framework

If you decided to use Spring Batch or some other JSR-352 implementations you’re not done with coding. You need what I call a meta batch framework on top of that. Something to adjust them to your company’s needs, the http endpoints for operating and monitoring jobs, special batch components for special use cases, metrics, logging and so on. You shouldn’t underestimate the effort needed for that.

So, what do you do about it? There’s not much you can do, but if your idea of batch processing looks a little bit like ours, you may use (and help us optimize) our open source solution.

Solution

When developing our Spring Boot starter for batch applications – spring-boot-starter-batch-web – we had these three issues in mind. We wanted to have a solution that is very easy to use so that creating an application really isn’t an issue. We wanted to get rid of the Application Server overhead. And we wanted to put all the meta framework stuff that we implemented at every customer again into one library free to use. That probably doesn’t mean that you don’t have anything to do for customizing it for your company, but the customization layer should be much thinner than without using our starter. If you want to try it – here is our Getting started page.

1. Short answer: no. Long answer: of course you could trigger jobs that are deployed using our boot starter in a Spring XD stream, but that’s not the native Spring Batch integration Spring XD offers. 2. I would always look for simpler solutions than application server clustering. If your application is stateless, you only need a load balancer in front of several instances. If it isn’t, take a look at Spring Session to externalize the session. In our case of batch servers you wouldn’t use application server clustering anyway, the only thing you sometimes need is load balancing (putting a job on another server cause one server is busy or unreachable), but you wouldn’t want to share any session state.