engage in technology

Spring Batch

Are you in a state where your application requires a large amount of data to be processed in a bunch? Then you are on the right track to visit this article Spring batch.

Spring batch comes with a very light framework where huge amount of data can be processed at various intervals, used to process billions of transactions everday for enterprises, it also provides logging/tracing, transaction management,job processing statistics,job restart,skips and resource management.

Let us see the below diagram and understand what Spring batch actually is!!

As you can see above batch consists of various components interconnected to each other.

Let us see each one of them in brief.

JobLauncher : As the name says it is used to launch a job. Certain triggers like cron commands or some other command line commands triggers the joblauncher to launch a job. It connects with Job and the Job repository.

JobRepository: A system to manage the condition of job and the steps in the execution of the job. Particularly consist of the metadata of the job and its statistics

Job: This is the actual thing to be executed. It consists of the main processes to be executed to perform the execution.

Step: The job which is to be executed is executed in the form of steps. This steps can be divided into two types -chunk based, tasklet based. We will see this further.

Now to understand the last 3 components let us see the chunk based model.

When the job is launched by the job launcher, three components come into play.

The job is first read by an Item Reader, then it is processed for performing certain business logic and then it is written via Item Writer. For eg , say you have a job to read certain CSV file containing employee information(name,age,address) and the write only those employees having age greater than 30 into XML. Now here you read the CSV file using Item Reader, process the data to have employees age greater than 30 using Item Processor and then write it to XML. So this is a kind of Chunk processing model.

The other model is the tasklet model. This is basically used when you dont have to carry out several steps in between input and output. If you want to send any communication via email,any reminders,stop the job after certain condition is met then tasklet based model comes into use.

Enough for now lets get into some example. Lets say you have a CSV file containing employee data and you want to write that data into XML file. Below is the csv file

examResult.txt

John Kennedy | London| 34

Jimmy Snuka | Sweden | 39

Renard konig | France | 21

And the mapped POJO with fields corresponding to the row content of above file:

com.techninfo.springbatch

package com.techninfo.springbatch;

import javax.xml.bind.annotation.XmlElement;

import javax.xml.bind.annotation.XmlRootElement;

import javax.xml.bind.annotation.adapters.XmlJavaTypeAdapter;

@XmlRootElement(name = “ExamResult”)

public class ExamResult {

private String studentName;

private String address;

private String age;

@XmlElement(name = “studentName”)

public String getStudentName() {

return studentName;

}

public void setStudentName(String studentName) {

this.studentName = studentName;

}

@XmlElement(name = “address”)

public String getAddress() {

return address;

}

public void setAddress(String add) {

this.address = add;

}

@XmlElement(name = “age”)

public String getAge() {

return age;

}

public void setAge(String age) {

this.age = age;

}

}

Also note that we have used JAXB annotations in order to map the class properties to XML tags.

Step 4: Create a FieldSetMapper

FieldSetMapper is responsible for mapping each field form the input to a domain object

ItemProcessor is Optional, and called after item read but before item write. It gives us the opportunity to perform a business logic on each item.In our case, for example, we will filter out all the items whose age is less than 30.So final result will only have records with age >= 30.

As you can see, we have setup a job with only one step. Step uses FlatFileItemReader to read the records, itemProcessor to process the record & StaxEventItemWriter to write the records. commit-interval specifies the number of items that can be processed before the transaction is committed/ before the write will happen.Grouping several record in single transaction and write them as chunk provides performance improvement. We have also shown the use of jobListener which can contain any arbitrary logic you might need to run before and after the job.