This forum is now a read-only archive. All commenting, posting, registration services have been turned off. Those needing community support and/or wanting to ask questions should refer to the Tag/Forum map, and to http://spring.io/questions for a curated list of stackoverflow tags that Pivotal engineers, and the community, monitor.

Manual chunk commit

Sep 4th, 2010, 11:33 AM

Hello,

I'm using spring batch for data extraction tasks, in the most of cases, from XML to DB. I have one situation that i need to control the commit interval manually.

Example: The application reads and process some registers provided from XML in the commit interval (chunk). The next register (in the same chunk yet) depends on the information readed previously. When this situation occurs, I need to write/commit the chunk before read the next one, which depends on the information commited on DB. Altought, if the information comes in different chunks, no problem, it's OK!

I can intercept the information through StepListener and identify if depends or not of some previous register, but I don't know how I should commit the chunk manually.

The <chunk/> element in the step configuration allows you to inject a chunk-completion-policy that can be used to control the commit. I didn't understand your use case 100%, but I'd be surprised if you can't do something with a completion policy and a listener of some type (or something that implements both). Another common use case is forcing a commit after a timeout or time window.

Comment

I will try to explain my doubt. I need to control the chunks commits depending on the input, before process and after read. I will study the Completion Policy to know if it helps me or not. Do you have some use example of Completion Policy and manual commit interval control?

In your case I would say you need to write a reader that pulls together a PeekableItemReader and a CompletionPolicy (maybe implements both interfaces, maybe just injects them and changes their state). Because of the state, there may be restrictions on multi-threaded use, unless you take special steps.

Comment

I have a similar issue to this one, but I'm unsure if this meets my requirements. Does a chunk completion policy end the chunk? All I want to do is commit what's in the write buffer manually, then continue processing the chunk. How would I go about that?

Comment

A completion policy signals the end of the chunk (i.e. no more items will be taken from the reader in this transaction). Can you describe your use case in a bit more detail because I'm surprised to hear that you want to manipulate the transaction from inside your business logic?

Comment

We are parsing a log file from a piece of hardware. It has a line that starts a group within the database, then we load the corresponding data into related tables as we go. Sometimes, we'll get a power interrupt, which basically means that a new group is starting immediately on the line after, and we have to mark all previous groups that didn't complete successfully as invalid. The problem that we're running in to is that sometimes these invalid groups only exist in the chunk write buffer, so we can't update them. We want to flush the write buffer, read any groups that have a certain status from the database and update them to a new status, and then continue on from the next line.

How would you implement something like that?

Comment

Could you peek ahead and try to detect the invalid group (as per the original post in this thread), then force a commit using a completion policy and deal with marking incomplete groups in a listener, or possible the writer. Then the next chunk will start with the peeked item (which you know is part of the next group)?

Or you could do the group validation as a completely separate step?

Comment

Could you peek ahead and try to detect the invalid group (as per the original post in this thread), then force a commit using a completion policy and deal with marking incomplete groups in a listener, or possible the writer. Then the next chunk will start with the peeked item (which you know is part of the next group)?

Or you could do the group validation as a completely separate step?

The group validation as a separate step is an interesting idea that we hadn't thought of. That might work, although right now we're working on a hack to get it working before we go through and completely rewrite the batch job for a new log format.