This forum is now a read-only archive. All commenting, posting, registration services have been turned off. Those needing community support and/or wanting to ask questions should refer to the Tag/Forum map, and to http://spring.io/questions for a curated list of stackoverflow tags that Pivotal engineers, and the community, monitor.

Pessimistic locking reader

Mar 18th, 2011, 04:24 PM

Hi,
I would like to pessimistically lock a page of rows from a table in the reader and have the locks released when the chunk commits. Is this reader a transactional reader? I will be configuring skippable exceptions which can be thrown in the writer as well, which I believe will rollback the transaction and release the lock. Then as part of the fault tolerant automatic retry/scan the reader query needs to run again to lock those rows (I may not be guaranteed to get the same rows again). I haven't seen any example of this, so any suggestions would be helpful.
Thanks,
Hari.

Comment

So I should do the pessimistic query in beforeChunk and put the items in the step execution context? The reader then reads from the context?
When individual item processing happens as part of the fault tolerant scan, is beforeChunk invoked for each item? The query would normally lock a chunk-size of rows. When individual item processing happens, how do I figure out that the chunk-size is 1 and query now needs to fetch just one locked row?

Comment

I wouldn't use StepExecution ExecutionContext for storing items, unless you need that for restartability (i.e. there is no way to determine which record you are reading without storing them all).

The ChunkListener is called once per chunk (not once per item) on a sunny day, but it will be called again in the scan for failed items if something fails in your writer (I assume, but you can check). Since you are storing state anyway, you can use that to determine whether or not to call your listener again.

When the ChunkProcessor scans for failed items the reader is not called again unless you set the reader-transactional-queue flag.

This isn't a very common use case. Can you describe in more detail what you need to happen and why? Why do you need to lock the items before they are read? How would you do that anyway, given that you don't know how many items will be read until the chunk is complete? Maybe a locking strategy involving the ItemProcessor or ItemWriter would be more sensible?

Comment

Hi,
I would like to pessimistically lock a page of rows from a table in the reader and have the locks released when the chunk commits. Is this reader a transactional reader?

Hari, as I understood, you wand to achieve the following scenario:

Reader reads N records from DB, with "SELECT FOR UPDATE" or with "SERIALIZABLE" isolation level

Processor modifies these records

Writer writes them back and transaction commits, releasing locks in step 1

The points here are:

In Spring Batch ideology, reader does not know anything about what is "chunk size". This parameter is configured outside reader.

Chunk processing and writing is executed in separate transaction. Even in case the reader knows the chunk size, it should somehow populate the transaction to writer, so reader and writer operate in the same transaction.

This is vary similar to what I want to achieve, but my case get worse due to Hibernate.

The main trap here is that you need to propagate the transaction from the reader (that lasts for the whole job execution) to writer like this:

Transaction T2 is explicitly opened by String Batch, you can't influence that. In your case you need T2 to be a continuation of T1 and there are only two cases, when it happens: PROPAGATION_REQUIRED and PROPAGATION_NESTED. In both cases you need to start the transaction T1 somewhere in ItemStream.open() and in ChunkListener.afterChunk() one need to commit T1, so that T2 is also committed. If you don't do it, ChunkTransactionCallback will be locked in the next chunk cycle. The main problem is after the transaction is committed, it is unbinded from TransactionSynchronizationManager, so you need to bind it back.

If you succeed with above adventures – let me know

Comment

@dma_k
That is exactly what I am trying to do
@Dave
The need for pessimistic locking comes because I need to run multiple instances of the same batch (for increased throughput). It is simpler for both instances to run as peers and grab their chunk of data by locking rows from the driving table. This is typically done using the query (not syntactically correct where rownum is concerned):

select id, priority...
where rownum < n
order by priority
for update of id skip locked

I cannot use optimistic locking because the query is sorting the data and the chances of multiple queries grabbing the same rows is very high.
Since the row lock is held for the duration of the transaction, I need the same transaction to span the reader, processor and writer

1. It appears to me that this would work if my reader buffers the n rows returned by the query and commit-interval is also 'n'.
2. It would not work if fault-tolerant processing occurs on write. I can skip reader and processor errors but I need to let the chunk fail in case of any write error. I need to disable fault-tolerant processing in this case so that it can "skip the chunk" on write error and start a new chunk which will then cause the reader to grab these rows again (query may not return the same rows) since they are now unlocked.
3. I still need to identify the failed item, so that it is not read again. I can do that by implementing a writer and handling the write exception for each item. When an item write fails, I log the error and create a new transaction to delete the item from the driving table.

@Dave
The need for pessimistic locking comes because I need to run multiple instances of the same batch (for increased throughput). It is simpler for both instances to run as peers and grab their chunk of data by locking rows from the driving table.

Perhaps you could think about "first – partition, then – run the batch" (partitioning is possible in Spring Batch), or if you really need a eager querying system, you can introduce the concept of a ticket, which has statuses NEW, IN_PROCESS, PROCESSED, FAILED and optimistic locking. The tickets are created somewhere else, and when the processor gets the ticket, if first tries to change it's status. If succeeded – the job may continue with processing of the task.

Comment

I believe I can do pessimistic locking as long as I add a processing indicator column. So the reader locks and buffers unprocessed rows using the query (with a filter on processing indicator), sets the indicator and commits. The processor and writer can now work with an item and not have to bother about a chunk-level transaction.