This forum is now a read-only archive. All commenting, posting, registration services have been turned off. Those needing community support and/or wanting to ask questions should refer to the Tag/Forum map, and to http://spring.io/questions for a curated list of stackoverflow tags that Pivotal engineers, and the community, monitor.

Skipping records with StaxEventItemReader

Mar 19th, 2008, 09:19 AM

I'm trying to get SkipLimitStepFactoryBean to work with StaxEventItemReader but can't seem to get it right.

I've setup a skip limit step which uses a StaxEventItemReader as a reader. When a skippable exception is thrown out of my deserializer, the record number is added to the reader skipRecords list, the chunk iteration is terminated and a new step iteration starts.

Right before processing a new chunk, ItemOrientedStep calls itemHandler.mark() which delegates to StaxEventItemReader which clears the skipRecords list, so StaxEventItemReader will only progress to the end of the current fragment and think there are no more fragments to process.

Is this a bug or did I get it wrong?

I think either the reader should be seeked to the next fragment before calling mark(), or the skip limit exception handler should be set on the chunkOperations instead of the stepOperations.

Right before processing a new chunk, ItemOrientedStep calls itemHandler.mark() which delegates to StaxEventItemReader which clears the skipRecords list, so StaxEventItemReader will only progress to the end of the current fragment and think there are no more fragments to process.

Sorry, I don't get the "clears the skipRecords list" => "will only progress to end of current fragment and think there are no more fragments to process". Can you elaborate on that?

Looking at the skip logic I think the reader should simply not clear the skipped list.

Now keep in mind this is called after an exception was thrown in the middle of a fragment, so moveCursorToNextFragment will read all events from the stream until it reaches the end of the fragment and will return false (since the fragmentReader has a fake end of document there - its peek() returns null).

The while loop could have helped if the skipRecords list would still contain the record number, because it will loop again and seek to the next fragment.

I'm not entirely sure if that's how it's supposed to work.

Maybe markFragmentProcessed() should be called explicitly?

Comment

Now keep in mind this is called after an exception was thrown in the middle of a fragment

When exception is thrown it currently always means tx rollback, so reset() is called that brings the reader's state back to last mark() call. So unless I'm missing something there's no issue with that. However before reprocessing the chunk after rollback, mark() is called again and the list of skipped records is lost so I assume the reader will try to reprocess the item that should have been skipped again. Does that make sense?

Comment

I'd expect skipping to actually skip the record, even if a rollback was made.
Anyway, I'm running this in debugger and after the exception occurs it goes into the loop with moveCursorToNextFragment returning false.

Comment

Anyway, I'm running this in debugger and after the exception occurs it goes into the loop with moveCursorToNextFragment returning false.

When exception is thrown you must end up in a catch block somewhere, so it's not clear to me what you mean by "it goes into the loop with moveCursorToNextFragment returning false". UPDATE: Ok, I guess I know what you mean, but it doesn't make sense to me why it would be so (yet)

The only functional test we have for skip is the skipSample job in the samples. That one however uses FlatFileItemReader for input, which doesn't clear the skipped records list. There is also the SkipLimitStepFactoryBeanTests that you may find worthwhile to look at.

I'll take a close look at this soon, so far I assume the skip doesn't work as expected for StaxEventItemReader and JdbcCursorItemReader because of the skipped items clearing in mark().

Comment

One of the greatest feature of Servoy is that, you can directly interact with the database by using Servoy’s built-in data binding for creating/modifying/editing/searching records using a particular form. This Tip will address one of the important things while looping through the records of your form.

We can modify/update the record data of a form by looping through its records. So, we are generally following the below code to do the same.