This forum is now a read-only archive. All commenting, posting, registration services have been turned off. Those needing community support and/or wanting to ask questions should refer to the Tag/Forum map, and to http://spring.io/questions for a curated list of stackoverflow tags that Pivotal engineers, and the community, monitor.

Spring Integration *max-messages-per-poll* and duplicate messages

Jul 5th, 2013, 06:34 PM

My configuration is given below. My two questions are

1). Even if there is only 1 file in the directory, *max-messages-per-poll* is sending 5 messages with same file information. I think that is correct per spec. However, What would be the best way to discard/prevent the four duplicates( files) from reaching the Service Activator?

2)Currently, based on this configuration, the file ends up in the outbound directory. Its fine if there are no errors. What would be the best way to redirect the file to a different directory from the Service Activator? ( Pass an error-channel and send it there from Service Activator?)

With prevent-duplicates=false, the same file will be sent 5 times if you don't remove the file.

However, since you are multi-threading the flow (with a task executor), it's certainly possible for another thread to see the file.

Probably the easiest thing to do is to add another service before the filesInboundChannel to rename the file (on the poller thread) - that way it (or another thread) won't "see" the file.

You can handle errors by putting an error-channel on the poller; or you could use an ExpressionEvaluatingRequestHandlerAdvice on the service-activator; there's an example of using this advice in the retry-and-more sample.

Comment

Hi Gary,
Thanks very much and that worked and that created a follow up question
1. When we set *prevent-duplicates=true*, is there a configuration that we can specify that keeps the number of files in the spring queue for checking? Or the entries keep on piling up in one running session until the memory exhausts? If there is a configurable max entry - how can we specify that and can we reset that using any admin option while app is running?

2.This is a multiple JVM question. Ideally we want only the code in one JVM to poll the directory. If there are multiple JVMs polling the same directory - the same file could end up in those service activators and present a sticky situation. I presume even the *prevent-duplicates* won't work across multiple JVMs. If that is the case what would be the best option in those type of situations? IS there an admin way to start and stop the pollers on JVMs while the app is running. Is *auto-startup* the correct config to handle this?

Please advise and oh the main thing - You guys have done an amazing job that saved me thousand of coding hours. Keep up the good work!

Comment

1. By default, with a regex-pattern, the adapter gets a CompositeFileListFilter (with a RegexPatternFileListFilter and an AcceptOnceFileListFilter with an unbounded "memory").

To change this, you'd have to define your own <bean/> for a CompositeFileListFilter with a RegexPatternFileListFilter, with an AcceptOnceFileListFilter with a 'maxCapacity' constructor-arg. You'd then supply the filter to the adapter with the 'filter' attribute (you can't have both filter and a pattern attributes, hence the need to build a CompositeFileListFilter).

But, yes, you can start/stop the adapters at will; setting auto-startup to false will prevent them starting automatically.

You can control the adapters directly (get a reference to the SourcePollingChannelAdapter and invoke stop(), start()). Or, you can expose the endpoints as JMX MBeans and use stop/start operations; or you can use a <control-bus/> to send messages; e.g. "@filesInbound.stop()"

Comment

Thanks very much, Gary. Lots of useful info. Now that you mentioned the file locker I got a question on that one. Currently, I am unlocking the file as the first thing in my Service Activator. Is that the right way to do it? I mean without doing that we won't be able to read the file inside Service Activator? Also, that brings up a sticky situation that other process can manipulate the files when service activator is reading its contents?

Comment

Hi Gary,
I did not rename the file as the name of the file has a business significance. So I am not sure renaming will work in my case. Anyway, with *prevent-duplicates=true* that issue no longer exists. However, as you mentioned above "With prevent-duplicates=false, the same file will be sent 5 times if you don't remove the file" - I am trying to get a handle on that statement. When you say remove the file, who would be responsible to remove the file? - Because as it stands now, the runtime invokes the service activator with duplicate messages/files. Where would I interject(What should I do specifically) to prevent the duplicates before Service Activator gets invoked with duplicate messages/files?

Comment

Hi Gary,
It takes less than 5 seconds to process the file and obviously I am going to increase polling to 1 minute intervals. However, I feel we will be at the mercy of timings and in a worst - case for whatever reason it file processing takes more than a minute, we might end up in soup. So, I am going to take a look at the alternative that you suggested and have a feeling that it will solve it.

I am thrilled with your "move it to another directory" option. So, I guess to connect service activator to that "new directory", wouldn't I have to create another poller on that *new directory* or is there another way to invoke the service activator once the file appears on that new directory.
Also, *move it another directory* could bring up another issue that is well documented. - The app picking up the file before its completely moved. I guess we need to do the *renaming file* trick so that app will pick up the complete file only?

You can use a custom FileListFilter to apply whatever algorithm you need or, if you can get the sender to send it with a temporary suffix and rename it when it's complete, that is the best solution (it's what we use in our (s)ftp adapters).

Regarding the renaming; you can just do it with a couple of extra components; you don't need another adapter...

Comment

Yes, the sender is actually sending with a .tmp and then once the writing is complete, the sender will rename it to .txt/.TXT. So, we are good there.

So coming back to the original question - as my original confusion again re-surfaced. With "prevent-duplicates=false" and max-"messages-per-poll=5" and only "File1.txt" in the directory, the runtime will send 3 messages (File payload) to service activator parallely( 3 threads) with same "File1.txt" payload. So I am wondering how would renaming the "File1.txt" help to prevent handling the duplicate messages?
I was looking for a solution to detect those are duplicate messages and not to process at all. Ideally Service Activator should not be invoked at all for those duplicates. Sorry for being naive and hope I did not confuse you?

Comment

Hi Gary,
I still need to figure out why the presence of "expression" is preventing JBoss from starting up. But on the "rename" part, will the expression takes care of creating two java.io.File objects, delete it and then rename it?