Is there any documentation on the circumstances under which flume ng will either drop events or possibly send events twice resulting in duplicates?

I seem to be able to run into both situations with a test setup under high contention, using a agent1[syslog source --> file channel --> avro sink] --> agent2[avro source, file channel, hdfs sink]. I drop events with the default values for the timeouts on the file channels in combination with letting agent1 become unavailable for some period of time (causing rsyslog to build up a queue). The same situation with higher timeouts leads to a number of duplicate events (about 500 after 2.5M events).

(BTW: is there an official ascii art notation for flume setups?)Thanks for any pointers,Friso

NEW: Monitor These Apps!

All projects made searchable here are trademarks of the Apache Software Foundation.
Service operated by Sematext