Github user tzulitai commented on the issue:
https://github.com/apache/flink/pull/5282
@aljoscha regarding the potential flakiness of the test you mentioned:
I think the test will be stable, as long as the recorded timestamp of the
second run is larger than the first run. We can add a loop (with max retries)
for the test topic generation, until that condition is met.
For the verification side (reading from Kafka), we'll essentially also be
relying on Kafka to correctly return corrrect offsets for a given timestamp,
but that is the case for almost all Kafka ITCases.
Am I missing any other potential flakiness aspects of this?