Field Detail

BACKLOG_UNKNOWN

Constructor Detail

UnboundedReader

public UnboundedReader()

Method Detail

start

public abstract boolean start()
throws java.io.IOException

Initializes the reader and advances the reader to the first record. If the reader has been
restored from a checkpoint then it should advance to the next unread record at the point
the checkpoint was taken.

This method will be called exactly once. The invocation will occur prior to calling
advance() or Source.Reader.getCurrent(). This method may perform expensive operations that
are needed to initialize the reader.

Returns true if a record was read, false if there is no more input
currently available. Future calls to advance() may return true once more data
is available. Regardless of the return value of start, start will not be
called again on the same UnboundedReader object; it will only be called again when a
new reader object is constructed for the same source, e.g. on recovery.

For example, this could be a hash of the record contents, or a logical ID present in
the record. If this is generated as a hash of the record contents, it should be at least 16
bytes (128 bits) to avoid collisions.

getWatermark

Returns a timestamp before or at the timestamps of all future elements read by this reader.

This can be approximate. If records are read that violate this guarantee, they will be
considered late, which will affect how they will be processed. See
Window for more information on
late data and how to handle it.

However, this value should be as late as possible. Downstream windows may not be able
to close until this watermark passes their end.

For example, a source may know that the records it reads will be in timestamp order. In
this case, the watermark can be the timestamp of the last record read. For a
source that does not have natural timestamps, timestamps can be set to the time of
reading, in which case the watermark is the current clock time.

See Window and
Trigger for more
information on timestamps and watermarks.

All elements read between the last time this method was called (or since this reader was
created, if this method has not been called on this reader) until this method is called will
be processed together as a bundle. (An element is considered 'read' if it could be returned
by a call to Source.Reader.getCurrent().)