[DISCUSSION] read-whole-page feature should be defaulted to true?

[DISCUSSION] read-whole-page feature should be defaulted to true?

This post was updated on .

Hi folks,

recently we've worked hard to fix some issues around AMQP and
paging and this has made me more interested into understanding in more
detail the implication of the read-whole-page feature, that's false by
default.

For who doesn't know, it allows to perform a read of paged messages without
reading all the paged messages as a whole (that's we were doing it before, or
by setting read-whole-page to true), but just reaching a requested message in
the page.

The history behind this feature (introduced by
https://issues.apache.org/jira/browse/ARTEMIS-2399) is that it's based on
another fix/improvement made by me on
https://issues.apache.org/jira/browse/ARTEMIS-2317 that allows to use a
sliding-window direct buffer to stream encoded paged messages in
memory batch reading them from the (file-based) pages.
The whole optimization was made to solve a not-so-well-known OS&JVM issue
happening with a wild use of mmap files (journal mapped is not affected
because it should reuse file as much as possible ;)) and it works at its
best *because* of the sliding window algorithm, that try to reduce the
syscalls number by reading data in batch from the OS page cache: this same
feature has been used by https://issues.apache.org/jira/browse/ARTEMIS-2399to perform reading of single messages and, similarly to the original one, it
works at its best if messages are read in batch (assuming
messages < 4K in size).

In order to work well it has to "leak" the sliding window
buffer used to read the messages in batch from the Page *into* the page
itself: it means an increase per-page of off-heap memory footprint of *at least* 4K or *at worst*
the maximum size of the last read paged message (if > 4K): in case of
slow/intermittent consumer it can lead to off-heap memory exaustion and
users should be aware of it.

My proposal is to either document this or/and
make read-whole-page true by default, disabling this feature to not hit common
users that don't want to further tune their system accounting for this
increased memory need.

Re: [DISCUSSION] read-whole-page feature should be defaulted to true?

I thought about defaulting read-whole-page to false could make existing
tests cover for new added lines of code. I didn't consider the implications
of it.

There is a tradeoff between the use of this feature for performance gain in
case of many subscribers, and the increasing off-heap memory and page file
handles. But what you've said makes sense to me. I agree with your
suggestions.