There are general guidelines for the most efficient heap sizes.
Firstly, for basic scaling, redundancy and failover capability, the
recommendation is to have two or three copies of a JVM instance (three
gives you as much as you need, two may be enough). Beyond that, you
are better off vertically scaling those three instances (vertical
scaling means increase the memory, CPU and I/O for those instances
rather than add more JVMs). That's because a JVM has a base cost just
to run it, so you gain efficiency by increasing the use of an existing
JVM. Of course there may be isolation reasons why you want to run
other JVMs, but you need to be aware that you're making a tradeoff,
of greater isolation for less efficiency.

A note from this newsletter's sponsor

But there is a higher efficiency limit for JVMs at the per NUMA RAM
(including any non-heap memory needs that need to run on that NUMA
node). For a typical system the RAM is evenly distributed and the NUMA
node corresponds to the socket (ie each multi-core chip, not each core),
so if you don't know the hardware architecture for your particular
system you can divide total RAM by the number of sockets and that
gives you the efficient memory limit for your JVM (of course you are
better off finding the exact architecture and using that data). At
that point it depends very much on your application whether increasing
JVM heap further is more efficient or adding another JVM is.

There are other considerations too, under 4GB heaps are a little more
efficient, and the 32GB-64GB heap range has some peculiarities (related
to pointer addressing of objects) that make under 32GB heaps more
efficient, so you want to stay below 32GB if you will be straying into
that range, especially up to around 48GB. Above 64GB there are no
differences in efficiency up to the NUMA node memory (but remember
that NUMA node limit relates to the process memory which will be larger
than the heap if it reaches maximum usage).

So like with most things in performance, you have tradeoffs that need
making but at least now you're informed about those tradeoffs. Target
having two JVM instances, (three if you need more scale or redundancy),
grow those instances up to the NUMA-node size but preferentially try
to stay below either 4GB or 32GB max heaps; above NUMA-node RAM you
need to test whether a bigger heap single JVM or per-NUMA-node memory
restricted JVMs are more efficient. And of course replicate the whole
thing in another datacentre if you need disaster recovery or regional
capabilities.