Details

Description

Having no upper limit memory limitation in an Erlang process is problematic for Erlang processes that consume memory based on messages received, in a way where the memory usage is not easily anticipated. Two main areas that people have asked about in the past are:

A way to limit the heap of an Erlang process

A way to limit the incoming message queue of an Erlang process

The #2 problem above can be avoided, by adding a queue data structure to the Erlang process, so it is avoidable without modifications to the Erlang VM, and it is difficult to make an incoming message queue limit correct when some messages would want to always be received, ignoring the limit (e.g., system messages). However, the #1 problem requires modification to the Erlang VM to be implemented in a dependable way. An ad-hoc solution without Erlang VM modification would likely require an Erlang process that runs at a high priority with a timer, to check Erlang processes that opt-in to checking, with the potential to be missed during the timer time period (potentially not preventing the death of the Erlang VM due to memory consumption).

In the past (around 2007 supposedly, but I don't have a link for reference) there was a discussion about a max_heap_size for Erlang processes and how it would be enforced. The simplest way that would satisfy the problem would be the best, which should be comparing the current Erlang process heap size to the max_heap_size when the GC attempts to provide more Erlang process heap memory, so the max_heap_size is enforced as an upper-limit on the amount of memory available (so no chance for extra garbage collections due to the higher amount of memory). I have not looked into the source code to see how this would be implemented, so my view may be naive.

The feature should be important for the fault-tolerance goal of Erlang, since the availability of an infinite amount of memory is not realistic. setrlimit is the normal POSIX function used on UNIX for OS processes. The value of max_heap_size can initially be set to infinity so there is no impact on source code that will not use max_heap_size. The source code that does use the max_heap_size would be able to avoid having the Erlang VM crash due to unanticipated memory consumption (from the absence of impossible validation or erroneous source code).

Yes, the Erlang process should crash. The source code could use something equivalent to "erlang:exit(Pid, kill)" when the GC discovers the Erlang process is asking for memory that exceeds the max_heap_size. That situation is much better than allowing the whole Erlang VM to crash when available memory is exhausted. The brutal kill needs to occur, to avoid the possibility of the process attempting to handle the exception.

Michael Truog
added a comment - 24/Nov/15 2:23 AM - edited Yes, the Erlang process should crash. The source code could use something equivalent to "erlang:exit(Pid, kill)" when the GC discovers the Erlang process is asking for memory that exceeds the max_heap_size. That situation is much better than allowing the whole Erlang VM to crash when available memory is exhausted. The brutal kill needs to occur, to avoid the possibility of the process attempting to handle the exception.

Björn-Egil Dahlberg
added a comment - 11/Dec/15 4:58 PM There isn't a single heap for an Erlang process and there are other memory considerations as well, so what should be included in "max_heap_size"?
nursery? (what we normally simply refer to as the heap). Yep.
the old_heap? Yep.
the heap that we allocated and are about to copy to? There is a period when the heap essentially is doubled in size. Is this included?
Heap-fragments? Almost always from BIFs and are about to be part of the heap. In 19 they essentially are the heap.
Msg-heap fragments? Essentially messages not read yet. This could be very costly.
What are the guarantees? For example, If we set max_heap_size to a certain threshold will it never try to allocate above this threshold or do we have some leeway?

I have a use case for something similar in embedded: preallocate heap to make behaviour more predictable.

In this case the reason for specifying a max heap size would be to allocate two heaps and copy between them and never allocate a new (larger) heap. Don't want to go in more details unrelated to this issue here.

For my future use case a max_heap_size would be the maximum sized heap the process would allocate. So the check can be limited to heap allocation time. It would either kill the process if it wants to allocate a larger than allowed heap to copy to or better first limit the size of the new allocation to max_heap_size and kill the process only if it can't run with this heap. This is more hairy since it could lead to GCing much more often when a process is using just about the whole heap size all the time. I wouldn't mind for my use case since I'd GC much more often anyway to get more predictable realtime behaviour of the process (reducing throughput as it usually does)

So in the non "realtime" case maybe two numbers would be interesting to specify: max_heap_alloc, max_heap_usage.

max_heap_alloc: would be checked/limited when allocating a heap
max_heap_usage: (maybe not the best name for this) would be checked after GC if it overshoots -> kill it with fire

max_heap_usage might default to a fixed percentage of max_heap_alloc when its specified (e.g. 75%)

Peer Stritzinger
added a comment - 11/Dec/15 5:49 PM I have a use case for something similar in embedded: preallocate heap to make behaviour more predictable.
In this case the reason for specifying a max heap size would be to allocate two heaps and copy between them and never allocate a new (larger) heap. Don't want to go in more details unrelated to this issue here.
For my future use case a max_heap_size would be the maximum sized heap the process would allocate. So the check can be limited to heap allocation time. It would either kill the process if it wants to allocate a larger than allowed heap to copy to or better first limit the size of the new allocation to max_heap_size and kill the process only if it can't run with this heap. This is more hairy since it could lead to GCing much more often when a process is using just about the whole heap size all the time. I wouldn't mind for my use case since I'd GC much more often anyway to get more predictable realtime behaviour of the process (reducing throughput as it usually does)
So in the non "realtime" case maybe two numbers would be interesting to specify: max_heap_alloc, max_heap_usage.
max_heap_alloc: would be checked/limited when allocating a heap
max_heap_usage: (maybe not the best name for this) would be checked after GC if it overshoots -> kill it with fire
max_heap_usage might default to a fixed percentage of max_heap_alloc when its specified (e.g. 75%)

Björn-Egil: The max_heap_size is best as an absolute maximum, so it would include the nursery, old_heap, new heap, heap fragments, and message heap fragments. If it is problematic finding the size of the Msg-heap fragments a slightly larger size based on the allocator could be used, since the main point is to make sure the process can die, not that the size is always accurate (since it is an absolute maximum).

Peer: Your desire to pre-allocate the heap of an Erlang process should already be handled by min_heap_size and min_bin_vheap_size. If not, you probably should create a separate entry in jira to capture this. The max_heap_size could be used to influence the GC behavior with other settings, but should be risky, since it should add extra latency that normally is not present (assuming extra collections would be done).

Michael Truog
added a comment - 11/Dec/15 7:44 PM - edited Björn-Egil: The max_heap_size is best as an absolute maximum, so it would include the nursery, old_heap, new heap, heap fragments, and message heap fragments. If it is problematic finding the size of the Msg-heap fragments a slightly larger size based on the allocator could be used, since the main point is to make sure the process can die, not that the size is always accurate (since it is an absolute maximum).
Peer: Your desire to pre-allocate the heap of an Erlang process should already be handled by min_heap_size and min_bin_vheap_size. If not, you probably should create a separate entry in jira to capture this. The max_heap_size could be used to influence the GC behavior with other settings, but should be risky, since it should add extra latency that normally is not present (assuming extra collections would be done).

okeuday: no it can't already be handled by min_heap_size and yes this discussion doesn't belong here. I have clear semantics in mind for a special sort of processes and this is part of a larger project.

All I wanted to contribute here is how I could imagine a max heap size be implemented. If you overdo it with the precision it will affect performance. If you only check when allocating a new heap you can effectively prevent many causes of runaway memory usage for everything which will eventually end up on the heap.

Peer Stritzinger
added a comment - 12/Dec/15 3:45 PM okeuday: no it can't already be handled by min_heap_size and yes this discussion doesn't belong here. I have clear semantics in mind for a special sort of processes and this is part of a larger project.
All I wanted to contribute here is how I could imagine a max heap size be implemented. If you overdo it with the precision it will affect performance. If you only check when allocating a new heap you can effectively prevent many causes of runaway memory usage for everything which will eventually end up on the heap.

It is not exactly what you describe that you want, but it is the closest that we have come without sacrificing the SMP scalability of the erts message passing. Please review it and put any comments you have in the PR.

Lukas Larsson
added a comment - 26/Apr/16 5:52 PM I've just opened a PR implementing max_heap_size, https://github.com/erlang/otp/pull/1032
It is not exactly what you describe that you want, but it is the closest that we have come without sacrificing the SMP scalability of the erts message passing. Please review it and put any comments you have in the PR.

Lukas: The pull request looks good. It is important that a feature like this not impact the SMP scalability and providing a maximum heap limit without impacting performance is the best approach. Thank you for adding this feature!

Michael Truog
added a comment - 26/Apr/16 6:59 PM - edited Lukas: The pull request looks good. It is important that a feature like this not impact the SMP scalability and providing a maximum heap limit without impacting performance is the best approach. Thank you for adding this feature!