drill-issues mailing list archives

[jira] [Commented] (DRILL-5741) Drillbit during startup should not exceed the available memory on a node

Date

Fri, 25 Aug 2017 20:48:00 GMT

[ https://issues.apache.org/jira/browse/DRILL-5741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16142160#comment-16142160
]
Paul Rogers commented on DRILL-5741:
------------------------------------
Not sure this entirely makes sense. It is again asking the user to add a new variable to check
the user's other settings.
In general, Drill should not use the entire memory on a node.
When running under YARN, YARN will assign memory. When running under other managers (MapR
Warden, Mesos, etc.) then those systems take care of the total memory allocations across tasks.
Perhaps we could, on Drillbit start, sum the memory allocations and check against total OS
memory. But, how much should we reserve for the OS? For file system caching? For ZK? For other
apps? Pretty soon we are trying to do node-level resource management "blind" inside the Drillbit.
In Drill-on-YARN, we considered the percentage-based allocation suggested above. But, this
is not as simple as it seems. Certain memory units are fixed (such as code cache), some can
be adjusted. But should the ratio between heap and direct be the same at small levels (2 GB
and 4 GB, say) vs at large levels (50 GB and 100 GB?).
Instead, we worked the other way. We summed the memory allocation for code cache, heap and
direct to get the total memory requested from YARN.
I think we can consider memory oversubscription as a user error; a bit like configuring storage
plugins wrong, or running too many processes for a node, or configuring the OS wrong, etc.
> Drillbit during startup should not exceed the available memory on a node
> ------------------------------------------------------------------------
>
> Key: DRILL-5741
> URL: https://issues.apache.org/jira/browse/DRILL-5741
> Project: Apache Drill
> Issue Type: Improvement
> Components: Server
> Affects Versions: 1.11.0
> Reporter: Kunal Khatua
> Fix For: 1.12.0
>
> Original Estimate: 48h
> Remaining Estimate: 48h
>
> Currently, during startup, a Drillbit can be assigned large values for the following:
> * Xmx (Heap)
> * XX:MaxDirectMemorySize
> * XX:ReservedCodeCacheSize
> * XX:MaxPermSize
> All of this, potentially, can exceed the available memory on a system when a Drillbit
is under heavy load. It would be good to have the Drillbit ensure during startup itself that
the cumulative value of these parameters does not exceed a pre-defined upper limit for the
Drill process.
> The proposal is to have the [runbit|https://github.com/apache/drill/blob/master/distribution/src/resources/runbit]
script look for an additional environment variable:
> {{DRILLBIT_MAX_PROC_MEM}}
> The parameter can specify the maximum in GB/MB (similar in syntax to how the Java's MaxHeap
is defined), or in terms of percentage of available memory (not to exceed 95%).
> The [runbit|https://github.com/apache/drill/blob/master/distribution/src/resources/runbit]
script will perform the calculation of the sum of memory required by the memory spaces (heap,
direct, etc) and ensure that it is within the limit defined by the {{DRILLBIT_MAX_PROC_MEM}}
env variable.
> In the absence of this parameter, there will be no restriction. A node admin can then
define this variable in the default terminal's environment (e.g. {{/root/.bashrc}} ) files.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)