Description

Overview

This is a feature to provide better support for running stateful services on Mesos such as HDFS (Distributed Filesystem), Cassandra (Distributed Database), or MySQL (Local Database).
Current resource reservations (henceforth called "static" reservations) are statically determined by the slave operator at slave start time, and individual frameworks have no authority to reserve resources themselves.
Dynamic reservations allow a framework to dynamically reserve offered resources, such that those resources will only be re-offered to the same framework (or other frameworks with the same role).
This is especially useful if the framework's task stored some state on the slave, and needs a guaranteed set of resources reserved so that it can re-launch a task on the same slave to recover that state.

Planned Stages

The goal of this stage is to allow the framework to send back a Reserve/Unreserve operation which gets validated by the master and updates the allocator resources. The allocator's allocate logic is left unchanged and the resources get offered back to the framework's role as desired.

The goal of this stage is to persist the reservation state on the slave. Currently the master knows to store the persistent volumes in the checkpointedResources data structure which gets sent to individual slaves to be checkpointed. We will update the master such that dynamically reserved resources are stored in the checkpointedResources as well. This stage also involves subtasks such as updating the slave re(register) logic to support slave re-starts.