Description

Overview

This is a feature to provide better support for running stateful services on Mesos such as HDFS (Distributed Filesystem), Cassandra (Distributed Database), or MySQL (Local Database).
Current resource reservations (henceforth called "static" reservations) are statically determined by the slave operator at slave start time, and individual frameworks have no authority to reserve resources themselves.
Dynamic reservations allow a framework to dynamically reserve offered resources, such that those resources will only be re-offered to the same framework (or other frameworks with the same role).
This is especially useful if the framework's task stored some state on the slave, and needs a guaranteed set of resources reserved so that it can re-launch a task on the same slave to recover that state.

Planned Stages

The goal of this stage is to allow the framework to send back a Reserve/Unreserve operation which gets validated by the master and updates the allocator resources. The allocator's allocate logic is left unchanged and the resources get offered back to the framework's role as desired.

The goal of this stage is to persist the reservation state on the slave. Currently the master knows to store the persistent volumes in the checkpointedResources data structure which gets sent to individual slaves to be checkpointed. We will update the master such that dynamically reserved resources are stored in the checkpointedResources as well. This stage also involves subtasks such as updating the slave re(register) logic to support slave re-starts.

Adam B
added a comment - 18/May/16 21:12 Michael Park , what's left before we can say that "Dynamic Reservations" has shipped?
Can we move the unresolved tasks from this JIRA into a Dynamic Reservations v2 Epic, so we can close this one out?

I'll explain more in the 0.23 release blog post, but the short answer is that SSL was the gating feature, and now that it's landed, we're ready to cut 0.23. Persistent Volumes (MESOS-1554) and Dynamic Reservations are both mostly complete, but there are a few unresolved issues left in those Epics. The biggest remaining issue/feature is operator endpoints for /reserve, /unreserve, and /destroy. Without these endpoints, a framework that does not shut down cleanly may leave reservations/volumes lingering, without an easy way for the operator to clean them up. (Workaround: create a cleanup framework that registers as the same role and unreserves/destroys any reservations/volumes it is offered.) As such, these features will be included in Mesos 0.23 in alpha/experimental state, so you can start modifying your frameworks to use the new APIs, but we would not recommend using them in a multi-framework production environment. Mesos 0.24 is targeted for early August (before MesosCon), and I am confident that those features will be production-ready by then.

Adam B
added a comment - 01/Jul/15 02:07 I'll explain more in the 0.23 release blog post, but the short answer is that SSL was the gating feature, and now that it's landed, we're ready to cut 0.23. Persistent Volumes ( MESOS-1554 ) and Dynamic Reservations are both mostly complete, but there are a few unresolved issues left in those Epics. The biggest remaining issue/feature is operator endpoints for /reserve, /unreserve, and /destroy. Without these endpoints, a framework that does not shut down cleanly may leave reservations/volumes lingering, without an easy way for the operator to clean them up. (Workaround: create a cleanup framework that registers as the same role and unreserves/destroys any reservations/volumes it is offered.) As such, these features will be included in Mesos 0.23 in alpha/experimental state, so you can start modifying your frameworks to use the new APIs, but we would not recommend using them in a multi-framework production environment. Mesos 0.24 is targeted for early August (before MesosCon), and I am confident that those features will be production-ready by then.

Jay Taylor
added a comment - 01/Jul/15 01:47 Dear Adam B ,
Hey there, I've been keeping and eye on this story and am keenly looking forward to the new capabilities which are made possible by this!
Will you please shed some light on why MESOS-2018 "Dynamic Reservation" has been moved to 0.24x (further into the future)?
Thank You!
Jay