automatically undo when a recipe is removed from the node's run_list

Details

Description

The undo feature is very useful for enterprise adoption of Chef. This feature means:

when a recipe is removed from the node's run_list(e.g. a role is removed from node's role list or a recipe is removed from a role's run_list), chef-client is able to detect the change between node's new run_list and last ran run_list, and undo the changes made by the removed recipe(e.g. stop services and remove packages/files installed by the removed recipe).

Proposed Solution
1. Manually. Write a new recipe abc_undo to appropriately un-install recipe abc, and add the recipe into node's run_list. Once it's uninstalled, remove abc_undo from node's run_list.
2. Automatically. chef-client defects the diff of old and new run_list, automatically run the abc_undo recipe, then run the real needed recipes.

We expect to implement this undo feature using an automatic way, because manual way is somehow complicated and not straight forward.

Thanks Nathan L Smith. Currently I don't need this feature for my project urgently. But I think it's a really useful feature. The typical user scenario is version N of an deployment has less recipes than the its version N-1, and the version N will be deployed on the same machince which N-1 depoyed on.

Jesse Hu
added a comment - 19/Oct/13 7:58 AM Thanks Nathan L Smith. Currently I don't need this feature for my project urgently. But I think it's a really useful feature. The typical user scenario is version N of an deployment has less recipes than the its version N-1, and the version N will be deployed on the same machince which N-1 depoyed on.

Currently we don't have any reliable way to make sure that an item is automatically added when some recipe is removed from a cookbook. Also enforcing this for every cookbook in a consistent manner is hard let alone the questions around how long the "reverting" recipe lives in the run list and what happens if you remove "nginx::source" but add "nginx::package".

Serdar Sutay
added a comment - 23/Oct/13 10:04 PM This is a complicated scenario Jesse Hu .
Currently we don't have any reliable way to make sure that an item is automatically added when some recipe is removed from a cookbook. Also enforcing this for every cookbook in a consistent manner is hard let alone the questions around how long the "reverting" recipe lives in the run list and what happens if you remove "nginx::source" but add "nginx::package".
Let me know if you have any further questions about this.

then from your role[base] you would always include apache::default with the attribute defaulting to disable, and all servers would run apache::disable by default. then twiddle the attribute in roles/attribute files/recipes as desired to conditionally setup apache on a server.

this is 'level-triggered' and asserts a given state. it does not try to 'undo' anything (which is, literally, impossible) and forces you to write a 'forwards-looking' definition of what it means to uninstall apache – its is mathematically provable that chef cannot reliably guess what you want to do. it is also going to be more auditable since you can search your infrastructure for servers which set or do not set that attribute (and if we could search recipes that were include_recipe'd then you could search that as well, but unfortunately you can't right now...)

Lamont Granquist
added a comment - 23/Oct/13 10:18 PM the way to do this is to have something like:
apache::enable – asserts apache is installed, running, configured, etc
apache::disable – asserts apache is stopped, init scripts removed, package deinstalled from the system
apache::default – reads an attribute and runs either apache::enable or apache::disable
then from your role [base] you would always include apache::default with the attribute defaulting to disable, and all servers would run apache::disable by default. then twiddle the attribute in roles/attribute files/recipes as desired to conditionally setup apache on a server.
this is 'level-triggered' and asserts a given state. it does not try to 'undo' anything (which is, literally, impossible) and forces you to write a 'forwards-looking' definition of what it means to uninstall apache – its is mathematically provable that chef cannot reliably guess what you want to do. it is also going to be more auditable since you can search your infrastructure for servers which set or do not set that attribute (and if we could search recipes that were include_recipe'd then you could search that as well, but unfortunately you can't right now...)
so the automatic solution suffers from mathematical impossibility ( http://markburgess.org/papers/totalfield.pdf ) and you've already got the tools that you need to do the manual solution.

Also, my solution is really what you want for auditing. You want to be able to assert that for all servers in your entire infrastructure they are either in a state of enabled or disabled, and that Chef is running every day/hour/whatever and enforcing either enabled or disabled on all servers. Enforcing that "if a server had been enabled that we have run the disable recipe once" is much harder than running it all the time, and it does not prevent an admin from installing and firing up an apache server by hand, so the "disabled" state becomes wishy-washy and is only a "maybe-disabled-if-nobody-twiddled-with-it-i-kinda-think" which doesn't stand up to auditing... So, harder, less useful, and impossible to do the magic solution.

Lamont Granquist
added a comment - 23/Oct/13 10:22 PM Also, my solution is really what you want for auditing. You want to be able to assert that for all servers in your entire infrastructure they are either in a state of enabled or disabled, and that Chef is running every day/hour/whatever and enforcing either enabled or disabled on all servers. Enforcing that "if a server had been enabled that we have run the disable recipe once" is much harder than running it all the time, and it does not prevent an admin from installing and firing up an apache server by hand, so the "disabled" state becomes wishy-washy and is only a "maybe-disabled-if-nobody-twiddled-with-it-i-kinda-think" which doesn't stand up to auditing... So, harder, less useful, and impossible to do the magic solution.

Thanks Serdar. I understand this feature is somehow complicated, since the atom operation of chef is the recipe level and a recipe contains not only install/uninstall package, but also adding/removing a file ,executing a script (which might not be easily reverted). If we plan to support it, we should work out a practical and easy approach.

Jesse Hu
added a comment - 25/Oct/13 2:24 AM Thanks Serdar. I understand this feature is somehow complicated, since the atom operation of chef is the recipe level and a recipe contains not only install/uninstall package, but also adding/removing a file ,executing a script (which might not be easily reverted). If we plan to support it, we should work out a practical and easy approach.