Production Updates with a Big Red Button

Imagine if you would, a modified game of Jeopardy hosted by Alex Trebec of course. This game is all about getting changes to production and a smart Dev is in the hot seat.

Dev: Can I have Daily Production Pushes for $200?

Alex: You just deployed a new code change to production, it is causing major issues with performance and you need to roll it back. How do you roll back the troublesome change without having to roll back the entire release? (Jeopardy theme music playing)…

Dev: What is a Feature Toggle.

Alex: Correct.

Dev: Can I have Daily Production Pushes for $500?

Alex: You just started work on a new feature that is no where near ready and you check in your changes to source control. The powers that be want to push to production the branch you’ve checked in the new feature… and they want to do it NOW! How do you push the branch and not expose the unfinished feature changes without having to revert the feature and other work mixed in between your feature changes?

Dev: What is a Feature Flag.

Alex: Correct.

Dev: Can I have Daily Production Pushes for $1,000?

Alex: Daily Double

Alex: Marketing wants to be able to turn a feature on and off for customers depending on their transaction volume. How do you accomplish it?

Dev: What is a Ticket Flag.

Alex: Correct.

Alex: You are our Production Jeopardy Champion?

Feature Flags, Feature Toggles, in our example below Ticket Flags all refer to a mechanism to configure changes live in production. They are all the same thing. In the method below we changed how a value is selected in a drop down list. This is trivial and may not warrant a feature flag, but it illustrates usage. The old way of selecting the item was causing errors and the bug fix is activated only if TicketFlag is true. If TicketFlag() queries a database or config file that can be easily changed by Ops, then Ops can effectively turn off and on features and changes as necessary. Also, in more robust implementations we could roll out features to users by role, group, user opt-in, A/B test, soft roll out…and more.

protected void SelectDropDownListItem(string item)
{
//TicketFlag() is used to turn code changes off and on in production
if (TicketFlag("OTF9906"))
{
//This is the fix that was deployed to prod
if (ddl.Items.FindByValue(item) != null)
{
ddl.SelectedValue = item;
}
}
else
{
//This is the old code
ddl.SelectedValue = item;
}
}
//This type of construct would most likely be in a separate application project and
//would return if the ticket is active or not, maybe by calling a service connected
//to db or config file, in this instance a simple switch statement
private bool TicketFlag(string ticketId)
{
switch (ticketId)
{
//Active Tickets
case "OTF9906":
{
return true;
}
}
return false;
}

The problem with this approach is that is tends to pollute your code base. You have to have policies and procedures in place to remove unnecessary feature flags. This could be mitigated somewhat by automated processes to find removal candidates. We could better tokenize the name of feature flags so that we could employ an algorithm to search the code base to find existence of flags that are ready for removal. When a change is approved in production, this algorithm can sniff out the feature flags for the change and record the file name and line number of every place it should be deleted, create a new ticket in the project management system for the developer that created the flag to remove them, send an email or IM to the dev reminding them to remove the flags.

Another problem is premature evaluation. There could be a situation where a feature flag is activated before the feature is ready for primetime. Facebook did this and they “took the site down for a few minutes” to fix it (http://thenextweb.com/facebook/2010/12/16/facebook-who-just-pressed-the-big-red-button/). TNW called it pushing the big red button and we can have one too. To mitigate the problem of premature feature flag activation, the flag activation could be surrounded by an approval process where more than one person is authorizing the flag to go live.

Obviously, there are other ways to slip up, like reversing the if statement on a flag. Instead of making the old code the default, the dev may inadvertently put the new code in the else statement. Not much can be done here, but these types of mishaps happen with or without feature flags.

I liked this idea a lot and believe there is no reason .Net shops can’t deploy daily with this type of approach, but nothing existed out the box for .Net. I found an open source Java project (Togglz) and decided to recreate something similar for the .Net (Featurz – https://github.com/charleslbryant/featurz). It’s still in preliminary development. If you like the idea, please check it out, and suggest features and changes. When its finally somewhat usable please help out with some code, documenting, testing, blog posts, tweets, coin…whatever you can.