Thinking About Fallback Values In Circuit Breakers In ColdFusion

In my previous noodling on Circuit Breakers in ColdFusion, I talked about the ability for a Circuit Breaker to throw special types of errors - errors that the calling context could catch and respond to. But, the more I thought about this, the less I could think of a reason as to why the calling context would want to differentiate between a "CircuitBreakerOpen" error and [for example] a "ConnectionTimeout" error. If you go through the trouble of providing a fallback value, it would seem to make the most sense to simply provide that fallback value for all errors. As such, I wanted to revisit the Circuit Breaker again, this time building the concept of a fallback directly into the action marshaling.

In my current approach to Circuit Breakers in ColdFusion, the breaker can be tripped-open if the target component or closure appears to be unhealthy. An unhealthy target is one that throws too many errors in a fixed period of time; or, one that fails to respond in a timely manner, holding too many requests open. In either case, the circuit breaker will start to "fail fast", throwing "CircuitBreakerOpen" errors for all subsequent requests that need to be marshaled.

At first, I thought that the calling context would catch and respond to these types of errors:

<cfscript>

// Create our Circuit Breaker and the gateway that we'll proxy its actions.

breaker = new CircuitBreaker();

testGateway = new TestGateway();

// In my previous pass on the Circuit Breaker, I had imagined the calling context

// could try-catch errors and then return a fallback value if it so desired.

// Define a fallback specifically for the case in which the Circuit Breaker

// had been tripped open.

result = "Some fallback value";

} catch ( any error ) {

// Or, a fallback for any other type of error.

result = "Some generic fallback value";

}

writeOutput( result );

</cfscript>

As you can see in this approach, the calling context has the ability to respond specifically to "CircuitBreakerOpen" errors. But, as I stated above, I can't really think of a good reason as to why the calling context would want to differentiate. As such, I went back and I updated the Circuit Breaker execution methods to accept an optional argument for a fallback value that the Circuit Breaker would return in the case of an error. The fallback value can either be a static value or a function / closure. In the case of a function or closure, the fallback value will only be invoked (and returned) if an error occurs.

<cfscript>

// Create our Circuit Breaker and the gateway that we'll proxy its actions.

breaker = new CircuitBreaker();

testGateway = new TestGateway();

// Now, you can pass the fallback value in as an optional argument. This fallback

// value will then be used for any error that occurs during the action marshaling.

// The ".executeMethod()" works with a method name but the ".execute()" method

// works with Closures and Functions for marshaled invocation:

result = breaker.execute(

function() {

return( testGateway.makeBadCall( "Meh" ) );

},

"Static fallback value."

);

writeOutput( "#result# <br />" );

// And of course, the fallback value for .execute() can be either a static value

// or a Function / Closure.

result = breaker.execute(

function() {

return( testGateway.makeBadCall( "Meh" ) );

},

function() {

return( "Fallback value from evaluated closure." );

}

);

writeOutput( "#result# <br />" );

</cfscript>

Because the fallback value is optional, I ended up creating two different execution methods to simplify the logic - one that takes a target component and a method name; and, one that takes a function or closure:

Internally to the Circuit Breaker, there are two different reasons that an error can be propagated. Either the Circuit Breaker is open and the marshaled request needed to "fail fast"; or the target itself threw an error. In order to keep the internal logic cleaner, I didn't want to build the fallback concept directly into that internal workflow. Instead, I moved the request marshaling into a private method - run() - and kept the top-level execute() methods as fallback-aware entry points:

<cfscript>

/**

* I marshal the given action inside the Circuit Breaker.

*

* @target I am the function or closure to be invoked.

* @fallback I am the value to be evaluated if the action fails to complete successfully.

* @output false

*/

public any function execute(

required any target,

any fallback

) {

try {

return( run( target ) );

} catch ( any error ) {

// If a fallback has been provided, return the fallback instead of letting

// the error propagate to the calling context.

if ( structKeyExists( arguments, "fallback" ) ) {

return( evaluateFallback( fallback ) );

}

rethrow;

}

}

/**

* I marshal the given action inside the Circuit Breaker.

*

* @target I am the component receiving the message.

* @methodName I am the message being sent to the target.

* @methodArguments I am the message arguments being sent to the target.

* @fallback I am the value to be evaluated if the action fails to complete successfully.

* @output false

*/

public any function executeMethod(

required any target,

required string methodName,

any methodArguments = [],

any fallback

) {

try {

return( run( target, methodName, methodArguments ) );

} catch ( any error ) {

// If a fallback has been provided, return the fallback instead of letting

// the error propagate to the calling context.

if ( structKeyExists( arguments, "fallback" ) ) {

return( evaluateFallback( fallback ) );

}

rethrow;

}

}

</cfscript>

As you can see, the execute() methods do nothing more than initiate the request marshaling and handle the fallback value (if it is provided). This creates a convenient way to consume the Circuit Breaker while the "guts" of Circuit Breaker don't need to know anything about the concept of a fallback.

// allow a single test (the current request) to be run against the target

// in order to see if the target has reached a healthy state (at which

// point the circuit can be closed once again). To make sure that no

// parallel requests try to perform the same test, push out the timeout.

// --

// NOTE: This is an implied HALF-OPEN state.

checkTargetHealthAtTick = ( currentTick + openStateTimeout );

}

activeRequestCount++;

} // END: Lock.

try {

// Try to execute the requested action.

var result = ( isClosure( target ) || isCustomFunction( target ) )

? target()

: invoke( target, methodName, methodArguments )

;

lock

name = lockName

type = "exclusive"

timeout = 1

throwOnTimeout = true

{

activeRequestCount--; // Circuit breaker is no longer at-capacity.

// If we made it this far, it means that the target method invocation has

// completed successfully. As such, we can clean up any opened state.

if ( state == states.OPENED ) {

state = states.CLOSED;

failedRequestCount = 0;

}

} // END: Lock.

// The target method may not return a defined value, even in a successful

// invocation. As such, we have to check to see if the result exists before

// we try to return the result upstream.

if ( structKeyExists( local, "result" ) ) {

return( result );

} else {

return; // void.

}

// Catch any errors thrown by target invocation.

} catch ( any error ) {

lock

name = lockName

type = "exclusive"

timeout = 1

throwOnTimeout = true

{

activeRequestCount--; // Circuit breaker is no longer at-capacity.

var currentTick = getTickCount();

// If the previous error occurred in the distant past (ie, a time greater

// than the open-state timeout), reset the error count before we record

// the current failure.

if ( lastFailedRequestAtTick < ( currentTick - openStateTimeout ) ) {

failedRequestCount = 0;

}

lastFailedRequestAtTick = currentTick;

// If we made it here, the invocation of the target method failed (ie,

// threw an error); as such, we need to check to see if this failure

// pushed us past the failure capacity of the circuit breaker.

if ( ++failedRequestCount > failedRequestThreshold ) {

// Too many requests against the target have failed. The target is

// likely in an unhealthy state. Trip the circuit open.

state = states.OPENED;

// Keep the breaker open until some time in the future (giving the

// target a chance to return to a healthy state).

checkTargetHealthAtTick = ( currentTick + openStateTimeout );

}

} // END: Lock.

rethrow;

}

}

}

⠀

From the various Circuit Breaker implementations that I've looked at online, the complexity can range from super simple to mind-bogglingly complex. I hope to keep my experimentation on the simple side. I think building the fallback value into the Circuit Breaker itself simplifies consumption without increasing the complexity of the internal code.

It's actually kind of nice - it forces the handling of requests to follow a generic run-book, allowing the state management implementation to be polymorphic. This also makes things easier to test since you can test the state management directly (making the test surface area of the Circuit Breaker itself quite small).

I am the co-founder and lead engineer at InVision App, Inc — the world's leading prototyping,
collaboration & workflow platform. I also rock out in JavaScript and ColdFusion 24x7 and I dream about
promise resolving asynchronously.