A Serious CFThread Bug In ColdFusion?

NOTE: This is not really a bug. This is a side-effect of ColdFusion 8's new deep copy functionality. In ColdFusion 8, sadly, deep copy will also deep copy CFCs.

Yesterday, I spent almost an hour trying to debug a problem with KinkyTwits, my ColdFusion Twitter client. After almost pulling all of my hair out, I finally realized that I wasn't fighting a bug in my code but rather a bug in ColdFusion itself. My KinkyTwits code is quite complex so this morning, I boiled it down to the most mundane example. I was a little nervous that it wasn't actually a bug, but even in this simple example, the problem is easily repeatable.

First, I created a small ColdFusion component that internally caches an integer and then increments it with an Increment() method call:

<cfcomponent output="false">

<cffunction

name="Init"

access="public"

returntype="any"

output="false"

hint="I return an intialized variable.">

<!--- Configure the variables. --->

<cfset VARIABLES.Instance = {

Index = 0,

LogFile = (

GetDirectoryFromPath( GetCurrentTemplatePath() ) &

"log.txt"

)

} />

<!--- Return This reference. --->

<cfreturn THIS />

</cffunction>

<cffunction

name="Increment"

access="public"

returntype="any"

output="false"

hint="I increment the internal index in a thread-safe way and log value to file.">

<!---

Single-thread this action. Because this incrementing

a shared scope (and because we know it will be used

in a CFThread tag) we want to make sure multiple

calls to this cannot possible corrupt each other.

--->

<cflock

name="Incrementor.Increment"

type="exclusive"

timeout="5">

<!--- Increment internal value. --->

<cfset VARIABLES.Instance.Index++ />

<!--- Log value. --->

<cffile

action="append"

file="#VARIABLES.Instance.LogFile#"

output="#VARIABLES.Instance.Index#"

addnewline="true"

/>

<!--- Return This scope for method chaining. --->

<cfreturn THIS />

</cflock>

</cffunction>

</cfcomponent>

As you can see, this is very straightforward. The internal value is cached during ColdFusion component initialization and then incremented with each Increment() method call. Inside the Increment() method call, I am single-threading (using an exclusive, named CFLock around) the actions that read from and update the internal index. While I don't think CFLocking is necessary in my scenario, since I am going to be using CFThread later on, I figured it would cut down on possible causes of the erratic behavior.

Once I had this ColdFusion component ready to go, I created a very simple Application.cfm file so that I would have a persistent scope in which to cache my Incrementor.cfc instance:

<!--- Define application. --->

<cfapplication

name="CFThreadBugTest"

applicationtimeout="#CreateTimeSpan( 0, 0, 5, 0 )#"

/>

Then, I created a small test page that created and cached an Incrementor.cfc instance in the APPLICATION scope:

<!---

Create an instance of the incrementor and cached it

in the APPLICATION scope. We are doing this only to make

sure that the "request" is not having some weird time-out

issue with the CFThread.

--->

<cfset APPLICATION.Incrementor = CreateObject(

"component",

"Incrementor"

).Init()

/>

<!--- Call the incrementor three times. --->

<cfset APPLICATION.Incrementor.Increment() />

<cfset APPLICATION.Incrementor.Increment() />

<cfset APPLICATION.Incrementor.Increment() />

Once the Incrementor ColdFusion component is cached, I then proceed to call the Increment() method three times. As expected, our log file reads:

1.02.03.0

Each subsequent call incremented the shared index correctly.

That was all executed as expected. Now, let's introduce ColdFusion 8's new CFThread tag to the equation and see what happens:

<!---

Create an instance of the incrementor and cached it

in the APPLICATION scope. We are doing this only to make

sure that the "request" is not having some weird time-out

issue with the CFThread.

--->

<cfset APPLICATION.Incrementor = CreateObject(

"component",

"Incrementor"

).Init()

/>

<!---

Call the Increment() method three times. Even though we are

using CFThread here to run the calls in parallel, the CFLock

within the Increment() method should synchronize them.

--->

<cfthread

name="T1"

action="run"

incrementor="#APPLICATION.Incrementor#">

<cfset ATTRIBUTES.Incrementor.Increment() />

</cfthread>

<cfthread

name="T2"

action="run"

incrementor="#APPLICATION.Incrementor#">

<cfset ATTRIBUTES.Incrementor.Increment() />

</cfthread>

<cfthread

name="T3"

action="run"

incrementor="#APPLICATION.Incrementor#">

<cfset ATTRIBUTES.Incrementor.Increment() />

</cfthread>

<!--- Make sure all the threads join back in. --->

<cfthread action="join" />

Here, I am running the three method calls to Increment() from within individual CFThread tags. To make sure that there is no confusion over object references, I am explicitly passing the cached instance of the Incrementor ColdFusion component into each CFThread tag context via its ATTRIBUTES. Even through CFThread runs code in parallel, remember that the named, exclusive CFLock I used inside the Incrementor() method should single thread these calls.

We should get the same exact output to our log file; but, when we run this code, our log file reads this:

1.01.01.0

Crazy right?

My first thought was that maybe there was just something going wrong in the parallel nature of the ColdFusion CFThread tags. So, to make sure that wasn't the problem, I modified the above example to have the requesting page wait for each thread to return before firing off the next one:

<!---

Create an instance of the incrementor and cached it

in the APPLICATION scope. We are doing this only to make

sure that the "request" is not having some weird time-out

issue with the CFThread.

--->

<cfset APPLICATION.Incrementor = CreateObject(

"component",

"Incrementor"

).Init()

/>

<!---

Call the Increment() method three times. Even though we are

using CFThread here to run the calls in parallel, the CFLock

within the Increment() method should synchronize them.

--->

<cfthread

name="T1"

action="run"

incrementor="#APPLICATION.Incrementor#">

<cfset ATTRIBUTES.Incrementor.Increment() />

</cfthread>

<!--- Wait for thread to join. --->

<cfthread action="join" name="T1" />

<cfthread

name="T2"

action="run"

incrementor="#APPLICATION.Incrementor#">

<cfset ATTRIBUTES.Incrementor.Increment() />

</cfthread>

<!--- Wait for thread to join. --->

<cfthread action="join" name="T2" />

<cfthread

name="T3"

action="run"

incrementor="#APPLICATION.Incrementor#">

<cfset ATTRIBUTES.Incrementor.Increment() />

</cfthread>

<!--- Wait for thread to join. --->

<cfthread action="join" name="T3" />

Here, the parallel nature of the CFThread tag doesn't even matter because we are purposefully single-threading all of our threads. But, even so, when we run the above example, our log file reads:

1.01.01.0

Bananas!!!

I have no idea what is going on here. I have picked through every inch of this code and of my knowledge of CFThread and I cannot for the life of me find any issues with the code. These examples are not complex - I wouldn't even know how to make them any more simple. My only conclusion is that this is some kind of crazy bug in ColdFusion's CFThread tag. I can't even quite describe what is going on.

Reader Comments

not sure if this is related, but according to the cfthread docs, attributes are deep-copied. I wonder if what you think is your single cached instance of that component is in fact being serialized and thus copied into a new version, which is why it's always starting out at 1.

You know more about the internals then I do and I could be way off base here, but I wonder if the issue is related to the fact that CF uses duplicate() behind the scenes on your variables that you pass into cfthread.

Try calling increment a few times before you kick off the threads and see what happens. If you call it 3 times prior to the threads does it output 3 each time?

does a deep copy really duplicate CFCs? My guess is yes. This should be pretty easy to test though. stick an instance var and assign it createUUID() in the constructor and then inside the thread, fetch the value so that you can see it. my suspicion is that when you pass in your instance into the cfthread, you're gonna see a different uuid value every time. if not... that'd be weird.

From what Rupesh has told me at conferences about CFThread, it is that it is, behind the scenes, actually a CFFunction call. It seems, that as a best practice for CFFunction calls in general, we should pass in all the required values:

function Run( Incrementor){. . . . arguments.Incrementor.Increment()}

Maybe that's a bit of stretch. But, also due to the parallel and unpredictable run-time of a thread (maybe it executes now, maybe in 10 minutes), I always just felt it was safer to explicitly pass in all values rather than assume that those "global" values would exist at the time the thread is run.

Now, obviously, this is an APPLICATION-scoped value, so its existence should never be in question. But, what about a VARIABLES-scoped or a REQUEST-scopes value? Is that guaranteed to be there when the thread executes?

My neurotic self says, NO! Therefore, I sort of developed a habit of explicitly passing all required values through the attributes.

Maybe this is crazy of me, but referring to the APPLICATION scope inside a CFThread tag feels akin to referring to the APPLICATION scope from within a CFC. My gut tells me that it breaks scope encapsulation, especially when you consider that it doesn't immediately execute.

I think the rule of not breaking scope is just that - a rule,a guideline, not a Law as in you shall always follow it 100% of the time.

Remote APIs are a perfect example of this. Imagine you build api.cfc for your web site. This is a remote-enabled CFC that lets folks get crap. Why would you NOT then use App scoped CFCs within that CFC in order to return data? It would be silly not to.

It's funny you mention that. Remote API are a great example. And, in fact, in my KinkyTwits, this holds true as well. My remote API CFC must refer to the APPLICATION.Factory as they cannot be "initialized" properly during a remote invocation.

I guess I am still growing my feelings on this. My use of CFThread is fairly limited, so my thoughts have not had time to mature.

Ben, I think your hesitance to use the shared application variable is because of a bias introduced by working in CF, which I think we all share. In other languages where threading is much more natural (and much more common), using shared variables is an extremely common practice for state maintenance. Many languages that are better at threading have constucts built in that are meant specifically to ensure that things stay in the proper state when using a shared variable or object.

I think if you poke around a few wikipedia articles on shared state maintenance, you'll get much more comfortable with the concept of threads accessing a shared object. Take a look at the wiki article on semaphores to start. Also, if you're at all comfortable with .NET, this site has an absolutely fantastic introduction to multithreaded programming in .NET that may give you a different point of view on the topic: http://www.yoda.arachsys.com/csharp/threads/

Perhaps part of this also stems from the fact that web-based applications have very different request life cycles than desktop-stlye software. With a web page, you make a request, it runs, and ends. There is a sense of start and end with a page request which I think is absent for a desktop application whose only delimiters are start up and shut down.

But of course, the fear I have might just be the fear of the unknown. The fear of not deeply considering the task at hand. If I stop and really think about all the "worst case" scenarios, I am sure that you are correct - I will see that threads act fine in this way.

ColdFusion makes a complete (deep) copy of all the attribute variables before passing them to the thread; therefore, the values of the variables inside the thread are independent of the values of any corresponding variables in other threads, including the page thread. For example, if you pass a CFC instance as an attribute to a thread, the thread gets a complete new copy of the CFC, including the contents of its This scope at the time that you create the thread. Any changes made to the original CFC outside the thread, for example, by calling a CFC function, have no effect on the copy that is in the thread. Similarly, any changes to the CFC instance in the thread have no effect on the original CFC instance.

Glad this post could help in some way. I feel a bit silly that my "problem" was clearly documented in the LiveDocs. But, I suppose that now I know more than ever about ColdFusion... and that's NEVER a bad thing.

Yes, cfthread is a pain in the ass because of the deep copying of attributes. It made it really painful for me to implement an encapsulated threading model in Edmund (my event-driven framework). I solved it by creating a thread execution context CFC. You create the thread helper, and initialize it with *references* to your variables and then call a method on that CFC to trigger cfthread, which then refers directly to the variables scope of the helper.

I suspect it's still not 100% safe due to contexts going out of scope but I really was at a loss to create a totally bullet-proof way to do this... I'm still working on it (although it hasn't failed for me yet in production load scenarios).

Yes, the thread context CFC - like the view context CFC - is a lightweight CFC that you create (on each request) and initialize with just the data needed to run the thread (or view). The CFC then runs the cfthread tag (or includes the view file) so that you have thread safety as well as a variables scope that can pass through from initializer to thread (include file).

Maybe the same issue here, but slightly different? I've found that if I create cfthreads in a loop, then join them and do a cfdump var="#cfthread#" the output is correct. But if I try to explicitly reference the cfthread struct (cfthread['t1'].output) it returns the same value each time.

The cfdump outputs the struct properly, but the collection loop doesn't. It "normally" outputs:

this is from t3this is from t3this is from t3

but if I re-run the page several times I'll end up with something like:this is from t3this is from t2this is from t3

Now, if I change my example to:<cfset lst = "t1, t2, t3"><cfthread name="t1" action="run"> this is from t1</cfthread><cfthread name="t2" action="run"> this is from t2</cfthread><cfthread name="t3" action="run"> this is from t3</cfthread>

The issue you are seeing is a side-effect of the asynchronous processing of threads. By the time any one of your threads executes, the #idx# variable might be completely exhausted (at the end of the list).

To get around this, and as a best practice, you should really pass variables into the cfthread tag:

I ran into the same problem but what ever was inside the cfthread was cache'd until i cleared both template and class cache in administrator. but what a pain. (my example i am sending an email cfmail and using cfthread to fire the email and forget it.)