Exploring The Possibility Of Parallelizing Queries In ColdFusion Using CFThread

I recently worked on a data-migration system built in Node.js. One of the really nice features of Node.js is that, by default, every asynchronous task has the ability to run in parallel. So, when it comes to data-migration, it's actually easier to run queries in parallel than it is to run them in serial. This can make the overall workflow incredibly fast. In ColdFusion, the CFQuery tag is a blocking operation. Theoretically, you could parallelize queries using the CFThread tag; but, the overhead of threading is probably too great. Or is it? I had absolutely no basis for that assumption; so, I wanted to try and run some sanity checks to see if my assumption held any water.

NOTE: I know that there is really just a single event-loop in Node.js; and, that everything in a single Node.js process runs on a single CPU. So, the idea of "parallelization" in Node.js is used very loosely. I know there's probably people who would love to argue that I'm actually talking about "concurrency" in Node.js, not "parallelization". But, to be frank, I'm just not smart enough to make such distinctions.

The test that I'm about to perform is in absolutely no way comprehensive. What I'm looking for here is not a proof-of-concept but rather, I'm looking to see if there are any immediately obvious failures-of-concept. I'm looking to see if using CFThread to run queries in parallel is blatantly slower or problematic. But, even if it doesn't appear to be slower on my local machine, again to be clear, that doesn't really prove anything.

The primary reasons that this test is not comprehensive:

My local development environment is not under any load.

There is no possibility of CFThread exhaustion due to competing requests.

The entire MySQL database likely fits in my machine's RAM.

ColdFusion can't spawn threads within threads, so this approach wouldn't be universally applicable (like it is in Node.js).

Ok, with the tiny-print out of the way, let's do some experimenting. First, to create a control case, I'm going to run a few queries that I might run in order to assemble a blog post (it's the largest data-set I have locally). And, to get a sense of trending, I'm going to do this in a loop, each time with a different primary-key value:

As you can see, nothing special is going on here. I'm simply keeping track of how long, in milliseconds, each iteration takes. Then, I'm outputting the time at the end of each iteration. And, when we run the above code, we get the following output:

... but, in the vast majority of cases, each iteration runs in low single-digit millisecond time.

Now, let's take the same exact concept and try to run these queries in parallel inside of individual CFThread blocks. And, since each query is going to be wrapped in its own CFThread body, we need to pass-in the iteration primary-key and, when we join the threads back to the page, we need to pull the query results out of the thread scope:

Since each CFThread needs to be uniquely named within the confines of a single web-request, I'm tacking a UUID onto the end of each thread name. This is only needed because we're looping over the thread-spawning. You wouldn't need to do this in a "standard" web request.

Now, my gut instinct - the assumptions that we're testing here - is that this approach will obviously be slower simply because of all the mechanics that go into spawning and running threads in ColdFusion. However, when we run the above code, we get the following output:

There are a few more outliers in the parallel version; and, the "serial" test looks to possibly trend one or two milliseconds faster; but, overall, this is way more performant than I expected it to be. Honestly, I was expecting a CFThread-based approach to be laughably slower. But, it looks like that assumption wasn't really based in reality.

So, what we see here is that the idea of parallelizing queries in ColdFusion using CFThread isn't immediately dead-in-the-water. However, that's not to say that this is truly a viable approach. After all, my local development environment is nothing like my production environment. No one is competing for thread resources. No one is competing for CPU resources. There's no reason for me to assume that local performance is any indicator of production performance.

The biggest fear that I still have in my head hinges on the fact that each production instance can only have 10 parallel CFThreads at any time (yay for Standard license limitations!). According to the documentation:

Standard Edition: Only the Async CFML, DirectoryWatcher, Socket, and user written event gateways are enabled. The maximum number of threads in the pool for the cfthread tag is limited to 10. Only the CFM page request limit is adjustable.

This number is much lower that the limit on concurrent page requests. As such, I'm much more likely to bump into a CFThread queuing issue before I run into a load problem on the application server. And that worries me way more than the performance of any individual CFThread block.

I'm actually a little angry that ColdFusion Standard makes the CFThread limit so low. Really Adobe? Really? This saddens me. "Performance" should never be a licensing concern. Only features should be.

To be honest, I am not sure where to go from here. On the one hand, I am curious to continue experimenting because running parallel queries was more performant than I had expected it to be. But, on the other hand, I have such FUD (Fear, Uncertainty, Doubt) about how likely I am to exhaust the CFThread pool. What a nail-biter!

I am the co-founder and lead engineer at InVision App, Inc — the world's leading prototyping,
collaboration & workflow platform. I also rock out in JavaScript and ColdFusion 24x7 and I dream about
promise resolving asynchronously.