Caching ColdFusion Queries Inside Other ColdFusion Queries

Over the weekend, I was playing around with some query caching for the blog when I found myself in a situation where I had two primary tables and a join table. Up until then, I had only had to deal with the caching of a single table, so persisting multiple tables in a relationship wasn't immediately obvious. I figured I could have cached all three tables and then just performed a query-of-queries in my output, joining the three tables at render-time; but, this felt somewhat sloppy. Then it occurred to me - ColdFusion query objects can hold any type of data. Why not perform the JOIN at cache-time and then simply store the resultant query inside of the primary query recordset.

Since I don't have a true caching framework set up for this demo, we're just going to use the ColdFusion Request scope as our makeshift "cache." After all, this post isn't really about caching best practices - it's about the awesome flexibility of the ColdFusion query object and how it can be leveraged within a caching mechanism.

For the demo, we need to build two primary tables - girlTable, traitTable - and our join table - girlTraitJNTable. Without having to read the build-code below, the girlTraitJNTable contains a girlID and a traitID which allows us to define a many-to-many relationship between girls and traits.

<!---

The first thing we are going to do is build three data tables

to describe our girl data. We need the girls, the traits, and

then the table that joins them:

girl: id, name

trait: id, name

girl_trait_jn: girlID, traitID

--->

<!--- Build our girls query. --->

<cfset girlTable = queryNew( "" ) />

<!--- Add ID column. --->

<cfset queryAddColumn(

girlTable,

"id",

"cf_sql_integer",

listToArray( "1,2,3" )

) />

<!--- Add NAME column. --->

<cfset queryAddColumn(

girlTable,

"name",

"cf_sql_varchar",

listToArray( "Sarah,Joanna,Tricia" )

) />

<!--- ----------------------------------------------------- --->

<!--- ----------------------------------------------------- --->

<!--- Build our traits query. --->

<cfset traitTable = queryNew( "" ) />

<!--- Add ID column. --->

<cfset queryAddColumn(

traitTable,

"id",

"cf_sql_integer",

listToArray( "1,2,3" )

) />

<!--- Add NAME column. --->

<cfset queryAddColumn(

traitTable,

"name",

"cf_sql_varchar",

listToArray( "Spunky,Seductive,Silly" )

) />

<!--- ----------------------------------------------------- --->

<!--- ----------------------------------------------------- --->

<!--- Build our join query. --->

<cfset girlTraitJNTable = queryNew( "" ) />

<!--- Add girlID column. --->

<cfset queryAddColumn(

girlTraitJNTable,

"girlID",

"cf_sql_integer",

listToArray( "1,1,2,3,3,3" )

) />

<!--- Add traitID column. --->

<cfset queryAddColumn(

girlTraitJNTable,

"traitID",

"cf_sql_integer",

listToArray( "1,3,2,1,2,3" )

) />

Now that we have our fake database in place, let's take a look at a page request that needs to perform a double-check lock in order cache our girls query in memory:

<!---

NOTE: I am using the REQUEST scope here as my cache scope. This,

of course, makes no sense as the request scope is short lived.

For demo purposes, however, this will work.

--->

<!---

Param the query as a simple value. Normally, we would just get

our cached item back from the cache; but, since we don't have a

true caching mechanism for this demo, we can fake our experiment

with CFParam.

--->

<cfparam

name="request.girls"

type="any"

default=""

/>

<!--- Check to see if the cached query is a query. --->

<cfif !isQuery( request.girls )>

<!---

The cached query does not exist in the proper format; now,

we have to create it. Let's implement a lock (a double-

check lock) to make sure we only create this cached query

as needed.

--->

<cflock

name="girls_cache_lock"

type="exclusive"

timeout="5">

<!---

Perform the double-check to make sure that another

thread didn't create the cached query while we were

waiting to obtain the lock.

--->

<cfif !isQuery( request.girls )>

<!--- Query for the girls. --->

<cfquery name="girls" dbtype="query">

SELECT

id,

name,

<!---

When we query for the girls, we are going

to create an empty column to hold our JOIN

table. This way, we can cache all of the

related tables.

--->

( '' ) AS traitsQuery

FROM

girlTable

ORDER BY

name ASC

</cfquery>

<!---

Now that we have the girls query, let's populate

the traitQuery column with the traits that apply to

this girl.

--->

<cfloop query="girls">

<!--- Query for the linked-traits for this girl. --->

<cfquery name="traits" dbtype="query">

SELECT

id,

name

FROM

traitTable,

girlTraitJNTable

WHERE

traitTable.id = girlTraitJNTable.traitID

AND

girlTraitJNTable.girlID = #girls.id#

</cfquery>

<!---

Cache the associated traits directly in the girls

query as the traitsQuery proprety.

--->

<cfset girls[ "traitsQuery" ][ girls.currentRow ] = traits />

</cfloop>

<!---

Now that we have created the girls, cache the

query in the request scope (our make-shift

caching mechanism).

--->

<cfset request.girls = girls />

</cfif>

</cflock>

</cfif>

<!--- ----------------------------------------------------- --->

<!--- ----------------------------------------------------- --->

<!---

Now that we know that our cached query exists, let's out

the girls and their traits.

--->

<cfoutput>

<!--- Loop over the cached girls. --->

<cfloop query="request.girls">

#request.girls.name#<br />

<!---

Get teh traits from the girls record.

NOTE: We purposely made the query column a non-practical

name so that it wouldn't conflict with our local

variable. An unfortunate side-effect of the fact that

query objects allow non-scoped columns.

--->

<cfset traits = request.girls.traitsQuery />

<!--- Output traits for this girl. --->

<cfloop query="traits">

--- #traits.name#<br />

</cfloop>

<br />

</cfloop>

</cfoutput>

When the request comes in, we check to see if the girls cache item is a query object. If it is not, we then single-thread the cache creation and before gathering the girls data. As we do this, notice that we are performing a sub-query for each girl record in order to find the traits associated with the given girl. Once we have this sub-query, we store it as a column in our girls query. If we were to CFDump the result query out, it would look like this:

As you can see, each related traits query is stored as a property of a particular girl's record. In this way, when we cache the girls query, we are caching all of the necessary derived traits queries with it. This allows us to render the future output without having to do any kind of render-time query-of-queries; all of our relationships have been found and cached ahead of time and can be used with a few simple CFLoop tags.

When we run the above code, we get the following output:

Joanna--- Seductive

Sarah--- Spunky--- Silly

Tricia--- Spunky--- Seductive--- Silly

As you can see, this worked extremely well.

When storing related queries within another query, it is best to use a column name that is not practical. For example, I used "traitsQuery" as my sub-query column name rather than something more convenient like "traits." The reason for doing this is that ColdFusion, unfortunately, allows for non-scoped query column references. By using a less practical column name, traitsQuery, it allows us to create reference to the sub-query, traits, without having to worry about naming collisions between our sub-query reference and our primary query column.

If you are using ColdFusion 9 with ORM integration, I am sure that this post is moot as you can immediately load an entity with all of its composed properties. If you're not dealing with an ORM-enabled system, however, being able to cache queries inside of other queries is simply beautiful. Doing this over the weekend made me fall in love with the ColdFusion query object all over again.

Reader Comments

I'd be a bit cautious with storing complex data in a query. Have you tried to QoQ against your final query? CF scans the first few rows to make guesses on the types and I wonder if it will throw a fit on the complex data.

I haven't tried doing query of queries with this particular setup only because the intent was designed to not have to do any querying at render time (on looping).

But, I know that Elliott Sprehn used the ColdFusion query object to do query-of-queries with complex data in one of his CFUNITED talks (the second one about trusting the application code). In that, he basically stored a complex object and then "sort" column and used the sort column to sort the list of complex objects. I don't recall him talking about any complications.

As awesome as ColdFusion query objects are, I know they are fraught with all kinds of type-casting headaches; however, I think those only apply to the column you are actually trying to do something within when queries. Meaning, ORDER BY and WHERE clauses where you are performing logic on the given column. I am not sure that ColdFusion does much with the data in column that only simply being retrieved.

If he used a sort col to allow for sorting of the complex data, then it must mean CF won't barf on sorting the other columns. That was my concern (since it wouldn't make sense to sort on the complex data itself). So if CF blissfully ignores those columns than it should be good to go.

I don't use this for cache oriented systems. When I freelance, I often get clients who change the spec a dozen times before we get to the final product.

A neat little 'RAD' trick is to serialize forms or data structures that change often. In this way, the application still works, and I don't spend time optimizing a table structure, or even changing a single variable. I just add or remove fields.

I use this to set nested structures as columns. This would probably require my own blog post, but we define forms as YAML structures. We parse them into Structs, render them and cache them. Structs are stored straight into the database and read like query.column.struct...

I've definitely used some XML-style approaches to data storage on small apps. It definitely makes for some sweet RAD approaches.

Have you looked into the NoSQL databases? I went to a preso recently on CouchDB and MongoDB. Both of them use "documented" databases in which each key can point to a non-uniform document. This sounds like something you might like to look into.

I am still in my NoSQL discovery phase. I have a hundred ideas swimming in my 'lizard brain.' In fact, it's why I am up right now! The tactics from my previous post are from previous experience and are tried and tested (by me at least). Until I become more versed in NoSQL I refrain from selling my ideas.

Yeah, I know what you mean, re: "There is simply so much new material to learn that it's crazy". I couldn't agree more :) It's both at the same time super exciting and intensely daunting. Hopefully, I'll get some time to try out the NoSQL stuff for myself - if I can, you better believe I'll be writing about it.

I think most of the NoSQL engines are HTTP/REST based. It definitely makes it a very interesting setup. Heck, any time you add HTTP to invocation approach, it just makes for more possibilities.

Regarding the "double check lock", this would lock out all users while the cache is repopulated, yes?

I'm trying to come up with a way to allow users to get data, even while a cache is being populated. Because, if it takes 5 minutes to populate the cache, as soon as one user hits the expired cache and locks the segment of code to repopulate, all users will be sitting for 5 minutes.

I know this is an old post but I wanted to share something I discovered when playing with queries nested within another query object.

I had a function within a CFC that created a query object and then nested another query object inside it (masterQuery > subQuery). The CFC returned the query object and to make things easier I looped through the query columns and brought everything into the local VARIABLES scope using the following format: "#a#" = masterQuery["#a#"] where "a" is the name of the query column.

After I had all of the query columns in the VARIABLES scope I tried to run a QofQ on the subQuery and low and behold I got an error message:

"coldfusion.sql.QueryColumn cannot be cast to coldfusion.sql.QueryTable"

However, when I tried a different method for bringing the masterQuery columns to the VARIABLES scope, I was able to get it to work: "#a#" = evaluate('masterQuery.#a#')

After doing some playing around using GetMetaData().getName() to determine what ColdFusion was seeing when a variable was set using masterQuery["#a#"] vs evaluate('masterQuery.#a#') I learned that ColdFusion continues to see the variable as a type: "QueryColumn" instead of its actual type (string, query, array, struct, etc...).

Currently the only way I can get CF to recognize the proper type is to use the evaluate() function.

I wonder if this a bug within ColdFusion 10. I haven't tested it in any earlier versions and after extensive searching I couldn't find anyone else with a similar problem. Regardless I thought it was pretty interesting and noteworthy.

In this article it mentions some complications when trying to reference complex objects within queries.

What I was missing in my pastebin code when trying to assign a variable to a column in the query was the bracket notation designating the desired row. I guess you have to do this when referring to complex object types.