Content Publishing Models

When you get neck deep into a content management implementation, you can lose sight of the actual publishing mechanism— how the content gets from your system to the end user’s browser. No matter how sophisticated your CMS is, at some point, a user enters a URL and some content comes out. How does a URL map to and retrieve content?

Given all the different content management systems I have my mitts in these days, I’ve seen what I think are the three big models. There are good and bad in each:

Template Pull

This is probably the most common. Your URL maps to a simple script file (PHP, ASP, Rails, whatever) that pulls content out of your repository. You may use a framework (ASP.Net server controls, like Ektron; a PHP framework; a Web service, etc.) or straight SQL, but the end result is that your templates are completely separate from your repository. They connect to it to retrieve data.

The first content management system you wrote? I’m fairly sure it was Template Pull.

The benefit is simplicity and ease of integration. Every Web developer knows how to get the data, and your templates can just use CMS-sourced data for as much or as little as they like. Perhaps you’re using CMS to manage the help screens for your CRM app — in this case, you just need to drop managed content into various spots on a page, while the rest of the page is deeply hooked into something else.

The drawback here are the same as standard Web development. Oftentimes there’s no centralized architecture to your templating system, and you get can “spaghetti templates” with inbound requests flying all over the place.

Additionally, there’s a very loose coupling between content item and addressable URL — your CMS has no idea of what URLs its content will appear on, unless you specifically tell it. (And this is often just what happens — Ektron allows you to specifiy a template file for a content object, for one reason only: so Ektron knows how to automatically form links to it.)

Full Stack

This is the eZ publish model, but it’s not all that common. In this case, your CMS “owns” every inbound request. It’s usually accomplished via goofy URLs or cleaner rewrite rules. Every request is fielded by the CMS, which maps it to content and templates and spits out the result. Content and presentation are managed in the same system.

The advantage is just that: oftentimes when managing content, presentation gets rolled into the mix (see our prior post about View Pattens). Additionally, the CMS manages the URLs as well, which insures link quality. Also, there are often methods for putting content back into the system just as simply as getting it out.

The drawback is that perhaps you don’t want your CMS to do everything. If your site is primarily managed content, then it works well, but if there’s a lot of other stuff going on, you often find that you have to write code to manage this stuff from within the CMS. For instance, to integrate with systems outside eZ publish, you have to write a “module” that exposes URLs — it’s a jealous mistress, oftentimes.

(Where this limitation is most obvious is if someone wants to do something with a client-side WYSIWYG editor — something so simple can be a huge pain to bring about. See our post about incorporating static HTML into a CMS for more on this.)

Aditionally, there’s often an…ickiness with this model for a lot of developers. To get used to a Full Stack system, they often have to leave what they know about Web development at the door and start over. Learning curves can be steep and developer buy-in can be slow.

Data Push

In this instance, your CMS essentially becomes a big code and/or data generator. During the publishing process, it creates files and writes them to the file system (or a remote server), via the file system, FTP, or other process.

Documetum Web Publisher did this back when I worked with it, and I just had a demo on Cascade Server the other day which did the same thing in much the same way. Closer to home, this has been Movable Type’s model for years (writing to the local server only). Even Blogger can do this — you manage your stuff on Blogger’s site, then it writes to your server via FTP.

The advantage here is extreme flexibility — there’s very little you can’t do with this model. Additionally, you can publish everything statically, which is handy for a lot of reasons. Gadgetopia probably would have curled up and died a long time ago if not for the fact that Movable Type writes a bunch of static files out.

Sound great, but there’s a drawback here, and it can be a big one: getting data back into the system can be tricky.

First, since the files the user interacts with are often separate from the CMS itself, any requests flowing back into the CMS will usually need to be addressed to the CMS itself, which assumes the CMS can be accessed (remember, it can be on an entirely different machine than the Web site). Second, if the data coming back in needs to change content that’s already published, there’s the extra step of republishing the content.

Consider Movable Type’s commenting. MT spits out a bunch of files for Gadgetopia, with which you’re interacting now. To take in a comment, however, you need to connect directly to the mt-comments.cgi script. This is pretty transparent here since it just fields the posted content, but on systems where you have real-time interaction with the CMS (you fail validation, for example, and get an error page), then you need to make sure that page looks like the rest of your site, which means perhaps duplicating some template code in the CMS. Then, after all this is done, MT still needs to regenerate the page on which you made the new-absorbed comment.

This is not so much of a big deal here, since MT is running on the same machine from which the site is published. However, Data Push architectures enable you to geographically separate your CMS from the publishing server. If you do this, it gets more complicated.

And Data Push shares the URL-management problem of Template Pull. It has little idea of what URLs its content maps too, leaving URL management to the user.

However, what’s interesting is how both Data Push and Full Stack systems can power a Template Pull system:

Configure your Full Stack system to generate XML, then use it as a Web service datasource for a Template Pull system.

Instead of writing out PHP or ASP files, there’s no reason a Data Push system can’t write out XML, which your template files use as a source of data for a Template Pull architecture.

(RSS is a lot like this. If your Full Stack system exposes RSS, and you syndicate it somewhere, then the target is just using Template Pull from your Full Stack datasource.)

Some Data Push systems have connectors that don’t even move files — they write and refresh database records. So your CMS can manage your content, then refresh a table of records in the database on your Web server which your templates can use as a simple source of data.

So, which is the best? As you can imagine, each one can be better depending on what you’re doing.

Template Pull is rock simple, and there’s a lot to be said for that. If your publication model is simple enough to limit the number of templates, there’s a lot to like here.

(What gets funny, however, is when you start refining your front-end so much that you implement a central controller backed by some rewrite rules…so your front-end becomes Full Stack itself.)

Data Push is great for one-way publishing, but two-way gets complicated. But without the need for data coming back into the system, Data Push is my first choice. It’s so flexible that once you get into it, your mind starts racing with different ways you can integrate it and different things you can do with it.

I’ve found that Full Stack is great when your CMS encompasses everything you like, but if you have to color outside the lines of your CMS, integration can get complicated. While I love eZ publish, this is actually my least favorite of the models — I always feel like the functionality of the system will run out at some point, and then what happens?

(But here’s the thing, we haven’t gotten there yet. I’m still waiting for eZ publish to leave us hanging. I think I’ll be waiting for a while. The same can be said of any system with a well-realized and implemented extension and plugin model.)

Looking for a clear, unbiased view of web content management?Web Content Management: Systems, Features, and Best Practices explores the systems, technologies, and platforms within web content management, giving you the knowledge you need to solve the right problems.