Forum OpenACS CMS: Some basic reflections on what a CMS has to do

If there is one thing I like on a web page, it's a spinning logo that is on fire. When they take these arm restraints off and let me have a geocities account, first thing I'm going to do is set the background to yellow, then write all my text in italics, in light pink, with the really important bits flashing in violent orange. And, as they say genius must out, the world will quiver as I set my thoughts into action. Seeing as I'm yet to have a thought, I've decided to copy and paste the last report I wrote in its entirety onto my page. It's a cunning and unstoppable plan except that I predict no one will sit there long enough to get make their eyes bleed, because no one is going to read a report on the web that was written to be read on the toilet.

Here is a fundamental truth; long reports written for the page are not appropriate for the web. The best content is written for the web, by good writers. These days, almost any web site that mixes up this idea ("It's all content right?") is doomed to being boring and unsuccessful. And this crucially effects how you build a CMS. Because it means you still have to offer the ability to download those boring... er... scintillating reports. But you have build a way to write content to a website - and as far as I know, everyone does this by using web forms.

In terms of publishing, I've seen people use a CMS to do 2 basic things. System users usually log in and then paste a report they've written in Microsoft Word into a field, and that report becomes the top story on the news page that day. This is relatively dynamic content - it can change every hour, or every day, and it's basically stored in the database as text. The other way they publish is they go and upload a physical file, like a pdf, or a video, or a document. This is what I call an asset'. This gets stored in the database in its original form.

What the browsing public reads on the site is the news story that has been pasted into the field in the administration screens. So they'll read the latest news, and at the bottom of the story it might say "For more information on how to inflate rubber pigs, download our 33k pdf file here". When they click there, they will get the PDF file, which they can save or read through their browser.

So, in a way, a CMS is partially about creating screens that will allow people to publish content to a web site, and partially about creating a system that can handle resources such as documents, videos, sound files and graphics. You could call that a digital asset management system. The real trick is getting these two things to work flexibly together, and yet make each part do what it does to the best of industry standards.

But publishing isn't the end of the story by a long shot. From the perspective of the system users, the other side of a CMS is the finding of content. If you are running a medium sized site, with news changing say everyday, you are going to be creating masses of content. Finding it afterwards is the trick. And you will always need to go back and find and edit stuff you've published. (And I think we should aim for medium sized here folks, or higher, because we aint no two bit operation.)

Most importantly, and here is where the getting the CMS and the asset management stuff working together' becomes really important; you will need to link news stories together with assets. Say you write a story called "Growing sweet Grass on your window sill". Then you want to go through your archives and associate a picture with it. You remember that 4 years ago you took a picture of some fine five pointed leaves that would do splendidly. You have to somehow find them in the online library of images that you have. Then you have to link this image to your web story. And maybe you want a few more images ... and a video "Stashing the pot plants when being raided" ... and that sound file of you swearing from inside the police wagon... ok, I'll stop now.

As I understand, this type of cross linking of the stuff you have in your database has to be set up from the beginning. You would take a web story, entered in using web fields, then you would associate assets like images and files to it, and then theoretically, you could even associate a template to it. So each news story could have anywhere between 0 - 5 images and files linked to it, and system users could even choose how it lays out. But I get ahead of myself.

The important thing to know is that a CMS needs to handle two types of interaction from system users - uploading files and documents, and the putting of data into web fields. It also needs to be a damn fine and clever little asset handler and offer lots of search facilities.

The ambiguity comes because the words 'publish' and 'content' are not specific enough. Content can mean content you write straight into a field on a web page, or it can mean a document that you upload into through a web page which is stored in its entirety in a database.

Let me simplify. Say you have a page on a website - it's your homepage - and on it you usually put an article that is your latest news. This news article is about 4 paragraphs long - and it has 3 pictures and one link to a document that is much more detailed.

Our system user comes in with a coffee in hand, opens her email, sees that the boss has sent another report that is extremely long and detailed in Microsoft Word format. No one is going to read this, but what the hell. Sighing, she uploads it to the system. She tells the system its a report, she clicks the button 'browse', finds the file on her file system in her computer, clicks 'open', and then puts the relevant meta data to it, like the title, the number of pages, the authors, and perhaps an executive summary. Then she realises her coffee has gone cold so she goes and gets another one.

Later, the system user has to go in and put a news article to go on the front page. This article will be funny, short, and well written - it will make people want to download the boss's report. She does this by logging in, then going to a web page that asks for the title, perhaps a subtitle, a teaser, the author, and then the news story. It will probably ask for news article related attrbutes, like a field called 'notes to editors' and 'who to contact about this story'. When she is done, perhaps the next screen asks what layout she'd like. She chooses a layout with 2 pictures and she chooses to attach a report.

The interface then displays a library of images to fill the template associated with that unique story. She searches through the library for the ones she wants, then she modifies the captions a little. Finally, she searches for the document she wants to associate to her news story. She finds the one she wants, and designates it as attached to her news story. Then she previews the entire page, checks it thoroughly, and presses publish.

According to the workflow implemented in her situation, it might go through a number of people before going live. In this example, it goes live immediately as she has the highest privileges.

What she has done to create that article is to put words into a web form - this is the main story that goes on the homepage. As I reckon it, this is like the unique story to which all other things relate. She has also uploaded a Word file and associated it with this article. The two types of content have gone into the system with different metadata attached to each one. The really cool bit is that you should be able to link the original article to any number of 'assets' - like reports, images, sound files etc. This kind of flexibility is the key to sucess IMHO.

What the CMS has to do is handle the 'assets' in a way that has some thought behind it. ( I keep putting the word 'assets' in quotes because I remember that codey types call different things assets and they find my use of the word confusing.) So anyway, handling assets thoughtfully is the real trick.

Someone at the Copenhagen skillshare started talking along the lines of creating a kind of centralised list of all the types of things you might have in the system. This would be pretty cool, I think, because in order to link one thing to another, you need a quick way of figuring out how many bloody reports you have - or even if you have reports at all! - by looking up a kind of menu. Yep, we have reports and we have images and we have sound files - you can find them here and here and here. This kind of mmm... sorry I don't quite know the technical name... menu of things would be pretty key if we wanted to cross link bits and pieces of a system together.

Yeah like, say you wrote a document handling module. You'd need to tell that centralised menu that the system now has documents. That way all the modules could just refer to the main menu and see what the system is offering. I guess if you don't do that, if you wrote say a module that just publishes articles through a web form, and you want to attach a document to it, you'd have to say to your article module - go and check if there is a document module, query it, and get back to me. I reckon that would be rather time consuming.

So anyway, sorry to ramble on Dave, I hope I answered your original question somewhere! I also have to say I do feel a little uncertain posting still, since at Copenhagen I had to internalise alot of technical talk and turn it into some basic concepts. Sometimes the beauty is in the details and when you talk about big picture stuff you can lose a few of them pesky leetle particulars. I'm ofcourse - I'm open to being educated :) I would like to know what it is that does that centralised menu thing.

Relating content with assests are already part of BCMS since its part of CR. The UI in the demo site on admin does already show this. Unfortunately the template that I made does not show this functionality. But its really there. Basically I just used the existing CR relations feature. So any item in CR can be related to any other item.

There are 2 problems in linking items with one another that I have encountered in workin on the CMS space. OpenACS and CCM.

1. Putting the link within a content's body. Its very easy to have the template pull the related images and put it beside the article (which is already implemented in BCMS). What is hard is putting the link inline. Its like a href but an internal one within CMS. It must be internal to eliminate broken links (404). I have done something similar in my CCM project, linking by an internal id. Doubly hard is migrating existing content and generating the links to the new system, but then again that is another story. I am thinking something in the lines of a href="cms://123" wherein 123 is the internal item id.

2. Using a unique id such as the object id, but it would be nice if the id is not part of OpenACS so migration to another CMS will be easier. Although I think the debate about using id other than object id is still going on.

It is also great that you have pointed out that searching through the library of existing content to link them with new content is very important. Not exactly sure what is the best approach in terms of UI. We also have a terrible search engine in terms of telling that you just want a particular subset to be searched. Say I want to link an image, so we should only search the meta data of images. I don't think in openfts this is easy to do. It will search the whole system not just the images library.

Here is a great quote, and, I think, a goal to work towards for OpenACS

Go back to May 1999 for an explanation. "When I'm writing for the web, and I'm browsing my own site, every bit of text that I created has a button that says Edit this Page when I view it. When I cleick the button, a new page opens with the text in an HTML textarea. I edit. Click on Submit. The original page displays with the change. Three easy steps."

From: http://davenet.userland.com/1999/05/24/editThisPage

Our edit-this-page package was inspired by this, and is a step in the right direction in some ways, but could use the CR more effectively.

Dave, that sounds like a great application of the idea. In what way do you think the CR could be better used?

Here's a thought... Say you have a website about vacume cleaners. It has an 'About our company' section, and a 'latest news' section. The about us section has a mission statement and history of the company page and a few other pages. They might need updating say once every 6 months if that.

Then in in the latest news section, you have a page which is an overview of the three latest news stories. You display a paragraph and a link to the full story on the next page. You also have three supporting pictures, and this whole section turns over very quickly, say two news stories a day about vacume cleaners. (PR department gone mad)

I think that ETP would be perfect for the about us section, but pretty miserable for that latest news section. One thing we have to do with all our stories is keep them in the system for legal reasons... sometimes we get legal charges on what we say and we need to be able to retrieve whole pages and different versions of the stories. (This is very rare, but we need to be able to do it.) You could probably keep old versions of pages using ETP right?

I think that you need a page that breaks news down into fields specific to the content type. Why? Well, I think because you might want an 'Review our old stories' page to display all the titles published, by title and author, latest first in descending order.

Or you might want to just do something with one little attribute associated to that content type in another module. Could you do that with ETP? (Dani thinks she must really look in to ETP very soon!)

Very good points. Actually when I first developed modETP that was one of the problems. One of my work around was to make a "Edit This Item" link. So say a press release page template will display different news. Even though its a single page, content are taken from different pages. Its not that bad, so the template page did become one of the UI for entering content. I guess I took some lessons on that.

ETP style of presenting UI to the user may be better with other organization. Particularly in smaller sites and each page has one content.

Although it's a KMS I wanted to contribute some of the things we did in sharenet for the so-called knowledge library:

You can define any new object-type and assign attributes to these types. These attributes can be shorttext, text, files, dates, integers, categories, links (to specific object-types or general links). In order to display lists of objects you can then assign attributes to be the shortname and/or longname of the objects. To speed up the lookup of existing objects the categorization system is used to browse the objects in all assigned categories. In addition to that, the linking package is used to link objects by link-attributes or by letting users contribute links along comments/feedback.

Maybe this knowledge library could be of help in designing a CMS since it could be used to store and retrieve all kinds of content.