A step-by-step guide to building an RSS 1.0 document by hand. (Updated for RSS 1.0 RC1)

(This article assumes a certain familiarity with the basics of XML markup (the "pointies") and perhaps even a little fiddling with RSS itself. The introductory material is brief, focusing on the distinguishing characteristics of the recently proposed RSS 1.0.)

RSS ("RDF Site Summary") is a lightweight multipurpose extensible metadata description and syndication format. Whew, that was a mouthful! Let's take that bit by bit, shall we.

Lightweight
Much of the reason RSS has been successful stems from the fact that it is simply an XML document. You can write an RSS document by hand. With minimal effort, you can have your content-management system write it for you. Or, if you're a programmer at heart, you can utilize one of the abundant XML libraries available for your programming language of choice.

Multipurpose
While originally conceived as a portal language, RSS has been repurposed again and again for aggregation, discussion threads, home and job listings, sports scores, and more. It's not just for breakfast -- or headline syndication -- anymore.

Metadata
Metadata is data about data, answering questions like "Who wrote this?", "When was this published?", and "What is/are the topic(s) of discussion?" While the proposed RSS version 1.0 sports a rich metadata framework through RDF ("Resource Description Framework"), we'll only touch those bits of RDF that are mechanically necessary to include and not wander off beyond the scope of the task at hand.

Syndication
Now here's the fun bit. ... RSS is a snapshot-in-a-document of what you consider most interesting/important about your site at the moment. That could be your latest couple of Weblogs, up-to-the-minute sports commentary ... anything. And you make this available for the world to grab, pass on, aggregate, or publish online -- with links right back to your site for each item.

That's about all I'll say about the overall picture of RSS. I do realize that this was a rather brief overview, but since our intention is to actually create an RSS document, I'll leave further introduction to the many wonderful RSS articles already in existence; visit the Resources section below for a list.

Let's start at the top and work our way down, shall we. And since the proposed RSS 1.0 builds on the foundation of RSS 0.9,
we'll start by building a 0.9 document and then cover the few basic
mechanical changes necessary to bring it into compliance; if you're
already familiar with RSS 0.9, feel free to breeze through the first part of the tutorial.

While XML documents are not required to begin with an XML declaration, it is generally good practice to do so. The declaration says "This is an XML document" and specifies the version thereof -- the current version of XML itself is
1.0.

Now the XML declaration does also afford you the opportunity to specify your preferred encoding type -- the way you'll be dealing with special characters. Unless specified otherwise, RSS 1.0 assumes UTF-8; let's go ahead and add it for pedantic/illustrative purposes. So the first line of our document (make sure it's the first line!) looks a lot like this:

<?xml version="1.0" encoding="utf-8"?>

(By the way, I'll be calling out changes in our evolving document as we go along by highlighting new bits in orange.)

Every XML structure can have one and only one outer container -- the "root element." RSS 1.0's root element is borrowed from the earlier 0.9
version. The root element also affords us the opportunity to declare the namespaces we'll be using in our document.

Let's take a pit stop and see what we mean by namespaces.

In my sphere, there exist two Tims, two Jons, and a number of Daves (or variations thereof). To avoid confusion (never mind embarrassment), I have to be sure to clarify which Tim or Jon or Dave I'm referring to. Thank goodness they all have different last names, making Tim O'Reilly distinct from Tim Berners-Lee.

Now, since XML elements and attributes don't have last names, it can be difficult to differentiate between <title> as in the title of a Web page and <title> as in the title of a book. The distinction, using XML namespaces, may be expressed so:

html:title
book:title

Now these namespace prefixes (the bit before the colon) are not particularly useful if you don't have a decent definition for what html and book are. They are, therefore, associated with a URL.

This scheme effectively identifies the former as "title as defined by the HTML 4.0 specification" and the latter as "title as in O'Reilly book." URLs are used because they're a convenient way for everyone to invent unique names under their own control. The URLs don't have to point to anything useful, but it's nice if they do (documentation, for instance).

Now, since I work for O'Reilly & Associates, a book company, it's fair to assume that when I say the word "title" in the office I'm referring to a book title. I would always qualify when talking about an HTML document title by saying, well, "Web page title" or the like. So my "default namespace," then, in the book world, is declared in XML like so:

You'll notice a lack of prefix associated with the http://www.oreilly.com/book realm, allowing me to refer to
the two types of titles as:

title
html:title

Mind you, the namespace doesn't refer to the "title" itself, but to the vocabulary which defines it. This rather simplistic example should hopefully provide enough on namespaces to get you going; for more information, be sure to visit the Namespaces in XML
W3C recommendation and Tim Bray's "XML Namespaces by Example."

We'll add the default namespace for RSS 0.9 and one for RDF itself to our
outer rdf:RDF element and drop the opening and closing tags into our document: