RSS
Workshop

Publish and Syndicate Your News to the Web

"If you build it and have great content,
they will come"

Tutorial URL: http://rssgov.com/rssworkshop.html

Workshop Description

In this workshop you'll learn how to create, validate, parse, publish,
and syndicate your own RSS news channel. The emphasis will be the practical
application of the two most popular varieties of RSS for dynamic publishing.

You can use RSS channels to allow customers to keep up on industry specific
news, check weather, look for jobs, view upcoming concerts or university
lectures, monitor specific websites, and much more. Some examples of the
varieties of applications that government agencies and others have created:

This workshop will also teach you how others can incorporate your news into
their pages automatically. The workshop will showcase the use of tools that
are readily available to you.

.

What is RSS?

RSS is a protocol, an application of XML, that provides an open method
of syndicating and aggregating Web content. Using RSS files, you can create
a data feed that supplies headlines, links, and article summaries from
your Web site. Users can have constantly updated content from web sites
delivered to them via a news aggregator, a piece of software specifically
tailored to receive these types of feeds. RSS is the hottest thing in
Web communication. It powers many popular applications such as weblogs,
knowledge management networks, and news syndication.

Weblogging, a term coined by Jorn
Barger in December 1997, is one of the most popular and fast growing
applications of RSS. A blog is someone's personal dated 'log' frequently
updated with new information about a particular subject or range of subjects.

RSS is changing the world of publishing news and searching for news.
Here's how, Chad Dickerson, the Chief Technology Officer at at InfoWorld
describes the change:

" I've spent a lot of time in conference rooms
patiently explaining RSS to business folks here at InfoWorld and to
developers at some other IDG publications, always emphasizing how simple
it is: you can do it and here's how. ...It was only recently (April
28, 2003, to be exact) that I felt like RSS had completely turned the
corner at InfoWorld when the various constituenices at InfoWorld (sales,
editorial, and technology) agreed to include links to our feeds prominently
on our home page. Getting something on the home page is recognition
that something has been politically mainstreamed within an organization.
So, what's my point? RSS is now burned into InfoWorld's organizational
brain like the words Kleenex and Xerox. Names can change, specs can
change, people involved can change, and any number of other things can
change, but in the end, if any of the business folks ask me.... I'll
just say: don't worry about the name, it's really just RSS...."Source.

RSS 0.91 (Rich Site Summary)

Netscape released version
0.91
in July, 1999 and has since been upgraded by Dave Winer of Userland to
0.92 ,
and in August 2002 to 0.94 and 2.0.
The latest version is the first to support extensionability
with optional namespaces in its first module, blogChannel.
For more detail see Ben Hammersley's Content Syndication with RSS,
chapter
4 (pdf) and chapter 8 and Sam Ruby's RSS
in Depth with Quick
Summary.

RSS 1.0 (RDF Site Summary)

RDF or Resource Description Framework,
provides an XML structure for describing document metadata content. The
RSS-DEV Working Group
created RSS version. 1.0
(official specification)
in December 2000 supporting RDF thus allowing the description and syndication
of site content and metadata. It differs from the earlier version because
of its extensibility via modules based on XML-Namespace technology. This
lets content providers use it in their own documents plugging functionality
into a basic syndication platform, saving time and effort, and ensuring
compatibility. RSS 1.0 documents can draw upon any RDF-compatible extension
syntaxes called modules. Current standardized modules exist for Dublin
Core, Syndication,
Content,
and Annotation.
For more detail see Ben Hammersley's Content Syndication with RSS,
chapter 6 and Sam Ruby's RSS
in Depth with Quick
Summary.

.

What Does a RSS File Look Like?

It's a good idea to learn how an RSS file is structured before you begin
creating them. RSS files are written in XML [this orange
logo is used to represent a link or pointer to the syndicated form of
a news feed].

Notice the differences. The1.0 version feed is wrapped in the RDF namespace
declartion, <rdf:RDF> ... </rdf:RDF>,, has a table of contents
in the channel block (used by aggregators), and employs Dublin Core metadata
fields.

RSS 1.0 conforms to the W3C
RDF specification. RDF is a model for describing metadata for describing
resources from a collection of web sites, a single web site, parts
of web pages, a specific HTML or XML element, documents, printed
books, recipes, etc.A property is an attribute or characteristic
used to describe a resource and is specified in the RDF Schema specification.
The resource together with a named property plus the value of that property
is an RDF statement. All statements are enclosed in an RDF
element which has a namespace prefix pointing to the RDF syntax specification:

The RDF
schema, calls for each channel to have a title, description, link
to the channel on your Web site, and then item elements. You can include
optional information about the channel: language, PICS rating, copyright
statement, pubDate or publication date of the channel, lastBuildDate showing
the date it was last updated, docs, link to the managingEditor, link to
the webMaster, an image such as a logo, textinput strings, skipHours telling
automated aggregators when not to collect RSS data, and skipDays telling
aggregators days that data should not be collected.

News stories in your RSS feed are items defined by the <item> tag
and usually containing a headline <title>, the URL to the RSS feed
or <link>, and an optional <description>. RDF feeds were traditionally
limited to 15 items, but that limitation is largely a matter of convention.
Syndic8's Headlines
Per Day shows that some feeds can number in the hundreds, but keep
them reasonable so they readily load into feed readers.

Don't put HTML markup code within a headline. If necessary in a description,
such as for a link, the HTML characters &, ", <, and >
will need to be encoded as &amp;, &quot;, &alt;, and &gt;
. The RSS
2.0 specification specifically allows it.
Ben Hammersley argues against putting HTML coding in description for
two reasons. It requires that client software have the ability to parse
it, and combining presentation markup along with the content diminishes
the ability of RSS in being able to provide indexable metadata.

Find a Viewer to View RSS Channels

To get acquainted with RSS, we'll first view a RSS channel through a
RSS reader or viewer. In this workshop, we'll view ResearchBuzz (0.91)
and Perl News (1.0)
using an online viewer and a client-side news aggregator. Try this exercise:

Click view to see the HTML presentation. Click "edit and then "save"
or "source" to view XML.
Optional: For comparsion, view Perl News using Redland
RSS 1.0 Viewer. Check "yes" to format the results in
a simple box.

Now we'll find these same two feeds in Syndic8, the largest RSS
syndicator. Notice the tabs at the bottom of the page. Click "headlines"
to read the current feed. Click the "integration" tab to
access files prepared for several other alternative readers including
NewsIsFree, Headline Viewer, Fyuze, Radio Userland, Snewp, Amphetadesk,
and BottomFeeder.

Let's take a look at one of these client-side readers that you may
wish to install on your computer:

Andrey Tumashinov's NewsZCrawler
 ($25). This Windows reader has many features and is easy to
set up. It synthesizes RSS news feeds, NNTP newsgroups, and news web
sites. It can write as well as read news.

Create your own channel and news headlines

You can create a RSS file using any text editor, but you almost certainly will find syntax errors. You may find it preferable to create channels using
one of these editors and then maintain them using a text editor. This
isn't a bad practice, if the channel is for something ephemeral. If the
channel items are not archived and if the channel isn't integrally associated
with a particular site, editing the channel with a text editor is easy
and fast. Otherwise, you'll probably prefer to use a content managment
system (CMS) tool to maintain channel data.

In this workshop, we are going to construct a simple RSS channel. We're
going to start with "Documents in the News" a news page constructed
in standard HTML by Government Documents Librarian, Peter Kraus, at the
University of Utah Marriott Library. The "before"
page is the page as it existed on September 24, 2002, and "current"
is the page as it currently exists.

When you're done with today's workshop, you will have created a RSS channel
(either in 0.9 or 1.0
) and you will have published the same
page with the information now dynamically appearing as an RSS channel
(style sheet optional).

We'll create a channel for "Documents in the News" by using
the first two tools listed here:

WebReference's RSS
Channel Editor - like UKOLN's, this editor is limited to 15 items,
but this online form will generate valid RSS 0.91 feeds. Build a new
channel or "fetch" an existing one, click "build RSS,"
and then save the resulting file. A similar web-based generator that
can be used to construct RSS 0.91 feeds is Andy Holt's RSS
Headline Generator.

If you're interested in more than an individual channel, then you're better
off using a blogger or content management product that use RSS. Most support
automatic archiving to store old posts, permalinks (to link to them),
date headers and time stamps that record when new headlines were posted.
Some provide hosting for you site for free or for a low cost. Some require
a client-side download; some are managed entirely through a web interface.
[Suggestion: include a "generator" comment line such as <!--
generator="Movable Type/2.51" --> on the line following the
XML declaration. This allows Syndic8 and others to track the usage of
these tools.]

Here is an incomplete list that links to several of the more popular
alternatives.

Blogger Products
is a server-side product line, including subscription services, for
creating RSS 0.91 feeds. Blogger Pro has spellcheck, image uploading
and team blogging. Blogger is a creation of Evan Williams and Meg Hourihan
of Pyra and is generally credited for first popularizing blogging. Pyra
was recently purchased by Google.

Radio Userland - Dave Winer's
easy-to-use blog host with client-side CMS product for Windows and Mac
with facilities for creating, reading, and archiving RSS feeds for a
single blog. It can be made collaborative by use of the Multi-Author
Weblog Tool. It is designed to be used from one main location, though
updates can be remotely if you have remote access to your workstation.
Manilla is a server-side product that supports building a community
of blogs and Frontier is the overall CMS.

Movable Type by Benjamin
and Mena G. Trott -- popular and feature-rich, extensible, server-based
Perl CMS, free for personal or non-profit use; $150 for government agencies
Commercial Pro version due out in summer. Easy to use for maintaining
regularly-updated news or journal sites, like weblogs. Supports XML-RPC
and creates custom 0.92 and 2.0 (.xml) and 1.0 (.rdf) RSS channels.
Supports the creation of "collaborative weblogs", i.e. multiple
authors and visitors can contribute postings and/or make comments. Has
language packs that translate the display into other languages such
as Spanish. Imports Radio Userland created channels and archives, but
comments are lost. Must have server access. Example of a subscription
list. Utilizes "trackback
pings" enabling you to see all sites that have referenced a
post on your site and to read related posts (trackback
example) enabling distributed communications. See admin interface
screenshots.

Typepad - Ben and Mena Trott's centrally
hosted commercial weblogging service. It's like using Movable Type only
easier and with some unique features.

NewZCrawler - this client-side
news reader also creates outgoing channels in RSS 0.92 to which you
can add your news using its news composer (template-based news feed
output allows you to create any output text format). If you have a webblog,
the program also has a Blog Client allowing you to post news via the
blogger XML-RPC API.

Onclave - a free web-based system
in beta testing by Drew Peloso, Dave Reid, Steve Hatch, and Per Kreipke
for creating and managing multiple online collaborative weblogs, managing
the information with directory-like taxonomies. Users can put information
into appropriate categories, or create new topics as needed. You can
create RSS 0.91 channels by creating a personal or collaborative
(cblog) onclave, adding a weblog channel, and syndicating as RSS.
There is no software to download. Publishing is done dragging the Share-it!
editor to your browser toolbar. When you see an article you want to
share with others, click Share-it, enter the item title and description,
and click publish. Click the syndicate button to create the RSS XML
file or the simple javascript code to parse and display the channel
to your site. Visitors can subscribe by email to your syndicated channel
by clicking the notify button and you can post entries by email. gilsUtah
example. - receive
channel by email.

WebCrimson - content management
tool by John Hiler's company for free, easy-to-use online browser creation
of a "weblog" ,
adding a weblog to an existing site, creating a WebZine
on your site or hosted on theirs, or contribute "articles"
using browser editor. Includes full FTP, templates, WYSIWYG editor,
permissions and admin contro for group contributions, and optional hosting
on their servers.

Blosxom - a free
lightweight, yet feature-laden weblog creator by Rael
Dornfest; in Perl (runs on any OS, but built to take advantage of
the Mac OS X). Possibly the most useful "61 lines of code"
on the Internet. A simple PERL plug-in
by DJ Adams allows you to post entries by email. Pyblosxom
is a python version (Abe Fettig's example).

Serence Corporations's KlipFarm
- register as a Klip
Provider to make channels. The service also has an alert service,
an XML-based Windows desktop application that grabs streaming RSS feeds,
and Klip
Folio, a free Windows client-side viewer that you can download.

Tom Dyson's Mailbucket
- some aggregators allow you to post entries to a blog via email. Mailbucket
is an experimental email-to-RSS service that allows you to create a
public RSS 1.0 feed and post to it by email. Perhaps useful for creating
mailing list feeds.

Some popular open source, server-side, portal content management systems
such as PostNuke, (php), PHP-Nuke,(php)
phpWebLog (php), SlashDot
(Perl), Squishdot (Zope),
Rusty Foster's Scoop (Perl),
Roller (Java),
and Drupal (php) can also create
and display RSS feeds using their built-in news aggregators. They are
very popular for community sites. Scott Johnson has a tutorial
for creating feeds using Drupal. Both Drupal and Scoop can support multiple
blogs and Drupal have a module for customized news aggregation. Here
are two of my favorite community implementations:

Machine Created RSS Channels

Content created in HTML by yourself or others can be converted to RSS
feeds by means of "scraping". Scrapers try to identify "headlines"
in a page and they create feeds that are OK but not of the quality of
human produced feeds. You can use one of the free online scraping services
or download their source code to your own server and run a scraping service
of your own.

You can assist these scraping programs by putting special span tags around
content that you want syndicated. The text inside the tags is pulled out
and put into an RSS file.

Put <span class="rss:item"> ... </span> tags around
any items on a page that you want to syndicate by RSS (such as a list
of links or events).

Then submit the URL to:

Ian Davis' myRSS is a python
program that converts web data into XHTML RSS channels. Enter
a web page URL in the Create a
Channel form.. Channels are created in RSS 0.91, RSS 1.0, and
in javascript include formats, and are updated daily. myRSS delivers,
using a heuristic algorithm, the last 15 headlines or hyperlinked
resources that have been added to the page. This could be a very useful
tool for syndicating a site what's new page, agency news, new acquisition
lists, donations received, and the like. A myRSS
style sheet can be associated to a feed if you wish to display
a myRSS channel on a page. The services uses item link redirects on
the myRSS server that in turn link to the referred to resources. Anyone
can "sponsor" a channel for a $10 annual fee and have the
redirect links removed. Anyone can pay $25 per year to have a channel
updated hourly instead of daily. Noteworthy channels created in this
manner are reviewed, and if acceptable, are annotated and are included
in the DMOZ Open Directory project and in Syndic8.com's database.
You can download the entire myRSS
Channel Catalog in OCS, an xml format for describing channel catalogs.
MyRSS turned this workshop into these two RSS 1.0
channels; new items posted to the State of Utah homepage are in this
channel.

Syndic8 Syndicate
Your Page- an online form to generate the feed and list it with
the Syndic8 aggregator. What to syndicate content created by someone
else? Use the Syndica8 "suggest"
service to create channels using myRSS.

eVictor's rssDistiller
- a commercial tool for RadioUserland that extracts RSS feeds from most
HTML pages and allowing you to join the results of several filters into
a single feed. For example, Bruce Loebrich has used this tool to creat
RSS channels from Google
News
and Columbia
University Newsblaster
syndications.

Validate Your RSS

As a channel editor, it is your responsibility to ensure that your file
can be parsed by the XML parser of any subscribing site. Your RSS creation
software should validate XML at the time of creation, but some do not.
Minor errors can make the feed unreadable. You may wish to load your RSS
file to your server and then enter the URL in one of the following validators
to check the syntax.

Feed Validator (formerly RSS
Validator) by Mark Pilgrim and Sam Ruby. Version 1.2.3 now validates
multiple syndication formats: RSS 1.0,
RSS 0.9x/2.0,
and the new Pie/Echo/Atom
0.2 feeds. It includes validation for common namespaces. If you
have access to a server with a Python distribution, you can download
the open source code and follow the instructions
to install it locally.

Dave Becket's Redland
RSS 1.0 Viewer. RSS to HTML converter, but acts as a RSS 1.0 validator
also. Viewer offers several parsing methods. Links to a great collection
of feeds about RSS.

Parse and Display Your RSS Channel on Your Site

Since RSS files are written in XML, you cannot readily display them in
a page without parsing them for the information you want to show. Content
management and blogging systems like Radio Userland, Movable Type, and
Drupal automatically parse and publish this to your site without any extra
effort on your part.

If you are using a text editor to create channels, or if you want to
display your RSS content on another site such as one created with DreamWeaver
or Frontpage, you'll need to insert an "include" that calls
upon an external parsing program.

These parsing programs coupled with stylesheets allow you to display
this content the way that you want it to appear. You can choose the fonts
and colors, the number of channel items to display, whether or not to
show the headline summaries or descriptions, and whether or not to show
the time/date stamps. Parsers will allow you to display multiple channels,
if you so desire, all on the same page.

There are many solutions to choose from. In many cases all you need
to do is to put a single-line javascript into a page, connecting the location
of the RSS channel to the location of the parser. If the feed already
exists as a javascript (.js file), you just need to call it. For example,
to display the LockerGnome daily tip just insert:

You have two choices. You can either install one of these parsing programs
on your own server, or you can point the "include" to a script
residing on someone else's server. Let's test the waters first by pointing
to a parser on another server. Here are a few options:

Adam Curry's RSS-Box Viewer
[mirror] - a service in beta
that parses all versions of RSS. Use the online form to select table size,
fonts, colors, maximum item numbers, and whether to display compact or
expanded. It then creates the javascript with embedded variables. Supported
stylesheet classes are: .rssBoxTitle, .rssBoxContent, rssBoxItemTitle
and rssBoxItemContent. Write to Adam
for the full code
if you want to host it locally (written in Rebol, an Internet messaging
language). The script to insert, for our "Documents in the News"
example, looks like this:
<script language="javascript" src="http://publish.curry.com/rss/rss-box.r?url=http://gils.utah.gov/secure/rss/kraus91.rss&align=
left&width=350&frameColor=white&titleBarColor=%23ffffff&titleBarTextColor=cc0000&boxFillColor=white&textColor=
black&fontFace=Times New Roman&maxItems=7&compact=&xmlButton=&javascript=true"></script>"Documents
in the News" demo -
- stylesheet

RSSxpress-Lite
by Andy Powell and Pete Cliff of UKOLN. Click "try it" and
Select a channel - -
Enter the URL of the channel that you just created and loaded to your
server. It produces a line of javascript that you just cut and paste.
You can use style sheets to customize the colors, fonts, and display.
Support documentation.
For example:

RSSViewer is the Utah State Library's modification (by permission) of
Pete Cliff's simplified RSSxpress-Lite. This Perl script uses customizable
stylesheets. Contact Ray Matthews
to be added to the user list or to use the script on your own agency's
server.

After you have tried this, you'll probably want to load a script to your
own server so that you have more customization options and assurances
of reliability. Most are open source and free to download and use, and
they come in a wide variety of programming languages.

Our RSS
Parsing Programs page describes many of these options
with links to working examples, documentation, and the urls where you
can download them.

Allow other Sites to Publish Your RSS Channel

You'll want to let others read your channel and publish it to their site.
Syndication is the process of sharing content among sites. This other
other websites and applications to include your updated headlines. Websites
should create an information page, about syndicating their headlines.
One standard that was created by UserLand and is being used increasingly
is to include this
image somewhere on your site and link it to the RSS XML for that page.
To publish other feeds on your site, first check licensing agreements;
an example is that of MagPortal.com.

Another thing you can do is to allow people to subscribe to your site
by email. Onclave supports this as well as:

Bloglet - a super slick
XML-RPC, RSS-to-email conversion
subscription service by Monsur Hossain
for the blogs you create. Allows others to receive email subscriptions
of your blog and you can receive daily stats on your subscribers. Also
can be used to stay up-to-date on your favorite sites by subscribing to
any existing RSS feed. Add a subscription box to your site, for your own
or anyone elses channel. It creates the code to add. Use the FeedMe
toolbar icon to automatically subscribe to feeds that incorporate auto-discovery.
For example:

Register Your Channel with a RSS Aggregator

Before going public, proofread to make sure that your channel has a correct
URL link, descriptive title, and an informative and accurate description.
DMOZ weblog editor, Laura, has compiled some useful tips.

Next, submit your channel to aggregators just as you would submit your
website to search engines. You'll be amazed at the traffic it will generate.

An aggregator is a web site or system that collects RSS feeds from multiple
sources and then does something with them. Usually this will involve collating
and displaying the contents of each feed and perhaps creating new composite
feeds from them. Here are a few of the larger aggregators:

Syndic8 - the
largest aggregator with almost 10,000 feeds-- recommend your own or
another's; created by Jeff Barr

NewsIsFree - by
Mike Krus has headlines from a fast-growing collection of more than
3,600 feeds.

News4Sites - a commercial
aggregator that monitors 8,000 web pages from 2,500 domains producing
more than 25,000 news headlines organized into over 2,200l channels
of up to 20 headlines per channel. Feeds are generated in ten formats:
javascript, RSS, PHP, ASP, C#, VB.NET, WDDX, XML RPC, CDF, and PERL.
They can be parsed or delivered by email. The feeds are available for
free with advertisements. Suggest
a site for inclusion.

Backwash - a PHP driven community
site of independent columnists who recommend the best specific Internet
content. Bills itself "the ultimate recommendation engine."
Register and then submit.

Morton Frederickson's Syndication
Subscription Service consolidates the multiplicity of aggregation
services. A green subscribe icon
-
on a page will take you to a page (example)
that lets you choose the subscription link for the aggregator of your
choice. Clicking the icon there will add the linked RSS feed to your
subscription list.

Some of these sites provide categorized lists of their channels in OCS
(Open Content Syndication) and OPML
(Open Processor Markup Language) formats. OCS is an xml format for describing
channel catalogs.

.

Use Auto-Discovery Aggregation

You can facilitate aggregators finding your channel by inserting a LINK
specification within the HTML <head> that tells the location of
your RSS feed. If your tool hasn't already implemented this, you can add
this a simple one-line statement using the HTML link tag:

Each RSS feed that you have should have its own defined LINK tag with
its own unique title.

Mark Pilgrim's Auto-subscribe Bookmarklets. Drag the
link to your link toolbar and whenever you visit a site that you want
to add to your news aggregator of choice, click your "Subscribe"
button on your link toolbar, and it will try to find the site's RSS feed
and auto-subscribe you. Nifty!

Likewise, Onclave comes with a Share-It bookmarklet and the latest version
of Movable Type comes with a "MT It!" allowing you to use the
bookmarklet from your right-click menu.

More and more news aggregators and programs are supporting auto-discovery
including:

NewsIsFree - If you create
an account with this headline aggregator by Mike Krus, you can use
the web site as a cloud-based personal aggregator. Any of the feeds
it collects can be added to custom pages that you define. Their OCS
Service List is an XML of exportable RSS channels.

LiveJournal, a collaborative
open source free service for most
platforms allowing you to create an online journal from syndicated
feeds. Once you've created a journal, add /RSS to the end of the URL
to view it in RSS format.

BlogLinker.com is a unique
free tool from Zaeem Maqsood.for
reciprocal linking and increasing traffic to your site. Register, insert
a javascrfipt and add
links. If the website you link to is also registered, then your
site will automatically appear in their list of links.

Mark Pilgrim, of dive
into mark, has created autorss.py,
a Python script to find a site’s RSS feed and to search for a
referring site's RSS feed in auto-linkbacks. It can be used programmatically
(the autorss.getRSSLink function) or on the command line:

Technorati - enter the URL
of your page and it will show you the pages currently linked to you
ranked by "blog authority" and "freshness." For
a fee, create
a "watchlist" based on your own URL or search terms and
and have Google send you a daily email or RSS feed..

Julian Bond's GNews2RSS
- a free online form (and PHP script that you can put on your own serve)
that will that will perform a Google News search query and turn the
results into an RSS feed.

Phil Pearson's Blogging
Ecosystem - analyzes links between members of weblog communities.
It scans nearly 14,000 blogs and lists the 501 most popular blogs (by
back links) and the blogs that each listed blog in turn links to (forward
links). Use it to check blog popularity and to see who is linked to
whom. For example, at this writing, Phil Windley's Enterprise Computing
is the 220th most popular because 74 others link to it. Calculates nearest
neighbors, levels
of linkedness, degrees
of separation between any two blogs.

Chromatic, Aker, B, and Krieger, D. (2002). Running
Weblogs with Slash. Though Slash is no longer the platform
of choice for a RSS driven community portal, this is a detailed illustration
how RSS can be used to manage web content.