Rick Klau and Eric Enge talk RSS feeds

Rick Klau is Vice President of Publisher Services at FeedBurner, the market-leading feed management provider. He is responsible for cultivating and managing relationships with large media companies, entertainment networks, newspapers and a variety of other commercial publishers. Prior to FeedBurner, Rick held the same role at Socialtext, the first enterprise social software company. Previously, Rick was Vice President of Vertical Markets at Interface Software, now part of the Lexis-Nexis family where he held the position of company spokesperson. Rick has also held senior marketing positions at iManage through an IPO.

An accomplished public speaker, moderator, panelist and author of a popular blog, Rick has received extensive coverage in a variety of publications including The Wall Street Journal, USA Today, CIO, InfoWorld, Inc. Magazine, Internet World, The Washington Post and more. He has published a number of books and columns covering the topics of technology, law, ecommerce and online security. Rick earned AB Degrees in International Affairs and French from Lafayette College, a JD from the University of Richmond School of Law and a Baccalaureate Degree from l’Universite de Bourgogne.

Interview Transcript

Eric Enge: One of the things that I was really intriguing in your SES NY presentation is the data on the amount of market share that FeedBurner has. I believe you said that FeedBurner has 350,000 customers, 600,000 feeds, and reach of more than 2 million people per day. Can you update those numbers for me, and do you have any broader market data?

Rick Klau: Those numbers are dated already; particularly on how many people a day we reach. The slide I think I presented was data for the end of the year 2006. So, now we know that we are reaching terms of subscribers, well over 60 million a day. The numbers are actually getting closer to 70 million even as we speak. In terms of the overall market, no one really has any idea. Technorati tells us that there are tens of millions of blogs out there; that’s certainly true. I would guess that there are tens of millions, or possibly hundreds of millions of feeds out there. So, with us sitting at 721,074 feeds (as of May 21, 2007), we are a long way from 100% of the feed market. However, anecdotally from talking with the major aggregators and so forth, we are pretty sure we have a very good representative sample of the feeds that are the most highly subscribed feed. If you look at lists like the Technorati 100 for example, we have somewhere between 75% to 80% using FeedBurner. We think we have somewhere between one third to two thirds of the active blogs, and the active podcasts, but we don’t really know. There is no objective way that we know how to measure that. We certainly know that we have a large set, and that we manage more than anyone else.

Eric Enge: It’s fascinating that the tracking is so hard in the space, at this point.

Rick Klau: It’s just like trying to figure out how many websites there are. That’s another number that anyone at any give point in time knows. So, you end up applying some estimates. In this particular case, since we’ve been growing so quickly, we’ve really focused on ensuring that the pace of the growth and our momentum remains continuous.

Eric Enge: It seems that there is a feed for almost any topic. Whatever topic you have in mind, you can probably find an audience as well. It’s no longer the domain of the uber geek.

Rick Klau: That’s right. I think when anyone can say that they are reaching an audience of 50 plus million people a day, that you are long past the point at which anyone could claim that that’s an early adopter crowd. When you look at the breadth of producers of content that are using FeedBurner, they do cover every possible topic. It’s Mommy Bloggers, its news publishers, its Podcasters. There really is a feed for every topic and many of them are using FeedBurner to measure their audience and figure out which content is driving engagement. Applications like Google Reader, My Yahoo, and Internet Explorer 7 support feeds, as well as blogs. We’ve seen feeds really become mainstream. People are increasingly aware that they can subscribe to content.

Eric Enge: Right. So, if you have a very niche interest area that you want to write about, you probably could find an audience for a feed that had good content on that topic.

Rick Klau: Absolutely true.

Eric Enge: What I like about this is that it brings the web back to the notion of leveling the playing field. A small player that does a really good job with a feed can potentially develop a really nice audience.

Rick Klau: No question. Looking at my own blog, I have two very distinct audiences. I have about a thousand people who subscribe to my feed, and they read my content whenever I post something to my blog. Then, there are hundreds of people that are searching for something specifically on a search engine, who are finding a page on my site that I may have written 5 years ago. Those are people who have never read my stuff before, and they may not ever come back, but that’s a very different audience. And so, you find that people are tailoring how they publish content, and what their expectations are, very differently. The feed I think is, where the most loyal of your readers are going to be, whether that’s true today or whether it’s going to take 6 months to 12 months, I think that’s almost a forgone conclusion. The subscriber is the most likely to leave comments there and contribute content. The visitors from the search engines are often drive-by visitors, they come in, they look at a page, and they go away. They are far less loyal, far, and less repetitive in their consumption of your content.

Eric Enge: Are you familiar with the Google’s new AJAX Feed API?

Rick Klau: I am aware that it exists, but I have not spent any real time looking at it or understanding the implications.

Eric Enge: It provides a method for much less technical programmer to be able to extract and render data from a feed and publish it on their sites. It makes it easy to programmatically extract things like the title, URL, the snippet, the author, etc. It’s just a way of allowing you to get at that data much more simply.

Rick Klau: Right. Another example of a service that’s built around making feed content more repurposable would be Yahoo’s Pipes. These are the things that are not explicitly targeted at publishers, which is who we typically work with, but are much more of a developer tool. I think we will see more of these in the next couple of years. The FeedBurner service is more designed to help publishers better understand how their content is consumed. That may include learning how developers are accessing their feed content. But, today we don’t really focus on helping publishers leverage those kinds of APIs.

Eric Enge: Right, okay. The other thing that’s really interesting is that there are tons of feed bots out there.

Rick Klau: Yes, there are thousands.

Eric Enge: How many of them are worth the trouble to pursue?

Rick Klau: Increasingly, I don’t think you have to pursue them at all. That’s the beauty of how the feed ecosystem works. If you do a couple of simple things such as setup autodiscovery and pinging properly within your site, automated systems can find your feed. Then they add your feed to their crawler, so that they check them on it again. It’s increasingly likely that just by publishing the content, your feed, and therefore your content will be visible and discoverable by those downstream services without having to do anything (once you have it properly set up). Publishers don’t have to work very hard at all to make their content discoverable. I can say for my own blog that I have never gone to a site and added my URL to a directory or to an index in the hopes that it would get indexed. Yet in the last ninety days, I think we’ve had hundreds of discreet aggregators consume my content and several hundred blogs consume my feed as well.

Eric Enge: Right; yeah. So, if you get autodiscovery and pinging setup, you are pretty much off for the races.

Rick Klau: Absolutely.

Eric Enge: Have you seen problems with RSS feeds being indexed by the search engines?

Rick Klau: I have talked quite a bit with folks at Google and Yahoo about this. I don’t actually believe it’s a big problem. Publishers should want their feed indexed, because it’s often a very easy way for Search Engines to know about where the newest content is. Sitemaps can also solve that programmatically as well, but not everyone is producing a sitemap yet. Certainly the number of people producing feeds is much greater than the number of people who are producing a sitemap.

And, I think the search engines explicitly understand what a feed is. It’s very simple for them to recognize a feed during their crawl and they act on that by concluding that the feed is not a permanent place for the content, and therefore eliminating the duplicate content concern that you alluded to. It’s not to say that it’s an irrational concern, but in many cases, it’s not one that the publisher should be really concerned about. I think the search engines are increasingly very smart about understanding the difference between a feed and webpage, and understand the purpose of the feed as indicating where the content lives.

Eric Enge: It’s interesting, because at the SES Conference in New York, I heard a lot of people recommending that you Noindex your feed.

Rick Klau: Yes. I am not sure I agree, though I will say we’ve added support for that, so that if as a publisher if you want to flag your feed for noindex, you can very easily do that. We’ve also insured that both Yahoo and Google will support it, which they already do. From my own perspective as a publisher, not to mention the guy that runs publisher services at FeedBurner, I can’t image many scenarios in which someone who is producing content for public consumption is not interested in having their content indexed. The fact that it’s indexed in a feed doesn’t mean they are going to get penalized because that content also is indexed from their site. Hopefully I can reassure people that feeds are helpful and beneficial, are not potentially creating penalty situations.

Eric Enge: If somebody is doing a feed manually and they want to find out what the format is for a noindex tag, where can they can go to find that?

Rick Klau: There are a couple of different cases were redirects come in to play with regards to feeds. The first is very easy to talk about, which is where publishers who have an existing feed, and they have chosen to use a service like ours to manage the feed itself, measure consumption, and get a size of the audience. We would generally recommend that you redirect requests for that feed to its FeedBurner equivalent, and we would recommend you use a 302, so that no downstream aggregators see that redirection as a permanent one, in which case the aggregator would update their pointers and start looking at the URL on the publishers’ domain and would start going to it directly. We do have protections in place, should a publisher do that, and I’ll talk about those in a moment. But, as a best practice, if somebody is looking at doing this today, the recommendation would be do a 302 redirect of your pre-existing feed URL and have that result in FeedBurner serving up the feed. The protection that we’ve built in is that should anyone choose to leave FeedBurner at any time, all they need to do is delete that feed, and then we present them with an option to do 301 redirects of requests to their FeedBurner feed back to the publishers’ source feed. So that, if someone deletes a feed from FeedBurner today, we will for a period of time do a 301 redirect back to that publishers’ URL. Therefore, anyone subscribed to the FeedBurner version of the feed will get an update pointing to the publishers’ URL directly.

The second case for redirects, come into play as a result of (optional) clickthrough tracking. The issue is that publishers can choose to enable a feature in FeedBurner that will rewrite the link within the publishers’ feed. This allows us to capture the clickthroughs and measure that for the publishers, so that they can know which items are driving more clicks back to their sites.

Now, they have a choice; those users who user our TotalStats service, which costs a few bucks a month, can choose to make that clickthrough URL either a 302 or a 301. So, if there is any concern about the clickthrough URL being interpreted as a permanent location of that content, which could result in a duplicate content concern, we would recommend that you use a 301 to eliminate that concern. In general, given our earlier discussion about the Noindex situation, and the fact that the search engines understand that feeds are not themselves a permanent repository of content, I think it’s less and less likely that the clickthrough URL itself will be seen as authoritative, even if it’s a 302. But, if you are worried about this issue you can use a 301 and eliminate that concern.

Eric Enge: What about the impact of Style Sheets? It seems like that’s a really important thing for people to take advantage of in their feeds.

Rick Klau: Since day one we have added an XSLT style sheet, so that when feeds are viewed in a browser, they look more like web pages then XML code. When most people look at angle brackets and XML code single spaced fonts, they don’t really know what it means, and they are not likely to subscribe to it. The notion of the XSLT style sheet to improve the usability of feeds is something that we stared doing several years ago. Now in the past nine months this has become a little bit more complex, because, the most recent versions of both Firefox and Internet Explorer, override any style sheet settings in the XML document.

Instead, they format the feed according to their internal style sheet settings. This renders the embedded style sheet ineffective, and it also limits to some extent, publishers’ ability to control the user experience, which is something that frustrates many publishers. As a result, we rolled out an enhancement to our service which we call BrowserFriendly. It gives you the option of forcing the feed to render as a webpage, whenever it is loaded in a browser, regardless of which version of the browser, it happens to be viewing. There is some complex decision-making we are doing in the background to decide when to deliver that feed up to a browser as a webpage. As a result, the browsers own feed styling doesn’t take precedence. This way the presentation of the feed content is consistent with how the publisher wants it, as frequently as possible, maybe not a hundred percent of the time, but certainly far more often then would otherwise happen.

Eric Enge: A couple of other things, that I have heard are good common practice, and these are actually things that we do ourselves at this point, but a lot of people are recommending now, that you include a your full content in the feed itself rather than just a summary.

Rick Klau: You can certainly count me among those who are advocating this. I recently published a blog post on our corporate blog, a couple of weeks ago on this particular subject. I think too often the full feed verses partial feed debate focuses on the wrong issue. One side of the argument say that the readers want full content, so forget about what the publisher wants, it’s all about what the readers want. I think that’s a valid perspective, but I think there are equally compelling reasons why a publisher would want full feeds that are rarely talked about. For example, an increasing number of services are discovering feeds and indexing feed content. That’s an important notion, because it means that publishers that don’t avail themselves of full content in their feeds, are going to have, only a portion of their content indexed in these Search Engines. If you instead put the full content in the feed, this will dramatically increase the visibility of their content, and the discoverability of their content.

Eric Enge: Certainly the SEO community understands the value of good content with lots of words to improve your long tail indexing.

Rick Klau: Yes. The second element of this is the value of a link – a very different reason for why links matter in the context of feeds. Services that index feed content, such as Techmeme, are looking for relationships among posts in feeds. They use those relationships to cluster content, so that you can very quickly see all the conversation happening about a particular subject. The way they are able to do that is by looking at the full content in the feed and look for that href in the body of the post, to establish that the post that’s being indexed is related to the post that’s linked to. If those links are not visible, because the full content of the post is not indexable, because it doesn’t exist, then the links don’t exist as far as these services are concerned. And, they will have no notion of the relevance of that particular post to whatever the conversation happens to be.

As a publisher I may be trying to participate in a conversation, and my participation will be completely invisible to any of these services that are using links as the pivot point on which they evaluate the relationships among these posts. As I mentioned in my blog post, the new version of FeedDemon that just went to release candidate this week does a great job of surfacing popular posts among your subscriptions. So, if I take my 200 subscriptions and use FeedDemon as my reader, instead of looking publisher by publisher, feed by feed, I can very quickly look at which things my subscriptions are talking about today.

It’s a great way of sifting thorough oodles of data, and using the people who I’ve effectively designated as filters for me to find out what they think collectively is important or interesting. If those links don’t exist in those posts, FeedDemon and other services won’t know about them, and won’t have any way of leveraging the power of that content that’s included in the post.

Eric Enge: Old-fashioned, promotional smarts come into play here. For example, if putting the full content in your feed means that only half the people clickthrough to your site, but your audience grows by four times, you probably gained something.

Rick Klau: Yes. Also, we manage an ad network which means that publishers who use FeedBurner can monetize their feeds and make a fair amount of money on their feeds. So, it’s not an all-or-nothing proposition. You can make money on your site, or you can money in the feed. And, I think too often, people presume that the only reason people would clickthrough is to read the same article they have just read in the aggregator. But, even that’s not true. People can click a link to vote for the story at Digg. They can bookmark it as del.icio.us, they can submit it to Netscape, they can submit it to Reddit, or they can look for related content at Sphere. There are all kinds of additional things that the publisher can do to add value to the post itself. Surfacing those opportunities, which we make available through a platform we call FeedFlare, dramatically increases the likelihood that the reader will interact with the content in a meaningful way. And, in a way by the way that will accrue vastly more benefit to the publisher than getting one incremental page view.

Eric Enge: Is FeedFlare part of the TotalStats package or is it separate?

Rick Klau: It’s a free platform, with an open API in addition to the core of FeedFlare integration points to sites like Digg and del.icio.us, Technorati, Reddit, and a number of others. Anyone can create their own FeedFlare. They can be dynamically generated, which means they can key off of content that’s in the feed (e.g. Technorati: 91 links to this item o Save to del.icio.us (30 saves, tagged: rss FeedBurner feeds) o Digg This! (3 Diggs)). They can execute programs in the background, that are looking for certain conditions to be met, and then insert text or links into the bottom of the post on a whole variety of factors. It’s I think a very exiting way of extending the core feed UI. It can make it much more about providing value to the reader and tailoring the presentation to what the reader wants and what benefits the publisher wants to deliver.

Eric Enge: Excellent. Another thing you should do is have at the bottom of your article, links to other related articles, right?

Rick Klau: Absolutely. I was talking with a commercial publisher the other day about this. They were really concerned about doing full content for all the normal reasons about not wanting to give away content, etc. I said “look, you’ve got in a hundred years of content on your website that most people who are reading the daily news have no concept exists. If, when I am reading a story about something going on in the news, I would link to all the stories that are related. You suddenly are increasing the likelihood that; you don’t care if they clickthrough to read that particular story. But, if they clickthrough to see a list of a hundred other places they could go, all within your website; do you think they are going to click a couple of times within there? Absolutely they will.

Eric Enge: Do you have any other promotional suggestions?

Rick Klau: One idea is to re-syndicate your content. For example, we offer BuzzBoost, which turns a feed into JavaScript which republishes the feed headlines and the feed content in any HTML page. So, if you have a bunch of people who like your content and would like to reproduce it on their own sites, you can encourage that by using tools like BuzzBoost or Widgets. Then there is SpringWidgets, which is a Fox Interactive Company that turns your feed into a FLASHAp that can be embedded on websites or downloaded to desktops. As your content updates their Widgets update, and then finally we have something called Headline Animator, which will turn your feed into an animated GIF which can be again rendered on any HTML page or in any email program. I use it in my email signature file. Whenever I send an email out, below my title and company and email address are the last 5 things I have written on my blog, and I have a little rotating banner, with a link through to the blog itself. When I included that, subscriptions to my feed went up 10% in under 3 weeks.

Eric Enge: I would love to learn how to do that in my signature.

Rick Klau: What do you use for email?

Eric Enge: Outlook.

Rick Klau: Okay. It will be very easy to do, depending on which version of Outlook. There are a couple of different tasks that need to be done. But, once its setup, its just part of your email signature file, and its gets embedded with every email you send.

Eric Enge: Right. That will be cool and the BuzzBoost sounds cool too, because you can basically give people very simple way to have a rolling a list of your articles, whatever they maybe, show up on their site.

Rick Klau: Yes. We are in an age when the media itself is becoming increasingly distributed. It’s no longer about the four corners of your website. It’s about the four corners of the globe about where the audience wants to be, not where the publisher wants it to be. That means that they might be reading your content in an aggregator, or they might be reading it on somebody else’s website. They could be reading it through a search engine, or they could be looking at a video on YouTube or reading it on someone’s blog. Basically, it’s all over the place. As a publisher, if you are looking to grow your audience and expose more people to your content, you should embrace those ideas instead of worrying about controlling where they are when they see your content.