It’s a new year, so it’s time for a slight change of direction. You may have noticed your feed reader of choice just barfed up a few dozen posts from these here parts. I’m hoping that little bit of necessary unpleasantness will be one time only.

I’ve come to realize that my content-creating has become a lot more distributed, which means the long-form post format of this site has been seeing less and less love in recent years. Much has been written about Twitter killing the urge to write longer blog posts, and I won’t dispute that as a cause. I liked Andy Budd’s take on why his site has been suffering, I can relate to a lot of those reasons.

So for the past month I’ve been working on a way of piecing together content I produce on other sites and funnel relevant bits into a stream that I could present on this site.

Totally nuts, right? The volume will be too high, and nobody wants to see every photo I upload or hear every inane thought I come up with while out for dinner. So that’s why I’m exercising editorial control and only bringing over the bits and pieces I’ve hash- or machine-tagged.

On Twitter I’m using a hash tag (#mb) which shows up in the original, but I’m stripping from the on-site version. Google Reader pulls in shared items tagged with mezzoblue. And I’m just throwing in everything from Delicious for now, since I got into the habit of using it for the now-deprecated mezzoblue Dailies.

I’m still not sure if I’m going to write up the scripts I built to make this happen, or package it up into some kind of actual open source release. I think the latter way would be more interesting, but there’s a lot of work that would have to happen to get to something even slightly worthy of putting out there for public consumption.

Now I do realize that not everyone will want this of course, so the way this site used to work isn’t gone. You can follow the clutter free post-only feed, or browse just my original posts on the traditional archive pages. Both are accessible from the main archives page, and will continue to exist. It’s just the defaults that have changed, but you can go ahead and ignore all the new stuff if you want.

Expect a few bugs as I stress-test my scripts live over the next few weeks, and let me know if you find anything horribly wrong.

Update: and first major bug has been found: the full Atom feeds weren’t ready for prime time at all. For now I’ve backed out and made the default feed post-only again until I can figure out what’s causing old items to duplicate. Sorry about the collateral damage to your feed reader.

That’s similar to what I’m doing on my relaunched site, except I’m running cron jobs on the server to make the API calls (Tumblr/Delicious, All Consuming, Flickr, and Twitter in my case) and then inserting the data directly into my site database. Different RSS feeds then give visitors the choice of ‘blog only’ or ‘blog plus links’.

The reason for importing the data into my own database is that it affords the opportunity to present the link (and my comments) in context, and invite comments (similar to how Jeff Croft’s site handles his linklog).

I hadn’t considered including tweets in my RSS - I don’t think I’ve ever said anything that interesting on Twitter…

I think all of us who have been on the web for a while are ending up doing this sort of thing in a way or another. The personal site is no longer the end-all, be-all of content it used to be, but rather a place we can use as a hub to concentrate all the bits and pieces we generate that end up scattered all over the web now. Zeldman summarized this brilliantly a while ago ( http://is.gd/9nb ).

That’s precisely what I am doing on my own site as well. If you see the home page, besides the main menu and doodles, there’s the latest three Twiter posts displayed, the latest 15 or so Flickr photos, and the Delicious links I personally choose to appear on the site by means of a special tag.

Since I’ve never been much of a programmer I’m pulling all this out by means of a few customized Wordpress plugins, and I wish I knew a way to select the tweets I want to show on the site by means of a tag as you suggest above (I’d rather not have replies and other irrelevant messages show up there). Will be interesting to see how your developments on these matters will turn out.

It sounds like this will be the first recognizable web trend in 2009. I’m in the process of making a similar conversion myself but will be doing so using Matthew’s logic above. I still want Mike Industries to be permanent home for all of that stuff… not some random third party services I happen to be using at the time. You’ve probably set yours up so they live in your database, which is great, but I think the ability to comment on everything is essential (although some people I know feel the opposite way and are turning comments off on all of their stuff).

@Matthew Pennell - I took a slightly different approach. I initially tried to avoid storing anything locally and run it all off RSS feeds, but realized I had to use the APIs in order to get anything past the most recent 15 or so items.

In the end I gave in and went with a file system-based cache instead of a database. Nothing’s running live, unless the cache has expired (some live for 15 mins, older ones live forever). It’s good enough for now, and did manage to give me enough control to write out custom Atom feeds with and without the extra stuff.

@Joey Baker - if Tumblr has an API that can return an RSS/Atom/other XML file for a specific date range (the date range is essential for the way I’ve got things set up) then it could fairly easily be adapted to work with the scripts I’m using.

@Geoff - hadn’t heard of Sweetcron, but it looks close to what I’m doing here. Probably even closer to what Matthew Pennell describes in his comment.

@Kenneth - fixed.

@Ethan - and thank you sir. I was a bit leery that the increased volume would be a problem, but so far the response seems to be positive. Having opt-out ability is pretty key, I’d say.

@Beto - the Twitter Search API is what I used. Getting the data in from the API actually wasn’t terribly difficult with a bit of PHP knowledge and a few minutes reading the documentation, it was the processing and doing something useful with it that took a bit more work. I’m using a PHP library called MiniXML to parse the query results, and then my own custom stuff to do the rest. Not sure if that helps a non-programmer, but I don’t really consider myself much of one and I figured it out, so hopefully it’s not that hard. I’ll see what I can do about providing some more concrete code that will help you out.

@Mike D. - I thought about permalinks and comments for all the off-site content here on mezzoblue. I didn’t do permalinks, but that’s next on my list.

I chose not to do comments though. I used to run a linkblog (Dailies) that was first its own Movable Type blog, then lived in Delicious. In the first incarnation comments were open, but after a few hundred links there were only a handful of useful comments. So I’m of the mind that comments on links are pretty much pointless. Photo comments could be a bit more meaningful, but I’m just passing the comments through to Flickr cause it doesn’t make sense to have those live in two different places. (Maybe one day I’ll do some slick data back and forth to post comments to Flickr directly from a form on here, but I’ve thrown back enough Tylenol already for now)

Definitely a trend for 2009, And also looking for do this with cron jobs ( actually looking for a easier way, messing with the wp admin, to make it control the where everything should go, from it’s own interface )

Let’s how this works :)

And maybe we should start a list or something to help other to do this … even if we do not completely open our sources…

I like that you cached it locally in the FS. I do it differently on a few sites, but I mostly cache to a local DB - allowing it to be used in connection with a bigger picture of things. Somewhat of 2 layers of caching, pulling in the API (by cron + rake task), then page/action caching different elements. Still gives me the speed factor, while also giving me the flexibility to use the data models however I see fit.

Not sure this is a new trend (per Mike D.) - but I do see more and more people moving towards pulling their content into their own domain. I like to push and pull the content, but ultimately it’s about really interacting with the APIs.

I love that you go above and beyond the typical JS widgets - those seem almost useless to me. The real power comes when you can work, form, and shape that data into a bigger system. Well done :)

I’d be very excited to get ANY sort of peek behind the curtain, especially to see your technique for Flickr. I’m another of the masses who are starting to tackle this right now, and I’d hate to re-invent the wheel.

Beyond the news, though, I just thought I’d issue a more general compliment for your entire site. I’ve visited many times over the years, but for some reason I’ve never really taken in the completeness of it all. It’s a very well-cared-for little home on the web.

I’ve been looking at doing this on my site for some time now, but scripting it from scratch seems to be well over my head. I’ve looked into some EE plugins, but haven’t found anything suitable yet. Perhaps 2009 will be the year I get it figured.

@Spencer - that’s certainly the goal. I’m planning on using them sparingly, and only if they feel on-topic and relevant for this site.

@Nate - I started out actually using a PHP caching library for the raw source, but realized that the hit of parsing XML on every page request made that a little silly. In the end I ditched the library and just wrote everything directly to PHP files.

And yeah, the JS widgets never really felt like a good solution in my mind. Tack-on content is a second class citizen, and I’d like to do it in a way that makes everything I chose to post here first class.

@Jonathan Snook - good call, never thought of that. Actually I don’t think I knew you could even favourite your own, ala Flickr. The big downside I can see is that other services make assumptions about why someone Favourites (cough Favrd cough) that a) wouldn’t be true in that case, and b) make me look like a self-promoting tool. Hash tags on tweets are lame, but for now seem like the best fit.

@punkassjim - alright, will see what I can do. I might write up some of the various stages (data acquisition, parsing, caching, and display) in separate posts, but I’d have to modify my source to do it in a way that each build off each other. Still, that might be the best way to get this out there. For now I’d recommend taking a look at the Flickr REST API, it’s well-documented, and armed with a bit of PHP knowledge and the MiniXML library you can probably go further than you’d think in a few hours of tinkering.

@Jason Landry - yeah, I wouldn’t say it’s a small undertaking, there are a lot of individual little problems to solve along the way that can be head-scratching. You might want to check out the sweetcron.com tool mentioned in a previous comment, still in beta but might be what you need.

Oh, and a general note in case you missed the post update – the full Atom feed wasn’t working out, so I’ve backed out to post-only for now. Every time a source updated, all the posts from that source refreshed in the feed, which is more than a little annoying.

@Matt - It’s a good resource, and it actually gets me about 80% to where I want to be. But, two things: 1) I’d like to keep the number of ready-made plugins down, and 2) since it’s just based on RSS feeds, there isn’t as much information to play around with, so it kinda limits the possibilities. For example, in Dave’s system, with every Flickr post, it shows how many comments that photo has at any given time.

@punkassjim - nope, they’re coming straight out of the API along with the rest of the data. You have to run two queries, one to search for photo IDs that meet your criteria, then a second to get the info from those photos. The second one is when the comments show up, then I just plug ‘em in with everything else.

In terms of how often they’re updated, well, that’s still a problem I’m working through. Relatively recent data is being refreshed once an hour (you could cron it, or you could just check to see if the cache has expired when someone loads the page and run the request live if it has, which is what I’m doing).

But the flaw in my plan is that Twitter’s Search API promises to only show results for the last six months, so older than that and if I refresh my cache to get Flickr comments (or even blog comments, though usually after six months they’re closed anyway) I lose Twitter posts for that range. Right now everything older than 6 months is permanently cached. The only way I can see around this is putting it all into a database, which I so didn’t want to have to do.

Search this site:

About This Entry:

You are reading “Not a Test”, an entry posted on 3 January, 2009, to the Misty collection. See other posts in this collection.