Julien Couvreur's programming blog and more

Open-ended links, link re-writing

30 Mar 2005

... and rich but loose service integration.

A couple of weeks back, I was thinking about an architecture for modularized/loose integration between services. Open-ended links is an idea to help scenarios where you go from one website to another, which today involves a lot of hard coupling.
In essence, these links would be simple but structured data, that the browser could interpret and pass on the the relevant service provider. This would work for RSS subscription, sending email, and a bunch of other scenarios.
I'll also discuss how Open-ended links relate to the recent Greasemonkey developments, with more structure and focus on cross-site linking (rather than mixing the functionality of two sites into one page).

Open ended links:
Loosely binding two services is one of the core design issues of one click subscription. One service is the news/update provider and the other is the user's aggregator (be it a web or a desktop-based solution).

Something has always bothered me when thinking about this scenario. What is itching me is there seems to be a more general problem underlying here, that calls for a more generalized solution.
Some examples are map/direction, mail, identity, search, pizza ordering or payment services.

Taking the case of mapping directions, I was on a web site than linked to MapQuest for some directions. But Yahoo Maps is the service I mostly use, and it caches some convenient data, like the addresses I recently used. That hard-coded link to MapQuest actually made me a dis-service, since I ended up having to copy the address anyways...

Previous solutions, like the "mailto" or "feed" url schemes seem to be taylored for desktop applications and they don't work unless you already have that app installed. Same thing with mime types (but at least they behave better when the browser doesn't have an association confirmed).
For web pages, more often than not, there is tight coupling: search boxes are hard-coded to Google, direction links to MapQuest, book references to Amazon, feed subscription to Bloglines,...

More browser support:
It seems that solutions should involve more browser involvement, although the user's data and preferences may may not actually be stored there (but instead on a server).
The browser would act as an intermediary between services and domains. To ensure things plug in correctly, a number of interfaces would be defined for various services (seach, mapping, RSS subscription, email,...), maybe in the form of an xml schema, and the service providers would need to declare which schemas their service supports.

Including versions in the schemas should allow adding new schemas, without stifling innovation... It would actually put some competitive pressure on the service providers. And you still need many websites to increment the schema versions they reference, the same way that you'd need to change your urls to support new parameters.
There is also the problem of providers that support a certain interface, but don't have a large enough datastore. For example, what if my bookstore doesn't know about that book?

SmartTag/AutoLink systems could be plugged into that system, by converting unstructed text into that a little xml blob for the right schema/service, that the browser knows how to handle. So instead of having AutoLink support three different mapping services (MapQuest, Yahoo or Google), it would support any other providers that offers that service (MSN,...).

Unstructured approach:
The approach described so far relies on structured data, based on schemas. But it turns out a more organic solution is emerging, shifting power to the user and the browser: Greasemonkey user scripts.
GreaseMonkey is a Firefox extension that allows the user to customize webpages as they are rendered by installing some javascript.

Some user scripts go beyond removing non-content, instead integrating multiple sites, adding features and re-writing urls.
For example, I wrote a script to have "mailto:" urls be handled by Gmail. Some user scripts, like Google Butler, add links to competitors. Some integrate the functionality of two sites, like the BBC Radio & Del.icio.us mash-up.

The current user script approach is much more organic than the structured approach I was thinking of: most scripts have to be taylored for specific sites and url re-writing works through reverse engineering of querystring parameters.

Referral ecology:
Beyond the problems with stripping Ads, re-writing links to use your preferred service could have some serious effect on the referral business.
For example, it is possible to replace links to books at Barnes&Nobles with Amazon links. In that case, the owner of the content that linked to the book wouldn't get any referral fees.
Is there some creative solutions to ensure the ecosystem remains healthy, providing incentives to writers?

The referral business would also be impacted indirectly, as users would become more "sticky" to their preferred service. This would create a second barrier to switch service, in addition to the current strategy of locking up user data.
For example, if I re-write all my map links to MSN Maps, there is little chance that I switch over to Google Maps.

We need to start thinking about it now, as complete control from the user over the content he consumes could make the web implode. There has to be some incentive to prevent users from removing all the ads and re-writing links with their own affiliate ID. Or we have to find other ways of remunerating authors.
Cory Doctorow sees re-writing ISBN links as an opportunity to support authors, with browser extensions that would my favorite author's affiliate IDs in the links I browse. But the same re-writing technique can also be used to put steal website authors' affiliate fees, for example by replacing any Amazon affiliate ID in links that the user clicks with his own.

Conclusion:
The need for richer service integration and customization is becoming increasingly apparent. Greasemonkey and other site tweaks (user styles, bookmarklets,...) are already exploring this space, in a very bazaar fashion and at a small scale. But the tough problems raised are not only technical (functionality, security, privacy, scalability,...) but also economic. This means solutions are going to take even longer to mature and we have lots of interesting debates ahead of us (blogosphere rejoice!).
Also, the question of how to finance xml web APIs is going to show its head again, as one site's APIs (like the del.icio.us HTTP/XML APIs) get integrated into foreign sites without the user seing any ads to pay for that service.

Related links:

AutoBlink (via is a script for website owners, that can make AutoLink-generated links blink, or modify them to include the owner's Amazon affiliate ID.

Some thinking on 43 folders about GreaseMonkey and having websites respect the users' preferences:
"Think about the benefits of taking web standards to the next level and making sites that can anticipate and acknowledge your visitor’s preferences from their first visit (via standard DIV names or calls to your public “preferences” file). I wouldn’t begin to know how to make this stuff, but I can definitely see myself becoming a grateful consumer."
But I would think that privacy-wise, it would be better if the browser kept the user's preferences to itself.

Matt from "Peer Pressure" posted some similar thoughts about blobs of structured data as a better architecture for contracts and flexible intermediation than the user script hack, using street addresses as an example.

User scripts are coming to IE with GreaseMonkIE (still in early development) (via). It's going to be "fun" to re-write all user scripts with IE support... (update: GreasemonkIE and TrixIE are dead, Turnabout is still carrying the flame)

Tony from ponderer.org points out that Greasemonkey user script installation has the same usability problem than RSS feed subscription.

Update (2005/10/26): Mark Birbeck defines an URI format, CURIE, that combines a namespace and a parameter.
It's useful for shortening urls and adding some semantic value to links, but also for controlling how these references are resolved (you could resolve ISBNs at Amazon or alternatively at B&N).
In his solution, the resolving (converting a CURIE to an actual URI) is not necessarily done by the browser, but could also be performed by the server (such as a wiki software).

Update (2015/10/28): Since writing this, this concept emerged in a number of platforms: custom protocol handlers (in Chrome and other browsers), Android Intents and Activities, iOS cross-app Sharing, ...

______________________________________

I'm lost I keep seeing references to relevant user scripts for trixie. What are they where do I get 'em?
All I want to do is use trixie to bypass the Microsoft validation tool.
Can you help me?