Strategies for Internet citizens

The social scripting continuum

Back in June, IBM’s Tessa Lau joined me on my ITConversations podcast to discuss Koala, “a system for recording, automating, and sharing business processes performed in a web browser.” The service is now available on the AlphaWorks site as CoScripter, where the first script I tried was Tessa’s own Update your Facebook status. Here is the text of the script as it appears in the CoScripter wiki:

Interestingly there was a bug in that script. The fourth step was originally:

* click the "Password" button

Because there is no button labeled “Password” on Facebook’s login page, the script failed.1 When I made the change from “Password” to “Login” in the CoScripter sidebar I simultaneously fixed the script and added the corrected version to the wiki. After posting this entry, I added a comment to the wiki that points back here. All in all, it’s a nice illustration of the emerging style of social programming that we also see in applications like Yahoo! Pipes and Popfly.

As Tessa explains in the podcast, many scripts — including this Facebook example — require secrets, notably usernames and passwords. These you can conveniently record as name/value pairs stored in a personal database. I have two observations about that. First, secrets appear to be stored remotely. If so, I’d prefer to keep them local. (Update: They are indeed local, see Tessa’s comment below.) Second, there should be a way to qualify them by domain, because names like “Email Address” and “Password” will soon become overloaded.

One of the delightful things about CoScripter is the simple and natural language used to express sequences of actions. It looks just like the instructions an ordinary user would write down for another ordinary user to follow. By embedding those instructions in an interpreter that makes it easy for anyone to run and debug them step by step, and by reflecting them into a versioned wiki, CoScripter creates a rich environment in which people can record, exchange, and refine their operational knowledge of web applications.

Currently CoScripter is a creature of the web, and specifically of a Firefox-based, Flash-free web. Adapting it to another browser would be hard but doable. Adapting it to work with RIA (rich Internet application) plug-ins like Flash or Silverlight is really problematic, though, because RIA plug-ins don’t mesh very well with the web’s RESTful style.

There are minor exceptions. Back in 2004 I raised that issue in terms of Flash, and Adobe’s Kevin Lynch showed how to materialize URLs for states within a Flash application. But this doesn’t occur normally and naturally when you write a Flash application, as it does when you write a web application. Or rather, as it used to when you wrote a web application, because AJAX also tends to hide an application’s URL namespace.

Because the same issue is going to come up all over again in the context of Silverlight, now would be a good time to think about how Silverlight apps can expose automation interfaces that cooperate with the RESTful web they’re part of.

With any flavor of web application, whether it’s based on simple HTML and JavaScript, or enriched with AJAX, or turbocharged with Flash or Silverlight, it would be great not only to be able to automate as CoScripter can, but also to share and collaboratively refine the scripts. How can we best assure that possibility? Tessa Lau thinks that web accessibility guidelines represent our best hope. If CoScripter-style automation were to catch on it would be a further incentive to adopt those guidelines, and would likely reshape them in useful ways as well.

But why stop there? In principle there’s no reason why desktop applications can’t play the same game, and there are compelling reasons why they should. Today, for example, I found the answers to the 25 top “How do I?” questions asked about Word. Those answers are pointers to articles in the Microsoft knowledge base. For the ever-popular “How do I create mailing labels?”, the answer includes instructions like these:

Open the document in Word, and then start the mail merge. To start a mail merge, follow these steps, as appropriate for the version of Word that you are running:

Under Select starting document, click Change document layout or Start from existing document. With the Change document layout option, you can use one of the mail-merge templates to set your label options. When you click Label options, the Label Options dialog box appears. Select the type of printer (dot matrix or laser), the type of label product (such as Avery), and the product number. If you are using a custom label, click Details, and then type the size of the label. Click OK. With the Start from existing document option, you can open an existing mail-merge document and use that as your main document.

Click Next: Select Recipients

The resemblance to CoScripter’s step-by-step instructions is striking. Why shouldn’t instructions like these be able to drive Word’s automation interfaces? Why couldn’t users create and share their own instructions? Sure it’s a desktop application, but nowadays that’s just an endpoint along a continuum of application styles — HTML, JavaScript, AJAX, RIA, desktop app — all of which are connected and can communicate. Collaborative automation is just one of many opportunities to exploit that ability to communicate, but it’s a huge one.

1 I suspect that Tessa planted that bug intentionally to see if we were paying attention!

Post navigation

18 thoughts on “The social scripting continuum”

Thanks for the thought-provoking write-up, and for fixing my script. :)

The personal database is stored as a text file in your Firefox profile directory, not on the server. It will be interesting to watch how the vocabulary of personal database entries evolves over time. I imagine people will start creating a namespace for things like email addresses and passwords, so for example you’d call it a “facebook login” and “facebook password”. Different scripts could then refer to different entries by name. But I expect this to be driven by the community and how they decide to use this tool.

I absolutely agree with you about scripting desktop applications. I’d love to see CoScripter’s “sloppy programming” approach be used to control all sorts of applications, and not just on the desktop. Could we use it to program VCRs? Or teach our parents how to use newfangled cellphones?

But I think what’s most interesting is not CoScripter itself, but what it enables. We’ve watched Facebook grow from an application into a platform for developing social-network applications. I can’t wait to see what people build on the CoScripter platform.

Someone else in the community has already made a copy of my script that uses the variables “facebook e-mail” and “facebook password” instead. So that’s what I mean by getting the community to standardize on variable naming. People who favor using site-specific names will tend to favor that script over mine, and that script will become more popular. And I’m hoping that eventually conventions will arise over what exactly to name things so that people can reuse variables across scripts.

“Someone else in the community has already made a copy of my script that uses the variables “facebook e-mail” and “facebook password” instead. So that’s what I mean by getting the community to standardize on variable naming.”

I came across this post in search for other peoples experiences with CoScripter. I find the natural language approach truly amazing, but the execution…. well, it simply does not work with 90% of all websites that I use.

On the other hand, in the last months I have been using iMacros, which is a Firefox extension very similar to CoScripter. It uses the “classical” record & replay approach. With iMacros I have been able to automate about 60% of all websites using the visual recording only, and another 30% I got to working after tweaking the recorded imacro manually.

>Because there is no button labeled “Password” on Facebook’s login page, the script failed….
If you have to fix something like this manually, I do not think this is “natural” language processing. Every human would have executed the script correctly.

This looks like useful fun. Useful if we can indeed link it up to pipes and the likes. More useful if applications get these kinds of hooks built in a-la applescript. Very useful for things like unit-testing webapps.

As you note, it’s very similar to how you write howtos, or for my line of work, how I write acceptance tests and specifications/scenarios.

This, for me, is the strength of rules engines. And you can prove they don’t break.

Thanks for the thought-provoking write-up, and for fixing my script. :)

The personal database is stored as a text file in your Firefox profile directory, not on the server. It will be interesting to watch how the vocabulary of personal database entries evolves over time. I imagine people will start creating a namespace for things like email addresses and passwords, so for example you’d call it a “facebook login” and “facebook password”. Different scripts could then refer to different entries by name. But I expect this to be driven by the community and how they decide to use this tool.

I absolutely agree with you about scripting desktop applications. I’d love to see CoScripter’s “sloppy programming” approach be used to control all sorts of applications, and not just on the desktop. Could we use it to program VCRs? Or teach our parents how to use newfangled cellphones?

But I think what’s most interesting is not CoScripter itself, but what it enables. We’ve watched Facebook grow from an application into a platform for developing social-network applications. I can’t wait to see what people build on the CoScripter platform.

I’ve been toying with similar concepts for rich client applications. Using URL monikers, data inside rich client apps can be directly accessed, forms and dialogs too. These URLs can be emailed from a user to another. What’s not to love about this kind of application?