Tag: AppleScript

In the last part we built a useful workflow that would open a given number of unread article from your Instapaper feed. But we stopped short of the goal, to convert the text of the articles to speech files.

If you look into the library of Automator actions there is one with the promising name “Get Text from Webpage.” However this will extract all the text, usually including all the menus, ads and all the other detritus that clutters webpages these days. The latest version of Safari1 has a functionality called “Reader,” which removes all this clutter and allows the user to focus on just the text. Unfortunately, the “Reader” functionality in Safari is not scriptable.

But before Safari had “Reader” there was the Readability javascriptlet from Arclab90 which does very much the same thing. Since Safari’s AppleScript dictionary allows us to execute arbitrary JavaScript against a webpage, we can use that to extract the relevant text from the article. That saves us from having to recreate the logic of the Readabilty scriptlet in AppleScript.

Do the following with the workflow we built in Part 1:

duplicate the Workflow file and name the copy: Speak Instapaper Articles to iTunes

add a new empty “Run AppleScript” action at the end of the workflow and enter the following code:

on run {input, parameters}
-- uses the 'Readability' javascript from
-- http://lab.arc90.com/experiments/readability/
set readabilityScript to "javascript:(function(){readConvertLinksToFootnotes=false;readStyle='style-newspaper';readSize='size-medium';readMargin='margin-medium';_readability_script=document.createElement('script');_readability_script.type='text/javascript';_readability_script.src='http://lab.arc90.com/experiments/readability/js/readability.js?x='+(Math.random());document.documentElement.appendChild(_readability_script);_readability_css=document.createElement('link');_readability_css.rel='stylesheet';_readability_css.href='http://lab.arc90.com/experiments/readability/css/readability.css';_readability_css.type='text/css';_readability_css.media='all';document.documentElement.appendChild(_readability_css);_readability_print_css=document.createElement('link');_readability_print_css.rel='stylesheet';_readability_print_css.href='http://lab.arc90.com/experiments/readability/css/readability-print.css';_readability_print_css.media='print';_readability_print_css.type='text/css';document.getElementsByTagName('head')[0].appendChild(_readability_print_css);})();"
set output to {}
tell application "Safari"
repeat with x in input
set theURL to contents of x
make new document with properties {URL:theURL}
delay 0.5
repeat until ( (do JavaScript "document.readyState;" in document of window 1) is equal to "complete")
delay 0.5
end repeat
set d to document of window 1
do JavaScript readabilityScript in d
delay 3
repeat until ( (do JavaScript "document.readyState;" in d) is equal to "complete")
delay 1
end repeat
set thetext to text of d
-- remove first three and last four paragraphs since these are Readability links
set AppleScript's text item delimiters to return
set thetext to (paragraphs 4 through -5 of thetext) as text
close d
set output to output & {thetext}
end repeat
end tell
return output
end run

then we loop through the items that were passed into the action in the input variable. In this case the items are the URLs of the Instapaper posts.

set theURL to contents of x
this de-references the iterator variable. Due to some oddities of the AppleScript language this is usually a wise thing to do in a repeat loop.

make new document with properties {URL:theURL}
delay 0.5

we tell Safari to open a new document with the given URL and pause for a while to let Safari start loading

repeat until ( (do JavaScript "document.readyState;" in document of window 1) is equal to "complete")
delay 0.5
end repeat
set d to document of window 1

We have to wait until the page is completely loaded before we can apply the Readability script against the page. Unfortunately Safari does not expose the state of the page (loading or complete) to AppleScript. This is however exposed to the JavaScript DOM within the page and we can access DOM information from AppleScript with the do Javascript event. So we poll the document.readyState attribute in Javascript until it reports complete. Then we remember a reference to this document in a variable.3

now we can execute the Readability script against the page:

do JavaScript readabilityScript in d
delay 3
repeat until ( (do JavaScript "document.readyState;" in d) is equal to "complete")
delay 1
end repeat

We use the same DOM trick to wait until Safari is done.

Now the text property of the document contains the cleaned up text of the article. We can extract that, remove some extra lines that Readabilty inserts, close the Safari window and append the text as its own element to the output list.

set thetext to text of d
-- remove first three and last four paragraphs since these are Readability links
set AppleScript's text item delimiters to return
set thetext to (paragraphs 4 through -5 of thetext) as text
close d
set output to output & {thetext}

This would be a good time to save the workflow, and do a test run. You can show the results of the workflow in Automator to see if the text is extracted properly. Readability is not perfect and does not work on all pages, but the success rate is quite high.

The remaining work of converting the text into audio is very straightforward. Add the following workflow actions:

there is a bug in Safari’s AppleScript implementation where document references from freshly created web documents will go stale once the page is loaded. This also affects the “New Safari Documents” action. We will work around this bug in our AppleScript ↩

Safari has a bug where a AppleScript reference to document will change while it is loading, resulting in broken references. All this is a clumsy, but effective workaround. ↩

First we have to get the unread articles from Instapaper. If you go instapaper, log in, and go to your unread articles, you can see the RSS button in the URL field in Safari. To get to the RSS feed in Automator, do the following:

Open Automator, create a new Workflow

Open www.instapaper.com/u in Safari and drag the link from the URL bar to the empty Automator Workflow window. Automator will create an new “Get Specified URLs” action with the Instapaper URL unread in it

Add a “Get Feeds from URLs” action next

Add a “Get Link URLs from Articles” action. Unselect the “only in the same domain” option

If you ares anything like me this workflow will open quite a large number of pages. I think Instapaper limits the RSS feed to 25. That’s still a lot of new Safari tabs/windows you are opening there. We want to add an action that restricts the number of items passed through it. Surprisingly there is none in the default actions, but this is fairly easy to add. Insert a new “Run AppleScript” action before the “New Safari Documents” action and replace the default code with the following:

on run {input, parameters}
set maxNum to 3
-- filters all but the first maxNum items from the articles, change as appropriate
-- enter '-1' or remove this action entirely to get all urls
if (count of input) > maxNum then
set output to items 1 through maxNum of input
else
set output to input
end if
return output
end run

This will only pass through the first maxNum of items passed into it, regardless of type. You can change maxNum to fit your taste and/or needs. You can also set maxNum to -1 if you want to pass all items without removing the AppleScript action.

Save again and try running it. The next step will be to filter the actual text out of the web page which will be a little tougher and the main topic of Part 2.

The following AppleScript are those which are extremely handy and I use on a daily basis. There are many others available on the forum and other places, but many of them didn’t really add much value to my workflow, which is a pretty standard one, or solve problems I never encountered.

Update: In one of those embarrassing “You know there is a checkbox for that!?” moments, @rogueamoeba points out there is in fact a checkbox for this under Preferences “Automatically Transmit To:”

Airfoil from Rogue Amoeba is a wonderful application that allows you to stream audio from your computer to any device that will receive Airplay audio or run the Airfoil Speaker application. This includes all iOS devices. I use it stream audio to any room I may be in where I just hook an iPod touch or the iPhone up to the stereo.

The one drawback is the UI, which only allows to the devices to stream to on the Mac that is streaming. So I’ll be in the kitchen, where a 1G iPod touch is permanently hooked up to some speakers, turn on the iPod touch, start the Airfoil speakers app, then walk to the Mac in the living room, select the iPod touch in the Kitchen and walk back to enjoy the music. Wouldn’t it be great if Airfoil automatically picked up the iPod when it appears in the list?

Luckily Airfoil has AppleScript support. It is actually very easy. I have named all my iOS device to start with either “iPhone”, “iPad” or “iPod touch” so I can make Airfoil connect to all devices that are running the Airfoil app with

tell application "Airfoil"
connect to every speaker whose name starts with "iP"
end

Now we need to keep running this command periodically in the background. I could setup a launchd plist for that, but AppleScript provides a simpler solution. Scripts that are saved as “Stay Open Applications” have an idle handler that is called after a certain number of seconds. See the details at the AppleScript Language Guide here.

So we wrap the command in an idle handler and add some checking to see if Airfoil is running so we don’t force launch Airfoil:

property idleTime : 30 -- in seconds
on run
idle -- call idle on launch
end run
on idle
tell application "System Events"
if exists application process "Airfoil" then -- check if Airfoil is running
tell application "Airfoil"
connect to (every speaker whose name starts with "iP" and connected is false)
end tell
else -- if Airfoil is not running script can quit, too
tell me to quit
end if
end tell
return idleTime
end idle

The value returned from the idle handler is the time (in seconds) until it gets called again. This will leave other speakers (that don’t start with “iP”) such as the local speakers and any Airport Express speakers unaffected.

Save this as an application and make sure to select the “Stay Open” option. Then find the application and double click to launch. Start and quit Airfoil speakers app on your iOS devices and listen to Airfoil connect automatically.

So there you are doing your work and of course being the geeks that we all are you have about three hundred windows open, give or take a few hundred. Then the iChat icon starts bouncing…

Now you might be very organized and have the iChat window in a certain spot on the screen or even on a certain space in Spaces. You might be a whiz with Exposé and immediately find the right iChat window in the myriad of windows that are open. Or you might have one or two 27″ displays and not really care about this.

But for the rest us, wouldn’t it be nice if say a small window floated into the screen with a notification and the person who sent the message and maybe even the message? And it would have to be unobtrusive and float away just as quickly. You could glance at the notification and decide there and then wether it is necessary to dig out that iChat window.

This window will float serenely in front of everything for a few seconds.

And with the scripting interface in iChat it is fairly simple to set up. After installing Growl, take this script:

property growlAppName : "Growl iChat"
property notificationNames : {"Buddy Became Available", ¬
"Buddy Became Unavailable", ¬
"Message Received", ¬
"Completed File Transfer"}
property defaultNotificationNames : {"Buddy Became Available", ¬
"Buddy Became Unavailable", ¬
"Message Received", ¬
"Completed File Transfer"}
using terms from application "iChat"
on buddy became available theBuddy
my registerWithGrowl()
tell application "iChat"
tell theBuddy
set theTitle to full name & " became available"
set theDesc to status message
set theIcon to image
end tell
end tell
my notify(theTitle, theDesc, theIcon, "Buddy Became Available")
end buddy became available
on buddy became unavailable theBuddy
my registerWithGrowl()
tell application "iChat"
tell theBuddy
set theTitle to full name & " went away"
set theDesc to status message
set theIcon to image
end tell
end tell
my notify(theTitle, theDesc, theIcon, "Buddy Became Unavailable")
end buddy became unavailable
on message received theText from theBuddy for theTextChat
my registerWithGrowl()
tell application "iChat"
set theIcon to image of theBuddy
set theTitle to full name of theBuddy
end tell
my notify(theTitle, theText, theIcon, "Message Received")
end message received
on completed file transfer theTransfer
my registerWithGrowl()
tell application "iChat"
tell theTransfer
if transfer status is finished then
if direction is incoming then
set theTitle to "Received File "
set theDesc to "from "
else
set theTitle to "Sent File "
set theDesc to "to "
end if
set theTitle to theTitle & (file as string)
set theDesc to theDesc & full name of buddy
end if
end tell
end tell
my notify(theTitle, theDesc, theIcon, "Message Received")
end completed file transfer
end using terms from
on registerWithGrowl()
tell application "GrowlHelperApp"
register as application growlAppName all notifications notificationNames default notifications notificationNames icon of application "iChat"
end tell
end registerWithGrowl
on notify(theTitle, desc, icondata, notificationName)
tell application "GrowlHelperApp"
if icondata is "" or icondata is missing value then
notify with name notificationName title theTitle description desc application name growlAppName icon of application "iChat"
else
notify with name notificationName title theTitle description desc application name growlAppName image icondata
end if
end tell
end notify

Copy the code into AppleScript Editor and save the file as “Growl iChat” (as a script file) in ~/Library/Scripts/iChat/

I also setup the script to react to “Buddy Becomes Available,” “Buddy Becomes Unavailable” and “File Transfer Completed.” The great thing is that you don’t have to enable all notifications. You can even set it up individually so that you get notifications for some buddies, but not others. It should be easy to adapt the script to more actions if you wanted to.