tech: April 2011 Archives

What does a Thai restaurant have in common with a scholarly article? Not much, but emerging standards for machine-readable semantic content online can help both reach more people. Publishers of academic reviews can take advantage of new techniques to surface quality content. And these new technical standards can help a core value of the academy — peer review — maintain visibility.

One of my side projects is a database of nearly 300 reviews of journal articles, books and festschrifts about the Norwegian playwright Henrik Ibsen, called the Survey of Articles. This online resource is based on many years’ work by the Ibsen Society of America, an international scholarly society. Over the course of decades, leading academics have read, reviewed and critiqued articles about Ibsen in many different journals and languages. As Ibsen is one of the most ‘international’ figures in Scandinavian literature, putting these critiques online helps researchers all over the world. And though the Society had placed PDF’s online previously, we felt that chunking everything up into a database would provide more granular ways of digging into the data. But if we built it, would anybody come?

It turns out that search engines such as Google are embracing technical markup known as “microformats” to help solve problems like this. Try this query as an example of microformats in action: Thai Restaurant. If you’re near a city with Thai food (and if Google can determine your position through your IP address), you’re likely to see a mix of things including a map and one or two reviews like this:

How does Google get those four stars, the number of reviews, etc, for this particular restaurant? And how does Yelp! communicate these pieces of data back to the search engine? Through machine-readable markup — by including a kind of secret code for web spiders.

Google supports three different kinds of this technical markup: RDFA, the hReview microformat, and Microdata. For a technical overview of these three options, see Philip JÃ¤genstedt’s analysis here. JÃ¤genstedt comes to the conclusion that they all have their flaws (“Microformats, you’re a class attribute kludge”) but that Microdata held the most promise for the future. Seeing as his article was written 18 months ago, I decided to implement Microdata, the newest of the triptych. (I had had some experience using hReview, before, and found it cumbersome to express cleanly in my HTML.)

So now the next step was figuring out how to implement Microdata in my existing database-driven HTML. Google maintains a Tips and Tricks page for how to do this. Their example (bear with me here) is a pizza restaurant:

Here’s where some human value judgements ensured. There’s no way that a critical review of an academic work can be broken down into a “5-star” rating. Even though Google allows for other types of quantitative evaluation — such as Thumbs Up/Thumbs Down, 100-point scales, etc — none of these systems map in any way to what the Society is trying to accomplish by putting these Critical Annotations online. So obviously I jettisoned the Star-Rating aspect of the pizza example. In addition, I only wanted to note the year an article was reviewed, rather than the specific month and day (which is pretty meaningless for an annual publication.) Finally, I didn’t want to further whittle down the Annotation into a one-sentance ‘summary’, despite the existence of that feature in the Microdata standard. With these adjustments in mind, I decided to express the data I had available through the following snippet of HTML and PHP:

The next step was to test whether Google could parse my embedded Microdata and recognize each page as a review. For this, I used the Rich Snippets Testing Tool. This system will ingest a web page and, in real time, expose the semantic markup it finds — along with any errors. After a bit of debugging and re-arranging the nesting and position of the various elements, I got the code to do what I wanted. Here’s an example of a successfully-parsed page:

Finally, and most importantly, you’ll need to submit your site to Google for review. It took my site a few weeks to go through — I don’t know the frequency with which Google looks at these, but it’s possible there’s a periodic review. You’ll know your site has made it through when search results start exposing the additional metadata your markup is indicating. For my content, the result was this:

Here’s a mapping of the information contained in the Microdata markup to the results page of a Google search:

There are a few things to point out here.

First, although the title of the article is crucial to Google’s system picking up on the review, that title is not actually exposed in the search result. Instead, title of the page itself is displayed. This makes it important that your TITLE tags are accurate and informative. In this case, I settled on the following template:

Name of Site: Article Title from Journal Title

In this particular case, that works out to

ISA: Ibsen and the Dramaturgy of Uncertainy from Ibsen Studies

I could go further with this and include the author of the article in the title, but at a certain point you start running into concerns about the length of page titles as exposed in bookmarking interfaces and similar tools.

Secondly, although we are marking up the review itself with itemprop=”description” tags, Google doesn’t necessarily return the start of that text in its summary on the results page. The instructions on the Rich Snippets Testing Tool explain this fact: “The reason we can’t show text from your webpage is because the text depends on the query the user types.” Indeed, as the above screen shot shows, the actual returned text depends on the keyword terms that a user feeds into the query box. Still, having this itemprop is crucial to Google parsing your content as an actual review, so don’t leave it out.

Finally, even though we left out the stars or any other system of quantitative measurement, Google still identified our page as a review and returned it as one of the top results (the actual first result, at least as of this writing and in this particular test case.) The name of the reviewer and the year in which the review was written were presented alongside the result, hopefully encouraging a user to click through to this critical evaluation. This, I think, is proof that a system originally designed for Thai restaurant reviews can be appropriated and used in a way that helps academic web surfers find quality content. As the web gets more crowded with more and more data, it’s incumbent on people in the academy to ensure their content is marked-up in ways which helps people looking for serious research.

Support for Thunderbolt, Apple/Intel’s new high-speed bus, has been slow in coming. Though all new MacBook Pro’s ship with the new port, neither Apple nor any third-parties have shipped any peripherals that take advantage of the the technology.

The first evidence that this could change showed up on the show floor at the National Association of Broadcasters last week. LaCie, which has an existing line of USB2/3 and FireWire 400/800 hard drives, was showing off a four-drive setup configured in a RAID, with each external drive connected serially with Thunderbolt cables:

You can see the new connection in heavy use on the back of the setup — the cables physically resemble MiniDisplayPort, but are capable of extremely high data throughput:

Here’s the back of one of these new drives — note that there’s no labels yet for the ports, but they will presumbably have the lightning icon which Apple is using for the technology:

Also on display was BlackMagic’s Ultrastudio SDI and HDMI to Thunderbolt converter, which will get high-definition video formats into and out of the new standard:

Turns out the drive was worth it: Here’s Randy Ubillos, the architect of Final Cut Pro (as well as the new iMovie for iPad) demonstrating a beta version of Final Cut Pro at the National Association of Broadcasters convention in Las Vegas:

The video below may seem odd — a standing ovation for the announced price of $300 — but the entire Final Cut Pro Studio package used to cost a thousand dollars more, so these video editors are cheering a dramatic price drop.

2011 will be 12 years since I first attended NAB, and I thought I’d go through my photo archives to see what I had from visiting Apple’s booth on the show floor over the years. Although Apple stopped having a large floor presence at the convention in 2008 (and won’t have one in 2011, no matter what they announce tomorrow), they’ve still been a major player in a multi-way tug-of-war between Adobe, Avid, Discreet and others through the years. We’ll start in…

1992

…when MacWeek Magazine covered the show in its April 20th issue:

Apple […] had a booth at the show; the company is aiming at the estimated 300,000 U.S. Video-production professionals, according to Kirk Shorte, Apple marketing manager for integrated media. “If Apple can win over the top video professionals to the Mac, there’s a multiplier effect of millions of people who produce scripts, budgets, animations and graphic images for video. Then there’s a trickle-down effect to an even bigger market of corporate video users and consumers who want to make home movies on their VCRs,” Shorte said.

I’m still working on getting print back issues of MacWeek from other years in the 1990s, so we’ll skip forward to…

1999

The first year after Apple has acquired video editing software called “Final Cut” from Macromedia. This NAB saw a large Apple booth on the show floor, with many blue-and-white G3’s demonstrating the newly-renamed Final Cut Pro. Note the dual CRT monitors in use here: flat-panels were still a way off as affordable, color-accurate output devices. DV and Firewire/1394 was everywhere — as the core of FCP, it was a new standard that everyone was trying to figure out.

2000

I was in New York, but PowerBooks made their debut as “Portable Movie Studios” featuring FireWire for connection to DV cameras. Blue-and-white G3s have turned into the first generation of silver G4s, and Cinema Displays have displaced CRTs. An Italian-language site has three good pictures online, including this one:

3DRender.com also has three good quality pictures of Apple’s booth from 2000. From examining them, it’s interesting to note that iMovie and the new iMac DV were featured in the booth as a hardware/software pairing for consumers. The previous year, in 1999, only the Pro-level blue-and-white G3 had DV ports as a standard feature.

2001

I can only find four photos from this year, and none of them are from inside the Convention Center, much less Apple’s booth. QuickTime 5 was released, and Apple announced that many (still) digital camera manufacturers were supporting QuickTime natively — presumably, for saving small movies onto flash memory.

2002

Sorry for the blurry shot, but it’s the one I have that shows the most of the booth. Silver G4s replace the blue-and-white G3s of 1999, and Apple shows off an alpha of their recently-acquired Shake (it would finally ship in July). Here it is being demo’d with some footage from Lord of the Rings on the show floor:

Another shot of Apple’s demonstration area, highlighting the promotion of FCP3’s real-time color correction. Please note the Apple rep pictured here is a really nice guy and is in no way responsible for the movie Scorpion King:

Phil Schiller actually gave the NAB keynote this year, introducing FCP4, DVDSP2, and Shake 3. Schiller shared the stage with Paul Saccone, and Compressor made its debut as well. All seven parts of the keynote are on YouTube, starting with this one:

Judging from the pictures I took, this NAB show stood out most for the appearance of XRaid, Apple’s disk array technology:

I was in Copenhagen. It was a busy show for Apple: they introduced Final Cut Server and Final Cut Studio 2 (containing FCP 6), together with a new addition Color. Underlying it all was a new editing codec, ProRes 422. You can review Engadget’s live coverage, and here’s Apple’s Demo Reel:

MacBidouille has a great gallery of the keynote, and AppleInsider covers the show floor in pictures.

2008

I was in Sweden, and Apple didn’t have a booth. This was the first time they had skipped a large presence on the show floor, which would continue through 2011.

2009

Again “Apple did not have a booth at NAB 2009,” claims Wikipedia, “however the product was well represented on the show floor in various booths. The RED Camera team relied heavily on FCP during development.” My pictures from the show are here.

2010

Rumors that Apple would return to the show floor with an updated version of Final Cut proved unfounded, and I wasn’t there to snap pictures of the companies that did exhibit.