Google’s API: For Fun, Not Profit (Yet)

The Google API is a fun way to use Google for uses as varied as solving crossword puzzles to creating recipes, but it’s not yet ready for prime-time applications.

A special report from the Search Engine Strategies 2003 Conference, August 18-21, San Jose, CA.

The Google API is a way for all sorts of programs to send queries to Google and get back results as data, rather than the web pages we’re familiar with when we do a Google search. Nelson Minar, the Google engineer who designed the API, and Rael Dornfest, co-author of O’Reilly’s “Google Hacks” book, talked a little about what it is, and what people can do with it in a panel called “Up Close with the Google API.”

An API is an “Application Programmer Interface.” Simply put, it’s a way to have one program talk to another program. For example, most systems have ways for applications to tell a browser to open a specific web page. The browser has an API that defines what commands it will accept and in what format.

Most webwide search engines have developed APIs. Using APIs, portals can get search results or ads to display on their local pages. And in fact, Google had an informal XML search results option before it officially releaseded its API: programs could send a browser-like request in the form of a URL, and Google would send back XML results.

But that was not meant for large-scale use. Neither was the hack (workaround) called “screen scraping.” This is when a program sends a URL request, and gets back HTML, and then programatically takes apart the HTML to locate the pieces of information that are included. In Google or any other search engine, this would be the search result item information: title, URL, text snippet, date, size, etc. Search engines hate that: the queries are mechanical rather than human, so there’s never a click on any result and they’re essentially wasting the search engine’s resources.

Minar talked a little about the goals of the Google API project. When Google decided to let people have programmatic access to the search engine in a more controlled way, they did two special things. One was to use hot new technologies called “WDSL” (Web services description language) and “SOAP”, which are designed for application-to-application communication over the Web. The other was to make it very easy for researchers, independent programmers and companies to use this API for creative and interesting projects.

The API is still in beta, non-commercial, free and limited. It provides access to just three of the Google services: search results, cached pages and spelling suggestions. And, to avoid being overwhelmed by results, each developer is limited to 1,000 queries per day. Note that there is not yet API access to the Google Search Appliance (enterprise search engine).

It’s easy to get access to the developer kit, posted online at http://www.google.com/apis. The developer kit includes the WDSL file and a lot of documentation and sample code (mainly Java, .Net, and Perl) but people have written samples for pretty much every known computer language, from C++ to AppleScript. Each developer signs up and gets their own key, which must be sent with every API request.

Using the API is also simple. The program sets up a request by creating an XML message, sends it and gets the reply. In the spelling example, that’s just the phrase to check, and the result is a little package including the spelling suggestion. Similarly, a search request has some parameters (like whether to use safesearch). When you send it, you get a batch of search results, in a tidy format.

For example, when you submit a search API request for Google Hacks author Rael Dornfest, you get some header information, and then a list with elements that look like this:

The content should look familiar. It consists of the pieces of each Google search result item, in what’s known as a name-value pair format. Unfortunatly, the output is not in XML format, which is a shame. The program that sent that request can now use those tidbits for all sorts of interersting purposes.

Minar also gave some examples of interesting uses of this data, including Mockery Bird, connecting Amazon book reviews and web commentary, people doing data mining on complex topics such as the Web ecosystem, and finding solutions to crossword puzzle clues. Minar encourages anyone doing something cool and different with the API to send email via the API support address.

Author Dornfest talked about the fun he had working on Google Hacks, noting that the sheer size of the Google database is interesting to play with. He showed how using an undocumented date function could take you “back in time” displaying only page information that hasn’t changed in several years. The Weblog Bookwatch by Paul Marsh combs weblogs for mentions of Amazon URLs. AvaQuest’s GoogleMovies gets the overall “temperature” of posts about new movies.

These are all fun and non-profit uses. Google is still working through the process of making the API into a commercial service. Obviously, they’ve already talked to a lot of SEO and SEM companies, but have not come to any public decision on how to handle this issue. Minar encouraged people to contact Google if they are interested in business arrangements, and Dornfest encouraged people to develop additional innovative tools.

Overture Search Leader Moves to MSN

Paul Ryan, the former Chief Technology Officer of Overture, has been hired as the General Manager of MSN Monetization.

According to MSN spokesperson Malina Bragg, “Paul joins the management team as we continue to invest and build our search and ad sales platforms. Further, he brings a tremendous amount of search industry experience to MSN, and will help our efforts to move MSN monetization into the future.”

Ryan was a key player at Overture, and his move to MSN is likely a significant loss to Yahoo, Overture’s new owner. We’ll have more coverage as the battle between MSN and Yahoo/Overture ramps up, inveitably, over the course of the coming year.

Want to stay on top of the latest search trends?

Related reading

The phrase "Web 3.0" was first coined back in 2006. Viewed by some industry insiders back then as an "unobtainable dream", the idea of Web 3.0 has remained elusive. However, as technology evolves, the dream seems much more obtainable than ever before. In fact, many argue it is already a reality.

A new report by Pi Datametrics has analyzed the entire US flights market to discover the most organically valuable search themes and players with the greatest share of voice across the market. What can marketers learn from it about the state of the flights market in search?

The SEO community has been fighting with low-quality, outdated and simply incorrect content for ages. How to educate yourself as well as your team in this abyss of misinformation? Here are four tools that can help solve these problems.

The SEO industry has spawned a vast array of influencers, dispensing invaluable insights that businesses can apply to their own strategies. With luminaries like Rand Fishkin, Danny Sullivan and Matt McGee announcing their departures from the scene in the near future, who should SEOs follow for advice and best practices? Here are some of the best experts and resources in the industry right now.

Here we’ll take a look at the basic things you need to know in regards to search engine optimisation, a discipline that everyone in your organisation should at least be aware of, if not have a decent technical understanding.