Using Google Search AJAX API in Python

This article hopefully teaches you how to use Google Search AJAX API in Python.

In this article, I will will attempt to teach you how to use the
"Google AJAX Search API". Lets break this down and look at each part.

1. Google -> Well, its obvious... duh

2. AJAX -> It is a programming language which stands for "Asynchronous Javascript
and XML". It is used to create better, faster and more interactive web applications.

3. Search -> It means that the user "queries" for something and gets the result.
Much like Google Search.

4. API -> It stands for "Application Programming Interface". It specifies how
some software components must interact with each other. In this case, we can say
that it is a set of programming instructions for accessing a Web-Tool (Web-based
Software).

So what we want to do is write a program to interact with the Google AJAX Search API.
That way we can get the results of a google search, by using a program. This will
be helpful in Timed 6 (i think, haven't completed that yet)

As the Title suggests,I will be explaining the python program to do this. One can
obviously use other programming languages also, but thats some other Article.

Before you continue from here, it is essential that the reader know Basic Python
Syntax, Certain Modules (urllib, urllib2). Whatever is not mentioned here, but are in
the article, I suggest reading it side by side for easier understanding.

SO LET'S GET CODING !! :D

The first line is normally the import line. I know that i omitted the
"#!/usr/bin/python" line, But that is understood. Now, the import line is the line
where we import the modules we are about to use. Which in this case is

import urllib, simplejson

I will explain the module simplejson as you continue through the article. But for now
you can say that it is used as an interpreter between Python and Javascript Object
Notation (JSON). I would suggest going through this:
http://pymotw.com/2/json/

Now, we need something to search for. So we can ask the user to input a query.

query = raw_input("Please enter you query: ")

Now, probably comes something new. We have the query or the thing we want to search
for. But before sending it directly to the site we have to convert it into something
the Search Engine will understand. This is called URL Encoding.
Its a pretty simple concept and you can google it if you want.
To do this is python we use a built-in method in the urllib class called "urlencode".

The above line stores the url we want to open to search for the user-defined
query. What the above line actually does is set the variable url to
"http://ajax.googleapis.com/ajax/services/search/web?v=1.0&q=url_encoded_query"
Here we can see that there are two variables in the url. ie. 'v' and 'q'.
'v' holds the version number of the search API which at the time of writting has
only one value '1.0'. 'q' holds the user-defined query. There are a few more variables
which i will tell at the end.

Now what we want to do is to make the python program open the url so that it will
return the search results. To do this will make use of the 'urlopen' method in the
urllib class.

search_results = urllib.urlopen(url)

Now the contents of the results page are stored in the search_results. But it is not
stored directly in user readable format. It is shown in a JSON format. To extract
information from the results page we will make use of the simplejson module.
This simplejson module can be used to extract the information stored in the JSON
format. So now what we are going to do is load the information onto another variable.

Now the search results are stored in the variable results. I am not really sure how to
put this, but results variable is a LIST of DICTIONARIES. Meaning results[1] is a
dictionary, results[2] is a dictionary and so on. One thing to remember is that this
ONLY returns the first four search results.

Now its obvious, all we have left to do is print those results on screen. Each index
the variable 'results' has a dictionary with the following keys.
1.GsearchResultClass
2.visibleUrl
3.titleNoFormatting
4.title
5.url
6.cacheUrl
7.unescapedUrl
8.content
Now we will be using only the keys 'title' and 'url' and print those to screen. But
you can print out any of the above.
For printing the info on the screen.

for item in results:
print item['title'] + ": " + item['url']

And thats it. You have just used a Python program to do a google search. Now, as i
said before also. This only returns 4 search results. Which means the maximum index of
results is 3 (because computer counting starts from 0(zero), wiz 0,1,2,3)
To increase the number of results we will have to make a few changes to the url
variable. Now thing you have to realise is that the API will return only 4 results
at time. I had read about a method for the API to return 8 results but that did not
work for me. So if the API returns only 4 results at a time, we can first ask for 4
results and then ask for another 4 and so on.
The change to the url (NOT the variable... the actual URL) is :