Use VBS to Search for Oracle Books using Google’s Book Library

13122009

December 12, 2009

Below is a somewhat complicated VBS script that interacts with Internet Explorer to submit a query to books.google.com. Once Google prepares the web page, the VBS script parses the raw HTML code in the web page to generate a new web page containing the list of matching books supplied by Google. Selecting a book and then clicking the View button opens that book on the Google books site.

The search keyword is specified on the objIESource.Navigate line at the end of the website address. A plus sign should appear between each search keyword. To exclude a particular word from the search, prefix the word with a minus sign, for example to search for the keywords Oracle and SQL, but exclude the word dummy:

Dim objIE 'For the data entry form where we present the harvested book list
Dim objShell 'To add a delay, may throw an error when executed in a Visual macro
Dim strHTML 'Holds what we are displaying on our data entry form
Dim intFlag 'Indicates if the user clicked OK or closed the browser window
Dim i 'For our counter
Dim objIESource 'Holds the data source web page
Dim strHTMLSource 'Holds the HTML contents of the source web page
Dim intStart 'The starting position of title=" in the source web page
Dim intEnd 'The position of the next " after title=" in the source web page
Dim intApproxStart 'The book titles appear after the first entry of coverthumb, so we will start there
Dim strBook(1000) 'Holds the book titles found on this page
Dim intBookCount 'Counter for the number of books found
Dim strFind 'What to find in the HTML code which indicates that the data of interest will follow
Dim intFindLen 'The length of the what to find string
Dim intLinkStart 'The starting position of title=" in the source web page
Dim intLinkEnd 'The position of the next " after title=" in the source web page
Dim strBookLink(1000)'Holds the link to the book found on this page
Dim strLinkFind 'What to find in the HTML code which indicates that the page link will follow
Dim intLinkFindLen 'The length of the what to find link string
Dim strSelectedBook 'The name of the selected book
Dim strBookAddress 'The web address of the book
Dim adsFile 'Used if we want to write the downloaded web page to the hard drive
On Error Resume Next
'Set objShell = CreateObject("WScript.Shell")
Set objIESource = CreateObject("InternetExplorer.Application")
objIESource.Navigate "http://books.google.com/books?um=1&q=oracle+tuning"
Set objIE = CreateObject("InternetExplorer.Application")
objIE.Navigate "about:blank"
For i = 1 to 10000
'Give it some time to prepare the objIESource
Next
Do While objIESource.Busy <> False
'objShell.Sleep 500 'Edit: This line was supposed to be replaced with the following line Dec 13, 2009
Wscript.Sleep 200
Loop
intBookCount = 0
strHTMLSource = cStr(objIESource.Document.Body.InnerHTML) 'Retrieve the raw HTML code from the web page
'Uncomment to save the web page to the hard drive
'Set adsFile = CreateObject("ADODB.Stream")
'adsFile.Type = 2
'adsFile.Charset = "iso-8859-1"
'adsFile.Open
'adsFile.WriteText(strHTMLSource)
'adsFile.SaveToFile "c:\InnerHTML.txt", 2
'adsFile.close
'Set adsFile = Nothing
'In the same HTML page, the first book title starts after the first entry of:
intApproxStart = InStr(strHTMLSource, "coverthumb") + 1
strFind = "title=" & Chr(34)
intFindLen = Len(strFind)
strLinkFind = "<A href=" & Chr(34)
intLinkFindLen = Len(strLinkFind)
'Find the start of the first book title
'Might be listed like this in the HTML code: title="Excel 2007 VBA Programming For Dummies"
intStart = InStr(intApproxStart, strHTMLSource, strFind)
Do While intStart > 0
'Find the end of the book title
intEnd = InStr(intStart + intFindLen, strHTMLSource, Chr(34))
If intEnd > 0 Then
intBookCount = intBookCount + 1
strBook(intBookCount) = Mid(strHTMLSource, intStart + intFindLen, intEnd - (intStart + intFindLen))
'Find the link to the book title
intLinkStart = InStr(intEnd, strHTMLSource, strLinkFind)
If intLinkStart > 0 Then
intLinkEnd = InStr(intLinkStart + intLinkFindLen, strHTMLSource, Chr(34))
If intLinkEnd > 0 Then
strBookLink(intBookCount) = Mid(strHTMLSource, intLinkStart + intLinkFindLen, intLinkEnd - (intLinkStart + intLinkFindLen))
Else
strBookLink(intBookCount) = ""
End If
Else
strBookLink(intBookCount) = ""
End If
'Find the start of the next book title
intStart = InStr(intEnd, strHTMLSource, strFind)
Else
Exit Do
End If
Loop
'Edit: place a single quote in front of the following two lines to prevent the list of books from disappearing, Dec 13, 2009
objIESource.Quit
Set objIESource = Nothing
strHTML = strHTML & "<form name=""BookFind"">" & vbCrLf
strHTML = strHTML & "<input type=hidden id=""txtOK"" value="" "">" & vbCrLf
strHTML = strHTML & "Book:<br /> <select size=""23"" id=""lstBooks"" style=""width:450"">" & vbCrLf
For i = 1 To intBookCount
strHTML = strHTML & "<option value=" & Chr(34) & strBook(i) & Chr(34) & ">" & strBook(i) & "</option>" & vbCrLf
Next
strHTML = strHTML & "</select>" & vbCrLf
strHTML = strHTML & "<p><center><input type=button value=""View"" id=""cmdOK"" onclick=""document.getElementById('txtOK').value='OK';""></center>" & vbCrLf
strHTML = strHTML & "</form>" & vbCrLf
objIE.Document.Body.InnerHTML = strHTML
objIE.Document.Title = "Select Book Title from Google Books"
objIE.Left = 0
objIE.Top = 0
objIE.Width = 500
objIE.Height = 520
objIE.Statusbar = False
objIE.Menubar = False
objIE.Toolbar = False
objIE.Visible = True
For i = 1 to 10000
'Give it some time to prepare the objIE
Next
Do While objIE.Busy <> False
Wscript.Sleep 200
Loop
intFlag = 0
'loop until the button is clicked
Do While intFlag = 0
If Err <> 0 Then
intFlag = -1
End If
If objIE Is Nothing Then
'User closed ID
intFlag = -1
Else
If objIE.Document.All.txtOK.Value <> " " Then
intFlag = 1
End If
End If
Wscript.Sleep 250
Loop
If intFlag = 1 Then
'Copy in the values from the web page
strSelectedBook = objIE.Document.Body.All.lstBooks.Value
'Try to find the associated link to the book
For i = 1 to intBookCount
If strBook(i) = strSelectedBook Then
'Found the book
strBookAddress = strBookLink(i)
'Extra credit - display the link associated with the selected book
'Comment out the following objIE lines and uncomment the objIE.Quit
' if the link associated with the book should not be displayed
objIE.Navigate "about:blank"
objIE.Width = 800
objIE.Height = 600
objIE.Statusbar = True
objIE.Menubar = True
objIE.Toolbar = True
objIE.Navigate strBookLink(i)
Exit For
End If
Next
'objIE.Quit
End If
Set objIE = Nothing
Set objShell = Nothing

Like this:

Related

Actions

Information

2 responses

6112010

Jeremy(14:53:13) :

I tried this out. Nice! the first time i tried it though, before hitting the view button, i grabbed the sides of the window and opened it up a little bit and the view button wouldn’t work. i closed the window and tried again, this time without adjusting the window size and the view button opened up the webpage for the book. Other than that, very nice.

I almost forgot about this VBS code – I think that it is the only code example on my blog that does not require a connected Oracle database. If you want to experiment with the script a little, you could add a text box to allow a person to enter their search keywords, rather than having those keywords hardcoded into the source code. Some of the other VBS code examples show how to add a text box to a web page and read the contents of the text box. Of course, this code could be adapted for other purposes also.

I am not sure why the code did not work the first time for you – I am not sure if I had similar problems or not. Thanks for stopping by and leaving the compliment.

Hints for Posting Code Sections in Comments

********************
When the spacing of text in a comment section is important for readability (execution plans, PL/SQL blocks, SQL, SQL*Plus output, etc.) please use a <pre> tag before the code section and a </pre> tag after the code section:

<pre>

SQL> SELECT
2 SYSDATE TODAY
3 FROM
4 DUAL;
TODAY
---------
01-MAR-12

</pre>
********************
When posting test case samples, it is much easier for people to reproduce the test case when the SQL*Plus line prefixes are not included - if possible, please remove those line prefixes. This:

SELECT
SYSDATE TODAY
FROM
DUAL;

Is easier to execute in a test case script than this:

SQL> SELECT
2 SYSDATE TODAY
3 FROM
4 DUAL;

********************
Greater than and Less than signs in code sections are often interpretted as HTML formatting commands. Please replace these characters in the code sections with the HTML equivalents for these characters: