Building Wireless Web Clients, Part 1: Pitfalls of MIDP HTTP

Wireless Java applications are, by their nature, network-centric. The devices that these applications run on are, however, less predictable. Most notably, the precise nature of the network connection depends both on the device and on the services provided by the network to which it is connected. Some wireless devices may
be directly connected to the Internet, while others are only able to access it through a gateway. Whatever the nature of the underlying network, a wireless Java device that conforms to the Mobile Internet Device Profile (MIDP) specification is required to provide the illusion that it is directly connected to the Internet by implementing the HTTP support that is part of the MIDP Generic Connection Framework API. A description of this framework can be found in the article "Invoking Java ServerPages from MIDlets" by Qusay Mahmoud. The device is not required to do this by including a TCP/IP protocol stack to carry the HTTP protocol messages; it is permitted to use the network's existing protocols as the bearer, as long as it preserves the behavior that an HTTP-based application would expect.

The lack of a TCP/IP stack usually means that access to lower-level programming paradigms, such as sockets, is not guaranteed to be available to a MIDP application, even though the Generic Connection Framework provides an interface to such low-level services, and the next version of the MIDP specification is likely to require their inclusion in all devices. For the time being, then, wireless Java applications will have to use HTTP to communicate with the outside world. However, several features of the wireless Java environment make it slightly more difficult to write a MIDP HTTP client than would be the case with J2SE. This article highlights some of the pitfalls that are unique to the J2ME environment, using an example taken from Chapter 6 of my recently published book, J2ME in a Nutshell.

The Bookstore Web Client

The Web client used in this article is, in principle, very simple. Given the ISBN for a book, we want to connect to Amazon's online bookstore, retrieve the book's details, and display its title, sales ranking, and the number of customer reviews. In the second part of this article, we'll get a little smarter and store these details, along with the book's title, so that we can go back to the site later and get updated details without having to remember the ISBN. For now, our problem will be how to get the information that we need and how to display it to the user.

Before proceeding with the technical details, let's take a look at the completed client in action. To build and run this example, you'll need the source code, which is available in this .zip file, and a suitable development environment. The example source code is appropriately packaged for the J2ME Wireless Toolkit, which can be downloaded for free and includes emulators for several types of mobile devices. This article assumes that you are using this tookit. The source code can, however, be used with other development tools, such as Forte for Java.

Having downloaded the source code, unpack it below the J2ME Wireless
Toolkit's apps directory, start the toolkit's KToolbar application, press the Open Project button, and select the project J2MEHttp. If this project does not appear, make sure that you have a directory called J2MEHttp below the apps directory of the J2ME Wireless Toolkit installation. If you don't, then you have not unpacked the example code to the correct location.

With the project open, press Build to compile the source code, select an emulated device, and press Run to start the emulator. When the emulator starts, it will offer the choice of two MIDlets to run. In this case, chose RankingMidlet; the other one, PersistentRankingMidlet, is the subject of the second article in this series.

Figure 1: The Bookstore Web Client

The RankingMidlet MIDlet presents a form where you can input an ISBN, as shown in the left side of Figure 1. Supply the ISBN of your favorite book and select OK. After a short while (assuming the ISBN is valid), you'll see the title, sales ranking, and the number of reader reviews for your chosen book, as shown in the right side of Figure 1.

So how does this client work? Apart from the details of the user interface, which we're not going to cover in this article, the client does three things:

Connects to the Amazon Web server and requests the store page for the
chosen book.

Reads the HTML page that the server returns.

Interprets the HTML page to extract the details that it needs.

This all sounds simple enough, but there are a few pitfalls waiting to trap the unwary along the way. Let's see what can go wrong by examining each of these three steps in more detail.

Connecting to the Server

The first problem we encounter is how to get the correct HTML page for a book, given its ISBN. It isn't difficult to work this out -- just point your browser at Amazon's Web site, enter an ISBN in the search box on the home page and look at the URL of the page that is returned. If you do this, you'll find that the browser ends up loading a URL that looks something like "http://www.amazon.com/exec/obidos/ASIN/156592455X/102-0259985-4227363", for a book with ISBN 156592455X.

In fact, everything after the ISBN in this URL is concerned with tracking your user session with Amazon, and does not have to be supplied on initial contact. Therefore, to get the details for this book, we only have to make an HTTP GET request with the URL
http://www.amazon.com/exec/obidos/ASIN/156592455X.

This isn't how the browser got the page, however; when you entered an ISBN on the home page, this URL was not constructed directly. Instead, a query was created and sent to the server, which allowed it to return the correct page. The fact that Amazon also recognizes a more explicit URL that gives the same result is useful for this client, but you might not be so lucky if you had the task of creating a client for a less cooperative server. To demonstrate how to handle the more general case, we'll show you how the browser actually fetched the correct page and show you the Java code that achieves the same result.

The search feature on the Amazon home page is implemented using HTML form tags. When it comes to the nitty-gritty detail, the form causes an HTTP POST request to be sent to the URL
http://www.amazon.com/exec/obidos/search-handle-form/0
in which the body of the message contains the query itself, in the form:

index=books&field-keywords=isbn

This query causes a search of all books on the site (as distinct from software, electronics, etc.) for the given ISBN. Recreating this in Java is quite straightforward -- we simply use the Connector class from the Generic Connection Framework to open a connection to the URL shown earlier, set the request method to POST, open an output stream, and write the query to it. (If you're not familiar with the basics of using HTTP with MIDP, or with the Generic Connection Framework, you should first read Qusay Mahmoud's article, which covers the necessary groundwork).

Using this reasoning, our first attempt at emulating the browser might look like this:

This code extract shows a class called Fetcher that has a method called fetch(), which requests information about a book whose ISBN is in an object of type BookInfo (which will be shown in the second article in this series), using the algorithm that was just described. The expectation is that, once the request has been sent, the HTML page for the book will be accessible from the input stream obtained from the HttpConnection's openInputStream() method. If this were a J2SE program and we had used a URL object to get a URLConnection and then made the same request as the one shown here, we would indeed get the HTML page from the input stream. Unfortunately, in the J2ME world, things are a little different.

We deliberately made this example more difficult by using a POST request instead of GET, in order to show you how different the MIDP HTTP implementation is from its J2SE counterpart when the server does not directly provide the data that you require. Instead of replying with a response code of 200 (or, more correctly, HttpConnection.HTTP_OK), the Amazon Web server sends a response code of 302, without any useful data. Since the data is missing, the usual, simple-minded approach of reading the content of the input stream isn't going to work here. So why is the server sending this response code, and what should we do about it?

Response code 302 is one of several codes that a Web server can use to indicate a redirection. The full list of these codes, and their official meanings, can be found in Table 1. A complete specification of the HTTP client's expected follow-up action when receiving each of these codes can be found in the HTTP 1.1 specification.

Table 1: HTTP redirect response codes

Code

Meaning

301

Moved permanently

302

Found

303

See other

305

Use proxy

307

Temporary redirect

These codes require the client to look for the requested resource at a different URL, which is included with the server's response in a header called Location. As well as connecting to a different location, if the server responsed with either 302 or 303, and the original request was a POST, the new request should instead be a GET; in all other cases, the original request method should be used.

There is no guarantee that a second request following a redirection will result in the required HTML page being returned, because multiple redirections are permitted. In other words, we have to keep following redirections until we get to the actual location of the information that we need. In order to avoid loops caused by incorrect server configuration, however, it is normal to impose an upper limit of the number of times a redirection will be followed, or to detect a loop by keeping a history of redirections.

In terms of our Fetcher class, the need to follow server redirections means that we convert the simple fetch method shown earlier into a loop that terminates when either the data is returned, an error occurs, or we get redirected too many times. Each pass of the loop will use the Connector class's open() method to open a new connection to the URL obtained from the previous redirection. The final version of this method is shown below, with the most important changes shown in bold.

As you can see, the first pass of the loop makes a POST request using the original URL, writing the query to the output stream obtained from the openOutputStream(). If a redirect is returned, the new URL is obtained from the Location header of the response and the request method is converted to GET if necessary, so that on the next pass the query will not be written. Conversion to a GET request implies that the server obtains all of the necessary information to locate the resource from the URL in the Location field.

In fact, in the case of the Amazon Web server, the URL that is supplied is exactly the one that you finally see in the Web browser when the page is displayed, and which contains the book's ISBN. Eventually, the server should return a response code of 200, at which point the HTML can be read from the input stream. This, however, is not the end of our problems, as we'll see in the next section.

Before moving on, you are probably wondering why it is that a J2SE application can get away without concerning itself with server redirects, whereas a MIDlet cannot. The answer to this question is very simple -- in J2SE, the code to handle redirects is built into the core libraries, and therefore happens automatically without the application being aware of it (although it can be turned off using the setFollowRedirects() and setInstanceFollowRedirects() method of java.net.HttpURLConnection). Unfortunately, the MIDP HTTP implementation does not include this feature.

A final word of caution -- if you think you can avoid the problems shown here by working out what the "real" URL is and using that to make the initial request, think again! In some cases, that might be possible, but in other cases it won't be. If you are faced with writing a J2ME client to interface with somebody else's server, the only way to work out what you have to do is try it and see -- or ask the server's owner, if that is possible. Be warned, though, that application servers hosting the server side of J2EE applications might use the redirection techniques shown here to point your client to a different URL following authentication, which is itself a topic that requires special treatment (similar to that shown here) for a MIDP application. That, however, is beyond the scope of this article.