Reading weather with Java

October 2009, by Mario

Applet MeteoApplet

In the August issue of LinuxPro I read an article in which he makes use of Python to process the meteorological information provided by the service Yahoo! Weather through the interface of the RSS feeds, as defined in the specifications API.

I was struck by the simplicity with which you can retrieve information using Python and choice in the library, although I have some reservations on the appropriate use of the feeds on this case. But let's go with order.

I tried to do the same thing with the Java and, regarding the feeds, I have focused on the project Rome because it is open source and is supported by The Sun At this writing, the latest version of Romeo and the 1.0 but be careful that requires another library not included, the JDOM, which you should download from its site.

Copy rome-1.0.jar and jdom.jar in the lib/ext of the Java Runtime Environment

However, the heart of code in Python that reads the feed is as follows:
import feedparser
url = "http://weather.yahooapis.com/forecastrss?p=ITXX0090&u=c"
data = feedparser.parser(url)
summary = data.entries[0].summary

Reading the specifications Yahoo! Weather, I realized that the library of Python retrieves information from the node standard feed <description> (contained in the standard node <item>) which, according 'which makes use Yahoo!, contains a portion of html formatted and ready to be displayed. The article performs elaborations on this text to extract the desired information. Advance that I think is not the right way forward but for now let's implement the same functionality in Java.

By analogy with the code in Python, I have omitted the monitoring of the actual presence of objects item in the feed but that there is guaranteed by the specific Yahoo!. The code above is not much more complex than the Python, but Java is not a script and is more verbose , variables must be defined , exceptions caught and the libraries are often more complex in their architecture to be more flexible.

The code written in Python, for example, does not take into account the passage of connections for a proxy, the event certainly occurs if you run the code from a corporate network, this information can not be passed in a simple string and that is why the complicated Java uses a library that requires more technical data streams. That said, the code base is more complex because we must first create the stream with the instructions:
URL("URL feed").openConnection().getInputStream());
the greater power lies in being able to create a connection that uses a proxy, the proxy object must obviously also be configured with the correct address and port:
Proxy proxy = new Proxy(Proxy.Type.HTTP,
new InetSocketAddress("IP proxy", Port proxy));
The opening of the stream then becomes:
URL("URL feed").openConnection(proxy).getInputStream());

The actual reading of the feed is as simple as in Python, is like in the code:
WireFeedInput inputFeed = new WireFeedInput();
feed = (Channel) inputFeed.build(reader);
List<Item> item = feed.getItems();
String summary = item.get(0).getDescription());
With the difference that the method that reads the feed is not static and then the class must first be instantiated.

It is noted that it is necessary to close the resources and catching exceptions. To keep the code simple, I used a generic Exception but you can have more control from the start by capturing the more specific instructions I used:
MalformedURLException, IOException, IllegalArgumentException and
FeedException

With the cast to the class manage Channel feeds like RSS, if we had been reading a feed type Atom we had to use the Feed class.

The article I initially dimension, continues processing the contents of the node <description> feeds but I think this is not the proper way to proceed. The specifications of Yahoo! Weather describing that node as follows:
A simple summary of the current conditions and tomorrow's forecast, in HTML format, including a link to Yahoo! Weather for the full forecast.
Do not specify anything! It contains a portion of the html ready but who can assure us that the text remains unchanged in form, which contains the same words and in the same order, while keeping the same version of the API service?

The actual information is contained in specific node <yweather:condition />, in the sense that this was clearly written in the specification of the current version of the API and will remain so unless a change of specifications. This was reported by an example, so that changes are only values, not attributes to reference the code as it does in the text of the node <description>:
<yweather:condition text="Fair" code="33" temp="8"
date="Thu, 29 Oct 2009 6:56 am PDT" />
The specification describes it this way:
The current weather conditions. Attributes:
- text: a textual description of conditions, for example, "Partly Cloudy" (string)
- code: the condition code for this forecast. You could use this code to choose a text description or image for the forecast. The possible values for this element are described in Condition Codes (integer)
- temp: the current temperature, in the units specified by the yweather:units element (integer)
- date: the current date and time for which this forecast applies. The date is in RFC822 Section 5 format, for example "Wed, 30 Nov 2005 1:56 pm PST" (string)
It clear that the feed is always received one and only one node <yweather:condition /> and always attribute "temp" indicating the temperature and the attribute "code" which identifies a code on the weather (which we translate into our language as it already does the attribute "text" for English). This node is not part of standard feed and I guess that's why the article on Python uses the node <description> instead is standard.

That said, we recall that the initial aim is to read the weather and so the easiest way is to proceed to a lower level: read the data as XML and find the node <yweather:condition />.