Using Hierarchical Data Sets with Aspire and Tomcat

What are Hierarchical Data Sets and Why Do You Care?

Hierarchical Data Sets are not new. They already exist in the form of CICS
transactional data, files in directories, and plain Java objects, as well as
the obvious XML. In the XML Journal in early 2001, I floated the idea that
programmers can benefit from hierarchical data abstractions even though many of
their data sources are predominantly relational (such as databases including
MySQL, Oracle, SQL Server, DB2, etc.).

The .NET world has a similar idea taking root in the notion of
"datasets." Although there are important differences between my proposed
Hierarchical Data Sets and the nature of Microsoft's datasets, it is evident
that Hierarchical Data Sets enhance relational abstractions with richer
detail.

This article examines the structure of, and a Java API for, Hierarchical Data
Sets. Unlike the XML Journal reference two years ago, you will now actually have
a piece of executable code to use to start taking advantage of Hierarchical Data Sets.
Although programmers can code in Java to access various data sources and
construct the final Hierarchical Data Set, this article has an implementation
that you can readily use to construct these Hierarchical Data Sets
declaratively by simply composing pre-built relational adapters. Relational
adapters include file readers, SQL readers, Stored Procedure readers, et
cetera.

The question you're probably asking is "What good are these
Hierarchical Data Sets?" Although they can't rival the salutary effects
of large expensive pieces of Carbon on your most certainly deserving
companions, Hierarchical Data Sets are quite useful in the programming world.
For starters, an entire HTML page worth of data can be satisfied by a single
Hierarchical Data Set. In an MVC model, a controller servlet can deliver a
Hierarchical Data Set to a JSP page, which will paint it without further ado.
For a warmup, it can be converted to XML and directly returned to the caller
by the controller servlet. For the appeal, the Hierarchical Data Set can be
converted to Excel. For the stylish, the Hierarchical Data Set can be
redirected to a reporting engine or a charting engine that supports XML
data.

Although the primary focus of the article is the Java programming API for
Java programmers, Hierarchical Data Sets can be used by non-Java programmers
quite effectively to obtain XML, HTML, or Excel formats directly from relational
databases and other data sources by using a J2EE server such as Tomcat.
Without further ado, let us investigate the structure of Hierarchical Data Sets
and see how these data sets can be obtained declaratively (while relaxing your
programming muscles a bit).

Structure of Hierarchical Data

A Hierarchical Data Structure can be conceptually represented as a Java API,
or XML, or some other format. It is easiest to visualize as XML.

<AspireDataSet>
<!-- A set of key value pairs at the root level -->
<key1>val1</key1>
<key2>val2</key2>
<!-- A set of named loops -->
<loop name="loop">
</loop>
<loop name="loop2">
</loop>
</AspireDataSet>

This is a set of key/value pairs. A given set of key/value pairs could
yield n independent loops. Each loop is essentially a table of
data. The term "loop" is synonymous with "table." I
haven't used "table" because people might literally take
"table" to mean only data from a relational table. Having mentioned
that is a collection of rows (RowSet!), let us look closer at the
structure of a loop:

The only odd thing here is the structure of a row. A row is, expectedly, a
collection of key/value pairs. Here a row includes not only key/value pairs, but
also another recursive set of n number of independent loops. This
extension can produce trees with any amount of depth. (Or should I say,
height!)

Structure of Hierarchical Data in Java

The moment I showed the hierarchical data as XML, there is a possibility
that people might take a Hierarchical Data Set to be literally XML and, hence,
literally DOM and, hence, a lot of memory inside of the JVM. No need to panic.
The Hierarchical Data Set can have its own Java API and need not be represented
as a DOM. The majority of the time it is a
forward-only-traversing-cursor-like-lazy-loading tree. Here is a working Java
API for a Hierarchical Data Set:

package com.ai.htmlgen;
import com.ai.data.*;
/**
* Represents a Hierarchical Data Set.
* An hds is a collection of rows.
* You can step through the rows using ILoopForwardIterator
* You can find out about the columns via IMetaData.
* An hds is also a collection loops originated using the current row.
*/
public interface ihds extends ILoopForwardIterator
{
/**
* Returns the parent if available
* Returns null if there is no parent
*/
public ihds getParent() throws DataException;
/**
* For the current row return a set of
* child loop names. ILoopForwardIteraor determines
* what the current row is.
*
* @see ILoopForwardIterator
*/
public IIterator getChildNames() throws DataException;
/**
* Given a child name return the child Java object
* represented by ihds again
*/
public ihds getChild(String childName) throws DataException;
/**
* returns a column that is similar to SUM, AVG etc of a
* set of rows that are children to this row.
*/
public String getAggregateValue(String keyname) throws DataException;
/**
* Returns the column names of this loop or table.
* @see IMetaData
*/
public IMetaData getMetaData() throws DataException;
/**
* Releases any resources that may be held by this loop of data
* or table.
*/
public void close() throws DataException;
}

For brevity, the Java interface ihds represents "Interface to
Hierarchical Data Set." This API allows you to step through your loops
recursively. An implementation has the option to load the loops only when they
are requested. It can also assume either forward-only or random traversal.
Before going further, let me present the two additional interfaces that this API
uses: ILoopForwardIterator and IMetaData.

How Can You Obtain a Hierarchical Data Set, So You Can Use It?

Now that we know the structure of Hierarchical Data Set, how do you get
hold of one? As I stated earlier, this is easy under Aspire. The steps are as
follows:

Learn the basics of Aspire.

Create a definition file for your Hierarchical Data Set.

Call your definition and receive ihds in your Java code.

Each of these steps is explained in some detail below.

Read the Basics on the Usage of the Aspire JAR

Aspire is a small JAR file that can complement your Java programming,
particularly when used with an app server such as Tomcat. At the heart of Aspire
is a set of configuration files, where you declare your data access mechanisms
in terms of Java classes and arguments to those Java classes. Aspire will
execute those Java classes and return the resulting objects. Hierarchical Data
Sets are no exception.

An earlier O'Reilly article introduced Aspire: "For Tomcat Developers, Aspire Comes in a JAR." This will familiarize you with defining
databases and calling SQL and Stored Procedures, as well as configuring and
initializing Aspire.

This definition has three sections. The data set is named
ihdsTest. The first section tells Aspire that the Java class
com.ai.htmlgen.DBHashTableFormHandler1 is responsible for
returning an object implementing ihds. Unless you code your own
implementation of ihds, you will use this class in every data set
definition. It's the pre-fabricated class that knows how to compose relational
assets into hierarchical assets. Line 2 of section 1 tells
DBHashTableFormHandler1 that this main data set has one loop
called works.

Section2 defines the loop works. A loop structure in Aspire
uses two Java classes: a class request (GenericTableHandler6) and
a Query request (RowFileReader).
RowFileReader reads a set of records from a flat file and makes
them look like a collection of rows and columns.
GenericTableHandler6 takes this collection and applies such
features as aggregate values and row numbers and implements the
ihds interface at the loop level. As with
DBHashtableFormHandler1, GenericTableHandler6 is
present in most definitions. RowFileReader might change, depending
on your data sources. For example, the following parts exist in this
category:

RowFileReader.

DBRequestExecutor2 (for reading SQL).

StoredProcedureExecutor2 (for reading from Stored
Procedures).

XMLReader (for reading XML files).

Or, you can write your own reader that implements
IDataCollection.

Section2 also indicates that it has a child called childloop1.
GenericTableHandler6 will take this cue and look for section3,
identified by childloop1.

Section3 defines childloop1. The definition is identical to
section2, except that childloop1 has no children. Both section2 and
section3 use RowFileReaders. In practice, they can use any
combination of data reader parts.

Let me call this file ihds-test.properties. Include this file
in Aspire's master aspire.properties as follows:

Aspire has a factory service, represented by the IFactory
interface. This factory interface allows you to call a Java class, identified by
a symbolic name called ihdsTest, with any arguments passed in as a
hashtable. The arguments are expected to be lowercase strings for the
downstream relational adapters.