Transparent Data Pipelines for JSP

Despite the undeniable popularity of JSP among Java programmers, there is a substantial amount of doubt, if not criticism, over its suitability as a front-end language for delivering HTML pages. One of the main complaints is that it breaks the MVC (Model-View-Controller) paradigm. Other issues include:

Exceptions are awkward, at best, to work with on a JSP page.

Resource reclamation is harder to achieve, particularly for large result sets, in the face of exceptions.

HTML pages, once embedded with JSP, are harder to visualize through graphical editors.

For non-trivial pages, it is difficult to see where HTML ends and JSP starts and vice versa, resulting in a very difficult maintenance situation.

Taglibs present only a partial solution for addressing the inviolable MVC rule, as they only facilitate but do not impose MVC.

Some of these issues arise from the nature of the current JSP usage patterns, rather than the JSP itself. I would like to present here a concept called "Transparent Data Pipelines" (TDP), which borrows elements from the following proven metaphors:

MVC.

Horizontal Interfaces.

Infosets.

Once implemented, such a facility will allow one to treat the Java code on a JSP page as merely a subset of the Java language, where the programmer will only use:

Assignments.

Conditionals.

Loops.

Although I do believe a well-designed template language will assist with most of these issues, for the majority of JSP users the TDP solution should solve, to a large extent, the MVC goal. In addition, TDP addresses resource-reclamation and exception-handling issues, while preserving for advanced Java programmers a single programming language solution.

In addition to bolstering the "View" portion of the MVC, the TDP architecture proposes solutions to the rarely addressed "model" part of the MVC and provides declarative solutions for data gathering.

TDP also addressees an increasingly important standard called "InfoSets" and delivers the model data as a hierarchical data set to the view portion. The reason why InfoSets matter here is because the InfoSet is represented as Java Object Tree and not a DOM tree (allowing lazy loading, etc.).

MVC Explained

Figure 1. Traditional MVC with vertical interfaces.

Every architecture is eager to confirm that it supports, if not embraces, MVC. What exactly is MVC and why is it my starting point? As depicted in Figure 1, the Web is built with the request/response model. A request is received by a "controller" on the Web server, which then orchestrates a response to be sent back. To generate a response, data has to be gathered and presented in a requisite format, in this case, HTML. The gathered data becomes the "model" part of the MVC. The presentation logic is said to constitute the "view" part of the MVC.

In a well-architected system, a controller is designed as a servlet, the model as a set of EJBs, and the view as a set of JSPs. Of course, variations of this concept are not uncommon. For example, one might choose to represent the model as a set of Java Beans, as opposed to Enterprise Java Beans. And one might decide to use template engines to present views instead of JSP.

Now on to the details of what MVC does not prescribe as a hard rule. A controller servlet typically receives a request, retrieves an appropriate business object or objects, and passes them over to the JSP. The JSP then uses those data objects to merely paint. How less intricate the JSP page is depends on how well your model data object is designed.

If you have two JSPs, then each page will typically receive its own business object. For example, a customer detail page will receive a customer business object, and an invoice detail page will receive an invoice business object. These types of specific business objects are called "vertical interfaces;" each JSP presents its own interface. For this reason, these objects are characterized as a circle and a triangle in Figure 1.

The role of the JSP is to walk through these distinct interfaces, retrieve data, and paint the page. If one were to call these distinct objects "vertical interfaces," won't there also be "horizontal interfaces"? To answer this question, let us move on to the next section.

MVC with Horizontal Interfaces

In an MVC paradigm, the "model" varies, depending on the nature of business objects it is carrying. Horizontal interfaces represent uniformity among interfaces presented by objects. Some of the most widely-known horizontal interfaces are the base interfaces of the COM spec -- IUnknown and IQuery.

If the model of the MVC is such that it always returns the same interface to the view, the JSP programmer's job becomes easier. This is because every JSP will deal with the model the same way. For example, the same methods are used to retrieve key value pairs from the model; the same mechanism is used to retrieve table data and list box data, etc. In other words, the interface of this unified model is the same, regardless of which JSP the model is delivered to by the controller servlet.

This is demonstrated in the following diagram, where a star represents this unified model interface. The only difference between this diagram and the previous diagram is that the controller servlet now delivers the same interface to all of the JSPs.

Figure 2. MVC with horizontal interfaces.

Inquisitive programmers will ask: "this uniform interface is still very abstract; please give us a concrete example of one such an interface." Let me give an example:

Although the API still seems somewhat esoteric, the following data model should make the API very clear. In essence, a dataset is a collection of rows, where each row represents a set of columns and 0 to n datasets derived from the key value pairs in that row.

Underneath an implementation of a DataSet, one can hide the data, coming from multiple data sources, that will include:

URL parameters (Request Scope).

Session attributes (Session Scope).

Memory-based tables.

Global configuration values (Application Scope).

SQL output.

Stored Procedure output.

Java-Object-based and EJB-based output rows

Connector-based output.

As you can see from the above, a unified interface gives you an idea of how to represent the data needs for any JSP. One can improve upon these interfaces to suit more complex needs. But the fact remains that one can design interfaces that have commonality and reap the benefits of that commonality.

At this juncture you might ask, how is this different from an XML DOM tree? The important consideration is one of storage efficiency and navigational ease. The presented navigational interface has the following characteristics:

Pull-based data model for higher scalability; the data is never collected until you ask for it from such sources as URL, Session, and Global sources.

Data gatherers not executed unless you ask for them.

Multiple strategies for either preloading the data or demand loading.

Further simplifications with JSP in mind.

DOM interface provided only if an XSL transform is in the offing, thus not paying the penalty of creating all the DOM nodes required.

Transparent Data Pipelines

Our exploration into the nature of MVC and its adaptation to JSP doesn't stop here. In this section, I will transform this idea into the idea of Transparent Data Pipelines. Figure 3 shows what happens when MVC with horizontal interfaces is transformed into a TDP.

The idea is that there is a data pipeline between your data sources and your views (in this case, the JSP page). One end of this data pipeline yields a horizontal interface called IDataSetNode, representing a hierarchical dataset. At the other end of the pipeline are the data sources.

Even in this scenario, the controller servlet will continue to receive the incoming request. Once the preliminaries of the request are complete (authentication, authorization, logging, etc.), the request is submitted to a data pipeline. Attached to the pipeline are such entities as EJBs, Stored Procs, etc. A set of EJBs and Stored Procs are chosen for an incoming request, and their output is collected and consolidated into a hierarchical data set and finally delivered to the JSP.

Another important component of this architecture is the XML definition file defining what entities (such as EJBs and Stored Procedures) belong to what request. The programmer will declaratively define these pipeline connections in a definition file. So, in essence, the TDP becomes the "model" part of the MVC architecture.

Figure 3. Transparent data pipelines.

TDP Composition: The Model Tier

The following diagram explains TDP from a compositional perspective. The core of TDP is that it knows how to combine relational data sets into a hierarchical data set. Let us first find out where we get the relational data sets from.

When we execute a "Select" SQL from JDBC, we end up with a relational data set represented by a result set. When we execute a stored procedure from JDBC, we could end up with a relational data set. When we have a Java class that returns a Java object that implements the concept of rows and columns, we again have a relational data set.

So we have multiple data sources that could yield the concept of relational data sets. This relational data set could be encapsulated into an API (similar to the rowset API of JDBC 2.0). In the example shown below, we have stored procedures, EJBs, Java procedures, and flat files yielding relational data sets.

The transparent data pipeline, assisted by an XML definition file, will organize these relational data sets into a hierarchical data set identified by the IDataSetNode interface.

Figure 4. TDP Composition.

Although the composition is taking place inside of the TDP architecture, the developer is exposed only to the XML definition file and the JSP. Because of this there seems to exist, for the developer, a transparent data pipeline between the data sources and the JSPs.

TDP Characteristics

Let's pause at this point and review the characteristics of the TDP:

MVC design pattern is preserved.

JSPs work with a uniform set of interfaces presented to them by the model.

When applied with external entities such as stored procedures, the programming model eliminates the complete middle tier, as JSPs now can receive hierarchical data sets by simply composing the stored procedures declaratively in an XML file. This is a powerful metaphor for database shops.

The TDP programming model is completely open, allowing all sorts of relational adapters to be plugged in.

TDP, InfoSets AND XPath

As TDP is delivering hierarchical data sets, there is an interesting connection between TDP and XML; this is via InfoSets. The InfoSets specification recognizes that XML data is fundamentally hierarchical and hence any hierarchical data, regardless of its representation, is XML.

Because of this, many of the tools that were previously defined for XML (schemas, xdata, etc.) can now be used against hierarchical data, as long as it follows the InfoSet model. So, potentially, one can apply XPATH queries on this hierarchical data to populate HTML via JSPs.

Realizing that this hierarchical data is indeed XML, one can even turn over the presentation of this data to XSL.

Choice of Transformations

Figure 5. Choice of transformations.

If we can apply a JSP transformation or an XSL transformation on the end points of a transparent data pipeline, what other transformations are possible? The hierarchical data set, due to its horizontal interface, is extremely suitable for your own homegrown template languages that could circumvent a minor irritation of JSP and XSL (namely, the inability to visualize the HTML template in which JSP and XSL is embedded).

Why do horizontal interfaces aid template languages? Because a template language does not have to distinguish between what objects are being accessed, when all objects sport the same interface. This uniformity also helps in translating pages between templates and JSPs when needed.

Let's take a trivial template language and see how this can be converted to a JSP when the need arises.

The above language will loop through the HTML section between the tags while replacing the {{var_name}} and {{var_lastname}} for each row that is in that loop. By examining the IDataSetNode interface, it is not difficult to imagine how the template engine could use that interface to walk through this HTML segment. By leaving the tags in comments, and exposing the substitution variables, the HTML design is kept intact, even after the tags are placed in the page.

The implication is that TDP eminently enables users to use their own transformations on the incoming data sets.

Exception Handling and Resource Reclamation

Facilities for exception handling in JSPs leave much to be desired. As I said, this is not a flaw of the JSP concept, but a problem of the current implementations. I am sure future releases of JSP will rectify this. In the meantime, there's no need to lose heart. There are some excellent solutions to circumvent the problem.

Why do we need exception handling on JSP pages? Although we are providing a nice, all-encompassing IDataSetNode object to the JSP, the navigation of this object might encounter exceptions, errors and runtime exceptions. And when an exception happens, the resources that you may have acquired will have to be returned.

Although it is possible to register error pages when an exception occurs, it has to be enforced on all of the JSPs, and programmers, as a rule, are a forgetful lot (if I am at all representative). The solution again lies in MVC. If there is a controller servlet that has transferred control to the JSP page, the controller servlet is a good place to perform this clean-up.

All that a controller servlet has to do is set up a CleanupRegistry in the HttpRequest (or more generally, the current thread) and let the model and the view register ICleanUpTasks with this registry. The controller servlet will call clean-up on this CleanupRegistry when the request returns from the JSP.

When using TDP, this cleanup registration typically happens in the model, and the clean-up in the controller and the view usually stays out of it. So the JSPs are completely exception-safe, as long as the JSP limits itself to accessing data from the passed hierarchical data set. Even in other cases, the JSP could always register a clean-up task if it were to request any clean-up as a post process.

Why are we going to such great lengths for cleaning up? How often do we run into this situation? Unfortunately, quite often, when we are dealing with relational databases and JDBC. Every time we access a result set we are dealing with three important resources:

ResultSets.

Statements.

Connections.

These resources will have to be either closed or returned to their pools once the data is collected from them. One cheap solution is to retrieve the result sets into vectors and close these resources at the data source level. This could be a bad idea for high-volume Web sites -- this would basically increase your memory trace, as you are commiting memory for all the rows at one time instead of processing them one at a time.

Processing one at a time is not that simple architecturally, because this processing takes place on the JSP, and the JSP is not good at cleanup. So the solution is to allow an abstraction like TDP and provide a close method on IDataSetNode, which will ripple through all of the inner relational data sets.

There is also something of a prevailing wisdom that one should not allow any exceptions to happen on a JSP page. The idea is that we can pre-collect all the data that a JSP page needs so that we can deal with exceptions before the JSP page is invoked. To me, this seems a tall compromise, in terms of scalability. In my view, one can minimize the exceptions that can happen on a JSP page, but one can never really stop runtime exceptions. I do agree that you have to minimize exceptions and find reasonable ways to deal with those severe exceptions.

This one single aspect of server-side programming, combined with connection pooling, determines how scalable and reliable your application server is.

Summary

TDP has much broader applications than the Web space. For instance, if one can standardize on the declarative definition of the compositional aspects of TDP, it is possible for multiple vendors to ship out-of-the-box TDP executors that one can plug into his or her applications, just the way they do now for JDBC drivers. And the programmers can be relieved from writing JDBC code, CICS code, etc., rather than focusing on the business logic that can manipulate the data returned.

As the data sets are already InfoSets, it makes lot of sense to return them in SOAP envelopes to make your backends SOAP-enabled, thereby serving your B2B and B2C needs simultaneously.

TDP will come in handy for enterprise reporting systems. Today, these reporting engines are constrained and pigeon-holed for a specific database and a specific kind of data source. Something like TDP will open up these reporting engines by infusing data in XML and let the reporting engines do what they do best, design GUI-based report templates.

The above analogy will apply to charting software as well. Let the charting engines design GUI templates for charts, and supply the needed data utilizing the declarative TDP. This approach has been successfully applied in the industry.

TDP is highly effective in bringing J2EE and XML to a much larger audience, allowing one to develop solutions in a much shorter timeframe, while maintaining the following characterestics: