Smoothly Blending Java and SQL with pureQuery

Just like oil and water don't mix effectively without an emulsifier, so too is the problem with Java and SQL. With IBM's pureQuery data access platform, developers can more simply blend the two together.

by Rick Cook

Dec 11, 2008

Page 1 of 2

Modern software development is, famously, a collaborative process. Unfortunately the tools we use to develop software don't always reflect that home truth.

Consider an application written in Java that uses SQL to manipulate information in a database. SQL and Java are two of the primary tools for developing modern data-driven applications. Unfortunately they mix about as well as oil and water. Developing in SQL and Java usually requires two sets of tools, two developers and a certain amount of pain and suffering.

Developers working with Java and SQL don't call it anything—at least anything printable. They just acknowledge it's a problem, put their heads down, and keep slogging.

One of the things they have to slog their way through is that most Java IDEs don't speak SQL. To them a SQL statement is merely a string. They can't do syntax and semantic checking, much less optimization on SQL statements, the way they can on Java.

Since people who are skilled in both Java and SQL are rare, most Java SQL projects need to team Java developers with SQL developers and DBAs. "There's a need for a person strong in Java collaborating with someone who is strong in SQL," says Rafael Coss, an IBM Solution Architect leading the pureQuery solution efforts in the Data Server Solutions team at IBM's Silicon Valley Lab. "A good running application is going to need the entire team to make it work." However most tools do a poor job of enabling teamwork between SQL and Java developers and DBAs.

Java is object-oriented with encapsulation, inheritance and polymorphism as major features. The closest relational databases come to objects are rows with primary key values. SQL is a declarative query language as are most languages for manipulating SQL queries.

The result is that programs using Java and SQL are hard to write, prone to errors and it is very hard for database administrators and Java programmers to collaborate effectively.

Some Java programmers have attempted to solve this challenge by hiding the database with object relational mapping (ORM) frameworks like Hibernate. They have attempted to create an abstraction layer so they don't have to concern themselves with SQL. In ORMs, developers setup a configuration or mapping file to define how their Java objects are mapped to tables with the goal of having the ORM generate some basic SQL CRUD (Create Read Update Delete). For more specific queries they also provide some Object Query Language, like HQL so they don't have to do SQL. However most ORM object query languages are very, very SQL-like and eventually generate SQL. The "what" and "how" of the translation depends on the ORM.

What if the DBA or SQL developer wants to help tune or troubleshoot the data access layer? With an ORM the Java developers now have no idea what the actual SQL is that is being generated. The ORM is like a SQL amnesia layer which prevents the Java developers from working with the DBAs and SQL developers because the database is hidden.

Of course, as every cook knows, oil and water will mix nicely if you add an emulsifier, a third ingredient to bind them together. For Java and SQL it's IBM's new pureQuery data access platform. Put simply, pureQuery acts as a software emulsifier to let Java and SQL blend together seamlessly. The result is quicker, easier development and greatly improved productivity.

How much improved productivity? YMMV, of course, but according to IBM, customers have seen productivity gains of 25 percent and more in Java-SQL applications.

"pureQuery is not just for developers. It's for developers and application-centric DBAs," says Coss. "Data Studio Developer is an Eclipse-based tool for developing data oriented applications." Most of pureQuery's features will work with any SQL database that has a JDBC driver (pureQuery is the data access layer for sMash/project Zero), but others are currently available to users of DB2 and Informix.

Two Ways of Working with Java and SQL
Broadly speaking, existing data access APIs take one of two approaches. They either provide direct access to the SQL or they treat SQL code as managed objects. Both have their drawbacks.

Direct access is easier to understand and gives the developer control over the SQL, but it means that someone has to map the JDBC result sets to Java objects, create statements and bind parameter markers. That's a lot of work, much of it fairly tedious.

The managed objects approach, such as Object Relational Mapping (ORM) makes life seem easier for the developers by creating a data access abstraction layer that often generates the SQL statements from the Java objects, thus hiding the database. This does make things easier for the Java developers, at least initially, but it provides only limited control over the SQL statements—in some cases so limited that the Java programmers don't know which SQL relates to which Java objects. The decision to hide the database basically taxes the DBA and Developer when it comes to tuning, trouble shooting and performing problem determination because the Java to Relational mapping is hidden and you don't have access to the actual SQL to modify.

This greatly complicates life at deployment Not only does the additional abstraction layer complicate analysis of runtime performance, tuning and diagnostics become much harder because the developers have lost control of the SQL.