EJB 3.0 Persistence – Introducing EJBQL 3.0

EJB QL or Enterprise Java Beans Query Language has been introduced in the EJB 2.0 Specification, back in 2001. For some reason, it never grew on me. I did not like EJB QL and I even ridiculed it. Having yet another language to learn in the the J2EE stack, trying to imitate SQL, in a very poor way. I was wary of it and avoided it.

However, at the same time I was a happy user of the Top Query Language. And of Hibernate HQL. I liked the elegant way in which queries could be written in terms of objects. I especially liked the ‘navigation style’ of writing query expressions. And even though at times I missed the liberty of writing direct SQL – being convinced that only I could write superior SQL, efficient and functionally complete. The main thing of writing queries in TopLink’s query language or HQL was that the result of the query was a bunch of domain objects – POJOs or Entities. I did not have to convert JDBC resultsets. And those domain objects lived in the cache and could be used for further manipulation.

So how come EJB QL was not good enough and those specialized OO Query Languages were? Part of it was the fact that EJB QL had huge functional limitations. I also tried to avoid EJBs in general – as they often brought much more complexity than could be justified by what they had to offer; I rarely needed to distribute my application for example.

In this article, I will introduce EJB QL as it is defined in EJB 3.0. And I will try to explain why now all of a sudden I am interested in using it. In a subsequent article, I will go through code examples of using the new EJBQL 3.0 language, on top of the GlassFish JEE 5 Reference Implementation.

History

EJB QL was introduced in September 2001, in EJB 2.0. Before that time, EJB had finder methods for retrieving objects based on database data: find all objects and find by primary key. There was a clear need for more refined queries, was CMP going to really fly. However, there were many limitations to the EJB 2.0 QL, such as a very poor selection of SQL functions, no support for joins or subqueries and not even order by and aggregation.

Some of these shortcomings were rectified in EJB 2.1 in November 2003. In J2EE 1.4, the EJB 2.1 specification tackled two widely anticipated enhancements to EJB QL. These included a new ORDER BY clause enabling sorting capabilities that previously were either hand-coded by developers or offered through proprietary extensions and new aggregate functions such as AVG, MIN, MAX, and COUNT. Because EJB QL naturally maps to SQL, these set-based manipulation additions were seen as critical by many developers and were now required by the specification for EJB QL 2.1. EJB 2.1 also allows string comparisons using the greater-than and less-than symbols ( > , < ) in WHERE statements.

However, even then there were large holes in EJB QL compared to proper Query Languages such as SQL and even Toplink’s Query Language and Hibernate’s HQL. And of course EJB QL was only used for EJBs – and those were becoming less and less popular due to their – perceived – complexity and relatively limited benefit.

In 2005 the EJB 3.0 specification evolved and in 2006 it will be finalized as part of the JEE 5 standard. It is rapidly becoming clear that EJB 3.0 will revolutionize EJB. And will mean the breakthrough for EJB usage by a much wider audience than before. The main reasons for this:

broad support by the main O/R players in the industry, such as Sun, Hibernate/JBoss, Oracle and SolarMetric

the ability to use EJB 3.0 Persistence outside the context of JEE containers; this not only means that you can much easier (unit-)test EJBs, it basically means that any Java application that needs to access a database can do so using EJB 3.0. That in turn means that your application is much more portable: from outside to inside an EJB Container as well as from one O/R vendor to another. Besides, EJB 3.0 Persistence and EJB QL 3.0 add support for Native Queries: whenever the functionality in EJB QL is too limited, you have the option of stepping outside the boundaries without sacrifing most of its advantages – except for database portability

a very good and open source Reference Implementation (GlassFish)

Before diving into EJB QL in EJB 3.0, let’s take a brief look at the discussions that have surrounded EJB QL from its incarnation in EJB 2.0:

Advantages and Benefits of EJB QL

What is EJB QL to start with? Unlike an SQL-based query, an EJB QL query queries the abstract schema, not the database. When a finder method is invoked, the EJB container executes the EJB QL query associated with that method and returns references to one or more entity beans that represent data satisfying the query criteria. An example of a finder method might be one that identifies bank accounts that have an account balance that is less than a certain threshold. The EJB QL query for that finder method queries the abstract schema and returns references to one or more entity beans that represent accounts meeting the query criteria. So you do not write SQL in terms of tables and columns but a query in similar syntax in terms of objects, properties and references or relations,supporting ‘navigation’ such as in: select e from employee e where e.department.location = ‘DALLAS’.

EJB QL provides a very good way to turn query results into objects: the result of a EJB QL query is a Collection of Objects or Entities. The alternative using JDBC and PreparedStatements based on Strings with SQL statements results in ResultSet objects that the application somehow has to turn into Entities, Domain Objects or POJOs.

EJB QL is portable across Application Servers and across databases – any J2EE compliant application server had to implement EJB QL. So an application using EJB QL deployed in any of the compliant Application Servers would function okay – at least in theory.

Java Developer does not have to learn SQL to access database and he/she most certainly does not have to learn the SQL dialect of each RDBMS that the application may be deployed against. However of course, they must learn EJB QL – even if they already happen to know SQL…

An EJB Container parses and validates EJB QL queries before entity beans are deployed, so there is not really compile time checking of queries but definitely a verification before run-time, which is better than Strings with JDBC SQL statements that are only tested when the application is run. On the other hand: such Strings typically can easily be tested in SQL frontend tools such as Oracle’s SQL*Plus, so in fact they are easier to verify than EJB QL queries that can only be verified inside the EJB container. Or at least: up until EJB 2.1 that used to be the case.

Disadvantages/Limitations of EJB QL (at least before 3.0)

No native SQL so no way to leverage the SQL features of a particular database such as calling PL/SQL functions from a Query, using advanced features such as InLine Views, Scalar SubQueries and Analytical Functions. Sometimes we could work around such limitations through the use of Database Views, but typically the results could not be easily mapped to the Entities.

Since no SQL knowledge is required, queries may not be optimally efficient – how much faith do you have in the OO and Java specialists that developed your J2EE Server’s EJB container when it comes to writing generic, yet optimally performing SQL statements for your particular flavor of Database (version 8.1.7.4 of Oracle Standard Edition or version 3.23 of MySQL with InnoDB).

EJBQL up to and including 2.1 missed essential features like HAVING for testing on aggregate results in the where clause. Also: the number of functions that is supported in EJB QL 2.1 and before is relatively small, compared to the richness of SQL in popular relational databases such as MySQL, PostgreSQL and Oracle. In 2.0, EJB QL had just five functions: CONCAT(), LENGTH(), LOCATE(), SUBSTRING(), ABS(), and SQRT(). In 2.1 the MOD() function was added, along with aggregate functions: COUNT(), MAX( ), MIN( ), AVG( ) and SUM( ).EJB QL 2.0 even lacked features like the ORDER BY identifier, which becomes very handy as the application becomes large and complex. Some of the application servers may provide these features, but usage of the same may limit portability across application servers. Date and time values should be passed as millisecond value using Java primitive type, long. EJB QL 2.0 did not support fixed decimal comparison in arithmetic expressions. String and Boolean comparison is restricted to = and <>. However, the built-in functions(CONCAT, SUBSTRING, etc.) can be used to perform other operations on String. EJB QL 2.0 does not support comments.

EJB 2.1 did (or I should say does since it is still the current specification) not support Group By, Having or SubQueries.

Since EJBQL looks so much like SQL, you can hardly claim that developers do not have to learn SQL: they have to learn EJBQL which is very close to it. In fact, since most know already some SQL you can also call this a disadvantage, as developers need to learn yet another syntax. On the other hand, they do not have to learn the SQL dialect for each database their application will run against. So there are pros and cons here.

Portability across databases sounds great. However, in actual practice this is not frequently required or made use of. Most organizations tend to standardize on one or two Enterprise RDBMS’s, typically Oracle, DB2 or SQL Server, and there is very little migration between those platforms.

And of course compared to the out-of-container options of EJB 3.0, 2.1 and before are somewhat limited as they can only be used for CMP in EJB inside J2EE containers.

New and Improved in EJB QL in the 3.0 Specification

Native Queries – whose results are still returned as objects! Allow direct SQL over actual physical database schema. Very useful for some applications where database portability is not important. Allow SQL query results to be mapped into entity beans and/or instances of other Java classes.

Bulk DML – EJBQL has a new update and delete statement, so it’s not just for queries anymore. Update and delete let developers do things like, give all of your favorite customers discount or something like that, without having to load all of those customer objects into the application server’s cache, so you can do the updates directly in the database and that should be a big win for performance. Statements such as: delete from Customer cust where cust.id = 12345 and update OrderLine ol set ol.fulfilled = â€˜Yâ€™ where ol.order.id = 9876543

â€“Named parameter binding – The use of named parameters applies to EJB QL, and is not defined for native queries. Only positional parameter binding may be portably used for native queries.

â€“

Pagination control – the Query interface provides methods to set the first record to return (for example starting at record 50) and the maximum number of records to return – for example the next 25.

Projections – Multiple object types or partial objects returned from a single query. Previously EJBQL was restricted to retrieving an entire object or one column of data. The new EJBQL 3.0 specification allows developers to create projections, which is a list of arbitrary columns from one or more objects in the database. It is a powerful method for querying reporting functions which can produce subsets of objects, etc. This makes it unnecessary to retrieve an entire large object when only a partial list of fields needs to be represented. Retrieving and presenting data is now possible without the extra overhead of retrieving the entire complex object or object model. This provides a much simpler and useful method for querying complex object model data which also increases application performance at the same time. It is nice to be able to run queries that return a data structure which spans multiple tables, without having to go to multiple entities. At the moment people use fast reader patterns to bypass EJB for this task.

A projection query is basically a query that spans one or more mapped entity returning a select set of attributes. A projection could optionally reference the relational schema directly but most usages involve leveraging the mappings and defining the attributes needed and selection criteria based on these mappings.

Join Support: Explicit support for both inner and outer JOIN operations (select o from Order o left join o.lineItems li where li.amount > 100). A FETCH JOIN allows you to indicate that the related entitiesâ€”e.g., Orders for a Customerâ€”should also be fetched, though the SELECT clause only selects Customers.

Support for dynamic queries through EntityManager APIs. So in addition to static queries, EJB 3.0 supports dynamic queries, which I think is a very useful thing and I am very excited that that has been added.

Support for setting Hints on queries These hints can be passed to the specific EJB 3.0 Implementation – for example Hibernate or TopLink or perhaps even to the underlying SQL engine of the RDBMS in use. If the Hint is not recognized, it is ignored, so it should not impede portability.PolyMorphic Queries

By default, all queries are polymorphic. That is, the FROM clause of a query designates not only instances of the specific entity class(es) to which it explicitly refers, but subclasses as well. The instances returned by a query include instances of the subclasses that satisfy the query conditions. For example, the query

select avg(e.salary) from Employee e where e.salary > 80000

returns the average salary of all employees, including subtypes of Employee, such as Manager and Exempt.