Thursday, December 27, 2012

Basic access authentication is a crude mechanism to authenticate that's part of the HTTP standard. It allows both an agent to send username/password credentials and a server to request the agent to authenticate itself. This happens in a simple but standardized way.

The mechanism can be easily implemented using Java EE's JASPIC and a sprinkle of utility code from the experimental OmniSecurity project (which is currently being discussed as one of the possible options to simplify security in Java EE 8).

Note that the JASPIC auth module as shown here is responsible for implementing the client/server interaction details. Validating the credentials (username/password here) and obtaining the username and roles is delegated to an identity store (which can e.g. be database or LDAP based).

Saturday, December 22, 2012

While Java EE applications could directly use the Undertow events, it's not directly clear how to do this. Furthermore having Undertow specific dependencies sprinkled throughout the code of an otherwise general Java EE application is perhaps not entirely optimal.

The following code shows how the Undertow dependencies can be centralized to a single drop-in jar, by creating an Undertow extension (handler) that bridges the native Undertow events to standard CDI ones. Upon adding such jar to a Java EE application, the application code only has to know about general CDI events.

Experimenting with the above code proved that it indeed worked and it appears to be incredibly useful. Unfortunately this is now all specific to Undertow and thus only usable there and in servers that use Undertow (e.g. JBoss). It would be a real step forward for security in Java EE if it would support these simple but highly effective authentication events using a standardized API.

Wednesday, November 7, 2012

This article takes a look at the state of security support in Java EE 6, with a focus on applications that wish to do their own authentication and the usage of the JASPI/JASPIC/JSR 196 API.

Update: the further reading section has been moved to my ZEEF page about JASPIC. This contains links to articles, background, questions and answers, and more.

Declarative security is easy

In Java EE it has always been relatively straightforward to specify to which resources security constraints should be applied.

For web resources (Servlets, JSP pages, etc) there is the <security-constraint> element in web.xml, while for EJB beans there's the @RolesAllowed annotation. Via this so called 'declarative security' the programmer can specify that only a user having the given roles is allowed access to the protected web resource, or may invoke methods on the protected bean.

The declarative model has a programmatic counterpart via methods like HttpServletRequest#isUserInRole, where the same kind of role checks can be done from within code (allowing for more elaborate combinations, e.g. has role 'admin' but not has role 'manager', or if price > 5 and not has role 'manager', etc).

This is indeed straightforward and easy to use. Unfortunately, when it comes to implementing the actual authentication code (the code that actually loads a user and associated roles from some place and checks e.g. the password), things are not so simple.

How is authentication traditionally implemented?

Traditionally, Java EE simply didn't say how authentication should be done at all, which greatly confused (new) users. The idea here is that security is setup inside the application server, and is done in a vendor specific way. In addition to that, a WAR or EAR will typically also have to contain vendor specific deployment descriptors which require setting up and configuring vendor specific things, often using vendor specific terminology. For instance, some application servers require specifying something called a "domain", which then approximately but not exactly corresponds to what another server may call a "realm", "zone", or "region". Roles can also rarely just be... roles. Many servers, but not all, require you to first map them to things like a "group", "principal", or "right" (which again are all roughly the same thing).

Ignoring the terminology confusion, this model works well for the situation where externally obtained applications need to be integrated in the existing intranet of an enterprise, and where existing user accounts residing in e.g. the enterprise's LDAP server need to access those applications. Examples of such applications are things like JIRA or Sonar. In that situation, if JIRA would use the role name "admin" and your organization uses the name "administrator", it's convenient that there's a way to map between those roles.

However, for applications that are developed in-house and are solely aimed to be deployed by the same organization that developed them and which are intended for the general Internet public (i.e. your typical web app), all this mandatory mapping is completely unnecessary.

Having the security setup inside the application server is an abstraction that only gets in the way if there is only ever one application deployed to that application server. Worse, because the functionality to create a user account is typically an integrated part of the above mentioned web applications, being forced to setup security outside the application really doesn't work nicely. Among others it prevents the application to easily use its own domain models for the authentication process (like e.g. a JPA entity User). There are some popular workarounds for this, like login modules that allow one to directly query a user and its roles from the same database that the application is using, but these are inelegant at best and require the details about how the User entity is persisted to reside at two places.

Heavyweight

The fact that the authentication mechanism is vendor specific doesn't just hurt the portability of Java EE applications, it also hurts learning about Java EE. Namely, in order to secure a Java EE application, you can't just study Java EE books and tutorials, but you also have to learn e.g. JBoss, or GlassFish. Especially for lesser known application servers, it can be very frustrating to dig up that information. Essentially it makes Java EE developers less able to move between jobs and makes it harder for companies to hire experienced employees.

All of this unfortunately seems to add to the feeling that Java EE is heavyweight, a reputation that Sun, now Oracle and partners have been trying hard to shake off. Indeed, a technology like EJB has been massively slimmed down by among others simply not forcing certain restrictions upon users and having smart defaults. Yes, (business) interfaces for services and separating business code into its own layer may be a best practice in some situations, but it's a choice users should make and in EJB 3.1 this choice was finally given to the user.

What about JAAS?

A common mistake is to think that JAAS (Java Authentication and Authorization Service) is the standardized and portable API that can be used to take care of authentication in Java EE without having to resort to vendor specific APIs.

Unfortunately this is not the case. JAAS comes a long way in introducing a basic set of security primitives and overall establishing a very comprehensive security framework, but one thing it doesn't have knowledge about is how to integrate with a Java EE container. Practically this means that JAAS has no way of communicating a successful authentication to the container. A user may be logged-in to some JAAS module, but Java EE will be totally unaware of this fact. The reverse is also true; when a protected resource is accessed by the user, or when an explicit login is triggered via the Servlet 3 HttpServletRequest#login method, the container has no notion of which JAAS login module should be called. Finally, there is a mismatch between the very general JAAS concept of a so-called Subject having a bag of Principals and the Java EE notion of a caller principal and a collection of roles. For further reading about this particular subject; Raymond Ng wrote an excellent article about this a few years ago.

Nearly all vendor specific authentication mechanisms are in fact based on JAAS, but each vendor has taken its own approach to implementing the container integration, how to map the above mentioned caller principal and roles to the JAAS Subject, and how to let a user install and specify which authentication modules should be used for a given application (or domain, realm, zone, etc).

JASPIC to the rescue... sort off

In actuality, the idea that there should be an API in Java EE that standardized the above mentioned integration already existed a long time ago, in 2002 to be precise when the JASPIC JSR (JSR 196) was created. For some reason or the other, it took a very long time for this JSR to be completed and it wasn't included in Java EE until Java EE 6 (2009).

JASPIC finally standardizes how an authentication module is integrated into a Java EE container. However, it's not without its problems and has a few quirks.

Probably in order to maintain compatibly with the existing ways that containers use the JAAS Subject, JASPIC did not specify which parts of this Subject correspond to the caller principal and roles. Instead, it uses a trick involving a so called callback handler. This works in 2 steps. First JASPIC introduced 2 types (called callbacks), that do contain this information in clearly specified fields. These are called CallerPrincipalCallback and GroupPrincipalCallback. Secondly the authentication module is given a handler implementation that reads the data from those two types and then stores it in a container specific way into the JAAS Subject. It's a bit convoluted, but it does do the trick.

Another strange aspect of JASPIC is its name. Seemingly people can't agree on whether it should be JASPIC or JASPI. Important vendors like JBoss and IBM call it "JASPI" in e.g. documentation and package names of source code. Oracle calls it "JASPIC". It's a small thing perhaps, but terminology is important and even though the difference is just one letter, it makes searching more difficult since many search engines emphasize full words. In e.g. JIRA and on Google I found searching for just "JASPI" did not always gave me the results that "JASPIC" would give me. And although it has now mostly faded away, JASPIC was once known by yet another name; JMAC. It looks like a rather different name, but it's an abbreviation for "Java Message Authentication SPI for Container(s)", which is almost identical to the current "Java Authentication SPI for Containers". The term "jmac" is still used in the GlassFish source code, which has apparently not been refactored after the name changed.

For something that was added to Java EE 6, it really feels out of place that JASPIC is still limited to the Java 1.4 syntax. From Servlet, to JSF to EJB and JPA; pretty much everything has adopted at least the Java 5 syntax. The JASPIC 1.0mr1 spec does mention this issue, but merely states that "There is a requirement that the SPI be used in J2SE 1.4 environments". -Why- this requirement is there, and why it holds for JASPIC but not for most of the other specifications in Java EE 6 is however not really clear.

A serious problem at the moment is the fact that adoption of JASPIC by vendors has been slow. JASPIC may be mandated for a Java EE 6 implementation, but only for the full profile. This means the very important web profile (with implementations like TomEE and Resin) does not need to implement it (not even the Servlet Container Profile, which is a subset of the full JASPIC spec). Web profile implementations do need to implement authentication modules (since security is a mandatory part of both Servlet and EJB-lite), but they have chosen to implement those using their own APIs. This is worrying. JASPIC isn't about something that web profile apps don't need, but is about doing something they need in a specific way. Perhaps this way is not yet good enough or not yet mature enough, or maybe there is just too much investment in proprietary solutions and the advantages of JASPIC are not seen as compelling enough, since otherwise those web profile implementations might have adopted JASPIC of their own account by now, without needing to be forced to implement it. (Tomcat in particular does ship with a JAASRealm (javadoc), which it says is an early prototype of JASPIC that was probably created somewhere around 2004))UPDATE: In 2016, Tomcat 9 M4 has implemented JASPIC.

Full profile implementations have implemented JASPIC of course, but most present it as a secondary option; for those usecases where a user happens to have a JASPIC authentication module that needs to be used. For "normal" security, the vendors' proprietary solutions are still being presented as the primary solution. As a result, various JASPIC implementations are a little buggy. This is however not a rare situation. The story is rather similar for the standardized embedded data source that was introduced in Java EE 6 (@DataSource or data-source in web.xml). Initially adoption of this was rather slow and most if not all vendors kept plugging their own proprietary ways to define data sources. Lately the situation has improved somewhat, but perhaps the Java EE certification process should be a bit stricter here.

Then there's the question of how to tell a container to use a particular authentication module for a particular application or perhaps for the entire server. In order to do this there are typically a number of options; via some kind of admin UI or console offered by the application server, declarative via configuration files or annotations (where those configuration files can either reside inside a WAR/EAR or inside the server itself), or programmatically via some API.

Unfortunately, the only method that JASPIC standardized is the programmatic option. And this programmatic option seems to be aimed more at vendors needing an internal API to register modules than at user code registering their own at startup. So in practice the already ill-advertised and sometimes buggy standardized method appears to be not that standard at all. The JASPIC documentation of all vendorsencourage the userto install the authentication module inside the server, create or edit proprietary configuration files and as if that isn't insulting enough to a developer not rarely requires interacting with a graphical UI as well. Clearly such documentation is aimed at system administrators setting up the kind of traditional servers that are used to run externally obtained applications that need to integrate with the existing infrastructure. Developers creating applications that need to manage their own users are largely left in the cold here. (The GlassFish developer documentation mentions the programmatic option, but doesn't go into detail how that exactly works)

Programmatically registering JASPIC auth modules

Nevertheless, the programmatic API is sort of useable for applications to register their own internal authentication module. Of 5 servers that I tested; JBoss EAP 6.0 (JBoss AS 7.1.2.Final-redhat-1), GlassFish 3.1.2.2, WebLogic 12c, Geronimo v3 and WebSphere 8.5, only Geronimo seemed to have overlooked the possibility for web apps registering their own authentication module. For the other servers it did more or less work, but it's striking that even when using programmatic registration none of them could actually do their job without a vendor specific deployment descriptor being present (which is something related to the general concept of security in Java EE and not a specific fault of JASPIC).

The first hurdle when attempting to use JASPIC for programmatically registering just an authentication module is the fact that there isn't a convenience API to do just that. Instead there's something that's essentially a factory-factory-factory for a delegator to an actual authentication module. That's right, it's a quadruple indirection. Usefull and flexible for those situations that require it no doubt, but more than a little intimidating for novice JASPIC users.

Another hurdle is that the initial factory used for registering the factory-factory requires an "appContext" identifier. This identifier is specified to be either null, or be composed of the pattern [hostname] [space] [context path]. When the identifier is null, the registration is for all (web) applications, otherwise it's only for a specific one. Clearly when an application registers its own internal authentication module the latter form is needed. The problem is that this "hostname" part is not that easy to guess when doing programmatic registration at startup time. It's further defined as being a "logical host", but how does an app knows what its own logical host is? The situation is further complicated by the fact that all servers except JBoss EAP just use a constant here, which is simply "server" in case of GlassFish, Geronimo and WebLogic and "default_host" in case of WebSphere. JBoss EAP however uses ServletRequest#getLocalName here, which is a value that's only available during request processing and not during startup time. It seems likely that if internal application server code is doing both the registration and the subsequent lookups, this is not really a problem. The AS itself knows which key it used for registration and can easily use the same one for lookups later. But when user code needs to do a registration independent of the application server that later on does the lookup, this becomes a problem. Maybe JBoss has interpreted the spec wrongly and the logical host should really be the constant "server", but then the spec needs to be clarified here. If it really should be a logical host of some kind, then there also needs to be a way to express that the application doesn't care about this (for example by specifying "*" as a kind of bind-all). As it stands, the situation is highly confusing. UPDATE: In JASPIC 1.1/Java EE this problem has been solved.

Sample code

The code below shows how to programmatically register a sample JASPIC authentication module. The module itself will be as simple as can be, and always just "returns" a user with a name and one role.

Step 1 - Registering via the factory-factory-factory

We first obtain a reference to the factory-factory-factory (AuthConfigFactory), which we use to register our own factory-factory (shown highlighted). We need to specify for which layer we're doing the registration, which needs to be the constant "HttpServlet" for the Servlet Container Profile. For this example we evade the problems with the appContext and provide a null, which means we're doing the registration for all applications running on the server.

Step 2 - Implementing the factory-factory

In the next step we look at the factory-factory that we registered above, which is an implementation of AuthConfigProvider. This factory-factory has a required constructor with a required implementation. The implementation seemed trivial; we need to do a self-registration. As I wasn't sure where some of the parameters had to be obtained from, I used my good friend null again.

The real meat of this class is in the getServerAuthConfig method (shown highlighted), which simply has to return a factory. The flexibility that this factory-factory offers is the ability to create factories with a given handler or when this is null give the factory-factory the chance to create a default handler of some sorts. There's also a refresh method that I think asks for updating all factories created by the factory-factory if needed. It's only for dynamic factory-factories though, so I left it unimplemented.

Step 3 - Implementing the factory

The factory that we returned in the previous step is an implementation of ServerAuthConfig. Its main functionality is creating instances of delegators for the authentication module (shown highlighted).

In our case the factory functionality is very simply; it just creates a new instance of the delegator passing only the handler through. The factories that are provided by the application servers themselves typically read-in and process the proprietary configuration files here.

I observed an interesting difference here between Geronimo and the other servers tested; Geronimo calls the getAuthContext method twice per request, while the others only do so once.

Step 5 - Implementing the authentication module

At long last, we finally get to implement our authentication module, which is an instance of ServerAuthModule. With respect to the API, it's interesting to note that this time around there's an initialize method present instead of a mandatory constructor.

As mentioned before, we don't do an actual authentication but just "install" the caller principal and a role into the JAAS Subject. For this example, getSupportedMessageTypes actually doesn't need to be implemented since it's only called by the delegator that encapsulates it. Since we own that delegator, we know it's not going to call this method. For completeness though I implemented it anyway to be compliant with the Servlet Container Profile.

Interesting to note is that secureResponse was treated differently by most servers. Only WebLogic and Geronimo call this method, but where WebLogic insists on seeing SEND_SUCCESS returned, Geronimo just ignores the return value. In its class org.apache.geronimo.tomcat.security.SecurityValve, it contains the following code fragment:

// This returns a success code but I'm not sure what to do with it.
authenticator.secureResponse(request, response, authResult);

Another difference for this same secureResponse method, is that WebLogic calls it before a protected resource (e.g. Servlet) is called, while Geronimo does so after.

Step 7 - Setting up the mandatory proprietary descriptors

A very unfortunate and nasty step is that we -have- to setup proprietary deployment descriptors for each container.

The majority of them (all, except JBoss EAP) don't directly accept the roles that our authentication module puts into the JAAS Subject, but forces us to map them. This necessitates a rather silly and pointless mapping where every time we map architect to architect. This will be extra painful when we are building an application that uses say 20 roles and we want to support those 3 servers out of the box. It will mean not less than 60 completely pointless mapping directives have to be added :(

Two servers require us to specify something that JBoss calls a domain, but Geronimo calls a security realm. The idea behind this concept is that it's a kind of alias for a whole slew of security configuration options (typically which authentication modules should be used). Of course, if we're registering our own authentication modules programmatically this is rather pointless as well. JBoss actually seems to want that we modify a file called domain.xml inside the JBoss installation directory (a horror for portable apps that take care of their own security configuration), but luckily there's already a domain defined there by default that we can use. The problem with these default things in JBoss is that JBoss does like to change them on a whim between (major) releases. Today I found a domain called "other" to be useable, but unfortunately I know from experience this might have another name in the next release.

Two servers, Geronimo and JBoss also needed extra configuration to work around bugs, where in the case of JBoss this configuration was needed because despite being Java EE 6 certified JBoss seemingly does not want to make JASPIC available by default; the user has to explicitly activate it. In the case of Geronimo, it was required to specify something called a moduleId, or otherwise ClassNotFoundExceptions would be thrown:

java.lang.NoClassDefFoundError: jaspic/TestServerAuthConfig
at jaspi.TestAuthConfigProvider.getServerAuthConfig(TestAuthConfigProvider.java:50)
at org.apache.geronimo.tomcat.BaseGeronimoContextConfig.configureSecurity(BaseGeronimoContextConfig.java:177)
at org.apache.geronimo.tomcat.WebContextConfig.authenticatorConfig(WebContextConfig.java:51)
at org.apache.geronimo.tomcat.BaseGeronimoContextConfig.configureStart(BaseGeronimoContextConfig.java:116)

WebSphere 8.5 was particularly troublesome here. The example application is a WAR, but the file in which the roles mapping had to be done could only reside in an EAR. So, specifically for WebSphere an extra wrapping EAR had to be created. Even more troublesome was that with WebSphere security it self first had to be activated in a graphical admin console (by default at https://localhost:9043/ibm/console). It's a well known caveat. After security was activated, JASPIC had to be separately activated as well. There seemed to be an option for this in the proprietary deployment descriptor, but unfortunately this didn't work. Likely this option is there to register a SAM declaratively, and it doesn't do anything without this SAM being given. More precisely the two settings that needed to be changed via the admin console are:

After that and adding the role mapping, authentication kept failing. The only thing that was logged was: "SECJ0056E: Authentication failed for reason ", which is a rather poor problem description. After hours of searching, the proprietary alternative to the JASPIC callback handlers hinted at a solution. Namely, most JASPIC handlers are just wrappers around whatever proprietary mechanism the server has or had in place before JASPIC. In this case, the alternative solution asked to "get a unique user id" from some "registry". But how does WebSphere know about these users? As it appeared; creating them via the Admin Console again:

After this, authentication finally succeeded. However, it's of course undoable to manually add all groups and especially all users to the admin console. How would this even work when users register themselves via the web? Hopefully there's an option somewhere to disable this, but I haven't found it yet.

When it comes to proprietary stuff, Websphere was clearly the worst offender. Having to mock around with a GUI before the app can run is just not tolerable for the kind of application we're trying to build here. But Geronimo was not innocent either. As it stands Geronimo requires both the "security realm" thing and the role mapping to be specified, as well as some gibberish for working around what seems to be a bug.

It's amazing really how the exact same nonsense mapping of architect to architect can be expressed in so many nearly identical but still different ways.

Step 8 - Implementing a test Servlet

In order to test that a request is getting authenticated, we also need an actual resource. For this I used a simple Servlet that just prints the name of the caller principal. Note that should the authentication module fail to put a caller principal into the JAAS Subject, this will result in a NullPointerException.

Step 9 - Working around bugs

Of the 4 servers tested, 2 of them have severe bugs that make the sample authentication module and programmatic registration as shown above unusable.

JBoss AS 7.1.1 and JBoss EAP 6 ignore the GroupPrincipalCallback, which makes it impossible to assign any roles. The way the code is setup a not yet mentioned callback, the PasswordValidationCallback, happens to be required even though the JASPIC spec does not require this one to be used at all. I reported this issue in June 2012 along with a proposal for a fix. Since then, the issue has been cloned, and the patch I proposed was committed around half Oktober of that year. Unfortunately, it wasn't included in JBoss EAP 6.0.1/JBoss AS 7.1.3.Final-redhat-4 that was released the following December, but instead is slated for JBoss AS 7.2 and JBoss AS 7.1.4. It might still take a considerable amount of time before any of those two is released. Since the bug appears in a Tomcat Valve, which is the class we explicitly reference in jboss-web.xml it's relatively easy to patch ourselves.

Geronimo v3.0 needs the extra gibberish in geronimo-web.xml, in order to prevent various class not found exceptions. Unfortunately, JASPIC authentication still doesn't work after that. It seems that if a web application registers a JASPIC authentication module, then this registration doesn't take effect for that application itself. In order to make this work we need to start up Geronimo with the app in question deployed, then undeploy the app while the server is still running and immediately deploy it again. After this sequence JASPIC authentication works correctly. An issue for this has been created at https://issues.apache.org/jira/browse/GERONIMO-6423

Step 10 - Taking behavioral differences into account

With respect to the life-cycle of an authentication module and interaction with the rest of the Java platform, no two servers of the ones tested behaved exactly the same.

For all application servers, the authentication module was invoked when a protected resource (the TestServlet from our example) was invoked. This is a good thing, otherwise JASPIC wouldn't be working at all. However, there was no universal agreement on what to do with non-protected resources. JBoss EAP didn't call the SAM in this case, but all other servers did. After an initial successful authentication (e.g. request.getUserPrincipal() subsequently returns a non-null value during the same request), the behavior differed with respect to the follow-up request. JBoss EAP would remember the full authentication, and would not call the SAM again until either the session expired or an explicit call to request#logout was made. All other servers did call the SAM again. In case we didn't re-authenticate again, then WebLogic would still remember the principal (request.getUserPrincipal() would return the one for which we authenticated), but accessing protected resources for which the authenticated principal has the correct roles was still not allowed. GlassFish and Geronimo both didn't remember a single thing.

As for accessing environmental Java EE resources, in both JBoss EAP and Geronimo it was possible to request the CDI bean manager from the standardized "java:comp/" JNDI namespace. GlassFish and WebLogic would throw binding exceptions here. When the SAM was called at the initial point during a request (before Servlet Filters are invoked) then in JBoss EAP the CDI request and session scope were already active. In GlassFish the scope seemed to be active, but when requesting a bean reference (after obtaining the bean manager via a globally accessible EJB), a scary warning was logged: "SEVERE: No valid EE environment for injection of ...". In WebLogic the mentioned contexts definitely weren't active and context not active exceptions were thrown. Geronimo was hard to test at this point, since the SAM seemingly runs in a different class loader. Things changed when request#authenticate was called from e.g. a JSF managed bean. In that case the SAM was invoked, and for most servers the CDI scopes simply remained active. Judging from the call stack between the authenticate() call and the invocation of the SAM, the CDI scopes are most likely also still active for Geronimo, but because of the class loader issues this was again hard to test.

Which brings us to our last point; for all servers the SAM that was embedded and installed by the application would run with the same class loaders as said application, except for Geronimo. Remembering that we needed the trick with the deploy/undeploy/deploy cycle, this perhaps doesn't come as a surprise.

To summarize, with respect to calls to the validateRequest() method of an authentication module, the following differences were observed:

Source code

Update

As of August 2015, the situation regarding implementation differences and bugs has considerably improved. requestDispatcher#forward and request#logout are now mandated by the spec to be supported, and wrapping the request (and response) which at the time this article was written didn't work with a single server now works everywhere. Furthermore, WebLogic doesn't require the mandatory role mapping anymore. In 2016 Payara, JBoss/WildFly and Liberty were re-tested. GlassFish and WebLogic were additionally re-tested end 2015.

Conclusion

JASPIC is one of those things that should have been there relatively early (e.g. for J2EE 1.4 if the original timeline would have hold). By now it could have had its ease-of-use treatment in Java EE 5 and subsequent tuning in Java EE 6. Vendors then might not have had the ~10 years worth of their own proprietary technology in place, which perhaps is currently one of the reasons not all of them are embracing JASPIC beyond what the spec mandates.

Originally a JASPIC 1.1 seemed to have been planned for Java EE 6, but eventually this turned into a smaller maintenance release. Given the various issues outlined above, a true JASPIC 1.1 for Java EE 7 would still be very welcome, but as Java EE 7 is nearing completion and to the best of my knowledge no such work has been started, the chance that we'll see any improvements in the short term are slim.

As it stands, JASPIC is not very well known among users and not universally embraced by vendors. Some users that do know JASPIC find it a "little technical". Where unfortunately some vendors go as far as to call it "bloated", other vendors are waiting for more "widespread adoption" before fully embracing it (which is a kind of chicken-and-egg problem).

Despite all this doom and gloom, the fact is that JASPIC -is- here and it really does offer a good portable way to integrate with container authentication. The bugs that are currently pressent in some implementations can of course be fixed and since the API is standardized there's nothing stopping a third party library to offer some convenience utilities that make things a little easier for 'casual' users (like we also see for e.g. JSF and JPA).

Wednesday, August 1, 2012

CDI has the well known concept of producers. Simply put a producer is a kind of general factory method for some type. It's defined by annotating a method with @Produces. An alternative "factory" for a type is simply a class itself; a class is a factory of objects of its own type.

In CDI both these factories are represented by the Bean type. The name may be somewhat confusing, but a Bean in CDI is thus not directly a bean itself but a type used to create instances (aka a factory). An interesting aspect of CDI is that those Bean instances are not just internally created by CDI after encountering class definitions and producer methods, but can be added manually by user code as well.

Via this mechanism we can thus dynamically register factories, or in CDI terms producers. This can be handy in a variety of cases, for instance when a lot of similar producer methods would have to be defined statically, or when generic producers are needed. Unfortunately, generics are not particularly well supported in CDI. Instead of trying to create a somewhat generic producer an alternative strategy could be to actually scan which types an application is using and then dynamically create a producer for each such type.

The following code gives a very bare bones example using the plain CDI API:

There are a few things to remark here. First of all the actual producer method is create. This one does nothing fancy and just returns a new Integer instance (normally not a good idea to do it this way, but it's just an example). The getTypes method is used to indicate the range of types for which this dynamic producer produces types. In this example it could have been deducted from the generic class parameter as well, but CDI still wants it to be defined explicitly.

The getQualifiers method is somewhat nasty. Normally if no explicit qualifiers are used in CDI then the Default one applies. This default is however not implemented in the core CDI system it seems, but is done by virtue of what this method returns. In our case it means we have to explicitly return the default qualifier here via an AnnotationLiteral instance. These are a tad nasty to create, as they require a new class definition that extends AnnotationLiteral and the actual annotation needs to be present as both a (super) interface AND as a generic parameter. To add insult to injury, Eclipse in particular doesn't like us doing this (even though it's the documented approach in the CDI documentation) and cries hard about this. We silenced Eclipse here by using the @SuppressWarnings("all") annotation. To make the code even more nasty, due to the way generics and type inference work in Java we have to add an explicit cast here (alternatively we could have used Collections.<Annotation>singleton).

For the scope we can't return a null either, but have to return the CDI default as well if we want that default. This time it's an easy return. For the scope and stereo types we can't return a null if we don't use them, but have to return an empty set. The isNullable method (deprecated since CDI 1.1) can return false. Finally, getName is the only method that can return a null.

Dynamic producers like this have to be added via a CDI extension observing the AfterBeanDiscovery event:

Sunday, June 24, 2012

Despite being almost ten years old, the JPA specification to this day has rather poor support for basic paging/sorting/filtering. Paging/sorting/filtering is used in a lot of (CRUD) applications where the result from a query is shown in a table, and where the user can scroll through the results one page at a time, and where this result can be sorted by clicking on any of the table column headers.

In order to support this a number of things are generally needed:

The total number of rows (or entities) in the full result must be known

There should be support for an offset in the full result and a limit for the amount of rows that will be obtained

The column (attribute) on which to sort must be dynamically added to the query

Search expressions must be dynamically added to the query

As it appears, only offset/limit is directly supported in JPA. A sorting column can only be added dynamically when using the overly verbose and hard to work with Criteria API. Search expressions are somewhat possible to add via the Criteria API as well, but it's an awkward and rather poor mechanism.

Surprisingly, universally counting the number of rows is not possible at all in JPA. In this article we'll look at a very hairy workaround for this using Hibernate specific code.

Strange as it may seem, this query is uncountable in JPA, while in SQL this is usually not a problem. So what we could do is generate the corresponding SQL query, surround it by an outer count(*) query and then execute that.

But here we hit another wall. While by definition every JPA implementation must be able to generate SQL from a JPA query, there's no actual standard API to get just this query text.

Now one particular aspect of JPA is that it's almost never a pure implementation (such as e.g. JSF), but a standardization API layered on top of another API. This other API is typically richer. In the case of Hibernate there indeed appears to be a public API available to do the transformation that we need, including handling query parameters (if any).

To demonstrate this, let's first create the Query object in Java. Here we assume that the JPQL query shown above is available as a query named "Statistic.perDate":

From this typed query we can obtain the Hibernate Query, and from that get the query string. This query string always represents the JPQL (technically, HQL) independent of whether the query was created from JPQL or from a Criteria:

In order to parse this JPQL (HQL) query text we need to make use of the ASTQueryTranslatorFactory. Using this and the JPA EntityManagerFactory one can get hold of the SQL query text and a collection of parameters:

Note that the +1 on the position is needed because of a mismatch between 0-based and 1-based indexing of both APIs.

With all this in place we can now finally execute the query and obtain the count:

Long cnt = ((Number) nativeQuery.getSingleResult()).longValue();

The casting here looks a big nasty. In the case of PostgreSQL a BigInteger was returned. I'm not entirely sure if this would be the case for all databases, hence the cast to Number first and then getting the long value from that.

Conclusion

Using the Hibernate specific API it's more or less possible to universally count the results of a query. It's not entirely perfect still, as values set on a JPQL query can often be richer than those set on a native query. For example, you can often set an entity itself as a parameter and the JPA provider will then automatically use the ID of that.

Furthermore using provider specific APIs when using JPA, especially for such an essential functionality, is just not so nice.

Finally, some providers such as EclipseLink do support subqueries in the select clause. For those providers no vendor specific APIs have to be used (and therefor there are no compile time concerns), but the code is of course still not portable.

If/when there will ever be a new JPA version again it would really be nice if the current problems with paging/sorting/filtering could be addressed.

Friday, May 11, 2012

In JSF components play a central role, it being a component based framework after all.

As mentioned in a previous blog posting, creating custom components was a lot of effort in JSF 1.x, but became significantly easier in JSF 2.0.

Nevertheless, there were a few tedious things left that needed to be done if the component was needed to be used on a Facelet (which is the overwhelmingly common case); having a -taglib.xml file where a tag for the component is declared, and when the component's Java code resides directly in a web project (as opposed to a jar) an entry in web.xml to point to the -taglib.xml file.

Just these two files (and only these two files) fully constitute a Java EE/JSF application. The .java file does need to be compiled to a .class of course, but then just these two can be deployed to a Java EE 7 server. There's not a single extra (XML) file, manifest, lib, or whatever else needed as shown in the image below:

Using Payara 4.x, requesting http://localhost:8080/customcomponent/page.jsf will simply result in a page displaying:

TEST

So can this be made any simpler? Well, maybe there's still some room for improvement. What about the getFamily method that still needs to be implemented? It would be great if that too could be defaulted to something. Likewise, the component name could be defaulted to something as well, and while we're at it, let's give createTag a default value of true in case the component name is defaulted (only in that case such as not to cause backwards compatibility issues).

Wednesday, April 25, 2012

In Java EE, JPA (Java Persistence API) is used to store and retrieve graphs of objects. This works by specifying relations between objects via annotations (or optionally XML). Hand over the root of an object graph to the entity manager and it will persist it. Ask the entity manager for an object with a given Id and you'll get the graph back.

This is all fine and well, but how in this model do we control which branches of the graph are retrieved and to which depth branches should be followed?

The primary mechanism to control this with is the eager/lazy mechanism. Mark a relation as eager and JPA will fetch it upfront, mark it as lazy and it will dynamically fetch it when the relation is traversed. In practice, both approaches have their cons and pros. Mark everything eager and you'll risk pulling in the entire DB for every little bit of data that you need. Mark everything lazy, and you'll not only have to keep the persistence context around (which by itself can be troublesome), but you also risk running into the 1 + N query problem (1 base query is fired, and then an unknown amount of N queries when iterating over its relations). If fetching 1000 items in one query took approximately as long as fetching 1 item per query and firing 1000 queries, then this wouldn't be a problem. Unfortunately, for a relational database this is not the case, not even when using heaps of memory and tons of fast SSDs in RAID.

There are various ways to overcome this. For instance there are proprietary mechanisms for setting the batch size, so not 1000 queries are fired but 10. We could also assume that all entities relating to those 1000 items are all in the (JPA) cache. Then 1000 fetches of 1 entity are indeed about as costly as 1 fetch of 1000 entities, but this is a dangerous assumption. Assume wrong and you might bring down your DB.

The fundamental problem however is that eager/lazy are static properties of the entity model. In practice, the part of the graph that you want often depends on the use case. For a master overview of all Users in a system, you'd probably want a rather shallow graph, but for the detail view of a particular User you most likely need a somewhat deeper one.

Again, there are various solutions for this. One is to write individual JPQL queries for each use case. This certainly works, but the number of queries can grow rapidly out of hand this way (allUsersWithAddress, allUsersWithAddressAndFriends, allUsersWithAddressAndFriendsWithAddress , ...). Another solution that addresses exactly this problem are the fetch profiles that were introduced in Hibernate 3.5. As can be seen in the official documentation, this solution is not particularly JPA friendly. You need access to the native Hibernate session, which is possible, but not pretty. One way or the other, fetch profiles are Hibernate specific.

In this posting I would like to present an alternative solution. It feels a little like fetch profiles, but the graph to be fetched can be specified dynamically and it uses the JPA API only. It works by using the criteria API to programmatically add one or more JOIN FETCH clauses to a query. Unfortunately JPA does not yet have the capabilities to turn a JPQL query into a Criteria query, so either the query must already be a Criteria or it should be a simple find. The following code demonstrates the latter case:

The above line would fetch the user with "id" 15, and pre-fetches the addresses associated with that user, as well as the friends and their addresses. (Note that the @Id field is hardcoded to be called "id" here. A more fancy implementation could query the object for it)

This solution, though handy, is however not perfect. While all JPA vendors support fetching multiple relations of one level deep (addresses and friends in the example above), not all of them support fetching chained relations (friends.addresses in the example above). Specifically for Hibernate care should be taken to avoid fetching so-called "multiple bags" (sets and @OrderColumn are a typical solution). Of course it's always wise to avoid creating a huge Cartesian product, which is unfortunately one low-level effect of the underlying relational DB you have to be aware of, even when purely dealing with object graphs.

Despite the problems I outlined with this approach above, I hope it's still useful to someone. Thanks go to my co-workers Jan Beernink and Hongqin Chen for coming up with the original idea respectively refining it.

Sunday, April 22, 2012

In JPA one can define JPQL queries as well as native queries. Each of those can return either an Entity or one or more scalar values. Queries can be created on demand at run-time from a String, or at start-up time from an annotation (or corresponding XML variant see Where to put named queries in JPA?).

Of all those combinations, curiously Hibernate has never supported named native queries returning a scalar result, including insert, update and delete queries which all don't return a result set, but merely the number of rows affected.

It's a curious case, since Hibernate does support scalar returns in non-native named queries (thus a scalar return and named queries is not the problem), and it does support scalar returns in dynamically created native queries (thus scalar returns in native queries are not the problem either).

If you do try to startup with such a query, Hibernate will throw an exception with the notorious message:

Pure native scalar queries are not yet supported

Extra peculiar is that this has been reported as a bug nearly 6 years(!) ago (see HHH-4412). In that timespan the advances in IT have been huge, but apparently not big enough to be able to fix this particular bug. "Not yet" certainly is a relative term in Hibernate's world.

And lo and behold, this actually works. Hibernate starts up and adds the query to its named query repository, and when subsequently executing the query there is no exception and the insert happens correctly.

Looking at the Hibernate code again it looks like this shouldn't be that impossible to fix. It's almost as if the original programmer just went out for lunch while working on that code fragment, temporarily put the exception there, and then after lunch completely forgot about it.

Until this has been fixed in Hibernate itself, the result-set-mapping workaround might be useful.

Thursday, March 22, 2012

In JSF input components have a label attribute that is typically used in (error) messages to let the user know about which component a message is. E.g.

Name: a value is required here.

If the label attribute isn't used, JSF will show a generated Id instead that is nearly always completely incomprehensible to users. So, this label is something you definitely want to use.

Of course, if the label is going to be used in the error message to identify said component, it should also be rendered on screen somewhere so the user knows which component has that label. For this JSF has the separate <h:outputLabel> component, which is typically but not necessarily placed right before the input component it labels.

The problem

The thing is though that this label component should nearly always have the exact same value as the label attribute of the component it labels, e.g.

There's a duplication here that feels rather unnecessary (it feels even worse when the label comes from a somewhat longer expression, which is typical for I18N). My co-worker Bauke identified this problem quite some time ago.

Finding a solution

It appears though that an implementation that automatically sets the label attribute of the target "for" component to the value of the outputLabel isn't that difficult, although there are a couple of things to keep in mind.

For starters, a component in JSF doesn't directly have something akin to an @PostConstruct method in which you can set up things. There are tag handlers and meta rules in which you can set up attributes, but when they execute not all components have to exist yet.

Luckily, we always have the plain old constructor and since JSF 2 components can register themselves for system events. This gets us into a method where we can setup things.

Additionally, we have to be aware of state. System event listeners are luckily not stateful, so perfectly suited for tasks that need to be setup once (Phase listeners are stateful though, and will 'come back' after every postback). Attributes of a component are by default stateful, so we only need to set those once, not at every postback. Finally, the API distinguishes between deferred expressions (value expressions) and literals. If we want to support dynamic labels and only want to setup the wiring once, it's important to take this distinction into account.

Finally, when searching for the target "for" component we can take advantage of the fact that typically this component will be close by. Compared to the regular search on the view root, the well-known "relative-up/down" search algorithm is probably more efficient here. This algorithm will start the search in the first naming container that is the parent of the component from where we start our search in the component tree. This will work its way up until there are no more parents, and if the component then still isn't found (practically this is rare if the component indeed exists), then a downward sweep will be done starting from the root.

So, this all comes down to the following piece of code then (slightly abbreviated):