When invoking rollback() on a UserTransaction after catching an Exception, the underlying UserTransactionImple throws an NPE when calling abortWithoutAck():

Caused by: com.arjuna.wst.SystemException: java.lang.NullPointerException
at com.arjuna.mwlabs.wst.at.remote.UserTransactionImple.abortWithoutAck(UserTransactionImple.java:360)
at com.arjuna.mwlabs.wst.at.remote.UserTransactionImple.rollback(UserTransactionImple.java:153)
at

This is using JBossTS 3.3.0 (I think) and JBossAS 4.3 EAP.

Could someone help with pointing me in what direction I should be looking to debug this? Thanks!

Hi Brice. It's not really clear to me where the error is coming from as the exception trace shows the line where the SystemException is created rather than the wrapped NullPointerException which is being caught. What i suggest you do is attach a debugger to your program and add a breakpoint on line 360 (inside the catch). When the break is reached evaluate

ex.printStackTrace(System.out)

This will display a stack trace for the NPE which should show you the line which is trying to dereference a null pointer. I strongly suspect it will be line 326 indicating that ctx is null (i.e. there is no currently active transaction).

I still don't quite grok the internal workings between UserTransaction and the JTA Tx, but it *almost* seems (from incidental log output) that the JTA Tx is rolling back *before* the ut.rollback(). Not sure if that leaves resources in an invalid state when abortWithoutAck() is invoked ...

This would be where my knowledge of how rollback works in a 2PC world is fairly slim.

Thanks for debugging this, Brice. As I suspected the NPE comes from line 326 which means that the transaction has already been disassociated from the current thread when you call rollback.

Just in case you are unsure where this is happening, I'll put this in context: .at.remote.UserTransactionImple is the class used to implement a WSTX transaction in an XTS client. So, it looks like you have configured Spring to create a WSTX transaction inside one of your web servlets and then, presumably invoked some other web service which has failed, causing Spring to rollback the transaction. If this is so then is this second service using bridging to allow it to access XA resources? Anyway, at the point where the exception occurs Spring has caused rollback on the WSTX transaction in the client but it has already been disassociated from the thread indicating that rollback has already been called.

Firstly, I'll note that I think the WSTX implementation is wrong here because it should not try to dereference a null pointer. This is what is causing the SystemException shown in your trace to be thrown. The WSTX code ought to detect the null pointer and explicitly throw an exception to notify that the TX has already been disassociated.

The obvious candidate for this exception is UnknownTransactionException but throwing that may actually be misleading. UnknownTransactionException is currently thrown when the client thinks it is in a TX and talks to the coordinator but gets a no such TX response i.e. it indicates something is wrong within the WSTX layer. If the transaction has already been rolled back then, arguably, a further attempt to rollback should get a SystemException to indicate that the client is doing something wrong. I'll think about this before I decide what to do but I need more input from you. So, the question is why has your transaction already been rolled back.

Am I right in assuming that your servlet is invoking a second service via the TX bridge.

Have you installed handlers on your client connections to the second service which rollback the WSTX transaction?

If so then you may be confusing Spring because it is expecting to do the rollback. Can you please let me know a bit more about your setup here?

Request comes into app A, advice around the top-level servlet gets a UserTransaction, then starts a bridge to JTA

processing continues

processing invokes a remote web service supporting WS-TX

in-handler joins coordination context provided in SOAP header and starts a bridge to local JTA (note: service is *optionally* able to join WS-TX, but will always join/create a UserTransaction - it detects if a *remote* coordinator is present and if so, doesn't drive the UserTransaction to completion, rather just calling suspend() if a remote coordinator *is* present)

if an exception occurs, I *think* Spring on the remote end will mark the JTA transaction as rollbackOnly ... it should notice that the Spring transaction is jca-subordinate to a parent and not try to commit/rollback

if a remote coordinator is present (as it is in this case), remote service should just suspend() the UserTransaction at its end

control comes back to "local" application and the remote service has thrown a fault .. local JTA transaction is marked rollbackOnly because an exception in the business layer was detected

I'm not 100% sure what happens here .. sometimes exceptions in the business layer propagate up to the top-level servlet and we call ib.stop() and ut.rollback() - what I *think* is happening in this case (maybe?) is we call ib.stop() and then ut.commit() and in the commit(), rollbackOnly is detected ... maybe that throws an exception?

if an exception is detected within the try/catch for starting/stopping/commiting the bridge & usertransaction, we always try to stop() the bridge and rollback() the usertransaction. Since the bridge may have already been stopped, we first check to see if its still available, otherwise we just skip it and try to rollback the usertransaction.

I'll grab the WsTransactionalSourceAdvice and post that here, as well as the FaultHandler on the remote end. I'll also try to step through this scenario and clear up the exact behavior in the last couple points above.

Firstly, I'll note that I think the WSTX implementation is wrong here because it should not try to dereference a null pointer. This is what is causing the SystemException shown in your trace to be thrown. The WSTX code ought to detect the null pointer and explicitly throw an exception to notify that the TX has already been disassociated.

+1

It should throw IllegalStateException

The obvious candidate for this exception is UnknownTransactionException but throwing that may actually be misleading.

See above. IllegalStateException is the better choice because that more closely matches JTA semantics.

UnknownTransactionException is currently thrown when the client thinks it is in a TX and talks to the coordinator but gets a no such TX response i.e. it indicates something is wrong within the WSTX layer. If the transaction has already been rolled back then, arguably, a further attempt to rollback should get a SystemException to indicate that the client is doing something wrong. I'll think about this before I decide what to do but I need more input from you. So, the question is why has your transaction already been rolled back.

I think we should change the signature ;-) Of course that would happen naturally if we supported JAXTX ;-)

Ok, we actually already have a WrongStateException which e.g. is thrown when a begin is attempted and the thread is already associated with a TX. I guess this is the equivalent of JTA IllegalStateException.

I think we should change the signature ;-)

Not just the signature but also the javadoc ;-). It currently says

/**
* The rollback operation will terminate the transaction and return
* normally if it succeeded, while throwing an appropriate exception if it
* didn't. If there is no transaction associated with the invoking thread
* then UnknownTransactionException is thrown.
*/

This should probably say IllegalStateException for this case and identify UnknownTransactionException as marking the situation where the coordinator does not know about the existence of the TX. Same applies for commit which is currently documented as follows:

/**
* The transaction is committed by the commit method. This will execute
* the PhaseZero, 2PC and OutcomeNotification protocols prior to returning.
* If there is no transaction associated with the invoking thread then
* UnknownTransactionException is thrown. If the transaction ultimately
* rolls back then the TransactionRolledBackException is thrown. When
* complete, this operation disassociates the transaction from the current
* thread such that it becomes associated with no transaction.
*/

n.b. commit does currently detect a null local context and throws UnknownTransactionException.

Ok Brice, thanks for the code. the problem is that your cleanup routine in the advice code can end up calling stop twice or end up calling rollback after a commit has disassociated th etx from the thread. Here is what you should have:

Notice that an exception in stop passes null instead of ib, avoiding a second call to ib. An exception during commit passes null for both arguments avoiding calls to stop and rollback.

This is because when you call ib.stop the bridge ensures the bridged to transaction is no longer left active so you must not call stop again even it throws an error. The same is true once you call commit. It disassociates the transaction even if if throws an error. So if you call rollback after calling commit it will throw another exception. In the current code it throws a SystemException because of the NPE. But even if we patch this to throw a WrongStateException your code still will not work.

As a general comment I'll note that exception handling during cleanup is very awkward when you have multiple cleanups to do. You have to keep very careful track of which things still need cleaning up and which ones are already cleared by previous cleanup attempts. Your original code was nice and neat but that's a sure sign that something is up because handling cleanup errors is almost always not neat :-)

Ok Brice, thanks for the code. the problem is that your cleanup routine in the advice code can end up calling stop twice or end up calling rollback after a commit has disassociated th etx from the thread. Here is what you should have:

Notice that an exception in stop passes null instead of ib, avoiding a second call to ib. An exception during commit passes null for both arguments avoiding calls to stop and rollback.

This is because when you call ib.stop the bridge ensures the bridged to transaction is no longer left active so you must not call stop again even it throws an error. The same is true once you call commit. It disassociates the transaction even if if throws an error. So if you call rollback after calling commit it will throw another exception. In the current code it throws a SystemException because of the NPE. But even if we patch this to throw a WrongStateException your code still will not work.

As a general comment I'll note that exception handling during cleanup is very awkward when you have multiple cleanups to do. You have to keep very careful track of which things still need cleaning up and which ones are already cleared by previous cleanup attempts. Your original code was nice and neat but that's a sure sign that something is up because handling cleanup errors is almost always not neat :-)

The change you recommended worked, except that I had to refactor processException() slightly to throw a specific named exception, not just RuntimeException - so that I could check in the outer catch {} if the exception being caught was that one, if so, I just rethrow and don't invoke processException again (otherwise we're back to where we started).

So, in a way, its even less nice and neat now ;-)

I have a follow-up question now on how exactly rollback semantics work between coordinators, but I'll start a new topic on that.