org.apache.derby.client.am.SqlException: Java exception: 'FWK005 parse may not be called while parsing.: org.xml.sax.SAXException'.
at org.apache.derby.client.am.SqlException.<init>(Unknown Source)
at org.apache.derby.client.am.SqlException.<init>(Unknown Source)

We believe that this comes from the dBuilder.parse(InputSource) method.

Bryan Pendleton
added a comment - 10/Sep/08 03:45 What sort of errors are in your derby.log file? Can you attach it?
Also, can you try extracting as much information as possible from
the client-side exception by using the techniques described at
http://wiki.apache.org/db-derby/UnwindExceptionChain
then posting that information?
What JDK and XML parser are you using?

i was able to fix the problem i am attaching ZIP file that contains the fix and simple test that can be built with ant after lib directory is created and populated with derby / xerces/ xalan / serializer jars (see README.txt has details).

> What JDK and XML parser are you using?

mostly JDK 1.5 and relatively new xerces/xalan version - i could not find exactly which one by looking into jar file of xerces but xalan from MANIFEST.MF is 2.7.1

i tested with db-derby-10.4.1.3-bin and db-derby-10.4.2.0-bin and also built from SVN db-derby-snapshot-10.5.0.0-694562 - for all the the same error happens

BTW: the problem seems to stem from the fact that Xerces DocumentBuilder
isn't thread safe as mentioned in http://www.oxygenxml.com/archives/xml-dev/200405/msg00149.html
and the same seems to be true about Serializer.
DocumentBuilderFactory and SerializerFactory seems to be OK
when used/shared by multiple threads.

If SqlXmlUtil was not shared between threads for exmaple
made thread private (and possibly recycled after connection is closed)
then using one DocumentBuilder instance should be fine.

However i was not able to understand life cycle of SqlXmlUtil
(sho is calling SqlXmlExecutor.getSqlXmlUtil()) and what is
elation between activations, connections and threads ...

So proper fix may be to make sure that SqlXmlUtil is safely reused
or not shared between threads ...

Aleksander Slominski
added a comment - 12/Sep/08 21:45 BTW: the problem seems to stem from the fact that Xerces DocumentBuilder
isn't thread safe as mentioned in
http://www.oxygenxml.com/archives/xml-dev/200405/msg00149.html
and the same seems to be true about Serializer.
DocumentBuilderFactory and SerializerFactory seems to be OK
when used/shared by multiple threads.
If SqlXmlUtil was not shared between threads for exmaple
made thread private (and possibly recycled after connection is closed)
then using one DocumentBuilder instance should be fine.
However i was not able to understand life cycle of SqlXmlUtil
(sho is calling SqlXmlExecutor.getSqlXmlUtil()) and what is
elation between activations, connections and threads ...
So proper fix may be to make sure that SqlXmlUtil is safely reused
or not shared between threads ...

Attached is a JUnit test case that exposes the bug. It runs queries that use all the XML operators in four threads concurrently. The actual failure varies. Sometimes it's a SAXException (FWK005 parse may not be called while parsing), other times it's a NullPointerException.

I agree that it's wrong to store the SqlXmlUtil instance in the compiled plan. Every activation of the plan should have its own copy. The comments in the code indicate that the caching was there to prevent recompilation of XPath queries for each row when using the XMLQUERY or XMLEXISTS operators. Having one instance per activation would still provide that optimization.

Knut Anders Hatlen
added a comment - 11/May/11 12:39 Attached is a JUnit test case that exposes the bug. It runs queries that use all the XML operators in four threads concurrently. The actual failure varies. Sometimes it's a SAXException (FWK005 parse may not be called while parsing), other times it's a NullPointerException.
I agree that it's wrong to store the SqlXmlUtil instance in the compiled plan. Every activation of the plan should have its own copy. The comments in the code indicate that the caching was there to prevent recompilation of XPath queries for each row when using the XMLQUERY or XMLEXISTS operators. Having one instance per activation would still provide that optimization.

Knut Anders Hatlen
added a comment - 11/May/11 12:45 I checked in the new test case (XMLConcurrencyTest), but disabled it first in XMLSuite so it doesn't get run as part of the regression tests until the bug is fixed.
Committed revision 1101839.

The attached patch (1a) seems to fix the bug and allow XML operators to be executed concurrently.

The patch makes the following changes:

1) The SqlXmlUtil instance isn't stored in the compiled plan. Instead, byte code is generated so that a new instance is created the first time the operator is used in an activation (and reused subsequently in the same activation only).

2) Since the SqlXmlUtil instance is no longer persisted, it no longer implements Formatable, which allowed for removal of a number of methods in the class, as well as entries for the class in RegisteredFormatIds and StoredFormatIds.

3) Created a new abstract super-class of UnaryOperatorNode, BinaryOperatorNode and TernaryOperatorNode, in which code that generates byte-code for the different XML operators could be shared. Currently, their only common super-class is ValueNode, which sounds too general for code that's only used by operator nodes.

4) Enables XMLConcurrencyTest in the regression test suite.

I've verified that we still don't create/compile the SqlXmlUtil instance for each row with the patch. We do create an instance and compile the XML query on the first row that uses the operator per activation. This also means that if you re-execute a previously executed PreparedStatement, the cached value will be used.

In addition to the per-activation compilation of the XML query, we still compile the XML query when creating the SQL query execution plan. This isn't strictly necessary anymore, since the compiled XML query will now be thrown away instead of being stored in the plan. I left the code in there so that syntax errors in the XML query would still generate a compile-time error instead of being delayed to execution-time. I'm open to changing this, but wanted to prevent any behaviour changes in the first iteration.

Other potential improvements:

More common code can probably be pulled from the *naryOperatorNode classes up to their new parent class. But that's outside the scope of this issue.

As far as I can see, there's no reason to create a new SqlXmlExecutor instance for every row, as we currently do. The instances are (effectively) immutable, and contain the same values for all rows, so it may make sense to cache the SqlXmlExecutor instances instead of the SqlXmlUtil instances (which can be retrieved from the executors anyway).

The SqlXmlExecutor class contains a comment indicating that it belongs in the types package, but it was placed in the execute package because it accessed the Activation class, and the types layer shouldn't depend on the SQL layer. SqlXmlExecutor no longer accesses the Activation class, so it may be possible to move it to the types package now.

I've run all the regression tests on an earlier version of the patch without seeing any failures. Rerunning the tests now just to be safe, since I've made some small changes to the patch later.

Knut Anders Hatlen
added a comment - 19/May/11 13:36 The attached patch (1a) seems to fix the bug and allow XML operators to be executed concurrently.
The patch makes the following changes:
1) The SqlXmlUtil instance isn't stored in the compiled plan. Instead, byte code is generated so that a new instance is created the first time the operator is used in an activation (and reused subsequently in the same activation only).
2) Since the SqlXmlUtil instance is no longer persisted, it no longer implements Formatable, which allowed for removal of a number of methods in the class, as well as entries for the class in RegisteredFormatIds and StoredFormatIds.
3) Created a new abstract super-class of UnaryOperatorNode, BinaryOperatorNode and TernaryOperatorNode, in which code that generates byte-code for the different XML operators could be shared. Currently, their only common super-class is ValueNode, which sounds too general for code that's only used by operator nodes.
4) Enables XMLConcurrencyTest in the regression test suite.
I've verified that we still don't create/compile the SqlXmlUtil instance for each row with the patch. We do create an instance and compile the XML query on the first row that uses the operator per activation. This also means that if you re-execute a previously executed PreparedStatement, the cached value will be used.
In addition to the per-activation compilation of the XML query, we still compile the XML query when creating the SQL query execution plan. This isn't strictly necessary anymore, since the compiled XML query will now be thrown away instead of being stored in the plan. I left the code in there so that syntax errors in the XML query would still generate a compile-time error instead of being delayed to execution-time. I'm open to changing this, but wanted to prevent any behaviour changes in the first iteration.
Other potential improvements:
More common code can probably be pulled from the *naryOperatorNode classes up to their new parent class. But that's outside the scope of this issue.
As far as I can see, there's no reason to create a new SqlXmlExecutor instance for every row, as we currently do. The instances are (effectively) immutable, and contain the same values for all rows, so it may make sense to cache the SqlXmlExecutor instances instead of the SqlXmlUtil instances (which can be retrieved from the executors anyway).
The SqlXmlExecutor class contains a comment indicating that it belongs in the types package, but it was placed in the execute package because it accessed the Activation class, and the types layer shouldn't depend on the SQL layer. SqlXmlExecutor no longer accesses the Activation class, so it may be possible to move it to the types package now.
I've run all the regression tests on an earlier version of the patch without seeing any failures. Rerunning the tests now just to be safe, since I've made some small changes to the patch later.

Your approach seems fine to me. When I was working on the XPLAIN tables I remember thinking
it would be nice if the operator nodes had a closer base class than ValueNode, so that seems
like a good idea to me. Isn't it true that the compile classes and the execution classes have a
common class hierarchy? Is there a parallel re-factoring of a common base class to be done in
the other tree? I think that if we move the SqlXmlExecutor from the execute package to the
types package, it might be nice to change its name (to remove "Executor"). Thanks for cleaning
up this code, it will be nice to have this feature be more reliable.

Bryan Pendleton
added a comment - 19/May/11 15:31 Your approach seems fine to me. When I was working on the XPLAIN tables I remember thinking
it would be nice if the operator nodes had a closer base class than ValueNode, so that seems
like a good idea to me. Isn't it true that the compile classes and the execution classes have a
common class hierarchy? Is there a parallel re-factoring of a common base class to be done in
the other tree? I think that if we move the SqlXmlExecutor from the execute package to the
types package, it might be nice to change its name (to remove "Executor"). Thanks for cleaning
up this code, it will be nice to have this feature be more reliable.

I think the execution hierarchy only partially mirrors the compile hierarchy. At least I cannot find any execution classes that correspond to the various operator node classes. The operator nodes seem to be transformed into methods at code-generation time, so there's no class or instance representing them anymore.

I agree that the "executor" name is problematic if we move SqlXmlExecutor to the types package. I'm actually considering a somewhat more drastic approach and remove the class altogether, merging its methods into SqlXmlUtil (which is already in the types package). SqlXmlExecutor is just a thin wrapper around SqlXmlUtil, so I think that would work. Currently, SqlXmlUtil is instantiated once per activation, and SqlXmlExecutor once per row accessed, so they are not a perfect match right now. But as I mentioned in my previous comment, I don't really see any reason why SqlXmlExecutor needs to be instantiated per row, so I'm pretty sure the SqlXmlExecutor class can go away now.

Knut Anders Hatlen
added a comment - 19/May/11 20:11 Thanks, Bryan.
I think the execution hierarchy only partially mirrors the compile hierarchy. At least I cannot find any execution classes that correspond to the various operator node classes. The operator nodes seem to be transformed into methods at code-generation time, so there's no class or instance representing them anymore.
I agree that the "executor" name is problematic if we move SqlXmlExecutor to the types package. I'm actually considering a somewhat more drastic approach and remove the class altogether, merging its methods into SqlXmlUtil (which is already in the types package). SqlXmlExecutor is just a thin wrapper around SqlXmlUtil, so I think that would work. Currently, SqlXmlUtil is instantiated once per activation, and SqlXmlExecutor once per row accessed, so they are not a perfect match right now. But as I mentioned in my previous comment, I don't really see any reason why SqlXmlExecutor needs to be instantiated per row, so I'm pretty sure the SqlXmlExecutor class can go away now.

Attaching a followup patch, d3870-2a.diff, which implements the following optimization:

Instead of doing lazy initialization of the field that holds the SqlXmlUtil instance, when the first row is accessed, the field is now initialized in the activation's constructor. The code generated for the operator can therefore be simplified, as it doesn't need to check if the field is initialized.

This change also allows us to remove the redundant compilation of the XML query when the execution plan is created. That redundant compilation was left in the code only to ensure that syntax errors in the XML query were reported at compile-time. Since the activation is created when the statement is prepared, exceptions raised in the activation's constructor will be reported at compile-time (whereas the previous behaviour with lazy initialization would have delayed the error until execution-time).

As a consequence, the XML query will now be compiled exactly once per activation. Before this patch, it would be compiled once when creating the plan, and then again once for each activation. Probably not much of an optimization, but it's simpler and feels cleaner that way.

All the XML tests passed with the patch. I'll run the full regression test suite too, just in case.

Knut Anders Hatlen
added a comment - 20/May/11 12:30 Attaching a followup patch, d3870-2a.diff, which implements the following optimization:
Instead of doing lazy initialization of the field that holds the SqlXmlUtil instance, when the first row is accessed, the field is now initialized in the activation's constructor. The code generated for the operator can therefore be simplified, as it doesn't need to check if the field is initialized.
This change also allows us to remove the redundant compilation of the XML query when the execution plan is created. That redundant compilation was left in the code only to ensure that syntax errors in the XML query were reported at compile-time. Since the activation is created when the statement is prepared, exceptions raised in the activation's constructor will be reported at compile-time (whereas the previous behaviour with lazy initialization would have delayed the error until execution-time).
As a consequence, the XML query will now be compiled exactly once per activation. Before this patch, it would be compiled once when creating the plan, and then again once for each activation. Probably not much of an optimization, but it's simpler and feels cleaner that way.
All the XML tests passed with the patch. I'll run the full regression test suite too, just in case.

I'm attaching a new followup patch (3a) which removes the SqlXmlExecutor class. By doing this, we avoid the allocation of one SqlXmlExecutor instance per row seen by an XML operator, and we reduce the amount of code. We still cache the SqlXmlUtil instance in the activation so that the XML queries only have to be compiled once per activation.

Since SqlXmlExecutor is just a thin wrapper around SqlXmlUtil and XMLDataType, the byte-code generation for the XML operator could be changed to invoke the methods directly on the XMLDataValue objects, passing the cached SqlXmlUtil instance as an argument. Apart from that, only small changes were needed (changing the order and type of arguments in some method signatures where SqlXmlExecutor and XMLDataValue differed, and adding some null handling previously performed by SqlXmlExecutor to the XMLPARSE implementation).

Knut Anders Hatlen
added a comment - 25/May/11 10:33 I'm attaching a new followup patch (3a) which removes the SqlXmlExecutor class. By doing this, we avoid the allocation of one SqlXmlExecutor instance per row seen by an XML operator, and we reduce the amount of code. We still cache the SqlXmlUtil instance in the activation so that the XML queries only have to be compiled once per activation.
Since SqlXmlExecutor is just a thin wrapper around SqlXmlUtil and XMLDataType, the byte-code generation for the XML operator could be changed to invoke the methods directly on the XMLDataValue objects, passing the cached SqlXmlUtil instance as an argument. Apart from that, only small changes were needed (changing the order and type of arguments in some method signatures where SqlXmlExecutor and XMLDataValue differed, and adding some null handling previously performed by SqlXmlExecutor to the XMLPARSE implementation).
All the regression tests passed with the patch.

I think we also need to bump the last digit in the version number on the branch in case someone following the updates on the branch has triggers that use XML operators. This is needed because the format of the compiled execution plans has changed, so the triggers need to be recompiled. Bumping the last digit in the version number will make the triggers recompile automatically.

Knut Anders Hatlen
added a comment - 03/Jun/11 12:15 Backported the fix to the 10.8 branch (revision 1130983).
I think we also need to bump the last digit in the version number on the branch in case someone following the updates on the branch has triggers that use XML operators. This is needed because the format of the compiled execution plans has changed, so the triggers need to be recompiled. Bumping the last digit in the version number will make the triggers recompile automatically.

Uh-oh! Bumping the last digit wasn't enough to get the triggers to recompile cleanly. They fail to read the old plan, which they're just supposed to throw away anyway, because of some serialization error. Possibly because the SqlXmlExecutor class was removed while it's still needed to read the old plans. Seems like the same problem exists when upgrading from 10.8.1.2 to trunk as well, so I'm reopening to fix the upgrade issues.

I've backed out the fix from the branch until the upgrade issue has been resolved (revision 1131002).

Knut Anders Hatlen
added a comment - 03/Jun/11 12:49 Uh-oh! Bumping the last digit wasn't enough to get the triggers to recompile cleanly. They fail to read the old plan, which they're just supposed to throw away anyway, because of some serialization error. Possibly because the SqlXmlExecutor class was removed while it's still needed to read the old plans. Seems like the same problem exists when upgrading from 10.8.1.2 to trunk as well, so I'm reopening to fix the upgrade issues.
I've backed out the fix from the branch until the upgrade issue has been resolved (revision 1131002).

It looks like the problems are caused by the code that drops JDBC metadata queries and invalidates stored prepared statement on upgrade. This code scans through SYS.SYSSTATEMENTS and reads every column in every row of the table. When it comes across a statement that uses one of the XML operators, it fails to deserialize the old plan because it contains SqlXmlUtil instances in the savedObject field, and the SqlXmlUtil class doesn't implement Formatable anymore.

I think the fix is simple. See the attached patch upgrade.diff, which simply makes the upgrade code skip the column that contains the compiled plan. Then the upgrade code is able to fetch the stored statements from the system table and flag them as invalid.

I would like to add a test case for this in the upgrade tests, but it's a bit tricky since the XML operators aren't always available, and the requirements vary between different versions of Derby, so it's hard to determine whether or not the test case is expected to pass on the platform. I'll give it a try, though.

Knut Anders Hatlen
added a comment - 03/Jun/11 15:45 It looks like the problems are caused by the code that drops JDBC metadata queries and invalidates stored prepared statement on upgrade. This code scans through SYS.SYSSTATEMENTS and reads every column in every row of the table. When it comes across a statement that uses one of the XML operators, it fails to deserialize the old plan because it contains SqlXmlUtil instances in the savedObject field, and the SqlXmlUtil class doesn't implement Formatable anymore.
I think the fix is simple. See the attached patch upgrade.diff, which simply makes the upgrade code skip the column that contains the compiled plan. Then the upgrade code is able to fetch the stored statements from the system table and flag them as invalid.
I would like to add a test case for this in the upgrade tests, but it's a bit tricky since the XML operators aren't always available, and the requirements vary between different versions of Derby, so it's hard to determine whether or not the test case is expected to pass on the platform. I'll give it a try, though.

Thanks for tracking this down! I'm not sure I'd use the word "simple" here, but your description
is quite clear, and your upgrade.diff patch looks good to me. I understand your frustration about
the delicacy of constructing a test to demonstrate this problem, given the platform dependent
XML behavior. Even if you can't easily construct the test, I think that the technique of avoiding
the problematic compiled-plan column is a reasonable solution, so +1 from me.

Bryan Pendleton
added a comment - 05/Jun/11 00:46 Thanks for tracking this down! I'm not sure I'd use the word "simple" here, but your description
is quite clear, and your upgrade.diff patch looks good to me. I understand your frustration about
the delicacy of constructing a test to demonstrate this problem, given the platform dependent
XML behavior. Even if you can't easily construct the test, I think that the technique of avoiding
the problematic compiled-plan column is a reasonable solution, so +1 from me.

Thanks for taking a look at the patch, Bryan. I'm uploading a new patch which adds a test case for the upgrade scenario. The test creates a trigger using the old version and verifies that the trigger works after upgrading. Without the fix in the data dictionary, the test fails to boot the database with the new version.

I think the test handles all combinations of XML ops being supported/not-supported with the old/new Derby version in the running environment. The different cases are handled like this:

1) If XML ops aren't supported with the new version, the test case is disabled in the test setup (we have helper methods to detect this).

2) If XML ops are supported with the new version, but not with the old version, the test case runs, but it will detect that CREATE TRIGGER fails with an exception saying XML requirements aren't satisfied. It will immediately return without reporting a failure.

3) If the situation described in (2) happens, the test case will run in all phases (create, soft-upgrade, post-soft-upgrade, hard-upgrade), but there is no trigger to test. The test case queries the system tables to see if the trigger has been created, and returns immediately if the trigger doesn't exist.

4) Since XML operators aren't supported at all before version 10.2, and creating the trigger will fail with a different exception than the one we detect as XML requirements not satisfied, the test case is made a no-op if the tested combination uses a too old version. (Actually, it only tests upgrade from 10.3 and newer now. That's because the CREATE TRIGGER statement fails on 10.2 because it lacks the fix for DERBY-1770.)

Knut Anders Hatlen
added a comment - 05/Jun/11 19:38 Thanks for taking a look at the patch, Bryan. I'm uploading a new patch which adds a test case for the upgrade scenario. The test creates a trigger using the old version and verifies that the trigger works after upgrading. Without the fix in the data dictionary, the test fails to boot the database with the new version.
I think the test handles all combinations of XML ops being supported/not-supported with the old/new Derby version in the running environment. The different cases are handled like this:
1) If XML ops aren't supported with the new version, the test case is disabled in the test setup (we have helper methods to detect this).
2) If XML ops are supported with the new version, but not with the old version, the test case runs, but it will detect that CREATE TRIGGER fails with an exception saying XML requirements aren't satisfied. It will immediately return without reporting a failure.
3) If the situation described in (2) happens, the test case will run in all phases (create, soft-upgrade, post-soft-upgrade, hard-upgrade), but there is no trigger to test. The test case queries the system tables to see if the trigger has been created, and returns immediately if the trigger doesn't exist.
4) Since XML operators aren't supported at all before version 10.2, and creating the trigger will fail with a different exception than the one we detect as XML requirements not satisfied, the test case is made a no-op if the tested combination uses a too old version. (Actually, it only tests upgrade from 10.3 and newer now. That's because the CREATE TRIGGER statement fails on 10.2 because it lacks the fix for DERBY-1770 .)
All regression tests passed in my environment.

Knut Anders Hatlen
added a comment - 06/Jun/11 14:24 I've now backported the fixes, including the upgrade fix, to the 10.8 branch. I also bumped the version on the branch to 10.8.1.4 in order to make upgrade work for those who follow the branch.
Committed revision 1132626.

Reopen for 10.5 backport consideration. If working on the backport for this issue. Temporarily assign yourself and add a comment that you are working on it. When finished, reresolve with the new fix versions or label backport_reject_10_x as appropriate.

Kathey Marsden
added a comment - 14/Mar/12 17:19 Reopen for 10.5 backport consideration. If working on the backport for this issue. Temporarily assign yourself and add a comment that you are working on it. When finished, reresolve with the new fix versions or label backport_reject_10_x as appropriate.

Re-resolving this issue without any further backport.
The portion of this issue suitable for backport was backported all the way to 10.1 with DERBY-5289. The rest looks too involved for a simple backport. It was actually backported to 10.8 from trunk with a forth digit version bump, so maybe if someone really needed it, they could look at it again. Marking derby_backport_reject_10_7 for now.

Kathey Marsden
added a comment - 16/Mar/12 16:30 Re-resolving this issue without any further backport.
The portion of this issue suitable for backport was backported all the way to 10.1 with DERBY-5289 . The rest looks too involved for a simple backport. It was actually backported to 10.8 from trunk with a forth digit version bump, so maybe if someone really needed it, they could look at it again. Marking derby_backport_reject_10_7 for now.