Re: [saxon] .net components, performance question

Ah, ok - I see now that was (another) xslt question, not a saxon question. Sorry about that.
With regards to saxon sa - we have a web app that we sell to our clients. We're going to use xml/xslt to import/export data from our relational database. All our xml has schema defined for it. Users can only transform our data using our web app - meaning it's not some kind of generic xslt tool.
And say we have 30 clients that install our web app on their lan. Each client has an average of 10 users that use that application. Plus we have 5 developers on our team.
How does the sa licensing work for that situation ?
Steve
-----Original Message-----
From: Michael Kay <mike@...>
Sent: Wednesday, November 12, 2008 6:41 PM
To: Mailing list for the SAXON XSLT and XQuery processor <saxon-help@...>
Subject: Re: [saxon] .net components, performance question
You can either use Saxon-SA, which will optimize this automatically, or you can use keys.
<xsl:key name="mapxsd" match="map:Mapping" use="@xsd"/>
<xsl:key name="mapsql" match="map:Mapping" use="@sql"/>
then
<xsl:variable name="MappingDocument" select="document('....')"/>
<xsl:variable name="MappingElement" select="
key('mapxsd', $PathToCurrentNode, $MappingDocument)[$Mapping eq "XSDtoSQL"] |
key('mapsql', $PathToCurrentNode, $MappingDocument)[$Mapping eq "SQLtoXSD"] "/>
Michael Kay
http://www.saxonica.com/
From: Stephen Caffo [mailto:steve@...]
Sent: 12 November 2008 22:49
To: Mailing list for the SAXON XSLT and XQuery processor
Subject: Re: [saxon] .net components, performance question
Ok, follow up question. As my files get larger, my processing time (even from the command line transform.exe) is slowing way, way down. Is there a better way to do this:
<!--Load all the mapping elements-->
<xsl:variable name="MappingElements" select="document('PortfolioSnapshotMapping_In.xslt')/descendant::map:Mapping"/>
<!--process each element in the large xml file, and look up the mapping element-->
<xsl:variable name="MappingElement" select='$MappingElements[($Mapping eq "XSDtoSQL" and string(@xsd) eq $PathToCurrentNode) or ($Mapping eq "SQLtoXSD" and string(@sql) eq $PathToCurrentNode)]'/>
Steve
From: Stephen Caffo [mailto:steve@...]
Sent: Wednesday, November 12, 2008 12:00 PM
To: Mailing list for the SAXON XSLT and XQuery processor
Subject: Re: [saxon] .net components, performance question
Thank you so much. Your quick responsiveness is almost inhuman considering the volume of work/email you must process daily!
Steve
From: Michael Kay [mailto:mike@...]
Sent: Tuesday, November 11, 2008 5:43 PM
To: 'Mailing list for the SAXON XSLT and XQuery processor'
Subject: Re: [saxon] .net components, performance question
I have now finally established why the stylesheet runs so much faster when it is known in advance that all nodes will be untyped. It is not, as I thought, because atomizing the nodes is significantly faster, or because the logic for doing a sequence comparison is slower than a singleton comparison in the case where the sequence turns out to be a singleton. Rather it is because when nodes are untyped, a dedicated "comparer" is allocated at compile-time, whose task is to compare strings using the Unicode codepoint collation; whereas when it is not known what type the nodes will be, a generic "comparer" is allocated at compile time, which then does some complex run-time decision making to decide how to perform the comparison, and (crucially) ends up choosing a less than optimum strategy.
It actually relates to the problem described here:
http://saxonica.blogharbor.com/blog/_archives/2006/8/13/2226871.html
(I enjoyed the title of that blog...)
In fact, I actually describe the bug in the blog posting! "That means implementing a comparesEqual() method in the collator that's separate from the compare() method, and changing ValueComparisons to use this method rather than calling the general compare() method and testing the result against zero"
But on this path, I'm not using a ValueComparison, I'm using code that still uses the general compare() method, which because of the UTF-16 problem described in the blog posting, is looking at the characters in the string one-by-one rather than doing a string compare.
Once identified, the problem turns out to be quite easy to fix. At any rate, to fix the main problem, which is choosing an efficient strategy for doing the comparisons. There's still a small overhead because the decision making is done at run-time rather than at compile time, but that's almost unnoticeable.
It's also not all that surprising that the overhead of doing this low-level manipulation of strings should be higher on the .NET platform than on Java

Thread view

Ah, ok - I see now that was (another) xslt question, not a saxon question. Sorry about that.
With regards to saxon sa - we have a web app that we sell to our clients. We're going to use xml/xslt to import/export data from our relational database. All our xml has schema defined for it. Users can only transform our data using our web app - meaning it's not some kind of generic xslt tool.
And say we have 30 clients that install our web app on their lan. Each client has an average of 10 users that use that application. Plus we have 5 developers on our team.
How does the sa licensing work for that situation ?
Steve
-----Original Message-----
From: Michael Kay <mike@...>
Sent: Wednesday, November 12, 2008 6:41 PM
To: Mailing list for the SAXON XSLT and XQuery processor <saxon-help@...>
Subject: Re: [saxon] .net components, performance question
You can either use Saxon-SA, which will optimize this automatically, or you can use keys.
<xsl:key name="mapxsd" match="map:Mapping" use="@xsd"/>
<xsl:key name="mapsql" match="map:Mapping" use="@sql"/>
then
<xsl:variable name="MappingDocument" select="document('....')"/>
<xsl:variable name="MappingElement" select="
key('mapxsd', $PathToCurrentNode, $MappingDocument)[$Mapping eq "XSDtoSQL"] |
key('mapsql', $PathToCurrentNode, $MappingDocument)[$Mapping eq "SQLtoXSD"] "/>
Michael Kay
http://www.saxonica.com/
From: Stephen Caffo [mailto:steve@...]
Sent: 12 November 2008 22:49
To: Mailing list for the SAXON XSLT and XQuery processor
Subject: Re: [saxon] .net components, performance question
Ok, follow up question. As my files get larger, my processing time (even from the command line transform.exe) is slowing way, way down. Is there a better way to do this:
<!--Load all the mapping elements-->
<xsl:variable name="MappingElements" select="document('PortfolioSnapshotMapping_In.xslt')/descendant::map:Mapping"/>
<!--process each element in the large xml file, and look up the mapping element-->
<xsl:variable name="MappingElement" select='$MappingElements[($Mapping eq "XSDtoSQL" and string(@xsd) eq $PathToCurrentNode) or ($Mapping eq "SQLtoXSD" and string(@sql) eq $PathToCurrentNode)]'/>
Steve
From: Stephen Caffo [mailto:steve@...]
Sent: Wednesday, November 12, 2008 12:00 PM
To: Mailing list for the SAXON XSLT and XQuery processor
Subject: Re: [saxon] .net components, performance question
Thank you so much. Your quick responsiveness is almost inhuman considering the volume of work/email you must process daily!
Steve
From: Michael Kay [mailto:mike@...]
Sent: Tuesday, November 11, 2008 5:43 PM
To: 'Mailing list for the SAXON XSLT and XQuery processor'
Subject: Re: [saxon] .net components, performance question
I have now finally established why the stylesheet runs so much faster when it is known in advance that all nodes will be untyped. It is not, as I thought, because atomizing the nodes is significantly faster, or because the logic for doing a sequence comparison is slower than a singleton comparison in the case where the sequence turns out to be a singleton. Rather it is because when nodes are untyped, a dedicated "comparer" is allocated at compile-time, whose task is to compare strings using the Unicode codepoint collation; whereas when it is not known what type the nodes will be, a generic "comparer" is allocated at compile time, which then does some complex run-time decision making to decide how to perform the comparison, and (crucially) ends up choosing a less than optimum strategy.
It actually relates to the problem described here:
http://saxonica.blogharbor.com/blog/_archives/2006/8/13/2226871.html
(I enjoyed the title of that blog...)
In fact, I actually describe the bug in the blog posting! "That means implementing a comparesEqual() method in the collator that's separate from the compare() method, and changing ValueComparisons to use this method rather than calling the general compare() method and testing the result against zero"
But on this path, I'm not using a ValueComparison, I'm using code that still uses the general compare() method, which because of the UTF-16 problem described in the blog posting, is looking at the characters in the string one-by-one rather than doing a string compare.
Once identified, the problem turns out to be quite easy to fix. At any rate, to fix the main problem, which is choosing an efficient strategy for doing the comparisons. There's still a small overhead because the decision making is done at run-time rather than at compile time, but that's almost unnoticeable.
It's also not all that surprising that the overhead of doing this low-level manipulation of strings should be higher on the .NET platform than on Java

Ah, ok - I see now that was (another) xslt question, not a saxon question. Sorry about that.
With regards to saxon sa - we have a web app that we sell to our clients. We're going to use xml/xslt to import/export data from our relational database. All our xml has schema defined for it. Users can only transform our data using our web app - meaning it's not some kind of generic xslt tool.
And say we have 30 clients that install our web app on their lan. Each client has an average of 10 users that use that application. Plus we have 5 developers on our team.
How does the sa licensing work for that situation ?
Steve
-----Original Message-----
From: Michael Kay <mike@...>
Sent: Wednesday, November 12, 2008 6:41 PM
To: Mailing list for the SAXON XSLT and XQuery processor <saxon-help@...>
Subject: Re: [saxon] .net components, performance question
You can either use Saxon-SA, which will optimize this automatically, or you can use keys.
<xsl:key name="mapxsd" match="map:Mapping" use="@xsd"/>
<xsl:key name="mapsql" match="map:Mapping" use="@sql"/>
then
<xsl:variable name="MappingDocument" select="document('....')"/>
<xsl:variable name="MappingElement" select="
key('mapxsd', $PathToCurrentNode, $MappingDocument)[$Mapping eq "XSDtoSQL"] |
key('mapsql', $PathToCurrentNode, $MappingDocument)[$Mapping eq "SQLtoXSD"] "/>
Michael Kay
http://www.saxonica.com/
From: Stephen Caffo [mailto:steve@...]
Sent: 12 November 2008 22:49
To: Mailing list for the SAXON XSLT and XQuery processor
Subject: Re: [saxon] .net components, performance question
Ok, follow up question. As my files get larger, my processing time (even from the command line transform.exe) is slowing way, way down. Is there a better way to do this:
<!--Load all the mapping elements-->
<xsl:variable name="MappingElements" select="document('PortfolioSnapshotMapping_In.xslt')/descendant::map:Mapping"/>
<!--process each element in the large xml file, and look up the mapping element-->
<xsl:variable name="MappingElement" select='$MappingElements[($Mapping eq "XSDtoSQL" and string(@xsd) eq $PathToCurrentNode) or ($Mapping eq "SQLtoXSD" and string(@sql) eq $PathToCurrentNode)]'/>
Steve
From: Stephen Caffo [mailto:steve@...]
Sent: Wednesday, November 12, 2008 12:00 PM
To: Mailing list for the SAXON XSLT and XQuery processor
Subject: Re: [saxon] .net components, performance question
Thank you so much. Your quick responsiveness is almost inhuman considering the volume of work/email you must process daily!
Steve
From: Michael Kay [mailto:mike@...]
Sent: Tuesday, November 11, 2008 5:43 PM
To: 'Mailing list for the SAXON XSLT and XQuery processor'
Subject: Re: [saxon] .net components, performance question
I have now finally established why the stylesheet runs so much faster when it is known in advance that all nodes will be untyped. It is not, as I thought, because atomizing the nodes is significantly faster, or because the logic for doing a sequence comparison is slower than a singleton comparison in the case where the sequence turns out to be a singleton. Rather it is because when nodes are untyped, a dedicated "comparer" is allocated at compile-time, whose task is to compare strings using the Unicode codepoint collation; whereas when it is not known what type the nodes will be, a generic "comparer" is allocated at compile time, which then does some complex run-time decision making to decide how to perform the comparison, and (crucially) ends up choosing a less than optimum strategy.
It actually relates to the problem described here:
http://saxonica.blogharbor.com/blog/_archives/2006/8/13/2226871.html
(I enjoyed the title of that blog...)
In fact, I actually describe the bug in the blog posting! "That means implementing a comparesEqual() method in the collator that's separate from the compare() method, and changing ValueComparisons to use this method rather than calling the general compare() method and testing the result against zero"
But on this path, I'm not using a ValueComparison, I'm using code that still uses the general compare() method, which because of the UTF-16 problem described in the blog posting, is looking at the characters in the string one-by-one rather than doing a string compare.
Once identified, the problem turns out to be quite easy to fix. At any rate, to fix the main problem, which is choosing an efficient strategy for doing the comparisons. There's still a small overhead because the decision making is done at run-time rather than at compile time, but that's almost unnoticeable.
It's also not all that surprising that the overhead of doing this low-level manipulation of strings should be higher on the .NET platform than on Java

Ah, ok - I see now that was (another) xslt question, not a saxon question. Sorry about that.
With regards to saxon sa - we have a web app that we sell to our clients. We're going to use xml/xslt to import/export data from our relational database. All our xml has schema defined for it. Users can only transform our data using our web app - meaning it's not some kind of generic xslt tool.
And say we have 30 clients that install our web app on their lan. Each client has an average of 10 users that use that application. Plus we have 5 developers on our team.
How does the sa licensing work for that situation ?
Steve
-----Original Message-----
From: Michael Kay <mike@...>
Sent: Wednesday, November 12, 2008 6:41 PM
To: Mailing list for the SAXON XSLT and XQuery processor <saxon-help@...>
Subject: Re: [saxon] .net components, performance question
You can either use Saxon-SA, which will optimize this automatically, or you can use keys.
<xsl:key name="mapxsd" match="map:Mapping" use="@xsd"/>
<xsl:key name="mapsql" match="map:Mapping" use="@sql"/>
then
<xsl:variable name="MappingDocument" select="document('....')"/>
<xsl:variable name="MappingElement" select="
key('mapxsd', $PathToCurrentNode, $MappingDocument)[$Mapping eq "XSDtoSQL"] |
key('mapsql', $PathToCurrentNode, $MappingDocument)[$Mapping eq "SQLtoXSD"] "/>
Michael Kay
http://www.saxonica.com/
From: Stephen Caffo [mailto:steve@...]
Sent: 12 November 2008 22:49
To: Mailing list for the SAXON XSLT and XQuery processor
Subject: Re: [saxon] .net components, performance question
Ok, follow up question. As my files get larger, my processing time (even from the command line transform.exe) is slowing way, way down. Is there a better way to do this:
<!--Load all the mapping elements-->
<xsl:variable name="MappingElements" select="document('PortfolioSnapshotMapping_In.xslt')/descendant::map:Mapping"/>
<!--process each element in the large xml file, and look up the mapping element-->
<xsl:variable name="MappingElement" select='$MappingElements[($Mapping eq "XSDtoSQL" and string(@xsd) eq $PathToCurrentNode) or ($Mapping eq "SQLtoXSD" and string(@sql) eq $PathToCurrentNode)]'/>
Steve
From: Stephen Caffo [mailto:steve@...]
Sent: Wednesday, November 12, 2008 12:00 PM
To: Mailing list for the SAXON XSLT and XQuery processor
Subject: Re: [saxon] .net components, performance question
Thank you so much. Your quick responsiveness is almost inhuman considering the volume of work/email you must process daily!
Steve
From: Michael Kay [mailto:mike@...]
Sent: Tuesday, November 11, 2008 5:43 PM
To: 'Mailing list for the SAXON XSLT and XQuery processor'
Subject: Re: [saxon] .net components, performance question
I have now finally established why the stylesheet runs so much faster when it is known in advance that all nodes will be untyped. It is not, as I thought, because atomizing the nodes is significantly faster, or because the logic for doing a sequence comparison is slower than a singleton comparison in the case where the sequence turns out to be a singleton. Rather it is because when nodes are untyped, a dedicated "comparer" is allocated at compile-time, whose task is to compare strings using the Unicode codepoint collation; whereas when it is not known what type the nodes will be, a generic "comparer" is allocated at compile time, which then does some complex run-time decision making to decide how to perform the comparison, and (crucially) ends up choosing a less than optimum strategy.
It actually relates to the problem described here:
http://saxonica.blogharbor.com/blog/_archives/2006/8/13/2226871.html
(I enjoyed the title of that blog...)
In fact, I actually describe the bug in the blog posting! "That means implementing a comparesEqual() method in the collator that's separate from the compare() method, and changing ValueComparisons to use this method rather than calling the general compare() method and testing the result against zero"
But on this path, I'm not using a ValueComparison, I'm using code that still uses the general compare() method, which because of the UTF-16 problem described in the blog posting, is looking at the characters in the string one-by-one rather than doing a string compare.
Once identified, the problem turns out to be quite easy to fix. At any rate, to fix the main problem, which is choosing an efficient strategy for doing the comparisons. There's still a small overhead because the decision making is done at run-time rather than at compile time, but that's almost unnoticeable.
It's also not all that surprising that the overhead of doing this low-level manipulation of strings should be higher on the .NET platform than on Java

> With regards to saxon sa - we have a web app that we sell to
> our clients. We're going to use xml/xslt to import/export
> data from our relational database. All our xml has schema
> defined for it. Users can only transform our data using our
> web app - meaning it's not some kind of generic xslt tool.
>
> And say we have 30 clients that install our web app on their
> lan. Each client has an average of 10 users that use that
> application. Plus we have 5 developers on our team.
>
> How does the sa licensing work for that situation ?
You have two options: you can either tell your clients to purchase a
Saxon-SA license (they need one for each computer on which the software
actually runs, which might just be one from your description), or you can
negotiate an OEM contract with Saxonica that allows you to distribute the
product with your application, and activate it by means of a license key
supplied programmatically. Feel free to contact me off-list to discuss the
commercial terms for this. Our terms and conditions for OEM distributors
typically include unlimited use by the application vendor for development,
testing, and marketing of the application.
Regards,
Michael Kay
http://www.saxonica.com/