Put as many <xsl:sort> elements as you need:
<xsl:sort select="col1"/>
<xsl:sort select="col2"/>
...

where the first <xsl:sort> element specifies the primary sort key, the
second specifies the secondary sort key and so on. When the apply-templates
or for-each builds its nodelist, it sorts according to the sort keys; if two
or more nodes have equal weight in the sort, then it should return in
document order.

Steve Muench adds:

List each sort key in it's own <xsl:sort> element.
The first one that appears in document order is
the "primary" sort, the second one that appears
is the "secondary" sort, etc.

The advantage of this syntax over a comma-separated list is that you can
have different properties attached to the two sorts, such as the order in
which the list is sorted by these cols, or whether the cols are treated as
text or numbers:

You can add as many xsl:sorts as you want within an xsl:for-each or an
xsl:apply-templates.

3.

Sorting

Mike Kay

Q: Expansion.
> I'm a bit confused by the interaction of xsl:sort and the various
> axes. I suppose basically my question is: does xsl:sort affect the
> ordering of nodes for the purpose of reference within the stylesheet,
> or just for the purpose of the output?

xsl:sort affects the order in which the nodes are
processed. It does not affect the position of the nodes on
any axis, such as the following-siblings axis.

Hopefully this will work. The questionner wanted a
comparison of "content" of term elements, which is more
difficult to test than string values because descendant
nodes would be considered "content". If each <term>
contains only text, it will be fine.

> Is there a way to achieve a sort with results?:
> title 1, title2, title 3,..., title 9, title 10, title 11

XSL isn't especially good (actually it's normally hopeless) at infering
structure from character data, so it would have had a much easier time
if the characters and numbers had been separated in the input

<title
name="this string"
number="42"/>
or some such.

I'll give an example that sorts strings
or the form "abc 123" ie characters, space, numbers,
first on the first word, then numerically on the digits.

This passes all topics in this section
to the named templated,
which will bomb out with the message if
the two 'orders' are not equal.

I'm curious why you cast them to string prior
to the comparison? Is it not possible
to compare a result tree fragment held
in the two variables?

The cast to a string was there mainly for clarity, and also for robustness:
The current version of MSXML doesn't follow the rules correctly when casting
from a result-tree-fragment, though I think this example would be OK

That is to say, for every unique surname, consult the key again to
get the list of people with that surname. that way, you do not have to
navigate the tree again at all, since the list of relevant nodes is
already known.

13.

Removing duplicates Muenchian solution

Jeni Tennison

I'm going to assume you *were* actually referring to removing duplicate elements and, to make the answer more general and more accurate, I'm also going to assume that you have a number of different elements within your content. Finally, I'm going to assume that you do know that the thing that it is the content of the element that makes it a duplicate (rather than the value of an attribute, say), so something like:

Rather than using the preceding-sibling axis, I'm going to use the Muenchian technique to identify the first unique elements, because it's a lot easier to use in this case, as well as being more efficient generally.

First, define a key so that you can index on the unique features of the particular elements that you want. In this case, there are two unique features: the name of the element, and the content of the element. To make a key that includes both, I'm concatenating these two bits of information together (with a separator to hopefully account for odd occurrances that could generate the same key despite having different element/content combinations):

<xsl:key name="elements" match="*" use="concat(name(), '::', .)" />

So all the <employee>Bill</employee> elements are indexed under 'employee::Bill'. The unique elements are those that appear first in the list of elements that are indexed by the same key. Identifying those involves testing to see whether the node you're currently looking at is the same node as the first node in the list that is indexed by the key for the node. So if the <employee>Bill</employee> node that we're looking at is the first one in the list that we get when we retrieve the 'employee::Bill' nodes from the 'elements' key, then we know it hasn't been processed before.

>1. sort them by 'priority'
>2. leave, say, only 3 nodes in the result

Here's a solution. First, specify the number of nodes you want in a
parameter, so that you can change it whenever you like:
<xsl:param name="nodes" select="'3'" />

Next, you want to treat the nodes individually despite them being nested
inside each other, and you want to sort them within your output in order of
priority. You can use either xsl:for-each or xsl:apply-templates to select
the nodes within the document, whatever their level (using //node) and
xsl:sort within whichever you use to sort in order of priority. For example:

To sort descending,
xsl:sort has an 'order' attribute, with possible values
'ascending' (the default) or 'descending'.

16.

Arbitrary sorting

David Marston

>supposing I have elements with a month attribute
<report month="Jan" />
<report month="Feb" />
>and so on.
>Of course unordered :-)
>Now I want them in chronological order....
>I know how to translate Jan->1, Feb->2 etc via a named template
>[and] xsl:choose, but that doesn't help much in this case.

Naturally, what you want is to map names to numbers using keys,
which can be very efficient. Keys were made for just this
purpose! So far, I've been able to get this to work if the
month table is in the input document.

The first part of the above document is the month table.
For demonstration purposes, I have both abbreviated and full
month names (look at September) as synonyms, and you could
easily add names in other languages. There's a many-to-one
structure: look up a name, get back the correct month
number. The rest of the document is the set of records that
we want to sort in chronological order. The <day> elements
will work as a simple numeric sort, but that's secondary to
the sort by months. Following Oliver's request, we want
<xsl:sort select="key('MonthNum',month)" data-type="number"/>
where we will take the <month> as a string and get its
number out of the 'MonthNum' keyspace.

I'll supply an example of the above sort working in an
apply-templates situation, but it can work similarly in a
for-each loop. The current node will be the outer <doc>
element at the time we sort, so the keyspace definition will
also be based on that context:

The <out> element is just something we commonly use as a
tracing aid. The above works on both Xalan and Saxon.

Ideally, one would want to put the month table in a
completely separate file, so it could be shared among all
stylesheets that needed it. Depending on your situation,
you might prefer to have the month table right in the
stylesheet. Either way, you have to use document(), which
certainly complicates the procedure. The point of this
message is to show that key() can be used in sort keys.

The xsl:for-each loops over each of the functions, sorted in terms of
pair/word and then... does nothing with them! Since there's nothing there
that actually produces any output, then the variable $fns is set to an
empty rtf. When it's passed to the next for-each, there's nothing to
iterate over.

As we've learned from the intersection expression: never say never! :-)

In the following I try to explain a method to construct such an
expression - only by XPath means. Decide yourself wether the result
is rather pretty or rather ugly ...

xsl:sort requires an expression used as the sort key.
What we want is the following:

if condition1 then use string1 as sort key
if condition2 then use string2 as sort key
etc.

How to achieve that?
The following expression gives $string if $condition is true,
otherwise it results in an empty string:
substring($string, 1 div number($condition))
Regarding to Mike's book this is perfectly valid.
(Note: works with Saxon and XT, but not with my versions of Xalan and
Oracle XSL - but I've not installed the latest versions ...)

If you don't like "infinity" - here's another solution:
substring($string, 1, number($condition)*string-length($string))
but then you need $string twice ...

The concatenation of all these substring expressions forms the sort key.
Requirement: all conditions must be mutually exclusive.

That's all! :-)

Here's my example which demonstrates the handling of leading "Re: "s.
If the string starts with "Re: ", an equivalent string without this
prefix but with an appended ", Re" forms the key, otherwise the
original string is used:

As you may imagine these expressions could become very complex the more
arbitrary you want to sort.

Jeni Tennison adds

This is obviously the 'Becker Method' :)

It is, of course, hideous when you actually use it. You can make it a
little less hideous by dropping the number() - 'div' automatically converts
its arguments to a number anyway. Do make sure, as well, to use boolean()
if the condition is a node set to convert it into a boolean value (true if
such a node exists, false if it doesn't). In other words, the pattern is:

If the list were a string separated by commas, say, then you have to use
recursion, and the current node doesn't matter, so named templates are the
best choice, but you can use xsl:apply-templates instead if you want to:

In the comparison that does not involve case-insensitive translation, you
select: //LEAGUE[not(@NAME = preceding::*/@NAME)]

Within equality tests, the way the result is worked out depends on the type
of the nodes that are involved. When they involve node sets (as in this
case), the equality expression returns true if there are nodes within the
node set(s) for which the equality expression will be true. In other
words, "@NAME = preceding::*/@NAME" returns true if *any* of the preceding
elements has a NAME attribute that matches the NAME attribute of the
current node.

When you select with the translation:
//LEAGUE[not(translate(@NAME,$lower,$upper) = translate(preceding::*/@NAME,$lower,$upper))]
things work differently because the translate() function returns a string.
Doing:
translate(preceding::*/@NAME, $lower, $upper)
translates the string value of the node set preceding::*/@NAME
from lower
case to upper case, and returns this string. The string value of a node
set is the string value of the first node in the node set - the value of
the first preceding element's NAME
attribute. That means that you're
testing the equality of the translated @NAME
of the current element with
the translated @NAME
of the first preceding element, not comparing it with
all the other preceding element's @NAME

I don't *think* it's possible to do the selection you're after with a
single select expression, but you could get around it by doing a
xsl:for-each on all the //LEAGUE elements, and containing within it an
xsl:if that only retrieved those who don't have a preceding element with
the same (translated) name:

If possible, for efficiency, you should probably give a more exact
indication of the LEAGUE elements you're interested in (e.g.
'/SCORES/LEAGUE') and if you're only interested in the
preceding-sibling::LEAGUE elements, you should use this rather than the
general preceding::*. Generally, more specific XPath expressions are more
efficient.

Use transform when creating the key to convert lowercase
to uppercase letters (e.g., so that 'a' and 'A' would be
grouped together), and to convert all numbers/relevant
symbols to a single symbol (so 1, 2, 3, etc. would all
be in the same group.)

For each group, create a heading.

For each group, sort the entries in the group, again using
transform to convert lower to uppercase (e.g., so 'aa'
would be sorted as if it were 'Aa', thus performing a case
insensitive sort).

i want to implement a topological sort in XSLT. This is necessary
for generating program code (IDL for example) out of an XML file

A stylesheet for processing elements in
topological sorted order. The trick is to carefully select
elements from the document into variables so that they are
node sets and cen be selected later on.

The complete problem is stated in an earlier post (2000-11-04)
and can be found in the archive.

Pseudocode:
select structs with no dependencies
process them
repeat
if not all structs are processed
select structs which are
not processed
have only dependencies which are processed
if empty
stop
else
process them
done

The structs are processed in increasing distance from leaves in the
dependency graph.
Can one of the gurus please comment on the "count(field/type/ref)=count(...)"
construct, and whether this could be substituted by a possibly more
efficient condition? In my real world examples with some 200+ structs
it takes quite some time, if there are cheap optimisations i would
appreciate it.
Applying the stylesheet to the following example document gives the
expected result of
This outputs "ACBED" which is the correct dependency order.

Which includes your changes to test for the extra cases.
(and your minor correction)

this gives

ABCD
ABCDE
ABCDEF
ABCDEF
spirit-level
XYZ

25.

Ordering and iteration problem

Jeni Tennison

> My thinking is that I need to do something like
>
> for each row
> for each column
> ooutput the <circuit-breaker> with that row and column

I'd probably do this using the Piez Method/Hack of having an
xsl:for-each iterate over the correct number of random nodes and using
the position of the node to give the row/column number.

You need to define some random nodes - I usually use nodes from the
stylesheet:

<xsl:variable name="random-nodes" select="document('')//node()" />

And since you'll be iterating over them, you need some way of getting
back to the data:

<xsl:variable name="data" select="/" />

I've used two keys to get the relevant circuit breakers quickly. One
just indexes them by column (this is so you can work out whether you
need to add a cell or whether there's a circuit breaker from higher in
the column that covers it). The other indexes them by row and column.

I've assumed that you've stored the maximum number of rows in a
global variable called $max-row and the maximum number of columns in a
global variable called $max-col. Here's the template that does the
work:

> I am trying to dynamically set sort order and sort column in my
> XSLT. It seems that I can not use an expression for "order".
>
> <xsl:apply-templates select="Uow">
> <xsl:sort select="$sortColumn" order="$sortOrder"/>
> <xsl:with-param name="from" select="$startRow"/>
> <xsl:with-param name="to" select="$endRow"/>
> </xsl:apply-templates>

All attributes of xsl:sort with the exception of "select" can be specified as AVT-s.

The "select" attribute can be any XPath
expression. Certainly, an attempt to put an XPath expression
in a variable and pass this variable as the (complete) value
of the "select: attribute -- this will fail for xsl: sort as
in any such attempt in XSLT, because XPath expressions are
not evaluated dynamically.

However, if you need to specify the name of a child element, then you can use an
expression like this:

*[name()=$sortColumn]

Therefore, one possible way to achieve your wanted results is:

<xsl:sort order="{$order}" select="*[name()=$sortColumn]"/>

28.

Can xsl:sort use a variable

Mike Kay

>. Can the &lt;xsl:sort/> use a variable directly? e.g. &lt;xsl:sort
> select="$orderBy">&lt;xsl:sort>

The value of $orderBy doesn't depend on the current node, so
you'll get the same sort key value for every node. You probably want
select="*[name()=$orderBy]".

I did a slightly different take on this and assumed that you would want to
collect all of joe's preferences together, like this:

Ann says: "Pears."
Joe says: "Apples, Bananas,Oranges."

Even if this is not what you really want, it's interesting to see how it
works out. I have not completely handled putting in commas everywhere
except a period for the last item - I leave this to the reader. I also
haven't translated the first character of the name to upper case
I also sorted the result by name. The solution is very
compact without those refinements:

The slightly odd formatting is an easy way to control the output format
while still having short line lengths in the stylesheet (better for
emailing), and the &#13; character reference is necessary on my (Windows)
system to get the line feed to display. Here is the result (I added another
person, bob, to the data, just for fun):

This was interesting because the usual examples for getting unique node-sets
assume you know what the target elements are named, but not in this case.

30.

Sort, upper case then lower case

Andrew Welch

>Is there any way to cause the sort command to sort all the upper case =
>first and then the lower case? I don't mean using the upper-case or =
>lower-case settings, because that just determines which order they are =
>in. But I want the following results:

I'm looking now to see if I can work this out and I was wondering if
anybody would be able to help me with the correct sort selection.

The only other issue to be aware of is that the decimal points can go on
indefinitely and I don't know until runtime the highest number
in the any one id will be.

> In a transform, is it possible to correctly sort these poorly formed
> id's listed below

Tricky. I did have a recursive solution for you until I noticed that
just because there's an ID CM09.18.2 doesn't mean that there's an ID
CM09.18. This irregularity makes the task very difficult.

I think that I'd pick one of the following general approaches:

1. Decide that your stylesheet is only going to cope with IDs that
have 5 components; or 10 components; or however many seems to be a
reasonable maximum. You can always test the XML to make sure that this
assumption holds and generate an error if it doesn't. But this allows
you to do:

2. Create an extension function that can select the Nth component from
an ID. Then create a recursive template that groups and sorts the
nodes based on their Nth component.

3. Have a pre-processing phase that changes the IDs such that the
number in each component of the ID is formatted with an appropriate
number of leading zeros. You will then be able to sort the nodes by ID
using alphabetical sorting.

4. Generate the stylesheet dynamically based on the data, creating a
stylesheet that contains the appropriate number of sorts for the depth
of the IDs that you encounter in the XML.
Jeni

Mike offers

If you're prepared to write some recursive XSLT code to transform the
keys, you could achieve this by the technique of prefixing each numeric
component with a digit indicating its length. Thus 1 becomes 11, 10
becomes 210, 15 becomes 215, 109 becomes 3109. This will give you a key
that collates alphabetically.

32.

Sort by date

Jarno.Elovirta

> I need to sort records in my XSL stylesheet descending by
> date (i.e. newest
> date first).
>
> The problem is that the dates are in a text field in
> localized form, i.e.
>
> <date>24. April 2003</date>
>
> I have no clue how to approach this or if it will be possible at all.

ZZZZZ would come before AAAAAAAA. The sort was being performed by IE 6.0.
After much hair pulling, I finally figured out it was because of the
carriage return
that preceded the ZZZZZZ (the actual XML doc was much bigger, hiding the
problem).

First question: It seems odd to me that the newline character would be
considered significant and not get stripped. Why is this not so?

Answer

newlines that are followed by non-white space characters are _never_
considered insignificatnt by XML or stripped by XSLT.

(Mike Kay adds; But they may be ignored when sorting - the spec leaves detailed
decisions on how strings are sorted to the implementation.)

You want the normalize-space function,

select="normalize-space(vendor_name)"

34.

How do I sort Hiragana and Katakana Japanese characters?

Eric Promislow

First, a note on what exactly these two kinds of characters are.
Written Japanese text uses four kinds of characters:

Kanji, the so-called Chinese characters. These are originally
based on written Chinese text.

Hiragana, used for writing out Japanese words phonetically, as
opposed to in Kanji.

Katakana, a syllabary that indicates that a word or phrase has been
borrowed from a non-Japanese alphabet.

Romaji, Roman characters.

Children's books often put small Hiragana characters below a Kanji
character, so the student can subvocalize the word and learn the Kanji
that way.

The Hiragana and Katakana alphabets are called "syllabaries". With
one exception, each member is either a single-vowel syllable or a
consonant-vowel syllable. The exception is "-n", as in "san", which
would be written [sa][n]. Some of the syllables can be complex, such
as the "kyo" in [To][kyo].

Both syllbaries follow the same order, following the a-i-u-e-o
(vocalized as short 'u', long 'e', 'oo' as in "shoe", short e,
semi-long o as in "beau")
form horizontally, and the a-ka-sa-ta-na ha-ya-ma-wa-n order
vertically (consonants like the hard g, d, sh, ch (as in chew), b, and
p come by modifying the so-called unvoiced consonants). Modified
syllables sort immediately after their base syllable. For example,
"ga" sits between "ka" and "ki".

Hiragana characters occupy Unicode code points Ux3042 - Ux3094.

Katakana characters occupy Unicode code points Ux30A0 - Ux30FF.

The Katakana alphabet is growing, at a slow rate, and contains
syllables that are not in the Hiragana table.

From what I know, the Unicode tables follow Japanese dictionary
sorting order, <i>as long as you stay within either the Hiragana or
Katakana table</i>. If all the items in your list are either one or
the other, you should be able to use XSLT's simple Unicode-based
xslt:sort element. Otherwise, you would need to write an extension
function that would map Hiragana characters to their corresponding
Katakana values (since the former is a proper subset of the latter).

Here's an example, where I have a list of Japanese characters. The
attribute "a" is used to indicate where I would expect each item to
appear in a sort based on Unicode-values only.

If you are doing any work in this area, Ken Lunde's book "CJKV
Information Processing" (ISBN 1565922247) is a worthwhile investment.
I supplement it with a copy of Unicode 3.0 I found at a local
discount store for remaindered computer books. The unicode.org site
is useful, too, but I prefer turning pages in hard copy to waiting for
PDF files to open.

But XML data is supposed to all be Unicode. C# supposedly stores
all chars as "Unicode" (whatever that means, probably 16-bit
ignoring issues with surrogates), so I'm surprised this sort
didn't occur. Sort works fine with ascii text. And it seems
to work when I mix Ascii and Japanese. Looks like I found a
boundary condition violation. I wouldn't call it processor-specific.

ZZZZZ would come before AAAAAAAA. The sort was being performed by IE 6.0.
After much hair pulling, I finally figured out it was because of the
carriage return
that preceded the ZZZZZZ (the actual XML doc was much bigger, hiding the
problem).

First question: It seems odd to me that the newline character would be
considered significant and not get stripped. Why is this not so?

newlines that are followed by non-white space characters are _never_
considered insignificatnt by XML or stripped by XSLT.

[
But they may be ignored when sorting - the spec leaves detailed
decisions on how strings are sorted to the implementation.

ZZZZZ would come before AAAAAAAA. The sort was being performed by IE 6.0.
After much hair pulling, I finally figured out it was because of the
carriage return
that preceded the ZZZZZZ (the actual XML doc was much bigger, hiding the
problem).

First question: It seems odd to me that the newline character would be
considered significant and not get stripped. Why is this not so?

newlines that are followed by non-white space characters are _never_
considered insignificatnt by XML or stripped by XSLT.

[
But they may be ignored when sorting - the spec leaves detailed
decisions on how strings are sorted to the implementation.

Michael Kay
]

You want the normalize-space function,

<xsl:value-of select="normmalize-space(vendor_name)"/>

37.

Sorting Upper-Case first, or in *your* way

Mike Kay et al

I don't know exactly what the intent of the XSLT 1.0 spec for case-order
was, but you need to read the definition in the light of the two
(non-normative) notes that follow it.

The first says that two implementations may produce different results -
in other words, the spec does not attempt to be completely prescriptive
about the output order (therefore, by definition, this is not a
Microsoft non-conformance).

The second note points to Unicode TR-10:
http://www.unicode.org/unicode/reports/tr10/index.html

Section 6.6 of this report recommends that implementations should allow
the user to decide whether lower-case should sort before or after
upper-case, and my guess is that the xsl:sort parameter was intended to
implement this recommendation.

In turn this should be read in the context of the collation algorithm
given in the report, which sorts strings in three phases:

- alphabetic ordering

- diacritic ordering

- case ordering

The key thing here is that case is only considered if the two strings
(as a whole) are the same except in case. So Xaaaa will sort before
xaaaa if upper-case comes first; but Xaaaa will always sort before
xaaab, regardless of case order.

It looks to me from this evidence as if Microsoft is implementing
something close to the Unicode TR10 algorithm.

> Of course XSLT 1.0 doesn't actualy define "lexicographic" but
> my understanding is that it always implies a direct extension
> on an ordering on characters to an ordering on strings by
> comparing the first different position. If that isn't what is
> intended I think XSLT shouldn't use this term and should just
> directly refer to TR10.

My dictionary defines "lexicographical" [sic] as "pertaining to the
making of dictionaries", so on that basis "lexicographic order" means
"the order that headwords might appear in a dictionary". And in my
dictionary, "Johnsonian" comes after "johnny" and before
"joie-de-vivre". I think the great man would have been surprised if he
had appeared before "a" or after "zymotic".

I know that the word lexicographic is also used to describe a class of
sorting algorithms, but I don't think the XSLT 1.0 spec is using the
word in that sense. This is clear from the phrase "lexicographically in
the culturally correct manner for the language..." and from the fact
that it recommends Unicode TR10, which is not a lexicographic sort in
that sense.

David C adds.
See for example the definition given here:
http://mathworld.wolfram.com/LexicographicOrder.html

Note that (despite the etymology) "lexicographic order" doesn't
necessarily mean "the order used in a dictionary" as dictionaries are
compiled by human compilers and words can appear in whatever order the
compiler chooses which may reflect personal and culturalpreferences as
much as logic. However lexicographic ordering is used in a technical
sense as a method of extending the ordering on one set (the alphabet)
to a derived set (strings over that alphabet).
I don't believe that the first note authorises this behaviour.
it does not give a blanket licence to produce any result, it is an
observation that because character order is language and system
dependent the resulting lexicographic ordering will be too.
The exact places that are system dependent are listed in the normative
text above.

MK continues

The second note points to Unicode TR-10:
http://www.unicode.org/unicode/reports/tr10/index.html

The key thing here is that case is only considered if the two strings
(as a whole) are the same except in case.

DC retorts.
You mean that this is a feature of the algorithm in TR-10 (I didn't
follow it closely enough to derive this property just now)?

Of course XSLT 1.0 doesn't actualy define "lexicographic" but my
understanding is that it always implies a direct extension on an
ordering on characters to an ordering on strings by comparing the first
different position. If that isn't what is intended I think XSLT
shouldn't use this term and should just directly refer to TR10.

Yours truly adds:

With more than a little help from Eliot, below is a means of providing
your own sort order.
Its Saxon specific, tested with 6.5.2. Sorry.

Assuming that the java file is in location
com/icl/saxon/sort/Compare_er.java

then make sure that '.' is in the classpath, so it finds it.

With your own collator.txt file you can then sort text to your
hearts content and to your own rules.

Its even easier in saxon 7, but that's another story.

Last word goes to Mike Kay.

> *It would be interesting to know how Saxon implements
> this behaviour..* if M. Kay will be kind to answer..
>

I thought you would never ask. I'm an optimist ;-)

The answer is different for Saxon 6.x and Saxon 7.x.

In Saxon 6.x, you can write your own collating functions as a plug-in,
but if you don't, then two strings are compared as follows:

1. The two strings are compared with case normalized and accents
stripped, using Unicode codepoint order of the normalized characters.

2. If step (1) finds that the strings are equal, they are compared
with case normalized but without accents stripped, again using codepoint
order.

3. If step (2) finds that the strings are equal, the outcome depends
on the case of the first character that differs in the two strings,
taking account of the case-order option on xsl:sort.

Case normalization relies on the Java method toLowerCase. Accent
stripping is implemented only for characters in the upper half of the
Latin-1 set.

The above is essentially a simplified implementation of the Unicode
Collation Algorithm.

In Saxon 7.x, Saxon uses the collation capabilities of JDK 1.4. You can
select any collation supported by the JDK. The default is selected
according to your locale, or according to the language if lang is
specified on xsl:sort. If case-order is upper-first, then the action of
the selected Java collation is modified as follows: if the Java
collation decides that two strings collate as equal, then Saxon examines
the two strings, looking for the first character that differs between
the two strings. If one of these is upper case, then that string comes
first in the sorted order.

38.

Topological sort

Bill Keese and Dimitre Novatchev

Regarding the post from two years ago about topological sorting
(Archive),
here is another approach that I came up with. To me it seems to be more
in the spirit of XSLT, ie, writing functionally rather than
procedurally. Tell me what you think.

Topological sort refers to printing the nodes in a graph such that you
print a node before you print any nodes that reference that node.
Here's an example of a topologically sorted list:

1. each node gets a weight which is greater than the weight of any
nodes it references

2. sort by weight

The algorithm is O(n^2) for a simple XSLT processor, but it would be
O(n) if the XSLT processor was smart enough to cache the values returned
from the computeWeight(node) function. Does saxon do this? Maybe it
would if I used keys.

Here is the code. Note that it's XSLT V2 (although it could be written
more verbosely in XSLT V1).

> I am attempting to sort an a list of personal names. All of the names
> consist of either a first name followed by a last name or of a last
> name only (there are no middle names). Both parts of the name, when
> present, are enclosed within the one tag (span) which has a
> class='person'
> attribute, the same tag is used to enclose a last name only. I am
> attempting to sort by last name like so
>
> <xsl:for-each select="html/body//span[@class='person']">
> <xsl:sort select="substring-after(., ' ')"/> <xsl:sort select="."/>
> <xsl:sort select="substring-before(., ' ')"/>
>
> The problem is that names consisting of a last name only appear first
> in my alphabetical sequence and are sorted; these are followed by
> names with a first name and a last name and these are also sorted. I
> require one alphabetical list rather than two.
>
> Can this be done in one fell swoop, without having to write an XSL
> style sheet for the file consisting of two alphabetical sequences?