Expand you shell powers

Xapply expands your shell powers by adding
parallel processing to iterative loops. Items from files, the command
line, and ptbw instances form shell commands
that are managed in a wrapper stack. It also allows you to cache
the else case, where
no items were provided.

To understand this document

This document assumes you are quite familiar with the
standard UNIX™ shell, sh(1), and
have an understanding of the UNIX process model, exit codes,
and have coded several scripts, used gzip(1),
and find(1).

It also assumes that you can read the manual page for any other
example command.
Having some exposure to printf(3) or
some other percent-markup function would help a little.
I use this "expander markup" in many of my programs.

What is xapply?

Simply stated, xapply is a generic loop. It iterates
over items you provide, running a customized shell command for
each pass though the loop.
One might code this loop as something like:

for Item in $ARGV
do
body-part $Item
done

and feel pretty good about it, so why would you need xapply?

The number 1 reason to use xapply is that it runs some
of the body-part's in parallel. It starts as many
as you ask it to (using the -P option),
then, as processes finish, it launches the next iteration of
body-part, until they are all started.
It waits for the running ones to finish before it exits.

The benefit is that we might take advantage of more CPUs resources
(either as threads on CPU cores, or multiple CPU packages in a host).

Even better, it can manage the output from those parallel
tasks so that each is not all mixed with the others.
Without the -m switch, xapply assumes you
can figure out which iteration of body-part output each line.
Under -m, xapply groups
the output from each iteration together such that
one finishes completely before the next one starts.

Like most loops, xapply can skip though the list more
than 1 item at time.
The -count option allows you to
visit the items in the argument list in pairs (or groups of count).
This is handy for programs like
diff(1) that need 2 targets.

Unlike common loops, xapply keeps track of
critical resources for each iteration.
A body can be bound to a token which it uses
for the life of its task. That resource token (for example a modem) won't
be issued to another iteration until the owner-task is complete, then
it will be allocated to a waiting task.
This allows xapply to make very efficient use of
limited resources
(and it honors -P as an upper limit as well).

Xapply has other friends.
In fact, it is the core node that connects
xclate(1), ptbw(1) and
hxmd(8) to each other.
We'll come back to the usefulness of that fact in a bit.

In summary, xapply lets you take advantage of
all the CPU resources on a host while keeping the tasks and resources
straight.
To raise the overall torque even more it reaches out to share resources,
to collate output, and reuse configuration data.
These features are all coordinated across multiple instances of
xapply and the related tools.

Basic examples

The gzip
utility can be pretty expensive in terms of CPU.
If we want to compress many output files (say *.txt), we could run
something like:

gzip -9 *.txt

Most modern hosts have more than the single CPU that is going to use.
We might break the list up with some shell magic (like split(1)),
then start a copy of gzip for each file.
That won't balance the CPUs, as 1 list will inevitably
have most of the small files.
This short list finishes long before the others, leaving an idle
CPU while the larger task still has files left to compress.

The shell code to split the list up is also pretty complex.
Given a temporary file, it might look like this:

The quoted gzip command is the template used
to build a shell command for each file that matches the glob. If no
files match the glob then the literal string "*.txt" is passed to
the shell, which passes it to gzip, which
complains that it cannot stat(2) that file.

With a few dozen files matching the glob, we would keep our machine busy for
a while! If there are less than 4
files we just start as many as we can. More than that will queue until
(the smallest or first) one finishes, then start another. This actually
sustains a load average on my test machine right at 4.0.
The xapply process itself is blocked in the
wait system call and therefore uses no CPU until
it is ready to start another task.

When the list of files might be too long for an argument list,
provide them on stdin (or from a file)
with the xapply's -f:

find . -name \*.txt -print |
xapply -f -P4 "gzip -9" -

This is also good because it won't try to compress
a file named "*.txt" in the case where the glob doesn't match
anything.
The other great thing about that is that the first gzip
task starts as soon as find sends the first
filename though the pipe!

When find has queued enough files to block on
the pipe, it gives up the CPU to the gzips,
which is exactly what you want. Just before that there are actually
5 tasks on the CPU, which is OK as find is largely
blocked on I/O while gzip is busy on the CPU.

In other cases, we'll need to specify the name of the matched file
someplace else, or more than once, in the template command.
We use the markup %1 to specify where in
the template command the current parameter should be expanded.

find . -name \*.txt -print |
xapply -f -P4 "gzip -9 %1; ls -ls %1" -

But it is never a good idea to create more processes than
you really need: compare these two spellings of the same idea.
We want to find all the RCS semaphore files that are under
/usr/msrc:

By using a filter (fmt) to group arguments into
bigger bundles we saved 90% of the time. Over larger tasks the
savings may be much larger. Remember fork(2) is
a really expensive system call, no matter how fast your machine might be.

In rare cases, we may want to discard the parameter; then we
use %0 to expand the empty string, and
suppress the default expansion on the end of the template.

I/O features -- input

Under UNIX's nifty pipe abstraction, it is best to think of
xapply as a filter, reading from stdin and writing
to stdout, like awk would. We'll see in the custom
command section that this is closer to the truth than it looks.
For now, just play along.

Because of the parallel tasks, xapply has some unique
issues with I/O.

On the input side, we have issues with processes competing for input
from stdin. We take several measures to keep the books balanced.

the -count switch and stdin

This xapply command folds input lines 1 and 2 to a single line,
then 3 and 4, then 5 and 6 and so on to the end of the file:

xapply -f -2 'echo' - -

The 2 occurrences of stdin, spelled dash "-" like most
UNIX filters, share a common reference. That is, the code knows to
read 1 thing from stdin for each dash, for
each iteration, rather than reading all of stdin for
the first dash leaving nothing for the second.

In other words, it does what you'd expect.
Using -3 and 3 dashes reformats the output to
present 3 lines as a single output line.

speaking in terms of lines

Sometimes, newline is not a good separator. Find
has the -print0 option for just this reason.
Xapply has the -z option to read
-print0 output. Some other programs, like
hxmd,
also use the nul terminated format.

So the compress example might become:

find . -name \*.txt -print0 |
xapply -fz -P4 "gzip -9" -

the command line option -i input

This option opens a different file as the common stdin for
all the inferior tasks.
Under -f, the default value is /dev/null.
This lets the parent xapply use stdin for
input without random child processes consuming bits from it.

To provide a unique word from $HOME/pass.words to
each of 5 tasks:

xapply -i $HOME/pass.words 'read U && echo %1 $U' 1 2 3 4 5

This has some limits; when the file is too short for the number of tasks,
the read will fail and
the echo won't be executed. (Put 3 words in the
file and try it.) We might want to recycle the words after they've been
used; see below where we explain how
-t does that.

Since the read is part of a program, it could be part of
a loop, so a variable number of words from
the input file could
be read for each task. Under -P this could be problematic.

I/O features -- output

Without the -m option, xapply tasks each
send output to stdout all jumbled together. This is not
evident until you try a large -Pjobs case with
a task that outputs over time (like a long running make).
If you want an example of this you might compare:

xapply -P2 -J4 'ptbw -' '' ''

to the collated version:

xapply -m -P2 -J4 'ptbw -' '' ''

The xclate processor is xapply's output
friend. It is not usually your friend, as it is hard to follow all
the rules. In fact some programs, like gzip, don't
follow the rules very well.
You'll have to compensate for that in
your xapply spells.

In our example above, we'd like to add the -v switch to
gzip to see how much compression we are getting

find . -name \*.txt -print0 |
xapply -fz -P4 "gzip -9 -v" -

Which looks OK until you run it. The start of all the compression
lines come out all at once (the first 4 of them), then the statistics
get mixed up with the new headers as they are output. It is a mess.

By adding the -m switch to
the xapply, we should be able to collate the output.
However, it doesn't work because
the statistics are sent to stderr,
so we must compensate with the addition of a shell descriptor duplication:

find . -name \*.txt -print0 |
xapply -fzm -P4 "gzip -9 -v 2>&1" -

The logic in xapply to manage xclate is
usually enough for even nested calls. When it is not, you'll have
to learn more about xclate; I'd save that for a major
blizzard, rain storm, or long plane trip.

The xapply's command line option -s passes
the squeeze option (also spelled -s) down
to xclate. This option allows any task which
doesn't output any text to stdout to exit without
waiting for exclusive access to the collated output stream.
This speeds the start of the next task substantially in cases
where output is rare (and either long, or evenly distributed).

Building a custom command

The old-school UNIX command apply uses a printf-like percent expander to
help customize commands. As a direct descendant of apply,
xapply has a similar expander.
As one of my tools, it has a lot more power in that expander.

In addition to the apply feature of binding %1 to the
first parameter, %2 to the second, and so forth,
xapply has access a facility called the
dicer.

The dicer is a shorthand notation used to pull substrings out of
a larger string with a known format. For example, a line in the
/etc/passwd file has a well-known format which uses
colons (":") to separate the fields. In every password file
I've ever seen, the first field is the login name of the account.
The xapply command

xapply -f 'echo %[1:1]' /etc/passwd

filters the /etc/passwd file into a list of login names.

The dicer expression %[1:1] says "take the first parameter,
split it on colon (:), then extract the first subfield".
Here are several possible dicer expressions and their expansions:

Expression

Expansion

%1

/usr/share/man/man1/ls.1.gz

%[1/2]

usr

%[1.1]

/usr/share/man/man1/ls

%[1.1].%[1.2]

/usr/share/man/man1/ls.1

%[1/$.1]

ls

I stuck a nifty one in there: the dollar sign always stands for the
last field. The other important point is that %[1/1]
would expand to the empty string, since the first field is empty.

The dicer also lets us remove a field with a negative
number:

Expression

Expansion

%1

/usr/share/man/man1/ls.1.gz

%[1/-1]

usr/share/man/man1/ls.1.gz

%[1/-2]

/share/man/man1/ls.1.gz

%[1.-$]

/usr/share/man/man1/ls.1

Because splitting on white-space is so common, the blank
character is special in that it matches any number of white-space
characters. Escape any of blank, a digit, close-bracket, or
backslash with a backslash to force it to be taken literally.

Later versions of xapply also allow access to the
mixer, which allows the selection of characters from a
dicer expression. That is slightly beyond the scope of
this document. As an example, %(3,$-1) is the
expression to reverse the characters in %3.
All my tools use the same mixer and dicer expression syntax:
xapply, mk,
oue, and sbp.
Because some programs call xapply, they also
provide a dicer interface (for example, hxmd).
The dicer documentation is in the explode library
as dicer.html (found usually in
/usr/local/lib/explode).

Preferences for the picky coder

There are some options that let you select details about the environment
that xapply provides: viz. shells, escape characters, and padding.

I like to use perl

The -S shell option lets you select a shell for
the command built to start each task. I would use ksh or
sh, if it were me. You could set $SHELL
to anything you like, but that might confuse other programs that use
xapply, so stick to -S.

As a special case, when you set -S perl,
it changes the behavior of xapply.
To introduce the command string,
it uses perl -e rather than the Bourne shell
compatible $SHELL -c.
It might also setup -A differently (see below).

Input file padding

Given a count of 2 and 2 file parameters under -f,
xapply matches the corresponding lines from each file as
parameter pairs. When only 1 of the files runs out of lines, the
empty string is provided as the element from the other. You can change this
pad string to anything you like, for example -p /dev/null.

In one of our first examples, we joined pairs of lines. What happens if
there is only 1 line?
The echo command gets an extra space on the end,
which it trims. To see that, we can replace the default expansion with
a quoted one, and run it through cat -v:

echo A |xapply -f -2 'echo "%*"' - - | cat -ve

This outputs "A $" (without the quotes). Try %~*
for a nice Easter Egg, use the -p option below to understand the output.

There are alternatives. Under -p we can detect a
sentinel value in for missing line. Say, for example, that a comma on
a line by itself could never be an element of the input, then
-p . would let us detect the missing even line with

xapply -p ,... if [ _"%2" = _"," ] ; then ...'

It is usually considered good form to exit from task
as soon as possible. With this in mind the above trap might be better
coded as:

... [ _"%2" = _"," ] && exit;...'

Percent marks are so vulgar

If you don't like the escape character, you can change it with the
-a option. Take care that the symbol you pick is quoted
from the shell.
Viz. "xapply -a~ ..." is not what you'd want under
csh or ksh, since the tilde gets expanded to a
path to someone's home directory.

Because xapply is driven from
mkcmd, it takes the full list of
character expressions (-a "^B" is
ASCII stx, -a M-A is code 230); that
doesn't mean you should use them. Try to stick with percent when you can.
In ksh, that makes some let,
$((...)), and ${NAME%glob}
parameter substitutions require %% to
get a literal percent sign.

More advanced escapes

Since xapply is emulating a generic, loop it stands
to reason that there would be a "loop counter".
The loop counter is named %u, which stands
for "unique". It also would be nice to be able to
break out of the loop. For that
we send a signal to xapply with
%p, which stands for "pid".

When you use -F to
load xapply as an interpreter, then
the markup %c expands to
the cmd read from script,
while %fc expands to the path to
the script specified.
Also %ct expands the the load line, or a
synthetic one representing the shell used to run each task.

The loop counter %u

Since I'm a C programmer, I start the loop counter at zero (0) and
bump it up one after each trip through the loop.
For example, to output the numbers 0 to 4 next to the letter 'A' to 'E':

xapply 'echo %u %1' A B C D E

A better use of this might be to process data from one iteration to
the next (making generations of a file with the extension .%u).

Use of ksh's built-in math operations to build a
function based on %u is common. To queue many
at jobs about 5 minutes apart:

xapply -x 'at + $((%u*5)) minutes < %1' *.job

The -x option lets you see the commands executed on stderr.
This emulates set -x in Bourne shell.

If the unique loop counter is provided by gtfw
then the source of the counter is available as %fu.

Short-circuit of the loop with %p

To break the xapply
loop we signal the process with a USR1
signal. Usually this is done conditionally in cmd
with the kill command.
The markup %p expands to the pid of
the enclosing xapply.

For example, to seach a list of integers for a prime number I might
code:

xapply -f 'is-prime %1 || exit; echo %1; kill -USR1 %p' numbers.cl

In that example the exit command acts as
a C continue statement, and the
kill -USR1 %p command acts as a
break statement (or maybe more like a
longjmp call).

Note that the kill command does
not terminate any already running tasks. So under
-P some tasks might already be processing in
parallel with the task that short-circuited the loop.
Since the USR1 sent to
xapply didn't terminate the current task either,
you may need to exit explicitly as well.

This feature is allowed under hxmd as well.
Short-circuited commands are assigned the synthetic status 3000,
as a sentinel value to distinguish them from other failed commands.

As a way to fetch the run-time name xapply
was called the expansion of %fp is
the program name.

Safer escapes

That will expand an unbalanced grave quote in the subject argument.
Even worse, it might try to run "Abrose" as a shell command.

A program should be safe from such corner cases, like a filename with
a quote or control character in the name.
On input, xapply
can use the -print0-style, on
output we depend on the shell. To make a parameter safer, there is a
q modifier that
tells xapply
that you are going to wrap the expansion in shell double-quotes, and that
you'd like the resulting dequoted text to be the original value.

We're asking xapply to backslash any of double-quote, grave,
dollar, or backslash in the target text, so the command is presented to
the shell as:

Mail -s "Hi Paul d\`Abrose" "pa@example.com"...

In versions of xapply above 3.60, two more quote
forms are available: %Q quotes all the
shell meta characters with a backslash (\),
and %W quotes all the shell meta characters
and all the default IFS characters (space, tabs
and newlines). These are mostly useful to pass commands to a
remote machine via ssh. This example sends
commands (from cmds) round-robin to
a list of hosts (in /tmp/ksb/hosts):

xapply -ft /tmp/ksb/hosts 'ssh %t1 %Q*' cmds

Note that number of hosts doesn't have to match the number of commands.

This is not always enough; sometimes the data should be passed through
a scrubber, or sent to /dev/null, when
you don't trust it.

Nested markup

If you really want to get all crazy you can pass more markup in as
a parameter. The escape %+ shifts the
parameters over one to the left, then expands
the new cmd (replacing the
%+), then continues with the rest of
the original cmd.

An example makes this a little clearer:

xapply -n -2 "( %+ )" "echo %1 %1" ksb rm /tmp/bob

Outputs

( echo ksb ksb )
( rm /tmp/bob )

This is really a lot more useful when the input is a pipe
(viz. under -fz).
A program can match commands to parameters and send the
paired stream to xapply
for parallel execution.
This is exactly how hxmd works.

It is also possible to use the first token (from -t)
as a command template via %t+. In this
case the token is consumed, so subsequence references
to %t+ will expand the second token (if any).
As an example:

This strange meme is actually really useful, if you use
ptbw to hold a list of commands that need to be
applied remotely. But you'll just have to trust me on that until you
see gtfw and sshw in
action.

What if I didn't find anything to do?

In most cases, if xapply didn't get any arguments to use
as parameters it shouldn't run anything (unlike busted xargs).
In a few cases, it might be nice to have an "else" part (like a Python
while loop). The -N else option allows
a command to run when we didn't get any tasks started.

Let's rework our compression filter; we'll misspell the extension we
are looking for (so we don't match anything) and put in a message
when we do not find anything to compress.

This is mostly used in scripts to give the Customer a warm feeling that
we looked, but didn't find anything to do.

In the else command %u
expands to 00 (double zero). This allows
other processors (like hxmd) to tell the
difference between the first task (0) and
the else clause. It doesn't help to
send a USR1 to xapply from
the else command, because no commands will be
run, in any event.

As an example of how to recover the notification stream, I'll make a list
of just the else clause:

The -x showed us the shell command built from
else and told us that xapply
did get the signal. The -m option forced
xapply to send notification to
the xclate we started, which diverted those
notifications to the local file res. We converted
the NUL terminated records into text via tr.

Other resources

In all the examples above, xapply is very predictable.
When we run the examples on the same input, we are apt to get the same
output. All that changes when we allow xapply to
start a ptbw to manage a resource.

Each line of a ptbw resource file represents a
unique resource that is allocated to a single task at any one time.
A resource could be anything, a CPU, filesystem,
VX disk group or network address. I picked a modem in these
examples because the exclusive use to dial a phone number is clear.

If we have 3 modems connected to a host
on /dev/cuaa0, /dev/cuaa1, and
/dev/ttyCA, we can put those strings in a file
called ~/lib/modems. Then we can ask xapply
to reserve 1 modem for each command:

xapply -f -t ~/lib/modems -R 1 'myDialer -d %t1 %1' phone.list

No matter how many phone numbers are in phone.list, we
will never try to dial different numbers on the same modem.
This is because xapply and ptbw know how
to work with each other to keep the books straight.

We can force a new ptbw instance into our
process tree by using the -t option, the -J,
or a -R option with any value greater than 0.
If we don't use any of those options, xapply uses
the internal function iota just as ptbw
does, but doesn't insert an instance in the process tree, so any
enclosing ptbw will be directly visible to each task.

The new expander form, %t1, expands to the modem selected.
The -R options specifies how many resources to allocate
to each task.

All of the dicer forms we saw above might be applied to a resource:
given that %t1 expands to /dev/cuaa1:

Expression

Expansion

%t[1/$]

cuaa1

%t[1/-$]

/dev

%t[1.-$]

/dev/cuaa1

If we use the resource to allocate CPUs we might want to get
more than 1 to a task. In that case we can tell ptbw
to just bind unique integers as the resources. On a 16 CPU machine
we could divide the host into 5 partition of 3 CPUs:

xapply -J5 -R3 -f -P5 'myWorker %t*' task.cl

The -J5 -R3 is passed along to ptbw to
build a tableau that is 5 by 3, then xapply
consults that to allocate resources. The %t* passes
the names of the CPUs provided down to myWorker.

The markup %ft expands to the source of
the tokens. If the tokens are the internal default the name expanded
is iota.

Ways to access data from xapply in xclate

Some programs need to send data through the environment to descendent processes.
The -evar=dicer option allows any
environment variable to be set to a dicer expression.

Here is why xapply has to set the variable: the xclate
output filter is launched as a peer process to the echo command,
so changing $L in the command won't give it a new value
in the (already running) process. We can't set it in the parent shell
as it won't change for each task, so xapply needs to be able
to set it.

I have a list of SHA512 sinatures for a set of files I just downloaded,
and I want to check those against the files themselves. The list is
directly from the OpenBSD.org website, with lines like:

So I need to snip out the filename with %[1(2)1]
(the first line, snip at open-paren, choose the second field, snip at the close,
choose the first element). Then I need to use openssl
to compute the SHA512 and compare that output to the line itself.

Since all versions of openssl don't put the spaces
in the same way we'll have to delete the blanks to make the lines match.

Another way to access %u

The option -u forces xapply to pass the value
of %u to any output xclate as the xid.
Using that, the above example becomes

XCLATE_1='-T "loop %x"' xapply -m -u 'echo' A B C

but that's not the reason this option exists.

When another processor (say hxmd) wants to know which of
several tasks has completed, it can call xapply with
-u and xclate with -N notify.
Then, xclate reports the completion of each task with
the number of the task as the xid on the resource
given to -N.

This makes xapply an excellent "back-end" program to manage
parallel tasks, although it works best from a C or perl program.
Here is an example where we use notify to
show the order of complete tasks:

This would be more useful if we could get the exit
code from each task, and we can under -r.
Try that same with a -r
switch passed to xclate (-Nr).
The 2 numbers are the exit status, and
the xid.

Also, try both of those without
the -u option to xapply, in
1 case, you get the number of the task, in the other the number of
seconds slept (which is the value of %1).

The observant student might think
this looks like it was designed to be given as input to an instance
of xapply -fz.
Another possible use is hxmd's retry logic.

One last corner case: the -r output for -N's
command is encoded as task "00". Thus, it is distinguishable, as a
string, from the first task (given as "0"). This is the same hack
the new rmt program uses to tell the client it has
a new more advanced command set.

Looks like ptbw to me

The ptbw program allows a shorthand to
access the recovered
resources as shell positional parameters. For historical reasons, this
option is also provided by xapply. In the xapply
case the shell parameters ($1, $2, ...) become
run-time versions of the expander names (%t1, %t2, ...).

That makes our command line modem example look like:

xapply -f -t ~/lib/modems -R 1 -A 'myDialer -d $1 %1' phone.list

We don't have to specify a -e MODEM, we can just force
the name into $1 and use it from there. This even works
when the -S option selects
perl as the shell, or
even worse tcsh.

See the ptbw HTML document for more ideas
about how to setup resource pools and using them from the command-line
and from scripts.

Using xapply as a co-process

A co-process allows multiple shell pipelines to share a common data
source or data sink. This is a very powerful construct in scripts,
which is often used to reduce common code and focus multiple data sources
into a single pipeline.
See the kshmanual page
under Co-Processes, if you've never heard of these before.

Because of the way xapply is designed,
it makes a really great co-process. It manages a list of tasks
given to it on stdin, and outputs a list
of results on stdout -- which is exactly
what a co-process service should do.

For a real turbo, let's start our gzip loop
as a co-process in a fair mockup of a workstation dump structure.

Say we want to dump many workstations in parallel to a large file server.
We are going to ssh to each client to run
dump(8) over a list of filesystems.
But we need to limit the impact to each workstation owner's
desktop, so let's run the compression for the files
locally on the file server. For a start, I'm going to assume
that the file server can run at least 4 processes at a time.

I'm going to simplify the code a little to show the inner loop
for a single host here.
We'll start a co-process that keeps 3 gzip tasks
running. To do that, it reads the names
of the files to compress from stdin, so
the main script outputs each completed dump archive to the co-process
with print -p; if it is marked in the list
as gzip. After all the hosts are
finished, we close the co-processes input, then wait for
it to finish.

In the real code, we run several hosts in parallel. Also, the list of
target filesystems is not from a here document, but that would be
much harder to explain here. I put in a comment where one might
display (or process) the log from all the gzip
processes. This might be used to feed-back and tune the compression
levels or exclude dumps that grow when compressed (viz. compressed tar files
tend to do that from /var/ftp).

The reason this is a good structure is that the number of compression
tasks is controlled with a single -P3
specification; when we move the process to a newer host, we can tune it
up to use most of the CPU, saving just enough to run ssh
to fetch backups from our client hosts. In the production script,
the parallel factor is a command-line option, and an outer loop also
processes multiple client hosts in
parallel with xapply.

Conversely, when we need more resources for the incoming dump streams we
can reduce -P, or
tune the nice options to
focus more effort on the ssh encryption tasks.
And to simplify the code, we could use a pipeline to compress the dumps
as they stream in from the client, but that slows down the over-all
throughput of the process to the speed of the backup host, which may
have more disks than brains.

Another co-process example, from your shell

If you run xapply as a co-process, you might look at
a pstree (aka. ptree) of
the processes doing the work. What you should see is the peer
instance of xapply with some workers below it,
and sometimes a defunct process or 2
waiting to be reaped. These don't hurt anything; it is just the way
xapply blocks reading input before it checks
for finished tasks. Here is a simple example, using your own
ksh as the master process:

The reason we see 2 exited children under
the co-process xapply
is that xapply was blocked waiting for a child to
exit until one did (to free up a slot), then it
noticed that there were no more tasks to launch (when we moved and closed the
p descriptor). So it waited for the
other children, then exit'd itself.

Always remember that the co-process can be an entire pipeline, which is
better than just a single xapply.
I use the nice to start my co-processes command
and the |& to end it as
structural documentation in the script.

The nice also puts the main script at an advantage,
but you could do the opposite and use op (or
sudo) to get better scheduling priority,
a different effective uid, or some other escalation for
the co-process. If you need the exit codes from the processes see
a note above about using
a wrapped xclate to do that.

Using xapply as an interpreter

Sometimes it is handy to keep the cmd in
a file, rather than specify it literally on the command-line. In that
case the -F forces the specification of the
command as a filename. The file is read into memory, then the first
line is removed if it looks like an
execve(2)
interpreter line.

The resulting text is used as normal cmd.
That is to say that all the common markup is replaced for every
set of parameters.

Note that we set the parallel factor to 1 to
override the default in the loader line. We could also set
PARALLEL to force a known value.

A example, which draws on those above, would be to gzip
each file presented. This is called "gzall":

$ cat /tmp/gzall
#!/usr/bin/env -S xapply -P -F
exec nice gzip -9 %W1

Also note the use of
env(1) to
do split-string processing for the arguments to xapply.
This is not supported on some versions of env,
which makes this facility less useful.

When would you use xapply as an interpreter?

If the cmd you need a very complex, or hard to
remember you should put it in a script. If the script just calls
xapply, you might as well make it the interpreter.
Why not? Setting -N and -S
on the loader line is often helpful, and it really reduces the shell
quoting needed to make the script.

The other use for -F

I find it easier to match the markup in
a literal command to the
context when cmd is specified literally to
the xapply command. So I don't use
-F to hide the template in a file most of
the time.
I believe this makes my code more clear to the reader. Really
my code it almost never clear to the reader, but I do try.

The exception to this is when the command I need is (itself) built
by a make recipe or another process. In that
case I'll use -F as the last option in
the command-line specification, which places
the script where cmd
would normally separate the options from the args
(or files).

Note that -F will not take
a dash as stdin. That would be hard to
justify, since there is really no use-case for it.
(Just read the command into a shell variable,
then specify that variable as cmd.)

Version output, like any of ksb's tools

Every one of my tools should take -V to
output a useful version banner, and -h to
output a brief on-line help message. So xapply does.

Like any of my tools with markup

Most of tools that accept command-line markup (like
%1) have quick reference ouptut under
-H, so xapply does.