What you need to know to understand this document

This document assumes you are quite familiar with the standard UNIX shell, sh(1), and have an understanding of the UNIX™ process model, exit codes, and have coded several scripts, used gzip, and find.

It also assumes that you can read the manual page for any other example command. It would help a little if you've used printf(3) or some other percent-markup function, but it's OK if you've not used any before.

What is xapply?

Simply stated, xapply is a generic loop. It iterates over items you provide, running a customized shell command for each pass though the loop. One might code this loop as something like:
for Item in $ARGV
do
	body-part $Item
done
and feel pretty good about it, so why would you need xapply?

The number one reason to use xapply is that it runs some of the body-part's in parallel. It starts as many as you ask it to (using the -P option), then, as processes finish, it launches the next iteration of body-part, until they are all started. It waits for the running ones to finish before it exits.

The benefit is that we might take advantage of more CPUs resources (either as threads on CPU cores, or multiple CPU packages in a host).

Even better, it can manage the output from those parallel tasks so that each is not all mixed with the others. Without the -m switch xapply assumes you can figure out which iteration of body-part output each line. Under the -m option xapply groups the output from each iteration together, such that one finishes completely before the next one starts.

Like most loops, xapply can skip though the list more than one item at time. The -count option allows you to visit the items in the argument list in pairs (or groups of count). This is handy for programs like diff that need two targets.

Unlike common loops xapply keeps track of critical resources for each iteration. A body can be bound to a token which it uses for the life of its task. That resource token (for example a modem) won't be issued to another iteration until the owner-task is complete, then it will be allocated to a waiting task. This allows xapply to make very efficient use of limited resources (and it honors -P as an upper limit as well).

Xapply has other friends. In fact it is the core node that connects xclate, ptbw and hxmd to each other. We'll come back to the usefulness of that fact in a bit.

In summary, xapply lets you take advantage of all the CPU resources on a host while keeping the tasks and resources straight. To raise the overall torque even more it reaches out to share resources, to collate output, and reuse configuration data. These features are all coordinated across multiple instances of xapply and the related tools.

Basic examples

The gzip utility can be pretty expensive in terms of CPU. If we want to compress many output files (say *.txt) we could run something like:
gzip -9 *.txt

Most modern hosts have more than the single CPU that is going to use. We might break the list up with some shell magic (like split(1)) then start a copy of gzip for each file. That won't balance the CPUs as one list will inevitably have most of the small files. This short list finishes long before the others leaving an idle CPU with files left to compress.

The shell code to split the list up is also pretty complex. Given a temporary file, it might look like this:

/bin/ls -1 *.txt >$TMP_FILE
LINES=`wc -l <$TMP_FILE`
split -l $((LINES/4+1)) $TMP_FILE $TMP_FILE,
for Start in $TMP_FILE,*
do
	xargs gzip -9 <$Start &
done
wait
rm $TMP_FILE $TMP_FILE,*

With xapply, we can keep 4 processes running in parallel with:

xapply -P4 "gzip -9" *.txt
That will keep our machine busy for a while! If there are less than 4 files we just start as many as we can. More than that will queue until (the smallest or first) one finishes, then start another. This actually sustains a load average on my test machine right at 4.0. The xapply process itself is blocked in the wait system call and therefore uses no CPU, until it is ready to start another task.

In some cases the list of files might be too long for an argument list. We can provide the arguments on stdin (or from a file) with the -f switch to xapply:

find . -name \*.txt -print |
xapply -f -P4 "gzip -9" -
This is also good because it won't try to compress a file named "*.txt" in the case where the glob doesn't match anything. The other great thing about that is that the first gzip task starts as soon as find can send the first filename though the pipe!

When find has queued enough files to block on the pipe it gives up the CPU to the gzips, which is exactly what you want. Just before that there are actually 5 tasks on the CPU, which is OK as find is largely blocked on I/O while gzip is busy on the CPU.

I/O features -- input

Under UNIX's nifty pipe abstraction it is best to think of xapply as a filter, reading from stdin and writing to stdout, like awk would. We'll see in the custom command section that this is closer to the truth than it looks. For now just play along.

Because of the parallel tasks xapply has some unique issues with I/O.

On the input side we have issues with processes competing for input from stdin. We take several measures to keep the books balanced.

the -count switch and stdin
This xapply command folds input lines 1 and 2 to a single line, then 3 and 4, then 5 and 6 -- and so on to the end of the file:
xapply -f -2 'echo' - -
The two occurrences of stdin, spelled dash "-" like most UNIX filters, share a common reference. That is the code knows to read one thing from stdin for each dash, for each iteration, rather than reading all of stdin for the first dash leaving nothing for the second.

In other words it does what you'd expect. Using -3 and three dashes reformats the output to present 3 lines as a single output line.

speaking in terms of lines
Sometimes newline is not a good separator. Find has the -print0 option for just this reason. Xapply has the -z option to read -print0 output. Some other programs, like hxmd, also use the nul terminated format.

So the compress example might become:

find . -name \*.txt -print0 |
xapply -fz -P4 "gzip -9" -
the command line option -i input
This option opens a different file as the common stdin for all the inferior tasks. Under -f the default value is /dev/null. This lets the parent xapply use stdin for input without random child processes consuming bits from it.

To provide a unique word from $HOME/pass.words to each of 5 tasks:

xapply -i $HOME/pass.words 'read U && echo %1 $U' 1 2 3 4 5
This has some limits, when the file is too short for the number of tasks the read will fail and the echo won't be executed. (Put 3 words in the file and try it.) We might want to recycle the words after they've been used, see below where we explain how -t does that.

Since the read is part of a program it could be part of a loop, so a variable number of words from the input file could be read for each task. Under -P this could be problematic.

I/O features -- output

Without the -m option, xapply tasks each send output to stdout all jumbled together. This is not evident until you try a large -Pjobs case with a task that outputs over time (like a long running make). If you want an example of this you might compare:
xapply -P2 -J4 'ptbw -' '' ''
to the collated version:
xapply -m -P2 -J4 'ptbw -' '' ''

The xclate processor is xapply's output friend. It is not usually your friend, as it is hard to follow all the rules. In fact some programs, like gzip, don't follow the rules very well. You'll have to compensate for that in your xapply spells.

In our example above we'd like to add the -v switch to gzip to see how much compression we are getting

find . -name \*.txt -print0 |
xapply -fz -P4 "gzip -9 -v" -
Which looks OK, until you run it. The start of all the compression lines come out all at once (the first 4 of them), then the statistics get mixed up with the new headers as they are output. It is a mess.

By adding the -m switch to the xapply we should be able to collate the output. However it doesn't work because the statistics are sent to stderr, so we must compensate with the addition of a shell descriptor duplication:

find . -name \*.txt -print0 |
xapply -fzm -P4 "gzip -9 -v 2>&1" -

The logic in xapply to manage xclate is usually enough for even nested calls. When it is not you'll have to learn more about xclate, I'd save that for a major rain storm, or long trip on a plane.

The xapply's command line option -s passes the squeeze option (also spelled -s) down to xclate. This option allows any task which doesn't output any text to stdout to exit without waiting for exclusive access to the collated output stream. This speeds the start of the next task substantially in cases where output is rare (and either long, or evenly distributed).

Building a custom command

The old-school UNIX command apply uses a printf-like percent expander to help customize commands. As a direct descendant of apply, xapply has a similar expander. As one of my tools it has a lot more power in that expander.

In addition to the apply feature of binding %1 to the first parameter, %2 to the second, and so forth, xapply has access a facility called the dicer.

The dicer is a shorthand notation used to pull substrings out of a larger string with a known format. For example a line in the /etc/passwd file has a well-known format which uses colons (":") to separate the fields. In every password file I've ever seen the first field is the login name of the account. The xapply command

xapply -f 'echo %[1:1]' /etc/passwd
filters the /etc/passwd file into a list of login names.

The dicer expression %[1:1] says "take the first parameter, split it on colon (:) then extract the first subfield". Here are several possible dicer expressions and their expansions:

ExpressionExpansion
%1/usr/share/man/man1/ls.1.gz
%[1/2]usr
%[1.1]/usr/share/man/man1/ls
%[1.1].%[1.2]/usr/share/man/man1/ls.1
%[1/$.1]ls
I stuck a nifty one in there, the dollar sign always stands for the last field. The other important point is that %[1/1] would expand to the empty string, since the first field is empty.

The dicer also lets us remove a field with a negative number:

ExpressionExpansion
%1/usr/share/man/man1/ls.1.gz
%[1/-1]usr/share/man/man1/ls.1.gz
%[1/-2]/share/man/man1/ls.1.gz
%[1.-$]/usr/share/man/man1/ls.1

Because splitting on white-space is so common, the blank character is special in that it matches any number of white-space characters. Escape any of blank, a digit, close-bracket, or backslash with a backslash to force it to be taken literally.

Later versions of xapply also allow access to the mixer which allows the selection of characters from a dicer expression. That is slightly beyond the scope of this document. As an example, %(3,$-1) is the expression to reverse the characters in %3. All these tools use the same mixer+dicer expression syntax: xapply, mk, and sbp.

Preferences for the picky coder

There are some options that let you select details about the environment that xapply provides: viz. shells, escape characters, and padding.

I like to use perl

The -S shell option lets you select a shell for the command built to start each task. I would use ksh or sh if it were me. You could set $SHELL to anything you like, but that might confuse other programs that use xapply, so stick to -S.

As a special case when you set -S perl it changes the behavior of xapply. To introduce the command string it uses perl -e rather than the Bourne shell compatible $SHELL -c. It might also setup -A differently (see below).

Input file padding

Given a count of 2 and 2 file parameters under -f, xapply matches the corresponding lines from each file as parameter pairs. When only one of the files runs out of lines the empty string is provided as the element from the other. You can change this pad string to anything you like, for example -p /dev/null.

In one of our first examples we joined pairs of lines. What happens if there is only 1 line? The echo command gets an extra space on the end, which it trims. To see that we can replace the default expansion with a quoted one, and run it through cat -v:

echo A |xapply -f -2 'echo "%*"' - - | cat -ve
This outputs "A $" (without the quotes).

There are alternatives. Under -p we can detect a sentinel value in for missing line. Say, for example, that a comma on a line by itself could never be an element of the input, then -p . would let us detect the missing even line with

xapply -p , ... if [ _"%2" = _"," ] ; then ...'

It is usually considered good form to exit from task as soon as possible. With this in mind the above trap might be better coded as:

... [ _"%2" = _"," ] && exit; ...'

Percent marks are so vulgar

If you don't like the escape character you can change it with the -a option. Take care that the symbol you pick is quoted from the shell. Viz. "xapply -a ~ ..." is not what you'd want under csh or ksh, since the tilde gets expanded to a path to someone's home directory.

Because xapply is driven from mkcmd it takes the full list of character expressions (-a "^B" is ASCII stx, -a M-A is code 230), that doesn't mean you should use them. Try to stick with percent if you can. In ksh that makes some let, $((...)), and ${NAME%glob} parameter substitutions require %% to get a literal percent sign.

More advanced escapes

Since xapply is emulating a generic loop it stands to reason that there would be a "loop counter". The loop counter is named %u, which stands for "unique". Since I'm a C programmer, I start the loop counter at zero (0) and bump it up one for each trip through the loop.

For example to output the numbers 0 to 4 next to the letter 'A' to 'E':

xapply 'echo %u %1' A B C D E

A better use of this might be to process data from one iteration to the next (making generations of a file with the extension .%u).

Use of the ksh built-in math operations to build a function based on %u is common. To queue many at jobs about 5 minutes apart:

xapply -x 'at + $((%u*5)) minutes < %1' *.job
The -x option lets you see the commands executed on stderr. This emulates set -x in Bourne shell.

Safer escapes

Say one of the input names to the command below is "Paul d`Abrose".
xapply -f -2 'Mail -s "Hi %1" "%2" <greeting.mail' names.cl address.cl
That will expand an unbalanced grave quote in the subject argument. Even worse we might try to run "Abrose" as a shell command.

A program should be safe from such corner cases, like a filename with a quote or control character in the name. On input xapply can use the -print0-style, on output we depend on the shell. To make a parameter safer there is a q modifier that tells xapply that you are going to wrap the expansion in shell double-quotes, and that you'd like the resulting dequoted text to be the original value.

By spelling the expansion as:

xapply -f -2 'Mail -s "Hi %q1" "%q2" <greeting.mail' names.cl address.cl
We're asking xapply to backslash any of double-quote, grave, dollar, or backslash in the target text, so the command is presented to the shell as:
Mail -s "Hi Paul d\`Abrose" "pa@example.com"...

This is not always enough, sometime the data should be passed through a scrubber, or sent to /dev/null, if you don't trust it.

Nested markup

If you really want to get a crazy you can pass more markup in as a parameter. The escape %+ shifts the parameters over one to the left, then expands the new cmd (replacing the %+) then continues with the rest of the original cmd.

And example makes this a little clearer:

xapply -n -2 "( %+ )" "echo %1 %1" ksb rm /tmp/bob
Outputs
( echo ksb ksb )
( rm /tmp/bob )

This is really a lot more useful with the input is a pipe (viz. under -fz). A program can match commands to parameters and send the paired stream to xapply for parallel execution. This is exactly how hxmd works.

What if I didn't find anything to do?

In most cases if xapply didn't get any arguments to use as parameters it shouldn't run anything (unlike busted xargs). In a few cases it might be nice to have an "else" part (like a Python while loop). The -N else option allows a command to run when we didn't get any tasks started.

Let's rework our compression filter, we'll misspell the extension we are looking for (so we don't match anything) and put in a message when we do not find anything to compress.

find . -name \*.text -print0 |
	xapply -fzm -P4 -N "echo nothing to compress 1>&2"" "gzip -9 -v 2>&1" -

This is mostly used in scripts to give the Customer a warm feeling that we looked, but didn't find anything to do.

Other resources

In all the examples above xapply is very predictable. When we run the examples on the same input, we are apt to get the same output. All that changes when we allow xapply to start a ptbw to manage a resource.

Each line of a ptbw resource file represents a unique resource that is allocated to a task. A resource could be anything, a CPU, filesystem, VX disk group or network address. I picked a modem in these examples because the exclusive use to dial a phone number is clear.

If we have 3 modems connected to a host on /dev/cuaa0, /dev/cuaa1, and /dev/ttyCA we can put those strings in a file called ~/lib/modems. Then we can ask xapply to reserve 1 modem for each command:

xapply -f -t ~/lib/modems -R 1 'myDialer -d %t1 %1' phone.list
No matter how many phone numbers are in phone.list we will never try to dial different numbers on the same modem. This is because xapply and ptbw know how to work with each other to keep the books straight. We can force a new ptbw instance into our process tree by using the -t option, the -J, or a -R option with any value greater than 0. If we don't use any of those options xapply uses the internal function iota just as ptbw does, but doesn't insert an instance in the process tree, so any enclosing ptbw will be directly visisble to each task.

The new expander form %t1 expands to the modem selected. The -R options specifies how many resources to allocate to each task.

All of the dicer forms we saw above might be applied to a resource: given that %t1 expands to /dev/cuaa1:

ExpressionExpansion
%t[1/$]cuaa1
%t[1/-$]/dev
%t[1.-$]/dev/cuaa1

If we use the resource to allocate CPUs we might want to get more than one to a task. In that case we can tell ptbw to just bind unique integers as the resources. On a 16 CPU machine we could divide the host into 5 partition of 3 CPUs:

xapply -J5 -R3  -f -P5 'myWorker %t*' task.cl
The -J5 -R3 is passed along to ptbw to build a tableau that is five by three, then xapply consults that to allocate resources. The %t* passes the names of the CPUs provided down to myWorker.

Ways to access data from xapply in xclate

Some programs need to send data through the environment to descendent processes. The -e var=dicer option allows any environment variable to be set to a dicer expression.

To specify the modem in $MODEM (rather than in an option):

xapply -f -t ~/lib/modems -R 1 -e "MODEM=%t1" 'myDialer %1' phone.list

This is also really useful to send options down to xclate in XCLATE_1 to set headers and footers on collated output.

XCLATE_1='-T "loop %{L}"' xapply -m -e L=%u 'echo' A B C
For more on the use of XCLATE_n see the xclate HTML document.

Here is why xapply has to set the variable; the xclate output filter is launched as a peer process to the echo command, so changing $L in the command won't give it a new value in the (already running) process. We can't set it in the parent shell as it won't change for each task, so xapply needs to be able to set it.

Another way to access %u

The option -u forces xapply to pass the value of %u to any output xclate as the xid. Using that the above example becomes
XCLATE_1='-T "loop %x"' xapply -m -u 'echo' A B C
but that's not the reason this option exists.

When another processor (say hxmd) wants to know which of several tasks has completed it can call xapply with -u and xclate with -N notify. Then xclate reports the completion of each task with the number of the task as the xid on the resource given to -N.

This makes xapply an excellent "back-end" program to manage parallel tasks, although it works best from a C or perl program. Here is an example where we use notify to show the order of complete tasks:

xclate -m -N '|tr -u \\000 \\n|while read N; do echo fini $N; done' -- \
		xapply -m -u -P5 'sleep' 3 2 5 2 3
It would be sad if we couldn't get the exit code from each task, but we can. Try that same with a -r switch passed to xclate. The two numbers are the exit status, and the xid.

Also try both of those without the -u option to xapply, in one case you get the number of the task, in the other the number of seconds slept (which is the value of %1).

The observant student might think this looks like it was designed to be given as input to an instance of xapply -fz. Another possible use is hxmd's retry logic.

One last corner case: the -r output for -N's command is encoded as task "00". Thus it is distinguishable, as a string, from the first task (given as "0"). The is the same hack the new rmt program uses to tell the client it has a new more advanced command set.

Looks like ptbw to me

The ptbw program allows a shorthand to access the recovered resources as shell positional parameters. For historical reasons this option is also provided by xapply. In the xapply case the shell parameters ($1, $2, ...) become run-time versions of the expander names (%t1, %t2, ...).

That makes our command line modem example look like:

xapply -f -t ~/lib/modems -R 1 -A 'myDialer -d $1 %1' phone.list
We don't have to specify a -e MODEM, we can just for the name into $1 and use it from there. This even works when the -S option selects perl as the shell, or even worse tcsh.

See the ptbw HTML document for more ideas about how to setup resource pools and using them from the command-line and from scripts.

Using xapply as a co-process

A co-process allows multiple shell pipelines to share a common data source or data sync. This is a very powerful construct in scripts, which is often used to reduce common code and focus multiple data sources into a single pipeline. See the ksh manual page under Co-Processes if you've never heard of thise before.

Because of the way xapply is designed it makes a really great co-process. It manages a list of tasks given to it on stdin, and outputs a list of results on stdout -- which is exactlty what a co-process service should do.

For a real turbo let's start our gzip loop as a co-process in a fair mockup of a workstation dump structure.

Say we want to dump many workstations in parallel to a large file server. We are going to ssh to each client to run dump(8) over a list of filesystems. But we need to limit the impact to each workstation owner's desktop, so let's run the compression for the files locally on the file server. For a start I'm going to assume that the file server can run at least 4 processes at a time.

I'm going to simplify the code a little to show the inner loop for a single host here. We'll start a co-process that keeps 3 gzip tasks running. To do that it reads the names of the files to compress from stdin, so the main script outputs each completed dump archive to the co-process with print -p, if it is marked in the list as "gzip". After all the hosts are finished we close the co-processes input, then wait for it to finish.

#!/bin/ksh
# comments and some argument processing
: ${SSH_AUTH_SOCK?'must have an ssh agent to run automated backups'}
unset TAPE RMP RSH
...
nice xapply -P3 -f 'gzip -7v %1 1>&3 2>&3' - 3>gzip.log |&
...
for $TARGET in ... ; do
	...
	while read FS WHERE COMPRESS junk ; do
		ssh root@$TARGET -x -n su -m operator -c "'/bin/sync; exec /sbin/dump -0uL -C16 -f - $FS'" >$WHERE.dump
		[ _${COMPRESS:-no} = _gzip ] && print -p $WHERE.dump
	done <<-\!
/	slash	gzip
/var	var	gzip
/usr	usr	gzip
/home	home	gzip
/var/ftp var_ftp no
...
!
done
exec 3>&p;exec 3>&-
wait
# cat gzip.log
exit 0

In the real code we run several hosts in parallel. Also the list of target filesystems is not from a here document: but that would be much harder to explain here. I put in a comment where one might display (or process) the log from all the gzip processes. This might be used to feed-back and tune the compression levels or exclude dumps that grow when compressed (viz. compressed tar files tend to do that from /var/ftp.

The reason this is a good structure is that the number of compression tasks is controlled with a single -P3 specification: when we move the process to an newer host we can tune it up to use most of the CPU, saving just enough to run ssh to fetch backups from our client hosts. In the production script the parallel factor is a command-line option, and the outer loop also processes multiple client hosts in parallel with xapply.

Conversely when we need more resources for the incoming dump streams we can reduce -P, or tune the nice options to focus more effort on the ssh encryption tasks. And to simplify the code we could use a pipeline to compress the dumps as they stream in from the client, but that slows down the over-all thoughtput of the process to the speed of the backup host, which may have more disks than brains.

Another co-process example, from your shell

If you run xapply as a co-process you might look at a pstree (aka. ptree) of the processes doing the work. What you should see is the peer instance of xapply with some workers below it, and sometimes a defunct process or two waiting to be reaped. These don't hurt anything, it is just the way xapply blocks reading input before it checks for finished tasks. Here is a simple example, using your own ksh as the master proceess.
$ nice xapply -f -P3 'sleep %1; date 1>&3' - 3>log.$$ |&
$ jobs
[1] + Running              nice xapply -P3 -f "sleep %1; date 1>&3" - 3>
$ print -p 10
$ ptree -n $$
1380  ksh -i -o vi -o viraw
  31057 xapply -P3 -f sleep %1; date 1>&3 -
    31058 /bin/ksh -c sleep 10; date 1>&3 _
      31063 sleep 10
    31059 ptree -n 1380
$ print -p 20 ; print -p 22 ; print -p 21
$ ptree -n $$
1380  ksh -i -o vi -o viraw
  31148 xapply -P3 -f sleep %1; date 1>&3 -
    31149 /bin/ksh -c sleep 20; date 1>&3 _
      31161 sleep 20
    31150 /bin/ksh -c sleep 22; date 1>&3 _
      31163 sleep 22
    31152 /bin/ksh -c sleep 21; date 1>&3 _
      31162 sleep 21
  31164 ptree -n 1380
$ sleep 30
$ ptree -n $$
1380  ksh -i -o vi -o viraw
  31148 xapply -P3 -f sleep %1; date 1>&3 -
    31150 ()
    31152 ()
  31168 ptree -n 1380
$ exec 4>&p ; exec 4>&-
[1] + Done                 nice xapply -P3 -f "sleep %1; date 1>&3" - 3>
$ wc -l log.$$
       4 log.1380
$ rm log.$$

The reason we see 2 exited children under the co-process xapply is that xapply was blocked waiting for a child to exit until one did (to free up a slot), then it noticed that there were no more tasks to launch (when we moved and closed the p descriptor). So it waited for the other childern then exit'd itself.

Always remember that the co-process can be an entire pipeline, which is better than just a single xapply. I use the nice to start my co-processes command and the |& to end it as structural documentation in the script.

The nice also puts the main script at an advantage, but you could do the opposite and use op (or sudo) to get better scheduling priority, a different effective uid, or some other escalation for the co-process. If you need the exit codes from the processes see a note above about using a wrapped xclate to do that.

Like any of ksb's tools

Every one of my tools should take -V to output a useful version banner, and -h to output a brief on-line help message. So xapply does.

See also

There are more examples of how one might use xapply in the hxmd HTML document and the msrc HTML document.
$Id: xapply.html,v 3.19 2010/08/13 17:19:58 ksb Exp $