hxmd
(8).
efmd
Efmd
is a macro processor based on
m4
.
Unlike hxmd
, which runs a shell command for
each configuration element, efmd
filters
a stream of macro expanded text, once for all of the selected hosts.
This tactic reduces the CPU time spent formatting simple reports,
versus the overhead of
fork
'ing all the echo
shell processes that hxmd
requires for similar tasks.
It also creates the output lines in a stable order: due to the parallel
nature of hxmd
output lines tend to come out
slightly permuted.
The attributes bound to each host are interpreted identically in both
hxmd
and
efmd
to select hosts and
customize macro expanded text.
Other tools based on hxmd
also reuse the same logic (msrc
,
mmsrc
, and newer versions of
distrib
).
efmd
resembles
hxmd
, but doesn't include all the options
from the xapply
wrapper stack.
Here is the usage from the manual page:
efmd [Or for just a list of the key macros selected:-n
] [-B
macros
] [-C
configs
] [-d
flags
] [-D
m4-option
] [-I
m4-option
] [-U
m4-option
] [-E
compares
] [-F
literal
] [-G
guard
] [-j
m4prep
] [-k
key
] [-M
prefix
] [-o
attributes
] [-T
header
] [-X
ex-configs
] [-Y
top
] [-Z
zero-config
] [arguments
]
Or for on-line help:efmd -L
[-B
macros
] [-C
configs
] [-d
flags
] [-D
m4-option
] [-I
m4-option
] [-U
m4-option
] [-E
compares
] [-G
guard
] [-j
m4prep
] [-k
key
] [-M
prefix
] [-o
attributes
] [-X
ex-configs
] [-Y
top
] [-Z
zero-config
]
For common ksb-style version information:efmd
-h
efmd -V
In that usage the 3 highlighted options (-L
,
-n
and -T
) are not
common to hxmd
.
The configuration file format is exactly the same as
mmsrc
, hxmd
and
distrib
(which is the whole point).
See the
hxmd
HTML document
for more details about host selection and configuration. I'm going to
assume that you have used hxmd
to generate
simple reports with echo
and want to make those
faster (or more efficient).
efmd
hxmd
you
should use efmd
to optimize any
code that just reports values from the configuration
file and fixed text. This includes merges or extracts of
the configuration files themselves.
For example, this spell is pretty common:
hxmd -C my.cf -B SPECIAL 'echo HOST'
sh
for
each selected item. With hundreds of selected items, the
xapply
/xclate
and
shell processing is all over-head for little work.
If might rephrase the spell as an efmd
as:
efmd -C my.cf -B SPECIAL -L
For about 1,000 selected items, the results of
those 2 spells are about the same (given -P1
they would be exactly the same).
But the expense for the 2 runs are very different:
Command Wall clock User CPU System CPU hxmd 8.46s real 4.49s user 17.98s system efmd 0.42s real 0.36s user 0.06s system
The four-tenths of a second that is common is the time spent selecting the
hosts, the rest of the hxmd
cost is the time spent
setting the xapply
machine up and running all those
shells and m4
filters
(it is still quite speedy given the number of processes executed).
In the comparison above, I cheated and used the "show me the keys" option
to efmd
: I should have asked for the
HOST
macro to be more fair;
that wording of the same spell would be:
efmd -C my.cf -B SPECIAL HOSTWhich added another line to the table:
The additional four-tenths second is the time it takes to process a second
Command Wall clock User CPU System CPU hxmd 8.46s real 4.49s user 17.98s system efmd -L 0.42s real 0.36s user 0.06s system efmd HOST 0.80s real 0.78s user 0.05s system
m4
filter to
expand HOST
for each selected definition.
efmd
m4
without running any dangerous
shell commands:
That output may tell you which part of site policy the macro belongs to, or it might leave you wondering why you even tried to figure it out. It is largely up to your local site policy how they plays out. No volume of comments will explain bad code, and most good code needs the why more than the how. If the output from the macro plus the name of the macro doesn't help you, then you don't know why it was coded -- so what is does (how it does it) isn't going to help anyway.$ efmd -Cother.cf "HOST IS_ZMD(DEFNET)" imp . nostromo yes lv426 yes sulaco yes ...
Efmd
selects hosts just as hxmd
does: generating an m4
guard markup stream from the
command line options -Y
,
-B
, -E
,
and -G
for the items selected from
all the configurations specified under
-Z
, -C
, and
-X
.
It reads the output from that macro processor as the list of
keys to select.
At this point under -L
, we have the output,
which is the list of selected keys. So we output those and exit.
Otherwise, we build a stream from the -T
specifications and the command-line arguments.
Under -n
this stream is sent to
stdout
directly.
Otherwise, an m4
output filter is pushed on
to stdout
to process the markup into the
requested report.
The header
specifications are output to
stdout
followed by a stanza for each
selected entry. No footer is provided as the m4
macro m4wrap
allows for that.
Each stanza starts with a list of pushdef
macro
calls to define the attributes of the current selection; the
command-line parameters are catenated to this.
Then, popdef
macro calls withdraw
the definitions of the attributes.
The next stanza follows directly.
We handle the command-line arguments for each stanza as
hxmd
would
(using -F
to differentiate literals from filenames).
Using open
(2) to get to the
contents of each file. We never cache the contents
of each file, we re-read the file for each selected item.
If the file is a FIFO we'll get a new connections for each instance.
However, stdin
, when specified as a single dash
(-
), is only read once, then cached in
a temporary file.
HXMD_U_MERGED
?efmd
is working
towards requires access to a merged configuration file.
But efmd
's temporary file,
under -o
, is usually deleted before
any process reading stdout
has a chance to open it.
If we want to let efmd
execute commands to update
some other aspect of each selected host while keeping track of
the whole list of hosts selected, then we need to find a way to
get to that file before efmd
deletes it.
I know that's odd, but it turns out that it is also useful, which is
why we include -o
option support at all.
Here is the hook that allows the access we need: when the
environment variable M4_PATH
is set,
the value is used as the path to the m4
filter command for the output stream (but not as the one
used to process selection and guard processing). That means you
can substitute a program of your design in to take the place of
the output m4
filter.
That only gets you part of the way. You still need to know where
efmd
put the merged file.
Normally, the m4
macro
HXMD_U_MERGED
is expanded in
hxmd
's retry command, or in a host-specific file
to point the way to the file, but in this case our script is
the macro processor.
There are 3 ways to get access to the value we need: 2 take advantage of the process model, the other an invariant of the hook's command-line options.
-T
to put a cp
command at the top of the m4
markup
efmd -T "syscmd(\`cp 'HXMD_U_MERGED\` fixed.cf')dnl" -C ... -F0 report.m4 >output.1 hxmd -C fixed.cf -X other.cf ... >output.2 # use output.1 and output.2 as needed rm fixed.cf output.*
This tactic works well as part of a make
recipe (where the fixed.cf
is a prerequisite
for output2
). Another case of this
structure eating its own dog food:
... all: output.1 output.2 ... output.1: fixed.cf [ -f output.1 ] fixed.cf: report.m4 efmd -T "syscmd(\`cp 'HXMD_U_MERGED\` $@')dnl" -C ... -F0 report.m4 >output.1 output.2: fixed.cf output.1 other.cf hxmd -C fixed.cf -X other.cf ... >$@ ...
m4
program and syscmd
m4
to act as a shell my wrapping
every shell statement in syscmd
.
This is really gross and error-prone, but it works. Be warned
that quoting the m4
markup for this on the
command-line is super tricky (use a file). You end up with a raw
macro stream that looks like:
pushdef(...)...dnl syscmd(`hxmd -C 'HXMD_U_MERGED`...')dnl popdef(...)...dnl pushdef(...)....dnl syscmd(`hxmd -C 'HXMD_U_MERGED`...')dnl popdef(...)...dnl pushdef(...)...dnl syscmd(`hxmd -C 'HXMD_U_MERGED`...')dnl ... popdef(...)...dnl
And you just re-implemented hxmd
without the
parallel processing, congratulations.
M4_PATH
that runs m4
on stdin
m4
, we can build the
processed version of stdin
, then use it
(as a shell script, perl program, or input to another command).
The key is that our parent efmd
will not remove
the merged file until we exit
.
#!/bin/sh m4 "$@" | exec sh
Similarly, we could catch the output from m4
in
a file, chmod
it +x
and run it (assuming a #!
loader line was
included at the top of the stream). This is marginally better than
all the syscmd
calls, in that the shell is
better at process control than m4
, and we
could use perl
or some other processor.
That processor could even be selected by the input markup, but that
would be on that fine line between clever and stupid, wouldn't it?
M4_PATH
Efmd
always puts 2 fixed-place
parameters on our command-line: the first is the word "-D", the
second is a macro definition of HXMD_U_MERGED
set to the temporary filename where efmd
stashed the merged configuration file.
#!/bin/sh # efmd gives us -D HXMD_U_MERGED=$tmp, otherwise exit SOFTWARE [ _-D = _"$1" ] || exit 70 HuM=`expr "$2" : 'HXMD_U_MERGED=\(.*\)'` || exit 70 hxmd -C$HuM ... ... m4 "$@" exit 0
The other parameters are the -D
,
-I
, and -U
options
specified on our command-line, followed by
any m4prep
files, then a dash
(-
) to specify stdin
.
Which is exactly what m4
should be provided.
stdin
and/or a file, then process that
data set to output results to stdout
. To make
efmd
a configuration file filter,
we'd have to process configuration file(s) into an output stream.
We may specify input configuration files under -C
,
-X
or -Z
. Any one of
those could be stdin
. We may
divert the merged configuration file under -o
to stdout
by ending the command line specification
with:
The "trick" is that we specify a null string for each processed host via theefmd ... -T "paste(HXMD_U_MERGED)dnl" dnl
m4
markup dnl
as the only argument
.
That just outputs the merged configuration as
the only text in on stdout
.
If your m4
does not
have a paste
directive, use
include
; however that can provoke some unwanted
quote processing in values that include spaces. There is no way
I've found around a bad version of m4
.
This allows for some complex boolean disjunctions in selection logic,
but oue
might be a better engine for that logic
if you can keep for list solely in terms of the hostnames (or some
other unique key).
See the HTML document for
oue
.
Some versions of m4
emit
#line
markup before the first line of the
included contents, which is taken as a comment by any program that uses
the "hostdb.m" module to read the resulting file. Sadly, the name of
the file is a mkstemp
name, so
diff
almost always shows a difference in
the output from multiple runs of the same filter command.
efmd
is commonly used to
limit the effects of a spell to a very refined subset of target hosts,
which are selected from multiple configuration files (aka. sources).
Configuration files from disparate realms may need different
selection processing to build a subset list of the desired hosts
which will all be given to a final msrc
or
hxmd
to update the whole super-set.
For example, to select all the hosts that provide a command and control service from many span-of-control areas, we may have to change the name of the service we are selecting; each realm might call it something different internally, but the encompassing organization may need to update them all en mass. After we have the complete list, we should apply the same update to all of them, but we can remember the realm (or other attributes) to compensate for details of each specific implementation.
This example creates a stream on# Example consolidation of 4 realm's data (usually part of a # master recipe or a cache control recipe). set -e # (our local) realm1 calls it SERVICE "apache" efmd -C realm1.cf -DREALM=earth -G "HAS_SERVICE(apache)" \ -o "HOST HOSTTYPE" -T "paste(HXMD_U_MERGED)dnl" dnl # realm2 calls it "httpd" efmd -C realm2.cf -DREALM=air -G "HAS_SERVICE(httpd)" \ -o "HOST HOSTTYPE" -T "paste(HXMD_U_MERGED)dnl" dnl # the next calls it "http" efmd -C realm3.cf -DREALM=fire -G "HAS_SERVICE(http)" \ -o "HOST HOSTTYPE" -T "paste(HXMD_U_MERGED)dnl" dnl # the last only uses hosts of class "www" efmd -C realm4.cf -DREALM=water -I --/water -j class.m4 -E "www=CLASSOF(HOST)" \ -o "HOST HOSTTYPE" -T "paste(HXMD_U_MERGED)dnl" dnl exit 0
stdout
that looks about like:
REALM="earth" %HOST HOSTTYPE mud.npcguild.org SUN5 dirt.npcguild.org SUN5 ... REALM="air" %HOST HOSTTYPE vapor.npcguild.org FREEBSD whirlwind.npcguild.org DARWIN ... REALM="fire" %HOST HOSTTYPE flame.npcguild.org NETBSD ... REALM="water" %HOST HOSTTYPE waterspout.npcguild.org ...
This adds another level of consolidation that large sites need. The output list is a complexity insulator for the update task because we removed the rules that generated the list, leaving just the list of hosts that run a web server. Any additional attributes needed to update a host are only conducted to the next step with intent, never by mistake.
It also implies that the aggregator recipe is maintained by the encompassing organization with cooperation from the federated realms. Without maintenance, the script that creates the list quickly looses its freshness, or bleeds complexity into the rest of the structure.
hxmd
cache directory before an msrc
or
hxmd
run, one might use efmd
with the output directed to the bit-bucket
(aka /dev/null
).
This moves the work up-front, but that's not better than letting
hxmd
run the cache operations itself most of
the time. This might help when the cache operations should be compressed into
a shorter window than msrc
can drive with the
extra per-task delay. In both cases, the cache population is sequential,
as that is how it is defined.
If you need parallel pre-caching you'll have to code it in the
Control recipe with hxmd
or
xapply
-P
.
The init
target in
the Control
recipe would be a good place to
put this logic: never keep it outside of Cache.m4
or Control
, because the default invariant is
that each cache operation should be done sequentially.
m4
markup command (viz. "dnl.example.com",
"unix.include.org") are almost impossible to
manage with these tools.
An option to just output
the merged configuration file would make the pipe usage shorter to
type, but I'm pretty much out of options in this stack. In any
case, the idiom with the dnl
is
actually not that hard to type or understand. If you had to
quote the dnl
from an enclosing
m4
, all the better.
$Id: efmd.html,v 1.25 2012/10/01 21:01:41 ksb Exp $