#define DICER_START '['
#define DICER_END ']'
#define MIXER_RECURSE '('
machine.h
to override.
#define MIXER_END ')'
mkcmd
spell to include just the dicer directly from
the explode repository.
#include "dicer.h"
xapply
's description of the square bracket expander.
Briefly we split a string on a character (like ":" or "/") then
select a field from that list to either keep or remove. The result
of that selection might be returned as is, or processed again.
For example the string "/home/662/ksb" is my home directory. If I want to pull out "662" I can write that as "[/3]" (spilt on "/" and take the third item (think "" / "home" / "662" / "ksb"). If I want to remove "/ksb" I can write "[/-$]" (split on "/" and remove the last item).
This is a lot more useful than it sounds to specify parts of an input record (like /etc/passwd lines or login@host pairs).
By adding the Mixer in we can reformat fields character by character.
For example, when %p
would expand to "8005551212"
the expressions (%(p,1-3")"4-6"-"7-$)
expands to
"(800)555-1212".
extern char *Dicer(char *pcDest, unsigned *puMax, char *pcTemplate, char *pcData);
In no case should the Dicer write beyond Max characters, or the strlen of Dest if puMax is a NULL pointer. The value left in puMax is the length of the data copied into Dest (aka the strlen).
extern char *Slicer(char *pcDest, unsigned *puMax, char *pcTemplate, char **ppcList);
This is very much like printf
. The destination buffer
could be overflowed by a poorly chosen template and list combination.
In most errors the code returns (char *)0, a successful call results
in an empty string (""), other errors return the part of the template
that remains to be applied.
The puMax parameter uses the same convention as Dicer's does, then the Mixer is applied if the dicer expression is surrounded by %( ... mixer).
extern char *Mixer(char *pcInplace, unsigned *puMax, char *pcExpr, int cExit);
~1
(or, more verbosely (~1-~1)
).
Ranges are separated with a comma (,), or a blank.
Any leading (or extra) separators are silently ignored.
Ranges may also be separated by literal strings, in either double ("...")
or m4
(`...') quotes.
The characters in the string are appended to the
current result (as space allows).
Thus %(1,1`-'$)
expands to the
first character of the first word followed
by a dash (-) then the last character.
A Mixer expression can be positioned after a term to further process
the selected value. For example in (17-$)(1,$-4,1-3`,'1)
all three references to "1" in the second term select character 17 from
the previous term, and the last comma is a literal.
The expression ends at the first unquoted occurrence of the Exit character. This allows the caller to change the outer expression boundary at run-time.
The output ends after *puMax characters, if that is not a NULL pointer, otherwise the length of the input string is assumed (including the end of string '\000').
Note that reversed ranges work to output the string from
the right to the left.
The expression $-1
reverses the Inplace string.
Normally the expression is bracketed in parenthesis ('(' and ')'), and
recursive expressions are allowed. The suggested syntax is (dicer,mixer).
This allows the dicer to select a large string, then the mixer limits that
to the desired substring. For example %({10},$)
is
the last character in the tenth input word.
The compositional form %(4,($-2)($-2))
removes the first and last character from the fourth parameter,
through a bit of chicanery. (I would prefer %(4,2-~2)
.)
Some expander's uses angle brackets in place of parenthesis,
because parenthesis already had a special meaning.
The return value is the expression which remains after consumption up to and including the Exit character. A (char *)0 return value indicates a syntax error, range errors largely result in the empty string.
-fwritable-strings
to run), via:
explode -s dicer.h
explode -s dicer.c
Build it with:
mk dicer.c
Run with:
./dicer
No output is good.
$Id: dicer.html,v 6.18 2012/03/29 20:41:49 ksb Exp $