sh
(1),
have an understanding of the UNIX process model,
exit codes,
have coded several scripts,
used gzip
(1),
at
(1) or built a cron
table.
It also assumes that you can read the manual page for any other example command.
kicker
?kicker
is a clever way to allow multiple applications to share a single instance
without stepping on each-other's resources. This is accomplished by
sequencing heavy unrelated tasks by virtue of the
batch
scheduler and system nice
values.
One of the most common points of contention I've seen, which is not
likely until an application base grows over time, is
botched updates to application login's cron
tables.
A new task added to an existing table accidentally deletes some (or all)
previous entries.
This is a rake you do not want to step on.
Kicker
periodically inject new tasks into the
various batch
queues. A cron
table usually specifies 25 possible enqueue times (00 hours to 23 hours and
eod
which is usually 23:55).
The system administrator may add additional triggers, which is clearly
local site policy (like night and
day shift changes).
The dependencies on cron
are limited to these
injection intervals. Jobs may also be injected into the
batch
queues by normal means, or by an
escalated process. This allows
one task to daisy-chain subsequent tasks as it completes their
prerequisites.
Each task may assume the run-time credentials of
the owner of the file (see -S
). This allows
mortal application accounts to cleanup their own log files, archive
data, and the like. (Mayhap application restarts and upgrades as well.)
The manual page has a copy of the recommended op
rule, which is repeated here:
This rule should be configured for local site policy. For example the# Allow the kicker login to run tasks as any mortal login batch /usr/bin/batch -q $1 ; %f.path=^/var/kicker/.*$ !f.uid=^0$ # comment to allow root tasks users=^kicker$ $1=^[a-zA-Z]$ $#=1 stdin=<%f uid=%f gid=%f initgroups=%f $KICKER_Q_TIME $HOME $USER $LOGNAME $SHELL $PATH
users
specification may be replaced
with a groups
or
netgroups
limit. No password should be
required as kicker
is often run via
automation detached from any terminal (viz. via cron
).
cron
table contains
25 boiler-plate lines:
The administrator must add 25 lines to
the superuser cron
configuration to
root's (kicker's) crontab -- once, this avoids breaking many crontabs
over the life of the host: nothing is really free.
The# kicker support if run as root 0 00 * * * root /usr/local/sbin/kicker top 00 0 01 * * * root /usr/local/sbin/kicker top 01 ... 0 22 * * * root /usr/local/sbin/kicker top 22 0 23 * * * root /usr/local/sbin/kicker top 23 55 23 * * * root /usr/local/sbin/kicker eod
top
token allows any job that triggers
every hour be injected before the specific hourly tasks. The order of
the tokens could be swapped to force hourly tasks to run
after the specific hourly tasks. (In fact a new
token (e.g. follow
) could also be
created.)
Additional tokens for the months of the year, or weeks of the year
could be created, but I've never needed any of those. Some tasks
might run every day, but simply exit
when it is not the last Monday of the month. So there is little
reason to add specific day or month triggers.
batch
structure has a simple design, like
most good UNIX™ structures. And like most features it is not
enabled by default. Start by building 1 queue, grow as you need to.
The Solaris implementation uses /etc/cron.d/queuedefs
to define each queue
Others (like FreeBSD) have fewer configuration options, or practically no options. In that case having more queues than CPUs is not really a good idea. FreeBSD's (aka Vixie's# $Id:... #
q
.
[
njob
j
][
nice
n
][
wait
w
] # compress 2 at a time, nice +8, with no delay between them. Z.2j8n # run reports 1 at a time, nice +2, with no delay between them. R.1j2n
cron
)
implementation depends on atrun
to
also launch batch
jobs, so you better
enable that too. Note that the name of a Vixie queue specifies the
nice
priority (a=0, b=1, c=2, ...). There is
no need to build queue directories, as the queue is part of the job name.
Vixie's atrun
takes -l
to
specify the maximum load average which allows new jobs to start. Be sure to
make that higher than the number of CPUs the instance provides (more or
less).
Vixie's atrun
also
starts multiple batched tasks at in a given run.
This pretty much defeats the purpose of the
batch
queue. So that limits the usefulness of
kicker
quit a bit.
Drop a test script into the batch queue to put date
output in a file under /tmp
. See that it works
before you get all complex here.
If you can't make that work for yourself you are not going to make$ echo 'date >/tmp/batch.works' | batch Job 985 will be executed using /bin/sh $ atq Date Owner Queue Job# Wed Aug 22 08:30:00 CDT 2012 ksb E 985 $ sleep 377 $ cat /tmp/batch.works Wed Aug 22 08:35:00 CDT 2012 $ rm /tmp/batch.works
kicker
work for
anyone. Note that the default interval for atrun
execution under Vixie cron
is 5 minutes.
I would change this to every minute if I depended on a lot of batch
jobs.
netlint
summary is a report, and compressing apache
logs is CPU intensive.
A report job might trigger a compression task.
In that case the last command in the script might be a call to
kicker
, kicker -S batch
,
or op kicker queue
with the name of
the task to release (or whole queue to process).
There is almost no benefit to having too many queue options. That just muddies the waters when a Customer needs to pick which queue to use: a clean local site policy to guide recurring tasks into the correct queue helps both the Customer and the Admin.
kicker
task at system boot time.
Vixie's version of cron
as a time for
exactly this purpose @reboot
.
(Which I suggested.) This is a great way to start mortal applications.
Which doesn't work so well if there is a 5 minute gap due to# allow mortal services to start via daemon and kicker @reboot root /usr/local/sbin/kicker boot
atrun
granularity. If so, then you could
specifically force an atrun
call as
part of the commad (&& /usr/libexec/atrun
).
cron
directly to inject the job into
the batch
queue (with a call to
kicker
or batch
).
Or you could leave it in the kicker
spool
and check inside the script for the day of week:
Similarly one might check the Julian day#!/usr/bin/env ksh # $Id: ... # Only needed on Sat morning, which is day 6, see
strftime
's %w. [ `date +%w` -nq 6 ] && exit 0 # the rest of the script...
%j
,
the month, or the week of the year (%U
).
Blue moons and Easter are slightly more complex.
This example is the most complicated request I've ever had. We
want to run this task on the last Monday of each month, but we
need to avoid the last day of the month. In other words if the
last day of the month is a Monday, don't count it. This even works
in leap years, since cal
is cool with those.
#!/usr/bin/env ksh # $Id: ... # Find this month's work-days (not Su or Sa, should we remove Mo holidays?) # Assumes that weeks start on Su. The -h option below # is not standard on all versions of cal, remove it if you don't need it. WORKD=$(cal -h |sed -e 's/^../XX/' -e 's/^...\(..............\).../\1/'\ -e 's/XX//' -e '/[^ 0-9]/d' -e '/^ *$/d' |tr -s ' \n' ' ' |\ sed -e 's/ $//') # Find the last Mo that is not the last day of the month LASTM=$(cal -h |sed -n -e 's/^.. \(..\).*/\1/p'|sed -e '/^ *$/d' | \ grep -v ${WORKD##* }|tail -1) TODAY=$(date +%d |sed -e s/^0//) [ ${LASTM:-28} -eq $TODAY ] || exit 0 # the rest of the script...
pgrep
to look for services
that need some cleanup action take. That has proved problematic because
of race conditions in pgrep
and aborted processes
leaving junk around that still needed to be cleansed. It is far better
to touch
a flag file (or empty log file) to
indicate the need for the cleanup, then remove or zero the file
after the job has started (or finished).
See flock
(1)'s
manual page
(or HTML document) for
more about locking files to prevent duplicate compression runs or
cleanup tasks.
haveip
IP address owner check program.
When a task needs to follow a VIP,
then run the task on every member of the cluster, only to
exit
at the top of the script, unless the local host
presently has the VIP configured up.
$Id: kicker.html,v 1.2 2012/10/31 15:29:55 ksb Exp $