Never just send it all

Building support scripts for op rules is not hard work. Sending only the support scripts your rule-base needs is a little more work, but it is worth the effort.

To understand this document

This document assumes that you've been a UNIX™ systems administrator or systems programmer long enough to read and write safe shell scripts. Given that a Bad Guy might try to subornate escalated scripts, we should take care to code op scripts with great care.

But giving non-administrators access to sudo is too much -- you know they are going to destroy something. So you've tried to limit the damage with harder and more detailed limits with sudo and eventually had to give up. Either you've ended up doing more work yourself, or you've given pretty much unlimited access to people that really don't need it.

If that fits you, then you are ready to move up to op. But to move up you need to learn how to get a better policy, not just a sharper tool. In this document I'm going to present the implementation details of getting the correct configuration for each rule to every host for every login and service you deploy.

See the op manual page, and the HTML documents for details about op itself. I'm not going to cover too much of that material (again) here. Later in this document I'm going to spend some time showing how to do this under the master source structure I use for all such tasks, if you've never heard of msrc or xapply before you're going to have to read about it before you get down to it.

Coding style and best practices

All execution scripts should be coded with great care. These will be tested for every possible weakness. Here is a check list of items I always go through for every script I code or review:

Trust the PATH from op

If you can't trust the environment from op it is mostly over before it starts. This means using the env lever to find your script's interpreter, this means not hard-coding absolute paths to common tools.

Fixing each path to be explicit is much error-prone.

Always include RCS keywords Id and/or Source in each script (or the like for you local site's revision control).

This is for hostlint's versions.hlc module they automates version tracking. It also allows a sane -V output.

#!/usr/bin/env ksh
# This script is just an example.
# $Id: revision information... 
# $Source: path information...

Pass all non-zero exit codes back to caller.

Only exit 0 when it all went well. Masking failure dramatically limits the usefulness of automation.

Use exec to pass exit status back in dead-end case statements.

exec is really cheap, and is what you want anyway.

case _`uname -s` in
_FreeBSD)
	exec kill -HUP $PIDS
	;;
_Linux|SunOS)
	exec kill -SIGHUP $PIDS
	;;
...
*)
	echo "$0: unknown platform, please update this script"
	exit 70 # SOFTWARE
	;;
esac

Trap unknown OS names in a default case when looking for uname matches.

See the example above.

Alternatively, use a MAP'd file under msrc to forbid installation on an unsupported platform.

The default action is a non-zero m4 m4exit:

'ifelse(HOSTTYPE,FREEBSD,`exec kill -HUP $PIDS',
HOSTTYPE,LINUX,`exec kill -SIGHUP $PIDS',
HOSTTYPE,SUN5,`exec kill -SIGHUP $PIDS',
`errprint(__file__`: unsupported platform')m4exit(70)')`
'dnl

Check for missing (;;) in all case statements.

The shell is not good at finding these, even under -n.

Use && to fail-fast when multiple commands must all complete to secure an escalation

Missing a failed command almost always leads to an insane escalation. For example, when multiple files must be copied to complete a single update:

cp $SOURCE/file1 $DEST/file1 &&
cp $SOURCE/file2 $DEST/file2 &&
cp $SOURCE/file3 $DEST/file3

Alternatively, use set -e

Use set +e only as needed.

set -e
cp $SOURCE/file1 $DEST/file1
cp $SOURCE/file2 $DEST/file2
cp $SOURCE/file3 $DEST/file3
set +e

Use valid sysexits.h codes

Always prefer defined codes. Meaningful status helps everyone. I use a little shell alias to remind me of the defined exit codes.

sed -n -e 's/^#define[ \t][ \t]*//p' /usr/include/sysexits.h |
	grep '[ \t][0-9]'

The \t's above represent literal tabs.

One might even turn that into a run-time spell to fetch the local system exit codes in a script:

# Read sysexits.h, if we can, or guess based on common values
eval `sed -ne 'y,\t, ,' -e \
	's,^# *define *\([_A-Z][_A-Z]*\) *\([0-9][0-9]*\).*,\1=\2,p' \
	/usr/include/sysexits.h 2>/dev/null | grep . ||
	echo 'EX_OK=0 EX__BASE=64 EX_USAGE=64 EX_DATAERR=65 EX_NOINPUT=66
	EX_NOUSER=67 EX_NOHOST=68 EX_UNAVAILABLE=69 EX_SOFTWARE=70
	EX_OSERR=71 EX_OSFILE=72 EX_CANTCREAT=73 EX_IOERR=74
	EX_TEMPFAIL=75 EX_PROTOCOL=76 EX_NOPERM=77 EX_CONFIG=78
	EX__MAX=78'

Remove trailing white-space, and blank lines at EOF

Compress spaces to tabs

Remove trailing ; for shell statements that don't need them, remember ksh != perl

These just make the script larger. Someone might think that they are required and leave them forever. If this script is going to be run escalated it should be a hygienic as possible.

Usage messages are optional, but if you add -h

Angle brackets in usages cause people to people type them. Use brackets (less harm if they type one) which are standard, or no grouping but white-space.

Add -h only if local site policy requires it. So we don't radiate information to the black hats.

Add -V if local site policy requires it, even if the escalation rule doesn't allow it. The superuser may still be able to check versions with an escalated command.

Remove dead commented code, use RCS for history.

Nothing screams "I'm not sure of what I'm doing!" quite as loud as commented code in an escalated script.

Never trace the actions of a script

Nothing radiates more information to the Black Hat guys than a set -x. That's really gross negligence to leave on in an escalated script.

Use stderr for error messages

Another novice mistake you must avoid is sending error messages to stdout. This is broken beyond all reason:

echo "configuration error."
exit 0

While this fixes all the mistakes above:

echo "$0: configuration forbids that action" 1>&2
exit 78 # CONFIG

The first one doesn't give the source of the error, the (completely useless) error message goes to stdout (so the Customer never sees it), and the status provided is success.

If you follow the plan above, then your escalation scripts should be must safer, easier to review, and have enough feed-back available to maintain. These are the same rules one might apply to any application provided as a service via opal, as well. See opal(1) or the opal HTML document, or any tcpmux service.

The new installation plan

In the past this directory installed every libexec script on every host. This was less than optimal.

We use mk markup in the rule-set to detect and install only the scripts actually referenced. By looping through the rules with in the installed rule-base we find only the one that a rule-file mentions, so we know we need each.

install:
	glob -sm -400 ${LIB}/\* |\
	MK=-sl0 xapply -f ${C}mk -amRequired -t $$PWD/Default.mk %1${C} - |\
	oue |\
	MK=-sl0 LIBEXEC=${LIBEXEC} xapply -f ${C}mk -mInstall -t $$PWD/Default.mk %[1/$$]${C} -

This runs after all the rule-sets are installed.

The other target available here is the non-standard deinstall, which removes scripts without any owner rule-set:

deinstall:
	( glob -sm -0/777 ${LIBEXEC}/\* ;\
	glob -sm -400 ${LIB}/\* | \
	MK=-sl0 xapply -f ${C}mk -amRequired -t $$PWD/Default.mk %1${C} - |\
	xapply -f ${C}[ -f "${LIBEXEC}/%1" ] && echo %1${C} |\
	oue -dv )|\
	xapply -vf ${C}install -R %1${C} -

The oue logic there finds the script listed in the directory, but not listed and installed by any rule-set.

`Mk` spell support

A template option (under -t) provides a default check and a default install rule. So no rule-set or script need supply the default. If a rule requires a file not listed literally in the rile-base then it should include a marked command to output the name(s) of the required files:

# $Required(*): ${echo:-echo} nonObvious

Or a rule to force a more restrictive mode on the script:

# $Install(*): ${install:-install} -c  -m 0750 -g wheel %f ${LIBEXEC:-/usr/local/libexec/op}/%[f/$]

An Install marked line may also rename the source file to some other name, which the default template doesn't do.

The all target lists any scripts missing from the source cache. There should never be any if all is sane. See the template file Default.mk.

Previously, the make recipe above us forced the installation of all scripts with an install spell like:

install:
	xapply -P4 ${C}[ -s %1 ] &&\
		install -cm 0755 -o root %1 ${LIBEXEC}${C} ${BIN}

Never just send it all

To understand this document

Coding style and best practices

The new installation plan

Mk spell support

See also

`Mk` spell support