The JML Continuum: script

2015-06-08

Writing Upstart Scripts

Upstart scripts are a great way to deal with starting and stopping system daemons in Ubuntu and CentOS 6, as well as many other flavors of Linux. While later replaced by systemd, Upstart allows quite a bit of customization when creating init scripts, without as much of the hassle and bash programming knowledge required with init.d System V scripts. In this tutorial, I'll show you how to create one for the Kibana 4.x executable distributed by Elastic.co the makers of ElasticSearch and LogStash. If you haven't messed with Kibana I highly recommend it along side LogStash and fluentd, but that's another tutorial. Any executable could be substituted because Kibana itself is simply a binary distribution.

There are many extra potental pieces in upstart scripts, but we're going to start with what we need for a simple start/stop script which will run upon system boot and shutdown properly on the flip side. All Upstart init scripts live in the /etc/init/ directory and end with '.conf'. This one is simply "kibana.conf". Consider the following config:

# cat /etc/init/kibana.conf
description 'Kibana Service startup'
author 'Joe Legeckis'
env NAME='kibana'
env LOG_FILE=/var/log/kibana.log
env USER=nginx
env BASEDIR=/var/www/kibana/
env BIN=/var/www/kibana/bin/kibana
env PARAMS=''
start on started network
stop on stopping network
# Respawn in case of a crash, with default parameters
respawn
script
 # Make sure logfile exists and can be written by the user we drop privileges to
 touch $LOG_FILE
 chown $USER:$USER $LOG_FILE
 date >> $LOG_FILE
 cd $BASEDIR
 exec su -s /bin/sh -c 'exec "$0" "$@"' $USER -- $BIN $PARAMS >> $LOG_FILE 2>&1
end script

post-start script
 echo "app $NAME post-start event" >> $LOG_FILE
end script

It starts off with a simple human readable information about the job we're defining. The Description and Author. After that, you'll see there are a number of Environment Variables defined as "env =". These are sometimes used by the script/binary itself, and other times, as seen here, used by the upstart script later on to execute said script/binary. In this case, we're defining the Name, Log File, User to run as, BaseDir where Kibana lives, the binary executed to run Kibana and any additional Parameters.

Next, we describe when it should start and stop. In this case, and in most cases, the following two lines will be accurate, but more detail can be provided if you need to chain things together, or make sure that something stops or starts according to another scripts timing. Here we're simply starting up after network starts successfully, and stopping prior to stopping network.

We then define what should happen with a crash by simply specifying that the script should respond. Importantly, there is a default limit which varies from system to system, to determine how many times and if any delay is enforced prior to responding a failed or prematurely exited script. 'respawn' does not check the output value of the script. If you didn't ask the script to stop, then it will run the respawn process and start it back up! It may be prudent to add a line after this:

respawn limit COUNT INTERVAL | unlimited

i.e.
reswpan limit 5 10

The example line will restart the process up to 5 times, with 10s in-between the failure and the next execution. This will help prevent race conditions. Specifying 0 (as in zero) for the count will restart it an unlimited number of times.

And now we get into the meat of the script. Between "script" and "end script" is where you put exactly what you want to run every time the daemon is to start up. In this case, we're going to make sure our log file exists, has the right permissions and gets a fresh datestamp (handy when it dies and you're not sure when it started back up). Then we change directory and run our executable. Using the previously defined variables to fill out the command it executes as the correct user, directing all it's output to the log file, including standard out, just in case. After that we close it up with the "end script" and move on!

"post-start" is simply what happens after starting and before exiting upstart. You can use it for all sorts of things like tests or initial setup checks if need be, here we're just adding a line to the log file.

Now, as long as you've put this script in your /etc/init/ directory (NOT /etc/init.d/) as 'name.conf' or in our case here, 'kibana.conf', you can execute it by running "initctl start name". Stop should work, as well as restart!

2014-03-05

Running Custom SNMPd Checks in CentOS 6

I've been fighting with a problem between CentOS 5.x and CentOS 6.x SNMPD configs. In CentOS 5.x we have two lines in /etc/snmpd/snmpd.conf like the following:

exec 1.3.6.1.4.1.5001.100 mailq-check /usr/local/nagios/libexec/check_mailq -w2 -c4 -t7 
exec 1.3.6.1.4.1.5002.1 fscheck /bin/touch /opt/rocheck && /bin/touch /tmp/rocheck

They're simple commands that extend snmp. In CentOS 5.x this form is: "exec <return oid> <name> <command>" and when setup the command or script's output will be returned via snmpd to those that request the MIB/OID. All well and good, but copying this into the standard snmpd.conf file from CentOS 6 gave me nothing. running snmpd with verbose logging gave me nothing useful other than being able to see that it was being requested. Digging deeper, I found the following command and it's output:

# snmpwalk -v 2c 10.93.90.209 -c rsprod .1.3.6.1.4.1.2021.8
UCD-SNMP-MIB::extIndex.1 = INTEGER: 1
UCD-SNMP-MIB::extIndex.2 = INTEGER: 2
UCD-SNMP-MIB::extNames.1 = STRING: 1.3.6.1.4.1.5001.100
UCD-SNMP-MIB::extNames.2 = STRING: 1.3.6.1.4.1.5002.1
UCD-SNMP-MIB::extCommand.1 = STRING: mailq-check
UCD-SNMP-MIB::extCommand.2 = STRING: fscheck
UCD-SNMP-MIB::extResult.1 = INTEGER: 1
UCD-SNMP-MIB::extResult.2 = INTEGER: 1
UCD-SNMP-MIB::extOutput.1 = STRING: mailq-check: No such file or directory
UCD-SNMP-MIB::extOutput.2 = STRING: fscheck: No such file or directory
UCD-SNMP-MIB::extErrFix.1 = INTEGER: 0
UCD-SNMP-MIB::extErrFix.2 = INTEGER: 0
UCD-SNMP-MIB::extErrFixCmd.1 = STRING: 
UCD-SNMP-MIB::extErrFixCmd.2 = STRING:

Looking at this, it doesn't look like it's running my script "/usr/local/nagios/libexec/check_mailq -w2 -c4 -t7" but attempting to run the name "mailq-check". Of course, that isn't the script name+path, so it doesn't find it. Turns out that in CentOS 6, you don't use the oid portion:

exec mailq-check "/usr/local/nagios/libexec/check_mailq -w2 -c4 -t7 -Mpostfix"
exec fscheck "/bin/touch /opt/rocheck && /bin/touch /tmp/rocheck"

Which, when called with the same snmpwalk above, comes back as:

UCD-SNMP-MIB::extIndex.1 = INTEGER: 1
UCD-SNMP-MIB::extIndex.2 = INTEGER: 2
UCD-SNMP-MIB::extNames.1 = STRING: mailq-check
UCD-SNMP-MIB::extNames.2 = STRING: fscheck
UCD-SNMP-MIB::extCommand.1 = STRING: /usr/local/nagios/libexec/check_mailq -w2 -c4 -t7 -Mpostfix
UCD-SNMP-MIB::extCommand.2 = STRING: /bin/touch /opt/rocheck && /bin/touch /tmp/rocheck
UCD-SNMP-MIB::extResult.1 = INTEGER: 0
UCD-SNMP-MIB::extResult.2 = INTEGER: 0
UCD-SNMP-MIB::extOutput.1 = STRING: OK: mailq reports queue is empty|unsent=0;2;4;0
UCD-SNMP-MIB::extOutput.2 = STRING: 
UCD-SNMP-MIB::extErrFix.1 = INTEGER: 0
UCD-SNMP-MIB::extErrFix.2 = INTEGER: 0
UCD-SNMP-MIB::extErrFixCmd.1 = STRING: 
UCD-SNMP-MIB::extErrFixCmd.2 = STRING:

Now, let me explain what all of these MIBs mean.

extIndex.#
The Index number of the command
extNames.#
The Name you gave the command. This is the first parameter you pass to exec
extCommand.#
The full command you had SNMPd call to return you this info. Second parameter you passed to exec
extResult.#
Exit Code from the command exec'd
extOutput.#
Raw output of the command exec'd
extErrFix.#
This is a bit that can be flipped by the client. Flipping this to a 1 kicks off the extErrFixCmd by SNMPd. Generally this is used to 'fix' and 'error' condition.
extErrFixCmd.#
The command to be executed by the SNMPd server