SMS Watchdog

My mobile carrier offers access to an API that can send SMS to its users. With systemd’s timers, I have been able to make a script that warns me when the load on my server is too high!Basically, timers work by stating a service repeatedly; which in turn starts a script in this case.This shell script is responsible for checking the load and sending a SMS. Of course, you can have it send you a mail too. Or tweet it, or whatever - sky’s the limit.

The script

This shell script is in charge of checking the load average and sending a SMS if it’s too high.

It is easy to get the load average of the last 5 minutes with the command cat /proc/loadavg | awk '{print $2}', but we must adjust the trigger depending on the amount on cores on the computer. To do so, nproc works fine, or more portable: grep processor /proc/cpuinfo -c (checks the amount of occurrence of the word “processor” in /proc/cpuinfo and thus the amount of cores).

Eventually we compare the load and the trigger limit. There’s a pitfall though: the shell (bash and sh as far as I know, I’ve heard it’s different for zsh) does not work on floating point number, so we need to pipe this computation to bc.

Here’s the script I come up with:

#!/bin/shlimit=125 # Percentage of total load
core_nbr=$(grep processor /proc/cpuinfo -c) # Amount of core. Equivalent to $(nproc) but more portable
trigger=$(echo"($limit / 100) * $core_nbr" | bc -l) # trigger limit, which depends of how many cores you have
load=$(cat /proc/loadavg | awk '{print $1}') # Load average of the last minuteif [ $(echo"$load > $trigger" | bc) = "1" ]; thenecho"The load average is very great, I should send you an e-mail"
message="There is something wrong with your server. The load average is $load.%0D%0AAs a reminder, the trigger limit is $trigger."# %0D%0A is a line break
curl "https://smsapi.free-mobile.fr/sendmsg?user=secret&pass=secret&msg=$message"fi

Timer

/etc/systemd/system/ is the directory where every manually added systemd file should go, so this is where you can create a service file called for example getkey-sms-watchdog.service.

[Unit]
Description=Send a SMS if the load average is TOO DAMN HIGH
[Service]
ExecStart=/home/getkey/monitor_load_average.sh
[Install]
# It's of no use making HTTP requests when there's no Internet accessRequires=network-online.target
After=network-online.target

Now, the timer. It must have the same filename as the service file, except of course the extension which must be “.timer”. So, let’s create getkey-sms-watchdog.timer:

[Unit]
Description=Send a SMS if the load average is TOO DAMN HIGH
[Timer]
OnBootSec=10min # Start first 10 minutes after boot
OnUnitActiveSec=5min # Then restart every 5 minutes
[Install]
WantedBy=timers.target