300 lines
11 KiB
Plaintext
300 lines
11 KiB
Plaintext
Advanced usage and scheduling notes
|
|
===================================
|
|
|
|
upsmon can call out to a helper script or program when the device changes
|
|
state. The example upsmon.conf has a full list of which state changes
|
|
are available -- ONLINE, ONBATT, LOWBATT, and more.
|
|
|
|
There are two options, that will be presented in details:
|
|
|
|
- the simple approach: create your own helper, and manage all events and actions
|
|
yourself,
|
|
- the advanced approach: use the NUT provided helper, called 'upssched'.
|
|
|
|
|
|
The simple approach, using your own script
|
|
------------------------------------------
|
|
|
|
How it works relative to upsmon
|
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
Your command will be called with the full text of the message as one argument.
|
|
|
|
For the default values, refer to the sample upsmon.conf file.
|
|
|
|
The environment string NOTIFYTYPE will contain the type string of whatever
|
|
caused this event to happen -- ONLINE, ONBATT, LOWBATT, ...
|
|
|
|
Making this some sort of shell script might be a good idea, but the helper can
|
|
be in any programming or scripting language.
|
|
|
|
NOTE: Remember that your helper must be *executable*. If you are using a script,
|
|
make sure the execution flags are set.
|
|
|
|
For more information, refer to linkman:upsmon[8] and
|
|
linkman:upsmon.conf[5] manual pages.
|
|
|
|
Setting up everything
|
|
~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
- Set EXEC flags on various things in linkman:upsmon.conf[5]:
|
|
+
|
|
NOTIFYFLAG ONBATT EXEC
|
|
NOTIFYFLAG ONLINE EXEC
|
|
+
|
|
If you want other things like WALL or SYSLOG to happen, just add them:
|
|
+
|
|
NOTIFYFLAG ONBATT EXEC+WALL+SYSLOG
|
|
+
|
|
You get the idea.
|
|
|
|
- Tell upsmon where your script is
|
|
|
|
NOTIFYCMD /path/to/my/script
|
|
|
|
- Make a simple script like this at that location:
|
|
|
|
#! /bin/bash
|
|
echo "$*" | sendmail -F"ups@mybox" bofh@pager.example.com
|
|
|
|
- Restart upsmon, pull the plug, and see what happens.
|
|
|
|
That approach is bare-bones, but you should get the text content of the
|
|
alert in the body of the message, since upsmon passes the alert text
|
|
(from NOTIFYMSG) as an argument.
|
|
|
|
Using more advanced features
|
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
Your helper script will be run with a few environment variables set.
|
|
|
|
- UPSNAME: the name of the system that generated the change.
|
|
+
|
|
This will be one of your identifiers from the MONITOR lines in upsmon.conf.
|
|
|
|
- NOTIFYTYPE: this will be ONLINE, ONBATT, or whatever event took place which
|
|
made upsmon call your script.
|
|
|
|
You can use these to do different things based on which system has
|
|
changed state. You could have it only send pages for an important
|
|
system while totally ignoring a known trouble spot, for example.
|
|
|
|
Suppressing notify storms
|
|
~~~~~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
upsmon will call your script every time an event happens that has the EXEC flag
|
|
set. This means a quick power failure that lasts mere seconds might generate a
|
|
notification storm. To suppress this sort of annoyance, use upssched as your
|
|
NOTIFYCMD program, and configure it to call your command after a timer has
|
|
elapsed.
|
|
|
|
|
|
The advanced approach, using upssched
|
|
-------------------------------------
|
|
|
|
upssched is a helper for upsmon that will invoke commands for you at some
|
|
interval relative to a UPS event. It can be used to send pages, mail out
|
|
notices about things, or even shut down the box early.
|
|
|
|
There will be examples scattered throughout. Change them to suit your
|
|
pathnames, UPS locations, and so forth.
|
|
|
|
How upssched works relative to upsmon
|
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
When an event occurs, upsmon will call whatever you specify as a 'NOTIFYCMD'
|
|
in your upsmon.conf, if you also enable the 'EXEC' in your 'NOTIFYFLAGS'. In
|
|
this case, we want upsmon to call upssched as the notifier, since it will
|
|
be doing all the work for us. So, in the upsmon.conf:
|
|
|
|
NOTIFYCMD /usr/local/ups/sbin/upssched
|
|
|
|
Then we want upsmon to actually _use_ it for the notify events, so again
|
|
in the upsmon.conf we set the flags:
|
|
|
|
NOTIFYFLAG ONLINE SYSLOG+EXEC
|
|
NOTIFYFLAG ONBATT SYSLOG+WALL+EXEC
|
|
NOTIFYFLAG LOWBATT SYSLOG+WALL+EXEC
|
|
... and so on.
|
|
|
|
For the purposes of this document I will only use those three, but you can set
|
|
the flags for any of the valid notify types.
|
|
|
|
Setting up your upssched.conf
|
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
Once upsmon has been configured with the NOTIFYCMD and EXEC flags, you're
|
|
ready to deal with the upssched.conf details. In this file, you specify
|
|
just what will happen when a given event occurs on a particular UPS.
|
|
|
|
First you need to define the name of the script or program that will
|
|
handle timers that trigger. This is your CMDSCRIPT, and needs to be above
|
|
any AT defines. There's an example provided with the program, so we'll
|
|
use that here:
|
|
|
|
CMDSCRIPT /usr/local/ups/bin/upssched-cmd
|
|
|
|
Then you have to define the variables PIPEFN and LOCKFN; the former
|
|
sets the file name of the FIFO that will pass communications between
|
|
processes to start and stop timers, while the latter sets the file name
|
|
for a temporary file created by upssched in order to avoid a race condition
|
|
under some circumstances. Please see the relevant comments in upssched.conf
|
|
for additional information and advice about these variables.
|
|
|
|
Now you can tell your CMDSCRIPT what to do when it is called by upsmon.
|
|
|
|
The big picture
|
|
^^^^^^^^^^^^^^^
|
|
|
|
The design in a nutshell is:
|
|
|
|
upsmon ---> calls upssched ---> calls your CMDSCRIPT
|
|
|
|
Ultimately, the CMDSCRIPT does the actual useful work, whether that's
|
|
initiating an early shutdown with 'upsmon -c fsd', sending a page by
|
|
calling sendmail, or opening a subspace channel to V'ger.
|
|
|
|
Establishing timers
|
|
^^^^^^^^^^^^^^^^^^^
|
|
|
|
Let's say that you want to receive a notification when any UPS has been running
|
|
on battery for 30 seconds. Create a handler that starts a 30 second timer
|
|
for an ONBATT condition.
|
|
|
|
AT ONBATT * START-TIMER onbattwarn 30
|
|
|
|
This means "when any UPS (the *) goes on battery, start a timer called
|
|
onbattwarn that will trigger in 30 seconds". We'll come back to the
|
|
onbattwarn part in a moment. Right now we need to make sure that we
|
|
don't trigger that timer if the UPS happens to come back before the
|
|
time is up. In essence, if it goes back on line, we need to cancel it.
|
|
So, let's tell upssched that.
|
|
|
|
AT ONLINE * CANCEL-TIMER onbattwarn
|
|
|
|
NOTE: Timers are pure in-memory mechanisms, specific to upssched. Conversely
|
|
to other mechanisms found in NUT, such as upsmon->POWERDOWNFLAG, there is no
|
|
file created on the filesystem.
|
|
|
|
Executing commands immediately
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
As an example, consider the scenario where a UPS goes onto battery power.
|
|
However, the users are not informed until 60 seconds later -- using a timer as
|
|
described above. Whilst this may let the *logged in* users know that the UPS
|
|
is on battery power, it does not inform any users subsequently logging in. To
|
|
enable this we could, at the same time, create a file which is read and
|
|
displayed to any user trying to login whilst the UPS is on battery power. If
|
|
the UPS comes back onto utility power within 60 seconds, then we can cancel
|
|
the timer and remove the file, as described above. However, if the UPS comes
|
|
back onto utility power say 5 minutes later then we do not want to use any
|
|
timers but we still want to remove the file. To do this we could use:
|
|
|
|
AT ONLINE * EXECUTE ups-back-on-power
|
|
|
|
This means that when upsmon detects that the UPS is back on utility power it
|
|
will signal upssched. Upssched will see the above command and simply pass
|
|
'ups-back-on-power' as an argument directly to CMDSCRIPT. This occurs
|
|
immediately, there are no timers involved.
|
|
|
|
Writing the command script handler
|
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
OK, now that upssched knows how the timers are supposed to work, let's
|
|
give it something to do when one actually triggers. The name of the
|
|
example timer is onbattwarn, so that's the argument that will be passed
|
|
into your CMDSCRIPT when it triggers. This means we need to do some
|
|
shell script writing to deal with that input.
|
|
|
|
--------------------------------------------------------------------------------
|
|
|
|
#! /bin/sh
|
|
|
|
case $1 in
|
|
onbattwarn)
|
|
# Send a notification mail
|
|
echo "The UPS has been on battery for awhile" \
|
|
| mail -s"UPS monitor" bofh@pager.example.com
|
|
# Create a flag-file on the filesystem, for your own processing
|
|
/usr/bin/touch /some/path/ups-on-battery
|
|
;;
|
|
ups-back-on-power)
|
|
# Delete the flag-file on the filesystem
|
|
/bin/rm -f /some/path/ups-on-battery
|
|
;;
|
|
*)
|
|
logger -t upssched-cmd "Unrecognized command: $1"
|
|
;;
|
|
esac
|
|
|
|
--------------------------------------------------------------------------------
|
|
|
|
This is a very simple script example, but it shows how you can test for
|
|
the presence of a given trigger. With multiple ATs creating various timer
|
|
names, you will need to test for each possibility and handle it according
|
|
to your desires.
|
|
|
|
NOTE: You can invoke just about anything from inside the CMDSCRIPT. It doesn't
|
|
need to be a shell script, either -- that's just an example. If you want to
|
|
write a program that will parse argv[1] and deal with the possibilities, that
|
|
will work too.
|
|
|
|
|
|
Early Shutdowns
|
|
~~~~~~~~~~~~~~~
|
|
|
|
One thing that gets requested a lot is early shutdowns in upsmon. With
|
|
upssched, you can now have this functionality. Just set a timer for some
|
|
length of time at ONBATT which will invoke a shutdown command if it elapses.
|
|
Just be sure to cancel this timer if you go back ONLINE before then.
|
|
|
|
The best way to do this is to use the upsmon callback feature. You can
|
|
make upsmon set the "forced shutdown" (FSD) flag on the upsd so your
|
|
secondary systems shut down early too. Just do something like this
|
|
in your CMDSCRIPT:
|
|
|
|
/sbin/upsmon -c fsd
|
|
|
|
NOTE: the path to `upsmon` must be provided. The default for an installation
|
|
built from sources is `/usr/local/ups` (so `/usr/local/ups/sbin/upsmon`),
|
|
while packaged installations will generally comply to
|
|
link:http://refspecs.linuxfoundation.org/fhs.shtml[FHS -- Filesystem Hierarchy Standard]
|
|
(so `/sbin/upsmon`).
|
|
|
|
It's not a good idea to call your system's shutdown routine directly
|
|
from the CMDSCRIPT, since there's no synchronization with the secondary
|
|
systems hooked to the same UPS. FSD is the primary's way of saying
|
|
"we're shutting down *now* like it or not, so you'd better get ready".
|
|
|
|
|
|
Background
|
|
~~~~~~~~~~
|
|
|
|
This program was written primarily to fulfill the requests of users for
|
|
the early shutdown scenario. The "outboard" design of the program
|
|
(relative to upsmon) was intended to reduce the load on the average
|
|
system. Most people don't have the requirement of shutting down after n
|
|
seconds on battery, since the usual OB+LB testing is sufficient.
|
|
|
|
This program was created separately so those people don't have to spend
|
|
CPU time and RAM on something that will never be used in their
|
|
environments.
|
|
|
|
The design of the timer handler is also geared towards minimizing impact.
|
|
It will come and go from the process list as necessary. When a new timer
|
|
is started, a process will be forked to actually watch the clock and
|
|
eventually start the CMDSCRIPT. When a timer triggers, it is removed from
|
|
the queue. Canceling a timer will also remove it from the queue. When
|
|
no timers are present in the queue, the background process exits.
|
|
|
|
This means that you will only see upssched running when one of two things
|
|
is happening:
|
|
|
|
1. There's a timer of some sort currently running
|
|
2. upsmon just called it, and you managed to catch the brief instance
|
|
|
|
The final optimization handles the possibility of trying to cancel a timer
|
|
when there's none running. If there's no process already running, there
|
|
are no timers to cancel, and furthermore there is no need to start a
|
|
clock-watcher. As a result, it skips that step and exits sooner.
|