# Network UPS Tools: example upsmon configuration # # This file contains passwords, so keep it secure. # -------------------------------------------------------------------------- # RUN_AS_USER # # By default, upsmon splits into two processes. One stays as root and # waits to run the SHUTDOWNCMD. The other one switches to another userid # and does everything else. # # The default unprivileged user is set at compile-time with the option # 'configure --with-user=...' # # You can override it with '-u ' when starting upsmon, or just # define it here for convenience. # # Note: if you plan to use the reload feature, this file (upsmon.conf) # must be readable by this user! Since it contains passwords, DO NOT # make it world-readable. Also, do not make it writable by the upsmon # user, since it creates an opportunity for an attack by changing the # SHUTDOWNCMD to something malicious. # # For best results, you should create a new normal user like "nutmon", # and make it a member of a "nut" group or similar. Then specify it # here and grant read access to the upsmon.conf for that group. # # This user should not have write access to upsmon.conf. # # RUN_AS_USER @RUN_AS_USER@ # -------------------------------------------------------------------------- # MONITOR ("primary"|"secondary") # # List systems you want to monitor. Not all of these may supply power # to the system running upsmon, but if you want to watch it, it has to # be in this section. # # You must have at least one of these declared. # # is a UPS identifier in the form @[:] # like ups@localhost, su700@mybox, etc. # # Examples: # # - "su700@mybox" means a UPS called "su700" on a system called "mybox" # # - "fenton@bigbox:5678" is a UPS called "fenton" on a system called # "bigbox" which runs upsd on port "5678". # # The UPS names like "su700" and "fenton" are set in your ups.conf # in [brackets] which identify a section for a particular driver. # # If the ups.conf on host "doghouse" has a section called "snoopy", the # identifier for it would be "snoopy@doghouse". # # is an integer - the number of power supplies that this UPS # feeds on this system. Most personal computers only have one power supply, # so this value is normally set to 1, while most modern servers have at least # two. You need a pretty big or special box to have any other value here. # # You can also set this to 0 for a system that doesn't take any power # from the MONITORed supply, which you still want to monitor (e.g. for an # administrative workstation fed from a different circuit than the datacenter # servers it monitors). Use if 0 when you want to hear about # changes for a given UPS without shutting down when it goes critical. # # and must match an entry in that system's # upsd.users. If your username is "upsmon" and your password is # "blah", the upsd.users would look like this: # # [upsmon] # password = blah # upsmon primary # (or secondary) # # "primary" means this system will shutdown last, allowing the secondary # systems time to shutdown first. # # "secondary" means this system shuts down immediately when power goes # critical and less than MINSUPPLIES power sources have reliable input feeds. # # The general assumption is that the "primary" system is the one with direct # connection to an UPS (such as serial or USB cable), so the primary system # runs the NUT driver and 'upsd' server locally and can manage the device, # and it would often tell the UPS to completely power itself off as a step # in power-race avoidance (see POWERDOWNFLAG for details). # # Also, since the primary system stays up the longest, it suffers higher risks # of ungraceful shutdown if the estimation of remaining runtime (or of the # time it takes to shut down this system) was guessed wrong. By consequence, # the "secondary" systems typically monitor the power environment state # through the 'upsd' processes running on the remote (often "primary") systems # and do not directly interact with an UPS (no local NUT drivers are running # on the secondary systems). As such, secondaries typically shut down as # soon as there is a sufficiently long power outage, or a low-battery alert # from the UPS, or a loss of connection to the primary while the power was # last known to be missing. # # This assumption and configuration can also make sense for networked UPSes, # where a rack full of servers might overload the communications capacity # of the networked management card on the UPS - in this case you might either # reduce the 'snmp-ups' or 'netxml-ups' driver polling rate, or dedicate a # "primary" server and set up the rest as "secondary" systems. # # In case of such large setups as mentioned above, beware also that shutdown # times of the rack done all at once can substantially differ from smaller # scale experiments with single-server shutdowns, since systems can compete # for shared storage and other limited resources as they go down (and also # not everyone may safely shut down simultaneously - e.g. a NAS or DB server # would better go down after all its clients). You would be well served by # higher-end UPSes with manageable thresholds to declare a critical state. # # Examples: # # MONITOR myups@bigserver 1 upswired blah primary # MONITOR su700@server.example.com 1 upsmon secretpass secondary # MONITOR myups@localhost 1 upsmon pass primary # (or secondary) # -------------------------------------------------------------------------- # MINSUPPLIES # # Give the number of power supplies that must be receiving power to keep # this system running. Most systems have one power supply, so you would # put "1" in this field. # # Large/expensive server type systems usually have more, and can run with # a few missing. Some of these can run with 2 out of 4, for example, # so you'd set that to 2. The idea is to keep the box running as long # as possible, right? # # Obviously you have to put the redundant supplies on different UPS circuits # for this to make sense! See big-servers.txt in the docs subdirectory # for more information and ideas on how to use this feature. MINSUPPLIES 1 # -------------------------------------------------------------------------- # SHUTDOWNCMD "" # # upsmon runs this command when the system needs to be brought down. # # This should work just about everywhere ... if it doesn't, well, change it, # perhaps to a more complicated custom script. # # Note that while you experiment with the initial setup and want to test how # your configuration reacts to power state changes and ultimately when power # is reported to go critical, but do not want your system to actually turn # off, consider setting the SHUTDOWNCMD temporarily to do something benign - # such as posting a message with 'logger' or 'wall' or 'mailx'. Do be careful # to plug the UPS back into the wall in a timely fashion. SHUTDOWNCMD "/sbin/shutdown -h +0" # -------------------------------------------------------------------------- # NOTIFYCMD # # upsmon calls this to send messages when things happen # # This command is called with the full text of the message (from NOTIFYMSG) # as one argument. # # The environment string NOTIFYTYPE will contain the type string of # whatever caused this event to happen. # # The environment string UPSNAME will contain the name of the system/device # that generated the change. # # Note that this is only called for NOTIFY events that have EXEC set with # NOTIFYFLAG. See NOTIFYFLAG below for more details. # # Making this some sort of shell script might not be a bad idea. # Alternately you can use the upssched program as your NOTIFYCMD for some # more complex setups (e.g. to ease handling of notification storms). # For more information and ideas, see docs/scheduling.txt # # Example: # NOTIFYCMD @BINDIR@/notifyme # -------------------------------------------------------------------------- # POLLFREQ # # Polling frequency for normal activities, measured in seconds. # # Adjust this to keep upsmon from flooding your network, but don't make # it too high or it may miss certain short-lived power events. POLLFREQ 5 # -------------------------------------------------------------------------- # POLLFREQALERT # # Polling frequency in seconds while UPS on battery. # # You can make this number lower than POLLFREQ, which will make updates # faster when any UPS is running on battery. This is a good way to tune # network load if you have a lot of these things running. # # The default is 5 seconds for both this and POLLFREQ. POLLFREQALERT 5 # -------------------------------------------------------------------------- # HOSTSYNC - How long upsmon will wait before giving up on another upsmon # # The primary upsmon process uses this number when waiting for secondary # systems to disconnect once it has set the forced shutdown (FSD) flag. # If they don't disconnect after this many seconds, it goes on without them. # # Similarly, upsmon secondary processes wait up to this interval for the # primary upsmon to set FSD when an UPS they are monitoring goes critical - # that is, on battery and low battery. If the primary doesn't do its job, # the secondaries will shut down anyway to avoid damage to the file systems. # # This "wait for FSD" is done to avoid races where the status changes # to critical and back between polls by the primary. HOSTSYNC 15 # -------------------------------------------------------------------------- # DEADTIME - Interval to wait before declaring a stale ups "dead" # # upsmon requires a UPS to provide status information every few seconds # (see POLLFREQ and POLLFREQALERT) to keep things updated. If the status # fetch fails, the UPS is marked stale. If it stays stale for more than # DEADTIME seconds, the UPS is marked dead. # # A dead UPS that was last known to be on battery is assumed to have gone # to a low battery condition. This may force a shutdown if it is providing # a critical amount of power to your system. # # Note: DEADTIME should be a multiple of POLLFREQ and POLLFREQALERT. # Otherwise you'll have "dead" UPSes simply because upsmon isn't polling # them quickly enough. Rule of thumb: take the larger of the two # POLLFREQ values, and multiply by 3. DEADTIME 15 # -------------------------------------------------------------------------- # POWERDOWNFLAG - Flag file for forcing UPS shutdown on the primary system # # upsmon will create a file with this name in primary mode when it's time # to shut down the load. You should check for this file's existence in # your shutdown scripts and run 'upsdrvctl shutdown' if it exists, to tell # the UPS(es) to power off. # # See the config-notes.txt file in the docs subdirectory for more information. # Refer to the section: # [[UPS_shutdown]] "Configuring automatic shutdowns for low battery events" # or refer to the online version. POWERDOWNFLAG /etc/killpower # -------------------------------------------------------------------------- # NOTIFYMSG - change messages sent by upsmon when certain events occur # # You can change the default messages to something else if you like. # # NOTIFYMSG "message" # # NOTIFYMSG ONLINE "UPS %s on line power" # NOTIFYMSG ONBATT "UPS %s on battery" # NOTIFYMSG LOWBATT "UPS %s battery is low" # NOTIFYMSG FSD "UPS %s: forced shutdown in progress" # NOTIFYMSG COMMOK "Communications with UPS %s established" # NOTIFYMSG COMMBAD "Communications with UPS %s lost" # NOTIFYMSG SHUTDOWN "Auto logout and shutdown proceeding" # NOTIFYMSG REPLBATT "UPS %s battery needs to be replaced" # NOTIFYMSG NOCOMM "UPS %s is unavailable" # NOTIFYMSG NOPARENT "upsmon parent process died - shutdown impossible" # # Note that %s is replaced with the identifier of the UPS in question. # # Possible values for : # # ONLINE : UPS is back online # ONBATT : UPS is on battery # LOWBATT : UPS has a low battery (if also on battery, it's "critical") # FSD : UPS is being shutdown by the primary (FSD = "Forced Shutdown") # COMMOK : Communications established with the UPS # COMMBAD : Communications lost to the UPS # SHUTDOWN : The system is being shutdown # REPLBATT : The UPS battery is bad and needs to be replaced # NOCOMM : A UPS is unavailable (can't be contacted for monitoring) # NOPARENT : The process that shuts down the system has died (shutdown impossible) # -------------------------------------------------------------------------- # NOTIFYFLAG - change behavior of upsmon when NOTIFY events occur # # By default, upsmon sends walls (global messages to all logged in users) # and writes to the syslog when things happen. You can change this. # # NOTIFYFLAG [+][+] ... # # NOTIFYFLAG ONLINE SYSLOG+WALL # NOTIFYFLAG ONBATT SYSLOG+WALL # NOTIFYFLAG LOWBATT SYSLOG+WALL # NOTIFYFLAG FSD SYSLOG+WALL # NOTIFYFLAG COMMOK SYSLOG+WALL # NOTIFYFLAG COMMBAD SYSLOG+WALL # NOTIFYFLAG SHUTDOWN SYSLOG+WALL # NOTIFYFLAG REPLBATT SYSLOG+WALL # NOTIFYFLAG NOCOMM SYSLOG+WALL # NOTIFYFLAG NOPARENT SYSLOG+WALL # # Possible values for the flags: # # SYSLOG - Write the message in the syslog # WALL - Write the message to all users on the system # EXEC - Execute NOTIFYCMD (see above) with the message # IGNORE - Don't do anything # # If you use IGNORE, don't use any other flags on the same line. # -------------------------------------------------------------------------- # RBWARNTIME - replace battery warning time in seconds # # upsmon will normally warn you about a battery that needs to be replaced # every 43200 seconds, which is 12 hours. It does this by triggering a # NOTIFY_REPLBATT which is then handled by the usual notify structure # you've defined above. # # If this number is not to your liking, override it here. RBWARNTIME 43200 # -------------------------------------------------------------------------- # NOCOMMWARNTIME - no communications warning time in seconds # # upsmon will let you know through the usual notify system if it can't # talk to any of the UPS entries that are defined in this file. It will # trigger a NOTIFY_NOCOMM by default every 300 seconds unless you # change the interval with this directive. NOCOMMWARNTIME 300 # -------------------------------------------------------------------------- # FINALDELAY - last sleep interval before shutting down the system # # On a primary, upsmon will wait this long after sending the NOTIFY_SHUTDOWN # before executing your SHUTDOWNCMD. If you need to do something in between # those events, increase this number. Remember, at this point your UPS is # almost depleted, so don't make this too high. If needed, on high-end UPS # devices you can usually configure when the low-battery state is announced # based on estimated remaining run-time or on charge level of the batteries. # # Alternatively, you can set this very low so you don't wait around when # it's time to shut down. Some UPSes don't give much warning for low # battery and will require a value of 0 here for a safe shutdown. # # Note: If FINALDELAY on the secondary is greater than HOSTSYNC on the # primary, the primary will give up waiting for that secondary system # to disconnect. FINALDELAY 5 # -------------------------------------------------------------------------- # CERTPATH - path to certificates (database directory or directory with CA's) # # When compiled with SSL support, you can enter the certificate path here. # # With NSS: # Certificates are stored in a dedicated database (split into 3 files). # Specify the path of the database directory. # # CERTPATH @CONFPATH@/cert/upsmon # # With OpenSSL: # Directory containing CA certificates in PEM format, used to verify # the server certificate presented by the upsd server. The files each # contain one CA certificate. The files are looked up by the CA subject # name hash value, which must hence be available. # # CERTPATH /usr/ssl/certs # # See 'docs/security.txt' or the Security chapter of NUT user manual # for more information on the SSL support in NUT. # -------------------------------------------------------------------------- # CERTIDENT - self certificate name and database password # CERTIDENT # # When compiled with SSL support with NSS, you can specify the certificate # name to retrieve from database to authenticate itself and the password # required to access certificate related private key. # # CERTIDENT "my nut monitor" "MyPasSw0rD" # # See 'docs/security.txt' or the Security chapter of NUT user manual # for more information on the SSL support in NUT. # -------------------------------------------------------------------------- # CERTHOST - security properties for an host # CERTHOST # # When compiled with SSL support with NSS, you can specify security directive # for each server you can contact. # Each entry maps server name with the expected certificate name and flags # indicating if the server certificate is verified and if the connection # must be secure. # # CERTHOST localhost "My nut server" 1 1 # # See 'docs/security.txt' or the Security chapter of NUT user manual # for more information on the SSL support in NUT. # -------------------------------------------------------------------------- # CERTVERIFY - make upsmon verify all connections with certificates # CERTVERIFY 1 # # When compiled with SSL support, make upsmon verify all connections with # certificates. # Without this, there is no guarantee that the upsd is the right host. # Enabling this greatly reduces the risk of man in the middle attacks. # This effectively forces the use of SSL, so don't use this unless # all of your upsd hosts are ready for SSL and have their certificates # in order. # When compiled with NSS support of SSL, can be overridden for host # specified with a CERTHOST directive. # -------------------------------------------------------------------------- # FORCESSL - force upsmon to use SSL # FORCESSL 1 # # When compiled with SSL, specify that a secured connection must be used # to communicate with upsd. # If you don't use 'CERTVERIFY 1', then this will at least make sure # that nobody can sniff your sessions without a large effort. Setting # this will make upsmon drop connections if the remote upsd doesn't # support SSL, so don't use it unless all of them have it running. # When compiled with NSS support of SSL, can be overridden for host # specified with a CERTHOST directive. # -------------------------------------------------------------------------- # DEBUG_MIN - specify minimal debugging level for upsmon daemon # e.g. DEBUG_MIN 6 # # Optionally specify a minimum debug level for `upsmon` daemon, e.g. for # troubleshooting a deployment, without impacting foreground or background # running mode directly, and without need to edit init-scripts or service # unit definitions. Note that command-line option `-D` can only increase # this verbosity level. # # NOTE: if the running daemon receives a `reload` command, presence of the # `DEBUG_MIN NUMBER` value in the configuration file can be used to tune # debugging verbosity in the running service daemon (it is recommended to # comment it away or set the minimum to explicit zero when done, to avoid # huge journals and I/O system abuse). Keep in mind that for this run-time # tuning, the `DEBUG_MIN` value *present* in *reloaded* configuration files # is applied instantly and overrides any previously set value, from file # or CLI options, regardless of older logging level being higher or lower # than the newly found number; a missing (or commented away) value however # does not change the previously active logging verbosity.