Alert Manager
Purpose: To create a program that can run an alert command, monitor the
status of that command's output, and if it fails run a different
command. It should have a very flexable configuration file that
allows creation of "alert chains" - chains of alerts, each with
their own alert command, fallback-command, and other advanced
options. It must have a method of passing an arbatrary message
from the command line into any of the alert commands defined in
the configuration file.
- Features
-
- Doesn't keep the log file open longer than it needs to - so multiple instances of alert.pl hopefully won't overwrite eachother's log files.
- Supports a very forgiving configuration file that allows pound, double-slash, and /* */ C-style comments.
- A command line switch to check the validity of the configuration file.
- A method to aviod infinite looping (failing?) alert commands.
Program Flow:
Alert Manager is designed to be run on Linux or any other OS that
properly supports posix threads. Win32 is not supported by Alert
Manager. The basic program flow looks like this:
* Start
* Process command line
* Read "General Settings" from the configuration file
* Enter a while (1) loop
* If the alert is one of the special predefined alerts (LOG/SYSLOG)
run the alert and exit.
* Read information about the alert we are about to run from the
configuration file
* Run the current alert command:
until the alert is completed successfully:
if a failure-alert command needs to run it is started in a separate
thread. If the current alert fails the fallback command is loaded
and it is run (in the same while loop). The while loop doesn't
exit unless some command completes successfully.
If the command completed successfully and there is a success-alert
it is loaded and run. If there is no success-alert the program exits.
NOTE that the configuration file syntax is not verified at runtime! Alert Manager was implemented this way on purpose to minimize potential message loss. For example say you created the "foo" alert definition. You have been using it for several weeks and it always runs flawlessly. Then say you create a "bar" alert definition but had a syntax error in it. You would not want Alert Manager to stop processing the "foo" entry just because the "bar" definition had a syntax error in it. Realize that this means that every time you modify the configuration file you should run the --validate option and make sure the syntax is sane.
/*****************************
** Practicle Applications **
*****************************/
As stated earlier, Alert Manager was created to be a simple wrapper around alert commands. Specifically I wanted something that could work in conjunction with Logdog <http://caspian.dotconf.net/menu/Software/LogDog/> to verify that the alert commands Logdog was running (i.e. sendEmail, sendSNPP.pl, sendSMS.pl, and others) were successfully running. In a production environment you can't have a critical page get lost because the network was temporarily down, or your pager's mailbox was full. Thus alert manager was born.
Since then I have added the success-alert option which made Alert Manager immensely
useful for monitoring tasks. For example I have a "success-chain" setup where the
flow is something like this:
Ping-server -> check-smtp -> check-imap -> check-http -> check-https -> check-sshd -> etc.
That way if a server is down I only get one page saying the server is un-pingable,
and I don't get 50 pages saying the server isn't responding on http, https, smtp, etc.
/***************
** Examples **
***************/
Verify the syntax of the configuration file:
alert.pl -c /etc/alertManager.conf --validate
Run the alert command "Page-Jane"
alert.pl -c /etc/alertManager.conf -a Page-Jane -m "I'm busy"
If the Page-Jane alert definition in the configuration file has a comand with the special phrase "MESSAGE" (with or without quotes) the message "I'm busy" would be substituted in place of it.
Log a message to a file:
alert.pl -c /etc/alertManager.conf -a "LOG /var/log/newlog" -m "Penguins rock" In this example the message "Penguins rock" would get logged to the file /var/log/newlog Since LOG is a special pre-defined alert, you would not need to modify the configuration file for this command to work.
Pipe the message text into alertManager:
echo "Please call at 555-666-7777" | alert.pl -c /etc/alertManager.conf -a Page-Jane -m stdin If you pass "-m stdin" to alertManager it will try reading a line from STDIN rather than using the actual phrase "stdin."
More example configuration files will be included in future releases of alertManager.
