LVSmon v0.0.2 - A cluster monitoring daemon Copyright (c) 2002 Gianni Tedesco <gianni@ecsc.co.uk> This software is released under the GNU GPL version 2 or later
INTRODUCTION
LVSmon is a cluster monitoring daemon written originally with the intention of replacing tools like ldirectord and mon with regards to maintaining LVS tables.
GETTING LVSMON
The lvsmon website is at:
http://www.scaramanga.co.uk/lvsmon/
To get a specific version of lvsmon you can download from:
http://www.scaramanga.co.uk/lvsmon/vX.Y.Z/lvsmon-X.Y.Z.tar.gz
USAGE
The following command will give you correct information on lvsmon usage (I cannot always promise to keep these docs up to date).
lvsmon --help
The usual invocation will go something like this:
lvsmon --file /etc/lvsmon.conf --interval 10
This tells lvsmon to use the file /etc/lvsmon.conf as the config file and to check all services once every ten seconds.
KNOWN BUGS (FEATURES)
failure count stops counting after 4 billion failures, this is because if it carrys on the variable will wrap. There isn't much that can be done about this... With default settings this problem will occur after 1268 years of operation.
WHY USE LVSMON?
It's secure and reliable. It will never run away. It will never give you the wrong output. It will not allow people to take over system. I would put money on it.
Performance. It uses very little CPU time, has a small footprint and scales better than most even considering the fact that it runs in a single thread.
FEATURES
Edge triggered status notification
Runs in a single non-blocking thread
Does no dynamic allocation after entering main loop
Uses very little memory (about 60B per service on a 32bit platform)
Scales reasonably well (poll driven)
Provably correct design, LVSmon will never run away with resources
Pauses 500ms before exiting with errors, stops runaway interactions
Portable (ANSI C/POSIX)
Timeouts enforced on connection and receive events
Report nature of failures (eg: conn refused, 404 error etc..)
HTTP plugin, checks response according to RFC 1945
Banner checker plugin - will work for SSH, SMTP, POP3, etc...
PHASES
What follows is a brief description of each phase in a monitors life cycle. A monitor is essentially a state machine
PHASE_CONNECT
Create a socket if necessary and initiate a connection.
Poll for writing.
PHASE_CONRESULT
getsockopt() to see if the connection succeeded. If it
did, the socket is now writable so go to query.
PHASE_QUERY
Send a query, this usually won't block because the last
poll() told us the socket is writable. Now poll for input.
PHASE_RECV
Poll tells us we have input so a recv() will not block.
Do the recieve and get out of there. If the plugin requires
it, then check the results, otherwise the fact that poll
for readability succeeds is indication enough of success.
PHASE_WAIT
Check results of the last cycle and act accordingly.
Wait until we are needed again, then go to PHASE_CONNECT.
TODO
- Features
- Better event reporting mechanism Allow failure timeouts (eg: 3 strikes and your out) Allow failure timeouts (eg: no success for 5 minutes) Support plugins that dont use TCP Make timing more accurate Support complex transactions Make plugins plugins :) Use getopt in plugins for option passing http: Support full URLs Drop root privileges and chroot if started as root
