Monitoring Virtuals with Nagios [message #35178] |
Fri, 06 March 2009 20:51  |
Geek42
Messages: 2 Registered: March 2009 Location: Ottawa, Ontario,Canada
|
Junior Member |
|
|
I currently use Nagios to monitor the(currently) 32 virtuals on my server. I found that if I was doing things the way I was supposed to, that I needed to run the nrpe server on every single one of them. I'm trying to keep things very controlled on the virtuals, so I want to run the minimal possible to do what I need on each one.
I came up with the following solution, and was wondering if anyone with better knowledge of how to do things could suggest some improvements/replacements... Of course you are also all free to use this if you think it might be useful...
Anyway, here is what I have:
1 Host Node: Server01
32 Virtuals: various names, descriptive of what they do, ie Nagios01 for my primary Nagios service.
My networking uses bridged virtual ethernet devices, mostly because they worked for me, but it should not effect this way of checking.
Nagios, running on a virtual, calls to nrpe running on Server01. If I'm just checking the local commands, no problem, they check fine. If I want to check one of the virtuals, it runs a special script via nrpe running on Server01:
#!/bin/bash
PATH=/bin:/sbin:/usr/bin:/usr/sbin:/usr/local/bin:/usr/local/sbin
PROGNAME=`basename $0`
PROGPATH=`echo $0 | sed -e 's,[\\/][^\\/][^\\/]*$,,'`
REVISION=`echo '$Revision: 0.1 $' | sed -e 's/[^0-9.]//g'`
. $PROGPATH/utils.sh
VE_PATH=/usr/lib/nagios/plugins/
VZ=/usr/sbin/vzctl
SUDO=/usr/bin/sudo
print_usage() {
echo "Usage: $PROGNAME"
}
print_help() {
print_revision $PROGNAME $REVISION
echo ""
print_usage
echo ""
echo "This plugin runs a plugin on a OpenVZ VM hosted on this system."
echo ""
support
exit 0
}
function get_veid {
# A=`echo $1 | sed -n 's/^\([0-9]*\)\..*/\1/p'`
# B=`echo $1 | sed -n 's/^[0-9]*\.\([0-9]*\)\..*/\1/p'`
C=`echo $1 | sed -n 's/^[0-9]*\.[0-9]*\.\([0-9]*\)\..*/\1/p'`
D=`echo $1 | sed -n 's/^[0-9]*\.[0-9]*\.[0-9]*\.\([0-9]*\)$/\1/p'`
if [ $D -eq 250 ]; then
C=$(( $C + 1 ))
D=0
fi
C=$(( $C - 192 ))
ID=$(( $D + $(( $C * 250 )) ))
}
case "$1" in
--help)
print_help
exit 0
;;
-h)
print_help
exit 0
;;
--version)
print_revision $PROGNAME $REVISION
exit 0
;;
-V)
print_revision $PROGNAME $REVISION
exit 0
;;
-t)
echo OK $0 $1
exit 0
;;
*)
if test "$1" = "-H"; then
get_veid $2
else
echo "Unknown: $1 is first paramater, should be -H"
exit 3
fi
if test "$3" = "-c"; then
CMD="$VE_PATH$4"
ARGS="$5 $6 $7 $8 $9"
else
echo "Unknown: $3 is third parameter, should be -c"
exit 3
fi
#vzctl exec2 $ID passes back the exit code
DATA=`$SUDO $VZ exec2 $ID $CMD $ARGS 2>&1`
STATUS=$?
echo $DATA
exit $STATUS
;;
esac
In the above code, I just get the veid based on the ip address passed as I use a formula to get the ip address from the veid in my creation script.
Beyond that I have the following line in the /etc/sudoers file on Server01:
nagios ALL=NOPASSWD:/usr/sbin/vzctl
Which is mildly insecure but allows the nagios user to run the vzctl command. Because of this I have setup the nrpe on server01 to only allow connection from my Nagios Server, no others, otherwise you can see the problems this could cause. This is the main problem I have with this script, and would love to know a way to avoid it.
Anyway, What this does is use vzctl exec2 #veid to execute the plugins i've installed in each VZ(without any nrpe server) and then pass the results back to nagios.
In my nrpe_local.cfg on Server01 I have:
command[check_openvz_vm]=/usr/lib/nagios/plugins/check_openvz_vm $ARG1$ $ARG2$ $ARG3$
Finally on the nagios virtual I have these settings(can be in one file or multiple, nagios does not care usually:
define command{
command_name check_openvz_vm
command_line /usr/lib/nagios/plugins/check_nrpe -H ip.of.server.01 -c check_openvz_vm -a "$ARG1$"
}
define service{
use openvz-service
hostgroup_name virtual-openvz
service_description OpenVZ VE Process Count
check_command check_openvz_vm!-H $HOSTADDRESS$ -c check_procs -w 50 -c 75
normal_check_interval 15
}
define hostgroup{
hostgroup_name virtual-openvz
alias OpenVZ Virtual Servers
}
define host{
use linux-server
hostgroups virtual-openvz
host_name 1001 Gateway
alias Gateway.domain.name
address 10.51.196.1
parents Server01
}
The one real improvement I can think of is to have this actually call a command on the VE that works similar to the nrpe without constantly running.
If people think this is good, after any improvements of course, I'll throw this up on the wiki, otherwise I'll just keep it to myself and by quite happy with it.
Thanks
JC
|
|
|