OpenVZ Forum


Home » General » Discussions » Monitoring Virtuals with Nagios
Monitoring Virtuals with Nagios [message #35178] Fri, 06 March 2009 20:51 Go to previous message
Geek42 is currently offline  Geek42
Messages: 2
Registered: March 2009
Location: Ottawa, Ontario,Canada
Junior Member
I currently use Nagios to monitor the(currently) 32 virtuals on my server. I found that if I was doing things the way I was supposed to, that I needed to run the nrpe server on every single one of them. I'm trying to keep things very controlled on the virtuals, so I want to run the minimal possible to do what I need on each one.

I came up with the following solution, and was wondering if anyone with better knowledge of how to do things could suggest some improvements/replacements... Of course you are also all free to use this if you think it might be useful...

Anyway, here is what I have:

1 Host Node: Server01
32 Virtuals: various names, descriptive of what they do, ie Nagios01 for my primary Nagios service.

My networking uses bridged virtual ethernet devices, mostly because they worked for me, but it should not effect this way of checking.

Nagios, running on a virtual, calls to nrpe running on Server01. If I'm just checking the local commands, no problem, they check fine. If I want to check one of the virtuals, it runs a special script via nrpe running on Server01:
#!/bin/bash
PATH=/bin:/sbin:/usr/bin:/usr/sbin:/usr/local/bin:/usr/local/sbin

PROGNAME=`basename $0`
PROGPATH=`echo $0 | sed -e 's,[\\/][^\\/][^\\/]*$,,'`
REVISION=`echo '$Revision: 0.1 $' | sed -e 's/[^0-9.]//g'`

. $PROGPATH/utils.sh

VE_PATH=/usr/lib/nagios/plugins/
VZ=/usr/sbin/vzctl
SUDO=/usr/bin/sudo

print_usage() {
        echo "Usage: $PROGNAME"
}

print_help() {
        print_revision $PROGNAME $REVISION
        echo ""
        print_usage
        echo ""
        echo "This plugin runs a plugin on a OpenVZ VM hosted on this system."
        echo ""
        support
        exit 0
}
function get_veid {
#       A=`echo $1 | sed -n 's/^\([0-9]*\)\..*/\1/p'`
#       B=`echo $1 | sed -n 's/^[0-9]*\.\([0-9]*\)\..*/\1/p'`
        C=`echo $1 | sed -n 's/^[0-9]*\.[0-9]*\.\([0-9]*\)\..*/\1/p'`
        D=`echo $1 | sed -n 's/^[0-9]*\.[0-9]*\.[0-9]*\.\([0-9]*\)$/\1/p'`

        if [ $D -eq 250 ]; then
                C=$(( $C + 1 ))
                D=0
        fi

        C=$(( $C - 192 ))
        ID=$(( $D + $(( $C * 250 )) ))
}

case "$1" in
        --help)
                print_help
                exit 0
                ;;
        -h)
                print_help
                exit 0
                ;;
        --version)
        print_revision $PROGNAME $REVISION
                exit 0
                ;;
        -V)
                print_revision $PROGNAME $REVISION
                exit 0
                ;;
        -t)
                echo OK $0 $1
                exit 0
                ;;
        *)
                if test "$1" = "-H"; then
                        get_veid $2
                else
                        echo "Unknown: $1 is first paramater, should be -H"
                        exit 3
                fi
                if test "$3" = "-c"; then
                        CMD="$VE_PATH$4"
                        ARGS="$5 $6 $7 $8 $9"
                else
                        echo "Unknown: $3 is third parameter, should be -c"
                        exit 3
                fi
                #vzctl exec2 $ID passes back the exit code
                DATA=`$SUDO $VZ exec2 $ID $CMD $ARGS 2>&1`
                STATUS=$?
                echo $DATA
                exit $STATUS
                ;;
esac

In the above code, I just get the veid based on the ip address passed as I use a formula to get the ip address from the veid in my creation script.

Beyond that I have the following line in the /etc/sudoers file on Server01:
nagios  ALL=NOPASSWD:/usr/sbin/vzctl

Which is mildly insecure but allows the nagios user to run the vzctl command. Because of this I have setup the nrpe on server01 to only allow connection from my Nagios Server, no others, otherwise you can see the problems this could cause. This is the main problem I have with this script, and would love to know a way to avoid it.

Anyway, What this does is use vzctl exec2 #veid to execute the plugins i've installed in each VZ(without any nrpe server) and then pass the results back to nagios.

In my nrpe_local.cfg on Server01 I have:
command[check_openvz_vm]=/usr/lib/nagios/plugins/check_openvz_vm $ARG1$ $ARG2$ $ARG3$


Finally on the nagios virtual I have these settings(can be in one file or multiple, nagios does not care usually:
define command{
        command_name    check_openvz_vm
        command_line    /usr/lib/nagios/plugins/check_nrpe -H ip.of.server.01 -c check_openvz_vm -a "$ARG1$"
}

define service{
        use                             openvz-service
        hostgroup_name                  virtual-openvz
        service_description             OpenVZ VE Process Count
        check_command                   check_openvz_vm!-H $HOSTADDRESS$ -c check_procs -w 50 -c 75
        normal_check_interval           15
        }

define hostgroup{
        hostgroup_name  virtual-openvz
        alias           OpenVZ Virtual Servers
        }

define host{
        use                     linux-server
        hostgroups              virtual-openvz
        host_name               1001 Gateway
        alias                   Gateway.domain.name
        address                 10.51.196.1
        parents                 Server01
}


The one real improvement I can think of is to have this actually call a command on the VE that works similar to the nrpe without constantly running.

If people think this is good, after any improvements of course, I'll throw this up on the wiki, otherwise I'll just keep it to myself and by quite happy with it.

Thanks

JC
 
Read Message
Read Message
Read Message
Read Message icon14.gif
Previous Topic: New OpenVZ VPS server control panel
Next Topic: Please help me in choosing web hosting
Goto Forum:
  


Current Time: Mon Jul 15 20:21:58 GMT 2024

Total time taken to generate the page: 0.02484 seconds