Hi Christian,
I didn't see your post on openvz-users (I only got the personally
addressed copy). I'll address this to openvz-users. Thanks for your
reply and ideas also.
> Try: arp -vn
>
> It should show a line with your VE IP address, like this:
>
> 10.10.150.244 * <from_interface> MP eth0
>
> If you don't have such a line, the host node will not respond to ARP
> queries (except if you turn on proxy_arp, but I'd advise against
> this).
Unfortunately things still seem wrong. To illustrate, here's what I
get for vzlist:
[root@sonata ~]# vzlist
VEID NPROC STATUS IP_ADDR HOSTNAME
10 68 running 192.168.0.10 trixbox.cgb1911.mine.nu
105 124 running 192.168.0.105 zimbra-ose.cgb1911.mine.nu
106 24 running 192.168.0.106 otrs.cgb1911.mine.nu
107 26 running 192.168.0.107 ipplan.cgb1911.mine.nu
108 6 running 192.168.0.108 freeside.cgb1911.mine.nu
109 8 running 192.168.0.109 enomalism.cgb1911.mine.nu
110 65 running 192.168.0.110 virtualmin.cgb1911.mine.nu
111 77 running 192.168.0.111 ispconfig.cgb1911.mine.nu
112 33 running 192.168.0.112 splunk.cgb1911.mine.nu
115 77 running 192.168.0.115 zimbra4.cgb1911.mine.nu
116 28 running 192.168.0.116 cacti-test.cgb1911.mine.nu
100001 1 running - -
Now, you'd expect exactly that same number of entries in the arp
table, one for each IP. Unfortunately, only entries for 105 and 10
exist (they are ones I've had to restart to 'make work').
[root@sonata ~]# arp -vn
Address HWtype HWaddress Flags Mask Iface
192.168.0.221 ether 00:11:93:98:67:45 C eth0
192.168.0.160 ether 00:07:E9:5F:BA:60 C eth0
192.168.0.253 ether 00:1B:2B:2C:C3:4D C eth0
192.168.0.43 ether 00:19:D1:69:FD:2E C eth0
192.168.0.105 * * MP eth0
192.168.0.10 * * MP eth0
Entries: 6 Skipped: 0 Found: 6
> The VE start script sets this up (so a restart will fix it), but
> I've also seen cases where these entries get lost after some time.
> Haven't been able to produce a test case, though.
Yeah, I hadn't determined what triggers the problem as yet. Some
further digging has revealed that a script:
/usr/share/vzctl/scripts/vpsnetclean
runs every 5 minutes in cron. It has all the characteristics of a
program that would be breaking arp entires for my running VE's. I
just need to work out if and then why it determines a VE is 'stopped'
and calls clear_ve_net to tidy up IP/ARP entries. Will reply if I
have anything more.
> Any chance that you are using a firewall on the host node which
> fiddles with the ARP stuff? (shorewall?)
Nope, not using anything like that. Just standard Centos 5 base
system with OVZ kernel and management tools.
Regards,
Chris Bennett (cgb)