OpenVZ Forum


Home » General » Support » Strange problem - machine "zombifying"
Strange problem - machine "zombifying" [message #37259] Sat, 29 August 2009 21:21
sylvester_0 is currently offline  sylvester_0
Messages: 4
Registered: August 2009
Junior Member
Hello,
I've been using openvz on a host for a while (8 months now) and have had no problems up until a few days ago. We are starting to get into the e-commerce business and are testing out Magento. The problems started once it was plopped onto our webserver.

The container [Debian Lenny] becomes a "zombie". All services are still running (I can still access apache usually) but the load average climbs to be extremely high (27+); there is no IOWait, CPU usage etc. No other containers experience any problems during this period but the load average on the host reflects what is shown inside the container. UBC indicates no failcnts.

When this happens I can no longer use mysql. Trying to load a PHP site results in "Can't connect to local MySQL server through socket '/var/run/mysqld/mysqld.sock'" So, I log into the machine with vzctl OR ssh and try to stop apache using standard init.d commands. This fails immediately. I try to kill -9 the PID. It doesn't even blink. The same thing happens if I try to stop any other service (bind, apache, etc). The processes are stuck in "limbo" if you will - kinda running but not really. I've never seen this behavior before and so I'm at a loss.

To fix the container, I am forced to stop every other container on the host, run a reboot on the host, then kill the "vz stop" command on the host (because the processes within the container refuse to die).

Apache logs don't indicate an attack; neither does netstat. I'm turning on detailed mysql logging and setting my Zabbix machine to keep a close eye on it.

Here's some of my config:

uname -a
Linux host.example.com 2.6.24-24-openvz #1 SMP Tue Aug 18 18:49:39 UTC 2009 x86_64 GNU/Linux


free
             total       used       free     shared    buffers     cached
Mem:       3911788    2527048    1384740          0     132020     601128
-/+ buffers/cache:    1793900    2117888
Swap:      2803256          0    2803256


UBC (when everything is running OK)
http://pastebin.com/fe5f4ffb

UBC (when container is broken)
http://pastebin.com/f70b2ff8f

Top (inside container when broken)
http://pastebin.com/f4cb9c76c

Top (another one showing processes with D state)
http://pastebin.com/f58879f4d

Netstat (when broken - note the Mysql connecting entries)
http://pastebin.com/f4693fdc2

Strace of me trying to kill mysql
http://pastebin.com/f4b5813af


I greatly appreciate any help in this matter as I'm pretty lost at the moment. I hope this isn't information overload - I tried to be as detailed as possible. If there's anything else you need to know please tell me!

Thanks Smile
Previous Topic: Vzmigrate Can't connect to destination address using public key
Next Topic: Network setup how to do it.
Goto Forum:
  


Current Time: Sun Aug 04 06:21:38 GMT 2024

Total time taken to generate the page: 0.02550 seconds