Strange problem - machine "zombifying" [message #37259] |
Sat, 29 August 2009 21:21 |
sylvester_0
Messages: 4 Registered: August 2009
|
Junior Member |
|
|
Hello,
I've been using openvz on a host for a while (8 months now) and have had no problems up until a few days ago. We are starting to get into the e-commerce business and are testing out Magento. The problems started once it was plopped onto our webserver.
The container [Debian Lenny] becomes a "zombie". All services are still running (I can still access apache usually) but the load average climbs to be extremely high (27+); there is no IOWait, CPU usage etc. No other containers experience any problems during this period but the load average on the host reflects what is shown inside the container. UBC indicates no failcnts.
When this happens I can no longer use mysql. Trying to load a PHP site results in "Can't connect to local MySQL server through socket '/var/run/mysqld/mysqld.sock'" So, I log into the machine with vzctl OR ssh and try to stop apache using standard init.d commands. This fails immediately. I try to kill -9 the PID. It doesn't even blink. The same thing happens if I try to stop any other service (bind, apache, etc). The processes are stuck in "limbo" if you will - kinda running but not really. I've never seen this behavior before and so I'm at a loss.
To fix the container, I am forced to stop every other container on the host, run a reboot on the host, then kill the "vz stop" command on the host (because the processes within the container refuse to die).
Apache logs don't indicate an attack; neither does netstat. I'm turning on detailed mysql logging and setting my Zabbix machine to keep a close eye on it.
Here's some of my config:
uname -a
Linux host.example.com 2.6.24-24-openvz #1 SMP Tue Aug 18 18:49:39 UTC 2009 x86_64 GNU/Linux
free
total used free shared buffers cached
Mem: 3911788 2527048 1384740 0 132020 601128
-/+ buffers/cache: 1793900 2117888
Swap: 2803256 0 2803256
UBC (when everything is running OK)
http://pastebin.com/fe5f4ffb
UBC (when container is broken)
http://pastebin.com/f70b2ff8f
Top (inside container when broken)
http://pastebin.com/f4cb9c76c
Top (another one showing processes with D state)
http://pastebin.com/f58879f4d
Netstat (when broken - note the Mysql connecting entries)
http://pastebin.com/f4693fdc2
Strace of me trying to kill mysql
http://pastebin.com/f4b5813af
I greatly appreciate any help in this matter as I'm pretty lost at the moment. I hope this isn't information overload - I tried to be as detailed as possible. If there's anything else you need to know please tell me!
Thanks
|
|
|