OpenVZ Forum: Support » More on debian hardware node crashes.

Home » General » Support » More on debian hardware node crashes.

Show: Today's Messages :: Show Polls :: Message Navigator
E-mail to friend

More on debian hardware node crashes. [message #3506]

Thu, 01 June 2006 00:59

vobiscum
Messages: 6
Registered: April 2006
Location: Brisbane

Junior Member

A month or so ago I posted about the problems that I had been having with my Debian hardware node crashing. Due to the location of the server, I had trouble getting any further debugging information. However, I have now found a way to reliably crash the server.

In one of my virtual servers, I am running postfix 2.1.5 with mailman for a number of mailing lists. Generally, this works, however at the start of the month, mailman sends an individual message to every user on the mailing list, resulting in high load on the mail server. Within a few seconds of doing this, postfix starts spewing out the following error:

Jun 1 05:00:58 wright postfix/smtp[19551]: warning: smtp_connect_addr: socket: No buffer space available
Jun 1 05:00:58 wright postfix/smtp[19551]: socket to mx1.hotmail.com[??I??+]: No buffer space available (port 25)

This message is repeated until the server stops responding. Power cycling the server is the only way to get it back up and running again. The problem is, all the email is still in postfix's queue, and the cycle will start again as soon as postfix fires up. I have been able to work around this issue by setting postfix concurrent connection to a single destination to 1, essentially limiting the number of outgoing connection it makes.

I have tried setting the tcpsndbuf parameter to the following. Which did not help the problem:

tcpsndbuf 17408 233984 5242880 5242880 0

I am running 2.6.16 with the 026test009 patch. Are the some other tcp settings in the hardware node that I need to set. If one of the virtual servers is requesting more tcp send buffers than the hardware node can provide, what happens?

Later on I also noticed the following error in my hardware node:

Jun 1 09:52:59 corbusier kernel: TCP: too many of orphaned sockets
Jun 1 09:52:59 corbusier last message repeated 9 times

Any ideas about why this might be happing would be appreciated.

Ned

Report message to a moderator

Re: More on debian hardware node crashes. [message #3513 is a reply to message #3506]

Thu, 01 June 2006 07:36

dim
Messages: 344
Registered: August 2005

Senior Member

1) Do I understand right, that there are no non-zero failcnt value in /proc/user_beancounters?
2) Could you check, that if you stop VPS with mail server after similar mass send, there is no entry in /proc/user_beancounters, related to this VPS?

http://static.openvz.org/openvz_userbar_en.gif

Report message to a moderator

Re: More on debian hardware node crashes. [message #3527 is a reply to message #3513]

Thu, 01 June 2006 09:22

vobiscum
Messages: 6
Registered: April 2006
Location: Brisbane

Junior Member

I don't get a chance to check /usr/beancounters as the system freezes before I can do anything.

Ned

Report message to a moderator

Re: More on debian hardware node crashes. [message #3528 is a reply to message #3527]

Thu, 01 June 2006 09:39

dim
Messages: 344
Registered: August 2005

Senior Member

Ok, could you check them after some days of normal operations?

Report message to a moderator

Re: More on debian hardware node crashes. [message #3581 is a reply to message #3506]

Mon, 05 June 2006 21:47

jonathankinney
Messages: 14
Registered: May 2006
Location: WA

Junior Member

Just for my clarification, you are saying the hardware node is actually becoming unresponsive, not just the VE? If that is the case, have you tried to scale back the VE's resource limits. I would assume that if you scaled back the resource limits like tcpsndbuf, then it should kill the VE, not the hardware node, which is usually preferred. Before you can really figure much of anything out, you will want to make sure that the hardware node does not crash.

Also, as far as I know, the messages about "too many of orphaned sockets" is just the result of something in a VE hitting a resource limit, and the kernel cleaning up after the processes or connections that were involved when the resource limit was hit.

Jonathan Kinney
Data Systems Specialist
http://www.advantagecom.net

Report message to a moderator

Previous Topic:	Multiple networks?
Next Topic:	Samba problem - Centos 4.3 - samba 3.0.10-1.4E.6

Goto Forum:

-=] Back to Top [=-

[ Syndicate this forum (XML) ] [

] [

]

Current Time: Sun Jun 29 02:22:31 GMT 2025

Total time taken to generate the page: 0.01819 seconds