More on debian hardware node crashes. [message #3506] |
Thu, 01 June 2006 00:59 |
vobiscum
Messages: 6 Registered: April 2006 Location: Brisbane
|
Junior Member |
|
|
A month or so ago I posted about the problems that I had been having with my Debian hardware node crashing. Due to the location of the server, I had trouble getting any further debugging information. However, I have now found a way to reliably crash the server.
In one of my virtual servers, I am running postfix 2.1.5 with mailman for a number of mailing lists. Generally, this works, however at the start of the month, mailman sends an individual message to every user on the mailing list, resulting in high load on the mail server. Within a few seconds of doing this, postfix starts spewing out the following error:
Jun 1 05:00:58 wright postfix/smtp[19551]: warning: smtp_connect_addr: socket: No buffer space available
Jun 1 05:00:58 wright postfix/smtp[19551]: socket to mx1.hotmail.com[??I??+]: No buffer space available (port 25)
This message is repeated until the server stops responding. Power cycling the server is the only way to get it back up and running again. The problem is, all the email is still in postfix's queue, and the cycle will start again as soon as postfix fires up. I have been able to work around this issue by setting postfix concurrent connection to a single destination to 1, essentially limiting the number of outgoing connection it makes.
I have tried setting the tcpsndbuf parameter to the following. Which did not help the problem:
tcpsndbuf 17408 233984 5242880 5242880 0
I am running 2.6.16 with the 026test009 patch. Are the some other tcp settings in the hardware node that I need to set. If one of the virtual servers is requesting more tcp send buffers than the hardware node can provide, what happens?
Later on I also noticed the following error in my hardware node:
Jun 1 09:52:59 corbusier kernel: TCP: too many of orphaned sockets
Jun 1 09:52:59 corbusier last message repeated 9 times
Any ideas about why this might be happing would be appreciated.
Ned
|
|
|
|
|
|
Re: More on debian hardware node crashes. [message #3581 is a reply to message #3506] |
Mon, 05 June 2006 21:47 |
jonathankinney
Messages: 14 Registered: May 2006 Location: WA
|
Junior Member |
|
|
Just for my clarification, you are saying the hardware node is actually becoming unresponsive, not just the VE? If that is the case, have you tried to scale back the VE's resource limits. I would assume that if you scaled back the resource limits like tcpsndbuf, then it should kill the VE, not the hardware node, which is usually preferred. Before you can really figure much of anything out, you will want to make sure that the hardware node does not crash.
Also, as far as I know, the messages about "too many of orphaned sockets" is just the result of something in a VE hitting a resource limit, and the kernel cleaning up after the processes or connections that were involved when the resource limit was hit.
Jonathan Kinney
Data Systems Specialist
http://www.advantagecom.net
|
|
|