Re: Debug numtcpsock growing without bounds [message #52130 is a reply to message #52124] |
Fri, 24 July 2015 13:49  |
stompro
Messages: 3 Registered: July 2015 Location: Moorhead, MN
|
Junior Member |
|
|
I think I have this figured out.
The problem had nothing to do with OpenVZ, other than being apparent because of how OpenVZ meters resources out.
The problem was in a perl module that apache was loading, it was making a http request and calling shutdown on the connection at the end, but not calling close also on the connection, so the file descriptor was never cleared but it was still counted as a tcp connection by user_beancounters. Connections that are shutdown but not closed are not shown in netstat or ss -s. These can also be caused by socket connection that are allocated, but never connected, when there is a socket call but never a connect that follows it. The socket FD (File Descriptor) just hangs around until the program exits.
To view these connections use the command
or to just get a count
lsof | grep " sock " | wc -l
The results look like this, here are several processes that each have 1-2 orphan sockets. These are counted in the numtcpsock total even though they don't show up as connections.
udevd 151 root 4u sock 0,6 0t0 22177 can't identify protocol
rpcbind 828 root 4u sock 0,6 0t0 24640 can't identify protocol
sudo 28430 root 5u sock 0,6 0t0 74035691 can't identify protocol
/usr/sbin 28886 root 3u sock 0,6 0t0 74055248 can't identify protocol
/usr/sbin 28886 root 5u sock 0,6 0t0 74055252 can't identify protocol
/usr/sbin 28903 opensrf 3u sock 0,6 0t0 74055248 can't identify protocol
/usr/sbin 28903 opensrf 5u sock 0,6 0t0 74055252 can't identify protocol
/usr/sbin 28904 opensrf 3u sock 0,6 0t0 74055248 can't identify protocol
/usr/sbin 28904 opensrf 5u sock 0,6 0t0 74055252 can't identify protocol
/usr/sbin 28906 opensrf 3u sock 0,6 0t0 74055248 can't identify protocol
/usr/sbin 28906 opensrf 5u sock 0,6 0t0 74055252 can't identify protocol
/usr/sbin 28907 opensrf 3u sock 0,6 0t0 74055248 can't identify protocol
/usr/sbin 28907 opensrf 5u sock 0,6 0t0 74055252 can't identify protocol
/usr/sbin 28908 opensrf 3u sock 0,6 0t0 74055248 can't identify protocol
/usr/sbin 28908 opensrf 5u sock 0,6 0t0 74055252 can't identify protocol
/usr/sbin 28910 opensrf 3u sock 0,6 0t0 74055248 can't identify protocol
I used strace on the apache processes to find which sockets were being shutdown but not closed. Then found the perl code that was just calling shutdown and fixed that, and now there is no more problem.
Josh
|
|
|