Home » Mailing lists » Users » TCP packets lost?
TCP packets lost? [message #25678] |
Mon, 07 January 2008 07:55 |
Max Deineko
Messages: 6 Registered: January 2008
|
Junior Member |
|
|
Hi,
I'm not really sure where to ask for help regarding my problem; I also
have posted to the apache group some time ago, but have not found a
solution. If you think this is the wrong group, I apologize. Also, any
hint as to where ask for help would be highly appreciated.
I'm facing following problem: Apache running inside an OpenVZ VPS,
serving multiple VirtualHosts. Now, it has regular availabiliy
problems, which appears to be neither load nor request pattern related.
What I don't quite understand is this: during the periods when the
server isn't available (and seems to do nothing) most of TCP sessions
look like the one below - apache keeps resending the handshake SYNACK
packets to the client although client's response is clearly here
(tcpdump done from inside the VPS).
Now I would think that even when the Apache proccess would hang, or be
waiting for some other component, the TCP stack part of the application
(or the kernel, for that matter) would still behave normally - is this
wrong? I.e. is such behaviour normal? Or do the incoming packets get
lost somewhere?
My first guess would have been that when the server is busy serving an
expensive request, the tcp rcv buffer would get full and packets would
get dropped. But wouldn't tcpdump not be able to see them then?
Any help would be highly appreciated,
thanks -
Max Deineko.
Session dump:
10:14:33.666935 IP CLIENT > SERVER: S 742086248:742086248(0) win 5808 <mss 1452,sackOK,timestamp 196805971 0,nop,wscale 7>
10:14:33.666965 IP SERVER > CLIENT: S 2619430727:2619430727(0) ack 742086249 win 5792 <mss 1460,sackOK,timestamp 347617915 196805971,nop,wscale 2>
10:14:33.700161 IP CLIENT > SERVER: . ack 1 win 46 <nop,nop,timestamp 196805980 347617915>
10:14:33.708406 IP CLIENT > SERVER: P 1:899(898) ack 1 win 46 <nop,nop,timestamp 196805980 347617915>
10:14:33.709279 IP CLIENT > SERVER: P 899:963(64) ack 1 win 46 <nop,nop,timestamp 196805980 347617915>
10:14:33.945483 IP CLIENT > SERVER: P 1:899(898) ack 1 win 46 <nop,nop,timestamp 196806039 347617915>
10:14:34.417635 IP CLIENT > SERVER: P 1:899(898) ack 1 win 46 <nop,nop,timestamp 196806157 347617915>
10:14:35.360801 IP CLIENT > SERVER: P 1:899(898) ack 1 win 46 <nop,nop,timestamp 196806393 347617915>
10:14:37.249017 IP CLIENT > SERVER: P 1:899(898) ack 1 win 46 <nop,nop,timestamp 196806865 347617915>
10:14:38.253998 IP SERVER > CLIENT: S 2619430727:2619430727(0) ack 742086249 win 5792 <mss 1460,sackOK,timestamp 347622503 196806865,nop,wscale 2>
10:14:38.287866 IP CLIENT > SERVER: . ack 1 win 46 <nop,nop,timestamp 196807127 347622503,nop,nop,sack 1 {0:1}>
10:14:41.023578 IP CLIENT > SERVER: P 1:899(898) ack 1 win 46 <nop,nop,timestamp 196807809 347622503>
10:14:44.253514 IP SERVER > CLIENT: S 2619430727:2619430727(0) ack 742086249 win 5792 <mss 1460,sackOK,timestamp 347628503 196807809,nop,wscale 2>
10:14:44.287392 IP CLIENT > SERVER: . ack 1 win 46 <nop,nop,timestamp 196808627 347628503,nop,nop,sack 1 {0:1}>
10:14:48.581817 IP CLIENT > SERVER: P 1:899(898) ack 1 win 46 <nop,nop,timestamp 196809697 347628503>
10:14:56.258819 IP SERVER > CLIENT: S 2619430727:2619430727(0) ack 742086249 win 5792 <mss 1460,sackOK,timestamp 347640510 196809697,nop,wscale 2>
10:14:56.292441 IP CLIENT > SERVER: . ack 1 win 46 <nop,nop,timestamp 196811629 347640510,nop,nop,sack 1 {0:1}>
10:15:03.676701 IP CLIENT > SERVER: P 1:899(898) ack 1 win 46 <nop,nop,timestamp 196813473 347640510>
10:15:20.458057 IP SERVER > CLIENT: S 2619430727:2619430727(0) ack 742086249 win 5792 <mss 1460,sackOK,timestamp 347664713 196813473,nop,wscale 2>
10:15:20.491784 IP CLIENT > SERVER: . ack 1 win 46 <nop,nop,timestamp 196817680 347664713,nop,nop,sack 1 {0:1}>
10:15:33.876920 IP CLIENT > SERVER: P 1:899(898) ack 1 win 46 <nop,nop,timestamp 196821025 347664713>
10:15:33.876953 IP SERVER > CLIENT: . ack 899 win 1897 <nop,nop,timestamp 347678134 196821025>
10:15:33.911143 IP CLIENT > SERVER: P 899:963(64) ack 1 win 46 <nop,nop,timestamp 196821035 347678134>
...(connection recovers)
# uname -a
Linux 2.6.9-023stab044.4-enterprise #1 SMP Thu May 24 17:41:23 MSD 2007
i686 i686 i386 GNU/Linux
# httpd -V
Server version: Apache/2.0.52
Server built: Jul 14 2007 11:53:18
Server's Module Magic Number: 20020903:9
Architecture: 32-bit
Server compiled with....
-D APACHE_MPM_DIR="server/mpm/prefork"
-D APR_HAS_SENDFILE
-D APR_HAS_MMAP
-D APR_HAVE_IPV6 (IPv4-mapped addresses enabled)
-D APR_USE_SYSVSEM_SERIALIZE
-D APR_USE_PTHREAD_SERIALIZE
-D SINGLE_LISTEN_UNSERIALIZED_ACCEPT
-D APR_HAS_OTHER_CHILD
-D AP_HAVE_RELIABLE_PIPED_LOGS
-D HTTPD_ROOT="/etc/httpd"
-D SUEXEC_BIN="/usr/sbin/suexec"
-D DEFAULT_PIDLOG="logs/httpd.pid"
-D DEFAULT_SCOREBOARD="logs/apache_runtime_status"
-D DEFAULT_LOCKFILE="logs/accept.lock"
-D DEFAULT_ERRORLOG="logs/error_log"
-D AP_TYPES_CONFIG_FILE="conf/mime.types"
-D SERVER_CONFIG_FILE="conf/httpd.conf"
Some sysctl parameters:
net.ipv4.tcp_moderate_rcvbuf = 1
net.ipv4.tcp_rmem = 4096 87380 174760
net.ipv4.tcp_wmem = 4096 16384 131072
net.ipv4.tcp_dsack = 1
net.ipv4.tcp_fack = 1
net.ipv4.tcp_max_syn_backlog = 1024
net.ipv4.tcp_keepalive_intvl = 75
net.ipv4.tcp_keepalive_probes = 9
net.ipv4.tcp_keepalive_time = 7200
net.ipv4.tcp_window_scaling = 1
net.core.rmem_max = 131071
net.core.wmem_max = 131071
net.core.somaxconn = 128
net.core.netdev_max_backlog = 300
net.ipv4.tcp_tw_reuse = 0
net.ipv4.tcp_tw_recycle = 0
net.ipv4.tcp_max_tw_buckets = 180000
net.ipv4.tcp_max_tw_buckets_ve = 16536
VPS limits:
uid resource held maxheld barrier limit failcnt
tcpsndbuf 3024352 3189816 2147483647 2147483647 0
tcprcvbuf 4662588 4728124 2147483647 2147483647 0
|
|
|
|
Re: TCP packets lost? [message #25684 is a reply to message #25679] |
Mon, 07 January 2008 10:00 |
Max Deineko
Messages: 6 Registered: January 2008
|
Junior Member |
|
|
Marcin Owsiany <marcin@owsiany.pl> wrote:
> On Mon, Jan 07, 2008 at 07:55:43AM +0000, Max Deineko wrote:
>> My first guess would have been that when the server is busy serving an
>> expensive request, the tcp rcv buffer would get full and packets would
>> get dropped. But wouldn't tcpdump not be able to see them then?
> http://forum.openvz.org/index.php?t=msg&goto=21592
>
> My theory is that when the (receive buffer size / current connections)
> is too small for the window size, then kernel gets itself into a state
> where it's impossible to recover a connection.
Thanks a lot. I already had found the thread and tried increasing
tcp_rmem size, which hung the webserver completely (maybe it was a litte
too big ;), but I'm still experimenting with the settings.
Of course there might be a problem with other server components, and I'm
not really a networking expert, it's just that the tcp flow doesn't look
like what I'd expect it to.
So is the behaviour I see normal or is there really something wrong at
this level already? Even if one assumed that userland components were
broken and went haywire sometimes?
Thanks, Max.
|
|
|
Re: Re: TCP packets lost? [message #25685 is a reply to message #25684] |
Mon, 07 January 2008 11:33 |
Vitaliy Gusev
Messages: 7 Registered: September 2007
|
Junior Member |
|
|
On 7 January 2008 13:01:24 Max Deineko wrote:
> Marcin Owsiany <marcin@owsiany.pl> wrote:
> > On Mon, Jan 07, 2008 at 07:55:43AM +0000, Max Deineko wrote:
> >> My first guess would have been that when the server is busy serving an
> >> expensive request, the tcp rcv buffer would get full and packets would
> >> get dropped. But wouldn't tcpdump not be able to see them then?
>
> > http://forum.openvz.org/index.php?t=msg&goto=21592
> >
> > My theory is that when the (receive buffer size / current connections)
> > is too small for the window size, then kernel gets itself into a state
> > where it's impossible to recover a connection.
>
> Thanks a lot. I already had found the thread and tried increasing
> tcp_rmem size, which hung the webserver completely (maybe it was a litte
> too big ;), but I'm still experimenting with the settings.
>
> Of course there might be a problem with other server components, and I'm
> not really a networking expert, it's just that the tcp flow doesn't look
> like what I'd expect it to.
Please show files: /proc/net/netstat, /proc/net/snmp ( inside VE)
Have you seen any suspicious messages in dmesg related to this issue?
>
> So is the behaviour I see normal or is there really something wrong at
> this level already? Even if one assumed that userland components were
> broken and went haywire sometimes?
>
> Thanks, Max.
>
--
Thanks,
Vitaliy Gusev
|
|
|
Re: TCP packets lost? [message #25686 is a reply to message #25685] |
Mon, 07 January 2008 12:01 |
Max Deineko
Messages: 6 Registered: January 2008
|
Junior Member |
|
|
Vitaliy Gusev <vgusev@sw.ru> wrote:
> On 7 January 2008 13:01:24 Max Deineko wrote:
>> > On Mon, Jan 07, 2008 at 07:55:43AM +0000, Max Deineko wrote:
>> >> My first guess would have been that when the server is busy
>> >> serving an expensive request, the tcp rcv buffer would get full
>> >> and packets would get dropped. But wouldn't tcpdump not be able
>> >> to see them then?
>> I already [...] tried increasing tcp_rmem size, which hung the
>> webserver completely (maybe it was a litte too big ;), but I'm still
>> experimenting with the settings.
>>
>> Of course there might be a problem with other server components, and
>> I'm not really a networking expert, it's just that the tcp flow
>> doesn't look like what I'd expect it to.
> Please show files: /proc/net/netstat, /proc/net/snmp ( inside VE)
See below. Currently running with
net.ipv4.tcp_rmem = 4096 87380 184762
net.core.rmem_max = 184761
> Have you seen any suspicious messages in dmesg related to this issue?
Indeed, dmesg shows
TCP: Treason uncloaked! Peer 91.4.192.15:52306/80 shrinks window 204385904:204399829. Repaired.
TCP: Treason uncloaked! Peer 91.4.192.15:52306/80 shrinks window 204385904:204399829. Repaired.
TCP: Treason uncloaked! Peer 91.4.192.15:52306/80 shrinks window 204385904:204399829. Repaired.
TCP: Treason uncloaked! Peer 62.158.79.38:1883/80 shrinks window 1835842591:1835855659. Repaired.
TCP: Treason uncloaked! Peer 91.12.168.104:2918/80 shrinks window 2946649572:2946651606. Repaired.
TCP: Treason uncloaked! Peer 217.238.252.24:49584/80 shrinks window 3063558791:3063563147. Repaired.
TCP: Treason uncloaked! Peer 87.167.76.47:52766/80 shrinks window 867053615:867090259. Repaired.
TCP: Treason uncloaked! Peer 91.11.116.53:2475/80 shrinks window 3699395483:3699409070. Repaired.
TCP: Treason uncloaked! Peer 91.11.116.53:2476/80 shrinks window 3699833370:3699846438. Repaired.
TCP: Treason uncloaked! Peer 91.11.116.53:2475/80 shrinks window 3699403648:3699409070. Repaired.
TCP: Treason uncloaked! Peer 91.11.116.53:2476/80 shrinks window 3699841562:3699846438. Repaired.
TCP: Treason uncloaked! Peer 79.212.215.179:2859/80 shrinks window 400801346:400802798. Repaired.
TCP: Treason uncloaked! Peer 80.139.188.46:3646/80 shrinks window 156197829:156205177. Repaired.
TCP: Treason uncloaked! Peer 80.139.188.46:3646/80 shrinks window 156197829:156205177. Repaired.
TCP: Treason uncloaked! Peer 80.140.219.65:49793/80 shrinks window 4253417104:4253432263. Repaired.
TCP: Treason uncloaked! Peer 80.140.219.65:49793/80 shrinks window 4253417104:4253432263. Repaired.
TCP: Treason uncloaked! Peer 80.140.219.65:49793/80 shrinks window 4253417104:4253432263. Repaired.
That's from the hardware node after running for three days. I'm sorry I
cannot relate the messages to the dropout times - the messages do not
show up in the syslogs (yet - I'm new to the server).
Thanks & regards - Max.
# cat /proc/net/netstat
TcpExt: SyncookiesSent SyncookiesRecv SyncookiesFailed EmbryonicRsts PruneCalled RcvPruned OfoPruned OutOfWindowIcmps LockDroppedIcmps ArpFilter TW TWRecycled TWKilled PAWSPassive PAWSActive PAWSEstab DelayedACKs DelayedACKLocked DelayedACKLost ListenOverflows ListenDrops TCPPrequeued TCPDirectCopyFromBacklog TCPDirectCopyFromPrequeue TCPPrequeueDropped TCPHPHits TCPHPHitsToUser TCPPureAcks TCPHPAcks TCPRenoRecovery TCPSackRecovery TCPSACKReneging TCPFACKReorder TCPSACKReorder TCPRenoReorder TCPTSReorder TCPFullUndo TCPPartialUndo TCPDSACKUndo TCPLossUndo TCPLoss TCPLostRetransmit TCPRenoFailures TCPSackFailures TCPLossFailures TCPFastRetrans TCPForwardRetrans TCPSlowStartRetrans TCPTimeouts TCPRenoRecoveryFail TCPSackRecoveryFail TCPSchedulerFailed TCPRcvCollapsed TCPDSACKOldSent TCPDSACKOfoSent TCPDSACKRecv TCPDSACKOfoRecv TCPAbortOnSyn TCPAbortOnData TCPAbortOnClose TCPAbortOnMemory TCPAbortOnTimeout TCPAbortOnLinger TCPAbortFailed TCPMemoryPressures
TcpExt: 0 0 0 15666 390 7 0 3 0 0 0 0 0 0 0 350 518998 283 5722 6044 6044 13385260 860306 145918454 0 13263422 87351 9290224 12159160 1576 60817 31 631 240 92 198 367 2815 1284 1383 34266 20 369 13804 1501 90841 8390 24116 28574 309 7175 2 12934 5823 2 14577 291 0 751 5077 0 2285 0 0 0
# cat /proc/net/snmp
Ip: Forwarding DefaultTTL InReceives InHdrErrors InAddrErrors ForwDatagrams InUnknownProtos InDiscards InDelivers OutRequests OutDiscards OutNoRoutes ReasmTimeout ReasmReqds ReasmOKs ReasmFails FragOKs FragFails FragCreates
Ip: 2 64 55086729 0 0 0 0 0 55079918 59160345 0 0 22 64 0 22 0 0 0
Icmp: InMsgs InErrors InDestUnreachs InTimeExcds InParmProbs InSrcQuenchs InRedirects InEchos InEchoReps InTimestamps InTimestampReps InAddrMasks InAddrMaskReps OutMsgs OutErrors OutDestUnreachs OutTimeExcds OutParmProbs OutSrcQuenchs OutRedirects OutEchos OutEchoReps OutTimestamps OutTimestampReps OutAddrMasks OutAddrMaskReps
Icmp: 10852 99 2358 0 0 2 0 8401 0 0 0 0 0 8472 0 71 0 0 0 0 0 8401 0 0 0 0
Tcp: RtoAlgorithm RtoMin RtoMax MaxConn ActiveOpens PassiveOpens AttemptFails EstabResets CurrEstab InSegs OutSegs RetransSegs InErrs OutRsts
Tcp: 0 0 0 0 487897 1752675 1326 25489 45 55038781 59121501 195520 397 7118
Udp: InDatagrams NoPorts InErrors OutDatagrams
Udp: 30214 71 0 30372
|
|
|
|
Re: Re: TCP packets lost? [message #25692 is a reply to message #25686] |
Mon, 07 January 2008 14:43 |
gblond
Messages: 64 Registered: May 2007
|
Member |
|
|
On 7 January 2008 15:02:14 Max Deineko wrote:
> Vitaliy Gusev <vgusev@sw.ru> wrote:
> > On 7 January 2008 13:01:24 Max Deineko wrote:
> >> > On Mon, Jan 07, 2008 at 07:55:43AM +0000, Max Deineko wrote:
> >> >> My first guess would have been that when the server is busy
> >> >> serving an expensive request, the tcp rcv buffer would get full
> >> >> and packets would get dropped. But wouldn't tcpdump not be able
> >> >> to see them then?
>
> >> I already [...] tried increasing tcp_rmem size, which hung the
> >> webserver completely (maybe it was a litte too big ;), but I'm still
> >> experimenting with the settings.
> >>
> >> Of course there might be a problem with other server components, and
> >> I'm not really a networking expert, it's just that the tcp flow
> >> doesn't look like what I'd expect it to.
>
> > Please show files: /proc/net/netstat, /proc/net/snmp ( inside VE)
>
> See below. Currently running with
>
> net.ipv4.tcp_rmem = 4096 87380 184762
> net.core.rmem_max = 184761
1) You didn't describe value of the net.core.rmem_default parameter.
Try to increase one.
2) Can you check patch from thread
http://forum.openvz.org/index.php?t=msg&goto=9018 ?
>
> > Have you seen any suspicious messages in dmesg related to this issue?
>
> Indeed, dmesg shows
>
> TCP: Treason uncloaked! Peer 91.4.192.15:52306/80 shrinks window 204385904:204399829. Repaired.
> TCP: Treason uncloaked! Peer 91.4.192.15:52306/80 shrinks window 204385904:204399829. Repaired.
> TCP: Treason uncloaked! Peer 91.4.192.15:52306/80 shrinks window 204385904:204399829. Repaired.
> TCP: Treason uncloaked! Peer 62.158.79.38:1883/80 shrinks window 1835842591:1835855659. Repaired.
> TCP: Treason uncloaked! Peer 91.12.168.104:2918/80 shrinks window 2946649572:2946651606. Repaired.
> TCP: Treason uncloaked! Peer 217.238.252.24:49584/80 shrinks window 3063558791:3063563147. Repaired.
> TCP: Treason uncloaked! Peer 87.167.76.47:52766/80 shrinks window 867053615:867090259. Repaired.
> TCP: Treason uncloaked! Peer 91.11.116.53:2475/80 shrinks window 3699395483:3699409070. Repaired.
> TCP: Treason uncloaked! Peer 91.11.116.53:2476/80 shrinks window 3699833370:3699846438. Repaired.
> TCP: Treason uncloaked! Peer 91.11.116.53:2475/80 shrinks window 3699403648:3699409070. Repaired.
> TCP: Treason uncloaked! Peer 91.11.116.53:2476/80 shrinks window 3699841562:3699846438. Repaired.
> TCP: Treason uncloaked! Peer 79.212.215.179:2859/80 shrinks window 400801346:400802798. Repaired.
> TCP: Treason uncloaked! Peer 80.139.188.46:3646/80 shrinks window 156197829:156205177. Repaired.
> TCP: Treason uncloaked! Peer 80.139.188.46:3646/80 shrinks window 156197829:156205177. Repaired.
> TCP: Treason uncloaked! Peer 80.140.219.65:49793/80 shrinks window 4253417104:4253432263. Repaired.
> TCP: Treason uncloaked! Peer 80.140.219.65:49793/80 shrinks window 4253417104:4253432263. Repaired.
> TCP: Treason uncloaked! Peer 80.140.219.65:49793/80 shrinks window 4253417104:4253432263. Repaired.
>
> That's from the hardware node after running for three days. I'm sorry I
> cannot relate the messages to the dropout times - the messages do not
> show up in the syslogs (yet - I'm new to the server).
>
> Thanks & regards - Max.
>
>
> # cat /proc/net/netstat
>
> TcpExt: SyncookiesSent SyncookiesRecv SyncookiesFailed EmbryonicRsts PruneCalled RcvPruned OfoPruned OutOfWindowIcmps LockDroppedIcmps ArpFilter TW TWRecycled TWKilled PAWSPassive PAWSActive PAWSEstab DelayedACKs DelayedACKLocked DelayedACKLost ListenOverflows ListenDrops TCPPrequeued TCPDirectCopyFromBacklog TCPDirectCopyFromPrequeue TCPPrequeueDropped TCPHPHits TCPHPHitsToUser TCPPureAcks TCPHPAcks TCPRenoRecovery TCPSackRecovery TCPSACKReneging TCPFACKReorder TCPSACKReorder TCPRenoReorder TCPTSReorder TCPFullUndo TCPPartialUndo TCPDSACKUndo TCPLossUndo TCPLoss TCPLostRetransmit TCPRenoFailures TCPSackFailures TCPLossFailures TCPFastRetrans TCPForwardRetrans TCPSlowStartRetrans TCPTimeouts TCPRenoRecoveryFail TCPSackRecoveryFail TCPSchedulerFailed TCPRcvCollapsed TCPDSACKOldSent TCPDSACKOfoSent TCPDSACKRecv TCPDSACKOfoRecv TCPAbortOnSyn TCPAbortOnData TCPAbortOnClose TCPAbortOnMemory TCPAbortOnTimeout TCPAbortOnLinger TCPAbortFailed TCPMemoryPressures
> TcpExt: 0 0 0 15666 390 7 0 3 0 0 0 0 0 0 0 350 518998 283 5722 6044 6044 13385260 860306 145918454 0 13263422 87351 9290224 12159160 1576 60817 31 631 240 92 198 367 2815 1284 1383 34266 20 369 13804 1501 90841 8390 24116 28574 309 7175 2 12934 5823 2 14577 291 0 751 5077 0 2285 0 0 0
>
> # cat /proc/net/snmp
>
> Ip: Forwarding DefaultTTL InReceives InHdrErrors InAddrErrors ForwDatagrams InUnknownProtos InDiscards InDelivers OutRequests OutDiscards OutNoRoutes ReasmTimeout ReasmReqds ReasmOKs ReasmFails FragOKs FragFails FragCreates
> Ip: 2 64 55086729 0 0 0 0 0 55079918 59160345 0 0 22 64 0 22 0 0 0
> Icmp: InMsgs InErrors InDestUnreachs InTimeExcds InParmProbs InSrcQuenchs InRedirects InEchos InEchoReps InTimestamps InTimestampReps InAddrMasks InAddrMaskReps OutMsgs OutErrors OutDestUnreachs OutTimeExcds OutParmProbs OutSrcQuenchs OutRedirects OutEchos OutEchoReps OutTimestamps OutTimestampReps OutAddrMasks OutAddrMaskReps
> Icmp: 10852 99 2358 0 0 2 0 8401 0 0 0 0 0 8472 0 71 0 0 0 0 0 8401 0 0 0 0
> Tcp: RtoAlgorithm RtoMin RtoMax MaxConn ActiveOpens PassiveOpens AttemptFails EstabResets CurrEstab InSegs OutSegs RetransSegs InErrs OutRsts
> Tcp: 0 0 0 0 487897 1752675 1326 25489 45 55038781 59121501 195520 397 7118
> Udp: InDatagrams NoPorts InErrors OutDatagrams
> Udp: 30214 71 0 30372
>
>
--
Thank,
Vitaliy Gusev
|
|
|
|
|
Goto Forum:
Current Time: Fri Oct 18 06:42:21 GMT 2024
Total time taken to generate the page: 0.05087 seconds
|