OpenVZ Forum


Home » Mailing lists » Devel » [PATCHSET] 2.6.20-lxc8
Re: Re: [PATCHSET] 2.6.20-lxc8 [message #18046 is a reply to message #18033] Wed, 28 March 2007 08:01 Go to previous messageGo to previous message
dev is currently offline  dev
Messages: 1693
Registered: September 2005
Location: Moscow
Senior Member

Benjamin,

checksumming can be optimized out as well.
We had an experimental patch for OpenVZ venet device, which adds
NETIF_F_LLTX | NETIF_F_HW_CSUM | NETIF_F_SG | NETIF_F_HIGHDMA
features to venet device and avoids additional checksumming where possible
(moving RX/TX checksum calculation to hardware).

So I guess this is doable in future as well.

Thanks,
Kirill

> Hi,
> 
> Yesterday, I applied a patch similar to Kirill's one that skip 
> skb_cow() in ip_forward when the device is a etun, and it does help a lot.
> 
> With the patch the cpu load increase is reduced by 50%. Part of the 
> problem is "solved".
> 
> Here are the figures for netperf:
> 
> (Host A -> Host B
>   Host A is running kernel 2.6.20-rc5-netns.i386)
> 
>                               Throughput    CPU load
> 
> - without container:              719.78     10.45
> - inside a container (no patch)   719.37     21.88
> - inside a container with patch:  728.93     15.41
> 
> The CPU load with the ip_forward patch is now "only" 50% higher (10% 
> compared to 15%) than the reference case without container.
> 
> The throughput is even better (I repeated the test a few times and I 
> always got better results from inside the container).
> 
> (1) Why skb_cow() performs the copy?
> 
> I also added some traces to understand why skb_cow() does copy the 
> skb: is it insufficient headroom or that the skb has been cloned 
> previously?
> In our case, the condition is always that the "TCP skb" is marked as 
> cloned.
> It is likely that these skb have been cloned in tcp_skb_transmit().
> 
> 
> (2) Who consumes the other 5% percent cpu?
> 
> With the patch installed oprofile reports that pskb_expand_head() 
> (called by skb_cow) has disappeared from the top cpu consumers list.
> 
> Now, the remaining symbol that shows unusual activity is 
> csum_partial_copy_generic().
> I'd like to find who is the caller, unfortunately, this one is harder 
> to track. It is written in assembler and called by "static inline" 
> routines and Systemtap doesn't like that. :(
> 
> 
> So, that was the current status.
> I'm continuing my investigations.
> 
> Regards,
> Benjamin
> 
> Eric W. Biederman wrote:
> 
>>Kirill Korotaev <dev@openvz.org> writes:
>>
>>
>>>we have the hack below in ip_forward() to avoid skb_cow(),
>>>Banjamin, can you check whether it helps in your case please?
>>>(NOTE: you will need to replace check for NETIF_F_VENET with something else
>>> or introduce the same flag on etun device).
>>
>>Ugh.  The thing is skb_cow should be free.  It only has a cost when the skb
>>is too small or there is a second copy of the skb.  I don't there is a technical
>>reason for either of those to be the case when we are going over ethernet.
>>
>>And since the hardware header needs to change as well your hack is actually broken
>>if the incoming network interface is not ethernet. 
>>
>>So while I can see this hack for testing I'd much rather see if we can actually
>>fix this one cleanly.
>>
>>Unless you understand what is triggering the skb_cow to actually perform
>>the copy.
>>
>>Eric
>>
>>
>>>diff -upr linux-2.6.18-rhel5.orig/net/ipv4/ip_forward.c
>>>linux-2.6.18-rhel5-028stab023/net/ipv4/ip_forward.c
>>>--- linux-2.6.18-rhel5.orig/net/ipv4/ip_forward.c 2006-09-20 07:42:06.000000000
>>>+0400
>>>+++ linux-2.6.18-rhel5-028stab023/net/ipv4/ip_forward.c 2007-03-20
>>>17:22:45.000000000 +0300
>>>@@ -86,6 +86,24 @@ int ip_forward(struct sk_buff *skb)
>>>        if (opt->is_strictroute && rt->rt_dst != rt->rt_gateway)
>>>                goto sr_failed;
>>>
>>>+       /*
>>>+        * We try to optimize forwarding of VE packets:
>>>+        * do not decrement TTL (and so save skb_cow)
>>>+        * during forwarding of outgoing pkts from VE.
>>>+        * For incoming pkts we still do ttl decr,
>>>+        * since such skb is not cloned and does not require
>>>+        * actual cow. So, there is at least one place
>>>+        * in pkts path with mandatory ttl decr, that is
>>>+        * sufficient to prevent routing loops.
>>>+        */
>>>+       iph = skb->nh.iph;
>>>+       if (
>>>+#ifdef CONFIG_IP_ROUTE_NAT
>>>+ (rt->rt_flags & RTCF_NAT) == 0 && /* no NAT mangling expected */
>>>+#endif                                           /* and */
>>>+           (skb->dev->features & NETIF_F_VENET)) /* src is VENET device */
>>>+               goto no_ttl_decr;
>>>+
>>>        /* We are about to mangle packet. Copy it! */
>>>        if (skb_cow(skb, LL_RESERVED_SPACE(rt->u.dst.dev)+rt->u.dst.header_len))
>>>                goto drop;
>>>@@ -94,6 +112,8 @@ int ip_forward(struct sk_buff *skb)
>>>        /* Decrease ttl after skb cow done */
>>>        ip_decrease_ttl(iph);
>>>
>>>+no_ttl_decr:
>>>+
>>>        /*
>>>         *      We now generate an ICMP HOST REDIRECT giving the route
>>>         *      we calculated.
>>>@@ -121,3 +141,5 @@ drop:
>>
> 
> 

_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containers
 
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Previous Topic: Re: L2 network namespace benchmarking
Next Topic: Re: [ckrm-tech] [PATCH 7/7] containers (V7): Container interface to nsproxy subsystem
Goto Forum:
  


Current Time: Tue Dec 03 07:53:33 GMT 2024

Total time taken to generate the page: 0.10738 seconds