OpenVZ Forum


Home » Mailing lists » Devel » L2 network namespace benchmarking (resend with Service Demand)
L2 network namespace benchmarking (resend with Service Demand) [message #18065] Fri, 30 March 2007 14:16 Go to next message
Daniel Lezcano is currently offline  Daniel Lezcano
Messages: 417
Registered: June 2006
Senior Member
Hi,

as suggested Rick, I added the Service Demand results to the matrix.

Cheers.

----------------

Hi,

I did some benchmarking on the existing L2 network namespaces.

These patches are included in the lxc patchset at:
    http://lxc.sourceforge.net/patches/2.6.20
The lxc7 patchset series contains Dmitry's patchset
The lxc8 patchset series contains Eric's patchset

Here are the following scenarii I made in order to do some simple
benchmarking on the network namespace. I tested three kernels:

     * Vanilla kernel 2.6.20

     * lxc7 with Dmitry's patchset based on 2.6.20
       * L3 network namespace has been removed to do testing

     * lxc8 with Eric's patchset based on 2.6.20

I didn't do any tests on Linux-Vserver because it is L3 namespace and
it is not comparable with the L2 namespace implementation. If anyone
is interessted by Linux-Vserver performances, that can be found at
http://lxc.sf.net. Roughly, we know there is no performance
degradation.

For each kernel, several configurations were tested:

 * vanilla, obviously, only one configuration was tested for reference
   values.

 * lxc7, network namespace
  - compiled out
  - compiled in
    - without container
    - inside a container with ip_forward, route and veth
    - inside a container with a bridge and veth

 * lxc8, network namespace
  - compiled out
  - compiled in
    - without container
    - inside a container with a real network device (eth1 was moved
      in the container instead of using an etun device)
    - inside a container with ip_forward, route and etun
    - inside a container with a bridge and etun

Each benchmarking has been done with 2 machines running netperf and
tbench. A dedicated machine with a RH4 kernel run the bench servers.

For each bench, netperf and tbench, the tests are ran on:

 * Intel Xeon EM64T, Bi-processor 2,8GHz with hyperthreading
activated, 4GB of RAM and Gigabyte NIC (tg3)

 * AMD Athlon MP 1800+, Bi-processor 1,5GHz, 1GB of RAM and Gigabyte
   NIC (dl2000)

Each tests are run on these machines in order to have a CPU relative
overhead.


# bench on vanilla
===================


 ----------- ----------------------------------------------------
| Netperf   | CPU usage (%) | Throughput (Mbits/s) |  SD (us/KB) |
 ----------- ----------------------------------------------------
| on xeon   |      5.99     |        941.38        |    2.084    |
 ----------------------------------------------------------------
| on athlon |     28.17     |        844.82        |    5.462    |
 ----------------------------------------------------------------

 ----------- -----------------------
| Tbench    | Throughput (MBytes/s) |
 ----------- -----------------------
| on xeon   |         66.35         |
 -----------------------------------
| on athlon |         65.31         |
 -----------------------------------


# bench from Dmitry's patchset
==============================


1 - with net_ns compiled out
----------------------------

 ----------- 
-------------------------------------------------------------------------
| Netperf   | CPU usage (%) / overhead | Throughput (Mbits/s) / changed 
|  SD (us/KB) |
 ----------- ----------------------------------------------------------- 
-------------
| on xeon   |      5.93 / -1 %         |        941.32  / 0 %           
|    2.066    |
 ----------------------------------------------------------------------- 
-------------
| on athlon |     28.89 / +2.5 %       |        842.78 / -0.2 %         
|    5.615    |
 ------------------------------------------------------------------------------------- 


 ----------- ---------------------------------
| Tbench    | Throughput (MBytes/s) / changed |
 ----------- ---------------------------------
| on xeon   |         67.00 / +0.9 %          |
 ---------------------------------------------
| on athlon |         65.45 / 0 %             |
 ---------------------------------------------

  Observation : no noticeable overhead


2 - with net_ns compiled in
---------------------------


    2.1 - without container
    -----------------------

 ----------- 
-------------------------------------------------------------------------
| Netperf   | CPU usage (%) / overhead | Throughput (Mbits/s) / changed 
|  SD (us/KB) |
 ----------- ----------------------------------------------------------- 
-------------
| on xeon   |      6.23 / +4 %         |        941.35  / 0 %           
|    2.168    |
 ----------------------------------------------------------------------- 
-------------
| on athlon |     28.83 / +2.3 %       |        850.76 / +0.7 %         
|    5.552    |
 ------------------------------------------------------------------------------------- 


 ----------- ---------------------------------
| Tbench    | Throughput (MBytes/s) / changed |
 ----------- ---------------------------------
| on xeon   |         67.00 / 0 %             |
 ---------------------------------------------
| on athlon |         65.45 / 0 %             |
 ---------------------------------------------

  Observation : no noticeable overhead


    2.2 - inside the container with veth and routes
    -----------------------------------------------

 ----------- 
-------------------------------------------------------------------------
| Netperf   | CPU usage (%) / overhead | Throughput (Mbits/s) / changed 
|  SD (us/KB) |
 ----------- ----------------------------------------------------------- 
-------------
| on xeon   |      17.14 / +186.1 %      |        941.34  / 0 %         
|    5.966    |
 ----------------------------------------------------------------------- 
-------------
| on athlon |      49.99 / +77.45 %      |        838.85 / +0.7 %       
|    9.763    |
 ------------------------------------------------------------------------------------- 


 ----------- ---------------------------------
| Tbench    | Throughput (MBytes/s) / changed |
 ----------- ---------------------------------
| on xeon   |         66.00 / -0.5 %          |
 ---------------------------------------------
| on athlon |         61.00 / -6.65 %         |
 ---------------------------------------------

  Observation : CPU overhead is very big, throughput is impacted on
  the less powerful machine


    2.3 - inside the container with veth and bridge
    -----------------------------------------------

 ----------- 
-------------------------------------------------------------------------
| Netperf   | CPU usage (%) / overhead | Throughput (Mbits/s) / changed 
|  SD (us/KB) |
 ----------- ----------------------------------------------------------- 
-------------
| on xeon   |      19.14 / +299 %      |        941.18  / 0 %           
|    6.863    |
 ----------------------------------------------------------------------- 
-------------
| on athlon |      49.98 / +77.42 %    |        831.65 / -1.5 %         
|    9.846    |
 ------------------------------------------------------------------------------------- 


 ----------- ---------------------------------
| Tbench    | Throughput (MBytes/s) / changed |
 ----------- ---------------------------------
| on xeon   |         64.00 / -3.5 %          |
 ---------------------------------------------
| on athlon |         60.07 / -8.3 %          |
 ---------------------------------------------

  Observation : CPU overhead is very big, throughput is impacted on
  the less powerful machine


# bench from Eric's patchset
============================


1 - with net_ns compiled out
----------------------------

 ----------- 
-------------------------------------------------------------------------
| Netperf   | CPU usage (%) / overhead | Throughput (Mbits/s) / changed 
|  SD (us/KB) |
 ----------- ----------------------------------------------------------- 
-------------
| on xeon   |      6.04 / +0.8 %       |        941.33  / 0 %           
|    2.104    |
 ----------------------------------------------------------------------- 
-------------
| on athlon |      28.45 / +1 %        |        840.76 /  -0.5 %        
|    5.545    |
 ------------------------------------------------------------------------------------- 


 ----------- ---------------------------------
| Tbench    | Throughput (MBytes/s) / changed |
 ----------- ---------------------------------
| on xeon   |         65.69 / -1 %            |
 ---------------------------------------------
| on athlon |         65.35 / -0.2 %          |
 ---------------------------------------------

  Observation : no noticeable overhead


2 - with net_ns compiled in
---------------------------


    2.1 - without container
    -----------------------

 ----------- 
-------------------------------------------------------------------------
| Netperf   | CPU usage (%) / overhead | Throughput (Mbits/s) / changed 
|  SD (us/KB) |
 ----------- ----------------------------------------------------------- 
-------------
| on xeon   |      6.02 / +0.5 %       |        941.34  / 0 %           
|    2.097    |
 ----------------------------------------------------------------------- 
-------------
| on athlon |      27.93 / -0.8 %      |        833.53 /  -1.3 %        
|    5.490    |
 ------------------------------------------------------------------------------------- 


 ----------- ---------------------------------
| Tbench    | Throughput (MBytes/s) / changed |
 ----------- ---------------------------------
| on xeon   |         66.00 / -0.5 %          |
 ---------------------------------------------
| on athlon |         64.94 / -0.9 %          |
 ---------------------------------------------

  Observation : no noticeable overhead


    2.2 - inside the container with real device
    -------------------------------------------

 ----------- 
-------------------------------------------------------------------------
| Netperf   | CPU usage (%) / overhead | Throughput (Mbits/s) / changed 
|  SD (us/KB) |
 ----------- ----------------------------------------------------------- 
-------------
| on xeon   |  
...

Re: L2 network namespace benchmarking (resend with Service Demand) [message #18078 is a reply to message #18065] Fri, 06 April 2007 08:03 Go to previous messageGo to next message
ebiederm is currently offline  ebiederm
Messages: 1354
Registered: February 2006
Senior Member
Daniel Lezcano <dlezcano@fr.ibm.com> writes:

> Hi,
>
> as suggested Rick, I added the Service Demand results to the matrix.

A couple of random thoughts in trying to understand the numbers you are
seeing.

- Checksum offloading?

  You have noted that with the bridge netfilter support disabled you
  are still seeing additional checksum overhead.  Just like you are
  seeing in the routing case.

  Is it possible the problem is simply that etun doesn't support
  checksum offloading, while your normal test hardware does?

- Tagged VLANs?
  
  Currently you have tested bridging and routing to get the packets to
  a network namespace.  Could you test tagged vlans?

  I'm just curious if we have anything in the network stack today that
  will multiplex a NIC without measurable overhead.

- Without NETNS?

  We should probably see if we can setup the same configuration we are
  testing without network namespaces (just multiple interfaces on the
  same machine) and see if we can still measure the same overhead.
  Just to confirm the overhead is not a network namespace related
  thing.

  I know we can configure the same case with bridging and I am fairly
  confident that we will see the same overhead without network
  namespaces.

  Of the top of my head I am insufficiently clever to think how we
  could configure the routing case without network namespaces,
  although we might be able to force it and if so it would be
  interesting to measure.

I will work to get the etun setup races fixed and to fix whatever
obvious feature deficiencies it has (like no configurable MTU support)
and see if I can get that pushed upstream.  That should make it easier
for other people to reproduce what we are seeing.

Eric
_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containers
Re: L2 network namespace benchmarking (resend with Service Demand) [message #18079 is a reply to message #18065] Fri, 06 April 2007 14:25 Go to previous message
ebiederm is currently offline  ebiederm
Messages: 1354
Registered: February 2006
Senior Member
Benjamin Thery <benjamin.thery@bull.net> writes:

> Eric W. Biederman wrote:

>> A couple of random thoughts in trying to understand the numbers you are
>> seeing.
>>
>> - Checksum offloading?
>>
>>   You have noted that with the bridge netfilter support disabled you
>>   are still seeing additional checksum overhead.  Just like you are
>>   seeing in the routing case.
>>
>>   Is it possible the problem is simply that etun doesn't support
>>   checksum offloading, while your normal test hardware does?
>
> Looks like you are 100% correct.
> I feel a bit stupid I didn't think about this "small" difference between real
> NIC and etun.
>
> If I turn off checksum offloading on my physical NIC, the checksum "overhead"
> (load) measured by oprofile is about the same in both case: when running netperf
> through a real NIC or through an etun tunnel first.

Interesting.  You can also 'enable' checksum offloading when using etun with
ethtool.  Which should just tell the kernel not to do checksumming.  A
bad idea in general but it might be useful in confirming where the
performance overhead is coming from, and when used with routing I
believe it is safe.  When used with bridging I don't know.

Thinking about it the ideal situation is to preserve skb->ip_summed it
if came from another device, instead of unconditionally setting it.
I need to take a good hard look at etun_xmit and make certain we
are dotting all of the i's and crossing all of the t's for best
performance and compatibility with the rest of the network stack.

Eric
_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containers
Re: L2 network namespace benchmarking (resend with Service Demand) [message #18086 is a reply to message #18078] Fri, 06 April 2007 11:19 Go to previous message
Benjamin Thery is currently offline  Benjamin Thery
Messages: 79
Registered: March 2007
Member
Eric W. Biederman wrote:
> Daniel Lezcano <dlezcano@fr.ibm.com> writes:
> 
>> Hi,
>>
>> as suggested Rick, I added the Service Demand results to the matrix.
> 
> A couple of random thoughts in trying to understand the numbers you are
> seeing.
> 
> - Checksum offloading?
> 
>   You have noted that with the bridge netfilter support disabled you
>   are still seeing additional checksum overhead.  Just like you are
>   seeing in the routing case.
> 
>   Is it possible the problem is simply that etun doesn't support
>   checksum offloading, while your normal test hardware does?

Looks like you are 100% correct.
I feel a bit stupid I didn't think about this "small" difference 
between real NIC and etun.

If I turn off checksum offloading on my physical NIC, the checksum 
"overhead" (load) measured by oprofile is about the same in both case: 
when running netperf through a real NIC or through an etun tunnel first.

Benjamin

> - Tagged VLANs?
>   
>   Currently you have tested bridging and routing to get the packets to
>   a network namespace.  Could you test tagged vlans?
> 
>   I'm just curious if we have anything in the network stack today that
>   will multiplex a NIC without measurable overhead.
> 
> - Without NETNS?
> 
>   We should probably see if we can setup the same configuration we are
>   testing without network namespaces (just multiple interfaces on the
>   same machine) and see if we can still measure the same overhead.
>   Just to confirm the overhead is not a network namespace related
>   thing.
> 
>   I know we can configure the same case with bridging and I am fairly
>   confident that we will see the same overhead without network
>   namespaces.
> 
>   Of the top of my head I am insufficiently clever to think how we
>   could configure the routing case without network namespaces,
>   although we might be able to force it and if so it would be
>   interesting to measure.
> 
> I will work to get the etun setup races fixed and to fix whatever
> obvious feature deficiencies it has (like no configurable MTU support)
> and see if I can get that pushed upstream.  That should make it easier
> for other people to reproduce what we are seeing.
> 
> Eric
> _______________________________________________
> Containers mailing list
> Containers@lists.linux-foundation.org
> https://lists.linux-foundation.org/mailman/listinfo/containers
> 


-- 
B e n j a m i n   T h e r y  - BULL/DT/Open Software R&D

    http://www.bull.com
_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containers
Previous Topic: network namespace website
Next Topic: Re: network namespace website
Goto Forum:
  


Current Time: Wed Sep 18 02:26:27 GMT 2024

Total time taken to generate the page: 0.04889 seconds