OpenVZ Forum


Home » Mailing lists » Devel » [patch 1/4] Network namespaces: cleanup of dev_base list use
Re: [patch 3/4] Network namespaces: IPv4 FIB/routing in namespaces [message #4112 is a reply to message #4111] Wed, 28 June 2006 17:10 Go to previous messageGo to next message
Ben Greear is currently offline  Ben Greear
Messages: 30
Registered: June 2006
Member
Daniel Lezcano wrote:
> Kirill Korotaev wrote:
>
>>>>>> Structures related to IPv4 rounting (FIB and routing cache)
>>>>>> are made per-namespace.
>>>>
>>>>
>>>>
>>>> Hi Andrey,
>>>>
>>>> if the ressources are private to the namespace, how do you will
>>>> handle NFS mounted before creating the network namespace ? Do you
>>>> take care of that or simply assume you can't access NFS anymore ?
>>>
>>>
>>>
>>>
>>> This is a question that brings up another level of interaction between
>>> networking and the rest of kernel code.
>>> Solution that I use now makes the NFS communication part always run in
>>> the root namespace. This is discussable, of course, but it's a far more
>>> complicated matter than just device lists or routing :)
>>
>>
>> if we had containers (not namespaces) then it would be also possible
>> to run NFS in context of the appropriate container and thus each user
>> could mount NFS itself with correct networking context.

With a relatively small patch, I was able to make NFS bind to a particular
local IP (poor man's namespace with existing code). I also changed it so
that multiple mounts to the same destination (and with unique local mount
points) are treated as unique mounts. This patch was done so that I could
stress test NFS servers, but similar logic might work for namespace isolation
as well...

Ben

--
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc http://www.candelatech.com
Re: Network namespaces a path to mergable code. [message #4115 is a reply to message #4108] Wed, 28 June 2006 17:22 Go to previous messageGo to next message
Andrey Savochkin is currently offline  Andrey Savochkin
Messages: 47
Registered: December 2005
Member
Hi Eric,

On Wed, Jun 28, 2006 at 10:51:26AM -0600, Eric W. Biederman wrote:
> Andrey Savochkin <saw@swsoft.com> writes:
>
> > One possible option to resolve this question is to show 2 relatively short
> > patches just introducing namespaces for sockets in 2 ways: with explicit
> > function parameters and using implicit current context.
> > Then people can compare them and vote.
> > Do you think it's worth the effort?
>
> Given that we have two strong opinions in different directions I think it
> is worth the effort to resolve this.

Do you have time to extract necessary parts of your old patch?
Or you aren't afraid of letting me draft an alternative version of socket
namespaces basing on your code? :)

>
> In a slightly different vein your second patch introduced a lot
> of #ifdef CONFIG_NET_NS in C files. That is something we need to look closely
> at.
>
> So I think the abstraction that we use to access per network namespace
> variables needs some work if we are going to allow the ability to compile
> out all of the namespace code. The explicit versus implicit lookup is just
> one dimension of that problem.

This is a good comment.

Those ifdef's mostly correspond to places where we walk over lists
and need to filter-out entities not belonging to a specific namespace.
Those places about the same in your and my implementation.
We can think what we can do with them.
One trick that I used on several occasions is net_ns_same macro
which doesn't evalute its arguments if CONFIG_NET_NS not defined,
and thus can be used without ifdef's.

Returning to implicit vs explicit function arguments, I belive that implicit
arguments are more promising in having zero impact on the code when
CONFIG_NET_NS is disabled.
Functions like inet_addr_type will translate into exactly the same code as
they did without net namespace patches.

>
> >> I'm still curious why many of those chunks can't use existing helper
> >> functions, to be cleaned up.
> >
> > What helper functions are you referring to?
>
> Basically most of the device list walker functions live in.
> net/core/dev.c
>
> I don't know if the cases you fixed could have used any of those
> helper functions but it certainly has me asking that question.
>
> A general pattern that happens in cleanups is the discovery
> that code using an old interface in a problematic way really
> could be done much better another way. I didn't dig enough
> to see if that was the case in any of the code that you changed.

Well, there is obvious improvement of this kind: many protocols walk over
device list to find devices with non-NULL protocol specific pointers.
For example, IPv6, decnet and others do it on module unloading to clean up.
Those places just ask for some simpler standard way of doing it, but I wasn't
bold enough for such radical change.
Do you think I should try?

Best regards

Andrey
Re: Network namespaces a path to mergable code. [message #4116 is a reply to message #4115] Wed, 28 June 2006 17:40 Go to previous messageGo to next message
Herbert Poetzl is currently offline  Herbert Poetzl
Messages: 239
Registered: February 2006
Senior Member
On Wed, Jun 28, 2006 at 09:22:40PM +0400, Andrey Savochkin wrote:
> Hi Eric,
>
> On Wed, Jun 28, 2006 at 10:51:26AM -0600, Eric W. Biederman wrote:
> > Andrey Savochkin <saw@swsoft.com> writes:
> >
> > > One possible option to resolve this question is to show 2
> > > relatively short patches just introducing namespaces for sockets
> > > in 2 ways: with explicit function parameters and using implicit
> > > current context. Then people can compare them and vote. Do you
> > > think it's worth the effort?
> >
> > Given that we have two strong opinions in different directions I
> > think it is worth the effort to resolve this.
>
> Do you have time to extract necessary parts of your old patch? Or you
> aren't afraid of letting me draft an alternative version of socket
> namespaces basing on your code? :)
>
> > In a slightly different vein your second patch introduced a lot of
> > #ifdef CONFIG_NET_NS in C files. That is something we need to look
> > closely at.
> >
> > So I think the abstraction that we use to access per network
> > namespace variables needs some work if we are going to allow the
> > ability to compile out all of the namespace code. The explicit
> > versus implicit lookup is just one dimension of that problem.
> This is a good comment.
>
> Those ifdef's mostly correspond to places where we walk over lists and
> need to filter-out entities not belonging to a specific namespace.
> Those places about the same in your and my implementation. We can
> think what we can do with them. One trick that I used on several
> occasions is net_ns_same macro which doesn't evalute its arguments if
> CONFIG_NET_NS not defined, and thus can be used without ifdef's.

yes, I think almost all of those cases can be avoided
while making the code even more readable by using
proper preprocessor (or even inline) mechanisms

> Returning to implicit vs explicit function arguments, I belive that
> implicit arguments are more promising in having zero impact on the
> code when CONFIG_NET_NS is disabled. Functions like inet_addr_type
> will translate into exactly the same code as they did without net
> namespace patches.

maybe a preprocessor wrapper can help here too ...

> > >> I'm still curious why many of those chunks can't use existing helper
> > >> functions, to be cleaned up.
> > >
> > > What helper functions are you referring to?
> >
> > Basically most of the device list walker functions live in.
> > net/core/dev.c
> >
> > I don't know if the cases you fixed could have used any of those
> > helper functions but it certainly has me asking that question.
> >
> > A general pattern that happens in cleanups is the discovery
> > that code using an old interface in a problematic way really
> > could be done much better another way. I didn't dig enough
> > to see if that was the case in any of the code that you changed.
>
> Well, there is obvious improvement of this kind: many protocols walk
> over device list to find devices with non-NULL protocol specific
> pointers. For example, IPv6, decnet and others do it on module
> unloading to clean up. Those places just ask for some simpler standard
> way of doing it, but I wasn't bold enough for such radical change.

> Do you think I should try?

IMHO it could not hurt to have some kind of protocol
helper library functions or macros ...

best,
Herbert

> Best regards
>
> Andrey
Re: Network namespaces a path to mergable code. [message #4117 is a reply to message #4115] Wed, 28 June 2006 17:50 Go to previous messageGo to next message
ebiederm is currently offline  ebiederm
Messages: 1354
Registered: February 2006
Senior Member
Andrey Savochkin <saw@swsoft.com> writes:

>> A general pattern that happens in cleanups is the discovery
>> that code using an old interface in a problematic way really
>> could be done much better another way. I didn't dig enough
>> to see if that was the case in any of the code that you changed.
>
> Well, there is obvious improvement of this kind: many protocols walk over
> device list to find devices with non-NULL protocol specific pointers.
> For example, IPv6, decnet and others do it on module unloading to clean up.
> Those places just ask for some simpler standard way of doing it, but I wasn't
> bold enough for such radical change.
> Do you think I should try?

It probably makes sense to asses that after the patches are split up.
Unless you run into something obvious.

Eric
Re: Network namespaces a path to mergable code. [message #4118 is a reply to message #4115] Wed, 28 June 2006 18:11 Go to previous messageGo to next message
ebiederm is currently offline  ebiederm
Messages: 1354
Registered: February 2006
Senior Member
Andrey Savochkin <saw@swsoft.com> writes:

>> In a slightly different vein your second patch introduced a lot
>> of #ifdef CONFIG_NET_NS in C files. That is something we need to look closely
>> at.
>>
>> So I think the abstraction that we use to access per network namespace
>> variables needs some work if we are going to allow the ability to compile
>> out all of the namespace code. The explicit versus implicit lookup is just
>> one dimension of that problem.
>
> This is a good comment.
>
> Those ifdef's mostly correspond to places where we walk over lists
> and need to filter-out entities not belonging to a specific namespace.
> Those places about the same in your and my implementation.
> We can think what we can do with them.
> One trick that I used on several occasions is net_ns_same macro
> which doesn't evalute its arguments if CONFIG_NET_NS not defined,
> and thus can be used without ifdef's.
>
> Returning to implicit vs explicit function arguments, I belive that implicit
> arguments are more promising in having zero impact on the code when
> CONFIG_NET_NS is disabled.
> Functions like inet_addr_type will translate into exactly the same code as
> they did without net namespace patches.

Which brings us to a basic question. Does it make sense to have
a define that completely disables namespace support.

I know all of the simple namespaces have been implemented like that,
and it was relatively easy there. I'm not at all certain in the long
term we want a configuration option. Especially if simply enabling
the code doesn't have an impact on performance. Which I think is
a merge requirement anyway.

As for inet_addr_type and friends I do agree that implicit arguments
make for an easier implementation of CONFIG_NET_NS. My gut feel
is though that the code with explicit arguments is probably more
comprehensible in the long term. Especially as we find more weird
exceptions where the process we are running in does not have the correct
network namespace.

In general unnecessary CONFIG options are a problem because they make
the entire testing process much harder and make the code harder to
write (so that both cases work and work cleanly).

So my feeling is that we actually want to kill all of those CONFIG_XXX_NS
options.

Which simply leaves us with the problem of implementing the code cleanly.

Eric
Re: Network namespaces a path to mergable code. [message #4120 is a reply to message #4115] Wed, 28 June 2006 18:14 Go to previous messageGo to next message
ebiederm is currently offline  ebiederm
Messages: 1354
Registered: February 2006
Senior Member
Andrey Savochkin <saw@swsoft.com> writes:

> Hi Eric,
>
> On Wed, Jun 28, 2006 at 10:51:26AM -0600, Eric W. Biederman wrote:
>> Andrey Savochkin <saw@swsoft.com> writes:
>>
>> > One possible option to resolve this question is to show 2 relatively short
>> > patches just introducing namespaces for sockets in 2 ways: with explicit
>> > function parameters and using implicit current context.
>> > Then people can compare them and vote.
>> > Do you think it's worth the effort?
>>
>> Given that we have two strong opinions in different directions I think it
>> is worth the effort to resolve this.
>
> Do you have time to extract necessary parts of your old patch?
> Or you aren't afraid of letting me draft an alternative version of socket
> namespaces basing on your code? :)

I'm not terribly afraid. I can always say you did it wrong. :)

I don't think I am going to have time today. But since this conversation
is slowing down and we are to getting into the technical details.
I will try and find some time.

Eric
Re: Network namespaces a path to mergable code. [message #4122 is a reply to message #4120] Wed, 28 June 2006 18:51 Go to previous messageGo to next message
Andrey Savochkin is currently offline  Andrey Savochkin
Messages: 47
Registered: December 2005
Member
On Wed, Jun 28, 2006 at 12:14:41PM -0600, Eric W. Biederman wrote:
> Andrey Savochkin <saw@swsoft.com> writes:
>
> > On Wed, Jun 28, 2006 at 10:51:26AM -0600, Eric W. Biederman wrote:
> >> Andrey Savochkin <saw@swsoft.com> writes:
> >>
> >> > One possible option to resolve this question is to show 2 relatively short
> >> > patches just introducing namespaces for sockets in 2 ways: with explicit
> >> > function parameters and using implicit current context.
> >> > Then people can compare them and vote.
> >> > Do you think it's worth the effort?
> >>
> >> Given that we have two strong opinions in different directions I think it
> >> is worth the effort to resolve this.
> >
> > Do you have time to extract necessary parts of your old patch?
> > Or you aren't afraid of letting me draft an alternative version of socket
> > namespaces basing on your code? :)
>
> I'm not terribly afraid. I can always say you did it wrong. :)

:)

> I don't think I am going to have time today. But since this conversation
> is slowing down and we are to getting into the technical details.
> I will try and find some time.

Good.
I'll focus on my part then.

Andrey
Re: Network namespaces a path to mergable code. [message #4123 is a reply to message #4082] Wed, 28 June 2006 21:53 Go to previous messageGo to next message
Daniel Lezcano is currently offline  Daniel Lezcano
Messages: 417
Registered: June 2006
Senior Member
Andrey Savochkin wrote:

> Ok, fine.
> Now I'm working on socket code.
>
> We still have a question about implicit vs explicit function parameters.
> This question becomes more important for sockets: if we want to allow to use
> sockets belonging to namespaces other than the current one, we need to do
> something about it.
>
> One possible option to resolve this question is to show 2 relatively short
> patches just introducing namespaces for sockets in 2 ways: with explicit
> function parameters and using implicit current context.
> Then people can compare them and vote.
> Do you think it's worth the effort?
>

The attached patch can have some part interesting for you for the socket
tagging. It is in the IPV4 isolation (part 5/6). With this and the
private routing table you will probably have a good IPV4 isolation.
Re: Network namespaces a path to mergable code. [message #4124 is a reply to message #4123] Wed, 28 June 2006 22:54 Go to previous messageGo to next message
James Morris is currently offline  James Morris
Messages: 10
Registered: March 2006
Junior Member
On Wed, 28 Jun 2006, Daniel Lezcano wrote:

> The attached patch can have some part interesting for you for the socket
> tagging. It is in the IPV4 isolation (part 5/6). With this and the private
> routing table you will probably have a good IPV4 isolation.

Please send patches inline, do not attach them.

(Perhaps we should have a filter on vger which drops emails with
attachements).

All of this needs to be done in a way where it can be entirely disabled at
compile time, so there is zero overhead for people who don't want
network namespaces.


- James
--
James Morris
<jmorris@namei.org>
Re: Network namespaces a path to mergable code. [message #4125 is a reply to message #4124] Thu, 29 June 2006 00:19 Go to previous messageGo to next message
ebiederm is currently offline  ebiederm
Messages: 1354
Registered: February 2006
Senior Member
James Morris <jmorris@namei.org> writes:

> On Wed, 28 Jun 2006, Daniel Lezcano wrote:
>
>> The attached patch can have some part interesting for you for the socket
>> tagging. It is in the IPV4 isolation (part 5/6). With this and the private
>> routing table you will probably have a good IPV4 isolation.
>
> Please send patches inline, do not attach them.
>
> (Perhaps we should have a filter on vger which drops emails with
> attachements).
>
> All of this needs to be done in a way where it can be entirely disabled at
> compile time, so there is zero overhead for people who don't want
> network namespaces.

I agree with the principle of no overhead.

The goal is an implementation that has no measurable overhead when
there is only one network namespace.

If that goal is achieved and you can compile in the network namespace
code and not measure overhead there should be no need for a compile
time option.

Eric
Re: Network namespaces a path to mergable code. [message #4126 is a reply to message #4123] Thu, 29 June 2006 00:25 Go to previous messageGo to next message
ebiederm is currently offline  ebiederm
Messages: 1354
Registered: February 2006
Senior Member
Daniel Lezcano <dlezcano@fr.ibm.com> writes:

> Andrey Savochkin wrote:
>
>> Ok, fine.
>> Now I'm working on socket code.
>> We still have a question about implicit vs explicit function parameters.
>> This question becomes more important for sockets: if we want to allow to use
>> sockets belonging to namespaces other than the current one, we need to do
>> something about it.
>> One possible option to resolve this question is to show 2 relatively short
>> patches just introducing namespaces for sockets in 2 ways: with explicit
>> function parameters and using implicit current context.
>> Then people can compare them and vote.
>> Do you think it's worth the effort?
>>
>
> The attached patch can have some part interesting for you for the socket
> tagging. It is in the IPV4 isolation (part 5/6). With this and the private
> routing table you will probably have a good IPV4 isolation.
> This patch partially isolates ipv4 by adding the network namespace
> structure in the structure sock, bind bucket and skbuf.

Ugh. skbuf sounds very wrong. Per packet overhead?

> When a socket
> is created, the pointer to the network namespace is stored in the
> struct sock and the socket belongs to the namespace by this way. That
> allows to identify sockets related to a namespace for lookup and
> procfs.
>
> The lookup is extended with a network namespace pointer, in
> order to identify listen points binded to the same port. That allows
> to have several applications binded to INADDR_ANY:port in different
> network namespace without conflicting. The bind is checked against
> port and network namespace.

Yes. If we don't duplicate the hash table we need to extend the lookup.

> When an outgoing packet has the loopback destination addres, the
> skbuff is filled with the network namespace. So the loopback packets
> never go outside the namespace. This approach facilitate the migration
> of loopback because identification is done by network namespace and
> not by address. The loopback has been benchmarked by tbench and the
> overhead is roughly 1.5 %

Ugh. 1.5% is noticeable.

I think it is cheaper to have one loopback device per namespace.
Which removes the need for a skbuff tag.

Eric
Re: Network namespaces a path to mergable code. [message #4138 is a reply to message #4126] Thu, 29 June 2006 09:42 Go to previous message
Daniel Lezcano is currently offline  Daniel Lezcano
Messages: 417
Registered: June 2006
Senior Member
Eric W. Biederman wrote:
>>When an outgoing packet has the loopback destination addres, the
>>skbuff is filled with the network namespace. So the loopback packets
>>never go outside the namespace. This approach facilitate the migration
>>of loopback because identification is done by network namespace and
>>not by address. The loopback has been benchmarked by tbench and the
>>overhead is roughly 1.5 %
>
>
> Ugh. 1.5% is noticeable.

We will see with all private network namespace ...
>
> I think it is cheaper to have one loopback device per namespace.
> Which removes the need for a skbuff tag.

Yes, probably.
Previous Topic: Re: [patch 2/6] [Network namespace] Network device sharing by view
Next Topic: Re: [patch 2/6] [Network namespace] Network device sharing by view
Goto Forum:
  


Current Time: Fri Sep 13 00:52:45 GMT 2024

Total time taken to generate the page: 0.05068 seconds