OpenVZ Forum


Home » Mailing lists » Devel » [PATCH 00/16] core network namespace support
Re: [PATCH 06/16] net: Add a network namespace parameter to struct sock [message #20107 is a reply to message #19973] Wed, 12 September 2007 09:58 Go to previous messageGo to next message
davem is currently offline  davem
Messages: 463
Registered: February 2006
Senior Member
From: ebiederm@xmission.com (Eric W. Biederman)
Date: Sat, 08 Sep 2007 15:21:37 -0600

> 
> Sockets need to get a reference to their network namespace,
> or possibly a simple hold if someone registers on the network
> namespace notifier and will free the sockets when the namespace
> is going to be destroyed.
> 
> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>

Applied, thanks.
_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containers
Re: [PATCH 07/16] net: Make /proc/net per network namespace [message #20108 is a reply to message #19972] Wed, 12 September 2007 10:02 Go to previous messageGo to next message
davem is currently offline  davem
Messages: 463
Registered: February 2006
Senior Member
From: ebiederm@xmission.com (Eric W. Biederman)
Date: Sat, 08 Sep 2007 15:20:36 -0600

> 
> This patch makes /proc/net per network namespace.  It modifies the global
> variables proc_net and proc_net_stat to be per network namespace.
> The proc_net file helpers are modified to take a network namespace argument,
> and all of their callers are fixed to pass &init_net for that argument.
> This ensures that all of the /proc/net files are only visible and
> usable in the initial network namespace until the code behind them
> has been updated to be handle multiple network namespaces.
> 
> Making /proc/net per namespace is necessary as at least some files
> in /proc/net depend upon the set of network devices which is per
> network namespace, and even more files in /proc/net have contents
> that are relevant to a single network namespace.
> 
> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>

Patch applied, thanks.
_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containers
Re: [PATCH 08/16] net: Make socket creation namespace safe. [message #20109 is a reply to message #19974] Wed, 12 September 2007 10:04 Go to previous messageGo to next message
davem is currently offline  davem
Messages: 463
Registered: February 2006
Senior Member
From: ebiederm@xmission.com (Eric W. Biederman)
Date: Sat, 08 Sep 2007 15:23:01 -0600

> 
> This patch passes in the namespace a new socket should be created in
> and has the socket code do the appropriate reference counting.  By
> virtue of this all socket create methods are touched.  In addition
> the socket create methods are modified so that they will fail if
> you attempt to create a socket in a non-default network namespace.
> 
> Failing if we attempt to create a socket outside of the default
> network namespace ensures that as we incrementally make the network stack
> network namespace aware we will not export functionality that someone
> has not audited and made certain is network namespace safe.
> Allowing us to partially enable network namespaces before all of the
> exotic protocols are supported.
> 
> Any protocol layers I have missed will fail to compile because I now
> pass an extra parameter into the socket creation code.
> 
> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>

Patch applied, thanks.
_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containers
Re: [PATCH 09/16] net: Initialize the network namespace of network devices. [message #20112 is a reply to message #19975] Wed, 12 September 2007 10:58 Go to previous messageGo to next message
davem is currently offline  davem
Messages: 463
Registered: February 2006
Senior Member
From: ebiederm@xmission.com (Eric W. Biederman)
Date: Sat, 08 Sep 2007 15:24:21 -0600

> 
> Except for carefully selected pseudo devices all network
> interfaces should start out in the initial network namespace.
> Ultimately it will be register_netdev that examines what
> dev->nd_net is set to and places a device in a network namespace.
> 
> This patch modifies alloc_netdev to initialize the network
> namespace a device is in with the initial network namespace.
> This gets it right for the vast majority of devices so their
> drivers need not be modified and for those few pseudo devices
> that need something different they can change this parameter
> before calling register_netdevice.
> 
> The network namespace parameter on a network device is not
> reference counted as the devices are inside of a network namespace
> and cannot remain in that namespace past the lifetime of the
> network namespace.
> 
> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>

Applied to net-2.6.24, thanks.
_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containers
Re: [PATCH 10/16] net: Make packet reception network namespace safe [message #20113 is a reply to message #19976] Wed, 12 September 2007 11:00 Go to previous messageGo to next message
davem is currently offline  davem
Messages: 463
Registered: February 2006
Senior Member
From: ebiederm@xmission.com (Eric W. Biederman)
Date: Sat, 08 Sep 2007 15:25:43 -0600

> 
> This patch modifies every packet receive function
> registered with dev_add_pack() to drop packets if they
> are not from the initial network namespace.
> 
> This should ensure that the various network stacks do
> not receive packets in a anything but the initial network
> namespace until the code has been converted and is ready
> for them.
> 
> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>

Applied to net-2.6.24, thanks.
_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containers
Re: [PATCH 11/16] net: Make device event notification network namespace safe [message #20114 is a reply to message #19977] Wed, 12 September 2007 11:02 Go to previous messageGo to next message
davem is currently offline  davem
Messages: 463
Registered: February 2006
Senior Member
From: ebiederm@xmission.com (Eric W. Biederman)
Date: Sat, 08 Sep 2007 15:27:11 -0600

> 
> Every user of the network device notifiers is either a protocol
> stack or a pseudo device.  If a protocol stack that does not have
> support for multiple network namespaces receives an event for a
> device that is not in the initial network namespace it quite possibly
> can get confused and do the wrong thing.
> 
> To avoid problems until all of the protocol stacks are converted
> this patch modifies all netdev event handlers to ignore events on
> devices that are not in the initial network namespace.
> 
> As the rest of the code is made network namespace aware these
> checks can be removed.
> 
> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>

Applied, thanks.
_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containers
Re: [PATCH 12/16] net: Support multiple network namespaces with netlink [message #20115 is a reply to message #19978] Wed, 12 September 2007 11:06 Go to previous messageGo to next message
davem is currently offline  davem
Messages: 463
Registered: February 2006
Senior Member
From: ebiederm@xmission.com (Eric W. Biederman)
Date: Sat, 08 Sep 2007 15:28:27 -0600

> 
> Each netlink socket will live in exactly one network namespace,
> this includes the controlling kernel sockets.
> 
> This patch updates all of the existing netlink protocols
> to only support the initial network namespace.  Request
> by clients in other namespaces will get -ECONREFUSED.
> As they would if the kernel did not have the support for
> that netlink protocol compiled in.
> 
> As each netlink protocol is updated to be multiple network
> namespace safe it can register multiple kernel sockets
> to acquire a presence in the rest of the network namespaces.
> 
> The implementation in af_netlink is a simple filter implementation
> at hash table insertion and hash table look up time.
> 
> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>

Applied to net-2.6.24, thanks.
_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containers
Re: [PATCH 13/16] net: Make the device list and device lookups per namespace. [message #20122 is a reply to message #19979] Wed, 12 September 2007 11:39 Go to previous messageGo to next message
davem is currently offline  davem
Messages: 463
Registered: February 2006
Senior Member
From: ebiederm@xmission.com (Eric W. Biederman)
Date: Sat, 08 Sep 2007 15:35:46 -0600

> 
> This patch makes most of the generic device layer network
> namespace safe.  This patch makes dev_base_head a
> network namespace variable, and then it picks up
> a few associated variables.  The functions:
> dev_getbyhwaddr
> dev_getfirsthwbytype
> dev_get_by_flags
> dev_get_by_name
> __dev_get_by_name
> dev_get_by_index
> __dev_get_by_index
> dev_ioctl
> dev_ethtool
> dev_load
> wireless_process_ioctl
> 
> were modified to take a network namespace argument, and
> deal with it.
> 
> vlan_ioctl_set and brioctl_set were modified so their
> hooks will receive a network namespace argument.
> 
> So basically anthing in the core of the network stack that was
> affected to by the change of dev_base was modified to handle
> multiple network namespaces.  The rest of the network stack was
> simply modified to explicitly use &init_net the initial network
> namespace.  This can be fixed when those components of the network
> stack are modified to handle multiple network namespaces.
> 
> For now the ifindex generator is left global.
> 
> Fundametally ifindex numbers are per namespace, or else
> we will have corner case problems with migration when
> we get that far.
> 
> At the same time there are assumptions in the network stack
> that the ifindex of a network device won't change.  Making
> the ifindex number global seems a good compromise until
> the network stack can cope with ifindex changes when
> you change namespaces, and the like.
> 
> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>

Applied to net-2.6.24, thanks.

_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containers
Re: [PATCH 14/16] net: Factor out __dev_alloc_name from dev_alloc_name [message #20124 is a reply to message #19980] Wed, 12 September 2007 11:49 Go to previous messageGo to next message
davem is currently offline  davem
Messages: 463
Registered: February 2006
Senior Member
From: ebiederm@xmission.com (Eric W. Biederman)
Date: Sat, 08 Sep 2007 15:36:56 -0600

> 
> When forcibly changing the network namespace of a device
> I need something that can generate a name for the device
> in the new namespace without overwriting the old name.
> 
> __dev_alloc_name provides me that functionality.
> 
> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>

Applied to net-2.6.24, thanks.
_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containers
Re: [PATCH 15/16] net: Implement network device movement between namespaces [message #20125 is a reply to message #19981] Wed, 12 September 2007 11:54 Go to previous messageGo to next message
davem is currently offline  davem
Messages: 463
Registered: February 2006
Senior Member
From: ebiederm@xmission.com (Eric W. Biederman)
Date: Sat, 08 Sep 2007 15:38:46 -0600

> 
> This patch introduces NETIF_F_NETNS_LOCAL a flag to indicate
> a network device is local to a single network namespace and
> should never be moved.  Useful for pseudo devices that we
> need an instance in each network namespace (like the loopback
> device) and for any device we find that cannot handle multiple
> network namespaces so we may trap them in the initial network
> namespace.
> 
> This patch introduces the function dev_change_net_namespace
> a function used to move a network device from one network
> namespace to another.  To the network device nothing
> special appears to happen, to the components of the network
> stack it appears as if the network device was unregistered
> in the network namespace it is in, and a new device
> was registered in the network namespace the device
> was moved to.
> 
> This patch sets up a namespace device destructor that
> upon the exit of a network namespace moves all of the
> movable network devices  to the initial network namespace
> so they are not lost.
> 
> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>

Applied to net-2.6.24
_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containers
Re: [PATCH 16/16] net: netlink support for moving devices between network namespaces. [message #20126 is a reply to message #19982] Wed, 12 September 2007 11:57 Go to previous messageGo to next message
davem is currently offline  davem
Messages: 463
Registered: February 2006
Senior Member
From: ebiederm@xmission.com (Eric W. Biederman)
Date: Sat, 08 Sep 2007 15:43:44 -0600

> 
> The simplest thing to implement is moving network devices between
> namespaces.  However with the same attribute IFLA_NET_NS_PID we can
> easily implement creating devices in the destination network
> namespace as well.  However that is a little bit trickier so this
> patch sticks to what is simple and easy.
> 
> A pid is used to identify a process that happens to be a member
> of the network namespace we want to move the network device to.
> 
> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>

Applied to net-2.6.24, thanks.
_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containers
Re: [PATCH 17/16] net: Disable netfilter sockopts when not in the initial network namespace [message #20127 is a reply to message #19983] Wed, 12 September 2007 11:59 Go to previous messageGo to next message
davem is currently offline  davem
Messages: 463
Registered: February 2006
Senior Member
From: ebiederm@xmission.com (Eric W. Biederman)
Date: Sat, 08 Sep 2007 15:47:12 -0600

> 
> Until we support multiple network namespaces with netfilter only allow
> netfilter configuration in the initial network namespace.
> 
> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>

Applied to net-2.6.24, thanks.
_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containers
Re: [PATCH 17/16] net: Disable netfilter sockopts when not in the initial network namespace [message #20129 is a reply to message #20127] Wed, 12 September 2007 12:03 Go to previous messageGo to next message
davem is currently offline  davem
Messages: 463
Registered: February 2006
Senior Member
I added the following patch to net-2.6.24 to kill a warning
since net_alloc() has no users (yet).

commit f444fa9b5d70b3d431e1554e0975e012514c39f3
Author: David S. Miller <davem@kimchee.(none)>
Date:   Wed Sep 12 14:01:08 2007 +0200

    [NET]: #if 0 out net_alloc() for now.
    
    We will undo this once it is actually used.
    
    Signed-off-by: David S. Miller <davem@davemloft.net>

diff --git a/net/core/net_namespace.c b/net/core/net_namespace.c
index f259a9b..1fc513c 100644
--- a/net/core/net_namespace.c
+++ b/net/core/net_namespace.c
@@ -32,10 +32,12 @@ void net_unlock(void)
 	mutex_unlock(&net_list_mutex);
 }
 
+#if 0
 static struct net *net_alloc(void)
 {
 	return kmem_cache_alloc(net_cachep, GFP_KERNEL);
 }
+#endif
 
 static void net_free(struct net *net)
 {
_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containers
Re: [PATCH 07/16] net: Make /proc/net per network namespace [message #20131 is a reply to message #20108] Wed, 12 September 2007 12:12 Go to previous messageGo to next message
Daniel Lezcano is currently offline  Daniel Lezcano
Messages: 417
Registered: June 2006
Senior Member
David Miller wrote:
> From: ebiederm@xmission.com (Eric W. Biederman)
> Date: Sat, 08 Sep 2007 15:20:36 -0600
> 
>> This patch makes /proc/net per network namespace.  It modifies the global
>> variables proc_net and proc_net_stat to be per network namespace.
>> The proc_net file helpers are modified to take a network namespace argument,
>> and all of their callers are fixed to pass &init_net for that argument.
>> This ensures that all of the /proc/net files are only visible and
>> usable in the initial network namespace until the code behind them
>> has been updated to be handle multiple network namespaces.
>>
>> Making /proc/net per namespace is necessary as at least some files
>> in /proc/net depend upon the set of network devices which is per
>> network namespace, and even more files in /proc/net have contents
>> that are relevant to a single network namespace.
>>
>> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
> 
> Patch applied, thanks.
> _______________________________________________
> Containers mailing list
> Containers@lists.linux-foundation.org
> https://lists.linux-foundation.org/mailman/listinfo/containers
> 

Hi Dave,

it seems the fs/proc/proc_net.c was not added to the git repository.

Regards.

	-- Daniel
_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containers
Re: [PATCH 17/16] net: Disable netfilter sockopts when not in the initial network namespace [message #20133 is a reply to message #20129] Wed, 12 September 2007 12:16 Go to previous messageGo to next message
ebiederm is currently offline  ebiederm
Messages: 1354
Registered: February 2006
Senior Member
David Miller <davem@davemloft.net> writes:

> I added the following patch to net-2.6.24 to kill a warning
> since net_alloc() has no users (yet).

Reasonable, and thanks for merging these.

Having a solid place to start helps a lot.

I will see if I can get the /proc races fixed shortly.

Eric
_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containers
Re: [PATCH 07/16] net: Make /proc/net per network namespace [message #20134 is a reply to message #20131] Wed, 12 September 2007 12:19 Go to previous messageGo to next message
davem is currently offline  davem
Messages: 463
Registered: February 2006
Senior Member
From: Daniel Lezcano <dlezcano@fr.ibm.com>
Date: Wed, 12 Sep 2007 14:12:04 +0200

> it seems the fs/proc/proc_net.c was not added to the git repository.

Fixed, thanks for catching that.
_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containers
Re: [PATCH 06/16] net: Add a network namespace parameter to struct sock [message #20546 is a reply to message #19973] Thu, 20 September 2007 12:55 Go to previous messageGo to next message
den is currently offline  den
Messages: 494
Registered: December 2005
Senior Member
Eric W. Biederman wrote:
> Sockets need to get a reference to their network namespace,
> or possibly a simple hold if someone registers on the network
> namespace notifier and will free the sockets when the namespace
> is going to be destroyed.
> 
> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
> ---
>  include/net/inet_timewait_sock.h |    1 +
>  include/net/sock.h               |    3 +++
>  2 files changed, 4 insertions(+), 0 deletions(-)
> 
> diff --git a/include/net/inet_timewait_sock.h b/include/net/inet_timewait_sock.h
> index 47d52b2..abaff05 100644
> --- a/include/net/inet_timewait_sock.h
> +++ b/include/net/inet_timewait_sock.h
> @@ -115,6 +115,7 @@ struct inet_timewait_sock {
>  #define tw_refcnt		__tw_common.skc_refcnt
>  #define tw_hash			__tw_common.skc_hash
>  #define tw_prot			__tw_common.skc_prot
> +#define tw_net			__tw_common.skc_net


This place is a very tricky, indeed. If we keep the namespace until
timewait bucket death - we'll keep the namespace alive at least 5
_minutes_ after all process death.

If we stop a VE (in terms of OpenVz) and restart it, we'll 100% have an
_OLD_ namespace with all buckets shown :( So, in OpenVz we use a number
of VE instead of pointer to a VE. Additionally, on VE death we can wipe
all TW buckets. VE start stop from outside world looks very much like a
computer power on/off.

Regards,
	Den
Re: [PATCH 06/16] net: Add a network namespace parameter to struct sock [message #20549 is a reply to message #20546] Thu, 20 September 2007 13:20 Go to previous messageGo to next message
Daniel Lezcano is currently offline  Daniel Lezcano
Messages: 417
Registered: June 2006
Senior Member
Denis V. Lunev wrote:
> Eric W. Biederman wrote:
>> Sockets need to get a reference to their network namespace,
>> or possibly a simple hold if someone registers on the network
>> namespace notifier and will free the sockets when the namespace
>> is going to be destroyed.
>>
>> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
>> ---
>>  include/net/inet_timewait_sock.h |    1 +
>>  include/net/sock.h               |    3 +++
>>  2 files changed, 4 insertions(+), 0 deletions(-)
>>
>> diff --git a/include/net/inet_timewait_sock.h b/include/net/inet_timewait_sock.h
>> index 47d52b2..abaff05 100644
>> --- a/include/net/inet_timewait_sock.h
>> +++ b/include/net/inet_timewait_sock.h
>> @@ -115,6 +115,7 @@ struct inet_timewait_sock {
>>  #define tw_refcnt		__tw_common.skc_refcnt
>>  #define tw_hash			__tw_common.skc_hash
>>  #define tw_prot			__tw_common.skc_prot
>> +#define tw_net			__tw_common.skc_net
> 
> 
> This place is a very tricky, indeed. If we keep the namespace until
> timewait bucket death - we'll keep the namespace alive at least 5
> _minutes_ after all process death.

Yes, that's right. And for me that makes totally sense. The namespace 
should not be destroyed until it is referenced somewhere.

> If we stop a VE (in terms of OpenVz) and restart it, we'll 100% have an
> _OLD_ namespace with all buckets shown :( 
> So, in OpenVz we use a number
> of VE instead of pointer to a VE. Additionally, on VE death we can wipe
> all TW buckets. VE start stop from outside world looks very much like a
> computer power on/off.

That makes sense too. But if you wipe out the sockets when stopping the 
VE where is the problem with the restart ?
Re: [PATCH 06/16] net: Add a network namespace parameter to struct sock [message #20562 is a reply to message #20549] Fri, 21 September 2007 05:04 Go to previous messageGo to next message
den is currently offline  den
Messages: 494
Registered: December 2005
Senior Member
Daniel Lezcano wrote:
>> This place is a very tricky, indeed. If we keep the namespace until
>> timewait bucket death - we'll keep the namespace alive at least 5
>> _minutes_ after all process death.
> 
> Yes, that's right. And for me that makes totally sense. The namespace
> should not be destroyed until it is referenced somewhere.

If all incoming interfaces are stopped, sure they do, no incoming
packets will be. So, it is completely pointless to keep TW bucket for 5
minutes. This is a resources wastage.

>> If we stop a VE (in terms of OpenVz) and restart it, we'll 100% have an
>> _OLD_ namespace with all buckets shown :( So, in OpenVz we use a number
>> of VE instead of pointer to a VE. Additionally, on VE death we can wipe
>> all TW buckets. VE start stop from outside world looks very much like a
>> computer power on/off.
> 
> That makes sense too. But if you wipe out the sockets when stopping the
> VE where is the problem with the restart ?
> 
> 

classical egg/chicken problem. If TW bucket holds namespace, how to
decide when to destroy it? :(
Re: [PATCH 06/16] net: Add a network namespace parameter to struct sock [message #20564 is a reply to message #20562] Fri, 21 September 2007 05:58 Go to previous messageGo to next message
ebiederm is currently offline  ebiederm
Messages: 1354
Registered: February 2006
Senior Member
"Denis V. Lunev" <den@sw.ru> writes:

> Daniel Lezcano wrote:
>>> This place is a very tricky, indeed. If we keep the namespace until
>>> timewait bucket death - we'll keep the namespace alive at least 5
>>> _minutes_ after all process death.
>> 
>> Yes, that's right. And for me that makes totally sense. The namespace
>> should not be destroyed until it is referenced somewhere.
>
> If all incoming interfaces are stopped, sure they do, no incoming
> packets will be. So, it is completely pointless to keep TW bucket for 5
> minutes. This is a resources wastage.

Agreed, at least in principle.

>>> If we stop a VE (in terms of OpenVz) and restart it, we'll 100% have an
>>> _OLD_ namespace with all buckets shown :( So, in OpenVz we use a number
>>> of VE instead of pointer to a VE. Additionally, on VE death we can wipe
>>> all TW buckets. VE start stop from outside world looks very much like a
>>> computer power on/off.
>> 
>> That makes sense too. But if you wipe out the sockets when stopping the
>> VE where is the problem with the restart ?
>> 
>> 
>
> classical egg/chicken problem. If TW bucket holds namespace, how to
> decide when to destroy it? :(

TW bucket must have a reference to a namespace because otherwise
we cannot interpret them.

However if need be we can just do hold_net, release_net style reference
counting, if we know that when the namespace exits we will flush all
of those sockets.

I looked and it doesn't appear that I am actually initializing
this field in my current patchset.  :(
- So either my skim through my code is wrong.
- Something got dropped in keeping the patches up to date.
- This was never addressed :(

I would be a good idea to see if we can make certain that we are
initializing the field right now (at least to &init_net).  That
way we won't get into a subtle problem later when we try and use it.

Eric
Re: [PATCH 06/16] net: Add a network namespace parameter to struct sock [message #20575 is a reply to message #20564] Fri, 21 September 2007 07:30 Go to previous message
Daniel Lezcano is currently offline  Daniel Lezcano
Messages: 417
Registered: June 2006
Senior Member
Eric W. Biederman wrote:
> "Denis V. Lunev" <den@sw.ru> writes:
> 
>> Daniel Lezcano wrote:
>>>> This place is a very tricky, indeed. If we keep the namespace until
>>>> timewait bucket death - we'll keep the namespace alive at least 5
>>>> _minutes_ after all process death.
>>> Yes, that's right. And for me that makes totally sense. The namespace
>>> should not be destroyed until it is referenced somewhere.
>> If all incoming interfaces are stopped, sure they do, no incoming
>> packets will be. So, it is completely pointless to keep TW bucket for 5
>> minutes. This is a resources wastage.
> 
> Agreed, at least in principle.
>>>> If we stop a VE (in terms of OpenVz) and restart it, we'll 100% have an
>>>> _OLD_ namespace with all buckets shown :( So, in OpenVz we use a number
>>>> of VE instead of pointer to a VE. Additionally, on VE death we can wipe
>>>> all TW buckets. VE start stop from outside world looks very much like a
>>>> computer power on/off.
>>> That makes sense too. But if you wipe out the sockets when stopping the
>>> VE where is the problem with the restart ?
>>>
>>>
>> classical egg/chicken problem. If TW bucket holds namespace, how to
>> decide when to destroy it? :(
> 
> TW bucket must have a reference to a namespace because otherwise
> we cannot interpret them.
> 
> However if need be we can just do hold_net, release_net style reference
> counting, if we know that when the namespace exits we will flush all
> of those sockets.
> 
> I looked and it doesn't appear that I am actually initializing
> this field in my current patchset.  :(
> - So either my skim through my code is wrong.
> - Something got dropped in keeping the patches up to date.
> - This was never addressed :(
> I would be a good idea to see if we can make certain that we are
> initializing the field right now (at least to &init_net).  That
> way we won't get into a subtle problem later when we try and use it.

With Denis's remark I looked at the code and I noticed that too.
I am currently doing some testing to check that. I will provide a 
patchset to hold a network namespace reference for the timewait socket 
and to wipe out timewait socket for the network namespace in a few hours.

BTW, the orphan sockets will lead to a similar problem ...

   -- Daniel
Previous Topic: [PATCH] Consolidate sleeping routines in file locking code
Next Topic: NET namespace locking seems broken to me
Goto Forum:
  


Current Time: Tue Dec 03 05:38:08 GMT 2024

Total time taken to generate the page: 0.10653 seconds