Home » Mailing lists » Devel » [RFC][PATCH 0/16] Enable cloning of pid namespace
Re: [RFC][PATCH 11/16] Enable cloning pid namespace [message #18643 is a reply to message #18620] |
Thu, 24 May 2007 14:59   |
serue
Messages: 750 Registered: February 2006
|
Senior Member |
|
|
Quoting sukadev@us.ibm.com (sukadev@us.ibm.com):
>
> Subject: Enable cloning pid namespace
>
> From: Sukadev Bhattiprolu <sukadev@us.ibm.com>
>
>
> When clone() is invoked with CLONE_NEWPID, create a new pid namespace
> and then create a new struct pid for the new process. Allocate pid_t's
> for the new process in the new pid namespace and all ancestor pid
> namespaces. Make the newly cloned process the session and process group
> leader.
>
> Since the active pid namespace is special and expected to be the first
> entry in pid->upid_list, preserve the order of pid namespaces when
> cloning without CLONE_NEWPID.
>
> TODO (partial list:)
>
> - Identify clone flags that should not be specified with CLONE_NEWPID
> and return -EINVAL from copy_process(), if they are specified. (eg:
> CLONE_THREAD|CLONE_NEWPID ?)
> - Add a privilege check for CLONE_NEWPID
>
> Changelog:
>
> 2.6.21-mm2-pidns3:
> - 'struct upid' used to be called 'struct pid_nr' and a list of these
> were hanging off of 'struct pid'. So, we renamed 'struct pid_nr'
> and now hold them in a statically sized array in 'struct pid' since
> the number of 'struct upid's for a process is known at process-
> creation time
>
> 2.6.21-mm2:
> - [Serge Hallyn] Terminate other processes in pid ns when reaper is
> exiting.
> Signed-off-by: Sukadev Bhattiprolu <sukadev@us.ibm.com>
> ---
> include/linux/pid.h | 3
> include/linux/pid_namespace.h | 5 -
> init/Kconfig | 9 +
> kernel/exit.c | 14 ++
> kernel/fork.c | 17 ++-
> kernel/pid.c | 209 ++++++++++++++++++++++++++++++++++++++----
> 6 files changed, 227 insertions(+), 30 deletions(-)
>
> Index: lx26-21-mm2/kernel/pid.c
> ===================================================================
> --- lx26-21-mm2.orig/kernel/pid.c 2007-05-22 16:59:50.000000000 -0700
> +++ lx26-21-mm2/kernel/pid.c 2007-05-22 16:59:52.000000000 -0700
> @@ -32,6 +32,7 @@
> #define pid_hashfn(nr) hash_long((unsigned long)nr, pidhash_shift)
> static struct hlist_head *pid_hash;
> static int pidhash_shift;
> +static struct kmem_cache *pid1_cachep;
> static struct kmem_cache *pid_cachep;
> struct upid init_struct_upid = INIT_STRUCT_UPID;
> struct pid init_struct_pid = INIT_STRUCT_PID;
> @@ -250,7 +251,12 @@ static struct upid *pid_active_upid(stru
> /*
> * Return the active pid namespace of the process @pid.
> *
> - * Note: At present, there is only one pid namespace (init_pid_ns).
> + * Note:
> + * To avoid having to use an extra pointer in struct pid to keep track
> + * of active pid namespace, dup_struct_pid() maintains the order of
> + * entries in 'pid->upid_list' such that the youngest (or the 'active')
> + * pid namespace is the first entry and oldest (init_pid_ns) is the last
> + * entry in the list.
> */
> struct pid_namespace *pid_active_pid_ns(struct pid *pid)
> {
> @@ -259,6 +265,64 @@ struct pid_namespace *pid_active_pid_ns(
> EXPORT_SYMBOL_GPL(pid_active_pid_ns);
>
> /*
> + * Return the parent pid_namespace of the active pid namespace of @tsk.
> + *
> + * Note:
> + * Refer to function header of pid_active_pid_ns() for information on
> + * the order of entries in pid->upid_list. Based on the order, the parent
> + * pid namespace of the active pid namespace of @tsk is just the second
> + * entry in the process's pid->upid_list.
> + *
> + * Parent pid namespace of init_pid_ns is init_pid_ns itself.
> + */
> +static struct pid_namespace *task_active_pid_ns_parent(struct task_struct *tsk)
> +{
> + int idx = 0;
> + struct pid *pid = task_pid(tsk);
> +
> + if (pid->num_upids > 1)
> + idx++;
> +
> + return pid->upid_list[idx].pid_ns;
> +}
> +
> +/*
> + * Return the child reaper of @tsk.
> + *
> + * Normally the child reaper of @tsk is simply the child reaper
> + * the active pid namespace of @tsk.
> + *
> + * But if @tsk is itself child reaper of a namespace, NS1, its child
> + * reaper depends on the caller. If someone from an ancestor namespace
> + * or, if the reaper himself is asking, return the reaper of our parent
> + * namespace.
> + *
> + * If someone from namespace NS1 (other than reaper himself) is asking,
> + * return reaper of NS1.
> + */
> +struct task_struct *task_child_reaper(struct task_struct *tsk)
> +{
> + struct pid_namespace *tsk_ns = task_active_pid_ns(tsk);
> + struct task_struct *tsk_reaper = tsk_ns->child_reaper;
> + struct pid_namespace *my_ns;
> +
> + /*
> + * TODO: Check if we need a lock here. ns->child_reaper
> + * can change in do_exit() when reaper is exiting.
> + */
> +
> + if (tsk != tsk_reaper)
> + return tsk_reaper;
> +
> + my_ns = task_active_pid_ns(current);
> + if (my_ns != tsk_ns || current == tsk)
> + return task_active_pid_ns_parent(tsk)->child_reaper;
This is bogus. This value is never returned to userspace. It is always
used to make kernel decisions like forget_original_parent() and
signaling. As such, this unnecessarily slows down this function, and
has the potential of creating a very subtle bug down the line (if there
isn't one already).
A task has one reaper, period, and a fn called task_child_reaper()
should return that reaper, period.
Then if userspace ever wants to see that value (right now it doesn't),
then whoever calls task_child_reaper from inside NS1 on NS1's reaper can
send back 0.
-serge
> + return tsk_reaper;
> +}
> +EXPORT_SYMBOL(task_child_reaper);
> +
> +/*
> * Return the pid_t by which the process @pid is known in the pid
> * namespace @ns.
> *
> @@ -301,15 +365,78 @@ pid_t pid_to_nr(struct pid *pid)
> }
> EXPORT_SYMBOL_GPL(pid_to_nr);
>
> +#ifdef CONFIG_PID_NS
> +static int init_ns_pidmap(struct pid_namespace *ns)
> +{
> + int i;
> +
> + atomic_set(&ns->pidmap[0].nr_free, BITS_PER_PAGE - 1);
> +
> + ns->pidmap[0].page = kzalloc(PAGE_SIZE, GFP_KERNEL);
> + if (!ns->pidmap[0].page)
> + return -ENOMEM;
> +
> + set_bit(0, ns->pidmap[0].page);
> +
> + for (i = 1; i < PIDMAP_ENTRIES; i++) {
> + atomic_set(&ns->pidmap[i].nr_free, BITS_PER_PAGE);
> + ns->pidmap[i].page = NULL;
> + }
> + return 0;
> +}
> +
> +static struct pid_namespace *alloc_pid_ns(void)
> +{
> + struct pid_namespace *ns;
> + int rc;
> +
> + ns = kzalloc(sizeof(struct pid_namespace), GFP_KERNEL);
> + if (!ns)
> + return NULL;
> +
> + rc = init_ns_pidmap(ns);
> + if(rc) {
> + kfree(ns);
> + return NULL;
> + }
> +
> + kref_init(&ns->kref);
> +
> + return ns;
> +}
> +
> +#else
> +
> +static int alloc_pid_ns()
> +{
> + static int warned;
> +
> + if (!warned) {
> + printk(KERN_INFO "WARNING: CLONE_NEWPID disabled\n");
> + warned = 1;
> + }
> + return 0;
> +}
> +#endif /*CONFIG_PID_NS*/
> +
> +void toss_pid(struct pid *pid)
> +{
> + if (pid->num_upids == 1)
> + kmem_cache_free(pid1_cachep, pid);
> + else {
> + kfree(pid->upid_list);
> + kmem_cache_free(pid_cachep, pid);
> + }
> +}
> +
> fastcall void put_pid(struct pid *pid)
> {
> if (!pid)
> return;
>
> if ((atomic_read(&pid->count) == 1) ||
> - atomic_dec_and_test(&pid->count)) {
> - kmem_cache_free(pid_cachep, pid);
> - }
> + atomic_dec_and_test(&pid->count))
> + toss_pid(pid);
> }
> EXPORT_SYMBOL_GPL(put_pid);
>
> @@ -345,15 +472,28 @@ static struct pid *alloc_struct_pid(int
> enum pid_type type;
> struct upid *upid_list;
> void *pid_end;
> + struct kmem_cache *cachep = pid1_cachep;
>
> - /* for now we only support one pid namespace */
> - BUG_ON(num_upids != 1);
> - pid = kmem_cache_alloc(pid_cachep, GFP_KERNEL);
> + if (num_upids > 1)
> + cachep = pid_cachep;
> +
> + pid = kmem_cache_alloc(cachep, GFP_KERNEL);
> if (!pid)
> return NULL;
>
> - pid_end = (void *)pid + sizeof(struct pid);
> - pid->upid_list = (struct upid *)pid_end;
> + if (num_upids == 1) {
> + pid_end = (void *)pid + sizeof(struct pid);
> + pid->upid_list = (struct upid *)pid_end;
> + } else {
> + int upid_list_size = num_upids * sizeof(struct upid);
> +
> + upid_list = kzalloc(upid_list_size, GFP_KERNEL);
> + if (!upid_list) {
> + kmem_cache_free(pid_cachep, pid);
> + return NULL;
> + }
> + pid->upid_list = upid_list;
> + }
I would much rather see the upid_list be a part of the struct pid, as in
struct pid {
...
struct upid upid_list[0];
};
and the allocation done all at once using kmalloc.
If we really want to use a cache later, we could either use a cache only
if num_upids==1, or use a set of caches, creating a new cache every
time someone does clone(CLONE_NEWPID) to a new depth of num_upids.
Others may disagree with this, I realize my preference somewhat
subjective.
>
> atomic_set(&pid->count, 1);
> pid->num_upids = num_upids;
> @@ -364,7 +504,8 @@ static struct pid *alloc_struct_pid(int
> return pid;
> }
>
> -struct pid *dup_struct_pid(enum copy_process_type copy_src)
> +struct pid *dup_struct_pid(enum copy_process_type copy_src,
> + unsigned long clone_flags, struct task_struct *new_task)
> {
> int rc;
> int i;
> @@ -379,20 +520,38 @@ struct pid *dup_struct_pid(enum copy_pro
&g
...
|
|
|
 |
|
[RFC][PATCH 0/16] Enable cloning of pid namespace
|
 |
|
[RFC][PATCH 01/16] Define/use task_active_pid_ns() wrapper
|
 |
|
[RFC][PATCH 02/16] Rename pid_nr function
|
 |
|
[RFC][PATCH 03/16] Rename child_reaper function
|
 |
|
[RFC][PATCH 04/16] Use pid_to_nr() in process info functions
|
 |
|
Re: [RFC][PATCH 04/16] Use pid_to_nr() in process info functions
By: xemul on Thu, 24 May 2007 08:22
|
 |
|
Re: [RFC][PATCH 04/16] Use pid_to_nr() in process info functions
|
 |
|
[RFC][PATCH 05/16] Use task_pid() to find leader's pid
|
 |
|
[RFC][PATCH 06/16] Define is_global_init()
|
 |
|
Re: [RFC][PATCH 06/16] Define is_global_init()
|
 |
|
Re: [RFC][PATCH 06/16] Define is_global_init()
By: xemul on Thu, 24 May 2007 09:24
|
 |
|
Re: [RFC][PATCH 06/16] Define is_global_init()
|
 |
|
Re: [RFC][PATCH 06/16] Define is_global_init()
|
 |
|
Re: [RFC][PATCH 06/16] Define is_global_init()
|
 |
|
Re: [RFC][PATCH 06/16] Define is_global_init()
|
 |
|
Re: [RFC][PATCH 06/16] Define is_global_init()
|
 |
|
Re: [RFC][PATCH 06/16] Define is_global_init()
By: serue on Thu, 24 May 2007 15:27
|
 |
|
Re: [RFC][PATCH 06/16] Define is_global_init()
By: xemul on Thu, 24 May 2007 08:28
|
 |
|
Re: [RFC][PATCH 06/16] Define is_global_init()
By: xemul on Thu, 24 May 2007 08:29
|
 |
|
Re: [RFC][PATCH 06/16] Define is_global_init()
|
 |
|
[RFC][PATCH 07/16] Move alloc_pid call to copy_process
|
 |
|
Re: [RFC][PATCH 07/16] Move alloc_pid call to copy_process
|
 |
|
Re: [RFC][PATCH 07/16] Move alloc_pid call to copy_process
By: xemul on Thu, 24 May 2007 09:30
|
 |
|
Re: [RFC][PATCH 07/16] Move alloc_pid call to copy_process
|
 |
|
Re: [RFC][PATCH 07/16] Move alloc_pid call to copy_process
By: xemul on Thu, 24 May 2007 08:35
|
 |
|
Re: [RFC][PATCH 07/16] Move alloc_pid call to copy_process
|
 |
|
Re: [RFC][PATCH 07/16] Move alloc_pid call to copy_process
|
 |
|
[RFC][PATCH 08/16] Define/use pid->upid_list list.
|
 |
|
Re: [RFC][PATCH 08/16] Define/use pid->upid_list list.
|
 |
|
Re: [RFC][PATCH 08/16] Define/use pid->upid_list list.
By: xemul on Thu, 24 May 2007 08:57
|
 |
|
[RFC][PATCH 09/16] Use pid ns from pid->upid_list
|
 |
|
[RFC][PATCH 10/16] Define CLONE_NEWPID flag
|
 |
|
[RFC][PATCH 11/16] Enable cloning pid namespace
|
 |
|
Re: [RFC][PATCH 11/16] Enable cloning pid namespace
By: serue on Thu, 24 May 2007 14:59
|
 |
|
[RFC][PATCH 12/16] Terminate processes in a ns when reaper is exiting.
|
 |
|
[RFC][PATCH 13/16] Remove proc_mnt's use for killing inodes
|
 |
|
[RFC][PATCH 14/16] Introduce proc_mnt for pid_ns
|
 |
|
Re: [RFC][PATCH 14/16] Introduce proc_mnt for pid_ns
|
 |
|
Re: [RFC][PATCH 14/16] Introduce proc_mnt for pid_ns
|
 |
|
Re: [RFC][PATCH 14/16] Introduce proc_mnt for pid_ns
By: xemul on Thu, 24 May 2007 09:23
|
 |
|
Re: [RFC][PATCH 14/16] Introduce proc_mnt for pid_ns
By: xemul on Thu, 24 May 2007 10:15
|
 |
|
Re: [RFC][PATCH 14/16] Introduce proc_mnt for pid_ns
|
 |
|
Re: [RFC][PATCH 14/16] Introduce proc_mnt for pid_ns
|
 |
|
Re: [RFC][PATCH 14/16] Introduce proc_mnt for pid_ns
|
 |
|
Re: [RFC][PATCH 14/16] Introduce proc_mnt for pid_ns
|
 |
|
[RFC][PATCH 15/16] Enable signaling child reaper from parent ns.
|
 |
|
Re: [RFC][PATCH 15/16] Enable signaling child reaper from parent ns.
By: serue on Thu, 24 May 2007 15:59
|
 |
|
Re: [RFC][PATCH 15/16] Enable signaling child reaper from parent ns.
|
 |
|
Re: [RFC][PATCH 15/16] Enable signaling child reaper from parent ns.
By: serue on Fri, 25 May 2007 20:13
|
 |
|
Re: [RFC][PATCH 15/16] Enable signaling child reaper from parent ns.
|
 |
|
Re: [RFC][PATCH 15/16] Enable signaling child reaper from parent ns.
|
 |
|
[RFC][PATCH 16/16] Move inline functions to sched.h
|
 |
|
Re: [RFC][PATCH 0/16] Enable cloning of pid namespace
|
 |
|
Re: [RFC][PATCH 0/16] Enable cloning of pid namespace
By: xemul on Thu, 24 May 2007 09:31
|
 |
|
Re: [RFC][PATCH 0/16] Enable cloning of pid namespace
|
 |
|
Re: [RFC][PATCH 0/16] Enable cloning of pid namespace
|
 |
|
Re: [RFC][PATCH 0/16] Enable cloning of pid namespace
|
Goto Forum:
Current Time: Tue Jul 15 16:43:29 GMT 2025
Total time taken to generate the page: 0.04320 seconds
|