Home » Mailing lists » Devel » Re: [RFC][PATCH 1/2] add user namespace [try #2]
Re: [RFC][PATCH 1/2] add user namespace [try #2] [message #6070] |
Thu, 07 September 2006 15:59  |
dev
Messages: 1693 Registered: September 2005 Location: Moscow
|
Senior Member |

|
|
comments below
> This patch adds the user namespace.
>
> Basically, it allows a process to unshare its user_struct table,
> resetting at the same time its own user_struct and all the associated
> accounting.
>
> A new root user (uid == 0) is added to the user namespace upon
> creation. Such root users have full privileges and it seems that
> theses privileges should be controlled through some means (process
> capabilities ?)
>
> Changes [try #2]
>
> - removed struct user_namespace* argument from find_user()
> - added a root_user per user namespace
>
> Signed-off-by: Cedric Le Goater <clg@fr.ibm.com>
> Cc: Andrew Morton <akpm@osdl.org>
> Cc: Kirill Korotaev <dev@openvz.org>
> Cc: Eric W. Biederman <ebiederm@xmission.com>
> Cc: Herbert Poetzl <herbert@13thfloor.at>
> Cc: Serge E. Hallyn <serue@us.ibm.com>
> Cc: Dave Hansen <haveblue@us.ibm.com>
>
> ---
[... skip ...]
> +struct user_namespace init_user_ns = {
> + .kref = {
> + .refcount = ATOMIC_INIT(2),
<<<< please add some comment about why it is initally = 2
> + },
> + .root_user = &root_user,
> +};
> +
> +EXPORT_SYMBOL_GPL(init_user_ns);
>
> /*
> * The uidhash_lock is mostly taken from process context, but it is
> @@ -84,6 +93,111 @@ static inline struct user_struct *uid_ha
> return NULL;
> }
>
> +
> +#ifdef CONFIG_USER_NS
> +
> +/*
> + * Clone a new ns copying an original user ns, setting refcount to 1
> + * @old_ns: namespace to clone
> + * Return NULL on error (failure to kmalloc), new ns otherwise
> + */
> +static struct user_namespace *clone_user_ns(struct user_namespace *old_ns)
> +{
> + struct user_namespace *ns;
> +
> + ns = kmalloc(sizeof(struct user_namespace), GFP_KERNEL);
> + if (ns) {
<<<< can you remake this function please so that normal case would go without indentation, i.e.:
if (!ns) {
error path;
}
normal path
> + int n;
> + struct user_struct *new_user;
> +
> + kref_init(&ns->kref);
> +
> + for(n = 0; n < UIDHASH_SZ; ++n)
> + INIT_LIST_HEAD(ns->uidhash_table + n);
> +
> + /* Insert new root user. */
> + ns->root_user = alloc_uid(ns, 0);
> + if (!ns->root_user) {
> + kfree(ns);
> + return NULL;
> + }
> +
> + /* Reset current->user with a new one */
> + new_user = alloc_uid(ns, current->uid);
> + if (!new_user) {
> + kfree(ns);
> + return NULL;
> + }
> +
> + switch_uid(new_user);
<<<< I see switch_uid in this function, but you don't change nsproxy->user_ns here...
<<<< it is done much later, in sys_unshare()... This looks a bit inconsistent for me...
<<<< But I'm unsure whether it can be fixed... :/
> + }
> + return ns;
> +}
> +
> +/*
> + * unshare the current process' user namespace.
> + */
> +int unshare_user_ns(unsigned long unshare_flags,
> + struct user_namespace **new_user)
> +{
> + if (unshare_flags & CLONE_NEWUSER) {
> + if (!capable(CAP_SYS_ADMIN))
> + return -EPERM;
<<<< such checks for CAP_SYS_ADMIN mean that we can't use copy_xxx/clone_xxx functions directly
<<<< from OpenVZ code, since VE creation is done with dropped capabilities already.
<<<< (user level tools decide which capabilities should be granted to VE, so CAP_SYS_ADMIN
<<<< is not normally granted :) )
<<<< Can we move capability checks into some more logical place which deals with user, e.g. sys_unshare()?
> +
> + *new_user = clone_user_ns(current->nsproxy->user_ns);
> + if (!*new_user)
> + return -ENOMEM;
> + }
> +
> + return 0;
> +}
> +
> +/*
> + * Copy task tsk's user namespace, or clone it if flags specifies
> + * CLONE_NEWUSER. In latter case, changes to the user namespace of
> + * this process won't be seen by parent, and vice versa.
> + */
> +int copy_user_ns(int flags, struct task_struct *tsk)
> +{
> + struct user_namespace *old_ns = tsk->nsproxy->user_ns;
> + struct user_namespace *new_ns;
> + int err = 0;
> +
> + if (!old_ns)
> + return 0;
> +
> + get_user_ns(old_ns);
> +
> + if (!(flags & CLONE_NEWUSER))
> + return 0;
> +
> + if (!capable(CAP_SYS_ADMIN)) {
> + err = -EPERM;
> + goto out;
> + }
<<<< same as above
> +
> + new_ns = clone_user_ns(old_ns);
> + if (!new_ns) {
> + err = -ENOMEM;
> + goto out;
> + }
> + tsk->nsproxy->user_ns = new_ns;
> +
> +out:
> + put_user_ns(old_ns);
> + return err;
> +}
> +
....
Kirill
|
|
|
Re: [RFC][PATCH 1/2] add user namespace [try #2] [message #6174 is a reply to message #6070] |
Mon, 11 September 2006 08:35   |
Cedric Le Goater
Messages: 443 Registered: February 2006
|
Senior Member |
|
|
Kirill Korotaev wrote:
[ ... ]
>> +static struct user_namespace *clone_user_ns(struct user_namespace *old_ns)
>> +{
>> + struct user_namespace *ns;
>> +
>> + ns = kmalloc(sizeof(struct user_namespace), GFP_KERNEL);
>> + if (ns) {
> <<<< can you remake this function please so that normal case would go without indentation, i.e.:
> if (!ns) {
> error path;
> }
> normal path
right, the error path has been growing too much. Will do.
>> + int n;
>> + struct user_struct *new_user;
>> +
>> + kref_init(&ns->kref);
>> +
>> + for(n = 0; n < UIDHASH_SZ; ++n)
>> + INIT_LIST_HEAD(ns->uidhash_table + n);
>> +
>> + /* Insert new root user. */
>> + ns->root_user = alloc_uid(ns, 0);
>> + if (!ns->root_user) {
>> + kfree(ns);
>> + return NULL;
>> + }
>> +
>> + /* Reset current->user with a new one */
>> + new_user = alloc_uid(ns, current->uid);
>> + if (!new_user) {
>> + kfree(ns);
>> + return NULL;
>> + }
>> +
>> + switch_uid(new_user);
> <<<< I see switch_uid in this function, but you don't change nsproxy->user_ns here...
> <<<< it is done much later, in sys_unshare()... This looks a bit inconsistent for me...
I would say it is safe but i agree with you that it would be better to
switch user namespace before than switching user_struct.
> <<<< But I'm unsure whether it can be fixed... :/
he, may be an argument for the clone_ns() syscall ?
>> + return ns;
>> +}
>> +
>> +/*
>> + * unshare the current process' user namespace.
>> + */
>> +int unshare_user_ns(unsigned long unshare_flags,
>> + struct user_namespace **new_user)
>> +{
>> + if (unshare_flags & CLONE_NEWUSER) {
>> + if (!capable(CAP_SYS_ADMIN))
>> + return -EPERM;
> <<<< such checks for CAP_SYS_ADMIN mean that we can't use copy_xxx/clone_xxx functions directly
> <<<< from OpenVZ code, since VE creation is done with dropped capabilities already.
OK.
> <<<< (user level tools decide which capabilities should be granted to VE, so CAP_SYS_ADMIN
> <<<< is not normally granted :) )
> <<<< Can we move capability checks into some more logical place which deals with user, e.g. sys_unshare()?
hmm, another argument for a new syscall() ?
C.
|
|
|
Re: [RFC][PATCH 1/2] add user namespace [try #2] [message #16695 is a reply to message #6174] |
Tue, 12 September 2006 10:48  |
Herbert Poetzl
Messages: 239 Registered: February 2006
|
Senior Member |
|
|
On Mon, Sep 11, 2006 at 10:35:45AM +0200, Cedric Le Goater wrote:
> Kirill Korotaev wrote:
>
> [ ... ]
>
> >> +static struct user_namespace *clone_user_ns(struct user_namespace *old_ns)
> >> +{
> >> + struct user_namespace *ns;
> >> +
> >> + ns = kmalloc(sizeof(struct user_namespace), GFP_KERNEL);
> >> + if (ns) {
> > <<<< can you remake this function please so that normal case would
> > go without indentation, i.e.:
> > if (!ns) {
> > error path;
> > }
> > normal path
>
> right, the error path has been growing too much. Will do.
>
> >> + int n;
> >> + struct user_struct *new_user;
> >> +
> >> + kref_init(&ns->kref);
> >> +
> >> + for(n = 0; n < UIDHASH_SZ; ++n)
> >> + INIT_LIST_HEAD(ns->uidhash_table + n);
> >> +
> >> + /* Insert new root user. */
> >> + ns->root_user = alloc_uid(ns, 0);
> >> + if (!ns->root_user) {
> >> + kfree(ns);
> >> + return NULL;
> >> + }
> >> +
> >> + /* Reset current->user with a new one */
> >> + new_user = alloc_uid(ns, current->uid);
> >> + if (!new_user) {
> >> + kfree(ns);
> >> + return NULL;
> >> + }
> >> +
> >> + switch_uid(new_user);
> > <<<< I see switch_uid in this function, but you don't change
> > nsproxy->user_ns here...
> > <<<< it is done much later, in sys_unshare()... This looks a bit
> > inconsistent for me...
>
> I would say it is safe but i agree with you that it would be better to
> switch user namespace before than switching user_struct.
>
> > <<<< But I'm unsure whether it can be fixed... :/
>
> he, may be an argument for the clone_ns() syscall ?
>
> >> + return ns;
> >> +}
> >> +
> >> +/*
> >> + * unshare the current process' user namespace.
> >> + */
> >> +int unshare_user_ns(unsigned long unshare_flags,
> >> + struct user_namespace **new_user)
> >> +{
> >> + if (unshare_flags & CLONE_NEWUSER) {
> >> + if (!capable(CAP_SYS_ADMIN))
> >> + return -EPERM;
> > <<<< such checks for CAP_SYS_ADMIN mean that we can't use
> > copy_xxx/clone_xxx functions directly
> > <<<< from OpenVZ code, since VE creation is done with dropped
> > capabilities already.
is there a good reason for doing so?
I mean, Linux-VServer for example drops the capabilities
at the end of initialization, right before spawning the
guest init (or running the guest's runlevel scripts)
> OK.
>
> > <<<< (user level tools decide which capabilities should be granted
> > to VE, so CAP_SYS_ADMIN
> > <<<< is not normally granted :) )
> > <<<< Can we move capability checks into some more logical place
> > which deals with user, e.g. sys_unshare()?
we also do not give CAP_SYS_ADMIN by default, but I have
to admit, that we add a CAP_CONTEXT (which allows to
do guest related manipulations) to the host admin processes
which in turn controls such things like namespace or
context changes
> hmm, another argument for a new syscall() ?
well, last time I checked syscalls were pretty popular
and I do not see a good reason to complicate things
unnecessary when the new features will require new
userspace tools anyways (just my opinion) ...
best,
Herbert
> C.
> _______________________________________________
> Containers mailing list
> Containers@lists.osdl.org
> https://lists.osdl.org/mailman/listinfo/containers
_______________________________________________
Containers mailing list
Containers@lists.osdl.org
https://lists.osdl.org/mailman/listinfo/containers
|
|
|
Goto Forum:
Current Time: Sun Aug 03 19:02:19 GMT 2025
Total time taken to generate the page: 0.63091 seconds
|