OpenVZ Forum


Home » Mailing lists » Devel » userns: targeted capabilities v5
Re: [PATCH 2/9] security: Make capabilities relative to the user namespace. [message #41847 is a reply to message #41744] Wed, 23 February 2011 12:01 Go to previous messageGo to next message
David Howells is currently offline  David Howells
Messages: 44
Registered: October 2006
Member
From: *parallels.com
David Howells <dhowells@redhat.com> wrote:

> > int (*capable) (struct task_struct *tsk, const struct cred *cred,
> > - int cap, int audit);
> > + struct user_namespace *ns, int cap, int audit);
>
> Hmmm... A chunk of the contents of the cred struct are user-namespaced.
> Could you add the user_namespace pointer to the cred struct and thus avoid
> passing it as an argument to other things.

Ah, no... Ignore that, I think I see that you do need it.

> +int cap_capable(struct task_struct *tsk, const struct cred *cred,
> + struct user_namespace *targ_ns, int cap, int audit)
> {
> - return cap_raised(cred->cap_effective, cap) ? 0 : -EPERM;
> + for (;;) {
> + /* The creator of the user namespace has all caps. */
> + if (targ_ns != &init_user_ns && targ_ns->creator == cred->user)
> + return 0;

Why is that last comment so? Why should the creating namespace sport all
possible capabilities? Do you have to have all capabilities available to you
to be permitted create a new user namespace?

Also, would it be worth having a separate cap_ns_capable()? Wouldn't most
calls to cap_capable() only be checking the caps granted in the current user
namespace?

David
_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containe rs
Re: [PATCH 2/9] security: Make capabilities relative to the user namespace. [message #41848 is a reply to message #41744] Wed, 23 February 2011 11:40 Go to previous messageGo to next message
David Howells is currently offline  David Howells
Messages: 44
Registered: October 2006
Member
From: *parallels.com
Serge E. Hallyn <serge@hallyn.com> wrote:

> int (*capable) (struct task_struct *tsk, const struct cred *cred,
> - int cap, int audit);
> + struct user_namespace *ns, int cap, int audit);

Hmmm... A chunk of the contents of the cred struct are user-namespaced.
Could you add the user_namespace pointer to the cred struct and thus avoid
passing it as an argument to other things.

In fact, I think you probably have to do this so that cachefiles, for example,
can override the user namespace when it operates on behalf of a process that's
in another namespace.

In fact, looking later on in your patch, I see:

> - || (cap_capable(current, current_cred(), CAP_SETPCAP,
> + || (cap_capable(current, current_cred(),
> + current_cred()->user->user_ns, CAP_SETPCAP,

so the user_ns _is_ already available through the creds. So is there really a
need to pass it as an argument to anything that already takes a cred?

> * @cred contains the credentials to use.
> + * @ns contains the user namespace we want the capability in
> * @cap contains the capability <include/linux/capability.h>.

That should be tabbed to match the lines either side.

David
_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containe rs
User namespaces and keys [message #41849 is a reply to message #41743] Wed, 23 February 2011 12:05 Go to previous messageGo to next message
David Howells is currently offline  David Howells
Messages: 44
Registered: October 2006
Member
From: *parallels.com
I guess we need to look at how to mix keys and namespaces again.

Possibly the trickiest problem with keys is how to upcall key construction to
/sbin/request-key when the keys may be of a different user namespace.

David
_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containe rs
Re: [PATCH 2/9] security: Make capabilities relative to the user namespace. [message #41850 is a reply to message #41847] Wed, 23 February 2011 13:43 Go to previous messageGo to next message
serge is currently offline  serge
Messages: 72
Registered: January 2007
Member
From: *parallels.com
Quoting David Howells (dhowells@redhat.com):
> David Howells <dhowells@redhat.com> wrote:
>
> > > int (*capable) (struct task_struct *tsk, const struct cred *cred,
> > > - int cap, int audit);
> > > + struct user_namespace *ns, int cap, int audit);
> >
> > Hmmm... A chunk of the contents of the cred struct are user-namespaced.
> > Could you add the user_namespace pointer to the cred struct and thus avoid
> > passing it as an argument to other things.
>
> Ah, no... Ignore that, I think I see that you do need it.

Thanks for reviewing, David.

> > +int cap_capable(struct task_struct *tsk, const struct cred *cred,
> > + struct user_namespace *targ_ns, int cap, int audit)
> > {
> > - return cap_raised(cred->cap_effective, cap) ? 0 : -EPERM;
> > + for (;;) {
> > + /* The creator of the user namespace has all caps. */
> > + if (targ_ns != &init_user_ns && targ_ns->creator == cred->user)
> > + return 0;
>
> Why is that last comment so? Why should the creating namespace sport all
> possible capabilities? Do you have to have all capabilities available to you
> to be permitted create a new user namespace?

It's not the creating namespace, but the creating user, which has all caps.

So for instance, if uid 500 in init_user_ns creates a namespace, then:
a. uid 500 in init_user_ns has all caps to the child user_ns, so it can
kill the tasks in the userns, clean up, etc.
b. uid X in any other child user_ns has no caps to the child user_ns.
c. root in init_user_ns has whatever capabilities are in his pE to the
child user_ns. Again, this is so that the admin in any user_ns can
clean up any messes made by users in his user_ns.

One of the goals of the user namespaces it to make it safe to allow
unprivileged users to create child user namespaces in which they have
targeted privilege. Anything which happens in that child user namespace
should be:

a. constrained to resources which the user can control anyway
b. able to be cleaned up by the user
c. (especially) able to be cleaned up by the privileged user in the
parent user_ns.

> Also, would it be worth having a separate cap_ns_capable()? Wouldn't most
> calls to cap_capable() only be checking the caps granted in the current user
> namespace?

Hm. There is nsown_capable() which is targeted to current_userns(), but
that still needs to enable the caps for the privileged ancestors as
described above.

thanks,
-serge
_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containe rs
Re: User namespaces and keys [message #41851 is a reply to message #41849] Wed, 23 February 2011 13:58 Go to previous messageGo to next message
serge is currently offline  serge
Messages: 72
Registered: January 2007
Member
From: *parallels.com
Quoting David Howells (dhowells@redhat.com):
>
> I guess we need to look at how to mix keys and namespaces again.

>From strictly kernel pov, at the moment, keys are strictly usable only
by the user in your own user namespace.

We may want to look at this again, but for now I think that would be a
safe enough default. Later, we'll probably want the user creating a
child_user_ns to allow his keys to be inherited by the child user_ns.
Though, as I type that, it seems to me that that'll just become a
maintenance pain, and it's just plain safer to have the user re-enter
his keys, sharing them over a file if needed.

I'm going to not consider the TPM at the moment :)

> Possibly the trickiest problem with keys is how to upcall key construction to
> /sbin/request-key when the keys may be of a different user namespace.

Hm, jinkeys, yes.

-serge
_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containe rs
Re: User namespaces and keys [message #41852 is a reply to message #41851] Wed, 23 February 2011 14:46 Go to previous messageGo to next message
ebiederm is currently offline  ebiederm
Messages: 1354
Registered: February 2006
Senior Member
From: *parallels.com
"Serge E. Hallyn" <serge@hallyn.com> writes:

> Quoting David Howells (dhowells@redhat.com):
>>
>> I guess we need to look at how to mix keys and namespaces again.
>
>>From strictly kernel pov, at the moment, keys are strictly usable only
> by the user in your own user namespace.
>
> We may want to look at this again, but for now I think that would be a
> safe enough default. Later, we'll probably want the user creating a
> child_user_ns to allow his keys to be inherited by the child user_ns.
> Though, as I type that, it seems to me that that'll just become a
> maintenance pain, and it's just plain safer to have the user re-enter
> his keys, sharing them over a file if needed.
>
> I'm going to not consider the TPM at the moment :)
>
>> Possibly the trickiest problem with keys is how to upcall key construction to
>> /sbin/request-key when the keys may be of a different user namespace.
>
> Hm, jinkeys, yes.

Serge short term this is where I think we need to add a check or two so
that keys only work in the init_user_ns. When the rest of the details
are sorted out we can open up the use of keys some more.

As the first stage of converting the network stack we added patches
that turned off everything in non init namespaces. Those patches
were trivial and easy to review, and it made the conversion process
a lot easier. I suspect for keys and possibly security modules and
anything else that does is related we want to turn off by default.

Then once the core of user namespaces is safe to unshare without
privilege we can come back and get the more difficult bits.

Eric
_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containe rs
Re: User namespaces and keys [message #41853 is a reply to message #41851] Wed, 23 February 2011 15:06 Go to previous messageGo to next message
David Howells is currently offline  David Howells
Messages: 44
Registered: October 2006
Member
From: *parallels.com
Serge E. Hallyn <serge@hallyn.com> wrote:

> > I guess we need to look at how to mix keys and namespaces again.
>
> From strictly kernel pov, at the moment, keys are strictly usable only
> by the user in your own user namespace.

I'm not sure that's currently completely true. Key quota maintenance is
namespaced, and the key's owner UID/GID belong to that namespace, so that's
okay, but:

(*) key_task_permission() does not distinguish UIDs and GIDs from different
namespaces.

(*) A key can be referred to by its serial number, no matter whose namespace
it is in, and will yield up its given UID/GID, even if these aren't
actually meaningful in your namespace.

This means request_key() can successfully upcall at the moment.

I wonder if I should make the following changes:

(1) If the key and the accessor are in different user namespaces, then skip
the UID and GID comparisons in key_task_permission(). That means that to
be able to access the key you'd have to possess the key and the key would
have to grant you Possessor access, or the key would have to grant you
Other access.

(2) If the key and someone viewing the key description are in different
namespaces, then indicate that the UID and the GID are -1, irrespective of
the actual values.

(3) When an upcall is attempting to instantiate a key, it is allowed to access
the keys of requestor using the requestor's credentials (UID, GID, groups,
security label). Ensure that this will be done in the requestor's user
namespace.

Nothing should need to be done here, since search_process_keyrings()
switches to the requestor's creds.

Oh, and are security labels user-namespaced?

> We may want to look at this again, but for now I think that would be a
> safe enough default. Later, we'll probably want the user creating a
> child_user_ns to allow his keys to be inherited by the child user_ns.

That depends what you mean by 'allow his keys to be inherited'. Do you mean
copying all the creator's keys en mass? You may find all you need to do is to
provide the new intended user with a new session keyring with a link back to
the creator's session keyring.

> Though, as I type that, it seems to me that that'll just become a
> maintenance pain, and it's just plain safer to have the user re-enter
> his keys,

That would certainly be simpler.

> sharing them over a file if needed.

I'm not sure what you mean by that. Do you mean some sort of cred passing
similar to that that can be done over AF_UNIX sockets with fds?

> I'm going to not consider the TPM at the moment :)

I'm not sure the TPM is that much of a problem, assuming you're just referring
to its keystore capability...

David
_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containe rs
Re: User namespaces and keys [message #41855 is a reply to message #41853] Wed, 23 February 2011 15:45 Go to previous messageGo to next message
ebiederm is currently offline  ebiederm
Messages: 1354
Registered: February 2006
Senior Member
From: *parallels.com
David Howells <dhowells@redhat.com> writes:

> Serge E. Hallyn <serge@hallyn.com> wrote:
>
>> > I guess we need to look at how to mix keys and namespaces again.
>>
>> From strictly kernel pov, at the moment, keys are strictly usable only
>> by the user in your own user namespace.
>
> I'm not sure that's currently completely true. Key quota maintenance is
> namespaced, and the key's owner UID/GID belong to that namespace, so that's
> okay, but:
>
> (*) key_task_permission() does not distinguish UIDs and GIDs from different
> namespaces.
>
> (*) A key can be referred to by its serial number, no matter whose namespace
> it is in, and will yield up its given UID/GID, even if these aren't
> actually meaningful in your namespace.
>
> This means request_key() can successfully upcall at the moment.
>
> I wonder if I should make the following changes:
>
> (1) If the key and the accessor are in different user namespaces, then skip
> the UID and GID comparisons in key_task_permission(). That means that to
> be able to access the key you'd have to possess the key and the key would
> have to grant you Possessor access, or the key would have to grant you
> Other access.
>
> (2) If the key and someone viewing the key description are in different
> namespaces, then indicate that the UID and the GID are -1, irrespective of
> the actual values.
>
> (3) When an upcall is attempting to instantiate a key, it is allowed to access
> the keys of requestor using the requestor's credentials (UID, GID, groups,
> security label). Ensure that this will be done in the requestor's user
> namespace.
>
> Nothing should need to be done here, since search_process_keyrings()
> switches to the requestor's creds.
>
> Oh, and are security labels user-namespaced?

Not at this time. The user namespace as currently merged is little more
than a place holder for a proper implementation. Serge is busily
fleshing out that proper implementation.

Until we reach the point where all checks that have historically been
"if (uid1 == uid2)" become "if ((uidns1 == uidns2) && (uid1 == uid2))"
there will be problems.

The security labels and probably lsm's in general need to be per user
namespace but we simply have not gotten that far. For the short term I
will be happy when we get a minimally usable user namespace.

Eric
_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containe rs
Re: User namespaces and keys [message #41856 is a reply to message #41855] Wed, 23 February 2011 15:53 Go to previous messageGo to next message
Serge E. Hallyn is currently offline  Serge E. Hallyn
Messages: 26
Registered: February 2011
Junior Member
From: *parallels.com
Quoting Eric W. Biederman (ebiederm@xmission.com):
> David Howells <dhowells@redhat.com> writes:
>
> > Serge E. Hallyn <serge@hallyn.com> wrote:
> >
> >> > I guess we need to look at how to mix keys and namespaces again.
> >>
> >> From strictly kernel pov, at the moment, keys are strictly usable only
> >> by the user in your own user namespace.
> >
> > I'm not sure that's currently completely true. Key quota maintenance is
> > namespaced, and the key's owner UID/GID belong to that namespace, so that's
> > okay, but:
> >
> > (*) key_task_permission() does not distinguish UIDs and GIDs from different
> > namespaces.
> >
> > (*) A key can be referred to by its serial number, no matter whose namespace
> > it is in, and will yield up its given UID/GID, even if these aren't
> > actually meaningful in your namespace.
> >
> > This means request_key() can successfully upcall at the moment.
> >
> > I wonder if I should make the following changes:
> >
> > (1) If the key and the accessor are in different user namespaces, then skip
> > the UID and GID comparisons in key_task_permission(). That means that to
> > be able to access the key you'd have to possess the key and the key would
> > have to grant you Possessor access, or the key would have to grant you
> > Other access.
> >
> > (2) If the key and someone viewing the key description are in different
> > namespaces, then indicate that the UID and the GID are -1, irrespective of
> > the actual values.
> >
> > (3) When an upcall is attempting to instantiate a key, it is allowed to access
> > the keys of requestor using the requestor's credentials (UID, GID, groups,
> > security label). Ensure that this will be done in the requestor's user
> > namespace.
> >
> > Nothing should need to be done here, since search_process_keyrings()
> > switches to the requestor's creds.
> >
> > Oh, and are security labels user-namespaced?
>
> Not at this time. The user namespace as currently merged is little more
> than a place holder for a proper implementation. Serge is busily
> fleshing out that proper implementation.
>
> Until we reach the point where all checks that have historically been
> "if (uid1 == uid2)" become "if ((uidns1 == uidns2) && (uid1 == uid2))"
> there will be problems.
>
> The security labels and probably lsm's in general need to be per user
> namespace but we simply have not gotten that far. For the short term I
> will be happy when we get a minimally usable user namespace.

Note also that when Eric brought this up at the LSM mini-conf two or three
years ago, there was pretty general, strong objection to the idea.

Like Eric says, I think that'll have to wait. In the meantime, isolating
user namespace sandboxes (or containers) using simple LSM configurations
is a very good idea.

-serge
_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containe rs
Re: [PATCH 2/9] security: Make capabilities relative to the user namespace. [message #41857 is a reply to message #41744] Wed, 23 February 2011 16:59 Go to previous messageGo to next message
David Howells is currently offline  David Howells
Messages: 44
Registered: October 2006
Member
From: *parallels.com
Serge E. Hallyn <serge@hallyn.com> wrote:

> + /* If you have the capability in a parent user ns you have it
> + * in the over all children user namespaces as well, so see
> + * if this process has the capability in the parent user
> + * namespace.
> + */

"... in the over all children user namespaces ..." I think need a couple of
words dropping.

David
_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containe rs
Re: [PATCH 5/9] Allow ptrace from non-init user namespaces [message #41858 is a reply to message #41751] Wed, 23 February 2011 17:05 Go to previous messageGo to next message
David Howells is currently offline  David Howells
Messages: 44
Registered: October 2006
Member
From: *parallels.com
Serge E. Hallyn <serge@hallyn.com> wrote:

> ptrace is allowed to tasks in the same user namespace according to

There's a verb missing after 'allowed to' - presumably 'trace'.

David
_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containe rs
Re: [PATCH 5/9] Allow ptrace from non-init user namespaces [message #41859 is a reply to message #41751] Wed, 23 February 2011 17:11 Go to previous messageGo to next message
David Howells is currently offline  David Howells
Messages: 44
Registered: October 2006
Member
From: *parallels.com
Serge E. Hallyn <serge@hallyn.com> wrote:

> +int same_or_ancestor_user_ns(struct task_struct *task,
> + struct task_struct *victim)
> +{
> + struct user_namespace *u1 = task_cred_xxx(task, user)->user_ns;
> + struct user_namespace *u2 = task_cred_xxx(victim, user)->user_ns;

Hmmm. task_cred_xxx() uses task->real_cred, which is correct for victim (the
object), but normally you'd use task->cred for task (the subject). However,
in this case, I think it's probably okay.

David
_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containe rs
Re: [PATCH 1/9] Add a user_namespace as creator/owner of uts_namespace [message #41860 is a reply to message #41747] Wed, 23 February 2011 17:16 Go to previous messageGo to next message
David Howells is currently offline  David Howells
Messages: 44
Registered: October 2006
Member
From: *parallels.com
Serge E. Hallyn <serge@hallyn.com> wrote:

> struct uts_namespace {
> struct kref kref;
> struct new_utsname name;
> + struct user_namespace *user_ns;
> };

If a uts_namespace belongs to a user_namespace, should CLONE_NEWUSER then
imply CLONE_NEWUTS?

Or is uts_namespace::user_ns more an implication that the set of users in
user_namespace are the only ones authorised to alter the uts data.

I presume that the uts_namespace of a process must be owned by one of the
user_namespaces in the alternating inheritance chain of namespaces and their
creators leading from current_user_ns() to init_user_ns.

With that in mind, looking at patch 3:

- if (!capable(CAP_SYS_ADMIN))
+ if (!ns_capable(current->nsproxy->uts_ns->user_ns, CAP_SYS_ADMIN))

what is it you're actually asking? I presume it's 'does this user have
CAP_SYS_ADMIN capability over objects belonging to the uts_namespace's
user_namespace?'

So, to look at the important bit of patch 2:

-int cap_capable(struct task_struct *tsk, const struct cred *cred, int cap,
- int audit)
+int cap_capable(struct task_struct *tsk, const struct cred *cred,
+ struct user_namespace *targ_ns, int cap, int audit)
{
- return cap_raised(cred->cap_effective, cap) ? 0 : -EPERM;
+ for (;;) {
+ /* The creator of the user namespace has all caps. */
+ if (targ_ns != &init_user_ns && targ_ns->creator == cred->user)
+ return 0;
+
+ /* Do we have the necessary capabilities? */
+ if (targ_ns == cred->user->user_ns)
+ return cap_raised(cred->cap_effective, cap) ? 0 : -EPERM;
+
+ /* Have we tried all of the parent namespaces? */
+ if (targ_ns == &init_user_ns)
+ return -EPERM;
+
+ /* If you have the capability in a parent user ns you have it
+ * in the over all children user namespaces as well, so see
+ * if this process has the capability in the parent user
+ * namespace.
+ */
+ targ_ns = targ_ns->creator->user_ns;
+ }
+
+ /* We never get here */
+ return -EPERM;
}

On entry, as we're called from ns_capable(), cred->user is the user that the
current process is running as, and, as such, may be in a separate namespace
from uts_namespace - which may itself be in a separate namespace from
init_user_ns.

So, assume for the sake of argument that there are three user_namespaces along
the chain from the calling process to the root, and that the uts_namespace
belongs to the middle one.

if (targ_ns != &init_user_ns && targ_ns->creator == cred->user)
return 0;

Can never match because targ_ns->creator cannot be cred->user; even if the
uts_namespace belongs to our namespace, given that the creator lies outside
our namespace.

if (targ_ns == cred->user->user_ns)
return cap_raised(cred->cap_effective, cap) ? 0 : -EPERM;

Can only match if we are in the target user_namespace (ie. the one to which
uts_namespace belongs), whether or not we have CAP_SYS_ADMIN.

Which means that unless the uts_namespace belongs to our user_namespace, we
cannot change it. Is that correct?

So ns_capable() restricts you to only doing interesting things to objects that
belong to a user_namespace if they are in your own user_namespace. Is that
correct?

If that is so, is the loop required for ns_capable()?


Looking further at patch 2:

#define nsown_capable(cap) (ns_capable(current_user_ns(), (cap)))

Given what I've said above, I presume the loop isn't necessary here either.


I think you're using ns_capable() in two different ways:

(1) You're using it to see if a process has power over its descendents in a
user_namespace that can be traced back to a clone() that it did with
CLONE_NEWUSER.

For example, automatically granting a process permission to kill
descendents in a namespace it created.

(2) You're using it to see if a process can access objects that might be
outside its own user_namespace.

For example, setting the hostname.

Is it worth giving two different interfaces to make this clearer (even if they
actually do the same thing)?


Sorry if this seems rambly, but I'm trying to get my head round your code.

David

_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containe rs
Re: User namespaces and keys [message #41861 is a reply to message #41856] Wed, 23 February 2011 19:24 Go to previous messageGo to next message
Casey Schaufler is currently offline  Casey Schaufler
Messages: 6
Registered: November 2007
Junior Member
From: *parallels.com
On 2/23/2011 7:53 AM, Serge E. Hallyn wrote:
> Quoting Eric W. Biederman (ebiederm@xmission.com):
>> David Howells <dhowells@redhat.com> writes:
>>
>>> Serge E. Hallyn <serge@hallyn.com> wrote:
>>>
>>>>> I guess we need to look at how to mix keys and namespaces again.
>>>> From strictly kernel pov, at the moment, keys are strictly usable only
>>>> by the user in your own user namespace.
>>> I'm not sure that's currently completely true. Key quota maintenance is
>>> namespaced, and the key's owner UID/GID belong to that namespace, so that's
>>> okay, but:
>>>
>>> (*) key_task_permission() does not distinguish UIDs and GIDs from different
>>> namespaces.
>>>
>>> (*) A key can be referred to by its serial number, no matter whose namespace
>>> it is in, and will yield up its given UID/GID, even if these aren't
>>> actually meaningful in your namespace.
>>>
>>> This means request_key() can successfully upcall at the moment.
>>>
>>> I wonder if I should make the following changes:
>>>
>>> (1) If the key and the accessor are in different user namespaces, then skip
>>> the UID and GID comparisons in key_task_permission(). That means that to
>>> be able to access the key you'd have to possess the key and the key would
>>> have to grant you Possessor access, or the key would have to grant you
>>> Other access.
>>>
>>> (2) If the key and someone viewing the key description are in different
>>> namespaces, then indicate that the UID and the GID are -1, irrespective of
>>> the actual values.
>>>
>>> (3) When an upcall is attempting to instantiate a key, it is allowed to access
>>> the keys of requestor using the requestor's credentials (UID, GID, groups,
>>> security label). Ensure that this will be done in the requestor's user
>>> namespace.
>>>
>>> Nothing should need to be done here, since search_process_keyrings()
>>> switches to the requestor's creds.
>>>
>>> Oh, and are security labels user-namespaced?
>> Not at this time. The user namespace as currently merged is little more
>> than a place holder for a proper implementation. Serge is busily
>> fleshing out that proper implementation.
>>
>> Until we reach the point where all checks that have historically been
>> "if (uid1 == uid2)" become "if ((uidns1 == uidns2) && (uid1 == uid2))"
>> there will be problems.
>>
>> The security labels and probably lsm's in general need to be per user
>> namespace but we simply have not gotten that far. For the short term I
>> will be happy when we get a minimally usable user namespace.
> Note also that when Eric brought this up at the LSM mini-conf two or three
> years ago, there was pretty general, strong objection to the idea.
>
> Like Eric says, I think that'll have to wait. In the meantime, isolating
> user namespace sandboxes (or containers) using simple LSM configurations
> is a very good idea.

I confess that I remain less well educated on namespaces than
I probably should be, but with what I do know it seems that the
relationships between user namespaces and LSMs are bound to be
strained from the beginning. Some LSMs (SELinux and Smack) are
providing similar sandbox capabilities to what you get from user
namespaces, but from different directions and with different
use cases.

> -serge
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>
>

_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containe rs
Re: User namespaces and keys [message #41863 is a reply to message #41861] Wed, 23 February 2011 20:55 Go to previous messageGo to next message
ebiederm is currently offline  ebiederm
Messages: 1354
Registered: February 2006
Senior Member
From: *parallels.com
Casey Schaufler <casey@schaufler-ca.com> writes:

> I confess that I remain less well educated on namespaces than
> I probably should be, but with what I do know it seems that the
> relationships between user namespaces and LSMs are bound to be
> strained from the beginning. Some LSMs (SELinux and Smack) are
> providing similar sandbox capabilities to what you get from user
> namespaces, but from different directions and with different
> use cases.

Casey I won't argue about the possibility of things being strained, but
I think if we focus on the semantics and not on the end goal of exactly
how the pieces are to be used there can be some reasonable dialog.

>From 10,000 feet an user namespace represents one administrative domain.
uids, gids and other external tokens are all controlled by a single
group with a single security policy. In that single administrative
domain things like nfs are expected to work without translating uids and
gids. Before the implementation of user namespaces a single kernel
could not even express the ability of dealing with multiple user
namespaces simultaneously (nfs has usually punted and said despite being
multiple machines you still have to be in the same administrative domain
so the same user identifiers can be used throughout).

>From that principle we have the concept that use assigned/visible
identifiers uid, gid, capabilities, security keys?, security labels?
logically belong to the user namespace.

Which means in implementing there are two pieces of work in implementing
the user namespace.

- Find all of the interesting comparisons and change them from
"if (id == id)" to "if (userns == userns) && (id == id)".

- Potentially define and handle what happens when you mix user
namespaces.

I think for the first round of this we simply want to define lsms
and the user namespace as not mixing or the lsm interfaces only being
visible if you are in the initial user namespace. Thus reserving the
definition of what happens when you have lsms and multiple user
namespaces as something we can define later. I expect the proper
definition is something that would allow nested lsm policy.

Regardless. The namespaces are all about making the identifiers that
processes deal with local to the namespace, while the underlying object
manipulations should not care.

The big advantage of the user namespace is that it is the only way I can
see to get us out of the bind we are in with suid root executables,
where we can not enable new features to untrusted applications that can
change the environment for a suid root executable, because it might
confuse those suid root executables to misusing their power. Once
inside a user namespace nothing has dangerous powers because even a
root owned process with a full set of capabilities is less powerful than
a guest user outside of the namespace. No process having dangerous
powers allows us to enable unsharing of other namespaces and potentially
other things that today are restricted with capabilities only because
they can be used to confuse a privileged executable to do something
malicious.

Eric

_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containe rs
Re: [PATCH 1/9] Add a user_namespace as creator/owner of uts_namespace [message #41864 is a reply to message #41860] Wed, 23 February 2011 21:21 Go to previous messageGo to next message
ebiederm is currently offline  ebiederm
Messages: 1354
Registered: February 2006
Senior Member
From: *parallels.com
David Howells <dhowells@redhat.com> writes:

> Serge E. Hallyn <serge@hallyn.com> wrote:
>
>> struct uts_namespace {
>> struct kref kref;
>> struct new_utsname name;
>> + struct user_namespace *user_ns;
>> };
>
> If a uts_namespace belongs to a user_namespace, should CLONE_NEWUSER then
> imply CLONE_NEWUTS?
>
> Or is uts_namespace::user_ns more an implication that the set of users in
> user_namespace are the only ones authorised to alter the uts data.

The later.

> I presume that the uts_namespace of a process must be owned by one of the
> user_namespaces in the alternating inheritance chain of namespaces and their
> creators leading from current_user_ns() to init_user_ns.
>
> With that in mind, looking at patch 3:
>
> - if (!capable(CAP_SYS_ADMIN))
> + if (!ns_capable(current->nsproxy->uts_ns->user_ns, CAP_SYS_ADMIN))
>
> what is it you're actually asking? I presume it's 'does this user have
> CAP_SYS_ADMIN capability over objects belonging to the uts_namespace's
> user_namespace?'

Yes.

> So, to look at the important bit of patch 2:
>
> -int cap_capable(struct task_struct *tsk, const struct cred *cred, int cap,
> - int audit)
> +int cap_capable(struct task_struct *tsk, const struct cred *cred,
> + struct user_namespace *targ_ns, int cap, int audit)
> {
> - return cap_raised(cred->cap_effective, cap) ? 0 : -EPERM;
> + for (;;) {
> + /* The creator of the user namespace has all caps. */
> + if (targ_ns != &init_user_ns && targ_ns->creator == cred->user)
> + return 0;
> +
> + /* Do we have the necessary capabilities? */
> + if (targ_ns == cred->user->user_ns)
> + return cap_raised(cred->cap_effective, cap) ? 0 : -EPERM;
> +
> + /* Have we tried all of the parent namespaces? */
> + if (targ_ns == &init_user_ns)
> + return -EPERM;
> +
> + /* If you have the capability in a parent user ns you have it
> + * in the over all children user namespaces as well, so see
> + * if this process has the capability in the parent user
> + * namespace.
> + */
> + targ_ns = targ_ns->creator->user_ns;
> + }
> +
> + /* We never get here */
> + return -EPERM;
> }
>
> On entry, as we're called from ns_capable(), cred->user is the user that the
> current process is running as, and, as such, may be in a separate namespace
> from uts_namespace - which may itself be in a separate namespace from
> init_user_ns.
>
> So, assume for the sake of argument that there are three user_namespaces along
> the chain from the calling process to the root, and that the uts_namespace
> belongs to the middle one.

So we have the nested stack of:
user_ns3->creator->user_ns == user_ns2
user_ns2->creator->user_ns == &init_user_ns
uts_ns2->user_ns == user_ns2

>
> if (targ_ns != &init_user_ns && targ_ns->creator == cred->user)
> return 0;
>
> Can never match because targ_ns->creator cannot be cred->user; even if the
> uts_namespace belongs to our namespace, given that the creator lies outside
> our namespace.

Initially we come in with targ_ns == user_ns2 and cred->user->user_ns in
one of (user_ns3, user_ns2, or &init_user_ns).

targ_ns takes on values user_ns2 and &init_user_ns.

So when targ_ns becomes &init_user_ns. If the user in question is
uts_ns2->user_ns->creator. This check will indeed match.

> if (targ_ns == cred->user->user_ns)
> return cap_raised(cred->cap_effective, cap) ? 0 : -EPERM;
>
> Can only match if we are in the target user_namespace (ie. the one to which
> uts_namespace belongs), whether or not we have CAP_SYS_ADMIN.

As before targ_ns takes on values of user_ns2 and &init_user_ns.

Which means this check will match if we have CAP_SYS_ADMIN in
&init_user_ns or in user_ns2.

> Which means that unless the uts_namespace belongs to our user_namespace, we
> cannot change it. Is that correct?

No. If you are root in a parent namespace you can also change it.

> So ns_capable() restricts you to only doing interesting things to objects that
> belong to a user_namespace if they are in your own user_namespace. Is that
> correct?

No. Root outside your user namespace is also allowed to do interesting
things.

> If that is so, is the loop required for ns_capable()?

Yes.


> Looking further at patch 2:
>
> #define nsown_capable(cap) (ns_capable(current_user_ns(), (cap)))
>
> Given what I've said above, I presume the loop isn't necessary here either.
>
>
> I think you're using ns_capable() in two different ways:
>
> (1) You're using it to see if a process has power over its descendents in a
> user_namespace that can be traced back to a clone() that it did with
> CLONE_NEWUSER.
>
> For example, automatically granting a process permission to kill
> descendents in a namespace it created.
>
> (2) You're using it to see if a process can access objects that might be
> outside its own user_namespace.
>
> For example, setting the hostname.
>
> Is it worth giving two different interfaces to make this clearer (even if they
> actually do the same thing)?
>
>
> Sorry if this seems rambly, but I'm trying to get my head round your
> code.

I am all for making that loop a little clearer, because it is something
you have to pause and think about to understand but so far I don't think
the loop is wrong, and it is simple.

Eric
_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containe rs
Re: User namespaces and keys [message #41865 is a reply to message #41863] Wed, 23 February 2011 21:37 Go to previous messageGo to next message
Casey Schaufler is currently offline  Casey Schaufler
Messages: 6
Registered: November 2007
Junior Member
From: *parallels.com
On 2/23/2011 12:55 PM, Eric W. Biederman wrote:
> Casey Schaufler <casey@schaufler-ca.com> writes:
>
>> I confess that I remain less well educated on namespaces than
>> I probably should be, but with what I do know it seems that the
>> relationships between user namespaces and LSMs are bound to be
>> strained from the beginning. Some LSMs (SELinux and Smack) are
>> providing similar sandbox capabilities to what you get from user
>> namespaces, but from different directions and with different
>> use cases.
> Casey I won't argue about the possibility of things being strained, but
> I think if we focus on the semantics and not on the end goal of exactly
> how the pieces are to be used there can be some reasonable dialog.

I'm sure that there will be cases where they will work together
like horses in a troika. Making sensible semantics for the interactions
is key, and it is entirely possible that in some cases a comparison
of semantics and behaviors will lead an end user to chose either an
LSM or namespaces over the combination. Just like I expect that even
when we allow multiple LSMs the SELinux and Smack combination will be
rare among the sane.

_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containe rs
Re: [PATCH 1/9] Add a user_namespace as creator/owner of uts_namespace [message #41867 is a reply to message #41747] Wed, 23 February 2011 23:54 Go to previous messageGo to next message
ebiederm is currently offline  ebiederm
Messages: 1354
Registered: February 2006
Senior Member
From: *parallels.com
David Howells <dhowells@redhat.com> writes:

> Eric W. Biederman <ebiederm@xmission.com> wrote:
>
>> > Which means that unless the uts_namespace belongs to our user_namespace, we
>> > cannot change it. Is that correct?
>>
>> No. If you are root in a parent namespace you can also change it.
>
> But surely, by definition, if you're a user in this namespace, you can't also
> be root in a parent namespace...

To be clear the case you looked at was:

> - if (!capable(CAP_SYS_ADMIN))
> + if (!ns_capable(current->nsproxy->uts_ns->user_ns, CAP_SYS_ADMIN))
>
> what is it you're actually asking? I presume it's 'does this user have
> CAP_SYS_ADMIN capability over objects belonging to the uts_namespace's
> user_namespace?'

Here "current->nsproxy->uts_ns->user_ns" (the target_ns value) is only
refers to the uts_ns we are talking about.

The user itself comes from current_user().

> For the case I worked through current_user() is a member of current_user_ns()
> and can't also be a member of its parent, grandparent, etc. - or can
> it?

Right now if you are looking at current_user() because of limitations in
the creation ordering I think you are correct.

However in the near term pile of changes to merge, are the syscalls for
joining an existing namespace. At which point there is no reason in
general to suppose the current limitations of creation apply.

Although it is conceivable that unshare of namespaces can also get you
to someplace similar to joining prexisting namespaces.

Eric
_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containe rs
Re: [PATCH 1/9] Add a user_namespace as creator/owner of uts_namespace [message #41868 is a reply to message #41864] Wed, 23 February 2011 23:19 Go to previous messageGo to next message
David Howells is currently offline  David Howells
Messages: 44
Registered: October 2006
Member
From: *parallels.com
Eric W. Biederman <ebiederm@xmission.com> wrote:

> > Which means that unless the uts_namespace belongs to our user_namespace, we
> > cannot change it. Is that correct?
>
> No. If you are root in a parent namespace you can also change it.

But surely, by definition, if you're a user in this namespace, you can't also
be root in a parent namespace...

For the case I worked through current_user() is a member of current_user_ns()
and can't also be a member of its parent, grandparent, etc. - or can it?

David
_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containe rs
Re: [PATCH 5/9] Allow ptrace from non-init user namespaces [message #41874 is a reply to message #41781] Thu, 24 February 2011 00:43 Go to previous messageGo to next message
serge is currently offline  serge
Messages: 72
Registered: January 2007
Member
From: *parallels.com
Quoting Andrew Morton (akpm@linux-foundation.org):
> On Thu, 17 Feb 2011 15:03:33 +0000
> "Serge E. Hallyn" <serge@hallyn.com> wrote:
>
> > ptrace is allowed to tasks in the same user namespace according to
> > the usual rules (i.e. the same rules as for two tasks in the init
> > user namespace). ptrace is also allowed to a user namespace to
> > which the current task the has CAP_SYS_PTRACE capability.
> >
> >
> > ...
> >
> > --- a/include/linux/capability.h
> > +++ b/include/linux/capability.h
> > @@ -546,6 +546,8 @@ extern const kernel_cap_t __cap_init_eff_set;
> > */
> > #define has_capability(t, cap) (security_real_capable((t), &init_user_ns, (cap)) == 0)
> >
> > +#define has_ns_capability(t, ns, cap) (security_real_capable((t), (ns), (cap)) == 0)
>
> macroitis.

Thanks for the review, Andrew. Unfortunately this one is hard to turn
into a function beecause it uses security_real_capable(), which is
sometimes defined in security/security.c as a real function, and
other times as a static inline in include/linux/security.h. So
I'd have to #include security.h in capability.h, but security.h
already #includes capability.h.

All the other comments affect same_or_ancestor_user_ns(), which
following Eric's feedback is going away.

> > /**
> > * has_capability_noaudit - Determine if a task has a superior capability available (unaudited)
> > * @t: The task in question
> > diff --git a/include/linux/user_namespace.h b/include/linux/user_namespace.h
> > index faf4679..862fc59 100644
> > --- a/include/linux/user_namespace.h
> > +++ b/include/linux/user_namespace.h
> > @@ -39,6 +39,9 @@ static inline void put_user_ns(struct user_namespace *ns)
> > uid_t user_ns_map_uid(struct user_namespace *to, const struct cred *cred, uid_t uid);
> > gid_t user_ns_map_gid(struct user_namespace *to, const struct cred *cred, gid_t gid);
> >
> > +int same_or_ancestor_user_ns(struct task_struct *task,
> > + struct task_struct *victim);
>
> bool.
>
> > #else
> >
> > static inline struct user_namespace *get_user_ns(struct user_namespace *ns)
> >
> > ...
> >
> > --- a/kernel/user_namespace.c
> > +++ b/kernel/user_namespace.c
> > @@ -129,6 +129,22 @@ gid_t user_ns_map_gid(struct user_namespace *to, const struct cred *cred, gid_t
> > return overflowgid;
> > }
> >
> > +int same_or_ancestor_user_ns(struct task_struct *task,
> > + struct task_struct *victim)
> > +{
> > + struct user_namespace *u1 = task_cred_xxx(task, user)->user_ns;
> > + struct user_namespace *u2 = task_cred_xxx(victim, user)->user_ns;
> > + for (;;) {
> > + if (u1 == u2)
> > + return 1;
> > + if (u1 == &init_user_ns)
> > + return 0;
> > + u1 = u1->creator->user_ns;
> > + }
> > + /* We never get here */
> > + return 0;
>
> Remove?
>
> > +}
> > +
> > static __init int user_namespaces_init(void)
> > {
> > user_ns_cachep = KMEM_CACHE(user_namespace, SLAB_PANIC);
> >
> > ...
> >
> > int cap_ptrace_access_check(struct task_struct *child, unsigned int mode)
> > {
> > int ret = 0;
> > + const struct cred *cred, *tcred;
> >
> > rcu_read_lock();
> > - if (!cap_issubset(__task_cred(child)->cap_permitted,
> > - current_cred()->cap_permitted) &&
> > - !capable(CAP_SYS_PTRACE))
> > - ret = -EPERM;
> > + cred = current_cred();
> > + tcred = __task_cred(child);
> > + /*
> > + * The ancestor user_ns check may be gratuitous, as I think
> > + * we've already guaranteed that in kernel/ptrace.c.
> > + */
>
> ?
>
> > + if (same_or_ancestor_user_ns(current, child) &&
> > + cap_issubset(tcred->cap_permitted, cred->cap_permitted))
> > + goto out;
> > + if (ns_capable(tcred->user->user_ns, CAP_SYS_PTRACE))
> > + goto out;
> > + ret = -EPERM;
> > +out:
> > rcu_read_unlock();
> > return ret;
> > }
> >
> > ...
> >
>
_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containe rs
Re: [PATCH 4/9] allow killing tasks in your own or child userns [message #41876 is a reply to message #41780] Thu, 24 February 2011 00:48 Go to previous messageGo to next message
serge is currently offline  serge
Messages: 72
Registered: January 2007
Member
From: *parallels.com
Quoting Andrew Morton (akpm@linux-foundation.org):
> On Thu, 17 Feb 2011 15:03:25 +0000
> "Serge E. Hallyn" <serge@hallyn.com> wrote:
>
> > /*
> > + * called with RCU read lock from check_kill_permission()
> > + */
> > +static inline int kill_ok_by_cred(struct task_struct *t)
> > +{
> > + const struct cred *cred = current_cred();
> > + const struct cred *tcred = __task_cred(t);
> > +
> > + if (cred->user->user_ns == tcred->user->user_ns &&
> > + (cred->euid == tcred->suid ||
> > + cred->euid == tcred->uid ||
> > + cred->uid == tcred->suid ||
> > + cred->uid == tcred->uid))
> > + return 1;
> > +
> > + if (ns_capable(tcred->user->user_ns, CAP_KILL))
> > + return 1;
> > +
> > + return 0;
> > +}
>
> The compiler will inline this for us.

Is that simply true with everything (worth inlining) nowadays, or is
there a particular implicit hint to the compiler that'll make that
happen?

Not that I guess it's even particularly important in this case.

From: Serge E. Hallyn <serge.hallyn@canonical.com>
Date: Thu, 24 Feb 2011 00:26:02 +0000
Subject: [PATCH 1/2] userns: let compiler inline kill_ok_by_cred (per akpm)

Signed-off-by: Serge E. Hallyn <serge.hallyn@canonical.com>
---
kernel/signal.c | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/kernel/signal.c b/kernel/signal.c
index ffe4bdf..12702b4 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -638,7 +638,7 @@ static inline bool si_fromuser(const struct siginfo *info)
/*
* called with RCU read lock from check_kill_permission()
*/
-static inline int kill_ok_by_cred(struct task_struct *t)
+static int kill_ok_by_cred(struct task_struct *t)
{
const struct cred *cred = current_cred();
const struct cred *tcred = __task_cred(t);
--
1.7.0.4

_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containe rs
[PATCH] userns: ptrace: incorporate feedback from Eric [message #41877 is a reply to message #41772] Thu, 24 February 2011 00:49 Go to previous messageGo to next message
serge is currently offline  serge
Messages: 72
Registered: January 2007
Member
From: *parallels.com
same_or_ancestore_user_ns() was not an appropriate check to
constrain cap_issubset. Rather, cap_issubset() only is
meaningful when both capsets are in the same user_ns.

Signed-off-by: Serge E. Hallyn <serge.hallyn@canonical.com>
---
include/linux/user_namespace.h | 9 ---------
kernel/user_namespace.c | 16 ----------------
security/commoncap.c | 28 ++++++++++------------------
3 files changed, 10 insertions(+), 43 deletions(-)

diff --git a/include/linux/user_namespace.h b/include/linux/user_namespace.h
index 862fc59..faf4679 100644
--- a/include/linux/user_namespace.h
+++ b/include/linux/user_namespace.h
@@ -39,9 +39,6 @@ static inline void put_user_ns(struct user_namespace *ns)
uid_t user_ns_map_uid(struct user_namespace *to, const struct cred *cred, uid_t uid);
gid_t user_ns_map_gid(struct user_namespace *to, const struct cred *cred, gid_t gid);

-int same_or_ancestor_user_ns(struct task_struct *task,
- struct task_struct *victim);
-
#else

static inline struct user_namespace *get_user_ns(struct user_namespace *ns)
@@ -69,12 +66,6 @@ static inline gid_t user_ns_map_gid(struct user_namespace *to,
return gid;
}

-static inline int same_or_ancestor_user_ns(struct task_struct *task,
- struct task_struct *victim)
-{
- return 1;
-}
-
#endif

#endif /* _LINUX_USER_H */
diff --git a/kernel/user_namespace.c b/kernel/user_namespace.c
index 0ef2258..9da289c 100644
--- a/kernel/user_namespace.c
+++ b/kernel/user_namespace.c
@@ -129,22 +129,6 @@ gid_t user_ns_map_gid(struct user_namespace *to, const struct cred *cred, gid_t
return overflowgid;
}

-int same_or_ancestor_user_ns(struct task_struct *task,
- struct task_struct *victim)
-{
- struct user_namespace *u1 = task_cred_xxx(task, user)->user_ns;
- struct user_namespace *u2 = task_cred_xxx(victim, user)->user_ns;
- for (;;) {
- if (u1 == u2)
- return 1;
- if (u1 == &init_user_ns)
- return 0;
- u1 = u1->creator->user_ns;
- }
- /* We never get here */
- return 0;
-}
-
static __init int user_namespaces_init(void)
{
user_ns_cachep = KMEM_CACHE(user_namespace, SLAB_PANIC);
diff --git a/security/commoncap.c b/security/commoncap.c
index 12ff65c..526106f 100644
--- a/security/commoncap.c
+++ b/security/commoncap.c
@@ -142,19 +142,15 @@ int cap_settime(struct timespec *ts, struct timezone *tz)
int cap_ptrace_access_check(struct task_struct *child, unsigned int mode)
{
int ret = 0;
- const struct cred *cred, *tcred;
+ const struct cred *cred, *child_cred;

rcu_read_lock();
cred = current_cred();
- tcred = __task_cred(child);
- /*
- * The ancestor user_ns check may be gratuitous, as I think
- * we've already guaranteed that in kernel/ptrace.c.
- */
- if (same_or_ancestor_user_ns(current, child) &&
- cap_issubset(tcred->cap_permitted, cred->cap_permitted))
+ child_cred = __task_cred(child);
+ if (cred->user->user_ns == child_cred->user->user_ns &&
+ cap_issubset(child_cred->cap_permitted, cred->cap_permitted))
goto out;
- if (ns_capable(tcred->user->user_ns, CAP_SYS_PTRACE))
+ if (ns_capable(child_cred->user->user_ns, CAP_SYS_PTRACE))
goto out;
ret = -EPERM;
out:
@@ -178,19 +174,15 @@ out:
int cap_ptrace_traceme(struct task_struct *parent)
{
int ret = 0;
- const struct cred *cred, *tcred;
+ const struct cred *cred, *child_cred;

rcu_read_lock();
cred = __task_cred(parent);
- tcred = current_cred();
- /*
- * The ancestor user_ns check may be gratuitous, as I think
- * we've already guaranteed that in kernel/ptrace.c.
- */
- if (same_or_ancestor_user_ns(parent, current) &&
- cap_issubset(tcred->cap_permitted, cred->cap_permitted))
+ child_cred = current_cred();
+ if (cred->user->user_ns == child_cred->user->user_ns &&
+ cap_issubset(child_cred->cap_permitted, cred->cap_permitted))
goto out;
- if (has_ns_capability(parent, tcred->user->user_ns, CAP_SYS_PTRACE))
+ if (has_ns_capability(parent, child_cred->user->user_ns, CAP_SYS_PTRACE))
goto out;
ret = -EPERM;
out:
--
1.7.0.4

_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containe rs
Re: [PATCH 4/9] allow killing tasks in your own or child userns [message #41878 is a reply to message #41876] Thu, 24 February 2011 00:54 Go to previous messageGo to next message
akpm is currently offline  akpm
Messages: 224
Registered: March 2007
Senior Member
From: *parallels.com
On Thu, 24 Feb 2011 00:48:18 +0000
"Serge E. Hallyn" <serge@hallyn.com> wrote:

> Quoting Andrew Morton (akpm@linux-foundation.org):
> > On Thu, 17 Feb 2011 15:03:25 +0000
> > "Serge E. Hallyn" <serge@hallyn.com> wrote:
> >
> > > /*
> > > + * called with RCU read lock from check_kill_permission()
> > > + */
> > > +static inline int kill_ok_by_cred(struct task_struct *t)
> > > +{
> > > + const struct cred *cred = current_cred();
> > > + const struct cred *tcred = __task_cred(t);
> > > +
> > > + if (cred->user->user_ns == tcred->user->user_ns &&
> > > + (cred->euid == tcred->suid ||
> > > + cred->euid == tcred->uid ||
> > > + cred->uid == tcred->suid ||
> > > + cred->uid == tcred->uid))
> > > + return 1;
> > > +
> > > + if (ns_capable(tcred->user->user_ns, CAP_KILL))
> > > + return 1;
> > > +
> > > + return 0;
> > > +}
> >
> > The compiler will inline this for us.
>
> Is that simply true with everything (worth inlining) nowadays, or is
> there a particular implicit hint to the compiler that'll make that
> happen?

We've basically stopped inlining things nowadays. gcc inlines
aggressively and sometimes we have to use noinline to stop it. Also,
modern gcc's like to ignore the inline directive anwyay, so we have to
resort to __always_inline when we disagree.


_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containe rs
Re: [PATCH] userns: ptrace: incorporate feedback from Eric [message #41879 is a reply to message #41877] Thu, 24 February 2011 00:56 Go to previous messageGo to next message
akpm is currently offline  akpm
Messages: 224
Registered: March 2007
Senior Member
From: *parallels.com
On Thu, 24 Feb 2011 00:49:01 +0000
"Serge E. Hallyn" <serge@hallyn.com> wrote:

> same_or_ancestore_user_ns() was not an appropriate check to
> constrain cap_issubset. Rather, cap_issubset() only is
> meaningful when both capsets are in the same user_ns.

I queued this as a fix against
userns-allow-ptrace-from-non-init-user-namespaces.patch, but I get the
feeling that it would be better to just drop everything and start
again?

_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containe rs
Re: [PATCH] userns: ptrace: incorporate feedback from Eric [message #41882 is a reply to message #41879] Thu, 24 February 2011 03:15 Go to previous messageGo to next message
serge is currently offline  serge
Messages: 72
Registered: January 2007
Member
From: *parallels.com
Quoting Andrew Morton (akpm@linux-foundation.org):
> On Thu, 24 Feb 2011 00:49:01 +0000
> "Serge E. Hallyn" <serge@hallyn.com> wrote:
>
> > same_or_ancestore_user_ns() was not an appropriate check to
> > constrain cap_issubset. Rather, cap_issubset() only is
> > meaningful when both capsets are in the same user_ns.
>
> I queued this as a fix against
> userns-allow-ptrace-from-non-init-user-namespaces.patch, but I get the
> feeling that it would be better to just drop everything and start
> again?

Sure, I'll rebase and resend. I wonder if I should trim the Cc list
for the next round.

thanks,
-serge
_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containe rs
Re: [PATCH 9/9] userns: check user namespace for task-&gt;file uid equivalence checks [message #41883 is a reply to message #41784] Thu, 24 February 2011 03:24 Go to previous messageGo to next message
serge is currently offline  serge
Messages: 72
Registered: January 2007
Member
From: *parallels.com
Quoting Andrew Morton (akpm@linux-foundation.org):
> On Thu, 17 Feb 2011 15:04:07 +0000
> "Serge E. Hallyn" <serge@hallyn.com> wrote:
>
> > Cheat for now and say all files belong to init_user_ns. Next
> > step will be to let superblocks belong to a user_ns, and derive
> > inode_userns(inode) from inode->i_sb->s_user_ns. Finally we'll
> > introduce more flexible arrangements.
> >
> >
> > ...
> >
> > +
> > +/*
> > + * return 1 if current either has CAP_FOWNER to the
> > + * file, or owns the file.
> > + */
> > +int is_owner_or_cap(const struct inode *inode)
> > +{
> > + struct user_namespace *ns = inode_userns(inode);
> > +
> > + if (current_user_ns() == ns && current_fsuid() == inode->i_uid)
> > + return 1;
> > + if (ns_capable(ns, CAP_FOWNER))
> > + return 1;
> > + return 0;
> > +}
>
> bool?
>
> > +EXPORT_SYMBOL(is_owner_or_cap);
>
> There's a fairly well adhered to convention that global symbols (and
> often static symbols) have a prefix which identifies the subsystem to
> which they belong. This patchset rather scorns that convention.
>
> Most of these identifiers are pretty obviously from the capability
> subsystem, but still...

Would 'inode_owner_or_capable' be better and and make sense?

-serge
_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containe rs
Re: [PATCH 9/9] userns: check user namespace for task-&gt;file uid equivalence checks [message #41884 is a reply to message #41883] Thu, 24 February 2011 05:08 Go to previous messageGo to next message
akpm is currently offline  akpm
Messages: 224
Registered: March 2007
Senior Member
From: *parallels.com
On Thu, 24 Feb 2011 03:24:16 +0000 "Serge E. Hallyn" <serge@hallyn.com> wrote:

> Quoting Andrew Morton (akpm@linux-foundation.org):
> > On Thu, 17 Feb 2011 15:04:07 +0000
> > "Serge E. Hallyn" <serge@hallyn.com> wrote:
> >
> > There's a fairly well adhered to convention that global symbols (and
> > often static symbols) have a prefix which identifies the subsystem to
> > which they belong. This patchset rather scorns that convention.
> >
> > Most of these identifiers are pretty obviously from the capability
> > subsystem, but still...
>
> Would 'inode_owner_or_capable' be better and and make sense?
>

I suppose so. We've totally screwed that pooch in the VFS (grep EXPORT
fs/inode.c). But it's never to late to start.

_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containe rs
Re: User namespaces and keys [message #41886 is a reply to message #41865] Thu, 24 February 2011 06:56 Go to previous message
ebiederm is currently offline  ebiederm
Messages: 1354
Registered: February 2006
Senior Member
From: *parallels.com
Casey Schaufler <casey@schaufler-ca.com> writes:

> On 2/23/2011 12:55 PM, Eric W. Biederman wrote:
>> Casey Schaufler <casey@schaufler-ca.com> writes:
>>
>>> I confess that I remain less well educated on namespaces than
>>> I probably should be, but with what I do know it seems that the
>>> relationships between user namespaces and LSMs are bound to be
>>> strained from the beginning. Some LSMs (SELinux and Smack) are
>>> providing similar sandbox capabilities to what you get from user
>>> namespaces, but from different directions and with different
>>> use cases.
>> Casey I won't argue about the possibility of things being strained, but
>> I think if we focus on the semantics and not on the end goal of exactly
>> how the pieces are to be used there can be some reasonable dialog.
>
> I'm sure that there will be cases where they will work together
> like horses in a troika. Making sensible semantics for the interactions
> is key, and it is entirely possible that in some cases a comparison
> of semantics and behaviors will lead an end user to chose either an
> LSM or namespaces over the combination. Just like I expect that even
> when we allow multiple LSMs the SELinux and Smack combination will be
> rare among the sane.

That sounds about right.

Eric

_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containe rs
Previous Topic: [PATCH 0/5] blk-throttle: writeback and swap IO control
Next Topic: [PATCH 1/4] userns: let clone_uts_ns() handle setting uts-&gt;user_ns
Goto Forum:
  


Current Time: Tue Sep 25 11:59:18 GMT 2018