| Home » Mailing lists » Devel » userns: targeted capabilities v5 Goto Forum:
	| 
		
			| Re: [PATCH 2/9] security: Make capabilities relative to the user namespace. [message #41847 is a reply to message #41744] | Wed, 23 February 2011 12:01   |  
			| 
				
				
					|  David Howells Messages: 44
 Registered: October 2006
 | Member |  |  |  
	| David Howells <dhowells@redhat.com> wrote: 
 > >  	int (*capable) (struct task_struct *tsk, const struct cred *cred,
 > > -			int cap, int audit);
 > > +			struct user_namespace *ns, int cap, int audit);
 >
 > Hmmm...  A chunk of the contents of the cred struct are user-namespaced.
 > Could you add the user_namespace pointer to the cred struct and thus avoid
 > passing it as an argument to other things.
 
 Ah, no...  Ignore that, I think I see that you do need it.
 
 > +int cap_capable(struct task_struct *tsk, const struct cred *cred,
 > +		struct user_namespace *targ_ns, int cap, int audit)
 >  {
 > -	return cap_raised(cred->cap_effective, cap) ? 0 : -EPERM;
 > +	for (;;) {
 > +		/* The creator of the user namespace has all caps. */
 > +		if (targ_ns != &init_user_ns && targ_ns->creator == cred->user)
 > +			return 0;
 
 Why is that last comment so?  Why should the creating namespace sport all
 possible capabilities?  Do you have to have all capabilities available to you
 to be permitted create a new user namespace?
 
 Also, would it be worth having a separate cap_ns_capable()?  Wouldn't most
 calls to cap_capable() only be checking the caps granted in the current user
 namespace?
 
 David
 _______________________________________________
 Containers mailing list
 Containers@lists.linux-foundation.org
 https://lists.linux-foundation.org/mailman/listinfo/containe rs
 |  
	|  |  |  
	| 
		
			| Re: [PATCH 2/9] security: Make capabilities relative to the user namespace. [message #41848 is a reply to message #41744] | Wed, 23 February 2011 11:40   |  
			| 
				
				
					|  David Howells Messages: 44
 Registered: October 2006
 | Member |  |  |  
	| Serge E. Hallyn <serge@hallyn.com> wrote: 
 >  	int (*capable) (struct task_struct *tsk, const struct cred *cred,
 > -			int cap, int audit);
 > +			struct user_namespace *ns, int cap, int audit);
 
 Hmmm...  A chunk of the contents of the cred struct are user-namespaced.
 Could you add the user_namespace pointer to the cred struct and thus avoid
 passing it as an argument to other things.
 
 In fact, I think you probably have to do this so that cachefiles, for example,
 can override the user namespace when it operates on behalf of a process that's
 in another namespace.
 
 In fact, looking later on in your patch, I see:
 
 > -		    || (cap_capable(current, current_cred(), CAP_SETPCAP,
 > +		    || (cap_capable(current, current_cred(),
 > +				    current_cred()->user->user_ns, CAP_SETPCAP,
 
 so the user_ns _is_ already available through the creds.  So is there really a
 need to pass it as an argument to anything that already takes a cred?
 
 >   *	@cred contains the credentials to use.
 > + *      @ns contains the user namespace we want the capability in
 >   *	@cap contains the capability <include/linux/capability.h>.
 
 That should be tabbed to match the lines either side.
 
 David
 _______________________________________________
 Containers mailing list
 Containers@lists.linux-foundation.org
 https://lists.linux-foundation.org/mailman/listinfo/containe rs
 |  
	|  |  |  
	|  |  
	| 
		
			| Re: [PATCH 2/9] security: Make capabilities relative to the user namespace. [message #41850 is a reply to message #41847] | Wed, 23 February 2011 13:43   |  
			| 
				
				
					|  serge Messages: 72
 Registered: January 2007
 | Member |  |  |  
	| Quoting David Howells (dhowells@redhat.com): > David Howells <dhowells@redhat.com> wrote:
 >
 > > >  	int (*capable) (struct task_struct *tsk, const struct cred *cred,
 > > > -			int cap, int audit);
 > > > +			struct user_namespace *ns, int cap, int audit);
 > >
 > > Hmmm...  A chunk of the contents of the cred struct are user-namespaced.
 > > Could you add the user_namespace pointer to the cred struct and thus avoid
 > > passing it as an argument to other things.
 >
 > Ah, no...  Ignore that, I think I see that you do need it.
 
 Thanks for reviewing, David.
 
 > > +int cap_capable(struct task_struct *tsk, const struct cred *cred,
 > > +		struct user_namespace *targ_ns, int cap, int audit)
 > >  {
 > > -	return cap_raised(cred->cap_effective, cap) ? 0 : -EPERM;
 > > +	for (;;) {
 > > +		/* The creator of the user namespace has all caps. */
 > > +		if (targ_ns != &init_user_ns && targ_ns->creator == cred->user)
 > > +			return 0;
 >
 > Why is that last comment so?  Why should the creating namespace sport all
 > possible capabilities?  Do you have to have all capabilities available to you
 > to be permitted create a new user namespace?
 
 It's not the creating namespace, but the creating user, which has all caps.
 
 So for instance, if uid 500 in init_user_ns creates a namespace, then:
 a. uid 500 in init_user_ns has all caps to the child user_ns, so it can
 kill the tasks in the userns, clean up, etc.
 b. uid X in any other child user_ns has no caps to the child user_ns.
 c. root in init_user_ns has whatever capabilities are in his pE to the
 child user_ns.  Again, this is so that the admin in any user_ns can
 clean up any messes made by users in his user_ns.
 
 One of the goals of the user namespaces it to make it safe to allow
 unprivileged users to create child user namespaces in which they have
 targeted privilege.  Anything which happens in that child user namespace
 should be:
 
 a. constrained to resources which the user can control anyway
 b. able to be cleaned up by the user
 c. (especially) able to be cleaned up by the privileged user in the
 parent user_ns.
 
 > Also, would it be worth having a separate cap_ns_capable()?  Wouldn't most
 > calls to cap_capable() only be checking the caps granted in the current user
 > namespace?
 
 Hm.  There is nsown_capable() which is targeted to current_userns(), but
 that still needs to enable the caps for the privileged ancestors as
 described above.
 
 thanks,
 -serge
 _______________________________________________
 Containers mailing list
 Containers@lists.linux-foundation.org
 https://lists.linux-foundation.org/mailman/listinfo/containe rs
 |  
	|  |  |  
	| 
		
			| Re: User namespaces and keys [message #41851 is a reply to message #41849] | Wed, 23 February 2011 13:58   |  
			| 
				
				
					|  serge Messages: 72
 Registered: January 2007
 | Member |  |  |  
	| Quoting David Howells (dhowells@redhat.com): >
 > I guess we need to look at how to mix keys and namespaces again.
 
 >From strictly kernel pov, at the moment, keys are strictly usable only
 by the user in your own user namespace.
 
 We may want to look at this again, but for now I think that would be a
 safe enough default.  Later, we'll probably want the user creating a
 child_user_ns to allow his keys to be inherited by the child user_ns.
 Though, as I type that, it seems to me that that'll just become a
 maintenance pain, and it's just plain safer to have the user re-enter
 his keys, sharing them over a file if needed.
 
 I'm going to not consider the TPM at the moment :)
 
 > Possibly the trickiest problem with keys is how to upcall key construction to
 > /sbin/request-key when the keys may be of a different user namespace.
 
 Hm, jinkeys, yes.
 
 -serge
 _______________________________________________
 Containers mailing list
 Containers@lists.linux-foundation.org
 https://lists.linux-foundation.org/mailman/listinfo/containe rs
 |  
	|  |  |  
	| 
		
			| Re: User namespaces and keys [message #41852 is a reply to message #41851] | Wed, 23 February 2011 14:46   |  
			| 
				
				
					|  ebiederm Messages: 1354
 Registered: February 2006
 | Senior Member |  |  |  
	| "Serge E. Hallyn" <serge@hallyn.com> writes: 
 > Quoting David Howells (dhowells@redhat.com):
 >>
 >> I guess we need to look at how to mix keys and namespaces again.
 >
 >>From strictly kernel pov, at the moment, keys are strictly usable only
 > by the user in your own user namespace.
 >
 > We may want to look at this again, but for now I think that would be a
 > safe enough default.  Later, we'll probably want the user creating a
 > child_user_ns to allow his keys to be inherited by the child user_ns.
 > Though, as I type that, it seems to me that that'll just become a
 > maintenance pain, and it's just plain safer to have the user re-enter
 > his keys, sharing them over a file if needed.
 >
 > I'm going to not consider the TPM at the moment :)
 >
 >> Possibly the trickiest problem with keys is how to upcall key construction to
 >> /sbin/request-key when the keys may be of a different user namespace.
 >
 > Hm, jinkeys, yes.
 
 Serge short term this is where I think we need to add a check or two so
 that keys only work in the init_user_ns.  When the rest of the details
 are sorted out we can open up the use of keys some more.
 
 As the first stage of converting the network stack we added patches
 that turned off everything in non init namespaces.   Those patches
 were trivial and easy to review, and it made the conversion process
 a lot easier.  I suspect for keys and possibly security modules and
 anything else that does is related we want to turn off by default.
 
 Then once the core of user namespaces is safe to unshare without
 privilege we can come back and get the more difficult bits.
 
 Eric
 _______________________________________________
 Containers mailing list
 Containers@lists.linux-foundation.org
 https://lists.linux-foundation.org/mailman/listinfo/containe rs
 |  
	|  |  |  
	| 
		
			| Re: User namespaces and keys [message #41853 is a reply to message #41851] | Wed, 23 February 2011 15:06   |  
			| 
				
				
					|  David Howells Messages: 44
 Registered: October 2006
 | Member |  |  |  
	| Serge E. Hallyn <serge@hallyn.com> wrote: 
 > > I guess we need to look at how to mix keys and namespaces again.
 >
 > From strictly kernel pov, at the moment, keys are strictly usable only
 > by the user in your own user namespace.
 
 I'm not sure that's currently completely true.  Key quota maintenance is
 namespaced, and the key's owner UID/GID belong to that namespace, so that's
 okay, but:
 
 (*) key_task_permission() does not distinguish UIDs and GIDs from different
 namespaces.
 
 (*) A key can be referred to by its serial number, no matter whose namespace
 it is in, and will yield up its given UID/GID, even if these aren't
 actually meaningful in your namespace.
 
 This means request_key() can successfully upcall at the moment.
 
 I wonder if I should make the following changes:
 
 (1) If the key and the accessor are in different user namespaces, then skip
 the UID and GID comparisons in key_task_permission().  That means that to
 be able to access the key you'd have to possess the key and the key would
 have to grant you Possessor access, or the key would have to grant you
 Other access.
 
 (2) If the key and someone viewing the key description are in different
 namespaces, then indicate that the UID and the GID are -1, irrespective of
 the actual values.
 
 (3) When an upcall is attempting to instantiate a key, it is allowed to access
 the keys of requestor using the requestor's credentials (UID, GID, groups,
 security label).  Ensure that this will be done in the requestor's user
 namespace.
 
 Nothing should need to be done here, since search_process_keyrings()
 switches to the requestor's creds.
 
 Oh, and are security labels user-namespaced?
 
 > We may want to look at this again, but for now I think that would be a
 > safe enough default.  Later, we'll probably want the user creating a
 > child_user_ns to allow his keys to be inherited by the child user_ns.
 
 That depends what you mean by 'allow his keys to be inherited'.  Do you mean
 copying all the creator's keys en mass?  You may find all you need to do is to
 provide the new intended user with a new session keyring with a link back to
 the creator's session keyring.
 
 > Though, as I type that, it seems to me that that'll just become a
 > maintenance pain, and it's just plain safer to have the user re-enter
 > his keys,
 
 That would certainly be simpler.
 
 > sharing them over a file if needed.
 
 I'm not sure what you mean by that.  Do you mean some sort of cred passing
 similar to that that can be done over AF_UNIX sockets with fds?
 
 > I'm going to not consider the TPM at the moment :)
 
 I'm not sure the TPM is that much of a problem, assuming you're just referring
 to its keystore capability...
 
 David
 _______________________________________________
 Containers mailing list
 Containers@lists.linux-foundation.org
 https://lists.linux-foundation.org/mailman/listinfo/containe rs
 |  
	|  |  |  
	| 
		
			| Re: User namespaces and keys [message #41855 is a reply to message #41853] | Wed, 23 February 2011 15:45   |  
			| 
				
				
					|  ebiederm Messages: 1354
 Registered: February 2006
 | Senior Member |  |  |  
	| David Howells <dhowells@redhat.com> writes: 
 > Serge E. Hallyn <serge@hallyn.com> wrote:
 >
 >> > I guess we need to look at how to mix keys and namespaces again.
 >>
 >> From strictly kernel pov, at the moment, keys are strictly usable only
 >> by the user in your own user namespace.
 >
 > I'm not sure that's currently completely true.  Key quota maintenance is
 > namespaced, and the key's owner UID/GID belong to that namespace, so that's
 > okay, but:
 >
 >  (*) key_task_permission() does not distinguish UIDs and GIDs from different
 >      namespaces.
 >
 >  (*) A key can be referred to by its serial number, no matter whose namespace
 >      it is in, and will yield up its given UID/GID, even if these aren't
 >      actually meaningful in your namespace.
 >
 >      This means request_key() can successfully upcall at the moment.
 >
 > I wonder if I should make the following changes:
 >
 >  (1) If the key and the accessor are in different user namespaces, then skip
 >      the UID and GID comparisons in key_task_permission().  That means that to
 >      be able to access the key you'd have to possess the key and the key would
 >      have to grant you Possessor access, or the key would have to grant you
 >      Other access.
 >
 >  (2) If the key and someone viewing the key description are in different
 >      namespaces, then indicate that the UID and the GID are -1, irrespective of
 >      the actual values.
 >
 >  (3) When an upcall is attempting to instantiate a key, it is allowed to access
 >      the keys of requestor using the requestor's credentials (UID, GID, groups,
 >      security label).  Ensure that this will be done in the requestor's user
 >      namespace.
 >
 >      Nothing should need to be done here, since search_process_keyrings()
 >      switches to the requestor's creds.
 >
 > Oh, and are security labels user-namespaced?
 
 Not at this time.  The user namespace as currently merged is little more
 than a place holder for a proper implementation.  Serge is busily
 fleshing out that proper implementation.
 
 Until we reach the point where all checks that have historically been
 "if (uid1 == uid2)" become "if ((uidns1 == uidns2) && (uid1 == uid2))"
 there will be problems.
 
 The security labels and probably lsm's in general need to be per user
 namespace but we simply have not gotten that far.  For the short term I
 will be happy when we get a minimally usable user namespace.
 
 Eric
 _______________________________________________
 Containers mailing list
 Containers@lists.linux-foundation.org
 https://lists.linux-foundation.org/mailman/listinfo/containe rs
 |  
	|  |  |  
	| 
		
			| Re: User namespaces and keys [message #41856 is a reply to message #41855] | Wed, 23 February 2011 15:53   |  
			| 
				
				
					|  Serge E. Hallyn Messages: 26
 Registered: February 2011
 | Junior Member |  |  |  
	| Quoting Eric W. Biederman (ebiederm@xmission.com): > David Howells <dhowells@redhat.com> writes:
 >
 > > Serge E. Hallyn <serge@hallyn.com> wrote:
 > >
 > >> > I guess we need to look at how to mix keys and namespaces again.
 > >>
 > >> From strictly kernel pov, at the moment, keys are strictly usable only
 > >> by the user in your own user namespace.
 > >
 > > I'm not sure that's currently completely true.  Key quota maintenance is
 > > namespaced, and the key's owner UID/GID belong to that namespace, so that's
 > > okay, but:
 > >
 > >  (*) key_task_permission() does not distinguish UIDs and GIDs from different
 > >      namespaces.
 > >
 > >  (*) A key can be referred to by its serial number, no matter whose namespace
 > >      it is in, and will yield up its given UID/GID, even if these aren't
 > >      actually meaningful in your namespace.
 > >
 > >      This means request_key() can successfully upcall at the moment.
 > >
 > > I wonder if I should make the following changes:
 > >
 > >  (1) If the key and the accessor are in different user namespaces, then skip
 > >      the UID and GID comparisons in key_task_permission().  That means that to
 > >      be able to access the key you'd have to possess the key and the key would
 > >      have to grant you Possessor access, or the key would have to grant you
 > >      Other access.
 > >
 > >  (2) If the key and someone viewing the key description are in different
 > >      namespaces, then indicate that the UID and the GID are -1, irrespective of
 > >      the actual values.
 > >
 > >  (3) When an upcall is attempting to instantiate a key, it is allowed to access
 > >      the keys of requestor using the requestor's credentials (UID, GID, groups,
 > >      security label).  Ensure that this will be done in the requestor's user
 > >      namespace.
 > >
 > >      Nothing should need to be done here, since search_process_keyrings()
 > >      switches to the requestor's creds.
 > >
 > > Oh, and are security labels user-namespaced?
 >
 > Not at this time.  The user namespace as currently merged is little more
 > than a place holder for a proper implementation.  Serge is busily
 > fleshing out that proper implementation.
 >
 > Until we reach the point where all checks that have historically been
 > "if (uid1 == uid2)" become "if ((uidns1 == uidns2) && (uid1 == uid2))"
 > there will be problems.
 >
 > The security labels and probably lsm's in general need to be per user
 > namespace but we simply have not gotten that far.  For the short term I
 > will be happy when we get a minimally usable user namespace.
 
 Note also that when Eric brought this up at the LSM mini-conf two or three
 years ago, there was pretty general, strong objection to the idea.
 
 Like Eric says, I think that'll have to wait.  In the meantime, isolating
 user namespace sandboxes (or containers) using simple LSM configurations
 is a very good idea.
 
 -serge
 _______________________________________________
 Containers mailing list
 Containers@lists.linux-foundation.org
 https://lists.linux-foundation.org/mailman/listinfo/containe rs
 |  
	|  |  |  
	|  |  
	|  |  
	|  |  
	| 
		
			| Re: [PATCH 1/9] Add a user_namespace as creator/owner of uts_namespace [message #41860 is a reply to message #41747] | Wed, 23 February 2011 17:16   |  
			| 
				
				
					|  David Howells Messages: 44
 Registered: October 2006
 | Member |  |  |  
	| Serge E. Hallyn <serge@hallyn.com> wrote: 
 >  struct uts_namespace {
 >  	struct kref kref;
 >  	struct new_utsname name;
 > +	struct user_namespace *user_ns;
 >  };
 
 If a uts_namespace belongs to a user_namespace, should CLONE_NEWUSER then
 imply CLONE_NEWUTS?
 
 Or is uts_namespace::user_ns more an implication that the set of users in
 user_namespace are the only ones authorised to alter the uts data.
 
 I presume that the uts_namespace of a process must be owned by one of the
 user_namespaces in the alternating inheritance chain of namespaces and their
 creators leading from current_user_ns() to init_user_ns.
 
 With that in mind, looking at patch 3:
 
 -	if (!capable(CAP_SYS_ADMIN))
 +	if (!ns_capable(current->nsproxy->uts_ns->user_ns, CAP_SYS_ADMIN))
 
 what is it you're actually asking?  I presume it's 'does this user have
 CAP_SYS_ADMIN capability over objects belonging to the uts_namespace's
 user_namespace?'
 
 So, to look at the important bit of patch 2:
 
 -int cap_capable(struct task_struct *tsk, const struct cred *cred, int cap,
 -		int audit)
 +int cap_capable(struct task_struct *tsk, const struct cred *cred,
 +		struct user_namespace *targ_ns, int cap, int audit)
 {
 -	return cap_raised(cred->cap_effective, cap) ? 0 : -EPERM;
 +	for (;;) {
 +		/* The creator of the user namespace has all caps. */
 +		if (targ_ns != &init_user_ns && targ_ns->creator == cred->user)
 +			return 0;
 +
 +		/* Do we have the necessary capabilities? */
 +		if (targ_ns == cred->user->user_ns)
 +			return cap_raised(cred->cap_effective, cap) ? 0 : -EPERM;
 +
 +		/* Have we tried all of the parent namespaces? */
 +		if (targ_ns == &init_user_ns)
 +			return -EPERM;
 +
 +		/* If you have the capability in a parent user ns you have it
 +		 * in the over all children user namespaces as well, so see
 +		 * if this process has the capability in the parent user
 +		 * namespace.
 +		 */
 +		targ_ns = targ_ns->creator->user_ns;
 +	}
 +
 +	/* We never get here */
 +	return -EPERM;
 }
 
 On entry, as we're called from ns_capable(), cred->user is the user that the
 current process is running as, and, as such, may be in a separate namespace
 from uts_namespace - which may itself be in a separate namespace from
 init_user_ns.
 
 So, assume for the sake of argument that there are three user_namespaces along
 the chain from the calling process to the root, and that the uts_namespace
 belongs to the middle one.
 
 if (targ_ns != &init_user_ns && targ_ns->creator == cred->user)
 return 0;
 
 Can never match because targ_ns->creator cannot be cred->user; even if the
 uts_namespace belongs to our namespace, given that the creator lies outside
 our namespace.
 
 if (targ_ns == cred->user->user_ns)
 return cap_raised(cred->cap_effective, cap) ? 0 : -EPERM;
 
 Can only match if we are in the target user_namespace (ie. the one to which
 uts_namespace belongs), whether or not we have CAP_SYS_ADMIN.
 
 Which means that unless the uts_namespace belongs to our user_namespace, we
 cannot change it.  Is that correct?
 
 So ns_capable() restricts you to only doing interesting things to objects that
 belong to a user_namespace if they are in your own user_namespace.  Is that
 correct?
 
 If that is so, is the loop required for ns_capable()?
 
 
 Looking further at patch 2:
 
 #define nsown_capable(cap) (ns_capable(current_user_ns(), (cap)))
 
 Given what I've said above, I presume the loop isn't necessary here either.
 
 
 I think you're using ns_capable() in two different ways:
 
 (1) You're using it to see if a process has power over its descendents in a
 user_namespace that can be traced back to a clone() that it did with
 CLONE_NEWUSER.
 
 For example, automatically granting a process permission to kill
 descendents in a namespace it created.
 
 (2) You're using it to see if a process can access objects that might be
 outside its own user_namespace.
 
 For example, setting the hostname.
 
 Is it worth giving two different interfaces to make this clearer (even if they
 actually do the same thing)?
 
 
 Sorry if this seems rambly, but I'm trying to get my head round your code.
 
 David
 
 _______________________________________________
 Containers mailing list
 Containers@lists.linux-foundation.org
 https://lists.linux-foundation.org/mailman/listinfo/containe rs
 |  
	|  |  |  
	| 
		
			| Re: User namespaces and keys [message #41861 is a reply to message #41856] | Wed, 23 February 2011 19:24   |  
			| 
				
				
					|  Casey Schaufler Messages: 6
 Registered: November 2007
 | Junior Member |  |  |  
	| On 2/23/2011 7:53 AM, Serge E. Hallyn wrote: > Quoting Eric W. Biederman (ebiederm@xmission.com):
 >> David Howells <dhowells@redhat.com> writes:
 >>
 >>> Serge E. Hallyn <serge@hallyn.com> wrote:
 >>>
 >>>>> I guess we need to look at how to mix keys and namespaces again.
 >>>> From strictly kernel pov, at the moment, keys are strictly usable only
 >>>> by the user in your own user namespace.
 >>> I'm not sure that's currently completely true.  Key quota maintenance is
 >>> namespaced, and the key's owner UID/GID belong to that namespace, so that's
 >>> okay, but:
 >>>
 >>>  (*) key_task_permission() does not distinguish UIDs and GIDs from different
 >>>      namespaces.
 >>>
 >>>  (*) A key can be referred to by its serial number, no matter whose namespace
 >>>      it is in, and will yield up its given UID/GID, even if these aren't
 >>>      actually meaningful in your namespace.
 >>>
 >>>      This means request_key() can successfully upcall at the moment.
 >>>
 >>> I wonder if I should make the following changes:
 >>>
 >>>  (1) If the key and the accessor are in different user namespaces, then skip
 >>>      the UID and GID comparisons in key_task_permission().  That means that to
 >>>      be able to access the key you'd have to possess the key and the key would
 >>>      have to grant you Possessor access, or the key would have to grant you
 >>>      Other access.
 >>>
 >>>  (2) If the key and someone viewing the key description are in different
 >>>      namespaces, then indicate that the UID and the GID are -1, irrespective of
 >>>      the actual values.
 >>>
 >>>  (3) When an upcall is attempting to instantiate a key, it is allowed to access
 >>>      the keys of requestor using the requestor's credentials (UID, GID, groups,
 >>>      security label).  Ensure that this will be done in the requestor's user
 >>>      namespace.
 >>>
 >>>      Nothing should need to be done here, since search_process_keyrings()
 >>>      switches to the requestor's creds.
 >>>
 >>> Oh, and are security labels user-namespaced?
 >> Not at this time.  The user namespace as currently merged is little more
 >> than a place holder for a proper implementation.  Serge is busily
 >> fleshing out that proper implementation.
 >>
 >> Until we reach the point where all checks that have historically been
 >> "if (uid1 == uid2)" become "if ((uidns1 == uidns2) && (uid1 == uid2))"
 >> there will be problems.
 >>
 >> The security labels and probably lsm's in general need to be per user
 >> namespace but we simply have not gotten that far.  For the short term I
 >> will be happy when we get a minimally usable user namespace.
 > Note also that when Eric brought this up at the LSM mini-conf two or three
 > years ago, there was pretty general, strong objection to the idea.
 >
 > Like Eric says, I think that'll have to wait.  In the meantime, isolating
 > user namespace sandboxes (or containers) using simple LSM configurations
 > is a very good idea.
 
 I confess that I remain less well educated on namespaces than
 I probably should be, but with what I do know it seems that the
 relationships between user namespaces and LSMs are bound to be
 strained from the beginning. Some LSMs (SELinux and Smack) are
 providing similar sandbox capabilities to what you get from user
 namespaces, but from different directions and with different
 use cases.
 
 > -serge
 > --
 > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
 > the body of a message to majordomo@vger.kernel.org
 > More majordomo info at  http://vger.kernel.org/majordomo-info.html
 > Please read the FAQ at  http://www.tux.org/lkml/
 >
 >
 
 _______________________________________________
 Containers mailing list
 Containers@lists.linux-foundation.org
 https://lists.linux-foundation.org/mailman/listinfo/containe rs
 |  
	|  |  |  
	| 
		
			| Re: User namespaces and keys [message #41863 is a reply to message #41861] | Wed, 23 February 2011 20:55   |  
			| 
				
				
					|  ebiederm Messages: 1354
 Registered: February 2006
 | Senior Member |  |  |  
	| Casey Schaufler <casey@schaufler-ca.com> writes: 
 > I confess that I remain less well educated on namespaces than
 > I probably should be, but with what I do know it seems that the
 > relationships between user namespaces and LSMs are bound to be
 > strained from the beginning. Some LSMs (SELinux and Smack) are
 > providing similar sandbox capabilities to what you get from user
 > namespaces, but from different directions and with different
 > use cases.
 
 Casey I won't argue about the possibility of things being strained, but
 I think if we focus on the semantics and not on the end goal of exactly
 how the pieces are to be used there can be some reasonable dialog.
 
 >From 10,000 feet an user namespace represents one administrative domain.
 uids, gids and other external tokens are all controlled by a single
 group with a single security policy.  In that single administrative
 domain things like nfs are expected to work without translating uids and
 gids.  Before the implementation of user namespaces a single kernel
 could not even express the ability of dealing with multiple user
 namespaces simultaneously (nfs has usually punted and said despite being
 multiple machines you still have to be in the same administrative domain
 so the same user identifiers can be used throughout).
 
 >From that principle we have the concept that use assigned/visible
 identifiers uid, gid, capabilities, security keys?, security labels?
 logically belong to the user namespace.
 
 Which means in implementing there are two pieces of work in implementing
 the user namespace.
 
 - Find all of the interesting comparisons and change them from
 "if (id == id)" to "if (userns == userns) && (id == id)".
 
 - Potentially define and handle what happens when you mix user
 namespaces.
 
 I think for the first round of this we simply want to define lsms
 and the user namespace as not mixing or the lsm interfaces only being
 visible if you are in the initial user namespace.  Thus reserving the
 definition of what happens when you have lsms and multiple user
 namespaces as something we can define later.  I expect the proper
 definition is something that would allow nested lsm policy.
 
 Regardless.  The namespaces are all about making the identifiers that
 processes deal with local to the namespace, while the underlying object
 manipulations should not care.
 
 The big advantage of the user namespace is that it is the only way I can
 see to get us out of the bind we are in with suid root executables,
 where we can not enable new features to untrusted applications that can
 change the environment for a suid root executable, because it might
 confuse those suid root executables to misusing their power.  Once
 inside a user namespace nothing has dangerous powers because even a
 root owned process with a full set of capabilities is less powerful than
 a guest user outside of the namespace.   No process having dangerous
 powers allows us to enable unsharing of other namespaces and potentially
 other things that today are restricted with capabilities only because
 they can be used to confuse a privileged executable to do something
 malicious.
 
 Eric
 
 _______________________________________________
 Containers mailing list
 Containers@lists.linux-foundation.org
 https://lists.linux-foundation.org/mailman/listinfo/containe rs
 |  
	|  |  |  
	| 
		
			| Re: [PATCH 1/9] Add a user_namespace as creator/owner of uts_namespace [message #41864 is a reply to message #41860] | Wed, 23 February 2011 21:21   |  
			| 
				
				
					|  ebiederm Messages: 1354
 Registered: February 2006
 | Senior Member |  |  |  
	| David Howells <dhowells@redhat.com> writes: 
 > Serge E. Hallyn <serge@hallyn.com> wrote:
 >
 >>  struct uts_namespace {
 >>  	struct kref kref;
 >>  	struct new_utsname name;
 >> +	struct user_namespace *user_ns;
 >>  };
 >
 > If a uts_namespace belongs to a user_namespace, should CLONE_NEWUSER then
 > imply CLONE_NEWUTS?
 >
 > Or is uts_namespace::user_ns more an implication that the set of users in
 > user_namespace are the only ones authorised to alter the uts data.
 
 The later.
 
 > I presume that the uts_namespace of a process must be owned by one of the
 > user_namespaces in the alternating inheritance chain of namespaces and their
 > creators leading from current_user_ns() to init_user_ns.
 >
 > With that in mind, looking at patch 3:
 >
 > -	if (!capable(CAP_SYS_ADMIN))
 > +	if (!ns_capable(current->nsproxy->uts_ns->user_ns, CAP_SYS_ADMIN))
 >
 > what is it you're actually asking?  I presume it's 'does this user have
 > CAP_SYS_ADMIN capability over objects belonging to the uts_namespace's
 > user_namespace?'
 
 Yes.
 
 > So, to look at the important bit of patch 2:
 >
 > -int cap_capable(struct task_struct *tsk, const struct cred *cred, int cap,
 > -		int audit)
 > +int cap_capable(struct task_struct *tsk, const struct cred *cred,
 > +		struct user_namespace *targ_ns, int cap, int audit)
 >  {
 > -	return cap_raised(cred->cap_effective, cap) ? 0 : -EPERM;
 > +	for (;;) {
 > +		/* The creator of the user namespace has all caps. */
 > +		if (targ_ns != &init_user_ns && targ_ns->creator == cred->user)
 > +			return 0;
 > +
 > +		/* Do we have the necessary capabilities? */
 > +		if (targ_ns == cred->user->user_ns)
 > +			return cap_raised(cred->cap_effective, cap) ? 0 : -EPERM;
 > +
 > +		/* Have we tried all of the parent namespaces? */
 > +		if (targ_ns == &init_user_ns)
 > +			return -EPERM;
 > +
 > +		/* If you have the capability in a parent user ns you have it
 > +		 * in the over all children user namespaces as well, so see
 > +		 * if this process has the capability in the parent user
 > +		 * namespace.
 > +		 */
 > +		targ_ns = targ_ns->creator->user_ns;
 > +	}
 > +
 > +	/* We never get here */
 > +	return -EPERM;
 >  }
 >
 > On entry, as we're called from ns_capable(), cred->user is the user that the
 > current process is running as, and, as such, may be in a separate namespace
 > from uts_namespace - which may itself be in a separate namespace from
 > init_user_ns.
 >
 > So, assume for the sake of argument that there are three user_namespaces along
 > the chain from the calling process to the root, and that the uts_namespace
 > belongs to the middle one.
 
 So we have the nested stack of:
 user_ns3->creator->user_ns == user_ns2
 user_ns2->creator->user_ns == &init_user_ns
 uts_ns2->user_ns == user_ns2
 
 >
 > 		if (targ_ns != &init_user_ns && targ_ns->creator == cred->user)
 > 			return 0;
 >
 > Can never match because targ_ns->creator cannot be cred->user; even if the
 > uts_namespace belongs to our namespace, given that the creator lies outside
 > our namespace.
 
 Initially we come in with targ_ns == user_ns2 and cred->user->user_ns in
 one of (user_ns3, user_ns2, or &init_user_ns).
 
 targ_ns takes on values user_ns2 and &init_user_ns.
 
 So when targ_ns becomes &init_user_ns.  If the user in question is
 uts_ns2->user_ns->creator.  This check will indeed match.
 
 > 		if (targ_ns == cred->user->user_ns)
 > 			return cap_raised(cred->cap_effective, cap) ? 0 : -EPERM;
 >
 > Can only match if we are in the target user_namespace (ie. the one to which
 > uts_namespace belongs), whether or not we have CAP_SYS_ADMIN.
 
 As before targ_ns takes on values of user_ns2 and &init_user_ns.
 
 Which means this check will match if we have CAP_SYS_ADMIN in
 &init_user_ns or in user_ns2.
 
 > Which means that unless the uts_namespace belongs to our user_namespace, we
 > cannot change it.  Is that correct?
 
 No.  If you are root in a parent namespace you can also change it.
 
 > So ns_capable() restricts you to only doing interesting things to objects that
 > belong to a user_namespace if they are in your own user_namespace.  Is that
 > correct?
 
 No.  Root outside your user namespace is also allowed to do interesting
 things.
 
 > If that is so, is the loop required for ns_capable()?
 
 Yes.
 
 
 > Looking further at patch 2:
 >
 > 	#define nsown_capable(cap) (ns_capable(current_user_ns(), (cap)))
 >
 > Given what I've said above, I presume the loop isn't necessary here either.
 >
 >
 > I think you're using ns_capable() in two different ways:
 >
 >  (1) You're using it to see if a process has power over its descendents in a
 >      user_namespace that can be traced back to a clone() that it did with
 >      CLONE_NEWUSER.
 >
 >      For example, automatically granting a process permission to kill
 >      descendents in a namespace it created.
 >
 >  (2) You're using it to see if a process can access objects that might be
 >      outside its own user_namespace.
 >
 >      For example, setting the hostname.
 >
 > Is it worth giving two different interfaces to make this clearer (even if they
 > actually do the same thing)?
 >
 >
 > Sorry if this seems rambly, but I'm trying to get my head round your
 > code.
 
 I am all for making that loop a little clearer, because it is something
 you have to pause and think about to understand but so far I don't think
 the loop is wrong, and it is simple.
 
 Eric
 _______________________________________________
 Containers mailing list
 Containers@lists.linux-foundation.org
 https://lists.linux-foundation.org/mailman/listinfo/containe rs
 |  
	|  |  |  
	|  |  
	|  |  
	|  |  
	| 
		
			| Re: [PATCH 5/9] Allow ptrace from non-init user namespaces [message #41874 is a reply to message #41781] | Thu, 24 February 2011 00:43   |  
			| 
				
				
					|  serge Messages: 72
 Registered: January 2007
 | Member |  |  |  
	| Quoting Andrew Morton (akpm@linux-foundation.org): > On Thu, 17 Feb 2011 15:03:33 +0000
 > "Serge E. Hallyn" <serge@hallyn.com> wrote:
 >
 > > ptrace is allowed to tasks in the same user namespace according to
 > > the usual rules (i.e. the same rules as for two tasks in the init
 > > user namespace).  ptrace is also allowed to a user namespace to
 > > which the current task the has CAP_SYS_PTRACE capability.
 > >
 > >
 > > ...
 > >
 > > --- a/include/linux/capability.h
 > > +++ b/include/linux/capability.h
 > > @@ -546,6 +546,8 @@ extern const kernel_cap_t __cap_init_eff_set;
 > >   */
 > >  #define has_capability(t, cap) (security_real_capable((t), &init_user_ns, (cap)) == 0)
 > >
 > > +#define has_ns_capability(t, ns, cap) (security_real_capable((t), (ns), (cap)) == 0)
 >
 > macroitis.
 
 Thanks for the review, Andrew.  Unfortunately this one is hard to turn
 into a function beecause it uses security_real_capable(), which is
 sometimes defined in security/security.c as a real function, and
 other times as a static inline in include/linux/security.h.  So
 I'd have to #include security.h in capability.h, but security.h
 already #includes capability.h.
 
 All the other comments affect same_or_ancestor_user_ns(), which
 following Eric's feedback is going away.
 
 > >  /**
 > >   * has_capability_noaudit - Determine if a task has a superior capability available (unaudited)
 > >   * @t: The task in question
 > > diff --git a/include/linux/user_namespace.h b/include/linux/user_namespace.h
 > > index faf4679..862fc59 100644
 > > --- a/include/linux/user_namespace.h
 > > +++ b/include/linux/user_namespace.h
 > > @@ -39,6 +39,9 @@ static inline void put_user_ns(struct user_namespace *ns)
 > >  uid_t user_ns_map_uid(struct user_namespace *to, const struct cred *cred, uid_t uid);
 > >  gid_t user_ns_map_gid(struct user_namespace *to, const struct cred *cred, gid_t gid);
 > >
 > > +int same_or_ancestor_user_ns(struct task_struct *task,
 > > +	struct task_struct *victim);
 >
 > bool.
 >
 > >  #else
 > >
 > >  static inline struct user_namespace *get_user_ns(struct user_namespace *ns)
 > >
 > > ...
 > >
 > > --- a/kernel/user_namespace.c
 > > +++ b/kernel/user_namespace.c
 > > @@ -129,6 +129,22 @@ gid_t user_ns_map_gid(struct user_namespace *to, const struct cred *cred, gid_t
 > >  	return overflowgid;
 > >  }
 > >
 > > +int same_or_ancestor_user_ns(struct task_struct *task,
 > > +	struct task_struct *victim)
 > > +{
 > > +	struct user_namespace *u1 = task_cred_xxx(task, user)->user_ns;
 > > +	struct user_namespace *u2 = task_cred_xxx(victim, user)->user_ns;
 > > +	for (;;) {
 > > +		if (u1 == u2)
 > > +			return 1;
 > > +		if (u1 == &init_user_ns)
 > > +			return 0;
 > > +		u1 = u1->creator->user_ns;
 > > +	}
 > > +	/* We never get here */
 > > +	return 0;
 >
 > Remove?
 >
 > > +}
 > > +
 > >  static __init int user_namespaces_init(void)
 > >  {
 > >  	user_ns_cachep = KMEM_CACHE(user_namespace, SLAB_PANIC);
 > >
 > > ...
 > >
 > >  int cap_ptrace_access_check(struct task_struct *child, unsigned int mode)
 > >  {
 > >  	int ret = 0;
 > > +	const struct cred *cred, *tcred;
 > >
 > >  	rcu_read_lock();
 > > -	if (!cap_issubset(__task_cred(child)->cap_permitted,
 > > -			  current_cred()->cap_permitted) &&
 > > -	    !capable(CAP_SYS_PTRACE))
 > > -		ret = -EPERM;
 > > +	cred = current_cred();
 > > +	tcred = __task_cred(child);
 > > +	/*
 > > +	 * The ancestor user_ns check may be gratuitous, as I think
 > > +	 * we've already guaranteed that in kernel/ptrace.c.
 > > +	 */
 >
 > ?
 >
 > > +	if (same_or_ancestor_user_ns(current, child) &&
 > > +	    cap_issubset(tcred->cap_permitted, cred->cap_permitted))
 > > +		goto out;
 > > +	if (ns_capable(tcred->user->user_ns, CAP_SYS_PTRACE))
 > > +		goto out;
 > > +	ret = -EPERM;
 > > +out:
 > >  	rcu_read_unlock();
 > >  	return ret;
 > >  }
 > >
 > > ...
 > >
 >
 _______________________________________________
 Containers mailing list
 Containers@lists.linux-foundation.org
 https://lists.linux-foundation.org/mailman/listinfo/containe rs
 |  
	|  |  |  
	|  |  
	| 
		
			| [PATCH] userns: ptrace: incorporate feedback from Eric [message #41877 is a reply to message #41772] | Thu, 24 February 2011 00:49   |  
			| 
				
				
					|  serge Messages: 72
 Registered: January 2007
 | Member |  |  |  
	| same_or_ancestore_user_ns() was not an appropriate check to constrain cap_issubset.  Rather, cap_issubset() only is
 meaningful when both capsets are in the same user_ns.
 
 Signed-off-by: Serge E. Hallyn <serge.hallyn@canonical.com>
 ---
 include/linux/user_namespace.h |    9 ---------
 kernel/user_namespace.c        |   16 ----------------
 security/commoncap.c           |   28 ++++++++++------------------
 3 files changed, 10 insertions(+), 43 deletions(-)
 
 diff --git a/include/linux/user_namespace.h b/include/linux/user_namespace.h
 index 862fc59..faf4679 100644
 --- a/include/linux/user_namespace.h
 +++ b/include/linux/user_namespace.h
 @@ -39,9 +39,6 @@ static inline void put_user_ns(struct user_namespace *ns)
 uid_t user_ns_map_uid(struct user_namespace *to, const struct cred *cred, uid_t uid);
 gid_t user_ns_map_gid(struct user_namespace *to, const struct cred *cred, gid_t gid);
 
 -int same_or_ancestor_user_ns(struct task_struct *task,
 -	struct task_struct *victim);
 -
 #else
 
 static inline struct user_namespace *get_user_ns(struct user_namespace *ns)
 @@ -69,12 +66,6 @@ static inline gid_t user_ns_map_gid(struct user_namespace *to,
 return gid;
 }
 
 -static inline int same_or_ancestor_user_ns(struct task_struct *task,
 -	struct task_struct *victim)
 -{
 -	return 1;
 -}
 -
 #endif
 
 #endif /* _LINUX_USER_H */
 diff --git a/kernel/user_namespace.c b/kernel/user_namespace.c
 index 0ef2258..9da289c 100644
 --- a/kernel/user_namespace.c
 +++ b/kernel/user_namespace.c
 @@ -129,22 +129,6 @@ gid_t user_ns_map_gid(struct user_namespace *to, const struct cred *cred, gid_t
 return overflowgid;
 }
 
 -int same_or_ancestor_user_ns(struct task_struct *task,
 -	struct task_struct *victim)
 -{
 -	struct user_namespace *u1 = task_cred_xxx(task, user)->user_ns;
 -	struct user_namespace *u2 = task_cred_xxx(victim, user)->user_ns;
 -	for (;;) {
 -		if (u1 == u2)
 -			return 1;
 -		if (u1 == &init_user_ns)
 -			return 0;
 -		u1 = u1->creator->user_ns;
 -	}
 -	/* We never get here */
 -	return 0;
 -}
 -
 static __init int user_namespaces_init(void)
 {
 user_ns_cachep = KMEM_CACHE(user_namespace, SLAB_PANIC);
 diff --git a/security/commoncap.c b/security/commoncap.c
 index 12ff65c..526106f 100644
 --- a/security/commoncap.c
 +++ b/security/commoncap.c
 @@ -142,19 +142,15 @@ int cap_settime(struct timespec *ts, struct timezone *tz)
 int cap_ptrace_access_check(struct task_struct *child, unsigned int mode)
 {
 int ret = 0;
 -	const struct cred *cred, *tcred;
 +	const struct cred *cred, *child_cred;
 
 rcu_read_lock();
 cred = current_cred();
 -	tcred = __task_cred(child);
 -	/*
 -	 * The ancestor user_ns check may be gratuitous, as I think
 -	 * we've already guaranteed that in kernel/ptrace.c.
 -	 */
 -	if (same_or_ancestor_user_ns(current, child) &&
 -	    cap_issubset(tcred->cap_permitted, cred->cap_permitted))
 +	child_cred = __task_cred(child);
 +	if (cred->user->user_ns == child_cred->user->user_ns &&
 +	    cap_issubset(child_cred->cap_permitted, cred->cap_permitted))
 goto out;
 -	if (ns_capable(tcred->user->user_ns, CAP_SYS_PTRACE))
 +	if (ns_capable(child_cred->user->user_ns, CAP_SYS_PTRACE))
 goto out;
 ret = -EPERM;
 out:
 @@ -178,19 +174,15 @@ out:
 int cap_ptrace_traceme(struct task_struct *parent)
 {
 int ret = 0;
 -	const struct cred *cred, *tcred;
 +	const struct cred *cred, *child_cred;
 
 rcu_read_lock();
 cred = __task_cred(parent);
 -	tcred = current_cred();
 -	/*
 -	 * The ancestor user_ns check may be gratuitous, as I think
 -	 * we've already guaranteed that in kernel/ptrace.c.
 -	 */
 -	if (same_or_ancestor_user_ns(parent, current) &&
 -	    cap_issubset(tcred->cap_permitted, cred->cap_permitted))
 +	child_cred = current_cred();
 +	if (cred->user->user_ns == child_cred->user->user_ns &&
 +	    cap_issubset(child_cred->cap_permitted, cred->cap_permitted))
 goto out;
 -	if (has_ns_capability(parent, tcred->user->user_ns, CAP_SYS_PTRACE))
 +	if (has_ns_capability(parent, child_cred->user->user_ns, CAP_SYS_PTRACE))
 goto out;
 ret = -EPERM;
 out:
 --
 1.7.0.4
 
 _______________________________________________
 Containers mailing list
 Containers@lists.linux-foundation.org
 https://lists.linux-foundation.org/mailman/listinfo/containe rs
 |  
	|  |  |  
	|  |  
	|  |  
	|  |  
	|  |  
	|  |  
	| 
		
			| Re: User namespaces and keys [message #41886 is a reply to message #41865] | Thu, 24 February 2011 06:56  |  
			| 
				
				
					|  ebiederm Messages: 1354
 Registered: February 2006
 | Senior Member |  |  |  
	| Casey Schaufler <casey@schaufler-ca.com> writes: 
 > On 2/23/2011 12:55 PM, Eric W. Biederman wrote:
 >> Casey Schaufler <casey@schaufler-ca.com> writes:
 >>
 >>> I confess that I remain less well educated on namespaces than
 >>> I probably should be, but with what I do know it seems that the
 >>> relationships between user namespaces and LSMs are bound to be
 >>> strained from the beginning. Some LSMs (SELinux and Smack) are
 >>> providing similar sandbox capabilities to what you get from user
 >>> namespaces, but from different directions and with different
 >>> use cases.
 >> Casey I won't argue about the possibility of things being strained, but
 >> I think if we focus on the semantics and not on the end goal of exactly
 >> how the pieces are to be used there can be some reasonable dialog.
 >
 > I'm sure that there will be cases where they will work together
 > like horses in a troika. Making sensible semantics for the interactions
 > is key, and it is entirely possible that in some cases a comparison
 > of semantics and behaviors will lead an end user to chose either an
 > LSM or namespaces over the combination. Just like I expect that even
 > when we allow multiple LSMs the SELinux and Smack combination will be
 > rare among the sane.
 
 That sounds about right.
 
 Eric
 
 _______________________________________________
 Containers mailing list
 Containers@lists.linux-foundation.org
 https://lists.linux-foundation.org/mailman/listinfo/containe rs
 |  
	|  |  | 
 
 
 Current Time: Sat Oct 25 10:16:59 GMT 2025 
 Total time taken to generate the page: 0.15642 seconds |