OpenVZ Forum


Home » Mailing lists » Devel » [PATCH 0/4] Devices accessibility control group (v3, release candidate)
Re: [PATCH 4/4] The control group itself [message #27239 is a reply to message #27220] Tue, 12 February 2008 17:21 Go to previous messageGo to previous message
serue is currently offline  serue
Messages: 750
Registered: February 2006
Senior Member
Quoting Pavel Emelyanov (xemul@openvz.org):
> Serge E. Hallyn wrote:
> > Quoting Pavel Emelyanov (xemul@openvz.org):
> >> Each new group will have its own maps for char and block
> >> layers. The devices access list is tuned via the 
> >> devices.permissions file. One may read from the file to get 
> >> the configured state.
> >>
> >> The top container isn't initialized, so that the char 
> >> and block layers will use the global maps to lookup 
> >> their devices. I did that not to export the static maps
> >> to the outer world.
> >>
> >> Good news is that this patch now contains more comments 
> >> and Documentation file :)
> > 
> > Seems to work as advertised  :)  I can't reproduce Suka's null/zero
> > bug.
> > 
> > You're relying fully on uid-0 to stop writes into the
> > devices.permissions files.  Would you mind adding a check for
> > CAP_SYS_ADMIN (or CAP_NS_OVERRIDE+CAP_MKNOD)?  Or were you really
> > counting on using the filesystem visibility cgroup to stop a container
> > from making changes to its device access whitelist?
> 
> Yup. I strongly believe that a controller itself should not bring 
> any security policy of its own, but the infrastructure should 
> take care of this.

That would be ok if the controller gave the infrastructure some way of
knowing what sort of thing the controller does.  I.e. I wouldn't mind
having the cgroup infrastructe check for capabilities, but I suspect
some cgroups will really be best represented by different capabilities.

Paul (actually both Menage and Jackson :) do you have an opinion on
this?  Are there sites which eg do 'chown -R some_user_id /cgroup/cpusets/'
to have some non-root user be able to dole out cpusets?  Is there any
way it would be ok to have cgroup_file_write() check for CAP_SYS_ADMIN?

> However, I'm open to change my mind if I see good explanation of
> why it is wrong.

Well the thing is that it currently leaves no way to keep root in
another namespace from manipulating it.  I don't think we can even use
SELinux unless we're willing to prevent containers from having write
access to everything under the cgroup - which is only ok depending on
what is composed with the devices cgroup.

We have the same kind of problem with the
/proc/sys/filesystems/<fs>/fs_safe flag.  There are a few possibilities,
and your fs visibility cgroup (plus splitting /proc/sys/filesystems into
its own fs) is one.  More complicated things pop into my head, but I
think until we get more comfortable with all this the simplest way is
the best.

But really imo the simplest way is to have a capable(CAP_SYS_ADMIN)
check inside your _write function :)

Ideally if you didn't mind I would float a patch (cousin to the userns
4-patch set) defining CAP_NS_OVERRIDE and requiring both CAP_NS_OVERRIDE
and CAP_SYS_ADMIN to access _write.

-serge

> > thanks,
> > -serge
> > 
> >> Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
> >>
> >> ---
> >>
> >> diff --git a/Documentation/controllers/devices.txt b/Documentation/controllers/devices.txt
> >> new file mode 100644
> >> index 0000000..dbd0c7a
> >> --- /dev/null
> >> +++ b/Documentation/controllers/devices.txt
> >> @@ -0,0 +1,61 @@
> >> +
> >> +	Devices visibility controller
> >> +
> >> +This controller allows to tune the devices accessibility by tasks,
> >> +i.e. grant full access for /dev/null, /dev/zero etc, grant read-only
> >> +access to IDE devices and completely hide SCSI disks.
> >> +
> >> +Tasks still can call mknod to create device files, regardless of
> >> +whether the particular device is visible or accessible, but they
> >> +may not be able to open it later.
> >> +
> >> +This one hides under CONFIG_CGROUP_DEVS option.
> >> +
> >> +
> >> +Configuring
> >> +
> >> +The controller provides a single file to configure itself -- the
> >> +devices.permissions one. To change the accessibility level for some
> >> +device write the following string into it:
> >> +
> >> +[cb] <major>:(<minor>|*) [r-][w-]
> >> + ^          ^               ^
> >> + |          |               |
> >> + |          |               +--- access rights (1)
> >> + |          |
> >> + |          +-- device major and minor numbers (2)
> >> + |
> >> + +-- device type (character / block)
> >> +
> >> +1) The access rights set to '--' remove the device from the group's
> >> +access list, so that it will not even be shown in this file later.
> >> +
> >> +2) Setting the minor to '*' grants access to all the minors for
> >> +particular major.
> >> +
> >> +When reading from it, one may see something like
> >> +
> >> +	c 1:5 rw
> >> +	b 8:* r-
> >> +
> >> +Security issues, concerning who may grant access to what are governed
> >> +at the cgroup infrastructure level.
> >> +
> >> +
> >> +Examples:
> >> +
> >> +1. Grand full access to /dev/null
> >> +	# echo c 1:3 rw > /cgroups/<id>/devices.permissions
> >> +
> >> +2. Grant the read-only access to /dev/sda and partitions
> >> +	# echo b 8:* r- > ...
> >> +
> >> +3. Change the /dev/null access to write-only
> >> +	# echo c 1:3 -w > ...
> >> +
> >> +4. Revoke access to /dev/sda
> >> +	# echo b 8:* -- > ...
> >> +
> >> +
> >> +		Written by Pavel Emelyanov <xemul@openvz.org>
> >> +
> >> diff --git a/fs/Makefile b/fs/Makefile
> >> index 7996220..5ad03be 100644
> >> --- a/fs/Makefile
> >> +++ b/fs/Makefile
> >> @@ -64,6 +64,8 @@ obj-y				+= devpts/
> >>
> >>  obj-$(CONFIG_PROFILING)		+= dcookies.o
> >>  obj-$(CONFIG_DLM)		+= dlm/
> >> +
> >> +obj-$(CONFIG_CGROUP_DEVS)	+= devscontrol.o
> >>   
> >>  # Do not add any filesystems before this line
> >>  obj-$(CONFIG_REISERFS_FS)	+= reiserfs/
> >> diff --git a/fs/devscontrol.c b/fs/devscontrol.c
> >> new file mode 100644
> >> index 0000000..48c5f69
> >> --- /dev/null
> >> +++ b/fs/devscontrol.c
> >> @@ -0,0 +1,314 @@
> >> +/*
> >> + * devscontrol.c - Device Controller
> >> + *
> >> + * Copyright 2007 OpenVZ SWsoft Inc
> >> + * Author: Pavel Emelyanov <xemul at openvz dot org>
> >> + *
> >> + * This program is free software; you can redistribute it and/or modify
> >> + * it under the terms of the GNU General Public License as published by
> >> + * the Free Software Foundation; either version 2 of the License, or
> >> + * (at your option) any later version.
> >> + *
> >> + * This program is distributed in the hope that it will be useful,
> >> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> >> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> >> + * GNU General Public License for more details.
> >> + */
> >> +
> >> +#include <linux/cgroup.h>
> >> +#include <linux/cdev.h>
> >> +#include <linux/err.h>
> >> +#include <linux/devscontrol.h>
> >> +#include <linux/uaccess.h>
> >> +#include <linux/fs.h>
> >> +#include <linux/genhd.h>
> >> +
> >> +struct devs_cgroup {
> >> +	/*
> >> +	 * The subsys state to build into cgrous infrastructure
> >> +	 */
> >> +	struct cgroup_subsys_state css;
> >> +
> >> +	/*
> >> +	 * The maps of character and block devices. They provide a
> >> +	 * map from dev_t-s to struct cdev/gendisk. See fs/char_dev.c
> >> +	 * and block/genhd.c to find out how the ->open() callbacks
> >> +	 * work when opening a device.
> >> +	 *
> >> +	 * Each group will have its onw maps, and at the open()
> >> +	 * time code will lookup in this map to get the device
> >> +	 * and permissions by its dev_t.
> >> +	 */
> >> +	struct kobj_map *cdev_map;
> >> +	struct kobj_map *bdev_map;
> >> +};
> >> +
> >> +static inline
> >> +struct devs_cgroup *css_to_devs(struct cgroup_subsys_state *css)
> >> +{
> >> +	return container_of(css, struct devs_cgroup, css);
> >> +}
> >> +
> >> +static inline
> >> +struct devs_cgroup *cgroup_to_devs(struct cgroup *cont)
> >> +{
> >> +	return css_to_devs(cgroup_subsys_state(cont, devs_subsys_id));
> >> +}
> >> +
> >> +struct kobj_map *task_cdev_map(struct task_struct *tsk)
> >> +{
> >> +	struct cgroup_subsys_state *css;
> >> +
> >> +	css = task_subsys_state(tsk, devs_subsys_id);
> >> +	if (css->cgroup->parent == NULL)
> >> +		return NULL;
> >> +	else
> >> +		return css_to_devs(css)->cdev_map;
> >> +}
> >> +
> >> +struct kobj_map *task_bdev_map(struct task_struct *tsk)
> >> +{
> >> +	struct cgroup_subsys_state *css;
> >> +
> >> +	css = task_subsys_state(tsk, devs_subsys_id);
> &g
...

 
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Read Message
Previous Topic: Re: [PATCH] cgroup: fix default notify_on_release setting
Next Topic: [PATCH] pidns: make pid->level and pid_ns->level unsigned
Goto Forum:
  


Current Time: Wed Sep 18 06:44:37 GMT 2024

Total time taken to generate the page: 0.04719 seconds