| Home » Mailing lists » Devel » [PATCH 01/10] Containers(V10): Basic container framework Goto Forum:
	| 
		
			| [PATCH 01/10] Containers(V10): Basic container framework [message #13542] | Tue, 29 May 2007 13:01  |  
			| 
				
				
					|  Paul Menage Messages: 642
 Registered: September 2006
 | Senior Member |  |  |  
	| This patch adds the main containers framework - the container filesystem, and the basic structures for tracking membership and
 associating subsystem state objects to tasks.
 
 Signed-off-by: Paul Menage <menage@google.com>
 
 ---
 Documentation/containers.txt     |  524 +++++++++++++++++
 include/linux/container.h        |  198 ++++++
 include/linux/container_subsys.h |   10
 include/linux/sched.h            |   34 +
 init/Kconfig                     |    3
 init/main.c                      |    3
 kernel/Makefile                  |    1
 kernel/container.c               | 1155 +++++++++++++++++++++++++++++++++++++++
 8 files changed, 1927 insertions(+), 1 deletion(-)
 
 Index: container-2.6.22-rc2-mm1/Documentation/containers.txt
 ============================================================ =======
 --- /dev/null
 +++ container-2.6.22-rc2-mm1/Documentation/containers.txt
 @@ -0,0 +1,524 @@
 +				CONTAINERS
 +				-------
 +
 +Written by Paul Menage <menage@google.com> based on Documentation/cpusets.txt
 +
 +Original copyright statements from cpusets.txt:
 +Portions Copyright (C) 2004 BULL SA.
 +Portions Copyright (c) 2004-2006 Silicon Graphics, Inc.
 +Modified by Paul Jackson <pj@sgi.com>
 +Modified by Christoph Lameter <clameter@sgi.com>
 +
 +CONTENTS:
 +=========
 +
 +1. Containers
 +  1.1 What are containers ?
 +  1.2 Why are containers needed ?
 +  1.3 How are containers implemented ?
 +  1.4 What does notify_on_release do ?
 +  1.5 How do I use containers ?
 +2. Usage Examples and Syntax
 +  2.1 Basic Usage
 +  2.2 Attaching processes
 +3. Kernel API
 +  3.1 Overview
 +  3.2 Synchronization
 +  3.3 Subsystem API
 +4. Questions
 +
 +1. Containers
 +==========
 +
 +1.1 What are containers ?
 +----------------------
 +
 +Containers provide a mechanism for aggregating/partitioning sets of
 +tasks, and all their future children, into hierarchical groups with
 +specialized behaviour.
 +
 +Definitions:
 +
 +A *container* associates a set of tasks with a set of parameters for one
 +or more subsystems.
 +
 +A *subsystem* is a module that makes use of the task grouping
 +facilities provided by containers to treat groups of tasks in
 +particular ways. A subsystem is typically a "resource controller" that
 +schedules a resource or applies per-container limits, but it may be
 +anything that wants to act on a group of processes, e.g. a
 +virtualization subsystem.
 +
 +A *hierarchy* is a set of containers arranged in a tree, such that
 +every task in the system is in exactly one of the containers in the
 +hierarchy, and a set of subsystems; each subsystem has system-specific
 +state attached to each container in the hierarchy.  Each hierarchy has
 +an instance of the container virtual filesystem associated with it.
 +
 +At any one time there may be multiple active hierachies of task
 +containers. Each hierarchy is a partition of all tasks in the system.
 +
 +User level code may create and destroy containers by name in an
 +instance of the container virtual file system, specify and query to
 +which container a task is assigned, and list the task pids assigned to
 +a container. Those creations and assignments only affect the hierarchy
 +associated with that instance of the container file system.
 +
 +On their own, the only use for containers is for simple job
 +tracking. The intention is that other subsystems hook into the generic
 +container support to provide new attributes for containers, such as
 +accounting/limiting the resources which processes in a container can
 +access. For example, cpusets (see Documentation/cpusets.txt) allows
 +you to associate a set of CPUs and a set of memory nodes with the
 +tasks in each container.
 +
 +1.2 Why are containers needed ?
 +----------------------------
 +
 +There are multiple efforts to provide process aggregations in the
 +Linux kernel, mainly for resource tracking purposes. Such efforts
 +include cpusets, CKRM/ResGroups, UserBeanCounters, and virtual server
 +namespaces. These all require the basic notion of a
 +grouping/partitioning of processes, with newly forked processes ending
 +in the same group (container) as their parent process.
 +
 +The kernel container patch provides the minimum essential kernel
 +mechanisms required to efficiently implement such groups. It has
 +minimal impact on the system fast paths, and provides hooks for
 +specific subsystems such as cpusets to provide additional behaviour as
 +desired.
 +
 +Multiple hierarchy support is provided to allow for situations where
 +the division of tasks into containers is distinctly different for
 +different subsystems - having parallel hierarchies allows each
 +hierarchy to be a natural division of tasks, without having to handle
 +complex combinations of tasks that would be present if several
 +unrelated subsystems needed to be forced into the same tree of
 +containers.
 +
 +At one extreme, each resource controller or subsystem could be in a
 +separate hierarchy; at the other extreme, all subsystems
 +would be attached to the same hierarchy.
 +
 +As an example of a scenario (originally proposed by vatsa@in.ibm.com)
 +that can benefit from multiple hierarchies, consider a large
 +university server with various users - students, professors, system
 +tasks etc. The resource planning for this server could be along the
 +following lines:
 +
 +       CPU :           Top cpuset
 +                       /       \
 +               CPUSet1         CPUSet2
 +                  |              |
 +               (Profs)         (Students)
 +
 +               In addition (system tasks) are attached to topcpuset (so
 +               that they can run anywhere) with a limit of 20%
 +
 +       Memory : Professors (50%), students (30%), system (20%)
 +
 +       Disk : Prof (50%), students (30%), system (20%)
 +
 +       Network : WWW browsing (20%), Network File System (60%), others (20%)
 +                               / \
 +                       Prof (15%) students (5%)
 +
 +Browsers like firefox/lynx go into the WWW network class, while (k)nfsd go
 +into NFS network class.
 +
 +At the same time firefox/lynx will share an appropriate CPU/Memory class
 +depending on who launched it (prof/student).
 +
 +With the ability to classify tasks differently for different resources
 +(by putting those resource subsystems in different hierarchies) then
 +the admin can easily set up a script which receives exec notifications
 +and depending on who is launching the browser he can
 +
 +       # echo browser_pid > /mnt/<restype>/<userclass>/tasks
 +
 +With only a single hierarchy, he now would potentially have to create
 +a separate container for every browser launched and associate it with
 +approp network and other resource class.  This may lead to
 +proliferation of such containers.
 +
 +Also lets say that the administrator would like to give enhanced network
 +access temporarily to a student's browser (since it is night and the user
 +wants to do online gaming :)  OR give one of the students simulation
 +apps enhanced CPU power,
 +
 +With ability to write pids directly to resource classes, its just a
 +matter of :
 +
 +       # echo pid > /mnt/network/<new_class>/tasks
 +       (after some time)
 +       # echo pid > /mnt/network/<orig_class>/tasks
 +
 +Without this ability, he would have to split the container into
 +multiple separate ones and then associate the new containers with the
 +new resource classes.
 +
 +
 +
 +1.3 How are containers implemented ?
 +---------------------------------
 +
 +Containers extends the kernel as follows:
 +
 + - Each task in the system has a reference-counted pointer to a
 +   css_group.
 +
 + - A css_group contains a set of reference-counted pointers to
 +   container_subsys_state objects, one for each container subsystem
 +   registered in the system. There is no direct link from a task to
 +   the container of which it's a member in each hierarchy, but this
 +   can be determined by following pointers through the
 +   container_subsys_state objects. This is because accessing the
 +   subsystem state is something that's expected to happen frequently
 +   and in performance-critical code, whereas operations that require a
 +   task's actual container assignments (in particular, moving between
 +   containers) are less common.
 +
 + - A container hierarchy filesystem can be mounted  for browsing and
 +   manipulation from user space.
 +
 + - You can list all the tasks (by pid) attached to any container.
 +
 +The implementation of containers requires a few, simple hooks
 +into the rest of the kernel, none in performance critical paths:
 +
 + - in init/main.c, to initialize the root containers and initial
 +   css_group at system boot.
 +
 + - in fork and exit, to attach and detach a task from its css_group.
 +
 +In addition a new file system, of type "container" may be mounted, to
 +enable browsing and modifying the containers presently known to the
 +kernel.  When mounting a container hierarchy, you may specify a
 +comma-separated list of subsystems to mount as the filesystem mount
 +options.  By default, mounting the container filesystem attempts to
 +mount a hierarchy containing all registered subsystems.
 +
 +If an active hierarchy with exactly the same set of subsystems already
 +exists, it will be reused for the new mount. If no existing hierarchy
 +matches, and any of the requested subsystems are in use in an existing
 +hierarchy, the mount will fail with -EBUSY. Otherwise, a new hierarchy
 +is activated, associated with the requested subsystems.
 +
 +It's not currently possible to bind a new subsystem to an active
 +container hierarchy, or to unbind a subsystem from an active container
 +hierarchy. This may be possible in future, but is fraught with nasty
 +error-recovery issues.
 +
 +When a container filesystem is unmounted, if there are any
 +subcontainers created below the top-level container, that hierarchy
 +will remain active even though unmounted; if there are no
 +subcontainers then the hierarchy will be deactivated.
 +
 +No new system calls are added for containers - all support for
 +querying and modifying co
...
 
 
 |  
	|  |  |  
	|  |  
	| 
		
			| Re: [PATCH 01/10] Containers(V10): Basic container framework [message #13569 is a reply to message #13542] | Wed, 30 May 2007 07:15   |  
			| 
				
				
					|  Andrew Morton Messages: 127
 Registered: December 2005
 | Senior Member |  |  |  
	| On Tue, 29 May 2007 06:01:05 -0700 menage@google.com wrote: 
 > This patch adds the main containers framework - the container
 > filesystem, and the basic structures for tracking membership and
 > associating subsystem state objects to tasks.
 >
 > ...
 >
 > --- /dev/null
 > +++ container-2.6.22-rc2-mm1/include/linux/container_subsys.h
 > @@ -0,0 +1,10 @@
 > +/* Add subsystem definitions of the form SUBSYS(<name>) in this
 > + * file. Surround each one by a line of comment markers so that
 > + * patches don't collide
 > + */
 > +
 > +/* */
 > +
 > +/* */
 > +
 > +/* */
 > Index: container-2.6.22-rc2-mm1/include/linux/sched.h
 >  ============================================================ =======
 > --- container-2.6.22-rc2-mm1.orig/include/linux/sched.h
 > +++ container-2.6.22-rc2-mm1/include/linux/sched.h
 > @@ -851,6 +851,34 @@ struct sched_class {
 >  	void (*task_new) (struct rq *rq, struct task_struct *p);
 >  };
 >
 > +#ifdef CONFIG_CONTAINERS
 > +
 > +#define SUBSYS(_x) _x ## _subsys_id,
 > +enum container_subsys_id {
 > +#include <linux/container_subsys.h>
 > +	CONTAINER_SUBSYS_COUNT
 > +};
 > +#undef SUBSYS
 > +
 > +/* A css_group is a structure holding pointers to a set of
 > + * container_subsys_state objects.
 > + */
 > +
 > +struct css_group {
 > +
 > +	/* Set of subsystem states, one for each subsystem. NULL for
 > +	 * subsystems that aren't part of this hierarchy. These
 > +	 * pointers reduce the number of dereferences required to get
 > +	 * from a task to its state for a given container, but result
 > +	 * in increased space usage if tasks are in wildly different
 > +	 * groupings across different hierarchies. This array is
 > +	 * immutable after creation */
 > +	struct container_subsys_state *subsys[CONTAINER_SUBSYS_COUNT];
 > +
 > +};
 
 hm, missing forward declaration of struct container_subsys_state, but it all
 seems to work out (via nested include) once the patches are applied and the
 config option is enableable.
 
 >
 > ...
 >
 > --- /dev/null
 > +++ container-2.6.22-rc2-mm1/kernel/container.c
 >
 > ...
 >
 > +#include <linux/cpu.h>
 > +#include <linux/cpumask.h>
 > +#include <linux/container.h>
 > +#include <linux/err.h>
 > +#include <linux/errno.h>
 > +#include <linux/file.h>
 > +#include <linux/fs.h>
 > +#include <linux/init.h>
 > +#include <linux/interrupt.h>
 > +#include <linux/kernel.h>
 > +#include <linux/kmod.h>
 > +#include <linux/list.h>
 > +#include <linux/mempolicy.h>
 > +#include <linux/mm.h>
 > +#include <linux/module.h>
 > +#include <linux/mount.h>
 > +#include <linux/namei.h>
 > +#include <linux/pagemap.h>
 > +#include <linux/proc_fs.h>
 > +#include <linux/rcupdate.h>
 > +#include <linux/sched.h>
 > +#include <linux/seq_file.h>
 > +#include <linux/security.h>
 > +#include <linux/slab.h>
 > +#include <linux/smp_lock.h>
 > +#include <linux/spinlock.h>
 > +#include <linux/stat.h>
 > +#include <linux/string.h>
 > +#include <linux/time.h>
 > +#include <linux/backing-dev.h>
 > +#include <linux/sort.h>
 > +
 > +#include <asm/uaccess.h>
 > +#include <asm/atomic.h>
 > +#include <linux/mutex.h>
 
 Holy cow, do we need all those?
 
 > +typedef enum {
 > +	CONT_REMOVED,
 > +} container_flagbits_t;
 
 typedefs are verboten.  Fortunately this one is never referred to - only
 the values are used, so we can delete it.
 
 >
 > ...
 >
 > +static void container_clear_directory(struct dentry *dentry)
 > +{
 > +	struct list_head *node;
 > +	BUG_ON(!mutex_is_locked(&dentry->d_inode->i_mutex));
 > +	spin_lock(&dcache_lock);
 > +	node = dentry->d_subdirs.next;
 > +	while (node != &dentry->d_subdirs) {
 > +		struct dentry *d = list_entry(node, struct dentry, d_u.d_child);
 > +		list_del_init(node);
 > +		if (d->d_inode) {
 > +			/* This should never be called on a container
 > +			 * directory with child containers */
 > +			BUG_ON(d->d_inode->i_mode & S_IFDIR);
 > +			d = dget_locked(d);
 > +			spin_unlock(&dcache_lock);
 > +			d_delete(d);
 > +			simple_unlink(dentry->d_inode, d);
 > +			dput(d);
 > +			spin_lock(&dcache_lock);
 > +		}
 > +		node = dentry->d_subdirs.next;
 > +	}
 > +	spin_unlock(&dcache_lock);
 > +}
 > +
 > +/*
 > + * NOTE : the dentry must have been dget()'ed
 > + */
 > +static void container_d_remove_dir(struct dentry *dentry)
 > +{
 > +	container_clear_directory(dentry);
 > +
 > +	spin_lock(&dcache_lock);
 > +	list_del_init(&dentry->d_u.d_child);
 > +	spin_unlock(&dcache_lock);
 > +	remove_dir(dentry);
 > +}
 
 Taking dcache_lock in here is unfortunate.  A filesystem really shouldn't
 be playing with that lock.
 
 But about 20 filesystems do so.  Ho hum.
 
 > +static int rebind_subsystems(struct containerfs_root *root,
 > +			      unsigned long final_bits)
 
 The code's a bit short on comments.
 
 > +{
 > +	unsigned long added_bits, removed_bits;
 > +	struct container *cont = &root->top_container;
 > +	int i;
 > +
 > +	removed_bits = root->subsys_bits & ~final_bits;
 > +	added_bits = final_bits & ~root->subsys_bits;
 > +	/* Check that any added subsystems are currently free */
 > +	for (i = 0; i < CONTAINER_SUBSYS_COUNT; i++) {
 > +		unsigned long long bit = 1ull << i;
 > +		struct container_subsys *ss = subsys[i];
 > +		if (!(bit & added_bits))
 > +			continue;
 > +		if (ss->root != &rootnode) {
 > +			/* Subsystem isn't free */
 > +			return -EBUSY;
 > +		}
 > +	}
 > +
 > +	/* Currently we don't handle adding/removing subsystems when
 > +	 * any subcontainers exist. This is theoretically supportable
 > +	 * but involves complex erro r handling, so it's being left until
 > +	 * later */
 > +	if (!list_empty(&cont->children)) {
 > +		return -EBUSY;
 > +	}
 > +
 > +	/* Process each subsystem */
 > +	for (i = 0; i < CONTAINER_SUBSYS_COUNT; i++) {
 > +		struct container_subsys *ss = subsys[i];
 > +		unsigned long bit = 1UL << i;
 > +		if (bit & added_bits) {
 > +			/* We're binding this subsystem to this hierarchy */
 > +			BUG_ON(cont->subsys[i]);
 > +			BUG_ON(!dummytop->subsys[i]);
 > +			BUG_ON(dummytop->subsys[i]->container != dummytop);
 > +			cont->subsys[i] = dummytop->subsys[i];
 > +			cont->subsys[i]->container = cont;
 > +			list_add(&ss->sibling, &root->subsys_list);
 > +			rcu_assign_pointer(ss->root, root);
 > +			if (ss->bind)
 > +				ss->bind(ss, cont);
 > +
 > +		} else if (bit & removed_bits) {
 > +			/* We're removing this subsystem */
 > +			BUG_ON(cont->subsys[i] != dummytop->subsys[i]);
 > +			BUG_ON(cont->subsys[i]->container != cont);
 > +			if (ss->bind)
 > +				ss->bind(ss, dummytop);
 > +			dummytop->subsys[i]->container = dummytop;
 > +			cont->subsys[i] = NULL;
 > +			rcu_assign_pointer(subsys[i]->root, &rootnode);
 > +			list_del(&ss->sibling);
 > +		} else if (bit & final_bits) {
 > +			/* Subsystem state should already exist */
 > +			BUG_ON(!cont->subsys[i]);
 > +		} else {
 > +			/* Subsystem state shouldn't exist */
 > +			BUG_ON(cont->subsys[i]);
 > +		}
 > +	}
 > +	root->subsys_bits = final_bits;
 > +	synchronize_rcu();
 > +
 > +	return 0;
 > +}
 >
 > ...
 >
 > +static int container_remount(struct super_block *sb, int *flags, char *data)
 > +{
 > +	int ret = 0;
 > +	unsigned long subsys_bits;
 > +	struct containerfs_root *root = sb->s_fs_info;
 > +	struct container *cont = &root->top_container;
 > +
 > +	mutex_lock(&cont->dentry->d_inode->i_mutex);
 > +	mutex_lock(&container_mutex);
 
 So container_mutex nests inside i_mutex.  That mean that we'll get lockdep
 moaning if anyone does a __GFP_FS allocation inside container_mutex (some
 filesystems can take i_mutex on the ->writepage path, iirc).
 
 Probably a false positive, we can cross that bridege if/when we come to it.
 
 > +	/* See what subsystems are wanted */
 > +	ret = parse_containerfs_options(data, &subsys_bits);
 > +	if (ret)
 > +		goto out_unlock;
 > +
 > +	ret = rebind_subsystems(root, subsys_bits);
 > +
 > +	/* (re)populate subsystem files */
 > +	if (!ret)
 > +		container_populate_dir(cont);
 > +
 > + out_unlock:
 > +	mutex_unlock(&container_mutex);
 > +	mutex_unlock(&cont->dentry->d_inode->i_mutex);
 > +	return ret;
 > +}
 > +
 >
 > ...
 >
 > +
 > +static int container_fill_super(struct super_block *sb, void *options,
 > +				int unused_silent)
 > +{
 > +	struct inode *inode;
 > +	struct dentry *root;
 > +	struct containerfs_root *hroot = options;
 > +
 > +	sb->s_blocksize = PAGE_CACHE_SIZE;
 > +	sb->s_blocksize_bits = PAGE_CACHE_SHIFT;
 > +	sb->s_magic = CONTAINER_SUPER_MAGIC;
 > +	sb->s_op = &container_ops;
 > +
 > +	inode = container_new_inode(S_IFDIR | S_IRUGO | S_IXUGO | S_IWUSR, sb);
 > +	if (!inode)
 > +		return -ENOMEM;
 > +
 > +	inode->i_op = &simple_dir_inode_operations;
 > +	inode->i_fop = &simple_dir_operations;
 > +	inode->i_op = &container_dir_inode_operations;
 > +	/* directories start off with i_nlink == 2 (for "." entry) */
 > +	inc_nlink(inode);
 > +
 > +	root = d_alloc_root(inode);
 > +	if (!root) {
 > +		iput(inode);
 > +		return -ENOMEM;
 
 I bet that iput() hasn't been tested ;)
 
 People have hit unpleasant prob
...
 
 
 |  
	|  |  |  
	| 
		
			| Re: [PATCH 01/10] Containers(V10): Basic container framework [message #13581 is a reply to message #13569] | Wed, 30 May 2007 14:02   |  
			| 
				
				
					|  Paul Menage Messages: 642
 Registered: September 2006
 | Senior Member |  |  |  
	| On 5/30/07, Andrew Morton <akpm@linux-foundation.org> wrote: >
 > Holy cow, do we need all those?
 
 I'll experiment to see which ones we can get rid of.
 
 >
 > > +typedef enum {
 > > +     CONT_REMOVED,
 > > +} container_flagbits_t;
 >
 > typedefs are verboten.  Fortunately this one is never referred to - only
 > the values are used, so we can delete it.
 
 OK.
 
 >
 > Taking dcache_lock in here is unfortunate.  A filesystem really shouldn't
 > be playing with that lock.
 
 Is there a recommended way to do what I want to do, i.e. clear out all
 the dentries from a virtual fs directory and rebuild them whilst
 holding the directory's i_sem so no one can see the transiently empty
 directory?
 
 >
 > The code's a bit short on comments.
 
 I'll add some.
 
 > > +     root = d_alloc_root(inode);
 > > +     if (!root) {
 > > +             iput(inode);
 > > +             return -ENOMEM;
 >
 > I bet that iput() hasn't been tested ;)
 
 Correct.
 
 >
 > People have hit unpleasant problems before now running iput() against
 > partially-constructed inodes.
 
 What kinds of problems? Are there bits of state that I should fully
 construct even if I'm going to iput() it, or is there a better
 function to call? fs/ext3/super.c seems to do the same thing.
 
 > > +             if (ret)
 > > +                     goto out_unlock;
 >
 > Did we just leak *root?
 
 I believe we did. I'll fix that.
 
 > >
 > > +static inline void get_first_subsys(const struct container *cont,
 > > +                                 struct container_subsys_state **css,
 > > +                                 int *subsys_id) {
 > > +     const struct containerfs_root *root = cont->root;
 > > +     const struct container_subsys *test_ss;
 > > +     BUG_ON(list_empty(&root->subsys_list));
 > > +     test_ss = list_entry(root->subsys_list.next,
 > > +                          struct container_subsys, sibling);
 > > +     if (css) {
 > > +             *css = cont->subsys[test_ss->subsys_id];
 > > +             BUG_ON(!*css);
 > > +     }
 > > +     if (subsys_id)
 > > +             *subsys_id = test_ss->subsys_id;
 > > +}
 >
 > This ends up having several callers and its too large to inline.
 
 Two large from a compiler PoV or from a style PoV? It's basically just
 six dereferences and two comparisons, plus the BUG_ON()s.
 
 >
 > Do we actually want to support lseek on these things?
 >
 > If not we can leave this null and use nonseekable_open() in ->open.
 
 I inherited that from cpusets without thinking about it too much. I
 guess that we don't really need seekability.
 
 > > +     } else if (S_ISREG(mode)) {
 > > +             inode->i_size = 0;
 > > +             inode->i_fop = &container_file_operations;
 > > +     }
 >
 > The S_ISREG files have no ->i_ops?
 
 Not currently. I don't see anything in inode_operations that we want
 to be able to do on non-directories.
 
 Paul
 |  
	|  |  |  
	| 
		
			| Re: [PATCH 01/10] Containers(V10): Basic container framework [message #13593 is a reply to message #13581] | Wed, 30 May 2007 16:00   |  
			| 
				
				
					|  Andrew Morton Messages: 127
 Registered: December 2005
 | Senior Member |  |  |  
	| On Wed, 30 May 2007 07:02:00 -0700 "Paul Menage" <menage@google.com> wrote: 
 >
 > >
 > > People have hit unpleasant problems before now running iput() against
 > > partially-constructed inodes.
 >
 > What kinds of problems? Are there bits of state that I should fully
 > construct even if I'm going to iput() it, or is there a better
 > function to call? fs/ext3/super.c seems to do the same thing.
 
 I don't recall, actually.  But it crashed.
 
 I guess the fault-injection code could be used to trigger errors here.
 
 > > >
 > > > +static inline void get_first_subsys(const struct container *cont,
 > > > +                                 struct container_subsys_state **css,
 > > > +                                 int *subsys_id) {
 > > > +     const struct containerfs_root *root = cont->root;
 > > > +     const struct container_subsys *test_ss;
 > > > +     BUG_ON(list_empty(&root->subsys_list));
 > > > +     test_ss = list_entry(root->subsys_list.next,
 > > > +                          struct container_subsys, sibling);
 > > > +     if (css) {
 > > > +             *css = cont->subsys[test_ss->subsys_id];
 > > > +             BUG_ON(!*css);
 > > > +     }
 > > > +     if (subsys_id)
 > > > +             *subsys_id = test_ss->subsys_id;
 > > > +}
 > >
 > > This ends up having several callers and its too large to inline.
 >
 > Two large from a compiler PoV or from a style PoV? It's basically just
 > six dereferences and two comparisons, plus the BUG_ON()s.
 
 It will end up generating more .text this way.  We figure that this makes
 it slower, due to increased icache footprint.
 |  
	|  |  |  
	| 
		
			| Re: [PATCH 01/10] Containers(V10): Basic container framework [message #14054 is a reply to message #13542] | Wed, 13 June 2007 10:17  |  
			| 
				
				
					|  Dhaval Giani Messages: 37
 Registered: June 2007
 | Member |  |  |  
	| Hi, 
 On Tue, May 29, 2007 at 06:01:05AM -0700, menage@google.com wrote:
 > +1.5 How do I use containers ?
 > +--------------------------
 > +
 > +To start a new job that is to be contained within a container, using
 > +the "cpuset" container subsystem, the steps are something like:
 > +
 > + 1) mkdir /dev/container
 > + 2) mount -t container -ocpuset cpuset /dev/container
 > + 3) Create the new container by doing mkdir's and write's (or echo's) in
 > +    the /dev/container virtual file system.
 > + 4) Start a task that will be the "founding father" of the new job.
 > + 5) Attach that task to the new container by writing its pid to the
 > +    /dev/container tasks file for that container.
 > + 6) fork, exec or clone the job tasks from this founding father task.
 > +
 > +For example, the following sequence of commands will setup a container
 > +named "Charlie", containing just CPUs 2 and 3, and Memory Node 1,
 > +and then start a subshell 'sh' in that container:
 > +
 > +  mount -t container cpuset -ocpuset /dev/container
 > +  cd /dev/container
 > +  mkdir Charlie
 > +  cd Charlie
 
 This example does not work. To do so we need to do
 
 /bin/echo 2-3 > cpus
 /bin/echo 1 > mems
 
 > +  /bin/echo $$ > tasks
 > +  sh
 > +  # The subshell 'sh' is now running in container Charlie
 > +  # The next line should display '/Charlie'
 > +  cat /proc/self/container
 
 The following patch does that.
 
 thanks and regards
 Dhaval
 
 ----------------------
 
 Signed-off-by: Dhaval Giani <dhaval@linux.vnet.ibm.com>
 
 
 diff -uprN linux-2.6.22-rc4/Documentation/containers.txt old/Documentation/containers.txt
 --- linux-2.6.22-rc4/Documentation/containers.txt	2007-06-13 15:38:30.000000000 +0530
 +++ old/Documentation/containers.txt	2007-06-13 10:56:49.000000000 +0530
 @@ -310,6 +310,8 @@ and then start a subshell 'sh' in that c
 cd /dev/container
 mkdir Charlie
 cd Charlie
 +  /bin/echo 2-3 > cpus
 +  /bin/echo 1 > mems
 /bin/echo $$ > tasks
 sh
 # The subshell 'sh' is now running in container Charlie
 |  
	|  |  | 
 
 
 Current Time: Fri Oct 31 10:30:36 GMT 2025 
 Total time taken to generate the page: 0.11748 seconds |