OpenVZ Forum: Devel » [PATCH 01/10] Containers(V10): Basic container framework

Home » Mailing lists » Devel » [PATCH 01/10] Containers(V10): Basic container framework

Show: Today's Messages :: Show Polls :: Message Navigator
E-mail to friend

[PATCH 01/10] Containers(V10): Basic container framework [message #13542]

Tue, 29 May 2007 13:01

Paul Menage
Messages: 642
Registered: September 2006

Senior Member

This patch adds the main containers framework - the container
filesystem, and the basic structures for tracking membership and
associating subsystem state objects to tasks.

Signed-off-by: Paul Menage <menage@google.com>

---
Documentation/containers.txt | 524 +++++++++++++++++
include/linux/container.h | 198 ++++++
include/linux/container_subsys.h | 10
include/linux/sched.h | 34 +
init/Kconfig | 3
init/main.c | 3
kernel/Makefile | 1
kernel/container.c | 1155 +++++++++++++++++++++++++++++++++++++++
8 files changed, 1927 insertions(+), 1 deletion(-)

Index: container-2.6.22-rc2-mm1/Documentation/containers.txt
============================================================ =======
--- /dev/null
+++ container-2.6.22-rc2-mm1/Documentation/containers.txt
@@ -0,0 +1,524 @@
+ CONTAINERS
+ -------
+
+Written by Paul Menage <menage@google.com> based on Documentation/cpusets.txt
+
+Original copyright statements from cpusets.txt:
+Portions Copyright (C) 2004 BULL SA.
+Portions Copyright (c) 2004-2006 Silicon Graphics, Inc.
+Modified by Paul Jackson <pj@sgi.com>
+Modified by Christoph Lameter <clameter@sgi.com>
+
+CONTENTS:
+=========
+
+1. Containers
+ 1.1 What are containers ?
+ 1.2 Why are containers needed ?
+ 1.3 How are containers implemented ?
+ 1.4 What does notify_on_release do ?
+ 1.5 How do I use containers ?
+2. Usage Examples and Syntax
+ 2.1 Basic Usage
+ 2.2 Attaching processes
+3. Kernel API
+ 3.1 Overview
+ 3.2 Synchronization
+ 3.3 Subsystem API
+4. Questions
+
+1. Containers
+==========
+
+1.1 What are containers ?
+----------------------
+
+Containers provide a mechanism for aggregating/partitioning sets of
+tasks, and all their future children, into hierarchical groups with
+specialized behaviour.
+
+Definitions:
+
+A *container* associates a set of tasks with a set of parameters for one
+or more subsystems.
+
+A *subsystem* is a module that makes use of the task grouping
+facilities provided by containers to treat groups of tasks in
+particular ways. A subsystem is typically a "resource controller" that
+schedules a resource or applies per-container limits, but it may be
+anything that wants to act on a group of processes, e.g. a
+virtualization subsystem.
+
+A *hierarchy* is a set of containers arranged in a tree, such that
+every task in the system is in exactly one of the containers in the
+hierarchy, and a set of subsystems; each subsystem has system-specific
+state attached to each container in the hierarchy. Each hierarchy has
+an instance of the container virtual filesystem associated with it.
+
+At any one time there may be multiple active hierachies of task
+containers. Each hierarchy is a partition of all tasks in the system.
+
+User level code may create and destroy containers by name in an
+instance of the container virtual file system, specify and query to
+which container a task is assigned, and list the task pids assigned to
+a container. Those creations and assignments only affect the hierarchy
+associated with that instance of the container file system.
+
+On their own, the only use for containers is for simple job
+tracking. The intention is that other subsystems hook into the generic
+container support to provide new attributes for containers, such as
+accounting/limiting the resources which processes in a container can
+access. For example, cpusets (see Documentation/cpusets.txt) allows
+you to associate a set of CPUs and a set of memory nodes with the
+tasks in each container.
+
+1.2 Why are containers needed ?
+----------------------------
+
+There are multiple efforts to provide process aggregations in the
+Linux kernel, mainly for resource tracking purposes. Such efforts
+include cpusets, CKRM/ResGroups, UserBeanCounters, and virtual server
+namespaces. These all require the basic notion of a
+grouping/partitioning of processes, with newly forked processes ending
+in the same group (container) as their parent process.
+
+The kernel container patch provides the minimum essential kernel
+mechanisms required to efficiently implement such groups. It has
+minimal impact on the system fast paths, and provides hooks for
+specific subsystems such as cpusets to provide additional behaviour as
+desired.
+
+Multiple hierarchy support is provided to allow for situations where
+the division of tasks into containers is distinctly different for
+different subsystems - having parallel hierarchies allows each
+hierarchy to be a natural division of tasks, without having to handle
+complex combinations of tasks that would be present if several
+unrelated subsystems needed to be forced into the same tree of
+containers.
+
+At one extreme, each resource controller or subsystem could be in a
+separate hierarchy; at the other extreme, all subsystems
+would be attached to the same hierarchy.
+
+As an example of a scenario (originally proposed by vatsa@in.ibm.com)
+that can benefit from multiple hierarchies, consider a large
+university server with various users - students, professors, system
+tasks etc. The resource planning for this server could be along the
+following lines:
+
+ CPU : Top cpuset
+ / \
+ CPUSet1 CPUSet2
+ | |
+ (Profs) (Students)
+
+ In addition (system tasks) are attached to topcpuset (so
+ that they can run anywhere) with a limit of 20%
+
+ Memory : Professors (50%), students (30%), system (20%)
+
+ Disk : Prof (50%), students (30%), system (20%)
+
+ Network : WWW browsing (20%), Network File System (60%), others (20%)
+ / \
+ Prof (15%) students (5%)
+
+Browsers like firefox/lynx go into the WWW network class, while (k)nfsd go
+into NFS network class.
+
+At the same time firefox/lynx will share an appropriate CPU/Memory class
+depending on who launched it (prof/student).
+
+With the ability to classify tasks differently for different resources
+(by putting those resource subsystems in different hierarchies) then
+the admin can easily set up a script which receives exec notifications
+and depending on who is launching the browser he can
+
+ # echo browser_pid > /mnt/<restype>/<userclass>/tasks
+
+With only a single hierarchy, he now would potentially have to create
+a separate container for every browser launched and associate it with
+approp network and other resource class. This may lead to
+proliferation of such containers.
+
+Also lets say that the administrator would like to give enhanced network
+access temporarily to a student's browser (since it is night and the user
+wants to do online gaming :) OR give one of the students simulation
+apps enhanced CPU power,
+
+With ability to write pids directly to resource classes, its just a
+matter of :
+
+ # echo pid > /mnt/network/<new_class>/tasks
+ (after some time)
+ # echo pid > /mnt/network/<orig_class>/tasks
+
+Without this ability, he would have to split the container into
+multiple separate ones and then associate the new containers with the
+new resource classes.
+
+
+
+1.3 How are containers implemented ?
+---------------------------------
+
+Containers extends the kernel as follows:
+
+ - Each task in the system has a reference-counted pointer to a
+ css_group.
+
+ - A css_group contains a set of reference-counted pointers to
+ container_subsys_state objects, one for each container subsystem
+ registered in the system. There is no direct link from a task to
+ the container of which it's a member in each hierarchy, but this
+ can be determined by following pointers through the
+ container_subsys_state objects. This is because accessing the
+ subsystem state is something that's expected to happen frequently
+ and in performance-critical code, whereas operations that require a
+ task's actual container assignments (in particular, moving between
+ containers) are less common.
+
+ - A container hierarchy filesystem can be mounted for browsing and
+ manipulation from user space.
+
+ - You can list all the tasks (by pid) attached to any container.
+
+The implementation of containers requires a few, simple hooks
+into the rest of the kernel, none in performance critical paths:
+
+ - in init/main.c, to initialize the root containers and initial
+ css_group at system boot.
+
+ - in fork and exit, to attach and detach a task from its css_group.
+
+In addition a new file system, of type "container" may be mounted, to
+enable browsing and modifying the containers presently known to the
+kernel. When mounting a container hierarchy, you may specify a
+comma-separated list of subsystems to mount as the filesystem mount
+options. By default, mounting the container filesystem attempts to
+mount a hierarchy containing all registered subsystems.
+
+If an active hierarchy with exactly the same set of subsystems already
+exists, it will be reused for the new mount. If no existing hierarchy
+matches, and any of the requested subsystems are in use in an existing
+hierarchy, the mount will fail with -EBUSY. Otherwise, a new hierarchy
+is activated, associated with the requested subsystems.
+
+It's not currently possible to bind a new subsystem to an active
+container hierarchy, or to unbind a subsystem from an active container
+hierarchy. This may be possible in future, but is fraught with nasty
+error-recovery issues.
+
+When a container filesystem is unmounted, if there are any
+subcontainers created below the top-level container, that hierarchy
+will remain active even though unmounted; if there are no
+subcontainers then the hierarchy will be deactivated.
+
+No new system calls are added for containers - all support for
+querying and modifying co ...

[ Show the rest of the message ]

Report message to a moderator

Re: [PATCH 01/10] Containers(V10): Basic container framework [message #13565 is a reply to message #13542]

Wed, 30 May 2007 07:15

Andrew Morton
Messages: 127
Registered: December 2005

Senior Member

On Tue, 29 May 2007 06:01:05 -0700 menage@google.com wrote:

> +For example, the following sequence of commands will setup a container
> +named "Charlie", containing just CPUs 2 and 3, and Memory Node 1,
> +and then start a subshell 'sh' in that container:
> +
> + mount -t container cpuset -ocpuset /dev/container
> + cd /dev/container
> + mkdir Charlie
> + cd Charlie
> + /bin/echo $$ > tasks
> + sh
> + # The subshell 'sh' is now running in container Charlie
> + # The next line should display '/Charlie'
> + cat /proc/self/container

Once this has been done, can tasks inside `Charlie' escape from it?

And what permissions are needed to expand the various allotments (if
that's the approved term) for `Charlie'?

Report message to a moderator

Re: [PATCH 01/10] Containers(V10): Basic container framework [message #13569 is a reply to message #13542]

Wed, 30 May 2007 07:15

Andrew Morton
Messages: 127
Registered: December 2005

Senior Member

On Tue, 29 May 2007 06:01:05 -0700 menage@google.com wrote:

> This patch adds the main containers framework - the container
> filesystem, and the basic structures for tracking membership and
> associating subsystem state objects to tasks.
>
> ...
>
> --- /dev/null
> +++ container-2.6.22-rc2-mm1/include/linux/container_subsys.h
> @@ -0,0 +1,10 @@
> +/* Add subsystem definitions of the form SUBSYS(<name>) in this
> + * file. Surround each one by a line of comment markers so that
> + * patches don't collide
> + */
> +
> +/* */
> +
> +/* */
> +
> +/* */
> Index: container-2.6.22-rc2-mm1/include/linux/sched.h
> ============================================================ =======
> --- container-2.6.22-rc2-mm1.orig/include/linux/sched.h
> +++ container-2.6.22-rc2-mm1/include/linux/sched.h
> @@ -851,6 +851,34 @@ struct sched_class {
> void (*task_new) (struct rq *rq, struct task_struct *p);
> };
>
> +#ifdef CONFIG_CONTAINERS
> +
> +#define SUBSYS(_x) _x ## _subsys_id,
> +enum container_subsys_id {
> +#include <linux/container_subsys.h>
> + CONTAINER_SUBSYS_COUNT
> +};
> +#undef SUBSYS
> +
> +/* A css_group is a structure holding pointers to a set of
> + * container_subsys_state objects.
> + */
> +
> +struct css_group {
> +
> + /* Set of subsystem states, one for each subsystem. NULL for
> + * subsystems that aren't part of this hierarchy. These
> + * pointers reduce the number of dereferences required to get
> + * from a task to its state for a given container, but result
> + * in increased space usage if tasks are in wildly different
> + * groupings across different hierarchies. This array is
> + * immutable after creation */
> + struct container_subsys_state *subsys[CONTAINER_SUBSYS_COUNT];
> +
> +};

hm, missing forward declaration of struct container_subsys_state, but it all
seems to work out (via nested include) once the patches are applied and the
config option is enableable.

>
> ...
>
> --- /dev/null
> +++ container-2.6.22-rc2-mm1/kernel/container.c
>
> ...
>
> +#include <linux/cpu.h>
> +#include <linux/cpumask.h>
> +#include <linux/container.h>
> +#include <linux/err.h>
> +#include <linux/errno.h>
> +#include <linux/file.h>
> +#include <linux/fs.h>
> +#include <linux/init.h>
> +#include <linux/interrupt.h>
> +#include <linux/kernel.h>
> +#include <linux/kmod.h>
> +#include <linux/list.h>
> +#include <linux/mempolicy.h>
> +#include <linux/mm.h>
> +#include <linux/module.h>
> +#include <linux/mount.h>
> +#include <linux/namei.h>
> +#include <linux/pagemap.h>
> +#include <linux/proc_fs.h>
> +#include <linux/rcupdate.h>
> +#include <linux/sched.h>
> +#include <linux/seq_file.h>
> +#include <linux/security.h>
> +#include <linux/slab.h>
> +#include <linux/smp_lock.h>
> +#include <linux/spinlock.h>
> +#include <linux/stat.h>
> +#include <linux/string.h>
> +#include <linux/time.h>
> +#include <linux/backing-dev.h>
> +#include <linux/sort.h>
> +
> +#include <asm/uaccess.h>
> +#include <asm/atomic.h>
> +#include <linux/mutex.h>

Holy cow, do we need all those?

> +typedef enum {
> + CONT_REMOVED,
> +} container_flagbits_t;

typedefs are verboten. Fortunately this one is never referred to - only
the values are used, so we can delete it.

>
> ...
>
> +static void container_clear_directory(struct dentry *dentry)
> +{
> + struct list_head *node;
> + BUG_ON(!mutex_is_locked(&dentry->d_inode->i_mutex));
> + spin_lock(&dcache_lock);
> + node = dentry->d_subdirs.next;
> + while (node != &dentry->d_subdirs) {
> + struct dentry *d = list_entry(node, struct dentry, d_u.d_child);
> + list_del_init(node);
> + if (d->d_inode) {
> + /* This should never be called on a container
> + * directory with child containers */
> + BUG_ON(d->d_inode->i_mode & S_IFDIR);
> + d = dget_locked(d);
> + spin_unlock(&dcache_lock);
> + d_delete(d);
> + simple_unlink(dentry->d_inode, d);
> + dput(d);
> + spin_lock(&dcache_lock);
> + }
> + node = dentry->d_subdirs.next;
> + }
> + spin_unlock(&dcache_lock);
> +}
> +
> +/*
> + * NOTE : the dentry must have been dget()'ed
> + */
> +static void container_d_remove_dir(struct dentry *dentry)
> +{
> + container_clear_directory(dentry);
> +
> + spin_lock(&dcache_lock);
> + list_del_init(&dentry->d_u.d_child);
> + spin_unlock(&dcache_lock);
> + remove_dir(dentry);
> +}

Taking dcache_lock in here is unfortunate. A filesystem really shouldn't
be playing with that lock.

But about 20 filesystems do so. Ho hum.

> +static int rebind_subsystems(struct containerfs_root *root,
> + unsigned long final_bits)

The code's a bit short on comments.

> +{
> + unsigned long added_bits, removed_bits;
> + struct container *cont = &root->top_container;
> + int i;
> +
> + removed_bits = root->subsys_bits & ~final_bits;
> + added_bits = final_bits & ~root->subsys_bits;
> + /* Check that any added subsystems are currently free */
> + for (i = 0; i < CONTAINER_SUBSYS_COUNT; i++) {
> + unsigned long long bit = 1ull << i;
> + struct container_subsys *ss = subsys[i];
> + if (!(bit & added_bits))
> + continue;
> + if (ss->root != &rootnode) {
> + /* Subsystem isn't free */
> + return -EBUSY;
> + }
> + }
> +
> + /* Currently we don't handle adding/removing subsystems when
> + * any subcontainers exist. This is theoretically supportable
> + * but involves complex erro r handling, so it's being left until
> + * later */
> + if (!list_empty(&cont->children)) {
> + return -EBUSY;
> + }
> +
> + /* Process each subsystem */
> + for (i = 0; i < CONTAINER_SUBSYS_COUNT; i++) {
> + struct container_subsys *ss = subsys[i];
> + unsigned long bit = 1UL << i;
> + if (bit & added_bits) {
> + /* We're binding this subsystem to this hierarchy */
> + BUG_ON(cont->subsys[i]);
> + BUG_ON(!dummytop->subsys[i]);
> + BUG_ON(dummytop->subsys[i]->container != dummytop);
> + cont->subsys[i] = dummytop->subsys[i];
> + cont->subsys[i]->container = cont;
> + list_add(&ss->sibling, &root->subsys_list);
> + rcu_assign_pointer(ss->root, root);
> + if (ss->bind)
> + ss->bind(ss, cont);
> +
> + } else if (bit & removed_bits) {
> + /* We're removing this subsystem */
> + BUG_ON(cont->subsys[i] != dummytop->subsys[i]);
> + BUG_ON(cont->subsys[i]->container != cont);
> + if (ss->bind)
> + ss->bind(ss, dummytop);
> + dummytop->subsys[i]->container = dummytop;
> + cont->subsys[i] = NULL;
> + rcu_assign_pointer(subsys[i]->root, &rootnode);
> + list_del(&ss->sibling);
> + } else if (bit & final_bits) {
> + /* Subsystem state should already exist */
> + BUG_ON(!cont->subsys[i]);
> + } else {
> + /* Subsystem state shouldn't exist */
> + BUG_ON(cont->subsys[i]);
> + }
> + }
> + root->subsys_bits = final_bits;
> + synchronize_rcu();
> +
> + return 0;
> +}
>
> ...
>
> +static int container_remount(struct super_block *sb, int *flags, char *data)
> +{
> + int ret = 0;
> + unsigned long subsys_bits;
> + struct containerfs_root *root = sb->s_fs_info;
> + struct container *cont = &root->top_container;
> +
> + mutex_lock(&cont->dentry->d_inode->i_mutex);
> + mutex_lock(&container_mutex);

So container_mutex nests inside i_mutex. That mean that we'll get lockdep
moaning if anyone does a __GFP_FS allocation inside container_mutex (some
filesystems can take i_mutex on the ->writepage path, iirc).

Probably a false positive, we can cross that bridege if/when we come to it.

> + /* See what subsystems are wanted */
> + ret = parse_containerfs_options(data, &subsys_bits);
> + if (ret)
> + goto out_unlock;
> +
> + ret = rebind_subsystems(root, subsys_bits);
> +
> + /* (re)populate subsystem files */
> + if (!ret)
> + container_populate_dir(cont);
> +
> + out_unlock:
> + mutex_unlock(&container_mutex);
> + mutex_unlock(&cont->dentry->d_inode->i_mutex);
> + return ret;
> +}
> +
>
> ...
>
> +
> +static int container_fill_super(struct super_block *sb, void *options,
> + int unused_silent)
> +{
> + struct inode *inode;
> + struct dentry *root;
> + struct containerfs_root *hroot = options;
> +
> + sb->s_blocksize = PAGE_CACHE_SIZE;
> + sb->s_blocksize_bits = PAGE_CACHE_SHIFT;
> + sb->s_magic = CONTAINER_SUPER_MAGIC;
> + sb->s_op = &container_ops;
> +
> + inode = container_new_inode(S_IFDIR | S_IRUGO | S_IXUGO | S_IWUSR, sb);
> + if (!inode)
> + return -ENOMEM;
> +
> + inode->i_op = &simple_dir_inode_operations;
> + inode->i_fop = &simple_dir_operations;
> + inode->i_op = &container_dir_inode_operations;
> + /* directories start off with i_nlink == 2 (for "." entry) */
> + inc_nlink(inode);
> +
> + root = d_alloc_root(inode);
> + if (!root) {
> + iput(inode);
> + return -ENOMEM;

I bet that iput() hasn't been tested ;)

People have hit unpleasant prob ...

[ Show the rest of the message ]

Report message to a moderator

Re: [PATCH 01/10] Containers(V10): Basic container framework [message #13581 is a reply to message #13569]

Wed, 30 May 2007 14:02

Paul Menage
Messages: 642
Registered: September 2006

Senior Member

On 5/30/07, Andrew Morton <akpm@linux-foundation.org> wrote:
>
> Holy cow, do we need all those?

I'll experiment to see which ones we can get rid of.

>
> > +typedef enum {
> > + CONT_REMOVED,
> > +} container_flagbits_t;
>
> typedefs are verboten. Fortunately this one is never referred to - only
> the values are used, so we can delete it.

OK.

>
> Taking dcache_lock in here is unfortunate. A filesystem really shouldn't
> be playing with that lock.

Is there a recommended way to do what I want to do, i.e. clear out all
the dentries from a virtual fs directory and rebuild them whilst
holding the directory's i_sem so no one can see the transiently empty
directory?

>
> The code's a bit short on comments.

I'll add some.

> > + root = d_alloc_root(inode);
> > + if (!root) {
> > + iput(inode);
> > + return -ENOMEM;
>
> I bet that iput() hasn't been tested ;)

Correct.

>
> People have hit unpleasant problems before now running iput() against
> partially-constructed inodes.

What kinds of problems? Are there bits of state that I should fully
construct even if I'm going to iput() it, or is there a better
function to call? fs/ext3/super.c seems to do the same thing.

> > + if (ret)
> > + goto out_unlock;
>
> Did we just leak *root?

I believe we did. I'll fix that.

> >
> > +static inline void get_first_subsys(const struct container *cont,
> > + struct container_subsys_state **css,
> > + int *subsys_id) {
> > + const struct containerfs_root *root = cont->root;
> > + const struct container_subsys *test_ss;
> > + BUG_ON(list_empty(&root->subsys_list));
> > + test_ss = list_entry(root->subsys_list.next,
> > + struct container_subsys, sibling);
> > + if (css) {
> > + *css = cont->subsys[test_ss->subsys_id];
> > + BUG_ON(!*css);
> > + }
> > + if (subsys_id)
> > + *subsys_id = test_ss->subsys_id;
> > +}
>
> This ends up having several callers and its too large to inline.

Two large from a compiler PoV or from a style PoV? It's basically just
six dereferences and two comparisons, plus the BUG_ON()s.

>
> Do we actually want to support lseek on these things?
>
> If not we can leave this null and use nonseekable_open() in ->open.

I inherited that from cpusets without thinking about it too much. I
guess that we don't really need seekability.

> > + } else if (S_ISREG(mode)) {
> > + inode->i_size = 0;
> > + inode->i_fop = &container_file_operations;
> > + }
>
> The S_ISREG files have no ->i_ops?

Not currently. I don't see anything in inode_operations that we want
to be able to do on non-directories.

Paul

Report message to a moderator

Re: [PATCH 01/10] Containers(V10): Basic container framework [message #13593 is a reply to message #13581]

Wed, 30 May 2007 16:00

Andrew Morton
Messages: 127
Registered: December 2005

Senior Member

On Wed, 30 May 2007 07:02:00 -0700 "Paul Menage" <menage@google.com> wrote:

>
> >
> > People have hit unpleasant problems before now running iput() against
> > partially-constructed inodes.
>
> What kinds of problems? Are there bits of state that I should fully
> construct even if I'm going to iput() it, or is there a better
> function to call? fs/ext3/super.c seems to do the same thing.

I don't recall, actually. But it crashed.

I guess the fault-injection code could be used to trigger errors here.

> > >
> > > +static inline void get_first_subsys(const struct container *cont,
> > > + struct container_subsys_state **css,
> > > + int *subsys_id) {
> > > + const struct containerfs_root *root = cont->root;
> > > + const struct container_subsys *test_ss;
> > > + BUG_ON(list_empty(&root->subsys_list));
> > > + test_ss = list_entry(root->subsys_list.next,
> > > + struct container_subsys, sibling);
> > > + if (css) {
> > > + *css = cont->subsys[test_ss->subsys_id];
> > > + BUG_ON(!*css);
> > > + }
> > > + if (subsys_id)
> > > + *subsys_id = test_ss->subsys_id;
> > > +}
> >
> > This ends up having several callers and its too large to inline.
>
> Two large from a compiler PoV or from a style PoV? It's basically just
> six dereferences and two comparisons, plus the BUG_ON()s.

It will end up generating more .text this way. We figure that this makes
it slower, due to increased icache footprint.

Report message to a moderator

Re: [PATCH 01/10] Containers(V10): Basic container framework [message #14054 is a reply to message #13542]

Wed, 13 June 2007 10:17

Dhaval Giani
Messages: 37
Registered: June 2007

Member

Hi,

On Tue, May 29, 2007 at 06:01:05AM -0700, menage@google.com wrote:
> +1.5 How do I use containers ?
> +--------------------------
> +
> +To start a new job that is to be contained within a container, using
> +the "cpuset" container subsystem, the steps are something like:
> +
> + 1) mkdir /dev/container
> + 2) mount -t container -ocpuset cpuset /dev/container
> + 3) Create the new container by doing mkdir's and write's (or echo's) in
> + the /dev/container virtual file system.
> + 4) Start a task that will be the "founding father" of the new job.
> + 5) Attach that task to the new container by writing its pid to the
> + /dev/container tasks file for that container.
> + 6) fork, exec or clone the job tasks from this founding father task.
> +
> +For example, the following sequence of commands will setup a container
> +named "Charlie", containing just CPUs 2 and 3, and Memory Node 1,
> +and then start a subshell 'sh' in that container:
> +
> + mount -t container cpuset -ocpuset /dev/container
> + cd /dev/container
> + mkdir Charlie
> + cd Charlie

This example does not work. To do so we need to do

/bin/echo 2-3 > cpus
/bin/echo 1 > mems

> + /bin/echo $$ > tasks
> + sh
> + # The subshell 'sh' is now running in container Charlie
> + # The next line should display '/Charlie'
> + cat /proc/self/container

The following patch does that.

thanks and regards
Dhaval

----------------------

Signed-off-by: Dhaval Giani <dhaval@linux.vnet.ibm.com>

diff -uprN linux-2.6.22-rc4/Documentation/containers.txt old/Documentation/containers.txt
--- linux-2.6.22-rc4/Documentation/containers.txt 2007-06-13 15:38:30.000000000 +0530
+++ old/Documentation/containers.txt 2007-06-13 10:56:49.000000000 +0530
@@ -310,6 +310,8 @@ and then start a subshell 'sh' in that c
cd /dev/container
mkdir Charlie
cd Charlie
+ /bin/echo 2-3 > cpus
+ /bin/echo 1 > mems
/bin/echo $$ > tasks
sh
# The subshell 'sh' is now running in container Charlie

Report message to a moderator

Previous Topic:	nptl perf bench and profiling with pidns patchsets
Next Topic:	Proposed changes to the vz* specfiles.

Goto Forum:

-=] Back to Top [=-

[ Syndicate this forum (XML) ] [

] [

]

Current Time: Sun Aug 03 04:30:29 GMT 2025

Total time taken to generate the page: 1.31430 seconds