OpenVZ Forum


Home » Mailing lists » Devel » [RFC][PATCH] rename 'struct pid'
[RFC][PATCH] rename 'struct pid' [message #18150] Tue, 10 April 2007 23:20 Go to next message
Dave Hansen is currently offline  Dave Hansen
Messages: 240
Registered: October 2005
Senior Member
I've been hacking quite a bit on the pidspace code.  I've run
into a bug or two, and had a heck of a time debugging it.
Other than my inferior puny monkey brain, I'm directing some
of the blame squarely in the direction of the 'struct pid'.

We have pid_t, pid_ns, struct pid and pid_link, at _least_.
Seeing code like get_pid(pid->pid_ns->pid_type[PIDTYPE_PID])
is mind-numbing to say the least.  It makes it really hard
to comprehend, and even harder to debug.

We honestly have a lot of code like this:

	pid = pid_nr(filp->f_owner.pid);

WTF?  It's getting a pid from a pid?  Huh? :)

It makes sense when you go look at the structures, but
sitting in the middle of a function and when you can't see
all of the struct declarations can be really sketchy.

So, I propose that we rename the one structure that seems to
be the focus of the problem: 'struct pid'.  Fundamentally, it
is a 'process identifier': it helps the kernel to identifty
processes.  However, as I noted, 'pid' is a wee bit overloaded.

In addition to "identifying" processes, this structure acts
as an indirection or handle to them.  So, I propse we call
these things 'struct task_ref'.  Just reading some of the
code that I've modified in here makes me feel like this is
the right way.

Compare the two sentences below:

Oh, I have a task_ref?  What kind is it?  Oh, it's a pgid
reference because I have REFTYPE_PGID.

Oh, I have a pid?  What kind is it?  Oh, it's a pid because
I have PIDTYPE_PID.

Which makes more sense?

So, this still needs some work converting some of the
function names, but it compiles as-is.  Any ideas for better
names?

Signed-off-by: Dave Hansen <hansendc@us.ibm.com>
---

 lxc-dave/drivers/char/keyboard.c        |   10 +-
 lxc-dave/drivers/char/n_r3964.c         |   32 +++---
 lxc-dave/drivers/char/tty_io.c          |   66 ++++++-------
 lxc-dave/drivers/char/vt.c              |    2 
 lxc-dave/drivers/char/vt_ioctl.c        |   16 +--
 lxc-dave/drivers/net/tun.c              |    2 
 lxc-dave/drivers/s390/char/fs3270.c     |    4 
 lxc-dave/drivers/usb/core/devio.c       |   12 +-
 lxc-dave/drivers/usb/core/inode.c       |    2 
 lxc-dave/drivers/usb/core/usb.h         |    2 
 lxc-dave/files                          |   42 --------
 lxc-dave/fs/autofs/autofs_i.h           |    2 
 lxc-dave/fs/autofs/inode.c              |    4 
 lxc-dave/fs/autofs/root.c               |    2 
 lxc-dave/fs/compat.c                    |    4 
 lxc-dave/fs/dnotify.c                   |    2 
 lxc-dave/fs/exec.c                      |    8 -
 lxc-dave/fs/fcntl.c                     |   63 ++++++-------
 lxc-dave/fs/file_table.c                |    2 
 lxc-dave/fs/ioprio.c                    |   17 +--
 lxc-dave/fs/locks.c                     |    2 
 lxc-dave/fs/ncpfs/inode.c               |   20 ++--
 lxc-dave/fs/proc/array.c                |    2 
 lxc-dave/fs/proc/base.c                 |   52 +++++-----
 lxc-dave/fs/proc/inode.c                |    4 
 lxc-dave/fs/proc/internal.h             |    6 -
 lxc-dave/fs/proc/root.c                 |    4 
 lxc-dave/fs/proc/task_mmu.c             |    4 
 lxc-dave/fs/proc/task_nommu.c           |    5 -
 lxc-dave/fs/smbfs/inode.c               |    8 -
 lxc-dave/fs/smbfs/proc.c                |    4 
 lxc-dave/fs/smbfs/smbiod.c              |   10 +-
 lxc-dave/include/linux/console_struct.h |    2 
 lxc-dave/include/linux/fs.h             |    8 +
 lxc-dave/include/linux/init_task.h      |   22 ++--
 lxc-dave/include/linux/kernel.h         |    4 
 lxc-dave/include/linux/mmzone.h         |    2 
 lxc-dave/include/linux/n_r3964.h        |    2 
 lxc-dave/include/linux/ncp_mount.h      |    2 
 lxc-dave/include/linux/pid.h            |  104 ++++++++++++---------
 lxc-dave/include/linux/proc_fs.h        |    4 
 lxc-dave/include/linux/sched.h          |   42 ++++----
 lxc-dave/include/linux/smb_fs_sb.h      |    2 
 lxc-dave/include/linux/tty.h            |    4 
 lxc-dave/include/linux/vt_kern.h        |    2 
 lxc-dave/init/main.c                    |    2 
 lxc-dave/ipc/mqueue.c                   |    8 -
 lxc-dave/kernel/capability.c            |    8 -
 lxc-dave/kernel/cpuset.c                |   10 +-
 lxc-dave/kernel/exit.c                  |   38 ++++----
 lxc-dave/kernel/fork.c                  |   23 ++--
 lxc-dave/kernel/futex.c                 |    2 
 lxc-dave/kernel/pid.c                   |  152 ++++++++++++++++----------------
 lxc-dave/kernel/rtmutex-debug.c         |    1 
 lxc-dave/kernel/signal.c                |   26 ++---
 lxc-dave/kernel/sys.c                   |   28 ++---
 lxc-dave/kernel/sysctl.c                |   10 +-
 lxc-dave/mm/mempolicy.c                 |    3 
 lxc-dave/mm/migrate.c                   |    2 
 59 files changed, 453 insertions(+), 475 deletions(-)

diff -puN include/linux/pid.h~rename-struct-pid include/linux/pid.h
--- lxc/include/linux/pid.h~rename-struct-pid	2007-04-10 16:18:30.000000000 -0700
+++ lxc-dave/include/linux/pid.h	2007-04-10 16:18:30.000000000 -0700
@@ -3,23 +3,24 @@
 
 #include <linux/rcupdate.h>
 
-enum pid_type
+enum task_ref_type
 {
-	PIDTYPE_PID,
-	PIDTYPE_PGID,
-	PIDTYPE_SID,
-	PIDTYPE_MAX
+	REFTYPE_PID,
+	REFTYPE_PGID,
+	REFTYPE_SID,
+	REFTYPE_MAX
 };
 
 /*
- * What is struct pid?
+ * What is struct task_ref?
  *
- * A struct pid is the kernel's internal notion of a process identifier.
- * It refers to individual tasks, process groups, and sessions.  While
- * there are processes attached to it the struct pid lives in a hash
- * table, so it and then the processes that it refers to can be found
- * quickly from the numeric pid value.  The attached processes may be
- * quickly accessed by following pointers from struct pid.
+ * A 'struct task_ref' is the kernel's internal notion of a process
+ * identifier.  It refers to individual tasks, process groups, and
+ * sessions.  While there are processes attached to it the
+ * 'struct task_ref' lives in a hash table, so it and then the
+ * processes that it refers to can be found quickly from the numeric
+ * pid value.  The attached processes may be quickly accessed by
+ * following pointers from struct task_ref.
  *
  * Storing pid_t values in the kernel and refering to them later has a
  * problem.  The process originally with that pid may have exited and the
@@ -31,89 +32,98 @@ enum pid_type
  * the now useless task_struct is still kept.  A task_struct plus a
  * stack consumes around 10K of low kernel memory.  More precisely
  * this is THREAD_SIZE + sizeof(struct task_struct).  By comparison
- * a struct pid is about 64 bytes.
+ * a struct task_ref is about 64 bytes.
  *
- * Holding a reference to struct pid solves both of these problems.
+ * Holding a reference to struct task_ref solves both of these problems.
  * It is small so holding a reference does not consume a lot of
- * resources, and since a new struct pid is allocated when the numeric pid
+ * resources, and since a new struct task_ref is allocated when the numeric pid
  * value is reused (when pids wrap around) we don't mistakenly refer to new
  * processes.
  */
 
-struct pid
+struct task_ref
 {
 	atomic_t count;
 	/* Try to keep pid_chain in the same cacheline as nr for find_pid */
-	int nr;
+	int pid;
 	struct hlist_node pid_chain;
-	/* lists of tasks that use this pid */
-	struct hlist_head tasks[PIDTYPE_MAX];
+	/*
+	 * lists of tasks that use this pid.
+	 * For instance, ->tasks[REFTYPE_SID]
+	 * has all tasks with a session id of
+	 * the number in ->pid.
+	 */
+	struct hlist_head tasks[REFTYPE_MAX];
 	struct rcu_head rcu;
 };
 
-extern struct pid init_struct_pid;
+extern struct task_ref init_task_ref;
 
 struct pid_link
 {
 	struct hlist_node node;
-	struct pid *pid;
+	struct task_ref *tref;
 };
 
-static inline struct pid *get_pid(struct pid *pid)
+static inline struct task_ref *get_pid(struct task_ref *tref)
 {
-	if (pid)
-		atomic_inc(&pid->count);
-	return pid;
+	if (tref)
+		atomic_inc(&tref->count);
+	return tref;
 }
 
-extern void FASTCALL(put_pid(struct pid *pid));
-extern struct task_struct *FASTCALL(pid_task(struct pid *pid, enum pid_type));
-extern struct task_struct *FASTCALL(get_pid_task(struct pid *pid,
-						enum pid_type));
+extern void FASTCALL(put_task_ref(struct task_ref *tref));
+extern struct task_struct *FASTCALL(pid_task(struct task_ref *tref,
+						enum task_ref_type));
+extern struct task_struct *FASTCALL(get_pid_task(struct task_ref *tref,
+						enum task_ref_type));
 
-extern struct pid *get_task_pid(struct task_struct *task, enum pid_type type);
+extern struct task_ref *get_task_ref(struct task_struct *task,
+				     enum task_ref_type type);
 
 /*
  * attach_pid() and detach_pid() must be called with the tasklist_lock
  * write-held.
  */
-extern int FASTCALL(attach_pid(struct task_struct *task,
-				enum pid_type type, struct pid *pid));
-extern void FASTCALL(detach_pid(struct task_struct *task, enum pid_type));
-extern void FASTCALL(transfer_pid(struct task_struct *old,
-				  struct task_struct *new, enum pid_type));
+extern int FASTCALL(attach_task_ref(struct task_struct *task,
+					enum task_ref_type type,
+					struct task_ref *tref));
+extern void FASTCALL(detach_task_ref(struct task_struct *task, enum task_ref_type));
+extern void FASTCALL(transfer_task_ref(struct task_struct *old,
+					struct task_struct *new,
+					enum task_ref_type));
 
 /*
  * look up a PID in the hash table. Must be called with the tasklist_lock
  * or rcu_read_lock() held.
  */
-extern struct pid *FASTCALL(find_pid(int nr));
+extern struct task_ref *FASTCALL(find_task(int nr));
 
 /*
  * Lookup a PID in the hash table, and return with it's count elevated.
  */
-extern struct pid *find_get_pid(int nr);
-extern struct pid *find_ge_pid(int nr);
+extern struct task_ref *find_get_pid(int nr);
+extern struct task_ref *find_ge_pid(int nr);
 
-extern struct pid *alloc_pid(void);
-extern void FASTCALL(free_pid(struct pid *pid));
+extern struct task_ref *alloc_task_ref(v
...

Re: [RFC][PATCH] rename 'struct pid' [message #18152 is a reply to message #18150] Wed, 11 April 2007 02:17 Go to previous messageGo to next message
Dave Hansen is currently offline  Dave Hansen
Messages: 240
Registered: October 2005
Senior Member
On Tue, 2007-04-10 at 19:28 -0600, Eric W. Biederman wrote:
> Dave Hansen <hansendc@us.ibm.com> writes:
> 
> > I've been hacking quite a bit on the pidspace code.  I've run
> > into a bug or two, and had a heck of a time debugging it.
> > Other than my inferior puny monkey brain, I'm directing some
> > of the blame squarely in the direction of the 'struct pid'.
> >
> > We have pid_t, pid_ns, struct pid and pid_link, at _least_.
> > Seeing code like get_pid(pid->pid_ns->pid_type[PIDTYPE_PID])
> > is mind-numbing to say the least.  
> 
> get_pid(pid->pid_ns->pid_type[PIDTYPE_PID]) is complete and utter
> nonsense.

Guilty as charged.  It was utter nonsense.  

> > It makes it really hard to comprehend, and even harder to debug.
> 
> Given that you quoted nonsense I can understand the comprehension
> problem.

Here are five actual examples that I see in my tree right now.  Each has
the string "pid" occur in it at least 4 times:

	 struct pid_nr *pid_nr = alloc_pid_nr(ppid_nr->pid_ns, new_pid);

         hlist_add_head_rcu(&pid->pid_chain, &pid_hash[pid_hashfn(pid->nr)]);

         pid_nr = alloc_pid_nr(ppid_nr->pid_ns, new_pid);

         struct pid_nr *pid_nr = alloc_pid_nr(ppid_nr->pid_ns, new_pid);

         hlist_add_head_rcu(&pid->pid_chain, &pid_hash[pid_hashfn(pid->nr)]);

What I'm telling you is that I really think that this naming is making
it hard to enhance and debug this code.

> > We honestly have a lot of code like this:
> >
> > 	pid = pid_nr(filp->f_owner.pid);
> >
> > WTF?  It's getting a pid from a pid?  Huh? :)
> 
> Clearer would be:
> 
> user_pid = pid_to_user(filp->f_owner.pid);

Yes.  Having a user vs. kernel process id distinction would be nice.
The fact remains, though, that we call the numbers pids in userspace.
We deal with processes very differently in the kernel, and we can and
should make that distinction in the naming of our structures.

> > It makes sense when you go look at the structures, but
> > sitting in the middle of a function and when you can't see
> > all of the struct declarations can be really sketchy.
> >
> > So, I propose that we rename the one structure that seems to
> > be the focus of the problem: 'struct pid'. 
> 
> NAK.

The pidspace stuff is taking a lot longer than it should to develop.
I'm not quite sure why, but I have the feeling that this readability
impediment is part of it.  

> > Fundamentally, it
> > is a 'process identifier': it helps the kernel to identifty
> > processes.  However, as I noted, 'pid' is a wee bit overloaded.
> >
> > In addition to "identifying" processes, this structure acts
> > as an indirection or handle to them.  So, I propse we call
> > these things 'struct task_ref'.
> 
> Renaming the structure above doesn't help and the structure represents
> a pid in a more fundamental way then pid_t can.

I completely agree.  Thus, I think we should get away from the 'pid'
nomenclature and move to something a bit more descriptive.  The 'struct
pid' is much more useful than a pid_t, and we should try to
differentiate it in some way.  Calling it a 'pid' really doesn't do it
justice.

> A pid (pid_t or
> struct pid) isn't just an identfier it is a handle to processes.
> struct pid just does so more directly because it is inside the kernel.

Let's face it, "pid" has a meaning.  It's a number.  It's what you
kill(1).  The meaning has been there for a long, long time.  'struct
pid' is a completely different concept, and it's certainly more than
"just a number".

So, please consider that there are actual kernel developers hacking on
this stuff who are having problem with it.  The function of 'struct pid'
is great, it's a wonderful concept.  It just needs a slightly different
name in order to more easily relate that concept to those that are
trying to use it.

If anyone can think of some more incremental ways to do this, or has
other suggestions on how to make it more clear, I'm all ears.

-- Dave

_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containers
Re: [RFC][PATCH] rename 'struct pid' [message #18153 is a reply to message #18150] Wed, 11 April 2007 01:28 Go to previous messageGo to next message
ebiederm is currently offline  ebiederm
Messages: 1354
Registered: February 2006
Senior Member
Dave Hansen <hansendc@us.ibm.com> writes:

> I've been hacking quite a bit on the pidspace code.  I've run
> into a bug or two, and had a heck of a time debugging it.
> Other than my inferior puny monkey brain, I'm directing some
> of the blame squarely in the direction of the 'struct pid'.
>
> We have pid_t, pid_ns, struct pid and pid_link, at _least_.
> Seeing code like get_pid(pid->pid_ns->pid_type[PIDTYPE_PID])
> is mind-numbing to say the least.  

get_pid(pid->pid_ns->pid_type[PIDTYPE_PID]) is complete and utter
nonsense.

> It makes it really hard to comprehend, and even harder to debug.

Given that you quoted nonsense I can understand the comprehension
problem.

> We honestly have a lot of code like this:
>
> 	pid = pid_nr(filp->f_owner.pid);
>
> WTF?  It's getting a pid from a pid?  Huh? :)

Clearer would be:

user_pid = pid_to_user(filp->f_owner.pid);


> It makes sense when you go look at the structures, but
> sitting in the middle of a function and when you can't see
> all of the struct declarations can be really sketchy.
>
> So, I propose that we rename the one structure that seems to
> be the focus of the problem: 'struct pid'. 

NAK.

> Fundamentally, it
> is a 'process identifier': it helps the kernel to identifty
> processes.  However, as I noted, 'pid' is a wee bit overloaded.
>
> In addition to "identifying" processes, this structure acts
> as an indirection or handle to them.  So, I propse we call
> these things 'struct task_ref'.

Renaming the structure above doesn't help and the structure represents
a pid in a more fundamental way then pid_t can.    A pid (pid_t or
struct pid) isn't just an identifier it is a handle to processes.
struct pid just does so more directly because it is inside the kernel.


> Just reading some of the
> code that I've modified in here makes me feel like this is
> the right way.

I get exactly the opposite impression.

> Compare the two sentences below:
>
> Oh, I have a task_ref?  What kind is it?  Oh, it's a pgid
> reference because I have REFTYPE_PGID.
>
> Oh, I have a pid?  What kind is it?  Oh, it's a pid because
> I have PIDTYPE_PID.
>
> Which makes more sense?

Neither the questions are nonsense.  The only reasonable question
is which kind of process am I using the pid to look for.

> So, this still needs some work converting some of the
> function names, but it compiles as-is.  Any ideas for better
> names?

struct pid is properly named.  It isn't even as fuzzy as struct
signal_struct.

All I can suggest is making a clearer distinction between user and
kernel pids.  So maybe it could become struct kpid.  Even there
I'm not certain it makes sense except in variable names.

Eric
_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containers
Re: [RFC][PATCH] rename 'struct pid' [message #18154 is a reply to message #18150] Wed, 11 April 2007 00:57 Go to previous messageGo to next message
serue is currently offline  serue
Messages: 750
Registered: February 2006
Senior Member
Quoting Dave Hansen (hansendc@us.ibm.com):
> 
> I've been hacking quite a bit on the pidspace code.  I've run
> into a bug or two, and had a heck of a time debugging it.
> Other than my inferior puny monkey brain, I'm directing some
> of the blame squarely in the direction of the 'struct pid'.
> 
> We have pid_t, pid_ns, struct pid and pid_link, at _least_.
> Seeing code like get_pid(pid->pid_ns->pid_type[PIDTYPE_PID])
> is mind-numbing to say the least.  It makes it really hard
> to comprehend, and even harder to debug.
> 
> We honestly have a lot of code like this:
> 
> 	pid = pid_nr(filp->f_owner.pid);
> 
> WTF?  It's getting a pid from a pid?  Huh? :)
> 
> It makes sense when you go look at the structures, but
> sitting in the middle of a function and when you can't see
> all of the struct declarations can be really sketchy.
> 
> So, I propose that we rename the one structure that seems to
> be the focus of the problem: 'struct pid'.  Fundamentally, it
> is a 'process identifier': it helps the kernel to identifty
> processes.  However, as I noted, 'pid' is a wee bit overloaded.
> 
> In addition to "identifying" processes, this structure acts
> as an indirection or handle to them.  So, I propse we call
> these things 'struct task_ref'.  Just reading some of the

I haven't tested the patch, but I enthusiastically, wholeheartedly,
excitedly agree with this name change.

-serge

(I plan to run some cross-arch tests in the morning)

> code that I've modified in here makes me feel like this is
> the right way.
> 
> Compare the two sentences below:
> 
> Oh, I have a task_ref?  What kind is it?  Oh, it's a pgid
> reference because I have REFTYPE_PGID.
> 
> Oh, I have a pid?  What kind is it?  Oh, it's a pid because
> I have PIDTYPE_PID.
> 
> Which makes more sense?
> 
> So, this still needs some work converting some of the
> function names, but it compiles as-is.  Any ideas for better
> names?
> 
> Signed-off-by: Dave Hansen <hansendc@us.ibm.com>
> ---
> 
>  lxc-dave/drivers/char/keyboard.c        |   10 +-
>  lxc-dave/drivers/char/n_r3964.c         |   32 +++---
>  lxc-dave/drivers/char/tty_io.c          |   66 ++++++-------
>  lxc-dave/drivers/char/vt.c              |    2 
>  lxc-dave/drivers/char/vt_ioctl.c        |   16 +--
>  lxc-dave/drivers/net/tun.c              |    2 
>  lxc-dave/drivers/s390/char/fs3270.c     |    4 
>  lxc-dave/drivers/usb/core/devio.c       |   12 +-
>  lxc-dave/drivers/usb/core/inode.c       |    2 
>  lxc-dave/drivers/usb/core/usb.h         |    2 
>  lxc-dave/files                          |   42 --------
>  lxc-dave/fs/autofs/autofs_i.h           |    2 
>  lxc-dave/fs/autofs/inode.c              |    4 
>  lxc-dave/fs/autofs/root.c               |    2 
>  lxc-dave/fs/compat.c                    |    4 
>  lxc-dave/fs/dnotify.c                   |    2 
>  lxc-dave/fs/exec.c                      |    8 -
>  lxc-dave/fs/fcntl.c                     |   63 ++++++-------
>  lxc-dave/fs/file_table.c                |    2 
>  lxc-dave/fs/ioprio.c                    |   17 +--
>  lxc-dave/fs/locks.c                     |    2 
>  lxc-dave/fs/ncpfs/inode.c               |   20 ++--
>  lxc-dave/fs/proc/array.c                |    2 
>  lxc-dave/fs/proc/base.c                 |   52 +++++-----
>  lxc-dave/fs/proc/inode.c                |    4 
>  lxc-dave/fs/proc/internal.h             |    6 -
>  lxc-dave/fs/proc/root.c                 |    4 
>  lxc-dave/fs/proc/task_mmu.c             |    4 
>  lxc-dave/fs/proc/task_nommu.c           |    5 -
>  lxc-dave/fs/smbfs/inode.c               |    8 -
>  lxc-dave/fs/smbfs/proc.c                |    4 
>  lxc-dave/fs/smbfs/smbiod.c              |   10 +-
>  lxc-dave/include/linux/console_struct.h |    2 
>  lxc-dave/include/linux/fs.h             |    8 +
>  lxc-dave/include/linux/init_task.h      |   22 ++--
>  lxc-dave/include/linux/kernel.h         |    4 
>  lxc-dave/include/linux/mmzone.h         |    2 
>  lxc-dave/include/linux/n_r3964.h        |    2 
>  lxc-dave/include/linux/ncp_mount.h      |    2 
>  lxc-dave/include/linux/pid.h            |  104 ++++++++++++---------
>  lxc-dave/include/linux/proc_fs.h        |    4 
>  lxc-dave/include/linux/sched.h          |   42 ++++----
>  lxc-dave/include/linux/smb_fs_sb.h      |    2 
>  lxc-dave/include/linux/tty.h            |    4 
>  lxc-dave/include/linux/vt_kern.h        |    2 
>  lxc-dave/init/main.c                    |    2 
>  lxc-dave/ipc/mqueue.c                   |    8 -
>  lxc-dave/kernel/capability.c            |    8 -
>  lxc-dave/kernel/cpuset.c                |   10 +-
>  lxc-dave/kernel/exit.c                  |   38 ++++----
>  lxc-dave/kernel/fork.c                  |   23 ++--
>  lxc-dave/kernel/futex.c                 |    2 
>  lxc-dave/kernel/pid.c                   |  152 ++++++++++++++++----------------
>  lxc-dave/kernel/rtmutex-debug.c         |    1 
>  lxc-dave/kernel/signal.c                |   26 ++---
>  lxc-dave/kernel/sys.c                   |   28 ++---
>  lxc-dave/kernel/sysctl.c                |   10 +-
>  lxc-dave/mm/mempolicy.c                 |    3 
>  lxc-dave/mm/migrate.c                   |    2 
>  59 files changed, 453 insertions(+), 475 deletions(-)
> 
> diff -puN include/linux/pid.h~rename-struct-pid include/linux/pid.h
> --- lxc/include/linux/pid.h~rename-struct-pid	2007-04-10 16:18:30.000000000 -0700
> +++ lxc-dave/include/linux/pid.h	2007-04-10 16:18:30.000000000 -0700
> @@ -3,23 +3,24 @@
>  
>  #include <linux/rcupdate.h>
>  
> -enum pid_type
> +enum task_ref_type
>  {
> -	PIDTYPE_PID,
> -	PIDTYPE_PGID,
> -	PIDTYPE_SID,
> -	PIDTYPE_MAX
> +	REFTYPE_PID,
> +	REFTYPE_PGID,
> +	REFTYPE_SID,
> +	REFTYPE_MAX
>  };
>  
>  /*
> - * What is struct pid?
> + * What is struct task_ref?
>   *
> - * A struct pid is the kernel's internal notion of a process identifier.
> - * It refers to individual tasks, process groups, and sessions.  While
> - * there are processes attached to it the struct pid lives in a hash
> - * table, so it and then the processes that it refers to can be found
> - * quickly from the numeric pid value.  The attached processes may be
> - * quickly accessed by following pointers from struct pid.
> + * A 'struct task_ref' is the kernel's internal notion of a process
> + * identifier.  It refers to individual tasks, process groups, and
> + * sessions.  While there are processes attached to it the
> + * 'struct task_ref' lives in a hash table, so it and then the
> + * processes that it refers to can be found quickly from the numeric
> + * pid value.  The attached processes may be quickly accessed by
> + * following pointers from struct task_ref.
>   *
>   * Storing pid_t values in the kernel and refering to them later has a
>   * problem.  The process originally with that pid may have exited and the
> @@ -31,89 +32,98 @@ enum pid_type
>   * the now useless task_struct is still kept.  A task_struct plus a
>   * stack consumes around 10K of low kernel memory.  More precisely
>   * this is THREAD_SIZE + sizeof(struct task_struct).  By comparison
> - * a struct pid is about 64 bytes.
> + * a struct task_ref is about 64 bytes.
>   *
> - * Holding a reference to struct pid solves both of these problems.
> + * Holding a reference to struct task_ref solves both of these problems.
>   * It is small so holding a reference does not consume a lot of
> - * resources, and since a new struct pid is allocated when the numeric pid
> + * resources, and since a new struct task_ref is allocated when the numeric pid
>   * value is reused (when pids wrap around) we don't mistakenly refer to new
>   * processes.
>   */
>  
> -struct pid
> +struct task_ref
>  {
>  	atomic_t count;
>  	/* Try to keep pid_chain in the same cacheline as nr for find_pid */
> -	int nr;
> +	int pid;
>  	struct hlist_node pid_chain;
> -	/* lists of tasks that use this pid */
> -	struct hlist_head tasks[PIDTYPE_MAX];
> +	/*
> +	 * lists of tasks that use this pid.
> +	 * For instance, ->tasks[REFTYPE_SID]
> +	 * has all tasks with a session id of
> +	 * the number in ->pid.
> +	 */
> +	struct hlist_head tasks[REFTYPE_MAX];
>  	struct rcu_head rcu;
>  };
>  
> -extern struct pid init_struct_pid;
> +extern struct task_ref init_task_ref;
>  
>  struct pid_link
>  {
>  	struct hlist_node node;
> -	struct pid *pid;
> +	struct task_ref *tref;
>  };
>  
> -static inline struct pid *get_pid(struct pid *pid)
> +static inline struct task_ref *get_pid(struct task_ref *tref)
>  {
> -	if (pid)
> -		atomic_inc(&pid->count);
> -	return pid;
> +	if (tref)
> +		atomic_inc(&tref->count);
> +	return tref;
>  }
>  
> -extern void FASTCALL(put_pid(struct pid *pid));
> -extern struct task_struct *FASTCALL(pid_task(struct pid *pid, enum pid_type));
> -extern struct task_struct *FASTCALL(get_pid_task(struct pid *pid,
> -						enum pid_type));
> +extern void FASTCALL(put_task_ref(struct task_ref *tref));
> +extern struct task_struct *FASTCALL(pid_task(struct task_ref *tref,
> +						enum task_ref_type));
> +extern struct task_struct *FASTCALL(get_pid_task(struct task_ref *tref,
> +						enum task_ref_type));
>  
> -extern struct pid *get_task_pid(struct task_struct *task, enum pid_type type);
> +extern struct task_ref *get_task_ref(struct task_struct *task,
> +				     enum task_ref_type type);
>  
>  
...

Re: [RFC][PATCH] rename 'struct pid' [message #18155 is a reply to message #18152] Wed, 11 April 2007 04:52 Go to previous messageGo to next message
ebiederm is currently offline  ebiederm
Messages: 1354
Registered: February 2006
Senior Member
Dave Hansen <hansendc@us.ibm.com> writes:

>> A pid (pid_t or
>> struct pid) isn't just an identfier it is a handle to processes.
>> struct pid just does so more directly because it is inside the kernel.
>
> Let's face it, "pid" has a meaning.  It's a number.  It's what you
> kill(1).  The meaning has been there for a long, long time.  'struct
> pid' is a completely different concept, and it's certainly more than
> "just a number".

Yes.  "pid" has a meaning.  The meaning is old and well established.
That meaning is not just a number, just like a file descriptor is not
just a number.

> So, please consider that there are actual kernel developers hacking on
> this stuff who are having problem with it.  The function of 'struct pid'
> is great, it's a wonderful concept.  It just needs a slightly different
> name in order to more easily relate that concept to those that are
> trying to use it.
>
> If anyone can think of some more incremental ways to do this, or has
> other suggestions on how to make it more clear, I'm all ears.

So what I have seen are examples down in the guts of the pid hash
table that are problematic.  And a few complaints about pid_nr.

However the conversions I have done and I have looked at have just
been a drop in replacement for pid_t except for reference counting
issues.  That to me at least is rather convincing.

So I'm not convinced there is a fundamental problem.  Just a bit of a
problem in the guts of things where everything seems to have the
same name.  I'm not at all certain how a different prefix would
sort that out.

My feeling is that changing this just caters to people who are not
going to be able to understand what is going on no matter what the
structure is named, and is going to make it harder for me to update
the code when I find the time to do it.

Eric
_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containers
Re: [RFC][PATCH] rename 'struct pid' [message #18157 is a reply to message #18155] Wed, 11 April 2007 06:31 Go to previous messageGo to next message
Dave Hansen is currently offline  Dave Hansen
Messages: 240
Registered: October 2005
Senior Member
On Tue, 2007-04-10 at 22:52 -0600, Eric W. Biederman wrote:
> Dave Hansen <hansendc@us.ibm.com> writes:
> >> A pid (pid_t or
> >> struct pid) isn't just an identfier it is a handle to processes.
> >> struct pid just does so more directly because it is inside the kernel.
> >
> > Let's face it, "pid" has a meaning.  It's a number.  It's what you
> > kill(1).  The meaning has been there for a long, long time.  'struct
> > pid' is a completely different concept, and it's certainly more than
> > "just a number".
> 
> Yes.  "pid" has a meaning.  The meaning is old and well established.
> That meaning is not just a number, just like a file descriptor is not
> just a number.

That's a great example.  Userspace fds are to 'struct file' as pids are
to 'struct pid', right?

I actually think 'struct file' is a pretty good name.  Think of what
do_sys_open() might look like if we called 'struct file' 'struct fd'
instead and 'fdp' instead of 'filp'.

We end up with lines like:

	fd_install(fd, fdp);

Which makes it confusing which fd we're dealing with or what the 'fd_'
in the name refers to, the 'fd' or the 'fdp'.  It makes quite a bit of
sense to have 'fd' and 'struct file' named quite distinctly.

> > So, please consider that there are actual kernel developers hacking on
> > this stuff who are having problem with it.  The function of 'struct pid'
> > is great, it's a wonderful concept.  It just needs a slightly different
> > name in order to more easily relate that concept to those that are
> > trying to use it.
> >
> > If anyone can think of some more incremental ways to do this, or has
> > other suggestions on how to make it more clear, I'm all ears.
> 
> So what I have seen are examples down in the guts of the pid hash
> table that are problematic.  And a few complaints about pid_nr.
> 
> However the conversions I have done and I have looked at have just
> been a drop in replacement for pid_t except for reference counting
> issues.  That to me at least is rather convincing.

I think this is more indicative of the great design of 'struct pid' in
concept.  It truly is a drop-in replacement for how things were used in
the past.  The concept is *great*.

> So I'm not convinced there is a fundamental problem.  Just a bit of a
> problem in the guts of things where everything seems to have the
> same name.  I'm not at all certain how a different prefix would
> sort that out.

I agree that there's no fundamental problem with the structure.  Its
function is quite ideal.  The issue for me comes in the ability for
people to update, enhance and review what is going on.  There's no
fundamental barrier here, as Suka demonstrated getting some of his
pidspace code to work.  It just crossed my pain threshold as I was
trying to debug some of it.

Once we get pidspaces fully working, the hacking in the guts will
certainly be reduced.  But, there are always bugs, and this is a common
enough code area that people are bound to touch it as time goes on.  I
just want to make that as easy as possible.

> My feeling is that changing this just caters to people who are not
> going to be able to understand what is going on no matter what the
> structure is named, and is going to make it harder for me to update
> the code when I find the time to do it.

I'm completely sure that you'll grasp the entire concept, no matter to
what we change the names.  The mess that you've unraveled so far in
there makes has given me supreme confidence in this. :)

My worry is the ramp-up time for people who want to understand it enough
hack it or just audit the code, but who won't grasp it on quite the same
level that you have.  

-- Dave

_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containers
Re: [RFC][PATCH] rename 'struct pid' [message #18158 is a reply to message #18153] Wed, 11 April 2007 13:31 Go to previous messageGo to next message
serue is currently offline  serue
Messages: 750
Registered: February 2006
Senior Member
Quoting Eric W. Biederman (ebiederm@xmission.com):
> Dave Hansen <hansendc@us.ibm.com> writes:
> 
> > I've been hacking quite a bit on the pidspace code.  I've run
> > into a bug or two, and had a heck of a time debugging it.
> > Other than my inferior puny monkey brain, I'm directing some
> > of the blame squarely in the direction of the 'struct pid'.
> >
> > We have pid_t, pid_ns, struct pid and pid_link, at _least_.
> > Seeing code like get_pid(pid->pid_ns->pid_type[PIDTYPE_PID])
> > is mind-numbing to say the least.  
> 
> get_pid(pid->pid_ns->pid_type[PIDTYPE_PID]) is complete and utter
> nonsense.
> 
> > It makes it really hard to comprehend, and even harder to debug.
> 
> Given that you quoted nonsense I can understand the comprehension
> problem.
> 
> > We honestly have a lot of code like this:
> >
> > 	pid = pid_nr(filp->f_owner.pid);
> >
> > WTF?  It's getting a pid from a pid?  Huh? :)
> 
> Clearer would be:
> 
> user_pid = pid_to_user(filp->f_owner.pid);
> 
> 
> > It makes sense when you go look at the structures, but
> > sitting in the middle of a function and when you can't see
> > all of the struct declarations can be really sketchy.
> >
> > So, I propose that we rename the one structure that seems to
> > be the focus of the problem: 'struct pid'. 
> 
> NAK.
> 
> > Fundamentally, it
> > is a 'process identifier': it helps the kernel to identifty
> > processes.  However, as I noted, 'pid' is a wee bit overloaded.
> >
> > In addition to "identifying" processes, this structure acts
> > as an indirection or handle to them.  So, I propse we call
> > these things 'struct task_ref'.
> 
> Renaming the structure above doesn't help and the structure represents
> a pid in a more fundamental way then pid_t can.    A pid (pid_t or
> struct pid) isn't just an identifier it is a handle to processes.

Exactly, so why call it a 'pid == process id' if it's not just an id?

If all we wanted was a kernel level process id, we could use the address
of the task struct.  'struct pid' is there to provide lightweight,
lasting references.  So if we didn't want it to serve as a task_ref, we
wouldn't need it at all.

Look it's not that we're saying struct pid was such a bad name, it's
that we have struct pid, struct pid_nr, struct pid_ns, pid_t, etc, and
some of those *need* to be renamed.  The pid_ns is in fact a pid
namespace, the pid_nr is in fact a pid number for a task in some pid
namespace, and struct pid happens to also serve as task_ref.  Renaming
just that one should suddenly make pid_nr and pid_ns much less
confusing imo.

-serge

> struct pid just does so more directly because it is inside the kernel.
> 
> 
> > Just reading some of the
> > code that I've modified in here makes me feel like this is
> > the right way.
> 
> I get exactly the opposite impression.
> 
> > Compare the two sentences below:
> >
> > Oh, I have a task_ref?  What kind is it?  Oh, it's a pgid
> > reference because I have REFTYPE_PGID.
> >
> > Oh, I have a pid?  What kind is it?  Oh, it's a pid because
> > I have PIDTYPE_PID.
> >
> > Which makes more sense?
> 
> Neither the questions are nonsense.  The only reasonable question
> is which kind of process am I using the pid to look for.
> 
> > So, this still needs some work converting some of the
> > function names, but it compiles as-is.  Any ideas for better
> > names?
> 
> struct pid is properly named.  It isn't even as fuzzy as struct
> signal_struct.
> 
> All I can suggest is making a clearer distinction between user and
> kernel pids.  So maybe it could become struct kpid.  Even there
> I'm not certain it makes sense except in variable names.
> 
> Eric
> _______________________________________________
> Containers mailing list
> Containers@lists.linux-foundation.org
> https://lists.linux-foundation.org/mailman/listinfo/containers
_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containers
Re: [RFC][PATCH] rename 'struct pid' [message #18159 is a reply to message #18150] Wed, 11 April 2007 15:58 Go to previous messageGo to next message
Dave Hansen is currently offline  Dave Hansen
Messages: 240
Registered: October 2005
Senior Member
On Wed, 2007-04-11 at 11:59 +0400, Kirill Korotaev wrote:
> Dave, taskref sounds a bit too much generic for me...

I completely agree.  It's a pretty generic name.  In the kernel, though
it does provide lookups to tasks.  I think the in-kernel task vs.
process naming means that it is more consistent if we use something with
"task" in it.  It may be called a "process identifier" in userspace but,
in the kernel, it appears to deal squarely with tasks.  

> But I can't provide some better name :/
> 
> pid             - number
> pref (or tref)  - process (task) ref, e.g. pid(filp->f_owner.pref)
> pref_struct     - former pid_struct, e.g. struct pref_struct pref; 

Not bad.  But, it would be nice to get away from pid-like names.  Part
of the problem with things like 'struct pid_struct' is that the
structure name is nice, but people will still do:

	struct pid_struct pid;

And we're back to square one. :(

-- Dave

_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containers
Re: [RFC][PATCH] rename 'struct pid' [message #18163 is a reply to message #18150] Wed, 11 April 2007 16:53 Go to previous messageGo to next message
Dave Hansen is currently offline  Dave Hansen
Messages: 240
Registered: October 2005
Senior Member
On Wed, 2007-04-11 at 10:13 -0600, Eric W. Biederman wrote:
> Because the suggestions I have seen are based on a partial understanding
> of struct pid, and the suggested renames will reinforce misunderstandings
> of what struct pid is.

Let's take a step back, then and see if we can nicely define what it
does.  Maybe we can come up with a bit better comment for 'struct
pid'.  

---

'struct pid' is a way for the kernel to get (and keep) a reference to a
task (or set of tasks) with a particular numeric value.  If there are
any tasks with this numeric value from now on, the structure will stay
pointed to those tasks.

In general, the underlying task will not change, but it can in cases
like when a non-thread-leader does an exec() and takes over as the
leader (the pid will change).

This reference will not keep the tasks from exiting.  But, even if all
of these tasks exit, the kernel will honor the fact that your reference
exists and will not re-create any tasks with the same numeric value, as
long as you allow your reference to persist.

-- Dave

_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containers
Re: [RFC][PATCH] rename 'struct pid' [message #18164 is a reply to message #18158] Wed, 11 April 2007 16:03 Go to previous messageGo to next message
ebiederm is currently offline  ebiederm
Messages: 1354
Registered: February 2006
Senior Member
"Serge E. Hallyn" <serue@us.ibm.com> writes:

> Exactly, so why call it a 'pid == process id' if it's not just an id?
>
> If all we wanted was a kernel level process id, we could use the address
> of the task struct.

No we can not. The address of the task struct can change for PIDTYPE_PID
when we call exec, if you don't call exec from your primary thread.

Further we could not compare the pids of a session and a process group
and a task if we were using addresses of struct task.

> 'struct pid' is there to provide lightweight,
> lasting references.  So if we didn't want it to serve as a task_ref, we
> wouldn't need it at all.

Not at all true.  Like any good solution it simultaneously has several
purposes, and in fact what is the current struct pid has been in the
kernel a very long time as the has table entry you find when doing
a pid lookup.

> Look it's not that we're saying struct pid was such a bad name, it's
> that we have struct pid, struct pid_nr, struct pid_ns, pid_t, etc, and
> some of those *need* to be renamed.  The pid_ns is in fact a pid
> namespace, the pid_nr is in fact a pid number for a task in some pid
> namespace, and struct pid happens to also serve as task_ref.  Renaming
> just that one should suddenly make pid_nr and pid_ns much less
> confusing imo.

I do agree with seem to be swarming in a see of pids.  Largely that
is a problem in kernel/pid.c where you are expected to know what you
are doing.

We can easily rename something besides struct pid and get the same
effect, especially in the core where everything is static.

I think the later file descriptor analogy is very apt, and as
such I would be happy to call it struct process_identifier.

Other then that likes rename things like struct pid_nr.  The one
should be local to kernel/pid.c so we could call it something.

Also for fixing the problem of swimming in a see of pids it is
the functions, variable and field names that we have to fix not
the nearly so much the structure name.

I suspect if we start changing the other names we can fix the problem
and possibly come up with a better name that is equally description.
Starting by renaming struct pid is the wrong place.

Eric
_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containers
Re: [RFC][PATCH] rename 'struct pid' [message #18165 is a reply to message #18150] Wed, 11 April 2007 16:13 Go to previous messageGo to next message
ebiederm is currently offline  ebiederm
Messages: 1354
Registered: February 2006
Senior Member
Kirill Korotaev <dev@sw.ru> writes:

> Totally agree with Dave.
> Current code looks like a mess of word 'pid'.

Yes the current code appears to be a mess of the word pid.

> Eric, why do you object so much? it doesn't change any functionality at all
> just makes code a bit more readable/clear.

Because the suggestions I have seen are based on a partial understanding
of struct pid, and the suggested renames will reinforce misunderstandings
of what struct pid is.

I think having a core data structure named in a way that suggests it is
less then it is and reinforces peoples misconceptions is likely to be
more of a problem for maintenance than having your eyes glaze over
with seeing pid pid pid pid pid pid pid.

Further if we can't find a better name a rename is simply useless noise.

Please let's address the of peoples eyes glazing over but not by renaming
struct pid.  But by renaming the other sructures, variables of field names
that surround it.  Especially in kernel/pid.c where anything that is local
doesn't need a global prefix.

Eric
_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containers
Re: [RFC][PATCH] rename 'struct pid' [message #18166 is a reply to message #18150] Wed, 11 April 2007 18:11 Go to previous messageGo to next message
Dave Hansen is currently offline  Dave Hansen
Messages: 240
Registered: October 2005
Senior Member
On Wed, 2007-04-11 at 11:34 -0600, Eric W. Biederman wrote:
> A struct pid is very similar to a list symbol.  Two of them can be compared for
> equality just by comparing pointers.  You can get different information about
> a struct pid by examining different ''slots''.  A struct pid has a
> name, but unlike lisp the name is numeric instead of a string.

I'm not sure what you mean by "list symbol".  Is that a lisp concept?

> Furthermore there is a one to one mapping from all valid user space
> pid numbers (in a single namespace) to a kernel struct pid.  Whether
> you compare the pointer is kernel space or the numeric user space
> value you will get the same answer.
> 
> The kernel struct pid can however live on after it's numeric name is
> no longer visible to user space.
> 
> The kernel struct pid is an identifier even if it isn't numeric.
> 
> The point of struct pid is so you can operate on pids without having
> to worry about the pesky user space numeric names.

Userspace refers to processes, sessions, and process groups by numbers
which are finite and can wrap over time.  Instead of using these numbers
inside the kernel, we use a 'struct pid'.  Unlike a plain number, a
'struct pid' uniquely refers to a particular process, session, or
process group, and does not suffer from any wrapping effects.

In general, the underlying task to which a 'struct pid' refers will not
change.  It is possible such as when a non-thread-leader does an exec()
and takes over as the leader, tat the tasks to which a 'struct pid'
refers will change.

This effectively lets the kernel do a pid_t (process, session, or
process group) to task lookup at a particular time and keep the results
of that lookup meaningful for a long, long time, not worrying about if
userspace has re-used those values.

---

I think at least part of the problem is that a 'struct pid' creates a
relationship to pid, as well as session ids and process groups.  This is
a bit muddled relationship that comes out of userspace, but it would be
nice to unmuddle it somehow in the kernel.  I think part of that might
be to give 'struct pid' a more neutral name that also encompasses the
pgrp and sid parts.

'pref' seems a decent compromise.  It keeps the "process" notion around
(and thus connected to the userspace concepts) while helping to reduce
the use of 'pid' in the name.  

These can be a bit confusing:

	struct pid *pid;
	struct pid *pgrp;
	struct pid *sid;

this is a bit better:

	struct pref *pid;
	struct pref *pgrp;
	struct pref *sid;

-- Dave

_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containers
Re: [RFC][PATCH] rename 'struct pid' [message #18168 is a reply to message #18163] Wed, 11 April 2007 17:34 Go to previous messageGo to next message
ebiederm is currently offline  ebiederm
Messages: 1354
Registered: February 2006
Senior Member
Dave Hansen <hansendc@us.ibm.com> writes:

> On Wed, 2007-04-11 at 10:13 -0600, Eric W. Biederman wrote:
>> Because the suggestions I have seen are based on a partial understanding
>> of struct pid, and the suggested renames will reinforce misunderstandings
>> of what struct pid is.
>
> Let's take a step back, then and see if we can nicely define what it
> does.  Maybe we can come up with a bit better comment for 'struct
> pid'.  
>
> ---
>
> 'struct pid' is a way for the kernel to get (and keep) a reference to a
> task (or set of tasks) with a particular numeric value.  If there are
> any tasks with this numeric value from now on, the structure will stay
> pointed to those tasks.
>
> In general, the underlying task will not change, but it can in cases
> like when a non-thread-leader does an exec() and takes over as the
> leader (the pid will change).
>
> This reference will not keep the tasks from exiting.  But, even if all
> of these tasks exit, the kernel will honor the fact that your reference
> exists and will not re-create any tasks with the same numeric value, as
> long as you allow your reference to persist.

The user space pid values do and should get recycled even if you are holding
a reference to a struct pid.  Otherwise there are some easy denial of service
attacks by opening /proc/<child> directories and then having your child exit.

...

A struct pid is very similar to a list symbol.  Two of them can be compared for
equality just by comparing pointers.  You can get different information about
a struct pid by examining different ''slots''.  A struct pid has a
name, but unlike lisp the name is numeric instead of a string.

Furthermore there is a one to one mapping from all valid user space
pid numbers (in a single namespace) to a kernel struct pid.  Whether
you compare the pointer is kernel space or the numeric user space
value you will get the same answer.

The kernel struct pid can however live on after it's numeric name is
no longer visible to user space.

The kernel struct pid is an identifier even if it isn't numeric.

The point of struct pid is so you can operate on pids without having
to worry about the pesky user space numeric names.

Eric
_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containers
Re: [RFC][PATCH] rename 'struct pid' [message #18169 is a reply to message #18157] Wed, 11 April 2007 07:59 Go to previous messageGo to next message
dev is currently offline  dev
Messages: 1693
Registered: September 2005
Location: Moscow
Senior Member

Dave Hansen wrote:
> On Tue, 2007-04-10 at 22:52 -0600, Eric W. Biederman wrote:
> 
>>Dave Hansen <hansendc@us.ibm.com> writes:
>>
>>>>A pid (pid_t or
>>>>struct pid) isn't just an identfier it is a handle to processes.
>>>>struct pid just does so more directly because it is inside the kernel.
>>>
>>>Let's face it, "pid" has a meaning.  It's a number.  It's what you
>>>kill(1).  The meaning has been there for a long, long time.  'struct
>>>pid' is a completely different concept, and it's certainly more than
>>>"just a number".
>>
>>Yes.  "pid" has a meaning.  The meaning is old and well established.
>>That meaning is not just a number, just like a file descriptor is not
>>just a number.
> 
> 
> That's a great example.  Userspace fds are to 'struct file' as pids are
> to 'struct pid', right?
> 
> I actually think 'struct file' is a pretty good name.  Think of what
> do_sys_open() might look like if we called 'struct file' 'struct fd'
> instead and 'fdp' instead of 'filp'.
> 
> We end up with lines like:
> 
> 	fd_install(fd, fdp);
> 
> Which makes it confusing which fd we're dealing with or what the 'fd_'
> in the name refers to, the 'fd' or the 'fdp'.  It makes quite a bit of
> sense to have 'fd' and 'struct file' named quite distinctly.

Totally agree with Dave.
Current code looks like a mess of word 'pid'.

Eric, why do you object so much? it doesn't change any functionality at all
just makes code a bit more readable/clear.

Dave, taskref sounds a bit too much generic for me...
But I can't provide some better name :/

pid		- number
pref (or tref)	- process (task) ref, e.g. pid(filp->f_owner.pref)
pref_struct	- former pid_struct, e.g. struct pref_struct pref;
?

Thanks,
Kirill

_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containers
Re: [RFC][PATCH] rename 'struct pid' [message #18171 is a reply to message #18166] Wed, 11 April 2007 18:46 Go to previous messageGo to next message
ebiederm is currently offline  ebiederm
Messages: 1354
Registered: February 2006
Senior Member
Dave Hansen <hansendc@us.ibm.com> writes:

> On Wed, 2007-04-11 at 11:34 -0600, Eric W. Biederman wrote:
>> A struct pid is very similar to a list symbol.  Two of them can be compared
> for
>> equality just by comparing pointers.  You can get different information about
>> a struct pid by examining different ''slots''.  A struct pid has a
>> name, but unlike lisp the name is numeric instead of a string.
>
> I'm not sure what you mean by "list symbol".  Is that a lisp concept?

Yes.  It was the most similar concept I could think of.

> Userspace refers to processes, sessions, and process groups by numbers
> which are finite and can wrap over time.  Instead of using these numbers
> inside the kernel, we use a 'struct pid'.  Unlike a plain number, a
> 'struct pid' uniquely refers to a particular process, session, or
> process group, and does not suffer from any wrapping effects.

Just like the plain number a 'struct pid' simultaneously refers to a process,
session and a process group.  Unlike a plain number it does not suffer any
wrapping effects.

> In general, the underlying task to which a 'struct pid' refers will not
> change.  It is possible such as when a non-thread-leader does an exec()
> and takes over as the leader, tat the tasks to which a 'struct pid'
> refers will change.

There is no unique underlying task to which a 'struct pid' refers.


> This effectively lets the kernel do a pid_t (process, session, or
> process group) to task lookup at a particular time and keep the results
> of that lookup meaningful for a long, long time, not worrying about if
> userspace has re-used those values.
>
> ---
>
> I think at least part of the problem is that a 'struct pid' creates a
> relationship to pid, as well as session ids and process groups.  This is
> a bit muddled relationship that comes out of userspace, but it would be
> nice to unmuddle it somehow in the kernel.  I think part of that might
> be to give 'struct pid' a more neutral name that also encompasses the
> pgrp and sid parts.

Have you ever looked at which data type the number for a session and a
process group are stored in? 


> 'pref' seems a decent compromise.  It keeps the "process" notion around
> (and thus connected to the userspace concepts) while helping to reduce
> the use of 'pid' in the name.  
>
> These can be a bit confusing:
>
> 	struct pid *pid;
> 	struct pid *pgrp;
> 	struct pid *sid;

How is it more confusing then?

	pid_t pid;
        pid_t pgrp;
        pid_t sid;

> this is a bit better:
>
> 	struct pref *pid;
> 	struct pref *pgrp;
> 	struct pref *sid;

The reason I call it a pid is because it is.  It is a pid in every way
except the bloody numeric name.

A pid is an identifier for processes and groups of processes.  Just
which processes depends on the function you feed it into.

Eric
_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containers
Re: [RFC][PATCH] rename 'struct pid' [message #18173 is a reply to message #18171] Wed, 11 April 2007 20:12 Go to previous messageGo to next message
Dave Hansen is currently offline  Dave Hansen
Messages: 240
Registered: October 2005
Senior Member
On Wed, 2007-04-11 at 12:46 -0600, Eric W. Biederman wrote:
> 
> > These can be a bit confusing:
> >
> >       struct pid *pid;
> >       struct pid *pgrp;
> >       struct pid *sid;
> 
> How is it more confusing then?
> 
>         pid_t pid;
>         pid_t pgrp;
>         pid_t sid; 

They confuse me the same way. :)

We can't do much about userspace.  But, we do have quite a bit of
control how we name things in the kernel, and I think there's a better
way.

Eric, we all know that you understand this stuff.  I'm just trying to
make it a bit more approachable for anyone that comes along in the
future.  Serge, Suka, Kirill and Pavel have all expressed some level of
dissatisfaction, and I think we all agree that 'struct pid' is at the
heart of the issue.  That's a pretty large chunk of the people other
than you that have tried to hack on it so far.  

I honestly don't think changing peripheral names is going to cut it,
especially as the largest amount of confusion comes in the core code
which doesn't use as many of the functions which could get new names.

What would you think about running the _concept_ by Linus and akpm and
seeing if they have any insights?

-- Dave

_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containers
Re: [RFC][PATCH] rename 'struct pid' [message #18174 is a reply to message #18150] Wed, 11 April 2007 21:28 Go to previous messageGo to next message
Dave Hansen is currently offline  Dave Hansen
Messages: 240
Registered: October 2005
Senior Member
On Wed, 2007-04-11 at 14:54 -0600, Eric W. Biederman wrote:
> Dave Hansen <hansendc@us.ibm.com> writes:
> 
> > On Wed, 2007-04-11 at 12:46 -0600, Eric W. Biederman wrote:
> >> 
> >> > These can be a bit confusing:
> >> >
> >> >       struct pid *pid;
> >> >       struct pid *pgrp;
> >> >       struct pid *sid;
> >> 
> >> How is it more confusing then?
> >> 
> >>         pid_t pid;
> >>         pid_t pgrp;
> >>         pid_t sid; 
> >
> > They confuse me the same way. :)
> >
> > We can't do much about userspace.  But, we do have quite a bit of
> > control how we name things in the kernel, and I think there's a better
> > way.
> 
> Maybe.
> 
> The worst of those above is:
> pid_t pid;
> 
> Am I correct?

Definitely.

> When someone mentions a pid which side of the above statement are you
> thinking of the left hand side or the right hand side.   The type or
> the variable name.

Traditionally, I think of a pid as what I see in top.  So, I think of
the right hand side variable name.  I think of it this way because the
left hand side has little meaning in how the pid_t is going to be used.

>   If the issue is that you find the concept of pid_t confusing then it
>   is much harder to sort this out.

I find pid_t confusing.  There, I've said it. ;)

In a perfect world, kill() wouldn't be multiplexed the way it is.  We'd
have kill_myself(), kill_pgrp(pgrp), kill_pid() and the pid_t passed
into kill_pid() there would only mean 'process id', only and could
_never_ mean 'process group id'.

We could even have different data structures so that type safety would
keep get_pgrp()'s result from being easily fed to kill_pgrp().

-- Dave

_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containers
Re: [RFC][PATCH] rename 'struct pid' [message #18175 is a reply to message #18173] Wed, 11 April 2007 20:54 Go to previous messageGo to next message
ebiederm is currently offline  ebiederm
Messages: 1354
Registered: February 2006
Senior Member
Dave Hansen <hansendc@us.ibm.com> writes:

> On Wed, 2007-04-11 at 12:46 -0600, Eric W. Biederman wrote:
>> 
>> > These can be a bit confusing:
>> >
>> >       struct pid *pid;
>> >       struct pid *pgrp;
>> >       struct pid *sid;
>> 
>> How is it more confusing then?
>> 
>>         pid_t pid;
>>         pid_t pgrp;
>>         pid_t sid; 
>
> They confuse me the same way. :)
>
> We can't do much about userspace.  But, we do have quite a bit of
> control how we name things in the kernel, and I think there's a better
> way.

Maybe.

The worst of those above is:
pid_t pid;

Am I correct?

When someone mentions a pid which side of the above statement are you
thinking of the left hand side or the right hand side.   The type or
the variable name.

If you think of the variable name, I contend pids are very confusing.
If you think of the type it is less so.

Did that make sense?

> Eric, we all know that you understand this stuff.  I'm just trying to
> make it a bit more approachable for anyone that comes along in the
> future.  Serge, Suka, Kirill and Pavel have all expressed some level of
> dissatisfaction, and I think we all agree that 'struct pid' is at the
> heart of the issue.  That's a pretty large chunk of the people other
> than you that have tried to hack on it so far.  

I haven't heard it universally said that 'struct pid' is at the heart
of the issue.

> I honestly don't think changing peripheral names is going to cut it,
> especially as the largest amount of confusion comes in the core code
> which doesn't use as many of the functions which could get new
> names.

Maybe.  I guess it depends on what issue you are trying to solve.

  If the issue is a flood of the word pid lots of little things can help.

  If the issue is that you find the concept of pid_t confusing then it
  is much harder to sort this out.

> What would you think about running the _concept_ by Linus and akpm and
> seeing if they have any insights?

They aren't likely to step in, if there is a reasonable chance we can
resolve this ourselves.  

Eric
_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containers
Re: [RFC][PATCH] rename 'struct pid' [message #18180 is a reply to message #18159] Wed, 11 April 2007 16:25 Go to previous messageGo to next message
dev is currently offline  dev
Messages: 1693
Registered: September 2005
Location: Moscow
Senior Member

Dave Hansen wrote:
> On Wed, 2007-04-11 at 11:59 +0400, Kirill Korotaev wrote:
> 
>>Dave, taskref sounds a bit too much generic for me...
> 
> 
> I completely agree.  It's a pretty generic name.  In the kernel, though
> it does provide lookups to tasks.  I think the in-kernel task vs.
> process naming means that it is more consistent if we use something with
> "task" in it.  It may be called a "process identifier" in userspace but,
> in the kernel, it appears to deal squarely with tasks.  
> 
> 
>>But I can't provide some better name :/
>>
>>pid             - number
>>pref (or tref)  - process (task) ref, e.g. pid(filp->f_owner.pref)
>>pref_struct     - former pid_struct, e.g. struct pref_struct pref; 
> 
> 
> Not bad.  But, it would be nice to get away from pid-like names.  Part
> of the problem with things like 'struct pid_struct' is that the
> structure name is nice, but people will still do:
> 
> 	struct pid_struct pid;
> 
> And we're back to square one. :(

exactly! that's why I propose to call it pref_struct and do:
struct pref_struct pref;

i.e. to remove word "pid" from any code which is not dealing with 
numbers.
pid(pref) macro on the other hand returns numeric identifier.

Thanks,
Kirill

_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containers
Re: [RFC][PATCH] rename 'struct pid' [message #18184 is a reply to message #18174] Thu, 12 April 2007 16:25 Go to previous message
ebiederm is currently offline  ebiederm
Messages: 1354
Registered: February 2006
Senior Member
Dave Hansen <hansendc@us.ibm.com> writes:

> On Wed, 2007-04-11 at 14:54 -0600, Eric W. Biederman wrote:

>> When someone mentions a pid which side of the above statement are you
>> thinking of the left hand side or the right hand side.   The type or
>> the variable name.
>
> Traditionally, I think of a pid as what I see in top.  So, I think of
> the right hand side variable name.  I think of it this way because the
> left hand side has little meaning in how the pid_t is going to be
> used.

I think if you looked a little more closely you would be surprised.

>>   If the issue is that you find the concept of pid_t confusing then it
>>   is much harder to sort this out.
>
> I find pid_t confusing.  There, I've said it. ;)
>
> In a perfect world, kill() wouldn't be multiplexed the way it is.  We'd
> have kill_myself(), kill_pgrp(pgrp), kill_pid() and the pid_t passed
> into kill_pid() there would only mean 'process id', only and could
> _never_ mean 'process group id'.

The multiplexing is an anomaly.  kill(2) really is 4 different
functions.

kill_process(pid_t process, int sig);
kill_process_group(pid_t group, int sig);
kill_current_process_group(int sig);
kill_all(int sig);

And it is really an abuse of the data type to overload it quite they
way it happens in kill(2).

However that no way in validates the fact that the identifiers
for session, process groups, and processes are and should be of
the same time.

To my knowledge the only kernel interface the brain damage of kill(2)
propagates in a persistent way to is: "fcntl(F_GETOWN);" which is
a tacked on interface.  I admit there are a few others where the
set of processes is interpreted the same as in kill(2) but on
fcntl actually reads back that way.

> We could even have different data structures so that type safety would
> keep get_pgrp()'s result from being easily fed to kill_pgrp().

I assume here you mean to switch the type of process grouping, process
vs process group or something like that.


What tends to be instructive in the user space interface are the
definition of setsid() and setpgrp();

setsid() really is defined as: session = pgrp = getpid();
setpgrp() is just: pgrp = getpid();

It really is fundamental to unix use of pids that you can compare
and interchange the identifiers for sessions, process groups, and
processes.

Eric
_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containers
Previous Topic: IPC NS tests ?
Next Topic: [RFC | PATCH 0/9] CPU controller over process container
Goto Forum:
  


Current Time: Thu Jan 02 19:30:11 GMT 2025

Total time taken to generate the page: 0.08973 seconds