Home » Mailing lists » Devel » [PATCH 0/3] clone64() and unshare64() system calls
[PATCH 0/3] clone64() and unshare64() system calls [message #29261] |
Wed, 09 April 2008 22:26 |
Sukadev Bhattiprolu
Messages: 413 Registered: August 2006
|
Senior Member |
|
|
This is a resend of the patch set Cedric had sent earlier. I ported
the patch set to 2.6.25-rc8-mm1 and tested on x86 and x86_64.
---
We have run out of the 32 bits in clone_flags !
This patchset introduces 2 new system calls which support 64bit clone-flags.
long sys_clone64(unsigned long flags_high, unsigned long flags_low,
unsigned long newsp);
long sys_unshare64(unsigned long flags_high, unsigned long flags_low);
The current version of clone64() does not support CLONE_PARENT_SETTID and
CLONE_CHILD_CLEARTID because we would exceed the 6 registers limit of some
arches. It's possible to get around this limitation but we might not
need it as we already have clone()
This is work in progress but already includes support for x86, x86_64,
x86_64(32), ppc64, ppc64(32), s390x, s390x(31).
ia64 already supports 64bits clone flags through the clone2() syscall.
should we harmonize the name to clone2 ?
_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containers
|
|
|
[PATCH 1/3] change clone_flags type to u64 [message #29262 is a reply to message #29261] |
Wed, 09 April 2008 22:32 |
Sukadev Bhattiprolu
Messages: 413 Registered: August 2006
|
Senior Member |
|
|
From: Sukadev Bhattiprolu <sukadev@us.ibm.com>
Subject: [lxc-dev] [patch -lxc 1/3] change clone_flags type to u64
This is a preliminary patch changing the clone_flags type to 64bits
for all the routines called by do_fork().
It prepares ground for the next patch which introduces an enhanced
version of clone() supporting 64bits flags.
This is work in progress. All conversions might not be done yet.
Signed-off-by: Cedric Le Goater <clg@fr.ibm.com>
Signed-off-by: Sukadev Bhattiprolu <sukadev@us.ibm.com>
---
arch/alpha/kernel/process.c | 2 +-
arch/arm/kernel/process.c | 2 +-
arch/avr32/kernel/process.c | 2 +-
arch/blackfin/kernel/process.c | 2 +-
arch/cris/arch-v10/kernel/process.c | 2 +-
arch/cris/arch-v32/kernel/process.c | 2 +-
arch/frv/kernel/process.c | 2 +-
arch/h8300/kernel/process.c | 2 +-
arch/ia64/ia32/sys_ia32.c | 2 +-
arch/ia64/kernel/process.c | 2 +-
arch/m32r/kernel/process.c | 2 +-
arch/m68k/kernel/process.c | 2 +-
arch/m68knommu/kernel/process.c | 2 +-
arch/mips/kernel/process.c | 2 +-
arch/mn10300/kernel/process.c | 2 +-
arch/parisc/kernel/process.c | 2 +-
arch/powerpc/kernel/process.c | 2 +-
arch/s390/kernel/process.c | 2 +-
arch/sh/kernel/process_32.c | 2 +-
arch/sh/kernel/process_64.c | 2 +-
arch/sparc/kernel/process.c | 2 +-
arch/sparc64/kernel/process.c | 2 +-
arch/um/kernel/process.c | 2 +-
arch/v850/kernel/process.c | 2 +-
arch/x86/kernel/process_32.c | 2 +-
arch/x86/kernel/process_64.c | 2 +-
arch/xtensa/kernel/process.c | 2 +-
fs/namespace.c | 2 +-
include/linux/ipc_namespace.h | 4 ++--
include/linux/key.h | 2 +-
include/linux/mnt_namespace.h | 2 +-
include/linux/nsproxy.h | 4 ++--
include/linux/pid_namespace.h | 4 ++--
include/linux/sched.h | 6 ++++--
include/linux/security.h | 6 +++---
include/linux/sem.h | 4 ++--
include/linux/user_namespace.h | 4 ++--
include/linux/utsname.h | 4 ++--
include/net/net_namespace.h | 4 ++--
ipc/namespace.c | 2 +-
ipc/sem.c | 2 +-
kernel/fork.c | 36 ++++++++++++++++++------------------
kernel/nsproxy.c | 6 +++---
kernel/pid_namespace.c | 2 +-
kernel/user_namespace.c | 2 +-
kernel/utsname.c | 2 +-
net/core/net_namespace.c | 4 ++--
security/dummy.c | 2 +-
security/keys/process_keys.c | 2 +-
security/security.c | 2 +-
security/selinux/hooks.c | 2 +-
51 files changed, 83 insertions(+), 81 deletions(-)
Index: 2.6.25-rc2-mm1/arch/alpha/kernel/process.c
===================================================================
--- 2.6.25-rc2-mm1.orig/arch/alpha/kernel/process.c
+++ 2.6.25-rc2-mm1/arch/alpha/kernel/process.c
@@ -270,7 +270,7 @@ alpha_vfork(struct pt_regs *regs)
*/
int
-copy_thread(int nr, unsigned long clone_flags, unsigned long usp,
+copy_thread(int nr, u64 clone_flags, unsigned long usp,
unsigned long unused,
struct task_struct * p, struct pt_regs * regs)
{
Index: 2.6.25-rc2-mm1/arch/arm/kernel/process.c
===================================================================
--- 2.6.25-rc2-mm1.orig/arch/arm/kernel/process.c
+++ 2.6.25-rc2-mm1/arch/arm/kernel/process.c
@@ -331,7 +331,7 @@ void release_thread(struct task_struct *
asmlinkage void ret_from_fork(void) __asm__("ret_from_fork");
int
-copy_thread(int nr, unsigned long clone_flags, unsigned long stack_start,
+copy_thread(int nr, u64 clone_flags, unsigned long stack_start,
unsigned long stk_sz, struct task_struct *p, struct pt_regs *regs)
{
struct thread_info *thread = task_thread_info(p);
Index: 2.6.25-rc2-mm1/arch/avr32/kernel/process.c
===================================================================
--- 2.6.25-rc2-mm1.orig/arch/avr32/kernel/process.c
+++ 2.6.25-rc2-mm1/arch/avr32/kernel/process.c
@@ -325,7 +325,7 @@ int dump_fpu(struct pt_regs *regs, elf_f
asmlinkage void ret_from_fork(void);
-int copy_thread(int nr, unsigned long clone_flags, unsigned long usp,
+int copy_thread(int nr, u64 clone_flags, unsigned long usp,
unsigned long unused,
struct task_struct *p, struct pt_regs *regs)
{
Index: 2.6.25-rc2-mm1/arch/blackfin/kernel/process.c
===================================================================
--- 2.6.25-rc2-mm1.orig/arch/blackfin/kernel/process.c
+++ 2.6.25-rc2-mm1/arch/blackfin/kernel/process.c
@@ -168,7 +168,7 @@ asmlinkage int bfin_clone(struct pt_regs
}
int
-copy_thread(int nr, unsigned long clone_flags,
+copy_thread(int nr, u64 clone_flags,
unsigned long usp, unsigned long topstk,
struct task_struct *p, struct pt_regs *regs)
{
Index: 2.6.25-rc2-mm1/arch/cris/arch-v10/kernel/process.c
===================================================================
--- 2.6.25-rc2-mm1.orig/arch/cris/arch-v10/kernel/process.c
+++ 2.6.25-rc2-mm1/arch/cris/arch-v10/kernel/process.c
@@ -115,7 +115,7 @@ int kernel_thread(int (*fn)(void *), voi
*/
asmlinkage void ret_from_fork(void);
-int copy_thread(int nr, unsigned long clone_flags, unsigned long usp,
+int copy_thread(int nr, u64 clone_flags, unsigned long usp,
unsigned long unused,
struct task_struct *p, struct pt_regs *regs)
{
Index: 2.6.25-rc2-mm1/arch/cris/arch-v32/kernel/process.c
===================================================================
--- 2.6.25-rc2-mm1.orig/arch/cris/arch-v32/kernel/process.c
+++ 2.6.25-rc2-mm1/arch/cris/arch-v32/kernel/process.c
@@ -131,7 +131,7 @@ kernel_thread(int (*fn)(void *), void *
extern asmlinkage void ret_from_fork(void);
int
-copy_thread(int nr, unsigned long clone_flags, unsigned long usp,
+copy_thread(int nr, u64 clone_flags, unsigned long usp,
unsigned long unused,
struct task_struct *p, struct pt_regs *regs)
{
Index: 2.6.25-rc2-mm1/arch/frv/kernel/process.c
===================================================================
--- 2.6.25-rc2-mm1.orig/arch/frv/kernel/process.c
+++ 2.6.25-rc2-mm1/arch/frv/kernel/process.c
@@ -204,7 +204,7 @@ void prepare_to_copy(struct task_struct
/*
* set up the kernel stack and exception frames for a new process
*/
-int copy_thread(int nr, unsigned long clone_flags,
+int copy_thread(int nr, u64 clone_flags,
unsigned long usp, unsigned long topstk,
struct task_struct *p, struct pt_regs *regs)
{
Index: 2.6.25-rc2-mm1/arch/h8300/kernel/process.c
===================================================================
--- 2.6.25-rc2-mm1.orig/arch/h8300/kernel/process.c
+++ 2.6.25-rc2-mm1/arch/h8300/kernel/process.c
@@ -192,7 +192,7 @@ asmlinkage int h8300_clone(struct pt_reg
}
-int copy_thread(int nr, unsigned long clone_flags,
+int copy_thread(int nr, u64 clone_flags,
unsigned long usp, unsigned long topstk,
struct task_struct * p, struct pt_regs * regs)
{
Index: 2.6.25-rc2-mm1/arch/ia64/ia32/sys_ia32.c
===================================================================
--- 2.6.25-rc2-mm1.orig/arch/ia64/ia32/sys_ia32.c
+++ 2.6.25-rc2-mm1/arch/ia64/ia32/sys_ia32.c
@@ -734,7 +734,7 @@ __ia32_copy_pp_list(struct ia64_partial_
int
ia32_copy_ia64_partial_page_list(struct task_struct *p,
- unsigned long clone_flags)
+ u64 clone_flags)
{
int retval = 0;
Index: 2.6.25-rc2-mm1/arch/ia64/kernel/process.c
===================================================================
--- 2.6.25-rc2-mm1.orig/arch/ia64/kernel/process.c
+++ 2.6.25-rc2-mm1/arch/ia64/kernel/process.c
@@ -402,7 +402,7 @@ ia64_load_extra (struct task_struct *tas
* so there is nothing to worry about.
*/
int
-copy_thread (int nr, unsigned long clone_flags,
+copy_thread(int nr, u64 clone_flags,
unsigned long user_stack_base, unsigned long user_stack_size,
struct task_struct *p, struct pt_regs *regs)
{
Index: 2.6.25-rc2-mm1/arch/m32r/kernel/process.c
===================================================================
--- 2.6.25-rc2-mm1.orig/arch/m32r/kernel/process.c
+++ 2.6.25-rc2-mm1/arch/m32r/kernel/process.c
@@ -242,7 +242,7 @@ int dump_fpu(struct pt_regs *regs, elf_f
return 0; /* Task didn't use the fpu at all. */
}
-int copy_thread(int nr, unsigned long clone_flags, unsigned long spu,
+int copy_thread(int nr, u64 clone_flags, unsigned long spu,
unsigned long unused, struct task_struct *tsk, struct pt_regs *regs)
{
struct pt_regs *childregs = task_pt_regs(tsk);
Index: 2.6.25-rc2-mm1/arch/m68k/kernel/process.c
===================================================================
--- 2.6.25-rc2-mm1.orig/arch/m68k/kernel/process.c
+++ 2.6.25-rc2-mm1/arch/m68k/kernel/process.c
@@ -235,7 +235,7 @@ asmlinkage int m68k_clone(struct pt_regs
parent_tidptr, child_tidptr);
}
-int copy_thread(int nr, unsigned long clone_flags, unsigned long usp,
+int copy_thread(int nr, u64 clone_flags, unsigned long usp,
unsigned long unused,
struct task_struct * p, struct pt_regs * regs)
{
Index: 2.6.25-rc2-mm1/arch/m68knommu/kernel/process.c
===================================================================
--- 2.6.25-rc2-mm1.orig/arch/m68knommu/kernel/process.c
+++ 2.6.25-rc2-mm1/arch/m68knommu/kernel/process.c
@@ -200,7 +200,7 @@ asmlinkage int m68k_clone(struct pt_regs
return do_fork(clone_flags, newsp, regs, 0, NULL, NULL);
}
-int copy_thread(int nr, unsigned long clone_flags,
+int copy_thread(int nr, u64 clone_flags,
unsigned long usp, unsigned long topstk,
struct task_struct * p, struct pt_regs * regs)
{
Index: 2.6.25-rc2-mm1/arch/mips/kernel/process.c
===========================================
...
|
|
|
[PATCH 2/3] add do_unshare() [message #29263 is a reply to message #29261] |
Wed, 09 April 2008 22:34 |
Sukadev Bhattiprolu
Messages: 413 Registered: August 2006
|
Senior Member |
|
|
From: Sukadev Bhattiprolu <sukadev@us.ibm.com>
Subject: [PATCH 2/3] add do_unshare()
This patch adds a do_unshare() routine which will be common
to the unshare() and unshare64() syscall.
Signed-off-by: Cedric Le Goater <clg@fr.ibm.com>
Signed-off-by: Sukadev Bhattiprolu <sukadev@us.ibm.com>
---
kernel/fork.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)
Index: 2.6.25-rc2-mm1/kernel/fork.c
===================================================================
--- 2.6.25-rc2-mm1.orig/kernel/fork.c
+++ 2.6.25-rc2-mm1/kernel/fork.c
@@ -1696,7 +1696,7 @@ static int unshare_semundo(u64 unshare_f
* constructed. Here we are modifying the current, active,
* task_struct.
*/
-asmlinkage long sys_unshare(unsigned long unshare_flags)
+static long do_unshare(u64 unshare_flags)
{
int err = 0;
struct fs_struct *fs, *new_fs = NULL;
@@ -1790,3 +1790,8 @@ bad_unshare_cleanup_thread:
bad_unshare_out:
return err;
}
+
+asmlinkage long sys_unshare(unsigned long unshare_flags)
+{
+ return do_unshare(unshare_flags);
+}
--
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups "lxc-dev" group.
To post to this group, send email to lxc-dev@googlegroups.com
To unsubscribe from this group, send email to lxc-dev-unsubscribe@googlegroups.com
For more options, visit this group at http://groups.google.com/group/lxc-dev?hl=en
-~----------~----~----~----~------~----~------~--~---
_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containers
|
|
|
[PATCH 3/3] add the clone64() and unshare64() syscalls [message #29264 is a reply to message #29261] |
Wed, 09 April 2008 22:34 |
Sukadev Bhattiprolu
Messages: 413 Registered: August 2006
|
Senior Member |
|
|
From: Cedric Le Goater <clg@fr.ibm.com>
Subject: [PATCH 3/3] add the clone64() and unshare64() syscalls
This patch adds 2 new syscalls :
long sys_clone64(unsigned long flags_high, unsigned long flags_low,
unsigned long newsp);
long sys_unshare64(unsigned long flags_high, unsigned long flags_low);
The current version of clone64() does not support CLONE_PARENT_SETTID and
CLONE_CHILD_CLEARTID because we would exceed the 6 registers limit of some
arches. It's possible to get around this limitation but we might not
need it as we already have clone()
Signed-off-by: Cedric Le Goater <clg@fr.ibm.com>
Signed-off-by: Sukadev Bhattiprolu <sukadev@us.ibm.com>
---
arch/powerpc/kernel/entry_32.S | 8 ++++++++
arch/powerpc/kernel/entry_64.S | 5 +++++
arch/powerpc/kernel/process.c | 15 +++++++++++++++
arch/s390/kernel/compat_linux.c | 16 ++++++++++++++++
arch/s390/kernel/compat_wrapper.S | 6 ++++++
arch/s390/kernel/process.c | 15 +++++++++++++++
arch/s390/kernel/syscalls.S | 2 ++
arch/x86/ia32/ia32entry.S | 4 ++++
arch/x86/ia32/sys_ia32.c | 12 ++++++++++++
arch/x86/kernel/entry_64.S | 1 +
arch/x86/kernel/process_32.c | 14 ++++++++++++++
arch/x86/kernel/process_64.c | 15 +++++++++++++++
arch/x86/kernel/syscall_table_32.S | 2 ++
include/asm-powerpc/systbl.h | 2 ++
include/asm-powerpc/unistd.h | 4 +++-
include/asm-s390/unistd.h | 4 +++-
include/asm-x86/unistd_32.h | 2 ++
include/asm-x86/unistd_64.h | 4 ++++
include/linux/syscalls.h | 3 +++
kernel/fork.c | 7 +++++++
kernel/sys_ni.c | 3 +++
21 files changed, 142 insertions(+), 2 deletions(-)
Index: 2.6.25-rc2-mm1/arch/s390/kernel/syscalls.S
===================================================================
--- 2.6.25-rc2-mm1.orig/arch/s390/kernel/syscalls.S 2008-02-27 15:17:34.000000000 -0800
+++ 2.6.25-rc2-mm1/arch/s390/kernel/syscalls.S 2008-03-06 22:08:49.000000000 -0800
@@ -330,3 +330,5 @@ SYSCALL(sys_eventfd,sys_eventfd,sys_even
SYSCALL(sys_timerfd_create,sys_timerfd_create,sys_timerfd_create_wrapper)
SYSCALL(sys_timerfd_settime,sys_timerfd_settime,compat_sys_timerfd_settime_wrapper) /* 320 */
SYSCALL(sys_timerfd_gettime,sys_timerfd_gettime,compat_sys_timerfd_gettime_wrapper)
+SYSCALL(sys_clone64,sys_clone64,sys32_clone64)
+SYSCALL(sys_unshare64,sys_unshare64,sys_unshare64_wrapper)
Index: 2.6.25-rc2-mm1/arch/x86/kernel/syscall_table_32.S
===================================================================
--- 2.6.25-rc2-mm1.orig/arch/x86/kernel/syscall_table_32.S 2008-02-27 15:17:35.000000000 -0800
+++ 2.6.25-rc2-mm1/arch/x86/kernel/syscall_table_32.S 2008-03-06 22:08:49.000000000 -0800
@@ -326,3 +326,5 @@ ENTRY(sys_call_table)
.long sys_fallocate
.long sys_timerfd_settime /* 325 */
.long sys_timerfd_gettime
+ .long sys_clone64
+ .long sys_unshare64
Index: 2.6.25-rc2-mm1/include/asm-powerpc/systbl.h
===================================================================
--- 2.6.25-rc2-mm1.orig/include/asm-powerpc/systbl.h 2008-02-27 15:18:12.000000000 -0800
+++ 2.6.25-rc2-mm1/include/asm-powerpc/systbl.h 2008-03-06 22:08:49.000000000 -0800
@@ -316,3 +316,5 @@ COMPAT_SYS(fallocate)
SYSCALL(subpage_prot)
COMPAT_SYS_SPU(timerfd_settime)
COMPAT_SYS_SPU(timerfd_gettime)
+PPC_SYS(clone64)
+SYSCALL_SPU(unshare64)
Index: 2.6.25-rc2-mm1/include/asm-powerpc/unistd.h
===================================================================
--- 2.6.25-rc2-mm1.orig/include/asm-powerpc/unistd.h 2008-02-27 15:18:12.000000000 -0800
+++ 2.6.25-rc2-mm1/include/asm-powerpc/unistd.h 2008-03-06 22:08:49.000000000 -0800
@@ -335,10 +335,12 @@
#define __NR_subpage_prot 310
#define __NR_timerfd_settime 311
#define __NR_timerfd_gettime 312
+#define __NR_clone64 313
+#define __NR_unshare64 314
#ifdef __KERNEL__
-#define __NR_syscalls 313
+#define __NR_syscalls 315
#define __NR__exit __NR_exit
#define NR_syscalls __NR_syscalls
Index: 2.6.25-rc2-mm1/include/asm-s390/unistd.h
===================================================================
--- 2.6.25-rc2-mm1.orig/include/asm-s390/unistd.h 2008-02-27 15:18:13.000000000 -0800
+++ 2.6.25-rc2-mm1/include/asm-s390/unistd.h 2008-03-06 22:08:49.000000000 -0800
@@ -259,7 +259,9 @@
#define __NR_timerfd_create 319
#define __NR_timerfd_settime 320
#define __NR_timerfd_gettime 321
-#define NR_syscalls 322
+#define __NR_clone64 322
+#define __NR_unshare64 323
+#define NR_syscalls 324
/*
* There are some system calls that are not present on 64 bit, some
Index: 2.6.25-rc2-mm1/include/asm-x86/unistd_32.h
===================================================================
--- 2.6.25-rc2-mm1.orig/include/asm-x86/unistd_32.h 2008-02-27 15:18:16.000000000 -0800
+++ 2.6.25-rc2-mm1/include/asm-x86/unistd_32.h 2008-03-06 22:08:49.000000000 -0800
@@ -332,6 +332,8 @@
#define __NR_fallocate 324
#define __NR_timerfd_settime 325
#define __NR_timerfd_gettime 326
+#define __NR_clone64 327
+#define __NR_unshare64 328
#ifdef __KERNEL__
Index: 2.6.25-rc2-mm1/include/asm-x86/unistd_64.h
===================================================================
--- 2.6.25-rc2-mm1.orig/include/asm-x86/unistd_64.h 2008-02-27 15:18:16.000000000 -0800
+++ 2.6.25-rc2-mm1/include/asm-x86/unistd_64.h 2008-03-06 22:08:49.000000000 -0800
@@ -639,6 +639,10 @@ __SYSCALL(__NR_fallocate, sys_fallocate)
__SYSCALL(__NR_timerfd_settime, sys_timerfd_settime)
#define __NR_timerfd_gettime 287
__SYSCALL(__NR_timerfd_gettime, sys_timerfd_gettime)
+#define __NR_clone64 288
+__SYSCALL(__NR_clone64, stub_clone64)
+#define __NR_unshare64 289
+__SYSCALL(__NR_unshare64, sys_unshare64)
#ifndef __NO_STUBS
Index: 2.6.25-rc2-mm1/include/linux/syscalls.h
===================================================================
--- 2.6.25-rc2-mm1.orig/include/linux/syscalls.h 2008-02-27 15:18:18.000000000 -0800
+++ 2.6.25-rc2-mm1/include/linux/syscalls.h 2008-03-06 22:08:49.000000000 -0800
@@ -615,6 +615,9 @@ asmlinkage long sys_timerfd_gettime(int
asmlinkage long sys_eventfd(unsigned int count);
asmlinkage long sys_fallocate(int fd, int mode, loff_t offset, loff_t len);
+asmlinkage long sys_unshare64(unsigned long clone_flags_high,
+ unsigned long clone_flags_low);
+
int kernel_execve(const char *filename, char *const argv[], char *const envp[]);
#endif
Index: 2.6.25-rc2-mm1/kernel/sys_ni.c
===================================================================
--- 2.6.25-rc2-mm1.orig/kernel/sys_ni.c 2008-02-27 15:18:23.000000000 -0800
+++ 2.6.25-rc2-mm1/kernel/sys_ni.c 2008-03-06 22:08:49.000000000 -0800
@@ -161,3 +161,6 @@ cond_syscall(sys_timerfd_gettime);
cond_syscall(compat_sys_timerfd_settime);
cond_syscall(compat_sys_timerfd_gettime);
cond_syscall(sys_eventfd);
+
+cond_syscall(sys_clone64);
+cond_syscall(sys_unshare64);
Index: 2.6.25-rc2-mm1/arch/x86/kernel/process_32.c
===================================================================
--- 2.6.25-rc2-mm1.orig/arch/x86/kernel/process_32.c 2008-03-06 22:08:49.000000000 -0800
+++ 2.6.25-rc2-mm1/arch/x86/kernel/process_32.c 2008-03-06 22:08:49.000000000 -0800
@@ -771,6 +771,20 @@ asmlinkage int sys_clone(struct pt_regs
return do_fork(clone_flags, newsp, ®s, 0, parent_tidptr, child_tidptr);
}
+asmlinkage int sys_clone64(struct pt_regs regs)
+{
+ u64 clone_flags;
+ unsigned long newsp;
+
+ clone_flags = ((u64) regs.bx << 32 | regs.cx);
+ clone_flags &= ~(CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID);
+
+ newsp = regs.dx;
+ if (!newsp)
+ newsp = regs.sp;
+ return do_fork(clone_flags, newsp, ®s, 0, NULL, NULL);
+}
+
/*
* This is trivial, and on the face of it looks like it
* could equally well be done in user mode.
Index: 2.6.25-rc2-mm1/arch/x86/kernel/process_64.c
===================================================================
--- 2.6.25-rc2-mm1.orig/arch/x86/kernel/process_64.c 2008-03-06 22:08:49.000000000 -0800
+++ 2.6.25-rc2-mm1/arch/x86/kernel/process_64.c 2008-03-06 22:08:49.000000000 -0800
@@ -775,6 +775,21 @@ sys_clone(unsigned long clone_flags, uns
return do_fork(clone_flags, newsp, regs, 0, parent_tid, child_tid);
}
+asmlinkage long
+sys_clone64(unsigned long clone_flags_high, unsigned long clone_flags_low,
+ unsigned long newsp, struct pt_regs *regs)
+{
+ u64 clone_flags;
+
+ clone_flags = ((u64) clone_flags_high << 32 | clone_flags_low);
+ clone_flags &= ~(CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID);
+
+ if (!newsp)
+ newsp = regs->sp;
+ return do_fork(clone_flags, newsp, regs, 0, NULL, NULL);
+}
+
+
/*
* This is trivial, and on the face of it looks like it
* could equally well be done in user mode.
Index: 2.6.25-rc2-mm1/arch/s390/kernel/compat_linux.c
===================================================================
--- 2.6.25-rc2-mm1.orig/arch/s390/kernel/compat_linux.c 2008-01-26 09:48:58.000000000 -0800
+++ 2.6.25-rc2-mm1/arch/s390/kernel/compat_linux.c 2008-03-06 22:08:49.000000000 -0800
@@ -940,6 +940,22 @@ asmlinkage long sys32_clone(void)
parent_tidptr, child_tidptr);
}
+asmlinkage long sys32_clone64(void)
+{
+ struct pt_regs *regs = task_pt_regs(current);
+ u64 clone_flags;
+ unsigned long newsp;
+
+ clone_flags = ((u64) (regs->orig_gpr2 & 0xffffffffUL) << 32 |
+ (regs->gprs[3] & 0xffffffffUL));
+ clone_flags &= ~(CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID);
+
+ newsp = regs->gprs[4] & 0x7fffffffUL;
+ if (!newsp)
+ newsp = regs->gprs[15];
+ return do_fork(clone_flags, newsp, regs, 0, NULL, NULL);
+}
+
/*
* 31 bit emulation wrapper functions for sys_fadvise64/fadvise64_64.
* These need to rewrite the advise values for POSIX_FADV_{DONTNEED,NOREUSE}
Index: 2.6.25-rc2-mm1/arch/s390/kernel/process.c
===================================================================
--- 2.6.25-rc2-mm1.orig/arch/s390/kernel/p
...
|
|
|
|
Re: [PATCH 0/3] clone64() and unshare64() system calls [message #29268 is a reply to message #29266] |
Thu, 10 April 2008 01:07 |
Sukadev Bhattiprolu
Messages: 413 Registered: August 2006
|
Senior Member |
|
|
H. Peter Anvin [hpa@zytor.com] wrote:
> sukadev@us.ibm.com wrote:
>> This is a resend of the patch set Cedric had sent earlier. I ported
>> the patch set to 2.6.25-rc8-mm1 and tested on x86 and x86_64.
>> ---
>> We have run out of the 32 bits in clone_flags !
>> This patchset introduces 2 new system calls which support 64bit
>> clone-flags.
>> long sys_clone64(unsigned long flags_high, unsigned long flags_low,
>> unsigned long newsp);
>> long sys_unshare64(unsigned long flags_high, unsigned long
>> flags_low);
>> The current version of clone64() does not support CLONE_PARENT_SETTID and
>> CLONE_CHILD_CLEARTID because we would exceed the 6 registers limit of some
>> arches. It's possible to get around this limitation but we might not
>> need it as we already have clone()
>
> I really dislike this interface.
>
> If you're going to make it a 64-bit pass it in as a 64-bit number, instead
> of breaking it into two numbers.
Maybe I am missing your point. The glibc interface could take a 64bit
parameter, but don't we need to pass 32-bit values into the system call
on 32 bit systems ?
> Better yet, IMO, would be to pass a pointer to a structure like:
>
> struct shared {
> unsigned long nwords;
> unsigned long flags[];
> };
>
> ... which can be expanded indefinitely.
Yes, this was discussed before in the context of Pavel Emelyanov's patch
http://lkml.org/lkml/2008/1/16/109
along with sys_indirect(). While there was no consensus, it looked like
adding a new system call was better than open ended interfaces.
_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containers
|
|
|
|
|
|
|
|
Re: [PATCH 1/3] change clone_flags type to u64 [message #29303 is a reply to message #29293] |
Thu, 10 April 2008 12:25 |
Cedric Le Goater
Messages: 443 Registered: February 2006
|
Senior Member |
|
|
Hello Andi,
Andi Kleen wrote:
> sukadev@us.ibm.com writes:
>
>> From: Sukadev Bhattiprolu <sukadev@us.ibm.com>
>> Subject: [lxc-dev] [patch -lxc 1/3] change clone_flags type to u64
>>
>> This is a preliminary patch changing the clone_flags type to 64bits
>> for all the routines called by do_fork().
>
> I must admit I was always a little sceptical of giving every tiny
> namespaceable kernel feature its own CLONE flag (and it's own
> CONFIG option). What was the rationale for that again?
I guess that was a development rationale. Most of the namespaces are in
use in the container projects like openvz, vserver and probably others
and we needed a way to activate the code.
Not perfect I agree.
> With your current strategy are you sure that even 64bit will
> be enough in the end? For me it rather looks like you'll
> go through those quickly too as more and more of the kernel
> is namespaced.
well, we're reaching the end. I hope ! devpts is in progress and
mq is just waiting for a clone flag.
> Also I think the user interface is very unfriendly. How
> is a non kernel hacker supposed to make sense of these
> myriads of flags? You'll be creating another
> CreateProcess123_extra_args_extended()
> in the end I fear.
well, the clone interface is a not friendly interface anyway. glibc wraps
it and most users just use fork().
We will need a user library, like we have a libphtread or a libaio, to
effectively use the namespaces features. This is being worked on but
it's another topic.
> Wouldn't it be better to just partition all this into
> fewer more understandable larger feature groups? I think
> that would be much nicer from pretty much all perspectives
> (kernel maintenance, user interface sanity, not needing
> clone128/256 in the end etc.)
Yes. this make sense. Most of the namespaces have dependencies between
each other.
> Some consolidation on the CONFIGs would be good too. I just
> cannot imagine it really makes sense to configure everything
> so fine grained and this is just asking for random compile
> breakage on randconfig.
yes. definitely agree.
but we still need a way to extend the clone flags because none are left.
would you say that the clone64 is the right way to go or should we rather
go in the direction hpa proposed :
http://lkml.org/lkml/2008/4/9/318 :
> If you're going to make it a 64-bit pass it in as a 64-bit number,
> instead of breaking it into two numbers. Better yet, IMO, would
> be to pass a pointer to a structure like:
>
> struct shared {
> unsigned long nwords;
> unsigned long flags[];
> };
>
> ... which can be expanded indefinitely.
if we could agree on some new interface, we could then make sure we
are not abusing it.
Thanks,
C.
_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containers
|
|
|
|
|
|
Re: [PATCH 1/3] change clone_flags type to u64 [message #29307 is a reply to message #29305] |
Thu, 10 April 2008 13:18 |
Cedric Le Goater
Messages: 443 Registered: February 2006
|
Senior Member |
|
|
Andi Kleen wrote:
>> I guess that was a development rationale.
>
> But what rationale? It just doesn't make much sense to me.
Let's add Eric in Cc:
>> Most of the namespaces are in
>> use in the container projects like openvz, vserver and probably others
>> and we needed a way to activate the code.
>
> You could just have added it to feature groups over time.
Yes if the feature group had existed, that would have been a good
option.
Don't take me wrong. I agree with this group direction. Most
namespaces can't be safely decoupled from each other with a clone
flag.
>> Not perfect I agree.
>>
>>> With your current strategy are you sure that even 64bit will
>>> be enough in the end? For me it rather looks like you'll
>>> go through those quickly too as more and more of the kernel
>>> is namespaced.
>> well, we're reaching the end. I hope ! devpts is in progress and
>> mq is just waiting for a clone flag.
>
> Are you sure?
I'm never sure ! :) That's what we have in plan for the moment.
>>> Also I think the user interface is very unfriendly. How
>>> is a non kernel hacker supposed to make sense of these
>>> myriads of flags? You'll be creating another
>>> CreateProcess123_extra_args_extended()
>>> in the end I fear.
>> well, the clone interface is a not friendly interface anyway. glibc wraps
>> it
>
> But only for the stack setup which is just a minor detail.
>
> The basic clone() flags interface used to be pretty sane and usable
> before it could overloaded with so many tiny features.
>
> I especially worry on how user land should keep track of changing kernel
> here. If you add new feature flag for lots of kernel features it is
> reasonable to expect that in the future there will be often new features.
>
> Does this mean user land needs to be updated all the time? Will this
> end up like another udev?
>
>> We will need a user library, like we have a libphtread or a libaio, to
>
> That doesn't make sense. The basic kernel syscalls should be usable,
> not require some magic library that would likely need intimate
> knowledge of specific kernel versions to do any good.
No magic there. but running a container will require some userland code
to be set up properly.
>> but we still need a way to extend the clone flags because none are left.
>
> Can we just take out some again that were added in the .25 cycle and
> readd them once there is a properly thought out interface? That would
> leave at least one.
well, CLONE_STOPPED is being recycle in 2.6.26. so we could use that one
to group namespaces.
and CLONE_NEWPID would probably be a good candidate to group namespaces.
That would be fine for me but it would still leave clone with one to zero
flags left.
Thanks,
C.
_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containers
|
|
|
|
|
Re: [PATCH 1/3] change clone_flags type to u64 [message #29330 is a reply to message #29305] |
Thu, 10 April 2008 17:14 |
serue
Messages: 750 Registered: February 2006
|
Senior Member |
|
|
Quoting Andi Kleen (andi@firstfloor.org):
> > I guess that was a development rationale.
>
> But what rationale? It just doesn't make much sense to me.
>
> > Most of the namespaces are in
> > use in the container projects like openvz, vserver and probably others
> > and we needed a way to activate the code.
>
> You could just have added it to feature groups over time.
>
> >
> > Not perfect I agree.
> >
> > > With your current strategy are you sure that even 64bit will
> > > be enough in the end? For me it rather looks like you'll
> > > go through those quickly too as more and more of the kernel
> > > is namespaced.
> >
> > well, we're reaching the end. I hope ! devpts is in progress and
> > mq is just waiting for a clone flag.
>
> Are you sure?
Well for one thing we can take a somewhat different approach to new
clone flags. I.e. we could extend CLONE_NEWIPC to do mq instead of
introducing a new clone flag. The name doesn't have 'sysv' in it,
and globbing all ipc resources together makes some amount of sense.
Similarly has hpa+eric pointed out earlier, suka could use
CLONE_NEWDEV for ptys. If we have net, pid, ipc, devices, that's a
pretty reasonable split imo. Perhaps we tie user to devices and get
rid of CLONE_NEWUSER which I suspect noone is using atm (since only
Dave has run into the CONFIG_USER_SCHED problem). Or not. We could
roll uts into net, and give CLONE_NEWUTS a deprecation period.
-serge
_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containers
|
|
|
|
|
Re: [PATCH 1/3] change clone_flags type to u64 [message #29340 is a reply to message #29330] |
Thu, 10 April 2008 22:13 |
Daniel Hokka Zakrisso
Messages: 22 Registered: January 2007
|
Junior Member |
|
|
Serge E. Hallyn wrote:
> Quoting Andi Kleen (andi@firstfloor.org):
>> > I guess that was a development rationale.
>>
>> But what rationale? It just doesn't make much sense to me.
>>
>> > Most of the namespaces are in
>> > use in the container projects like openvz, vserver and probably others
>> > and we needed a way to activate the code.
>>
>> You could just have added it to feature groups over time.
>>
>> >
>> > Not perfect I agree.
>> >
>> > > With your current strategy are you sure that even 64bit will
>> > > be enough in the end? For me it rather looks like you'll
>> > > go through those quickly too as more and more of the kernel
>> > > is namespaced.
>> >
>> > well, we're reaching the end. I hope ! devpts is in progress and
>> > mq is just waiting for a clone flag.
>>
>> Are you sure?
>
> Well for one thing we can take a somewhat different approach to new
> clone flags. I.e. we could extend CLONE_NEWIPC to do mq instead of
> introducing a new clone flag. The name doesn't have 'sysv' in it,
> and globbing all ipc resources together makes some amount of sense.
> Similarly has hpa+eric pointed out earlier, suka could use
> CLONE_NEWDEV for ptys. If we have net, pid, ipc, devices, that's a
> pretty reasonable split imo. Perhaps we tie user to devices and get
> rid of CLONE_NEWUSER which I suspect noone is using atm (since only
> Dave has run into the CONFIG_USER_SCHED problem). Or not. We could
> roll uts into net, and give CLONE_NEWUTS a deprecation period.
Please don't. Then we'd need to re-add it in Linux-VServer to support
guests where network namespaces aren't used...
> -serge
--
Daniel Hokka Zakrisson
_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containers
|
|
|
Re: [PATCH 1/3] change clone_flags type to u64 [message #29341 is a reply to message #29340] |
Thu, 10 April 2008 22:49 |
serue
Messages: 750 Registered: February 2006
|
Senior Member |
|
|
Quoting Daniel Hokka Zakrisson (daniel@hozac.com):
> Serge E. Hallyn wrote:
> > Quoting Andi Kleen (andi@firstfloor.org):
> >> > I guess that was a development rationale.
> >>
> >> But what rationale? It just doesn't make much sense to me.
> >>
> >> > Most of the namespaces are in
> >> > use in the container projects like openvz, vserver and probably others
> >> > and we needed a way to activate the code.
> >>
> >> You could just have added it to feature groups over time.
> >>
> >> >
> >> > Not perfect I agree.
> >> >
> >> > > With your current strategy are you sure that even 64bit will
> >> > > be enough in the end? For me it rather looks like you'll
> >> > > go through those quickly too as more and more of the kernel
> >> > > is namespaced.
> >> >
> >> > well, we're reaching the end. I hope ! devpts is in progress and
> >> > mq is just waiting for a clone flag.
> >>
> >> Are you sure?
> >
> > Well for one thing we can take a somewhat different approach to new
> > clone flags. I.e. we could extend CLONE_NEWIPC to do mq instead of
> > introducing a new clone flag. The name doesn't have 'sysv' in it,
> > and globbing all ipc resources together makes some amount of sense.
> > Similarly has hpa+eric pointed out earlier, suka could use
> > CLONE_NEWDEV for ptys. If we have net, pid, ipc, devices, that's a
> > pretty reasonable split imo. Perhaps we tie user to devices and get
> > rid of CLONE_NEWUSER which I suspect noone is using atm (since only
> > Dave has run into the CONFIG_USER_SCHED problem). Or not. We could
> > roll uts into net, and give CLONE_NEWUTS a deprecation period.
>
> Please don't. Then we'd need to re-add it in Linux-VServer to support
> guests where network namespaces aren't used...
So these are networked vservers with a different hostname? Just
curious, what would be a typical use for these?
Anyway then I guess we won't :) Do you have other suggestions for
ns clone flags which ought to be combined? Do the rest of what I
listed make sense to you? (If not, then I guess I'll step out of the
way and let you and Andi fight it out :)
thanks,
-serge
_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containers
|
|
|
Re: [PATCH 1/3] change clone_flags type to u64 [message #29359 is a reply to message #29341] |
Fri, 11 April 2008 08:45 |
Daniel Hokka Zakrisso
Messages: 22 Registered: January 2007
|
Junior Member |
|
|
Serge E. Hallyn wrote:
> Quoting Daniel Hokka Zakrisson (daniel@hozac.com):
>> Serge E. Hallyn wrote:
>> > Quoting Andi Kleen (andi@firstfloor.org):
>> >> > I guess that was a development rationale.
>> >>
>> >> But what rationale? It just doesn't make much sense to me.
>> >>
>> >> > Most of the namespaces are in
>> >> > use in the container projects like openvz, vserver and probably
>> others
>> >> > and we needed a way to activate the code.
>> >>
>> >> You could just have added it to feature groups over time.
>> >>
>> >> >
>> >> > Not perfect I agree.
>> >> >
>> >> > > With your current strategy are you sure that even 64bit will
>> >> > > be enough in the end? For me it rather looks like you'll
>> >> > > go through those quickly too as more and more of the kernel
>> >> > > is namespaced.
>> >> >
>> >> > well, we're reaching the end. I hope ! devpts is in progress and
>> >> > mq is just waiting for a clone flag.
>> >>
>> >> Are you sure?
>> >
>> > Well for one thing we can take a somewhat different approach to new
>> > clone flags. I.e. we could extend CLONE_NEWIPC to do mq instead of
>> > introducing a new clone flag. The name doesn't have 'sysv' in it,
>> > and globbing all ipc resources together makes some amount of sense.
>> > Similarly has hpa+eric pointed out earlier, suka could use
>> > CLONE_NEWDEV for ptys. If we have net, pid, ipc, devices, that's a
>> > pretty reasonable split imo. Perhaps we tie user to devices and get
>> > rid of CLONE_NEWUSER which I suspect noone is using atm (since only
>> > Dave has run into the CONFIG_USER_SCHED problem). Or not. We could
>> > roll uts into net, and give CLONE_NEWUTS a deprecation period.
>>
>> Please don't. Then we'd need to re-add it in Linux-VServer to support
>> guests where network namespaces aren't used...
>
> So these are networked vservers with a different hostname? Just
> curious, what would be a typical use for these?
Layer 3 isolation will continue to be the default for Linux-VServer.
> Anyway then I guess we won't :) Do you have other suggestions for
> ns clone flags which ought to be combined? Do the rest of what I
> listed make sense to you? (If not, then I guess I'll step out of the
> way and let you and Andi fight it out :)
I think putting mq under CLONE_NEWIPC makes sense, as well as using
CLONE_NEWDEV for the ptys. If CLONE_NEWUSER is to be combined with
anything, I think it makes more sense to combine it with CLONE_NEWPID than
CLONE_NEWDEV.
> thanks,
> -serge
>
--
Daniel Hokka Zakrisson
_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containers
|
|
|
Re: [PATCH 3/3] add the clone64() and unshare64() syscalls [message #29657 is a reply to message #29264] |
Wed, 09 April 2008 23:07 |
Jakub Jelinek
Messages: 1 Registered: April 2008
|
Junior Member |
|
|
On Wed, Apr 09, 2008 at 03:34:59PM -0700, sukadev@us.ibm.com wrote:
> From: Cedric Le Goater <clg@fr.ibm.com>
> Subject: [PATCH 3/3] add the clone64() and unshare64() syscalls
>
> This patch adds 2 new syscalls :
>
> long sys_clone64(unsigned long flags_high, unsigned long flags_low,
> unsigned long newsp);
>
> long sys_unshare64(unsigned long flags_high, unsigned long flags_low);
Can you explain why are you adding it for 64-bit arches too? unsigned long
is there already 64-bit, and both sys_clone and sys_unshare have unsigned
long flags, rather than unsigned int.
Jakub
_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containers
|
|
|
Goto Forum:
Current Time: Fri Nov 08 21:17:12 GMT 2024
Total time taken to generate the page: 0.03376 seconds
|