OpenVZ Forum


Home » Mailing lists » Devel » [PATCH 0/25] Sysfs cleanups & tagged directory support
Re: [PATCH 20/25] sysfs: Rename Support multiple superblocks [message #19608 is a reply to message #19582] Wed, 08 August 2007 16:55 Go to previous message
ebiederm is currently offline  ebiederm
Messages: 1354
Registered: February 2006
Senior Member
Tejun Heo <htejun@gmail.com> writes:

> Eric W. Biederman wrote:
>>> 	/* Find the first parent which has valid dentry.
>>> 	 */
>>> 	dentry = NULL;
>>> 	cur = sd;
>>> 	while (!(dentry = __sysfs_get_dentry(sb, cur))) {
>>> 		if (cur->s_flags & SYSFS_FLAG_REMOVED) {
>>> 			dentry = ERR_PTR(-ENOENT);
>>> 			break;
>>> 		}
>>> 		cur = cur->s_parent;
>>> 	}
>> 
>> Here we depend on the fact that sysfs_root is pointed to
>> by sb->s_root so we know it will always have a dentry.
>
> Hmmm... dentry could be ERR_PTR(-ENOENT) here if the REMOVED flag test
> succeeded, right?

Ugh right.  Now that I don't have the locking it probably makes
sense to have that path just return or branch to the exit.

>
>>> 	/* from the found dentry, look up depth times */
>>> 	while (dentry->d_fsdata != sd) {
>
> And then dereferenced.  The REMOVED test should never succeed there, so
> we're probably in the clear but still the code looks a bit scary.

Agreed.  That is a bug.  Either the removed test needs to
be removed because it can't happen or that case needs to be handled.

Thanks for spotting this one.

Mental blind spots are the worst.

Eric
_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containers
Re: [PATCH 0/25] Sysfs cleanups & tagged directory support [message #19610 is a reply to message #19562] Wed, 08 August 2007 09:16 Go to previous message
Cornelia Huck is currently offline  Cornelia Huck
Messages: 16
Registered: August 2007
Junior Member
On Wed, 08 Aug 2007 17:54:44 +0900,
Tejun Heo <htejun@gmail.com> wrote:

> Are you sure it's patch 6?  Patch 17 adds a deadlock in rename path.

Yes. If I apply the series up to patch 5, everything's fine. If I apply
patch 6, /dev/dasda* is missing.
_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containers
Re: [PATCH 01/25] sysfs: Move all of inode initialization into sysfs_init_inode [message #19616 is a reply to message #19563] Wed, 08 August 2007 06:37 Go to previous message
Tejun Heo is currently offline  Tejun Heo
Messages: 184
Registered: November 2006
Senior Member
On Tue, Aug 07, 2007 at 03:08:07PM -0600, Eric W. Biederman wrote:
> 
> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>

Acked-by: Tejun Heo <htejun@gmail.com>
_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containers
Re: [PATCH 02/25] sysfs: Remove sysfs_instantiate [message #19617 is a reply to message #19564] Wed, 08 August 2007 06:37 Go to previous message
Tejun Heo is currently offline  Tejun Heo
Messages: 184
Registered: November 2006
Senior Member
On Tue, Aug 07, 2007 at 03:08:50PM -0600, Eric W. Biederman wrote:
> 
> Now that sysfs_get_inode is dropping the inode lock
> we no longer have a need from sysfs_instantiate.
> 
> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>

Acked-by: Tejun Heo <htejun@gmail.com>
_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containers
Re: [PATCH 03/25] sysfs: Use kill_anon_super [message #19618 is a reply to message #19565] Wed, 08 August 2007 06:50 Go to previous message
Tejun Heo is currently offline  Tejun Heo
Messages: 184
Registered: November 2006
Senior Member
On Tue, Aug 07, 2007 at 03:10:27PM -0600, Eric W. Biederman wrote:
> 
> Since sysfs no longer stores fs directory information in the dcache
> on a permanent basis kill_litter_super it is inappropriate and actively
> wrong.  It will decrement the count on all dentries left in the
> dcache before trying to free them.
> 
> At the moment this is not biting us only because we never unmount sysfs.
> 
> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>

Acked-by: Tejun Heo <htejun@gmail.com>

-- 
tejun
_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containers
Re: [PATCH 04/25] sysfs: Make sysfs_mount static [message #19619 is a reply to message #19566] Wed, 08 August 2007 06:51 Go to previous message
Tejun Heo is currently offline  Tejun Heo
Messages: 184
Registered: November 2006
Senior Member
On Tue, Aug 07, 2007 at 03:11:16PM -0600, Eric W. Biederman wrote:
> 
> This patch modifies the users of sysfs_mount to use sysfs_root
> instead (which is what they are looking for).  It then
> makes sysfs_mount static to keep people from using it
> by accident.
> 
> The net result is slightly faster and cleaner code.
> 
> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>

Acked-by: Tejun Heo <htejun@gmail.com>

--
tejun
_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containers
Re: [PATCH 05/25] sysfs: In sysfs_lookup don't open code sysfs_find_dirent [message #19620 is a reply to message #19567] Wed, 08 August 2007 06:51 Go to previous message
Tejun Heo is currently offline  Tejun Heo
Messages: 184
Registered: November 2006
Senior Member
On Tue, Aug 07, 2007 at 03:12:02PM -0600, Eric W. Biederman wrote:
> 
> This is a small cleanup patch that makes the code just
> a little bit cleaner.
> 
> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>

Acked-by: Tejun Heo <htejun@gmail.com>

-- 
tejun
_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containers
Re: [PATCH 06/25] sysfs: Simplify readdir. [message #19621 is a reply to message #19568] Wed, 08 August 2007 07:12 Go to previous message
Tejun Heo is currently offline  Tejun Heo
Messages: 184
Registered: November 2006
Senior Member
On Tue, Aug 07, 2007 at 03:13:14PM -0600, Eric W. Biederman wrote:
> 
> At some point someone wrote sysfs_readdir to insert a cursor
> into the list of sysfs_dirents to ensure that sysfs_readdir would
> restart properly.  That works but it is complex code and tends
> to be expensive.
> 
> The same effect can be achieved by keeping the sysfs_dirents in
> inode order and using the inode number as the f_pos.  Then
> when we restart we just have to find the first dirent whose inode
> number is equal or greater then the last sysfs_dirent we attempted
> to return.
> 
> Removing the sysfs directory cursor also allows the remove of
> all of the mysterious checks for sysfs_type(sd) != 0.   Which
> were nonbovious checks to see if a cursor was in a directory list.
> 
> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>

Acked-by: Tejun Heo <htejun@gmail.com>

-- 
tejun
_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containers
Re: [PATCH 0/25] Sysfs cleanups & tagged directory support [message #19622 is a reply to message #19607] Wed, 08 August 2007 14:16 Go to previous message
Cornelia Huck is currently offline  Cornelia Huck
Messages: 16
Registered: August 2007
Junior Member
On Wed, 8 Aug 2007 10:37:59 +0200,
Cornelia Huck <cornelia.huck@de.ibm.com> wrote:

> On Wed, 08 Aug 2007 01:57:07 -0600,
> ebiederm@xmission.com (Eric W. Biederman) wrote:
> 
> > > Got it: It's patch 6, the readdir simplification.
> > >
> > > (The udev on that guest is ancient (063)...)
> > 
> > Ok.  That is weird.
> 
> More weirdness. If I activate another dasd from the repair file
> system, /dev/dasdb is created...
> 
> Same if I set the card reader online: /dev/vmrdr-0.0.000c is created as
> expected.

OK, it seems that it is udevstart that has problems. (Normal udev
operation seems fine; when I trigger uevents manually, the device nodes
are created.)

I ran strace on udevstart and found that it skipped
/sys/block/dasd/dasda1 (interestingly, /dev/dasda was created after I
ran udevstart manually; no idea why this didn't happen on boot. Further
manual runs don't create /dev/dasda1, however.) ls doesn't produce
different output on the two kernels.

Here is some excerpt from the strace run with the broken kernel:

open("/sys/block", O_RDONLY|O_NONBLOCK|O_LARGEFILE|O_DIRECTORY) = 3
fstat64(3, {st_mode=S_IFDIR|0755, st_size=0, ...}) = 0
fcntl64(3, F_SETFD, FD_CLOEXEC)         = 0
getdents64(3, /* 28 entries */, 4096)   = 800
_llseek(3, 3414, [3414], SEEK_SET)      = 0
stat64("/sys/block/dasda/dev", {st_mode=S_IFREG|0444, st_size=4096, ...}) = 0
open("/sys/block/dasda", O_RDONLY|O_NONBLOCK|O_LARGEFILE|O_DIRECTORY) = 4
fstat64(4, {st_mode=S_IFDIR|0755, st_size=0, ...}) = 0
fcntl64(4, F_SETFD, FD_CLOEXEC)         = 0
getdents64(4, /* 15 entries */, 4096)   = 440
_llseek(4, 3079, [3079], SEEK_SET)      = 0
stat64("/sys/block/dasda/uevent/dev", 0x7fb8c3f8) = -1 ENOTDIR (Not a directory)
stat64("/sys/block/dasda/dev/dev", 0x7fb8c3f8) = -1 ENOTDIR (Not a directory)
stat64("/sys/block/dasda/range/dev", 0x7fb8c3f8) = -1 ENOTDIR (Not a directory)
stat64("/sys/block/dasda/removable/dev", 0x7fb8c3f8) = -1 ENOTDIR (Not a directory)
stat64("/sys/block/dasda/size/dev", 0x7fb8c3f8) = -1 ENOTDIR (Not a directory)
stat64("/sys/block/dasda/stat/dev", 0x7fb8c3f8) = -1 ENOTDIR (Not a directory)
stat64("/sys/block/dasda/capability/dev", 0x7fb8c3f8) = -1 ENOTDIR (Not a directory)
stat64("/sys/block/dasda/device/dev", 0x7fb8c3f8) = -1 ENOENT (No such file or d irectory)
stat64("/sys/block/dasda/subsystem/dev", 0x7fb8c3f8) = -1 ENOENT (No such file or directory)
stat64("/sys/block/dasda/holders/dev", 0x7fb8c3f8) = -1 ENOENT (No such file or directory)
stat64("/sys/block/dasda/slaves/dev", 0x7fb8c3f8) = -1 ENOENT (No such file or directory)
stat64("/sys/block/dasda/queue/dev", 0x7fb8c3f8) = -1 ENOENT (No such file or directory)
getdents64(4, /* 1 entries */, 4096)    = 32
close(4)                                = 0
getdents64(3, /* 1 entries */, 4096)    = 32
close(3)                                = 0
lstat64("/sys/block/dasda", {st_mode=S_IFDIR|0755, st_size=0, ...}) = 0
lstat64("/sys/block/dasda/dev", {st_mode=S_IFREG|0444, st_size=4096, ...}) = 0
stat64("/sys/block/dasda/dev", {st_mode=S_IFREG|0444, st_size=4096, ...}) = 0
open("/sys/block/dasda/dev", O_RDONLY)  = 3
read(3, "94:0\n", 4096)                 = 5
close(3)                                = 0
lstat64("/sys/block/dasda/device", {st_mode=S_IFLNK|0777, st_size=0, ...}) = 0
readlink("/sys/block/dasda/device", "../../devices/css0/0.0.0003/0.0.4e30", 256) = 36
lstat64("/sys/devices/css0/0.0.0003/0.0.4e30", {st_mode=S_IFDIR|0755, st_size=0, ...}) = 0
lstat64("/sys/devices/css0/0.0.0003/0.0.4e30/bus", {st_mode=S_IFLNK|0777, st_siz e=0, ...}) = 0
readlink("/sys/devices/css0/0.0.0003/0.0.4e30/bus", "../../../../bus/ccw", 256) = 19
lstat64("/sys/devices/css0/0.0.0003/0.0.4e30/driver", {st_mode=S_IFLNK|0777, st_ size=0, ...}) = 0
readlink("/sys/devices/css0/0.0.0003/0.0.4e30/driver", "../../../../bus/ccw/driv ers/dasd-eckd", 256) = 37
open("/proc/filesystems", O_RDONLY|O_LARGEFILE) = 3
read(3, "nodev\tsysfs\nnodev\trootfs\nnodev\tb"..., 4095) = 259
close(3)                                = 0
stat64("/dev/dasda", 0x7fb8b738)        = -1 ENOENT (No such file or directory)
mknod("/dev/dasda", S_IFBLK|0640, makedev(94, 0)) = 0
chmod("/dev/dasda", 060640)             = 0
chown("/dev/dasda", 0, 6)               = 0


And here's with a working kernel:

open("/sys/block", O_RDONLY|O_NONBLOCK|O_LARGEFILE|O_DIRECTORY) = 3
fstat64(3, {st_mode=S_IFDIR|0755, st_size=0, ...}) = 0
fcntl64(3, F_SETFD, FD_CLOEXEC)         = 0
getdents64(3, /* 27 entries */, 4096)   = 768
stat64("/sys/block/dasda/dev", {st_mode=S_IFREG|0444, st_size=4096, ...}) = 0
open("/sys/block/dasda", O_RDONLY|O_NONBLOCK|O_LARGEFILE|O_DIRECTORY) = 4
fstat64(4, {st_mode=S_IFDIR|0755, st_size=0, ...}) = 0
fcntl64(4, F_SETFD, FD_CLOEXEC)         = 0
getdents64(4, /* 15 entries */, 4096)   = 440
stat64("/sys/block/dasda/dasda1/dev", {st_mode=S_IFREG|0444, st_size=4096, ...}) = 0
stat64("/sys/block/dasda/queue/dev", 0x7f9004f8) = -1 ENOENT (No such file or directory)
stat64("/sys/block/dasda/slaves/dev", 0x7f9004f8) = -1 ENOENT (No such file or directory)
stat64("/sys/block/dasda/holders/dev", 0x7f9004f8) = -1 ENOENT (No such file or directory)
stat64("/sys/block/dasda/subsystem/dev", 0x7f9004f8) = -1 ENOENT (No such file or directory)
stat64("/sys/block/dasda/device/dev", 0x7f9004f8) = -1 ENOENT (No such file or directory)
stat64("/sys/block/dasda/capability/dev", 0x7f9004f8) = -1 ENOTDIR (Not a directory)
stat64("/sys/block/dasda/stat/dev", 0x7f9004f8) = -1 ENOTDIR (Not a  directory)
stat64("/sys/block/dasda/size/dev", 0x7f9004f8) = -1 ENOTDIR (Not a directory)
stat64("/sys/block/dasda/removable/dev", 0x7f9004f8) = -1 ENOTDIR (Not a directory)
stat64("/sys/block/dasda/range/dev", 0x7f9004f8) = -1 ENOTDIR (Not a directory)
stat64("/sys/block/dasda/dev/dev", 0x7f9004f8) = -1 ENOTDIR (Not a directory)
stat64("/sys/block/dasda/uevent/dev", 0x7f9004f8) = -1 ENOTDIR (Not a directory)
getdents64(4, /* 0 entries */, 4096)    = 0
close(4)                                = 0
...
lstat64("/sys/block/dasda", {st_mode=S_IFDIR|0755, st_size=0, ...}) = 0
lstat64("/sys/block/dasda/dev", {st_mode=S_IFREG|0444, st_size=4096, ...}) = 0
stat64("/sys/block/dasda/dev", {st_mode=S_IFREG|0444, st_size=4096, ...}) = 0
open("/sys/block/dasda/dev", O_RDONLY)  = 3
read(3, "94:0\n", 4096)                 = 5
close(3)                                = 0
lstat64("/sys/block/dasda/device", {st_mode=S_IFLNK|0777, st_size=0, ...}) = 0
readlink("/sys/block/dasda/device", "../../devices/css0/0.0.0003/0.0.4e30", 256) = 36
lstat64("/sys/devices/css0/0.0.0003/0.0.4e30", {st_mode=S_IFDIR|0755, st_size=0, ...}) = 0
lstat64("/sys/devices/css0/0.0.0003/0.0.4e30/bus", {st_mode=S_IFLNK|0777, st_size=0, ...}) = 0
readlink("/sys/devices/css0/0.0.0003/0.0.4e30/bus", "../../../../bus/ccw", 256) = 19
lstat64("/sys/devices/css0/0.0.0003/0.0.4e30/driver", {st_mode=S_IFLNK|0777, st_size=0, ...}) = 0
readlink("/sys/devices/css0/0.0.0003/0.0.4e30/driver", "../../../../bus/ccw/drivers/dasd-eckd", 256) = 37
open("/proc/filesystems", O_RDONLY|O_LARGEFILE) = 3
read(3, "nodev\tsysfs\nnodev\trootfs\nnodev\tb"..., 4095) = 276
close(3)                                = 0
stat64("/dev/dasda", {st_mode=S_IFBLK|0640, st_rdev=makedev(94, 0), ...}) = 0
chmod("/dev/dasda", 060640)             = 0
chown("/dev/dasda", 0, 6)               = 0
...
lstat64("/sys/block/dasda/dasda1", {st_mode=S_IFDIR|0755, st_size=0, ...}) = 0
lstat64("/sys/block/dasda/dasda1/dev", {st_mode=S_IFREG|0444, st_size=4096, ...} ) = 0
stat64("/sys/block/dasda/dasda1/dev", {st_mode=S_IFREG|0444, st_size=4096, ...}) = 0
open("/sys/block/dasda/dasda1/dev", O_RDONLY) = 3
read(3, "94:1\n", 4096)                 = 5
close(3)                                = 0
lstat64("/sys/block/dasda", {st_mode=S_IFDIR|0755, st_size=0, ...}) = 0
lstat64("/sys/block/dasda/device", {st_mode=S_IFLNK|0777, st_size=0, ...}) = 0
readlink("/sys/block/dasda/device", "../../devices/css0/0.0.0003/0.0.4e30", 256) = 36
lstat64("/sys/devices/css0/0.0.0003/0.0.4e30", {st_mode=S_IFDIR|0755, st_size=0, ...}) = 0
lstat64("/sys/devices/css0/0.0.0003/0.0.4e30/bus", {st_mode=S_IFLNK|0777, st_size=0, ...}) = 0
readlink("/sys/devices/css0/0.0.0003/0.0.4e30/bus", "../../../../bus/ccw", 256) = 19
lstat64("/sys/devices/css0/0.0.0003/0.0.4e30/driver", {st_mode=S_IFLNK|0777, st_ size=0, ...}) = 0
readlink("/sys/devices/css0/0.0.0003/0.0.4e30/driver", "../../../../bus/ccw/drivers/dasd-eckd", 256) = 37
stat64("/dev/dasda1", {st_mode=S_IFBLK|0640, st_rdev=makedev(94, 1), ...}) = 0
chmod("/dev/dasda1", 060640)            = 0
chown("/dev/dasda1", 0, 6)              = 0

Ideas welcome...
_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containers
Re: [PATCH 07/25] sysfs: Rewrite sysfs_drop_dentry. [message #19623 is a reply to message #19569] Wed, 08 August 2007 07:35 Go to previous message
Tejun Heo is currently offline  Tejun Heo
Messages: 184
Registered: November 2006
Senior Member
On Tue, Aug 07, 2007 at 03:14:56PM -0600, Eric W. Biederman wrote:
> 
> Currently we find the dentry to drop by looking at sd->s_dentry.
> We can just as easily accomplish the same task by looking up the
> sysfs inode and finding all of the dentries from there, with the
> added bonus that we don't need to play with the sysfs_assoc_lock.
> 
> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>

Great, Acked-by: Tejun Heo <htejun@gmail.com>

-- 
tejun
_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containers
Re: [PATCH 08/25] sysfs: Implement __sysfs_get_dentry [message #19624 is a reply to message #19570] Wed, 08 August 2007 07:45 Go to previous message
Tejun Heo is currently offline  Tejun Heo
Messages: 184
Registered: November 2006
Senior Member
On Tue, Aug 07, 2007 at 03:16:19PM -0600, Eric W. Biederman wrote:
> 
> This function is similar but much simpler to sysfs_get_dentry
> returns a sysfs dentry if one currently exists.
> 
> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>

Acked-by: Tejun Heo <htejun@gmail.com>

-- 
tejun
_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containers
Re: [PATCH 09/25] sysfs: Move sysfs_get_dentry below __sysfs_get_dentry [message #19625 is a reply to message #19571] Wed, 08 August 2007 07:45 Go to previous message
Tejun Heo is currently offline  Tejun Heo
Messages: 184
Registered: November 2006
Senior Member
On Tue, Aug 07, 2007 at 03:17:33PM -0600, Eric W. Biederman wrote:
> 
> sysfs_get_dentry is higher in fs/sysfs/dir.c then is needed and it the
> dependencies get simpler if we move it down in the file to where I
> have placed __sysfs_get_dentry.  So this patch just moves
> sysfs_get_dentry so code movement doesn't get confused with later code
> changes. 
> 
> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>

Acked-by: Tejun Heo <htejun@gmail.com>

-- 
tejun
_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containers
Re: [PATCH 10/25] sysfs: Rewrite sysfs_get_dentry in terms of __sysfs_get_dentry [message #19626 is a reply to message #19572] Wed, 08 August 2007 07:45 Go to previous message
Tejun Heo is currently offline  Tejun Heo
Messages: 184
Registered: November 2006
Senior Member
On Tue, Aug 07, 2007 at 03:18:27PM -0600, Eric W. Biederman wrote:
> 
> This removes the last major user of s_dentry and makes
> the locking in sysfs_get_dentry much simpler.  Hopefully
> leading to more readable and maintainable code.
> 
> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>

Acked-by: Tejun Heo <htejun@gmail.com>

-- 
tejun
_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containers
Re: [PATCH 11/25] sysfs: Remove s_dentry [message #19627 is a reply to message #19573] Wed, 08 August 2007 07:46 Go to previous message
Tejun Heo is currently offline  Tejun Heo
Messages: 184
Registered: November 2006
Senior Member
On Tue, Aug 07, 2007 at 03:19:31PM -0600, Eric W. Biederman wrote:
> 
> The only uses of s_dentry left are the code that maintains
> s_dentry and trivial users that don't actually need it.
> So this patch removes the s_dentry maintenance code and
> restructures the trivial uses to use something else.
> 
> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>

Acked-by: Tejun Heo <htejun@gmail.com>

Nice clean up.  Thanks.

-- 
tejun
_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containers
Re: [PATCH 0/25] Sysfs cleanups & tagged directory support [message #19629 is a reply to message #19562] Wed, 08 August 2007 14:50 Go to previous message
Cornelia Huck is currently offline  Cornelia Huck
Messages: 16
Registered: August 2007
Junior Member
On Wed, 08 Aug 2007 23:35:36 +0900,
Tejun Heo <htejun@gmail.com> wrote:

> Does the attached patch happen to fix the problem?

Indeed it does; thanks!
_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containers
Re: [PATCH 0/25] Sysfs cleanups & tagged directory support [message #19631 is a reply to message #19591] Wed, 08 August 2007 07:53 Go to previous message
Tejun Heo is currently offline  Tejun Heo
Messages: 184
Registered: November 2006
Senior Member
Eric W. Biederman wrote:
> Cornelia Huck <cornelia.huck@de.ibm.com> writes:
> 
>> My udev failed to create /dev/dasd* so it cannot mount root :( I'm
>> currently trying to find out what causes this, may take some time...
> 
> Oh weird.
> 
> No great surprise that something goofed up given how many
> patches were involved.  Still there shouldn't have been
> any user visible differences in the patchset.
> 
> If you can narrow down which patch caused the problem that would be
> great.
> 
> Otherwise I guess I will have to start looking at the code and
> trying to guess what might have triggered this.

JFYI, your patchset works fine on my x86-64 test machine.  Corenelia
tests on a s390 system which does rename/move stuff with its devices and
usually gets hit first if something is wrong.  I'll scream if I find
anything suspicious while reviewing.

Thanks.

-- 
tejun
_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containers
Re: [PATCH 0/25] Sysfs cleanups & tagged directory support [message #19632 is a reply to message #19562] Wed, 08 August 2007 15:15 Go to previous message
Cornelia Huck is currently offline  Cornelia Huck
Messages: 16
Registered: August 2007
Junior Member
On Wed, 08 Aug 2007 23:55:29 +0900,
Tejun Heo <htejun@gmail.com> wrote:

> Yeah, you seem to have 32bit off_t.  UINT_MAX overflows, so...

<checks> Yes, that was an 31 bit guest. (Good thing I tried it there, I
usually use 64 bit guests...)
_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containers
Re: [PATCH 12/25] sysfs: Introduce sysfs_rename_mutex [message #19633 is a reply to message #19574] Wed, 08 August 2007 08:19 Go to previous message
Tejun Heo is currently offline  Tejun Heo
Messages: 184
Registered: November 2006
Senior Member
Hello, Eric.

Eric W. Biederman wrote:
> Looking carefully at the rename code we have a subtle dependency
> that the structure of sysfs not change while we are performing
> a rename.  If the parent directory of the object we are renaming
> changes while the rename is being performed nasty things could
> happen when we go to release our locks.
> 
> So introduce a sysfs_rename_mutex to prevent this highly
> unlikely theoretical issue.

Yeah, it's a theoretical issue.  Rename/move implementation has always
depended on the parent structure not changing beneath it, but it's nice
to tighten up loose ends.

> +DEFINE_MUTEX(sysfs_rename_mutex);

Probably doesn't really matter but wouldn't a rwsem fit better?

> @@ -774,7 +775,7 @@ static struct dentry *__sysfs_get_dentry(struct super_block *sb, struct sysfs_di
>   *	down from there looking up dentry for each step.
>   *
>   *	LOCKING:
> - *	Kernel thread context (may sleep)
> + *	mutex_lock(sysfs_rename_mutex)

LOCKING describes what locks should be held when entering the function,
so proper description would be something like...

	Kernel thread context, grabs sysfs_rename_mutex

Thanks.

-- 
tejun
_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containers
Re: [PATCH 12/25] sysfs: Introduce sysfs_rename_mutex [message #19634 is a reply to message #19633] Wed, 08 August 2007 08:23 Go to previous message
Tejun Heo is currently offline  Tejun Heo
Messages: 184
Registered: November 2006
Senior Member
Tejun Heo wrote:
> Hello, Eric.
> 
> Eric W. Biederman wrote:
>> Looking carefully at the rename code we have a subtle dependency
>> that the structure of sysfs not change while we are performing
>> a rename.  If the parent directory of the object we are renaming
>> changes while the rename is being performed nasty things could
>> happen when we go to release our locks.
>>
>> So introduce a sysfs_rename_mutex to prevent this highly
>> unlikely theoretical issue.
> 
> Yeah, it's a theoretical issue.  Rename/move implementation has always
> depended on the parent structure not changing beneath it, but it's nice
> to tighten up loose ends.
> 
>> +DEFINE_MUTEX(sysfs_rename_mutex);
> 
> Probably doesn't really matter but wouldn't a rwsem fit better?
> 
>> @@ -774,7 +775,7 @@ static struct dentry *__sysfs_get_dentry(struct super_block *sb, struct sysfs_di
>>   *	down from there looking up dentry for each step.
>>   *
>>   *	LOCKING:
>> - *	Kernel thread context (may sleep)
>> + *	mutex_lock(sysfs_rename_mutex)
> 
> LOCKING describes what locks should be held when entering the function,
> so proper description would be something like...
> 
> 	Kernel thread context, grabs sysfs_rename_mutex

Oops, forget about the above.  Thought the comment was added to
sysfs_rename_dir().

-- 
tejun
_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containers
Re: [PATCH 13/25] sysfs: Simply sysfs_get_dentry [message #19635 is a reply to message #19575] Wed, 08 August 2007 08:24 Go to previous message
Tejun Heo is currently offline  Tejun Heo
Messages: 184
Registered: November 2006
Senior Member
On Tue, Aug 07, 2007 at 03:22:13PM -0600, Eric W. Biederman wrote:
> 
> Now that we know the sysfs tree structure cannot change under us
> simplfy sysfs_get_dentry.
> 
> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>

Acked-by: Tejun Heo <htejun@gmail.com>

It might be better to have sysfs_get_dentry_locked() and
sysfs_get_dentry() so that the latter can be used from
sysfs_update/chmod_file().

-- 
tejun
_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containers
Re: [PATCH 14/25] sysfs: Don't use lookup_one_len_kern [message #19636 is a reply to message #19576] Wed, 08 August 2007 08:38 Go to previous message
Tejun Heo is currently offline  Tejun Heo
Messages: 184
Registered: November 2006
Senior Member
On Tue, Aug 07, 2007 at 03:23:57PM -0600, Eric W. Biederman wrote:
> 
> Upon inspection it appears that there is no looking of the
> inode mutex in lookup_one_len_kern and we aren't calling
> it with the inode mutex and that is wrong.
> 
> So this patch rolls our own dcache insertion function that
> does exactly what we need it to do.  As it turns out this
> is pretty trivial to do and it makes the code easier to
> audit.
> 
> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
> ---
>  fs/sysfs/dir.c |   41 +++++++++++++++++++++++++++++++++++++++--
>  1 files changed, 39 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/sysfs/dir.c b/fs/sysfs/dir.c
> index a9bdb12..1d53c2a 100644
> --- a/fs/sysfs/dir.c
> +++ b/fs/sysfs/dir.c
> @@ -765,6 +765,44 @@ static struct dentry *__sysfs_get_dentry(struct super_block *sb, struct sysfs_di
>  	return dentry;
>  }
>  
> +static struct dentry *sysfs_add_dentry(struct dentry *parent, struct sysfs_dirent *sd)
> +{
> +	struct qstr name;
> +	struct dentry *dentry;
> +	struct inode *inode;
> +
> +	mutex_lock(&parent->d_inode->i_mutex);
> +	mutex_lock(&sysfs_mutex);
> +	dentry = ERR_PTR(-EINVAL);
> +	if (parent->d_fsdata != sd->s_parent)
> +		goto out;
> +
> +	name.name = sd->s_name;
> +	name.len = strlen(sd->s_name);
> +	dentry = d_hash_and_lookup(parent, &name);
> +	if (dentry)
> +		goto out;
> +
> +	dentry = d_alloc(parent, &name);
> +	if (!dentry) {
> +		dentry = ERR_PTR(-ENOMEM);
> +		goto out;
> +	}
> +
> +	inode = sysfs_get_inode(sd);
> +	if (!inode) {
> +		dput(dentry);
> +		dentry = ERR_PTR(-ENOMEM);
> +		goto out;
> +	}
> +	d_instantiate(dentry, inode);
> +	sysfs_attach_dentry(sd, dentry);
> +out:
> +	mutex_unlock(&sysfs_mutex);
> +	mutex_unlock(&parent->d_inode->i_mutex);
> +	return dentry;
> +}

This is virtually identical to

	mutex_lock(&parent_dentry->d_inode->i_mutex);
	dentry = lookup_one_len_kern(cur->s_name, parent_dentry,
				     strlen(cur->s_name));
	mutex_unlock(&parent_dentry->d_inode->i_mutex);

right?  I don't think we need to duplicate the code here.  Or is it
needed for later multi-sb thing?

-- 
tejun
_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containers
Re: [PATCH 15/25] vfs: Remove lookup_one_len_kern [message #19637 is a reply to message #19577] Wed, 08 August 2007 08:39 Go to previous message
Tejun Heo is currently offline  Tejun Heo
Messages: 184
Registered: November 2006
Senior Member
On Tue, Aug 07, 2007 at 03:25:05PM -0600, Eric W. Biederman wrote:
> 
> Now that sysfs no longer uses lookup_one_len_kern the function has
> no users so remove it from the kernel.  Making namei.c just a little
> easier to read.
> 
> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>

Oh.. you're killing lookup_one_len_kern().  In that case, I think the
previous one is okay too.

Acked-by: Tejun Heo <htejun@gmail.com>

-- 
tejun
_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containers
Re: [PATCH 17/25] sysfs: Rewrite rename in terms of sysfs dirents [message #19638 is a reply to message #19579] Wed, 08 August 2007 08:51 Go to previous message
Tejun Heo is currently offline  Tejun Heo
Messages: 184
Registered: November 2006
Senior Member
>  int sysfs_rename_dir(struct kobject * kobj, const char *new_name)
>  {
> -	struct sysfs_dirent *sd;
> +	struct sysfs_dirent *sd = kobj->sd;
>  	struct dentry *parent = NULL;
>  	struct dentry *old_dentry = NULL, *new_dentry = NULL;
>  	const char *dup_name = NULL;
> @@ -863,42 +863,41 @@ int sysfs_rename_dir(struct kobject * kobj, const char *new_name)
>  
>  	mutex_lock(&sysfs_rename_mutex);
>  
> +	error = 0;
> +	if (strcmp(sd->s_name, new_name) == 0)
> +		goto out;	/* nothing to rename */
> +
>  	/* get the original dentry */
>  	old_dentry = sysfs_get_dentry(sd);
>  	if (IS_ERR(old_dentry)) {
>  		error = PTR_ERR(old_dentry);
> +		goto out;
>  	}
>  
>  	parent = old_dentry->d_parent;
>  
>  	/* lock parent and get dentry for new name */
>  	mutex_lock(&parent->d_inode->i_mutex);
> +	mutex_lock(&sysfs_mutex);
>  
> +	error = -EEXIST;
> +	if (sysfs_find_dirent(sd->s_parent, new_name))
>  		goto out_unlock;
>  
> +	error = -ENOMEM;
> +	new_dentry = d_alloc_name(parent, new_name);
> +	if (!new_dentry)
>  		goto out_unlock;
>  
>  	/* rename kobject and sysfs_dirent */
>  	error = -ENOMEM;
>  	new_name = dup_name = kstrdup(new_name, GFP_KERNEL);
>  	if (!new_name)
> +		goto out_unlock;
>  
>  	error = kobject_set_name(kobj, "%s", new_name);
>  	if (error)
> +		goto out_unlock;
>  
>  	mutex_lock(&sysfs_mutex);

sysfs_mutex is being grabbed twice and unlocked twice later.

-- 
tejun
_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containers
Re: [PATCH 18/25] sysfs: Rewrite sysfs_move_dir in terms of sysfs dirents [message #19639 is a reply to message #19580] Wed, 08 August 2007 08:53 Go to previous message
Tejun Heo is currently offline  Tejun Heo
Messages: 184
Registered: November 2006
Senior Member
On Tue, Aug 07, 2007 at 03:28:10PM -0600, Eric W. Biederman wrote:
> 
> This patch rewrites sysfs_move_dir to perform it's checks
> as much as possible on the underlying sysfs_dirents instead
> of the contents of the dcache, making sysfs_move_dir
> more like the rest of the sysfs directory modification
> code.
> 
> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>

Signed-off-by: Tejun Heo <htejun@gmail.com>

-- 
tejun
_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containers
Re: [PATCH 0/25] Sysfs cleanups & tagged directory support [message #19640 is a reply to message #19607] Wed, 08 August 2007 08:54 Go to previous message
Tejun Heo is currently offline  Tejun Heo
Messages: 184
Registered: November 2006
Senior Member
Cornelia Huck wrote:
> On Wed, 08 Aug 2007 01:57:07 -0600,
> ebiederm@xmission.com (Eric W. Biederman) wrote:
> 
>>> Got it: It's patch 6, the readdir simplification.
>>>
>>> (The udev on that guest is ancient (063)...)
>> Ok.  That is weird.
> 
> More weirdness. If I activate another dasd from the repair file
> system, /dev/dasdb is created...
> 
> Same if I set the card reader online: /dev/vmrdr-0.0.000c is created as
> expected.
> 
>> Does it depend on the order in which the dentries are returned from
>> readdir?
> 
> I'd think not.
> 
>> Unless I made a really stupid error otherwise the two versions
>> of readdir should have the same semantics.
> 
> Yes, your patch looks sane. I have no idea why it breaks stuff...

Are you sure it's patch 6?  Patch 17 adds a deadlock in rename path.

-- 
tejun
_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containers
Re: [PATCH 19/25] sysfs: sysfs_get_dentry add a sb parameter [message #19641 is a reply to message #19581] Wed, 08 August 2007 08:57 Go to previous message
Tejun Heo is currently offline  Tejun Heo
Messages: 184
Registered: November 2006
Senior Member
On Tue, Aug 07, 2007 at 03:29:23PM -0600, Eric W. Biederman wrote:
> 
> In preparation for multiple mounts of sysfs add a superblock parameter to
> sysfs_get_dentry.
> 
> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
> @@ -827,7 +829,7 @@ struct dentry *sysfs_get_dentry(struct sysfs_dirent *sd)
>  	 */
>  	dentry = NULL;
>  	cur = sd;
> -	while (!(dentry = __sysfs_get_dentry(sysfs_sb, cur))) {
> +	while (!(dentry = __sysfs_get_dentry(sb, cur))) {

It's probably better to add @sb to __sysfs_get_dentry() here too.
That will make things look clearer.

Thanks.

-- 
tejun
_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containers
Re: [PATCH 20/25] sysfs: Rename Support multiple superblocks [message #19642 is a reply to message #19582] Wed, 08 August 2007 09:35 Go to previous message
Tejun Heo is currently offline  Tejun Heo
Messages: 184
Registered: November 2006
Senior Member
Hello,

On Tue, Aug 07, 2007 at 03:31:18PM -0600, Eric W. Biederman wrote:
> This patch modifies the sysfs_rename_dir and sysfs_move_dir
> to support multiple sysfs dentry trees rooted in different
> sysfs superblocks.
> 
> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>

> +struct sysfs_rename_struct {
> +	struct list_head list;
> +	struct dentry *old_dentry;
> +	struct dentry *new_dentry;
> +	struct dentry *old_parent;
> +	struct dentry *new_parent;
> +};

Please rename to sysfs_rename_cxt to it consistent with
sysfs_addrm_cxt.

> +static void post_rename(struct list_head *head)

Please rename to sysfs_post_rename() and add comment.

> +{
> +	struct sysfs_rename_struct *srs;
> +	while (!list_empty(head)) {
> +		srs = list_entry(head->next, struct sysfs_rename_struct, list);
> +		dput(srs->old_dentry);
> +		dput(srs->new_dentry);
> +		dput(srs->old_parent);
> +		dput(srs->new_parent);
> +		list_del(&srs->list);
> +		kfree(srs);
> +	}
> +}
> +
> +static int prep_rename(struct list_head *head,
> +	struct sysfs_dirent *sd, struct sysfs_dirent *new_parent_sd,
> +	const char *name)

Ditto.

> +{
> +	struct sysfs_rename_struct *srs;
> +	struct super_block *sb;
> +	struct dentry *dentry;
> +	int error;
> +
> +	list_for_each_entry(sb, &sysfs_fs_type.fs_supers, s_instances) {
> +		dentry = sysfs_get_dentry(sb, sd);
> +		if (!dentry)
> +			continue;

sysfs_get_dentry() return ERR_PTR() value.  Oops, sysfs_get_dentry()
implementation is wrong too.  Also, please move
sysfs_grab/release_supers() near this patch and add (a lot of)
comments there.

Other than that, I think this is as clean as this can be.  Great.

-- 
tejun
_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containers
Re: [PATCH 0/25] Sysfs cleanups & tagged directory support [message #19643 is a reply to message #19629] Wed, 08 August 2007 16:35 Go to previous message
Cornelia Huck is currently offline  Cornelia Huck
Messages: 16
Registered: August 2007
Junior Member
On Wed, 8 Aug 2007 16:50:27 +0200,
Cornelia Huck <cornelia.huck@de.ibm.com> wrote:

> On Wed, 08 Aug 2007 23:35:36 +0900,
> Tejun Heo <htejun@gmail.com> wrote:
> 
> > Does the attached patch happen to fix the problem?
> 
> Indeed it does; thanks!

And the whole patchset seems to work as well. I did my usual
attach/detach/move stuff, and didn't see any problems.
_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containers
Re: [PATCH 21/25] sysfs: sysfs_chmod_file handle multiple superblocks [message #19644 is a reply to message #19583] Wed, 08 August 2007 09:38 Go to previous message
Tejun Heo is currently offline  Tejun Heo
Messages: 184
Registered: November 2006
Senior Member
On Tue, Aug 07, 2007 at 03:32:46PM -0600, Eric W. Biederman wrote:
> 
> Teach sysfs_chmod_file how to handle multiple sysfs
> superblocks.  We need to iterate over each superblock
> so that we give all of the appropriate filesystem modification
> notifications.
> 
> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
> ---
>  fs/sysfs/file.c |   41 +++++++++++++++++++++++++----------------
>  1 files changed, 25 insertions(+), 16 deletions(-)
> 
> diff --git a/fs/sysfs/file.c b/fs/sysfs/file.c
> index f954b9f..cff054f 100644
> --- a/fs/sysfs/file.c
> +++ b/fs/sysfs/file.c
> @@ -501,7 +501,8 @@ int sysfs_update_file(struct kobject * kobj, const struct attribute * attr)
>  int sysfs_chmod_file(struct kobject *kobj, struct attribute *attr, mode_t mode)
>  {
>  	struct sysfs_dirent *victim_sd = NULL;
> -	struct dentry *victim = NULL;
> +	struct super_block *sb;
> +	struct dentry *victim;
>  	struct inode * inode;
>  	struct iattr newattrs;
>  	int rc;
> @@ -512,22 +513,30 @@ int sysfs_chmod_file(struct kobject *kobj, struct attribute *attr, mode_t mode)
>  		goto out;
>  
>  	mutex_lock(&sysfs_rename_mutex);
> -	victim = sysfs_get_dentry(sysfs_sb, victim_sd);
> -	mutex_unlock(&sysfs_rename_mutex);
> -	if (IS_ERR(victim)) {
> -		rc = PTR_ERR(victim);
> -		victim = NULL;
> -		goto out;
> +	sysfs_grab_supers();
> +	list_for_each_entry(sb, &sysfs_fs_type.fs_supers, s_instances) {
> +		victim = sysfs_get_dentry(sb, victim_sd);
> +		if (!victim)
> +			continue;
> +		if (IS_ERR(victim)) {
> +			rc = PTR_ERR(victim);
> +			victim = NULL;
> +			goto out_unlock;

Hmmm... Please fix sysfs_get_dentry() and make it return either NULL
or PTR_ERR() values.  Returning both is pretty confusing.  Also, it
would be nice if we can use the rename_prep stuff for this too but it
might just be a wishful thinking.

Thanks.

-- 
tejun
_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containers
Re: [PATCH 22/25] sysfs: sysfs_uptdate_file handle multiple superblocks [message #19645 is a reply to message #19584] Wed, 08 August 2007 09:39 Go to previous message
Tejun Heo is currently offline  Tejun Heo
Messages: 184
Registered: November 2006
Senior Member
On Tue, Aug 07, 2007 at 03:34:11PM -0600, Eric W. Biederman wrote:
> 
> Teach sysfs_update_file how to handle multiple sysfs
> superblocks.  Again we are just iterating over the superblocks
> to so all of the filesystem modification notifications work
> as expected.
> 
> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>

Same as the previous one.

-- 
tejun
_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containers
Re: [PATCH 0/25] Sysfs cleanups & tagged directory support [message #19653 is a reply to message #19622] Wed, 08 August 2007 14:35 Go to previous message
Tejun Heo is currently offline  Tejun Heo
Messages: 184
Registered: November 2006
Senior Member
Does the attached patch happen to fix the problem?

-- 
tejun

---
 fs/sysfs/dir.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

Index: work/fs/sysfs/dir.c
===================================================================
--- work.orig/fs/sysfs/dir.c
+++ work/fs/sysfs/dir.c
@@ -1010,7 +1010,7 @@ static int sysfs_readdir(struct file * f
 		if (filldir(dirent, "..", 2, filp->f_pos, ino, DT_DIR) == 0)
 			filp->f_pos++;
 	}
-	if ((filp->f_pos > 1) && (filp->f_pos < UINT_MAX)) {
+	if ((filp->f_pos > 1) && (filp->f_pos < INT_MAX)) {
 		mutex_lock(&sysfs_mutex);
 
 		/* Skip the dentries we have already reported */
@@ -1031,7 +1031,7 @@ static int sysfs_readdir(struct file * f
 				break;
 		}
 		if (!pos)
-			filp->f_pos = UINT_MAX;
+			filp->f_pos = INT_MAX;
 		mutex_unlock(&sysfs_mutex);
 	}
 	return 0;

_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containers
Re: [PATCH 0/25] Sysfs cleanups & tagged directory support [message #19654 is a reply to message #19629] Wed, 08 August 2007 14:55 Go to previous message
Tejun Heo is currently offline  Tejun Heo
Messages: 184
Registered: November 2006
Senior Member
Cornelia Huck wrote:
> On Wed, 08 Aug 2007 23:35:36 +0900,
> Tejun Heo <htejun@gmail.com> wrote:
> 
>> Does the attached patch happen to fix the problem?
> 
> Indeed it does; thanks!

Yeah, you seem to have 32bit off_t.  UINT_MAX overflows, so...

-- 
tejun
_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containers
Re: [PATCH 0/25] Sysfs cleanups & tagged directory support [message #19655 is a reply to message #19595] Wed, 08 August 2007 15:13 Go to previous message
Tejun Heo is currently offline  Tejun Heo
Messages: 184
Registered: November 2006
Senior Member
Eric W. Biederman wrote:
> Tejun Heo <htejun@gmail.com> writes:
> 
>> Cornelia Huck wrote:
>>> On Wed, 08 Aug 2007 23:35:36 +0900,
>>> Tejun Heo <htejun@gmail.com> wrote:
>>>
>>>> Does the attached patch happen to fix the problem?
>>> Indeed it does; thanks!
>> Yeah, you seem to have 32bit off_t.  UINT_MAX overflows, so...
> 
> Weird.  And we have it opening the directory O_LARGEFILE.
> 
> I have no problems with the fix though.

It's probably because of struct dirent definition used by readdir().  It
may differ depending on archs and glibc version but IIRC the backend
implementation of all directory listing functions in glibc were the
same.  It opens with O_LARGEFILE and use getdents64 to get full info
then clip it to whatever limit the called API imposes.

-- 
tejun
_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containers
Re: [PATCH 0/25] Sysfs cleanups & tagged directory support [message #19656 is a reply to message #19655] Wed, 08 August 2007 15:16 Go to previous message
Tejun Heo is currently offline  Tejun Heo
Messages: 184
Registered: November 2006
Senior Member
Tejun Heo wrote:
> Eric W. Biederman wrote:
>> Tejun Heo <htejun@gmail.com> writes:
>>
>>> Cornelia Huck wrote:
>>>> On Wed, 08 Aug 2007 23:35:36 +0900,
>>>> Tejun Heo <htejun@gmail.com> wrote:
>>>>
>>>>> Does the attached patch happen to fix the problem?
>>>> Indeed it does; thanks!
>>> Yeah, you seem to have 32bit off_t.  UINT_MAX overflows, so...
>> Weird.  And we have it opening the directory O_LARGEFILE.
>>
>> I have no problems with the fix though.
> 
> It's probably because of struct dirent definition used by readdir().

More specifically, d_off field.  It's a bit twisted.  For the last
entry, filp->f_pos gets written into the field and gets wrapped while
being copied out to userland or in glibc.

-- 
tejun
_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containers
Re: [PATCH 14/25] sysfs: Don't use lookup_one_len_kern [message #19658 is a reply to message #19596] Wed, 08 August 2007 15:35 Go to previous message
Tejun Heo is currently offline  Tejun Heo
Messages: 184
Registered: November 2006
Senior Member
Eric W. Biederman wrote:
>> right?  I don't think we need to duplicate the code here.  Or is it
>> needed for later multi-sb thing?
> 
> Right.  We can do that as well. In practice in working code
> there is no real difference.
> 
> There is a little extra uniformity in rolling it ourselves, but
> not enough to worry about either way.
> 
> In the review/debug etc cycle it just wound up being a lot easier
> to roll the code myself.
> 
> By the time we get to lookup_one_len_kern it is almost impenetrable
> code in namei.c where sysfs_add_dentry tends is easier to comprehend,
> and to modify for debugging.

Yeap, agreed.  I agreed with this one too in the comment for the next
patch.  I guess I should have wrote here too.  Sorry about the trouble.

Thanks.

-- 
tejun
_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containers
Re: [PATCH 20/25] sysfs: Rename Support multiple superblocks [message #19659 is a reply to message #19600] Wed, 08 August 2007 15:50 Go to previous message
Tejun Heo is currently offline  Tejun Heo
Messages: 184
Registered: November 2006
Senior Member
Eric W. Biederman wrote:
> Welcome.  I will see what I can do with respect to cleaning up
> the names.
> 
> As for the return value of sysfs_get_dentry that is tricky.  In particular
> I have three specific cases the code needs to deal with.
> 
> - We got the dentry.
> - We did not get the dentry because for this super block there never
>   ever will be a dentry.
> - Some kind of error occurred in attempting to get the dentry.
> 
> Not getting a dentry because it is impossible I am currently handling
> with a NULL return.  I can equally use a specific error code to mean
> that as well.  It doesn't much matter.  So I guess the hunk in
> question could read:
> 
>>> +	list_for_each_entry(sb, &sysfs_fs_type.fs_supers, s_instances) {
>>> +		dentry = sysfs_get_dentry(sb, sd);
>>> +		if (dentry == ERR_PTR(-ENOENT))
>>> +			continue;
> 
> As long as we handle that class of error differently I really don't
> care.

Yeah, I think using -ENOENT is better; otherwise, my little head feels
like exploding.  :-)  More importantly, sysfs_get_dentry() seems like it
would deference ERR_PTR() value.  No?

-- 
tejun
_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containers
Re: [PATCH 0/25] Sysfs cleanups & tagged directory support [message #19660 is a reply to message #19601] Wed, 08 August 2007 16:03 Go to previous message
Tejun Heo is currently offline  Tejun Heo
Messages: 184
Registered: November 2006
Senior Member
Hello,

Eric W. Biederman wrote:
>> More specifically, d_off field.  It's a bit twisted.  For the last
>> entry, filp->f_pos gets written into the field and gets wrapped while
>> being copied out to userland or in glibc.
> 
> That could do it, and glibc is crunching it.  Oh well, it is
> easy enough to avoid as long as our inode numbers are small which
> the idr allocator seems to ensure.

Yeah, now I think about it.  glibc throws out entries which don't fit in
the data structure specified by the called API, so it probably threw out
the last entry which has UINT_MAX in d_off which doesn't fit in the
readdir() return structure.  Using INT_MAX should be just fine as IDA
always allocates the first empty slot.  We can add paranoia check in ino
allocation path.

-- 
tejun
_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containers
Re: alternative approached at tagged nodes [message #19661 is a reply to message #19585] Wed, 08 August 2007 16:31 Go to previous message
Tejun Heo is currently offline  Tejun Heo
Messages: 184
Registered: November 2006
Senior Member
Okay, here's a different implementation of tagged nodes.  I just
compile tested it so I can guarantee it's broken.  I don't think
there's anything fundamentally wrong but I'm wrong fairly often, so if
you find something moronic, feel free to scream at me.

1. there's no enable_tagging() or fundamental restrictions on which
   types of nodes can be tagged.

2. no callback.  tags are set by sysfs_rename_dir_tagged().  symlinks
   follow the tag of its target_sd.  This limits taggable node types
   to dirs and symlinks.

3. in all paths between the root and leaf nodes, only zero or one
   tagged sd exists.  all children of a tagged sd have NULL s_tag but
   are considered tagged the same as the tagged ancestor.

4. untagged entries are visible in all supers unless there's a
   matching tagged entry overshadowing it.  tagged entry is only
   visible in the  matching tagged super.

5. due to #3, children in a directory which belong to the same tag can
   be looked up using NULL tag.  Most leaf node ops don't need to be
   changed.

6. symlink removal is different.  we either need to modify the
   interface to take target_sd or implement new one.  actually, I
   think linking symlinks to target_sd and renaming/removing them
   automagically would be pretty good.

Thanks.  What do you think?

JUST FOR BRAIN STORMING.  DO NOT APPLY.
---
 fs/sysfs/bin.c        |    2 
 fs/sysfs/dir.c        |  122 ++++++++++++++++++++++++++++++++++++++++----------
 fs/sysfs/file.c       |    4 -
 fs/sysfs/group.c      |    2 
 fs/sysfs/inode.c      |    5 +-
 fs/sysfs/mount.c      |   44 ++++++++++++++++--
 fs/sysfs/symlink.c    |    8 ++-
 fs/sysfs/sysfs.h      |   11 +++-
 include/linux/sysfs.h |    4 +
 9 files changed, 167 insertions(+), 35 deletions(-)

Index: work/fs/sysfs/dir.c
===================================================================
--- work.orig/fs/sysfs/dir.c
+++ work/fs/sysfs/dir.c
@@ -387,7 +387,7 @@ void sysfs_addrm_start(struct sysfs_addr
  */
 int sysfs_add_one(struct sysfs_addrm_cxt *acxt, struct sysfs_dirent *sd)
 {
-	if (sysfs_find_dirent(acxt->parent_sd, sd->s_name))
+	if (__sysfs_find_dirent(acxt->parent_sd, sd->s_name, sd->s_tag, 0))
 		return -EEXIST;
 
 	sd->s_parent = sysfs_get(acxt->parent_sd);
@@ -526,6 +526,24 @@ void sysfs_addrm_finish(struct sysfs_add
 	}
 }
 
+struct sysfs_dirent *__sysfs_find_dirent(struct sysfs_dirent *parent_sd,
+					 const unsigned char *name,
+					 const void *tag, int match_null)
+{
+	struct sysfs_dirent *null_sd = NULL, *sd;
+
+	for (sd = parent_sd->s_children; sd; sd = sd->s_sibling) {
+		if (strcmp(sd->s_name, name))
+			continue;
+		if (sd->s_tag == tag)
+			return sd;
+		if (!sd->s_tag && match_null)
+			null_sd = sd;
+	}
+
+	return null_sd;
+}
+
 /**
  *	sysfs_find_dirent - find sysfs_dirent with the given name
  *	@parent_sd: sysfs_dirent to search under
@@ -542,12 +560,7 @@ void sysfs_addrm_finish(struct sysfs_add
 struct sysfs_dirent *sysfs_find_dirent(struct sysfs_dirent *parent_sd,
 				       const unsigned char *name)
 {
-	struct sysfs_dirent *sd;
-
-	for (sd = parent_sd->s_children; sd; sd = sd->s_sibling)
-		if (!strcmp(sd->s_name, name))
-			return sd;
-	return NULL;
+	return __sysfs_find_dirent(parent_sd, name, NULL, 0);
 }
 
 /**
@@ -642,7 +655,8 @@ static struct dentry * sysfs_lookup(stru
 
 	mutex_lock(&sysfs_mutex);
 
-	sd = sysfs_find_dirent(parent_sd, dentry->d_name.name);
+	sd = __sysfs_find_dirent(parent_sd, dentry->d_name.name,
+				 sysfs_info(dentry->d_sb)->tag, 1);
 
 	/* no such entry */
 	if (!sd)
@@ -803,9 +817,19 @@ out:
 	return dentry;
 }
 
+const void *sysfs_dirent_tag(struct sysfs_dirent *sd)
+{
+	while (sd) {
+		if (sd->s_tag)
+			return sd->s_tag;
+		sd = sd->s_parent;
+	}
+	return sd->s_tag;
+}
+
 /**
  *	sysfs_get_dentry - get dentry for the given sysfs_dirent
- *	@sb: superblock of the dentry to return
+ *	@super: superblock of the dentry to return
  *	@sd: sysfs_dirent of interest
  *
  *	Get dentry for @sd.  Dentry is looked up if currently not
@@ -820,16 +844,23 @@ out:
  *	Pointer to found dentry on success, ERR_PTR() value on error.
  *	NULL if the sysfs dentry does not appear in the specified superblock
  */
-struct dentry *sysfs_get_dentry(struct super_block *sb, struct sysfs_dirent *sd)
+struct dentry *sysfs_get_dentry(struct super_block *super,
+				struct sysfs_dirent *sd)
 {
+	const void *tag;
 	struct sysfs_dirent *cur;
 	struct dentry *parent_dentry, *dentry;
 
+	/* bail if this sd won't show up in this superblock */
+	tag = sysfs_dirent_tag(sd);
+	if (tag && tag != sysfs_info(super)->tag)
+		return ERR_PTR(-ENOENT);
+
 	/* Find the first parent which has valid dentry.
 	 */
 	dentry = NULL;
 	cur = sd;
-	while (!(dentry = __sysfs_get_dentry(sb, cur))) {
+	while (!(dentry = __sysfs_get_dentry(super, cur))) {
 		if (cur->s_flags & SYSFS_FLAG_REMOVED) {
 			dentry = ERR_PTR(-ENOENT);
 			break;
@@ -926,27 +957,44 @@ err_out:
 	return error;
 }
 
-
-int sysfs_rename_dir(struct kobject * kobj, const char *new_name)
+int sysfs_rename_dir_tagged(struct kobject * kobj, const char *new_name,
+			    const void *new_tag)
 {
 	struct sysfs_dirent *sd = kobj->sd;
 	struct list_head todo;
 	struct sysfs_rename_struct *srs;
 	struct inode *parent_inode = NULL;
 	const char *dup_name = NULL;
-	int error;
+	const void *old_tag;
+	int tag = 0, error;
 
 	INIT_LIST_HEAD(&todo);
 	mutex_lock(&sysfs_rename_mutex);
 
+	old_tag = sysfs_dirent_tag(sd);
+
+	/* need tagging? */
+	if (old_tag != new_tag) {
+		tag = 1;
+
+		/* can't tag again in tagged subtree */
+		error = -EINVAL;
+		if (old_tag != new_tag)
+			goto out;
+	} else
+		/* sd->s_tag can be either old_tag or NULL */
+		new_tag = sd->s_tag;
+
 	error = 0;
-	if (strcmp(sd->s_name, new_name) == 0)
+	if (!tag && (strcmp(sd->s_name, new_name) == 0))
 		goto out;	/* nothing to rename */
 
 	sysfs_grab_supers();
-	error = prep_rename(&todo, sd, sd->s_parent, new_name);
-	if (error)
-		goto out_release;
+	if (!tag) {
+		error = prep_rename(&todo, sd, sd->s_parent, new_name);
+		if (error)
+			goto out_release;
+	}
 
 	error = -ENOMEM;
 	mutex_lock(&sysfs_mutex);
@@ -959,7 +1007,7 @@ int sysfs_rename_dir(struct kobject * ko
 	mutex_lock(&sysfs_mutex);
 
 	error = -EEXIST;
-	if (sysfs_find_dirent(sd->s_parent, new_name))
+	if (__sysfs_find_dirent(sd->s_parent, new_name, new_tag, 0))
 		goto out_unlock;
 
 	/* rename kobject and sysfs_dirent */
@@ -974,6 +1022,7 @@ int sysfs_rename_dir(struct kobject * ko
 
 	dup_name = sd->s_name;
 	sd->s_name = new_name;
+	sd->s_tag = new_tag;
 
 	/* rename */
 	list_for_each_entry(srs, &todo, list) {
@@ -981,6 +1030,21 @@ int sysfs_rename_dir(struct kobject * ko
 		d_move(srs->old_dentry, srs->new_dentry);
 	}
 
+	/* If we are moving across superblocks drop the dcache entries */
+	if (tag) {
+		struct super_block *super;
+		struct dentry *dentry;
+
+		list_for_each_entry(super, &sysfs_fs_type.fs_supers, s_instances) {
+			dentry = __sysfs_get_dentry(super, sd);
+			if (!dentry)
+				continue;
+			shrink_dcache_parent(dentry);
+			d_drop(dentry);
+			dput(dentry);
+		}
+	}
+
 	error = 0;
 out_unlock:
 	mutex_unlock(&sysfs_mutex);
@@ -995,6 +1059,11 @@ out:
 	return error;
 }
 
+int sysfs_rename_dir(struct kobject * kobj, const char *new_name)
+{
+	return sysfs_rename_dir_tagged(kobj, new_name, kobj->sd->s_tag);
+}
+
 int sysfs_move_dir(struct kobject *kobj, struct kobject *new_parent_kobj)
 {
 	struct sysfs_dirent *sd = kobj->sd;
@@ -1013,6 +1082,11 @@ int sysfs_move_dir(struct kobject *kobj,
 	if (sd->s_parent == new_parent_sd)
 		goto out;	/* nothing to move */
 
+	/* can't move between parents belonging to different tags */
+	error = -EINVAL;
+	if (sysfs_dirent_tag(sd->s_parent) != sysfs_dirent_tag(new_parent_sd))
+		goto out;
+
 	sysfs_grab_supers();
 	error = prep_rename(&todo, sd, new_parent_sd, sd->s_name);
 	if (error)
@@ -1041,7 +1115,7 @@ again:
 	mutex_lock(&sysfs_mutex);
 
 	error = -EEXIST;
-	if (sysfs_find_dirent(new_parent_sd, sd->s_name))
+	if (__sysfs_find_dirent(new_parent_sd, sd->s_name, sd->s_tag, 0))
 		goto out_unlock;
 
 	error = 0;
@@ -1080,8 +1154,9 @@ static inline unsigned char dt_type(stru
 
 static int sysfs_readdir(struct file * filp, void * dirent, filldir_t filldir)
 {
-	struct dentry *dentry = filp->f_path.dentry;
-	struct sysfs_dirent * parent_sd = dentry->d_fsdata;
+	struct dentry *parent = filp->f_path.dentry;
+	struct sysfs_dirent * parent_sd = parent->d_fsdata;
+	const void *tag = sysfs_info(filp->f_dentry->d_sb)->tag;
 	struct sysfs_dirent *pos;
 	ino_t ino;
 
@@ -1110,6 +1185,9 @@ static int sysfs_readdir(struct file * f
 			const char * name;
 			int len;
 
+			if (pos->s_tag && pos->s_tag != tag)
+				continue;
+
 			name = pos->s_name;
 			len = strlen(name);
 			filp->f_pos = ino = pos->s_ino;
Index: work/fs/sysfs/sysfs.h
===================================================================
--- work.orig/fs/sysfs/sysfs.h
+++ work/fs/sysfs/sysfs.h
@@ -26,6 +26,7 @@ struct sysfs_dirent {
 	struct sysfs_dirent	* s_sibling;
 	struct sysfs_dirent	* s_children;
 	const char		* s_name;
+	const void		* s_tag;
 
 	union {
 		struct sysfs_elem_dir		dir;
@@ -51,7 +52,8 @@ struct sysfs_addrm_cxt {
 };
 
 struct sysfs_super_info {
-	int	grabbed;
+	int		grabbed;
+	const void	*tag;
 };
 
 #define sysfs_info(SB) ((struct sysfs_super_info *)(SB)->s_fs_info)
@@ -63,6 +65,7 @@ extern struct file_system_type sysfs_fs_
 void sysfs_grab_supers(void);
 void sysfs_release_supers(void);
 
+extern const void *sysfs_dirent_tag(struct sysfs_dirent *sd);
 extern struct dentry *sysfs_get_dentry(struct super_block *sb, struct sysfs_dirent *sd);
 extern struct sysfs_dirent *sysfs_get_active(struct sysfs_dirent *sd);
 extern void sysfs_put_active(struct sysfs_dirent *sd);
@@ -79,6 +
...

Re: [PATCH 20/25] sysfs: Rename Support multiple superblocks [message #19662 is a reply to message #19605] Wed, 08 August 2007 16:42 Go to previous message
Tejun Heo is currently offline  Tejun Heo
Messages: 184
Registered: November 2006
Senior Member
Eric W. Biederman wrote:
>> 	/* Find the first parent which has valid dentry.
>> 	 */
>> 	dentry = NULL;
>> 	cur = sd;
>> 	while (!(dentry = __sysfs_get_dentry(sb, cur))) {
>> 		if (cur->s_flags & SYSFS_FLAG_REMOVED) {
>> 			dentry = ERR_PTR(-ENOENT);
>> 			break;
>> 		}
>> 		cur = cur->s_parent;
>> 	}
> 
> Here we depend on the fact that sysfs_root is pointed to
> by sb->s_root so we know it will always have a dentry.

Hmmm... dentry could be ERR_PTR(-ENOENT) here if the REMOVED flag test
succeeded, right?

>> 	/* from the found dentry, look up depth times */
>> 	while (dentry->d_fsdata != sd) {

And then dereferenced.  The REMOVED test should never succeed there, so
we're probably in the clear but still the code looks a bit scary.

-- 
tejun
_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containers
Previous Topic: [PATCH] Cleanup oops/bug reports on i386
Next Topic: containers development plans (Aug 8 version)
Goto Forum:
  


Current Time: Sat Oct 25 16:11:56 GMT 2025

Total time taken to generate the page: 0.15838 seconds