ext4 checksum errors after upgrade to OpenVZ 7.0.16 [message #53743] |
Thu, 06 May 2021 06:30 |
dmc_dtc
Messages: 15 Registered: May 2014 Location: Serbia
|
Junior Member |
|
|
Hi,
Me again, I have updated my hybrid centOS 7 + OpenVZ to 7.0.16 and i got burned badly, i dont know if this issue has something to do my setups (cents7 + Openvz7 Repo) but after upgrading and rebooting, i got random corruption all over file systems and i have lost much of my data so i needed to recover from backup (still in progress (will take days)).
anyways if anyone is interested here are logs
for example if doing du -s -m /vz/private/* you would get this in dmesg
May 06 03:21:55 CentOS-75-64-minimal kernel: EXT4-fs error (device md2): ext4_iget:4488: inode #10095922: comm du: checksum invalid
May 06 03:21:55 CentOS-75-64-minimal kernel: EXT4-fs error (device md2): ext4_iget:4488: inode #15342712: comm du: checksum invalid
May 06 03:21:55 CentOS-75-64-minimal kernel: EXT4-fs error (device md2): ext4_iget:4488: inode #15342731: comm du: checksum invalid
May 06 03:21:56 CentOS-75-64-minimal kernel: EXT4-fs error (device md2): ext4_iget:4488: inode #7210104: comm du: checksum invalid
May 06 03:21:56 CentOS-75-64-minimal kernel: EXT4-fs error (device md2): ext4_iget:4488: inode #10093734: comm du: checksum invalid
May 06 03:21:57 CentOS-75-64-minimal kernel: EXT4-fs error (device md2): ext4_iget:4488: inode #9179121: comm du: checksum invalid
And on filesystem you get
du: cannot access '/vz/private/2b35aaf1-6d62-49bf-a885-48288be123c4/fs/var/cac he/yum/x86_64/7/updates': Input/output error
Then you would have to reboot - fix it with e2fsck but then you got corrupted filesystems on dom0 and guests
i think i found some workaround and that is to disable ext4 cheksums on partitions and for now i have two servers not corrupting data any more by typing
tune2fs -O ^metadata_csum /dev/md2
I have no idea what went wrong, i see that centos e2fsprogs-1.42.9-16 rpm package from centos7 was replaces with vz7 package with same name
EDIT: i have check system log and it seems that if you have metadata_csum enabled on ext4 drive with newer e2fsprogs it somehow either detected previous errors which went unnoticed (possibly?) and now have inconsistency and fixes them or somehow corrupted it, i assume the errors were there but only now it notices them.. disabling metadata_csum on ext4 filesystem and running e2fsck on it then fixed the problem (as the time of writing this fs is clean)
boot log mentioning metadata_csum few months back but only today - AFTER upgrade to openvz 7.0.16 it made problem
Mar 21 23:42:59 localhost systemd-fsck[509]: /dev/md2 has unsupported feature(s): metadata_csum
Mar 21 23:42:59 localhost systemd-fsck[509]: e2fsck: Get a newer version of e2fsck!
Mar 21 23:42:59 localhost systemd-fsck[509]: fsck failed with error code 8.
Mar 21 23:42:59 localhost systemd-fsck[509]: Ignoring error.
May 02 15:20:06 localhost systemd-fsck[515]: /dev/md2 has unsupported feature(s): metadata_csum
May 02 15:20:06 localhost systemd-fsck[515]: e2fsck: Get a newer version of e2fsck!
May 02 15:20:06 localhost systemd-fsck[515]: fsck failed with error code 8.
May 02 15:20:06 localhost systemd-fsck[515]: Ignoring error.
May 06 00:50:35 localhost systemd-fsck[509]: /dev/md2 has unsupported feature(s): metadata_csum
May 06 00:50:35 localhost systemd-fsck[509]: e2fsck: Get a newer version of e2fsck!
May 06 00:50:35 localhost systemd-fsck[509]: fsck failed with error code 8.
May 06 00:50:35 localhost systemd-fsck[509]: Ignoring error.
May 06 03:26:36 localhost systemd-fsck[509]: /dev/md2: Journal superblock has an unknown incompatible feature flag set.
May 06 03:26:36 localhost systemd-fsck[509]: /dev/md2: UNEXPECTED INCONSISTENCY; RUN fsck MANUALLY.
May 06 03:26:36 localhost systemd-fsck[509]: (i.e., without -a or -p options)
May 06 03:26:36 localhost systemd-fsck[509]: fsck failed with error code 4.
Anyways, i am writing this here in case if anyone is using this combo to URGE not to update to 7.0.16 ... and for someone to try to fix this if
it is also present in Virtuozzo
EDIT2: Apparently my provider (hetzner) used metadata_csum argument during ext4 filesystem creation, which conflictls with openvz and centos e2fsprogs (too old to supoort it), so usually centos7 does not enable this by default, but sometimes some online images contain this argument (it can be seen with dumpe2fs -h /dev/XXX) where it says metadata_csum, so hopefully i have described the problem and solution to everyone who can be affected, i am sure if metadata_csum is enabled also on virtuozzo linux, the same problem will appear, so probably best to check that metadata_csum is disabled before installing virtuozzo...
Thank you
Vladimir
>> dmc / dtc <<
[Updated on: Thu, 06 May 2021 12:59] Report message to a moderator
|
|
|