very odd error caused by a particular VE [message #40966] |
Thu, 28 October 2010 23:44 |
vmvmvm
Messages: 51 Registered: January 2006
|
Member |
|
|
Hi all,
I have what appears to be a very odd problem.
Today dmesg started reporting a bunch of :
ata1.00: status: { DRDY ERR }
ata1.00: error: { UNC }
ata1.00: configured for UDMA/133
ata1: EH complete
SCSI device sda: 1465149168 512-byte hdwr sectors (750156 MB)
sda: Write Protect is off
sda: Mode Sense: 00 3a 00 00
SCSI device sda: drive cache: write back
ata1.00: exception Emask 0x0 SAct 0xe SErr 0x0 action 0x0
ata1.00: irq_stat 0x40000008
ata1.00: cmd 60/08:10:87:75:bc/00:00:56:00:00/40 tag 2 ncq 4096 in
res 41/40:08:8a:75:bc/cb:00:56:00:00/00 Emask 0x409 (media error) <F>
ata1.00: status: { DRDY ERR }
ata1.00: error: { UNC }
ata1.00: configured for UDMA/133
sd 0:0:0:0: SCSI error: return code = 0x08000002
sda: Current [descriptor]: sense key: Medium Error
Add. Sense: Unrecovered read error - auto reallocate failed
Descriptor sense data with sense descriptors (in hex):
72 03 11 04 00 00 00 0c 00 0a 80 00 00 00 00 00
56 bc 75 8a
end_request: I/O error, dev sda, sector 1455191434
ata1: EH complete
SCSI device sda: 1465149168 512-byte hdwr sectors (750156 MB)
sda: Write Protect is off
sda: Mode Sense: 00 3a 00 00
SCSI device sda: drive cache: write back
etc.
Load goes to 40 to 60 and all IO is very, very slow.
I assumed perhaps a drive was going bad. However I've discovered that if a particular VE is shutdown the errors stop and load goes back down to 0 or so.
I next assumed it must be that this particular VE uses the damaged part of the disk, etc. and that when the VE is on that part is accessed more etc. causing the error.
However this is not the case. I can access that part of the disk (a did a copy of /vz/private/problem-ve-number to another dir and load did not rise abnormally and dmesg did not report any errors.
Any ideas what could be going on? What could that particular VE be doing that is causing this? I'm at a loss!
Thanks.
|
|
|