OpenVZ Forum


Home » Mailing lists » Devel » megaraid_mbox: garbage in file
megaraid_mbox: garbage in file [message #2983] Thu, 04 May 2006 18:46 Go to next message
vaverin is currently offline  vaverin
Messages: 708
Registered: September 2005
Senior Member
Hello all,

I've investigated customers claim on the unstable work of their node and found a
strange effect: reading from some files leads to the
"attempt to access beyond end of device" messages.

I've checked filesystem, memory on the node, motherboard BIOS version, but it
does not help and issue still has been reproduced by simple file reading.

Reproducer is simple:

echo 0xffffffff >/proc/sys/dev/scsi/logging_level ;
cat /vz/private/101/root/etc/ld.so.cache >/tmp/ttt ;
echo 0 >/proc/sys/dev/scsi/logging

It leads to the following messages in dmesg

sd_init_command: disk=sda, block=871769260, count=26
sda : block=871769260
sda : reading 26/26 512 byte blocks.
scsi_add_timer: scmd: f79ed980, time: 7500, (c02b1420)
sd 0:1:0:0: send 0xf79ed980 sd 0:1:0:0:
command: Read (10): 28 00 33 f6 24 ac 00 00 1a 00
buffer = 0xf7cfb540, bufflen = 13312, done = 0xc0366b40, queuecommand 0xc0344010
leaving scsi_dispatch_cmnd()
scsi_delete_timer: scmd: f79ed980, rtn: 1
sd 0:1:0:0: done 0xf79ed980 SUCCESS 0 sd 0:1:0:0:
command: Read (10): 28 00 33 f6 24 ac 00 00 1a 00
scsi host busy 1 failed 0
sd 0:1:0:0: Notifying upper driver of completion (result 0)
sd_rw_intr: sda: res=0x0
26 sectors total, 13312 bytes done.
use_sg is 4
attempt to access beyond end of device
sda6: rw=0, want=1044134458, limit=951401367
Buffer I/O error on device sda6, logical block 522067228
attempt to access beyond end of device
sda6: rw=0, want=1178878530, limit=951401367
Buffer I/O error on device sda6, logical block 589439264
...

As far as I see first read operation has finished without errors, but when we
read the rest of file we get an access to beyond end of device.

Originally it was found on Virtuozzo kernels (2.6.8.1-based x86 32-bit),
reproduced on RHEL4 kernels 2.6.9-22.EL and 2.6.9-34.EL,
on FC5 (2.6.16-1.2096_FC5) and on vanilla 2.6.16 kernels.

However, when I first read these blocks by using dd with bs=512 or 1024 it works
without any troubles. Then I can cat this file, copy it, map it and so on -- and
get correct content without any errors. Moreover, this issue may be workarounded
by memory limitation: it helps to use mem=4G in kernel commandline or kernels
without PAE support.

I've noticed that we attempt to access to the blocks with a strange numbers:

522067228 = 0x1f1e1d1c
589439264 = 0x23222120 and so on.

Then I've found that I've read strange garbage from file:

# hexdump /tmp/ttt
0000000 0100 0302 0504 0706 0908 0b0a 0d0c 0f0e
0000010 1110 1312 1514 1716 1918 1b1a 1d1c 1f1e
0000020 2120 2322 2524 2726 2928 2b2a 2d2c 2f2e
0000030 3130 3332 3534 3736 3938 3b3a 3d3c 3f3e
0000040 4140 4342 4544 4746 4948 4b4a 4d4c 4f4e
0000050 5150 5352 5554 5756 5958 5b5a 5d5c 5f5e
0000060 6160 6362 6564 6766 6968 6b6a 6d6c 6f6e
0000070 7170 7372 7574 7776 7978 7b7a 7d7c 7f7e
0000080 0100 0302 0504 0706 0908 0b0a 0d0c 0f0e
0000090 1110 1312 1514 1716 1918 1b1a 1d1c 1f1e
00000a0 2120 2322 2524 2726 2928 2b2a 2d2c 2f2e
...
00000f0 7170 7372 7574 7776 7978 7b7a 7d7c 7f7e
0000100 0100 0302 0504 0706 0908 0b0a 0d0c 0f0e
...

Then I've discovered that "access beyond end of device" occurs due reading of
the same garbage from the 13-th (Indirect) block of the file.

I've tried to understand where we got this garbage and found that it is present
in the data buffers beginning at megaraid_mbox driver functions.

Could somebody explain me what is the strange garbage: repeated 0...127?
Seokmann, Atul, could you please tell me if it is a known issue?
James, from my point of view it is not looks like a driver bug, but probably I'm
wrong?

I suppose it is MegaRAID SATA 150-4 firmware issue. I've seen similar firmware
fixes for MegaRAID SATA 300 controllers ("Support PAE mode fixed" and "Fixed the
operating systems using more than 4 gig of memory"). Is it probably the same
issues are present in SATA 150-4 firmware? Or may be I use broken controller?

Hardware Environment:
Tyan B2881
2 x Opteron 246
8G RAM
LSI MegaRAID SATA 150-4
/vz partition formatted as ext3 with 1Kb blocksize

megaraid cmm: 2.20.2.6 (Release Date: Mon Mar 7 00:01:03 EST 2005)
megaraid: 2.20.4.7 (Release Date: Mon Nov 14 12:27:22 EST 2005)
megaraid: probe new device 0x1000:0x1960:0x1000:0x4523: bus 1:slot 4:func 0
ACPI: PCI Interrupt 0000:01:04.0[A] -> GSI 29 (level, low) -> IRQ 16
megaraid: fw version:[713N] bios version:[G119]
scsi0 : LSI Logic MegaRAID driver
scsi[0]: scanning scsi channel 0 [Phy 0] for non-raid devices
scsi[0]: scanning scsi channel 1 [virtual] for logical drives
Vendor: MegaRAID Model: LD 0 RAID1 476G Rev: 713N
Type: Direct-Access ANSI SCSI revision: 02


Also I would note that from my point of view this issue looks similar to
http://bugzilla.kernel.org/show_bug.cgi?id=6052

It seems for me both of our cases may have the same cause.

Thank you,
Vasily Averin

SWsoft Virtuozzo/OpenVZ Linux kernel team
Re: megaraid_mbox: garbage in file [message #2986 is a reply to message #2983] Fri, 05 May 2006 05:34 Go to previous messageGo to next message
vaverin is currently offline  vaverin
Messages: 708
Registered: September 2005
Senior Member
James Bottomley wrote:
> On Thu, 2006-05-04 at 22:48 +0400, Vasily Averin wrote:
>>attempt to access beyond end of device
>>sda6: rw=0, want=1044134458, limit=951401367
>>Buffer I/O error on device sda6, logical block 522067228
>
> That's not a SCSI error. It's coming from the block layer and it means
> that the filesystem tried to access beyond the end of the listed
> partition. Why that happened is anyone's guess. I suspect the actual
> filesystem is corrupt somehow, but how it came to be, I don't know.

James,

The issue is that the correctly finished scsi read command return me garbage
(repeated 0 ...127 -- see hexdump in my first letter) instead correct file content.
"attempt to access beyond end of device" messages occurs due the same garbage
readed from the Indirect block. I found this garbage present in data buffers
beginning at megaraid driver functions.

I would note that if I read the same file by using dd with bs=1024 or bs=512 --
I get correct file content.

When I use kernel with 4Gb memory limit -- the same cat command return me
correct file content too, without any garbage.

Question is what it is the strange garbage? Have you seen it earlier?
Is it possible that it is some driver-related issue or it is broken hardware?
And why I can workaround this issue by using only 4Gb memory?

Thank you,
Vasily Averin

SWsoft Virtuozzo/OpenVZ Linux kernel team
Re: megaraid_mbox: garbage in file [message #2990 is a reply to message #2986] Fri, 05 May 2006 09:18 Go to previous messageGo to next message
vaverin is currently offline  vaverin
Messages: 708
Registered: September 2005
Senior Member
Small update:

When I use
cat /vz/private/101/root/etc/ld.so.cache >/tmp/ttt
I've get "access beyond end of device" and garbage in buffers

Then I create the same scsi read command by using sgp_dd utils:
sgp_dd count=26 if=/dev/sg0 skip=871769260 of=/tmp/ttt.sgp
and get correct file content without any errors.

The only difference that I see is use_sg=3 for cat and use_sg=1 for dd.

dmesg with scsi debugs and output files are attached.

Node will be accessible for some time and I can perform some experiments. If
somebody wants I can request the customer about access on the node.

Thank you,
Vasily Averin

SWsoft Virtuozzo/OpenVZ Linux kernel team

Vasily Averin wrote:
> James Bottomley wrote:
>>On Thu, 2006-05-04 at 22:48 +0400, Vasily Averin wrote:
>>>attempt to access beyond end of device
>>>sda6: rw=0, want=1044134458, limit=951401367
>>>Buffer I/O error on device sda6, logical block 522067228
>>That's not a SCSI error. It's coming from the block layer and it means
>>that the filesystem tried to access beyond the end of the listed
>>partition. Why that happened is anyone's guess. I suspect the actual
>>filesystem is corrupt somehow, but how it came to be, I don't know.
>
> James,
>
> The issue is that the correctly finished scsi read command return me garbage
> (repeated 0 ...127 -- see hexdump in my first letter) instead correct file content.
> "attempt to access beyond end of device" messages occurs due the same garbage
> readed from the Indirect block. I found this garbage present in data buffers
> beginning at megaraid driver functions.
>
> I would note that if I read the same file by using dd with bs=1024 or bs=512 --
> I get correct file content.
>
> When I use kernel with 4Gb memory limit -- the same cat command return me
> correct file content too, without any garbage.
>
> Question is what it is the strange garbage? Have you seen it earlier?
> Is it possible that it is some driver-related issue or it is broken hardware?
> And why I can workaround this issue by using only 4Gb memory?
>
> Thank you,
> Vasily Averin
>
> SWsoft Virtuozzo/OpenVZ Linux kernel team
>


Linux version 2.6.16 (vvs@dhcp0-157) (gcc version 3.3.5 20050117 (prerelease) (SUSE Linux)) #1 SMP Thu May 4 17:49:16 MSD 2006
BIOS-provided physical RAM map:
BIOS-e820: 0000000000000000 - 000000000009fc00 (usable)
BIOS-e820: 000000000009fc00 - 00000000000a0000 (reserved)
BIOS-e820: 00000000000e0000 - 0000000000100000 (reserved)
BIOS-e820: 0000000000100000 - 00000000fbff0000 (usable)
BIOS-e820: 00000000fbff0000 - 00000000fbfff000 (ACPI data)
BIOS-e820: 00000000fbfff000 - 00000000fc000000 (ACPI NVS)
BIOS-e820: 00000000ff780000 - 0000000100000000 (reserved)
BIOS-e820: 0000000100000000 - 0000000200000000 (usable)
7296MB HIGHMEM available.
896MB LOWMEM available.
found SMP MP-table at 000ff780
NX (Execute Disable) protection: active
On node 0 totalpages: 2097152
DMA zone: 4096 pages, LIFO batch:0
DMA32 zone: 0 pages, LIFO batch:0
Normal zone: 225280 pages, LIFO batch:31
HighMem zone: 1867776 pages, LIFO batch:31
DMI 2.3 present.
ACPI: RSDP (v002 ACPIAM ) @ 0x000f6dd0
ACPI: XSDT (v001 A M I OEMXSDT 0x12000527 MSFT 0x00000097) @ 0xfbff0100
ACPI: FADT (v001 A M I OEMFACP 0x12000527 MSFT 0x00000097) @ 0xfbff0281
ACPI: MADT (v001 A M I OEMAPIC 0x12000527 MSFT 0x00000097) @ 0xfbff0380
ACPI: OEMB (v001 A M I OEMBIOS 0x12000527 MSFT 0x00000097) @ 0xfbfff040
ACPI: SRAT (v001 A M I OEMSRAT 0x12000527 MSFT 0x00000097) @ 0xfbff39b0
ACPI: HPET (v001 A M I OEMHPET 0x12000527 MSFT 0x00000097) @ 0xfbff3ac0
ACPI: ASF! (v001 AMIASF AMDSTRET 0x00000001 INTL 0x02002026) @ 0xfbff3b00
ACPI: DSDT (v001 0AAAA 0AAAA001 0x00000001 INTL 0x02002026) @ 0x00000000
ACPI: PM-Timer IO Port: 0x5008
ACPI: Local APIC address 0xfee00000
ACPI: LAPIC (acpi_id[0x01] lapic_id[0x00] enabled)
Processor #0 15:5 APIC version 16
ACPI: LAPIC (acpi_id[0x02] lapic_id[0x01] enabled)
Processor #1 15:5 APIC version 16
ACPI: LAPIC (acpi_id[0x03] lapic_id[0x82] disabled)
ACPI: LAPIC (acpi_id[0x04] lapic_id[0x83] disabled)
ACPI: IOAPIC (id[0x02] address[0xfec00000] gsi_base[0])
IOAPIC[0]: apic_id 2, version 17, address 0xfec00000, GSI 0-23
ACPI: IOAPIC (id[0x03] address[0xfebff000] gsi_base[24])
IOAPIC[1]: apic_id 3, version 17, address 0xfebff000, GSI 24-27
ACPI: IOAPIC (id[0x04] address[0xfebfe000] gsi_base[28])
IOAPIC[2]: apic_id 4, version 17, address 0xfebfe000, GSI 28-31
ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
ACPI: IRQ0 used by override.
ACPI: IRQ2 used by override.
ACPI: IRQ9 used by override.
Enabling APIC mode: Flat. Using 3 I/O APICs
ACPI: HPET id: 0x102282a0 base: 0xfec01000
Using ACPI (MADT) for SMP configuration information
Allocating PCI resources starting at fc400000 (gap: fc000000:03780000)
Built 1 zonelists
Kernel command line: ro root=LABEL=/1 debug panic=5
mapped APIC to ffffd000 (fee00000)
mapped IOAPIC to ffffc000 (fec00000)
mapped IOAPIC to ffffb000 (febff000)
mapped IOAPIC to ffffa000 (febfe000)
Enabling fast FPU save and restore... done.
Enabling unmasked SIMD FPU exception support... done.
Initializing CPU#0
CPU 0 irqstacks, hard=c0565000 soft=c0545000
PID hash table entries: 4096 (order: 12, 65536 bytes)
Console: colour VGA+ 80x25
Dentry cache hash table entries: 131072 (order: 7, 524288 bytes)
Inode-cache hash table entries: 65536 (order: 6, 262144 bytes)
Memory: 8248540k/8388608k available (3118k kernel code, 73068k reserved, 940k data, 288k init, 7405504k highmem)
Checking if this processor honours the WP bit even in supervisor mode... Ok.
Using HPET for base-timer
Using HPET for gettimeofday
Detected 1990.876 MHz processor.
Using hpet for high-res timesource
Calibrating delay using timer specific routine.. 3987.38 BogoMIPS (lpj=7974771)
Mount-cache hash table entries: 512
CPU: After generic identify, caps: 078bfbff e1d3fbff 00000000 00000000 00000000 00000000 00000000
CPU: After vendor identify, caps: 078bfbff e1d3fbff 00000000 00000000 00000000 00000000 00000000
CPU: L1 I Cache: 64K (64 bytes/line), D cache 64K (64 bytes/line)
CPU: L2 Cache: 1024K (64 bytes/line)
CPU: After all inits, caps: 078bfbff e1d3fbff 00000000 00000010 00000000 00000000 00000000
Intel machine check architecture supported.
Intel machine check reporting enabled on CPU#0.
Checking 'hlt' instruction... OK.
CPU0: AMD Opteron(tm) Processor 246 stepping 0a
Booting processor 1/1 eip 2000
CPU 1 irqstacks, hard=c0566000 soft=c0546000
Initializing CPU#1
Calibrating delay using timer specific routine.. 3981.36 BogoMIPS (lpj=7962728)
CPU: After generic identify, caps: 078bfbff e1d3fbff 00000000 00000000 00000000 00000000 00000000
CPU: After vendor identify, caps: 078bfbff e1d3fbff 00000000 00000000 00000000 00000000 00000000
CPU: L1 I Cache: 64K (64 bytes/line), D cache 64K (64 bytes/line)
CPU: L2 Cache: 1024K (64 bytes/line)
CPU: After all inits, caps: 078bfbff e1d3fbff 00000000 00000010 00000000 00000000 00000000
Intel machine check architecture supported.
Intel machine check reporting enabled on CPU#1.
CPU1: AMD Opteron(tm) Processor 246 stepping 0a
Total of 2 processors activated (7968.74 BogoMIPS).
ENABLING IO-APIC IRQs
..TIMER: vector=0x31 apic1=0 pin1=2 apic2=0 pin2=0
checking TSC synchronization across 2 CPUs: passed.
Brought up 2 CPUs
migration_cost=4000
checking if image is initramfs...it isn't (no cpio magic); looks like an initrd
Freeing initrd memory: 589k freed
NET: Registered protocol family 16
ACPI: bus type pci registered
PCI: PCI BIOS revision 2.10 entry at 0xf0031, last bus=3
PCI: Using configuration type 1
ACPI: Subsystem revision 20060127
ACPI: Interpreter enabled
ACPI: Using IOAPIC for interrupt routing
ACPI: PCI Root Bridge [PCI0] (0000:00)
PCI: Probing PCI hardware (bus 00)
Boot video device is 0000:03:06.0
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0._PRT]
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.PCI1._PRT]
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.GOLA._PRT]
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.GOLB._PRT]
ACPI: PCI Interrupt Link [LNKA] (IRQs 3 4 *5 6 7 9 10 11 12 14 15)
ACPI: PCI Interrupt Link [LNKB] (IRQs 3 4 5 6 7 9 *10 11 12 14 15)
ACPI: PCI Interrupt Link [LNKC] (IRQs 3 4 5 6 7 9 10 *11 12 14 15)
ACPI: PCI Interrupt Link [LNKD] (IRQs 3 4 5 6 7 *9 10 11 12 14 15)
SCSI subsystem initialized
PCI: Using ACPI for IRQ routing
PCI: If a device doesn't work, try "pci=routeirq". If it helps, post a report
PCI: Bridge: 0000:00:06.0
IO window: b000-bfff
MEM window: fca00000-feafffff
PREFETCH window: disabled.
PCI: Bridge: 0000:00:0a.0
IO window: disabled.
MEM window: fc900000-fc9fffff
PREFETCH window: ff500000-ff5fffff
PCI: Bridge: 0000:00:0b.0
IO window: disabled.
MEM window: fc800000-fc8fffff
PREFETCH window: ff400000-ff4fffff
highmem bounce pool size: 64 pages
VFS: Disk quotas dquot_6.5.1
Dquot-cache hash table entries: 1024 (order 0, 4096 bytes)
Initializing Cryptographic API
io scheduler noop registered
io scheduler anticipatory registered (default)
io scheduler deadline registered
io scheduler cfq registered
PCI: MSI quirk detected. pci_msi_quirk set.
PCI: MSI quirk detected. pci_msi_quirk set.
pci_hotplug: PCI Hot Plug PCI Core version: 0.5
Real Time Clock Driver v1.12ac
serio: i8042 AUX port at 0x60,0x64 irq 12
serio: i8042 KBD port at 0x60,0x64 irq 1
Serial: 8250/16550 driver $Revision: 1.90 $ 4 ports, IRQ sharing disabled
serial8250: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A
serial8250: ttyS1 at I/O 0x2f8 (irq = 3) is a 16550A
RAMDISK driver initialized: 16 RAM disks of 16384K size 1024 blocksize
Compaq SMART2 Driver (v 2.6.0)
HP CISS Driver (v 2.6.10)
Uniform Multi-Platform E-IDE driver Revision: 7.00alpha2
ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx
AMD8111: IDE controller at PCI slot 0000:00:
...

  • Attachment: ttt
    (Size: 16.00KB, Downloaded 269 times)
  • Attachment: ttt.sgp
    (Size: 13.00KB, Downloaded 248 times)
Re: megaraid_mbox: garbage in file [message #2994 is a reply to message #2983] Thu, 04 May 2006 22:59 Go to previous messageGo to next message
James Bottomley is currently offline  James Bottomley
Messages: 17
Registered: May 2006
Junior Member
On Thu, 2006-05-04 at 22:48 +0400, Vasily Averin wrote:
> attempt to access beyond end of device
> sda6: rw=0, want=1044134458, limit=951401367
> Buffer I/O error on device sda6, logical block 522067228

That's not a SCSI error. It's coming from the block layer and it means
that the filesystem tried to access beyond the end of the listed
partition. Why that happened is anyone's guess. I suspect the actual
filesystem is corrupt somehow, but how it came to be, I don't know.

James
Re: megaraid_mbox: garbage in file [message #2995 is a reply to message #2986] Fri, 05 May 2006 15:59 Go to previous messageGo to next message
James Bottomley is currently offline  James Bottomley
Messages: 17
Registered: May 2006
Junior Member
On Fri, 2006-05-05 at 09:37 +0400, Vasily Averin wrote:
> The issue is that the correctly finished scsi read command return me garbage
> (repeated 0 ...127 -- see hexdump in my first letter) instead correct file content.
> "attempt to access beyond end of device" messages occurs due the same garbage
> readed from the Indirect block. I found this garbage present in data buffers
> beginning at megaraid driver functions.
>
> I would note that if I read the same file by using dd with bs=1024 or bs=512 --
> I get correct file content.
>
> When I use kernel with 4Gb memory limit -- the same cat command return me
> correct file content too, without any garbage.
>
> Question is what it is the strange garbage? Have you seen it earlier?
> Is it possible that it is some driver-related issue or it is broken hardware?
> And why I can workaround this issue by using only 4Gb memory?

This is really odd ... if the controller can't reach *any* memory above
32 bits, then, on an 8GB machine you'd expect corruption all over the
place since most user pages come from the top of highmem.

The first thing to try, since you have an opteron system, is to get rid
of highmem entirely and use a 64 bit kernel (just to make sure we're not
running into some annoying dma_addr_t conversion problem). Then, I
suppose if that doesn't work, try printing out the actual contents of
the sg list to see what the physical memory location of the page
containing the corrupt block is.

This could also be a firmware problem, I suppose, but I haven't seen any
similar reports.

James
Re: megaraid_mbox: garbage in file [message #2997 is a reply to message #2995] Fri, 05 May 2006 18:14 Go to previous messageGo to next message
vaverin is currently offline  vaverin
Messages: 708
Registered: September 2005
Senior Member
James Bottomley wrote:
> On Fri, 2006-05-05 at 09:37 +0400, Vasily Averin wrote:
>>The issue is that the correctly finished scsi read command return me garbage
>>(repeated 0 ...127 -- see hexdump in my first letter) instead correct file content.
>>"attempt to access beyond end of device" messages occurs due the same garbage
>>readed from the Indirect block. I found this garbage present in data buffers
>>beginning at megaraid driver functions.
>>
>>I would note that if I read the same file by using dd with bs=1024 or bs=512 --
>>I get correct file content.
>>
>>When I use kernel with 4Gb memory limit -- the same cat command return me
>>correct file content too, without any garbage.
>>
>>Question is what it is the strange garbage? Have you seen it earlier?
>>Is it possible that it is some driver-related issue or it is broken hardware?
>>And why I can workaround this issue by using only 4Gb memory?
>
> This is really odd ... if the controller can't reach *any* memory above
> 32 bits, then, on an 8GB machine you'd expect corruption all over the
> place since most user pages come from the top of highmem.
>
> The first thing to try, since you have an opteron system, is to get rid
> of highmem entirely and use a 64 bit kernel (just to make sure we're not
> running into some annoying dma_addr_t conversion problem).

Unfortunately it is customers node, and I'm not able to re-install 64-bit
distribution to load 64-bit kernel. Of course I'll ask customer about this, but
it will be done later.

> Then, I
> suppose if that doesn't work, try printing out the actual contents of
> the sg list to see what the physical memory location of the page
> containing the corrupt block is.

I've already done such experiment:
On 2.6.8-based virtuozzo kernel I've added following code to
megaraid_mbox_display_scb function:
virt = page_address(sg[i].page) + sg[i].offset;
printk("mbox sg%d: page %p off %d addr %llx len %d "
"virt %p first %08x page->flags %08x\n",
i, sg[i].page, sg[i].offset, sg[i].dma_address, sg[i].length,
virt, virt == NULL ? 0: *(int *)virt, sg[i].page->flags);

and get the following results
May 4 02:51:38 vpsn002 kernel:
megaraid mailbox: status:0x0 cmd:0xa7 id:0x25 sec:0x1a
lba:0x33f624ac addr:0xffffffff ld:128 sg:4
scsi cmnd: 0x28 0x00 0x33 0xf6 0x24 0xac 0x00 0x00 0x1a 0x00
mbox request_buffer eafde340 use_sg 4
mbox sg0: page 077a0474 off 0 addr 1fd575000 len 4096 virt ff15a000
first 03020100 page->flags 40020101
mbox sg1: page 077b5738 off 0 addr 1fdede000 len 4096 virt ff141000
first 03020100 page->flags 40020101
mbox sg2: page 077ad500 off 0 addr 1fdb40000 len 4096 virt ff056000
first 03020100 page->flags 40020101
mbox sg3: page 030d46e8 off 1024 addr 5e6a400 len 1024 virt 07e6a400
first 03020100 page->flags 20001004

"first 03020100" shows that data in the all sg buffers is already corrupted.
Also I would note that page for last 1Kb buffer is not Highmem.

If you want I can reproduce this experiment on 2.6.16 kernel too.

> This could also be a firmware problem, I suppose, but I haven't seen any
> similar reports.

Thank you,
Vasily Averin

SWsoft Virtuozzo/OpenVZ Linux kernel team
Re: megaraid_mbox: garbage in file [message #2998 is a reply to message #2997] Fri, 05 May 2006 20:05 Go to previous messageGo to next message
James Bottomley is currently offline  James Bottomley
Messages: 17
Registered: May 2006
Junior Member
On Fri, 2006-05-05 at 22:17 +0400, Vasily Averin wrote:
> megaraid mailbox: status:0x0 cmd:0xa7 id:0x25 sec:0x1a
> lba:0x33f624ac addr:0xffffffff ld:128 sg:4
> scsi cmnd: 0x28 0x00 0x33 0xf6 0x24 0xac 0x00 0x00 0x1a 0x00
> mbox request_buffer eafde340 use_sg 4
> mbox sg0: page 077a0474 off 0 addr 1fd575000 len 4096 virt ff15a000
> first 03020100 page->flags 40020101
> mbox sg1: page 077b5738 off 0 addr 1fdede000 len 4096 virt ff141000
> first 03020100 page->flags 40020101
> mbox sg2: page 077ad500 off 0 addr 1fdb40000 len 4096 virt ff056000
> first 03020100 page->flags 40020101
> mbox sg3: page 030d46e8 off 1024 addr 5e6a400 len 1024 virt 07e6a400
> first 03020100 page->flags 20001004

The odd thing about this is that the highmem addresses shouldn't have a
virtual mapping (since nothing should have called kmap on them).

But the other tickles a suspicion about the card. I know various LSI
chips that don't have a true 64 bit mode, but instead have programmable
windowed mappings in their descriptors (i.e. all SG list elements have
to be in the same xGB region of physical memory), and since the last
descriptor is more than 4GB away from the other three, whether this
might be the problem here. Unfortunately, only LSI can tell us this ...

James
Re: megaraid_mbox: garbage in file [message #2999 is a reply to message #2983] Fri, 05 May 2006 23:32 Go to previous messageGo to next message
vaverin is currently offline  vaverin
Messages: 708
Registered: September 2005
Senior Member
Ju, Seokmann wrote:
> Can you do one quick change in the driver?
> Search for 'pci_set_dma_mask()' API calls in the driver and mask out one of them with DMA_64BIT_MASK as follow.
> ---
> // if (pci_set_dma_mask(adapter->pdev, DMA_64BIT_MASK) != 0) {
>
> // conlog(CL_ANN, (KERN_WARNING
> // "megaraid: could not set DMA mask for 64-bit.\n"));
>
> // goto out_free_sysfs_res;
> // }
> ---
>
> I found that the driver is NOT checking 64-bit DMA capability of the controllers accordingly and this could be a reason.

This change help me:
megaraid mailbox: status:0x0 cmd:0xa7 id:0x1f sec:0x1a lba:0x33f624ac
addr:0xffffffff ld:128 sg:4
scsi cmnd: 0x28 0x00 0x33 0xf6 0x24 0xac 0x00 0x00 0x1a 0x00
mbox request_buffer ebeb9380 use_sg 4
mbox sg0: page 050c5d88 off 0 addr e90d2000 len 4096 virt eb0d2000
first 732e646c page->flags 20000000
mbox sg1: page 050c5710 off 0 addr e90a4000 len 4096 virt eb0a4000
first 00000003 page->flags 20000000
mbox sg2: page 050c4438 off 0 addr e901e000 len 4096 virt eb01e000
first 00000000 page->flags 20000000
mbox sg3: page 030d64dc off 1024 addr 5f3f400 len 1024 virt 07f3f400
first 19398a0e page->flags 20001004

Errors go away, file content is correct.

> I'm waiting for feedback from F/W team for MegaRAID 150-4 controller if it supports 64-bit DMA.
>
> I'll update here as I get.

How do you this, can it be the cause of
http://bugzilla.kernel.org/show_bug.cgi?id=6052

If so, you have a good chance to resolve this bug too :)

Thank you,
Vasily Averin

SWsoft Virtuozzo/OpenVZ Linux kernel team
Re: megaraid_mbox: garbage in file [message #3000 is a reply to message #2998] Fri, 05 May 2006 23:40 Go to previous messageGo to next message
vaverin is currently offline  vaverin
Messages: 708
Registered: September 2005
Senior Member
James Bottomley wrote:
> On Fri, 2006-05-05 at 22:17 +0400, Vasily Averin wrote:
>> megaraid mailbox: status:0x0 cmd:0xa7 id:0x25 sec:0x1a
>> lba:0x33f624ac addr:0xffffffff ld:128 sg:4
>> scsi cmnd: 0x28 0x00 0x33 0xf6 0x24 0xac 0x00 0x00 0x1a 0x00
>> mbox request_buffer eafde340 use_sg 4
>> mbox sg0: page 077a0474 off 0 addr 1fd575000 len 4096 virt ff15a000
>> first 03020100 page->flags 40020101
>> mbox sg1: page 077b5738 off 0 addr 1fdede000 len 4096 virt ff141000
>> first 03020100 page->flags 40020101
>> mbox sg2: page 077ad500 off 0 addr 1fdb40000 len 4096 virt ff056000
>> first 03020100 page->flags 40020101
>> mbox sg3: page 030d46e8 off 1024 addr 5e6a400 len 1024 virt 07e6a400
>> first 03020100 page->flags 20001004
>
> The odd thing about this is that the highmem addresses shouldn't have a
> virtual mapping (since nothing should have called kmap on them).

You are right, in the other my experiments highmem pages usually have virt=0 and
I cannot find who is kmapped these pages.

I'll investigate this issue later: first έΑ all I'll try to reproduce this issue
on 2.6.16 kernel.

Thank you,
Vasily Averin

SWsoft Virtuozzo/OpenVZ Linux kernel team
RE: megaraid_mbox: garbage in file [message #3065 is a reply to message #2983] Fri, 05 May 2006 19:59 Go to previous messageGo to next message
Seokmann.Ju is currently offline  Seokmann.Ju
Messages: 5
Registered: May 2006
Junior Member
Can you do one quick change in the driver?
Search for 'pci_set_dma_mask()' API calls in the driver and mask out one of them with DMA_64BIT_MASK as follow.
---
// if (pci_set_dma_mask(adapter->pdev, DMA_64BIT_MASK) != 0) {

// conlog(CL_ANN, (KERN_WARNING
// "megaraid: could not set DMA mask for 64-bit.\n"));

// goto out_free_sysfs_res;
// }
---

I found that the driver is NOT checking 64-bit DMA capability of the controllers accordingly and this could be a reason.
I'm waiting for feedback from F/W team for MegaRAID 150-4 controller if it supports 64-bit DMA.

I'll update here as I get.

Thank you,

> -----Original Message-----
> From: Vasily Averin [mailto:vvs@sw.ru]
> Sent: Friday, May 05, 2006 2:17 PM
> To: James Bottomley
> Cc: linux-scsi@vger.kernel.org; Kolli, Neela; Mukker, Atul;
> Ju, Seokmann; Bagalkote, Sreenivas; devel@openvz.org; Linux
> Kernel Mailing List
> Subject: Re: megaraid_mbox: garbage in file
>
> James Bottomley wrote:
> > On Fri, 2006-05-05 at 09:37 +0400, Vasily Averin wrote:
> >>The issue is that the correctly finished scsi read command
> return me garbage
> >>(repeated 0 ...127 -- see hexdump in my first letter)
> instead correct file content.
> >>"attempt to access beyond end of device" messages occurs
> due the same garbage
> >>readed from the Indirect block. I found this garbage
> present in data buffers
> >>beginning at megaraid driver functions.
> >>
> >>I would note that if I read the same file by using dd with
> bs=1024 or bs=512 --
> >>I get correct file content.
> >>
> >>When I use kernel with 4Gb memory limit -- the same cat
> command return me
> >>correct file content too, without any garbage.
> >>
> >>Question is what it is the strange garbage? Have you seen
> it earlier?
> >>Is it possible that it is some driver-related issue or it
> is broken hardware?
> >>And why I can workaround this issue by using only 4Gb memory?
> >
> > This is really odd ... if the controller can't reach *any*
> memory above
> > 32 bits, then, on an 8GB machine you'd expect corruption
> all over the
> > place since most user pages come from the top of highmem.
> >
> > The first thing to try, since you have an opteron system,
> is to get rid
> > of highmem entirely and use a 64 bit kernel (just to make
> sure we're not
> > running into some annoying dma_addr_t conversion problem).
>
> Unfortunately it is customers node, and I'm not able to
> re-install 64-bit
> distribution to load 64-bit kernel. Of course I'll ask
> customer about this, but
> it will be done later.
>
> > Then, I
> > suppose if that doesn't work, try printing out the actual
> contents of
> > the sg list to see what the physical memory location of the page
> > containing the corrupt block is.
>
> I've already done such experiment:
> On 2.6.8-based virtuozzo kernel I've added following code to
> megaraid_mbox_display_scb function:
> virt = page_address(sg[i].page) + sg[i].offset;
> printk("mbox sg%d: page %p off %d addr %llx len %d "
> "virt %p first %08x page->flags %08x\n",
> i, sg[i].page, sg[i].offset, sg[i].dma_address, sg[i].length,
> virt, virt == NULL ? 0: *(int *)virt, sg[i].page->flags);
>
> and get the following results
> May 4 02:51:38 vpsn002 kernel:
> megaraid mailbox: status:0x0 cmd:0xa7 id:0x25 sec:0x1a
> lba:0x33f624ac addr:0xffffffff ld:128 sg:4
> scsi cmnd: 0x28 0x00 0x33 0xf6 0x24 0xac 0x00 0x00 0x1a 0x00
> mbox request_buffer eafde340 use_sg 4
> mbox sg0: page 077a0474 off 0 addr 1fd575000 len 4096 virt ff15a000
> first 03020100 page->flags 40020101
> mbox sg1: page 077b5738 off 0 addr 1fdede000 len 4096 virt ff141000
> first 03020100 page->flags 40020101
> mbox sg2: page 077ad500 off 0 addr 1fdb40000 len 4096 virt ff056000
> first 03020100 page->flags 40020101
> mbox sg3: page 030d46e8 off 1024 addr 5e6a400 len 1024 virt 07e6a400
> first 03020100 page->flags 20001004
>
> "first 03020100" shows that data in the all sg buffers is
> already corrupted.
> Also I would note that page for last 1Kb buffer is not Highmem.
>
> If you want I can reproduce this experiment on 2.6.16 kernel too.
>
> > This could also be a firmware problem, I suppose, but I
> haven't seen any
> > similar reports.
>
> Thank you,
> Vasily Averin
>
> SWsoft Virtuozzo/OpenVZ Linux kernel team
>
Re: megaraid_mbox: garbage in file [message #3092 is a reply to message #2999] Fri, 12 May 2006 04:15 Go to previous messageGo to next message
vaverin is currently offline  vaverin
Messages: 708
Registered: September 2005
Senior Member
Vasily Averin wrote:
> Ju, Seokmann wrote:
>>I'm waiting for feedback from F/W team for MegaRAID 150-4 controller if it supports 64-bit DMA.
>>
>>I'll update here as I get.

Could you please tell me any updates? Could you confirm that this issue was
reproduced on your nodes?

>>Can you do one quick change in the driver?
>>Search for 'pci_set_dma_mask()' API calls in the driver and mask out one of them with DMA_64BIT_MASK as follow.
>>---
>> // if (pci_set_dma_mask(adapter->pdev, DMA_64BIT_MASK) != 0) {
>>
>> // conlog(CL_ANN, (KERN_WARNING
>> // "megaraid: could not set DMA mask for 64-bit.\n"));
>>
>> // goto out_free_sysfs_res;
>> // }
>>---
>>
>>I found that the driver is NOT checking 64-bit DMA capability of the controllers accordingly and this could be a reason.
>
> This change help me:
> Errors go away, file content is correct.

I'm going to use this change in production, at least as temporal workaround.
Could you please confirm that it is safe for all controllers supported by this
driver?

Thank you,
Vasily Averin

SWsoft Virtuozzo/OpenVZ Linux kernel team
RE: megaraid_mbox: garbage in file [message #3104 is a reply to message #2983] Fri, 12 May 2006 12:19 Go to previous message
Seokmann.Ju is currently offline  Seokmann.Ju
Messages: 5
Registered: May 2006
Junior Member
Hi,
Friday, May 12, 2006 12:19 AM, Vasily Averin wrote:
> Could you please tell me any updates? Could you confirm that
> this issue was
> reproduced on your nodes?
Yes, it has confirmed by F/W team that the controller doesn't support 64-bit DMA.
A patch addresses the issue will followed by soon.

Thank you,

> -----Original Message-----
> From: Vasily Averin [mailto:vvs@sw.ru]
> Sent: Friday, May 12, 2006 12:19 AM
> To: Vasily Averin
> Cc: Ju, Seokmann; James Bottomley;
> linux-scsi@vger.kernel.org; Kolli, Neela; Mukker, Atul;
> Bagalkote, Sreenivas; devel@openvz.org; Linux Kernel Mailing List
> Subject: Re: megaraid_mbox: garbage in file
>
> Vasily Averin wrote:
> > Ju, Seokmann wrote:
> >>I'm waiting for feedback from F/W team for MegaRAID 150-4
> controller if it supports 64-bit DMA.
> >>
> >>I'll update here as I get.
>
> Could you please tell me any updates? Could you confirm that
> this issue was
> reproduced on your nodes?
>
> >>Can you do one quick change in the driver?
> >>Search for 'pci_set_dma_mask()' API calls in the driver and
> mask out one of them with DMA_64BIT_MASK as follow.
> >>---
> >> // if (pci_set_dma_mask(adapter->pdev, DMA_64BIT_MASK) != 0) {
> >>
> >> // conlog(CL_ANN, (KERN_WARNING
> >> // "megaraid: could not set DMA mask for
> 64-bit.\n"));
> >>
> >> // goto out_free_sysfs_res;
> >> // }
> >>---
> >>
> >>I found that the driver is NOT checking 64-bit DMA
> capability of the controllers accordingly and this could be a reason.
> >
> > This change help me:
> > Errors go away, file content is correct.
>
> I'm going to use this change in production, at least as
> temporal workaround.
> Could you please confirm that it is safe for all controllers
> supported by this
> driver?
>
> Thank you,
> Vasily Averin
>
> SWsoft Virtuozzo/OpenVZ Linux kernel team
>
Previous Topic: [PATCH COMMIT] diff-merge-2.6.16.15-20060510
Next Topic: [PATCH 2/9] namespaces: incorporate fs namespace into nsproxy
Goto Forum:
  


Current Time: Tue May 07 19:45:01 GMT 2024

Total time taken to generate the page: 0.01528 seconds