The problem
We have an Adaptec 6805E card in an Intel Q67 / Core i7 server running Proxmox 1.9. Whenever and USB device is plugged in or removed (in our case the data center's KVM-over-IP console), there is a kernel error and the PCI bus hangs after that (extremely slows it down). It seems the Adaptec controller and the USB controller share the same IRQ.
Problem appears on all of our kernels, and only a hard reset solves the problem. Has anyone experienced this?
Others reported similar issues (not necessarily with OpenVZ kernel):
serverfault.com/questions/165717/apparent-irq-conflict-drivi ng-me-nuts-under-centos
forums.opensuse.org/english/get-technical-help-here/hardware /462213-irq-16-nobody-cared-try-booting-irqpoll-option-but-i rqpoll-causes-boot-failure.html
(remove spaces)
There are a lot of boot options regarding ACPI and IRQs.
help.ubuntu.com/community/BootOptions
We can't really test these on a production system. Anyone know what to do?
The error
Nov 23 09:16:31 proxmox2 kernel: irq 16: nobody cared (try booting with the "irqpoll" option)
Nov 23 09:16:31 proxmox2 kernel: Pid: 17958, comm: apache2 Not tainted 2.6.32-4-pve #1
Nov 23 09:16:31 proxmox2 kernel: Call Trace:
Nov 23 09:16:31 proxmox2 kernel: <IRQ> [<ffffffff81097bfd>] ? __report_bad_irq+0x30/0x7d
Nov 23 09:16:31 proxmox2 kernel: [<ffffffff81097d4f>] ? note_interrupt+0x105/0x16e
Nov 23 09:16:31 proxmox2 kernel: [<ffffffff810983b4>] ? handle_fasteoi_irq+0x93/0xb5
Nov 23 09:16:31 proxmox2 kernel: [<ffffffff8101333f>] ? handle_irq+0x17/0x1d
Nov 23 09:16:31 proxmox2 kernel: [<ffffffff81012999>] ? do_IRQ+0x57/0xb6
Nov 23 09:16:31 proxmox2 kernel: [<ffffffff81011593>] ? ret_from_intr+0x0/0x11
Nov 23 09:16:31 proxmox2 kernel: <EOI>
Nov 23 09:16:31 proxmox2 kernel: handlers:
Nov 23 09:16:31 proxmox2 kernel: [<ffffffffa00c63a8>] (aac_src_intr_message+0x0/0x108 [aacraid])
Nov 23 09:16:31 proxmox2 kernel: [<ffffffffa0024848>] (usb_hcd_irq+0x0/0x7e [usbcore])
Nov 23 09:16:31 proxmox2 kernel: Disabling IRQ #16
Nov 23 09:16:37 proxmox2 kernel: usb 1-1.6: USB disconnect, address 3
The system
proxmox2:~# pveversion -v
pve-manager: 1.9-26 (pve-manager/1.9/6567)
running kernel: 2.6.32-4-pve
proxmox-ve-2.6.32: 1.9-55+ovzfix-2
pve-kernel-2.6.32-4-pve: 2.6.32-33
pve-kernel-2.6.32-6-pve: 2.6.32-55+ovzfix-1
pve-kernel-2.6.32-7-pve: 2.6.32-55+ovzfix-2
qemu-server: 1.1-32
pve-firmware: 1.0-15
libpve-storage-perl: 1.0-19
vncterm: 0.9-2
vzctl: 3.0.29-3pve1
vzdump: 1.2-16
vzprocps: 2.0.11-2
vzquota: 3.0.11-1
pve-qemu-kvm: 0.15.0-2
ksm-control-daemon: 1.0-6
PCI device list:
proxmox2:~# lspci
00:00.0 Host bridge: Intel Corporation 2nd Generation Core Processor Family DRAM Controller (rev 09)
00:01.0 PCI bridge: Intel Corporation 2nd Generation Core Processor Family PCI Express Root Port (rev 09)
00:02.0 VGA compatible controller: Intel Corporation 2nd Generation Core Processor Family Integrated Graphics Controller (rev 09)
00:16.0 Communication controller: Intel Corporation 6 Series Chipset Family MEI Controller #1 (rev 04)
00:16.2 IDE interface: Intel Corporation 6 Series Chipset Family IDE-r Controller (rev 04)
00:16.3 Serial controller: Intel Corporation 6 Series Chipset Family KT Controller (rev 04)
00:19.0 Ethernet controller: Intel Corporation 82579LM Gigabit Network Connection (rev 04)
00:1a.0 USB Controller: Intel Corporation 6 Series Chipset Family USB Enhanced Host Controller #2 (rev 04)
00:1c.0 PCI bridge: Intel Corporation 6 Series Chipset Family PCI Express Root Port 1 (rev b4)
00:1c.4 PCI bridge: Intel Corporation 6 Series Chipset Family PCI Express Root Port 5 (rev b4)
00:1d.0 USB Controller: Intel Corporation 6 Series Chipset Family USB Enhanced Host Controller #1 (rev 04)
00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev a4)
00:1f.0 ISA bridge: Intel Corporation 6 Series Chipset Family LPC Controller (rev 04)
00:1f.2 SATA controller: Intel Corporation 6 Series Chipset Family 6 port SATA AHCI Controller (rev 04)
00:1f.3 SMBus: Intel Corporation 6 Series Chipset Family SMBus Controller (rev 04)
02:00.0 RAID bus controller: Adaptec Device 028b (rev 01)
03:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168B PCI Express Gigabit Ethernet controller (rev 01)
The two conflicting devices from lspci -vv:
00:1a.0 USB Controller: Intel Corporation 6 Series Chipset Family USB Enhanced Host Controller #2 (rev 04) (prog-if 20 [EHCI])
Subsystem: Intel Corporation Device 200a
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0
Interrupt: pin A routed to IRQ 16
Region 0: Memory at fbe23000 (32-bit, non-prefetchable) [size=1K]
Capabilities: [50] Power Management version 2
Flags: PMEClk- DSI- D1- D2- AuxCurrent=375mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
Status: D0 PME-Enable- DSel=0 DScale=0 PME-
Capabilities: [58] Debug port: BAR=1 offset=00a0
Capabilities: [98] PCIe advanced features <?>
Kernel driver in use: ehci_hcd
Kernel modules: ehci-hcd
02:00.0 RAID bus controller: Adaptec Device 028b (rev 01)
Subsystem: Adaptec Device 0201
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0, Cache Line Size: 64 bytes
Interrupt: pin A routed to IRQ 16
Region 0: Memory at fb800000 (64-bit, non-prefetchable) [size=4M]
Region 2: Memory at fbc41000 (64-bit, non-prefetchable) [size=2K]
Region 4: Memory at fbc40000 (32-bit, non-prefetchable) [size=256]
Expansion ROM at fbc00000 [disabled] [size=256K]
Capabilities: [40] Power Management version 3
Flags: PMEClk- DSI- D1+ D2- AuxCurrent=375mA PME(D0+,D1-,D2-,D3hot-,D3cold-)
Status: D0 PME-Enable- DSel=0 DScale=0 PME-
Capabilities: [50] Message Signalled Interrupts: Mask- 64bit+ Queue=0/0 Enable-
Address: 0000000000000000 Data: 0000
Capabilities: [70] Express (v2) Endpoint, MSI 00
DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s <1us, L1 <8us
ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset-
DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop+
MaxPayload 128 bytes, MaxReadReq 128 bytes
DevSta: CorrErr+ UncorrErr- FatalErr- UnsuppReq+ AuxPwr- TransPend-
LnkCap: Port #0, Speed 5GT/s, Width x8, ASPM L0s L1, Latency L0 <512ns, L1 <64us
ClockPM- Suprise- LLActRep- BwNot-
LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- Retrain- CommClk+
ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
LnkSta: Speed 5GT/s, Width x4, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
Capabilities: [ac] MSI-X: Enable- Mask- TabSize=16
Vector table: BAR=0 offset=001c2000
PBA: BAR=0 offset=001c4000
Capabilities: [100] Advanced Error Reporting <?>
Kernel driver in use: aacraid
Kernel modules: aacraid
Active interrupt list
proxmox2:~# cat /proc/interrupts
CPU0 CPU1 CPU2 CPU3 CPU4 CPU5 CPU6 CPU7
0: 99 0 0 0 0 0 0 0 IR-IO-APIC-edge timer
1: 2 0 0 0 0 0 0 0 IR-IO-APIC-edge i8042
8: 1 0 0 0 0 0 0 0 IR-IO-APIC-edge rtc0
9: 0 0 0 0 0 0 0 0 IR-IO-APIC-fasteoi acpi
12: 4 0 0 0 0 0 0 0 IR-IO-APIC-edge i8042
16: 6782762 0 0 0 0 0 0 0 IR-IO-APIC-fasteoi aacraid, ehci_hcd:usb1
23: 88 0 0 0 0 0 0 0 IR-IO-APIC-fasteoi ehci_hcd:usb2
24: 0 0 0 0 0 0 0 0 DMAR_MSI-edge dmar0
25: 0 0 0 0 0 0 0 0 DMAR_MSI-edge dmar1
30: 40696839 0 0 0 0 0 0 0 IR-PCI-MSI-edge eth1
31: 0 0 0 0 0 0 0 0 IR-PCI-MSI-edge ahci
NMI: 0 0 0 0 0 0 0 0 Non-maskable interrupts
LOC: 4238841 3442513 3262886 3097275 3822337 3062084 3039864 3062109 Local timer interrupts
SPU: 0 0 0 0 0 0 0 0 Spurious interrupts
PMI: 0 0 0 0 0 0 0 0 Performance monitoring interrupts
PND: 0 0 0 0 0 0 0 0 Performance pending work
RES: 2091023 2526976 2132682 1876928 2226523 1816845 1427789 1141970 Rescheduling interrupts
CAL: 24 59 65 63 62 61 6
...[Updated on: Fri, 23 November 2012 17:43]
Report message to a moderator