*SOLVED* Kernel Panics Often! HELP [message #11301] |
Sun, 18 March 2007 00:57 |
Vetrox
Messages: 1 Registered: March 2007
|
Junior Member |
|
|
Hey guys,
I'm not the smartest tool in the shed with OpenVZ, but I have tried everything with this problem. I'm running Pentium D 3.4 GhZ, with 2 GB RAM, and 160 GB Hard Drive. I'm using HyperVM as well. My kernel panics often, three times a day sometimes - and tells me theres so many kernel bugs. I know for sure this isn't a hardware issue, so please tell me what could be to blame. I just installed the version of OpenVZ that comes with HyperVM. If you guys need any info, let me know. Also - the RAM will fill to 99.9% with cache, not utilize swap - and crash. (panic)
If you guys need logs or anything relevant - let me know
Regards,
Joe
(I'm on CentOS 4.4 by the way)
[Updated on: Mon, 26 March 2007 08:48] by Moderator Report message to a moderator
|
|
|
Re: Kernel Panics Often! HELP [message #11302 is a reply to message #11301] |
Sun, 18 March 2007 19:35 |
madguy24
Messages: 6 Registered: March 2007
|
Junior Member |
|
|
As a tech who tried to fix issue, server rises in load after it reports a kernel bug. See the latest ones. I did a custom compile of kernel, version 2.6.18. For me, it looks like a faulty memory, but server owner said, he replaced every hardware.
Mar 19 02:45:01 localhost kernel: BUG: warning at kernel/ub/ub_page_bc.c:322/pb_dup_ref()
.....repeated thrice followed by,
Mar 19 02:45:05 localhost kernel: BUG: unable to handle kernel NULL pointer dereference at virtual address 00000002
Mar 19 02:45:05 localhost kernel: printing eip:
Mar 19 02:45:05 localhost kernel: c0143c41
Mar 19 02:45:05 localhost kernel: *pde = 00000000
Mar 19 02:45:05 localhost kernel: Oops: 0000 [#1]
Mar 19 02:45:05 localhost kernel: SMP
Mar 19 02:45:05 localhost kernel: Modules linked in: simfs vzethdev vzrst ip_nat vzcpt ip_conntrack vzdquota af_packet xt_tcpudp xt_length ipt_ttl xt_tcpmss ipt_TCPMSS iptable_mangle xt_multiport xt_limit ipt_tos ipt_REJECT iptable_filter ip_tables x_tables parport_pc lp parport autofs4 sunrpc vznetdev vzmon vzdev thermal processor fan button battery asus_acpi ac uhci_hcd ehci_hcd usbcore i2c_i801 i2c_core 8139too mii
Mar 19 02:45:05 localhost kernel: CPU: 1, VCPU: 149.0
Mar 19 02:45:05 localhost kernel: EIP: 0060:[<c0143c41>] Not tainted VLI
Mar 19 02:45:05 localhost kernel: EFLAGS: 00010286 (2.6.18-028 #3)
Mar 19 02:45:05 localhost kernel: EIP is at ub_page_uncharge+0x31/0x90
Mar 19 02:45:05 localhost kernel: eax: fffffffe ebx: c203c6b0 ecx: 00000000 edx: 00000001
Mar 19 02:45:05 localhost kernel: esi: f6515600 edi: c203c6b0 ebp: c0539a40 esp: f4a59dcc
Mar 19 02:45:05 localhost kernel: ds: 007b es: 007b ss: 0068
Mar 19 02:45:05 localhost kernel: Process dcpumon (pid: 22650, veid: 149, ti=f4a59000 task=f4969340 task.ti=f4a59000)
Mar 19 02:45:05 localhost kernel: Stack: f4a59f20 00000000 c203c6b0 c0539a20 00000000 c015e8f5 c203c6b0 00000000
Mar 19 02:45:05 localhost kernel: 00000000 000006e6 c05399e0 0000000c f4a59e1c 0000000e 0000000e c015f32c
Mar 19 02:45:05 localhost kernel: c1bce7ac 00000000 c0161fc2 f4a59e1c 0000000e 00000000 c1d7d564 c1d33cec
Mar 19 02:45:05 localhost kernel: Call Trace:
Mar 19 02:45:05 localhost kernel: [<c015e8f5>] free_hot_cold_page+0xf5/0x1a0
Mar 19 02:45:05 localhost kernel: [<c015f32c>] __pagevec_free+0x1c/0x30
Mar 19 02:45:05 localhost kernel: [<c0161fc2>] release_pages+0x102/0x190
Mar 19 02:45:05 localhost kernel: [<c0172287>] free_pages_and_swap_cache+0x77/0xa0
Mar 19 02:45:05 localhost kernel: [<c016838f>] zap_pte_range+0x27f/0x340
Mar 19 02:45:05 localhost kernel: [<c0168514>] unmap_page_range+0xc4/0x160
Mar 19 02:45:05 localhost kernel: [<c0168685>] unmap_vmas+0xd5/0x200
Mar 19 02:45:05 localhost kernel: [<c016de65>] exit_mmap+0x85/0x120
Mar 19 02:45:05 localhost kernel: [<c0121a88>] mmput+0x38/0xc0
Mar 19 02:45:05 localhost kernel: [<c012859e>] do_exit+0xfe/0x480
Mar 19 02:45:05 localhost kernel: [<c0128986>] do_group_exit+0x36/0xa0
Mar 19 02:45:05 localhost kernel: [<c01031c7>] syscall_call+0x7/0xb
Mar 19 02:45:05 localhost kernel: Code: 1c 89 7c 24 10 8b 7c 24 18 89 5c 24 08 89 74 24 0c 8b 77 20 85 f6 74 4f 89 e2 8b 86 30 05 00 00 81 e2 00 f0 ff ff 8b 52 10 f7 d0 <8b> 14 90 b8 01 00 00 00 d3 e0 29 42 20 81 3e 75 62 75 62 75 37
Mar 19 02:45:05 localhost kernel: EIP: [<c0143c41>] ub_page_uncharge+0x31/0x90 SS:ESP 0068:f4a59dcc
Mar 19 02:45:05 localhost kernel: Fixing recursive fault but reboot is needed!
Mar 19 02:45:05 localhost kernel: BUG: scheduling while atomic: dcpumon/0x00000001/22650
============================================
Most common error is in rmap.c which repeats very frequently. as below.
Mar 19 02:45:29 localhost kernel: ------------[ cut here ]------------
Mar 19 02:45:29 localhost kernel: kernel BUG at mm/rmap.c:529!
Mar 19 02:45:29 localhost kernel: invalid opcode: 0000 [#2]
Mar 19 02:45:29 localhost kernel: SMP
Mar 19 02:45:29 localhost kernel: Modules linked in: simfs vzethdev vzrst ip_nat vzcpt ip_conntrack vzdquota af_packet xt_tcpudp xt_length ipt_ttl xt_tcpmss ipt_TCPMSS iptable_mangle xt_multiport xt_limit ipt_tos ipt_REJECT iptable_filter ip_tables x_tables parport_pc lp parport autofs4 sunrpc vznetdev vzmon vzdev thermal processor fan button battery asus_acpi ac uhci_hcd ehci_hcd usbcore i2c_i801 i2c_core 8139too mii
Mar 19 02:45:29 localhost kernel: CPU: 1, VCPU: 120.1
Mar 19 02:45:29 localhost kernel: EIP: 0060:[<c016feb7>] Not tainted VLI
Mar 19 02:45:29 localhost kernel: EFLAGS: 00010286 (2.6.18-028 #3)
Mar 19 02:45:29 localhost kernel: EIP is at page_remove_rmap+0x37/0x50
Mar 19 02:45:29 localhost kernel: eax: ffffffff ebx: c1d1d6f0 ecx: c0003ea0 edx: c20b4bb4
Mar 19 02:45:29 localhost kernel: esi: c1d1d6f0 edi: fffa85a4 ebp: c20b4bb4 esp: f3d0ff10
Mar 19 02:45:29 localhost kernel: ds: 007b es: 007b ss: 0068
Mar 19 02:45:29 localhost kernel: Process exim (pid: 22837, veid: 120, ti=f3d0f000 task=f38446e0 task.ti=f3d0f000)
=================================
With this above kernel bug, server load started rising and finally all VPSes stopped responding. I was not able to do a 'vzctl stop veid" or a "vzctl exec 120 kill -9 22837" or even to enter the vps "vzctl enter 120". It simply hangs there and finally have to do Ctrl + C to get the shell prompt back. BTW that exim process raised the load to 100+ before I issued a reboot.
Please shed some lights into this issue. I am going clueless. I dont want to think it is kernel bug, but what are our other options to try ?
|
|
|
|
|
|
|
|
|