On Wed, Oct 24, 2007 at 06:31:39AM +0200, Martin Trtusek wrote:
> The same kernel on i386 is working without oops (uptime 22 days). Should
> I fulfill a bug ?
>
> Unfortunately previous hardware are not available for test now (it is in
> production, with 2.6.18-openvz-13-1etch4 kernel). Probably after next
> week we will have similar one for 1-2 week testing.
>
> Martin Trtusek
>
> Martin Trtusek pí?e v St 10. 10. 2007 v 07:54 +0200:
> > I installed kernel 2.6.18-openvz-13-39.1d1-amd64 from
> > http://download.openvz.org/debian on Debian Etch one week ago and
> > experienced kernel oops (complete freezing, off/on necessary) after 2-3
> > days of running (3 times). Oops is always after cron.daily scripts (in
> > my case 06:25) but not everyday. Yesterday I configured netconsole for
> > capturing useful info, enclosed.
I've seen three crashes with
linux-image-2.6.18-openvz-13-39.1d2-686_028.39.1d2_i386.deb
I changed my production server back to 2.6.18-openvz-12-1etch1-686.
I captured the output this time:
preparing to turn dcache accounting on, size 4294967293 pages, watermarks 0 21800
UBC: turning dcache accounting on succeeded, usage 1236258, time 0.040
------------[ cut here ]------------
kernel BUG at kernel/sched.c:3798!
invalid opcode: 0000 [#1]
SMP
Modules linked in: netconsole tcp_diag inet_diag hp100 nfs simfs
vznetdev vzethdev vzrst ip_nat vzcpt ip_conntrack nfnetlink vzdquota
vzmon vzdev xt_length ipt_ttl xt_tcpmss ipt_TCPMSS iptable_mangle
iptable_filter xt_multiport xt_limit ipt_tos ipt_REJECT ip_tables
x_tables nfsd exportfs lockd nfs_acl sunrpc ppdev lp ipv6 nls_iso8859_1
isofs dm_snapshot dm_mirror dm_mod uhci_hcd ehci_hcd usb_storage
ide_generic loop snd_cs46xx gameport snd_seq_dummy snd_seq_oss
snd_seq_midi snd_seq_midi_event snd_seq tsdev snd_rawmidi snd_seq_device
snd_ac97_codec snd_ac97_bus snd_pcm_oss snd_mixer_oss i2c_piix4 snd_pcm
i2c_core snd_timer parport_pc psmouse rtc serio_raw snd evdev soundcore
snd_page_alloc shpchp pci_hotplug parport sworks_agp agpgart floppy
pcspkr ide_floppy ext3 jbd mbcache sd_mod ide_cd cdrom ide_disk ohci_hcd
usbcore aic7xxx scsi_transport_spi scsi_mod serverworks generic ide_core
e100 mii processor
CPU: 1, VCPU: -1.1
EIP: 0060:[<c01163a8>] Not tainted VLI
EFLAGS: 00010046 (2.6.18-openvz-13-39.1d2-686 #1)
EIP is at rebalance_tick+0x2fa/0x485
eax: 0000005e ebx: c035c6c0 ecx: 00000008 edx: dfb05d94
esi: c2214180 edi: dfb99000 ebp: dfb05db0 esp: dfb05d64
ds: 007b es: 007b ss: 0068
Process swapper (pid: 0, veid: 0, ti=dfb04000 task=dfb01220 task.ti=dfb04000)
Stack: 00000000 00000000 dfb98000 dfb98000 330b1369 00000001 00000002 00000001
dfb99000 1fb2c449 dfb99000 00000003 000000ff 0000005e 00000000 00000000
dfb01220 00000001 00000000 c1f78da4 c012476b dfb05dd0 f524a414 00000202
Call Trace:
[<c012476b>] update_process_times+0x52/0x5c
[<c010c9d2>] smp_apic_timer_interrupt+0x9b/0xa1
[<c010342b>] apic_timer_interrupt+0x1f/0x24
[<c0281338>] _spin_unlock_irqrestore+0x8/0x9
[<c012d3f6>] __wake_up_bit+0x29/0x2e
[<c016515e>] end_buffer_async_write+0xe3/0x105
[<c014a30c>] mempool_free+0x5f/0x63
[<c0164939>] end_bio_bh_io_sync+0x0/0x39
[<c0164967>] end_bio_bh_io_sync+0x2e/0x39
[<c016618d>] bio_endio+0x50/0x55
[<c01a835d>] __end_that_request_first+0x11b/0x425
[<f891f12d>] scsi_end_request+0x1a/0xa9 [scsi_mod]
[<c014a30c>] mempool_free+0x5f/0x63
[<f891f2ff>] scsi_io_completion+0x143/0x2ed [scsi_mod]
[<f89553b2>] sd_rw_intr+0x1eb/0x215 [sd_mod]
[<f891b3bd>] scsi_finish_command+0x73/0x77 [scsi_mod]
[<c01a9f25>] blk_done_softirq+0x4d/0x58
[<c01205aa>] __do_softirq+0x84/0x109
[<c0120665>] do_softirq+0x36/0x3a
[<c0104ea4>] do_IRQ+0x8a/0x92
[<c010339a>] common_interrupt+0x1a/0x20
[<c0101731>] default_idle+0x0/0x59
[<c0101762>] default_idle+0x31/0x59
[<c01017e8>] cpu_idle+0x5e/0x74
Code: 0c 85 c0 0f 84 4e 01 00 00 53 89 c2 8b 4d b8 ff 75 e8 8b 45 dc e8
a7 cf ff ff 89 c3 58 85 db 5a 0f 84 31 01 00 00 39 7d d4 75 0b <0f> 0b
66 b8 d6 0e b8 a3 61 29 c0 39 fb 89 5d d4 0f 84 16 01 00
EIP: [<c01163a8>] rebalance_tick+0x2fa/0x485 SS:ESP 0068:dfb05d64
Kernel panic - not syncing: Fatal exception in interrupt
BUG: warning at arch/i386/kernel/smp.c:550/smp_call_function()
[<c010b093>] smp_call_function+0x53/0xfe
[<c011bfce>] vprintk+0x26/0x3a
[<c010b151>] smp_send_stop+0x13/0x1c
[<c011b322>] panic+0x4c/0xe2
[<c0103d59>] die+0x252/0x269
[<c0104617>] do_invalid_op+0x0/0x9d
[<c01046a8>] do_invalid_op+0x91/0x9d
[<c01163a8>] rebalance_tick+0x2fa/0x485
[<f8e3c87a>] ipt_do_table+0x2a1/0x2cb [ip_tables]
[<c01b2b10>] __next_cpu+0x12/0x21
[<c0113533>] find_busiest_group+0x185/0x46e
[<f8f5e982>] venet_entry_lookup+0x2c/0x5a [vznetdev]
[<c01034dd>] error_code+0x39/0x40
[<c01163a8>] rebalance_tick+0x2fa/0x485
[<c012476b>] update_process_times+0x52/0x5c
[<c010c9d2>] smp_apic_timer_interrupt+0x9b/0xa1
[<c010342b>] apic_timer_interrupt+0x1f/0x24
[<c0281338>] _spin_unlock_irqrestore+0x8/0x9
[<c012d3f6>] __wake_up_bit+0x29/0x2e
[<c016515e>] end_buffer_async_write+0xe3/0x105
[<c014a30c>] mempool_free+0x5f/0x63
[<c0164939>] end_bio_bh_io_sync+0x0/0x39
[<c0164967>] end_bio_bh_io_sync+0x2e/0x39
[<c016618d>] bio_endio+0x50/0x55
[<c01a835d>] __end_that_request_first+0x11b/0x425
[<f891f12d>] scsi_end_request+0x1a/0xa9 [scsi_mod]
[<c014a30c>] mempool_free+0x5f/0x63
[<f891f2ff>] scsi_io_completion+0x143/0x2ed [scsi_mod]
[<f89553b2>] sd_rw_intr+0x1eb/0x215 [sd_mod]
[<f891b3bd>] scsi_finish_command+0x73/0x77 [scsi_mod]
[<c01a9f25>] blk_done_softirq+0x4d/0x58
[<c01205aa>] __do_softirq+0x84/0x109
[<c0120665>] do_softirq+0x36/0x3a
[<c0104ea4>] do_IRQ+0x8a/0x92
[<c010339a>] common_interrupt+0x1a/0x20
[<c0101731>] default_idle+0x0/0x59
[<c0101762>] default_idle+0x31/0x59
[<c01017e8>] cpu_idle+0x5e/0x74
--
E Frank Ball efball@efball.com