OpenVZ Forum


Home » General » Support » vzmond hanging during vzctl stop - ends in panic
vzmond hanging during vzctl stop - ends in panic [message #14587] Tue, 03 July 2007 02:14 Go to next message
jochum is currently offline  jochum
Messages: 21
Registered: December 2006
Location: Naperville, IL, USA
Junior Member
Hi All:

I am seeing an problem occasionally on stopping a VPS. I see the following symptoms:

1) vzlist -a reports:
VEID NPROC STATUS IP_ADDR HOSTNAME
200 0 running lss-vps140.xx.xx.xx

2) running a top on the host, shows that there is a process "vzmond/200", that is running 100% of the CPU, and has run for over 17 minutes

3) it then ends up with a kernel panic, I captured the screen to a .png file that I (hopefully) have attached, the EIP was
EIP: [<c05b8294>] __ip_route_output_key+0xf7/0x822 SS:ESP 0068:c5df3d90

other info:
uname -a
Linux lss-host40.ih.lucent.com 2.6.18-8.1.4.el5.028stab035.1PAE #1 SMP Sat Jun 9 02:27:12 MSD 2007 i686 athlon i386 GNU/Linux

I set up the host as Scientific Linux SL release 4.4 (a recompile of RedHat Enterprise 4.4, just like CentOS 4.4). I seemed to have the same problem when I set up the host as CentOS 5.0 (but I didn't capture the kernel panic, so not sure if it was is the same place).

The VPS is a CentOS 5.0 VPS.

Any other info needed that I can provide?

thanks,
Paul

[Updated on: Mon, 23 July 2007 07:34] by Moderator

Report message to a moderator

Re: vzmond hanging during vzctl stop - ends in panic [message #14590 is a reply to message #14587] Tue, 03 July 2007 06:43 Go to previous messageGo to next message
Vasily Tarasov is currently offline  Vasily Tarasov
Messages: 1345
Registered: January 2006
Senior Member
Hello, do you use NFS on your node?

Seems, that it is the bug report: http://bugzilla.openvz.org/show_bug.cgi?id=513

Vasily.
Re: vzmond hanging during vzctl stop - ends in panic [message #14606 is a reply to message #14590] Tue, 03 July 2007 11:49 Go to previous messageGo to next message
jochum is currently offline  jochum
Messages: 21
Registered: December 2006
Location: Naperville, IL, USA
Junior Member
Yes, I am running NFS. Sorry, I did a google search before submitting to the forum, but did not see Bug 513 in the results.

Any status on when this bug may be fixed (or, can we provide any further information to help fix it)?

thanks,

Paul
Re: vzmond hanging during vzctl stop - ends in panic [message #14668 is a reply to message #14606] Thu, 05 July 2007 07:30 Go to previous messageGo to next message
Vasily Tarasov is currently offline  Vasily Tarasov
Messages: 1345
Registered: January 2006
Senior Member
Hello,

The patch was attached to the bug report.
It'll be included in the next kernel release.

Vasily.
*NOT SOLVED* vzmond hanging during vzctl stop - ends in panic [message #15214 is a reply to message #14587] Sun, 22 July 2007 02:16 Go to previous messageGo to next message
jochum is currently offline  jochum
Messages: 21
Registered: December 2006
Location: Naperville, IL, USA
Junior Member
Hi Vasily:

I built the kernel with the patch in the bug report, but it is still failing.

Here is the procedure I used to build the new kernel:

yum -y groupinstall 'Legacy Software Development'
rm -rf /usr/src/*
mkdir -p /usr/src/redhat/SOURCES
mkdir -p /usr/src/redhat/BUILD
rpm -ivh ovzkernel-2.6.18-8.1.4.el5.028stab035.1.src.rpm
cd /usr/src/redhat/SPECS
rpmbuild -bp --target=i686 kernel-ovz.spec
ln -s /usr/src/redhat/BUILD/ovzkernel-2.6.18/linux-2.6.18.i686 /usr/src/linux
cd /usr/src/linux
patch -p1 < diff-ve-nfsstop-b-20070704.patch
yum -y install ncurses-devel
make menuconfig
make clean
# edit the Makefile, and change the EXTRAVERSION =
# i.e. EXTRAVERSION = -8.el5.028stab031.1PAE.prj
make -j4 bzImage
make -j4 modules
make modules_install
make install
# edit /boot/grub/menu.lst, to boot the new kernel

I was not able to capture the kernel dump, if I can get one, I will add it to this thread.


Re: *NOT SOLVED* vzmond hanging during vzctl stop - ends in panic [message #15223 is a reply to message #15214] Mon, 23 July 2007 07:56 Go to previous messageGo to next message
den is currently offline  den
Messages: 494
Registered: December 2005
Senior Member
Hello, Jochum!

Bug #513 aka
http://bugzilla.openvz.org/show_bug.cgi?id=513
has been solved in the latest OpenVz kernels, namely on 028stab038. Will you try this one. We believe that the problem should gone.

Regards,
Den
Re: *NOT SOLVED* vzmond hanging during vzctl stop - ends in panic [message #15224 is a reply to message #15214] Mon, 23 July 2007 07:59 Go to previous messageGo to next message
den is currently offline  den
Messages: 494
Registered: December 2005
Senior Member
as far as I can see from this log, you use diff-ve-nfs-stop-b-20070502, while the real cure is diff-ve-nfs-stop-c-20070704

Regards,
Den
Re: *NOT SOLVED* vzmond hanging during vzctl stop - ends in panic [message #15226 is a reply to message #15224] Mon, 23 July 2007 08:06 Go to previous messageGo to next message
dev is currently offline  dev
Messages: 1693
Registered: September 2005
Location: Moscow
Senior Member

Den, is a bit wrong (patch was simply renamed in CVS, since -b already existed). The patch is correct. And this patch requires another minor fix. Anyway, we will continue investigating and will provide you a kernel today/tomorrow for debugging.



http://static.openvz.org/userbars/openvz-developer.png
Re: *NOT SOLVED* vzmond hanging during vzctl stop - ends in panic [message #15229 is a reply to message #15214] Mon, 23 July 2007 11:58 Go to previous messageGo to next message
den is currently offline  den
Messages: 494
Registered: December 2005
Senior Member
Hello again, Jochum!

We have found a problem induced by the diff-ve-nfsstop-b-20070704. There is a deadlock if NFS service is started in VE0 before vz service, which is generally the case. The patch is attached to the bug.

Though, could you plz clarify, is there any changes after the diff-ve-nfsstop-b-20070704 or not? We have no feedback for the original defect.

Kirill will provide your 028stab039 kernel a bit later.

Regards,
Den
Re: *NOT SOLVED* vzmond hanging during vzctl stop - ends in panic [message #15230 is a reply to message #15226] Mon, 23 July 2007 11:59 Go to previous messageGo to next message
jochum is currently offline  jochum
Messages: 21
Registered: December 2006
Location: Naperville, IL, USA
Junior Member
Hi Den and Dev:

I was able to capture the screen from the kernel panic, if that is of any help (this was still on the ovzkernel-2.6.18-8.1.4.el5.028stab035.1.src.rpm kernel with patch diff-ve-nfsstop-b-20070704.patch installed).

thanks,

Paul
  • Attachment: kernel_panic
    (Size: 446.81KB, Downloaded 432 times)
Re: *NOT SOLVED* vzmond hanging during vzctl stop - ends in panic [message #15231 is a reply to message #15229] Mon, 23 July 2007 12:10 Go to previous messageGo to next message
jochum is currently offline  jochum
Messages: 21
Registered: December 2006
Location: Naperville, IL, USA
Junior Member
Hi Den and Dave:

I am sorry, but I am not sure I understand the question. If you are asking if I installed any other package in my build (other than patch diff-ve-nfsstop-b-20070704.patch), than the answer is no (i.e. only patch was added). If you are asking if the behavior of the kernel was different once I used that patch, than I believe the answer is also no (to me, the kernel panic looks like it came from the same area, but I am not very good at reading kernel panics, so I can't guarantee that).

Also, I uploaded the kernel panic in the previous response, but forgot to add the extension .png to the file.

thanks,

Paul
Re: vzmond hanging during vzctl stop - ends in panic [message #15232 is a reply to message #14587] Mon, 23 July 2007 14:33 Go to previous messageGo to next message
den is currently offline  den
Messages: 494
Registered: December 2005
Senior Member
Here is a small debug patch. Current idea about this problem is that there is a "use after free" situation. The patch will clarify the situation.

It will OOPS faster if we guessed correctly. Could you check it and notify us about the results?
Re: *NOT SOLVED* vzmond hanging during vzctl stop - ends in panic [message #15237 is a reply to message #15231] Mon, 23 July 2007 21:40 Go to previous messageGo to next message
jochum is currently offline  jochum
Messages: 21
Registered: December 2006
Location: Naperville, IL, USA
Junior Member
Hi Den:

I tried the patch, but couldn't get it to build. Error message was:

net/ipv4/fib_rules.c: In function ‘fib_rules_destroy’:
net/ipv4/fib_rules.c:158: error: incompatible types in assignment
make[2]: *** [net/ipv4/fib_rules.o] Error 1

thanks,

Paul
Re: *NOT SOLVED* vzmond hanging during vzctl stop - ends in panic [message #15250 is a reply to message #15237] Tue, 24 July 2007 11:22 Go to previous messageGo to next message
den is currently offline  den
Messages: 494
Registered: December 2005
Senior Member
I was wrong, again Sad
Here a correct one.
Re: *NOT SOLVED* vzmond hanging during vzctl stop - ends in panic [message #15252 is a reply to message #15237] Tue, 24 July 2007 17:29 Go to previous messageGo to next message
jochum is currently offline  jochum
Messages: 21
Registered: December 2006
Location: Naperville, IL, USA
Junior Member
Hi Den:

I have attached a copy of the screen dump from the kernel panic, I assume that is what you are looking for. I looked at /var/log/messages and /var/log/vzctl.log on the host, but nothing came out around the kernel panic timeframe.

thanks,

Paul
Re: *NOT SOLVED* vzmond hanging during vzctl stop - ends in panic [message #15348 is a reply to message #15252] Fri, 27 July 2007 12:37 Go to previous messageGo to next message
den is currently offline  den
Messages: 494
Registered: December 2005
Senior Member
Hi, Jochum...

Unfortunately, we can't reproduce the situation locally and the information we have is not enough to understand the problem. As far as I see from our talks, this situation is rather frequent for you.

Kirill has prepared the latest OVZ kernel with all the fixes we have. You can download it from
http://download.openvz.org/~dev/028stab039.1/
Could you try this one and if this will not help, plz provide more details about the situation.

According to the dumps we have, the crash occurs at the beginning of ip_route_output_slow on dereference of loopback_dev (this is a macro now).

Do you have problems with a concrete VE or this can happen on an arbitrary one with NFS?

Do you have crashes on the same node or an arbitrary one is affected?

Is there any specific activity before the crash like checkpointing/restoration/backup/restore etc?

It will be very helpful if you setup a serial or a network console for the crashing node. We really need to see a whole OOPS to make a more productive guesses. Though, you can try to turn off kernel panic on oops and obtain logs directly from the node. This can be performed by
echo 0 > /proc/sys/kernel/panic_on_oops

Here is a small debug patch against 028stab039.1. Could you run the kernel with it?

I hope this will help us Smile

Regards,
Den
Re: *NOT SOLVED* vzmond hanging during vzctl stop - ends in panic [message #15359 is a reply to message #15348] Fri, 27 July 2007 22:20 Go to previous messageGo to next message
jochum is currently offline  jochum
Messages: 21
Registered: December 2006
Location: Naperville, IL, USA
Junior Member
Hi Den:

Thanks for responding and your patience in working on this. I will try to answer your questions, and also attach a copy of a full kernel dump.

1) Can I try Kirill's latest version?
A) I downloaded and installed his version (ovzkernel-PAE-2.6.18-8.1.8.el5.028stab039.1.i686.rpm
ovzkernel-PAE-devel-2.6.18-8.1.8.el5.028stab039.1.i686.rpm
), but they still failed.

2) I have tries this on 2 differen't VE's. Both of them are based on CentOS 5.0. One of these was the download from the OpenVZ website, the other from my own conversion of CentOS4 to CentOS5.

3) Do I have the crashes on the same node, or arbitrary ones?
I have this configuration running on 6 nodes, and I can reproduce the failure on all 6 of them.

4) Is there any specific activity before the crash?
Nothing special. Basically, I start the host, and than have been using a script to start the VPS, perform a pwd of an automounted NFS filesystem, and then stop the VPS. I can tell it will fail, when the VPS does not shut down completly (vzlist -a shows it is running but has 0 processes). Once in this state (where the VPS does not shut down completely), it takes a while (anywhere from minutes to hours) before the kernel panics. During this interval, I do receive messages on the console, (nfs) "server xxx not responding, still trying"


Here is the copy of the kernel dump. Note, this was run against Kirill's kernel, with out the included patch. I will next work on building Kirill's kernel with the new patch.

thanks,

Paul

BUG: unable to handle kernel NULL pointer dereference at virtual address 00000040
printing eip:
c05b85e3
*pde = 37532001
Oops: 0000 [#1]
SMP
last sysfs file:
Modules linked in: simfs(U) vzethdev(U) vzrst(U) ip_nat(U) vzcpt(U) ip_conntrack(U) nfnetlink(U) vzdquota(U) xt_tcpudp(U) xt_length(U) ipt_ttl(U) xt_tcpmss(U) ipt_TCPMSS(U) iptable_mangle(U) iptable_filter(U) xt_multiport(U) xt_limit(U) ipt_tos(U) ipt_REJECT(U) ip_tables(U) x_tables(U) autofs4(U) nfs(U) lockd(U) fscache(U) nfs_acl(U) sunrpc(U) vznetdev(U) vzmon(U) vzdev(U) ipv6(U) cpufreq_ondemand(U) dm_mirror(U) dm_mod(U) video(U) sbs(U) i2c_ec(U) button(U) battery(U) asus_acpi(U) ac(U) parport_pc(U) lp(U) parport(U) sr_mod(U) sg(U) i2c_nforce2(U) i2c_core(U) k8_edac(U) pcspkr(U) ide_cd(U) edac_mc(U) e1000(U) serio_raw(U) forcedeth(U) cdrom(U) usb_storage(U) sata_nv(U) libata(U) mptsas(U) mptscsih(U) mptbase(U) scsi_transport_sas(U) sd_mod(U) scsi_mod(U) raid1(U) ext3(U) jbd(U) ehci_hcd(U) ohci_hcd(U) uhci_hcd(U)
CPU: 2, VCPU: 0.3
EIP: 0060:[<c05b85e3>] Not tainted VLI
EFLAGS: 00010286 (2.6.18-8.1.8.el5.028stab039.1PAE #1)
EIP is at __ip_route_output_key+0xf7/0x822
eax: 00000000 ebx: 00000000 ecx: 00000000 edx: 00000000
esi: 00000000 edi: f7f4bea4 ebp: f63e09c0 esp: f7f4bd90
ds: 007b es: 007b ss: 0068
Process events/3 (pid: 13, veid: 0, ti=f7f4a000 task=f7f49990 task.ti=f7f4a000)
Stack: f7f4bef8 00000000 00000000 00000040 00000000 f765f0c0 f4c724b8 c05d40f4
f4c724b8 00000000 00000000 00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
Call Trace:
[<c05d40f4>] tcp_v4_send_check+0x76/0xbb
[<c05d4490>] tcp_v4_connect+0x102/0x5f8
[<c04d5d5c>] __next_cpu+0x12/0x21
[<c0418c91>] find_busiest_group+0x178/0x45c
[<c05faef7>] _spin_lock_bh+0x8/0x18
[<c05de6b8>] inet_stream_connect+0x7d/0x208
[<c05f96e7>] schedule+0xcc3/0xda1
[<f91b3bb7>] xs_tcp_connect_worker+0x221/0x2bd [sunrpc]
[<c04314d7>] run_workqueue+0x7f/0xbc
[<f91b3996>] xs_tcp_connect_worker+0x0/0x2bd [sunrpc]
[<c0431af1>] worker_thread+0xd9/0x10c
[<c0419818>] default_wake_function+0x0/0xc
[<c0431a18>] worker_thread+0x0/0x10c
[<c043464b>] kthread+0xc0/0xed
[<c043458b>] kthread+0x0/0xed
[<c05fdc77>] kernel_thread_helper+0x7/0x10
=======================
Code: 83 e6 1d e8 ea 13 f2 ff 8b 07 f7 db 8b 4f 0c 83 e3 fd 89 44 24 24 89 e0 25 00 e0 ff ff 8b 00 8b 80 d4 05 00 00 8b 80 e8 02 00 00 <8b> 40 40 89 4c 24 30 88 5c 24 39 c7 44 24 7c 00 00 00 00 89 44
EIP: [<c05b85e3>] __ip_route_output_key+0xf7/0x822 SS:ESP 0068:f7f4bd90
Kernel panic - not syncing: Fatal exception
Re: *NOT SOLVED* vzmond hanging during vzctl stop - ends in panic [message #15362 is a reply to message #15359] Sat, 28 July 2007 02:53 Go to previous messageGo to next message
jochum is currently offline  jochum
Messages: 21
Registered: December 2006
Location: Naperville, IL, USA
Junior Member
Hi Den:

I rebuilt Kirill's version with the patch. The kernel panic looks the same (at least to me), I have included it below. I did not see anything unusual in the /var/log/vzctl.log file, and the only important thing I saw in /var/log/messages was the following 2 messages:

kernel: nfs: server lss-nfsa09 not responding, still
trying
Jul 27 20:23:44 lss-host40 last message repeated 4 times

thanks,

Paul



BUG: unable to handle kernel NULL pointer dereference at virtual address 00000040
printing eip:
c05b85e3
*pde = 00003001
Oops: 0000 [#1]
SMP
last sysfs file:
Modules linked in: simfs(U) vzethdev(U) vzrst(U) ip_nat(U) vzcpt(U) ip_conntrack(U) nfnetlink(U) vzdquota(U) xt_tcpudp(U) xt_length(U) ipt_ttl(U) xt_tcpmss(U) ipt_TCPMSS(U) iptable_mangle(U) iptable_filter(U) xt_multiport(U) xt_limit(U) ipt_tos(U) ipt_REJECT(U) ip_tables(U) x_tables(U) autofs4(U) nfs(U) lockd(U) fscache(U) nfs_acl(U) sunrpc(U) vznetdev(U) vzmon(U) vzdev(U) ipv6(U) cpufreq_ondemand(U) dm_mirror(U) dm_mod(U) video(U) sbs(U) i2c_ec(U) button(U) battery(U) asus_acpi(U) ac(U) parport_pc(U) lp(U) parport(U) sr_mod(U) i2c_nforce2(U) sg(U) pcspkr(U) i2c_core(U) e1000(U) k8_edac(U) edac_mc(U) forcedeth(U) serio_raw(U) ide_cd(U) cdrom(U) usb_storage(U) sata_nv(U) libata(U) mptsas(U) mptscsih(U) mptbase(U) scsi_transport_sas(U) sd_mod(U) scsi_mod(U) raid1(U) ext3(U) jbd(U) ehci_hcd(U) ohci_hcd(U) uhci_hcd(U)
CPU: 1, VCPU: 0.2
EIP: 0060:[<c05b85e3>] Not tainted VLI
EFLAGS: 00010286 (2.6.18-8.1.8.el5.028stab039.1PAE.prj #1)
EIP is at __ip_route_output_key+0xf7/0x822
eax: 00000000 ebx: 00000000 ecx: 00000000 edx: 00000000
esi: 00000000 edi: c5dedea4 ebp: f63019c0 esp: c5dedd90
ds: 007b es: 007b ss: 0068
Process events/2 (pid: 12, veid: 0, ti=c5dec000 task=f7e3a0d0 task.ti=c5dec000)
Stack: c5dedef8 f7615e9e 00000000 c05bf53c f7203000 f67032bc f67032bc f7203000
c059fed2 00000000 00000000 00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
Call Trace:
[<c05bf53c>] ip_finish_output+0x0/0x19a
[<c059fed2>] dev_hard_start_xmit+0x1b8/0x22a
[<c05c00dd>] ip_queue_xmit+0x3a5/0x3eb
[<c05d4490>] tcp_v4_connect+0x102/0x5f8
[<f8c79238>] scsi_end_request+0x9f/0xa9 [scsi_mod]
[<f8c7938b>] scsi_io_completion+0x149/0x2f3 [scsi_mod]
[<c05d40f4>] tcp_v4_send_check+0x76/0xbb
[<c05cec56>] tcp_transmit_skb+0x6a4/0x6d2
[<c042b0f9>] lock_timer_base+0x15/0x2f
[<c042b20d>] __mod_timer+0x9c/0xa6
[<c05faef7>] _spin_lock_bh+0x8/0x18
[<c05de6b8>] inet_stream_connect+0x7d/0x208
[<c05f96e7>] schedule+0xcc3/0xda1
[<f91bcbe8>] xs_tcp_connect_worker+0x221/0x2bd [sunrpc]
[<c04314d7>] run_workqueue+0x7f/0xbc
[<f91bc9c7>] xs_tcp_connect_worker+0x0/0x2bd [sunrpc]
[<c0431af1>] worker_thread+0xd9/0x10c
[<c0419818>] default_wake_function+0x0/0xc
[<c0431a18>] worker_thread+0x0/0x10c
[<c043464b>] kthread+0xc0/0xed
[<c043458b>] kthread+0x0/0xed
[<c05fdc77>] kernel_thread_helper+0x7/0x10
=======================
Code: 83 e6 1d e8 ea 13 f2 ff 8b 07 f7 db 8b 4f 0c 83 e3 fd 89 44 24 24 89 e0 25 00 e0 ff ff 8b 00 8b 80 d4 05 00 00 8b 80 e8 02 00 00 <8b> 40 40 89 4c 24 30 88 5c 24 39 c7 44 24 7c 00 00 00 00 89 44
EIP: [<c05b85e3>] __ip_route_output_key+0xf7/0x822 SS:ESP 0068:c5dedd90
Kernel panic - not syncing: Fatal exception
Re: *NOT SOLVED* vzmond hanging during vzctl stop - ends in panic [message #15367 is a reply to message #15362] Sat, 28 July 2007 09:44 Go to previous messageGo to next message
den is currently offline  den
Messages: 494
Registered: December 2005
Senior Member
Are there any debug printks above the OOPS?

Regards,
Den
Re: *NOT SOLVED* vzmond hanging during vzctl stop - ends in panic [message #15370 is a reply to message #15367] Sat, 28 July 2007 19:10 Go to previous messageGo to next message
jochum is currently offline  jochum
Messages: 21
Registered: December 2006
Location: Naperville, IL, USA
Junior Member
Hi Den:

Nothing was printed above the kernel oops on the console, but I did find the following in /var/log/messages:

Jul 28 14:00:43 lss-host40 automount[6771]: umount_multi: could not stat fs of /home/prj
Jul 28 14:00:50 lss-host40 kernel: VE: 200: started
Jul 28 14:00:56 lss-host40 kernel: BUG: unable to handle kernel NULL pointer dereference at virtual address 00000040


thanks,

Paul
Re: *NOT SOLVED* vzmond hanging during vzctl stop - ends in panic [message #15443 is a reply to message #15370] Tue, 31 July 2007 10:51 Go to previous messageGo to next message
den is currently offline  den
Messages: 494
Registered: December 2005
Senior Member
Here a correct patch, I beleive. The patch is against 028stab039.

Regards,
Den
Re: *NOT SOLVED* vzmond hanging during vzctl stop - ends in panic [message #15456 is a reply to message #15443] Tue, 31 July 2007 22:48 Go to previous messageGo to next message
jochum is currently offline  jochum
Messages: 21
Registered: December 2006
Location: Naperville, IL, USA
Junior Member
Hi Den:

In my limited testing, I can't get the system to panic any longer, but it does seem to corrupt NFS when I stop a VPS now, which causes the system to hang.

I built ovzkernel-2.6.18-8.1.4.el5.028stab039.1.src.rpm, with the following patches:
diff-ve-opseminit-20070723.patch
diff-ve-nfsstop-b-20070704.patch
diff-ve-nfsstop-d-20070731.patch

I do not get a kernel dump, but in /var/log/messages, I received:
Jul 31 13:43:28 lss-host40 kernel: portmap: RPC call returned error 101
Jul 31 13:43:28 lss-host40 kernel: RPC: failed to contact portmap (errno -101).

thanks,

Paul
Re: *NOT SOLVED* vzmond hanging during vzctl stop - ends in panic [message #15461 is a reply to message #15456] Wed, 01 August 2007 05:26 Go to previous messageGo to next message
den is currently offline  den
Messages: 494
Registered: December 2005
Senior Member
Hi, Paul!

Have you umounted the NFS share from userspace during normal VE stop sequence? If the NFS if forcely umounted from kernel, it CAN be corrupted.

Though, the hang should be investigated. We do not see it in our environment, again Sad Is it possible to press Alt-SysRq-P severalt times and Alt-SysRq-T after that after the hang and send the calltraces to me?

Regards,
Den
Re: *NOT SOLVED* vzmond hanging during vzctl stop - ends in panic [message #15490 is a reply to message #15461] Wed, 01 August 2007 22:34 Go to previous messageGo to next message
jochum is currently offline  jochum
Messages: 21
Registered: December 2006
Location: Naperville, IL, USA
Junior Member
Hi Den:

Here is the output, hope it helps.

thanks,

Paul

(ps - I know, I have a lot of processes running on the host, a lot more than normally needed, someday, I might even get a chance of cleaning some of them up Smile )


  • Attachment: alt-t-p
    (Size: 102.36KB, Downloaded 389 times)
Re: *NOT SOLVED* vzmond hanging during vzctl stop - ends in panic [message #15497 is a reply to message #15490] Thu, 02 August 2007 07:34 Go to previous messageGo to next message
den is currently offline  den
Messages: 494
Registered: December 2005
Senior Member
Hi, Paul!

The log shows where the deadlock occurs, so I can start to think. Though, I have a small question. Do you use IPv6 in your environment?

Regards,
Den
Re: vzmond hanging during vzctl stop - ends in panic [message #15500 is a reply to message #14587] Thu, 02 August 2007 11:16 Go to previous messageGo to next message
jochum is currently offline  jochum
Messages: 21
Registered: December 2006
Location: Naperville, IL, USA
Junior Member
Hi Den:

No, we don't use IPv6.

Paul
Re: *NOT SOLVED* vzmond hanging during vzctl stop - ends in panic [message #15573 is a reply to message #15490] Mon, 06 August 2007 09:45 Go to previous messageGo to next message
den is currently offline  den
Messages: 494
Registered: December 2005
Senior Member
Hello, Paul!

A small, but important question. Do you have an appropriate script inside the VE, which will umount it during normal runlevel stop process? I think this should work well as a temporary solution.

I am trying to solve the problem from kernel point of view, but there no quite a good solution for now.

Regards,
Den
Re: *NOT SOLVED* vzmond hanging during vzctl stop - ends in panic [message #15590 is a reply to message #15573] Mon, 06 August 2007 14:16 Go to previous messageGo to next message
jochum is currently offline  jochum
Messages: 21
Registered: December 2006
Location: Naperville, IL, USA
Junior Member
Hi Den:

Up till now, I have not been unmounting the file systems on shutdown. I will be out of the office this week, but will add that to my scripts next week, and see if that clears the problem.

thanks, and I hope you have a good week,

Paul
Re: vzmond hanging during vzctl stop - ends in panic [message #36498 is a reply to message #14587] Wed, 24 June 2009 20:35 Go to previous message
bbjwp is currently offline  bbjwp
Messages: 1
Registered: June 2009
Junior Member
We're presently being affected by this bug on both:

* 2.6.18-92.1.18.el5.028stab060.8
* 2.6.18-128.1.1.el5.028stab062.3

We're using CentOS 5.3 templates with NFS mounted. Unmounting the NFS before a shut down does not resolve this for us: We still get the 0 NPROC listed and the 100% CPU for vzmond/7279.

We have other hosts where this isn't a problem, but on this particular setup, we've confirmed this issue across 6 machines.

Nothing shows up in /var/log/messages or dmesg.

Yesterday, the containers stopped after a while (20 minutes?). Today, they seem to just be hanging.
Previous Topic: Migration never ends.Rsync null.
Next Topic: OpenVZ on Soekris net5501 (compatible to i586)
Goto Forum:
  


Current Time: Wed Sep 04 08:27:33 GMT 2024

Total time taken to generate the page: 0.05371 seconds