OpenVZ Forum


Home » General » Support » TUN causing instability. (Enabling TUN on VPS causes high CPU loads (300.0+), live migrations causes kernel panic.)
icon4.gif  TUN causing instability. [message #44330] Wed, 30 November 2011 08:42 Go to next message
KuJoe is currently offline  KuJoe
Messages: 11
Registered: November 2011
Junior Member
We recently migrated to new hardware nodes with the latest OpenVZ kernel and started to experience instability with TUN. At first, all live migrations with VPSs that have TUN enabled caused a kernel panic. Now when we enable TUN on a VPS and they run "tunctl -t tun0" they get the error ""enabling TUNSETPERSIST: Operation not permitted" and it causes the loads to spike over 300.0 which requires the node to be forcefully rebooted.

Is it really that easy to crash a whole OpenVZ node with a single command? Any ideas how to fix this?

Kernel: 2.6.18-274.7.1.el5.028stab095.1
vzctl version 3.0.29.3
Re: TUN causing instability. [message #44362 is a reply to message #44330] Fri, 02 December 2011 05:48 Go to previous messageGo to next message
KuJoe is currently offline  KuJoe
Messages: 11
Registered: November 2011
Junior Member
Any ideas of where to look? I've checked all logs, top, ps, and iotop, but there is no sign of what is causing the high loads. I moved the client to his own node with different hardware and the latest kernel but the problem continues.
Re: TUN causing instability. [message #44363 is a reply to message #44362] Fri, 02 December 2011 06:56 Go to previous messageGo to next message
KuJoe is currently offline  KuJoe
Messages: 11
Registered: November 2011
Junior Member
Resolved by using an older kernel.
Re: TUN causing instability. [message #44947 is a reply to message #44330] Sun, 15 January 2012 22:05 Go to previous messageGo to next message
Bryon is currently offline  Bryon
Messages: 5
Registered: January 2012
Junior Member
We're experiencing the same issue with multiple servers running the latest OpenVZ kernel. Which older kernel did you switch to?

If a container (with TUN enabled) runs "tunctl" while the system is actively in use by customers, it begins to receive many hung / blocked task log messages. On a busy system, the load immediately spikes. Immediately after running tunctl no new connections to the system can be made (e.g. SSH) and tunctl never returns, it only outputs "enabling TUNSETPERSIST: Operation not permitted."

On a busy system the server will eventually crash with many lines similar to those below within messages:

Jan 14 20:44:40 x22 kernel: INFO: task irqbalance:9026 blocked for more than 300 seconds.
Jan 14 20:44:40 x22 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jan 14 20:44:40 x22 kernel: irqbalance    D ffff81043b8e2ba0     0  9026      1          9061  8885 (NOTLB)
Jan 14 20:44:40 x22 kernel:  ffff81042c73bd78 0000000000000086 3933323036383031 00002b3d84fe2035
Jan 14 20:44:40 x22 kernel:  ffff81043b8e2ba0 ffffffff8031bba0 0004e24c1dee8030 000bbd68d82c77c7
Jan 14 20:44:40 x22 kernel:  ffff81043b8e2da8 0000000000030002 0000000000000000 ffffffff804a6280
Jan 14 20:44:40 x22 kernel: Call Trace:
Jan 14 20:44:40 x22 kernel:  [<ffffffff8006520d>] __mutex_lock_slowpath+0x60/0x9b
Jan 14 20:44:40 x22 kernel:  [<ffffffff8023215e>] dev_name_hash+0x1e/0x64
Jan 14 20:44:40 x22 kernel:  [<ffffffff80065257>] .text.lock.mutex+0xf/0x14
Jan 14 20:44:40 x22 kernel:  [<ffffffff80232598>] dev_load+0x18/0x46
Jan 14 20:44:40 x22 kernel:  [<ffffffff80232cb0>] dev_ioctl+0x317/0x497
Jan 14 20:44:40 x22 kernel:  [<ffffffff802277fa>] sock_ioctl+0x1d4/0x1e5
Jan 14 20:44:40 x22 kernel:  [<ffffffff80043f2e>] do_ioctl+0x21/0x6b
Jan 14 20:44:40 x22 kernel:  [<ffffffff8003154a>] vfs_ioctl+0x457/0x4b9
Jan 14 20:44:40 x22 kernel:  [<ffffffff800c37ca>] audit_syscall_entry+0x1a8/0x1d3
Jan 14 20:44:40 x22 kernel:  [<ffffffff8004ec3e>] sys_ioctl+0x3c/0x5c
Jan 14 20:44:40 x22 kernel:  [<ffffffff800602dd>] tracesys+0xd5/0xe0
Jan 14 20:44:40 x22 kernel:
Jan 14 20:44:40 x22 kernel: INFO: task tunctl:22859 blocked for more than 300 seconds.
Jan 14 20:44:40 x22 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jan 14 20:44:40 x22 kernel: tunctl        D ffff81037b0b61a0     0 22859  22784                     (L-TLB)
Jan 14 20:44:40 x22 kernel:  ffff8101c0635d68 0000000000000046 0000000100000000 ffff8102003cda08
Jan 14 20:44:40 x22 kernel:  ffff81037b0b61a0 ffff81033bcf32e0 0004e2443ca8806b 000bbd55e7240efd
Jan 14 20:44:40 x22 kernel:  ffff81037b0b63a8 0000000300000000 ffff8101da059980 ffff8102f59a0000
Jan 14 20:44:40 x22 kernel: Call Trace:
Jan 14 20:44:40 x22 kernel:  [<ffffffff8006520d>] __mutex_lock_slowpath+0x60/0x9b
Jan 14 20:44:40 x22 kernel:  [<ffffffff80065257>] .text.lock.mutex+0xf/0x14
Jan 14 20:44:40 x22 kernel:  [<ffffffff88a1da20>] :tun:tun_chr_close+0x32/0x78
Jan 14 20:44:40 x22 kernel:  [<ffffffff800128ef>] __fput+0xd3/0x1c2
Jan 14 20:44:40 x22 kernel:  [<ffffffff800248f8>] filp_close+0x5c/0x64
Jan 14 20:44:40 x22 kernel:  [<ffffffff8003a94e>] put_files_struct+0x63/0xae
Jan 14 20:44:40 x22 kernel:  [<ffffffff80015ba0>] do_exit+0x74c/0xe2d
Jan 14 20:44:40 x22 kernel:  [<ffffffff800c37ca>] audit_syscall_entry+0x1a8/0x1d3
Jan 14 20:44:40 x22 kernel:  [<ffffffff8004b625>] cpuset_exit+0x0/0x88
Jan 14 20:44:40 x22 kernel:  [<ffffffff800602dd>] tracesys+0xd5/0xe0
Jan 14 20:44:40 x22 kernel:


On a system with low activity nothing appears within any monitoring utilities (e.g. top, iostat) or within messages/dmesg.

[Updated on: Thu, 19 January 2012 04:17]

Report message to a moderator

Re: TUN causing instability. [message #44948 is a reply to message #44330] Sun, 15 January 2012 22:08 Go to previous messageGo to next message
Bryon is currently offline  Bryon
Messages: 5
Registered: January 2012
Junior Member
I forgot to mention we're using kernel 2.6.18-274.7.1.el5.028stab095.1 - The same as KuJoe.

I can also say that we've experienced many "crashes" during live migrations. We're fairly certain the crashes occurred during the migration of a container with TUN enabled and most likely in use. We've been so far unable to retrieve a kernel panic because we do not have physical access to the systems.

[Updated on: Thu, 19 January 2012 04:16]

Report message to a moderator

Re: TUN causing instability. [message #44956 is a reply to message #44948] Thu, 19 January 2012 04:10 Go to previous messageGo to next message
KuJoe is currently offline  KuJoe
Messages: 11
Registered: November 2011
Junior Member
We reverted back to 2.6.18-238.19.1.el5.028stab092.2. This was the latest kernel we've found to not allow clients to crash the hardware node.
Re: TUN causing instability. [message #44957 is a reply to message #44956] Thu, 19 January 2012 04:14 Go to previous messageGo to next message
Bryon is currently offline  Bryon
Messages: 5
Registered: January 2012
Junior Member
Thanks for letting me know. We've also downgraded to 2.6.18-238.19.1.el5.028stab092.2 on servers with clients that require TUN.

[Updated on: Thu, 19 January 2012 04:14]

Report message to a moderator

Re: TUN causing instability. [message #45511 is a reply to message #44330] Tue, 13 March 2012 17:29 Go to previous messageGo to next message
Bryon is currently offline  Bryon
Messages: 5
Registered: January 2012
Junior Member
Looks like this may have been fixed in 028stab099.3 --

Lookup bugzilla #2207 - Can't include a link in my reply.
Re: TUN causing instability. [message #45685 is a reply to message #44330] Thu, 29 March 2012 04:28 Go to previous messageGo to next message
KuJoe is currently offline  KuJoe
Messages: 11
Registered: November 2011
Junior Member
My node running 99.3 just received a kernel panic from migrating a VPS using TUN. Next time it happens I'll take a screenshot. Back to 92.2 I go. Sad
Re: TUN causing instability. [message #45686 is a reply to message #44330] Thu, 29 March 2012 04:48 Go to previous message
KuJoe is currently offline  KuJoe
Messages: 11
Registered: November 2011
Junior Member
Kernel panic #2. Screenshot attached.
  • Attachment: 004.png
    (Size: 20.94KB, Downloaded 195 times)
Previous Topic: Slow Container Network Speeds
Next Topic: Abnormal memory centos EL6 64
Goto Forum:
  


Current Time: Fri Aug 09 13:00:49 GMT 2024

Total time taken to generate the page: 0.02994 seconds