OpenVZ Forum


Home » General » Support » Main Server Hangs (Server randomly hangs and becomes unavailable )
Main Server Hangs [message #43847] Mon, 24 October 2011 13:57 Go to previous message
YaoDzi is currently offline  YaoDzi
Messages: 2
Registered: October 2011
Junior Member
Hello,

The main server hangs due to the overload, CPU load gets CRITICAL - load average: 121.90 with more than 1200 processes.

We have checked the statistics of each virtual server and didn't find any troublemaker there. What other things should we check or how to find the cause of this overload ?

Recent log when it froze:

Quote:
Oct 24 11:46:50 v1 kernel: INFO: task events/0:18 blocked for more than 300 seconds.
Oct 24 11:46:50 v1 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Oct 24 11:46:50 v1 kernel: events/0 D ffff81032ebff360 0 18 1 19 17 (L-TLB)
Oct 24 11:46:50 v1 kernel: ffff81032ee01dc0 0000000000000046 0000000000000000 0000000100000002
Oct 24 11:46:50 v1 kernel: ffff81032ebff360 ffff81032f30c160 00018a72fa95cb4c 000276225002dd32
Oct 24 11:46:50 v1 kernel: ffff81032ebff568 0000000480499d00 0000000000000000 ffff81032f308000
Oct 24 11:46:50 v1 kernel: Call Trace:
Oct 24 11:46:50 v1 kernel: [<ffffffff8023a575>] linkwatch_event+0x0/0x30
Oct 24 11:46:50 v1 kernel: [<ffffffff8006520d>] __mutex_lock_slowpath+0x60/0x9b
Oct 24 11:46:50 v1 kernel: [<ffffffff80065257>] .text.lock.mutex+0xf/0x14
Oct 24 11:46:50 v1 kernel: [<ffffffff8023a57e>] linkwatch_event+0x9/0x30
Oct 24 11:46:50 v1 kernel: [<ffffffff8004fb4c>] run_workqueue+0x9e/0xfc
Oct 24 11:46:50 v1 kernel: [<ffffffff8004c2b8>] worker_thread+0x0/0x122
Oct 24 11:46:50 v1 kernel: [<ffffffff8004c3a8>] worker_thread+0xf0/0x122
Oct 24 11:46:50 v1 kernel: [<ffffffff8008b636>] default_wake_function+0x0/0xe
Oct 24 11:46:50 v1 kernel: [<ffffffff80033a47>] kthread+0xfe/0x132
Oct 24 11:46:50 v1 kernel: [<ffffffff80061001>] child_rip+0xa/0x11
Oct 24 11:46:50 v1 kernel: [<ffffffff80033949>] kthread+0x0/0x132
Oct 24 11:46:50 v1 kernel: [<ffffffff80060ff7>] child_rip+0x0/0x11
Oct 24 11:46:50 v1 kernel:
Oct 24 11:46:50 v1 xinetd[10793]: EXIT: nrpe status=0 pid=121739 duration=10(sec)
Oct 24 11:46:50 v1 kernel: INFO: task irqbalance:10395 blocked for more than 300 seconds.
Oct 24 11:46:50 v1 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Oct 24 11:46:50 v1 kernel: irqbalance D ffff81032cc90260 0 10395 1 10426 9215 (NOTLB)
Oct 24 11:46:50 v1 kernel: ffff81032106dd78 0000000000000086 3738313633303839 000000000000006c
Oct 24 11:46:50 v1 kernel: ffff81032cc90260 ffff8100418d3660 00018a790d47804d 0002762c03a4248c
Oct 24 11:46:50 v1 kernel: ffff81032cc90468 000000002711fa28 ffff8102ea36f680 ffff81031187e000
Oct 24 11:46:50 v1 kernel: Call Trace:
Oct 24 11:46:50 v1 kernel: [<ffffffff8006520d>] __mutex_lock_slowpath+0x60/0x9b
Oct 24 11:46:50 v1 kernel: [<ffffffff80231e8f>] dev_name_hash+0x1e/0x64
Oct 24 11:46:50 v1 kernel: [<ffffffff80065257>] .text.lock.mutex+0xf/0x14
Oct 24 11:46:50 v1 kernel: [<ffffffff802322c9>] dev_load+0x18/0x46
Oct 24 11:46:50 v1 kernel: [<ffffffff802329e1>] dev_ioctl+0x317/0x497
Oct 24 11:46:50 v1 kernel: [<ffffffff80060e39>] error_exit+0x0/0x84
Oct 24 11:46:50 v1 kernel: [<ffffffff80227730>] sock_ioctl+0x1d4/0x1e5
Oct 24 11:46:50 v1 kernel: [<ffffffff80043f27>] do_ioctl+0x21/0x6b
Oct 24 11:46:50 v1 kernel: [<ffffffff80031550>] vfs_ioctl+0x457/0x4b9
Oct 24 11:46:50 v1 kernel: [<ffffffff800c3842>] audit_syscall_entry+0x1a8/0x1d3
Oct 24 11:46:50 v1 kernel: [<ffffffff8004ec0e>] sys_ioctl+0x3c/0x5c
Oct 24 11:46:50 v1 kernel: [<ffffffff800602dd>] tracesys+0xd5/0xe0
Oct 24 11:46:50 v1 kernel:
Oct 24 11:46:50 v1 kernel: INFO: task ntpd:17253 blocked for more than 300 seconds.
Oct 24 11:46:50 v1 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Oct 24 11:46:50 v1 kernel: ntpd D ffff810316328960 0 17253 16562 18452 17104 (NOTLB)
Oct 24 11:46:50 v1 kernel: ffff8102f26b1cb8 0000000000200086 0000000000000000 0000000000000000
Oct 24 11:46:51 v1 kernel: ffff810316328960 ffff81032f3f2220 00018a7b5d8c2b7c 0002762fb5cac0a4
Oct 24 11:46:51 v1 kernel: ffff810316328b68 00000007173f2e40 0000000000000000 ffff81032f3ee000
Oct 24 11:46:51 v1 kernel: Call Trace:
Oct 24 11:46:51 v1 kernel: [<ffffffff8006520d>] __mutex_lock_slowpath+0x60/0x9b
Oct 24 11:46:51 v1 kernel: [<ffffffff80065257>] .text.lock.mutex+0xf/0x14
Oct 24 11:46:51 v1 kernel: [<ffffffff8001abaf>] vsnprintf+0x5df/0x627
Oct 24 11:46:51 v1 kernel: [<ffffffff802326ee>] dev_ioctl+0x24/0x497
Oct 24 11:46:51 v1 kernel: [<ffffffff80227730>] sock_ioctl+0x1d4/0x1e5
Oct 24 11:46:51 v1 kernel: [<ffffffff80043f27>] do_ioctl+0x21/0x6b
Oct 24 11:46:51 v1 kernel: [<ffffffff80031550>] vfs_ioctl+0x457/0x4b9
Oct 24 11:46:51 v1 kernel: [<ffffffff80109026>] inotify_d_instantiate+0x12/0x3c
Oct 24 11:46:51 v1 kernel: [<ffffffff8004ec0e>] sys_ioctl+0x3c/0x5c
Oct 24 11:46:51 v1 kernel: [<ffffffff8010f267>] dev_ifconf+0xe5/0x1ab
Oct 24 11:46:51 v1 kernel: [<ffffffff8010d20a>] compat_sys_ioctl+0x24e/0x292
Oct 24 11:46:51 v1 kernel: [<ffffffff80062766>] ia32_sysret+0x0/0x5
Oct 24 11:46:51 v1 kernel:
Oct 24 11:46:51 v1 kernel: INFO: task nmbd:18732 blocked for more than 300 seconds.
Oct 24 11:46:51 v1 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Oct 24 11:46:51 v1 kernel: nmbd D ffff810315fccae0 0 18732 18550 938512 18727 (NOTLB)
Oct 24 11:46:51 v1 kernel: ffff8102eb0edb58 0000000000000082 ffff8102eb0eddd8 ffff8102eb0edde0
Oct 24 11:46:51 v1 kernel: ffff810315fccae0 ffff81032f39a1e0 00018a797e82e2b6 0002762cb8879978
Oct 24 11:46:51 v1 kernel: ffff810315fccce8 0000000600000000 0000000000000000 ffff81032f396000
Oct 24 11:46:51 v1 kernel: Call Trace:
Oct 24 11:46:51 v1 kernel: [<ffffffff8006520d>] __mutex_lock_slowpath+0x60/0x9b
Oct 24 11:46:51 v1 kernel: [<ffffffff80065257>] .text.lock.mutex+0xf/0x14
Oct 24 11:46:51 v1 kernel: [<ffffffff800ab824>] ub_sock_getwres_other+0x94/0xc3
Oct 24 11:46:51 v1 kernel: [<ffffffff80239afd>] rtnetlink_rcv+0x15/0x3b
Oct 24 11:46:51 v1 kernel: [<ffffffff80248109>] netlink_data_ready+0x12/0x50
Oct 24 11:46:51 v1 kernel: [<ffffffff80248c18>] netlink_sendmsg+0x4a4/0x4d8
Oct 24 11:46:51 v1 kernel: [<ffffffff8008b636>] default_wake_function+0x0/0xe
Oct 24 11:46:51 v1 kernel: [<ffffffff80057ce7>] sock_sendmsg+0xe5/0x137
Oct 24 11:46:51 v1 kernel: [<ffffffff800a3453>] autoremove_wake_function+0x0/0x2e
Oct 24 11:46:51 v1 kernel: [<ffffffff8001abaf>] vsnprintf+0x5df/0x627
Oct 24 11:46:51 v1 kernel: [<ffffffff800f890d>] core_sys_select+0x21d/0x27f
Oct 24 11:46:51 v1 kernel: [<ffffffff8002f0c8>] __wake_up+0x38/0x4f
Oct 24 11:46:51 v1 kernel: [<ffffffff80228a4f>] sys_sendto+0x131/0x164
Oct 24 11:46:51 v1 kernel: [<ffffffff802281a9>] move_addr_to_user+0x5d/0x78
Oct 24 11:46:51 v1 kernel: [<ffffffff802286b4>] sys_getsockname+0x72/0xa2
Oct 24 11:46:51 v1 kernel: [<ffffffff800602dd>] tracesys+0xd5/0xe0
Oct 24 11:46:51 v1 kernel:
Oct 24 11:46:51 v1 kernel: INFO: task nmbd:26759 blocked for more than 300 seconds.
Oct 24 11:46:51 v1 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Oct 24 11:46:51 v1 kernel: nmbd D ffff8102b38d65e0 0 26759 25068 26762 26750 (NOTLB)
Oct 24 11:46:51 v1 kernel: ffff810296d7db58 0000000000000082 ffff8102b38d67e8 0000000680030d42
Oct 24 11:46:51 v1 kernel: ffff8102b38d65e0 ffff81032f30c160 00018a73ded436d0 00027623bca1af74
Oct 24 11:46:51 v1 kernel: ffff8102b38d67e8 000000048003f7bc 0000000000000000 ffff81032f308000
Oct 24 11:46:51 v1 kernel: Call Trace:
Oct 24 11:46:51 v1 kernel: [<ffffffff8004d560>] try_to_del_timer_sync+0x7f/0x88
Oct 24 11:46:51 v1 kernel: [<ffffffff8006520d>] __mutex_lock_slowpath+0x60/0x9b
Oct 24 11:46:51 v1 kernel: [<ffffffff80065257>] .text.lock.mutex+0xf/0x14
Oct 24 11:46:51 v1 kernel: [<ffffffff800ab824>] ub_sock_getwres_other+0x94/0xc3
Oct 24 11:46:51 v1 kernel: [<ffffffff80239afd>] rtnetlink_rcv+0x15/0x3b
Oct 24 11:46:51 v1 kernel: [<ffffffff80248109>] netlink_data_ready+0x12/0x50
Oct 24 11:46:51 v1 kernel: [<ffffffff80248c18>] netlink_sendmsg+0x4a4/0x4d8
Oct 24 11:46:51 v1 kernel: [<ffffffff8008b636>] default_wake_function+0x0/0xe
Oct 24 11:46:51 v1 kernel: [<ffffffff80057ce7>] sock_sendmsg+0xe5/0x137
Oct 24 11:46:51 v1 kernel: [<ffffffff8005ab1a>] __ip_route_output_key+0x17a/0x97f
Oct 24 11:46:51 v1 kernel: [<ffffffff800a3453>] autoremove_wake_function+0x0/0x2e
Oct 24 11:46:51 v1 kernel: [<ffffffff8001abaf>] vsnprintf+0x5df/0x627
Oct 24 11:46:51 v1 kernel: [<ffffffff8002f0c8>] __wake_up+0x38/0x4f
Oct 24 11:46:51 v1 kernel: [<ffffffff80228a4f>] sys_sendto+0x131/0x164
Oct 24 11:46:51 v1 kernel: [<ffffffff802281a9>] move_addr_to_user+0x5d/0x78
Oct 24 11:46:51 v1 kernel: [<ffffffff802286b4>] sys_getsockname+0x72/0xa2
Oct 24 11:46:51 v1 kernel: [<ffffffff800602dd>] tracesys+0xd5/0xe0
Oct 24 11:46:51 v1 kernel:
Oct 24 11:46:52 v1 kernel: INFO: task dsm_om_connsvcd:43862 blocked for more than 300 seconds.
Oct 24 11:46:52 v1 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Oct 24 11:46:52 v1 kernel: dsm_om_connsv D ffff81022509cb60 0 43862 43703 43918 43861 (NOTLB)
Oct 24 11:46:52 v1 kernel: ffff810220ba3b58 0000000000000082 ffff810324cad440 ffff810058bc08c8
Oct 24 11:46:52 v1 kernel: ffff81022509cb60 ffff81032f2460a0 00018a82fbe5f51c 0002763be17c22b6
Oct 24 11:46:52 v1 kernel: ffff81022509cd68 0000000180089baa 0000000000000000 ffff81032f242000
Oct 24 11:46:52 v
...

 
Read Message
Read Message
Read Message
Previous Topic: CT restarts instead of stopping
Next Topic: Is it worth creating a template?
Goto Forum:
  


Current Time: Sun Sep 15 21:13:51 GMT 2024

Total time taken to generate the page: 0.04898 seconds