Server crash [message #36550] |
Tue, 30 June 2009 12:13 |
dvazart
Messages: 37 Registered: October 2008 Location: France
|
Member |
|
|
Hi again !
I'm running a 2.6.18-14-ovz-686-enterprise kernel under Debian Etch. I have about 150 VEs running in 2 quad core Intel Xeon prosessors with 32 Gb RAM.
My HN crashes every weekend at different times, this weekend it crashed 2 times... and my customers are not really happy
I think I have seen on the console an output like an "oops" (http://wiki.openvz.org/Oops) but I'm not sure that the logs do not see anything abnormal, except this:
tail /var/log/messages
Jun 23 18:31:59 sht2 kernel: oom-killer: gfp_mask=0xd0, order=0
Jun 23 18:31:59 sht2 kernel: [<c0159a39>] out_of_memory+0x109/0x150
Jun 23 18:31:59 sht2 kernel: [<c015b5e8>] __alloc_pages+0x328/0x3a0
Jun 23 18:31:59 sht2 kernel: [<c015b681>] __get_free_pages+0x21/0x50
Jun 23 18:31:59 sht2 kernel: [<c0190311>] __pollwait+0xb1/0x110
Jun 23 18:31:59 sht2 kernel: [<c04287cf>] tcp_poll+0x2f/0x220
Jun 23 18:31:59 sht2 kernel: [<c03f26c0>] sock_poll+0x20/0x30
Jun 23 18:31:59 sht2 kernel: [<c018f9d1>] do_select+0x291/0x4d0
Jun 23 18:31:59 sht2 kernel: [<c0190260>] __pollwait+0x0/0x110
Jun 23 18:31:59 sht2 kernel: [<c0119d20>] default_wake_function+0x0/0x20
Jun 23 18:31:59 sht2 last message repeated 19 times
Jun 23 18:31:59 sht2 kernel: [<c018fdf1>] core_sys_select+0x1e1/0x330
Jun 23 18:31:59 sht2 kernel: [<c01794f8>] do_sync_write+0xc8/0x110
Jun 23 18:31:59 sht2 kernel: [<c013a900>] autoremove_wake_function+0x0/0x60
Jun 23 18:31:59 sht2 kernel: [<c01b47b8>] dnotify_parent+0x38/0xd0
Jun 23 18:31:59 sht2 kernel: [<c019065d>] sys_select+0x4d/0x1c0
Jun 23 18:31:59 sht2 kernel: [<c017a891>] sys_write+0xb1/0xc0
Jun 23 18:31:59 sht2 kernel: [<c010322f>] syscall_call+0x7/0xb
Is this normal ??
I have 3 other cuestions :
- it is possible that a misconfiguration in VEs can crash the server?
- it may be a bug in OpenVZ kernel? (I use : 028stab056.1dso1)
- because my server has 32 GB of RAM, I dont want to run a memory test, that could take a long time...
Can you advise me ?
thanks !
----------- Daniel Vazart ------------
"Knowledge is power, Sharing is human"
------- http://www.vazart.net --------
|
|
|
|
Re: Server crash [message #36556 is a reply to message #36550] |
Tue, 30 June 2009 13:31 |
dvazart
Messages: 37 Registered: October 2008 Location: France
|
Member |
|
|
Thanks for your answer.
So, if my HN enters in a state of out of memory, with only half the memory used, this can lead to produce the "oops" in OpenVZ?
and this means that I have a problem in some of the memory bars...
am I right?
----------- Daniel Vazart ------------
"Knowledge is power, Sharing is human"
------- http://www.vazart.net --------
|
|
|
|
Re: Server crash [message #36568 is a reply to message #36558] |
Wed, 01 July 2009 12:35 |
dvazart
Messages: 37 Registered: October 2008 Location: France
|
Member |
|
|
Hi, thanks for your answer.
No, i'm using a 32 bits kernel.
It is possible that a misconfiguration of VMGUARPAGES, PRIVMPAGES and OOMGUARPAGES cause an OOM situation (as mentioned in the first post) in the HN?
This can also lead to an oops in OpenVZ?
----------- Daniel Vazart ------------
"Knowledge is power, Sharing is human"
------- http://www.vazart.net --------
|
|
|
|
|
Re: Server crash [message #36586 is a reply to message #36550] |
Fri, 03 July 2009 07:21 |
dvazart
Messages: 37 Registered: October 2008 Location: France
|
Member |
|
|
Thanks for your explanation.
do you think its a misconfiguration of KMEMSIZE in the VE's ??
thanks.
----------- Daniel Vazart ------------
"Knowledge is power, Sharing is human"
------- http://www.vazart.net --------
|
|
|
|