OpenVZ Forum


Home » General » Support » *solved* do_IRQ: stack overflow crash (2.6.16-026test017.1)
*solved* do_IRQ: stack overflow crash (2.6.16-026test017.1) [message #5732] Tue, 29 August 2006 13:19 Go to next message
HubertD is currently offline  HubertD
Messages: 22
Registered: August 2006
Junior Member
Hello,

I'm running a self-compiled linux-2.6.16 using the
patch-026test017-combined.gz patchset on a Xeon-HT-Machine (IBM xServer 345), kernel config attached.

# uname -a
Linux nibbler.little-isp.de 2.6.16-026test017 #1 SMP Mon Aug 21 18:58:28 CEST 2006 i686 GNU/Linux


It crashed once after ~7h uptime using the test015 patches and now crashed after ~8d uptime using the test17 patches.

Logs showed nothing of interest upon the first crash, so I attached a serial console for my second try and all I found today was this error message:
do_IRQ: stack overflow: 384


The kernel didn't react on serial SysRq-Commands, I also tried the "panic=10 oops=panic" kernel parameters, but the machine wouldn't restart on its own :-(

Don't know if this is enough to file a bug report, but I also don't know how to gain more information from our (more/less productive) server.

Help, anybody?
  • Attachment: kernel-config
    (Size: 32.32KB, Downloaded 245 times)

[Updated on: Wed, 13 September 2006 08:10]

Report message to a moderator

Re: do_IRQ: stack overflow crash (2.6.16-026test017.1) [message #5736 is a reply to message #5732] Tue, 29 August 2006 14:03 Go to previous messageGo to next message
dev is currently offline  dev
Messages: 1693
Registered: September 2005
Location: Moscow
Senior Member

These messages about stack overflow are bad and probably it is the hint why your kernel crashes.
After these message there should be a call trace, is it there? can you post it here please?

Are you running some complex configuration with MD (raid), maybe DRBD, networking tunnels etc? can you describe your configuration plz?

Next, turn this option to 'n':
CONFIG_4KSTACKS=y
This will increase stack size from 4K to 8K.

It would be really nice to get the call trace as it would help to catch the bug. Most likely it is a mainstream problem unrelated to OpenVZ itself, but sure, this doesn't help you much Smile


http://static.openvz.org/userbars/openvz-developer.png
Re: do_IRQ: stack overflow crash (2.6.16-026test017.1) [message #5749 is a reply to message #5736] Tue, 29 August 2006 16:07 Go to previous messageGo to next message
HubertD is currently offline  HubertD
Messages: 22
Registered: August 2006
Junior Member
thanks for your answer.
I did not see a call trace on the serial console, do I have to configure something to have it printed on serial?
The stack overflow message was the last line there...

Unfortunately I don't have physical access to the server so I can't see local console messages...

Yes, the server is running some complex configuration, including
  • some OpenVPN tunnels
  • loads of LVM volumes (14 volumes on 3 VGs)
  • filesystem snapshots every 2h via rsync --link-dest
  • 3 chrooted debian installations that i wanted to replace with OpenVZ

No MD-raid or drbd, server uses Hardware Raids on the Adaptec ServeRaid adapter...

Any way to get a call trace that I didn't try already?
The problem should somehow be related to OpenVZ - never had such a Crash in 2 years running vanilla kernels, first crash 7h after booting a OpenVZ kernel... ;)
Re: do_IRQ: stack overflow crash (2.6.16-026test017.1) [message #5754 is a reply to message #5749] Tue, 29 August 2006 17:51 Go to previous messageGo to next message
dev is currently offline  dev
Messages: 1693
Registered: September 2005
Location: Moscow
Senior Member

call traces should have printed right after the message about stack... strange :/

can you help with resolving the problem? if no (e.g. if you can't experiment with this system), then just set CONFIG_4KSTACKS=n and retry.

if yes, I will think over a debug patch for stack overflow hunting.

P.S. you couldn't have been running 2.6.16 for 2 years Smile)))
BTW, you can check if mainstream kernel compiled with the same .config crashes.


http://static.openvz.org/userbars/openvz-developer.png
Re: do_IRQ: stack overflow crash (2.6.16-026test017.1) [message #5757 is a reply to message #5754] Tue, 29 August 2006 19:25 Go to previous messageGo to next message
HubertD is currently offline  HubertD
Messages: 22
Registered: August 2006
Junior Member
Thanks for your support so far!

I'm gonna reboot the system with a 8k-stack-kernel now and see whether the problem persists. If it does, I will try the vanilla kernel. (I upgraded from 2.6.15 to 2.6.16 together with the openvz install).
Of course, the system wouldn't have run a vanilla 2.6.16 for 2 years, but it did run vanilla 2.6.8, 2.6.11 and 2.6.15 kernels without a crash in that time period.

I would surly like to help resolving the problem, another few planned reboots should be no problem, but the system must be somewhat productive in between (it's mainly acting as a mail&webserver for ~20 customers)

Concerning the missing call traces:
I'm monitoring the serial line with minicom in a screen session on a second server. Is this a a reasonable setup or should I try something else?
Re: do_IRQ: stack overflow crash (2.6.16-026test017.1) [message #5769 is a reply to message #5757] Wed, 30 August 2006 08:40 Go to previous messageGo to next message
dev is currently offline  dev
Messages: 1693
Registered: September 2005
Location: Moscow
Senior Member

you know, each kernel has its own set of bugs Smile

ok, let's first check with 8k-stacks.

minicom should be ok.


http://static.openvz.org/userbars/openvz-developer.png
*solved* do_IRQ: stack overflow crash (2.6.16-026test017.1) [message #6275 is a reply to message #5769] Wed, 13 September 2006 08:10 Go to previous messageGo to next message
HubertD is currently offline  HubertD
Messages: 22
Registered: August 2006
Junior Member
after 14 days uptime I had to reboot the server for some other reason. Seems to work stable with 8k stacks, though.

thanks again for your help!
Re: *solved* do_IRQ: stack overflow crash (2.6.16-026test017.1) [message #6286 is a reply to message #6275] Wed, 13 September 2006 13:27 Go to previous message
dev is currently offline  dev
Messages: 1693
Registered: September 2005
Location: Moscow
Senior Member

Hope so! Feel free to report problems (though hope you won't have one)!


http://static.openvz.org/userbars/openvz-developer.png
Previous Topic: qmail heads up
Next Topic: cpanel config
Goto Forum:
  


Current Time: Sat Apr 13 05:38:24 GMT 2024

Total time taken to generate the page: 0.01692 seconds