OpenVZ Forum


Home » General » Support » *SOLVED* 028test10.1 same sda error on 3 of 5 machines
*SOLVED* 028test10.1 same sda error on 3 of 5 machines [message #9618] Sat, 13 January 2007 11:39 Go to next message
bjmg is currently offline  bjmg
Messages: 32
Registered: December 2005
Location: Puettlingen, Germany
Member

Hello!

We have a strange problem with some of our servers.
Before we installed OpenVZ (using CentOS 4.4 as basis) on our machines every server worked without any problem. Now - after the installation of OpenVZ 028test10.1 we got strange hardware errors. The error is the same on ALL machines (including sector numbers):
end_request: I/O error, dev sda, sector 5100005.
The strange on that fact is that every machine has different hard drive vendors. So it seems to be impossible that every machine has the same error. But all (5 of 5) machines still have the same main board. I think it is not a hardware bug but a kernel problem because all machines worked well with ubuntu 6.06 and its official ubuntu server kernel.
After the error occurs the machine is only reachable by ping but not via ssh. Console access is not possible too.
btw. We need the 2.6.18 kernel series because we need nfs support in the VEs.

I really hope someone can give us a hint on how to solve that problem.

Thank you for reading

Bernhard

[Updated on: Fri, 04 May 2007 07:24] by Moderator

Report message to a moderator

Re: 028test10.1 same sda error on 3 of 5 machines [message #9620 is a reply to message #9618] Sat, 13 January 2007 12:30 Go to previous messageGo to next message
bjmg is currently offline  bjmg
Messages: 32
Registered: December 2005
Location: Puettlingen, Germany
Member

Notice:
In one of the servers we changed the hard disk already -> still the same error.

The systems are using a VIA VT6420 SATA Controller. And I already found some links to other people that have similar problems.
For example this one:
http://lkml.org/lkml/2006/12/30/79

They concludde that this is a device error but I still think this is wrong because we have exactly the same error on many machines.

Maybe the driver for VT6420 has an error. I'll try looking deeper into it but I'm not a kernel hacker so my knowledge of the kernel is limited. Hopefully someone can help out.

[Updated on: Sat, 13 January 2007 12:49]

Report message to a moderator

Re: 028test10.1 same sda error on 3 of 5 machines [message #9621 is a reply to message #9618] Sat, 13 January 2007 13:01 Go to previous messageGo to next message
bjmg is currently offline  bjmg
Messages: 32
Registered: December 2005
Location: Puettlingen, Germany
Member

This Kernel-Bugzilla entry seems to describe our problem:
http://bugzilla.kernel.org/show_bug.cgi?id=7641
Re: 028test10.1 same sda error on 3 of 5 machines [message #9639 is a reply to message #9618] Mon, 15 January 2007 07:21 Go to previous messageGo to next message
Vasily Tarasov is currently offline  Vasily Tarasov
Messages: 1345
Registered: January 2006
Senior Member
Thank you for your report and useful notes.

I filled the bug, concerning the problem:
http://bugzilla.openvz.org/show_bug.cgi?id=437

At
http://bugzilla.kernel.org/show_bug.cgi?id=7415
comment #28 and comment #32 content patches.
They're very small and I guess you can apply them for
OpenVZ without changes. Can you try them, please?

Thanks in advance.
Re: 028test10.1 same sda error on 3 of 5 machines [message #11391 is a reply to message #9618] Thu, 22 March 2007 15:15 Go to previous messageGo to next message
dev is currently offline  dev
Messages: 1693
Registered: September 2005
Location: Moscow
Senior Member

bjmg,

do you know whether RHEL5 based OVZ kernels suffer from the same problem?
If yes, then we have to post a bug to RHEL bugzilla...
If not, you can safely use RHEL5 based kernels then...

Thanks,
Kirill


http://static.openvz.org/userbars/openvz-developer.png
Re: 028test10.1 same sda error on 3 of 5 machines [message #12444 is a reply to message #11391] Sat, 28 April 2007 10:36 Go to previous messageGo to next message
bjmg is currently offline  bjmg
Messages: 32
Registered: December 2005
Location: Puettlingen, Germany
Member

Hi!

I'll try the new stable RHEL5 kernel on RHEL4 (CentOS) and will report if the problem is still there.
At least the normal 2.6.18-ovz releases still have the same problem. Also kernel 2.6.20 does not work on that machines btw. I'll post more about that in another thread.

Bernhard
Re: 028test10.1 same sda error on 3 of 5 machines [message #12459 is a reply to message #9618] Sun, 29 April 2007 09:42 Go to previous messageGo to next message
bjmg is currently offline  bjmg
Messages: 32
Registered: December 2005
Location: Puettlingen, Germany
Member

The Problem also occurs with the el5-ovz kernel. That is really bad. :-/
Well, I'll report the bug to redhat. Hope that helps.

There are already one or two bugs that are related:
https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=211948
https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=202822

There is also a possible workaround (deactivating irqbalance). I think I should try that too.

Thank you

Bernhard

[Updated on: Sun, 29 April 2007 09:52]

Report message to a moderator

Re: 028test10.1 same sda error on 3 of 5 machines [message #12522 is a reply to message #12459] Wed, 02 May 2007 07:50 Go to previous messageGo to next message
Vasily Tarasov is currently offline  Vasily Tarasov
Messages: 1345
Registered: January 2006
Senior Member
Thanks for the info!
icon14.gif  Re: 028test10.1 same sda error on 3 of 5 machines [message #12561 is a reply to message #9618] Wed, 02 May 2007 21:28 Go to previous message
bjmg is currently offline  bjmg
Messages: 32
Registered: December 2005
Location: Puettlingen, Germany
Member

It seems that both kernels (RHEL5+ovz, vanilla+ovz) are running fine after deactivating irqbalance.

Thank you for your help!

Bernhard
Previous Topic: 1 VPS IP, Multiple Host Interfaces
Next Topic: vzctl create --root and vzyum and get_veid
Goto Forum:
  


Current Time: Sat Nov 16 19:09:19 GMT 2024

Total time taken to generate the page: 0.02990 seconds