OpenVZ Forum


Home » Mailing lists » Users » OpenVZ, Open ISCSI & OpenSolaris
OpenVZ, Open ISCSI & OpenSolaris [message #26426] Wed, 23 January 2008 15:11 Go to next message
Mart van Santen is currently offline  Mart van Santen
Messages: 1
Registered: January 2008
Junior Member
From: openvz.org
Hello,

On one of our servers we are running OpenVZ for some time without any
problems. But yesterday I started a virtual machine based on a
iscsi-target, this target is running on opensolaris. After a few hours
without problems, the system totally collapsed by the cause of some
ip-conflicts. The machine running openvz was claiming IP-adres space, or
sending ip-frames with the same source ip as the storage server, and
because of this, the storage system disabled the network interfaces now
and then. I don't think this is caused by a bug in openvz, but because
we are running openiscsi in combination with opensolaris for some time,
i just want to know if it is possible that openvz interacts with the
storage/openiscsi layer and some error can raise because of that.

The details:

iscsi-target machine:
intel 64-bit OpenSolaris/SunOS 5.10
Target is on a zfs volume

open-iscsi:
version: 2.0-868-test1
This version because i had some other problems with previous versions

openVZ machine:

kernel: 2.6.18-1-openvz
patch: 028.18
platform: intel 64-bit
os: debian etch

extra sysctl.conf settings on openvz machine:
net.ipv4.tcp_max_tw_kmem_fraction=384
net.ipv4.tcp_max_tw_buckets_ub=16536


The problems:

During the night suddenly ping to our storage server dropped several
times. This is caused by sunos, because it discovers that an other
machine is using the same ip, and than disables the network interface
for some time and after some timout then tries to recover the
IP/interface. The mac-addresses in the SunOS log matches with the
hardware addresses of the openvz machine.

On the OpenVZ machine:

Around the same time, the errors at the bottom of this email occurred.
If the timing of all logfiles is correct, it looks like that these
errors occurred first, and then the problems with the IP's occurred. I
wonder if this has to do anything with the interface stack of openvz
conflicting with the lowlevel access of the interface by open-iscsi.
Maybe the IP-stack mirrored packets from the solaris machine or
something like that. On the machine without openvz but nearly the same
kernel and the same iscsi stack we didn't had any of these problems.

I hope anyone has a good hint. Are other people using iscsi targets for
storage and is this done on the same interface as the external network
etc. Are there any ip/tcp system settings where I have to take care of...


Kind regards,


Mart van Santen



Jan 23 08:30:41 krypton kernel:  session0: iscsi: session recovery timed
out after 120 secs
Jan 23 08:30:41 krypton kernel: iscsi: cmd 0x28 is not queued (7)
Jan 23 08:30:41 krypton kernel: sd 1:0:0:0: SCSI error: return code =
0x00010000
Jan 23 08:30:41 krypton kernel: end_request: I/O error, dev sdb, sector
4969152
Jan 23 08:30:41 krypton kernel: iscsi: cmd 0x28 is not queued (7)
Jan 23 08:30:41 krypton kernel: sd 1:0:0:0: SCSI error: return code =
0x00010000
Jan 23 08:30:41 krypton kernel: end_request: I/O error, dev sdb, sector
4969152
Jan 23 08:30:41 krypton kernel: iscsi: cmd 0x2a is not queued (7)
Jan 23 08:30:41 krypton last message repeated 6 times
Jan 23 08:30:41 krypton kernel: sd 1:0:0:0: SCSI error: return code =
0x00010000
Jan 23 08:30:41 krypton kernel: end_request: I/O error, dev sdb, sector
37901744
Jan 23 08:30:41 krypton kernel: sd 1:0:0:0: SCSI error: return code =
0x00010000
Jan 23 08:30:41 krypton kernel: end_request: I/O error, dev sdb, sector
38917752
Jan 23 08:30:41 krypton kernel: sd 1:0:0:0: SCSI error: return code =
0x00010000
Jan 23 08:30:41 krypton kernel: end_request: I/O error, dev sdb, sector
38918128
Jan 23 08:30:41 krypton kernel: sd 1:0:0:0: SCSI error: return code =
0x00010000
Jan 23 08:30:41 krypton kernel: end_request: I/O error, dev sdb, sector
3543176
Jan 23 08:30:41 krypton kernel: sd 1:0:0:0: SCSI error: return code =
0x00010000
Jan 23 08:30:41 krypton kernel: printk: 18 messages suppressed.
Jan 23 08:30:41 krypton kernel: Buffer I/O error on device sdb1, logical
block 4864715
Jan 23 08:30:41 krypton kernel: lost page write due to I/O error on sdb1
Jan 23 08:30:41 krypton kernel: Buffer I/O error on device sdb1, logical
block 4864762
Jan 23 08:30:41 krypton kernel: lost page write due to I/O error on sdb1
Jan 23 08:30:41 krypton kernel: end_request: I/O error, dev sdb, sector
36712096
Jan 23 08:30:41 krypton kernel: sd 1:0:0:0: SCSI error: return code =
0x00010000
Jan 23 08:30:41 krypton kernel: end_request: I/O error, dev sdb, sector
4676184
Jan 23 08:30:41 krypton kernel: sd 1:0:0:0: SCSI error: return code =
0x00010000
Jan 23 08:30:41 krypton kernel: end_request: I/O error, dev sdb, sector
4676432
Jan 23 08:30:41 krypton kernel: Buffer I/O error on device sdb1, logical
block 584519
Jan 23 08:30:41 krypton kernel: lost page write due to I/O error on sdb1
Jan 23 08:30:41 krypton kernel: Buffer I/O error on device sdb1, logical
block 584550
Jan 23 08:30:41 krypton kernel: lost page write due to I/O error on sdb1
Jan 23 08:30:41 krypton kernel: I/O error in filesystem ("sdb1")
meta-data dev sdb1 block 0x2302e80       ("xfs_trans_read_buf") error 5
buf count 8192
Jan 23 08:30:42 krypton kernel: iscsi: cmd 0x2a is not queued (7)
Jan 23 08:30:42 krypton last message repeated 4 times
Jan 23 08:30:42 krypton kernel: sd 1:0:0:0: SCSI error: return code =
0x00010000
Jan 23 08:30:42 krypton kernel: end_request: I/O error, dev sdb, sector
36712096
Jan 23 08:30:42 krypton kernel: sd 1:0:0:0: SCSI error: return code =
0x00010000
Jan 23 08:30:42 krypton kernel: end_request: I/O error, dev sdb, sector
38917752
Jan 23 08:30:42 krypton kernel: sd 1:0:0:0: SCSI error: return code =
0x00010000
Jan 23 08:30:42 krypton kernel: end_request: I/O error, dev sdb, sector
3543176
Jan 23 08:30:42 krypton kernel: sd 1:0:0:0: SCSI error: return code =
0x00010000
Jan 23 08:30:42 krypton kernel: end_request: I/O error, dev sdb, sector
37901744
Jan 23 08:30:42 krypton kernel: sd 1:0:0:0: SCSI error: return code =
0x00010000
Jan 23 08:30:42 krypton kernel: end_request: I/O error, dev sdb, sector
38918136
Jan 23 08:30:42 krypton kernel: Buffer I/O error on device sdb1, logical
block 4864715
Jan 23 08:30:42 krypton kernel: lost page write due to I/O error on sdb1
Jan 23 08:30:42 krypton kernel: Buffer I/O error on device sdb1, logical
block 4864763
Jan 23 08:30:42 krypton kernel: lost page write due to I/O error on sdb1
Jan 23 08:31:18 krypton kernel: iscsi: cmd 0x2a is not queued (7)
Jan 23 08:31:18 krypton kernel: iscsi: cmd 0x2a is not queued (7)
Jan 23 08:31:18 krypton kernel: sd 1:0:0:0: SCSI error: return code =
0x00010000
Jan 23 08:31:18 krypton kernel: end_request: I/O error, dev sdb, sector
37901744
Jan 23 08:31:18 krypton kernel: sd 1:0:0:0: SCSI error: return code =
0x00010000
Jan 23 08:31:18 krypton kernel: end_request: I/O error, dev sdb, sector
38918136
Jan 23 08:31:18 krypton kernel: Buffer I/O error on device sdb1, logical
block 4864763
Jan 23 08:31:18 krypton kernel: lost page write due to I/O error on sdb1
Jan 23 08:31:20 krypton kernel: iscsi: cmd 0x28 is not queued (7)
Jan 23 08:31:20 krypton kernel: sd 1:0:0:0: SCSI error: return code =
0x00010000
Jan 23 08:31:20 krypton kernel: end_request: I/O error, dev sdb, sector
4701032
Jan 23 08:31:20 krypton kernel: iscsi: cmd 0x28 is not queued (7)
Jan 23 08:31:20 krypton kernel: sd 1:0:0:0: SCSI error: return code =
0x00010000
Jan 23 08:31:20 krypton kernel: end_request: I/O error, dev sdb, sector
4701032
Jan 23 08:31:20 krypton kernel: iscsi: cmd 0x28 is not queued (7)
Jan 23 08:31:20 krypton kernel: sd 1:0:0:0: SCSI error: return code =
0x00010000
Jan 23 08:31:20 krypton kernel: end_request: I/O error, dev sdb, sector
4979496
Jan 23 08:31:20 krypton kernel: iscsi: cmd 0x28 is not queued (7)
Jan 23 08:31:20 krypton kernel: sd 1:0:0:0: SCSI error: return code =
0x00010000
Jan 23 08:31:20 krypton kernel: end_request: I/O error, dev sdb, sector
4979496
Jan 23 08:31:28 krypton kernel: iscsi: cmd 0x2a is not queued (7)
Jan 23 08:31:28 krypton kernel: iscsi: cmd 0x2a is not queued (7)
Jan 23 08:31:28 krypton kernel: sd 1:0:0:0: SCSI error: return code =
0x00010000
Jan 23 08:31:28 krypton kernel: end_request: I/O error, dev sdb, sector
36712096
Jan 23 08:31:28 krypton kernel: sd 1:0:0:0: SCSI error: return code =
0x00010000
Jan 23 08:31:28 krypton kernel: end_request: I/O error, dev sdb, sector
38917752
Jan 23 08:31:28 krypton kernel: Buffer I/O error on device sdb1, logical
block 4864715
Jan 23 08:31:28 krypton kernel: lost page write due to I/O error on sdb1
Jan 23 08:31:38 krypton kernel: iscsi: cmd 0x2a is not queued (7)
Jan 23 08:31:38 krypton kernel: iscsi: cmd 0x2a is not queued (7)
Jan 23 08:31:38 krypton kernel: sd 1:0:0:0: SCSI error: return code =
0x00010000
Jan 23 08:31:38 krypton kernel: end_request: I/O error, dev sdb, sector
5032176
Jan 23 08:31:38 krypton kernel: sd 1:0:0:0: SCSI error: return code =
0x00010000
Jan 23 08:31:38 krypton kernel: Buffer I/O error on device sdb1, logical
block 629018
Jan 23 08:31:38 krypton kernel: lost page write due to I/O error on sdb1
Jan 23 08:31:38 krypton kernel: end_request: I/O error, dev sdb, sector
3543176
Jan 23 08:31:53 krypton kernel: iscsi: cmd 0x2a is not queued (7)
Jan 23 08:31:53 krypton kernel: iscsi: cmd 0x2a is not queued (7)
Jan 23 08:31:53 krypton kernel: sd 1:0:0:0: SCSI error: return code =
0x00010000
Jan 23 08:31:53 krypton kernel: end_request: I/O error, dev sdb, sector
37901744
Jan 23 08:31:53 krypton kernel: sd 1:0:0:0: SCSI error: return code =
0x00010000
Jan 23 08:31:53 krypton kernel: end_request: I/O error, dev sdb, sector
38918136
Jan 23 08:31:53 krypton kernel: Buffer I/O error on device sdb1, logical
block 4864763
Jan 23 08:31:53 krypton kernel: lost page write due to I/O error on sdb1
Jan 23 08:32:08 krypton kernel: iscsi: cmd 0x2a is not queued (7)
Jan 23 08:32:08 krypton kernel: iscsi: cmd 0x2a is not queued (7)
Jan 23 08:32:08 krypton kernel: sd 1:0:0:0: SCSI error: return code =
0x00010000
Jan 23 08:32:08 krypton kernel: end_request: I/O error, dev sdb, sector
38917752
...

Re: OpenVZ, Open ISCSI & OpenSolaris [message #26427 is a reply to message #26426] Wed, 23 January 2008 16:11 Go to previous message
dev is currently offline  dev
Messages: 1693
Registered: September 2005
Location: Moscow
Senior Member

From: openvz.org
Mart,

1. OpenVZ doesn't interfere with storage and we run locally some services
   using RHEL5(2.6.18) + iscsi internally w/o any problems.
2. If I'm correct, you are using too way old OVZ kernel. plz update.
3. have you used iSCSI packages from Debian etch or compiled/installed anything
   yourself?

Kirill


Mart van Santen wrote:
> Hello,
> 
> On one of our servers we are running OpenVZ for some time without any
> problems. But yesterday I started a virtual machine based on a
> iscsi-target, this target is running on opensolaris. After a few hours
> without problems, the system totally collapsed by the cause of some
> ip-conflicts. The machine running openvz was claiming IP-adres space, or
> sending ip-frames with the same source ip as the storage server, and
> because of this, the storage system disabled the network interfaces now
> and then. I don't think this is caused by a bug in openvz, but because
> we are running openiscsi in combination with opensolaris for some time,
> i just want to know if it is possible that openvz interacts with the
> storage/openiscsi layer and some error can raise because of that.
> 
> The details:
> 
> iscsi-target machine:
> intel 64-bit OpenSolaris/SunOS 5.10
> Target is on a zfs volume
> 
> open-iscsi:
> version: 2.0-868-test1
> This version because i had some other problems with previous versions
> 
> openVZ machine:
> 
> kernel: 2.6.18-1-openvz
> patch: 028.18
> platform: intel 64-bit
> os: debian etch
> 
> extra sysctl.conf settings on openvz machine:
> net.ipv4.tcp_max_tw_kmem_fraction=384
> net.ipv4.tcp_max_tw_buckets_ub=16536
> 
> 
> The problems:
> 
> During the night suddenly ping to our storage server dropped several
> times. This is caused by sunos, because it discovers that an other
> machine is using the same ip, and than disables the network interface
> for some time and after some timout then tries to recover the
> IP/interface. The mac-addresses in the SunOS log matches with the
> hardware addresses of the openvz machine.
> 
> On the OpenVZ machine:
> 
> Around the same time, the errors at the bottom of this email occurred.
> If the timing of all logfiles is correct, it looks like that these
> errors occurred first, and then the problems with the IP's occurred. I
> wonder if this has to do anything with the interface stack of openvz
> conflicting with the lowlevel access of the interface by open-iscsi.
> Maybe the IP-stack mirrored packets from the solaris machine or
> something like that. On the machine without openvz but nearly the same
> kernel and the same iscsi stack we didn't had any of these problems.
> 
> I hope anyone has a good hint. Are other people using iscsi targets for
> storage and is this done on the same interface as the external network
> etc. Are there any ip/tcp system settings where I have to take care of...
> 
> 
> Kind regards,
> 
> 
> Mart van Santen
> 
> 
> 
> Jan 23 08:30:41 krypton kernel:  session0: iscsi: session recovery timed
> out after 120 secs
> Jan 23 08:30:41 krypton kernel: iscsi: cmd 0x28 is not queued (7)
> Jan 23 08:30:41 krypton kernel: sd 1:0:0:0: SCSI error: return code =
> 0x00010000
> Jan 23 08:30:41 krypton kernel: end_request: I/O error, dev sdb, sector
> 4969152
> Jan 23 08:30:41 krypton kernel: iscsi: cmd 0x28 is not queued (7)
> Jan 23 08:30:41 krypton kernel: sd 1:0:0:0: SCSI error: return code =
> 0x00010000
> Jan 23 08:30:41 krypton kernel: end_request: I/O error, dev sdb, sector
> 4969152
> Jan 23 08:30:41 krypton kernel: iscsi: cmd 0x2a is not queued (7)
> Jan 23 08:30:41 krypton last message repeated 6 times
> Jan 23 08:30:41 krypton kernel: sd 1:0:0:0: SCSI error: return code =
> 0x00010000
> Jan 23 08:30:41 krypton kernel: end_request: I/O error, dev sdb, sector
> 37901744
> Jan 23 08:30:41 krypton kernel: sd 1:0:0:0: SCSI error: return code =
> 0x00010000
> Jan 23 08:30:41 krypton kernel: end_request: I/O error, dev sdb, sector
> 38917752
> Jan 23 08:30:41 krypton kernel: sd 1:0:0:0: SCSI error: return code =
> 0x00010000
> Jan 23 08:30:41 krypton kernel: end_request: I/O error, dev sdb, sector
> 38918128
> Jan 23 08:30:41 krypton kernel: sd 1:0:0:0: SCSI error: return code =
> 0x00010000
> Jan 23 08:30:41 krypton kernel: end_request: I/O error, dev sdb, sector
> 3543176
> Jan 23 08:30:41 krypton kernel: sd 1:0:0:0: SCSI error: return code =
> 0x00010000
> Jan 23 08:30:41 krypton kernel: printk: 18 messages suppressed.
> Jan 23 08:30:41 krypton kernel: Buffer I/O error on device sdb1, logical
> block 4864715
> Jan 23 08:30:41 krypton kernel: lost page write due to I/O error on sdb1
> Jan 23 08:30:41 krypton kernel: Buffer I/O error on device sdb1, logical
> block 4864762
> Jan 23 08:30:41 krypton kernel: lost page write due to I/O error on sdb1
> Jan 23 08:30:41 krypton kernel: end_request: I/O error, dev sdb, sector
> 36712096
> Jan 23 08:30:41 krypton kernel: sd 1:0:0:0: SCSI error: return code =
> 0x00010000
> Jan 23 08:30:41 krypton kernel: end_request: I/O error, dev sdb, sector
> 4676184
> Jan 23 08:30:41 krypton kernel: sd 1:0:0:0: SCSI error: return code =
> 0x00010000
> Jan 23 08:30:41 krypton kernel: end_request: I/O error, dev sdb, sector
> 4676432
> Jan 23 08:30:41 krypton kernel: Buffer I/O error on device sdb1, logical
> block 584519
> Jan 23 08:30:41 krypton kernel: lost page write due to I/O error on sdb1
> Jan 23 08:30:41 krypton kernel: Buffer I/O error on device sdb1, logical
> block 584550
> Jan 23 08:30:41 krypton kernel: lost page write due to I/O error on sdb1
> Jan 23 08:30:41 krypton kernel: I/O error in filesystem ("sdb1")
> meta-data dev sdb1 block 0x2302e80       ("xfs_trans_read_buf") error 5
> buf count 8192
> Jan 23 08:30:42 krypton kernel: iscsi: cmd 0x2a is not queued (7)
> Jan 23 08:30:42 krypton last message repeated 4 times
> Jan 23 08:30:42 krypton kernel: sd 1:0:0:0: SCSI error: return code =
> 0x00010000
> Jan 23 08:30:42 krypton kernel: end_request: I/O error, dev sdb, sector
> 36712096
> Jan 23 08:30:42 krypton kernel: sd 1:0:0:0: SCSI error: return code =
> 0x00010000
> Jan 23 08:30:42 krypton kernel: end_request: I/O error, dev sdb, sector
> 38917752
> Jan 23 08:30:42 krypton kernel: sd 1:0:0:0: SCSI error: return code =
> 0x00010000
> Jan 23 08:30:42 krypton kernel: end_request: I/O error, dev sdb, sector
> 3543176
> Jan 23 08:30:42 krypton kernel: sd 1:0:0:0: SCSI error: return code =
> 0x00010000
> Jan 23 08:30:42 krypton kernel: end_request: I/O error, dev sdb, sector
> 37901744
> Jan 23 08:30:42 krypton kernel: sd 1:0:0:0: SCSI error: return code =
> 0x00010000
> Jan 23 08:30:42 krypton kernel: end_request: I/O error, dev sdb, sector
> 38918136
> Jan 23 08:30:42 krypton kernel: Buffer I/O error on device sdb1, logical
> block 4864715
> Jan 23 08:30:42 krypton kernel: lost page write due to I/O error on sdb1
> Jan 23 08:30:42 krypton kernel: Buffer I/O error on device sdb1, logical
> block 4864763
> Jan 23 08:30:42 krypton kernel: lost page write due to I/O error on sdb1
> Jan 23 08:31:18 krypton kernel: iscsi: cmd 0x2a is not queued (7)
> Jan 23 08:31:18 krypton kernel: iscsi: cmd 0x2a is not queued (7)
> Jan 23 08:31:18 krypton kernel: sd 1:0:0:0: SCSI error: return code =
> 0x00010000
> Jan 23 08:31:18 krypton kernel: end_request: I/O error, dev sdb, sector
> 37901744
> Jan 23 08:31:18 krypton kernel: sd 1:0:0:0: SCSI error: return code =
> 0x00010000
> Jan 23 08:31:18 krypton kernel: end_request: I/O error, dev sdb, sector
> 38918136
> Jan 23 08:31:18 krypton kernel: Buffer I/O error on device sdb1, logical
> block 4864763
> Jan 23 08:31:18 krypton kernel: lost page write due to I/O error on sdb1
> Jan 23 08:31:20 krypton kernel: iscsi: cmd 0x28 is not queued (7)
> Jan 23 08:31:20 krypton kernel: sd 1:0:0:0: SCSI error: return code =
> 0x00010000
> Jan 23 08:31:20 krypton kernel: end_request: I/O error, dev sdb, sector
> 4701032
> Jan 23 08:31:20 krypton kernel: iscsi: cmd 0x28 is not queued (7)
> Jan 23 08:31:20 krypton kernel: sd 1:0:0:0: SCSI error: return code =
> 0x00010000
> Jan 23 08:31:20 krypton kernel: end_request: I/O error, dev sdb, sector
> 4701032
> Jan 23 08:31:20 krypton kernel: iscsi: cmd 0x28 is not queued (7)
> Jan 23 08:31:20 krypton kernel: sd 1:0:0:0: SCSI error: return code =
> 0x00010000
> Jan 23 08:31:20 krypton kernel: end_request: I/O error, dev sdb, sector
> 4979496
> Jan 23 08:31:20 krypton kernel: iscsi: cmd 0x28 is not queued (7)
> Jan 23 08:31:20 krypton kernel: sd 1:0:0:0: SCSI error: return code =
> 0x00010000
> Jan 23 08:31:20 krypton kernel: end_request: I/O error, dev sdb, sector
> 4979496
> Jan 23 08:31:28 krypton kernel: iscsi: cmd 0x2a is not queued (7)
> Jan 23 08:31:28 krypton kernel: iscsi: cmd 0x2a is not queued (7)
> Jan 23 08:31:28 krypton kernel: sd 1:0:0:0: SCSI error: return code =
> 0x00010000
> Jan 23 08:31:28 krypton kernel: end_request: I/O error, dev sdb, sector
> 36712096
> Jan 23 08:31:28 krypton kernel: sd 1:0:0:0: SCSI error: return code =
> 0x00010000
> Jan 23 08:31:28 krypton kernel: end_request: I/O error, dev sdb, sector
> 38917752
> Jan 23 08:31:28 krypton kernel: Buffer I/O error on device sdb1, logical
> block 4864715
> Jan 23 08:31:28 krypton kernel: lost page write due to I/O error on sdb1
> Jan 23 08:31:38 krypton kernel: iscsi: cmd 0x2a is not queued (7)
> Jan 23 08:31:38 krypton kernel: iscsi: cmd 0x2a is not queued (7)
> Jan 23 08:31:38 krypton kernel: sd 1:0:
...

Previous Topic: /proc/user_beancounters resetting,
Next Topic: veth and IPv6
Goto Forum:
  


Current Time: Thu Nov 22 10:23:14 GMT 2018