OpenVZ Forum


Home » General » Support » online container migration fails with Remote exception I/O operation on closed file (Trying to lie migrate container using vzmigrate or prlctl, getting Remote exception I/O operation on closed file)
online container migration fails with Remote exception I/O operation on closed file [message #53303] Wed, 23 May 2018 14:33 Go to next message
msolovyev is currently offline  msolovyev
Messages: 29
Registered: August 2007
Location: Russia, Novosibirsk
Junior Member

Hello,

I'm trying to do live migration of container, and getting such an error (this happens even with empty test container):

[root@vz03 ~]# vzmigrate -vvv --online --require-realtime vz04 222
...
2018-05-23 10:30:10.732: Live migration stage started
2018-05-23 10:30:36.320: Io multiplexer aborted
2018-05-23 10:30:36.320: 2018-05-23 10:30:36.321: Phaul service failed to live migrate CT
2018-05-23 10:30:36.320: 2018-05-23 10:30:36.321: error [-73] : Phaul service failed to live migrate CT
2018-05-23 10:30:36.321: Phaul service failed to live migrate CT
2018-05-23 10:30:36.321: Phaul failed to live migrate CT (/var/log/phaul.log)
2018-05-23 10:30:36.322: 2018-05-23 10:30:36.322: cleaning : destroy CT 222
2018-05-23 10:30:36.372: 2018-05-23 10:30:36.372: cleaning : 'rm' dir : /vz/private/222
2018-05-23 10:30:36.372: 2018-05-23 10:30:36.372: can not rename : [/vz/private/222] -> [/vz/private/222.ss6sKg]
2018-05-23 10:30:36.372: 2018-05-23 10:30:36.373: cleaning : 'rmdir' dir : /vz/root/222
2018-05-23 10:30:36.372: 2018-05-23 10:30:36.373: can not find entry for delete : [/vz/root/222]
2018-05-23 10:30:37.373: 2018-05-23 10:30:37.373: unlocking 222
2018-05-23 10:30:37.375: Can't move/copy CT 222 -> CT 222, [], [] : Phaul failed to live migrate CT (/var/log/phaul.log)
2018-05-23 10:30:37.375: unlocking 222
2018-05-23 10:30:37.375: close channel


[root@vz03 ~]# tail -20 /var/log/phaul.log
10:30:33.214: 285170:           Notify (post-network-lock)
10:30:35.283: 285170: Final FS and images sync
10:30:35.522: 285170: Sending images to target
10:30:35.524: 285170:   Pack
10:30:35.561: 285170:   Add htype images
10:30:35.812: 285170: Asking target host to restore
10:30:36.271: 285170: Remote exception
10:30:36.271: 285170: I/O operation on closed file
Traceback (most recent call last):
  File "/usr/libexec/phaul/p.haul", line 9, in <module>
    load_entry_point('phaul==0.1', 'console_scripts', 'p.haul')()
  File "/usr/lib/python2.7/site-packages/phaul/shell/phaul_client.py", line 49, in main
    worker.start_migration()
  File "/usr/lib/python2.7/site-packages/phaul/iters.py", line 161, in start_migration
    self.__start_live_migration()
  File "/usr/lib/python2.7/site-packages/phaul/iters.py", line 232, in __start_live_migration
    self.target_host.restore_from_images()
  File "/usr/lib/python2.7/site-packages/phaul/xem_rpc_client.py", line 26, in __call__
    raise Exception(resp[1])
Exception: I/O operation on closed file



Logs from destination server:

[root@vz04 ~]# tail -20 /var/log/phaul-service.log
10:30:35.562: 817892: Waiting for images to unpack
10:30:35.813: 817892: Restoring from images
10:30:35.827: 817892: Starting vzctl restore
10:30:36.269: 817892:   > Restoring the Container ...
10:30:36.269: 817892:   > Mount image: /vz/private/222/root.hdd 
10:30:36.269: 817892:   > Container is mounted
10:30:36.269: 817892:   > Setting permissions for image=/vz/private/222/root.hdd
10:30:36.269: 817892:   > (00.000283) Error (criu/util.c:694): Can't read link of fd -404: No such file or directory
10:30:36.270: 817892:   > (00.000295) Error (criu/protobuf.c:77): Unexpected EOF on (null)
10:30:36.270: 817892:   > The restore log was saved in /vz/dump/222/rst-_cQGWZ-18.05.23-10.30/criu_restore.9.log
10:30:36.270: 817892:   > criu exited with rc=17
10:30:36.270: 817892:   > Unmount image: /vz/private/222/root.hdd
10:30:36.270: 817892:   > Container is unmounted
10:30:36.270: 817892:   > Failed to restore the Container
10:30:36.321: 817892: Disconnected
10:30:36.322: 817892: Closing images
10:30:36.322: 817892: Removing images
10:30:36.373: 817892: Stop by 15
10:30:36.373: 817892: RPC Service stops
10:30:36.374: 817892: Bye!


[root@vz04 ~]# tail -20 /vz/dump/222/rst-_cQGWZ-18.05.23-10.30/criu_restore.9.log
(00.000142) Version: 3.8 (gitid 0)
(00.000188) Running on vz04.boardreader.com Linux 3.10.0-693.21.1.vz7.47.4 #1 SMP Sat Apr 28 11:48:07 MSK 2018 x86_64
(00.000237) No inventory.img image
(00.000283) Error (criu/util.c:694): Can't read link of fd -404: No such file or directory
(00.000295) Error (criu/protobuf.c:77): Unexpected EOF on (null)




Both servers run the following openvz version:

[root@vz03 ~]# uname -a
Linux vz03.boardreader.com 3.10.0-693.21.1.vz7.47.4 #1 SMP Sat Apr 28 11:48:07 MSK 2018 x86_64 x86_64 x86_64 GNU/Linux


[root@vz03 ~]# cat /etc/*release*
OpenVZ release 7.0.8 (142)
NAME="Virtuozzo"
VERSION="7.0.8"
ID="virtuozzo"
ID_LIKE="rhel fedora"
VERSION_ID="7"
PRETTY_NAME="OpenVZ release 7.0.8"
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:virtuozzoproject:vz:7"
HOME_URL="http://www.virtuozzo.com"
BUG_REPORT_URL="https://bugs.openvz.org/"
OpenVZ release 7.0.8 (142)
Virtuozzo Linux release 7.4
OpenVZ release 7.0.8 (142)
cpe:/o:virtuozzoproject:vzlinux:7:ga
OpenVZ release 7.0.8 (142)
Virtuozzo Linux release 7.5.0 (549)


If I remove --online --require-realtime options, it works.
Re: online container migration fails with Remote exception I/O operation on closed file [message #53304 is a reply to message #53303] Thu, 24 May 2018 12:00 Go to previous messageGo to next message
vaverin is currently offline  vaverin
Messages: 708
Registered: September 2005
Senior Member
submitted to
https://github.com/checkpoint-restore/criu/issues/494
Re: online container migration fails with Remote exception I/O operation on closed file [message #53307 is a reply to message #53304] Fri, 25 May 2018 07:17 Go to previous messageGo to next message
vaverin is currently offline  vaverin
Messages: 708
Registered: September 2005
Senior Member
avagin on github asks to provide criu_restore.9.log file
Re: online container migration fails with Remote exception I/O operation on closed file [message #53308 is a reply to message #53303] Fri, 25 May 2018 11:38 Go to previous messageGo to next message
msolovyev is currently offline  msolovyev
Messages: 29
Registered: August 2007
Location: Russia, Novosibirsk
Junior Member

Actually criu_restore.9.log was already posted in my first message:

[root@vz04 ~]# cat /vz/dump/222/rst-_cQGWZ-18.05.23-10.30/criu_restore.9.log
(00.000142) Version: 3.8 (gitid 0)
(00.000188) Running on vz04.boardreader.com Linux 3.10.0-693.21.1.vz7.47.4 #1 SMP Sat Apr 28 11:48:07 MSK 2018 x86_64
(00.000237) No inventory.img image
(00.000283) Error (criu/util.c:694): Can't read link of fd -404: No such file or directory
(00.000295) Error (criu/protobuf.c:77): Unexpected EOF on (null)



I attached the whole /vz/dump/222/rst-_cQGWZ-18.05.23-10.30 dir on github


BTW, can't attach archive here, getting "File Attachment is too big (over allowed limit of 2097152 bytes)", while archive is 200KB.
Re: online container migration fails with Remote exception I/O operation on closed file [message #53309 is a reply to message #53308] Fri, 25 May 2018 12:59 Go to previous messageGo to next message
vaverin is currently offline  vaverin
Messages: 708
Registered: September 2005
Senior Member
reported as https://bugs.openvz.org/browse/OVZ-7030
Re: online container migration fails with Remote exception I/O operation on closed file [message #53493 is a reply to message #53303] Fri, 22 March 2019 18:46 Go to previous messageGo to next message
alenco is currently offline  alenco
Messages: 10
Registered: August 2014
Junior Member
When will this be integrated in the stable package of Virtuozzo, or even at the factory repo? The last update on the crui package is from 2017.
Having the same issue as described here.

Or would it be OK to compile the package myself? Not sure if this will create more issues?



Re: online container migration fails with Remote exception I/O operation on closed file [message #53494 is a reply to message #53493] Fri, 22 March 2019 19:24 Go to previous messageGo to next message
alenco is currently offline  alenco
Messages: 10
Registered: August 2014
Junior Member
Just tried to compile it to 3.10 and 3.11, but getting an error on that. Probably because p.haul is not working any more with different version than their own package.

Really much appreciated if someone can let me know if there is a way to get this working.

Live migration is simply not working anymore.
Re: online container migration fails with Remote exception I/O operation on closed file [message #53516 is a reply to message #53303] Fri, 10 May 2019 19:38 Go to previous messageGo to next message
HHawk is currently offline  HHawk
Messages: 32
Registered: September 2017
Location: Europe
Member
I have the same issue.

OpenVZ release 7.0.10 (252)
Virtuozzo Linux release 7.6.0 (618)
Linux name.of.server 3.10.0-957.10.1.vz7.85.17 #1 SMP Thu Apr 11 18:11:44 MSK 2019 x86_64 x86_64 x86_64 GNU/Linux


I run the following command: prlctl migrate root@192.168.0.1/10012 root@192.168.0.2/10012 -v 10

Result:
Quote:

05-10 20:49:01.414 W /disp:3647:199402/ handleClientConnected
05-10 20:49:01.414 F /disp:3647:199402/ Processing command 'DspCmdUserEasyLoginLocal' 2186 (PJOC_SRV_LOGIN_LOCAL)
05-10 20:49:01.415 F /disp:3647:199402/ VM Directory /vz/vmprivate does not exists.
05-10 20:49:01.415 F /disp:3647:199402/ Virtuozzo user [root@.] successfully logged on( LOCAL ). [sessionId = {dfe639bf-d805-4d7c-9075-092877108e0a} ]
05-10 20:49:01.415 F /disp:3647:199402/ Session with uuid[ {dfe639bf-d805-4d7c-9075-092877108e0a} ] was started.
05-10 20:49:01.522 I /IOCommunication:3647:199402/ IO server ctx [read thr] (handle 70, sender 2): Socket graceful shutdown detected. No worries, everything goes fine.
05-10 20:49:01.522 W /disp:3647:199402/ handleClientDisconnected
05-10 20:51:57.623 F /disp:3647:199166/ Sending SIGKILL to 199168...
05-10 20:51:57.624 F /IOCommunication:3647:199166/ IO client ctx [read thr] (sender 2): WARNING: callback took too much time: about 300051 msecs. This is absolutely incorrect! Callback must be rewritten!
05-10 20:51:57.625 F /disp:3647:199165/ Task '20Task_MigrateCtSource' with uuid = {cd73d895-9131-4107-9a5e-bcbdff8d7323} was finished with result PRL_ERR_CT_MIGRATE_INTERNAL_ERROR (0x80031035) )
05-10 20:51:57.626 I /IOCommunication:3647:199166/ IO client ctx [read thr] (sender 2): Stop in progress for read thread
05-10 20:51:57.712 F /disp:3647:199153/ Processing command 'DspCmdUserLogoff' 2042 (PJOC_SRV_LOGOFF)
05-10 20:51:57.712 F /disp:3647:199153/ Virtuozzo user [root@.] successfully logged off. [sessionId = {653e6805-09f2-43c3-b7b0-2499587ffabb} ]
05-10 20:51:57.712 I /IOCommunication:3647:199153/ IO server ctx [read thr] (handle 42, sender 2) (83.172.190.190:49832): Stop in progress for read thread
05-10 20:51:57.712 W /disp:3647:199153/ handleClientDisconnected
05-10 20:52:02.232 F /disp:3647:3647/ Synchronizing VMs uptime values
05-10 20:52:02.232 F /disp:3647:3647/ Synchronization of VMs uptime was completed
05-10 20:52:06.650 F /disp:3647:4098/ RNG schema validation has failed
05-10 20:52:06.650 F /disp:3647:4098/ libvirt error no error
05-10 20:52:06.650 F /disp:3647:4098/ Cannot read host PCI devices: PRL_ERR_FAILURE
05-10 20:52:16.297 F /HostUtils:3647:4098/ Failed to load libpcs_client.so.1: libpcs_client.so.1: cannot open shared object file: No such file or directory


Maybe someone has an idea?

Things I tried and/or looked up:

1. Failed to load libpcs_client.so.1 (= apparently only a warning)
2. VM Directory /vz/vmprivate does not exists (= I created this directory on both source and destination; still same message)

I decided to try something else; I turned the container offline and ran the command: prlctl migrate 10012 192.168.0.2
As a result the container was migrated succesfully...

...however, I rather have a live migration if possible.


Maybe someone experienced can see in the logs what is wrong?

Thanks!
Re: online container migration fails with Remote exception I/O operation on closed file [message #53527 is a reply to message #53303] Sat, 18 May 2019 18:10 Go to previous message
HHawk is currently offline  HHawk
Messages: 32
Registered: September 2017
Location: Europe
Member
* small bump *
Previous Topic: Regenerate /boot/grub2/grub.cfg
Next Topic: MDS
Goto Forum:
  


Current Time: Thu May 02 13:33:57 GMT 2024

Total time taken to generate the page: 0.01889 seconds