OpenVZ Forum


Home » General » Support » Migration Fails with vzmigrate: vzmdest invoked oom-killer [ovz7]
Migration Fails with vzmigrate: vzmdest invoked oom-killer [ovz7] [message #53572] Tue, 17 September 2019 04:23 Go to next message
andre is currently offline  andre
Messages: 36
Registered: January 2008
Member
When migrating VEs using vzmigrate, we have seen this behaviour twice today with different servers (different source and destination) and different hardware.

The destination server runs out of memory until vmdest gets killed. It happens even without having any VEs on the destination and only the one being migrated on the source.

Command: vzmigrate -r no -v --ssh='-p 2222' DESTIP 102

3.10.0-957.12.2.vz7.96.21
ploop-7.0.157-1.vz7.x86_64
ploop-lib-7.0.157-1.vz7.x86_64

Sep 16 22:15:03 srv systemd: Removed slice User Slice of vmanager.
Sep 16 22:15:44 srv kernel: vzmdest invoked oom-killer: gfp_mask=0x201da, order=0, oom_score_adj=0
Sep 16 22:15:44 srv kernel: vzmdest cpuset=/ mems_allowed=0
Sep 16 22:15:44 srv kernel: CPU: 4 PID: 77510 Comm: vzmdest ve: 0 Not tainted 3.10.0-957.12.2.vz7.96.21 #1 96.21
Sep 16 22:15:44 srv kernel: Hardware name: Supermicro X9SCL/X9SCM/X9SCL/X9SCM, BIOS 4.6.4 06/30/2011
Sep 16 22:15:44 srv kernel: Call Trace:
Sep 16 22:15:44 srv kernel: [<ffffffff92b94677>] dump_stack+0x19/0x1b
Sep 16 22:15:44 srv kernel: [<ffffffff92b8ef52>] dump_header+0x90/0x229
Sep 16 22:15:44 srv kernel: [<ffffffff92569d7f>] ? delayacct_end+0x8f/0xb0
Sep 16 22:15:44 srv kernel: [<ffffffff925cdcf8>] oom_kill_process+0x5e8/0x640
Sep 16 22:15:44 srv kernel: [<ffffffff925f82ce>] ? get_task_oom_score_adj+0xee/0x100
Sep 16 22:15:44 srv kernel: [<ffffffff925ce141>] out_of_memory+0x391/0x4e0
Sep 16 22:15:44 srv kernel: [<ffffffff92b8fa6e>] __alloc_pages_slowpath+0x5de/0x78c
Sep 16 22:15:44 srv kernel: [<ffffffff925d4a42>] __alloc_pages_nodemask+0x5d2/0x600
Sep 16 22:15:44 srv kernel: [<ffffffff92569ea4>] ? __delayacct_blkio_end+0x34/0x60
Sep 16 22:15:44 srv kernel: [<ffffffff92627278>] alloc_pages_current+0x98/0x110
Sep 16 22:15:44 srv kernel: [<ffffffff925c8f87>] __page_cache_alloc+0x97/0xb0
Sep 16 22:15:44 srv kernel: [<ffffffff925cbb70>] filemap_fault+0x200/0x4b0
Sep 16 22:15:44 srv kernel: [<ffffffffc0369d56>] ext4_filemap_fault+0x36/0x50 [ext4]
Sep 16 22:15:44 srv kernel: [<ffffffff925fcf1d>] __do_fault.isra.62+0x9d/0x170
Sep 16 22:15:44 srv kernel: [<ffffffff926020d5>] handle_pte_fault+0x3c5/0xce0
Sep 16 22:15:44 srv kernel: [<ffffffff92604b47>] handle_mm_fault+0x397/0x9a0
Sep 16 22:15:44 srv kernel: [<ffffffff92ba26e3>] __do_page_fault+0x203/0x4f0
Sep 16 22:15:44 srv kernel: [<ffffffff92ba2a05>] do_page_fault+0x35/0x90
Sep 16 22:15:44 srv kernel: [<ffffffff92b9eab6>] ? error_swapgs+0xa7/0xbd
Sep 16 22:15:44 srv kernel: [<ffffffff92b9e768>] page_fault+0x28/0x30
Sep 16 22:15:44 srv kernel: Mem-Info:
Sep 16 22:15:44 srv kernel: active_anon:3444028 inactive_anon:439407 isolated_anon:0#012 active_file:1212 inactive_file:1193 isolated_file:0#012 unevictable:0 dirty:170 writeback:22 wbtmp:0 unstable:0#012 slab_reclaimable:22642 slab_unreclaimable:6879#012 mapped:3576 shmem:2342 pagetables:14890 bounce:0#012 free:98117 free_pcp:229 free_cma:0
Sep 16 22:15:44 srv kernel: Node 0 DMA free:15888kB min:316kB low:392kB high:468kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15972kB managed:15888kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
Sep 16 22:15:44 srv kernel: lowmem_reserve[]: 0 2821 15856 15856
Sep 16 22:15:44 srv kernel: Node 0 DMA32 free:109716kB min:57784kB low:72228kB high:86672kB active_anon:2189164kB inactive_anon:547540kB active_file:1124kB inactive_file:1080kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:3119688kB managed:2889256kB mlocked:0kB dirty:292kB writeback:24kB mapped:2672kB shmem:3516kB slab_reclaimable:10004kB slab_unreclaimable:2368kB kernel_stack:576kB pagetables:10120kB unstable:0kB bounce:0kB free_pcp:236kB local_pcp:120kB free_cma:0kB writeback_tmp:0kB pages_scanned:17233 all_unreclaimable? yes
Sep 16 22:15:44 srv kernel: lowmem_reserve[]: 0 0 13034 13034
Sep 16 22:15:44 srv kernel: Node 0 Normal free:266864kB min:266952kB low:333688kB high:400424kB active_anon:11586948kB inactive_anon:1210088kB active_file:3724kB inactive_file:3692kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:13631488kB managed:13350452kB mlocked:0kB dirty:388kB writeback:64kB mapped:11632kB shmem:5852kB slab_reclaimable:80564kB slab_unreclaimable:25148kB kernel_stack:3920kB pagetables:49440kB unstable:0kB bounce:0kB free_pcp:680kB local_pcp:240kB free_cma:0kB writeback_tmp:0kB pages_scanned:11140 all_unreclaimable? yes
Sep 16 22:15:44 srv kernel: lowmem_reserve[]: 0 0 0 0
Sep 16 22:15:44 srv kernel: Node 0 DMA: 0*4kB 0*8kB 1*16kB (U) 0*32kB 2*64kB (U) 1*128kB (U) 1*256kB (U) 0*512kB 1*1024kB (U) 1*2048kB (M) 3*4096kB (M) = 15888kB
Sep 16 22:15:44 srv kernel: Node 0 DMA32: 506*4kB (UEM) 200*8kB (UE) 147*16kB (UEM) 145*32kB (UE) 56*64kB (UEM) 57*128kB (UEM) 58*256kB (UEM) 34*512kB (UEM) 7*1024kB (UEM) 10*2048kB (EM) 7*4096kB (UEM) = 110072kB
Sep 16 22:15:44 srv kernel: Node 0 Normal: 1754*4kB (UE) 700*8kB (UE) 655*16kB (UEM) 824*32kB (UEM) 675*64kB (UEM) 361*128kB (UEM) 178*256kB (UE) 123*512kB (UEM) 19*1024kB (UM) 0*2048kB 0*4096kB = 266872kB
Sep 16 22:15:44 srv kernel: Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
Sep 16 22:15:44 srv kernel: 4916 total pagecache pages
Sep 16 22:15:44 srv kernel: 227 pages in swap cache
Sep 16 22:15:44 srv kernel: Swap cache stats: add 9598431, delete 9597921, find 3758613/4237238
Sep 16 22:15:44 srv kernel: Free swap = 0kB
Sep 16 22:15:44 srv kernel: Total swap = 8191992kB
Sep 16 22:15:44 srv kernel: 4191787 pages RAM
Sep 16 22:15:44 srv kernel: 0 pages HighMem/MovableOnly
Sep 16 22:15:44 srv kernel: 127888 pages reserved
Sep 16 22:15:44 srv kernel: Out of memory: Kill process 77510 (vzmdest) score 936 or sacrifice child
Sep 16 22:15:44 srv kernel: Killed process 77535 (p.haul-service) in VE "0" total-vm:387416kB, anon-rss:1008kB, file-rss:268kB, shmem-rss:0kB
Sep 16 22:15:50 srv kernel: vzmdest invoked oom-killer: gfp_mask=0x200da, order=0, oom_score_adj=0
Sep 16 22:15:50 srv kernel: vzmdest cpuset=/ mems_allowed=0
Sep 16 22:15:50 srv kernel: CPU: 3 PID: 77510 Comm: vzmdest ve: 0 Not tainted 3.10.0-957.12.2.vz7.96.21 #1 96.21
Sep 16 22:15:50 srv kernel: Hardware name: Supermicro X9SCL/X9SCM/X9SCL/X9SCM, BIOS 4.6.4 06/30/2011
Sep 16 22:15:50 srv kernel: Call Trace:
Sep 16 22:15:50 srv kernel: [<ffffffff92b94677>] dump_stack+0x19/0x1b
Sep 16 22:15:50 srv kernel: [<ffffffff92b8ef52>] dump_header+0x90/0x229
Sep 16 22:15:50 srv kernel: [<ffffffff92569d7f>] ? delayacct_end+0x8f/0xb0
...

Re: Migration Fails with vzmigrate: vzmdest invoked oom-killer [ovz7] [message #53573 is a reply to message #53572] Tue, 17 September 2019 05:08 Go to previous messageGo to next message
andre is currently offline  andre
Messages: 36
Registered: January 2008
Member
if the VE is stopped before doing the vzmigrate, it works.
Re: Migration Fails with vzmigrate: vzmdest invoked oom-killer [ovz7] [message #53586 is a reply to message #53572] Thu, 10 October 2019 11:31 Go to previous messageGo to next message
MilesWeb
Messages: 3
Registered: May 2015
Location: UK
Junior Member
Other probabilities would involve fine-tuning the OOM killer, scaling the load horizontally over various small instances or decreasing the memory demands of the application.

Re: Migration Fails with vzmigrate: vzmdest invoked oom-killer [ovz7] [message #53587 is a reply to message #53586] Fri, 11 October 2019 02:40 Go to previous message
andre is currently offline  andre
Messages: 36
Registered: January 2008
Member
The OOM KILL happens at the destination node, where there is "nothing" running, so I have no way to save memory there. It just has the base OS (OVZ 7) with plenty of free RAM.

It doesn't happen when starting the migrated VE. The transfer has not even finished so, I believe, that the memory usage of the VE (which is running on the source node, not on the destination where the OOM ahppens) won't count. It is still just copying its filesystem.

It just happened with 2 different servers today, Same scenario. We had to fall back to rsync.


[Updated on: Fri, 11 October 2019 02:40]

Report message to a moderator

Previous Topic: Fsck needed after out of memory
Next Topic: How to disable libvirtd
Goto Forum:
  


Current Time: Tue Mar 19 06:08:20 GMT 2024

Total time taken to generate the page: 0.02427 seconds