【详述】be的一个节点突然被系统 kill,Out of memory: Kill process 14832 (jemalloc_bg_thd) score 676 or sacrifice child
【背景】无
【业务影响】
【StarRocks版本】2.4.3
【集群规模】5fe(3 follower+2observer)+6be(独立部署)
【机器信息】16c+ 64g
【联系方式】社区6群-春江
【附件】
- fe.log/beINFO/相应截图
be warn:
be info:
be.out
系统日志:
Apr 3 13:33:09 VM-0-2-centos kernel: [14831] 0 14831 25521830 11331827 31171 0 0 starrocks_be
Apr 3 13:33:09 VM-0-2-centos kernel: Out of memory: Kill process 14831 (starrocks_be) score 676 or sacrifice child
Apr 3 13:33:09 VM-0-2-centos kernel: Killed process 14831 (starrocks_be), UID 0, total-vm:102087320kB, anon-rss:45327184kB, file-rss:124kB, shmem-rss:0kB
Apr 3 13:33:09 VM-0-2-centos kernel: YDService invoked oom-killer: gfp_mask=0x201da, order=0, oom_score_adj=0
Apr 3 13:33:09 VM-0-2-centos kernel: YDService cpuset=/ mems_allowed=0-1
Apr 3 13:33:09 VM-0-2-centos kernel: CPU: 5 PID: 13164 Comm: YDService Kdump: loaded Tainted: G W ------------ 3.10.0-1160.66.1.el7.x86_64 #1
Apr 3 13:33:09 VM-0-2-centos kernel: Hardware name: Tencent Cloud CVM, BIOS seabios-1.9.1-qemu-project.org 04/01/2014
Apr 3 13:33:09 VM-0-2-centos kernel: Call Trace:
Apr 3 13:33:09 VM-0-2-centos kernel: [<ffffffffa15865a9>] dump_stack+0x19/0x1b
Apr 3 13:33:09 VM-0-2-centos kernel: [<ffffffffa1581648>] dump_header+0x90/0x229
Apr 3 13:33:09 VM-0-2-centos kernel: [<ffffffffa0f06ac2>] ? ktime_get_ts64+0x52/0xf0
Apr 3 13:33:09 VM-0-2-centos kernel: [<ffffffffa0f5e17f>] ? delayacct_end+0x8f/0xb0
Apr 3 13:33:09 VM-0-2-centos kernel: [<ffffffffa0fc258d>] oom_kill_process+0x2cd/0x490
Apr 3 13:33:09 VM-0-2-centos kernel: [<ffffffffa0f336f1>] ? cpuset_mems_allowed_intersects+0x21/0x30
Apr 3 13:33:09 VM-0-2-centos kernel: [<ffffffffa0fc2c7a>] out_of_memory+0x31a/0x500
Apr 3 13:33:09 VM-0-2-centos kernel: [<ffffffffa0fc9874>] __alloc_pages_nodemask+0xad4/0xbe0
Apr 3 13:33:09 VM-0-2-centos kernel: [<ffffffffa10193b8>] alloc_pages_current+0x98/0x110
Apr 3 13:33:09 VM-0-2-centos kernel: [<ffffffffa0fbe037>] __page_cache_alloc+0x97/0xb0
Apr 3 13:33:09 VM-0-2-centos kernel: [<ffffffffa0fc0fe0>] filemap_fault+0x270/0x420
Apr 3 13:33:09 VM-0-2-centos kernel: [<ffffffffc050e756>] ext4_filemap_fault+0x36/0x50 [ext4]
Apr 3 13:33:09 VM-0-2-centos kernel: [<ffffffffa0fee7aa>] __do_fault.isra.61+0x8a/0x100
Apr 3 13:33:09 VM-0-2-centos kernel: [<ffffffffa0feed5c>] do_read_fault.isra.63+0x4c/0x1b0
Apr 3 13:33:09 VM-0-2-centos kernel: [<ffffffffa0ff65a0>] handle_mm_fault+0xa20/0xfb0
Apr 3 13:33:09 VM-0-2-centos kernel: [<ffffffffa1594653>] __do_page_fault+0x213/0x500
Apr 3 13:33:09 VM-0-2-centos kernel: [<ffffffffa1594a26>] trace_do_page_fault+0x56/0x150
Apr 3 13:33:09 VM-0-2-centos kernel: [<ffffffffa1593fa2>] do_async_page_fault+0x22/0xf0
Apr 3 13:33:09 VM-0-2-centos kernel: [<ffffffffa15907a8>] async_page_fault+0x28/0x30
Apr 3 13:33:09 VM-0-2-centos kernel: Mem-Info:
。。。
Apr 3 13:33:09 VM-0-2-centos kernel: [ 441] 0 441 6945614 4513913 9168 0 0 java
Apr 3 13:33:09 VM-0-2-centos kernel: [14855] 0 14831 25521830 11342843 31171 0 0 jemalloc_bg_thd
Apr 3 13:33:09 VM-0-2-centos kernel: Out of memory: Kill process 14832 (jemalloc_bg_thd) score 676 or sacrifice child
Apr 3 13:33:09 VM-0-2-centos kernel: Killed process 14855 (jemalloc_bg_thd), UID 0, total-vm:102087320kB, anon-rss:45370244kB, file-rss:1128kB, shmem-rss:0kB
Apr 3 13:33:09 VM-0-2-centos kernel: java invoked oom-killer: gfp_mask=0x201da, order=0, oom_score_adj=0
Apr 3 13:33:09 VM-0-2-centos kernel: java cpuset=/ mems_allowed=0-1
Apr 3 13:33:09 VM-0-2-centos kernel: CPU: 9 PID: 461 Comm: java Kdump: loaded Tainted: G W ------------ 3.10.0-1160.66.1.el7.x86_64 #1
Apr 3 13:33:09 VM-0-2-centos kernel: Hardware name: Tencent Cloud CVM, BIOS seabios-1.9.1-qemu-project.org 04/01/2014
Apr 3 13:33:09 VM-0-2-centos kernel: Call Trace:
Apr 3 13:33:09 VM-0-2-centos kernel: [<ffffffffa15865a9>] dump_stack+0x19/0x1b
Apr 3 13:33:09 VM-0-2-centos kernel: [<ffffffffa1581648>] dump_header+0x90/0x229
Apr 3 13:33:09 VM-0-2-centos kernel: [<ffffffffa0f06ac2>] ? ktime_get_ts64+0x52/0xf0
Apr 3 13:33:09 VM-0-2-centos kernel: [<ffffffffa0f5e17f>] ? delayacct_end+0x8f/0xb0
Apr 3 13:33:09 VM-0-2-centos kernel: [<ffffffffa0fc258d>] oom_kill_process+0x2cd/0x490
Apr 3 13:33:09 VM-0-2-centos kernel: [<ffffffffa0fc2c7a>] out_of_memory+0x31a/0x500
Apr 3 13:33:09 VM-0-2-centos kernel: [<ffffffffa0fc9874>] __alloc_pages_nodemask+0xad4/0xbe0
Apr 3 13:33:09 VM-0-2-centos kernel: [<ffffffffa10193b8>] alloc_pages_current+0x98/0x110
Apr 3 13:33:09 VM-0-2-centos kernel: [<ffffffffa0fbe037>] __page_cache_alloc+0x97/0xb0
Apr 3 13:33:09 VM-0-2-centos kernel: [<ffffffffa0fc0fe0>] filemap_fault+0x270/0x420
Apr 3 13:33:09 VM-0-2-centos kernel: [<ffffffffc050e756>] ext4_filemap_fault+0x36/0x50 [ext4]
Apr 3 13:33:09 VM-0-2-centos kernel: [<ffffffffa0fee7aa>] __do_fault.isra.61+0x8a/0x100
Apr 3 13:33:09 VM-0-2-centos kernel: [<ffffffffa0feed5c>] do_read_fault.isra.63+0x4c/0x1b0
Apr 3 13:33:09 VM-0-2-centos kernel: [<ffffffffa0ff65a0>] handle_mm_fault+0xa20/0xfb0
Apr 3 13:33:09 VM-0-2-centos kernel: [<ffffffffa1594653>] __do_page_fault+0x213/0x500
Apr 3 13:33:09 VM-0-2-centos kernel: [<ffffffffa1594a26>] trace_do_page_fault+0x56/0x150
Apr 3 13:33:09 VM-0-2-centos kernel: [<ffffffffa1593fa2>] do_async_page_fault+0x22/0xf0
Apr 3 13:33:09 VM-0-2-centos kernel: [<ffffffffa15907a8>] async_page_fault+0x28/0x30
Apr 3 13:33:09 VM-0-2-centos kernel: Mem-Info:
Apr 3 13:33:09 VM-0-2-centos kernel: active_anon:15888426 inactive_anon:87 isolated_anon:28#012 active_file:0 inactive_file:5890 isolated_file:153#012 unevictable:2589 dirty:34 writeback:2 unstable:0#012 slab_reclaimable:68101 slab_unreclaimable:15434#012 mapped:2442 shmem:175 pagetables:41725 bounce:0#012 free:95527 free_pcp:828 free_cma:0
Apr 3 13:33:09 VM-0-2-centos kernel: Node 0 DMA free:15892kB min:20kB low:24kB high:28kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15992kB managed:15908kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:16kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
系统监控