BE 节点频繁被 OOM Kill

【详述】上线了一张两张大表Join的视图,导致 BE 频繁 OOM Kill,查看了被 kill 节点的 be.out 日志,日志中没有过多详细信息
【业务影响】
【StarRocks版本】2.4.4
【集群规模】例如:3fe(3 follower)+4be(独立部署)
【机器信息】16C/64G/万兆
【联系方式】StarRocks社区群12-Hale
【附件】

即便是遇到慢查询或内存消耗过大,如何避免 BE OOM?

pipeline开启状态下,parallel_fragment_exec_instance_num设置为1,参考 https://docs.starrocks.io/zh-cn/latest/administration/Memory_management 获取下mem_tracker,完整的dmesg -T结果发下

[Sat Aug 19 20:57:55 2023] Node 0 Normal: 14384kB (UEM) 9468kB (UEM) 41716kB (UEM) 20332kB (UEM) 15364kB (UEM) 91128kB (UEM) 40256kB (UEM) 14512kB (UEM) 11024kB (E) 02048kB 04096kB = 66360kB
[Sat Aug 19 20:57:55 2023] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[Sat Aug 19 20:57:55 2023] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[Sat Aug 19 20:57:55 2023] 44802 total pagecache pages
[Sat Aug 19 20:57:55 2023] 0 pages in swap cache
[Sat Aug 19 20:57:55 2023] Swap cache stats: add 0, delete 0, find 0/0
[Sat Aug 19 20:57:55 2023] Free swap = 0kB
[Sat Aug 19 20:57:55 2023] Total swap = 0kB
[Sat Aug 19 20:57:55 2023] 16569223 pages RAM
[Sat Aug 19 20:57:55 2023] 0 pages HighMem/MovableOnly
[Sat Aug 19 20:57:55 2023] 322401 pages reserved
[Sat Aug 19 20:57:55 2023] [ pid ] uid tgid total_vm rss nr_ptes swapents oom_score_adj name
[Sat Aug 19 20:57:55 2023] [ 512] 0 512 16003 5248 38 0 0 systemd-journal
[Sat Aug 19 20:57:55 2023] [ 548] 0 548 11350 138 23 0 -1000 systemd-udevd
[Sat Aug 19 20:57:55 2023] [ 612] 0 612 13883 113 28 0 -1000 auditd
[Sat Aug 19 20:57:55 2023] [ 676] 0 676 5410 66 17 0 0 irqbalance
[Sat Aug 19 20:57:55 2023] [ 678] 81 678 14522 137 32 0 -900 dbus-daemon
[Sat Aug 19 20:57:55 2023] [ 682] 32 682 17314 135 37 0 0 rpcbind
[Sat Aug 19 20:57:55 2023] [ 690] 999 690 153085 2427 62 0 0 polkitd
[Sat Aug 19 20:57:55 2023] [ 692] 0 692 6629 98 18 0 0 systemd-logind
[Sat Aug 19 20:57:55 2023] [ 705] 998 705 29452 114 28 0 0 chronyd
[Sat Aug 19 20:57:55 2023] [ 721] 0 721 48801 117 36 0 0 gssproxy
[Sat Aug 19 20:57:55 2023] [ 1004] 0 1004 25747 521 48 0 0 dhclient
[Sat Aug 19 20:57:55 2023] [ 1061] 0 1061 143570 3335 100 0 0 tuned
[Sat Aug 19 20:57:55 2023] [ 1189] 0 1189 22447 259 42 0 0 master
[Sat Aug 19 20:57:55 2023] [ 1203] 89 1203 22517 265 44 0 0 qmgr
[Sat Aug 19 20:57:55 2023] [ 1267] 0 1267 90145 3122 87 0 0 rsyslogd
[Sat Aug 19 20:57:55 2023] [ 1282] 0 1282 27551 32 13 0 0 agetty
[Sat Aug 19 20:57:55 2023] [ 1283] 0 1283 31596 160 21 0 0 crond
[Sat Aug 19 20:57:55 2023] [ 1286] 0 1286 27551 32 10 0 0 agetty
[Sat Aug 19 20:57:55 2023] [ 1399] 0 1399 28246 259 57 0 -1000 sshd
[Sat Aug 19 20:57:55 2023] [ 7738] 997 7738 180237 4320 36 0 0 node_exporter
[Sat Aug 19 20:57:55 2023] [18027] 1002 18027 4814538 2772997 5880 0 0 java
[Sat Aug 19 20:57:55 2023] [ 686] 0 686 39211 335 81 0 0 sshd
[Sat Aug 19 20:57:55 2023] [ 688] 1000 688 39211 340 78 0 0 sshd
[Sat Aug 19 20:57:55 2023] [ 689] 1000 689 29155 383 16 0 0 bash
[Sat Aug 19 20:57:55 2023] [ 3302] 1002 3302 17899734 13050896 30171 0 0
starrocks_be
starrocks_be
[Sat Aug 19 21:07:55 2023] Out of memory: Kill process 5830 (starrocks_be) score 805 or sacrifice child
[Sat Aug 19 21:07:55 2023] Killed process 5830 (starrocks_be), UID 1002, total-vm:76506872kB, anon-rss:52218612kB, file-rss:0kB, shmem-rss:0kB
[Sat Aug 19 21:07:56 2023] in:imjournal invoked oom-killer: gfp_mask=0x201da, order=0, oom_score_adj=0
[Sat Aug 19 21:07:56 2023] in:imjournal cpuset=/ mems_allowed=0
[Sat Aug 19 21:07:56 2023] CPU: 2 PID: 1285 Comm: in:imjournal Kdump: loaded Not tainted 3.10.0-1160.71.1.el7.x86_64 #1
[Sat Aug 19 21:07:56 2023] Hardware name: Amazon EC2 m5.4xlarge/, BIOS 1.0 10/16/2017
[Sat Aug 19 21:07:56 2023] Call Trace:
[Sat Aug 19 21:07:56 2023] [] dump_stack+0x19/0x1b
[Sat Aug 19 21:07:56 2023] [] dump_header+0x90/0x229
[Sat Aug 19 21:07:56 2023] [] ? ktime_get_ts64+0x52/0xf0
[Sat Aug 19 21:07:56 2023] [] ? delayacct_end+0x8f/0xb0
[Sat Aug 19 21:07:56 2023] [] oom_kill_process+0x2cd/0x490
[Sat Aug 19 21:07:56 2023] [] ? oom_unkillable_task+0xcd/0x120
[Sat Aug 19 21:07:56 2023] [] out_of_memory+0x31a/0x500
[Sat Aug 19 21:07:56 2023] [] __alloc_pages_nodemask+0xad4/0xbe0
[Sat Aug 19 21:07:56 2023] [] alloc_pages_current+0x98/0x110
[Sat Aug 19 21:07:56 2023] [] __page_cache_alloc+0x97/0xb0
[Sat Aug 19 21:07:56 2023] [] filemap_fault+0x270/0x420
[Sat Aug 19 21:07:56 2023] [] __xfs_filemap_fault+0x7e/0x1d0 [xfs]
[Sat Aug 19 21:07:56 2023] [] xfs_filemap_fault+0x2c/0x30 [xfs]
[Sat Aug 19 21:07:56 2023] [] __do_fault.isra.61+0x8a/0x100
[Sat Aug 19 21:07:56 2023] [] do_read_fault.isra.63+0x4c/0x1b0
[Sat Aug 19 21:07:56 2023] [] handle_mm_fault+0xa20/0xfb0
[Sat Aug 19 21:07:56 2023] [] ? list_del+0xd/0x30
[Sat Aug 19 21:07:56 2023] [] __do_page_fault+0x213/0x500
[Sat Aug 19 21:07:56 2023] [] trace_do_page_fault+0x56/0x150
[Sat Aug 19 21:07:56 2023] [] do_async_page_fault+0x22/0xf0
[Sat Aug 19 21:07:56 2023] [] async_page_fault+0x28/0x30
[Sat Aug 19 21:07:56 2023] Mem-Info:
[Sat Aug 19 21:07:56 2023] active_anon:15864277 inactive_anon:22874 isolated_anon:0
active_file:14 inactive_file:0 isolated_file:0
unevictable:0 dirty:0 writeback:0 unstable:0
slab_reclaimable:46271 slab_unreclaimable:15899
mapped:3820 shmem:43147 pagetables:39631 bounce:0
free:83196 free_pcp:984 free_cma:0
[Sat Aug 19 21:07:56 2023] Node 0 DMA free:15908kB min:16kB low:20kB high:24kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15992kB managed:15908kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
[Sat Aug 19 21:07:56 2023] lowmem_reserve[]: 0 2827 63445 63445
[Sat Aug 19 21:07:56 2023] Node 0 DMA32 free:246116kB min:3008kB low:3760kB high:4512kB active_anon:2515800kB inactive_anon:3272kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:3129252kB managed:2895748kB mlocked:0kB dirty:0kB writeback:0kB mapped:192kB shmem:6144kB slab_reclaimable:108072kB slab_unreclaimable:8584kB kernel_stack:752kB pagetables:5112kB unstable:0kB bounce:0kB free_pcp:1624kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:15 all_unreclaimable? yes
[Sat Aug 19 21:07:56 2023] lowmem_reserve[]: 0 0 60617 60617
[Sat Aug 19 21:07:56 2023] Node 0 Normal free:70032kB min:64552kB low:80688kB high:96828kB active_anon:60941308kB inactive_anon:88224kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:63131648kB managed:62075632kB mlocked:0kB dirty:8kB writeback:0kB mapped:14816kB shmem:166444kB slab_reclaimable:77012kB slab_unreclaimable:54596kB kernel_stack:7840kB pagetables:153412kB unstable:0kB bounce:0kB free_pcp:2308kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
[Sat Aug 19 21:07:56 2023] lowmem_reserve[]: 0 0 0 0
[Sat Aug 19 21:07:56 2023] Node 0 DMA: 1
4kB (U) 08kB 016kB 132kB (U) 264kB (U) 1128kB (U) 1256kB (U) 0512kB 11024kB (U) 12048kB (M) 34096kB (M) = 15908kB
[Sat Aug 19 21:07:56 2023] Node 0 DMA32: 2874kB (UE) 6828kB (UE) 283116kB (UEM) 202032kB (UEM) 98164kB (UEM) 344128kB (UE) 65256kB (UE) 10512kB (U) 11024kB (M) 02048kB 04096kB = 246140kB
[Sat Aug 19 21:07:56 2023] Node 0 Normal: 455
4kB (UEM) 14228kB (UEM) 120616kB (UEM) 39932kB (UEM) 16164kB (UEM) 46128kB (UEM) 19256kB (UEM) 4512kB (UEM) 11024kB (E) 02048kB 04096kB = 69388kB
[Sat Aug 19 21:07:56 2023] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[Sat Aug 19 21:07:56 2023] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[Sat Aug 19 21:07:56 2023] 43019 total pagecache pages
[Sat Aug 19 21:07:56 2023] 0 pages in swap cache
[Sat Aug 19 21:07:56 2023] Swap cache stats: add 0, delete 0, find 0/0
[Sat Aug 19 21:07:56 2023] Free swap = 0kB
[Sat Aug 19 21:07:56 2023] Total swap = 0kB
[Sat Aug 19 21:07:56 2023] 16569223 pages RAM
[Sat Aug 19 21:07:56 2023] 0 pages HighMem/MovableOnly
[Sat Aug 19 21:07:56 2023] 322401 pages reserved
[Sat Aug 19 21:07:56 2023] [ pid ] uid tgid total_vm rss nr_ptes swapents oom_score_adj name
[Sat Aug 19 21:07:56 2023] [ 512] 0 512 16003 5248 39 0 0 systemd-journal
[Sat Aug 19 21:07:56 2023] [ 548] 0 548 11350 138 23 0 -1000 systemd-udevd
[Sat Aug 19 21:07:56 2023] [ 612] 0 612 13883 113 28 0 -1000 auditd
[Sat Aug 19 21:07:56 2023] [ 676] 0 676 5410 64 17 0 0 irqbalance
[Sat Aug 19 21:07:56 2023] [ 678] 81 678 14522 137 32 0 -900 dbus-daemon
[Sat Aug 19 21:07:56 2023] [ 682] 32 682 17314 135 37 0 0 rpcbind
[Sat Aug 19 21:07:56 2023] [ 690] 999 690 153085 2427 62 0 0 polkitd
[Sat Aug 19 21:07:56 2023] [ 692] 0 692 6629 98 18 0 0 systemd-logind
[Sat Aug 19 21:07:56 2023] [ 705] 998 705 29452 114 28 0 0 chronyd
[Sat Aug 19 21:07:56 2023] [ 721] 0 721 48801 117 36 0 0 gssproxy
[Sat Aug 19 21:07:56 2023] [ 1004] 0 1004 25747 518 48 0 0 dhclient
[Sat Aug 19 21:07:56 2023] [ 1061] 0 1061 143570 3335 100 0 0 tuned
[Sat Aug 19 21:07:56 2023] [ 1189] 0 1189 22447 259 42 0 0 master
[Sat Aug 19 21:07:56 2023] [ 1203] 89 1203 22517 265 44 0 0 qmgr
[Sat Aug 19 21:07:56 2023] [ 1267] 0 1267 90145 3138 87 0 0 rsyslogd
[Sat Aug 19 21:07:56 2023] [ 1282] 0 1282 27551 32 13 0 0 agetty
[Sat Aug 19 21:07:56 2023] [ 1283] 0 1283 31596 161 21 0 0 crond
[Sat Aug 19 21:07:56 2023] [ 1286] 0 1286 27551 32 10 0 0 agetty
[Sat Aug 19 21:07:56 2023] [ 1399] 0 1399 28246 259 57 0 -1000 sshd
[Sat Aug 19 21:07:56 2023] [ 7738] 997 7738 180237 3361 36 0 0 node_exporter
[Sat Aug 19 21:07:56 2023] [18027] 1002 18027 4814538 2773065 5880 0 0 java
[Sat Aug 19 21:07:56 2023] [ 686] 0 686 39211 335 81 0 0 sshd
[Sat Aug 19 21:07:56 2023] [ 688] 1000 688 39211 340 78 0 0 sshd
[Sat Aug 19 21:07:56 2023] [ 689] 1000 689 29155 383 16 0 0 bash
[Sat Aug 19 21:07:56 2023] [ 4269] 89 4269 22473 250 44 0 0 pickup
[Sat Aug 19 21:07:56 2023] [ 4887] 0 4887 39211 337 80 0 0 sshd
[Sat Aug 19 21:07:56 2023] [ 4889] 1000 4889 39211 334 76 0 0 sshd
[Sat Aug 19 21:07:56 2023] [ 4890] 1000 4890 29122 341 14 0 0 bash
[Sat Aug 19 21:07:56 2023] [ 4918] 0 4918 60352 292 72 0 0 sudo
[Sat Aug 19 21:07:56 2023] [ 4919] 0 4919 47969 144 52 0 0 su
[Sat Aug 19 21:07:56 2023] [ 4920] 0 4920 29160 371 15 0 0 bash
[Sat Aug 19 21:07:56 2023] [ 5838] 1002 5828 19126718 13054716 32417 0 0 jemalloc_bg_thd
[Sat Aug 19 21:07:56 2023] Out of memory: Kill process 5838 (jemalloc_bg_thd) score 805 or sacrifice child
[Sat Aug 19 21:07:56 2023] Killed process 5838 (jemalloc_bg_thd), UID 1002, total-vm:76506872kB, anon-rss:52218860kB, file-rss:4kB, shmem-rss:0kB
[Sat Aug 19 21:11:13 2023] java invoked oom-killer: gfp_mask=0x201da, order=0, oom_score_adj=0
[Sat Aug 19 21:11:13 2023] java cpuset=/ mems_allowed=0
[Sat Aug 19 21:11:13 2023] CPU: 4 PID: 18312 Comm: java Kdump: loaded Not tainted 3.10.0-1160.71.1.el7.x86_64 #1
[Sat Aug 19 21:11:13 2023] Hardware name: Amazon EC2 m5.4xlarge/, BIOS 1.0 10/16/2017
[Sat Aug 19 21:11:13 2023] Call Trace:
[Sat Aug 19 21:11:13 2023] [] dump_stack+0x19/0x1b
[Sat Aug 19 21:11:13 2023] [] dump_header+0x90/0x229
[Sat Aug 19 21:11:13 2023] [] ? ktime_get_ts64+0x52/0xf0
[Sat Aug 19 21:11:13 2023] [] oom_kill_process+0x2cd/0x490
[Sat Aug 19 21:11:13 2023] [] ? oom_unkillable_task+0xcd/0x120
[Sat Aug 19 21:11:13 2023] [] out_of_memory+0x31a/0x500
[Sat Aug 19 21:11:13 2023] [] __alloc_pages_nodemask+0xad4/0xbe0
[Sat Aug 19 21:11:13 2023] [] alloc_pages_current+0x98/0x110
[Sat Aug 19 21:11:13 2023] [] __page_cache_alloc+0x97/0xb0
[Sat Aug 19 21:11:13 2023] [] filemap_fault+0x270/0x420
[Sat Aug 19 21:11:13 2023] [] __xfs_filemap_fault+0x7e/0x1d0 [xfs]
[Sat Aug 19 21:11:13 2023] [] xfs_filemap_fault+0x2c/0x30 [xfs]
[Sat Aug 19 21:11:13 2023] [] __do_fault.isra.61+0x8a/0x100
[Sat Aug 19 21:11:13 2023] [] do_read_fault.isra.63+0x4c/0x1b0
[Sat Aug 19 21:11:13 2023] [] handle_mm_fault+0xa20/0xfb0
[Sat Aug 19 21:11:13 2023] [] __do_page_fault+0x213/0x500
[Sat Aug 19 21:11:13 2023] [] trace_do_page_fault+0x56/0x150
[Sat Aug 19 21:11:13 2023] [] do_async_page_fault+0x22/0xf0
[Sat Aug 19 21:11:13 2023] [] async_page_fault+0x28/0x30
[Sat Aug 19 21:11:13 2023] Mem-Info:
[Sat Aug 19 21:11:13 2023] active_anon:15869738 inactive_anon:22874 isolated_anon:0
active_file:55 inactive_file:32 isolated_file:0
unevictable:0 dirty:1 writeback:0 unstable:0
slab_reclaimable:45216 slab_unreclaimable:15804
mapped:4188 shmem:43147 pagetables:36095 bounce:0
free:81434 free_pcp:1715 free_cma:0
[Sat Aug 19 21:11:13 2023] Node 0 DMA free:15908kB min:16kB low:20kB high:24kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15992kB managed:15908kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
[Sat Aug 19 21:11:13 2023] lowmem_reserve[]: 0 2827 63445 63445
[Sat Aug 19 21:11:13 2023] Node 0 DMA32 free:245408kB min:3008kB low:3760kB high:4512kB active_anon:2521432kB inactive_anon:3272kB active_file:4kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:3129252kB managed:2895748kB mlocked:0kB dirty:0kB writeback:0kB mapped:192kB shmem:6144kB slab_reclaimable:105176kB slab_unreclaimable:8416kB kernel_stack:992kB pagetables:5320kB unstable:0kB bounce:0kB free_pcp:1372kB local_pcp:12kB free_cma:0kB writeback_tmp:0kB pages_scanned:637 all_unreclaimable? yes
[Sat Aug 19 21:11:13 2023] lowmem_reserve[]: 0 0 60617 60617
[Sat Aug 19 21:11:13 2023] Node 0 Normal free:64420kB min:64552kB low:80688kB high:96828kB active_anon:60957520kB inactive_anon:88224kB active_file:216kB inactive_file:244kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:63131648kB managed:62075632kB mlocked:0kB dirty:4kB writeback:0kB mapped:16560kB shmem:166444kB slab_reclaimable:75688kB slab_unreclaimable:54800kB kernel_stack:14560kB pagetables:139060kB unstable:0kB bounce:0kB free_pcp:5488kB local_pcp:560kB free_cma:0kB writeback_tmp:0kB pages_scanned:1472 all_unreclaimable? yes
[Sat Aug 19 21:11:13 2023] lowmem_reserve[]: 0 0 0 0
[Sat Aug 19 21:11:13 2023] Node 0 DMA: 14kB (U) 08kB 016kB 132kB (U) 264kB (U) 1128kB (U) 1256kB (U) 0512kB 11024kB (U) 12048kB (M) 34096kB (M) = 15908kB
[Sat Aug 19 21:11:13 2023] Node 0 DMA32: 732
4kB (UEM) 8008kB (UE) 287316kB (UEM) 205732kB (UEM) 95664kB (UEM) 327128kB (UE) 67256kB (UEM) 8512kB (U) 01024kB 02048kB 04096kB = 245408kB
[Sat Aug 19 21:11:13 2023] Node 0 Normal: 12314kB (UEM) 5938kB (UEM) 25216kB (UEM) 94732kB (UEM) 21764kB (UEM) 31128kB (UEM) 10256kB (EM) 0512kB 01024kB 02048kB 04096kB = 64420kB
[Sat Aug 19 21:11:13 2023] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[Sat Aug 19 21:11:13 2023] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[Sat Aug 19 21:11:13 2023] 43452 total pagecache pages
[Sat Aug 19 21:11:13 2023] 0 pages in swap cache
[Sat Aug 19 21:11:13 2023] Swap cache stats: add 0, delete 0, find 0/0
[Sat Aug 19 21:11:13 2023] Free swap = 0kB
[Sat Aug 19 21:11:13 2023] Total swap = 0kB
[Sat Aug 19 21:11:13 2023] 16569223 pages RAM
[Sat Aug 19 21:11:13 2023] 0 pages HighMem/MovableOnly
[Sat Aug 19 21:11:13 2023] 322401 pages reserved
[Sat Aug 19 21:11:13 2023] [ pid ] uid tgid total_vm rss nr_ptes swapents oom_score_adj name
[Sat Aug 19 21:11:13 2023] [ 512] 0 512 16003 5294 39 0 0 systemd-journal
[Sat Aug 19 21:11:13 2023] [ 548] 0 548 11350 138 23 0 -1000 systemd-udevd
[Sat Aug 19 21:11:13 2023] [ 612] 0 612 13883 113 28 0 -1000 auditd
[Sat Aug 19 21:11:13 2023] [ 676] 0 676 5410 65 17 0 0 irqbalance
[Sat Aug 19 21:11:13 2023] [ 678] 81 678 14522 137 32 0 -900 dbus-daemon
[Sat Aug 19 21:11:13 2023] [ 682] 32 682 17314 135 37 0 0 rpcbind
[Sat Aug 19 21:11:13 2023] [ 690] 999 690 153085 2427 62 0 0 polkitd
[Sat Aug 19 21:11:13 2023] [ 692] 0 692 6629 98 18 0 0 systemd-logind
[Sat Aug 19 21:11:13 2023] [ 705] 998 705 29452 114 28 0 0 chronyd
[Sat Aug 19 21:11:13 2023] [ 721] 0 721 48801 117 36 0 0 gssproxy
[Sat Aug 19 21:11:13 2023] [ 1004] 0 1004 25747 518 48 0 0 dhclient
[Sat Aug 19 21:11:13 2023] [ 1061] 0 1061 143570 3335 100 0 0 tuned
[Sat Aug 19 21:11:13 2023] [ 1189] 0 1189 22447 259 42 0 0 master
[Sat Aug 19 21:11:13 2023] [ 1203] 89 1203 22517 265 44 0 0 qmgr
[Sat Aug 19 21:11:13 2023] [ 1267] 0 1267 90145 3164 87 0 0 rsyslogd
[Sat Aug 19 21:11:13 2023] [ 1282] 0 1282 27551 32 13 0 0 agetty
[Sat Aug 19 21:11:13 2023] [ 1283] 0 1283 31596 160 21 0 0 crond
[Sat Aug 19 21:11:13 2023] [ 1286] 0 1286 27551 32 10 0 0 agetty
[Sat Aug 19 21:11:13 2023] [ 1399] 0 1399 28246 259 57 0 -1000 sshd
[Sat Aug 19 21:11:13 2023] [ 7738] 997 7738 180237 3541 36 0 0 node_exporter
[Sat Aug 19 21:11:13 2023] [18027] 1002 18027 4814538 2773131 5880 0 0 java
[Sat Aug 19 21:11:13 2023] [ 686] 0 686 39211 335 81 0 0 sshd
[Sat Aug 19 21:11:13 2023] [ 688] 1000 688 39211 340 78 0 0 sshd
[Sat Aug 19 21:11:13 2023] [ 689] 1000 689 29155 383 16 0 0 bash
[Sat Aug 19 21:11:14 2023] [ 4269] 89 4269 22473 251 44 0 0 pickup
[Sat Aug 19 21:11:14 2023] [ 4887] 0 4887 39211 337 80 0 0 sshd
[Sat Aug 19 21:11:14 2023] [ 4889] 1000 4889 39211 341 76 0 0 sshd
[Sat Aug 19 21:11:14 2023] [ 4890] 1000 4890 29122 341 14 0 0 bash
[Sat Aug 19 21:11:14 2023] [ 4918] 0 4918 60352 292 72 0 0 sudo
[Sat Aug 19 21:11:14 2023] [ 4919] 0 4919 47969 144 52 0 0 su
[Sat Aug 19 21:11:14 2023] [ 4920] 0 4920 29160 370 15 0 0 bash
[Sat Aug 19 21:11:14 2023] [ 8779] 1002 8779 17015215 13057167 28845 0 0 starrocks_be
[Sat Aug 19 21:28:28 2023] Out of memory: Kill process 16831 (starrocks_be) score 805 or sacrifice child
[Sat Aug 19 21:28:28 2023] Killed process 16831 (starrocks_be), UID 1002, total-vm:76278992kB, anon-rss:52227956kB, file-rss:0kB, shmem-rss:0kB
[Sat Aug 19 21:28:28 2023] systemd-journal invoked oom-killer: gfp_mask=0x201da, order=0, oom_score_adj=0
[Sat Aug 19 21:28:28 2023] systemd-journal cpuset=/ mems_allowed=0
[Sat Aug 19 21:28:28 2023] CPU: 15 PID: 512 Comm: systemd-journal Kdump: loaded Not tainted 3.10.0-1160.71.1.el7.x86_64 #1
[Sat Aug 19 21:28:28 2023] Hardware name: Amazon EC2 m5.4xlarge/, BIOS 1.0 10/16/2017
[Sat Aug 19 21:28:28 2023] Call Trace:
[Sat Aug 19 21:28:28 2023] [] dump_stack+0x19/0x1b
[Sat Aug 19 21:28:28 2023] [] dump_header+0x90/0x229
[Sat Aug 19 21:28:28 2023] [] ? ktime_get_ts64+0x52/0xf0
[Sat Aug 19 21:28:28 2023] [] ? delayacct_end+0x8f/0xb0
[Sat Aug 19 21:28:28 2023] [] oom_kill_process+0x2cd/0x490
[Sat Aug 19 21:28:28 2023] [] ? oom_unkillable_task+0xcd/0x120
[Sat Aug 19 21:28:28 2023] [] out_of_memory+0x31a/0x500
[Sat Aug 19 21:28:28 2023] [] __alloc_pages_nodemask+0xad4/0xbe0
[Sat Aug 19 21:28:28 2023] [] alloc_pages_current+0x98/0x110
[Sat Aug 19 21:28:28 2023] [] __page_cache_alloc+0x97/0xb0
[Sat Aug 19 21:28:28 2023] [] filemap_fault+0x270/0x420
[Sat Aug 19 21:28:28 2023] [] __xfs_filemap_fault+0x7e/0x1d0 [xfs]
[Sat Aug 19 21:28:28 2023] [] xfs_filemap_fault+0x2c/0x30 [xfs]
[Sat Aug 19 21:28:28 2023] [] __do_fault.isra.61+0x8a/0x100
[Sat Aug 19 21:28:28 2023] [] do_read_fault.isra.63+0x4c/0x1b0
[Sat Aug 19 21:28:28 2023] [] handle_mm_fault+0xa20/0xfb0
[Sat Aug 19 21:28:28 2023] [] __do_page_fault+0x213/0x500
[Sat Aug 19 21:28:28 2023] [] trace_do_page_fault+0x56/0x150
[Sat Aug 19 21:28:28 2023] [] do_async_page_fault+0x22/0xf0
[Sat Aug 19 21:28:28 2023] [] async_page_fault+0x28/0x30
[Sat Aug 19 21:28:28 2023] Mem-Info:
[Sat Aug 19 21:28:28 2023] active_anon:15478939 inactive_anon:22872 isolated_anon:0
active_file:0 inactive_file:280 isolated_file:0
unevictable:0 dirty:2 writeback:0 unstable:0
slab_reclaimable:43219 slab_unreclaimable:15384
mapped:3982 shmem:43139 pagetables:39388 bounce:0
free:473222 free_pcp:1411 free_cma:0
[Sat Aug 19 21:28:28 2023] Node 0 DMA free:15908kB min:16kB low:20kB high:24kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15992kB managed:15908kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
[Sat Aug 19 21:28:28 2023] lowmem_reserve[]: 0 2827 63445 63445
[Sat Aug 19 21:28:28 2023] Node 0 DMA32 free:353624kB min:3008kB low:3760kB high:4512kB active_anon:2417668kB inactive_anon:3268kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:3129252kB managed:2895748kB mlocked:0kB dirty:0kB writeback:0kB mapped:208kB shmem:6140kB slab_reclaimable:99204kB slab_unreclaimable:7380kB kernel_stack:608kB pagetables:4524kB unstable:0kB bounce:0kB free_pcp:2548kB local_pcp:16kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
[Sat Aug 19 21:28:28 2023] lowmem_reserve[]: 0 0 60617 60617
[Sat Aug 19 21:28:28 2023] Node 0 Normal free:3560644kB min:64552kB low:80688kB high:96828kB active_anon:57461204kB inactive_anon:88220kB active_file:4kB inactive_file:1616kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:63131648kB managed:62075632kB mlocked:0kB dirty:8kB writeback:0kB mapped:16928kB shmem:166416kB slab_reclaimable:73672kB slab_unreclaimable:54156kB kernel_stack:14416kB pagetables:153028kB unstable:0kB bounce:0kB free_pcp:2796kB local_pcp:116kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
[Sat Aug 19 21:28:28 2023] lowmem_reserve[]: 0 0 0 0
[Sat Aug 19 21:28:28 2023] Node 0 DMA: 1
4kB (U) 08kB 016kB 132kB (U) 264kB (U) 1128kB (U) 1256kB (U) 0512kB 11024kB (U) 12048kB (M) 34096kB (M) = 15908kB
[Sat Aug 19 21:28:28 2023] Node 0 DMA32: 142124kB (UEM) 57758kB (UEM) 496916kB (UEM) 271032kB (UEM) 107464kB (UEM) 345128kB (UEM) 54256kB (UEM) 7512kB (U) 01024kB 02048kB 04096kB = 399576kB
[Sat Aug 19 21:28:28 2023] Node 0 Normal: 359932
4kB (UEM) 1406848kB (UEM) 6172016kB (UEM) 1783532kB (UEM) 215664kB (UEM) 108128kB (UEM) 73256kB (UEM) 0512kB 01024kB 12048kB (M) 04096kB = 4295984kB
[Sat Aug 19 21:28:28 2023] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[Sat Aug 19 21:28:28 2023] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[Sat Aug 19 21:28:28 2023] 43549 total pagecache pages
[Sat Aug 19 21:28:28 2023] 0 pages in swap cache
[Sat Aug 19 21:28:28 2023] Swap cache stats: add 0, delete 0, find 0/0
[Sat Aug 19 21:28:28 2023] Free swap = 0kB
[Sat Aug 19 21:28:28 2023] Total swap = 0kB
[Sat Aug 19 21:28:29 2023] 16569223 pages RAM
[Sat Aug 19 21:28:29 2023] 0 pages HighMem/MovableOnly
[Sat Aug 19 21:28:29 2023] 322401 pages reserved
[Sat Aug 19 21:28:29 2023] [ pid ] uid tgid total_vm rss nr_ptes swapents oom_score_adj name
[Sat Aug 19 21:28:29 2023] [ 512] 0 512 16003 5389 39 0 0 systemd-journal
[Sat Aug 19 21:28:29 2023] [ 548] 0 548 11350 138 23 0 -1000 systemd-udevd
[Sat Aug 19 21:28:29 2023] [ 612] 0 612 13883 113 28 0 -1000 auditd
[Sat Aug 19 21:28:29 2023] [ 676] 0 676 5410 65 17 0 0 irqbalance
[Sat Aug 19 21:28:29 2023] [ 678] 81 678 14522 137 32 0 -900 dbus-daemon
[Sat Aug 19 21:28:29 2023] [ 682] 32 682 17314 135 37 0 0 rpcbind
[Sat Aug 19 21:28:29 2023] [ 690] 999 690 153085 2427 62 0 0 polkitd
[Sat Aug 19 21:28:29 2023] [ 692] 0 692 6629 98 18 0 0 systemd-logind
[Sat Aug 19 21:28:29 2023] [ 705] 998 705 29452 114 28 0 0 chronyd
[Sat Aug 19 21:28:29 2023] [ 721] 0 721 48801 117 36 0 0 gssproxy
[Sat Aug 19 21:28:29 2023] [ 1004] 0 1004 25747 518 48 0 0 dhclient
[Sat Aug 19 21:28:29 2023] [ 1061] 0 1061 143570 3403 100 0 0 tuned
[Sat Aug 19 21:28:29 2023] [ 1189] 0 1189 22447 259 42 0 0 master
[Sat Aug 19 21:28:29 2023] [ 1203] 89 1203 22517 265 44 0 0 qmgr
[Sat Aug 19 21:28:29 2023] [ 1267] 0 1267 90145 3216 87 0 0 rsyslogd
[Sat Aug 19 21:28:29 2023] [ 1282] 0 1282 27551 32 13 0 0 agetty
[Sat Aug 19 21:28:29 2023] [ 1283] 0 1283 31596 160 21 0 0 crond
[Sat Aug 19 21:28:29 2023] [ 1286] 0 1286 27551 32 10 0 0 agetty
[Sat Aug 19 21:28:29 2023] [ 1399] 0 1399 28246 259 57 0 -1000 sshd
[Sat Aug 19 21:28:29 2023] [ 7738] 997 7738 180237 4329 36 0 0 node_exporter
[Sat Aug 19 21:28:29 2023] [18027] 1002 18027 4814538 2773430 5880 0 0 java
[Sat Aug 19 21:28:29 2023] [ 4269] 89 4269 22473 251 44 0 0 pickup
[Sat Aug 19 21:28:29 2023] [ 4887] 0 4887 39211 337 80 0 0 sshd
[Sat Aug 19 21:28:29 2023] [ 4889] 1000 4889 39211 342 76 0 0 sshd
[Sat Aug 19 21:28:29 2023] [ 4890] 1000 4890 29122 341 14 0 0 bash
[Sat Aug 19 21:28:29 2023] [ 4918] 0 4918 60352 292 72 0 0 sudo
[Sat Aug 19 21:28:29 2023] [ 4919] 0 4919 47969 144 52 0 0 su
[Sat Aug 19 21:28:29 2023] [ 4920] 0 4920 29160 374 15 0 0 bash
[Sat Aug 19 21:28:29 2023] Out of memory: Kill process 17221 (pip_scan_io) score 805 or sacrifice child

您可以优化下两个大表的关联方式,如果可以,使用colocate join可以大大减少大表之间join 消耗内存 Colocate Join @ Colocate_join @ StarRocks Docs 减少大查询影响,可以将集群升级到2.5最新版,开启资源隔离,设置大查询熔断 资源隔离 @ resource_group @ StarRocks Docs

parallel_fragment_exec_instance_num = 1 是不是会影响整体的查询效率呢

2.3之后的版本都设置成1即可,并行度调整由pipeline_dop 进行设置

这是啥进程,18027,看起来用了10g内存

混部导致,这个Java进程吃了太多内存(18G)

be conf配置下mem_limit=(机器总内存-其他服务占用内存-1g(系统预留))