【be】 SR2.4 be oom

【详述】问题详细描述
使用过程中出现be oom现象,被操作系统kill
【背景】做过哪些操作?
【业务影响】无
【StarRocks版本】2.4.0
【集群规模】3fe(1 follower+2observer)+ 8be(be独立)
【机器信息】CPU虚拟核/内存/网卡,32C/128G/万兆
【联系方式】为了在解决问题过程中能及时联系到您获取一些日志信息,请补充下您的联系方式:社区群12-金谡 jinsu@moojing.com
【附件】

  • fe.log/beINFO/相应截图
  • 慢查询:
    • Profile信息

    • 并行度:show variables like ‘%parallel_fragment_exec_instance_num%’;
      mysql> show variables like ‘%parallel_fragment_exec_instance_num%’;
      ±------------------------------------±------+
      | Variable_name | Value |
      ±------------------------------------±------+
      | parallel_fragment_exec_instance_num | 1 |
      ±------------------------------------±------+
      1 row in set (0.00 sec)

    • pipeline是否开启:show variables like ‘%pipeline%’;
      mysql> show variables like ‘%pipeline%’;
      ±--------------------------------±------+
      | Variable_name | Value |
      ±--------------------------------±------+
      | enable_pipeline_engine | true |
      | enable_pipeline_query_statistic | true |
      | pipeline_dop | 0 |
      | pipeline_profile_level | 1 |
      ±--------------------------------±------+
      4 rows in set (0.00 sec)

    • be节点cpu和内存使用率截图

  • 查询报错:
  • be crash
    • be.out
      I0201 11:14:49.069170 131822 tablet_updates.cpp:512] commit rowset tablet:10065 version:8994743 txn_id: 9591701 02000000013e5ce61940ab4a88c3f43eb9bfe2a88bdb1ca8 rowset:6575426 #seg:0 #delfile:0 #row:0 size:0 #pending:0
      I0201 11:14:49.069183 131822 publish_version.cpp:109] Publish txn success tablet:10065 version:8994743 partition:10041 txn_id: 9591701 rowset:02000000013e5ce61940ab4a88c3f43eb9bfe2a88bdb1ca8
      I0201 11:14:49.069217 131822 tablet_updates.cpp:512] commit rowset tablet:10071 version:8994743 txn_id: 9591701 02000000013e5ce71940ab4a88c3f43eb9bfe2a88bdb1ca8 rowset:6580475 #seg:0 #delfile:0 #row:0 size:0 #pending:0
      I0201 11:14:49.069300 131822 publish_version.cpp:109] Publish txn success tablet:10071 version:8994743 partition:10041 txn_id: 9591701 rowset:02000000013e5ce71940ab4a88c3f43eb9bfe2a88bdb1ca8
      I0201 11:14:49.069445 1632628 tablet_updates.cpp:1028] apply_rowset_commit finish. tablet:10050 version:8994743 txn_id: 9591701 total del/row:192/7723 2% rowset:6580475 #seg:0 #op(upsert:0 del:0) #del:0+0=0 #dv:0 duration:0ms(0/0/0/0/0)
      I0201 11:14:49.069520 1632628 tablet_updates.cpp:1028] apply_rowset_commit finish. tablet:10065 version:8994743 txn_id: 9591701 total del/row:238/7749 3% rowset:6575426 #seg:0 #op(upsert:0 del:0) #del:0+0=0 #dv:0 duration:0ms(0/0/0/0/0)
      I0201 11:14:4I0201 12:29:56.818656 1636744 daemon.cpp:280] version 2.4.0 RELEASE (build c0fa2bb)

1.be有没有和其他服务混部,混部的话,be.conf中配置下mem_limit=xxx,配置的算法:总内存-其他服务占用内存-2g(给系统预留)
2.确认下面的参数都设置了

关闭swap,echo 0 | sudo tee /proc/sys/vm/swappiness
使用overcommit,echo 1 | sudo tee /proc/sys/vm/overcommit_memory

be 独立部署,没有和其他服务混合

机器没有开启swap
MiB Mem : 128634.1 total, 864.9 free, 57331.3 used, 70437.8 buff/cache
MiB Swap: 0.0 total, 0.0 free, 0.0 used. 76487.8 avail Mem

root@starrocks-be-11:/data/starrocks/log# cat /proc/sys/vm/overcommit_memory
0

be.conf中有配置mem_limit吗

query_mem_limit=0
load_mem_limit=0
exec_mem_limit=4294967296

data root path, separate by ‘;’

you can specify the storage medium of each root path, HDD or SSD, seperate by ‘,’

eg:

storage_root_path = /data1,medium:HDD;/data2,medium:SSD;/data3

/data1, HDD;

/data2, SSD;

/data3, HDD(default);

Default value is ${STARROCKS_HOME}/storage, you should create it by hand.

storage_root_path = ${STARROCKS_HOME}/storage

storage_root_path = /data/starrocks/storage
/data/starrocks/storage, SSD(default);

Advanced configurations

sys_log_dir = /data/starrocks/log

sys_log_roll_mode = SIZE-MB-1024

sys_log_roll_num = 10

sys_log_verbose_modules = *

log_buffer_level = -1

default_rowset_type = beta
max_unpacked_row_block_size = 10737418240

When integrate with STAROS with local disk cache enabled, this is the absolute

root dir for cache to write, multiple paths can be separated by colon (:).

e.g. starlet_cache_dir = “/data/disk1/cache/:data/disk2/cache/”

#starlet_cache_dir = “”

JVM options for be

eg:

JAVA_OPTS=-Djava.security.krb5.conf=/etc/krb5.conf

For jdk 9+, this JAVA_OPTS will be used as default JVM options

JAVA_OPTS_FOR_JDK_9=-Djava.security.krb5.conf=/etc/krb5.conf

base_compaction_check_interval_seconds = 10
cumulative_compaction_num_threads_per_disk = 4
base_compaction_num_threads_per_disk = 2
cumulative_compaction_check_interval_seconds = 2
priority_networks = 10.19.117.190

dynamic params

streaming_load_max_mb = 512000

be.conf中没有配置mem_limit