be节点老是挂掉

starRocks-2.5.20
单节点部署

按照官网上配置后 没生效
查询的外表列比较多时可能触发该问题,可通过在 be.conf 中添加参数 buffer_stream_reserve_size=8192 后重启 BE 解决该问题。

be.out:

start time: 2024年 04月 26日 星期五 10:39:09 CST
start time: 2024年 04月 26日 星期五 14:33:51 CST
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/data1/software/starRocks-2.5.20/be/lib/jni-packages/starrocks-jdbc-bridge-jar-with-dependencies.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/data1/software/starRocks-2.5.20/be/lib/hadoop/common/lib/slf4j-reload4j-1.7.36.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
start time: Thu May 9 15:36:43 CST 2024
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/data1/software/starRocks-2.5.20/be/lib/jni-packages/starrocks-jdbc-bridge-jar-with-dependencies.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/data1/software/starRocks-2.5.20/be/lib/hadoop/common/lib/slf4j-reload4j-1.7.36.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
start time: 2024年 06月 14日 星期五 18:40:54 CST
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/data1/software/starRocks-2.5.20/be/lib/jni-packages/starrocks-jdbc-bridge-jar-with-dependencies.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/data1/software/starRocks-2.5.20/be/lib/hadoop/common/lib/slf4j-reload4j-1.7.36.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
start time: 2024年 06月 14日 星期五 18:42:43 CST
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/data1/software/starRocks-2.5.20/be/lib/jni-packages/starrocks-jdbc-bridge-jar-with-dependencies.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/data1/software/starRocks-2.5.20/be/lib/hadoop/common/lib/slf4j-reload4j-1.7.36.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
start time: 2024年 06月 14日 星期五 18:51:23 CST
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/data1/software/starRocks-2.5.20/be/lib/jni-packages/starrocks-jdbc-bridge-jar-with-dependencies.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/data1/software/starRocks-2.5.20/be/lib/hadoop/common/lib/slf4j-reload4j-1.7.36.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
start time: 2024年 06月 14日 星期五 19:30:23 CST
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/data1/software/starRocks-2.5.20/be/lib/jni-packages/starrocks-jdbc-bridge-jar-with-dependencies.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/data1/software/starRocks-2.5.20/be/lib/hadoop/common/lib/slf4j-reload4j-1.7.36.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
start time: 2024年 06月 14日 星期五 19:46:17 CST
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/data1/software/starRocks-2.5.20/be/lib/jni-packages/starrocks-jdbc-bridge-jar-with-dependencies.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/data1/software/starRocks-2.5.20/be/lib/hadoop/common/lib/slf4j-reload4j-1.7.36.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
start time: 2024年 06月 14日 星期五 19:48:36 CST
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/data1/software/starRocks-2.5.20/be/lib/jni-packages/starrocks-jdbc-bridge-jar-with-dependencies.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/data1/software/starRocks-2.5.20/be/lib/hadoop/common/lib/slf4j-reload4j-1.7.36.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
start time: 2024年 06月 14日 星期五 19:51:53 CST
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/data1/software/starRocks-2.5.20/be/lib/jni-packages/starrocks-jdbc-bridge-jar-with-dependencies.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/data1/software/starRocks-2.5.20/be/lib/hadoop/common/lib/slf4j-reload4j-1.7.36.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
start time: 2024年 06月 14日 星期五 20:02:12 CST
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/data1/software/starRocks-2.5.20/be/lib/jni-packages/starrocks-jdbc-bridge-jar-with-dependencies.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/data1/software/starRocks-2.5.20/be/lib/hadoop/common/lib/slf4j-reload4j-1.7.36.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
start time: Fri Aug 9 11:29:06 CST 2024
[warn] Error from accept() call: Invalid argument
[warn] Error from accept() call: Invalid argument
[warn] Error from accept() call: Invalid argument
[warn] Error from accept() call: Invalid argument
[warn] Error from accept() call: Invalid argument
[warn] Error from accept() call: Invalid argument
[warn] Error from accept() call: Invalid argument
[warn] Error from accept() call: Invalid argument
[warn] Error from accept() call: Invalid argument
[warn] Error from accept() call: Invalid argument
[warn] Error from accept() call: Invalid argument
[warn] Error from accept() call: Invalid argument
[warn] Error from accept() call: Invalid argument
[warn] Error from accept() call: Invalid argument
[warn] Error from accept() call: Invalid argument
[warn] Error from accept() call: Invalid argument
[warn] Error from accept() call: Invalid argument
[warn] Error from accept() call: Invalid argument
[warn] Error from accept() call: Invalid argument
[warn] Error from accept() call: Invalid argument
[warn] Error from accept() call: Invalid argument
[warn] Error from accept() call: Invalid argument
[warn] Error from accept() call: Invalid argument
[warn] Error from accept() call: Invalid argument
[warn] Error from accept() call: Invalid argument
[warn] Error from accept() call: Invalid argument
[warn] Error from accept() call: Invalid argument
[warn] Error from accept() call: Invalid argument
[warn] Error from accept() call: Invalid argument
[warn] Error from accept() call: Invalid argument
[warn] Error from accept() call: Invalid argument
[warn] Error from accept() call: Invalid argument
[warn] Error from accept() call: Invalid argument
[warn] Error from accept() call: Invalid argument
[warn] Error from accept() call: Invalid argument
[warn] Error from accept() call: Invalid argument
[warn] Error from accept() call: Invalid argument
[warn] Error from accept() call: Invalid argument
[warn] Error from accept() call: Invalid argument
[warn] Error from accept() call: Invalid argument
[warn] Error from accept() call: Invalid argument
[warn] Error from accept() call: Invalid argument
[warn] Error from accept() call: Invalid argument
[warn] Error from accept() call: Invalid argument
[warn] Error from accept() call: Invalid argument
[warn] Error from accept() call: Invalid argument
[warn] Error from accept() call: Invalid argument
[warn] Error from accept() call: Invalid argument
start time: Fri Aug 9 11:33:01 CST 2024
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/data1/software/starRocks-2.5.20/be/lib/jni-packages/starrocks-jdbc-bridge-jar-with-dependencies.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/data1/software/starRocks-2.5.20/be/lib/hadoop/common/lib/slf4j-reload4j-1.7.36.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
start time: Fri Aug 9 17:35:34 CST 2024
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/data1/software/starRocks-2.5.20/be/lib/jni-packages/starrocks-jdbc-bridge-jar-with-dependencies.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/data1/software/starRocks-2.5.20/be/lib/hadoop/common/lib/slf4j-reload4j-1.7.36.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
start time: Mon Aug 12 09:52:27 CST 2024
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/data1/software/starRocks-2.5.20/be/lib/jni-packages/starrocks-jdbc-bridge-jar-with-dependencies.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/data1/software/starRocks-2.5.20/be/lib/hadoop/common/lib/slf4j-reload4j-1.7.36.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
start time: Mon Aug 12 10:46:47 CST 2024
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/data1/software/starRocks-2.5.20/be/lib/jni-packages/starrocks-jdbc-bridge-jar-with-dependencies.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/data1/software/starRocks-2.5.20/be/lib/hadoop/common/lib/slf4j-reload4j-1.7.36.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
start time: Mon Aug 12 11:12:30 CST 2024
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/data1/software/starRocks-2.5.20/be/lib/jni-packages/starrocks-jdbc-bridge-jar-with-dependencies.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/data1/software/starRocks-2.5.20/be/lib/hadoop/common/lib/slf4j-reload4j-1.7.36.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
start time: Mon Aug 12 18:28:53 CST 2024
Ignored unknown config: buffer_stream_reserve_size
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/data1/software/starRocks-2.5.20/be/lib/jni-packages/starrocks-jdbc-bridge-jar-with-dependencies.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/data1/software/starRocks-2.5.20/be/lib/hadoop/common/lib/slf4j-reload4j-1.7.36.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
start time: Tue Aug 13 11:18:31 CST 2024
starRocks-2.5.20 Ignored unknown config: buffer_stream_reserve_size

你这个应该是被系统oom kill掉了,dmesg -T|grep -i oom可以看到

[Thu Aug 15 10:01:02 2024] AliYunDun invoked oom-killer: gfp_mask=0x201da, order=0, oom_score_adj=0
[Thu Aug 15 10:01:02 2024] [] ? virtballoon_oom_notify+0x2a/0x80 [virtio_balloon]
[Thu Aug 15 10:01:02 2024] [] oom_kill_process+0x2d5/0x4a0
[Thu Aug 15 10:01:02 2024] [] ? oom_unkillable_task+0xcd/0x120
[Thu Aug 15 10:01:02 2024] [ pid ] uid tgid total_vm rss nr_ptes swapents oom_score_adj name
[Thu Aug 15 10:01:02 2024] argusagent invoked oom-killer: gfp_mask=0x201da, order=0, oom_score_adj=0
[Thu Aug 15 10:01:02 2024] [] ? virtballoon_oom_notify+0x2a/0x80 [virtio_balloon]
[Thu Aug 15 10:01:02 2024] [] oom_kill_process+0x2d5/0x4a0
[Thu Aug 15 10:01:02 2024] [] ? oom_unkillable_task+0xcd/0x120
[Thu Aug 15 10:01:02 2024] [ pid ] uid tgid total_vm rss nr_ptes swapents oom_score_adj name
[Thu Aug 15 10:01:03 2024] systemd-journal invoked oom-killer: gfp_mask=0x201da, order=0, oom_score_adj=0
[Thu Aug 15 10:01:03 2024] [] ? virtballoon_oom_notify+0x2a/0x80 [virtio_balloon]
[Thu Aug 15 10:01:03 2024] [] oom_kill_process+0x2d5/0x4a0
[Thu Aug 15 10:01:03 2024] [] ? oom_unkillable_task+0xcd/0x120
[Thu Aug 15 10:01:03 2024] [ pid ] uid tgid total_vm rss nr_ptes swapents oom_score_adj name

针对oom我修改了一些参数 be的 内存我都限制了 为什么还会出现这种情况啊

be的conf发下

mem_limit = 65%
buffer_stream_reserve_size=8192

under the License.

load_process_max_memory_limit_bytes = 19737418240
block_cache_mem_size = 2147483648

INFO, WARNING, ERROR, FATAL

sys_log_level = INFO
disable_storage_page_cache = FALSE

ports for admin, web, heartbeat service

be_port = 9060
webserver_port = 8040
heartbeat_service_port = 9050
brpc_port = 8060

我目前定位到 是因为一个 hive的外表 insert table XX select * from hive.外表 hive 外表大概600多个字段 每天数据大小 60M 就是因为这一个任务让我系统认为是 oom 然后be节点就挂掉了

调度工具使用的海豚 但是我在命令行中自己执行 有时候能成功 有时候 be业直接挂掉 是有py代码去执行也一样直接挂掉