Out of memory && Mem usage has exceed the limit of BE

【StarRocks版本】例如:2.2.2
查询多的时候be就挂掉然后报错:

一个简单的插入报错: Mem usage has exceed the limit of BE

be机器内存192个g,除了系统占用,其他内存全部给be用,还是容易oom,请问这种情况下该怎么优化?

您好,麻烦这样操作一下截图发下,看看内存统计使用情况:http://be_ip:be_web_port/mem_tracker

我已经重启了be,现在查看是不是分析不出当时的情况了:


今日be还是多次挂掉:
Out of memory: Kill process 11835 (starrocks_be) score 575 or sacrifice child
而且oom重启没多久又挂,集群资源利用率不高,压力不大,从日志里没看出什么异常,这个是不是be oom的bug啊。

在be.warn日志里发现:
W0825 17:21:27.211031 18520 data_dir.cpp:544] Found invalid rowset=02000000000bbffff24a7dffffbcdc37191819c64dc5d28a tablet id=72594 tablet uid=1440ad30b8a170db-e7f269cf354039b2 schema hash=1962580018 txn_id: 1009238 current valid tablet uid=3646b660a2deced5-735de1ec50007885
W0825 17:21:27.211084 18520 data_dir.cpp:544] Found invalid rowset=02000000000bbcdbf24a7dffffbcdc37191819c64dc5d28a tablet id=68242 tablet uid=6241a5057ce8ea31-e33e18772b70bdbd schema hash=1962580018 txn_id: 1009129 current valid tablet uid=6248fc8874c56195-dc2884d2df0a80a9
W0825 17:21:27.211094 18520 data_dir.cpp:544] Found invalid rowset=02000000000bbea7f24a7dffffbcdc37191819c64dc5d28a tablet id=68242 tablet uid=6241a5057ce8ea31-e33e18772b70bdbd schema hash=1962580018 txn_id: 1009238 current valid tablet uid=6248fc8874c56195-dc2884d2df0a80a9
W0825 17:21:27.211134 18520 data_dir.cpp:544] Found invalid rowset=0200000000073d28f24a7dffffbcdc37191819c64dc5d28a tablet id=12335848 tablet uid=8647f022ea7a8d4b-e5dca32a292d8985 schema hash=1955557706 txn_id: 647891 current valid tablet uid=d04759e346243bc7-a2dd0f26c91537aa
W0825 17:21:27.211165 18520 data_dir.cpp:544] Found invalid rowset=02000000000738abf24a7dffffbcdc37191819c64dc5d28a tablet id=12335719 tablet uid=9a4dac2a8f5c8367-27c240e36a4e35b6 schema hash=1955557706 txn_id: 647635 current valid tablet uid=6346d3b85c31f1d3-2fa84ac2da0970ac
W0825 17:21:27.549248 18929 utils.cpp:90] Fail to get master client from cache. host= port=0 code=THRIFT_RPC_ERROR
W0825 17:21:27.549314 18929 task_worker_pool.cpp:1355] Fail to report task to :0, err=-1
如上日志

请问官方的同学可以看看Mem usage has exceed the limit of BE这个问题吗?


集群内存明明还有很多。

这个需要看具体的内存都消耗在哪里了?上面那个查看内存使用的方法,您在出现这个问题的时候搞一下,有复杂大查询嘛?并发高不高?be和fe是混部嘛?

你好,查询不是很复杂,就是三张表关联,一张20几万,另外两张2,3亿左右,并发不高,be和fe不是混布,这个sr后续有什么优化措施吗?不能内存不够了就杀了,内存不够可以落盘的,保证查询的可用性。

发下执行计划explain costs看看

你好,我们周末升了级,发现可以跑了,但是数据量大的话还是会报: Mem usage has exceed the limit of BE。explain 如下:
PLAN FRAGMENT 0
OUTPUT EXPRS:109: expr | 111: priority
PARTITION: HASH_PARTITIONED: 109: expr

OLAP TABLE SINK
TUPLE ID: 8
RANDOM

11:EXCHANGE

PLAN FRAGMENT 1
OUTPUT EXPRS:
PARTITION: RANDOM

STREAM DATA SINK
EXCHANGE ID: 11
HASH_PARTITIONED: 109: expr

10:Project
| <slot 109> : split(62: post_id, ‘:’)[2]
| <slot 111> : CAST(29 AS INT)
|
9:HASH JOIN
| join op: INNER JOIN (BROADCAST)
| hash predicates:
| colocate: false, reason:
| equal join conjunct: 1: user_id = 63: user_id
|
|----8:EXCHANGE
|
1:Project
| <slot 1> : 1: user_id
|
0:OlapScanNode
TABLE: socialmedia_user
PREAGGREGATION: OFF. Reason: None aggregate function
PREDICATES: 2: platform_id = 4101, 29: fans_count > 100000
partitions=1/7
rollup: socialmedia_user
tabletRatio=128/128
tabletList=211574,211578,211582,211586,211590,211594,211598,211602,211606,211610 …
cardinality=95513627
avgRowSize=24.775545
numNodes=0

PLAN FRAGMENT 2
OUTPUT EXPRS:
PARTITION: RANDOM

STREAM DATA SINK
EXCHANGE ID: 08
UNPARTITIONED

7:Project
| <slot 62> : 62: post_id
| <slot 63> : 63: user_id
|
6:HASH JOIN
| join op: INNER JOIN (COLOCATE)
| hash predicates:
| colocate: true
| equal join conjunct: 58: id = 73: id
|
|----5:Project
| | <slot 73> : 73: id
| |
| 4:OlapScanNode
| TABLE: socialmedia_post_secondary
| PREAGGREGATION: OFF. Reason: Has can not pre-aggregation Join
| PREDICATES: 72: platform_id = 4101, 78: reply_count > 0
| partitions=26/51
| rollup: socialmedia_post_secondary
| tabletRatio=832/832
| tabletList=25319147,25319151,25319155,25319159,25319163,25319167,25319171,25319175,25319179,25319183 …
| cardinality=1
| avgRowSize=44.0
| numNodes=0
|
3:Project
| <slot 58> : 58: id
| <slot 62> : 62: post_id
| <slot 63> : 63: user_id
|
2:OlapScanNode
TABLE: socialmedia_post_main
PREAGGREGATION: OFF. Reason: Has can not pre-aggregation Join
PREDICATES: 57: platform_id = 4101, 69: is_host = 1
partitions=26/51
rollup: socialmedia_post_main
tabletRatio=832/832
tabletList=25287240,25287244,25287248,25287252,25287256,25287260,25287264,25287268,25287272,25287276 …
cardinality=1
avgRowSize=72.99072
numNodes=0

parallel_fragment_exec_instance_num为20,
exec_mem_limit为10G,
现版本为2.3.8。

这个表有多少列?

socialmedia_post_secondary表:38列
socialmedia_post_main表:15列
socialmedia_user表:55列

当前升级到哪个版本了

升级到2.3.8了

parallel_fragment_exec_instance_num 这个调成4, 能跑出来吗

调成4,报查询内存不够了:
SQL 错误 [1064] [42000]: Memory of Fragment cf8d94f2-b0f2-11ed-b2b6-20040fe4ec70 exceed limit. try consume:5198233600 Used: 6025635664, Limit: 10737418240. Mem usage has exceed the limit of single query, You can change the limit by set session variable exec_mem_limit or query_mem_limit.

可以按照提示调大下query_mem_limit看看

如果是被系统因为oom给kill的话,你be的使用内存设置的多少阿

oom的那个问题已经解决了,现在没有了。

不行的,调大了就会报错exceed the limit of BE

把in的数量放小点,比如只放两个,还会报错吗,如果不报错了,可能是个策略问题,会消耗大量内存。