Memory of Fragment fe620076-2b69-11ee-8516-b4055ddf14dd exceed limit. try consume:268435456

codebetter · 2023年07月27日 02:20

【详述】执行insert into table select * XXXSQL, 查询内存超限
【背景】
【业务影响】影响严重
【StarRocks版本】例如：2.3.11
【集群规模】例如：6be（fe与be混部）
【机器信息】
【联系方式】13692280056

1. 报错信息

Exception in thread "main" java.sql.SQLSyntaxErrorException: Memory of Fragment fe620076-2b69-11ee-8516-b4055ddf14dd exceed limit. try consume:268435456 
Used: 8373466672, Limit: 8589934538. Mem usage has exceed the limit of single query, You can change the limit by set session variable exec_mem_limit or query_mem_limit.
	at com.mysql.cj.jdbc.exceptions.SQLError.createSQLException(SQLError.java:120)
	at com.mysql.cj.jdbc.exceptions.SQLError.createSQLException(SQLError.java:97)
	at com.mysql.cj.jdbc.exceptions.SQLExceptionsMapping.translateException(SQLExceptionsMapping.java:122)
	at com.mysql.cj.jdbc.StatementImpl.executeInternal(StatementImpl.java:764)
	at com.mysql.cj.jdbc.StatementImpl.execute(StatementImpl.java:648)
	at com.sf.service.Impl.StrategyServiceImpl.doAction(StrategyServiceImpl.java:24)
	at com.sf.Application.executeSwitch(Application.java:157)
	at com.sf.Application.main(Application.java:116)

2. 执行的SQL

(

只select的时候不报错, 加上insert into bfms_prod_line_sp_collect_result 就报内存不够
第二天执行同样的SQL, 却不报内存不够了

以表的数据量来看, 默认8G的内存应该够用 )

– 结果 932595, 90万
insert into bfms_prod_line_sp_collect_result
select
‘1386’ as instance_id,
t0.city_code as object_code,
left_table.predict_datetime,
left_table.product_type,
5 as object_type_code ,
left_table.weight_level,
left_table.task_type,
left_table.predict_classtime,
left_table.model_version,
left_table.data_version,
left_table.feature_version ,
t0.city_name as object_name,
‘城市’ as object_type,
sum(left_table.predict_quantity) as predict_quantity ,
sum(left_table.predict_waybill) as predict_waybill,
sum(left_table.predict_weight) as predict_weight,
sum(left_table.predict_volume) as predict_volume,
‘’ as real_quantity ,
‘’ as real_waybill ,
‘’ as real_weight ,
‘’ as real_volume,
‘’ as dianbu_code ,
‘’ as division_code,
‘’ as city_code ,
‘’ as area_code,
‘’ as fbq_code ,
‘’ as hq_code ,
now() as record_time,
replace(predict_datetime,’-’,’’) as partition_key,
inc_hours,
batch_code,
batch_name,
predict_hour,
object_properties,
is_centralized
from (

-- 左表43547190, 4千三百万
         select
             *
         from  bfms_prod_line_sp_concat_result
         where instance_id = '1386' and object_type_code='7'
           and predict_datetime between '2023-07-28' and '2023-11-23'
     )left_table
         join
     (

-- 右表 60483  , 6万
         select
             object_code,
             parent_object_code   as city_code,
             parent_object_name as city_name
         from
             bfms_zone_parent_df
         where  object_type='网点'
           and parent_object_type='城市'
           and inc_date=DATE_FORMAT(date_sub(now(),interval 1 day),'%Y-%m-%d')
     )t0
     on left_table.object_code=t0.object_code
group by
    left_table.predict_datetime,
    left_table.product_type,
    left_table.weight_level,
    left_table.task_type,
    left_table.predict_classtime,
    left_table.model_version,
    left_table.data_version,
    left_table.feature_version ,
    t0.city_name,
    t0.city_code,
    inc_hours,
    batch_code,
    batch_name,
    predict_hour,
    object_properties,
    is_centralized;

【附件】

fe.log/beINFO/相应截图
慢查询：
- Profile信息
- 并行度：show variables like ‘%parallel_fragment_exec_instance_num%’;
- pipeline是否开启：show variables like ‘%pipeline%’;
- be节点cpu和内存使用率截图
查询报错：
- query_dump，怎么获取query_dump文件
be crash
- be.out

Doni · 2023年07月27日 02:53

两天执行扫的数据量都是一模一样的么

codebetter · 2023年07月27日 02:54

对的, 这个查询内存超限真的头疼

codebetter · 2023年07月27日 02:57

我提示设置了会话内存至20G, 就不报错了
但是, 我很困惑, 困惑点主要有3个
1.为什么这个错误不是必现的?
2. 为什么8G会不够用, 需要设置到20G?
3. 我设置20G后, 后续一定没问题吗?

Doni · 2023年07月27日 02:59

U_1684486284899_7189:

右表 60483  , 6万
         select
             object_code,
             parent_object_code   as city_code,
             parent_object_name as city_name
         from
             bfms_zone_parent_df
         where  object_type='网点'
           and parent_object_type='城市'
           and inc_date=DATE_FORMAT(date_sub(now(),interval 1 day),'%Y-%m-%d')
     )t0

看sql 这块两天的数据量可能会有差异，避免任务出错可以把 exec_mem_limit的内存再调大一点，exec_mem_limit 只是单个查询可用的内存上限，不一定会用到那么多。

codebetter · 2023年07月27日 03:00

这个是申请的内存是吗? 如果我有多个地方都设置了较大的申请内存, 会不会导致集群的内存被耗光呢?

codebetter · 2023年07月27日 03:05

这个查询为什么需要申请这么大的内存, 是否有一些资料可以提供给我, 了解下这个内存的使用过程?

Doni · 2023年07月27日 03:33

exec_mem_limit 单个查询实例最大可申请的内存，但实际不一定会用这么多。
image777×102 9.18 KB

image818×347 61.1 KB

image789×255 45.2 KB
整个集群所有查询可用的最大内存默认是 mem_limit * 90%，如果达到集群查询限制，查询会报内存超限 https://docs.starrocks.io/zh-cn/latest/administration/Query_planning
查询中也分不同的阶段，每个阶段用的内存也不一样，可能数据量的变化引起了内存超限，只是需要多申请一点即可完成查询，但是内存限制住了。设置为20G是否没问题看后续任务数据量以及集群其他查询情况，如果其他查询把内存用了，该查询在执行时可能由于达到整个集群内存限制而被终止。

许秀不许秀 · 2023年07月27日 06:31

3.0 之后支持spill，可以在内存不足的时候把中间结果落盘，可以测一下看看

codebetter · 2023年07月27日 07:33

我这用V2.3.11

codebetter · 2023年08月7日 03:27

你好，请问下怎么计算集群可支持的查询内存上限呢?
假设我有6台BE， 256G/台, 如果我每个查询都设置exec_mem_limit=20g, 那最多可支持的并发查询上限是多少个，是不是这样算：256*6/20=70个？

codebetter · 2023年08月21日 02:27

麻烦帮忙看下这个问题呢

Liang · 2023年08月23日 02:13

这个不能这样算的，exec_mem_limit是单个查询的内存上限，单节点256G内存不可能全部给查询用，导入、合并以及元数据等等也需要占用内存的。exec_mem_limit这个参数的话需要根据你们实际场景来做调整，满足你们需求即可，另外QPS上限的话，需要具体压测下确定