【详述】执行insert into table select * XXXSQL, 查询内存超限
【背景】
【业务影响】影响严重
【StarRocks版本】例如:2.3.11
【集群规模】例如:6be(fe与be混部)
【机器信息】
【联系方式】13692280056
1. 报错信息
Exception in thread "main" java.sql.SQLSyntaxErrorException: Memory of Fragment fe620076-2b69-11ee-8516-b4055ddf14dd exceed limit. try consume:268435456
Used: 8373466672, Limit: 8589934538. Mem usage has exceed the limit of single query, You can change the limit by set session variable exec_mem_limit or query_mem_limit.
at com.mysql.cj.jdbc.exceptions.SQLError.createSQLException(SQLError.java:120)
at com.mysql.cj.jdbc.exceptions.SQLError.createSQLException(SQLError.java:97)
at com.mysql.cj.jdbc.exceptions.SQLExceptionsMapping.translateException(SQLExceptionsMapping.java:122)
at com.mysql.cj.jdbc.StatementImpl.executeInternal(StatementImpl.java:764)
at com.mysql.cj.jdbc.StatementImpl.execute(StatementImpl.java:648)
at com.sf.service.Impl.StrategyServiceImpl.doAction(StrategyServiceImpl.java:24)
at com.sf.Application.executeSwitch(Application.java:157)
at com.sf.Application.main(Application.java:116)
2. 执行的SQL
(
-
只select的时候不报错, 加上insert into bfms_prod_line_sp_collect_result 就报内存不够
-
第二天执行同样的SQL, 却不报内存不够了
-
以表的数据量来看, 默认8G的内存应该够用 )
– 结果 932595, 90万
insert into bfms_prod_line_sp_collect_result
select
‘1386’ as instance_id,
t0.city_code as object_code,
left_table.predict_datetime,
left_table.product_type,
5 as object_type_code ,
left_table.weight_level,
left_table.task_type,
left_table.predict_classtime,
left_table.model_version,
left_table.data_version,
left_table.feature_version ,
t0.city_name as object_name,
‘城市’ as object_type,
sum(left_table.predict_quantity) as predict_quantity ,
sum(left_table.predict_waybill) as predict_waybill,
sum(left_table.predict_weight) as predict_weight,
sum(left_table.predict_volume) as predict_volume,
‘’ as real_quantity ,
‘’ as real_waybill ,
‘’ as real_weight ,
‘’ as real_volume,
‘’ as dianbu_code ,
‘’ as division_code,
‘’ as city_code ,
‘’ as area_code,
‘’ as fbq_code ,
‘’ as hq_code ,
now() as record_time,
replace(predict_datetime,’-’,’’) as partition_key,
inc_hours,
batch_code,
batch_name,
predict_hour,
object_properties,
is_centralized
from (-- 左表43547190, 4千三百万 select * from bfms_prod_line_sp_concat_result where instance_id = '1386' and object_type_code='7' and predict_datetime between '2023-07-28' and '2023-11-23' )left_table join ( -- 右表 60483 , 6万 select object_code, parent_object_code as city_code, parent_object_name as city_name from bfms_zone_parent_df where object_type='网点' and parent_object_type='城市' and inc_date=DATE_FORMAT(date_sub(now(),interval 1 day),'%Y-%m-%d') )t0 on left_table.object_code=t0.object_code group by left_table.predict_datetime, left_table.product_type, left_table.weight_level, left_table.task_type, left_table.predict_classtime, left_table.model_version, left_table.data_version, left_table.feature_version , t0.city_name, t0.city_code, inc_hours, batch_code, batch_name, predict_hour, object_properties, is_centralized;
【附件】
- fe.log/beINFO/相应截图
- 慢查询:
- Profile信息
- 并行度:show variables like ‘%parallel_fragment_exec_instance_num%’;
- pipeline是否开启:show variables like ‘%pipeline%’;
- be节点cpu和内存使用率截图
- 查询报错:
- query_dump,怎么获取query_dump文件
- be crash
- be.out