total size of single column exceed the limit of hash join

【详述】由doris1.15升级到StarRocks 2.0.3,原先可以正常运行的SQL的,在新版本报错: total size of single column exceed the limit of hash join 。 是改了什么限制吗?
【背景】升级版本
【业务影响】SQL运行报错
【StarRocks版本】2.0.3
【集群规模】例如:3fe(1 follower+2observer)+5be(fe与be混部)
【机器信息】CPU虚拟核/内存/网卡,例如:48C/64G/万兆
【附件】

运行的SQL;
set exec_mem_limit = 68719476736;
select product_id,big_label,sum(volume) volume from
((select product_id,
volume
from table1
where date >= ‘20220310’
and date <= ‘20220325’)
union all
(select product_id,
volume
from table2
where date >= ‘20220310’
and date <= ‘20220325’))
sale_product left join (select pid,big_label from table3) v5 on v5.pid = product_id where big_label = ‘baihou’
group by product_id,big_label
order by volume desc
limit 0,500

右表大小:167.698 GB

咱们并行度设置的是多少?
执行下

show variables like "%parallel_fragment_exec_instance_num %";

show variables like “%parallel_fragment_exec_instance_num %”;

并行度为 1

设置为cpu核数的一半重新执行下试试看。

目前情况怎么样了呢?

请问下,怎么修改这个参数? 需要重启FE/BE吗?

https://docs.starrocks.com/zh-cn/main/administration/Query_management#查询相关的-session-变量
参照这篇文章

查询我设置了
set exec_mem_limit = 68719476736;
set parallel_fragment_exec_instance_num = 16;

结果报错: Memory of process exceed limit. Used: 103601595008, Limit: 107643243478. Mem usage has exceed the limit of BE

增大exec_mem_limit 再看下?

辛苦执行下explain + sql 和 explain cost + sql 看下。

增大 set exec_mem_limit = 107643243478;

还是报同样的错:Memory of process exceed limit. Used: 99249110328, Limit: 107643243478. Mem usage has exceed the limit of BE

explain的输出为:
±----------------------------------------------------------------------------------------------+
| Explain String |
±----------------------------------------------------------------------------------------------+
| PLAN FRAGMENT 0 |
| OUTPUT EXPRS:8: product_id | 19: big_label | 38: sum |
| PARTITION: UNPARTITIONED |
| |
| RESULT SINK |
| |
| 15:MERGING-EXCHANGE |
| limit: 50 |
| use vectorized: true |
| |
| PLAN FRAGMENT 1 |
| OUTPUT EXPRS: |
| PARTITION: HASH_PARTITIONED: 8: product_id, 19: big_label |
| |
| STREAM DATA SINK |
| EXCHANGE ID: 15 |
| UNPARTITIONED |
| |
| 14:TOP-N |
| | order by: <slot 38> 38: sum DESC |
| | offset: 0 |
| | limit: 50 |
| | use vectorized: true |
| | |
| 13:AGGREGATE (merge finalize) |
| | output: sum(38: sum) |
| | group by: 8: product_id, 19: big_label |
| | use vectorized: true |
| | |
| 12:EXCHANGE |
| use vectorized: true |
| |
| PLAN FRAGMENT 2 |
| OUTPUT EXPRS: |
| PARTITION: RANDOM |
| |
| STREAM DATA SINK |
| EXCHANGE ID: 12 |
| HASH_PARTITIONED: 8: product_id, 19: big_label |
| |
| 11:AGGREGATE (update serialize) |
| | STREAMING |
| | output: sum(9: volume) |
| | group by: 8: product_id, 19: big_label |
| | use vectorized: true |
| | |
| 10:Project |
| | <slot 8> : 8: product_id |
| | <slot 9> : 9: volume |
| | <slot 19> : 19: big_label |
| | use vectorized: true |
| | |
| 9:HASH JOIN |
| | join op: INNER JOIN (BROADCAST) |
| | hash predicates: |
| | colocate: false, reason: |
| | equal join conjunct: 17: pid = 8: product_id |
| | use vectorized: true |
| | |
| |----8:EXCHANGE |
| | use vectorized: true |
| | |
| 0:OlapScanNode |
| TABLE: table3 |
| PREAGGREGATION: OFF. Reason: Has can not pre-aggregation Join |
| PREDICATES: 19: big_label = ‘日用百货’ |
| partitions=1/1 |
| rollup: table3 |
| tabletRatio=24/24 |
| tabletList=63332,63336,63340,63344,63348,63352,63356,63360,63364,63368 … |
| cardinality=848630875 |
| avgRowSize=19.500404 |
| numNodes=0 |
| use vectorized: true |
| |
| PLAN FRAGMENT 3 |
| OUTPUT EXPRS: |
| PARTITION: RANDOM |
| |
| STREAM DATA SINK |
| EXCHANGE ID: 08 |
| UNPARTITIONED |
| |
| 1:UNION |
| | use vectorized: true |
| | |
| |----7:EXCHANGE |
| | use vectorized: true |
| | |
| 4:EXCHANGE |
| use vectorized: true |
| |
| PLAN FRAGMENT 4 |
| OUTPUT EXPRS: |
| PARTITION: RANDOM |
| |
| STREAM DATA SINK |
| EXCHANGE ID: 07 |
| RANDOM |
| |
| 6:Project |
| | <slot 13> : 13: product_id |
| | <slot 14> : 14: volume |
| | use vectorized: true |
| | |
| 5:OlapScanNode |
| TABLE: table2 |
| PREAGGREGATION: OFF. Reason: Group columns isn’t bound table2 |
| PREDICATES: 10: date >= ‘2022-03-10’, 10: date <= ‘2022-03-25’ |
| partitions=1/1 |
| rollup: table2 |
| tabletRatio=24/24 |
| tabletList=64551,64555,64559,64563,64567,64571,64575,64579,64583,64587 … |
| cardinality=0 |
| avgRowSize=19.99904 |
| numNodes=0 |
| use vectorized: true |
| |
| PLAN FRAGMENT 5 |
| OUTPUT EXPRS: |
| PARTITION: RANDOM |
| |
| STREAM DATA SINK |
| EXCHANGE ID: 04 |
| RANDOM |
| |
| 3:Project |
| | <slot 4> : 4: product_id |
| | <slot 5> : 5: volume |
| | use vectorized: true |
| | |
| 2:OlapScanNode |
| TABLE: table1 |
| PREAGGREGATION: OFF. Reason: Group columns isn’t bound table table1 |
| PREDICATES: 1: date >= ‘2022-03-10’, 1: date <= ‘2022-03-25’ |
| partitions=1/1 |
| rollup: table1 |
| tabletRatio=24/24 |
| tabletList=64452,64456,64460,64464,64468,64472,64476,64480,64484,64488 … |
| cardinality=0 |
| avgRowSize=22.49028 |
| numNodes=0 |
| use vectorized: true |
±----------------------------------------------------------------------------------------------+
143 rows in set (0.01 sec)

explain costs 的输出
±---------------------------------------------------------------------------------------------------------------------------+
| Explain String |
±---------------------------------------------------------------------------------------------------------------------------+
| PLAN FRAGMENT 0(F06) |
| Output Exprs:8: product_id | 19: big_label | 38: sum |
| Input Partition: UNPARTITIONED |
| RESULT SINK |
| |
| 15:MERGING-EXCHANGE |
| limit: 50 |
| cardinality: 0 |
| column statistics: |
| * product_id–>[-Infinity, Infinity, 0.0, 0.0, 81963.0] ESTIMATE |
| * big_label–>[-Infinity, Infinity, 0.0, 1.0, 1.0] UNKNOWN |
| * sum–>[-Infinity, Infinity, 0.0, 1.0, 1.0] UNKNOWN |
| |
| PLAN FRAGMENT 1(F05) |
| |
| Input Partition: HASH_PARTITIONED: 8: product_id, 19: big_label |
| OutPut Partition: UNPARTITIONED |
| OutPut Exchange Id: 15 |
| |
| 14:TOP-N |
| | order by: [38, BIGINT, true] DESC |
| | offset: 0 |
| | limit: 50 |
| | cardinality: 0 |
| | column statistics: |
| | * product_id–>[-Infinity, Infinity, 0.0, 0.0, 81963.0] ESTIMATE |
| | * big_label–>[-Infinity, Infinity, 0.0, 1.0, 1.0] UNKNOWN |
| | * sum–>[-Infinity, Infinity, 0.0, 1.0, 1.0] UNKNOWN |
| | |
| 13:AGGREGATE (merge finalize) |
| | aggregate: sum[([38: sum, BIGINT, true]); args: BIGINT; result: BIGINT; args nullable: true; result nullable: true] |
| | group by: [8: product_id, VARCHAR(64), true], [19: big_label, VARCHAR, false] |
| | cardinality: 0 |
| | column statistics: |
| | * product_id–>[-Infinity, Infinity, 0.0, 0.0, 81963.0] ESTIMATE |
| | * big_label–>[-Infinity, Infinity, 0.0, 1.0, 1.0] UNKNOWN |
| | * sum–>[-Infinity, Infinity, 0.0, 1.0, 1.0] UNKNOWN |
| | |
| 12:EXCHANGE |
| cardinality: 0 |
| |
| PLAN FRAGMENT 2(F00) |
| |
| Input Partition: RANDOM |
| OutPut Partition: HASH_PARTITIONED: 8: product_id, 19: big_label |
| OutPut Exchange Id: 12 |
| |
| 11:AGGREGATE (update serialize) |
| | STREAMING |
| | aggregate: sum[([9: volume, BIGINT, true]); args: BIGINT; result: BIGINT; args nullable: true; result nullable: true] |
| | group by: [8: product_id, VARCHAR(64), true], [19: big_label, VARCHAR, false] |
| | cardinality: 0 |
| | column statistics: |
| | * product_id–>[-Infinity, Infinity, 0.0, 0.0, 81963.0] ESTIMATE |
| | * big_label–>[-Infinity, Infinity, 0.0, 1.0, 1.0] UNKNOWN |
| | * sum–>[-Infinity, Infinity, 0.0, 1.0, 1.0] UNKNOWN |
| | |
| 10:Project |
| | output columns: |
| | 8 <-> [8: product_id, VARCHAR(64), true] |
| | 9 <-> [9: volume, BIGINT, true] |
| | 19 <-> [19: big_label, VARCHAR, false] |
| | cardinality: 0 |
| | column statistics: |
| | * product_id–>[-Infinity, Infinity, 0.0, 0.0, 81963.0] ESTIMATE |
| | * volume–>[-Infinity, Infinity, 0.0, 1.0, 1.0] UNKNOWN |
| | * big_label–>[-Infinity, Infinity, 0.0, 1.0, 1.0] UNKNOWN |
| | |
| 9:HASH JOIN |
| | join op: INNER JOIN (BROADCAST) |
| | equal join conjunct: [17: pid, VARCHAR, false] = [8: product_id, VARCHAR(64), true] |
| | build runtime filters: |
| | - filter_id = 0, build_expr = (8: product_id), remote = false |
| | cardinality: 0 |
| | column statistics: |
| | * product_id–>[-Infinity, Infinity, 0.0, 0.0, 81963.0] ESTIMATE |
| | * volume–>[-Infinity, Infinity, 0.0, 1.0, 1.0] UNKNOWN |
| | * pid–>[-Infinity, Infinity, 0.0, 18.500405, 7.314E8] ESTIMATE |
| | * big_label–>[-Infinity, Infinity, 0.0, 1.0, 1.0] UNKNOWN |
| | |
| |----8:EXCHANGE |
| | cardinality: 0 |
| | |
| 0:OlapScanNode |
| table: table1, rollup: table1 |
| preAggregation: off. Reason: Has can not pre-aggregation Join |
| Predicates: [19: big_label, VARCHAR, false] = ‘日用百货’ |
| partitionsRatio=1/1, tabletsRatio=24/24 |
| tabletList=63332,63336,63340,63344,63348,63352,63356,63360,63364,63368 … |
| actualRows=618761576, avgRowSize=19.500404 |
| cardinality: 827764925 |
| probe runtime filters: |
| - filter_id = 0, probe_expr = (17: pid) |
| column statistics: |
| * pid–>[-Infinity, Infinity, 0.0, 18.500405, 7.314E8] ESTIMATE |
| * big_label–>[-Infinity, Infinity, 0.0, 1.0, 1.0] UNKNOWN |
| |
| PLAN FRAGMENT 3(F01) |
| |
| Input Partition: RANDOM |
| OutPut Partition: UNPARTITIONED |
| OutPut Exchange Id: 08 |
| |
| 1:UNION |
| | child exprs: |
| | [4, VARCHAR, true] | [5, BIGINT, true] |
| | [13, VARCHAR, true] | [14, BIGINT, true] |
| | pass-through-operands: all |
| | cardinality: 0 |
| | column statistics: |
| | * product_id–>[-Infinity, Infinity, 0.0, 0.0, 81963.0] ESTIMATE |
| | * volume–>[-Infinity, Infinity, 0.0, 1.0, 1.0] UNKNOWN |
| | |
| |----7:EXCHANGE |
| | cardinality: 0 |
| | |
| 4:EXCHANGE |
| cardinality: 0 |
| |
| PLAN FRAGMENT 4(F03) |
| |
| Input Partition: RANDOM |
| OutPut Partition: RANDOM |
| OutPut Exchange Id: 07 |
| |
| 6:Project |
| | output columns: |
| | 13 <-> [13: product_id, VARCHAR, true] |
| | 14 <-> [14: volume, BIGINT, true] |
| | cardinality: 0 |
| | column statistics: |
| | * product_id–>[-Infinity, Infinity, 0.0, 14.99904, 81963.0] ESTIMATE |
| | * volume–>[-Infinity, Infinity, 0.0, 1.0, 1.0] UNKNOWN |
| | |
| 5:OlapScanNode |
| table: table2, rollup: table2 |
| preAggregation: off. Reason: Group columns isn’t bound table table2 |
| Predicates: [10: date, DATE, false] >= ‘2022-03-10’, [10: date, DATE, false] <= ‘2022-03-25’ |
| partitionsRatio=1/1, tabletsRatio=24/24 |
| tabletList=64551,64555,64559,64563,64567,64571,64575,64579,64583,64587 … |
| actualRows=2071854878, avgRowSize=19.99904 |
| cardinality: 0 |
| column statistics: |
| * date–>[-Infinity, 1.6481376E9, 0.0, 4.0, 2.0] ESTIMATE |
| * product_id–>[-Infinity, Infinity, 0.0, 14.99904, 81963.0] ESTIMATE |
| * volume–>[-Infinity, Infinity, 0.0, 1.0, 1.0] UNKNOWN |
| |
| PLAN FRAGMENT 5(F02) |
| |
| Input Partition: RANDOM |
| OutPut Partition: RANDOM |
| OutPut Exchange Id: 04 |
| |
| 3:Project |
| | output columns: |
| | 4 <-> [4: product_id, VARCHAR, true] |
| | 5 <-> [5: volume, BIGINT, true] |
| | cardinality: 0 |
| | column statistics: |
| | * product_id–>[-Infinity, Infinity, 0.0, 17.49028, 59875.0] ESTIMATE |
| | * volume–>[-Infinity, Infinity, 0.0, 1.0, 1.0] UNKNOWN |
| | |
| 2:OlapScanNode |
| table: table1, rollup: table1 |
| preAggregation: off. Reason: Group columns isn’t bound table table1 |
| Predicates: [1: date, DATE, false] >= ‘2022-03-10’, [1: date, DATE, false] <= ‘2022-03-25’ |
| partitionsRatio=1/1, tabletsRatio=24/24 |
| tabletList=64452,64456,64460,64464,64468,64472,64476,64480,64484,64488 … |
| actualRows=2776807009, avgRowSize=22.49028 |
| cardinality: 0 |
| column statistics: |
| * date–>[-Infinity, 1.6481376E9, 0.0, 4.0, 4.0] ESTIMATE |
| * product_id–>[-Infinity, Infinity, 0.0, 17.49028, 59875.0] ESTIMATE |
| * volume–>[-Infinity, Infinity, 0.0, 1.0, 1.0] UNKNOWN |
±---------------------------------------------------------------------------------------------------------------------------+
174 rows in set (0.00 sec)

可以发一下这几张表的建表语句吗?

CREATE TABLE table1 (

) ENGINE=OLAP
UNIQUE KEY(date, room_id, author_id, product_id)
COMMENT “OLAP”
DISTRIBUTED BY HASH(author_id) BUCKETS 24
PROPERTIES (
“replication_num” = “3”,
“in_memory” = “false”,
“storage_format” = “DEFAULT”
);

CREATE TABLE table2 (

) ENGINE=OLAP
UNIQUE KEY(date, aweme_id, author_id, product_id)
COMMENT “OLAP”
DISTRIBUTED BY HASH(author_id) BUCKETS 24
PROPERTIES (
“replication_num” = “3”,
“in_memory” = “false”,
“storage_format” = “DEFAULT”
);

CREATE TABLE table3 (

) ENGINE=OLAP
UNIQUE KEY(pid)
COMMENT “OLAP”
DISTRIBUTED BY HASH(pid) BUCKETS 24
PROPERTIES (
“replication_num” = “3”,
“in_memory” = “false”,
“storage_format” = “DEFAULT”
);

咱们建表语句有些不合理,导致了查询时扫描了很多数据,建议按照天或者月份来进行分区,将table1 和 table2重新设计一下。

后面设置了并行度没有报hash join的错,而是报超出内存限制的错,感觉exec_mem_limit 这个参数没起作用?

旧版本运行是正常的。2.0.3反而报错,新版本是有作hash join的限制吗?

是不是之后表内导入数据了呢?

这张表一直是有导入数据的。

我们是关联的维度表列超过4G的问题,那种维度表没有天的概念。