2.2.1 Cross JOIN 、unnest、bitmap_to_array 使用

2.0.7版本时sql可以正常执行,但升级到2.2.1后报错了


需要改成这样才能执行

求大佬指导一下

1赞

@Natsume729

您好,麻烦您发一下建表sql、查询sql、调整后可以正常执行的sql

A表:
CREATE TABLE entity_44_pack_summary (
pack_id int(11) NOT NULL COMMENT “群组标识”,
update_time datetime NOT NULL COMMENT “群组生成时间”,
seq_id_bitmap bitmap BITMAP_UNION NOT NULL COMMENT “SeqID集合”
) ENGINE=OLAP
AGGREGATE KEY(pack_id, update_time)
COMMENT “人群汇总表”
PARTITION BY RANGE(update_time)
(PARTITION p20220530 VALUES [(‘0000-01-01 00:00:00’), (‘2022-05-31 00:00:00’)),
PARTITION p20220531 VALUES [(‘2022-05-31 00:00:00’), (‘2022-06-01 00:00:00’)),
PARTITION p20220601 VALUES [(‘2022-06-01 00:00:00’), (‘2022-06-02 00:00:00’)),
PARTITION p20220602 VALUES [(‘2022-06-02 00:00:00’), (‘2022-06-03 00:00:00’)),
PARTITION p20220603 VALUES [(‘2022-06-03 00:00:00’), (‘2022-06-04 00:00:00’)),
PARTITION p20220604 VALUES [(‘2022-06-04 00:00:00’), (‘2022-06-05 00:00:00’)),
PARTITION p20220605 VALUES [(‘2022-06-05 00:00:00’), (‘2022-06-06 00:00:00’)),
PARTITION p20220606 VALUES [(‘2022-06-06 00:00:00’), (‘2022-06-07 00:00:00’)),
PARTITION p20220607 VALUES [(‘2022-06-07 00:00:00’), (‘2022-06-08 00:00:00’)),
PARTITION p20220608 VALUES [(‘2022-06-08 00:00:00’), (‘2022-06-09 00:00:00’)),
PARTITION p20220609 VALUES [(‘2022-06-09 00:00:00’), (‘2022-06-10 00:00:00’)),
PARTITION p20220610 VALUES [(‘2022-06-10 00:00:00’), (‘2022-06-11 00:00:00’)),
PARTITION p20220611 VALUES [(‘2022-06-11 00:00:00’), (‘2022-06-12 00:00:00’)),
PARTITION p20220612 VALUES [(‘2022-06-12 00:00:00’), (‘2022-06-13 00:00:00’)),
PARTITION p20220613 VALUES [(‘2022-06-13 00:00:00’), (‘2022-06-14 00:00:00’)),
PARTITION p20220614 VALUES [(‘2022-06-14 00:00:00’), (‘2022-06-15 00:00:00’)),
PARTITION p20220615 VALUES [(‘2022-06-15 00:00:00’), (‘2022-06-16 00:00:00’)),
PARTITION p20220616 VALUES [(‘2022-06-16 00:00:00’), (‘2022-06-17 00:00:00’)),
PARTITION p20220617 VALUES [(‘2022-06-17 00:00:00’), (‘2022-06-18 00:00:00’)),
PARTITION p20220618 VALUES [(‘2022-06-18 00:00:00’), (‘2022-06-19 00:00:00’)),
PARTITION p20220619 VALUES [(‘2022-06-19 00:00:00’), (‘2022-06-20 00:00:00’)),
PARTITION p20220620 VALUES [(‘2022-06-20 00:00:00’), (‘2022-06-21 00:00:00’)),
PARTITION p20220621 VALUES [(‘2022-06-21 00:00:00’), (‘2022-06-22 00:00:00’)),
PARTITION p20220622 VALUES [(‘2022-06-22 00:00:00’), (‘2022-06-23 00:00:00’)),
PARTITION p20220623 VALUES [(‘2022-06-23 00:00:00’), (‘2022-06-24 00:00:00’)))
DISTRIBUTED BY HASH(pack_id, update_time) BUCKETS 3
PROPERTIES (
“replication_num” = “3”,
“dynamic_partition.enable” = “true”,
“dynamic_partition.time_unit” = “DAY”,
“dynamic_partition.time_zone” = “Asia/Shanghai”,
“dynamic_partition.start” = “-2147483648”,
“dynamic_partition.end” = “2”,
“dynamic_partition.prefix” = “p”,
“dynamic_partition.buckets” = “3”,
“in_memory” = “false”,
“storage_format” = “DEFAULT”
);

B表:CREATE TABLE entity_44_seq_id_mapping (
entity_id varchar(32) NOT NULL COMMENT “实体ID”,
seq_id bigint(20) NOT NULL COMMENT “序列ID”
) ENGINE=OLAP
PRIMARY KEY(entity_id)
COMMENT “序列ID映射表”
DISTRIBUTED BY HASH(entity_id) BUCKETS 3
PROPERTIES (
“replication_num” = “3”,
“colocate_with” = “co_group_entity_44_id”,
“in_memory” = “false”,
“storage_format” = “DEFAULT”
);

原sql:

SELECT
sm.entity_id current_id
FROM
entity_44_pack_summary as ps
CROSS JOIN unnest (bitmap_to_array(ps.seq_id_bitmap)) AS us
INNER JOIN entity_44_seq_id_mapping AS sm ON us.unnest = sm.seq_id

调整后sql:
select sm.entity_id current_id from
(select unnest from entity_44_pack_summary as ps
CROSS JOIN unnest (bitmap_to_array(ps.seq_id_bitmap)) AS us) as t
INNER JOIN entity_44_seq_id_mapping AS sm ON t.unnest = sm.seq_id

您好,这个是有改动的,之前的是个bug,2.2版本改了。修正前后只是解析顺序不会影响查询结果,多个表join,会先解析带on条件的。

那像上述查询需求,有推荐的join查询方式吗?还是只能用我这个调整后的sql这样去查?