group by array<json>字段导致be宕机

【详述】group by的字段类型为array时,导致be宕机

建表语句:
CREATE TABLE test_json_001 (
id int(11) NULL COMMENT “”,
aa json NULL COMMENT “”,
bb ARRAY NULL COMMENT “”
) ENGINE=OLAP
DUPLICATE KEY(id)
COMMENT “OLAP”
DISTRIBUTED BY HASH(id) BUCKETS 3
PROPERTIES (
“replication_num” = “3”,
“in_memory” = “false”,
“storage_format” = “DEFAULT”,
“enable_persistent_index” = “false”,
“compression” = “LZ4”
);

导致be宕机的sql:
select bb,count(1) from rx_bd_test.test_json_001 group by bb;

be.out日志:
query_id:18719228-bf23-11ed-aa73-1866dafb1278, fragment_instance:18719228-bf23-11ed-aa73-1866dafb127a
*** Aborted at 1678439297 (unix time) try “date -d @1678439297” if you are using GNU date ***
PC: @ 0x6bde34e _ZNK8arangodb10velocypack5Slice14normalizedHashEm.localalias
*** SIGSEGV (@0x0) received by PID 3930978 (TID 0x7fa8e7b46700) from PID 0; stack trace: ***
@ 0x5722822 google::(anonymous namespace)::FailureSignalHandler()
@ 0x7fa980142630 (unknown)
@ 0x6bde34e _ZNK8arangodb10velocypack5Slice14normalizedHashEm.localalias
@ 0x475478d starrocks::JsonValue::hash()
@ 0x4cf7d24 starrocks::vectorized::JsonColumn::fnv_hash()
@ 0x4ce25dc starrocks::vectorized::NullableColumn::fnv_hash()
@ 0x4cbc329 starrocks::vectorized::ArrayColumn::fnv_hash()
@ 0x4ce25dc starrocks::vectorized::NullableColumn::fnv_hash()
@ 0x4dcfb09 starrocks::pipeline::ExchangeSinkOperator::push_chunk()
@ 0x2c40466 starrocks::pipeline::PipelineDriver::process()
@ 0x4dc1bd7 starrocks::pipeline::GlobalDriverExecutor::_worker_thread()
@ 0x47d2acd starrocks::ThreadPool::dispatch_thread()
@ 0x47cd85a starrocks::thread::supervise_thread()
@ 0x7fa98013aea5 start_thread
@ 0x7fa97f755b0d __clone
@ 0x0 (unknown)

【业务影响】
【StarRocks版本】2.5.2
【集群规模】3fe+3be(fe与be混部)

数据表;
CREATE TABLE rx_bd_test.test_ods_001 (
id int,
aa string
) ENGINE=OLAP
DUPLICATE KEY(id)
COMMENT “OLAP”
DISTRIBUTED BY HASH(id) BUCKETS 3
PROPERTIES (
“replication_num” = “3”,
“in_memory” = “false”,
“storage_format” = “DEFAULT”,
“enable_persistent_index” = “false”,
“compression” = “LZ4”
);

插入数据:
insert into rx_bd_test.test_json_001 values(1, ‘{“k1”:“v1”,“k2”:“v2”}’, ‘[{“k1”:“v1”,“k2”:“v2”},{“k3”:“v3”,“k4”:“v4”}]’);

导入到查询的目标表:
insert into rx_bd_test.test_json_001
SELECT
1 as id
, json_query(parse_json(aa), ‘$.k3’) as aa
, json_query(parse_json(aa), ‘$.k4’) as bb
from rx_bd_test.test_ods_001
;

查询sql:
select bb,count(1) from rx_bd_test.test_json_001 group by bb;

好的 我尝试复现下 稍等

复现了该问题 我们修复下 有进展会在此帖子下同步

ok,麻烦大佬了

大佬,这个问题有修复安排吗?什么版本发布?

您好 这个问题当前会在2.5.5版本进行修复

ok,多谢大佬 ,

@U_1647419731669_9072 SR 还不支持 group by JSON, group by array,你们具体是怎样的需求?