date_trunc 分区表,数据插入不进去

为了更快的定位您的问题,请提供以下信息,谢谢
【详述】
重新启动Starrocks集群,之前的分区表就插不进去数据了。建表语句如下 :

CREATE TABLE if not exists dwd.dwd_tacos_subs_event(
product_id string comment ‘’
,msg_id string comment ‘’
,msg_capture_at string comment ‘’
,msg_type_code string comment ‘’
,device_id string comment ‘’
,space_code string comment ‘’
,project_old_id string comment ‘’
,function_id string comment ‘’
,event_id string comment ‘’
,event_output string comment ‘’
,tacos_id string comment ‘’
,etl_at string comment ‘’
,pt_d date comment ‘’
)ENGINE=OLAP
DUPLICATE KEY(product_id,msg_id)
PARTITION BY date_trunc(‘day’, pt_d)
DISTRIBUTED BY HASH(msg_id)
PROPERTIES (
“replication_num” = “1”
)

因为想对分区数据进行overwrite,目前我看只支持这种插入分区名的操作,每次都需要手动创建分区(不知道还有没有别的好办法)。其余的方法也都试了不太行。
– 前置sql :
ALTER TABLE dwd.dwd_tacos_subs_event
DROP PARTITION IF EXISTS p20230721;

ALTER TABLE dwd.dwd_tacos_subs_event ADD PARTITIONS START (‘2023-07-21’) END (‘2023-07-22’) EVERY (interval 1 day);

插入语句如下:
insert overwrite dwd.dwd_tacos_subs_event partition(‘p20230721’)
select
prodId as product_id,
id as msg_id,
from_unixtime(time/1000) as msg_capture_at,
type as msg_type_code,
deviceId as device_id,
spaceCode as space_code,
projectId as project_old_id,
cast(cast(cast(payload as Array)[1] -> “functions” as Array)[1] -> “identifier” as string)as function_id,
cast(cast(cast(cast(payload as Array)[1] -> “functions” as Array)[1] -> “event” as Array)[1] -> “identifier” as string) as event_id,
cast(cast(cast(payload as Array)[1] -> “functions” as Array)[1] -> “event” as Array)[1] -> “output” as event_output,
‘chon-gqin-gaip-ark0’ as tacos_id,
CURRENT_TIMESTAMP() as etl_at,
pt_d
from ods.ods_tacos_kafka_data
where pt_d = “2023-07-21” and type = 8

因停电,关闭集群,在起来集群之后。之前创的date_trunc 分区表都不能正常插入数据:
报错如下:
SQL 错误 [1064] [42000]: Insert has filtered data in strict mode, txn_id = 661032 tracking sql = select tracking_log from information_schema.load_tracking_logs where job_id=74091

执行这个sql之后 :

Error: The row create partition failed since Runtime error: failed to analyse partition value:[‘p_tsl_smart_entrance_guard’]… Row: [‘p_vrv’, ‘1689870227070’, ‘2023-07-21 00:23:47’, ‘8’, ‘915354007603650591’, ‘972508496181768192’, ‘1001’, ‘vrv’, ‘vrv_fault’, ‘{“fault_description”: “0”}’, ‘chon-gqin-gaip-ark0’, ‘2023-07-25 14:16:11’, 2023-07-21]

这些表如果再新创建一次:插入数据全部都正常。

目前发现的异常,就是在新建分区的时候,发现分桶数由原来的4变成1.
很影响啊,不能把表全部重新创建一遍吧。

【业务影响】
【StarRocks版本】例如:3.0.2
【集群规模】例如:1fe + 2be(fe与be混部)
【机器信息】CPU虚拟核/内存/网卡,例如:32C/128G/万兆
【联系方式】446477483@qq.com
【附件】

分区分桶数变更应该是因为历史分区的数据量比较小,建表时没有指定分桶数会用默认的规则来创建分桶,后面分区的分桶数会根据历史数据推算

好的。那数据现在插入不进去怎么办?

这个集群是 3.0.2版本的吧,可以升级到3.0.3+版本,升级后不需要重新建表,修复pr链接:https://github.com/StarRocks/starrocks/pull/25177

升级完可以了。感谢。