为了更快的定位您的问题,请提供以下信息,谢谢
【详述】对主键表dwt_user_label_member_id_middle_partition_tenant_daily同时进行delete操作和insert into操作,发现存在数据写入失败的场景。
【背景】
-
dwt_user_label_member_id_middle_partition_tenant_daily表结构,以(
tenant_id
,partition_version
,label_code
,identity_id
)为主键,以partition_version为range分区。 -
建表语句如下:
CREATE TABLEdwt_user_label_member_id_middle_partition_tenant_daily
(
tenant_id
bigint(20) NOT NULL COMMENT “租户ID”,
partition_version
int(11) NOT NULL COMMENT “版本分区”,
label_code
varchar(256) NOT NULL COMMENT “LabelCode”,
identity_id
varchar(64) NOT NULL COMMENT “身份ID”,
label_value
varchar(1024) NULL COMMENT “标签值”,
data_update_time
varchar(64) NULL COMMENT “数据更新日期”
) ENGINE=OLAP
PRIMARY KEY(tenant_id
,partition_version
,label_code
,identity_id
)
COMMENT “租户标签分区表”
PARTITION BY RANGE(partition_version
)
(PARTITION p1 VALUES [(“1”), (“2”)),
PARTITION p2 VALUES [(“2”), (“3”)),
PARTITION p3 VALUES [(“3”), (“4”)))
DISTRIBUTED BY HASH(tenant_id
,identity_id
) BUCKETS 200
PROPERTIES (
“replication_num” = “3”,
“colocate_with” = “tenant_identity”,
“in_memory” = “false”,
“storage_format” = “DEFAULT”,
“enable_persistent_index” = “true”,
“replicated_storage” = “true”,
“compression” = “LZ4”,
“storage_medium” = “HDD”
);
SQL执行 -
- 依次执行的sql为:(执行的sql是 insert into && delete,一共执行了三对)。
timestamp, queryTime, stmt
– 1.向2号分区写入G_200000001335817数据,该insert操作写入成功,且2号分区有数据
2024-09-30 18:00:13,4315,“INSERT INTO rms_member.dwt_user_label_member_id_middle_partition_tenant_daily PARTITION(p2) WITH Label G_200000001335817_1840693120472338522_1727690413003 ( tenant_id, partition_version, label_code, identity_id, label_value, data_update_time ) SELECT tenant_id, ‘2’ AS partition_version, ‘G_200000001335817’ AS label_code, identity_id AS identity_id, label AS label_value, ‘2024-09-30 18:00:12’ AS data_update_time FROM ( SELECT identity_id, tenant_id, label FROM ( SELECT U.identityId AS identity_id, U.tenantId AS tenant_id, ‘1’ AS label, 0 AS label_index FROM ( SELECT u.identityId, u.tenantId FROM ( SELECT u.identity AS identityId, u.tenant_id AS tenantId, u.uattr1004, u.uattr32 FROM ( SELECT identity AS identity, u.tenant_id AS tenant_id, item1.wx_app_id AS uattr32, item1.wx_template_10000 AS uattr1004 FROM rms_member.dwt_member_profile_ss_daily AS u LEFT JOIN rms_member.dim_member_item_ss_daily AS item1 ON u.tenant_id = item1.tenant_id AND u.one_id = item1.identity_id AND item1.item_type = ‘wx_app_quota’ AND ( (item1.tenant_id IN (‘11750013’)) AND (item1.id_type = ‘@member_id’) ) WHERE ( (u.tenant_id IN (‘11750013’)) AND (u.id_type = ‘@member_id’) ) ) AS u WHERE 1 = 1 ) AS u WHERE 1 = 1 AND ( (u.uattr32 = ‘wx29850215cb395a68’) AND ((CAST(u.uattr1004 AS decimal(20, 6)) > 0)) ) GROUP BY u.tenantId, u.identityId ) AS U ) AS U ) AS U”
– 2.删除表中之前可能存在的G_200000001335817数据,表中之前并没有G_200000001335817的数据。
2024-09-30 18:00:17,4812,delete from rms_member.dwt_user_label_member_id_middle_partition_tenant_daily where partition_version in (2) and label_code = ‘G_200000001335817’ and tenant_id = 11750013 and data_update_time < ‘2024-09-30 18:00:12’
– 3.向3号分区写入G_200000001335817数据,该insert操作成功,且3号分区有数据。
2024-09-30 18:03:53,3896,“INSERT INTO rms_member.dwt_user_label_member_id_middle_partition_tenant_daily PARTITION(p3) WITH Label G_200000001335817_1840694046666891298_1727690633825 ( tenant_id, partition_version, label_code, identity_id, label_value, data_update_time ) SELECT tenant_id, ‘3’ AS partition_version, ‘G_200000001335817’ AS label_code, identity_id AS identity_id, label AS label_value, ‘2024-09-30 18:03:53’ AS data_update_time FROM ( SELECT identity_id, tenant_id, label FROM ( SELECT U.identityId AS identity_id, U.tenantId AS tenant_id, ‘1’ AS label, 0 AS label_index FROM ( SELECT u.identityId, u.tenantId FROM ( SELECT u.identity AS identityId, u.tenant_id AS tenantId, u.uattr1004, u.uattr32 FROM ( SELECT identity AS identity, u.tenant_id AS tenant_id, item1.wx_app_id AS uattr32, item1.wx_template_10000 AS uattr1004 FROM rms_member.dwt_member_profile_ss_daily AS u LEFT JOIN rms_member.dim_member_item_ss_daily AS item1 ON u.tenant_id = item1.tenant_id AND u.one_id = item1.identity_id AND item1.item_type = ‘wx_app_quota’ AND ( (item1.tenant_id IN (‘11750013’)) AND (item1.id_type = ‘@member_id’) ) WHERE ( (u.tenant_id IN (‘11750013’)) AND (u.id_type = ‘@member_id’) ) ) AS u WHERE 1 = 1 ) AS u WHERE 1 = 1 AND ( (u.uattr32 = ‘wx29850215cb395a68’) AND ((CAST(u.uattr1004 AS decimal(20, 6)) > 0)) ) GROUP BY u.tenantId, u.identityId ) AS U ) AS U ) AS U”
– 4.删除第一步往2号分区写入的数据,delete操作执行了10s。
2024-09-30 18:03:57,10748,“delete from rms_member.dwt_user_label_member_id_middle_partition_tenant_daily where partition_version in (2,3) and label_code = ‘G_200000001335817’ and tenant_id = 11750013 and data_update_time < ‘2024-09-30 18:03:53’”
– 5.在删除2号分区语句执行过程中,往2号分区写入数据,(写入的数据和删除的数据主键一致),且该写入操作和上一步的delete操作结束时间非常接近。这一步数据写入失败,2号分区数据丢失。
– 第4步删除的数据应该是data_update_time < '2024-09-30 18:03:53’的条件,但是当前写入数据并不满足这个条件,却还是没写入成功。
2024-09-30 18:04:02,6066,“INSERT INTO rms_member.dwt_user_label_member_id_middle_partition_tenant_daily PARTITION(p2) WITH Label G_200000001335817_1840694080435241023_1727690641877 ( tenant_id, partition_version, label_code, identity_id, label_value, data_update_time ) SELECT tenant_id, ‘2’ AS partition_version, ‘G_200000001335817’ AS label_code, identity_id AS identity_id, label AS label_value, ‘2024-09-30 18:04:01’ AS data_update_time FROM ( SELECT identity_id, tenant_id, label FROM ( SELECT U.identityId AS identity_id, U.tenantId AS tenant_id, ‘1’ AS label, 0 AS label_index FROM ( SELECT u.identityId, u.tenantId FROM ( SELECT u.identity AS identityId, u.tenant_id AS tenantId, u.uattr1004, u.uattr32 FROM ( SELECT identity AS identity, u.tenant_id AS tenant_id, item1.wx_app_id AS uattr32, item1.wx_template_10000 AS uattr1004 FROM rms_member.dwt_member_profile_ss_daily AS u LEFT JOIN rms_member.dim_member_item_ss_daily AS item1 ON u.tenant_id = item1.tenant_id AND u.one_id = item1.identity_id AND item1.item_type = ‘wx_app_quota’ AND ( (item1.tenant_id IN (‘11750013’)) AND (item1.id_type = ‘@member_id’) ) WHERE ( (u.tenant_id IN (‘11750013’)) AND (u.id_type = ‘@member_id’) ) ) AS u WHERE 1 = 1 ) AS u WHERE 1 = 1 AND ( (u.uattr32 = ‘wx29850215cb395a68’) AND ((CAST(u.uattr1004 AS decimal(20, 6)) > 0)) ) GROUP BY u.tenantId, u.identityId ) AS U ) AS U ) AS U”
– 6.删除第3步往3号分区写入的数据,数据删除成功。
2024-09-30 18:04:09,8962,“delete from rms_member.dwt_user_label_member_id_middle_partition_tenant_daily where partition_version in (3,2) and label_code = ‘G_200000001335817’ and tenant_id = 11750013 and data_update_time < ‘2024-09-30 18:04:01’”
– 7.表中没有任何G_200000001335817的数据。
- 依次执行的sql为:(执行的sql是 insert into && delete,一共执行了三对)。
-
其中insert into 后面跟的select语句都一致,且查询结果都有数据。仅仅是partition_version和update_time字段不一致。
-
初始dwt_user_label_member_id_middle_partition_tenant_daily中不存在label_code = 'G_200000001335817’的任何数据,但是每个分区存在很多其他数据。
猜测
对于SR主键表,执行数据删除的时候,同时插入和删除数据主键一样的数据,
就算插入的数据并不满足删除的条件,该插入操作也会写入不成功。
【业务影响】数据写入失败,导致未圈选到人
【是否存算分离】否
【StarRocks版本】例如:2.5
【集群规模】例如:3fe(1 follower+2observer)+8be(fe与be混部)
【机器信息】CPU虚拟核/内存/网卡,例如:48C/64G/万兆
【联系方式】为了在解决问题过程中能及时联系到您获取一些日志信息,请补充下您的联系方式,例如:社区群4-小李或者邮箱,谢谢
【附件】
过了国庆,当时日志已经缺失。