单表datetime范围查询查不到,以及一些where添加匹配不出数据

为了更快的定位您的问题,请提供以下信息,谢谢
【详述】问题详细描述
(1)单表where 一个datetime类型的字段,匹配不出数据
表DDL:
CREATE TABLE hjx_crm_identify_record (
id bigint(20) NOT NULL COMMENT “主键”,
rent_id bigint(20) NOT NULL COMMENT “租户标识”,
status tinyint(4) NULL DEFAULT “1” COMMENT “认证状态,1认证成功,0认证失败”,
identity_id varchar(64) NULL COMMENT “认证ID”,
user_name varchar(50) NULL COMMENT “认证人姓名”,
org_no varchar(64) NULL DEFAULT “-1” COMMENT “归属部门编号,不为空避免后续索引失效”,
no_level_path varchar(256) NULL DEFAULT “-1” COMMENT “归属部门编号层级路径,数据权限时使用,不为空避免后续索引失效”,
fail_info varchar(128) NULL COMMENT “失败原因及建议”,
service_price decimal128(20, 2) NULL DEFAULT “0.00” COMMENT “产生记录时,该租户的本服务单价”,
create_by varchar(20) NULL COMMENT “创建人id,认证人”,
gmt_create datetime NULL COMMENT “创建时间”,
__op boolean NULL DEFAULT “0” COMMENT “删除控制”
) ENGINE=OLAP
PRIMARY KEY(id, rent_id)
COMMENT “实名认证记录表”
DISTRIBUTED BY HASH(id, rent_id) BUCKETS 1
ORDER BY(gmt_create, rent_id)
PROPERTIES (
“replication_num” = “1”,
“in_memory” = “false”,
“enable_persistent_index” = “true”,
“replicated_storage” = “false”,
“compression” = “LZ4”
);

SELECT * from hjx_crm_identify_record WHERE gmt_create > ‘2021-10-29 09:48:06’
匹配不出数据

(2)单表聚合操作,where int类型字段过滤匹配不出数据
表DDL:
CREATE TABLE tb_sms_record (
id bigint(20) NOT NULL COMMENT “主键id”,
rent_id bigint(20) NOT NULL COMMENT “租户标识”,
receive_number varchar(11) NULL COMMENT “接收短信号码”,
content varchar(500) NOT NULL COMMENT “短信内容”,
template varchar(20) NULL COMMENT “短信模板”,
org_no varchar(64) NULL DEFAULT “-1” COMMENT “归属部门编号,不为空避免后续索引失效”,
no_level_path varchar(256) NULL DEFAULT “-1” COMMENT “归属部门编号层级路径,数据权限时使用,避免后续索引失效”,
entrance varchar(32) NULL COMMENT “触发入口”,
status tinyint(4) NULL DEFAULT “1” COMMENT “发送状态,1发送成功,0发送失败”,
service_price decimal128(20, 2) NULL DEFAULT “0.00” COMMENT “产生记录时,该租户的本服务单价”,
create_by varchar(20) NULL COMMENT “创建人”,
gmt_create datetime NULL COMMENT “创建时间”,
__op boolean NULL DEFAULT “0” COMMENT “删除控制”
) ENGINE=OLAP
PRIMARY KEY(id, rent_id)
COMMENT “短信发送记录表”
DISTRIBUTED BY HASH(id, rent_id) BUCKETS 1
ORDER BY(rent_id, gmt_create)
PROPERTIES (
“replication_num” = “1”,
“in_memory” = “false”,
“enable_persistent_index” = “true”,
“replicated_storage” = “false”,
“compression” = “LZ4”
);

sql:
SELECT
DATE_FORMAT(gmt_create, ‘%Y-%m-%d’) AS dateInfo,
COUNT(id) AS totalCount,
SUM(IF(status = 1, 1, 0)) AS dataSuccess,
SUM(service_price) AS dateCost
FROM
tb_sms_record
where
1 = 1
and DATE_FORMAT(gmt_create, ‘%Y-%m-%d %H:%i:%S’) BETWEEN ‘2023-12-01 00:00:00’ and ‘2023-12-31 23:59:59’
and rent_id in (1, 560)
GROUP BY
DATE_FORMAT(gmt_create, ‘%Y-%m-%d’);

其中where的2个条件,gmt_create和rent_id,同时存在时,就过滤不出数据,实际上数据是存在的

【背景】做过哪些操作?
版本做了升级,StarRocks version 2.5.1 --> 3.1.5-5d8438a
be:替换lib 以及bin
fe:替换lib 以及bin和spark-dpp

【业务影响】
【是否存算分离】
fe和be同一台服务器
【StarRocks版本】
3.1.5-5d8438a

【集群规模】例如:1fe(1 follower+2observer)+1be(fe与be混部)
【联系方式】
为了在解决问题过程中能及时联系到您获取一些日志信息,请补充下您的联系方式,
例如:社区群11-卖女孩的小火柴

【附件】
(1)
explain costs
SELECT * from hjx_crm_identify_record WHERE gmt_create > ‘2021-10-29 09:48:06’

Explain String |
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
PLAN FRAGMENT 0(F01) |
Output Exprs:1: id | 2: rent_id | 3: status | 4: identity_id | 5: user_name | 6: org_no | 7: no_level_path | 8: fail_info | 9: service_price | 10: create_by | 11: gmt_create | 12: __op|
Input Partition: UNPARTITIONED |
RESULT SINK |
|
1:EXCHANGE |
distribution type: GATHER |
limit: 200 |
cardinality: 34 |
|
PLAN FRAGMENT 1(F00) |
|
Input Partition: RANDOM |
OutPut Partition: UNPARTITIONED |
OutPut Exchange Id: 01 |
|
0:OlapScanNode |
table: hjx_crm_identify_record, rollup: hjx_crm_identify_record |
preAggregation: on |
Predicates: [11: gmt_create, DATETIME, true] > ‘2021-10-29 09:48:06’ |
partitionsRatio=1/1, tabletsRatio=1/1 |
tabletList=95078 |
actualRows=38, avgRowSize=98.26316 |
limit: 200 |
cardinality: 34 |
column statistics: |
* id–>[1.0, 67.0, 0.0, 8.0, 34.14647315941868] ESTIMATE |
* rent_id–>[1.0, 1135.0, 0.0, 8.0, 4.0] ESTIMATE |
* status–>[0.0, 1.0, 0.0, 1.0, 2.0] ESTIMATE |
* identity_id–>[-Infinity, Infinity, 0.0, 29.342105263157894, 34.14647315941868] ESTIMATE |
* user_name–>[-Infinity, Infinity, 0.0, 3.5, 12.0] ESTIMATE |
* org_no–>[-Infinity, Infinity, 0.0, 4.2368421052631575, 11.0] ESTIMATE |
* no_level_path–>[-Infinity, Infinity, 0.0, 6.815789473684211, 12.0] ESTIMATE |
* fail_info–>[-Infinity, Infinity, 0.0, 5.973684210526316, 8.0] ESTIMATE |
* service_price–>[1.0, 1.2, 0.0, 16.0, 2.0] ESTIMATE |
* create_by–>[-Infinity, Infinity, 0.0, 6.394736842105263, 10.0] ESTIMATE |
* gmt_create–>[1.635472086E9, 1.678954665E9, 0.0, 8.0, 34.14647315941868] ESTIMATE |
* __op–>[0.0, 0.0, 0.0, 1.0, 1.0] ESTIMATE |

(2)
explain costs
SELECT
DATE_FORMAT(gmt_create, ‘%Y-%m-%d’) AS dateInfo,
COUNT(id) AS totalCount,
SUM(IF(status = 1, 1, 0)) AS dataSuccess,
SUM(service_price) AS dateCost
FROM
tb_sms_record
where
1 = 1
and DATE_FORMAT(gmt_create, ‘%Y-%m-%d %H:%i:%S’) BETWEEN ‘2023-12-01 00:00:00’ and ‘2023-12-31 23:59:59’
and rent_id in (1, 560)
GROUP BY
DATE_FORMAT(gmt_create, ‘%Y-%m-%d’);

Explain String |
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
PLAN FRAGMENT 0(F01) |
Output Exprs:14: date_format | 16: count | 17: sum | 18: sum |
Input Partition: UNPARTITIONED |
RESULT SINK |
|
3:EXCHANGE |
distribution type: GATHER |
limit: 200 |
cardinality: 97 |
|
PLAN FRAGMENT 1(F00) |
|
Input Partition: RANDOM |
OutPut Partition: UNPARTITIONED |
OutPut Exchange Id: 03 |
|
2:AGGREGATE (update finalize) |
| aggregate: sum[([15: if, TINYINT, true]); args: TINYINT; result: BIGINT; args nullable: true; result nullable: true], sum[([10: service_price, DECIMAL128(20,2), true]); args: DECIMAL128; result: DECIMAL128(38,2); args nullable: true; result nullable:|
| group by: [14: date_format, VARCHAR, true] |
| limit: 200 |
| cardinality: 97 |
| column statistics: |
| * date_format–>[-Infinity, Infinity, 0.0, 1.0, 1.0] UNKNOWN |
| * count–>[0.0, 193.29729729729732, 0.0, 8.0, 96.64864864864866] ESTIMATE |
| * sum–>[-Infinity, Infinity, 0.0, 8.0, 2.0] ESTIMATE |
| * sum–>[0.0, 3.2216216216216225, 0.0, 16.0, 3.0] ESTIMATE |
| |
1:Project |
| output columns: |
| 1 <-> [1: id, BIGINT, false] |
| 10 <-> [10: service_price, DECIMAL128(20,2), true] |
| 14 <-> date_format[([12: gmt_create, DATETIME, true], ‘%Y-%m-%d’); args: DATETIME,VARCHAR; result: VARCHAR; args nullable: true; result nullable: true] |
| 15 <-> if[([9: status, TINYINT, true] = 1, 1, 0); args: BOOLEAN,TINYINT,TINYINT; result: TINYINT; args nullable: true; result nullable: true] |
| cardinality: 193 |
| column statistics: |
| * id–>[3.0, 28609.0, 0.0, 8.0, 193.29729729729732] ESTIMATE |
| * service_price–>[0.0, 0.1, 0.0, 16.0, 3.0] ESTIMATE |
| * date_format–>[-Infinity, Infinity, 0.0, 1.0, 1.0] UNKNOWN |
| * if–>[-Infinity, Infinity, 0.0, 1.0, 2.0] ESTIMATE |
| |
0:OlapScanNode |
table: tb_sms_record, rollup: tb_sms_record |
preAggregation: on |
Predicates: date_format[([12: gmt_create, DATETIME, true], ‘%Y-%m-%d %H:%i:%S’); args: DATETIME,VARCHAR; result: VARCHAR; args nullable: true; result nullable: true] >= ‘2023-12-01 00:00:00’, date_format[([12: gmt_create, DATETIME, true], '%Y-%m-%d %|
partitionsRatio=1/1, tabletsRatio=1/1 |
tabletList=95045 |
actualRows=28607, avgRowSize=43.0 |
cardinality: 193 |
column statistics: |
* id–>[3.0, 28609.0, 0.0, 8.0, 193.29729729729732] ESTIMATE |
* rent_id–>[1.0, 560.0, 0.0, 8.0, 2.0] ESTIMATE |
* status–>[0.0, 1.0, 0.0, 1.0, 2.0] ESTIMATE |
* service_price–>[0.0, 0.1, 0.0, 16.0, 3.0] ESTIMATE |
* gmt_create–>[1.631177894E9, 1.702257949E9, 0.0, 8.0, 193.29729729729732] ESTIMATE |
* date_format–>[-Infinity, Infinity, 0.0, 1.0, 1.0] UNKNOWN |
* if–>[-Infinity, Infinity, 0.0, 1.0, 2.0] ESTIMATE |

附件profile
profile.txt (50.7 KB)

请问有通过alter table修改过sort key列吗?

没有,都是删表,再create的,没有alter操作

我在3.1.4版本上按照你的建表和sql语句没发现问题,稍等我升级下试试。或者方便的话能提供两条有问题数据的insert语句吗,我试下

这个是第一张表的数据,38条,少一点,可以看看

表数据.txt (11.6 KB)

SELECT * from hjx_crm_identify_record WHERE gmt_create > ‘2021-10-29 09:48:06’ tablet(95078)
SELECT * from hjx_crm_identify_record tablet(95078)

这两个有结果吗

SELECT * from hjx_crm_identify_record tablet(95078) 有数据

SELECT * from hjx_crm_identify_record tablet(95078) WHERE gmt_create > ‘2021-10-29 09:48:06’ 没有数据

SELECT * from hjx_crm_identify_record tablet(95078) order by gmt_create limit 10

SELECT * from hjx_crm_identify_record tablet(95078) order by gmt_create limit 10 有数据

gmt_create这一列的结果集发下

这个explain costs麻烦发下

Explain String |
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
PLAN FRAGMENT 0(F01) |
Output Exprs:1: id | 2: rent_id | 3: status | 4: identity_id | 5: user_name | 6: org_no | 7: no_level_path | 8: fail_info | 9: service_price | 10: create_by | 11: gmt_create | 12: __op|
Input Partition: UNPARTITIONED |
RESULT SINK |
|
1:EXCHANGE |
distribution type: GATHER |
limit: 200 |
cardinality: 34 |
|
PLAN FRAGMENT 1(F00) |
|
Input Partition: RANDOM |
OutPut Partition: UNPARTITIONED |
OutPut Exchange Id: 01 |
|
0:OlapScanNode |
table: hjx_crm_identify_record, rollup: hjx_crm_identify_record |
preAggregation: on |
Predicates: [11: gmt_create, DATETIME, true] > ‘2021-10-29 09:48:06’ |
partitionsRatio=1/1, tabletsRatio=1/1 |
tabletList=95078 |
actualRows=38, avgRowSize=98.26316 |
limit: 200 |
cardinality: 34 |
column statistics: |
* id–>[1.0, 67.0, 0.0, 8.0, 34.14647315941868] ESTIMATE |
* rent_id–>[1.0, 1135.0, 0.0, 8.0, 4.0] ESTIMATE |
* status–>[0.0, 1.0, 0.0, 1.0, 2.0] ESTIMATE |
* identity_id–>[-Infinity, Infinity, 0.0, 29.342105263157894, 34.14647315941868] ESTIMATE |
* user_name–>[-Infinity, Infinity, 0.0, 3.5, 12.0] ESTIMATE |
* org_no–>[-Infinity, Infinity, 0.0, 4.2368421052631575, 11.0] ESTIMATE |
* no_level_path–>[-Infinity, Infinity, 0.0, 6.815789473684211, 12.0] ESTIMATE |
* fail_info–>[-Infinity, Infinity, 0.0, 5.973684210526316, 8.0] ESTIMATE |
* service_price–>[1.0, 1.2, 0.0, 16.0, 2.0] ESTIMATE |
* create_by–>[-Infinity, Infinity, 0.0, 6.394736842105263, 10.0] ESTIMATE |
* gmt_create–>[1.635472086E9, 1.678954665E9, 0.0, 8.0, 34.14647315941868] ESTIMATE |
* __op–>[0.0, 0.0, 0.0, 1.0, 1.0] ESTIMATE |

另外问下你是用mysql客户端查的还是?如果不是,用mysql客户端直接连fe查下

用Dbeaver查是查不到
然后我刚上linux,连mysql客户端去查,也是没有
mysql> SELECT * from hjx_crm_identify_record tablet(95078) order by gmt_create desc limit 1;
±-----±--------±-------±---------------------------------±----------±-------±--------------±-------------±--------------±----------±--------------------±-----+
| id | rent_id | status | identity_id | user_name | org_no | no_level_path | fail_info | service_price | create_by | gmt_create | __op |
±-----±--------±-------±---------------------------------±----------±-------±--------------±-------------±--------------±----------±--------------------±-----+
| 67 | 1 | 1 | 9276c6be052d4ba5995e95cfca009763 | 郭嘉俊 | 08 | /08/ | 认证通过 | 1.20 | MSGZ00948 | 2023-03-16 16:17:45 | 0 |
±-----±--------±-------±---------------------------------±----------±-------±--------------±-------------±--------------±----------±--------------------±-----+
1 row in set (0.01 sec)

mysql> SELECT * from hjx_crm_identify_record tablet(95078) WHERE gmt_create > ‘2021-10-29 09:48:06’;
Empty set (0.01 sec)

第一个问题我根据你的数据和DDL试了下是可以查出来数据的



或者你建个tmp表,create table hjx_crm_identify_record_tmp as select * from hjx_crm_identify_record;然后查询hjx_crm_identify_record_tmp试试呢

新建一个tmp表,然后insert同样数据进入,查tmp表是能查到的

另一个表也这么试下呢

另一个表也建新表,重新insert,也是可以查到

SELECT gmt_create from hjx_crm_identify_record WHERE gmt_create > ‘2021-10-29 09:48:06’
这个能查出来结果吗

select *没有,select id和select gmt_create都有数据