【详述】
基于jdbc的分区表和非分区表的join建了一个物化视图,之前mv刷新有问题。现在基于3.2.2分支的代码打上了 https://github.com/StarRocks/starrocks/pull/37909 这个patch( [BugFix] Fix mv refresh with mysql external table bug)之后mv可以刷新了,但是刷新mv完了之后发现不能用来做SQL改写,而且mv刷新一直在刷新其中的2个分区
源端mysql表定义
mysql> show create table p_lineorder\G
*************************** 1. row ***************************
Table: p_lineorder
Create Table: CREATE TABLE `p_lineorder` (
`lo_orderkey` bigint(20) NOT NULL,
`lo_linenumber` bigint(20) NOT NULL,
`lo_custkey` int(11) NOT NULL,
`lo_partkey` int(11) NOT NULL,
`lo_suppkey` int(11) NOT NULL,
`lo_orderdate` int(11) NOT NULL,
`lo_orderpriotity` varchar(15) NOT NULL,
`lo_shippriotity` int(11) DEFAULT NULL,
`lo_quantity` bigint(20) DEFAULT NULL,
`lo_extendedprice` bigint(20) DEFAULT NULL,
`lo_ordtotalprice` bigint(20) DEFAULT NULL,
`lo_discount` bigint(20) DEFAULT NULL,
`lo_revenue` bigint(20) DEFAULT NULL,
`lo_supplycost` bigint(20) DEFAULT NULL,
`lo_tax` bigint(20) DEFAULT NULL,
`lo_commitdate` int(11) NOT NULL,
`lo_shipmode` varchar(10) NOT NULL,
PRIMARY KEY (`lo_orderkey`,`lo_linenumber`,`lo_orderdate`),
KEY `fk_lineitem_customer` (`lo_custkey`),
KEY `fk_lineitem_part` (`lo_partkey`),
KEY `fk_lineitem_supplier` (`lo_suppkey`),
KEY `fk_lineitem_orderdate` (`lo_orderdate`),
KEY `fk_lineitem_commitdate` (`lo_commitdate`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8
/*!50100 PARTITION BY RANGE (`lo_orderdate`)
(PARTITION pmin VALUES LESS THAN (19910101) ENGINE = InnoDB,
PARTITION p19920101 VALUES LESS THAN (19920102) ENGINE = InnoDB,
PARTITION p19920102 VALUES LESS THAN (19920103) ENGINE = InnoDB,
PARTITION p19920103 VALUES LESS THAN (19920104) ENGINE = InnoDB,
PARTITION p19920104 VALUES LESS THAN (19920105) ENGINE = InnoDB,
PARTITION p19920105 VALUES LESS THAN (19920106) ENGINE = InnoDB,
PARTITION p19920106 VALUES LESS THAN (19920107) ENGINE = InnoDB,
PARTITION p19920107 VALUES LESS THAN (19920108) ENGINE = InnoDB,
PARTITION p19920108 VALUES LESS THAN (19920109) ENGINE = InnoDB,
PARTITION p19920109 VALUES LESS THAN (19920110) ENGINE = InnoDB) */
1 row in set (0.00 sec)
mysql> show create table part\G
*************************** 1. row ***************************
Table: part
Create Table: CREATE TABLE `part` (
`p_partkey` int(11) NOT NULL,
`p_name` varchar(22) NOT NULL,
`p_mfgr` varchar(6) NOT NULL,
`p_category` varchar(7) NOT NULL,
`p_brand` varchar(9) NOT NULL,
`p_color` varchar(11) NOT NULL,
`p_type` varchar(25) NOT NULL,
`p_size` int(11) NOT NULL,
`p_container` varchar(10) NOT NULL,
PRIMARY KEY (`p_partkey`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8
1 row in set (0.00 sec)
mv和View的定义:
'root'@myssbtest Wed Jan 10 14:27:15 2024>show create materialized view default_catalog.mysqlviewtestdb.mv8311\G
*************************** 1. row ***************************
Materialized View: mv8311
Create Materialized View: CREATE MATERIALIZED VIEW `mv8311` (`p_brand`, `LO_ORDERDATE`, `revenue_sum`)
PARTITION BY (`lo_orderdate`)
DISTRIBUTED BY HASH(`LO_ORDERDATE`)
REFRESH ASYNC START("2024-01-07 11:30:00") EVERY(INTERVAL 10 MINUTE)
PROPERTIES (
"replicated_storage" = "true",
"replication_num" = "1",
"force_external_table_query_rewrite" = "TRUE",
"storage_medium" = "HDD"
)
AS SELECT `v831`.`p_brand`, `v831`.`LO_ORDERDATE`, `v831`.`revenue_sum`
FROM `mysqlviewtestdb`.`v831`;
1 row in set (0.00 sec)
'root'@myssbtest Wed Jan 10 14:28:17 2024>show create view default_catalog.mysqlviewtestdb.v831\G
*************************** 1. row ***************************
View: v831
Create View: CREATE VIEW `v831` (`p_brand`, `LO_ORDERDATE`, `revenue_sum`) AS SELECT `mysql`.`myssbtest`.`p`.`p_brand`, `mysql`.`myssbtest`.`l`.`LO_ORDERDATE`, sum(`mysql`.`myssbtest`.`l`.`LO_REVENUE`) AS `revenue_sum`
FROM `mysql`.`myssbtest`.`p_lineorder` AS `l` LEFT OUTER JOIN `mysql`.`myssbtest`.`part` AS `p` ON `mysql`.`myssbtest`.`l`.`LO_PARTKEY` = `mysql`.`myssbtest`.`p`.`P_PARTKEY`
GROUP BY `mysql`.`myssbtest`.`p`.`p_brand`, `mysql`.`myssbtest`.`l`.`LO_ORDERDATE`;
character_set_client: utf8
collation_connection: utf8_general_ci
1 row in set (0.01 sec)
一开始mv是可以用来做改写的:
'root'@myssbtest Wed Jan 10 12:36:59 2024>explain select p_brand, LO_ORDERDATE, sum(LO_REVENUE) as revenue_sum from p_lineorder l left join part p on l.LO_PARTKEY = p.P_PARTKEY group by p_brand, LO_ORDERDATE;
+-----------------------------------------------------------------------------------------------------+
| Explain String |
+-----------------------------------------------------------------------------------------------------+
| PLAN FRAGMENT 0 |
| OUTPUT EXPRS:22: p_brand | 6: lo_orderdate | 27: sum |
| PARTITION: UNPARTITIONED |
| |
| RESULT SINK |
| |
| 2:EXCHANGE |
| |
| PLAN FRAGMENT 1 |
| OUTPUT EXPRS: |
| PARTITION: RANDOM |
| |
| STREAM DATA SINK |
| EXCHANGE ID: 02 |
| UNPARTITIONED |
| |
| 1:Project |
| | <slot 6> : 29: LO_ORDERDATE |
| | <slot 22> : 28: p_brand |
| | <slot 27> : 30: revenue_sum |
| | |
| 0:OlapScanNode |
| TABLE: mv8311 |
| PREAGGREGATION: ON |
| partitions=9/10 |
| rollup: mv8311 |
| tabletRatio=90/90 |
| tabletList=3986496,3986498,3986500,3986502,3986504,3986506,3986508,3986510,3986512,3986514 ... |
| cardinality=5513 |
| avgRowSize=3.0 |
| MaterializedView: true |
+-----------------------------------------------------------------------------------------------------+
31 rows in set (1.16 sec)
源端先后删除了2条数据:
mysql> delete from p_lineorder where lo_orderdate = 19920101 limit 1;
Query OK, 1 row affected (0.00 sec)
mysql> delete from part where p_partkey = 1;
Query OK, 1 row affected (0.00 sec)
发现mv不能用来做改写了:
'root'@myssbtest Wed Jan 10 12:37:28 2024>explain select p_brand, LO_ORDERDATE, sum(LO_REVENUE) as revenue_sum from p_lineorder l left join part p on l.LO_PARTKEY = p.P_PARTKEY group by p_brand, LO_ORDERDATE;
+----------------------------------------------------------------------------------+
| Explain String |
+----------------------------------------------------------------------------------+
| PLAN FRAGMENT 0 |
| OUTPUT EXPRS:22: p_brand | 6: lo_orderdate | 27: sum |
| PARTITION: UNPARTITIONED |
| |
| RESULT SINK |
| |
| 9:EXCHANGE |
| |
| PLAN FRAGMENT 1 |
| OUTPUT EXPRS: |
| PARTITION: HASH_PARTITIONED: 22: p_brand, 6: lo_orderdate |
| |
| STREAM DATA SINK |
| EXCHANGE ID: 09 |
| UNPARTITIONED |
| |
| 8:AGGREGATE (merge finalize) |
| | output: sum(27: sum) |
| | group by: 22: p_brand, 6: lo_orderdate |
| | |
| 7:EXCHANGE |
| |
| PLAN FRAGMENT 2 |
| OUTPUT EXPRS: |
| PARTITION: HASH_PARTITIONED: p_partkey |
| |
| STREAM DATA SINK |
| EXCHANGE ID: 07 |
| HASH_PARTITIONED: 22: p_brand, 6: lo_orderdate |
| |
| 6:AGGREGATE (update serialize) |
| | STREAMING |
| | output: sum(13: lo_revenue) |
| | group by: 22: p_brand, 6: lo_orderdate |
| | |
| 5:Project |
| | <slot 6> : lo_orderdate |
| | <slot 13> : lo_revenue |
| | <slot 22> : p_brand |
| | |
| 4:HASH JOIN |
| | join op: RIGHT OUTER JOIN (PARTITIONED) |
| | colocate: false, reason: |
| | equal join conjunct: p_partkey = lo_partkey |
| | |
| |----3:EXCHANGE |
| | |
| 1:EXCHANGE |
| |
| PLAN FRAGMENT 3 |
| OUTPUT EXPRS: |
| PARTITION: UNPARTITIONED |
| |
| STREAM DATA SINK |
| EXCHANGE ID: 03 |
| HASH_PARTITIONED: lo_partkey |
| |
| 2:SCAN JDBC |
| TABLE: `p_lineorder` |
| QUERY: SELECT `lo_partkey`, `lo_orderdate`, `lo_revenue` FROM `p_lineorder` |
| |
| PLAN FRAGMENT 4 |
| OUTPUT EXPRS: |
| PARTITION: UNPARTITIONED |
| |
| STREAM DATA SINK |
| EXCHANGE ID: 01 |
| HASH_PARTITIONED: p_partkey |
| |
| 0:SCAN JDBC |
| TABLE: `part` |
| QUERY: SELECT `p_partkey`, `p_brand` FROM `part` |
+----------------------------------------------------------------------------------+
72 rows in set (0.54 sec)
物化视图刷新过程:
'root'@myssbtest Wed Jan 10 12:43:11 2024>refresh materialized view default_catalog.mysqlviewtestdb.mv8311;
+--------------------------------------+
| QUERY_ID |
+--------------------------------------+
| c5914ba4-af72-11ee-9276-1070fd5198c8 |
+--------------------------------------+
1 row in set (0.01 sec)
刷新完之后还不能用来做改写
'root'@myssbtest Wed Jan 10 12:43:17 2024>select * from default_catalog.information_schema.task_runs where `DATABASE` = 'mysqlviewtestdb' and CREATE_TIME >= '2024-01-08 15:00:00' and DEFINITION like '%`mv8311`%' order by create_time desc limit 1\G
*************************** 1. row ***************************
QUERY_ID: c5914ba4-af72-11ee-9276-1070fd5198c8
TASK_NAME: mv-2321548
CREATE_TIME: 2024-01-10 12:43:17
FINISH_TIME: 2024-01-10 12:43:19
STATE: SUCCESS
DATABASE: mysqlviewtestdb
DEFINITION: insert overwrite `mv8311` SELECT `mysqlviewtestdb`.`v831`.`p_brand`, `mysqlviewtestdb`.`v831`.`LO_ORDERDATE`, `mysqlviewtestdb`.`v831`.`revenue_sum`
FROM `mysqlviewtestdb`.`v831`
EXPIRE_TIME: 2024-01-11 12:43:17
ERROR_CODE: 0
ERROR_MESSAGE: NULL
PROGRESS: 100%
EXTRA_MESSAGE: {"forceRefresh":false,"mvPartitionsToRefresh":["p19920106","p19920107","p19920108","p19920109","p19910101","p19920110","p19920102","p19920103","p19920104","p19920105"],"refBasePartitionsToRefreshMap":{"p_lineorder":["p19920106","p19920107","p19920108","p19920109","p19910101","p19920110","p19920102","p19920103","p19920104","p19920105"]},"basePartitionsToRefreshMap":{}}
PROPERTIES: {"FORCE":"false"}
1 row in set (0.14 sec)
'root'@myssbtest Wed Jan 10 12:43:21 2024>explain select p_brand, LO_ORDERDATE, sum(LO_REVENUE) as revenue_sum from p_lineorder l left join part p on l.LO_PARTKEY = p.P_PARTKEY group by p_brand, LO_ORDERDATE;
+----------------------------------------------------------------------------------+
| Explain String |
+----------------------------------------------------------------------------------+
| PLAN FRAGMENT 0 |
| OUTPUT EXPRS:22: p_brand | 6: lo_orderdate | 27: sum |
| PARTITION: UNPARTITIONED |
| |
| RESULT SINK |
| |
| 9:EXCHANGE |
| |
| PLAN FRAGMENT 1 |
| OUTPUT EXPRS: |
| PARTITION: HASH_PARTITIONED: 22: p_brand, 6: lo_orderdate |
| |
| STREAM DATA SINK |
| EXCHANGE ID: 09 |
| UNPARTITIONED |
| |
| 8:AGGREGATE (merge finalize) |
| | output: sum(27: sum) |
| | group by: 22: p_brand, 6: lo_orderdate |
| | |
| 7:EXCHANGE |
| |
| PLAN FRAGMENT 2 |
| OUTPUT EXPRS: |
| PARTITION: HASH_PARTITIONED: p_partkey |
| |
| STREAM DATA SINK |
| EXCHANGE ID: 07 |
| HASH_PARTITIONED: 22: p_brand, 6: lo_orderdate |
| |
| 6:AGGREGATE (update serialize) |
| | STREAMING |
| | output: sum(13: lo_revenue) |
| | group by: 22: p_brand, 6: lo_orderdate |
| | |
| 5:Project |
| | <slot 6> : lo_orderdate |
| | <slot 13> : lo_revenue |
| | <slot 22> : p_brand |
| | |
| 4:HASH JOIN |
| | join op: RIGHT OUTER JOIN (PARTITIONED) |
| | colocate: false, reason: |
| | equal join conjunct: p_partkey = lo_partkey |
| | |
| |----3:EXCHANGE |
| | |
| 1:EXCHANGE |
| |
| PLAN FRAGMENT 3 |
| OUTPUT EXPRS: |
| PARTITION: UNPARTITIONED |
| |
| STREAM DATA SINK |
| EXCHANGE ID: 03 |
| HASH_PARTITIONED: lo_partkey |
| |
| 2:SCAN JDBC |
| TABLE: `p_lineorder` |
| QUERY: SELECT `lo_partkey`, `lo_orderdate`, `lo_revenue` FROM `p_lineorder` |
| |
| PLAN FRAGMENT 4 |
| OUTPUT EXPRS: |
| PARTITION: UNPARTITIONED |
| |
| STREAM DATA SINK |
| EXCHANGE ID: 01 |
| HASH_PARTITIONED: p_partkey |
| |
| 0:SCAN JDBC |
| TABLE: `part` |
| QUERY: SELECT `p_partkey`, `p_brand` FROM `part` |
+----------------------------------------------------------------------------------+
72 rows in set (1.13 sec)
发现定时刷新一直在做刷新2个分区的动作,很奇怪
'root'@myssbtest Wed Jan 10 12:48:04 2024>select * from default_catalog.information_schema.task_runs where `DATABASE` = 'mysqlviewtestdb' and CREATE_TIME >= '2024-01-08 15:00:00' and DEFINITION like '%`mv8311`%' order by create_time desc limit 1\G
*************************** 1. row ***************************
QUERY_ID: 709dbeb8-af73-11ee-9276-1070fd5198c8
TASK_NAME: mv-2321548
CREATE_TIME: 2024-01-10 12:48:04
FINISH_TIME: 2024-01-10 12:48:05
STATE: SUCCESS
DATABASE: mysqlviewtestdb
DEFINITION: insert overwrite `mv8311` SELECT `mysqlviewtestdb`.`v831`.`p_brand`, `mysqlviewtestdb`.`v831`.`LO_ORDERDATE`, `mysqlviewtestdb`.`v831`.`revenue_sum`
FROM `mysqlviewtestdb`.`v831`
EXPIRE_TIME: 2024-01-11 12:48:04
ERROR_CODE: 0
ERROR_MESSAGE: NULL
PROGRESS: 100%
EXTRA_MESSAGE: {"forceRefresh":false,"mvPartitionsToRefresh":["p19920107","p19920102"],"refBasePartitionsToRefreshMap":{"p_lineorder":["p19920107","p19920102"]},"basePartitionsToRefreshMap":{}}
PROPERTIES: {"FORCE":"false"}
1 row in set (0.12 sec)
'root'@myssbtest Wed Jan 10 12:48:06 2024>select * from default_catalog.information_schema.task_runs where `DATABASE` = 'mysqlviewtestdb' and CREATE_TIME >= '2024-01-08 15:00:00' and DEFINITION like '%`mv8311`%' order by create_time desc limit 1\G
*************************** 1. row ***************************
QUERY_ID: 709dbeb8-af73-11ee-9276-1070fd5198c8
TASK_NAME: mv-2321548
CREATE_TIME: 2024-01-10 12:48:04
FINISH_TIME: 2024-01-10 12:48:05
STATE: SUCCESS
DATABASE: mysqlviewtestdb
DEFINITION: insert overwrite `mv8311` SELECT `mysqlviewtestdb`.`v831`.`p_brand`, `mysqlviewtestdb`.`v831`.`LO_ORDERDATE`, `mysqlviewtestdb`.`v831`.`revenue_sum`
FROM `mysqlviewtestdb`.`v831`
EXPIRE_TIME: 2024-01-11 12:48:04
ERROR_CODE: 0
ERROR_MESSAGE: NULL
PROGRESS: 100%
EXTRA_MESSAGE: {"forceRefresh":false,"mvPartitionsToRefresh":["p19920107","p19920102"],"refBasePartitionsToRefreshMap":{"p_lineorder":["p19920107","p19920102"]},"basePartitionsToRefreshMap":{}}
PROPERTIES: {"FORCE":"false"}
1 row in set (0.13 sec)
'root'@myssbtest Wed Jan 10 12:48:11 2024>select * from default_catalog.information_schema.task_runs where `DATABASE` = 'mysqlviewtestdb' and CREATE_TIME >= '2024-01-08 15:00:00' and DEFINITION like '%`mv8311`%' order by create_time desc limit 1\G
*************************** 1. row ***************************
QUERY_ID: 709dbeb8-af73-11ee-9276-1070fd5198c8
TASK_NAME: mv-2321548
CREATE_TIME: 2024-01-10 12:48:04
FINISH_TIME: 2024-01-10 12:48:05
STATE: SUCCESS
DATABASE: mysqlviewtestdb
DEFINITION: insert overwrite `mv8311` SELECT `mysqlviewtestdb`.`v831`.`p_brand`, `mysqlviewtestdb`.`v831`.`LO_ORDERDATE`, `mysqlviewtestdb`.`v831`.`revenue_sum`
FROM `mysqlviewtestdb`.`v831`
EXPIRE_TIME: 2024-01-11 12:48:04
ERROR_CODE: 0
ERROR_MESSAGE: NULL
PROGRESS: 100%
EXTRA_MESSAGE: {"forceRefresh":false,"mvPartitionsToRefresh":["p19920107","p19920102"],"refBasePartitionsToRefreshMap":{"p_lineorder":["p19920107","p19920102"]},"basePartitionsToRefreshMap":{}}
PROPERTIES: {"FORCE":"false"}
1 row in set (0.12 sec)
'root'@myssbtest Wed Jan 10 12:48:12 2024>select * from default_catalog.information_schema.task_runs where `DATABASE` = 'mysqlviewtestdb' and CREATE_TIME >= '2024-01-08 15:00:00' and DEFINITION like '%`mv8311`%' order by create_time desc limit 1\G
*************************** 1. row ***************************
QUERY_ID: 709dbeb8-af73-11ee-9276-1070fd5198c8
TASK_NAME: mv-2321548
CREATE_TIME: 2024-01-10 12:48:04
FINISH_TIME: 2024-01-10 12:48:05
STATE: SUCCESS
DATABASE: mysqlviewtestdb
DEFINITION: insert overwrite `mv8311` SELECT `mysqlviewtestdb`.`v831`.`p_brand`, `mysqlviewtestdb`.`v831`.`LO_ORDERDATE`, `mysqlviewtestdb`.`v831`.`revenue_sum`
FROM `mysqlviewtestdb`.`v831`
EXPIRE_TIME: 2024-01-11 12:48:04
ERROR_CODE: 0
ERROR_MESSAGE: NULL
PROGRESS: 100%
EXTRA_MESSAGE: {"forceRefresh":false,"mvPartitionsToRefresh":["p19920107","p19920102"],"refBasePartitionsToRefreshMap":{"p_lineorder":["p19920107","p19920102"]},"basePartitionsToRefreshMap":{}}
PROPERTIES: {"FORCE":"false"}
1 row in set (0.11 sec)
trace logs mv的结果太大,只能上传文件:trace_jdbc_mv_logs.txt (34.1 KB)
【背景】做过哪些操作?
基于3.2.2分支的代码打上了 https://github.com/StarRocks/starrocks/pull/37909 这个patch( [BugFix] Fix mv refresh with mysql external table bug),然后编译后把starrocks-fe.jar拷贝到了官网下载的3.2.2的fe的lib目录
【业务影响】
无,测试业务
【是否存算分离】
否
【StarRocks版本】
基于3.2.2分支的代码打上了 https://github.com/StarRocks/starrocks/pull/37909 这个patch( [BugFix] Fix mv refresh with mysql external table bug),然后编译后把starrocks-fe.jar拷贝到了官网下载的3.2.2的fe的lib目录
【集群规模】
3fe(1 follower+2observer)+5be(fe与be混部)
【机器信息】
96核,512G
【联系方式】为了在解决问题过程中能及时联系到您获取一些日志信息,请补充下您的联系方式,例如:社区群4-小李或者邮箱,谢谢