物化视图创建:
CREATE MATERIALIZED VIEW IF NOT EXISTS test_mv
DISTRIBUTED BY HASH(uid)
REFRESH MANUAL
PROPERTIES (
"replication_num" = "3",
"force_external_table_query_rewrite" = "TRUE"
)
as select
imp_date,
el_lm_experiment_id,
uid
from
hive.test_db.test_table
group by
imp_date,
el_lm_experiment_id,
uid
order by imp_date, el_lm_experiment_id;
查询语句:
select
imp_date,
el_lm_experiment_id,
uid
from
hive.test_db.test_table
where
imp_date = 20231112
and el_lm_experiment_id in ('10901522')
group by
imp_date,
el_lm_experiment_id,
uid
explain执行计划:
+--------------------------------------------------------------------------------------------+
| Explain String |
+--------------------------------------------------------------------------------------------+
| PLAN FRAGMENT 0 |
| OUTPUT EXPRS:1: imp_date | 3: el_lm_experiment_id | 10: uid |
| PARTITION: UNPARTITIONED |
| |
| RESULT SINK |
| |
| 4:EXCHANGE |
| |
| PLAN FRAGMENT 1 |
| OUTPUT EXPRS: |
| PARTITION: HASH_PARTITIONED: 1: imp_date, 3: el_lm_experiment_id, 10: uid |
| |
| STREAM DATA SINK |
| EXCHANGE ID: 04 |
| UNPARTITIONED |
| |
| 3:AGGREGATE (merge finalize) |
| | group by: 1: imp_date, 3: el_lm_experiment_id, 10: uid |
| | |
| 2:EXCHANGE |
| |
| PLAN FRAGMENT 2 |
| OUTPUT EXPRS: |
| PARTITION: RANDOM |
| |
| STREAM DATA SINK |
| EXCHANGE ID: 02 |
| HASH_PARTITIONED: 1: imp_date, 3: el_lm_experiment_id, 10: uid |
| |
| 1:AGGREGATE (update serialize) |
| | STREAMING |
| | group by: 1: imp_date, 3: el_lm_experiment_id, 10: uid |
| | |
| 0:HdfsScanNode |
| TABLE: test_table |
| PARTITION PREDICATES: 1: imp_date = 20231112 |
| NON-PARTITION PREDICATES: 1: imp_date = 20231112, 3: el_lm_experiment_id = '10901522' |
| partitions=2/368 |
| cardinality=1 |
| avgRowSize=3.0 |
| numNodes=0 |
+--------------------------------------------------------------------------------------------+