出现多条重复数据
帮忙发下脱敏后的建表语句和explain+sql的执行结果
另外发下是哪个版本的
版本:
1.19.0
建表语句:
CREATE TABLE tbl_agent_day_statistics
(
uid
bigint(20) NOT NULL,
id
bigint(20) NOT NULL,
statistics_date
date NOT NULL,
card_count
int(11) REPLACE NULL DEFAULT “0”,
inner_customer_count
int(11) REPLACE NULL DEFAULT “0”,
outer_customer_count
int(11) REPLACE NULL DEFAULT “0”,
visit_share_count
int(11) REPLACE NULL DEFAULT “0”,
visit_share_friend_count
int(11) REPLACE NULL DEFAULT “0”,
visit_share_group_count
int(11) REPLACE NULL DEFAULT “0”,
premium_inner_pay_amount
decimal64(16, 2) REPLACE NULL DEFAULT “0.00”,
premium_inner_item_data
varchar(65533) REPLACE NULL,
premium_inner_item_count
int(11) REPLACE NULL DEFAULT “0”,
premium_inner_customer_data
varchar(65533) REPLACE NULL,
premium_inner_customer_count
int(11) REPLACE NULL DEFAULT “0”,
study_time
int(11) REPLACE NULL DEFAULT “0”,
one_share_count
int(11) REPLACE NULL DEFAULT “0”,
two_share_count
int(11) REPLACE NULL DEFAULT “0”,
one_customer_count
int(11) REPLACE NULL DEFAULT “0”,
two_customer_count
int(11) REPLACE NULL DEFAULT “0”,
remind_count
int(11) REPLACE NULL DEFAULT “0”,
mark_count
int(11) REPLACE NULL DEFAULT “0”,
phone_visit_count
int(11) REPLACE NULL DEFAULT “0”,
touch_count
int(11) REPLACE NULL DEFAULT “0”,
rights_want_count
int(11) REPLACE NULL DEFAULT “0”,
rights_share_count
int(11) REPLACE NULL DEFAULT “0”,
cloud_login_count
int(11) REPLACE NULL DEFAULT “0”,
cloud_touch_count
int(11) REPLACE NULL DEFAULT “0”,
clues_analysis_count
int(11) REPLACE NULL DEFAULT “0”,
touch_user_count
int(11) REPLACE NULL DEFAULT “0”,
touch_high_praise_rate
int(11) REPLACE NULL DEFAULT “0”,
achievement_beat_rate
int(11) REPLACE NULL DEFAULT “0”,
share_user_read_count
int(11) REPLACE NULL DEFAULT “0”,
achievement_status
int(11) REPLACE NULL DEFAULT “0”,
achievement_beat_rate_sore
int(11) REPLACE NULL DEFAULT “0”,
cloud_active_count
int(11) REPLACE NULL DEFAULT “0”,
sign_user_count
int(11) REPLACE NULL DEFAULT “0”,
create_time
datetime REPLACE NULL,
update_time
datetime REPLACE NULL
) ENGINE=OLAP
PRIMARY KEY(uid
, id
, statistics_date
)
COMMENT “OLAP”
PARTITION BY RANGE(statistics_date
)
(PARTITION p201801 VALUES [(‘2018-01-01’), (‘2018-02-01’)),
PARTITION p201802 VALUES [(‘2018-02-01’), (‘2018-03-01’)),
PARTITION p201803 VALUES [(‘2018-03-01’), (‘2018-04-01’)),
PARTITION p201804 VALUES [(‘2018-04-01’), (‘2018-05-01’)),
PARTITION p201805 VALUES [(‘2018-05-01’), (‘2018-06-01’)),
PARTITION p201806 VALUES [(‘2018-06-01’), (‘2018-07-01’)),
PARTITION p201807 VALUES [(‘2018-07-01’), (‘2018-08-01’)),
PARTITION p201808 VALUES [(‘2018-08-01’), (‘2018-09-01’)),
PARTITION p201809 VALUES [(‘2018-09-01’), (‘2018-10-01’)),
PARTITION p201810 VALUES [(‘2018-10-01’), (‘2018-11-01’)),
PARTITION p201811 VALUES [(‘2018-11-01’), (‘2018-12-01’)),
PARTITION p201812 VALUES [(‘2018-12-01’), (‘2019-01-01’)),
PARTITION p201901 VALUES [(‘2019-01-01’), (‘2019-02-01’)),
PARTITION p201902 VALUES [(‘2019-02-01’), (‘2019-03-01’)),
PARTITION p201903 VALUES [(‘2019-03-01’), (‘2019-04-01’)),
PARTITION p201904 VALUES [(‘2019-04-01’), (‘2019-05-01’)),
PARTITION p201905 VALUES [(‘2019-05-01’), (‘2019-06-01’)),
PARTITION p201906 VALUES [(‘2019-06-01’), (‘2019-07-01’)),
PARTITION p201907 VALUES [(‘2019-07-01’), (‘2019-08-01’)),
PARTITION p201908 VALUES [(‘2019-08-01’), (‘2019-09-01’)),
PARTITION p201909 VALUES [(‘2019-09-01’), (‘2019-10-01’)),
PARTITION p201910 VALUES [(‘2019-10-01’), (‘2019-11-01’)),
PARTITION p201911 VALUES [(‘2019-11-01’), (‘2019-12-01’)),
PARTITION p201912 VALUES [(‘2019-12-01’), (‘2020-01-01’)),
PARTITION p202001 VALUES [(‘2020-01-01’), (‘2020-02-01’)),
PARTITION p202002 VALUES [(‘2020-02-01’), (‘2020-03-01’)),
PARTITION p202003 VALUES [(‘2020-03-01’), (‘2020-04-01’)),
PARTITION p202004 VALUES [(‘2020-04-01’), (‘2020-05-01’)),
PARTITION p202005 VALUES [(‘2020-05-01’), (‘2020-06-01’)),
PARTITION p202006 VALUES [(‘2020-06-01’), (‘2020-07-01’)),
PARTITION p202007 VALUES [(‘2020-07-01’), (‘2020-08-01’)),
PARTITION p202008 VALUES [(‘2020-08-01’), (‘2020-09-01’)),
PARTITION p202009 VALUES [(‘2020-09-01’), (‘2020-10-01’)),
PARTITION p202010 VALUES [(‘2020-10-01’), (‘2020-11-01’)),
PARTITION p202011 VALUES [(‘2020-11-01’), (‘2020-12-01’)),
PARTITION p202012 VALUES [(‘2020-12-01’), (‘2021-01-01’)),
PARTITION p202101 VALUES [(‘2021-01-01’), (‘2021-02-01’)),
PARTITION p202102 VALUES [(‘2021-02-01’), (‘2021-03-01’)),
PARTITION p202103 VALUES [(‘2021-03-01’), (‘2021-04-01’)),
PARTITION p202104 VALUES [(‘2021-04-01’), (‘2021-05-01’)),
PARTITION p202105 VALUES [(‘2021-05-01’), (‘2021-06-01’)),
PARTITION p202106 VALUES [(‘2021-06-01’), (‘2021-07-01’)),
PARTITION p202107 VALUES [(‘2021-07-01’), (‘2021-08-01’)),
PARTITION p202108 VALUES [(‘2021-08-01’), (‘2021-09-01’)),
PARTITION p202109 VALUES [(‘2021-09-01’), (‘2021-10-01’)),
PARTITION p202110 VALUES [(‘2021-10-01’), (‘2021-11-01’)),
PARTITION p202111 VALUES [(‘2021-11-01’), (‘2021-12-01’)),
PARTITION p202112 VALUES [(‘2021-12-01’), (‘2022-01-01’)),
PARTITION p202201 VALUES [(‘2022-01-01’), (‘2022-02-01’)),
PARTITION p202202 VALUES [(‘2022-02-01’), (‘2022-03-01’)),
PARTITION p202203 VALUES [(‘2022-03-01’), (‘2022-04-01’)),
PARTITION p202204 VALUES [(‘2022-04-01’), (‘2022-05-01’)),
PARTITION p202205 VALUES [(‘2022-05-01’), (‘2022-06-01’)),
PARTITION p202206 VALUES [(‘2022-06-01’), (‘2022-07-01’)),
PARTITION p202207 VALUES [(‘2022-07-01’), (‘2022-08-01’)),
PARTITION p202208 VALUES [(‘2022-08-01’), (‘2022-09-01’)),
PARTITION p202209 VALUES [(‘2022-09-01’), (‘2022-10-01’)),
PARTITION p202210 VALUES [(‘2022-10-01’), (‘2022-11-01’)),
PARTITION p202211 VALUES [(‘2022-11-01’), (‘2022-12-01’)),
PARTITION p202212 VALUES [(‘2022-12-01’), (‘2023-01-01’)),
PARTITION p202301 VALUES [(‘2023-01-01’), (‘2023-02-01’)),
PARTITION p202302 VALUES [(‘2023-02-01’), (‘2023-03-01’)),
PARTITION p202303 VALUES [(‘2023-03-01’), (‘2023-04-01’)),
PARTITION p202304 VALUES [(‘2023-04-01’), (‘2023-05-01’)),
PARTITION p202305 VALUES [(‘2023-05-01’), (‘2023-06-01’)),
PARTITION p202306 VALUES [(‘2023-06-01’), (‘2023-07-01’)),
PARTITION p202307 VALUES [(‘2023-07-01’), (‘2023-08-01’)),
PARTITION p202308 VALUES [(‘2023-08-01’), (‘2023-09-01’)),
PARTITION p202309 VALUES [(‘2023-09-01’), (‘2023-10-01’)),
PARTITION p202310 VALUES [(‘2023-10-01’), (‘2023-11-01’)),
PARTITION p202311 VALUES [(‘2023-11-01’), (‘2023-12-01’)),
PARTITION p202312 VALUES [(‘2023-12-01’), (‘2024-01-01’)))
DISTRIBUTED BY HASH(uid
) BUCKETS 12
PROPERTIES (
“replication_num” = “2”,
“colocate_with” = “group_agent_uid”,
“in_memory” = “false”,
“storage_format” = “DEFAULT”
);
CREATE TABLE tbl_tjpt_agent_info
(
job_number
varchar(64) NOT NULL,
uid
bigint(20) REPLACE NULL DEFAULT “0”,
id
bigint(20) REPLACE NULL DEFAULT “0”,
one_level_code
varchar(64) REPLACE NULL,
two_level_code
varchar(64) REPLACE NULL,
three_level_code
varchar(64) REPLACE NULL,
four_level_code
varchar(64) REPLACE NULL,
five_level_code
varchar(64) REPLACE NULL,
six_level_code
varchar(64) REPLACE NULL,
name
varchar(256) REPLACE NULL DEFAULT “”,
head
varchar(512) REPLACE NULL DEFAULT “”,
image
varchar(512) REPLACE NULL DEFAULT “”,
qr_code
varchar(512) REPLACE NULL DEFAULT “”,
sex
varchar(1) REPLACE NULL DEFAULT “”,
phone
varchar(32) REPLACE NULL DEFAULT “”,
title
varchar(1024) REPLACE NULL DEFAULT “”,
resume
varchar(65533) REPLACE NULL,
work_year
datetime REPLACE NULL,
sign_time
datetime REPLACE NULL,
job_time
datetime REPLACE NULL,
superior_name
varchar(64) REPLACE NULL DEFAULT “”,
auth_status
int(11) REPLACE NULL DEFAULT “0”,
tips
int(11) REPLACE NULL DEFAULT “0”,
modify_flag
int(11) REPLACE NULL DEFAULT “0”,
user_name
varchar(128) REPLACE NULL DEFAULT “”,
birthday
date REPLACE NULL,
first_phone
varchar(128) REPLACE NULL DEFAULT “”,
agent_phone
varchar(128) REPLACE NULL DEFAULT “”,
source
varchar(128) REPLACE NULL DEFAULT “其它”,
source_type
tinyint(4) REPLACE NULL DEFAULT “0”,
type
tinyint(4) REPLACE NULL DEFAULT “0”,
js_type
tinyint(4) REPLACE NULL DEFAULT “0”,
register_date
datetime REPLACE NULL,
create_time
datetime REPLACE NULL,
update_time
datetime REPLACE NULL,
staff_flag
tinyint(4) REPLACE NULL DEFAULT “0”
) ENGINE=OLAP
PRIMARY KEY(job_number
)
COMMENT “OLAP”
DISTRIBUTED BY HASH(job_number
) BUCKETS 12
PROPERTIES (
“replication_num” = “2”,
“in_memory” = “false”,
“storage_format” = “DEFAULT”
);
explain:
WORK ON CBO OPTIMIZER
PLAN FRAGMENT 0
OUTPUT EXPRS:1: uid | 75: sum(4: card_count) | 76: sum(5: inner_customer_count) | 77: sum(6: outer_customer_count) | 78: sum(18: one_customer_count) | 79: sum(19: two_customer_count) | 80: sum(22: phone_visit_count) | 81: sum(23: touch_count)
PARTITION: UNPARTITIONED
RESULT SINK
8:MERGING-EXCHANGE
use vectorized: true
PLAN FRAGMENT 1
OUTPUT EXPRS:
PARTITION: RANDOM
STREAM DATA SINK
EXCHANGE ID: 08
UNPARTITIONED
7:SORT
| order by: <slot 76> 76: sum(5: inner_customer_count) DESC, <slot 1> 1: uid ASC
| offset: 0
| use vectorized: true
|
6:AGGREGATE (update finalize)
| output: sum(23: touch_count), sum(4: card_count), sum(5: inner_customer_count), sum(6: outer_customer_count), sum(18: one_customer_count), sum(19: two_customer_count), sum(22: phone_visit_count)
| group by: 1: uid
| use vectorized: true
|
5:Project
| <slot 1> : 1: uid
| <slot 18> : 18: one_customer_count
| <slot 19> : 19: two_customer_count
| <slot 4> : 4: card_count
| <slot 5> : 5: inner_customer_count
| <slot 6> : 6: outer_customer_count
| <slot 22> : 22: phone_visit_count
| <slot 23> : 23: touch_count
| use vectorized: true
|
4:HASH JOIN
| join op: INNER JOIN (BROADCAST)
| hash predicates:
| colocate: false, reason:
| equal join conjunct: 1: uid = 40: uid
| use vectorized: true
|
|----3:EXCHANGE
| use vectorized: true
|
0:OlapScanNode
TABLE: tbl_agent_day_statistics
PREAGGREGATION: ON
PREDICATES: 5: inner_customer_count > 0
partitions=38/72
rollup: tbl_agent_day_statistics
tabletRatio=456/456
tabletList=1144719,1144722,1144725,1144728,1144731,1144734,1144737,1144740,1144743,1144746 …
cardinality=24150897
avgRowSize=9.0
numNodes=0
use vectorized: true
PLAN FRAGMENT 2
OUTPUT EXPRS:
PARTITION: RANDOM
STREAM DATA SINK
EXCHANGE ID: 03
UNPARTITIONED
2:Project
| <slot 40> : 40: uid
| use vectorized: true
|
1:OlapScanNode
TABLE: tbl_tjpt_agent_info
PREAGGREGATION: ON
PREDICATES: 42: one_level_code = ‘660000’, 44: three_level_code = ‘660521001002’, 69: type = 1
partitions=1/1
rollup: tbl_tjpt_agent_info
tabletRatio=12/12
tabletList=1147316,1147319,1147322,1147325,1147328,1147331,1147334,1147337,1147340,1147343 …
cardinality=890812
avgRowSize=4.0
numNodes=0
use vectorized: true
执行结果:
应该是执行计划有问题, 不应该生成一阶段聚合,我们快尽快修复
请问问题解决了吗?需要升级到哪个版本?