版本2.5.7 SQL执行计划错误导致查询数据不正确
-
sql语句
select *
from (select substr(finishtime, 1, 4),
case
when industry = ‘生态’ then
‘生态’
else
‘家电’
end as type1,
case
when channelid_name = ‘海尔智家APP’ then
‘智家APP’
when channelid_name = ‘智家app’ then
‘智家APP’
when channelid_name = ‘智家APP’ then
‘智家APP’
when channelid_name = ‘智家app小程序’ then
‘智家小程序’
else
‘其他’
end as type2,
/* b.dim6 as type3, / sum(scene_productamount), count(distinct md_code)
from di_syn.dws_trade_actual_payamount_df_new_scene_productamount a
left join (select distinct dim1, dim6 from di_syn.dim_trade_apddim_df where type_id = ‘6’) b
on b.dim1 = a.sku
where source_flag in (‘1’, ‘10’)
and substr(finishtime, 1, 7) between ‘2022-01’ and ‘2023-08’
and channelid_name in (‘海尔智家APP’, ‘智家app’, ‘智家APP’, ‘智家app小程序’)
group by substr(finishtime, 1, 4),
case
when industry = ‘生态’ then
‘生态’
else
‘家电’
end,
case
when channelid_name = ‘海尔智家APP’ then
‘智家APP’
when channelid_name = ‘智家app’ then
‘智家APP’
when channelid_name = ‘智家APP’ then
‘智家APP’
when channelid_name = ‘智家app小程序’ then
‘智家小程序’
else
‘其他’
end / ,b.dim6 */
) aa
where type1 = ‘家电’ -
执行计划
PLAN FRAGMENT 0
OUTPUT EXPRS:165: substr | 166: case | 167: case | 168: sum | 169: count
PARTITION: UNPARTITIONED
RESULT SINK
16:EXCHANGE
PLAN FRAGMENT 1
OUTPUT EXPRS:
PARTITION: HASH_PARTITIONED: 165: substr, 166: case, 167: case
STREAM DATA SINK
EXCHANGE ID: 16
UNPARTITIONED
15:AGGREGATE (update finalize)
| output: sum(168: sum), count(82: md_code)
| group by: 165: substr, 166: case, 167: case
|
14:AGGREGATE (merge serialize)
| output: sum(168: sum)
| group by: 82: md_code, 165: substr, 166: case, 167: case
|
13:EXCHANGE
PLAN FRAGMENT 2
OUTPUT EXPRS:
PARTITION: HASH_PARTITIONED: 58: sku
STREAM DATA SINK
EXCHANGE ID: 13
HASH_PARTITIONED: 165: substr, 166: case, 167: case
12:AGGREGATE (update serialize)
| STREAMING
| output: sum(155: scene_productamount)
| group by: 82: md_code, 165: substr, 166: case, 167: case
|
11:Project
| <slot 82> : 82: md_code
| <slot 155> : 155: scene_productamount
| <slot 165> : substr(108: finishtime, 1, 4)
| <slot 166> : if(61: industry = ‘生态’, ‘生态’, ‘家电’)
| <slot 167> : CASE WHEN 9: channelid_name = ‘海尔智家APP’ THEN ‘智家APP’ WHEN 9: channelid_name = ‘智家app’ THEN ‘智家APP’ WHEN 9: channelid_name = ‘智家APP’ THEN ‘智家APP’ WHEN 9: channelid_name = ‘智家app小程序’ THEN ‘智家小程序’ ELSE ‘其他’ END
|
10:HASH JOIN
| join op: LEFT OUTER JOIN (PARTITIONED)
| colocate: false, reason:
| equal join conjunct: 58: sku = 158: dim1
|
|----9:EXCHANGE
|
2:EXCHANGE
PLAN FRAGMENT 3
OUTPUT EXPRS:
PARTITION: HASH_PARTITIONED: 158: dim1, 163: dim6
STREAM DATA SINK
EXCHANGE ID: 09
HASH_PARTITIONED: 158: dim1
8:Project
| <slot 158> : 158: dim1
|
7:AGGREGATE (merge finalize)
| group by: 158: dim1, 163: dim6
|
6:EXCHANGE
PLAN FRAGMENT 4
OUTPUT EXPRS:
PARTITION: RANDOM
STREAM DATA SINK
EXCHANGE ID: 06
HASH_PARTITIONED: 158: dim1, 163: dim6
5:AGGREGATE (update serialize)
| STREAMING
| group by: 158: dim1, 163: dim6
|
4:Project
| <slot 158> : 158: dim1
| <slot 163> : 163: dim6
|
3:OlapScanNode
TABLE: dim_trade_apddim_df
PREAGGREGATION: ON
PREDICATES: 156: type_id = ‘6’
partitions=1/1
rollup: dim_trade_apddim_df
tabletRatio=3/3
tabletList=146735198,146735202,146735206
cardinality=6394
avgRowSize=13.288524
numNodes=0
PLAN FRAGMENT 5
OUTPUT EXPRS:
PARTITION: RANDOM
STREAM DATA SINK
EXCHANGE ID: 02
HASH_PARTITIONED: 58: sku
1:Project
| <slot 9> : 9: channelid_name
| <slot 58> : 58: sku
| <slot 61> : 61: industry
| <slot 82> : 82: md_code
| <slot 108> : 108: finishtime
| <slot 155> : 155: scene_productamount
|
0:OlapScanNode
TABLE: dws_trade_actual_payamount_df_new_scene_productamount
PREAGGREGATION: ON
PREDICATES: 61: industry != ‘生态’, 61: industry != ‘生态’ IS NOT NULL, 3: source_flag IN (‘1’, ‘10’), substr(108: finishtime, 1, 7) >= ‘2022-01’, substr(108: finishtime, 1, 7) <= ‘2023-08’, 9: channelid_name IN (‘海尔智家APP’, ‘智家app’, ‘智家APP’, ‘智家app小程序’)
partitions=1/1
rollup: dws_trade_actual_payamount_df_new_scene_productamount
tabletRatio=20/20
tabletList=233699538,233699542,233699546,233699550,233699554,233699558,233699562,233699566,233699570,233699574 …
cardinality=66947
avgRowSize=50.58644
numNodes=0
执行计划中把空值也排除了,实际上空值应该被查询出来
