【详述】每天的数据量1500万条左右,对其中的字段进行去重求个数,现在查询查询一个字段大概1s,多个字段查询时间翻倍,有什么途径可以优化这种查询吗
【业务影响】
【StarRocks版本】例如:2.0.1
【集群规模】例如:3fe(64G)
执行计划:
PLAN FRAGMENT 0
OUTPUT EXPRS:16: count | 17: count | 2: dt
PARTITION: UNPARTITIONED
RESULT SINK
5:EXCHANGE
use vectorized: true
PLAN FRAGMENT 1
OUTPUT EXPRS:
PARTITION: HASH_PARTITIONED: 2: dt
STREAM DATA SINK
EXCHANGE ID: 05
UNPARTITIONED
4:AGGREGATE (merge finalize)
| output: multi_distinct_count(17: count), multi_distinct_count(16: count)
| group by: 2: dt
| use vectorized: true
|
3:EXCHANGE
use vectorized: true
PLAN FRAGMENT 2
OUTPUT EXPRS:
PARTITION: RANDOM
STREAM DATA SINK
EXCHANGE ID: 03
HASH_PARTITIONED: 2: dt
2:AGGREGATE (update serialize)
| STREAMING
| output: multi_distinct_count(14: user_uuid), multi_distinct_count(13: device_uuid)
| group by: 2: dt
| use vectorized: true
|
1:Project
| <slot 2> : 2: dt
| <slot 13> : 13: device_uuid
| <slot 14> : 14: user_uuid
| use vectorized: true
|
0:OlapScanNode
TABLE: dws_realtime_user_detail_inc_1d
PREAGGREGATION: ON
PREDICATES: 1: game_key = 10003
partitions=7/66
rollup: dws_realtime_user_detail_inc_1d
tabletRatio=7/7
tabletList=27150373,27158210,27158268,27217317,27263695,27264171,27264235
cardinality=16663835
avgRowSize=27.533485
numNodes=0
use vectorized: true