StarRocks查询性能优化

【详述】每天的数据量1500万条左右,对其中的字段进行去重求个数,现在查询查询一个字段大概1s,多个字段查询时间翻倍,有什么途径可以优化这种查询吗
【业务影响】
【StarRocks版本】例如:2.0.1
【集群规模】例如:3fe(64G)
执行计划:
PLAN FRAGMENT 0
OUTPUT EXPRS:16: count | 17: count | 2: dt
PARTITION: UNPARTITIONED

   RESULT SINK                                                                          
                                                                                        
   5:EXCHANGE                                                                           
      use vectorized: true                                                              
                                                                                        
 PLAN FRAGMENT 1                                                                        
  OUTPUT EXPRS:                                                                         
   PARTITION: HASH_PARTITIONED: 2: dt                                                   
                                                                                        
   STREAM DATA SINK                                                                     
     EXCHANGE ID: 05                                                                    
     UNPARTITIONED                                                                      
                                                                                        
   4:AGGREGATE (merge finalize)                                                         
   |  output: multi_distinct_count(17: count), multi_distinct_count(16: count)          
   |  group by: 2: dt                                                                   
   |  use vectorized: true                                                              
   |                                                                                    
   3:EXCHANGE                                                                           
      use vectorized: true                                                              
                                                                                        
 PLAN FRAGMENT 2                                                                        
  OUTPUT EXPRS:                                                                         
   PARTITION: RANDOM                                                                    
                                                                                        
   STREAM DATA SINK                                                                     
     EXCHANGE ID: 03                                                                    
     HASH_PARTITIONED: 2: dt                                                            
                                                                                        
   2:AGGREGATE (update serialize)                                                       
   |  STREAMING                                                                         
   |  output: multi_distinct_count(14: user_uuid), multi_distinct_count(13: device_uuid)
   |  group by: 2: dt                                                                   
   |  use vectorized: true                                                              
   |                                                                                    
   1:Project                                                                            
   |  <slot 2> : 2: dt                                                                  
   |  <slot 13> : 13: device_uuid                                                       
   |  <slot 14> : 14: user_uuid                                                         
   |  use vectorized: true                                                              
   |                                                                                    
   0:OlapScanNode                                                                       
      TABLE: dws_realtime_user_detail_inc_1d                                            
      PREAGGREGATION: ON                                                                
      PREDICATES: 1: game_key = 10003                                                   
      partitions=7/66                                                                   
      rollup: dws_realtime_user_detail_inc_1d                                           
      tabletRatio=7/7                                                                   
      tabletList=27150373,27158210,27158268,27217317,27263695,27264171,27264235         
      cardinality=16663835                                                              
      avgRowSize=27.533485                                                              
      numNodes=0                                                                        
      use vectorized: true

您好,请问您是什么数据模型?

您好,麻烦也补充一下查询语句和集群配置

--建表语句
CREATE TABLE `dws_realtime_user_detail_inc_1d` (
  `game_key` int(11) NOT NULL COMMENT "",
  `dt` date NOT NULL COMMENT "2021-06-17",
  `platform` varchar(20) NULL COMMENT "",
  `buess_time` bigint(20) NOT NULL COMMENT "2021-06-17 01:45:00",
  `is_dev_new` int(11) NULL COMMENT "",
  `is_user_new` int(11) NULL COMMENT "",
  `is_role_new` int(11) NULL COMMENT "",
  `server_id` varchar(20) NULL COMMENT "",
  `channel_id` varchar(20) NULL COMMENT "",
  `app_channel_id` varchar(100) NULL COMMENT "",
  `hero_time_zone` varchar(100) NULL COMMENT "",
  `hero_online_version` varchar(100) NULL COMMENT "",
  `device_uuid` varchar(100) NULL COMMENT "",
  `user_uuid` varchar(100) NULL COMMENT "",
  `role_uuid` varchar(100) NULL COMMENT ""
) ENGINE=OLAP 
DUPLICATE KEY(`game_key`, `dt`, `platform`, `buess_time`, `is_dev_new`, `is_user_new`, `is_role_new`, `server_id`, `channel_id`, `app_channel_id`, `hero_time_zone`, `hero_online_version`)
COMMENT "OLAP"
PARTITION BY RANGE(`game_key`, `dt`)
DISTRIBUTED BY HASH(`dt`) BUCKETS 1 
PROPERTIES (
"replication_num" = "2",
"in_memory" = "false",
"storage_format" = "DEFAULT"
); 


--查询语句
select ta.role_cnt                                                                         role_cnt,
       ta.user_cnt                                                                         user_cnt,
       ta.dev_cnt                                                                          dev_cnt,
       case when tb.role_cnt = 0 then 0 else round((ta.role_cnt / tb.role_cnt - 1), 4) end regUserDiffRate,
       case when tb.user_cnt = 0 then 0 else round((ta.user_cnt / tb.user_cnt - 1), 4) end regAccDiffRate,
       case when tb.role_cnt = 0 then 0 else round((ta.role_cnt / tb.role_cnt - 1), 4) end regDevDiffRate
from (select count(distinct role_uuid)   role_cnt,
             count(distinct user_uuid)   user_cnt,
             count(distinct device_uuid) dev_cnt,
             ${gameId}                   game_key
      from realtime.dws_realtime_user_detail_inc_1d
      where game_key = ${gameId}
        and dt = ${startTime}
        and platform = ${envType}  ) ta left join (
select count (distinct role_uuid) role_cnt, count (distinct user_uuid) user_cnt, count (distinct device_uuid) dev_cnt, ${gameId} game_key
from realtime.dws_realtime_user_detail_inc_1d
where game_key = ${gameId}
  and dt = date_sub(${startTime}, interval 1 day)
  and platform = ${envType}
  and buess_time <= unix_timestamp(date_sub(now(), interval 1 day))) tb
on ta.game_key = tb.game_key


--集群配置
fe * 3    64G 16核
be * 6    128G  32核

增加分桶数,可以解决这个问题