EXPLAIN logical 返回rows差距非常大

【StarRocks版本】例如:2.3

EXPLAIN logical
    SELECT device_id
    FROM a
    WHERE app_id IN ('com.adidas.app')
      AND country IN ('IND')
      AND platform = 1
      AND create_date >= '20221230'
      AND create_date <= '20230330'
    GROUP BY device_id
    LIMIT 10000000;

这个SQL实际结果大概是130w左右,但是用explain logical查看执行计划,只有估计5000行,这个差距非常大,是我理解错了吗

+---------------------------------------------------------------------------------------------------------------------------------------------+
| Explain String                                                                                                                              |
+---------------------------------------------------------------------------------------------------------------------------------------------+
| - Output => [2:device_id]                                                                                                                   |
|     - LIMIT [10000000]                                                                                                                      |
|         - EXCHANGE(GATHER)                                                                                                                  |
|                 Estimates: {row: 5352, cpu: 192701.79, memory: 0.00, network: 192701.79, cost: 1484034.30}                                  |
|             - AGGREGATE(GLOBAL) [2:device_id]                                                                                               |
|                     Estimates: {row: 5352, cpu: 327823.57, memory: 192701.79, network: 0.00, cost: 1098630.72}                              |
|                     limit: 10000000                                                                                                         |
|                 - EXCHANGE(SHUFFLE) [2]                                                                                                     |
|                         Estimates: {row: 5352, cpu: 192701.79, memory: 0.00, network: 192701.79, cost: 549315.36}                           |
|                     - SCAN [adt_rtb_device_dynamic_profile] => [2:device_id]                                                                |
|                             Estimates: {row: 5352, cpu: 327823.57, memory: 0.00, network: 0.00, cost: 163911.78}                            |
|                             partitionRatio: 91/185, tabletRatio: 1456/1456                                                                  |
|                             predicate: 3:app_id = 'com.adidas.app' AND 5:country = 'IND' AND 6:platform = 1 AND 1:create_date <= 2023-03-30 |
+---------------------------------------------------------------------------------------------------------------------------------------------+

explain sql这样看下,另外sql的执行结果也帮忙发下看看

感谢回复

下面是explain的结果

mysql> explain
    -> SELECT device_id
    ->     FROM adt_rtb_device_dynamic_profile
    ->     WHERE app_id IN ('com.adidas.app')
    ->       AND country IN ('IND')
    ->       AND platform = 1
    ->       AND create_date >= '20221230'
    ->       AND create_date <= '20230330'
    ->     GROUP BY device_id;
+--------------------------------------------------------------------------------------------------------------------+
| Explain String                                                                                                     |
+--------------------------------------------------------------------------------------------------------------------+
| PLAN FRAGMENT 0                                                                                                    |
|  OUTPUT EXPRS:2: device_id                                                                                         |
|   PARTITION: UNPARTITIONED                                                                                         |
|                                                                                                                    |
|   RESULT SINK                                                                                                      |
|                                                                                                                    |
|   4:EXCHANGE                                                                                                       |
|                                                                                                                    |
| PLAN FRAGMENT 1                                                                                                    |
|  OUTPUT EXPRS:                                                                                                     |
|   PARTITION: HASH_PARTITIONED: 2: device_id                                                                        |
|                                                                                                                    |
|   STREAM DATA SINK                                                                                                 |
|     EXCHANGE ID: 04                                                                                                |
|     UNPARTITIONED                                                                                                  |
|                                                                                                                    |
|   3:AGGREGATE (update finalize)                                                                                    |
|   |  group by: 2: device_id                                                                                        |
|   |                                                                                                                |
|   2:EXCHANGE                                                                                                       |
|                                                                                                                    |
| PLAN FRAGMENT 2                                                                                                    |
|  OUTPUT EXPRS:                                                                                                     |
|   PARTITION: RANDOM                                                                                                |
|                                                                                                                    |
|   STREAM DATA SINK                                                                                                 |
|     EXCHANGE ID: 02                                                                                                |
|     HASH_PARTITIONED: 2: device_id                                                                                 |
|                                                                                                                    |
|   1:Project                                                                                                        |
|   |  <slot 2> : 2: device_id                                                                                       |
|   |                                                                                                                |
|   0:OlapScanNode                                                                                                   |
|      TABLE: adt_rtb_device_dynamic_profile                                                                         |
|      PREAGGREGATION: ON                                                                                            |
|      PREDICATES: 3: app_id = 'com.adidas.app', 5: country = 'IND', 6: platform = 1, 1: create_date <= '2023-03-30' |
|      partitions=91/185                                                                                             |
|      rollup: adt_rtb_device_dynamic_profile                                                                        |
|      tabletRatio=1456/1456                                                                                         |
|      tabletList=668685,668688,668691,668694,668697,668700,668703,668706,668709,668712 ...                          |
|      cardinality=5325                                                                                              |
|      avgRowSize=61.245052                                                                                          |
|      numNodes=0                                                                                                    |
+--------------------------------------------------------------------------------------------------------------------+

执行结果:

mysql> SELECT device_id
    ->     FROM adt_rtb_device_dynamic_profile
    ->     WHERE app_id IN ('com.adidas.app')
    ->       AND country IN ('IND')
    ->       AND platform = 1
    ->       AND create_date >= '20221230'
    ->       AND create_date <= '20230330'
    ->     GROUP BY device_id;
+--------------------------------------+
| device_id                            |
+--------------------------------------+
| 003efaed-5294-4f14-acbf-d240f7fb4846 |
| b6edadab-5f64-479b-8b78-4b548f889976 |
| d7a78f79-ddab-45ac-a45a-004f071986bd |
| a9e4edf6-fa37-40a2-b3dd-fb3a0d88ed57 |
| cbeae59e-5e97-4ffb-b0cd-74b245dc62ae |
| e65239c9-7bc3-4100-9b9e-07d29c6ebb09 |
| 2226afe9-042e-4fca-871f-f001aca68fa2 |
| 1a7fcb77-8e65-40bf-b797-37b311669cf8 |
| 3b7b1d6b-3251-43ee-9276-ed950dd57002 |
| 0c803fb9-8ff9-454f-8efa-a495abc89e8d |
| 2b0eb8db-1a45-49aa-a74b-3afd7f30ed1e |
| 7100c4d4-3aa3-4a3a-a8c8-2827d32bd0a7 |
| ad00e45a-48b7-4f5a-9096-2aba99175dfd |
| 46cb07b8-1480-4c70-aefa-84809030490a |
...
mysql> SELECT count(distinct device_id)
    ->     FROM adt_rtb_device_dynamic_profile
    ->     WHERE app_id IN ('com.adidas.app')
    ->       AND country IN ('IND')
    ->       AND platform = 1
    ->       AND create_date >= '20221230'
    ->       AND create_date <= '20230330';
+---------------------------+
| count(DISTINCT device_id) |
+---------------------------+
|                   1300313 |
+---------------------------+
1 row in set (1 min 14.07 sec)

2.3版本的么,有没有开启cbo优化器,或者统计信息采集。
可以执行 explain costs select … 看看统计信息结果

plan 里面给出的是估计的值,值能作为优化器cost的参考,肯定没办法跟实际的值完全匹配