IN语句中常量的类型不一致导致查询结果非预期

IN语句中的常量类型不一致时,查询结果会非预期。

b列为VARCHAR类型:

starrocks> select * from t;
+------+------+
| a    | b    |
+------+------+
|    0 | 1    |
|    1 | abc  |
+------+------+
2 rows in set (0.02 sec)

当IN语句中全为字符串时,结果正确:

starrocks> select * from t  where b in ('1', 'abc');
+------+------+
| a    | b    |
+------+------+
|    0 | 1    |
|    1 | abc  |
+------+------+
2 rows in set (0.02 sec)

当类型不一致,出现INT和VARCHAR的组合时,只查出了数字

starrocks> select * from t  where b in (1, 'abc');
+------+------+
| a    | b    |
+------+------+
|    0 | 1    |
+------+------+
1 row in set (0.01 sec)

查看执行计划和代码,发现这种情况,会都转换成DOUBLE类型:

starrocks> explain select * from t where b in(1, 'abc');
+-------------------------------------------------------------------------------------------+
| Explain String                                                                            |
+-------------------------------------------------------------------------------------------+
| PLAN FRAGMENT 0                                                                           |
|  OUTPUT EXPRS:1: a | 2: b                                                                 |
|   PARTITION: UNPARTITIONED                                                                |
|                                                                                           |
|   RESULT SINK                                                                             |
|                                                                                           |
|   1:EXCHANGE                                                                              |
|                                                                                           |
| PLAN FRAGMENT 1                                                                           |
|  OUTPUT EXPRS:                                                                            |
|   PARTITION: RANDOM                                                                       |
|                                                                                           |
|   STREAM DATA SINK                                                                        |
|     EXCHANGE ID: 01                                                                       |
|     UNPARTITIONED                                                                         |
|                                                                                           |
|   0:OlapScanNode                                                                          |
|      TABLE: t                                                                             |
|      PREAGGREGATION: ON                                                                   |
|      PREDICATES: CAST(2: b AS DOUBLE) IN (1.0, CAST('abc' AS DOUBLE))                     |
|      partitions=1/1                                                                       |
|      rollup: t                                                                            |
|      tabletRatio=16/16                                                                    |
|      tabletList=109098,109100,109102,109104,109106,109108,109110,109112,109114,109116 ... |
|      cardinality=1                                                                        |
|      avgRowSize=5.6                                                                       |
|      numNodes=0                                                                           |
+-------------------------------------------------------------------------------------------+
27 rows in set (0.01 sec)

比较能让人接受的方式,应该是都转换成字符串,比如使用 ‘=’ 的场景:

starrocks> explain select * from t where b = 1;
+-------------------------------------------------------------------------------------------+
| Explain String                                                                            |
+-------------------------------------------------------------------------------------------+
| PLAN FRAGMENT 0                                                                           |
|  OUTPUT EXPRS:1: a | 2: b                                                                 |
|   PARTITION: UNPARTITIONED                                                                |
|                                                                                           |
|   RESULT SINK                                                                             |
|                                                                                           |
|   1:EXCHANGE                                                                              |
|                                                                                           |
| PLAN FRAGMENT 1                                                                           |
|  OUTPUT EXPRS:                                                                            |
|   PARTITION: RANDOM                                                                       |
|                                                                                           |
|   STREAM DATA SINK                                                                        |
|     EXCHANGE ID: 01                                                                       |
|     UNPARTITIONED                                                                         |
|                                                                                           |
|   0:OlapScanNode                                                                          |
|      TABLE: t                                                                             |
|      PREAGGREGATION: ON                                                                   |
|      PREDICATES: 2: b = '1'                                                               |
|      partitions=1/1                                                                       |
|      rollup: t                                                                            |
|      tabletRatio=16/16                                                                    |
|      tabletList=109098,109100,109102,109104,109106,109108,109110,109112,109114,109116 ... |
|      cardinality=1                                                                        |
|      avgRowSize=5.6                                                                       |
|      numNodes=0                                                                           |
+-------------------------------------------------------------------------------------------+
27 rows in set (0.01 sec)

帮看看这个场景有没有必要修改一下?