hash join 的sql, sr 比 trino 慢很多

为了更快的定位您的问题,请提供以下信息,谢谢
【详述】迁移 trino 的sql 到 sr, 大部分sql都OK,有一个关键sql慢, trino 需要27s,sr 需要3m40s,查看sql profile, 主要是hash join 的原因。
两个be 节点的内存一直在 50%以下,不带涨的。

【背景】
【业务影响】 trino 迁移到 sr
【是否存算分离】是
【StarRocks版本】3.2.8
【集群规模】1fe + 2 be
【机器信息】16C + 128G

【附件】

  • Profile信息
    Explain_sql.txt (44.0 KB)

    • 并行度:show variables like ‘%parallel_fragment_exec_instance_num%’;
      修改并行度没有作用 parallel_fragment_exec_instance_num=8
      show variables like ‘%pipeline_dop%’;

    • pipeline是否开启:show variables like ‘%pipeline%’;
      已开启pipeline

    • 执行计划:explain costs + sql

    • be节点cpu和内存使用率截图

查询比较慢,但是资源利用率不高, 不明白咋回事

这也没个profile,只能靠猜了

sr_frag1_dop8.profile (123.4 KB) trino_analyze.json (1.1 MB) trino_exec.profile (53.6 KB) sr_dop.profile (123.2 KB) sr_trino_sql.profile (123.2 KB)

应该是一对多join导致的,可以先用3.3试下看看能到多少?

我看这个应该是这个like 执行到慢,把like换成这个函数试下 https://docs.starrocks.io/zh/docs/sql-reference/sql-functions/string-functions/instr/

1赞

你就是我的神, 改为 instr 之后, 14s 出结果, 比trino的 27 s 快很多, 在cpu资源不如trino的情况下

@ [许秀不许秀] 这种是不是优化器可以搞成自动转换下。

这个只有 join 的时候,如果用到 like ,用 instr 替换,效果会好很多,如果只是简单的like, 使用instr 替换没什么效果

秀秀老师, 还有一个case , join + row_number 窗口排序函数的, 相对trino来说是个明显的
profile 文件:
badcase080901.profile (216.4 KB)

SELECT vbi.name, t2.qqPlayTimes, t2.mangguoPlayTimes, t2.letvPlayTimes, t2.sohuPlayTimes, t2.pptvPlayTimes, t2.iqiyiPlayTimes, t2.youkuPlayTimes, t2.xiguaPlayTimes, t2.qqPlayTimesPredicted, t2.mangguoPlayTimesPredicted, t2.letvPlayTimesPredicted, t2.sohuPlayTimesPredicted, t2.pptvPlayTimesPredicted, t2.iqiyiPlayTimesPredicted, t2.youkuPlayTimesPredicted, t2.xiguaPlayTimesPredicted, t2.totalPlayTimes, t2.totalPlayTimesPredicted, t2.qqserialnum, t2.mangguoserialnum, t2.letvserialnum, t2.sohuserialnum, t2.pptvserialnum, t2.iqiyiserialnum, t2.youkuserialnum, t2.xiguaserialnum, t2.maxserialnum, vbi.tags, vbi.releasetime, vbi.network, vbi.dayPlatform, vbi.hpURL, vbi.area, vbi.bcType, vbi.directorinfo, vbi.screenwriterinfo, vbi.actorinfo, vbi.producerinfo, vbi.productioncompanyinfo, vbi.db_rating, vbi.db_ratingtimes, coalesce(vbi.episodenum_bilibili, cast(vbi.episodenum as varchar)) as episodenum, vbi.totalepisodenum, vbi.presenterinfo, vbi.guestinfo, vbi.originalid, vbi.duration, vbi.dayplatform_start, vbi.editedflag, vbi.dayplatform_all, t2.bilibiliPlayTimes, t2.bilibiliPlayTimesPredicted, vbi.offlinetime, vbi.ismemberfree, vbi.copyright, vbi.isvip, vbi.notvipdowntime FROM ( SELECT name, sum(playtimespredicted) AS totalPlayTimesPredicted FROM ( SELECT *, row_number() over (partition BY name,channel,channelType ORDER BY DAY DESC) R FROM ( SELECT B.name AS name, A.channel AS channel, A.channelType AS channelType, play_times, A.playtimespredicted_avg as playtimespredicted, DAY, serialnum FROM default_catalog.enlightent_daily.rank_view_days A JOIN hive_old.mysql_prod_enlightent_daily.video_basic_info_extra B ON lower(A.name) = lower(B.name) AND A.channelType=B.channelType WHERE DAY<='2020-12-24' AND vn=30 AND A.channelType='tv' AND instr(B.dayplatform_all, A.channel) > 0 AND B.releasetime BETWEEN '2020-01-01' AND '2020-12-24' ) T ) T_R WHERE R=1 GROUP BY name ) t2 RIGHT JOIN hive_old.mysql_prod_enlightent_daily.video_basic_info_extra vbi ON lower(vbi.name) = lower(t2.name) WHERE vbi.channelType = 'tv' AND vbi.releasetime BETWEEN '2020-01-01' and '2020-12-24' AND vbi.editedflag >=0 AND vbi.editedflag !=2 AND vbi.dayplatform_all != 'NULL' AND vbi.dayplatform_all != '' ORDER BY t2.totalPlayTimesPredicted DESC
这个join + row_number 的数据结果有点炸裂

让下面这个broadcast join 试试

set disable_join_reorder = true; 使用这个之后, 跟 trino 逻辑一致了,结果秒出。
我简化了sql。
default_catalog.enlightent_daily.rank_view_days 这是一个视图,里面有个 join。
trino 优先计算了里面的视图,最后做 JOIN hive_old.mysql_prod_enlightent_daily.video_basic_info_extra B 。
这样的结果集最少。
如果先做 A 视图里面的 事实表跟 video_basic_info_extra B 表 join , 结果会非常差,
应该是命中什么负优化的策略。

SELECT *, row_number() over (partition BY name ORDER BY DAY DESC) R FROM ( SELECT B.name AS name, A.channel AS channel, A.channelType AS channelType, CASE WHEN A.channel = 'youku' AND A.day > '2019-01-17' THEN 0 ELSE A.playTimes_avg END as play_times, A.playtimespredicted_avg as playtimespredicted, DAY, serialnum FROM default_catalog.enlightent_daily.rank_view_days A JOIN hive_old.mysql_prod_enlightent_daily.video_basic_info_extra B ON lower(A.name) = lower(B.name) AND A.channelType=B.channelType WHERE DAY<='2020-12-24' AND vn=30 AND A.channelType='tv' AND instr(B.dayplatform_all, A.channel) > 0 AND B.releasetime BETWEEN '2020-01-01' AND '2020-12-24'

broadcast 可以2s 出结果, 时间主要耗在 hdfs scan 上了。

发个没改写时候的explain costs

[quote=“不知不觉, post:12, topic:14099”]
SELECT *, row_number() over (partition BY name ORDER BY DAY DESC) R FROM ( SELECT B.name AS name, A.channel AS channel, A.channelType AS channelType, CASE WHEN A.channel = ‘youku’ AND A.day > ‘2019-01-17’ THEN 0 ELSE A.playTimes_avg END as play_times, A.playtimespredicted_avg as playtimespredicted, DAY, serialnum FROM default_catalog.enlightent_daily.rank_view_days A JOIN hive_old.mysql_prod_enlightent_daily.video_basic_info_extra B ON lower(A.name) = lower(B.name) AND A.channelType=B.channelType WHERE DAY<=‘2020-12-24’ AND vn=30 AND A.channelType=‘tv’ AND instr(B.dayplatform_all, A.channel) > 0 AND B.releasetime BETWEEN ‘2020-01-01’ AND ‘2020-12-24’
[/quote]explain_costs.profile (54.0 KB)

https://github.com/StarRocks/starrocks/pull/50416 这个PR之后like %var% 这种pattern 会和instr持平