wait_for_version failed: apply stopped tablet导致查询失败

【详述】查询语句经常报等待版本错误,导致查询失败, 附件中的查询语句中使用的两张表是通过colocation_group组关联起来的,组里面有5张表, Show tablet 105575924 的信息


SHOW PROC ‘/dbs/10829/5475941/partitions/5475742/105569927/105575924’;的信息


【StarRocks版本】例如:2.3.7
【集群规模】例如:3fe(3 follower)+28be(fe与be独立部署)
【附件】

  • 查询报错:
    【数据服务API执行异常告警】
    成功标志:fail
    告警级别:严重
    调用账号:xxxxxxxxxxxxxxxx
    Api编号:order_11111111111111
    耗时:<60秒
    触发时间:2023-02-22 08:48:00.000
    触发次数:1
    url:https://xxxxxxxx
    header参数:
    异常信息:
    java.lang.RuntimeException: java.lang.RuntimeException: java.util.concurrent.ExecutionException: java.sql.SQLSyntaxErrorException: wait_for_version version:118703 failed: apply stopped tablet:105575924 #version:7 [118699.1 118702.1@6 118702.1] pending: rowsets:1[id/seg/row/del/byte/compaction]: [8245/1/50552/0/4.64 MB/27.36 MB]
    at com.haier.bigdata.automat.service.api.impl.InterfaceServiceImpl.doExc(InterfaceServiceImpl.java:316)
    at com.haier.bigdata.automat.service.api.impl.InterfaceServiceImpl.exc(InterfaceServiceImpl.java:250)
    at com.haier.bigdata.automat.service.api.impl.InterfaceServiceImpl.getReturnResultByDB(InterfaceServiceImpl.java:216)
    at com.haier.bigdata.automat.service.api.impl.InterfaceServiceImpl.executeApiReturnResultImpl(InterfaceServiceImpl.java:173)
    at com.haier.bigdata.automat.service.api.impl.InterfaceServiceImpl.executeApiReturnResult(InterfaceServiceImpl.java:96)
    at com.haier.bigdata.automat.api.endpoint.SecurityAPIController.securityApi$original$Q9lkW
    发送时间

2023年02月22日08时48分00秒

  • 查询语句:
    select
    t.jshi_trade_code,
    t1.jshd_product_group as proGroupCode ,
    t1.jshd_product_brand as brandCode,
    sum(
    case
    when jshi_gvs_order_type in (‘ZGRE’, ‘ZGBR’, ‘ZKKA’) then -1 * jshd_qty
    else jshd_qty
    end
    ) AS takeQuantity,
    sum(
    case
    when jshi_gvs_order_type in (‘ZGRE’, ‘ZGBR’, ‘ZKKA’) then -1 * jshd_amount
    else jshd_amount
    end
    ) as allPriceTotal
    from
    (
    select
    t.jshi_trade_code,
    t.jshi_delivery_type,
    t.jshi_order_amount,
    t.bstnk,
    t.jshi_gvs_order_type,
    t.jshi_sendto_code
    from
    di_sc.tt_jsh_t_order_info t
    where
    t.sap_sended5 != ‘2’
    and t.sap_judged != ‘2’
    and (
    t.sap_canceled != ‘1’
    or t.sap_canceled is null
    )
    and t.etl_source = 1
    and t.jshi_order_gvs_status = ‘1’
    and t.jshi_order_status not in (‘6’, ‘10’, ‘20’)
    and t.jshi_created_time >= ‘2023-02-22’
    and t.jshi_created_time < ‘2023-02-23’
    and t.jshi_main_channel_code in (‘C’,‘M’)
    ) t
    INNER JOIN di_sc.tt_jsh_t_order_detail t1 ON t.bstnk = t1.bstnk
    INNER join (
    select
    tm.customer_number,
    tm.gridbody,
    tm.trade_code
    from
    di_sc.tm_sc_customers tm
    where
    tm.trade_code = ‘12805’
    ) t2 on t.jshi_sendto_code = t2.customer_number
    where
    1 = 1
    and t2.gridbody IN (‘02040500B549’,‘02040500B550’,‘02040500B551’,‘02040500B552’,‘02040500B553’,‘02040500B554’,‘02040500B556’,‘02040500B557’,‘02040500B558’,‘02040500B559’,‘02040500B560’,‘02040500C358’,‘02040500C361’,‘02040500C362’,‘02040500C363’,‘02040501A425’,‘01170201A224’,‘01170201A225’,‘01170201A226’,‘01170201A227’,‘01170201A228’,‘01170202D005’,‘01170202D006’,‘01170202D007’,‘01170202D008’,‘01170202D009’,‘01170202D011’,‘01170202D012’,‘01170202D013’)
    and t1.jshd_product_group IN (‘EA’,‘EE’)
    group by
    jshi_trade_code,
    jshd_product_group,
    jshd_product_brand

这个问题已经修复了 您可以升级到最新版本的2.3.9版本解决该问题
对应pr: https://github.com/StarRocks/starrocks/pull/17850

我看2.3.9 tag上面的代码没有修复啊

2.3.9上应该还没有,改动比较小,如果有需求可以cherry-pick下或者等下我们的新版本