为了更快的定位您的问题,请提供以下信息,谢谢
【详述】问题详细描述
【背景】添加 Bloom filter索引
【业务影响】
【StarRocks版本】例如:2.4.4
导致be卡死,查询不出来东西,数据一直在转圈圈
通过查询线程fragment_mgr这个线程占用了1500多个
pid,tid,common
为了更快的定位您的问题,请提供以下信息,谢谢
【详述】问题详细描述
【背景】添加 Bloom filter索引
【业务影响】
【StarRocks版本】例如:2.4.4
导致be卡死,查询不出来东西,数据一直在转圈圈
通过查询线程fragment_mgr这个线程占用了1500多个
麻烦be.info日志中搜索下fragment_mgr关键字 看有什么关键信心返回吗 ? 然后这个2.4版本的集群也推荐您升级到2.5.10版本 2.4版本稳定性不如2.5 lts版本
W0421 15:29:09.581102 3272505 fragment_mgr.cpp:182] Fail to open fragment 34a5d9f0-e016-11ed-b974-525400c351e8: Cancelled: Expected LE 1 to be returned by expression
/root/starrocks/be/src/runtime/plan_fragment_executor.cpp:315 _plan->get_next(_runtime_state, &_chunk, &_done)
/root/starrocks/be/src/runtime/plan_fragment_executor.cpp:207 _get_next_internal_vectorized(&chunk)
W0421 15:29:09.581398 3272522 fragment_mgr.cpp:182] Fail to open fragment 34a5d9f0-e016-11ed-b974-525400c351e9: Cancelled: Cancelled because of runtime state is cancelled
/root/starrocks/be/src/exec/exchange_node.cpp:126 get_next_merging(state, chunk, eos)
/root/starrocks/be/src/runtime/plan_fragment_executor.cpp:315 _plan->get_next(_runtime_state, &_chunk, &_done)
/root/starrocks/be/src/runtime/plan_fragment_executor.cpp:207 _get_next_internal_vectorized(&chunk)
W0421 15:29:09.581622 3272768 result_buffer_mgr.cpp:102] no result for this query, id=TUniqueId(hi=3793677889704432109, lo=-5083347558650916375)
E0421 15:29:09.581847 3272442 olap_scan_node.cpp:279] [TUniqueId(hi=3793677889704432109, lo=-5083347558650916401)] Cancelled: canceled state
E0421 15:29:09.582428 3272478 olap_scan_node.cpp:279] [TUniqueId(hi=3793677889704432109, lo=-5083347558650916401)] Cancelled: canceled state
W0421 15:29:09.583143 3272549 fragment_mgr.cpp:182] Fail to open fragment 34a5d9f0-e016-11ed-b974-525400c351d2: Cancelled: Cancelled because of runtime state is cancelled
/root/starrocks/be/src/runtime/plan_fragment_executor.cpp:315 _plan->get_next(_runtime_state, &_chunk, &_done)
/root/starrocks/be/src/runtime/plan_fragment_executor.cpp:207 _get_next_internal_vectorized(&chunk)
W0421 15:29:09.583257 3272506 fragment_mgr.cpp:182] Fail to open fragment 34a5d9f0-e016-11ed-b974-525400c351d8: Cancelled: canceled state
/root/starrocks/be/src/runtime/plan_fragment_executor.cpp:315 _plan->get_next(_runtime_state, &_chunk, &_done)
/root/starrocks/be/src/runtime/plan_fragment_executor.cpp:207 _get_next_internal_vectorized(&chunk)
W0421 15:29:09.583364 3272539 fragment_mgr.cpp:182] Fail to open fragment 34a5d9f0-e016-11ed-b974-525400c351d9: Cancelled: canceled state
/root/starrocks/be/src/runtime/plan_fragment_executor.cpp:315 _plan->get_next(_runtime_state, &_chunk, &_done)
/root/starrocks/be/src/runtime/plan_fragment_executor.cpp:207 _get_next_internal_vectorized(&chunk)
W0421 15:29:09.583365 3272494 fragment_mgr.cpp:182] Fail to open fragment 34a5d9f0-e016-11ed-b974-525400c351d5: Cancelled: Cancelled SenderQueue::get_chunk
/root/starrocks/be/src/exec/exchange_node.cpp:132 _stream_recvr->get_chunk(&_input_chunk)
/root/starrocks/be/src/exec/vectorized/cross_join_node.cpp:545 child(1)->get_next(state, &chunk, &eos)
/root/starrocks/be/src/exec/vectorized/cross_join_node.cpp:106 _build(state)
/root/starrocks/be/src/exec/vectorized/project_node.cpp:93 _children[0]->open(state)
/root/starrocks/be/src/exec/vectorized/topn_node.cpp:102 data_source->open(state)
/root/starrocks/be/src/runtime/plan_fragment_executor.cpp:193 _plan->open(_runtime_state)
无法打开片段
那我打算升级2.5 lts版本
大佬 routine开始报WARN (txnTimeoutChecker|73) [RoutineLoadJob.executeTaskOnTxnStatusChanged():1038] routine load task [job name task_adm_live_spu_trend_15, task id 7d714980-741a-4ed3-9395-81f93f8134a8] aborted because of timeout by txn manager, remove old task and generate new one routine无法工作了
profile都很快 就routine无法工作 已经升级为2.5.10了
新的报错我提交到https://github.com/StarRocks/starrocks/issues/29092