FE节点宕机

为了更快的定位您的问题,请提供以下信息,谢谢
【详述】FE有3台机器(10.1..134、10.8..133、10.8.***.134),其中10.1为生产环境,10.8为灾备环境。7月26号上午灾备环境有问题,后续逐渐恢复。到26号晚9点左右,灾备两台FE机器宕机,期间没有大数量的导入任务和查询请求。具体宕机机器日志见附件。fe.log.zip (34.3 MB)
【背景】做过哪些操作?
【业务影响】
【StarRocks版本】例如:2.5.0
【集群规模】例如:3fe(1 follower+2observer)+6be(fe与be单独部)
【机器信息】CPU虚拟核/内存/网卡,例如:8C/24G/万兆
【联系方式】
【附件】

  • fe.log/beINFO/相应截图

补充FE机器配置:4C16G

能否先把集群升级至2.5.9

2023-07-26 20:45:40,846 WARN (pool-15-thread-6|26763) [KafkaUtil$ProxyAPI.sendProxyRequest():181] failed to send proxy request to TNetworkAddress(hostname:10.8.228.137, port:8060) err [too busy to get kafka info, please check the kafka broker status, or set internal_service_async_thread_num bigger]
2023-07-26 20:45:40,846 WARN (pool-15-thread-6|26763) [KafkaUtil$ProxyAPI.sendProxyRequest():192] failed to send proxy request to TNetworkAddress(hostname:10.8.228.137, port:8060) err failed to send proxy request to TNetworkAddress(hostname:10.8.228.137, port:8060) err [too busy to get kafka info, please check the kafka broker status, or set internal_service_async_thread_num bigger]
2023-07-26 20:45:40,846 WARN (pool-15-thread-6|26763) [RoutineLoadTaskScheduler.scheduleOneTask():191] check task ready to execute failed
com.starrocks.common.LoadException: failed to send proxy request to TNetworkAddress(hostname:10.8.228.137, port:8060) err failed to send proxy request to TNetworkAddress(hostname:10.8.228.137, port:8060) err [too busy to get kafka info, please check the kafka broker status, or set internal_service_async_thread_num bigger]
at com.starrocks.common.util.KafkaUtil$ProxyAPI.sendProxyRequest(KafkaUtil.java:193) ~[starrocks-fe.jar:?]
at com.starrocks.common.util.KafkaUtil$ProxyAPI.getOffsets(KafkaUtil.java:134) ~[starrocks-fe.jar:?]
at com.starrocks.common.util.KafkaUtil$ProxyAPI.getLatestOffsets(KafkaUtil.java:114) ~[starrocks-fe.jar:?]
at com.starrocks.common.util.KafkaUtil.getLatestOffsets(KafkaUtil.java:66) ~[starrocks-fe.jar:?]
at com.starrocks.load.routineload.KafkaTaskInfo.readyToExecute(KafkaTaskInfo.java:86) ~[starrocks-fe.jar:?]
at com.starrocks.load.routineload.RoutineLoadTaskScheduler.scheduleOneTask(RoutineLoadTaskScheduler.java:173) ~[starrocks-fe.jar:?]
at com.starrocks.load.routineload.RoutineLoadTaskScheduler.lambda$submitToSchedule$1(RoutineLoadTaskScheduler.java:152) ~[starrocks-fe.jar:?]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [?:1.8.0_181]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_181]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_181]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_181]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_181]

调整 be.conf
internal_service_async_thread_num = 20