fe master 宕机,publish timeout: 20000

为了更快的定位您的问题,请提供以下信息,谢谢
【详述】fe master 宕机 在20240709 6点56的样子,异常退出了
【背景】无
【业务影响】
【是否存算分离】存算一体
【StarRocks版本】2.5.13
【集群规模】3fe(1 follower+2observer)+5be(fe与be没有混部)
【机器信息】 fe:16C/32G be:64C/256G/
【联系方式】 社区群1-桌椅板凳

出现了很多异常 2024-07-09 06:56:08,697 WARN (nioEventLoopGroup-6-13|168) [TransactionLoadAction.executeWithoutPassword():101] com.starrocks.common.UserException: publish timeout: 20000

at com.starrocks.transaction.GlobalTransactionMgr.commitPreparedTransaction(GlobalTransactionMgr.java:432)

at com.starrocks.http.rest.TransactionLoadAction.executeTransaction(TransactionLoadAction.java:153)

at com.starrocks.http.rest.TransactionLoadAction.executeWithoutPassword(TransactionLoadAction.java:90)

at com.starrocks.http.rest.RestBaseAction.execute(RestBaseAction.java:85)

at com.starrocks.http.rest.RestBaseAction.handleRequest(RestBaseAction.java:60)

at com.starrocks.http.HttpServerHandler.channelRead(HttpServerHandler.java:80)

at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:444)

at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)

at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412)

at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103)

at io.netty.handler.codec.MessageToMessageCodec.channelRead(MessageToMessageCodec.java:111)

at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:442)

at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)

at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412)

at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103)

at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:444)

at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)

at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412)

at io.netty.channel.CombinedChannelDuplexHandler$DelegatingChannelHandlerContext.fireChannelRead(CombinedChannelDuplexHandler.java:436)

at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:324)

at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:296)

at io.netty.channel.CombinedChannelDuplexHandler.channelRead(CombinedChannelDuplexHandler.java:251)

at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:442)

at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)

at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412)

at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410)

at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:440)

at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)

at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919)

at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:166)

at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:788)

at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:724)

at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:650)

at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:562)

at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997)

at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)

at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)

at java.lang.Thread.run(Thread.java:748)

fe.warn.log (16.4 MB)

是不是和这个情况差不多https://forum.mirrorship.cn/t/topic/10349

fe.log.20240709-1 (17.7 MB)

这个是当时的监控情况

增加了线程数还有这个问题,thrift_server_max_worker_threads = 8192