【详述】BE节点使用过程中不定期宕机
【背景】因为宕机时间和宕机节点不固定,一般会循环宕机两三次后复合正常
【业务影响】查询失败
【StarRocks版本】3.0.2 RELEASE (build c833698)
【集群规模】1fe + 3be(fe与be混部)
【机器信息】CPU虚拟核/内存/网卡,32C/64G/万兆
【联系方式】社区群1 - Techsun-tao
【附件】
FE日志
2023-08-08 20:05:01,891 WARN (starrocks-mysql-nio-pool-14628|4635086) [ReadListener.lambda$handleEvent$0():81] Exception happened in one session(com.starrocks.mysql.nio.NConnectContext@49fde93b).
java.io.IOException: Connection reset by peer
at sun.nio.ch.FileDispatcherImpl.read0(Native Method) ~[?:1.8.0_131]
at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39) ~[?:1.8.0_131]
at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223) ~[?:1.8.0_131]
at sun.nio.ch.IOUtil.read(IOUtil.java:197) ~[?:1.8.0_131]
at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:380) ~[?:1.8.0_131]
at org.xnio.nio.NioSocketConduit.read(NioSocketConduit.java:289) ~[xnio-nio-3.7.9.Final.jar:3.7.9.Final]
at org.xnio.conduits.ConduitStreamSourceChannel.read(ConduitStreamSourceChannel.java:127) ~[xnio-api-3.7.9.Final.jar:3.7.9.Final]
at org.xnio.channels.Channels.readBlocking(Channels.java:294) ~[xnio-api-3.7.9.Final.jar:3.7.9.Final]
at com.starrocks.mysql.nio.NMysqlChannel.realNetRead(NMysqlChannel.java:53) ~[starrocks-fe.jar:?]
at com.starrocks.mysql.MysqlChannel.readAllPlain(MysqlChannel.java:162) ~[starrocks-fe.jar:?]
at com.starrocks.mysql.MysqlChannel.readAll(MysqlChannel.java:155) ~[starrocks-fe.jar:?]
at com.starrocks.mysql.MysqlChannel.fetchOnePacket(MysqlChannel.java:186) ~[starrocks-fe.jar:?]
at com.starrocks.qe.ConnectProcessor.processOnce(ConnectProcessor.java:718) ~[starrocks-fe.jar:?]
at com.starrocks.mysql.nio.ReadListener.lambda$handleEvent$0(ReadListener.java:69) ~[starrocks-fe.jar:?]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) ~[?:1.8.0_131]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) ~[?:1.8.0_131]
at java.lang.Thread.run(Thread.java:748) ~[?:1.8.0_131]
下面是其中一个be节点的out日志
query_id:b1875506-6ec4-4c41-89f1-b19f284a83e4, fragment_instance:b1875506-6ec4-4c41-89f1-b19f284a83e5
tracker:process consumption: 258461240
tracker:query_pool consumption: 5048
tracker:load consumption: 0
tracker:metadata consumption: 12842785
tracker:tablet_metadata consumption: 4925490
tracker:rowset_metadata consumption: 2533897
tracker:segment_metadata consumption: 1065560
tracker:column_metadata consumption: 4317838
tracker:tablet_schema consumption: 566658
tracker:segment_zonemap consumption: 898058
tracker:short_key_index consumption: 2065
tracker:column_zonemap_index consumption: 1651990
tracker:ordinal_index consumption: 1276248
tracker:bitmap_index consumption: 0
tracker:bloom_filter_index consumption: 0
tracker:compaction consumption: 0
tracker:schema_change consumption: 0
tracker:column_pool consumption: 148372910
tracker:page_cache consumption: 26402160
tracker:update consumption: 0
tracker:chunk_allocator consumption: 6128248
tracker:clone consumption: 0
tracker:consistency consumption: 0
*** Aborted at 1691479371 (unix time) try “date -d @1691479371” if you are using GNU date ***
PC: @ 0x30b08ea starrocks::FixedLengthColumnBase<>::put_mysql_row_buffer()
*** SIGSEGV (@0x7fd274400000) received by PID 118081 (TID 0x7fd20bf68700) from PID 1950351360; stack trace: ***
@ 0x62d7062 google::(anonymous namespace)::FailureSignalHandler()
@ 0x7fd279e7b5d0 (unknown)
@ 0x30b08ea starrocks::FixedLengthColumnBase<>::put_mysql_row_buffer()
@ 0x56af543 starrocks::MapColumn::put_mysql_row_buffer()
@ 0x5671c34 starrocks::ArrayColumn::put_mysql_row_buffer()
@ 0x5a154ec starrocks::MysqlResultWriter::process_chunk()
@ 0x57799d5 starrocks::pipeline::ResultSinkOperator::push_chunk()
@ 0x310fef9 starrocks::pipeline::PipelineDriver::process()
@ 0x578d84b starrocks::pipeline::GlobalDriverExecutor::_worker_thread()
@ 0x50e5222 starrocks::ThreadPool::dispatch_thread()
@ 0x50dfd1a starrocks::supervise_thread()
@ 0x7fd279e73dd5 start_thread
@ 0x7fd27948eead __clone
@ 0x0 (unknown)
start time: Tue Aug 8 15:38:49 CST 2023