fe节点挂掉

为了更快的定位您的问题,请提供以下信息,谢谢
【详述】问题详细描述
fe连接错误,报错 Could not determine master from helpers,后发现两台fe节点挂掉
【背景】做过哪些操作?
【业务影响】
【是否存算分离】
三台混步,1leader 2follower
【StarRocks版本】3.0.8
【附件】

  • fe.log/beINFO/相应截图
    fe.log

2024-06-18 00:00:23,768 WARN (thrift-server-pool-20533139|20634633) [FrontendServiceImpl.loadTxnCommit():1230] failed to commit txn_id: 84330352: get database write lock timeout, database=fin_test, timeoutMillis=15000
2024-06-18 00:00:24,505 WARN (starrocks-mysql-nio-pool-114|20538036) [AcceptListener.lambda$handleEvent$1():116] connect processor exception because
2024-06-18 00:00:24,944 WARN (starrocks-mysql-nio-pool-114|20538036) [AcceptListener.lambda$handleEvent$1():116] connect processor exception because
2024-06-18 00:00:26,527 WARN (JournalWriter|22188) [BDBJEJournal.rebuildCurrentTransaction():444] transaction is invalid, rebuild the txn with 2 kvs
2024-06-18 00:00:29,508 WARN (starrocks-mysql-nio-pool-114|20538036) [AcceptListener.lambda$handleEvent$1():116] connect processor exception because
2024-06-18 00:00:29,945 WARN (starrocks-mysql-nio-pool-114|20538036) [AcceptListener.lambda$handleEvent$1():116] connect processor exception because
2024-06-18 00:00:32,845 WARN (es repository|42) [EsRepository.runAfterCatalogReady():104] Thread es repository: Exception happens when fetch index [journal_info] meta data from remote es cluster. Table info: [Table [id=2754732, name=journal_info, type=ELASTICSEARCH]]
2024-06-18 00:00:34,511 WARN (starrocks-mysql-nio-pool-114|20538036) [AcceptListener.lambda$handleEvent$1():116] connect processor exception because
2024-06-18 00:00:34,947 WARN (starrocks-mysql-nio-pool-114|20538036) [AcceptListener.lambda$handleEvent$1():116] connect processor exception because
2024-06-18 00:00:36,528 ERROR (JournalWriter|22188) [BDBJEJournal.batchWriteCommit():422] failed to commit journal after retried 2 times! txn[] db[CloseSafeDatabase{db=202876799}]
2024-06-18 00:00:38,599 WARN (thrift-server-pool-20533250|20634745) [Database.logTryLockFailureEvent():175] try db lock failed. type: writeLock, current owner id: 20634436, owner name: thrift-server-pool-20532944, owner stack: dump thread: thrift-server-pool-20532944, id: 20634436
2024-06-18 00:00:38,599 WARN (thrift-server-pool-20533250|20634745) [FrontendServiceImpl.loadTxnCommit():1230] failed to commit txn_id: 84330333: get database write lock timeout, database=fin_test, timeoutMillis=15000
2024-06-18 00:00:38,600 WARN (thrift-server-pool-20533252|20634747) [Database.logTryLockFailureEvent():175] try db lock failed. type: writeLock, current owner id: 20634436, owner name: thrift-server-pool-20532944, owner stack: dump thread: thrift-server-pool-20532944, id: 20634436
2024-06-18 00:00:38,600 WARN (thrift-server-pool-20533252|20634747) [FrontendServiceImpl.loadTxnCommit():1230] failed to commit txn_id: 84330334: get database write lock timeout, database=fin_test, timeoutMillis=15000
2024-06-18 00:00:38,601 WARN (thrift-server-pool-20533253|20634748) [Database.logTryLockFailureEvent():175] try db lock failed. type: writeLock, current owner id: 20634436, owner name: thrift-server-pool-20532944, owner stack: dump thread: thrift-server-pool-20532944, id: 20634436
2024-06-18 00:00:38,601 WARN (thrift-server-pool-20533251|20634746) [Database.logTryLockFailureEvent():175] try db lock failed. type: writeLock, current owner id: 20634436, owner name: thrift-server-pool-20532944, owner stack: dump thread: thrift-server-pool-20532944, id: 20634436
2024-06-18 00:00:38,601 WARN (thrift-server-pool-20533253|20634748) [FrontendServiceImpl.loadTxnCommit():1230] failed to commit txn_id: 84330337: get database write lock timeout, database=fin_test, timeoutMillis=15000
2024-06-18 00:00:38,601 WARN (thrift-server-pool-20533251|20634746) [FrontendServiceImpl.loadTxnCommit():1230] failed to commit txn_id: 84330332: get database write lock timeout, database=fin_test, timeoutMillis=15000
2024-06-18 00:00:38,603 WARN (thrift-server-pool-20533254|20634749) [Database.logTryLockFailureEvent():175] try db lock failed. type: writeLock, current owner id: 20634436, owner name: thrift-server-pool-20532944, owner stack: dump thread: thrift-server-pool-20532944, id: 20634436
2024-06-18 00:00:38,603 WARN (thrift-server-pool-20533254|20634749) [FrontendServiceImpl.loadTxnCommit():1230] failed to commit txn_id: 84330330: get database write lock timeout, database=fin_test, timeoutMillis=15000
2024-06-18 00:00:39,512 WARN (starrocks-mysql-nio-pool-114|20538036) [AcceptListener.lambda$handleEvent$1():116] connect processor exception because
2024-06-18 00:00:39,623 WARN (thrift-server-pool-20533261|20634756) [Database.logTryLockFailureEvent():175] try db lock failed. type: writeLock, current owner id: 20634436, owner name: thrift-server-pool-20532944, owner stack: dump thread: thrift-server-pool-20532944, id: 20634436
2024-06-18 00:00:39,624 WARN (thrift-server-pool-20533261|20634756) [FrontendServiceImpl.loadTxnCommit():1230] failed to commit txn_id: 84330336: get database write lock timeout, database=fin_test, timeoutMillis=15000
2024-06-18 00:00:39,949 WARN (starrocks-mysql-nio-pool-114|20538036) [AcceptListener.lambda$handleEvent$1():116] connect processor exception because
2024-06-18 00:00:40,135 WARN (thrift-server-pool-20533268|20634763) [Database.logTryLockFailureEvent():175] try db lock failed. type: writeLock, current owner id: 20634436, owner name: thrift-server-pool-20532944, owner stack: dump thread: thrift-server-pool-20532944, id: 20634436
2024-06-18 00:00:40,135 WARN (thrift-server-pool-20533267|20634762) [Database.logTryLockFailureEvent():175] try db lock failed. type: writeLock, current owner id: 20634436, owner name: thrift-server-pool-20532944, owner stack: dump thread: thrift-server-pool-20532944, id: 20634436
2024-06-18 00:00:40,135 WARN (thrift-server-pool-20533268|20634763) [FrontendServiceImpl.loadTxnCommit():1230] failed to commit txn_id: 84330339: get database write lock timeout, database=fin_test, timeoutMillis=15000
2024-06-18 00:00:40,136 WARN (thrift-server-pool-20533267|20634762) [FrontendServiceImpl.loadTxnCommit():1230] failed to commit txn_id: 84330338: get database write lock timeout, database=fin_test, timeoutMillis=15000
2024-06-18 00:00:41,529 WARN (JournalWriter|22188) [BDBJEJournal.rebuildCurrentTransaction():444] transaction is invalid, rebuild the txn with 2 kvs
2024-06-18 00:00:42,182 WARN (thrift-server-pool-20533284|20634779) [Database.logTryLockFailureEvent():175] try db lock failed. type: writeLock, current owner id: 20634436, owner name: thrift-server-pool-20532944, owner stack: dump thread: thrift-server-pool-20532944, id: 20634436
2024-06-18 00:00:42,182 WARN (thrift-server-pool-20533284|20634779) [FrontendServiceImpl.loadTxnCommit():1230] failed to commit txn_id: 84330340: get database write lock timeout, database=fin_test, timeoutMillis=15000
2024-06-18 00:00:43,079 WARN (es repository|42) [EsRepository.runAfterCatalogReady():104] Thread es repository: Exception happens when fetch index [journal_info] meta data from remote es cluster. Table info: [Table [id=2754732, name=journal_info, type=ELASTICSEARCH]]
2024-06-18 00:00:44,514 WARN (starrocks-mysql-nio-pool-114|20538036) [AcceptListener.lambda$handleEvent$1():116] connect processor exception because
2024-06-18 00:00:44,951 WARN (starrocks-mysql-nio-pool-114|20538036) [AcceptListener.lambda$handleEvent$1():116] connect processor exception because
2024-06-18 00:00:49,516 WARN (starrocks-mysql-nio-pool-114|20538036) [AcceptListener.lambda$handleEvent$1():116] connect processor exception because
2024-06-18 00:00:49,953 WARN (starrocks-mysql-nio-pool-114|20538036) [AcceptListener.lambda$handleEvent$1():116] connect processor exception because
2024-06-18 00:00:51,530 ERROR (JournalWriter|22188) [BDBJEJournal.batchWriteCommit():422] failed to commit journal after retried 3 times! txn[] db[CloseSafeDatabase{db=202876799}]
2024-06-18 00:00:51,530 WARN (JournalWriter|22188) [JournalWriter.writeOneBatch():133] failed to commit batch, will abort current 2 journals.
2024-06-18 00:00:51,531 WARN (JournalWriter|22188) [BDBJEJournal.batchWriteAbort():480] failed to abort transaction because no running transaction, will just ignore and return.
2024-06-18 00:00:51,531 ERROR (JournalWriter|22188) [JournalWriter.abortJournalTask():176] failed to commit journal after retried 3 times! txn[] db[CloseSafeDatabase{db=202876799}]

fe.out
WARNING: correlationId:5923464 timeout with bound channel =>[id: 0x7847afe0, L:/10.118.1.183:42162 - R:/10.118.1.180:8060]
[2024-06-18 00:00:51] failed to commit journal after retried 3 times! txn[] db[CloseSafeDatabase{db=202876799}]
using java version 8
-Xmx16384m -XX:+UseMembar -XX:SurvivorRatio=8 -XX:MaxTenuringThreshold=7 -XX:+PrintGCDateStamps -XX:+PrintGCDetails -XX:+UseConcMarkSweepGC -XX:+UseParNewGC -XX:+CMSClassUnloadingEnabled -XX:-CMSParallelRemarkEnabled -XX:CMSInitiatingOccupancyFraction=80 -XX:SoftRefLRUPolicyMSPerMB=0 -Xloggc:/home/qateadmin/StarRocks-3.1.4/fe/log/fe.gc.log.20240618-102010ll

提供下挂掉前后完整的fe.log 日志,以及 show frontends 的结果

完整的fe.log日志如下:
2024-06-17 23:59:59,494 INFO (pool-19-thread-6|22288) [StreamLoadPlanner.plan():282] load job id: d218b119-6054-4870-908b-d6c549b0ce5a tx id 84330349 parallel 0 compress NO_COMPRESSION replicated false quorum MAJORITY
2024-06-17 23:59:59,495 INFO (thrift-server-pool-20530991|20632465) [FrontendServiceImpl.loadTxnBegin():1114] receive txn begin request, db: fin_test, tbl: ft_beta_log, label: -7ff7a6e3-136f-49ed-ad95-2c10bf412380, backend: 10.118.1.181
2024-06-17 23:59:59,495 INFO (thrift-server-pool-20530991|20632465) [DatabaseTransactionMgr.beginTransaction():313] begin transaction: txn_id: 84330350 with label -7ff7a6e3-136f-49ed-ad95-2c10bf412380 from coordinator BE: 10.118.1.181, listner id: -1
2024-06-17 23:59:59,495 INFO (thrift-server-pool-20531763|20633237) [FrontendServiceImpl.streamLoadPut():1533] receive stream load put request. db:fin_test, tbl: ft_beta_log, txn_id: 84330350, load id: b045844f-665f-b474-3a19-f3d83aca3886, backend: 10.118.1.181
2024-06-17 23:59:59,512 INFO (thrift-server-pool-20531763|20633237) [StreamLoadPlanner.plan():282] load job id: b045844f-665f-b474-3a19-f3d83aca3886 tx id 84330350 parallel 0 compress NO_COMPRESSION replicated false quorum MAJORITY
2024-06-17 23:59:59,907 INFO (thrift-server-pool-20532924|20634416) [FrontendServiceImpl.loadTxnCommit():1212] receive txn commit request. db: fin_test, tbl: ft_beta_log, txn_id: 84330327, backend: 10.118.1.181
2024-06-17 23:59:59,914 INFO (thrift-server-pool-20532924|20634416) [DatabaseTransactionMgr.commitTransaction():462] transaction:[TransactionState. txn_id: 84330327,label: -46a7c81b-b74c-4712-8fcb-af9de73473ec, db id: 12003, table id list: 5155900, callback id: -1, coordinator: BE: 10.118.1.181, transaction status: COMMITTED, error replicas num: 0, replica ids: , prepare time: 1718639988282, commit time: 1718639999911, finish time: -1, write cost: 11629ms, reason: attachment: com.starrocks.load.loadv2.ManualLoadTxnCommitAttachment@365294b4] successfully committed
2024-06-17 23:59:59,915 INFO (PUBLISH_VERSION|34) [PublishVersionDaemon.publishVersionForOlapTable():171] send publish tasks for txn_id: 84330327
2024-06-17 23:59:59,936 INFO (starrocks-mysql-nio I/O-1|139) [AcceptListener.handleEvent():79] Connection established. remote=/10.118.130.196:30683, connectionId=981701
2024-06-17 23:59:59,936 INFO (starrocks-mysql-nio-pool-114|20538036) [AcceptListener.lambda$handleEvent$1():87] Connection scheduled to worker thread 20538036. remote=/10.118.130.196:30683, connectionId=981701
2024-06-17 23:59:59,936 WARN (starrocks-mysql-nio-pool-114|20538036) [AcceptListener.lambda$handleEvent$1():116] connect processor exception because
java.io.IOException: Connection reset by peer
at sun.nio.ch.FileDispatcherImpl.write0(Native Method) ~[?:1.8.0_91]
at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47) ~[?:1.8.0_91]
at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93) ~[?:1.8.0_91]
at sun.nio.ch.IOUtil.write(IOUtil.java:65) ~[?:1.8.0_91]
at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:471) ~[?:1.8.0_91]
at org.xnio.nio.NioSocketConduit.write(NioSocketConduit.java:153) ~[xnio-nio-3.8.10.Final.jar:3.8.10.Final]
at org.xnio.conduits.ConduitStreamSinkChannel.write(ConduitStreamSinkChannel.java:150) ~[xnio-api-3.8.10.Final.jar:3.8.10.Final]
at org.xnio.channels.Channels.writeBlocking(Channels.java:147) ~[xnio-api-3.8.10.Final.jar:3.8.10.Final]
at com.starrocks.mysql.nio.NMysqlChannel.realNetSend(NMysqlChannel.java:65) ~[starrocks-fe.jar:?]
at com.starrocks.mysql.MysqlChannel.send(MysqlChannel.java:234) ~[starrocks-fe.jar:?]
at com.starrocks.mysql.MysqlChannel.flush(MysqlChannel.java:257) ~[starrocks-fe.jar:?]
at com.starrocks.mysql.MysqlChannel.sendAndFlush(MysqlChannel.java:325) ~[starrocks-fe.jar:?]
at com.starrocks.mysql.MysqlProto.negotiate(MysqlProto.java:134) ~[starrocks-fe.jar:?]
at com.starrocks.mysql.nio.AcceptListener.lambda$handleEvent$1(AcceptListener.java:91) ~[starrocks-fe.jar:?]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) ~[?:1.8.0_91]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) ~[?:1.8.0_91]
at java.lang.Thread.run(Thread.java:745) ~[?:1.8.0_91]
2024-06-18 00:00:23,768 WARN (thrift-server-pool-20533139|20634633) [FrontendServiceImpl.loadTxnCommit():1230] failed to commit txn_id: 84330352: get database write lock timeout, database=fin_test, timeoutMillis=15000
2024-06-18 00:00:24,505 WARN (starrocks-mysql-nio-pool-114|20538036) [AcceptListener.lambda$handleEvent$1():116] connect processor exception because
2024-06-18 00:00:24,944 WARN (starrocks-mysql-nio-pool-114|20538036) [AcceptListener.lambda$handleEvent$1():116] connect processor exception because
2024-06-18 00:00:26,527 WARN (JournalWriter|22188) [BDBJEJournal.rebuildCurrentTransaction():444] transaction is invalid, rebuild the txn with 2 kvs
2024-06-18 00:00:29,508 WARN (starrocks-mysql-nio-pool-114|20538036) [AcceptListener.lambda$handleEvent$1():116] connect processor exception because
2024-06-18 00:00:29,945 WARN (starrocks-mysql-nio-pool-114|20538036) [AcceptListener.lambda$handleEvent$1():116] connect processor exception because
2024-06-18 00:00:32,845 WARN (es repository|42) [EsRepository.runAfterCatalogReady():104] Thread es repository: Exception happens when fetch index [journal_info] meta data from remote es cluster. Table info: [Table [id=2754732, name=journal_info, type=ELASTICSEARCH]]
2024-06-18 00:00:34,511 WARN (starrocks-mysql-nio-pool-114|20538036) [AcceptListener.lambda$handleEvent$1():116] connect processor exception because
2024-06-18 00:00:34,947 WARN (starrocks-mysql-nio-pool-114|20538036) [AcceptListener.lambda$handleEvent$1():116] connect processor exception because
2024-06-18 00:00:36,528 ERROR (JournalWriter|22188) [BDBJEJournal.batchWriteCommit():422] failed to commit journal after retried 2 times! txn[] db[CloseSafeDatabase{db=202876799}]
2024-06-18 00:00:38,599 WARN (thrift-server-pool-20533250|20634745) [Database.logTryLockFailureEvent():175] try db lock failed. type: writeLock, current owner id: 20634436, owner name: thrift-server-pool-20532944, owner stack: dump thread: thrift-server-pool-20532944, id: 20634436
2024-06-18 00:00:38,599 WARN (thrift-server-pool-20533250|20634745) [FrontendServiceImpl.loadTxnCommit():1230] failed to commit txn_id: 84330333: get database write lock timeout, database=fin_test, timeoutMillis=15000
2024-06-18 00:00:38,600 WARN (thrift-server-pool-20533252|20634747) [Database.logTryLockFailureEvent():175] try db lock failed. type: writeLock, current owner id: 20634436, owner name: thrift-server-pool-20532944, owner stack: dump thread: thrift-server-pool-20532944, id: 20634436
2024-06-18 00:00:38,600 WARN (thrift-server-pool-20533252|20634747) [FrontendServiceImpl.loadTxnCommit():1230] failed to commit txn_id: 84330334: get database write lock timeout, database=fin_test, timeoutMillis=15000
2024-06-18 00:00:38,601 WARN (thrift-server-pool-20533253|20634748) [Database.logTryLockFailureEvent():175] try db lock failed. type: writeLock, current owner id: 20634436, owner name: thrift-server-pool-20532944, owner stack: dump thread: thrift-server-pool-20532944, id: 20634436
2024-06-18 00:00:38,601 WARN (thrift-server-pool-20533251|20634746) [Database.logTryLockFailureEvent():175] try db lock failed. type: writeLock, current owner id: 20634436, owner name: thrift-server-pool-20532944, owner stack: dump thread: thrift-server-pool-20532944, id: 20634436
2024-06-18 00:00:38,601 WARN (thrift-server-pool-20533253|20634748) [FrontendServiceImpl.loadTxnCommit():1230] failed to commit txn_id: 84330337: get database write lock timeout, database=fin_test, timeoutMillis=15000
2024-06-18 00:00:38,601 WARN (thrift-server-pool-20533251|20634746) [FrontendServiceImpl.loadTxnCommit():1230] failed to commit txn_id: 84330332: get database write lock timeout, database=fin_test, timeoutMillis=15000
2024-06-18 00:00:38,603 WARN (thrift-server-pool-20533254|20634749) [Database.logTryLockFailureEvent():175] try db lock failed. type: writeLock, current owner id: 20634436, owner name: thrift-server-pool-20532944, owner stack: dump thread: thrift-server-pool-20532944, id: 20634436
2024-06-18 00:00:38,603 WARN (thrift-server-pool-20533254|20634749) [FrontendServiceImpl.loadTxnCommit():1230] failed to commit txn_id: 84330330: get database write lock timeout, database=fin_test, timeoutMillis=15000
2024-06-18 00:00:39,512 WARN (starrocks-mysql-nio-pool-114|20538036) [AcceptListener.lambda$handleEvent$1():116] connect processor exception because
2024-06-18 00:00:39,623 WARN (thrift-server-pool-20533261|20634756) [Database.logTryLockFailureEvent():175] try db lock failed. type: writeLock, current owner id: 20634436, owner name: thrift-server-pool-20532944, owner stack: dump thread: thrift-server-pool-20532944, id: 20634436
2024-06-18 00:00:39,624 WARN (thrift-server-pool-20533261|20634756) [FrontendServiceImpl.loadTxnCommit():1230] failed to commit txn_id: 84330336: get database write lock timeout, database=fin_test, timeoutMillis=15000
2024-06-18 00:00:39,949 WARN (starrocks-mysql-nio-pool-114|20538036) [AcceptListener.lambda$handleEvent$1():116] connect processor exception because
2024-06-18 00:00:40,135 WARN (thrift-server-pool-20533268|20634763) [Database.logTryLockFailureEvent():175] try db lock failed. type: writeLock, current owner id: 20634436, owner name: thrift-server-pool-20532944, owner stack: dump thread: thrift-server-pool-20532944, id: 20634436
2024-06-18 00:00:40,135 WARN (thrift-server-pool-20533267|20634762) [Database.logTryLockFailureEvent():175] try db lock failed. type: writeLock, current owner id: 20634436, owner name: thrift-server-pool-20532944, owner stack: dump thread: thrift-server-pool-20532944, id: 20634436
2024-06-18 00:00:40,135 WARN (thrift-server-pool-20533268|20634763) [FrontendServiceImpl.loadTxnCommit():1230] failed to commit txn_id: 84330339: get database write lock timeout, database=fin_test, timeoutMillis=15000
2024-06-18 00:00:40,136 WARN (thrift-server-pool-20533267|20634762) [FrontendServiceImpl.loadTxnCommit():1230] failed to commit txn_id: 84330338: get database write lock timeout, database=fin_test, timeoutMillis=15000
2024-06-18 00:00:41,529 WARN (JournalWriter|22188) [BDBJEJournal.rebuildCurrentTransaction():444] transaction is invalid, rebuild the txn with 2 kvs
2024-06-18 00:00:42,182 WARN (thrift-server-pool-20533284|20634779) [Database.logTryLockFailureEvent():175] try db lock failed. type: writeLock, current owner id: 20634436, owner name: thrift-server-pool-20532944, owner stack: dump thread: thrift-server-pool-20532944, id: 20634436
2024-06-18 00:00:42,182 WARN (thrift-server-pool-20533284|20634779) [FrontendServiceImpl.loadTxnCommit():1230] failed to commit txn_id: 84330340: get database write lock timeout, database=fin_test, timeoutMillis=15000
2024-06-18 00:00:43,079 WARN (es repository|42) [EsRepository.runAfterCatalogReady():104] Thread es repository: Exception happens when fetch index [journal_info] meta data from remote es cluster. Table info: [Table [id=2754732, name=journal_info, type=ELASTICSEARCH]]
2024-06-18 00:00:44,514 WARN (starrocks-mysql-nio-pool-114|20538036) [AcceptListener.lambda$handleEvent$1():116] connect processor exception because
2024-06-18 00:00:44,951 WARN (starrocks-mysql-nio-pool-114|20538036) [AcceptListener.lambda$handleEvent$1():116] connect processor exception because
2024-06-18 00:00:49,516 WARN (starrocks-mysql-nio-pool-114|20538036) [AcceptListener.lambda$handleEvent$1():116] connect processor exception because
2024-06-18 00:00:49,953 WARN (starrocks-mysql-nio-pool-114|20538036) [AcceptListener.lambda$handleEvent$1():116] connect processor exception because
2024-06-18 00:00:51,530 ERROR (JournalWriter|22188) [BDBJEJournal.batchWriteCommit():422] failed to commit journal after retried 3 times! txn[] db[CloseSafeDatabase{db=202876799}]
2024-06-18 00:00:51,530 WARN (JournalWriter|22188) [JournalWriter.writeOneBatch():133] failed to commit batch, will abort current 2 journals.
2024-06-18 00:00:51,531 WARN (JournalWriter|22188) [BDBJEJournal.batchWriteAbort():480] failed to abort transaction because no running transaction, will just ignore and return.
2024-06-18 00:00:51,531 ERROR (JournalWriter|22188) [JournalWriter.abortJournalTask():176] failed to commit journal after retried 3 times! txn[] db[CloseSafeDatabase{db=202876799}]

show froontends为:
Could not determine master from helpers

这种是因为fe OOM导致的吗,fe 内存压力来自于哪里,这是在测试stream load的时候挂掉的

be节点tablet总数不到8万

  1. 现在 3个fe节点都是存活状态的么
  2. cat fe/meta/image/VERSION 检查一下3个fe的clusterId是同一个么
  3. 部署的时候是通过 --helper的方式添加的follower节点么

请问该问题是否有恢复的方案可以分享一下?

我这边找到恢复的方式了,这里分享一下:https://docs.starrocks.io/zh/docs/administration/Meta_recovery/#7-最终应急方案