3.1.5 存算分离集群FE 宕机后没法启动

为了更快的定位您的问题,请提供以下信息,谢谢
【详述】问题详细描述
【背景】做过哪些操作?
【业务影响】
【是否存算分离】
【StarRocks版本】例如:3.1.5
【集群规模】例如:1fe(1 follower)+2be(fe与be混部)
【机器信息】CPU虚拟核/内存/网卡,例如:16C/64G/万兆
【联系方式】社区群3-杨荣
【附件】

  • fe.log/beINFO/相应截图
    FE 启动不了的日志:

2023-12-11 15:06:42,843 WARN (stateChangeExecutor|79) [GlobalStateMgr.replayJournalInner():2319] catch exception when replaying 2272521,
com.starrocks.journal.JournalInconsistentException: failed to load journal type 10001
at com.starrocks.persist.EditLog.loadJournal(EditLog.java:1089) ~[starrocks-fe.jar:?]
at com.starrocks.server.GlobalStateMgr.replayJournalInner(GlobalStateMgr.java:2306) ~[starrocks-fe.jar:?]
at com.starrocks.server.GlobalStateMgr.replayJournal(GlobalStateMgr.java:2255) ~[starrocks-fe.jar:?]
at com.starrocks.server.GlobalStateMgr.transferToLeader(GlobalStateMgr.java:1216) ~[starrocks-fe.jar:?]
at com.starrocks.server.GlobalStateMgr.access$100(GlobalStateMgr.java:340) ~[starrocks-fe.jar:?]
at com.starrocks.server.GlobalStateMgr$1.transferToLeader(GlobalStateMgr.java:777) ~[starrocks-fe.jar:?]
at com.starrocks.ha.StateChangeExecutor.runOneCycle(StateChangeExecutor.java:103) ~[starrocks-fe.jar:?]
at com.starrocks.common.util.Daemon.run(Daemon.java:115) ~[starrocks-fe.jar:?]
Caused by: java.lang.NullPointerException
at com.starrocks.alter.AlterJobMgr.swapTableInternal(AlterJobMgr.java:1122) ~[starrocks-fe.jar:?]
at com.starrocks.alter.AlterJobMgr.replaySwapTable(AlterJobMgr.java:1106) ~[starrocks-fe.jar:?]
at com.starrocks.persist.EditLog.loadJournal(EditLog.java:894) ~[starrocks-fe.jar:?]
… 7 more
2023-12-11 15:06:42,847 WARN (stateChangeExecutor|79) [GlobalStateMgr.replayJournal():2257] got interrupt exception or inconsistent exception when replay journal 2272521, will exit,
com.starrocks.journal.JournalInconsistentException: failed to load journal type 10001
at com.starrocks.persist.EditLog.loadJournal(EditLog.java:1089) ~[starrocks-fe.jar:?]
at com.starrocks.server.GlobalStateMgr.replayJournalInner(GlobalStateMgr.java:2306) ~[starrocks-fe.jar:?]
at com.starrocks.server.GlobalStateMgr.replayJournal(GlobalStateMgr.java:2255) ~[starrocks-fe.jar:?]
at com.starrocks.server.GlobalStateMgr.transferToLeader(GlobalStateMgr.java:1216) ~[starrocks-fe.jar:?]
at com.starrocks.server.GlobalStateMgr.access$100(GlobalStateMgr.java:340) ~[starrocks-fe.jar:?]
at com.starrocks.server.GlobalStateMgr$1.transferToLeader(GlobalStateMgr.java:777) ~[starrocks-fe.jar:?]
at com.starrocks.ha.StateChangeExecutor.runOneCycle(StateChangeExecutor.java:103) ~[starrocks-fe.jar:?]
at com.starrocks.common.util.Daemon.run(Daemon.java:115) ~[starrocks-fe.jar:?]
Caused by: java.lang.NullPointerException
at com.starrocks.alter.AlterJobMgr.swapTableInternal(AlterJobMgr.java:1122) ~[starrocks-fe.jar:?]
at com.starrocks.alter.AlterJobMgr.replaySwapTable(AlterJobMgr.java:1106) ~[starrocks-fe.jar:?]
at com.starrocks.persist.EditLog.loadJournal(EditLog.java:894) ~[starrocks-fe.jar:?]
… 7 more
2023-12-11 15:06:51,363 INFO (main|1) [StarRocksFE.start():129] StarRocks FE starting, version: 3.1.5-5d8438a

今天在验证物化视图的功能,猜测可能是创建物化视图后出的问题。

在fe.conf配置metadata_journal_skip_bad_journal_ids=xxx
后临时跳过了出问题的journal_id

是不是FE配置了什么参数导致这个问题?

发一下完整的fe.log的日志吧

没有配置参数,日志文件微信发给您了,请查收

集群中有几个FE?

1个FE