3.0.0存算分离集群(k8s)fe无法启动

使用k8s operator部署Starrocks 3.0.0的存算分离集群,图省事,直接使用了kubectl delete -f yaml, kubectl apply -f yaml等方式重启集群。结果是BE正常启动,而FE在不停的重启。

yaml文件

fe.out:

SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/starrocks/fe/lib/spark-dpp-1.0.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/starrocks/fe/lib/log4j-slf4j-impl-2.17.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
[2023-05-12 02:53:20] notify new FE type transfer: UNKNOWN
[2023-05-12 02:53:21] notify new FE type transfer: FOLLOWER
[2023-05-12 02:53:23] notify new FE type transfer: UNKNOWN
May 12, 2023 2:53:26 AM com.baidu.jprotobuf.pbrpc.client.ProtobufRpcProxy proxy
INFO: Use global share channel pool to create protobuf RPC proxy with interface interface com.starrocks.rpc.PBackendService
May 12, 2023 2:53:26 AM com.baidu.jprotobuf.pbrpc.client.ProtobufRpcProxy proxy
INFO: Use global share channel pool to create protobuf RPC proxy with interface interface com.starrocks.rpc.PBackendService
May 12, 2023 2:53:26 AM com.baidu.jprotobuf.pbrpc.client.ProtobufRpcProxy proxy
INFO: Use global share channel pool to create protobuf RPC proxy with interface interface com.starrocks.rpc.PBackendService
May 12, 2023 2:53:31 AM com.baidu.jprotobuf.pbrpc.client.ProtobufRpcProxy proxy
INFO: Use global share channel pool to create protobuf RPC proxy with interface interface com.starrocks.rpc.LakeService
May 12, 2023 2:53:35 AM com.baidu.jprotobuf.pbrpc.client.ProtobufRpcProxy proxy
INFO: Use global share channel pool to create protobuf RPC proxy with interface interface com.starrocks.rpc.PBackendService
May 12, 2023 2:53:35 AM com.baidu.jprotobuf.pbrpc.client.ProtobufRpcProxy proxy
INFO: Use global share channel pool to create protobuf RPC proxy with interface interface com.starrocks.rpc.PBackendService
May 12, 2023 2:53:36 AM com.github.benmanes.caffeine.cache.LocalAsyncCache$AsyncBulkCompleter accept
WARNING: Exception thrown during asynchronous load
java.util.concurrent.CompletionException: com.starrocks.sql.analyzer.SemanticException: Statistics query fail | Error Message [INTERNAL_ERROR] | {} | SQL [2a7f912f-f070-11ed-b3f6-00163e0a97d5]
at java.base/java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:314)
at java.base/java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:319)
at java.base/java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1702)
at java.base/java.util.concurrent.CompletableFuture$AsyncSupply.exec(CompletableFuture.java:1692)
at java.base/java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:290)
at java.base/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(ForkJoinPool.java:1020)
at java.base/java.util.concurrent.ForkJoinPool.scan(ForkJoinPool.java:1656)
at java.base/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1594)
at java.base/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:183)
Caused by: com.starrocks.sql.analyzer.SemanticException: Statistics query fail | Error Message [INTERNAL_ERROR] | {} | SQL [2a7f912f-f070-11ed-b3f6-00163e0a97d5]
at com.starrocks.statistic.StatisticExecutor.executeDQL(StatisticExecutor.java:290)
at com.starrocks.statistic.StatisticExecutor.executeStatisticDQL(StatisticExecutor.java:273)
at com.starrocks.statistic.StatisticExecutor.queryTableStats(StatisticExecutor.java:182)
at com.starrocks.sql.optimizer.statistics.TableStatsCacheLoader.queryStatisticsData(TableStatsCacheLoader.java:92)
at com.starrocks.sql.optimizer.statistics.TableStatsCacheLoader.lambda$asyncLoadAll$1(TableStatsCacheLoader.java:68)
at java.base/java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1700)
… 6 more

May 12, 2023 2:53:36 AM com.github.benmanes.caffeine.cache.LocalAsyncCache$AsyncBulkCompleter accept
WARNING: Exception thrown during asynchronous load
java.util.concurrent.CompletionException: com.starrocks.sql.analyzer.SemanticException: Statistics query fail | Error Message [INTERNAL_ERROR] | {} | SQL [2a807b90-f070-11ed-b3f6-00163e0a97d5]
at java.base/java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:314)
at java.base/java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:319)
at java.base/java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1702)
at java.base/java.util.concurrent.CompletableFuture$AsyncSupply.exec(CompletableFuture.java:1692)
at java.base/java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:290)
at java.base/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(ForkJoinPool.java:1020)
at java.base/java.util.concurrent.ForkJoinPool.scan(ForkJoinPool.java:1656)
at java.base/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1594)
at java.base/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:183)
Caused by: com.starrocks.sql.analyzer.SemanticException: Statistics query fail | Error Message [INTERNAL_ERROR] | {} | SQL [2a807b90-f070-11ed-b3f6-00163e0a97d5]
at com.starrocks.statistic.StatisticExecutor.executeDQL(StatisticExecutor.java:290)
at com.starrocks.statistic.StatisticExecutor.executeStatisticDQL(StatisticExecutor.java:273)
at com.starrocks.statistic.StatisticExecutor.queryTableStats(StatisticExecutor.java:182)
at com.starrocks.sql.optimizer.statistics.TableStatsCacheLoader.queryStatisticsData(TableStatsCacheLoader.java:92)
at com.starrocks.sql.optimizer.statistics.TableStatsCacheLoader.lambda$asyncLoadAll$1(TableStatsCacheLoader.java:68)
at java.base/java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1700)
… 6 more

May 12, 2023 2:53:36 AM com.github.benmanes.caffeine.cache.LocalAsyncCache$AsyncBulkCompleter accept
WARNING: Exception thrown during asynchronous load
java.util.concurrent.CompletionException: com.starrocks.sql.analyzer.SemanticException: Statistics query fail | Error Message [INTERNAL_ERROR] | {} | SQL [2a870b41-f070-11ed-b3f6-00163e0a97d5]
at java.base/java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:314)
at java.base/java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:319)
at java.base/java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1702)
at java.base/java.util.concurrent.CompletableFuture$AsyncSupply.exec(CompletableFuture.java:1692)
at java.base/java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:290)
at java.base/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(ForkJoinPool.java:1020)
at java.base/java.util.concurrent.ForkJoinPool.scan(ForkJoinPool.java:1656)
at java.base/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1594)
at java.base/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:183)
Caused by: com.starrocks.sql.analyzer.SemanticException: Statistics query fail | Error Message [INTERNAL_ERROR] | {} | SQL [2a870b41-f070-11ed-b3f6-00163e0a97d5]
at com.starrocks.statistic.StatisticExecutor.executeDQL(StatisticExecutor.java:290)
at com.starrocks.statistic.StatisticExecutor.executeStatisticDQL(StatisticExecutor.java:273)
at com.starrocks.statistic.StatisticExecutor.queryStatisticSync(StatisticExecutor.java:99)
at com.starrocks.sql.optimizer.statistics.ColumnBasicStatsCacheLoader.queryStatisticsData(ColumnBasicStatsCacheLoader.java:124)
at com.starrocks.sql.optimizer.statistics.ColumnBasicStatsCacheLoader.lambda$asyncLoadAll$1(ColumnBasicStatsCacheLoader.java:90)
at java.base/java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1700)
… 6 more

fe.log

2023-05-12 04:22:35,691 WARN (ReportHandler|102) [NodeMgr.updateResourceUsage():1095] UpdateResourceUsage to remote fe: starrockscluster-sample-fe-1.starrockscluster-sample-fe-search.starrocks.svc.cluster.local failed
org.apache.thrift.transport.TTransportException: java.net.ConnectException: Connection refused (Connection refused)
at org.apache.thrift.transport.TSocket.open(TSocket.java:226) ~[libthrift-0.13.0.jar:0.13.0]
at com.starrocks.common.GenericPool$ThriftClientFactory.create(GenericPool.java:144) ~[starrocks-fe.jar:?]
at com.starrocks.common.GenericPool$ThriftClientFactory.create(GenericPool.java:129) ~[starrocks-fe.jar:?]
at org.apache.commons.pool2.BaseKeyedPooledObjectFactory.makeObject(BaseKeyedPooledObjectFactory.java:62) ~[commons-pool2-2.3.jar:2.3]
at org.apache.commons.pool2.impl.GenericKeyedObjectPool.create(GenericKeyedObjectPool.java:1036) ~[commons-pool2-2.3.jar:2.3]
at org.apache.commons.pool2.impl.GenericKeyedObjectPool.borrowObject(GenericKeyedObjectPool.java:356) ~[commons-pool2-2.3.jar:2.3]
at org.apache.commons.pool2.impl.GenericKeyedObjectPool.borrowObject(GenericKeyedObjectPool.java:278) ~[commons-pool2-2.3.jar:2.3]
at com.starrocks.common.GenericPool.borrowObject(GenericPool.java:105) ~[starrocks-fe.jar:?]
at com.starrocks.rpc.FrontendServiceProxy.call(FrontendServiceProxy.java:33) ~[starrocks-fe.jar:?]
at com.starrocks.server.NodeMgr.updateResourceUsage(NodeMgr.java:1087) [starrocks-fe.jar:?]
at com.starrocks.server.GlobalStateMgr.updateResourceUsage(GlobalStateMgr.java:3524) [starrocks-fe.jar:?]
at com.starrocks.leader.ReportHandler.resourceUsageReport(ReportHandler.java:537) [starrocks-fe.jar:?]
at com.starrocks.leader.ReportHandler.access$500(ReportHandler.java:119) [starrocks-fe.jar:?]
at com.starrocks.leader.ReportHandler$ReportTask.exec(ReportHandler.java:357) [starrocks-fe.jar:?]
at com.starrocks.leader.ReportHandler.runOneCycle(ReportHandler.java:1473) [starrocks-fe.jar:?]
at com.starrocks.common.util.Daemon.run(Daemon.java:115) [starrocks-fe.jar:?]
Caused by: java.net.ConnectException: Connection refused (Connection refused)
at java.net.PlainSocketImpl.socketConnect(Native Method) ~[?:?]
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:412) ~[?:?]
at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:255) ~[?:?]
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:237) ~[?:?]
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) ~[?:?]
at java.net.Socket.connect(Socket.java:609) ~[?:?]
at org.apache.thrift.transport.TSocket.open(TSocket.java:221) ~[libthrift-0.13.0.jar:0.13.0]
… 15 more
2023-05-12 04:22:35,692 WARN (ReportHandler|102) [NodeMgr.updateResourceUsage():1095] UpdateResourceUsage to remote fe: starrockscluster-sample-fe-0.starrockscluster-sample-fe-search.starrocks.svc.cluster.local failed
org.apache.thrift.transport.TTransportException: java.net.ConnectException: Connection refused (Connection refused)
at org.apache.thrift.transport.TSocket.open(TSocket.java:226) ~[libthrift-0.13.0.jar:0.13.0]
at com.starrocks.common.GenericPool$ThriftClientFactory.create(GenericPool.java:144) ~[starrocks-fe.jar:?]
at com.starrocks.common.GenericPool$ThriftClientFactory.create(GenericPool.java:129) ~[starrocks-fe.jar:?]
at org.apache.commons.pool2.BaseKeyedPooledObjectFactory.makeObject(BaseKeyedPooledObjectFactory.java:62) ~[commons-pool2-2.3.jar:2.3]
at org.apache.commons.pool2.impl.GenericKeyedObjectPool.create(GenericKeyedObjectPool.java:1036) ~[commons-pool2-2.3.jar:2.3]
at org.apache.commons.pool2.impl.GenericKeyedObjectPool.borrowObject(GenericKeyedObjectPool.java:356) ~[commons-pool2-2.3.jar:2.3]
at org.apache.commons.pool2.impl.GenericKeyedObjectPool.borrowObject(GenericKeyedObjectPool.java:278) ~[commons-pool2-2.3.jar:2.3]
at com.starrocks.common.GenericPool.borrowObject(GenericPool.java:105) ~[starrocks-fe.jar:?]
at com.starrocks.rpc.FrontendServiceProxy.call(FrontendServiceProxy.java:33) ~[starrocks-fe.jar:?]
at com.starrocks.server.NodeMgr.updateResourceUsage(NodeMgr.java:1087) [starrocks-fe.jar:?]
at com.starrocks.server.GlobalStateMgr.updateResourceUsage(GlobalStateMgr.java:3524) [starrocks-fe.jar:?]
at com.starrocks.leader.ReportHandler.resourceUsageReport(ReportHandler.java:537) [starrocks-fe.jar:?]
at com.starrocks.leader.ReportHandler.access$500(ReportHandler.java:119) [starrocks-fe.jar:?]
at com.starrocks.leader.ReportHandler$ReportTask.exec(ReportHandler.java:357) [starrocks-fe.jar:?]
at com.starrocks.leader.ReportHandler.runOneCycle(ReportHandler.java:1473) [starrocks-fe.jar:?]
at com.starrocks.common.util.Daemon.run(Daemon.java:115) [starrocks-fe.jar:?]
Caused by: java.net.ConnectException: Connection refused (Connection refused)
at java.net.PlainSocketImpl.socketConnect(Native Method) ~[?:?]
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:412) ~[?:?]
at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:255) ~[?:?]
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:237) ~[?:?]
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) ~[?:?]
at java.net.Socket.connect(Socket.java:609) ~[?:?]
at org.apache.thrift.transport.TSocket.open(TSocket.java:221) ~[libthrift-0.13.0.jar:0.13.0]
… 15 more
2023-05-12 04:22:35,747 WARN (ReportHandler|102) [NodeMgr.updateResourceUsage():1095] UpdateResourceUsage to remote fe: starrockscluster-sample-fe-1.starrockscluster-sample-fe-search.starrocks.svc.cluster.local failed
org.apache.thrift.transport.TTransportException: java.net.ConnectException: Connection refused (Connection refused)
at org.apache.thrift.transport.TSocket.open(TSocket.java:226) ~[libthrift-0.13.0.jar:0.13.0]
at com.starrocks.common.GenericPool$ThriftClientFactory.create(GenericPool.java:144) ~[starrocks-fe.jar:?]
at com.starrocks.common.GenericPool$ThriftClientFactory.create(GenericPool.java:129) ~[starrocks-fe.jar:?]
at org.apache.commons.pool2.BaseKeyedPooledObjectFactory.makeObject(BaseKeyedPooledObjectFactory.java:62) ~[commons-pool2-2.3.jar:2.3]
at org.apache.commons.pool2.impl.GenericKeyedObjectPool.create(GenericKeyedObjectPool.java:1036) ~[commons-pool2-2.3.jar:2.3]
at org.apache.commons.pool2.impl.GenericKeyedObjectPool.borrowObject(GenericKeyedObjectPool.java:356) ~[commons-pool2-2.3.jar:2.3]
at org.apache.commons.pool2.impl.GenericKeyedObjectPool.borrowObject(GenericKeyedObjectPool.java:278) ~[commons-pool2-2.3.jar:2.3]
at com.starrocks.common.GenericPool.borrowObject(GenericPool.java:105) ~[starrocks-fe.jar:?]
at com.starrocks.rpc.FrontendServiceProxy.call(FrontendServiceProxy.java:33) ~[starrocks-fe.jar:?]
at com.starrocks.server.NodeMgr.updateResourceUsage(NodeMgr.java:1087) [starrocks-fe.jar:?]
at com.starrocks.server.GlobalStateMgr.updateResourceUsage(GlobalStateMgr.java:3524) [starrocks-fe.jar:?]
at com.starrocks.leader.ReportHandler.resourceUsageReport(ReportHandler.java:537) [starrocks-fe.jar:?]
at com.starrocks.leader.ReportHandler.access$500(ReportHandler.java:119) [starrocks-fe.jar:?]
at com.starrocks.leader.ReportHandler$ReportTask.exec(ReportHandler.java:357) [starrocks-fe.jar:?]
at com.starrocks.leader.ReportHandler.runOneCycle(ReportHandler.java:1473) [starrocks-fe.jar:?]
at com.starrocks.common.util.Daemon.run(Daemon.java:115) [starrocks-fe.jar:?]
Caused by: java.net.ConnectException: Connection refused (Connection refused)
at java.net.PlainSocketImpl.socketConnect(Native Method) ~[?:?]
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:412) ~[?:?]
at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:255) ~[?:?]
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:237) ~[?:?]
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) ~[?:?]
at java.net.Socket.connect(Socket.java:609) ~[?:?]
at org.apache.thrift.transport.TSocket.open(TSocket.java:221) ~[libthrift-0.13.0.jar:0.13.0]
… 15 more
2023-05-12 04:22:35,748 WARN (ReportHandler|102) [NodeMgr.updateResourceUsage():1095] UpdateResourceUsage to remote fe: starrockscluster-sample-fe-0.starrockscluster-sample-fe-search.starrocks.svc.cluster.local failed
org.apache.thrift.transport.TTransportException: java.net.ConnectException: Connection refused (Connection refused)
at org.apache.thrift.transport.TSocket.open(TSocket.java:226) ~[libthrift-0.13.0.jar:0.13.0]
at com.starrocks.common.GenericPool$ThriftClientFactory.create(GenericPool.java:144) ~[starrocks-fe.jar:?]
at com.starrocks.common.GenericPool$ThriftClientFactory.create(GenericPool.java:129) ~[starrocks-fe.jar:?]
at org.apache.commons.pool2.BaseKeyedPooledObjectFactory.makeObject(BaseKeyedPooledObjectFactory.java:62) ~[commons-pool2-2.3.jar:2.3]
at org.apache.commons.pool2.impl.GenericKeyedObjectPool.create(GenericKeyedObjectPool.java:1036) ~[commons-pool2-2.3.jar:2.3]
at org.apache.commons.pool2.impl.GenericKeyedObjectPool.borrowObject(GenericKeyedObjectPool.java:356) ~[commons-pool2-2.3.jar:2.3]
at org.apache.commons.pool2.impl.GenericKeyedObjectPool.borrowObject(GenericKeyedObjectPool.java:278) ~[commons-pool2-2.3.jar:2.3]
at com.starrocks.common.GenericPool.borrowObject(GenericPool.java:105) ~[starrocks-fe.jar:?]
at com.starrocks.rpc.FrontendServiceProxy.call(FrontendServiceProxy.java:33) ~[starrocks-fe.jar:?]
at com.starrocks.server.NodeMgr.updateResourceUsage(NodeMgr.java:1087) [starrocks-fe.jar:?]
at com.starrocks.server.GlobalStateMgr.updateResourceUsage(GlobalStateMgr.java:3524) [starrocks-fe.jar:?]
at com.starrocks.leader.ReportHandler.resourceUsageReport(ReportHandler.java:537) [starrocks-fe.jar:?]
at com.starrocks.leader.ReportHandler.access$500(ReportHandler.java:119) [starrocks-fe.jar:?]
at com.starrocks.leader.ReportHandler$ReportTask.exec(ReportHandler.java:357) [starrocks-fe.jar:?]
at com.starrocks.leader.ReportHandler.runOneCycle(ReportHandler.java:1473) [starrocks-fe.jar:?]
at com.starrocks.common.util.Daemon.run(Daemon.java:115) [starrocks-fe.jar:?]
Caused by: java.net.ConnectException: Connection refused (Connection refused)
at java.net.PlainSocketImpl.socketConnect(Native Method) ~[?:?]
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:412) ~[?:?]
at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:255) ~[?:?]
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:237) ~[?:?]
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) ~[?:?]
at java.net.Socket.connect(Socket.java:609) ~[?:?]
at org.apache.thrift.transport.TSocket.open(TSocket.java:221) ~[libthrift-0.13.0.jar:0.13.0]
… 15 more
2023-05-12 04:22:35,765 INFO (thrift-server-pool-28|219) [QeProcessorImpl.reportExecStatus():145] ReportExecStatus() failed, query does not exist, fragment_instance_id=9f3a4392-f07c-11ed-b95e-00163e0cb0bc, query_id=9f3a4392-f07c-11ed-b95e-00163e0cb0b5,
2023-05-12 04:22:35,765 INFO (ForkJoinPool.commonPool-worker-7|88) [QeProcessorImpl.unregisterQuery():105] deregister query id 9f3a4392-f07c-11ed-b95e-00163e0cb0b5
2023-05-12 04:22:35,765 INFO (thrift-server-pool-32|223) [QeProcessorImpl.reportExecStatus():145] ReportExecStatus() failed, query does not exist, fragment_instance_id=9f3a4392-f07c-11ed-b95e-00163e0cb0b7, query_id=9f3a4392-f07c-11ed-b95e-00163e0cb0b5,
2023-05-12 04:22:35,766 INFO (thrift-server-pool-31|222) [QeProcessorImpl.reportExecStatus():145] ReportExecStatus() failed, query does not exist, fragment_instance_id=9f3a4392-f07c-11ed-b95e-00163e0cb0bd, query_id=9f3a4392-f07c-11ed-b95e-00163e0cb0b5,
2023-05-12 04:22:35,766 INFO (thrift-server-pool-34|225) [QeProcessorImpl.reportExecStatus():145] ReportExecStatus() failed, query does not exist, fragment_instance_id=9f3a4392-f07c-11ed-b95e-00163e0cb0bb, query_id=9f3a4392-f07c-11ed-b95e-00163e0cb0b5,
2023-05-12 04:22:35,847 INFO (leaderCheckpointer|149) [LocalMetastore.replayCreateDb():421] finish replay create db, name: h_cache_asyn, id: 10834
2023-05-12 04:22:35,847 INFO (leaderCheckpointer|149) [LocalMetastore.replayCreateDb():421] finish replay create db, name: h_cache_sync, id: 10835
2023-05-12 04:22:35,848 INFO (leaderCheckpointer|149) [LocalMetastore.replayCreateDb():421] finish replay create db, name: no_cache_asyn, id: 10836[fe.log|attachment]

fe-00.log (5.9 MB) fe-01.log (5.5 MB)

麻烦将三个fe pod的/opt/starrocks/fe/log/日志打包上传一下,帮助分析问题。

正常情况下kubectl delete -f再kubectl apply -f不能作为重启的替代操作。

已上传0,1两个节点的日志

2023-05-12 11:01:24,846 WARN (leaderCheckpointer|90) [GlobalStateMgr.replayJournalInner():2012] catch exception when replaying 19003,
com.starrocks.journal.JournalInconsistentException: failed to load journal type 118
    at com.starrocks.persist.EditLog.loadJournal(EditLog.java:981) ~[starrocks-fe.jar:?]
    at com.starrocks.server.GlobalStateMgr.replayJournalInner(GlobalStateMgr.java:2001) [starrocks-fe.jar:?]
    at com.starrocks.server.GlobalStateMgr.replayJournal(GlobalStateMgr.java:1953) [starrocks-fe.jar:?]
    at com.starrocks.leader.Checkpoint.replayAndGenerateGlobalStateMgrImage(Checkpoint.java:215) [starrocks-fe.jar:?]
    at com.starrocks.leader.Checkpoint.runAfterCatalogReady(Checkpoint.java:106) [starrocks-fe.jar:?]
    at com.starrocks.common.util.LeaderDaemon.runOneCycle(LeaderDaemon.java:73) [starrocks-fe.jar:?]
    at com.starrocks.common.util.Daemon.run(Daemon.java:115) [starrocks-fe.jar:?]
Caused by: java.lang.NullPointerException
    at com.starrocks.lake.StarOSAgent.getServiceId(StarOSAgent.java:101) ~[starrocks-fe.jar:?]
    at com.starrocks.lake.StarOSAgent.prepare(StarOSAgent.java:94) ~[starrocks-fe.jar:?]
    at com.starrocks.lake.StarOSAgent.getShardReplicas(StarOSAgent.java:393) ~[starrocks-fe.jar:?]
    at com.starrocks.lake.StarOSAgent.getBackendIdsByShard(StarOSAgent.java:444) ~[starrocks-fe.jar:?]
    at com.starrocks.lake.LakeTablet.getBackendIds(LakeTablet.java:88) ~[starrocks-fe.jar:?]
    at com.starrocks.server.LocalMetastore.truncateTableInternal(LocalMetastore.java:4833) ~[starrocks-fe.jar:?]
    at com.starrocks.server.LocalMetastore.replayTruncateTable(LocalMetastore.java:4862) ~[starrocks-fe.jar:?]
    at com.starrocks.server.GlobalStateMgr.replayTruncateTable(GlobalStateMgr.java:3520) ~[starrocks-fe.jar:?]
    at com.starrocks.persist.EditLog.loadJournal(EditLog.java:574) ~[starrocks-fe.jar:?]
    ... 6 more
2023-05-12 11:01:24,846 WARN (leaderCheckpointer|90) [GlobalStateMgr.replayJournal():1955] got interrupt exception or inconsistent exception when replay journal 19003, will exit,
com.starrocks.journal.JournalInconsistentException: failed to load journal type 118
    at com.starrocks.persist.EditLog.loadJournal(EditLog.java:981) ~[starrocks-fe.jar:?]
    at com.starrocks.server.GlobalStateMgr.replayJournalInner(GlobalStateMgr.java:2001) ~[starrocks-fe.jar:?]
    at com.starrocks.server.GlobalStateMgr.replayJournal(GlobalStateMgr.java:1953) [starrocks-fe.jar:?]
    at com.starrocks.leader.Checkpoint.replayAndGenerateGlobalStateMgrImage(Checkpoint.java:215) [starrocks-fe.jar:?]
    at com.starrocks.leader.Checkpoint.runAfterCatalogReady(Checkpoint.java:106) [starrocks-fe.jar:?]
    at com.starrocks.common.util.LeaderDaemon.runOneCycle(LeaderDaemon.java:73) [starrocks-fe.jar:?]
    at com.starrocks.common.util.Daemon.run(Daemon.java:115) [starrocks-fe.jar:?]
Caused by: java.lang.NullPointerException
    at com.starrocks.lake.StarOSAgent.getServiceId(StarOSAgent.java:101) ~[starrocks-fe.jar:?]
    at com.starrocks.lake.StarOSAgent.prepare(StarOSAgent.java:94) ~[starrocks-fe.jar:?]
    at com.starrocks.lake.StarOSAgent.getShardReplicas(StarOSAgent.java:393) ~[starrocks-fe.jar:?]
    at com.starrocks.lake.StarOSAgent.getBackendIdsByShard(StarOSAgent.java:444) ~[starrocks-fe.jar:?]
    at com.starrocks.lake.LakeTablet.getBackendIds(LakeTablet.java:88) ~[starrocks-fe.jar:?]
    at com.starrocks.server.LocalMetastore.truncateTableInternal(LocalMetastore.java:4833) ~[starrocks-fe.jar:?]
    at com.starrocks.server.LocalMetastore.replayTruncateTable(LocalMetastore.java:4862) ~[starrocks-fe.jar:?]
    at com.starrocks.server.GlobalStateMgr.replayTruncateTable(GlobalStateMgr.java:3520) ~[starrocks-fe.jar:?]
    at com.starrocks.persist.EditLog.loadJournal(EditLog.java:574) ~[starrocks-fe.jar:?]
    ... 6 more
[2023-05-12 11:01:24] failed to load journal type 118

3.0.0存算分离已知问题: 请关注https://github.com/StarRocks/starrocks/pull/23293 bugfix

1赞

你好,我也遇到了相同的问题,请问最后是怎么解决的呢?

请升级到3.0.2+版本.

我使用的镜像是3.0.3版本,fe起不来

我之前使用的镜像为2.5版本,之后将镜像改为3.0.3版本后,在fe.conf和be.conf里增加了存算分离相关配置。apply 之后be正常启动,fe无法启动一直在重试。这是经典模式无法转为存算分离模式的问题吗?

是的, 不支持从shared-nothing模式升级到shared-data模式, 请单独部署新的shared-data模式的集群.