常见 Crash / BUG / 优化 查询

  1. _delete_tablets_on_unused_root_path crash

*** SIGABRT (@0x3e800004eaf) received by PID 20143 (TID 0x7f4fb631b700) from PID 20143; stack trace: ***
    @          0x58f9dc2 google::(anonymous namespace)::FailureSignalHandler()
    @     0x7f509d0ae630 (unknown)
    @     0x7f509c5f9387 __GI_raise
    @     0x7f509c5faa78 __GI_abort
    @          0x2c5c79e starrocks::failure_function()
    @          0x58ed79d google::LogMessage::Fail()
    @          0x58efc0f google::LogMessage::SendToLog()
    @          0x58ed2ee google::LogMessage::Flush()
    @          0x58f0219 google::LogMessageFatal::~LogMessageFatal()
    @          0x3fefc29 starrocks::StorageEngine::_delete_tablets_on_unused_root_path()
    @          0x3fefd4a starrocks::StorageEngine::_start_disk_stat_monitor()
    @          0x424a3b1 starrocks::StorageEngine::_disk_stat_monitor_thread_callback()
    @          0x7e0a7e0 execute_native_thread_routine
    @     0x7f509d0a6ea5 start_thread
    @     0x7f509c6c1b0d __clone
    @                0x0 (unknown)
F0802 11:18:09.379490  3870 storage_engine.cpp:496] meet too many error disks, process exit. max_ratio_allowed=0%, error_disk_count=1, total_disk_count=1
F0802 11:43:16.782750 20644 storage_engine.cpp:496] meet too many error disks, process exit. max_ratio_allowed=0%, error_disk_count=1, total_disk_count=1
F0802 12:16:51.805621 21631 storage_engine.cpp:496] meet too many error disks, process exit. max_ratio_allowed=0%, error_disk_count=1, total_disk_count=1

原因: 磁盘坏了或是磁盘满了

  1. 3.0 版本 FE 升级失败

3.0.0 ~ 3.0.2 的版本,升级到3.0.3+ 或是 3.1 时候出现

2023-08-16 23:26:55,710 ERROR (stateChangeExecutor|73) [GlobalStateMgr.transferToLeader():1148] failed to init journal after transfer to leader! will exit
com.google.gson.JsonSyntaxException: java.lang.IllegalStateException: Expected BEGIN_OBJECT but was STRING at line 1 column 91 path $.p.m2.
at com.google.gson.internal.bind.ReflectiveTypeAdapterFactory$Adapter.read(ReflectiveTypeAdapterFactory.java:226) ~[spark-dpp-1.0.0.jar:?]
at com.starrocks.persist.gson.GsonUtils$ProcessHookTypeAdapterFactory$1.read(GsonUtils.java:641) ~[starrocks-fe.jar:?]
at com.google.gson.internal.bind.TypeAdapterRuntimeTypeWrapper.read(TypeAdapterRuntimeTypeWrapper.java:41) ~[spark-dpp-1.0.0.jar:?]
at com.google.gson.internal.bind.MapTypeAdapterFactory$Adapter.read(MapTypeAdapterFactory.java:186) ~[spark-dpp-1.0.0.jar:?]
at com.google.gson.internal.bind.MapTypeAdapterFactory$Adapter.read(MapTypeAdapterFactory.java:145) ~[spark-dpp-1.0.0.jar:?]
at com.starrocks.persist.gson.GsonUtils$ProcessHookTypeAdapterFactory$1.read(GsonUtils.java:641) ~[starrocks-fe.jar:?]
at com.google.gson.internal.bind.ReflectiveTypeAdapterFactory$1.read(ReflectiveTypeAdapterFactory.java:131) ~[spark-dpp-1.0.0.jar:?]
at com.google.gson.internal.bind.ReflectiveTypeAdapterFactory$Adapter.read(ReflectiveTypeAdapterFactory.java:222) ~[spark-dpp-1.0.0.jar:?]
at com.starrocks.persist.gson.GsonUtils$ProcessHookTypeAdapterFactory$1.read(GsonUtils.java:641) ~[starrocks-fe.jar:?]
at com.google.gson.internal.bind.ReflectiveTypeAdapterFactory$1.read(ReflectiveTypeAdapterFactory.java:131) ~[spark-dpp-1.0.0.jar:?]
at com.google.gson.internal.bind.ReflectiveTypeAdapterFactory$Adapter.read(ReflectiveTypeAdapterFactory.java:222) ~[spark-dpp-1.0.0.jar:?]
at com.starrocks.persist.gson.GsonUtils$ProcessHookTypeAdapterFactory$1.read(GsonUtils.java:641) ~[starrocks-fe.jar:?]
at com.google.gson.Gson.fromJson(Gson.java:963) ~[spark-dpp-1.0.0.jar:?]
at com.google.gson.Gson.fromJson(Gson.java:928) ~[spark-dpp-1.0.0.jar:?]
at com.google.gson.Gson.fromJson(Gson.java:877) ~[spark-dpp-1.0.0.jar:?]
at com.google.gson.Gson.fromJson(Gson.java:848) ~[spark-dpp-1.0.0.jar:?]
at com.starrocks.persist.UserPrivilegeCollectionInfo.read(UserPrivilegeCollectionInfo.java:75) ~[starrocks-fe.jar:?]
at com.starrocks.journal.JournalEntity.readFields(JournalEntity.java:990) ~[starrocks-fe.jar:?]
at com.starrocks.journal.bdbje.BDBJournalCursor.deserializeData(BDBJournalCursor.java:251) ~[starrocks-fe.jar:?]
at com.starrocks.journal.bdbje.BDBJournalCursor.next(BDBJournalCursor.java:295) ~[starrocks-fe.jar:?]
at com.starrocks.server.GlobalStateMgr.replayJournalInner(GlobalStateMgr.java:2144) ~[starrocks-fe.jar:?]
at com.starrocks.server.GlobalStateMgr.replayJournal(GlobalStateMgr.java:2104) ~[starrocks-fe.jar:?]
at com.starrocks.server.GlobalStateMgr.transferToLeader(GlobalStateMgr.java:1143) ~[starrocks-fe.jar:?]
at com.starrocks.server.GlobalStateMgr.access$100(GlobalStateMgr.java:325) ~[starrocks-fe.jar:?]
at com.starrocks.server.GlobalStateMgr$1.transferToLeader(GlobalStateMgr.java:722) ~[starrocks-fe.jar:?]
at com.starrocks.ha.StateChangeExecutor.runOneCycle(StateChangeExecutor.java:103) ~[starrocks-fe.jar:?]
at com.starrocks.common.util.Daemon.run(Daemon.java:115) ~[starrocks-fe.jar:?]
Caused by: java.lang.IllegalStateException: Expected BEGIN_OBJECT but was STRING at line 1 column 91 path $.p.m2.
at com.google.gson.stream.JsonReader.beginObject(JsonReader.java:384) ~[spark-dpp-1.0.0.jar:?]
at com.google.gson.internal.bind.ReflectiveTypeAdapterFactory$Adapter.read(ReflectiveTypeAdapterFactory.java:215) ~[spark-dpp-1.0.0.jar:?]
  1. CloudCannal 报错 Unsupported dataFormat value is : \N

com.clougence.cloudcanal.base.metadata.exception.DataTaskRuntimeException: start increment service failed for task(id:27,name:canalp00lphy9198_INCREMENT),msgUnsupportedOperationException: Unsupported dataFormat value is : \N
        at com.clougence.cloudcanal.task.service.impl.CanalIncrementServiceImpl.start(CanalIncrementServiceImpl.java:86)
        at com.clougence.cloudcanal.task.DataTaskStarter.startDataTask(DataTaskStarter.java:247)
        at com.clougence.cloudcanal.task.DataTaskStarter.start(DataTaskStarter.java:109)
        at com.clougence.cloudcanal.task.TaskCoreApplication.main(TaskCoreApplication.java:57)
  1. [行为变更] Truncate 后的数据是否进 Trash

旧版本: 进 Trash,过期删除

新版本: 不进 Trash, 直接删除

  1. 低基数字典不一致导致 BE crash

query_id:24ce896e-52a9-11ee-958a-52540062cc3e, fragment_instance:24ce896e-52a9-11ee-958a-52540062cd2e
*** Aborted at 1694659692 (unix time) try "date -d @1694659692" if you are using GNU date ***
PC: @          0x21e0c13 starrocks::vectorized::FixedLengthColumnBase<>::swap_column()
*** SIGSEGV (@0x52eeb18) received by PID 2778746 (TID 0x7fb02c7dc700) from PID 86960920; stack trace: ***
    @          0x4ebcb82 google::(anonymous namespace)::FailureSignalHandler()
    @     0x7fb38adbfce0 (unknown)
    @          0x21e0c13 starrocks::vectorized::FixedLengthColumnBase<>::swap_column()
    @          0x37076be starrocks::vectorized::NullableColumn::swap_column()
    @          0x36c7cc5 starrocks::vectorized::SegmentIterator::_encode_to_global_id()
    @          0x36d062b starrocks::vectorized::SegmentIterator::_do_get_next()
    @          0x36d32c0 starrocks::vectorized::SegmentIterator::do_get_next()
    @          0x37475d2 starrocks::vectorized::ProjectionIterator::do_get_next()
    @          0x3cf4d35 starrocks::SegmentIteratorWrapper::do_get_next()
    @          0x3b134a3 starrocks::vectorized::TimedChunkIterator::do_get_next()
    @          0x376e8ce starrocks::vectorized::TabletReader::do_get_next()
    @          0x254ff5b starrocks::pipeline::OlapChunkSource::_read_chunk_from_storage()
    @          0x255063b starrocks::pipeline::OlapChunkSource::_read_chunk()
    @          0x254002c starrocks::pipeline::ChunkSource::buffer_next_batch_chunks_blocking()
    @          0x22be4c4 _ZZN9starrocks8pipeline12ScanOperator18_trigger_next_scanEPNS_12RuntimeStateEiENKUlvE_clEv
    @          0x22cf57e starrocks::workgroup::ScanExecutor::worker_thread()
    @          0x3ef1362 starrocks::ThreadPool::dispatch_thread()
    @          0x3eebe5a starrocks::Thread::supervise_thread()
    @     0x7fb38adb51ca start_thread
    @     0x7fb38aae4ef3 __GI___clone
    @                0x0 (unknown)
  1. Spark connecotr 读 StarRocks 数据报错: Set cancelled by MemoryScratchSinkOperator

BE日志会报内存超限类似的错误

Set cancelled by MemoryScratchSinkOperator

或是 process 的内存统计小于 query_pool 内存统计

  1. 冷热数据迁移,导致 FE 死锁

"BackgroundDynamicPartitionThread" #97631 daemon prio=5 os_prio=0 cpu=0.46ms elapsed=30497.58s tid=0x00007f7f0d0f8800 nid=0x4a4e waiting on condition  [0x00007f7ce498b000]
   java.lang.Thread.State: WAITING (parking)
        at jdk.internal.misc.Unsafe.park(java.base@11.0.0.1/Native Method)
        - parking to wait for  <0x0000000506370ab8> (a java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
        at java.util.concurrent.locks.LockSupport.park(java.base@11.0.0.1/LockSupport.java:194)
        at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(java.base@11.0.0.1/AbstractQueuedSynchronizer.java:885)
        at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(java.base@11.0.0.1/AbstractQueuedSynchronizer.java:917)
        at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(java.base@11.0.0.1/AbstractQueuedSynchronizer.java:1240)
        at java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lock(java.base@11.0.0.1/ReentrantReadWriteLock.java:959)
        at com.starrocks.catalog.TabletInvertedIndex.writeLock(TabletInvertedIndex.java:118)
        at com.starrocks.catalog.TabletInvertedIndex.deleteTablet(TabletInvertedIndex.java:682)
        at com.starrocks.server.LocalMetastore.onErasePartition(LocalMetastore.java:4567)
        at com.starrocks.server.GlobalStateMgr.onErasePartition(GlobalStateMgr.java:4001)
        at com.starrocks.catalog.OlapTable.dropPartition(OlapTable.java:976)
        at com.starrocks.catalog.OlapTable.dropPartition(OlapTable.java:1002)
        at com.starrocks.server.LocalMetastore.dropPartition(LocalMetastore.java:1551)
        at com.starrocks.server.GlobalStateMgr.dropPartition(GlobalStateMgr.java:2434)
        at com.starrocks.clone.DynamicPartitionScheduler.executeDynamicPartitionForTable(DynamicPartitionScheduler.java:412)
        at com.starrocks.catalog.OlapTable.lambda$onCreate$3(OlapTable.java:2450)
        at com.starrocks.catalog.OlapTable$Lambda$3278/0x000000080187c840.run(Unknown Source)
        at java.lang.Thread.run(java.base@11.0.0.1/Thread.java:834)

   Locked ownable synchronizers:
        - <0x0000000753f065a0> (a java.util.concurrent.locks.ReentrantReadWriteLock$FairSync)
        
"ReportHandler" #113 daemon prio=5 os_prio=0 cpu=84177.82ms elapsed=42962.98s tid=0x00007f7f89ed1000 nid=0xa3b waiting on condition  [0x00007f7f0abee000]
   java.lang.Thread.State: WAITING (parking)
        at jdk.internal.misc.Unsafe.park(java.base@11.0.0.1/Native Method)
        - parking to wait for  <0x0000000753f065a0> (a java.util.concurrent.locks.ReentrantReadWriteLock$FairSync)
        at java.util.concurrent.locks.LockSupport.park(java.base@11.0.0.1/LockSupport.java:194)
        at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(java.base@11.0.0.1/AbstractQueuedSynchronizer.java:885)
        at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireShared(java.base@11.0.0.1/AbstractQueuedSynchronizer.java:1009)
        at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireShared(java.base@11.0.0.1/AbstractQueuedSynchronizer.java:1324)
        at java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.lock(java.base@11.0.0.1/ReentrantReadWriteLock.java:738)
        at com.starrocks.catalog.Database.readLock(Database.java:182)
        at com.starrocks.catalog.TabletInvertedIndex.addToTabletMigrationMap(TabletInvertedIndex.java:334)
        at com.starrocks.catalog.TabletInvertedIndex.tabletReport(TabletInvertedIndex.java:213)
        at com.starrocks.leader.ReportHandler.tabletReport(ReportHandler.java:394)
        at com.starrocks.leader.ReportHandler.access$300(ReportHandler.java:119)
        at com.starrocks.leader.ReportHandler$ReportTask.exec(ReportHandler.java:351)
        at com.starrocks.leader.ReportHandler.runOneCycle(ReportHandler.java:1459)
        at com.starrocks.common.util.Daemon.run(Daemon.java:115)       
  1. 冷热数据迁移,导致BE占用大量内存

  1. BE GlobalRuntimeFilter 内存泄漏

  • Github Issue:

  • Github Fix PR:

  • Jira

  • 问题版本:

    • 2.2.0 ~ 2.2.15
    • 2.3.0 ~ 2.3.16
    • 2.4.0 ~ 2.4.5
    • 2.5.0 ~ 2.5.10
    • 3.0.0 ~ 3.0.5
    • 3.1.0 ~ 3.1.1
  • 修复版本:

    • 2.2.16+
    • 2.3.17+
    • 2.4.6+
    • 2.5.11+
    • 3.0.6+
    • 3.1.2+
  1. BE regex_replace 函数内存泄漏

  1. BE ES 外表内存泄漏

  1. BE avro 格式导入内存泄漏

  1. 存算分离模式下 FE 启动失败: failed to load journal type 118

2023-08-16 09:11:47,262 WARN (leaderCheckpointer|130) [GlobalStateMgr.replayJournalInner():2012] catch exception when replaying 9748222,
com.starrocks.journal.JournalInconsistentException: failed to load journal type 118
        at com.starrocks.persist.EditLog.loadJournal(EditLog.java:981) ~[starrocks-fe.jar:?]
        at com.starrocks.server.GlobalStateMgr.replayJournalInner(GlobalStateMgr.java:2001) [starrocks-fe.jar:?]
        at com.starrocks.server.GlobalStateMgr.replayJournal(GlobalStateMgr.java:1953) [starrocks-fe.jar:?]
        at com.starrocks.leader.Checkpoint.replayAndGenerateGlobalStateMgrImage(Checkpoint.java:215) [starrocks-fe.jar:?]
        at com.starrocks.leader.Checkpoint.runAfterCatalogReady(Checkpoint.java:106) [starrocks-fe.jar:?]
        at com.starrocks.common.util.LeaderDaemon.runOneCycle(LeaderDaemon.java:73) [starrocks-fe.jar:?]
        at com.starrocks.common.util.Daemon.run(Daemon.java:115) [starrocks-fe.jar:?]
Caused by: java.lang.NullPointerException
        at com.starrocks.lake.StarOSAgent.getServiceId(StarOSAgent.java:101) ~[starrocks-fe.jar:?]
        at com.starrocks.lake.StarOSAgent.prepare(StarOSAgent.java:94) ~[starrocks-fe.jar:?]
        at com.starrocks.lake.StarOSAgent.getShardReplicas(StarOSAgent.java:393) ~[starrocks-fe.jar:?]
        at com.starrocks.lake.StarOSAgent.getBackendIdsByShard(StarOSAgent.java:444) ~[starrocks-fe.jar:?]
        at com.starrocks.lake.LakeTablet.getBackendIds(LakeTablet.java:88) ~[starrocks-fe.jar:?]
        at com.starrocks.server.LocalMetastore.truncateTableInternal(LocalMetastore.java:4833) ~[starrocks-fe.jar:?]
        at com.starrocks.server.LocalMetastore.replayTruncateTable(LocalMetastore.java:4862) ~[starrocks-fe.jar:?]
        at com.starrocks.server.GlobalStateMgr.replayTruncateTable(GlobalStateMgr.java:3520) ~[starrocks-fe.jar:?]
        at com.starrocks.persist.EditLog.loadJournal(EditLog.java:574) ~[starrocks-fe.jar:?]
  1. FE 启动失败: failed to load journal type 100

2023-07-01 21:07:39,506 WARN (replayer|81) [GlobalStateMgr.replayJournalInner():1941] catch exception when replaying 492202039,
com.starrocks.journal.JournalInconsistentException: failed to load journal type 100
        at com.starrocks.persist.EditLog.loadJournal(EditLog.java:954) ~[starrocks-fe.jar:?]
        at com.starrocks.server.GlobalStateMgr.replayJournalInner(GlobalStateMgr.java:1930) ~[starrocks-fe.jar:?]
        at com.starrocks.server.GlobalStateMgr$5.runOneCycle(GlobalStateMgr.java:1787) ~[starrocks-fe.jar:?]
        at com.starrocks.common.util.Daemon.run(Daemon.java:115) ~[starrocks-fe.jar:?]
        at com.starrocks.server.GlobalStateMgr$5.run(GlobalStateMgr.java:1852) ~[starrocks-fe.jar:?]
Caused by: java.lang.IndexOutOfBoundsException: Index: 0, Size: 0
        at java.util.ArrayList.rangeCheck(ArrayList.java:659) ~[?:1.8.0_301]
        at java.util.ArrayList.get(ArrayList.java:435) ~[?:1.8.0_301]
        at com.starrocks.statistic.AnalyzeManager.updateLoadRows(AnalyzeManager.java:435) ~[starrocks-fe.jar:?]
        at com.starrocks.transaction.DatabaseTransactionMgr.updateCatalogAfterCommitted(DatabaseTransactionMgr.java:1491) ~[starrocks-fe.jar:?]
        at com.starrocks.transaction.DatabaseTransactionMgr.replayUpsertTransactionState(DatabaseTransactionMgr.java:1589) ~[starrocks-fe.jar:?]
        at com.starrocks.transaction.GlobalTransactionMgr.replayUpsertTransactionState(GlobalTransactionMgr.java:615) ~[starrocks-fe.jar:?]
        at com.starrocks.persist.EditLog.loadJournal(EditLog.java:524) ~[starrocks-fe.jar:?]
        ... 4 more
com.starrocks.journal.JournalInconsistentException: failed to load journal type 100
        at com.starrocks.persist.EditLog.loadJournal(EditLog.java:1060) ~[starrocks-fe.jar:?]
        at com.starrocks.server.GlobalStateMgr.replayJournalInner(GlobalStateMgr.java:2145) ~[starrocks-fe.jar:?]
        at com.starrocks.server.GlobalStateMgr.replayJournal(GlobalStateMgr.java:2097) ~[starrocks-fe.jar:?]
        at com.starrocks.server.GlobalStateMgr.transferToLeader(GlobalStateMgr.java:1142) ~[starrocks-fe.jar:?]
        at com.starrocks.server.GlobalStateMgr.access$100(GlobalStateMgr.java:324) ~[starrocks-fe.jar:?]
        at com.starrocks.server.GlobalStateMgr$1.transferToLeader(GlobalStateMgr.java:721) ~[starrocks-fe.jar:?]
        at com.starrocks.ha.StateChangeExecutor.runOneCycle(StateChangeExecutor.java:103) ~[starrocks-fe.jar:?]
        at com.starrocks.common.util.Daemon.run(Daemon.java:115) ~[starrocks-fe.jar:?]
Caused by: java.lang.NullPointerException
        at com.starrocks.statistic.AnalyzeMgr.updateLoadRows(AnalyzeMgr.java:493) ~[starrocks-fe.jar:?]
        at com.starrocks.transaction.DatabaseTransactionMgr.updateCatalogAfterVisible(DatabaseTransactionMgr.java:1541) ~[starrocks-fe.jar:?]
        at com.starrocks.transaction.DatabaseTransactionMgr.replayUpsertTransactionState(DatabaseTransactionMgr.java:1629) ~[starrocks-fe.jar:?]
        at com.starrocks.transaction.GlobalTransactionMgr.replayUpsertTransactionState(GlobalTransactionMgr.java:630) ~[starrocks-fe.jar:?]
        at com.starrocks.persist.EditLog.loadJournal(EditLog.java:600) ~[starrocks-fe.jar:?]

com.starrocks.journal.JournalInconsistentException: failed to load journal type 12110
	at com.starrocks.persist.EditLog.loadJournal(EditLog.java:1087) ~[starrocks-fe.jar:?]
	at com.starrocks.server.GlobalStateMgr.replayJournalInner(GlobalStateMgr.java:2272) ~[starrocks-fe.jar:?]
	at com.starrocks.server.GlobalStateMgr.replayJournal(GlobalStateMgr.java:2224) ~[starrocks-fe.jar:?]
	at com.starrocks.server.GlobalStateMgr.transferToLeader(GlobalStateMgr.java:1197) ~[starrocks-fe.jar:?]
	at com.starrocks.server.GlobalStateMgr.access$100(GlobalStateMgr.java:336) ~[starrocks-fe.jar:?]
	at com.starrocks.server.GlobalStateMgr$1.transferToLeader(GlobalStateMgr.java:764) ~[starrocks-fe.jar:?]
	at com.starrocks.ha.StateChangeExecutor.runOneCycle(StateChangeExecutor.java:103) ~[starrocks-fe.jar:?]
	at com.starrocks.common.util.Daemon.run(Daemon.java:115) ~[starrocks-fe.jar:?]
Caused by: java.lang.NullPointerException
	at com.starrocks.statistic.AnalyzeMgr.updateLoadRows(AnalyzeMgr.java:512) ~[starrocks-fe.jar:?]
	at com.starrocks.transaction.DatabaseTransactionMgr.updateCatalogAfterVisible(DatabaseTransactionMgr.java:1453) ~[starrocks-fe.jar:?]
	at com.starrocks.transaction.DatabaseTransactionMgr.replayUpsertTransactionState(DatabaseTransactionMgr.java:1541) ~[starrocks-fe.jar:?]
	at com.starrocks.transaction.GlobalTransactionMgr.replayUpsertTransactionState(GlobalTransactionMgr.java:689) ~[starrocks-fe.jar:?]
	at com.starrocks.persist.EditLog.loadJournal(EditLog.java:601) ~[starrocks-fe.jar:?]
  1. FE 启动失败: failed to load journal type 261

2023-08-08 13:51:43,632 WARN (stateChangeExecutor|92) [GlobalStateMgr.replayJournalInner():1930] catch exception when replaying 24259470,
com.starrocks.journal.JournalInconsistentException: failed to load journal type 261
        at com.starrocks.persist.EditLog.loadJournal(EditLog.java:948) ~[classes/:?]
        at com.starrocks.server.GlobalStateMgr.replayJournalInner(GlobalStateMgr.java:1919) [classes/:?]
        at com.starrocks.server.GlobalStateMgr.replayJournal(GlobalStateMgr.java:1870) [classes/:?]
        at com.starrocks.server.GlobalStateMgr.transferToLeader(GlobalStateMgr.java:1050) [classes/:?]
        at com.starrocks.server.GlobalStateMgr.access$100(GlobalStateMgr.java:298) [classes/:?]
        at com.starrocks.server.GlobalStateMgr$1.transferToLeader(GlobalStateMgr.java:655) [classes/:?]
        at com.starrocks.ha.StateChangeExecutor.runOneCycle(StateChangeExecutor.java:86) [classes/:?]
        at com.starrocks.common.util.Daemon.run(Daemon.java:115) [classes/:?]
Caused by: java.lang.NumberFormatException: null
        at java.lang.Integer.parseInt(Integer.java:542) ~[?:1.8.0_202]
        at java.lang.Integer.parseInt(Integer.java:615) ~[?:1.8.0_202]
        at com.starrocks.catalog.DynamicPartitionProperty.<init>(DynamicPartitionProperty.java:73) ~[classes/:?]
        at com.starrocks.catalog.TableProperty.buildDynamicProperty(TableProperty.java:189) ~[classes/:?]
        at com.starrocks.catalog.TableProperty.buildProperty(TableProperty.java:145) ~[classes/:?]
        at com.starrocks.server.LocalMetastore.replayModifyTableProperty(LocalMetastore.java:4157) ~[classes/:?]
        at com.starrocks.server.GlobalStateMgr.replayModifyTableProperty(GlobalStateMgr.java:3121) ~[classes/:?]
        at com.starrocks.persist.EditLog.loadJournal(EditLog.java:722) ~[classes/:?]
        ... 7 more
  1. FE 启动失败: failed to load journal type 12110

2023-08-09 14:07:57,985 INFO (stateChangeExecutor|66) [DatabaseTransactionMgr.replayUpsertTransactionState():1626] replay a committed transaction TransactionState. txn_id: 5242, label: insert_cf587b48-3675-11ee-8c4a-00163e1276cf, db id: 91465, table id list: 91471, callback id: -1, coordinator: FE: 172.26.80.21, transaction status: COMMITTED, error replicas num: 0, replica ids: , prepare time: 1691559011286, commit time: 1691559011331, finish time: -1, write cost: 45ms, reason:  attachment: com.starrocks.transaction.InsertTxnCommitAttachment@3bd990c2
2023-08-09 14:07:57,985 WARN (stateChangeExecutor|66) [GlobalStateMgr.replayJournalInner():2301] catch exception when replaying 26401,
com.starrocks.journal.JournalInconsistentException: failed to load journal type 12110
        at com.starrocks.persist.EditLog.loadJournal(EditLog.java:1090) ~[starrocks-fe.jar:?]
        at com.starrocks.server.GlobalStateMgr.replayJournalInner(GlobalStateMgr.java:2290) ~[starrocks-fe.jar:?]
        at com.starrocks.server.GlobalStateMgr.replayJournal(GlobalStateMgr.java:2242) ~[starrocks-fe.jar:?]
        at com.starrocks.server.GlobalStateMgr.transferToLeader(GlobalStateMgr.java:1216) ~[starrocks-fe.jar:?]
        at com.starrocks.server.GlobalStateMgr.access$100(GlobalStateMgr.java:338) ~[starrocks-fe.jar:?]
        at com.starrocks.server.GlobalStateMgr$1.transferToLeader(GlobalStateMgr.java:771) ~[starrocks-fe.jar:?]
        at com.starrocks.ha.StateChangeExecutor.runOneCycle(StateChangeExecutor.java:103) ~[starrocks-fe.jar:?]
        at com.starrocks.common.util.Daemon.run(Daemon.java:115) ~[starrocks-fe.jar:?]
Caused by: java.lang.NullPointerException
        at com.starrocks.transaction.TransactionLogApplierFactory.create(TransactionLogApplierFactory.java:23) ~[starrocks-fe.jar:?]
        at com.starrocks.transaction.DatabaseTransactionMgr.updateCatalogAfterCommitted(DatabaseTransactionMgr.java:1526) ~[starrocks-fe.jar:?]
        at com.starrocks.transaction.DatabaseTransactionMgr.replayUpsertTransactionState(DatabaseTransactionMgr.java:1627) ~[starrocks-fe.jar:?]
        at com.starrocks.transaction.GlobalTransactionMgr.replayUpsertTransactionState(GlobalTransactionMgr.java:674) ~[starrocks-fe.jar:?]
        at com.starrocks.persist.EditLog.loadJournal(EditLog.java:599) ~[starrocks-fe.jar:?]
        ... 7 more
  • Github Issue:

  • Github Fix PR:

  • Jira

  • 问题版本:

    • 3.0.0 ~ 3.0.5
  • 修复版本:

    • 3.0.6+
  • 问题原因:

    • 并发创建相同名称的table,在创建过程中删除db,同时又建了同名的库,但是table往db放的时候,对db存在性的检查是根据名称检查的,这就导致两个table都能创建成功,但是在回放日志的时候只能成功一个。
  1. FE 启动失败: Expected BEGIN_OBJECT but was STRING

2023-08-16 23:26:55,710 ERROR (stateChangeExecutor|73) [GlobalStateMgr.transferToLeader():1148] failed to init journal after transfer to leader! will exit
com.google.gson.JsonSyntaxException: java.lang.IllegalStateException: Expected BEGIN_OBJECT but was STRING at line 1 column 91 path $.p.m2.
at com.google.gson.internal.bind.ReflectiveTypeAdapterFactory$Adapter.read(ReflectiveTypeAdapterFactory.java:226) ~[spark-dpp-1.0.0.jar:?]
  1. FE 启动失败

2023-08-16 16:49:55,364 ERROR (UNKNOWN 10.18.104.101_9010_1681212541567(-1)|1) [StarRocksFE.start():170] StarRocksFE start failed
com.starrocks.sql.analyzer.SemanticException: Column '`usr_ser`.`v_dwd_usr_ser_spo_order_qty_dtl`.`execute_date`' cannot be resolved
        at com.starrocks.sql.analyzer.Scope.resolveField(Scope.java:83) ~[starrocks-fe.jar:?]
        at com.starrocks.sql.analyzer.Scope.resolveField(Scope.java:77) ~[starrocks-fe.jar:?]
        at com.starrocks.sql.analyzer.ExpressionAnalyzer$Visitor.visitSlot(ExpressionAnalyzer.java:253) ~[starrocks-fe.jar:?]
        at com.starrocks.sql.analyzer.ExpressionAnalyzer$Visitor.visitSlot(ExpressionAnalyzer.java:210) ~[starrocks-fe.jar:?]
        at com.starrocks.analysis.SlotRef.accept(SlotRef.java:489) ~[starrocks-fe.jar:?]
        at com.starrocks.sql.ast.AstVisitor.visit(AstVisitor.java:41) ~[starrocks-fe.jar:?]
        at com.starrocks.sql.analyzer.ExpressionAnalyzer.bottomUpAnalyze(ExpressionAnalyzer.java:207) ~[starrocks-fe.jar:?]
        at com.starrocks.sql.analyzer.ExpressionAnalyzer.analyze(ExpressionAnalyzer.java:102) ~[starrocks-fe.jar:?]
        at com.starrocks.sql.analyzer.ExpressionAnalyzer.analyzeExpression(ExpressionAnalyzer.java:1194) ~[starrocks-fe.jar:?]
        at com.starrocks.catalog.MaterializedView.analyzePartitionInfo(MaterializedView.java:734) ~[starrocks-fe.jar:?]
        at com.starrocks.catalog.MaterializedView.onCreate(MaterializedView.java:700) ~[starrocks-fe.jar:?]
        at java.util.ArrayList.forEach(ArrayList.java:1257) ~[?:1.8.0_232]
        at com.starrocks.server.LocalMetastore.loadDb(LocalMetastore.java:326) ~[starrocks-fe.jar:?]
        at com.starrocks.server.GlobalStateMgr.loadImage(GlobalStateMgr.java:1301) ~[starrocks-fe.jar:?]
        at com.starrocks.server.GlobalStateMgr.initialize(GlobalStateMgr.java:955) ~[starrocks-fe.jar:?]
        at com.starrocks.StarRocksFE.start(StarRocksFE.java:116) ~[starrocks-fe.jar:?]
        at com.starrocks.StarRocksFE.main(StarRocksFE.java:68) ~[starrocks-fe.
  • Github Issue:

  • Github Fix PR:

  • Jira

  • 问题版本:

    • 2.5.0 ~ 2.5.10
    • 3.0.0 ~ 3.0.5
    • 3.1.0 ~ 3.1.1
  • 修复版本:

    • 2.5.11+
    • 3.0.6+
    • 3.1.2+
  • 问题原因:

    • 物化视图加载的过程中,base表的列发生了变化。
  1. FE 启动失败: failed to load journal type 10002

2023-07-14 14:31:56,161 WARN (stateChangeExecutor|70) [GlobalStateMgr.replayJournalInner():1914] catch exception when replaying 34812512,
com.starrocks.journal.JournalInconsistentException: failed to load journal type 10002
        at com.starrocks.persist.EditLog.loadJournal(EditLog.java:948) ~[starrocks-fe.jar:?]
        at com.starrocks.server.GlobalStateMgr.replayJournalInner(GlobalStateMgr.java:1903) [starrocks-fe.jar:?]
        at com.starrocks.server.GlobalStateMgr.replayJournal(GlobalStateMgr.java:1854) [starrocks-fe.jar:?]
        at com.starrocks.server.GlobalStateMgr.transferToLeader(GlobalStateMgr.java:1034) [starrocks-fe.jar:?]
        at com.starrocks.server.GlobalStateMgr.access$100(GlobalStateMgr.java:295) [starrocks-fe.jar:?]
        at com.starrocks.server.GlobalStateMgr$1.transferToLeader(GlobalStateMgr.java:643) [starrocks-fe.jar:?]
        at com.starrocks.ha.StateChangeExecutor.runOneCycle(StateChangeExecutor.java:86) [starrocks-fe.jar:?]
        at com.starrocks.common.util.Daemon.run(Daemon.java:115) [starrocks-fe.jar:?]
Caused by: java.lang.NullPointerException
        at com.starrocks.server.LocalMetastore.replayAddPartition(LocalMetastore.java:1442) ~[starrocks-fe.jar:?]
        at com.starrocks.server.GlobalStateMgr.replayAddPartition(GlobalStateMgr.java:2057) ~[starrocks-fe.jar:?]
        at com.starrocks.persist.EditLog.loadJournal(EditLog.java:216) ~[starrocks-fe.jar:?]
        ... 7 more
  • Github Issue:

  • Github Fix PR:

  • Jira

  • 问题版本:

    • 2.5.0 ~ 2.5.4
  • 修复版本:

    • 3.5.5+
  • 问题原因:

    • informattion_schema上创建物化视图,因为informattion_schema上面的表是不持久化的,回放与物化视图相关的log时会报NPE
  1. FE 启动失败: failed to load journal type 17

2023-04-06 13:22:13,024 WARN (stateChangeExecutor|79) [GlobalStateMgr.replayJournal():1941] got interrupt exception or inconsistent exception when replay journal 30315695, will exit,
com.starrocks.journal.JournalInconsistentException: failed to load journal type 17
        at com.starrocks.persist.EditLog.loadJournal(EditLog.java:1031) ~[starrocks-fe.jar:?]
        at com.starrocks.server.GlobalStateMgr.replayJournalInner(GlobalStateMgr.java:1987) ~[starrocks-fe.jar:?]
        at com.starrocks.server.GlobalStateMgr.replayJournal(GlobalStateMgr.java:1939) [starrocks-fe.jar:?]
        at com.starrocks.server.GlobalStateMgr.transferToLeader(GlobalStateMgr.java:1097) [starrocks-fe.jar:?]
        at com.starrocks.server.GlobalStateMgr.access$100(GlobalStateMgr.java:316) [starrocks-fe.jar:?]
        at com.starrocks.server.GlobalStateMgr$1.transferToLeader(GlobalStateMgr.java:690) [starrocks-fe.jar:?]
        at com.starrocks.ha.StateChangeExecutor.runOneCycle(StateChangeExecutor.java:103) [starrocks-fe.jar:?]
        at com.starrocks.common.util.Daemon.run(Daemon.java:115) [starrocks-fe.jar:?]
Caused by: java.lang.NullPointerException
        at com.starrocks.catalog.CatalogRecycleBin.replayRecoverTable(CatalogRecycleBin.java:579) ~[starrocks-fe.jar:?]
        at com.starrocks.server.LocalMetastore.replayRecoverTable(LocalMetastore.java:2272) ~[starrocks-fe.jar:?]
        at com.starrocks.server.GlobalStateMgr.replayRecoverTable(GlobalStateMgr.java:2714) ~[starrocks-fe.jar:?]
        at com.starrocks.persist.EditLog.loadJournal(EditLog.java:346) ~[starrocks-fe.jar:?]