be crush 后无法重启

【详述】be crush后无法重启
【业务影响】三副本集群暂无影响
【是否存算分离】否
【StarRocks版本】例如:2.5.12
【集群规模】例如:3fe+5be
【做了什么操作】: 猜测是大任务查询导致的BE节点宕机
be.WARNING:
日志中报错的两个tablet,查询发现并不在当前节点有副本。

F1120 05:41:42.999663 29333 tablet_updates.cpp:1131] delvec inconsistent tablet:147140561 rssid:9367 #old:1568 #add:1568 #new:1568 old_v:8010 v:8011
W1120 05:41:44.466245 29352 user_function_cache.cpp:176] load a library failed, dir=/data_ssd/StarRocks/be/lib/udf/57, file=-138526137.3b89f5b802725f1e1cbfa4df1a639592.jar.tmp
F1120 05:41:45.397938 30404 tablet_updates.cpp:1131] delvec inconsistent tablet:147140561 rssid:9367 #old:1568 #add:1568 #new:1568 old_v:8010 v:8011
F1120 05:41:45.400492 30403 tablet_updates.cpp:1131] delvec inconsistent tablet:147140537 rssid:7786 #old:1609 #add:2175 #new:2175 old_v:8007 v:8011
W1120 05:47:37.184389 30760 user_function_cache.cpp:176] load a library failed, dir=/data_ssd/StarRocks/be/lib/udf/57, file=-138526137.3b89f5b802725f1e1cbfa4df1a639592.jar.tmp
F1120 05:47:38.147249 31811 tablet_updates.cpp:1131] delvec inconsistent tablet:147140537 rssid:7786 #old:1609 #add:2175 #new:2175 old_v:8007 v:8011
F1120 05:47:38.147845 31812 tablet_updates.cpp:1131] delvec inconsistent tablet:147140561 rssid:9367 #old:1568 #add:1568 #new:1568 old_v:8010 v:8011
  • be crash
    • be.out
start time: Mon Nov 20 05:47:37 UTC 2023
*** Check failure stack trace: ***
2.5.12 RELEASE (build cb07d99)
query_id:00000000-0000-0000-0000-000000000000, fragment_instance:00000000-0000-0000-0000-000000000000
tracker:process consumption: 1326135720
tracker:query_pool consumption: 0
tracker:load consumption: 0
tracker:metadata consumption: 128777998
tracker:tablet_metadata consumption: 18656128
tracker:rowset_metadata consumption: 109974618
tracker:segment_metadata consumption: 20046
tracker:column_metadata consumption: 127848
tracker:tablet_schema consumption: 278104
tracker:segment_zonemap consumption: 11820
tracker:short_key_index consumption: 0
tracker:column_zonemap_index consumption: 18368
tracker:ordinal_index consumption: 73384
tracker:bitmap_index consumption: 0
tracker:bloom_filter_index consumption: 1392
tracker:compaction consumption: 0
tracker:schema_change consumption: 0
tracker:column_pool consumption: 0
tracker:page_cache consumption: 53888
tracker:update consumption: 170196914
tracker:chunk_allocator consumption: 0
tracker:clone consumption: 0
tracker:consistency consumption: 0
*** Aborted at 1700459258 (unix time) try "date -d @1700459258" if you are using GNU date ***
PC: @     0x7fc49aebca9f __GI_raise
*** SIGABRT (@0x7828) received by PID 30760 (TID 0x7fc3fa3ff700) from PID 30760; stack trace: ***
*** Check failure stack trace: ***
    @          0x5b1ba42 google::(anonymous namespace)::FailureSignalHandler()
    @     0x7fc49b9d4ce0 (unknown)
    @     0x7fc49aebca9f __GI_raise
    @     0x7fc49ae8fe05 __GI_abort
    @          0x2cdf2be starrocks::failure_function()
    @          0x5b0f41d google::LogMessage::Fail()
    @          0x5b1188f google::LogMessage::SendToLog()
    @          0x5b0ef6e google::LogMessage::Flush()
    @          0x5b11e99 google::LogMessageFatal::~LogMessageFatal()
    @          0x4265eaf starrocks::TabletUpdates::_apply_rowset_commit()
    @          0x4266353 starrocks::TabletUpdates::do_apply()
    @          0x4b17465 starrocks::ThreadPool::dispatch_thread()
    @          0x4b11e4a starrocks::Thread::supervise_thread()
    @     0x7fc49b9ca1ca start_thread
    @     0x7fc49aea7dd3 __GI___clone
    @                0x0 (unknown)