【详述】 一台be一直宕机,启动后10几分钟又挂。
【背景】看日志好像是compaction失败导致的。
【业务影响】生产
【是否存算分离】否
【StarRocks版本】3.2.2
【集群规模】3fe(2 follower 1 leader)+3be
【机器信息】fe(16C/32G/万兆), be(32C/256G/万兆)
【联系方式】微信 rayxai
【附件】
F0415 18:06:46.769048 1422 tablet_updates.cpp:2066] Check failed: st.ok() actual row size changed after compaction 20878890 -> 20826754 inputs:394,396,398,492,524,558,585,615,644,673,703,732,761,790,820,848,878,912,946,979,1013,1146 output:1484 max_rowset_id:1146 max_src_rssid:1180 inputs:394:4689908/4728055 396:3290221/3323248 398:4651632/4689715 492:1063791/1063791 524:1062508/1062508 558:1122992/1122992 585:1252236/1252236 615:1252472/1252472 644:1253819/1253819 673:1247877/1247877 703:1248120/1248120 732:788996/1250041 761:788931/1252157 790:787271/1249635 820:788162/1249259 848:952369/952369 878:1064421/1064421 912:1063784/1063784 946:1062517/1062517 979:1058461/1058461 1013:1059740/1059740 1146:80/1057519 output:1484:1571021/4533313 /mnt/data02/starrocks/storage/data/253/85962/185282386/020000000000000384406afef49d2d584e7fa0f9cf9b81a7 duplicate row [‘140000000009’] row:1==row:2
*** Check failure stack trace: ***
3.2.2 RELEASE (build 269e832)
query_id:00000000-0000-0000-0000-000000000000, fragment_instance:00000000-0000-0000-0000-000000000000
tracker:process consumption: 3225496536
tracker:query_pool consumption: 0
tracker:load consumption: 0
tracker:metadata consumption: 552446926
tracker:tablet_metadata consumption: 8193124
tracker:rowset_metadata consumption: 525476498
tracker:segment_metadata consumption: 2869096
tracker:column_metadata consumption: 15908208
tracker:tablet_schema consumption: 131148
tracker:segment_zonemap consumption: 2758908
tracker:short_key_index consumption: 0
tracker:column_zonemap_index consumption: 4799872
tracker:ordinal_index consumption: 5221984
tracker:bitmap_index consumption: 0
tracker:bloom_filter_index consumption: 0
tracker:compaction consumption: 558748664
tracker:schema_change consumption: 0
tracker:column_pool consumption: 0
tracker:page_cache consumption: 674368
tracker:update consumption: 427625349
tracker:chunk_allocator consumption: 0
tracker:clone consumption: 0
tracker:consistency consumption: 0
*** Aborted at 1713175606 (unix time) try “date -d @1713175606” if you are using GNU date ***
PC: @ 0x7f5f4bb7e387 __GI_raise
*** SIGABRT (@0x493) received by PID 1171 (TID 0x7f5f113f8700) from PID 1171; stack trace: ***
@ 0x65cd242 google::(anonymous namespace)::FailureSignalHandler()
@ 0x7f5f4c84d630 (unknown)
@ 0x7f5f4bb7e387 __GI_raise
@ 0x7f5f4bb7fa78 __GI_abort
@ 0x2b42b6e starrocks::failure_function()
@ 0x65c0c1d google::LogMessage::Fail()
@ 0x65c308f google::LogMessage::SendToLog()
@ 0x65c076e google::LogMessage::Flush()
@ 0x65c3699 google::LogMessageFatal::~LogMessageFatal()
@ 0x50134d4 starrocks::TabletUpdates::_apply_compaction_commit()
@ 0x5018135 starrocks::TabletUpdates::do_apply()
@ 0x2dcf0d5 starrocks::ThreadPool::dispatch_thread()
@ 0x2dc9a3a starrocks::supervise_thread()
@ 0x7f5f4c845ea5 start_thread
@ 0x7f5f4bc46b0d __clone
@ 0x0 (unknown)