为了更快的定位您的问题,请提供以下信息,谢谢
【详述】
routine load任务消费kafka如果更快落盘
修改配置,重启be,集群任何节点都不能重启成功。
【背景】
修改 be.conf
routine_load_thread_pool_size = 128
修改fe.conf
max_running_txn_num_per_db = 1000
max_routine_load_task_num_per_be = 32
enable_auto_tablet_distribution = true
重启fe没有问题,重启be失败。
【业务影响】
不能查询
SQL 错误 [1064] [42000]: get_applied_rowsets failed, tablet updates is in error state: tablet:21661594 actual row size changed after compaction 683463 -> 0tablet:21661594 #version:2 [6937 6937.1@1 6937.1] #pending:0 backend:172.20.192.74
【StarRocks版本】2.5.10
【集群规模】4fe(3 follower+1observer)+4be(fe与be混部)
【机器信息】CPU虚拟核/内存/网卡,例如:256C/2048G/万兆
报错信息
*** Aborted at 1692840766 (unix time) try “date -d @1692840766” if you are using GNU date ***
PC: @ 0x7f01c301d387 __GI_raise
*** SIGABRT (@0xe5b3) received by PID 58803 (TID 0x7f01777fe700) from PID 58803; stack trace: ***
@ 0x5aed0a2 google::(anonymous namespace)::FailureSignalHandler()
@ 0x7f01c3ad2630 (unknown)
@ 0x7f01c301d387 __GI_raise
@ 0x7f01c301ea78 __GI_abort
@ 0x2cce6fe starrocks::failure_function()
@ 0x5ae0a7d google::LogMessage::Fail()
@ 0x5ae2eef google::LogMessage::SendToLog()
@ 0x5ae05ce google::LogMessage::Flush()
@ 0x5ae34f9 google::LogMessageFatal::~LogMessageFatal()
@ 0x41c8a12 starrocks::DataDir::load()
@ 0x41a9e3b _ZNSt6thread11_State_implINS_8_InvokerISt5tupleIJZN9starrocks13StorageEngine14load_data_dirsERKSt6vectorIPNS3_7DataDirESaIS7_EEEUlvE_EEEEE6_M_runEv
@ 0x7ffb6e0 execute_native_thread_routine
@ 0x7f01c3acaea5 start_thread
@ 0x7f01c30e5b0d __clone
@ 0x0 (unknown)
查看info日志 发现 tablet init missing rowset error
跟GitHub描述的一样