【详述】
一分钟类3台BE节点连续crash
【StarRocks版本】
2.5.6
【集群规模】
3 fe 20 be
【机器信息】
16c 128g内存
【附件】
be.out日志
*** Aborted at 1693533462 (unix time) try “date -d @1693533462” if you are using GNU date ***
PC: @ 0x7ff539e619d5 __GI_raise
*** SIGABRT (@0x3e90015601a) received by PID 1400858 (TID 0x7ff3ad7d0640) from PID 1400858; stack trace: ***
@ 0x5961d42 google::(anonymous namespace)::FailureSignalHandler()
*** Check failure stack trace: ***
@ 0x7ff53e72b1d0 (unknown)
@ 0x7ff539e619d5 __GI_raise
@ 0x7ff539e4a894 __GI_abort
@ 0x2c80f6e starrocks::failure_function()
@ 0x595571d google::LogMessage::Fail()
@ 0x5957b8f google::LogMessage::SendToLog()
@ 0x595526e google::LogMessage::Flush()
@ 0x5958199 google::LogMessageFatal::~LogMessageFatal()
@ 0x45828c8 starrocks::BinaryDictPageDecoder<>::next_batch()
@ 0x4582c08 starrocks::BinaryDictPageDecoder<>::next_batch()
@ 0x459f0d7 starrocks::ParsedPageV1::read()
@ 0x4572fc2 starrocks::ScalarColumnIterator::fetch_values_by_rowid()
@ 0x4524d52 starrocks::ColumnIterator::fetch_values_by_rowid()
@ 0x415766f starrocks::vectorized::SegmentIterator::_finish_late_materialization()
@ 0x41603c0 starrocks::vectorized::SegmentIterator::_do_get_next()
@ 0x4162db0 starrocks::vectorized::SegmentIterator::do_get_next()
@ 0x41e7482 starrocks::vectorized::ProjectionIterator::do_get_next()
@ 0x4786fc5 starrocks::SegmentIteratorWrapper::do_get_next()
@ 0x45b66a3 starrocks::vectorized::TimedChunkIterator::do_get_next()
@ 0x420fc7e starrocks::vectorized::TabletReader::do_get_next()
@ 0x2fd19eb starrocks::pipeline::OlapChunkSource::_read_chunk_from_storage()
@ 0x2fd20cb starrocks::pipeline::OlapChunkSource::_read_chunk()
@ 0x2fc1b5c starrocks::pipeline::ChunkSource::buffer_next_batch_chunks_blocking()
@ 0x2d3ea54 _ZZN9starrocks8pipeline12ScanOperator18_trigger_next_scanEPNS_12RuntimeStateEiENKUlvE_clEv
@ 0x2d4fb3e starrocks::workgroup::ScanExecutor::worker_thread()
@ 0x4983372 starrocks::ThreadPool::dispatch_thread()
@ 0x497de6a starrocks::supervise_thread()
@ 0x7ff53e7203fb start_thread
@ 0x7ff539f27c23 __GI___clone
@ 0x0 (unknown)
start time: Fri Sep 1 09:58:05 AM CST 2023
保留个coredump
有做过backup restore 这种操作吗?
没有做过这样的操作,线上的除了查询,就是flink实时写入,datax离线调度写入
这个集群曾经用过2.1, 2.0这样的版本吗?
没有,之前用过2.5.1,从这个版本升级到2.5.4版本的
set global enable_filter_unused_columns_in_scan_stage=false; 试试还Crash不
可以ulimit -c unlimited 获取个Core文件,看看。
加个微信一起看下?
好的谢谢大佬,我私信您
2.5.16 backup 到hdfs,be节点挂的问题能一起看看?
发下 be.out
加个微信吧,百分百复现的,日志没有看出啥异常 ,私你了
已确定是BE JAVA_HOME配置有问题。
1赞