请问下,295.LogMessageFatal这个问题有修复计划吗?按照临时解决办法设置ignore_load_tablet_failure = true,对数据会有什么影响吗,会出现数据丢失或者不一致吗@ LIANGCHAOHUA
老师好。
目前我们的版本 3.1.17-67ae3b7 的版本,
在调用udfs额时候 还是会出现这个问题。
未知错误 (1064, 'exception happened when get class: java.lang.OutOfMemoryError: Compressed class space[] backend
补充:bug 305的临时解决方法,把对应be上的tablet id进行set bad:
https://docs.starrocks.io/zh/docs/sql-reference/sql-statements/cluster-management/tablet_replica/ADMIN_SET_REPLICA_STATUS/
-
使用 StarRocks 外表同步数据报错
get TableMeta failed from TNetworkAddress
-
Github Issue:
-
Github Fix PR:
-
Jira
-
问题版本:
-
2.5.0 ~ 2.5.12
-
3.1.0 ~ 3.1.2
-
-
修复版本:
-
2.5.13+
-
3.1.3+
-
-
问题原因:
-
临时解决办法:
-
SkewRule 导致性能慢或SlowLock
"app//com.starrocks.sql.optimizer.statistics.StatisticsCalculator.visitLogicalHiveScan(StatisticsCalculator.java:664)",
"app//com.starrocks.sql.optimizer.statistics.StatisticsCalculator.visitLogicalHiveScan(StatisticsCalculator.java:179)",
"app//com.starrocks.sql.optimizer.operator.logical.LogicalHiveScanOperator.accept(LogicalHiveScanOperator.java:78)",
"app//com.starrocks.sql.optimizer.statistics.StatisticsCalculator.estimatorStats(StatisticsCalculator.java:202)",
"app//com.starrocks.sql.optimizer.Utils.calculateStatistics(Utils.java:815)",
"app//com.starrocks.sql.optimizer.Utils.calculateStatistics(Utils.java:804)",
"app//com.starrocks.sql.optimizer.Utils.calculateStatistics(Utils.java:804)",
"app//com.starrocks.sql.optimizer.Optimizer.skewJoinOptimize(Optimizer.java:794)",
"app//com.starrocks.sql.optimizer.Optimizer.logicalRuleRewrite(Optimizer.java:570)",
"app//com.starrocks.sql.optimizer.Optimizer.rewriteAndValidatePlan(Optimizer.java:748)",
"app//com.starrocks.sql.optimizer.Optimizer.optimizeByCost(Optimizer.java:248)",
"app//com.starrocks.sql.optimizer.Optimizer.optimize(Optimizer.java:196)",
"app//com.starrocks.sql.optimizer.Optimizer.optimize(Optimizer.java:172)",
"app//com.starrocks.sql.InsertPlanner.buildExecPlan(InsertPlanner.java:517)",
"app//com.starrocks.sql.InsertPlanner.plan(InsertPlanner.java:315)",
"app//com.starrocks.sql.StatementPlanner.planInsertStmt(StatementPlanner.java:203)",
"app//com.starrocks.sql.StatementPlanner.plan(StatementPlanner.java:141)",
"app//com.starrocks.sql.StatementPlanner.plan(StatementPlanner.java:92)",
"app//com.starrocks.qe.StmtExecutor.execute(StmtExecutor.java:578)",
-
Github Issue:
-
Github Fix PR:
-
Jira
-
问题版本:
- 所有版本
-
修复版本:
- 未修复
-
问题原因:
-
临时解决办法:
- Set global enable_stats_to_optimize_skew_join=false;
大佬 请问3.2.17版本什么时候发布

-
StarRocks写的Parquet文件,Hive读不了
Failed with exception java.io.IOException:org.apache.parquet.io.ParquetDecodingException: Can not read value at 0 in block -1 in file
-
Github Issue:
-
Github Fix PR:
-
Jira
-
问题版本:
-
3.2.0~latest
-
3.3.0~3.3.4
-
-
修复版本:
-
3.2未修复
-
3.3.5+
-
-
问题原因:
- StarRocks写的是新版本的Parquet格式,Hive <=3.0 的版本识别不了
-
临时解决办法:
-
Morsel queue crash
*** SIGSEGV (@0xa0) received by PID 1328927 (TID 0x14f3e4c3a640) from PID 160; stack trace: ***
@ 0x14f509070ee8 (/usr/lib/x86_64-linux-gnu/libc.so.6+0x99ee7)
@ 0xa37c049 google::(anonymous namespace)::FailureSignalHandler(int, siginfo_t*, void*)
@ 0x14f509e7835d os::Linux::chained_handler(int, siginfo*, void*)
@ 0x14f509e7df5f JVM_handle_linux_signal
@ 0x14f509e6f968 signalHandler(int, siginfo*, void*)
@ 0x14f509019520 (/usr/lib/x86_64-linux-gnu/libc.so.6+0x4251f)
@ 0x6d622bc starrocks::TabletReader::_to_seek_tuple(std::shared_ptr<starrocks::TabletSchema const> const&, starrocks::OlapTuple const&, starrocks::SeekTuple*, starrocks::MemPool*)
@ 0x6d62ebe starrocks::TabletReader::parse_seek_range(std::shared_ptr<starrocks::TabletSchema const> const&, starrocks::TabletReaderParams::RangeStartOperation, starrocks::TabletReaderParams::RangeEndOperation, std::vector<starrocks::OlapTuple, std::allocator<starrockP^Y
@ 0x758ae83 starrocks::pipeline::PhysicalSplitMorselQueue::_init_segment()
@ 0x758b46f starrocks::pipeline::PhysicalSplitMorselQueue::_try_get_split_from_single_tablet()
@ 0x758bd67 starrocks::pipeline::PhysicalSplitMorselQueue::try_get()
@ 0x7587e53 starrocks::pipeline::BucketSequenceMorselQueue::try_get()
@ 0x5456ace starrocks::pipeline::ScanOperator::_pickup_morsel(starrocks::RuntimeState*, int)
@ 0x54555bc starrocks::pipeline::ScanOperator::_try_to_trigger_next_scan(starrocks::RuntimeState*)
@ 0x545584a starrocks::pipeline::ScanOperator::pull_chunk(starrocks::RuntimeState*)
@ 0x544c8b8 starrocks::pipeline::PipelineDriver::process(starrocks::RuntimeState*, int)
@ 0x7ebce58 starrocks::pipeline::GlobalDriverExecutor::_worker_thread()
@ 0x8c00b73 starrocks::ThreadPool::dispatch_thread()
@ 0x8bf81c9 starrocks::Thread::supervise_thread(void*)
@ 0x14f50906bac3 (/usr/lib/x86_64-linux-gnu/libc.so.6+0x94ac2)
@ 0x14f5090fd850 (/usr/lib/x86_64-linux-gnu/libc.so.6+0x12684f)
-
Github Issue:
-
Github Fix PR:
-
Jira
-
问题版本:
-
3.3.0~3.3.17
-
3.4.0~3.4.16
-
3.5.0~3.5.4
-
-
修复版本:
-
3.3.18+
-
3.4.17+
-
3.5.5+
-
-
问题原因:
-
临时解决办法:
- set global enable_per_bucket_optimize = false
-
Arm parquet reader crash
*** SIGSEGV (@0x0) received by PID 28 (TID 0xfffea329fe00) LWP(602) from PID 0; stack trace: ***
@ 0xffffb81e25c4 (/usr/lib/aarch64-linux-gnu/libc.so.6+0x825c3)
@ 0xf73dd28 google::(anonymous namespace)::FailureSignalHandler(int, siginfo_t*, void*)
@ 0xffffb989c850 ([vdso]+0x84f)
@ 0xffffb81f7bbc (/usr/lib/aarch64-linux-gnu/libc.so.6+0x97bbb)
@ 0xb350584 starrocks::parquet::Int32ToDateConverter::convert(starrocks::Cow<starrocks::Column>::ImmutPtr<starrocks::Column> const&, starrocks::Column*)
@ 0xb37c948 starrocks::parquet::StatisticsHelper::decode_value_into_column(starrocks::Cow<starrocks::Column>::MutPtr<starrocks::Column> const&, std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::ba
@ 0xb29af2c starrocks::parquet::RawColumnReader::_row_group_zone_map_filter(std::vector<starrocks::ColumnPredicate const*, std::allocator<starrocks::ColumnPredicate const*> > const&, starrocks::CompoundNodeType, starrocks::TypeDescriptor const&, unsigned long, unsigne
@ 0xb29f114 starrocks::parquet::ScalarColumnReader::row_group_zone_map_filter(std::vector<starrocks::ColumnPredicate const*, std::allocator<starrocks::ColumnPredicate const*> > const&, starrocks::CompoundNodeType, unsigned long, unsigned long) const
@ 0xb254340 starrocks::StatusOr<std::optional<starrocks::SparseRange<unsigned long> > > starrocks::parquet::PredicateFilterEvaluator::visit_for_rowgroup_zonemap<(starrocks::CompoundNodeType)0>(starrocks::PredicateCompoundNode<(starrocks::CompoundNodeType)0> const&)
@ 0xb2573cc starrocks::StatusOr<std::optional<starrocks::SparseRange<unsigned long> > > starrocks::parquet::PredicateFilterEvaluator::operator()<(starrocks::CompoundNodeType)0>(starrocks::PredicateCompoundNode<(starrocks::CompoundNodeType)0> const&, starrocks::parquet
@ 0xb22ac94 starrocks::parquet::FileReader::_filter_group(std::shared_ptr<starrocks::parquet::GroupReader> const&)
@ 0xb22b3a8 starrocks::parquet::FileReader::_init_group_readers()
@ 0xb22c3ac starrocks::parquet::FileReader::init(starrocks::HdfsScannerContext*)
@ 0xab53f4c starrocks::HdfsParquetScanner::do_open(starrocks::RuntimeState*)
@ 0xaa038a8 starrocks::HdfsScanner::open(starrocks::RuntimeState*)
@ 0xa9de95c starrocks::connector::HiveDataSource::_init_scanner(starrocks::RuntimeState*)
@ 0xa9df6d8 starrocks::connector::HiveDataSource::open(starrocks::RuntimeState*)
@ 0xa9bcf88 starrocks::pipeline::ConnectorChunkSource::_open_data_source(starrocks::RuntimeState*, bool*)
@ 0xa9bd9fc starrocks::pipeline::ConnectorChunkSource::_read_chunk(starrocks::RuntimeState*, std::shared_ptr<starrocks::Chunk>*)
@ 0xa9c399c starrocks::pipeline::ChunkSource::buffer_next_batch_chunks_blocking(starrocks::RuntimeState*, unsigned long, starrocks::workgroup::WorkGroup const*)
@ 0xa299bf0 auto starrocks::pipeline::ScanOperator::_trigger_next_scan(starrocks::RuntimeState*, int)::{lambda(auto:1&)#1}::operator()<starrocks::workgroup::YieldContext>(starrocks::workgroup::YieldContext&) const [clone .constprop.0]
@ 0xa92c89c starrocks::workgroup::ScanExecutor::worker_thread()
@ 0xc411dc4 starrocks::ThreadPool::dispatch_thread()
@ 0xc408a68 starrocks::Thread::supervise_thread(void*)
-
Github Issue:
-
Github Fix PR:
-
Jira
-
问题版本:
-
3.3.0~3.3.19
-
3.4.0~3.4.8
-
3.5.0~3.5.5
-
4.0.0
-
-
修复版本:
-
3.3.20+
-
3.4.9+
-
3.5.6+
-
4.0.1+
-
-
问题原因:
-
临时解决办法:
-
min_by/max_by 函数 crash
*** Aborted at 1754458860 (unix time) try "date -d @1754458860" if you are using GNU date ***
PC: @ 0x7f0ee04d5711 __memcpy_ssse3_back
*** SIGSEGV (@0x7f0e74196ff1) received by PID 16389 (TID 0x7f0e5895e700) from PID 1947824113; stack trace: ***
@ 0x7f0ee107b20b __pthread_once_slow
@ 0x7dc3dc0 google::(anonymous namespace)::FailureSignalHandler(int, siginfo_t*, void*)
@ 0x7f0ee1f0935d os::Linux::chained_handler(int, siginfo*, void*)
@ 0x7f0ee1f0ef5f JVM_handle_linux_signal
@ 0x7f0ee1f00968 signalHandler(int, siginfo*, void*)
@ 0x7f0ee1084630 (/usr/lib64/libpthread-2.17.so+0xf62f)
@ 0x7f0ee04d5711 __memcpy_ssse3_back
@ 0x4ce607d starrocks::AggregateFunctionBatchHelper<starrocks::MinByAggregateData<(starrocks::LogicalType)17, true, int>, starrocks::MaxMinByAggregateFunction<(starrocks::LogicalType)17, starrocks::MinByAggregateData<(starrocks::LogicalType)17, true, int>, starrocks::����
@ 0x46200dc starrocks::Aggregator::compute_batch_agg_states(starrocks::Chunk*, unsigned long)
@ 0x451ffcd starrocks::pipeline::AggregateBlockingSinkOperator::push_chunk(starrocks::RuntimeState*, std::shared_ptr<starrocks::Chunk> const&)
@ 0x452f3e1 starrocks::pipeline::BucketProcessSinkOperator::push_chunk(starrocks::RuntimeState*, std::shared_ptr<starrocks::Chunk> const&)
@ 0x44f49d4 starrocks::pipeline::PipelineDriver::process(starrocks::RuntimeState*, int)
@ 0x47b85e3 starrocks::pipeline::GlobalDriverExecutor::_worker_thread()
@ 0x398e8e3 starrocks::ThreadPool::dispatch_thread()
@ 0x3985f66 starrocks::Thread::supervise_thread(void*)
@ 0x7f0ee107cea5 start_thread
@ 0x7f0ee047db0d __clone
-
Github Issue:
-
Github Fix PR:
-
Jira
-
问题版本:
-
3.3.0~3.3.17
-
3.4.0~3.4.6
-
3.5.0~3.5.3
-
-
修复版本:
-
3.3.18+
-
3.4.7+
-
3.5.4+
-
-
问题原因:
-
临时解决办法:
-
RecoverableStub crash
*** SIGABRT (@0x272000296e6) received by PID 169702 (TID 0x15162f6fb640) from PID 169702; stack trace: ***
@ 0x151dd468ee18 __pthread_once_slow
@ 0x7da2500 google::(anonymous namespace)::FailureSignalHandler(int, siginfo_t*, void*)
@ 0x151dd463e6f0 (/usr/lib64/libc.so.6+0x3e6ef)
@ 0x151dd468b94c __pthread_kill_implementation
@ 0x151dd463e646 __GI_raise
@ 0x151dd46287f3 __GI_abort
@ 0x3328555 __gnu_cxx::__verbose_terminate_handler() [clone .cold]
@ 0xc480c56 __cxxabiv1::__terminate(void (*)())
@ 0xc480cc1 std::terminate()
@ 0xc48144f __cxa_pure_virtual
@ 0x3821545 starrocks::LocalTabletsChannel::_abort_replica_tablets(starrocks::PTabletWriterAddChunkRequest const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::unordered_map<long, std::vector<long, std::allocator<long> >X^S
@ 0x38237f0 starrocks::LocalTabletsChannel::add_chunk(starrocks::Chunk*, starrocks::PTabletWriterAddChunkRequest const&, starrocks::PTabletWriterAddBatchResult*)
@ 0x3815939 starrocks::LoadChannel::_add_chunk(starrocks::Chunk*, starrocks::PTabletWriterAddChunkRequest const&, starrocks::PTabletWriterAddBatchResult*)
@ 0x3816b2c starrocks::LoadChannel::add_chunks(starrocks::PTabletWriterAddChunksRequest const&, starrocks::PTabletWriterAddBatchResult*)
@ 0x38101c3 starrocks::LoadChannelMgr::add_chunks(starrocks::PTabletWriterAddChunksRequest const&, starrocks::PTabletWriterAddBatchResult*)
@ 0x38d163b starrocks::BackendInternalServiceImpl<starrocks::PInternalService>::tablet_writer_add_chunks(google::protobuf::RpcController*, starrocks::PTabletWriterAddChunksRequest const*, starrocks::PTabletWriterAddBatchResult*, google::protobuf::Closure*)
@ 0x802f2d4 brpc::policy::ProcessRpcRequest(brpc::InputMessageBase*)
@ 0x7f5b677 brpc::ProcessInputMessage(void*)
@ 0x7f5c9f5 brpc::InputMessenger::OnNewMessages(brpc::Socket*)
@ 0x7f4acee brpc::Socket::ProcessEvent(void*)
@ 0x7f1bd72 bthread::TaskGroup::task_runner(long)
@ 0x8071201 bthread_make_fcontext
-
Github Issue:
-
Github Fix PR:
-
Jira
-
问题版本:
-
3.2.0~3.2.16
-
3.3.0~3.3.15
-
3.4.0~3.4.4
-
3.5.0
-
-
修复版本:
-
3.2.17+
-
3.3.16+
-
3.4.5+
-
3.5.1+
-
-
问题原因:
-
临时解决办法:
-
AggHashMapWithSerializedKey crash
*** SIGSEGV (@0x10) received by PID 61982 (TID 0x151907171640) LWP(66306) from PID 16; stack trace: ***
@ 0x1522e1c8ef38 __pthread_once_slow
@ 0xbed4e94 google::(anonymous namespace)::FailureSignalHandler(int, siginfo_t*, void*)
@ 0x1522e318c519 PosixSignals::chained_handler(int, siginfo*, void*) [clone .part.0]
@ 0x1522e318cf6e JVM_handle_linux_signal
@ 0x1522e1c3e730 (/usr/lib64/libc.so.6+0x3e72f)
@ 0x565de1e void starrocks::AggHashMapWithSerializedKey<phmap::flat_hash_map<starrocks::Slice, unsigned char*, starrocks::SliceHashWithSeed<(starrocks::PhmapSeed)0>, starrocks::SliceEqual, std::allocator<std::pair<starrocks::Slice const, unsigned char*> > > >::compute@
@ 0x5687d74 starrocks::Aggregator::build_hash_map_with_selection(unsigned long)
@ 0x54667fc starrocks::pipeline::AggregateStreamingSinkOperator::_push_chunk_by_selective_preaggregation(std::shared_ptr<starrocks::Chunk> const&, unsigned long, bool)
@ 0x54671f2 starrocks::pipeline::AggregateStreamingSinkOperator::_push_chunk_by_auto(std::shared_ptr<starrocks::Chunk> const&, unsigned long)
@ 0x54682d2 starrocks::pipeline::AggregateStreamingSinkOperator::push_chunk(starrocks::RuntimeState*, std::shared_ptr<starrocks::Chunk> const&)
@ 0x5426a49 starrocks::pipeline::PipelineDriver::process(starrocks::RuntimeState*, int)
@ 0x58bfc71 starrocks::pipeline::GlobalDriverExecutor::_worker_thread()
@ 0x45fee3f starrocks::ThreadPool::dispatch_thread()
@ 0x45f5c30 starrocks::Thread::supervise_thread(void*)
@ 0x1522e1c89d22 start_thread
@ 0x1522e1d0ed40 __clone3
-
Github Issue:
-
Github Fix PR:
-
Jira
-
问题版本:
- 3.5.0~3.5.8
-
修复版本:
- 3.5.9+
-
问题原因:
-
临时解决办法:
- 无
-
Field 函数 crash
*** Aborted at 1770806357 (unix time) try "date -d @1770806357" if you are using GNU date ***
PC: @ 0x4e40660 starrocks::BinaryColumnBase<unsigned int>::_build_slices() const
*** SIGSEGV (@0x0) received by PID 19649 (TID 0x1530d22c5640) LWP(24061) from PID 0; stack trace: ***
@ 0x1538ba08ef38 __pthread_once_slow
@ 0xbf0a514 google::(anonymous namespace)::FailureSignalHandler(int, siginfo_t*, void*)
@ 0x1538bb58c519 PosixSignals::chained_handler(int, siginfo*, void*) [clone .part.0]
@ 0x1538bb58cf6e JVM_handle_linux_signal
@ 0x1538ba03e730 (/usr/lib64/libc.so.6+0x3e72f)
@ 0x4e40660 starrocks::BinaryColumnBase<unsigned int>::_build_slices() const
@ 0x8722837 starrocks::StatusOr<starrocks::Cow<starrocks::Column>::ImmutPtr<starrocks::Column> > starrocks::StringFunctions::field<(starrocks::LogicalType)17>(starrocks::FunctionContext*, std::vector<starrocks::Cow<starrocks::Column>::ImmutPtr
<starrocks::Column>, std:@
@ 0x86e702a std::_Function_handler<starrocks::StatusOr<starrocks::Cow<starrocks::Column>::ImmutPtr<starrocks::Column> > (starrocks::FunctionContext*, std::vector<starrocks::Cow<starrocks::Column>::ImmutPtr<starrocks::Column>, std::allocator<st
arrocks::Cow<starrocks::C@
@ 0x745a68d starrocks::VectorizedFunctionCallExpr::evaluate_checked(starrocks::ExprContext*, starrocks::Chunk*)
@ 0x665dbdb starrocks::ExprContext::evaluate(starrocks::Expr*, starrocks::Chunk*, unsigned char*)
@ 0x665df1b starrocks::ExprContext::evaluate(starrocks::Chunk*, unsigned char*)
@ 0x538b16c starrocks::pipeline::ProjectOperator::push_chunk(starrocks::RuntimeState*, std::shared_ptr<starrocks::Chunk> const&)
@ 0x544347a starrocks::pipeline::PipelineDriver::process(starrocks::RuntimeState*, int)
@ 0x58e2aed starrocks::pipeline::GlobalDriverExecutor::_worker_thread()
@ 0x460d0d7 starrocks::ThreadPool::dispatch_thread()
@ 0x4603eb0 starrocks::Thread::supervise_thread(void*)
@ 0x1538ba089d22 start_thread
@ 0x1538ba10ed40 __clone3
-
Github Issue:
-
Github Fix PR:
-
Jira
-
问题版本:
-
3.5.0~3.5.13
-
4.0.0~4.0.6
-
-
修复版本:
-
3.5.14+
-
4.0.7+
-
-
问题原因:
-
临时解决办法:
- 无
-
get_fe_metrics crash
*** Aborted at 1768557257 (unix time) try "date -d @1768557257" if you are using GNU date ***
PC: @ 0x1498c0c8ba6c __pthread_kill_implementation
*** SIGABRT (@0x1f20001129f) received by PID 70303 (TID 0x148d3ff87640) LWP(76947) from PID 70303; stack trace: ***
@ 0x1498c0c8ef38 __pthread_once_slow
@ 0xbed4e94 google::(anonymous namespace)::FailureSignalHandler(int, siginfo_t*, void*)
@ 0x1498c0c3e730 (/usr/lib64/libc.so.6+0x3e72f)
@ 0x1498c0c8ba6c __pthread_kill_implementation
@ 0x1498c0c3e686 __GI_raise
@ 0x1498c0c28833 __GI_abort
@ 0x3f76f21 __gnu_cxx::__verbose_terminate_handler() [clone .cold]
@ 0xfb1d096 __cxxabiv1::__terminate(void (*)())
@ 0x3f76db9 std::terminate()
@ 0xfb1d233 __cxa_throw
@ 0x45d8de8 __wrap___cxa_throw
@ 0x5860ba1 starrocks::SchemaFeMetricsScanner::_get_fe_metrics(starrocks::RuntimeState*)
@ 0x5864713 starrocks::SchemaFeMetricsScanner::start(starrocks::RuntimeState*)
@ 0x58b8fea std::once_flag::_Prepare_execution::_Prepare_execution<std::call_once<starrocks::pipeline::SchemaChunkSource::start(starrocks::RuntimeState*)::{lambda()#1}>(std::once_flag&, starrocks::pipeline::SchemaChunkSo
urce::start(starrocks::RuntimeState*)::{lambda()@
@ 0x1498c0c8ef38 __pthread_once_slow
@ 0x58b90d4 starrocks::pipeline::SchemaChunkSource::start(starrocks::RuntimeState*)
@ 0x53914d6 auto starrocks::pipeline::ScanOperator::_trigger_next_scan(starrocks::RuntimeState*, int)::{lambda(auto:1&)#1}::operator()<starrocks::workgroup::YieldContext>(starrocks::workgroup::YieldContext&) const [clone
.isra.0]
@ 0x54d7a29 starrocks::workgroup::ScanExecutor::worker_thread()
@ 0x45fee3f starrocks::ThreadPool::dispatch_thread()
@ 0x45f5c30 starrocks::Thread::supervise_thread(void*)
@ 0x1498c0c89d22 start_thread
@ 0x1498c0d0ed40 __clone3
-
Github Issue:
-
Github Fix PR:
-
Jira
-
问题版本:
-
3.4.0~3.4.10
-
3.5.0~3.5.11
-
4.0.0~4.0.4
-
-
修复版本:
-
3.4.11
-
3.5.12+
-
4.0.5+
-
-
问题原因:
-
临时解决办法:
-
Tablet channel use-after-free
@ 0x406ef03 starrocks::FixedLengthColumnBase<starrocks::TimestampValue>::append_selective(starrocks::Column const&, unsigned int const*, unsigned int, unsigned int)
@ 0x635da25 starrocks::MemTable::insert(starrocks::Chunk const&, unsigned int const*, unsigned int, unsigned int)
@ 0x63529c5 starrocks::DeltaWriter::write(starrocks::Chunk const&, unsigned int const*, unsigned int, unsigned int)
@ 0x62c92a6 starrocks::AsyncDeltaWriter::_execute(void*, bthread::TaskIterator<starrocks::AsyncDeltaWriter::Task>&)
@ 0x7f2a85c bthread::ExecutionQueueBase::_execute(bthread::TaskNode*, bool, int*)
@ 0x7f2b84b bthread::ExecutionQueueBase::_execute_tasks(void*)
@ 0x398b053 starrocks::ThreadPool::dispatch_thread()
@ 0x3983296 starrocks::Thread::supervise_thread(void*)
@ 0x14914ac89d22 start_thread
@ 0x14914ad0ed40 __clone3
-
Github Issue:
-
Github Fix PR:
-
Jira
-
问题版本:
-
3.1.0 ~ latest
-
3.2.0 ~ latest
-
3.3.0 ~ 3.3.19
-
3.4.0 ~ 3.4.8
-
3.5.0 ~ 3.5.7
-
4.0.0
-
-
修复版本:
-
3.1 未修复
-
3.2 未修复
-
3.3.20+
-
3.5.9+
-
3.5.8+
-
4.0.1+
-
-
问题原因:
-
临时解决办法:
*** Aborted at 1772777751 (unix time) try "date -d @1772777751" if you are using GNU date ***
PC: @ 0x637e83f starrocks::lake::AsyncDeltaWriter::close()
*** SIGSEGV (@0x0) received by PID 4101738 (TID 0x7f7ddb8ff640) from PID 0; stack trace: ***
@ 0x7f80534904f8 __pthread_once_slow
@ 0x7d0ad20 google::(anonymous namespace)::FailureSignalHandler(int, siginfo_t*, void*)
@ 0x7f805441c25a os::Linux::chained_handler(int, siginfo*, void*)
@ 0x7f805442185e JVM_handle_linux_signal
@ 0x7f8054415748 signalHandler(int, siginfo*, void*)
@ 0x7f805343fc30 (/usr/lib64/libc.so.6+0x3fc2f)
@ 0x637e83f starrocks::lake::AsyncDeltaWriter::close()
@ 0x384e239 starrocks::LakeTabletsChannel::abort()
@ 0x37f924b starrocks::LoadChannel::abort()
@ 0x37f4590 starrocks::LoadChannelMgr::cancel(brpc::Controller*, starrocks::PTabletWriterCancelRequest const&, starrocks::PTabletWriterCancelResult*, google::protobuf::Closure*)
@ 0x7f97af4 brpc::policy::ProcessRpcRequest(brpc::InputMessageBase*)
@ 0x7ec3e97 brpc::ProcessInputMessage(void*)
@ 0x7ec5215 brpc::InputMessenger::OnNewMessages(brpc::Socket*)
@ 0x7eb350e brpc::Socket::ProcessEvent(void*)
@ 0x7e84592 bthread::TaskGroup::task_runner(long)
@ 0x7fd9a21 bthread_make_fcontext
- Github Issue:
- Github Fix PR:
- Jira
- 问题版本:
- 修复版本:
- 3.3.19+
- 3.4.8+
- 3.5.6+
- 问题原因:
LoadChannel被重新打开, 但对应的 tablet writer 没有重新打开, 所以 nullptr 了 - 临时解决办法:
- AsyncFlushOutputStream use-after-free 导致 BE Crash
PC: @ 0xd5a81ed starrocks::io::AsyncFlushOutputStream::write(unsigned char const*, long)
*** SIGSEGV (@0x0) received by PID 76996 (TID 0x14bbe3cd6640) LWP(77375) from PID 0; stack trace: ***
@ 0x14bc0e41bee8 (/usr/lib/x86_64-linux-gnu/libc.so.6+0x99ee7)
@ 0x14ac8d46 google::(anonymous namespace)::FailureSignalHandler(int, siginfo_t*, void*)
@ 0x14bc0e3c4520 (/usr/lib/x86_64-linux-gnu/libc.so.6+0x4251f)
@ 0xd5a81ed starrocks::io::AsyncFlushOutputStream::write(unsigned char const*, long)
@ 0xd5696c8 starrocks::parquet::AsyncParquetOutputStream::Write(void const*, long)
@ 0x1615e7f6 parquet::FileMetaData::FileMetaDataImpl::WriteTo(arrow::io::OutputStream*, std::shared_ptr<parquet::Encryptor> const&) const
@ 0x160d650e parquet::WriteFileMetaData(parquet::FileMetaData const&, arrow::io::OutputStream*)
@ 0x160d96aa parquet::FileSerializer::Close()
@ 0x160d6d40 parquet::ParquetFileWriter::Close()
@ 0x160d6e8f parquet::ParquetFileWriter::~ParquetFileWriter()
@ 0xd33b8da std::_Sp_counted_deleter<parquet::ParquetFileWriter*, ...>::_M_dispose()
@ 0xd336ee8 starrocks::formats::ParquetFileWriter::~ParquetFileWriter()
@ 0xc6e9b38 std::_Sp_counted_ptr_inplace<starrocks::connector::BufferPartitionChunkWriter, ...>::_M_dispose()
@ 0xc6f9b29 starrocks::connector::ConnectorChunkSink::write_partition_chunk(...)
@ 0xc6fa222 starrocks::connector::ConnectorChunkSink::add(std::shared_ptr<starrocks::Chunk> const&)
@ 0xfd4e805 starrocks::pipeline::ConnectorSinkOperator::push_chunk(starrocks::RuntimeState*, std::shared_ptr<starrocks::Chunk> const&)
@ 0xc5f40ff starrocks::pipeline::PipelineDriver::process(starrocks::RuntimeState*, int)
@ 0xc669dd1 starrocks::pipeline::GlobalDriverExecutor::_worker_thread()
@ 0x11034ebb starrocks::ThreadPool::dispatch_thread()
@ 0x1102beef starrocks::Thread::supervise_thread(void*)
- Github Issue:
- Github Fix PR:
- Jira:
- 问题版本:
- 3.5.0 ~ 3.5.14
- 4.0.0 ~ 4.0.6
- 修复版本:
- 3.5.15+
- 4.0.7+
- 问题原因:
_filter_writer析构时会触发_out_stream的 flush,若_out_stream先于_filter_writer被销毁,则会产生 use-after-free 问题导致 Crash。修复方式为调整成员变量声明顺序,确保_out_stream在_filter_writer之后销毁。 - 临时解决办法:
- ParquetFileWriter::close 抛出非 ParquetStatusException 异常导致 BE Crash
*** SIGABRT (@0x74ee) received by PID 29934 (TID 0x2ad55605e700) from PID 29934; stack trace: ***
@ 0x2ad416033e20 __GI___pthread_once
@ 0x7d2b0c0 google::(anonymous namespace)::FailureSignalHandler(int, siginfo_t*, void*)
@ 0x2ad4160365e0 (/usr/lib64/libpthread-2.17.so+0xf5df)
@ 0x2ad416ba01f7 __GI_raise
@ 0x2ad416ba18e8 __GI_abort
@ 0x330c327 __gnu_cxx::__verbose_terminate_handler() [clone .cold]
@ 0xc408c86 __cxxabiv1::__terminate(void (*)())
@ 0xc408cf1 std::terminate()
@ 0xc408e44 __cxa_throw
@ 0x36faf61 __wrap___cxa_throw
@ 0x8bf930e parquet::ThrowRowsMisMatchError(int, long, long)
@ 0x8bfa9e8 parquet::FileSerializer::Close()
@ 0x8bf7ad0 parquet::ParquetFileWriter::Close() [clone .localalias]
@ 0x7183400 starrocks::formats::ParquetFileWriter::commit()
@ 0x6f6e7df starrocks::connector::ConnectorChunkSink::finish()
@ 0x6fd6714 starrocks::pipeline::ConnectorSinkOperator::set_finishing(starrocks::RuntimeState*)
@ 0x44b4154 starrocks::pipeline::PipelineDriver::_mark_operator_finishing(...)
@ 0x44b43ac starrocks::pipeline::PipelineDriver::_mark_operator_finished(...)
@ 0x44b4b03 starrocks::pipeline::PipelineDriver::_mark_operator_cancelled(...)
@ 0x44b5025 starrocks::pipeline::PipelineDriver::cancel_operators(starrocks::RuntimeState*)
@ 0x477b524 starrocks::pipeline::GlobalDriverExecutor::_worker_thread()
@ 0x3966ca3 starrocks::ThreadPool::dispatch_thread()
@ 0x395e326 starrocks::Thread::supervise_thread(void*)
- Github Issue:
- Github Fix PR:
- Jira:
- 问题版本:
- 3.5.0~3.5.13
- 4.0.0~4.0.5
- 修复版本:
- 3.5.14+
- 4.0.6+
- 问题原因:
ParquetFileWriter::commit()原来只捕获ParquetStatusException,当_writer->Close()抛出其他类型的异常(如parquet::ThrowRowsMisMatchError抛出的行数不匹配异常)时,异常未被捕获,导致 stream 无法关闭,进而引发 pipeline 卡死或 BE Crash。修复方式为将异常捕获范围扩展为std::exception。 - 临时解决办法:
- 修复 hadoop-client 导致 FE 死锁的 bug
jdk.internal.misc.Unsafe.park(Native Method)
java.util.concurrent.locks.LockSupport.park(LockSupport.java:341)
java.util.concurrent.ForkJoinTask.awaitDone(ForkJoinTask.java:468)
java.util.concurrent.ForkJoinTask.invoke(ForkJoinTask.java:687)
java.util.stream.ReduceOps$ReduceOp.evaluateParallel(ReduceOps.java:927)
java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:233)
java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:682)
org.apache.hadoop.fs.statistics.impl.EvaluatingStatisticsMap.entrySet(EvaluatingStatisticsMap.java:166)
java.util.Collections$UnmodifiableMap.entrySet(Collections.java:1529)
org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.copyMap(IOStatisticsBinding.java:172)
org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.snapshotMap(IOStatisticsBinding.java:216)
org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.snapshotMap(IOStatisticsBinding.java:199)
org.apache.hadoop.fs.statistics.IOStatisticsSnapshot.snapshot(IOStatisticsSnapshot.java:165)
org.apache.hadoop.fs.statistics.IOStatisticsSnapshot.<init>(IOStatisticsSnapshot.java:125)
org.apache.hadoop.fs.statistics.IOStatisticsSupport.snapshotIOStatistics(IOStatisticsSupport.java:49)
- Github Issue: https://issues.apache.org/jira/browse/HADOOP-19712
- Github Fix PR:
- Jira:
- 问题版本:
- 3.4 未修复
- 3.5.0~3.5.13
- 4.0.0~4.0.6
- 修复版本:
- 3.5.14+
- 4.0.7+
- 问题原因:
StarRocks 使用的 hadoop-client 3.4.2 存在 bug(HADOOP-19712),会导致死锁 - 临时解决办法:
- field 函数并发执行导致 BE Crash(BinaryColumn slice cache 线程不安全)
PC: @ 0x4e40660 starrocks::BinaryColumnBase<unsigned int>::_build_slices() const
*** SIGSEGV (@0x0) received by PID 19649 (TID 0x1530d22c5640) LWP(24061) from PID 0; stack trace: ***
@ 0x1538ba08ef38 __pthread_once_slow
@ 0xbf0a514 google::(anonymous namespace)::FailureSignalHandler(int, siginfo_t*, void*)
@ 0x1538bb58c519 PosixSignals::chained_handler(int, siginfo*, void*) [clone .part.0]
@ 0x1538bb58cf6e JVM_handle_linux_signal
@ 0x1538ba03e730 (/usr/lib64/libc.so.6+0x3e72f)
@ 0x4e40660 starrocks::BinaryColumnBase<unsigned int>::_build_slices() const
@ 0x8722837 starrocks::StringFunctions::field<(starrocks::LogicalType)17>(starrocks::FunctionContext*, ...)
@ 0x86e702a std::_Function_handler<...>::_M_invoke(...)
@ 0x745a68d starrocks::VectorizedFunctionCallExpr::evaluate_checked(starrocks::ExprContext*, starrocks::Chunk*)
@ 0x665dbdb starrocks::ExprContext::evaluate(starrocks::Expr*, starrocks::Chunk*, unsigned char*)
@ 0x665df1b starrocks::ExprContext::evaluate(starrocks::Chunk*, unsigned char*)
@ 0x538b16c starrocks::pipeline::ProjectOperator::push_chunk(starrocks::RuntimeState*, std::shared_ptr<starrocks::Chunk> const&)
@ 0x544347a starrocks::pipeline::PipelineDriver::process(starrocks::RuntimeState*, int)
@ 0x58e2aed starrocks::pipeline::GlobalDriverExecutor::_worker_thread()
@ 0x460d0d7 starrocks::ThreadPool::dispatch_thread()
@ 0x4603eb0 starrocks::Thread::supervise_thread(void*)
- Github Issue:
- Github Fix PR:
- Jira:
- 问题版本:
- 3.5.0~3.5.13
- 4.0.0~4.0.6
- 修复版本:
- 3.5.14+
- 4.0.7+
- 问题原因:
field函数在多线程并发执行时,会调用BinaryColumn::get_data()来构建 slice cache(懒加载,修改_slices_cache、_slices等可变状态),该操作非线程安全,导致并发访问时 Crash。修复方式是将常量参数的值在 prepare 阶段提前获取并缓存到FieldFuncState中,evaluate 阶段直接使用缓存值,避免并发修改。 - 临时解决办法: