3.3.17 版本, 写入数据时创建线程失败, 影响线上业务了

为了更快的定位您的问题,请提供以下信息,谢谢
【详述】问题详细描述

image

W20250819 05:00:32.859540 140216180680256 sink_memory_manager.cpp:124] consumption: 36424531584 releasable_memory: 0 writer_allocated_memory: 0
W20250819 05:00:32.968602 140214238340672 engine_storage_migration_task.cpp:122] could not migration because has unfinished txns.
W20250819 05:00:32.968635 140214238340672 agent_task.cpp:359] local tablet migration failed. status: Internal error: could not migration because has unfinished txns., signature: 494986987
W20250819 05:00:33.016201 140187058275904 agent_server.cpp:588] fail to make_snapshot. tablet_id:475371360 msg:Not found: get_rowsets_for_snapshot: no version to clone tablet:475371360 #version:160 [46443 46600@159 46600] #pending:0 request_version:46601,
W20250819 05:00:33.061130 140177647334976 agent_server.cpp:588] fail to make_snapshot. tablet_id:475371376 msg:Not found: get_rowsets_for_snapshot: no version to clone tablet:475371376 #version:160 [46444 46600@159 46600] #pending:0 request_version:46601,
W20250819 05:00:54.939501 140120892110400 engine_clone_task.cpp:410] Fail to make snapshot from saas-node12: Not found: get_rowsets_for_snapshot: no version to clone tablet:475371372 #version:157 [46454.1 46608@156 46608] #pending:0 request_version:46609, tablet:475371372
W20250819 05:00:58.799694 140216138716736 pipeline_driver.cpp:376] push_chunk returns not ok status Runtime error: Could not create thread: Resource temporarily unavailable
be/src/exec/tablet_sink.cpp:663 _automatic_partition_token->submit_func([this] { this->_automatic_partition_status = this->_automatic_create_partition(); if (!this->_automatic_partition_status.ok()) { google::LogMessage("be/src/exec/tablet_sink.cpp", 666, google::GLOG_WARNING).stream() << "Failed to automatic create partition, err=" << this->_automatic_partition_status; } _is_automatic_partition_running.store(false, std::memory_order_release); })
W20250819 05:00:58.799953 140216138716736 pipeline_driver_executor.cpp:184] [Driver] Process error, query_id=3c9c6ba9-7c74-11f0-9b9a-0242b8d519da, instance_id=3c9c6ba9-7c74-11f0-9b9a-0242b8d519dd, status=Runtime error: Could not create thread: Resource temporarily unavailable: BE:10003
W20250819 05:00:58.873235 140216138716736 pipeline_driver.cpp:581] cancel pipeline driver error [driver=query_id=3c9c6ba9-7c74-11f0-9b9a-0242b8d519da fragment_id=3c9c6ba9-7c74-11f0-9b9a-0242b8d519dd driver=driver_88_2, status=RUNNING, operator-chain: [spillable_aggregate_blocking_source_88_0x7f83fb713c10(X) -> project_89_0x7f83fb722210(X) -> olap_table_sink_-1_0x7f83fb723710(X)]]: Cancelled: Cancelled by pipeline engine
W20250819 05:00:58.873269 140216138716736 tablet_sink_sender.cpp:250] close channel failed. channel_name=NodeChannel[10001], load_info=load_id=3c9c6ba9-7c74-11f0-9b9a-0242b8d519da, txn_id: 57017890, parallel=1, compress_type=2, error_msg=Cancelled by pipeline engine
W20250819 05:00:58.873275 140216138716736 tablet_sink_sender.cpp:250] close channel failed. channel_name=NodeChannel[10003], load_info=load_id=3c9c6ba9-7c74-11f0-9b9a-0242b8d519da, txn_id: 57017890, parallel=1, compress_type=2, error_msg=Cancelled by pipeline engine
W20250819 05:00:58.873278 140216138716736 tablet_sink_sender.cpp:250] close channel failed. channel_name=NodeChannel[10002], load_info=load_id=3c9c6ba9-7c74-11f0-9b9a-0242b8d519da, txn_id: 57017890, parallel=1, compress_type=2, error_msg=Cancelled by pipeline engine
E20250819 05:00:59.551697 140215035668032 threadpool.cpp:485] Thread pool failed to create thread: Runtime error: Could not create thread: Resource temporarily unavailable
    @          0x6325a05  starrocks::get_stack_trace[abi:cxx11]()
    @          0x874c4c1  starrocks::ThreadPool::do_submit(std::shared_ptr<starrocks::Runnable>, starrocks::ThreadPoolToken*, starrocks::ThreadPool::Priority)
    @          0x874c759  starrocks::ThreadPoolToken::submit(std::shared_ptr<starrocks::Runnable>, starrocks::ThreadPool::Priority)
    @          0x71d40bc  starrocks::SegmentFlushToken::submit(starrocks::DeltaWriter*, brpc::Controller*, starrocks::PTabletWriterAddSegmentRequest const*, starrocks::PTabletWriterAddSegmentResult*, google::protobuf::Closure*)
    @          0x8506a6e  starrocks::AsyncDeltaWriter::write_segment(starrocks::AsyncDeltaWriterSegmentRequest const&)
    @          0x84f5ccc  starrocks::LocalTabletsChannel::add_segment(brpc::Controller*, starrocks::PTabletWriterAddSegmentRequest const*, starrocks::PTabletWriterAddSegmentResult*, google::protobuf::Closure*)
    @          0x84ef786  starrocks::LoadChannel::add_segment(brpc::Controller*, starrocks::PTabletWriterAddSegmentRequest const*, starrocks::PTabletWriterAddSegmentResult*, google::protobuf::Closure*)
    @          0x84e7a55  starrocks::LoadChannelMgr::add_segment(brpc::Controller*, starrocks::PTabletWriterAddSegmentRequest const*, starrocks::PTabletWriterAddSegmentResult*, google::protobuf::Closure*)
    @          0x9dfcfb3  brpc::policy::ProcessRpcRequest(brpc::InputMessageBase*)
    @          0x9d2255b  brpc::ProcessInputMessage(void*)
    @          0x9d239a4  brpc::InputMessenger::OnNewMessages(brpc::Socket*)
    @          0x9d60322  brpc::Socket::ProcessEvent(void*)
    @          0x9cd7397  bthread::TaskGroup::task_runner(long)
    @          0x9cc05c1  bthread_make_fcontext
E20250819 05:00:59.773030 140215060846144 threadpool.cpp:485] Thread pool failed to create thread: Runtime error: Could not create thread: Resource temporarily unavailable
    @          0x6325a05  starrocks::get_stack_trace[abi:cxx11]()
    @          0x874c4c1  starrocks::ThreadPool::do_submit(std::shared_ptr<starrocks::Runnable>, starrocks::ThreadPoolToken*, starrocks::ThreadPool::Priority)
    @          0x874c759  starrocks::ThreadPoolToken::submit(std::shared_ptr<starrocks::Runnable>, starrocks::ThreadPool::Priority)
    @          0x71d40bc  starrocks::SegmentFlushToken::submit(starrocks::DeltaWriter*, brpc::Controller*, starrocks::PTabletWriterAddSegmentRequest const*, starrocks::PTabletWriterAddSegmentResult*, google::protobuf::Closure*)
    @          0x8506a6e  starrocks::AsyncDeltaWriter::write_segment(starrocks::AsyncDeltaWriterSegmentRequest const&)
    @          0x84f5ccc  starrocks::LocalTabletsChannel::add_segment(brpc::Controller*, starrocks::PTabletWriterAddSegmentRequest const*, starrocks::PTabletWriterAddSegmentResult*, google::protobuf::Closure*)
    @          0x84ef786  starrocks::LoadChannel::add_segment(brpc::Controller*, starrocks::PTabletWriterAddSegmentRequest const*, starrocks::PTabletWriterAddSegmentResult*, google::protobuf::Closure*)
    @          0x84e7a55  starrocks::LoadChannelMgr::add_segment(brpc::Controller*, starrocks::PTabletWriterAddSegmentRequest const*, starrocks::PTabletWriterAddSegmentResult*, google::protobuf::Closure*)
    @          0x9dfcfb3  brpc::policy::ProcessRpcRequest(brpc::InputMessageBase*)
    @          0x9d2255b  brpc::ProcessInputMessage(void*)
    @          0x9d239a4  brpc::InputMessenger::OnNewMessages(brpc::Socket*)
    @          0x9d60322  brpc::Socket::ProcessEvent(void*)
    @          0x9cd7397  bthread::TaskGroup::task_runner(long)
    @          0x9cc05c1  bthread_make_fcontext
E20250819 05:01:00.008913 140215052453440 threadpool.cpp:485] Thread pool failed to create thread: Runtime error: Could not create thread: Resource temporarily unavailable
    @          0x6325a05  starrocks::get_stack_trace[abi:cxx11]()
    @          0x874c4c1  starrocks::ThreadPool::do_submit(std::shared_ptr<starrocks::Runnable>, starrocks::ThreadPoolToken*, starrocks::ThreadPool::Priority)
    @          0x874c759  starrocks::ThreadPoolToken::submit(std::shared_ptr<starrocks::Runnable>, starrocks::ThreadPool::Priority)
    @          0x71d40bc  starrocks::SegmentFlushToken::submit(starrocks::DeltaWriter*, brpc::Controller*, starrocks::PTabletWriterAddSegmentRequest const*, starrocks::PTabletWriterAddSegmentResult*, google::protobuf::Closure*)
    @          0x8506a6e  starrocks::AsyncDeltaWriter::write_segment(starrocks::AsyncDeltaWriterSegmentRequest const&)
    @          0x84f5ccc  starrocks::LocalTabletsChannel::add_segment(brpc::Controller*, starrocks::PTabletWriterAddSegmentRequest const*, starrocks::PTabletWriterAddSegmentResult*, google::protobuf::Closure*)
    @          0x84ef786  starrocks::LoadChannel::add_segment(brpc::Controller*, starrocks::PTabletWriterAddSegmentRequest const*, starrocks::PTabletWriterAddSegmentResult*, google::protobuf::Closure*)
    @          0x84e7a55  starrocks::LoadChannelMgr::add_segment(brpc::Controller*, starrocks::PTabletWriterAddSegmentRequest const*, starrocks::PTabletWriterAddSegmentResult*, google::protobuf::Closure*)
    @          0x9dfcfb3  brpc::policy::ProcessRpcRequest(brpc::InputMessageBase*)
    @          0x9d2255b  brpc::ProcessInputMessage(void*)
    @          0x9d239a4  brpc::InputMessenger::OnNewMessages(brpc::Socket*)
    @          0x9d60322  brpc::Socket::ProcessEvent(void*)
    @          0x9cd7397  bthread::TaskGroup::task_runner(long)
    @          0x9cc05c1  bthread_make_fcontext
E20250819 05:01:00.271202 140215027275328 threadpool.cpp:485] Thread pool failed to create thread: Runtime error: Could not create thread: Resource temporarily unavailable
    @          0x6325a05  starrocks::get_stack_trace[abi:cxx11]()
    @          0x874c4c1  starrocks::ThreadPool::do_submit(std::shared_ptr<starrocks::Runnable>, starrocks::ThreadPoolToken*, starrocks::ThreadPool::Priority)
    @          0x874c759  starrocks::ThreadPoolToken::submit(std::shared_ptr<starrocks::Runnable>, starrocks::ThreadPool::Priority)
    @          0x71d40bc  starrocks::SegmentFlushToken::submit(starrocks::DeltaWriter*, brpc::Controller*, starrocks::PTabletWriterAddSegmentRequest const*, starrocks::PTabletWriterAddSegmentResult*, google::protobuf::Closure*)
    @          0x8506a6e  starrocks::AsyncDeltaWriter::write_segment(starrocks::AsyncDeltaWriterSegmentRequest const&)
    @          0x84f5ccc  starrocks::LocalTabletsChannel::add_segment(brpc::Controller*, starrocks::PTabletWriterAddSegmentRequest const*, starrocks::PTabletWriterAddSegmentResult*, google::protobuf::Closure*)
    @          0x84ef786  starrocks::LoadChannel::add_segment(brpc::Controller*, starrocks::PTabletWriterAddSegmentRequest const*, starrocks::PTabletWriterAddSegmentResult*, google::protobuf::Closure*)
    @          0x84e7a55  starrocks::LoadChannelMgr::add_segment(brpc::Controller*, starrocks::PTabletWriterAddSegmentRequest const*, starrocks::PTabletWriterAddSegmentResult*, google::protobuf::Closure*)
    @          0x9dfcfb3  brpc::policy::ProcessRpcRequest(brpc::InputMessageBase*)
    @          0x9d2255b  brpc::ProcessInputMessage(void*)
    @          0x9d239a4  brpc::InputMessenger::OnNewMessages(brpc::Socket*)
    @          0x9d60322  brpc::Socket::ProcessEvent(void*)
    @          0x9cd7397  bthread::TaskGroup::task_runner(long)
    @          0x9cc05c1  bthread_make_fcontext
E20250819 05:01:00.467808 140215010489920 threadpool.cpp:485] Thread pool failed to create thread: Runtime error: Could not create thread: Resource temporarily unavailable
    @          0x6325a05  starrocks::get_stack_trace[abi:cxx11]()
    @          0x874c4c1  starrocks::ThreadPool::do_submit(std::shared_ptr<starrocks::Runnable>, starrocks::ThreadPoolToken*, starrocks::ThreadPool::Priority)
    @          0x874c759  starrocks::ThreadPoolToken::submit(std::shared_ptr<starrocks::Runnable>, starrocks::ThreadPool::Priority)
    @          0x71d40bc  starrocks::SegmentFlushToken::submit(starrocks::DeltaWriter*, brpc::Controller*, starrocks::PTabletWriterAddSegmentRequest const*, starrocks::PTabletWriterAddSegmentResult*, google::protobuf::Closure*)
    @          0x8506a6e  starrocks::AsyncDeltaWriter::write_segment(starrocks::AsyncDeltaWriterSegmentRequest const&)
    @          0x84f5ccc  starrocks::LocalTabletsChannel::add_segment(brpc::Controller*, starrocks::PTabletWriterAddSegmentRequest const*, starrocks::PTabletWriterAddSegmentResult*, google::protobuf::Closure*)
    @          0x84ef786  starrocks::LoadChannel::add_segment(brpc::Controller*, starrocks::PTabletWriterAddSegmentRequest const*, starrocks::PTabletWriterAddSegmentResult*, google::protobuf::Closure*)
    @          0x84e7a55  starrocks::LoadChannelMgr::add_segment(brpc::Controller*, starrocks::PTabletWriterAddSegmentRequest const*, starrocks::PTabletWriterAddSegmentResult*, google::protobuf::Closure*)
    @          0x9dfcfb3  brpc::policy::ProcessRpcRequest(brpc::InputMessageBase*)
    @          0x9d2255b  brpc::ProcessInputMessage(void*)
    @          0x9d239a4  brpc::InputMessenger::OnNewMessages(brpc::Socket*)
:

日志上看没有什么征兆 直接就报错了, 监控上看没有看到线程数的增长
机器进程线程监控
image
机器网络监控
image
机器磁盘监控
image
starrocks be监控
image
starrocks be task监控
image

【背景】做过哪些操作?
【业务影响】saas服务, 影响线上报表数据加工
【是否存算分离】
【StarRocks版本】例如:3.3.17
【集群规模】例如:3fe(1 follower+2observer)+5be(fe与be混部)
【机器信息】CPU虚拟核/内存/网卡,例如:48C/64G/万兆
【联系方式】为了在解决问题过程中能及时联系到您获取一些日志信息,请补充下您的联系方式,例如:社区群16-可乐鸡或者邮箱,谢谢
【附件】

ulimit -a

看看系统资源的当前限制是什么样的.

[root@saas-node11 ~]# ulimit -a
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 252877
max locked memory (kbytes, -l) 64
max memory size (kbytes, -m) unlimited
open files (-n) 524288
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 8192
cpu time (seconds, -t) unlimited
max user processes (-u) 252877
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited