为了更快的定位您的问题,请提供以下信息,谢谢
【详述】问题详细描述
be crash
【背景】做过哪些操作?
【业务影响】
【是否存算分离】
【StarRocks版本】例如:3.3.6
【集群规模】例如:3fe(3 follower+0observer)+3be(fe与be混部)
【机器信息】CPU虚拟核/内存/网卡,例如:16C/64G/万兆
【联系方式】为了在解决问题过程中能及时联系到您获取一些日志信息,请补充下您的联系方式,例如:社区群16-colagy或者邮箱,谢谢
【附件】
- fe.log/beINFO/相应截图
W20250311 09:40:50.794377 139986770675264 thrift_rpc_helper.cpp:129] Rpc error: FE RPC failure, address=TNetworkAddress(hostname=pengze-node00, port=9020), reason=No more data to read.
W20250311 09:40:50.895560 139986770675264 thrift_rpc_helper.cpp:129] Rpc error: FE RPC failure, address=TNetworkAddress(hostname=pengze-node00, port=9020), reason=No more data to read.
W20250311 09:40:50.997820 139986770675264 thrift_rpc_helper.cpp:129] Rpc error: FE RPC failure, address=TNetworkAddress(hostname=pengze-node00, port=9020), reason=No more data to read.
W20250311 09:41:02.174420 139984684504640 thrift_rpc_helper.cpp:129] Rpc error: FE RPC failure, address=TNetworkAddress(hostname=pengze-node01, port=9020), reason=No more data to read.
W20250311 09:41:23.488083 139986737104448 thrift_rpc_helper.cpp:129] Rpc error: FE RPC failure, address=TNetworkAddress(hostname=pengze-node01, port=9020), reason=No more data to read.
W20250311 09:41:33.100276 139986560857664 thrift_rpc_helper.cpp:129] Rpc error: FE RPC failure, address=TNetworkAddress(hostname=pengze-node01, port=9020), reason=No more data to read.
W20250311 09:41:39.649083 139986644784704 thrift_rpc_helper.cpp:129] Rpc error: FE RPC failure, address=TNetworkAddress(hostname=pengze-node01, port=9020), reason=No more data to read.
W20250311 09:43:15.371214 139986728711744 thrift_rpc_helper.cpp:129] Rpc error: FE RPC failure, address=TNetworkAddress(hostname=pengze-node00, port=9020), reason=No more data to read.
W20250311 09:43:31.808899 139984684504640 thrift_rpc_helper.cpp:129] Rpc error: FE RPC failure, address=TNetworkAddress(hostname=pengze-node01, port=9020), reason=No more data to read.
W20250311 09:43:52.354570 139985869080128 query_context.cpp:675] Retrying ReportExecStatus: No more data to read.
W20250311 09:43:54.942871 139984684504640 thrift_rpc_helper.cpp:129] Rpc error: FE RPC failure, address=TNetworkAddress(hostname=pengze-node01, port=9020), reason=No more data to read.
W20250311 09:43:55.059100 139984684504640 thrift_rpc_helper.cpp:129] Rpc error: FE RPC failure, address=TNetworkAddress(hostname=pengze-node01, port=9020), reason=No more data to read.
W20250311 09:44:10.840282 139984709682752 thrift_rpc_helper.cpp:129] Rpc error: FE RPC failure, address=TNetworkAddress(hostname=pengze-node01, port=9020), reason=No more data to read.
W20250311 09:44:10.956387 139984709682752 thrift_rpc_helper.cpp:129] Rpc error: FE RPC failure, address=TNetworkAddress(hostname=pengze-node01, port=9020), reason=No more data to read.
W20250311 09:44:23.797423 139986611213888 thrift_rpc_helper.cpp:129] Rpc error: FE RPC failure, address=TNetworkAddress(hostname=pengze-node00, port=9020), reason=No more data to read.
W20250311 09:44:23.898188 139986611213888 thrift_rpc_helper.cpp:129] Rpc error: FE RPC failure, address=TNetworkAddress(hostname=pengze-node00, port=9020), reason=No more data to read.
W20250311 09:44:23.999268 139986611213888 thrift_rpc_helper.cpp:129] Rpc error: FE RPC failure, address=TNetworkAddress(hostname=pengze-node00, port=9020), reason=No more data to read.
W20250311 09:44:52.366995 139985869080128 query_context.cpp:675] Retrying ReportExecStatus: No more data to read.
W20250311 09:45:01.784686 139984709682752 thrift_rpc_helper.cpp:129] Rpc error: FE RPC failure, address=TNetworkAddress(hostname=pengze-node01, port=9020), reason=No more data to read.
W20250311 09:45:22.387358 139985869080128 query_context.cpp:675] Retrying ReportExecStatus: No more data to read.
W20250311 09:45:52.583482 139985869080128 query_context.cpp:675] Retrying ReportExecStatus: No more data to read.
W20250311 09:46:02.863743 139986636392000 thrift_rpc_helper.cpp:129] Rpc error: FE RPC failure, address=TNetworkAddress(hostname=pengze-node00, port=9020), reason=No more data to read.
W20250311 09:46:09.690354 139984692897344 thrift_rpc_helper.cpp:129] Rpc error: FE RPC failure, address=TNetworkAddress(hostname=pengze-node01, port=9020), reason=No more data to read.
W20250311 09:46:09.806950 139984692897344 thrift_rpc_helper.cpp:129] Rpc error: FE RPC failure, address=TNetworkAddress(hostname=pengze-node01, port=9020), reason=No more data to read.
W20250311 09:47:01.423710 139984684504640 thrift_rpc_helper.cpp:129] Rpc error: FE RPC failure, address=TNetworkAddress(hostname=pengze-node01, port=9020), reason=No more data to read.
W20250311 09:47:01.540110 139984684504640 thrift_rpc_helper.cpp:129] Rpc error: FE RPC failure, address=TNetworkAddress(hostname=pengze-node01, port=9020), reason=No more data to read.
W20250311 09:47:08.505173 139985885865536 runtime_filter_worker.cpp:333] brpc failed, error=RPC call is timed out, error_text=[E1008]Reached timeout=400ms @192.168.18.222:8060
W20250311 09:47:08.505367 139985885865536 runtime_filter_worker.cpp:333] brpc failed, error=RPC call is timed out, error_text=[E1008]Reached timeout=400ms @192.168.18.224:8060
W20250311 09:47:32.242575 139986602821184 thrift_rpc_helper.cpp:129] Rpc error: FE RPC failure, address=TNetworkAddress(hostname=pengze-node00, port=9020), reason=No more data to read.
W20250311 09:47:52.461264 139986485323328 thrift_rpc_helper.cpp:129] Rpc error: FE RPC failure, address=TNetworkAddress(hostname=pengze-node00, port=9020), reason=No more data to read.
W20250311 09:48:03.148128 139986527286848 thrift_rpc_helper.cpp:129] Rpc error: FE RPC failure, address=TNetworkAddress(hostname=pengze-node00, port=9020), reason=No more data to read.
W20250311 09:48:03.249068 139986527286848 thrift_rpc_helper.cpp:129] Rpc error: FE RPC failure, address=TNetworkAddress(hostname=pengze-node00, port=9020), reason=No more data to read.
W20250311 09:49:12.468167 139816349894208 data_dir.cpp:383] could not find tablet id: 13493761 for rowset: 020000000220c136e746f725ea0ddadbf6da3519f0637c95, skip loading this rowset
W20250311 09:49:17.430282 139811735868992 heartbeat_server.cpp:205] pengze-node00 not equal to to backend localhost 192.168.18.220
W20250311 09:49:59.125195 139811613533760 engine_clone_task.cpp:403] Fail to make snapshot from pengze-node02: Not found: get_rowsets_for_snapshot: no version to clone tablet:13442909 #version:97 [514 577@96 577] #pending:0 request_version:578, tablet:13442909
W20250311 09:49:59.125363 139811596748352 engine_clone_task.cpp:403] Fail to make snapshot from pengze-node01: Not found: get_rowsets_for_snapshot: no version to clone tablet:13442933 #version:74 [530 577@73 577] #pending:0 request_version:578, tablet:13442933
W20250311 09:49:59.125407 139811672282688 engine_clone_task.cpp:403] Fail to make snapshot from pengze-node01: Not found: get_rowsets_for_snapshot: no version to clone tablet:13442925 #version:73 [530 577@72 577] #pending:0 request_version:578, tablet:13442925
W20250311 09:52:13.777864 139812514285120 storage_engine.cpp:1124] tablet not found, remove rowset meta, rowset_id=020000000220c136e746f725ea0ddadbf6da3519f0637c95 tablet: 13493761
W20250311 09:52:13.780432 139812514285120 storage_engine.cpp:1144] traverse_rowset_meta and remove 81/163 invalid rowset metas, path:/dbdata1 duration:2ms
W20250311 09:54:00.180397 139813891724864 thrift_rpc_helper.cpp:129] Rpc error: FE RPC failure, address=TNetworkAddress(hostname=pengze-node00, port=9020), reason=No more data to read.
W20250311 09:55:20.195493 139813765834304 thrift_rpc_helper.cpp:129] Rpc error: FE RPC failure, address=TNetworkAddress(hostname=pengze-node02, port=9020), reason=No more data to read.
W20250311 09:55:20.296716 139813765834304 thrift_rpc_helper.cpp:129] Rpc error: FE RPC failure, address=TNetworkAddress(hostname=pengze-node02, port=9020), reason=No more data to read.
W20250311 09:56:56.733431 139813376099904 thrift_rpc_helper.cpp:129] Rpc error: FE RPC failure, address=TNetworkAddress(hostname=pengze-node00, port=9020), reason=No more data to read.
W20250311 09:56:57.068602 139813334136384 thrift_rpc_helper.cpp:129] Rpc error: FE RPC failure, address=TNetworkAddress(hostname=pengze-node01, port=9020), reason=No more data to read.
W20250311 09:56:57.511903 139813325743680 thrift_rpc_helper.cpp:129] Rpc error: FE RPC failure, address=TNetworkAddress(hostname=pengze-node02, port=9020), reason=No more data to read.
W20250311 09:57:34.357294 139813975651904 thrift_rpc_helper.cpp:129] Rpc error: FE RPC failure, address=TNetworkAddress(hostname=pengze-node02, port=9020), reason=No more data to read.
W20250311 09:57:34.457910 139813975651904 thrift_rpc_helper.cpp:129] Rpc error: FE RPC failure, address=TNetworkAddress(hostname=pengze-node02, port=9020), reason=No more data to read.
W20250311 09:58:44.679638 139811799860800 thrift_rpc_helper.cpp:129] Rpc error: FE RPC failure, address=TNetworkAddress(hostname=pengze-node01, port=9020), reason=No more data to read.
W20250311 10:00:02.838293 139813858154048 thrift_rpc_helper.cpp:129] Rpc error: FE RPC failure, address=TNetworkAddress(hostname=pengze-node00, port=9020), reason=No more data to read.
- be crash
- be.out
3.3.6 RELEASE (build 8f01cfa)
query_id:90705366-fe1a-11ef-ae94-024234d8287d, fragment_instance:90705366-fe1a-11ef-ae94-024234d828eb
tracker:process consumption: 19217561592
tracker:jemalloc_metadata consumption: 601174560
tracker:jemalloc_fragmentation consumption: 336564736
tracker:query_pool consumption: 11133394080
tracker:query_pool/connector_scan consumption: 0
tracker:load consumption: 763392
tracker:metadata consumption: 661226991
tracker:tablet_metadata consumption: 162826852
tracker:rowset_metadata consumption: 41567547
tracker:segment_metadata consumption: 21636041
tracker:column_metadata consumption: 435196551
tracker:tablet_schema consumption: 1169644
tracker:segment_zonemap consumption: 10098957
tracker:short_key_index consumption: 245579
tracker:column_zonemap_index consumption: 24667199
tracker:ordinal_index consumption: 211970168
tracker:bitmap_index consumption: 0
tracker:bloom_filter_index consumption: 0
tracker:compaction consumption: 0
tracker:schema_change consumption: 0
tracker:column_pool consumption: 0
tracker:page_cache consumption: 7222224
tracker:jit_cache consumption: 9584
tracker:update consumption: 35321512
tracker:chunk_allocator consumption: 0
tracker:passthrough consumption: 0
tracker:clone consumption: 0
tracker:consistency consumption: 0
tracker:datacache consumption: 0
tracker:replication consumption: 0
*** Aborted at 1741657717 (unix time) try "date -d @1741657717" if you are using GNU date ***
PC: @ 0x7f51ec288aca (/usr/lib/x86_64-linux-gnu/libc.so.6+0x1a0ac9)
*** SIGSEGV (@0x7f51e57ff000) received by PID 25 (TID 0x7f5145bf4640) from PID 18446744073264951296; stack trace: ***
@ 0x7f51ec181ee8 (/usr/lib/x86_64-linux-gnu/libc.so.6+0x99ee7)
@ 0xa16e1c9 google::(anonymous namespace)::FailureSignalHandler(int, siginfo_t*, void*)
@ 0x7f51ec12a520 (/usr/lib/x86_64-linux-gnu/libc.so.6+0x4251f)
@ 0x7f51ec288aca (/usr/lib/x86_64-linux-gnu/libc.so.6+0x1a0ac9)
@ 0x5472aed starrocks::FixedLengthColumnBase<signed char>::append(starrocks::Column const&, unsigned long, unsigned long)
@ 0x53b194a starrocks::Chunk::append(starrocks::Chunk const&, unsigned long, unsigned long)
@ 0x58b5989 starrocks::spill::OrderedMemTable::append(std::shared_ptr<starrocks::Chunk>)
@ 0x5827e8c starrocks::Status starrocks::spill::RawSpillerWriter::spill<starrocks::spill::IOTaskExecutor, starrocks::spill::ResourceMemTrackerGuard<std::weak_ptr<starrocks::pipeline::QueryContext>, std::weak_ptr<starrocks::spill::Spiller> >&>(starrocks::RuntimeState*,¹^Q
@ 0x582a4ad starrocks::Status starrocks::spill::Spiller::spill<starrocks::spill::IOTaskExecutor, starrocks::spill::ResourceMemTrackerGuard<std::weak_ptr<starrocks::pipeline::QueryContext>, std::weak_ptr<starrocks::spill::Spiller> > >(starrocks::RuntimeState*, std::sha¹^Q
@ 0x75cfa3c starrocks::pipeline::SpillProcessOperator::pull_chunk(starrocks::RuntimeState*)
@ 0x5396c5f starrocks::pipeline::PipelineDriver::process(starrocks::RuntimeState*, int)
@ 0x7d94683 starrocks::pipeline::GlobalDriverExecutor::_worker_thread()
@ 0x8aa57d2 starrocks::ThreadPool::dispatch_thread()
@ 0x8a9db09 starrocks::Thread::supervise_thread(void*)
@ 0x7f51ec17cac3 (/usr/lib/x86_64-linux-gnu/libc.so.6+0x94ac2)
@ 0x7f51ec20da04 clone
- coredump,如何获取coredump
- 外表查询报错
- be.out和fe.warn.log