存算分离版本,be宕机问题

【详述】存算分离版本 3.1.9-e1c6e4e,会有be节点宕机问题,be.out日志如下:

query_id:2e93addd-0131-11ef-b7c5-00163e3794a9, fragment_instance:2e93addd-0131-11ef-b7c5-00163e3798d8
tracker:process consumption: 56909424736
tracker:query_pool consumption: 34269842480
tracker:load consumption: 0
tracker:metadata consumption: 988293418
tracker:tablet_metadata consumption: 3308529
tracker:rowset_metadata consumption: 0
tracker:segment_metadata consumption: 96303461
tracker:column_metadata consumption: 888681428
tracker:tablet_schema consumption: 3308529
tracker:segment_zonemap consumption: 77374472
tracker:short_key_index consumption: 8579716
tracker:column_zonemap_index consumption: 307591580
tracker:ordinal_index consumption: 342980176
tracker:bitmap_index consumption: 0
tracker:bloom_filter_index consumption: 0
tracker:compaction consumption: 40383112
tracker:schema_change consumption: 0
tracker:column_pool consumption: 0
tracker:page_cache consumption: 268884416
tracker:update consumption: 3740104099
tracker:chunk_allocator consumption: 1609851288
tracker:clone consumption: 0
tracker:consistency consumption: 0
tracker:datacache consumption: 0
tracker:replication consumption: 0
*** Aborted at 1713849934 (unix time) try “date -d @1713849934” if you are using GNU date ***
PC: @ 0x2cbfa36 std::_Rb_tree<>::find()
*** SIGSEGV (@0x10) received by PID 89287 (TID 0x7f1009151700) from PID 16; stack trace: ***
@ 0x64fe742 google::(anonymous namespace)::FailureSignalHandler()
@ 0x7f10aba1db25 os::Linux::chained_handler()
@ 0x7f10aba22b21 JVM_handle_linux_signal
@ 0x7f10aba158d8 signalHandler()
@ 0x7f10aaec8630 (unknown)
@ 0x2cbfa36 std::_Rb_tree<>::find()
@ 0x2cbdc7d starrocks::RuntimeProfile::merge_isomorphic_profiles()
@ 0x35dbcb4 starrocks::pipeline::ScanOperator::_merge_chunk_source_profiles()
@ 0x366259a starrocks::pipeline::PipelineDriver::runtime_report_action()
@ 0x36721f4 starrocks::pipeline::FragmentContext::report_exec_state_if_necessary()
@ 0x3658845 starrocks::pipeline::GlobalDriverExecutor::_worker_thread()
@ 0x2cf8f5a starrocks::ThreadPool::dispatch_thread()
@ 0x2cf39ea starrocks::thread::supervise_thread()
@ 0x7f10aaec0ea5 start_thread
@ 0x7f10aa2c1b0d __clone
@ 0x0 (unknown)

【背景】
【业务影响】
【是否存算分离】 是
【StarRocks版本】3.1.9-e1c6e4e
【集群规模 3fe + 20be

之前在 3.1.6-fcc0c6b 版本也遇到过同样的报错

3.1.6版本 core文件如下:
https://bi-rx-bigdata.oss-cn-beijing.aliyuncs.com/core/core-starrocks_be-9965-1706190172.gz
这个宕机的core

在 fe.audit.log 中搜 query id ,发一下对应的sql

请问你们机器上这个core可以打开么?(使用gdb /home/disk1/sr/output/be/lib/starrocks_be ~/core-starrocks_be-9965-1706190172)
如果可以的话,能否将 /lib64下的文件打包发一下?

这个是之前3.1.6版本保存的core,上传到阿里oss的,目前本地没有这个core文件了。没办法打包。现在的版本由于没有配置core,也没有收集到最新的core

fe.audit.log 目前保存的文件到25日,现在已经查不到了,又复现了我在拿一下具体sql