ngram_search_case_insensitive会导致内存泄露,BE直接挂了

为了更快的定位您的问题,请提供以下信息,谢谢
【详述】
ngram_search_case_insensitive会导致内存泄露,BE直接挂了
【背景】3.3.16
SELECT COUNT(*)
FROM (

SELECT t.product_id_source as product_id
, t.title
, t.price
, t.promotion_price
, t.retail_price
, t.coupon_price
, t.cos_ratio
, t.cos_fee
, t.first_cid
, t.sales
, t.cover
, t.images
, t.detail_url
, t.shop_id
, t.shop_name
, t.comment_score
, t.order_num
, t.commission_type
, t.platform
, t.platform_all
, t.service_ratio
, t.colonel_level – 综合评分:平台权重 × (文本匹配度 + 销量相对分数)
, case
when platform = ‘tbb’ then 1.5
when platform = ‘tbc’ then 1.2
when platform = ‘dyc’ then 1.2
else 1.0 end *
(ngram_search_case_insensitive(title, ‘男刺绣毛衣加绒圆领打底衫潮流内搭加厚保暖衣休闲宽松秋冬半高领’,
2) + – 销量按分档处理,避免具体数值影响
case
when sales > 100000 then 0.1 – 高销量档
when sales > 50000 then 0.05 – 中高销量档
when sales > 10000 then 0.01 – 中等销量档
else 0.005 – 低销量档
end) as comprehensive_score
FROM dws_cps_product_base_info t
WHERE (ngram_search_case_insensitive(t.title, ‘男刺绣毛衣加绒圆领打底衫潮流内搭加厚保暖衣休闲宽松秋冬半高领’, 2) >
0.3 AND t.platform_all = ‘dy’ AND t.is_on_sales = 0 AND
t.etl_date >= DATE_FORMAT(DATE_SUB(CURDATE(), INTERVAL 7 DAY), ‘%Y-%m-%d 00:00:00’))
ORDER BY comprehensive_score DESC

  )

LOG日志:
query_id:c7dfee14-c44e-11f0-9b4d-3ed1f7924773, fragment_instance:c7dfee14-c44e-11f0-9b4d-3ed1f7924775
tracker:process consumption: 32455065192
tracker:jemalloc_metadata consumption: 1272293408
tracker:query_pool consumption: 3041808
tracker:query_pool/connector_scan consumption: 0
tracker:load consumption: 0
tracker:metadata consumption: 2409116972
tracker:tablet_metadata consumption: 249500243
tracker:rowset_metadata consumption: 215077503
tracker:segment_metadata consumption: 290909540
tracker:column_metadata consumption: 1653629686
tracker:tablet_schema consumption: 54347
tracker:segment_zonemap consumption: 247199748
tracker:short_key_index consumption: 1037583
tracker:column_zonemap_index consumption: 448020198
tracker:ordinal_index consumption: 364221920
tracker:bitmap_index consumption: 24941808
tracker:bloom_filter_index consumption: 24111936
tracker:compaction consumption: 0
tracker:schema_change consumption: 0
tracker:column_pool consumption: 0
tracker:page_cache consumption: 21970599536
tracker:jit_cache consumption: 1720
tracker:update consumption: 4007163574
tracker:chunk_allocator consumption: 0
tracker:passthrough consumption: 0
tracker:clone consumption: 0
tracker:consistency consumption: 0
tracker:datacache consumption: 0
tracker:replication consumption: 0
*** Aborted at 1763450308 (unix time) try “date -d @1763450308” if you are using GNU date ***
PC: @ 0x9572bc0 starrocks::NgramFunctionImpl<true, false, char>::haystack_vector_and_needle_const(std::shared_ptrstarrocks::Column const&, std::vector<unsigned short, std::allocator >&, starrocks::FunctionContext*, unsigned long)
*** SIGSEGV (@0x520000f2ca) received by PID 37067 (TID 0x148a74ac2640) from PID 62154; stack trace: ***
@ 0x148af6251ee8 (/usr/lib/x86_64-linux-gnu/libc.so.6+0x99ee7)
@ 0xa2ba469 google::(anonymous namespace)::FailureSignalHandler(int, siginfo_t*, void*)
@ 0x148af61fa520 (/usr/lib/x86_64-linux-gnu/libc.so.6+0x4251f)
@ 0x9572bc0 starrocks::NgramFunctionImpl<true, false, char>::haystack_vector_and_needle_const(std::shared_ptrstarrocks::Column const&, std::vector<unsigned short, std::allocator >&, starrocks::FunctionContext*, unsigned long)
@ 0x9570fc8 starrocks::StringFunctions::ngram_search_case_insensitive(starrocks::FunctionContext*, std::vector<std::shared_ptrstarrocks::Column, std::allocator<std::shared_ptrstarrocks::Column > > const&)
@ 0x7ed316e starrocks::VectorizedFunctionCallExpr::evaluate_checked(starrocks::ExprContext*, starrocks::Chunk*)
@ 0x6fa2608 starrocks::VectorizedBinaryPredicate<(starrocks::LogicalType)11, starrocks::BinaryPredFunc<std::greater > >::evaluate_checked(starrocks::ExprContext*, starrocks::Chunk*)
@ 0x56b7583 starrocks::ExprContext::evaluate(starrocks::Expr*, starrocks::Chunk*, unsigned char*)
@ 0x56b7a92 starrocks::ExprContext::evaluate(starrocks::Chunk*, unsigned char*)
@ 0x6eb2d2d starrocks::ColumnExprPredicate::evaluate(starrocks::Column const*, unsigned char*, unsigned short, unsigned short) const
@ 0x6fd70c1 starrocks::ColumnPredicateRewriter::_rewrite_expr_predicate(starrocks::ObjectPool*, starrocks::ColumnPredicate const*, std::shared_ptrstarrocks::Column const&, std::shared_ptrstarrocks::Column const&, bool, starrocks::ColumnPredicate**)
@ 0x6fd81a1 starrocks::ColumnPredicateRewriter::_rewrite_predicate(starrocks::ObjectPool*, std::shared_ptrstarrocks::Field const&)
@ 0x6fd9254 starrocks::ColumnPredicateRewriter::rewrite_predicate(starrocks::ObjectPool*)
@ 0x756e683 starrocks::SegmentIterator::_rewrite_predicates()
@ 0x75801d9 starrocks::SegmentIterator::_init()
@ 0x7580913 starrocks::SegmentIterator::do_get_next(starrocks::Chunk*)
@ 0x66191d8 starrocks::SegmentIteratorWrapper::do_get_next(starrocks::Chunk*)
@ 0x7043473 starrocks::TimedChunkIterator::do_get_next(starrocks::Chunk*)
@ 0x7046e8f starrocks::UnionIterator::do_get_next(starrocks::Chunk*)
@ 0x6d298bf starrocks::TabletReader::do_get_next(starrocks::Chunk*)
@ 0x783b178 starrocks::pipeline::OlapChunkSource::_read_chunk_from_storage(starrocks::RuntimeState*, starrocks::Chunk*)
@ 0x783b95f starrocks::pipeline::OlapChunkSource::_read_chunk(starrocks::RuntimeState*, std::shared_ptrstarrocks::Chunk)
@ 0x76b6bdf starrocks::pipeline::ChunkSource::buffer_next_batch_chunks_blocking(starrocks::RuntimeState
, unsigned long, starrocks::workgroup::WorkGroup const*)
@ 0x541ae8f auto starrocks::pipeline::ScanOperator::_trigger_next_scan(starrocks::RuntimeState*, int)::{lambda(auto:1&)#1}::operator()starrocks::workgroup::YieldContext(starrocks::workgroup::YieldContext&) const [clone .constprop.0]
@ 0x53f899e starrocks::workgroup::ScanExecutor::worker_thread()
@ 0x8bced83 starrocks::ThreadPool::dispatch_thread()
@ 0x8bc63d9 starrocks::thread::supervise_thread(void*)
@ 0x148af624cac3 (/usr/lib/x86_64-linux-gnu/libc.so.6+0x94ac2)
@ 0x148af62de8c0 (/usr/lib/x86_64-linux-gnu/libc.so.6+0x1268bf)

你的 BE 是被 ngram_search_case_insensitive() 函数直接干崩的!

而且是 StringFunctions → ngram 内存访问越界 导致段错误(Segmentation Fault)。

这不是 SQL 优化问题,而是 StarRocks 3.3.16 的已知 Bug

升级到3.3.17就行,已经修复了