starrocks FE节点宕机后,内存占用持续走高,无法人工干预.

【详述】问题详细描述 (机器3台,starrcoks1,starrocks2,starocks3 ,混布3FE,3BE,每台1BE,1FE)
starrocksFE节点宕机启动后,持续(4个小时)出现内存占用过高(82%内存占用,一直告警),无法人工干预或者人工加快恢复,直到最后在社区同学帮助下,减少了FE的JVM(30G—》20G)手动重启所有fe,be节点,内存高占用问题解决.截止到第二天下午,任然发现starrocks2FE节点在做数据重放操作。且jps发现后存在一些奇怪的进程。


确认是starrocks的,因为我们新上数仓,只有海豚调度,和starrocks在用.目前还在进行数据的重放操作,但暂时没有内存占用过高的报警。

【背景】做过哪些操作?
昨天13:50左右. 做过 insert overwrite dwd.xxx select xxx from ods.xxx 做过一张1亿级数据量的大表的插入.(海豚调度补数方式按天插入.)
select
id,code,order_type,sub_type,order_source,settle_on book_date,trade_date, trade_on, trade_hour,
org_code,cashier_code,cashier_name,device_code,device_name,counterman_code,
COALESCE(customer_type,0) customer_type,customer_code,customer_orgcode,delivery_type,COALESCE(delivery_order_source,0) delivery_order_source,

1 total_sale_order_cnt,
(case when order_type = 2 then -1 else 1 end) * total_amt total_sale_amt,
(case when order_type = 2 then -1 else 1 end) * spillover_amt total_spillover_amt,
(case when order_type = 2 then -1 else 1 end) * fact_amt total_fact_amt,
(case when order_type = 2 then -1 else 1 end) * freight_amt total_freight_amt,
(case when order_type = 2 then -1 else 1 end) * goods_amt goods_amt,
(case when order_type = 2 then -1 else 1 end) * vip_amt vip_amt,
(case when order_type = 2 then -1 else 1 end) * promotion_amt promotion_amt,
(case when order_type = 2 then -1 else 1 end) * discount_amt discount_amt,
(case when order_type = 2 then -1 else 1 end) * round_amt round_amt,
(case when order_type = 2 then -1 else 1 end) * diff_amt diff_amt,
(case when order_type = 2 then -1 else 1 end) * gmv_share_amt gmv_share_amt,
(case when order_type = 2 then -1 else 1 end) * pocket_amt pocket_amt,
(case when order_type = 2 then -1 else 1 end) * poi_fact_amt poi_fact_amt,
(case when order_type = 2 then -1 else 1 end) * platform_charge1 platform_charge1,
(case when order_type = 2 then -1 else 1 end) * platform_charge2 platform_charge2,
(case when order_type = 2 then -1 else 1 end) * poi_promotion_amt poi_promotion_amt,
(case when order_type = 2 then -1 else 1 end) * platform_promotion_amt platform_promotion_amt,
detail_count,detail_count2,case when detail_count2 > 0 then 1 else 0 end total_sale_order_cnt2,

(case when order_type = 2 then 0 else 1 end) sale_order_cnt,
(case when order_type = 2 then 0 else 1 end) * total_amt sale_order_amt,
(case when order_type = 2 then 0 else 1 end) * spillover_amt sale_spillover_amt,
(case when order_type = 2 then 0 else 1 end) * fact_amt sale_fact_amt,

(case when order_type = 2 then 1 else 0 end) return_order_cnt,
(case when order_type = 2 then 1 else 0 end) * total_amt return_order_amt,
(case when order_type = 2 then 1 else 0 end) * spillover_amt return_spillover_amt,
(case when order_type = 2 then 1 else 0 end) * fact_amt return_fact_amt,

(case when sub_type in (0,1,6,7,10) then 1 else 0 end) retail_order_cnt,
(case when sub_type in (0,1,6,7,10) then 1 else 0 end) * (case when order_type = 2 then -1 else 1 end) * total_amt retail_order_amt,
(case when sub_type in (0,1,6,7,10) then 1 else 0 end) * (case when order_type = 2 then -1 else 1 end) * spillover_amt retail_spillover_amt,
(case when sub_type in (0,1,6,7,10) then 1 else 0 end) * (case when order_type = 2 then -1 else 1 end) * fact_amt retail_fact_amt,
(case when sub_type in (0,1,6,7,10) then 1 else 0 end) * (case when order_type = 2 then -1 else 1 end) * freight_amt retail_freight_amt,

(case when sub_type in (2,8) then 1 else 0 end) wholesale_order_cnt,
(case when sub_type in (2,8) then 1 else 0 end) * (case when order_type = 2 then -1 else 1 end) * total_amt wholesale_order_amt,
(case when sub_type in (2,8) then 1 else 0 end) * (case when order_type = 2 then -1 else 1 end) * spillover_amt wholesale_spillover_amt,
(case when sub_type in (2,8) then 1 else 0 end) * (case when order_type = 2 then -1 else 1 end) * fact_amt wholesale_fact_amt,
(case when sub_type in (2,8) then 1 else 0 end) * (case when order_type = 2 then -1 else 1 end) * freight_amt wholesale_freight_amt,

(case when customer_type = 1 then 1 else 0 end) total_member_sale_order_cnt,
(case when customer_type = 1 then 1 else 0 end) * (case when order_type = 2 then -1 else 1 end) * total_amt total_member_sale_order_amt,
(case when customer_type = 1 then 1 else 0 end) * (case when order_type = 2 then -1 else 1 end) * spillover_amt total_member_sale_spillover_amt,
(case when customer_type = 1 then 1 else 0 end) * (case when order_type = 2 then -1 else 1 end) * fact_amt total_member_sale_fact_amt,

(case when customer_type = 1 then 1 else 0 end) * (case when order_type = 2 then 0 else 1 end) member_sale_order_cnt,
(case when customer_type = 1 then 1 else 0 end) * (case when order_type = 2 then 0 else 1 end) * total_amt member_sale_order_amt,
(case when customer_type = 1 then 1 else 0 end) * (case when order_type = 2 then 0 else 1 end) * spillover_amt member_sale_spillover_amt,
(case when customer_type = 1 then 1 else 0 end) * (case when order_type = 2 then 0 else 1 end) * fact_amt member_sale_fact_amt,

(case when customer_type = 1 then 1 else 0 end) * (case when order_type = 2 then 1 else 0 end) member_return_order_cnt,
(case when customer_type = 1 then 1 else 0 end) * (case when order_type = 2 then 1 else 0 end) * total_amt member_return_order_amt,
(case when customer_type = 1 then 1 else 0 end) * (case when order_type = 2 then 1 else 0 end) * spillover_amt member_return_spillover_amt,
(case when customer_type = 1 then 1 else 0 end) * (case when order_type = 2 then 1 else 0 end) * fact_amt member_return_fact_amt

from xs_gmv_order
where status = 80
and order_type in (1,2,5) and sub_type in (0,1,2,6,7,8,10)
and settle_on = ‘2024-7-29’

14:20 显示打挂FE节点,通过FE自动调起脚本重启了fe,此时发现starrocks2机器上fe日志中一直在做数据重放操作.

【业务影响】昨天(8月8号)下午2点开始,到18点starrocks2节点一直出现内存高,cpu高,查starrocks2 FE节点,garbege线程占用绝大多少资源(100%cpu,20多G的JVM.)告警群一直在告警starrocks2节点内存>80%.目前starrocks2节点还在做数据重放操作,可以跑补数和导入任务。
【是否存算分离】 未存算分离,混布集群.
【StarRocks版本】3.1-stable
【集群规模】例如:3fe+3be
【机器信息】CPU虚拟核/内存/网卡,(16core 64GB) *3
【表模型】主键模型
【导入或者导出方式】starrocks ods内表插入到dwd内表.
【联系方式】为了在解决问题过程中能及时联系到您获取一些日志信息,请补充下您的联系方式:
WX:18770170039
【附件】

replayed journal 是在做元数据的回放,一般发生在fe重启后,回放时间较长可能是因为长时间没做checkpoint,等回放完成fe起来后,可以看下meta目录下image和bdb的文件大小和数量

1赞

您好,Leader 中的image和bdb 占用不到1G,我们是新上集群,表不到20张。数据量500G左右,不太可能是因为元数据过多造成问题。此外,这种数据重放操作是否有命令可以加快重放速度,一直重放导致某台单节点高内存,高CPU(StarrockFE的garbege线程),持续云服务器内存、cpu告警,让人很容易觉得有问题。