select,在数据量特别大时,如果想直接KILL这个sql,是无法kill掉,只能重启be让任务失败,任务失败以后发现,starrocks还在写入hive,导致namenode与datanode得rpc还是无法释放,只能全部把BE重启掉,这个

【详述】
select,在数据量特别大时,如果想直接KILL这个sql,是无法kill掉,只能重启be让任务失败,任务失败以后发现,starrocks还在写入hive,导致namenode与datanode得rpc还是无法释放,只能全部把BE重启掉,这个
【背景】做过哪些操作?
【业务影响】
【是否存算分离】 否
【StarRocks版本】例如:3.3
【集群规模】3fe(1 follower+2observer)+32be(fe与be混部)

用的HiveCatalog吗?

是的。

SQL是什么样的

SET query_timeout = 3600;
SET resource_group = ‘default_wg’;
SET enable_pipeline_engine = true;
SET pipeline_dop = 8;

insert overwrite hive.temp.operate_waybill_record_op_track_dd_01 partition(dt,op_code)
select
a.waybill_no,
a.create_time,
a.org_code,
a.org_type,
a.volume_weight,
a.weigh_weight,
a.input_weight,
a.pkg_length,
a.pkg_width,
a.pkg_height,
a.source_org_code,
a.des_org_code,
a.previous_org_code,
a.next_org_code,
a.container_no,
a.truck_no,
a.pkg_no,
a.create_user_code,
a.create_user_name,
a.emp_code,
a.emp_name,
a.create_terminal,
a.create_org_code,
a.device_type,
a.aux_route_code,
a.io_type,
a.cmp_flag,
a.bill_source_org_code,
a.eco_bag,
a.etl_extract_time,
a.waybill_flag,
a.is_envelope,
a.modify_user_code,
a.modify_user_name,
a.weight,
a.upload_time,
a.effective_type_code,
a.auto_device_no,
a.lattice_no,
a.latticethree_code,
a.pack_type,
a.op_station,
a.in_out_flag,
a.client_mac,
COALESCE(c.customer_code,b.customer_code) customer_code,
COALESCE(c.customer_name,b.customer_name) customer_name,
b.order_channel_code,
COALESCE(c.mat_org_code,b.mat_org_code) mat_org_code,
COALESCE(c.datoubi,b.datoubi) datoubi,
COALESCE(c.get_waybill_time,b.get_waybill_time) get_waybill_time,
COALESCE(c.business_id,b.business_id) business_id,
COALESCE(c.end_org_code,b.end_org_code) end_org_code,
b.taking_org_code,
b.create_time as take_earlist_time,
b.des_org_code as taking_des_org_code,
b.weigh_weight as taking_weigh_weight,
b.taking_emp_code,
b.taking_emp_name,
b.taking_create_time,
b.is_scatter,
b.is_yzd,
b.is_cod,
b.is_hk,
b.is_return,
b.is_issue,
COALESCE(c.is_vip,b.is_vip) is_vip,
COALESCE(c.platform_vip,b.platform_vip) platform_vip,
b.source_org_code as taking_source_org_code,
b.pick_org_code,
c.target_org_code,
b.pkg_length taking_pkg_length,
b.pkg_width taking_pkg_width,
b.pkg_height taking_pkg_height,
a.dt,
a.op_code
from ( select *
from hive.dwd.operate_waybill_record_dd --7天80亿
where dt >= DATE_FORMAT(DATE_SUB(STR_TO_DATE(‘20251010’, ‘%Y%m%d’), INTERVAL 6 DAY),’%Y%m%d’)
and dt <= DATE_FORMAT(DATE_SUB(STR_TO_DATE(‘20251010’, ‘%Y%m%d’), INTERVAL 0 DAY),’%Y%m%d’)
) a
left join ( select waybill_no,
customer_code,
customer_name,
order_channel_code,
mat_org_code,
datoubi,
get_waybill_time,
business_id,
end_org_code,
taking_org_code,
create_time ,
des_org_code ,
weigh_weight,
taking_emp_code,
taking_emp_name,
taking_create_time,
is_scatter,
is_yzd,
is_cod,
is_hk,
is_return,
is_issue,
is_vip,
platform_vip,
is_city_waybill,
source_org_code,
pick_org_code,
pkg_length,
pkg_width,
pkg_height,
row_number() over(partition by waybill_no order by create_time) rn
from hive.dwd.waybill_take_waybill_info_dd --37天30亿
where dt >= DATE_FORMAT(DATE_SUB(STR_TO_DATE(‘20251010’, ‘%Y%m%d’), INTERVAL 37 DAY),’%Y%m%d’)
and dt <= ‘20251010’
) b
on a.waybill_no = b.waybill_no
and b.rn = 1
left join ( select waybill_no,
company_code end_org_code,
target_org_code,
branch_code mat_org_code,
seller_id business_id,
curtime get_waybill_time,
da_tou_bi datoubi,
customer_code,
customer_name,
case when is_vip = ‘1’ then ‘Y’ else ‘N’ end is_vip,
platform_vip,
is_city_waybill,
–is_m_waybill,
row_number() over(partition by waybill_no order by curtime) rn
from hive.dwd.waybill_waybillcenter_d --37天30亿
where dt >= DATE_FORMAT(DATE_SUB(STR_TO_DATE(‘20251010’, ‘%Y%m%d’), INTERVAL 37 DAY),’%Y%m%d’)
and dt <= ‘20251010’
) c
on a.waybill_no = c.waybill_no
and c.rn = 1
;
我们使用了资源队列隔离,资源使用整体资源的50%。

由于我们用的是hive的外部表,所以读取和写入的表都是hive表。只是在starorcks上做计算。