broker load卡住一个小时

【SR版本】2.0.5
【SR机器】3个be,与haoop集群在异地机房,网络互通,两集群时间是一致的
【导入语句】
load label wn.agg_mis_r_sales_a_achment_lep_prem_index_m_202112
(
data infile(“hdfs://hdfs01-shyp-sx-stg/user/hive/warehouse/sx_adm_safe.db/agg_mis_r_sales_a_achment_lep_prem_index_m/month=202112/data_from=PREM/margin_version=CUR/*”)
into table agg_mis_r_sales_a_achment_lep_prem_index_m
COLUMNS TERMINATED BY “\x01”
(几十个hive表字段,省略)
)
WITH BROKER ‘hdfs_broker’

(
“username” = “hadoop”,
“password” = “Bigdata123$”
)
【现象】每次broker load后,会卡在LOAD:0%长达近一小时时间
| 12051 | agg_mis_r_sales_a_achment_lep_prem_index_m_202112 | LOADING | ETL:100%; LOAD:0% | BROKER | NULL | cluster:N/A; timeout(s):14400; max_filter_ratio:0.0 | NULL | 2022-06-10 16:36:11 | 2022-06-10 16:36:12 | 2022-06-10 16:36:12 | 2022-06-10 16:36:12 | NULL | NULL | {“Unfinished backends”:{“7c40a024-1888-4b67-a4b8-c55e48bc7757”:[10005]},“ScannedRows”:1055374,“TaskNumber”:1,“All backends”:{“7c40a024-1888-4b67-a4b8-c55e48bc7757”:[10005,10006,10004]},“FileNumber”:1,“FileSize”:6064849574}

过一小时报错hive数据格式不对:
| 12051 | agg_mis_r_sales_a_achment_lep_prem_index_m_202112 | CANCELLED | ETL:N/A; LOAD:N/A | BROKER | unselected.rows=0; dpp.abnorm.ALL=1573733; dpp.norm.ALL=0 | cluster:N/A; timeout(s):14400; max_filter_ratio:0.0 | type:ETL_QUALITY_UNSATISFIED; msg:quality not good enough to cancel | 2022-06-10 16:36:11 | 2022-06-10 16:36:12 | 2022-06-10 16:36:12 | 2022-06-10 16:36:12 | 2022-06-10 17:26:23 | http://30.104.226.30:8040/api/_load_error_log?file=__shard_3/error_log_insert_stmt_7c40a024-1888-4b67-a4b8-c55e48bc7758_7c40a02418884b67_a4b8c55e48bc7758 | {“Unfinished backends”:{“7c40a024-1888-4b67-a4b8-c55e48bc7757”:[]},“ScannedRows”:1573733,“TaskNumber”:1,“All backends”:{“7c40a024-1888-4b67-a4b8-c55e48bc7757”:[10005,10006,10004]},“FileNumber”:1,“FileSize”:6064849574}

而且都是刚开始show load ,显示Unfinished backends有三个,但几秒钟以后就变成了一个。去看监控也发现只有一个be有CPU负载

你好,请问下部署了一个broker节点吗?可以curl下errorurl看到具体的异常信息。

每个Fe节点都部署了一个Broker。curl显示:Reason: column clog]; ismatch, expect=3 real=10. src line: [RCFhive.io.rcfile.column.number2�+��u5 ct�fF�

你好,这个报错是有脏数据或者列分隔符不对,目标表字段是3列吗?

如果有脏数据的话,可以配置下 max_filter_ratio=0.x,可以过滤对应比例的脏数据。

我配了"max_filter_ratio" = “1”。分隔符的话,hive文件存储格式是RCFile,使用默认分隔符,我在SR配了COLUMNS TERMINATED BY “\x01”

orc格式的导入指定下导入格式吧,默认是csv,

load label wn.agg_mis_r_sales_a_achment_lep_prem_index_m_202112
(
data infile("hdfs://hdfs01-shyp-sx-stg/user/hive/warehouse/sx_adm_safe.db/agg_mis_r_sales_a_achment_lep_prem_index_m/month=202112/data_from=PREM/margin_version=CUR/*")
into table agg_mis_r_sales_a_achment_lep_prem_index_m
COLUMNS TERMINATED BY "\\x01"
FORMAT AS "orc"
(几十个hive表字段,省略)
)
WITH BROKER 'hdfs_broker'

(
"username" = "hadoop",
"password" = "Bigdata123$"
)

type:LOAD_RUN_FAIL; msg:OrcScannerAdapter::init failed. reason = Not an ORC file

hive存储用的RCFile,不是orc

确认下表中字段个数和hdfs文件中字段个数一致吗?一致的话用下面的sql试试

load label wn.agg_mis_r_sales_a_achment_lep_prem_index_m_202112
(
data infile("hdfs://hdfs01-shyp-sx-stg/user/hive/warehouse/sx_adm_safe.db/agg_mis_r_sales_a_achment_lep_prem_index_m/month=202112/data_from=PREM/margin_version=CUR/*")
into table agg_mis_r_sales_a_achment_lep_prem_index_m
COLUMNS TERMINATED BY "\\x01"
(几十个hive表字段,省略)
)
WITH BROKER 'hdfs_broker'

(
    "username" = "hadoop",
    "password" = "Bigdata123$"
)
PROPERTIES
(
    "timeout" = "3600",
    "max_filter_ratio" = "0.1"
);

SR表字段个数肯定是和hive表字段个数相同,但hive表在hdfs里存储的文件格式是RCFile,他不是 字段1 字段2 … 字段n 这样来存储的

我新建了另一个保存格式为ORCFile的表进行导入,报错为type:LOAD_RUN_FAIL; msg:OrcScannerAdapter::init_include_columns. col name = month not found。month是hive表中的分区列,请问这种报错需要怎么解决

使用下面的sql语句:
load label wn.test_202201
(
data infile(“hdfs://hdfs01-shyp-sx-stg/user/hive/warehouse/wn.db/test3/month=202201/000000_0”)
into table test
COLUMNS TERMINATED BY “\x01”
FORMAT AS “orc”
(name,id,month)
set
(log=name,
id=id,
month=month)
)
WITH BROKER ‘hdfs_broker’
(
“username” = “hadoop”,
“password” = “Bigdata123$”
)
PROPERTIES
(
“max_filter_ratio” = “1”
)

用这个sql跑下

load label wn.test_202201
(
data infile("hdfs://hdfs01-shyp-sx-stg/user/hive/warehouse/wn.db/test3/month=*/*")
into table test
COLUMNS TERMINATED BY "\x01"
FORMAT AS "orc"
(name,id,month)
COLUMNS FROM PATH AS (month)
set
(log=name,
id=id,
month=month)
)
WITH BROKER 'hdfs_broker'
(
"username" = "hadoop",
"password" = "Bigdata123$"
)
PROPERTIES
(
"max_filter_ratio" = "1"
)

你好,该问题已解决,最终发现是hive存储格式的问题。转存成orc格式以后,使用上面的sql就可以成功了。谢谢

1赞