Stream Load data报错“Cancelled”

【详述】通过Stream Load的方式导数,数据量168G,本地csv文件,多次观察"WriteDataTimeMs"在600s的时候就终止了,状态“Fail”,原因是“Cancelled”。
具体步骤:执行
nohup curl --location-trusted -u root: -H “label:orders1t” -H “column_separator:|” -T /home/starrocks/2.18.0_rc2/dbgen/1tdata/data/orders.tbl http://192.168.4.24:8040/api/tpch/orders/_stream_load &
返回结果:
{
“TxnId”: 26,
“Label”: “orders1t”,
“Status”: “Fail”,
“Message”: “cancelled”,
“NumberTotalRows”: 0,
“NumberLoadedRows”: 0,
“NumberFilteredRows”: 0,
“NumberUnselectedRows”: 0,
“LoadBytes”: 62076297216,
“LoadTimeMs”: 704684,
“BeginTxnTimeMs”: 0,
“StreamLoadPutTimeMs”: 2,
“ReadDataTimeMs”: 528790,
“WriteDataTimeMs”: 601733,
“CommitAndPublishTimeMs”: 0
}
be.conf追加参数:
push_write_mbytes_per_sec = 30
write_buffer_size = 1000
tablet_writer_rpc_timeout_sec = 6000
streaming_load_max_mb = 819200
stream_load_default_timeout_second = 81920
cumulative_compaction_num_threads_per_disk = 4
base_compaction_num_threads_per_disk = 2
cumulative_compaction_check_interval_seconds = 2

【导入/导出方式】Stream Load
【StarRocks版本】StarRocks-1.19.1
【集群规模】单机上面搭建的1FE+1BE+1Broker
【机器信息】104C/376G/万兆
【附件】


可以指定下超时,详细请参考官方文档https://docs.starrocks.com/zh-cn/main/loading/StreamLoad#%E5%88%9B%E5%BB%BA%E5%AF%BC%E5%85%A5%E4%BB%BB%E5%8A%A1

已经根据这个文档更改了相关参数image ,是否还有其他参数需要设置?

暂时不需要了,导入的时候指定下timeout,之前失败应该是timeout默认600s超时了

我把文档分割了,导第一部分成功了,导入第二部分的时候,BE挂了,很迷
具体信息:
[root@host24 data]# curl --location-trusted -u root: -H “label:orders1t_xaa2” -H “column_separator:|” -T /home/starrocks/2.18.0_rc2/dbgen/1tdata/data/xaa http://192.168.4.24:8040/api/tpch/orders/_stream_load
{
“TxnId”: 34,
“Label”: “orders1t_xaa2”,
“Status”: “Publish Timeout”,
“Message”: “transaction commit successfully, BUT data will be visible later”,
“NumberTotalRows”: 200000000,
“NumberLoadedRows”: 200000000,
“NumberFilteredRows”: 0,
“NumberUnselectedRows”: 0,
“LoadBytes”: 23733258884,
“LoadTimeMs”: 419270,
“BeginTxnTimeMs”: 0,
“StreamLoadPutTimeMs”: 4,
“ReadDataTimeMs”: 205817,
“WriteDataTimeMs”: 411762,
“CommitAndPublishTimeMs”: 0
}[root@host24 data]# curl --location-trusted -u root: -H “label:orders1t_xab” -H “column_separator:|” -T /home/starrocks/2.18.0_rc2/dbgen/1tdata/data/xab http://192.168.4.24:8040/api/tpch/orders/_stream_load
{
“TxnId”: 35,
“Label”: “orders1t_xab”,
“Status”: “Fail”,
“Message”: “coordinate BE is down”,
“NumberTotalRows”: 200000000,
“NumberLoadedRows”: 200000000,
“NumberFilteredRows”: 0,
“NumberUnselectedRows”: 0,
“LoadBytes”: 23911145053,
“LoadTimeMs”: 472217,
“BeginTxnTimeMs”: 1,
“StreamLoadPutTimeMs”: 3,
“ReadDataTimeMs”: 250749,
“WriteDataTimeMs”: 472212,
“CommitAndPublishTimeMs”: 0
}
be.WARNING信息:


表数据只有第一部分的数据量:
mysql> select count() from orders;
±----------+
| count(
) |
±----------+
| 200000000 |
±----------+
1 row in set (0.19 sec)
后台进程:


看状态并没有挂呀

咱们现在是只有一台be嘛

是的 单机的1FE+1BE+1Broker

1个be测试1TB ssb么?

是的 我看官方文档说 性能的话单个BE会比较呢

两次导入间隔段时间呢,现在导入还有问题么

1、在命令行设置timeout之后就没有超时cancelled的问题了
2、 “coordinate BE is down”这个问题,过了一段时间我再导入数据就导入成功了。
3、但是导数还有一个问题:
orders第三部分xac导数失败,说有很多不符合条件的数据,查看具体信息如下

[root@host24 data] # curl --location-trusted -u root: -H "label:orders1t_xac" -H "column_separator:|" -T /home/starrocks/2.18.0_rc2/dbgen/1tdata/data/xac http://192.168.4.24:8040/api/tpch/orders/_stream_load

{

"TxnId" : 38,

"Label" : "orders1t_xac" ,

"Status" : "Fail" ,

"Message" : "too many filtered rows" ,

"NumberTotalRows" : 200000000,

"NumberLoadedRows" : 136870911,

"NumberFilteredRows" : 63129089,

"NumberUnselectedRows" : 0,

"LoadBytes" : 23960510156,

"LoadTimeMs" : 279532,

"BeginTxnTimeMs" : 0,

"StreamLoadPutTimeMs" : 1,

"ReadDataTimeMs" : 205098,

"WriteDataTimeMs" : 279530,

"CommitAndPublishTimeMs" : 0,

"ErrorURL" : "http://192.168.4.24:8040/api/_load_error_log?file=__shard_2/error_log_insert_stmt_c346b097-f93e-e282-f303-7648e83fabb7_c346b097f93ee282_f3037648e83fabb7"


报错是说第一个不能为空的列传入了空值,用awk -F “|” ‘$1=="" {print $0}’ xac查看第一列为空的数据,并没有。

查看是否有空行,grep -v ‘^$’ xac|wc -l值也是200000000,与总条数吻合。并不存在空行。很迷。
image

3中信息补充,order表, 168G/1500000000行, split -l 200000000 orders.tbl,生成8个文件(xaa到xah),导入xaa和xab都没有问题,xac就报错。

可能是int越界的问题,把这个字段改为bigint看看

另外1台be的话可以用官方提供的脚本先测试下ssb-100g

1TB ssb测试请参考https://www.starrocks.com/zh-CN/blog/1.8测试

字段改为bigint,导入成功。

导入虽然成功了,但是查询又报了错。具体信息如下,
导入日志:
{
“TxnId”: 49,
“Label”: “orders1t”,
“Status”: “Publish Timeout”,
“Message”: “transaction commit successfully, BUT data will be visible later”,
“NumberTotalRows”: 1500000000,
“NumberLoadedRows”: 1500000000,
“NumberFilteredRows”: 0,
“NumberUnselectedRows”: 0,
“LoadBytes”: 179429739251,
“LoadTimeMs”: 3097502,
“BeginTxnTimeMs”: 0,
“StreamLoadPutTimeMs”: 1,
“ReadDataTimeMs”: 1565468,
“WriteDataTimeMs”: 3089996,
“CommitAndPublishTimeMs”: 0
}
be.WARNING:



查询报的错:

等待一段时间就好了
image

不能着急,得慢慢来惹 :thinking: