版本:2.5.11
5FE 11BE
使用datax导出数据,1小时固定报错退出:transmit chunk rpc failed:1d35ec94-933f-11ee-b3c4-28e424bb0239
在每个BE里加入了以下参数:
brpc_socket_max_unwritten_bytes=10737418240
thrift_rpc_timeout_ms=10000
版本:2.5.11
5FE 11BE
使用datax导出数据,1小时固定报错退出:transmit chunk rpc failed:1d35ec94-933f-11ee-b3c4-28e424bb0239
在每个BE里加入了以下参数:
brpc_socket_max_unwritten_bytes=10737418240
thrift_rpc_timeout_ms=10000
在同网段申请了另一台主机,同样使用datax,发现导出了10G数据无报错,区别是无报错的这台主机因为是同网段,传输速率高,在85秒内导出完毕
2023-12-05 16:38:27.955 [job-0] INFO StandAloneJobContainerCommunicator - Total 23347520 records, 10644187238 bytes | Speed 24.66MB/s, 56281 records/s | Error 0 records, 0 bytes | All Task WaitWriterTime 31.217s | All Task WaitReaderTime 85.333s | Percentage 0.00%
2023-12-05 16:38:33.408 [0-0-0-reader] INFO CommonRdbmsReader$Task - Finished read record by Sql: [select * from sales.sales act where act.visit_date BETWEEN '2023-06-01' and '2023-12-04';
] jdbcUrl:[jdbc:mysql://10.204.128.68:6033/sales?yearIsDateType=false&zeroDateTimeBehavior=convertToNull&tinyInt1isBit=false&rewriteBatchedStatements=true].
2023-12-05 16:38:35.359 [0-0-0-writer] INFO TxtFileWriter$Task - end do write
2023-12-05 16:38:35.436 [taskGroup-0] INFO TaskGroupContainer - taskGroup[0] taskId[0] is successed, used[437635]ms
2023-12-05 16:38:35.437 [taskGroup-0] INFO TaskGroupContainer - taskGroup[0] completed it's tasks.
2023-12-05 16:38:37.958 [job-0] INFO StandAloneJobContainerCommunicator - Total 24236840 records, 11050266238 bytes | Speed 38.73MB/s, 88932 records/s | Error 0 records, 0 bytes | All Task WaitWriterTime 31.318s | All Task WaitReaderTime 88.078s | Percentage 100.00%
2023-12-05 16:38:37.958 [job-0] INFO AbstractScheduler - Scheduler accomplished all tasks.
2023-12-05 16:38:37.959 [job-0] INFO JobContainer - DataX Writer.Job [txtfilewriter] do post work.
2023-12-05 16:38:37.960 [job-0] INFO JobContainer - DataX Reader.Job [mysqlreader] do post work.
2023-12-05 16:38:37.960 [job-0] INFO JobContainer - DataX jobId [0] completed successfully.
2023-12-05 16:38:37.963 [job-0] INFO HookInvoker - No hook invoked, because base dir not exists or is a file: /data/datax/datax/hook
2023-12-05 16:38:37.965 [job-0] INFO JobContainer -
[total cpu info] =>
averageCpu | maxDeltaCpu | minDeltaCpu
-1.00% | -1.00% | -1.00%
[total gc info] =>
NAME | totalGCCount | maxDeltaGCCount | minDeltaGCCount | totalGCTime | maxDeltaGCTime | minDeltaGCTime
PS MarkSweep | 0 | 0 | 0 | 0.000s | 0.000s | 0.000s
PS Scavenge | 2504 | 1733 | 771 | 9.524s | 6.814s | 2.710s
2023-12-05 16:38:37.965 [job-0] INFO JobContainer - PerfTrace not enable!
2023-12-05 16:38:37.966 [job-0] INFO StandAloneJobContainerCommunicator - Total 24236840 records, 11050266238 bytes | Speed 23.95MB/s, 55083 records/s | Error 0 records, 0 bytes | All Task WaitWriterTime 31.318s | All Task WaitReaderTime 88.078s | Percentage 100.00%
2023-12-05 16:38:37.968 [job-0] INFO JobContainer -
任务启动时刻 : 2023-12-05 16:31:17
任务结束时刻 : 2023-12-05 16:38:37
任务总计耗时 : 440s
任务平均流量 : 23.95MB/s
记录写入速度 : 55083rec/s
读出记录总数 : 24236840
读写失败总数 : 0
brpc_socket_max_unwritten_bytes 这个有尝试继续调大吗?
brpc_socket_max_unwritten_bytes=10737418240
brpc_socket_max_unwritten_bytes=10737418240,调到10G了
这个问题是BI那边反馈的,因为外网下载速度慢,总是1小时断开;
刚开始以为是网络问题,后面在同网段申请了一台主机,用datax限速导数,发现也存在1小时退出
感觉不是这个参数的问题,因为超过10G在1小时内也能成功导出
您好 当前是想不做超时报错 正常限速执行完是吗
是的,因为网络限速的原因,有部分用户从starrocks里抽数会存在超过1小时的情况,现在的问题是下不完 1小时一定会断开
嗯呢 我确认下是任务的超时限制还是datax超时限制的配置
这个在帆软上也是同样的表现,1小时断开
3.1.0也有这个问题
2.5.5 88分钟断开
3.1 最新的版本解决了
3.1.5吗?
对 3.1.5版本
好,我先在测试环境上试试
当前没有合入到2.5版本 方便的话建议您升级到最新的3.1.*版本
后续有计划合并吗?