【详述】问题详细描述:
我在生产上使用flink starrocks connector写数据到starrocks时,偶尔会出现starrocks写事务非常慢的情况,导致我的checkpoint完成时间很长,我从日志里面找到了一个事务的日志,如下:
2022-12-12 11:27:03,253 INFO (nioEventLoopGroup-3-8|3801) [LoadAction.executeWithoutPassword():99] redirect load action to destination=TNetworkAddress(hostname:, port:8040), db: app, tbl: app_option_price_his, label: d29ff8d3-f090-4655-a3a0-f8e7bd31bcf3
2022-12-12 11:27:03,255 INFO (thrift-server-pool-11351|631637) [DatabaseTransactionMgr.beginTransaction():300] begin transaction: txn_id: 4581427 with label d29ff8d3-f090-4655-a3a0-f8e7bd31bcf3 from coordinator BE: , listner id: -1
2022-12-12 11:27:09,089 INFO (PUBLISH_VERSION|20) [DatabaseTransactionMgr.finishTransaction():894] finish transaction TransactionState. txn_id: 4581427, label: d29ff8d3-f090-4655-a3a0-f8e7bd31bcf3, db id: 12693, table id list: 6513952, callback id: -1, coordinator: BE: , transaction status: VISIBLE, error replicas num: 0, replica ids: , prepare time: 1670815623255, commit time: 1670815623279, finish time: 1670815629086, publish cost: 5807ms, reason: attachment: com.starrocks.load.loadv2.ManualLoadTxnCommitAttachment@105ab016 successfully
2022-12-12 11:27:03,283 INFO (thrift-server-pool-11285|630715) [ DatabaseTransactionMgr.commitTransaction():557 ] transaction: [ TransactionState. txn_id: 4581427, label: d29ff8d3-f090-4655-a3a0-f8e7bd31bcf3, db id: 12693, table id list: 6513952, callback id: -1, coordinator: BE: , transaction status: COMMITTED, error replicas num: 0, replica ids: , prepare time: 1670815623255, commit time: 1670815623279, finish time: -1, publish cost: -1670815623280ms, reason: attachment: com.starrocks.load.loadv2.ManualLoadTxnCommitAttachment@105ab016 ] successfully committed
比如这个事务,从coordinator接到事务到完成事务总共花了5807ms,还有一种情况:就是完成事务很快没,但是coordinator向fe汇报事务完成事件有延迟,往往是过了很久才通知fe完成事务;
fe和be中只能找到这么多线索日志。
【背景】做过哪些操作?
starrocks集群中目前只有10多个routine load task,直接从kafka中消费数据,其他的都是小数据量的查询任务。
【业务影响】
影响flink做chekpoint操作,影响业务使用starrocks(数据延迟)
【StarRocks版本】
starrocks 2.2.4
【集群规模】
3fe + 3be
【机器信息】CPU虚拟核/内存/网卡,例如:48C/64G/万兆
fe:4c/16G/万兆 be:8c/32G/万兆