【详述】flink实时写入SR表报错
【背景】滚动重启BE服务后Flink任务写入失败
【业务影响】影响业务
【是否存算分离】集群存算一体
【StarRocks版本】3.2.16
【集群规模】例如:3fe(1 follower+2observer)+ 8BE
【机器信息】96C + 512G + 8T
【附件】
2025-08-19 19:55:56
java.lang.RuntimeException: Failed to notify checkpoint complete for checkpoint id 9223372036854775807
at com.starrocks.connector.flink.table.sink.StarRocksDynamicSinkFunctionV2.notifyCheckpointComplete(StarRocksDynamicSinkFunctionV2.java:367)
at com.starrocks.connector.flink.table.sink.StarRocksDynamicSinkFunctionV2.openForExactlyOnce(StarRocksDynamicSinkFunctionV2.java:231)
at com.starrocks.connector.flink.table.sink.StarRocksDynamicSinkFunctionV2.open(StarRocksDynamicSinkFunctionV2.java:212)
at org.apache.flink.api.common.functions.util.FunctionUtils.openFunction(FunctionUtils.java:34)
at org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator.open(AbstractUdfStreamOperator.java:101)
at org.apache.flink.streaming.api.operators.StreamSink.open(StreamSink.java:46)
at org.apache.flink.streaming.runtime.tasks.RegularOperatorChain.initializeStateAndOpenOperators(RegularOperatorChain.java:107)
at org.apache.flink.streaming.runtime.tasks.StreamTask.restoreGates(StreamTask.java:753)
at org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor$1.call(StreamTaskActionExecutor.java:55)
at org.apache.flink.streaming.runtime.tasks.StreamTask.restoreInternal(StreamTask.java:728)
at org.apache.flink.streaming.runtime.tasks.StreamTask.restore(StreamTask.java:693)
at org.apache.flink.runtime.taskmanager.Task.runWithSystemExitMonitoring(Task.java:953)
at org.apache.flink.runtime.taskmanager.Task.restoreAndInvoke(Task.java:922)
at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:746)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:562)
at java.lang.Thread.run(Thread.java:750)
Caused by: java.lang.RuntimeException: com.starrocks.data.load.stream.exception.StreamLoadFailException: Transaction commit failed, db: cdc, table: erp_order, label: flink-07bd50d0-5689-425b-ab12-41fa56dac0e5, commit response status: TXN_NOT_EXISTS, label state: UNKNOWN. commit response status with TXN_NOT_EXISTS or label state with UNKNOWN often happens when transaction timeouts, and please check StarRocks FE leader’s log to confirm it. You can find the transaction id for the label in the FE log first, and search with the transaction id and the keyword ‘expired’
at com.starrocks.data.load.stream.TransactionStreamLoader.commit(TransactionStreamLoader.java:310)
at com.starrocks.data.load.stream.DefaultStreamLoader.commit(DefaultStreamLoader.java:239)
at com.starrocks.data.load.stream.v2.StreamLoadManagerV2.commit(StreamLoadManagerV2.java:401)
at com.starrocks.connector.flink.table.sink.StarRocksDynamicSinkFunctionV2.notifyCheckpointComplete(StarRocksDynamicSinkFunctionV2.java:355)
… 15 more
Caused by: com.starrocks.data.load.stream.exception.StreamLoadFailException: Transaction commit failed, db: cdc_test, table: tms_t_failed_order, label: flink-07bd50d0-5689-425b-ab12-41fa56dac0e5, commit response status: TXN_NOT_EXISTS, label state: UNKNOWN. commit response status with TXN_NOT_EXISTS or label state with UNKNOWN often happens when transaction timeouts, and please check StarRocks FE leader’s log to confirm it. You can find the transaction id for the label in the FE log first, and search with the transaction id and the keyword ‘expired’
at com.starrocks.data.load.stream.TransactionStreamLoader.commit(TransactionStreamLoader.java:308)
… 18 more