【详述】BE节点一直输出 [Driver] Timeout 日志,每个节点一分钟都会输出2G的 [Driver] Timeout 告警日志
【背景】大概时间节点是从做了账号角色的资源隔离开始的,此前基本没有此类问题
【业务影响】集群目前并没有异常情况,平稳运行
【是否存算分离】否
【StarRocks版本】2.5.12
【集群规模】例如:7fe(1leader+6 follower)+6be(fe是虚拟机be是独占节点)
【机器信息】50C/256G/万兆
【联系方式】
【附件】
W1225 11:11:11.595973 10056 pipeline_driver_poller.cpp:70] [Driver] Timeout, query_id=be50aa9f-6be2-4079-b9ad-0f92053e7d40, instance_id=be50aa9f-6be2-4079-b9ad-0f92053e7d41
W1225 11:11:11.785729 10056 pipeline_driver_poller.cpp:70] [Driver] Timeout, query_id=be50aa9f-6be2-4079-bb73-e5f96035d852, instance_id=be50aa9f-6be2-4079-bb73-e5f96035d853
W1225 11:11:11.785744 10056 pipeline_driver_poller.cpp:70] [Driver] Timeout, query_id=9a8836b4-c7b4-451e-a8b6-31b2169b1dde, instance_id=9a8836b4-c7b4-451e-a8b6-31b2169b1ddf
W1225 11:11:11.785763 10056 pipeline_driver_poller.cpp:70] [Driver] Timeout, query_id=9a8836b4-c7b4-451e-a624-5ffe95c05bfa, instance_id=9a8836b4-c7b4-451e-a624-5ffe95c05bfb
W1225 11:11:11.785774 10056 pipeline_driver_poller.cpp:70] [Driver] Timeout, query_id=2740a89e-61fa-4a42-9429-41b3b074ae60, instance_id=2740a89e-61fa-4a42-9429-41b3b074ae61
W1225 11:11:11.785882 10056 pipeline_driver_poller.cpp:70] [Driver] Timeout, query_id=2740a89e-61fa-4a42-9429-41b3b074ae60, instance_id=2740a89e-61fa-4a42-9429-41b3b074ae61
W1225 11:11:11.785902 10056 pipeline_driver_poller.cpp:70] [Driver] Timeout, query_id=2740a89e-61fa-4a42-9429-41b3b074ae60, instance_id=2740a89e-61fa-4a42-9429-41b3b074ae61
W1225 11:11:11.785909 10056 pipeline_driver_poller.cpp:70] [Driver] Timeout, query_id=2740a89e-61fa-4a42-ac3d-a4000c080f9b, instance_id=2740a89e-61fa-4a42-ac3d-a4000c080f9c
W1225 11:11:11.785917 10056 pipeline_driver_poller.cpp:70] [Driver] Timeout, query_id=2740a89e-61fa-4a42-ac3d-a4000c080f9b, instance_id=2740a89e-61fa-4a42-ac3d-a4000c080f9c
W1225 11:11:11.785939 10056 pipeline_driver_poller.cpp:70] [Driver] Timeout, query_id=2740a89e-61fa-4a42-ac3d-a4000c080f9b, instance_id=2740a89e-61fa-4a42-ac3d-a4000c080f9c
W1225 11:11:11.785954 10056 pipeline_driver_poller.cpp:70] [Driver] Timeout, query_id=987e6f40-6fc2-4ddf-98f2-1db4e76524c5, instance_id=987e6f40-6fc2-4ddf-98f2-1db4e76524c6
W1225 11:11:11.785962 10056 pipeline_driver_poller.cpp:70] [Driver] Timeout, query_id=987e6f40-6fc2-4ddf-98f2-1db4e76524c5, instance_id=987e6f40-6fc2-4ddf-98f2-1db4e76524c6
重启一下这个be节点 看能不能恢复
另外,这个集群咋部署了7个fe节点,一般3个fe就够了,如果做查询能力扩展的话,可以增加部署 observer 节点,不用follower节点