为了更快的定位您的问题,请提供以下信息,谢谢
【详述】在starrocks正常运行的状态下,reboot每台虚拟机,启动后fe无法正常启动
【背景】在3台虚拟机上搭建了3台FE和3台BE的集群,可以都正常运行,fe.conf和be.conf下面也都配置了priority_networks为{当前部署虚拟机ip}/24,配置好service服务,加入开机自启服务,集群没有数据,是个全新的集群,然后在服务正常运行的情况下,直接reboot虚拟机,但是重启后发现FE集群无法启动,mysql客户端无法连接9030端口
【业务影响】
【是否存算分离】否
【StarRocks版本】3.4
【集群规模】3fe(1 leader + 2 follower)+3be(fe与be混部)
【机器信息】内存16G,磁盘没有满
【联系方式】rootwang@163.com
【附件】
- 三台fe.log中反复报
我查看其中提示信息,在我这里都不存在,时间是同步的:
It took too much time for FE to transfer to a stable state(LEADER/FOLLOWER), it maybe caused by one of the following reasons:- There are too many BDB logs to replay, because of previous failure of checkpoint(you can check the create time of image file under meta/image dir).
- Majority voting members(LEADER or FOLLOWER) of the FE cluster haven’t started completely.
- FE node has multiple IPs, you should configure the priority_networks in fe.conf to match the ip record in meta/image/ROLE. And we don’t support change the ip of FE node. Ignore this reason if you are using FQDN.
- The time deviation between FE nodes is greater than 5s, please use ntp or other tools to keep clock synchronized.
- The configuration of edit_log_port has changed, please reset to the original value.
- The replayer thread may get stuck, please use jstack to find the details.
各位大佬请帮忙看看,谢谢,相关日志如下admin-3.zip (63.8 KB)
