【详述】从版本2.4.1升级至2.5.3版本。使用异步物化视图功能之后。3个节点服务器,随机出现自动重启故障。服务器为VMware虚拟机。从VMware管理平台可以看到报错:客户机操作系统已禁用该CPU。请关闭电源或重置虚拟机。
【业务影响】服务查询很慢。偶尔出现无法查询。
【StarRocks版本】2.5.3
【集群规模】例如:3fe(3 follower)+ 3be(fe与be混部)
【机器信息】CPU虚拟核/内存/网卡8C/16G/万兆
【联系方式】
【附件】
- fe.log/beINFO/相应截图
- 慢查询:
- 并行度:show variables like ‘%parallel_fragment_exec_instance_num%’;
- pipeline是否开启:show variables like ‘%pipeline%’;
- fe/be节点cpu和内存使用率截图,在重启时间端前后情况:
下线时间:
堆内存:
qps:
cpu:
be内存
- 并行度:show variables like ‘%parallel_fragment_exec_instance_num%’;
fe.out 重启前后时间日志:
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/data/server/fe/lib/slf4j-reload4j-1.7.36.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/data/server/fe/lib/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/data/server/fe/lib/log4j-slf4j-impl-2.17.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Reload4jLoggerFactory]
[2023-04-07 14:13:58] notify new FE type transfer: UNKNOWN
[2023-04-07 14:13:58] notify new FE type transfer: FOLLOWER
log4j:WARN No appenders could be found for logger (velocity).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
四月 07, 2023 2:14:48 下午 com.github.benmanes.caffeine.cache.LocalAsyncCache$AsyncBulkCompleter accept
警告: Exception thrown during asynchronous load
java.util.concurrent.CompletionException: com.starrocks.sql.common.StarRocksPlannerException: StarRocks planner use long time 3000 ms in logical phase, This probably because 1. FE Full GC, 2. Hive external table fetch metadata took a long time, 3. The SQL is very complex. You could 1. adjust FE JVM config, 2. try query again, 3. enlarge new_planner_optimize_timeout session variable
at java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:273)
at java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:280)
at java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1606)
at java.util.concurrent.CompletableFuture$AsyncSupply.exec(CompletableFuture.java:1596)
at java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289)
at java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1067)
at java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1703)
at java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:172)
Caused by: com.starrocks.sql.common.StarRocksPlannerException: StarRocks planner use long time 3000 ms in logical phase, This probably because 1. FE Full GC, 2. Hive external table fetch metadata took a long time, 3. The SQL is very complex. You could 1. adjust FE JVM config, 2. try query again, 3. enlarge new_planner_optimize_timeout session variable
at com.starrocks.sql.optimizer.task.SeriallyTaskScheduler.executeTasks(SeriallyTaskScheduler.java:38)
at com.starrocks.sql.optimizer.Optimizer.ruleRewriteIterative(Optimizer.java:479)
at com.starrocks.sql.optimizer.Optimizer.logicalRuleRewrite(Optimizer.java:220)
at com.starrocks.sql.optimizer.Optimizer.rewriteAndValidatePlan(Optimizer.java:324)
at com.starrocks.sql.optimizer.Optimizer.optimizeByCost(Optimizer.java:132)
at com.starrocks.sql.optimizer.Optimizer.optimize(Optimizer.java:93)
at com.starrocks.sql.StatementPlanner.createQueryPlan(StatementPlanner.java:95)
at com.starrocks.sql.StatementPlanner.plan(StatementPlanner.java:66)
at com.starrocks.statistic.StatisticExecutor.executeDQL(StatisticExecutor.java:239)
at com.starrocks.statistic.StatisticExecutor.queryStatisticSync(StatisticExecutor.java:83)
at com.starrocks.sql.optimizer.statistics.ColumnBasicStatsCacheLoader.queryStatisticsData(ColumnBasicStatsCacheLoader.java:111)
at com.starrocks.sql.optimizer.statistics.ColumnBasicStatsCacheLoader.lambda$asyncLoadAll$1(ColumnBasicStatsCacheLoader.java:77)
at java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1604)
… 5 more
四月 07, 2023 2:14:48 下午 com.github.benmanes.caffeine.cache.LocalAsyncCache$AsyncBulkCompleter accept
警告: Exception thrown during asynchronous load
java.util.concurrent.CompletionException: com.starrocks.sql.common.StarRocksPlannerException: StarRocks planner use long time 3000 ms in logical phase, This probably because 1. FE Full GC, 2. Hive external table fetch metadata took a long time, 3. The SQL is very complex. You could 1. adjust FE JVM config, 2. try query again, 3. enlarge new_planner_optimize_timeout session variable
at java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:273)
at java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:280)
at java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1606)
at java.util.concurrent.CompletableFuture$AsyncSupply.exec(CompletableFuture.java:1596)
at java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289)
at java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1067)
at java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1703)
at java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:172)
Caused by: com.starrocks.sql.common.StarRocksPlannerException: StarRocks planner use long time 3000 ms in logical phase, This probably because 1. FE Full GC, 2. Hive external table fetch metadata took a long time, 3. The SQL is very complex. You could 1. adjust FE JVM config, 2. try query again, 3. enlarge new_planner_optimize_timeout session variable
at com.starrocks.sql.optimizer.task.SeriallyTaskScheduler.executeTasks(SeriallyTaskScheduler.java:38)
at com.starrocks.sql.optimizer.Optimizer.ruleRewriteIterative(Optimizer.java:479)
at com.starrocks.sql.optimizer.Optimizer.logicalRuleRewrite(Optimizer.java:220)
at com.starrocks.sql.optimizer.Optimizer.rewriteAndValidatePlan(Optimizer.java:324)
at com.starrocks.sql.optimizer.Optimizer.optimizeByCost(Optimizer.java:132)
at com.starrocks.sql.optimizer.Optimizer.optimize(Optimizer.java:93)
at com.starrocks.sql.StatementPlanner.createQueryPlan(StatementPlanner.java:95)
at com.starrocks.sql.StatementPlanner.plan(StatementPlanner.java:66)
at com.starrocks.statistic.StatisticExecutor.executeDQL(StatisticExecutor.java:239)
at com.starrocks.statistic.StatisticExecutor.queryStatisticSync(StatisticExecutor.java:83)
at com.starrocks.sql.optimizer.statistics.ColumnBasicStatsCacheLoader.queryStatisticsData(ColumnBasicStatsCacheLoader.java:111)
at com.starrocks.sql.optimizer.statistics.ColumnBasicStatsCacheLoader.lambda$asyncLoadAll$1(ColumnBasicStatsCacheLoader.java:77)
at java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1604)
… 5 more