存算分离fe报错Bad page: checksum mismatch

【详述】查询一张1ww行100列的表时报错
2025-04-13 12:34:58.046+08:00 WARN (starrocks-mysql-nio-pool-1|812) [FragmentInstanceExecState.waitForDeploymentCompletion():306] exec plan fragment failed, err
msg=Query terminates prematurely backend [id=10215] [host=192.168.16.137], code=CANCELLED, fragmentId=F37, backend=192.168.16.137:9060
2025-04-13 12:34:58.046+08:00 WARN (starrocks-mysql-nio-pool-1|812) [FragmentInstanceExecState.waitForDeploymentCompletion():306] exec plan fragment failed, err
msg=Query terminates prematurely backend [id=10215] [host=192.168.16.137], code=CANCELLED, fragmentId=F46, backend=192.168.16.137:9060
2025-04-13 12:34:58.047+08:00 WARN (starrocks-mysql-nio-pool-1|812) [FragmentInstanceExecState.waitForDeploymentCompletion():306] exec plan fragment failed, err
msg=Query terminates prematurely backend [id=10214] [host=192.168.16.136], code=CANCELLED, fragmentId=F46, backend=192.168.16.136:9060
2025-04-13 12:34:58.047+08:00 WARN (starrocks-mysql-nio-pool-1|812) [FragmentInstanceExecState.waitForDeploymentCompletion():306] exec plan fragment failed, err
msg=Query terminates prematurely backend [id=10214] [host=192.168.16.136], code=CANCELLED, fragmentId=F49, backend=192.168.16.136:9060
2025-04-13 12:34:58.057+08:00 WARN (starrocks-mysql-nio-pool-1|812) [FragmentInstanceExecState.waitForDeploymentCompletion():306] exec plan fragment failed, err
msg=Query terminates prematurely backend [id=10215] [host=192.168.16.137], code=CANCELLED, fragmentId=F49, backend=192.168.16.137:9060
2025-04-13 12:34:58.061+08:00 WARN (thrift-server-pool-470|957) [DefaultCoordinator.lambda$updateFragmentExecStatus$9():1094] exec state report failed status=er
rorCode CORRUPTION Bad page: checksum mismatch (actual=1399904539 vs expect=0), file=staros://52579/data/000000000000833d_6976d74a-d85e-4790-8cc1-71f1cec28736.da
t encrypted=false: BE:10214, query_id=a075f362-1820-11f0-aa66-6660846a4a2e, instance_id=a075f362-1820-11f0-aa66-6660846a4a62, backend_id=10214
2025-04-13 12:34:58.061+08:00 WARN (thrift-server-pool-457|943) [DefaultCoordinator.lambda$updateFragmentExecStatus$9():1094] exec state report failed status=er
rorCode CORRUPTION Bad page: checksum mismatch (actual=1399904539 vs expect=0), file=staros://52579/data/000000000000833d_6976d74a-d85e-4790-8cc1-71f1cec28736.da
t encrypted=false: BE:10214, query_id=a075f362-1820-11f0-aa66-6660846a4a2e, instance_id=a075f362-1820-11f0-aa66-6660846a4a84, backend_id=10214
2025-04-13 12:34:58.061+08:00 WARN (starrocks-mysql-nio-pool-1|812) [StmtExecutor.execute():723] Query a075f362-1820-11f0-aa66-6660846a4a2e failed. Planner profile : Planner:

  • – Parser[1] 34ms
  • – Total[1] 491ms
  • -- Analyzer[1] 30ms
    
  •     -- Lock[1] 0
    
  •     -- AnalyzeDatabase[1] 0
    
  •     -- AnalyzeTemporaryTable[1] 0
    
  •     -- AnalyzeTable[1] 0
    
  • -- Transformer[1] 72ms
    
  • -- Optimizer[1] 147ms
    
  •     -- MVPreprocess[1] 0
    
  •     -- MVTextRewrite[1] 0
    
  •     -- RuleBaseOptimize[1] 100ms
    
  •     -- CostBaseOptimize[1] 37ms
    
  •     -- PhysicalRewrite[1] 5ms
    
  •     -- DynamicRewrite[1] 0
    
  •     -- PlanValidate[1] 1ms
    
  •         -- InputDependenciesChecker[1] 0
    
  •         -- TypeChecker[1] 0
    
  •         -- CTEUniqueChecker[1] 0
    
  •         -- ColumnReuseChecker[1] 0
    
  • -- ExecPlanBuild[1] 238ms
    
  • – Pending[1] 0
  • – Prepare[1] 15ms
  • – Deploy[1] 11s656ms
  • -- DeployLockInternalTime[1] 11s656ms
    
  •     -- DeploySerializeConcurrencyTime[6] 22ms
    
  •     -- DeployStageByStageTime[18] 3ms
    
  •     -- DeployWaitTime[18] 11s618ms
    
  •         -- DeployAsyncSendTime[58] 2ms
    
  • DeployDataSize: 1651788
    Reason:

2025-04-13 12:34:58.066+08:00 WARN (thrift-server-pool-473|960) [DefaultCoordinator.lambda$updateFragmentExecStatus$9():1094] exec state report failed status=errorCode CORRUPTION Bad page: checksum mismatch (actual=1399904539 vs expect=0), file=staros://52579/data/000000000000833d_6976d74a-d85e-4790-8cc1-71f1cec28736.dat encrypted=false: BE:10214, query_id=a075f362-1820-11f0-aa66-6660846a4a2e, instance_id=a075f362-1820-11f0-aa66-6660846a4a2f, backend_id=10214
2025-04-13 12:34:58.067+08:00 WARN (thrift-server-pool-460|946) [DefaultCoordinator.lambda$updateFragmentExecStatus$9():1094] exec state report failed status=errorCode CORRUPTION Bad page: checksum mismatch (actual=1399904539 vs expect=0), file=staros://52579/data/000000000000833d_6976d74a-d85e-4790-8cc1-71f1cec28736.dat encrypted=false: BE:10214, query_id=a075f362-1820-11f0-aa66-6660846a4a2e, instance_id=a075f362-1820-11f0-aa66-6660846a4a3b, backend_id=10214
2025-04-13 12:34:58.069+08:00 WARN (thrift-server-pool-466|952) [DefaultCoordinator.lambda$updateFragmentExecStatus$9():1094] exec state report failed status=errorCode CORRUPTION Bad page: checksum mismatch (actual=1399904539 vs expect=0), file=staros://52579/data/000000000000833d_6976d74a-d85e-4790-8cc1-71f1cec28736.dat encrypted=false: BE:10214, query_id=a075f362-1820-11f0-aa66-6660846a4a2e, instance_id=a075f362-1820-11f0-aa66-6660846a4a9b, backend_id=10214
2025-04-13 12:34:58.070+08:00 WARN (thrift-server-pool-463|949) [DefaultCoordinator.lambda$updateFragmentExecStatus$9():1094] exec state report failed status=errorCode CORRUPTION Bad page: checksum mismatch (actual=1399904539 vs expect=0), file=staros://52579/data/000000000000833d_6976d74a-d85e-4790-8cc1-71f1cec28736.dat encrypted=false: BE:10214, query_id=a075f362-1820-11f0-aa66-6660846a4a2e, instance_id=a075f362-1820-11f0-aa66-6660846a4a5b, backend_id=10214
2025-04-13 12:34:58.072+08:00 WARN (thrift-server-pool-457|943) [DefaultCoordinator.lambda$updateFragmentExecStatus$9():1094] exec state report failed status=errorCode CORRUPTION Bad page: checksum mismatch (actual=1399904539 vs expect=0), file=staros://52579/data/000000000000833d_6976d74a-d85e-4790-8cc1-71f1cec28736.dat encrypted=false: BE:10214, query_id=a075f362-1820-11f0-aa66-6660846a4a2e, instance_id=a075f362-1820-11f0-aa66-6660846a4a35, backend_id=10214
2025-04-13 12:34:58.072+08:00 WARN (thrift-server-pool-470|957) [DefaultCoordinator.lambda$updateFragmentExecStatus$9():1094] exec state report failed status=errorCode CORRUPTION Bad page: checksum mismatch (actual=1399904539 vs expect=0), file=staros://52579/data/000000000000833d_6976d74a-d85e-4790-8cc1-71f1cec28736.dat encrypted=false: BE:10214, query_id=a075f362-1820-11f0-aa66-6660846a4a2e, instance_id=a075f362-1820-11f0-aa66-6660846a4a5e, backend_id=10214
2025-04-13 12:34:58.073+08:00 WARN (thrift-server-pool-473|960) [DefaultCoordinator.lambda$updateFragmentExecStatus$9():1094] exec state report failed status=errorCode CORRUPTION Bad page: checksum mismatch (actual=1399904539 vs expect=0), file=staros://52579/data/000000000000833d_6976d74a-d85e-4790-8cc1-71f1cec28736.dat encrypted=false: BE:10214, query_id=a075f362-1820-11f0-aa66-6660846a4a2e, instance_id=a075f362-1820-11f0-aa66-6660846a4a58, backend_id=10214
2025-04-13 12:34:58.073+08:00 WARN (thrift-server-pool-460|946) [DefaultCoordinator.lambda$updateFragmentExecStatus$9():1094] exec state report failed status=errorCode CORRUPTION Bad page: checksum mismatch (actual=1399904539 vs expect=0), file=staros://52579/data/000000000000833d_6976d74a-d85e-4790-8cc1-71f1cec28736.dat encrypted=false: BE:10214, query_id=a075f362-1820-11f0-aa66-6660846a4a2e, instance_id=a075f362-1820-11f0-aa66-6660846a4a7e, backend_id=10214
【背景】使用datacache,把datacache路径指向tpmfs
【业务影响】查询操作无法使用
【是否存算分离】是
【StarRocks版本】3.4.1
【集群规模】3cn+1fe
【机器信息】三台服务器,每台都是128C/256G/万兆,其中be限制了单节点内存上限100g
【联系方式】社区群-26-索马里海豹
【附件】

还有这个报错:error: load_segments failed tablet:52426 rowset:3960 segid:0: Bad segment file staros://52426/data/0000000000008462_5836b21e-8101-4a93-8d40-9f53068b6ff4.dat: magic number not match: BE:10215
想知道是磁盘坏了还是哪里没配置好呢,该怎么排查呢