本集群操作恢复正常。恢复到arm 内核的集群提示元数据无法读取

为了更快的定位您的问题,请提供以下信息,谢谢
【详述】问题详细描述
【背景】做过哪些操作?
【业务影响】
【是否存算分离】
【StarRocks版本】例如:3.1.2
【集群规模】例如:3fe(3 followerr)+3be(fe与be混部)
【机器信息】CPU虚拟核/内存/网卡,例如:48C/64G/万兆

本集群操作方法:
mysql> show tables;
±-----------------+
| Tables_in_testdb |
±-----------------+
| backup_user |
±-----------------+
1 row in set (0.00 sec)

mysql> drop table backup_user;
Query OK, 0 rows affected (0.01 sec)

mysql> RESTORE SNAPSHOT testdb.backup1 FROM sync_repo on (backup_user ) PROPERTIES (“backup_timestamp”=“2024-02-22-10-27-05-731”,“replication_num” = “3”);
Query OK, 0 rows affected (0.09 sec)

mysql> show tables;
±-----------------+
| Tables_in_testdb |
±-----------------+
| backup_user |
±-----------------+
1 row in set (0.00 sec)

mysql> select * from backup_user;
±-----------±--------±----------±----------±------+
| event_day | site_id | city_code | user_name | pv |
±-----------±--------±----------±----------±------+
| 2023-12-23 | 12 | 49 | lm | 23 |
| 2023-12-25 | 10 | 26 | Mark | 55 |
| 2024-02-01 | 13 | 52 | SR | 10086 |
±-----------±--------±----------±----------±------+
3 rows in set (0.03 sec)

arm架构集群操作:

RESTORE SNAPSHOT testdb.backup1 FROM sync_repo on (backup_user ) PROPERTIES (“backup_timest”=“2024-02-22-10-27-05-731”,“replication_num” = “3”);
Query OK, 0 rows affected (0.36 sec)

mysql> show restore\G
*************************** 1. row ***************************
JobId: 11077
Label: backup1
Timestamp: 2024-02-22-10-27-05-731
DbName: testdb
State: CANCELLED
AllowLoad: false
ReplicationNum: 3
RestoreObjs: backup_user PARTITIONS [p202201, p202202, p202203]
CreateTime: 2024-02-23 10:40:12
MetaPreparedTime: NULL
SnapshotFinishedTime: NULL
DownloadFinishedTime: NULL
FinishedTime: 2024-02-23 10:40:12
UnfinishedTasks:
Progress:
TaskErrMsg:
Status: [COMMON_ERROR, msg: Failed to read backup meta from file]
Timeout: 86400

两个集群的仓库信息:x86

mysql> show snapshot on sync_repo
-> ;
±------------------±------------------------±-------+
| Snapshot | Timestamp | Status |
±------------------±------------------------±-------+
| backup1 | 2024-02-22-10-27-05-731 | OK |
| backup_stg_full_1 | 2024-02-21-11-31-45-270 | OK |
| backup_stg_full_2 | 2024-02-21-11-57-41-109 | OK |
| mz_sydw_shtt2 | 2024-02-21-13-34-49-145 | OK |
±------------------±------------------------±-------+
4 rows in set (0.02 sec)

arm集群:
mysql> show snapshot on sync_repo;
±------------------±------------------------±-------+
| Snapshot | Timestamp | Status |
±------------------±------------------------±-------+
| backup1 | 2024-02-22-10-27-05-731 | OK |
| backup_stg_full_1 | 2024-02-21-11-31-45-270 | OK |
| backup_stg_full_2 | 2024-02-21-11-57-41-109 | OK |
| mz_sydw_shtt2 | 2024-02-21-13-34-49-145 | OK |
±------------------±------------------------±-------+
4 rows in set (0.09 sec)

有什么方法快速在x86环境数据迁移到am环境的sr集群

单表数据迁移的话,可以用sr外表https://docs.starrocks.io/zh/docs/data_source/External_table/#starrocks-%E5%A4%96%E9%83%A8%E8%A1%A8

arm集群这个restore任务时间段内的 fe.log 和 be.INFO 日志方便提供下么

不是单表迁移。有问题。想用单表测试一下比较方便

日志方便发么,需要根据日志定位看能不能定位到restore报错的原因

能升级集群版本的话,可以考虑使用 跨集群数据迁移工具, 数据迁移的目标集群必须为 v3.1.8 或 v3.2.3 及以上版本,详细参考 https://docs.starrocks.io/zh/docs/administration/data_migration_tool/

fe:
2024-02-23 10:41:08,965 INFO (starrocks-mysql-nio-pool-33|257) [HdfsService.listPath():58] receive a
list path request, path: hdfs://xxx.xx.xxx.xx:8020/starrocks/__starrocks_repository_sync_repo/_ss
*
2024-02-23 10:41:08,965 INFO (starrocks-mysql-nio-pool-33|257) [HdfsFsManager.getDistributedFileSyst
em():476] could not find file system for path hdfs://xxx.xx.xxx.xx:8020/starrocks/__starrocks_reposi
tory_sync_repo/_ss* create a new one
2024-02-23 10:41:09,043 INFO (starrocks-mysql-nio-pool-33|257) [BlobStorage.listWithoutBroker():816]
finished to list remote path hdfs://xxx.xx.xxx.xx:8020/starrocks/_starrocks_repository_sync_repo/
ss. get files: [[name: __ss_backup1, is file: false], [name: __ss_backup_stg_full_1, is file: fal
se], [name: __ss_backup_stg_full_2, is file: false], [name: __ss_mz_sydw_shtt2, is file: false]]
2024-02-23 10:41:09,043 INFO (starrocks-mysql-nio-pool-33|257) [HdfsService.listPath():58] receive a
list path request, path: hdfs://xxx.xx.xxx.xx:8020/starrocks/__starrocks_repository_sync_repo/_ss
backup1/_info

2024-02-23 10:41:09,046 INFO (starrocks-mysql-nio-pool-33|257) [BlobStorage.listWithoutBroker():816]
finished to list remote path hdfs://xxx.xx.xxx.xx:8020/starrocks/_starrocks_repository_sync_repo/
_ss_backup1/_info. get files: [[name: __info_2024-02-22-10-27-05-731.5a15a75615bca9c1d131482f6851
41d4, is file: true]]
2024-02-23 10:41:09,046 INFO (starrocks-mysql-nio-pool-33|257) [HdfsService.listPath():58] receive a
list path request, path: hdfs://xxx.xx.xxx.xx:8020/starrocks/__starrocks_repository_sync_repo/_ss
backup_stg_full_1/_info

hdfs dfs -ls /starrocks/__starrocks_repository_sync_repo/
Found 5 items
-rw-r–r-- 1 root supergroup 56 2024-02-22 11:14 /starrocks/__starrocks_repository_sync_repo/__repo_info

could not find file system for path hdfs://xxx.xx.xxx.xx:8020/starrocks/__starrocks_reposi
tory_sync_repo/_ss* create a new one这个文件确实不存在

be好象没看到错误信息

可以发下 2024-02-23 10:40:12 开始,到 restore返回失败 这块时间的完整 fe.log 日志么

2024-02-23 10:40:12,573 INFO (replayer|85) [RestoreJob.cancelInternal():1516] finished to cancel res
tore job. is replay: true. RESTORE repo id: 11001, label: backup1, job id: -1, db id: 11046, db name
: testdb, status: [OK], timeout: 86400000, backup ts: 2024-02-22-10-27-05-731, state: PENDING
2024-02-23 10:40:12,574 INFO (replayer|85) [GlobalStateMgr.replayJournalInner():2315] replayed journ
al from 875088 - 875089
2024-02-23 10:40:15,620 INFO (nioEventLoopGroup-5-32|153) [RestBaseAction.handleRequest():70] receiv
e http request. url=/api/bootstrap?cluster_id=1311759495&token=4811fb63-c382-4696-a6f9-226982ee11b8
2024-02-23 10:40:15,627 INFO (replayer|85) [GlobalStateMgr.replayJournalInner():2315] replayed journ
al from 875089 - 875090
2024-02-23 10:40:20,629 INFO (nioEventLoopGroup-5-1|98) [RestBaseAction.handleRequest():70] receive
http request. url=/api/bootstrap?cluster_id=1311759495&token=4811fb63-c382-4696-a6f9-226982ee11b8
2024-02-23 10:40:20,635 INFO (replayer|85) [GlobalStateMgr.replayJournalInner():2315] replayed journ
al from 875090 - 875091
2024-02-23 10:40:21,549 INFO (replayer|85) [GlobalStateMgr.replayJournalInner():2315] replayed journ
al from 875091 - 875092
2024-02-23 10:40:25,637 INFO (nioEventLoopGroup-5-2|107) [RestBaseAction.handleRequest():70] receive
http request. url=/api/bootstrap?cluster_id=1311759495&token=4811fb63-c382-4696-a6f9-226982ee11b8
2024-02-23 10:40:25,645 INFO (replayer|85) [GlobalStateMgr.replayJournalInner():2315] replayed journ
al from 875092 - 875093
2024-02-23 10:40:30,647 INFO (nioEventLoopGroup-5-3|109) [RestBaseAction.handleRequest():70] receive
http request. url=/api/bootstrap?cluster_id=1311759495&token=4811fb63-c382-4696-a6f9-226982ee11b8
2024-02-23 10:40:30,654 INFO (replayer|85) [GlobalStateMgr.replayJournalInner():2315] replayed journ
al from 875093 - 875094
2024-02-23 10:40:31,552 INFO (replayer|85) [GlobalStateMgr.replayJournalInner():2315] replayed journ
al from 875094 - 875095
2024-02-23 10:40:35,656 INFO (nioEventLoopGroup-5-4|117) [RestBaseAction.handleRequest():70] receive
http request. url=/api/bootstrap?cluster_id=1311759495&token=4811fb63-c382-4696-a6f9-226982ee11b8
2024-02-23 10:40:35,664 INFO (replayer|85) [GlobalStateMgr.replayJournalInner():2315] replayed journ
al from 875095 - 875096
2024-02-23 10:40:40,666 INFO (nioEventLoopGroup-5-5|120) [RestBaseAction.handleRequest():70] receive
http request. url=/api/bootstrap?cluster_id=1311759495&token=4811fb63-c382-4696-a6f9-226982ee11b8
2024-02-23 10:40:40,672 INFO (replayer|85) [GlobalStateMgr.replayJournalInner():2315] replayed journ
al from 875096 - 875097
2024-02-23 10:40:41,556 INFO (replayer|85) [GlobalStateMgr.replayJournalInner():2315] replayed journ
al from 875097 - 875098
2024-02-23 10:40:45,675 INFO (nioEventLoopGroup-5-6|121) [RestBaseAction.handleRequest():70] receive
http request. url=/api/bootstrap?cluster_id=1311759495&token=4811fb63-c382-4696-a6f9-226982ee11b8
2024-02-23 10:40:45,681 INFO (replayer|85) [GlobalStateMgr.replayJournalInner():2315] replayed journ
al from 875098 - 875099
2024-02-23 10:40:50,683 INFO (nioEventLoopGroup-5-7|122) [RestBaseAction.handleRequest():70] receive
http request. url=/api/bootstrap?cluster_id=1311759495&token=4811fb63-c382-4696-a6f9-226982ee11b8
2024-02-23 10:40:50,690 INFO (replayer|85) [GlobalStateMgr.replayJournalInner():2315] replayed journ
al from 875099 - 875100
2024-02-23 10:40:51,571 INFO (replayer|85) [GlobalStateMgr.replayJournalInner():2315] replayed journ
al from 875100 - 875101
2024-02-23 10:40:52,583 WARN (starrocks-mysql-nio-pool-33|257) [ConnectProcessor.handleQuery():385]
Process one query failed. SQL: show snapshot sync_repo, because.
com.starrocks.common.AnalysisException: Getting syntax error at line 1, column 14. Detail message: U
nexpected input ‘sync_repo’, the most similar input is {‘ON’}.
at com.starrocks.qe.ConnectProcessor.handleQuery(ConnectProcessor.java:341) ~[starrocks-fe.j
ar:?]
at com.starrocks.qe.ConnectProcessor.dispatch(ConnectProcessor.java:478) ~[starrocks-fe.jar:
?]
at com.starrocks.qe.ConnectProcessor.processOnce(ConnectProcessor.java:754) ~[starrocks-fe.j
ar:?]
at com.starrocks.mysql.nio.ReadListener.lambda$handleEvent$0(ReadListener.java:69) ~[starroc
ks-fe.jar:?]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[?:1.8.0
_391]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ~[?:1.8.0
_391]
at java.lang.Thread.run(Thread.java:750) ~[?:1.8.0_391]
2024-02-23 10:41:01,564 INFO (replayer|85) [GlobalStateMgr.replayJournalInner():2315] replayed journ
al from 875103 - 875104
2024-02-23 10:41:05,710 INFO (nioEventLoopGroup-5-10|125) [RestBaseAction.handleRequest():70] receiv
e http request. url=/api/bootstrap?cluster_id=1311759495&token=4811fb63-c382-4696-a6f9-226982ee11b8
2024-02-23 10:41:05,717 INFO (replayer|85) [GlobalStateMgr.replayJournalInner():2315] replayed journ
al from 875104 - 875105
2024-02-23 10:41:08,965 INFO (starrocks-mysql-nio-pool-33|257) [HdfsService.listPath():58] receive a
list path request, path: hdfs://xxx.xx.xxx.xx:8020/starrocks/__starrocks_repository_sync_repo/_ss
*
2024-02-23 10:41:08,965 INFO (starrocks-mysql-nio-pool-33|257) [HdfsFsManager.getDistributedFileSyst
em():476] could not find file system for path hdfs://xxx.xx.xxx.xx:8020/starrocks/__starrocks_reposi
tory_sync_repo/_ss* create a new one
2024-02-23 10:41:09,043 INFO (starrocks-mysql-nio-pool-33|257) [BlobStorage.listWithoutBroker():816]
finished to list remote path hdfs://xxx.xx.xxx.xx:8020/starrocks/_starrocks_repository_sync_repo/
ss. get files: [[name: __ss_backup1, is file: false], [name: __ss_backup_stg_full_1, is file: fal
se], [name: __ss_backup_stg_full_2, is file: false], [name: __ss_mz_sydw_shtt2, is file: false]]
2024-02-23 10:41:09,043 INFO (starrocks-mysql-nio-pool-33|257) [HdfsService.listPath():58] receive a
list path request, path: hdfs://xxx.xx.xxx.xx:8020/starrocks/__starrocks_repository_sync_repo/_ss
backup1/_info

2024-02-23 10:41:09,046 INFO (starrocks-mysql-nio-pool-33|257) [BlobStorage.listWithoutBroker():816]
finished to list remote path hdfs://xxx.xx.xxx.xx:8020/starrocks/_starrocks_repository_sync_repo/
_ss_backup1/_info. get files: [[name: __info_2024-02-22-10-27-05-731.5a15a75615bca9c1d131482f6851
41d4, is file: true]]
2024-02-23 10:41:09,046 INFO (starrocks-mysql-nio-pool-33|257) [HdfsService.listPath():58] receive a
list path request, path: hdfs://xxx.xx.xxx.xx:8020/starrocks/__starrocks_repository_sync_repo/_ss
backup_stg_full_1/_info

2024-02-23 10:41:09,048 INFO (starrocks-mysql-nio-pool-33|257) [BlobStorage.listWithoutBroker():816]
finished to list remote path hdfs://xxx.xx.xxx.xx:8020/starrocks/_starrocks_repository_sync_repo/
_ss_backup_stg_full_1/_info. get files: [[name: __info_2024-02-21-11-31-45-270.3d52f4437d1873f857
1ed35ea543d682, is file: true]]
2024-02-23 10:41:09,049 INFO (starrocks-mysql-nio-pool-33|257) [HdfsService.listPath():58] receive a
list path request, path: hdfs://xxx.xx.xxx.xx:8020/starrocks/__starrocks_repository_sync_repo/_ss
backup_stg_full_2/_info

2024-02-23 10:41:09,053 INFO (starrocks-mysql-nio-pool-33|257) [BlobStorage.listWithoutBroker():816]
finished to list remote path hdfs://xxx.xx.xxx.xx:8020/starrocks/_starrocks_repository_sync_repo/
_ss_backup_stg_full_2/_info. get files: [[name: __info_2024-02-21-11-57-41-109.4ced02f1ec803a17e5
28de5ebb6d5ca6, is file: true]]
2024-02-23 10:41:09,053 INFO (starrocks-mysql-nio-pool-33|257) [HdfsService.listPath():58] receive a
list path request, path: hdfs://xxx.xx.xxx.xx:8020/starrocks/__starrocks_repository_sync_repo/_ss
mz_sydw_shtt2/_info

2024-02-23 10:41:09,055 INFO (starrocks-mysql-nio-pool-33|257) [BlobStorage.listWithoutBroker():816]
finished to list remote path hdfs://xxx.xx.xxx.xx:8020/starrocks/_starrocks_repository_sync_repo/
_ss_mz_sydw_shtt2/_info*. get files: [[name: __info_2024-02-21-13-34-49-145.840b794cc2caa5bb847f20
64b1e1147f, is file: true]]
2024-02-23 10:41:10,719 INFO (nioEventLoopGroup-5-11|126) [RestBaseAction.handleRequest():70] receiv
e http request. url=/api/bootstrap?cluster_id=1311759495&token=4811fb63-c382-4696-a6f9-226982ee11b8
2024-02-23 10:41:10,725 INFO (replayer|85) [GlobalStateMgr.replayJournalInner():2315] replayed journ
al from 875105 - 875106
2024-02-23 10:41:11,568 INFO (replayer|85) [GlobalStateMgr.replayJournalInner():2315] replayed journ
al from 875106 - 875107
2024-02-23 10:41:15,728 INFO (nioEventLoopGroup-5-12|127) [RestBaseAction.handleRequest():70] receiv
e http request. url=/api/bootstrap?cluster_id=1311759495&token=4811fb63-c382-4696-a6f9-226982ee11b8
2024-02-23 10:41:15,734 INFO (replayer|85) [GlobalStateMgr.replayJournalInner():2315] replayed journ
al from 875107 - 875108
2024-02-23 10:41:20,736 INFO (nioEventLoopGroup-5-13|128) [RestBaseAction.handleRequest():70] receiv
e http request. url=/api/bootstrap?cluster_id=1311759495&token=4811fb63-c382-4696-a6f9-226982ee11b8
2024-02-23 10:41:20,743 INFO (replayer|85) [GlobalStateMgr.replayJournalInner():2315] replayed journ
al from 875108 - 875109
2024-02-23 10:41:21,573 INFO (replayer|85) [GlobalStateMgr.replayJournalInner():2315] replayed journ
al from 875109 - 875110
2024-02-23 10:41:25,745 INFO (nioEventLoopGroup-5-14|129) [RestBaseAction.handleRequest():70] receiv
e http request. url=/api/bootstrap?cluster_id=1311759495&token=4811fb63-c382-4696-a6f9-226982ee11b8
2024-02-23 10:41:25,754 INFO (replayer|85) [GlobalStateMgr.replayJournalInner():2315] replayed journ
al from 875110 - 875111
2024-02-23 10:43:05,940 INFO (replayer|85) [GlobalStateMgr.replayJournalInner():2315] replayed journ
al from 875140 - 875141
2024-02-23 10:43:10,943 INFO (nioEventLoopGroup-5-3|109) [RestBaseAction.handleRequest():70] receive
http request. url=/api/bootstrap?cluster_id=1311759495&token=4811fb63-c382-4696-a6f9-226982ee11b8
2024-02-23 10:43:10,949 INFO (replayer|85) [GlobalStateMgr.replayJournalInner():2315] replayed journ
al from 875141 - 875142
2024-02-23 10:43:11,616 INFO (replayer|85) [GlobalStateMgr.replayJournalInner():2315] replayed journ
al from 875142 - 875143
2024-02-23 10:43:15,952 INFO (nioEventLoopGroup-5-4|117) [RestBaseAction.handleRequest():70] receive
http request. url=/api/bootstrap?cluster_id=1311759495&token=4811fb63-c382-4696-a6f9-226982ee11b8
2024-02-23 10:43:15,959 INFO (replayer|85) [GlobalStateMgr.replayJournalInner():2315] replayed journ
al from 875143 - 875144
2024-02-23 10:43:20,962 INFO (nioEventLoopGroup-5-5|120) [RestBaseAction.handleRequest():70] receive
http request. url=/api/bootstrap?cluster_id=1311759495&token=4811fb63-c382-4696-a6f9-226982ee11b8
2024-02-23 10:43:20,968 INFO (replayer|85) [GlobalStateMgr.replayJournalInner():2315] replayed journ
al from 875144 - 875145
2024-02-23 10:43:21,620 INFO (replayer|85) [GlobalStateMgr.replayJournalInner():2315] replayed journ
al from 875145 - 875146
2024-02-23 10:43:25,971 INFO (nioEventLoopGroup-5-6|121) [RestBaseAction.handleRequest():70] receive
http request. url=/api/bootstrap?cluster_id=1311759495&token=4811fb63-c382-4696-a6f9-226982ee11b8
2024-02-23 10:43:25,978 INFO (replayer|85) [GlobalStateMgr.replayJournalInner():2315] replayed journ
al from 875146 - 875147

中间的错误是 show resorte 命令写错了。其他没什么特别地

  • 数据迁移的目标集群必须为 v3.1.8 或 v3.2.3 及以上版本。目标库版本 3.1.3 不适合

我又测试了一下。arm下备份和恢复也正常。就是x86到arm不正常

mysql> select * from backup_user;
±-----------±--------±----------±----------±------+
| event_day | site_id | city_code | user_name | pv |
±-----------±--------±----------±----------±------+
| 2024-02-01 | 13 | 52 | SR | 10086 |
| 2023-12-23 | 12 | 49 | lm | 23 |
| 2023-12-25 | 10 | 26 | Mark | 55 |
±-----------±--------±----------±----------±------+
3 rows in set (0.05 sec)

mysql> BACKUP SNAPSHOT testdb.backup2 TO sync_repo on (backup_user ) PROPERTIES (‘type’=‘full’, ‘timeout’ = ‘3600’);
Query OK, 0 rows affected (0.03 sec)

mysql> drop table backup_user;
Query OK, 0 rows affected (0.01 sec)

mysql> show snapshot on sync_repo;
±------------------±------------------------±-------+
| Snapshot | Timestamp | Status |
±------------------±------------------------±-------+
| backup1 | 2024-02-22-10-27-05-731 | OK |
| backup2 | 2024-02-23-15-20-10-002 | OK |
| backup_stg_full_1 | 2024-02-21-11-31-45-270 | OK |
| backup_stg_full_2 | 2024-02-21-11-57-41-109 | OK |
| mz_sydw_shtt2 | 2024-02-21-13-34-49-145 | OK |
±------------------±------------------------±-------+
5 rows in set (0.09 sec)

mysql> RESTORE SNAPSHOT testdb.backup2 FROM sync_repo on (backup_user ) PROPERTIES (“backup_timestamp”=“2024-02-23-15-20-10-002”,“replication_num” = “3”);
Query OK, 0 rows affected (0.04 sec)

mysql> show tables;
±---------------------+
| Tables_in_testdb |
±---------------------+
| backup_user |
| external_backup_user |
±---------------------+
2 rows in set (0.00 sec)

mysql> select * from backup_user;
ERROR 1064 (HY000): Getting analyzing error. Detail message: Table state is not NORMAL: ‘RESTORING’.
mysql> select * from backup_user;
±-----------±--------±----------±----------±------+
| event_day | site_id | city_code | user_name | pv |
±-----------±--------±----------±----------±------+
| 2023-12-23 | 12 | 49 | lm | 23 |
| 2023-12-25 | 10 | 26 | Mark | 55 |
| 2024-02-01 | 13 | 52 | SR | 10086 |
±-----------±--------±----------±----------±------+
3 rows in set (0.02 sec)

mysql>