starocks 2.5 数据存储介质未按照建表语句分布

【详述】 开启 enable_strict_storage_medium_check 前 tablet 为按照 期望将数据存储到hdd节点上,开启后才能正常识别存储介质
【建表语句】CREATE TABLE test.test1 (
logtime datetime NULL COMMENT “None”,
md5 varchar(65533) NULL COMMENT “None”,
key varchar(65533) NULL COMMENT “None”,
ip varchar(65533) NULL COMMENT “None”,
modify_time bigint(20) NULL COMMENT “”
) ENGINE=OLAP
UNIQUE KEY(logtime, md5)
COMMENT “test1”
PARTITION BY RANGE(logtime)
(
PARTITION p20240109 VALUES [(“2024-01-09 00:00:00”), (“2024-01-10 00:00:00”)),
PARTITION p20240110 VALUES [(“2024-01-10 00:00:00”), (“2024-01-11 00:00:00”)),
PARTITION p20240111 VALUES [(“2024-01-11 00:00:00”), (“2024-01-12 00:00:00”)),
PARTITION p20240112 VALUES [(“2024-01-12 00:00:00”), (“2024-01-13 00:00:00”)),
PARTITION p20240113 VALUES [(“2024-01-13 00:00:00”), (“2024-01-14 00:00:00”)),
PARTITION p20240114 VALUES [(“2024-01-14 00:00:00”), (“2024-01-15 00:00:00”)),
PARTITION p20240115 VALUES [(“2024-01-15 00:00:00”), (“2024-01-16 00:00:00”)),
PARTITION p20240116 VALUES [(“2024-01-16 00:00:00”), (“2024-01-17 00:00:00”)),
PARTITION p20240117 VALUES [(“2024-01-17 00:00:00”), (“2024-01-18 00:00:00”)),
PARTITION p20240118 VALUES [(“2024-01-18 00:00:00”), (“2024-01-19 00:00:00”)),
PARTITION p20240119 VALUES [(“2024-01-19 00:00:00”), (“2024-01-20 00:00:00”)),
PARTITION p20240120 VALUES [(“2024-01-20 00:00:00”), (“2024-01-21 00:00:00”)),
PARTITION p20240121 VALUES [(“2024-01-21 00:00:00”), (“2024-01-22 00:00:00”)),
PARTITION p20240122 VALUES [(“2024-01-22 00:00:00”), (“2024-01-23 00:00:00”)))
DISTRIBUTED BY HASH(md5) BUCKETS 1
PROPERTIES (
“replication_num” = “3”,
“dynamic_partition.enable” = “true”,
“dynamic_partition.time_unit” = “DAY”,
“dynamic_partition.time_zone” = “Asia/Shanghai”,
“dynamic_partition.start” = “-35”,
“dynamic_partition.end” = “7”,
“dynamic_partition.prefix” = “p”,
“dynamic_partition.buckets” = “2”,
“dynamic_partition.history_partition_num” = “0”,
“in_memory” = “false”,
“storage_format” = “DEFAULT”,
“enable_persistent_index” = “false”,
“write_quorum” = “ONE”,
“compression” = “LZ4”,
“storage_medium” = “HDD”
);
【背景】开启 enable_strict_storage_medium_check 前 tablet 为按照 期望将数据存储到hdd节点上
【业务影响】
【是否存算分离】否
【StarRocks版本】2.5.14
【集群规模】5fe(1 follower+2observer)+7be(fe与be混部)
【机器信息】4节点SSD(data01-data04) 3节点HDD(data05-data07)
【联系方式】社区群7-n1
【附件】
-backends

不好意思没看太懂两张图的对比点在哪里?另外机器的hostname就叫data01-data07吗?

是的 期望所有数据都存储在 data05-data07节点(HDD盘服务器), data01-data04节点(SSD盘服务器)上也有数据 可以稳定复现

我懂你意思了,你是说开启enable_strict_storage_medium_check前,本身集群中有HDD盘,也应该都分布在HDD的节点上,不应该分布在data01-04是吧?

就是建表时设置的属性 storage_medium=HDD tablet 并没有按照期望 存储在hdd节点上

data01-data04节点be.conf里的storage_path是怎么配置的,方便截图下吗

data01-data04:


data05-data07:
storage_root_path=/mnt/data1/starrocks/be/storage,medium:hdd;/mnt/data2/starrocks/be/storage,medium:hdd;/mnt/data3/starrocks/be/storage,medium:hdd;/mnt/data4/starrocks/be/storage,medium:hdd;/mnt/data5/starrocks/be/storage,medium:hdd;/mnt/data6/starrocks/be/storage,medium:hdd;/mnt/data7/starrocks/be/storage,medium:hdd;/mnt/data8/starrocks/be/storage,medium:hdd;/mnt/data9/starrocks/be/storage,medium:hdd;/mnt/data10/starrocks/be/storage,medium:hdd;/mnt/data11/starrocks/be/storage,medium:hdd;/mnt/data12/starrocks/be/storage,medium:hdd;/mnt/data13/starrocks/be/storage,medium:hdd;/mnt/data14/starrocks/be/storage,medium:hdd;/mnt/data15/starrocks/be/storage,medium:hdd;/mnt/data16/starrocks/be/storage,medium:hdd;/mnt/data17/starrocks/be/storage,medium:hdd;/mnt/data18/starrocks/be/storage,medium:hdd;/mnt/data19/starrocks/be/storage,medium:hdd;/mnt/data20/starrocks/be/storage,medium:hdd;/mnt/data21/starrocks/be/storage,medium:hdd;/mnt/data22/starrocks/be/storage,medium:hdd;/mnt/data23/starrocks/be/storage,medium:hdd;/mnt/data24/starrocks/be/storage,medium:hdd

这个问题看了下代码,当enable_strict_storage_medium_check设置为true的时候,选择be创建tablet的时候会校验存储介质;如果是false的话,就直接忽略了校验存储介质。所以当默认设置为false的时候,可能会出现你提到的问题。如果期望完全按照您建表语句里设置的存储介质分布数据的话,建议enable_strict_storage_medium_check设置为true。