【详述】雾化视图(异步)数据的保留时长无法比明细表更久?
【背景】创建了一张明细表,用于记录调用日志明细,基于明细表创建雾化视图,按照api+hour维度聚合指标数据,比如调用失败量、平均耗时;明细表有event_time时间字段,使用event_time按天分区,另外雾化视图用event_time小时聚合;
明细表设置分区属性:
“dynamic_partition.time_unit” = “DAY”, – 按天分区
“dynamic_partition.start” = “-2”, – 从当前日期前2天开始
雾化视图设置ttl:
“partition_ttl” = “30 DAY”,
现象是雾化视图的保留时长是跟明细表保持一致,不是设置的30天,想达到的效果是雾化视图数据的保留时长更久,不受明细表影响。
简化后的建表模型和雾化视图:
CREATE TABLE table_A
(
api_id
bigint(20) NULL COMMENT “API ID”,
event_time
datetime NULL COMMENT “事件时间”,
trace_id
varchar(65533) NULL COMMENT “追踪ID”,
latency
bigint(20) NULL COMMENT “”
) ENGINE=OLAP
DUPLICATE KEY(api_id
, event_time
)
PARTITION BY RANGE (event_time) (PARTITION p20250110 VALUES LESS THAN (“2025-01-10”))
DISTRIBUTED BY HASH (trace_id) – 默认分布策略,根据 trace_id 分布数据
PROPERTIES (
“replication_num” = “2”, – 设置副本数为2
“dynamic_partition.enable” = “true”, – 启用动态分区
“dynamic_partition.time_unit” = “DAY”, – 按天分区
“dynamic_partition.start” = “-2”, – 保留最近2天的数据
“dynamic_partition.end” = “1”, – 当前日期
“dynamic_partition.prefix” = “p” – 分区前缀
);
CREATE MATERIALIZED VIEW view_A
(api_id
, event_hour
, total_count
, avg_latency
)
PARTITION BY (event_hour
)
DISTRIBUTED BY RANDOM
REFRESH ASYNC START(“2025-01-01 00:00:00”) EVERY(INTERVAL 10 SECOND)
PROPERTIES (
“replicated_storage” = “true”,
“partition_ttl” = “30 DAY”,
“replication_num” = “2”
)
AS SELECT api_id
, date_trunc(‘hour’, event_time
) AS event_hour
, count(*) AS total_count
,avg(latency
) AS avg_latency
FROM table_A
GROUP BY api_id
, event_hour
;
【是否存算分离】是
【StarRocks版本】3.2.9
【集群规模】fe(4core 8g)*1 + be(4core 8g)*3
【联系方式】StarRocks 社区群25-小微