iceberg 表关联查询导致全部 BE 节点 EC2 不可用

fe.audit.log

{
    "Timestamp": "1737877769532",
    "Client": "10.60.22.0:45684",
    "User": "root",
    "AuthorizedUser": "'root'@'%'",
    "ResourceGroup": "default_wg",
    "Catalog": "unified_catalog",
    "Db": "",
    "State": "EOF",
    "ErrorCode": "",
    "Time": "4076",
    "ScanBytes": "17020",
    "ScanRows": "0",
    "ReturnRows": "0",
    "CpuCostNs": "15565590",
    "MemCostBytes": "81460432",
    "StmtId": "4",
    "QueryId": "12aa3836-dbba-11ef-8af1-06ae1418ac3a",
    "IsQuery": "true",
    "feIp": "10.60.22.0",
    "Stmt": "with tmp as( select * from us_dim.dim_user_register_info where user_id ='dddd' ), tmp1 as(select * from tmp) select * from tmp1 t1 left join ice.iceberg_db.iceberg_lineitem_parquet_zstd_multi_v2 t2 on t1.uuid=t2.id and t1.user_id=t2.l_shipdate limit 10",
    "Digest": "",
    "PlanCpuCost": "1.6531322052E12",
    "PlanMemCost": "896.0",
    "PendingTimeMs": "43",
    "BigQueryLogCPUSecondThreshold": "480",
    "BigQueryLogScanBytesThreshold": "10737418240",
    "BigQueryLogScanRowsThreshold": "1000000000",
    "Warehouse": "default_warehouse",
    "CandidateMVs": null,
    "HitMvs": null,
    "IsForwardToLeader": "false"
}

fe.internal.log

2025-01-26 07:49:32.681Z INFO (stats-cache-refresher-6|115) [StatisticExecutor.executeDQL():486] statistic execute query | QueryId [1487f8f9-dbba-11ef-8af1-06ae1418ac3a] | SQL: SELECT cast(7 as INT), column_name, cast(json_object("buckets", buckets, "mcv", mcv) as varchar) FROM external_histogram_statistics WHERE table_uuid = 'unified_catalog.us_dim.dim_user_register_info.1717579531' and column_name in ('channel_3_en', 'have_nick_name', 'channel_3', 'register_os', 'channel_4', 'source_channel', 'have_personal_signature', 'channel_1', 'channel_2', 'register_type', 'app_channel', 'channel_1_en', 'user_id', 'invite_source', 'mktb_id', 'channel_4_en', 'register_time', 'register_account_status', 'channel_2_en', 'uuid', 'update_time', 'register_address', 'register_device_id', 'register_platform', 'have_set_picture', 'final_channel', 'invite_user_id', 'register_version')

2025-01-26 07:49:32.897Z INFO (stats-cache-refresher-0|109) [StatisticExecutor.executeDQL():486] statistic execute query | QueryId [14a9d8df-dbba-11ef-8af1-06ae1418ac3a] | SQL: SELECT cast(7 as INT), column_name, cast(json_object("buckets", buckets, "mcv", mcv) as varchar) FROM external_histogram_statistics WHERE table_uuid = 'ice.iceberg_db.iceberg_lineitem_parquet_zstd_multi_v2.c9d1437b-2766-4e58-a1b7-863257f17e3e' and column_name in ('ext22', 'ext21', 'l_linestatus', 'ext24', 'ext23', 'ext20', 'ext29', 'ext26', 'ext25', 'ext28', 'ext27', 'l_returnflag', 'ext11', 'ext10', 'l_comment', 'ext13', 'ext12', 'l_extendedprice', 'l_quantity', 'ext50', 'ext19', 'ext18', 'ext15', 'ext14', 'ext17', 'ext16', 'l_suppkey', 'l_orderkey', 'l_linenumber', 'ext44', 'id', 'ext43', 'ext46', 'ext45', 'ext42', 'ext41', 'ext48', 'ext47', 'ext49', 'ext9', 'l_partkey', 'ext6', 'ext5', 'ext8', 'ext7', 'ext2', 'ext1', 'l_commitdate', 'ext4', 'ext3', 'l_discount', 'l_tax', 'ext30', 'l_shipmode', 'l_shipdate', 'l_receiptdate', 'l_shipinstruct')

2025-01-26 07:49:32.910Z INFO (stats-cache-refresher-8|117) [StatisticExecutor.executeDQL():486] statistic execute query | QueryId [14a9158a-dbba-11ef-8af1-06ae1418ac3a] | SQL: SELECT cast(6 as INT), column_name, sum(row_count),cast(sum(data_size) as bigint), hll_union_agg(ndv), sum(null_count),  cast(max(cast(max as string)) as string), cast(min(cast(min as string)) as string) FROM external_column_statistics WHERE table_uuid = "unified_catalog.us_dim.dim_user_register_info.1717579531" and column_name in ("channel_3_en", "have_nick_name", "channel_3", "register_os", "channel_4", "source_channel", "have_personal_signature", "channel_1", "channel_2", "register_type", "app_channel", "channel_1_en", "user_id", "invite_source", "mktb_id", "channel_4_en", "register_time", "register_account_status", "channel_2_en", "uuid", "update_time", "register_address", "register_device_id", "register_platform", "have_set_picture", "final_channel", "invite_user_id", "register_version") GROUP BY table_uuid, column_name

2025-01-26 07:49:32.931Z INFO (stats-cache-refresher-3|112) [StatisticExecutor.executeDQL():486] statistic execute query | QueryId [14aa4e10-dbba-11ef-8af1-06ae1418ac3a] | SQL: SELECT cast(6 as INT), column_name, sum(row_count),cast(sum(data_size) as bigint), hll_union_agg(ndv), sum(null_count),  cast(max(cast(max as string)) as string), cast(min(cast(min as string)) as string) FROM external_column_statistics WHERE table_uuid = "ice.iceberg_db.iceberg_lineitem_parquet_zstd_multi_v2.c9d1437b-2766-4e58-a1b7-863257f17e3e" and column_name in ("ext22", "ext21", "l_linestatus", "ext24", "ext23", "ext20", "ext29", "ext26", "ext25", "ext28", "ext27", "l_returnflag", "ext11", "ext10", "l_comment", "ext13", "ext12", "ext50", "ext19", "ext18", "ext15", "ext14", "ext17", "ext16", "l_orderkey", "ext44", "id", "ext43", "ext46", "ext45", "ext42", "ext41", "ext48", "ext47", "ext49", "ext9", "ext6", "ext5", "ext8", "ext7", "ext2", "ext1", "l_commitdate", "ext4", "ext3", "ext30", "l_shipmode", "l_shipdate", "l_receiptdate", "l_shipinstruct") GROUP BY table_uuid, column_name UNION ALL SELECT cast(6 as INT), column_name, sum(row_count), cast(sum(data_size) as bigint), hll_union_agg(ndv), sum(null_count),  cast(max(cast(max as double)) as string), cast(min(cast(min as double)) as string) FROM external_column_statistics WHERE table_uuid = "ice.iceberg_db.iceberg_lineitem_parquet_zstd_multi_v2.c9d1437b-2766-4e58-a1b7-863257f17e3e" and column_name in ("l_extendedprice", "l_quantity", "l_discount", "l_tax") GROUP BY table_uuid, column_name UNION ALL SELECT cast(6 as INT), column_name, sum(row_count), cast(sum(data_size) as bigint), hll_union_agg(ndv), sum(null_count),  cast(max(cast(max as bigint)) as string), cast(min(cast(min as bigint)) as string) FROM external_column_statistics WHERE table_uuid = "ice.iceberg_db.iceberg_lineitem_parquet_zstd_multi_v2.c9d1437b-2766-4e58-a1b7-863257f17e3e" and column_name in ("l_suppkey", "l_linenumber", "l_partkey") GROUP BY table_uuid, column_name

这几条是查询统计信息的,属于正常情况。直接vlog打开,看log吧

sys_log_verbose_level
  • 默认值:10

  • 类型:Int

  • 单位:-

  • 是否动态:否

  • 描述:日志显示的级别,用于控制代码中 VLOG 开头的日志输出。

  • 引入版本:-

    mysql> admin execute on 10002 ‘VLogCntl.getInstance().enable(“pipeline*”)’;
    Compile error: Error at ‘VLogCntl’: Variable is used but not defined. at main:1

请问这个配置如何开启?经过尝试在 be 日志中未能找到 vlog 相关日志。

sys_log_verbose_level=10
sys_log_verbose_modules=*
在be的配置文件中增加以下两行,然后重启。

tail -n 3 conf/be.conf

sys_log_verbose_level = 10
sys_log_verbose_modules = *

只找到以下内容的日志:
grep -i vlog *
be.out:I00000000 00:00:00.000000 264273531170880 vlog_is_on.cc:197] RAW: Set VLOG level for “" to 10
be.out:I00000000 00:00:00.000000 263411079770176 vlog_is_on.cc:197] RAW: Set VLOG level for "
” to 10
be.out:I00000000 00:00:00.000000 278018418929728 vlog_is_on.cc:197] RAW: Set VLOG level for “" to 10
be.out:I00000000 00:00:00.000000 276672607940672 vlog_is_on.cc:197] RAW: Set VLOG level for "
” to 10

看be.INFO里,查询结束后新增的日志。可以先清空下日志,然后执行一次查询复现下,把对应的be.INFO和查询开始和结束时间发下。

测试之前已经取消了 starrocks 相关服务的自启动,机器挂掉后再启动不会有新的日志产生。总共三台机器。

be.INFO.log.20250207-093021.zip (37.7 MB)
be.INFO.log.20250207-091732.zip (37.5 MB)
be.INFO.log.20250207-091302.zip (37.0 MB)

设置下set global enable_exchange_pass_through=false再试下呢。
如果查询结束后CPU还在增加,就执行admin execute on 10004 ‘System.print(ExecEnv.get_stack_trace_for_all_threads())’;打一个stack出来。其中10004换成有问题的BE ID。

set enable_exchange_pass_through = false;

这个参数设置后,现象与之前不同。内存使用率查询结束后就不再继续下降,但是其它指标仍没有恢复。

持续大概7分钟,直至恢复正常,机器也没有挂掉。

stack.txt (47.9 KB)

这里应该是有两个问题:

  1. passthrough的内存统计问题,导致没有触发内存限制,buffer内存堆积,机器挂掉。
  2. 查询中右表为空,所以join直接返回,请求提前结束了。但是每给左表发cancel,导致左表一直在scan数据并放到buffer,CPU继续执行。再加上1中提到的内存统计问题,触发了内存不断上涨。

第2个问题已经有bugfix PR:https://github.com/StarRocks/starrocks/pull/49519,不过目前没有合到3.3版本。你们看下是不是自己基于这个打个补丁,如果觉得有困难的话可以找我合到3.3然后你们升级版本就行。
第1个问题理论上在第2个问题fix之后出现概率就很低了,但是可靠起见,可以把 enable_exchange_pass_through也设置成false。

第1个问题,设置参数后,x86、arm 集群都可以避免机器因为内存上涨直接挂掉。
第2个问题,我们可以接受 CPU 异常占用,自动扩容能一定程度缓解,但是机器不能挂掉。之前出问题的时候,自动扩容出来的机器也受影响全部挂掉了。

综合考虑,决定暂时设置参数规避机器挂掉的问题。

最后,感谢大佬的帮助~!