catalog 查询报错 Failed to open the off-heap table scanner

为了更快的定位您的问题,请提供以下信息,谢谢
【详述】查询某些hudi表报错
【背景】
【业务影响】
【是否存算分离】
【StarRocks版本】例如:3.1.5
【集群规模】例如:3fe(1 follower+2observer)+3be(fe与be混部)
【机器信息】
【联系方式】 社区群2-巡山的大王大人
【附件】
查询sql
select d, sum(1)c from ( select date(create_time)d from hudi.fin_basic_data.cffinrepayagentdb_user_auto_repay_launch_rt where _db = ‘cffinrepayagentdb’ )t group by d;

报错信息
Job aborted due to stage failure: Task 0 in stage 0.0 failed 1 times, most recent failure: Lost task 0.0 in stage 0.0 (TID 0) (10.60.40.138 executor driver): java.sql.SQLSyntaxErrorException: Failed to open the off-heap table scanner. java exception details: java.io.IOException: Failed to open the hudi MOR slice reader.[com.starrocks.hudi.reader.HudiSliceScanner.open(HudiSliceScanner.java:219)]
at com.mysql.cj.jdbc.exceptions.SQLError.createSQLException(SQLError.java:120)
at com.mysql.cj.jdbc.exceptions.SQLExceptionsMapping.translateException(SQLExceptionsMapping.java:122)
at com.mysql.cj.jdbc.ClientPreparedStatement.executeInternal(ClientPreparedStatement.java:953)
at com.mysql.cj.jdbc.ClientPreparedStatement.executeQuery(ClientPreparedStatement.java:1009)
at org.apache.spark.sql.execution.datasources.jdbc.JDBCRDD.compute(JDBCRDD.scala:358)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:373)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:337)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:373)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:337)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:373)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:337)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:373)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:337)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:373)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:337)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:373)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:337)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
at org.apache.spark.scheduler.Task.run(Task.scala:138)
at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:506)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1510)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:509)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745) Driver stacktrace:

hudi表结构 - 表是 mor 压缩方式 gzip
starrocks 查看表结构
*************************** 1. row ***************************
Table: cffinrepayagentdb_user_auto_repay_launch_rt
Create Table: CREATE TABLE cffinrepayagentdb_user_auto_repay_launch_rt (
_hoodie_commit_time varchar(1048576) DEFAULT NULL,
_hoodie_commit_seqno varchar(1048576) DEFAULT NULL,
_hoodie_record_key varchar(1048576) DEFAULT NULL,
_hoodie_partition_path varchar(1048576) DEFAULT NULL,
_hoodie_file_name varchar(1048576) DEFAULT NULL,
id bigint(20) DEFAULT NULL,
user_id varchar(1048576) DEFAULT NULL,
batch_no varchar(1048576) DEFAULT NULL,
org_channel varchar(1048576) DEFAULT NULL,
product_no varchar(1048576) DEFAULT NULL,
auto_serial_no varchar(1048576) DEFAULT NULL,
detail_serial_no varchar(1048576) DEFAULT NULL,
req_serial_no varchar(1048576) DEFAULT NULL,
launch_way varchar(1048576) DEFAULT NULL,
status bigint(20) DEFAULT NULL,
roll_order bigint(20) DEFAULT NULL,
asset_type varchar(1048576) DEFAULT NULL,
asset_info varchar(1048576) DEFAULT NULL,
req_user_id varchar(1048576) DEFAULT NULL,
repay_amt decimal128(30, 15) DEFAULT NULL,
withhold_amt decimal128(30, 15) DEFAULT NULL,
repay_succ_amt decimal128(30, 15) DEFAULT NULL,
auto_type varchar(1048576) DEFAULT NULL,
error_code varchar(1048576) DEFAULT NULL,
error_msg varchar(1048576) DEFAULT NULL,
ext varchar(1048576) DEFAULT NULL,
finish_time datetime DEFAULT NULL,
create_time datetime DEFAULT NULL,
update_time datetime DEFAULT NULL,
_binlog_offset bigint(20) DEFAULT NULL,
_event_type varchar(1048576) DEFAULT NULL,
_event_type_number bigint(20) DEFAULT NULL,
_hoodie_is_deleted boolean DEFAULT NULL,
_db varchar(1048576) DEFAULT NULL,
_tbl varchar(1048576) DEFAULT NULL,
_source varchar(1048576) DEFAULT NULL,
_id_mod int(11) DEFAULT NULL
)
PARTITION BY ( _db, _tbl, _id_mod )
PROPERTIES (“location” = “hdfs://ns/user/hive/warehouse/fin_basic_data.db/cffinrepayagentdb_user_auto_repay_launch_rt”);
1 row in set (2.26 sec)

hive show create table

CREATE EXTERNAL TABLE fin_basic_data.cffinrepayagentdb_user_auto_repay_launch_rt (
_hoodie_commit_time string COMMENT ‘’,
_hoodie_commit_seqno string COMMENT ‘’,
_hoodie_record_key string COMMENT ‘’,
_hoodie_partition_path string COMMENT ‘’,
_hoodie_file_name string COMMENT ‘’,
ext string COMMENT ‘’,
roll_order bigint COMMENT ‘’,
batch_no string COMMENT ‘’,
error_msg string COMMENT ‘’,
create_time timestamp COMMENT ‘’,
detail_serial_no string COMMENT ‘’,
org_channel string COMMENT ‘’,
product_no string COMMENT ‘’,
finish_time timestamp COMMENT ‘完成时间’,
auto_type string COMMENT ‘’,
asset_info string COMMENT ‘’,
update_time timestamp COMMENT ‘’,
withhold_amt decimal(30,15) COMMENT ‘’,
launch_way string COMMENT ‘’,
repay_succ_amt decimal(30,15) COMMENT ‘’,
user_id string COMMENT ‘用户id’,
asset_type string COMMENT ‘’,
auto_serial_no string COMMENT ‘’,
error_code string COMMENT ‘’,
repay_amt decimal(30,15) COMMENT ‘’,
id bigint COMMENT ‘’,
req_user_id string COMMENT ‘’,
status bigint COMMENT ‘’,
req_serial_no string COMMENT ‘’,
_binlog_offset bigint COMMENT ‘’,
_event_type string COMMENT ‘’,
_event_type_number bigint COMMENT ‘’,
_hoodie_is_deleted boolean COMMENT ‘’,
_source string COMMENT ‘’)
PARTITIONED BY (
_db string COMMENT ‘’,
_tbl string COMMENT ‘’,
_id_mod int COMMENT ‘’)
ROW FORMAT SERDE
‘org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe’
WITH SERDEPROPERTIES (
‘hoodie.query.as.ro.table’=‘false’,
‘path’=’/user/hive/warehouse/fin_basic_data.db/cffinrepayagentdb_user_auto_repay_launch_rt’)
STORED AS INPUTFORMAT
‘org.apache.hudi.hadoop.realtime.HoodieParquetRealtimeInputFormat’
OUTPUTFORMAT
‘org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat’
LOCATION
‘hdfs://ns/user/hive/warehouse/fin_basic_data.db/cffinrepayagentdb_user_auto_repay_launch_rt’
TBLPROPERTIES (
‘dp_last_modified_time’=‘1703748231536,1703739175815,1703729888625,1703722338354,1703663106063,1703655731057,1703643604965’,
‘last_commit_time_sync’=‘20231228144423814’,
‘spark.sql.create.version’=‘3.2.0-ctrip-1.0.0’,
‘spark.sql.sources.provider’=‘hudi’,
‘spark.sql.sources.schema.numPartCols’=‘3’,
‘spark.sql.sources.schema.numParts’=‘1’,
‘spark.sql.sources.schema.part.0’=’{“type”:“struct”,“fields”:[{“name”:"_hoodie_commit_time",“type”:“string”,“nullable”:true,“metadata”:{}},{“name”:"_hoodie_commit_seqno",“type”:“string”,“nullable”:true,“metadata”:{}},{“name”:"_hoodie_record_key",“type”:“string”,“nullable”:true,“metadata”:{}},{“name”:"_hoodie_partition_path",“type”:“string”,“nullable”:true,“metadata”:{}},{“name”:"_hoodie_file_name",“type”:“string”,“nullable”:true,“metadata”:{}},{“name”:“ext”,“type”:“string”,“nullable”:true,“metadata”:{}},{“name”:“roll_order”,“type”:“long”,“nullable”:true,“metadata”:{}},{“name”:“batch_no”,“type”:“string”,“nullable”:true,“metadata”:{}},{“name”:“error_msg”,“type”:“string”,“nullable”:true,“metadata”:{}},{“name”:“create_time”,“type”:“timestamp”,“nullable”:true,“metadata”:{}},{“name”:“detail_serial_no”,“type”:“string”,“nullable”:true,“metadata”:{}},{“name”:“org_channel”,“type”:“string”,“nullable”:true,“metadata”:{}},{“name”:“product_no”,“type”:“string”,“nullable”:true,“metadata”:{}},{“name”:“finish_time”,“type”:“timestamp”,“nullable”:true,“metadata”:{}},{“name”:“auto_type”,“type”:“string”,“nullable”:true,“metadata”:{}},{“name”:“asset_info”,“type”:“string”,“nullable”:true,“metadata”:{}},{“name”:“update_time”,“type”:“timestamp”,“nullable”:true,“metadata”:{}},{“name”:“withhold_amt”,“type”:“decimal(30,15)”,“nullable”:true,“metadata”:{}},{“name”:“launch_way”,“type”:“string”,“nullable”:true,“metadata”:{}},{“name”:“repay_succ_amt”,“type”:“decimal(30,15)”,“nullable”:true,“metadata”:{}},{“name”:“user_id”,“type”:“string”,“nullable”:true,“metadata”:{}},{“name”:“asset_type”,“type”:“string”,“nullable”:true,“metadata”:{}},{“name”:“auto_serial_no”,“type”:“string”,“nullable”:true,“metadata”:{}},{“name”:“error_code”,“type”:“string”,“nullable”:true,“metadata”:{}},{“name”:“repay_amt”,“type”:“decimal(30,15)”,“nullable”:true,“metadata”:{}},{“name”:“id”,“type”:“long”,“nullable”:true,“metadata”:{}},{“name”:“req_user_id”,“type”:“string”,“nullable”:true,“metadata”:{}},{“name”:“status”,“type”:“long”,“nullable”:true,“metadata”:{}},{“name”:“req_serial_no”,“type”:“string”,“nullable”:true,“metadata”:{}},{“name”:"_binlog_offset",“type”:“long”,“nullable”:true,“metadata”:{}},{“name”:"_event_type",“type”:“string”,“nullable”:true,“metadata”:{}},{“name”:"_event_type_number",“type”:“long”,“nullable”:true,“metadata”:{}},{“name”:"_hoodie_is_deleted",“type”:“boolean”,“nullable”:true,“metadata”:{}},{“name”:"_source",“type”:“string”,“nullable”:true,“metadata”:{}},{“name”:"_db",“type”:“string”,“nullable”:true,“metadata”:{}},{“name”:"_tbl",“type”:“string”,“nullable”:true,“metadata”:{}},{“name”:"_id_mod",“type”:“integer”,“nullable”:false,“metadata”:{}}]}’,
‘spark.sql.sources.schema.partCol.0’=’_db’,
‘spark.sql.sources.schema.partCol.1’=’_tbl’,
‘spark.sql.sources.schema.partCol.2’=’_id_mod’,
‘transient_lastDdlTime’=‘1651649240’)

请问是所有的hudi表都读不了吗,还是个别表

部分表查询不了
目前我们的hudi表都是用一个任务创建的,所以除了表结构不一致,其余都是一致的(MOR 压缩方式都是gzip)

就是说查询报错的表和正常查询的表,除了表结构不一致,其他格式、配置都是类似的是吗?

嗯嗯 是的

这个能查的都是非分区表,不能查的都是分区表吗?

都是分区表,而且分区字段都一致
目前不能查的表都有 decimal(30,15) 字段 能查询的表都没有 decimal类型字段

cn.log 报错信息
13:38:54.648 [Thread-344] ERROR com.starrocks.hudi.reader.HudiSliceScanner - Failed to open the hudi MOR slice reader.
java.lang.IllegalArgumentException: Error: , expected at the position 130 of ‘string,string,string,string,string,bigint,string,string,string,string,string,string,bigint,string,decimal(30,bigint,bigint,decimal(30,timestamp,timestamp,string,string,string,timestamp,timestamp,string,string,string,timestamp,timestamp,string,string,bigint,bigint,string,bigint,boolean,string,string,string,int’ but ‘(’ is found.
at org.apache.hadoop.hive.serde2.typeinfo.TypeInfoUtils$TypeInfoParser.expect(TypeInfoUtils.java:413) ~[hive-apache-3.1.2-22.jar:?]
at org.apache.hadoop.hive.serde2.typeinfo.TypeInfoUtils$TypeInfoParser.parseParams(TypeInfoUtils.java:431) ~[hive-apache-3.1.2-22.jar:?]
at org.apache.hadoop.hive.serde2.typeinfo.TypeInfoUtils$TypeInfoParser.parseType(TypeInfoUtils.java:451) ~[hive-apache-3.1.2-22.jar:?]
at org.apache.hadoop.hive.serde2.typeinfo.TypeInfoUtils$TypeInfoParser.parseTypeInfos(TypeInfoUtils.java:358) ~[hive-apache-3.1.2-22.jar:?]
at org.apache.hadoop.hive.serde2.typeinfo.TypeInfoUtils.getTypeInfosFromTypeString(TypeInfoUtils.java:848) ~[hive-apache-3.1.2-22.jar:?]
at org.apache.hadoop.hive.ql.io.parquet.read.DataWritableReadSupport.getColumnTypes(DataWritableReadSupport.java:88) ~[hive-apache-3.1.2-22.jar:?]
at org.apache.hadoop.hive.ql.io.parquet.read.DataWritableReadSupport.init(DataWritableReadSupport.java:376) ~[hive-apache-3.1.2-22.jar:?]
at org.apache.hadoop.hive.ql.io.parquet.ParquetRecordReaderBase.getSplit(ParquetRecordReaderBase.java:84) ~[hive-apache-3.1.2-22.jar:?]
at org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.(ParquetRecordReaderWrapper.java:75) ~[hive-apache-3.1.2-22.jar:?]
at org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.(ParquetRecordReaderWrapper.java:60) ~[hive-apache-3.1.2-22.jar:?]
at org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat.getRecordReader(MapredParquetInputFormat.java:93) ~[hive-apache-3.1.2-22.jar:?]
at org.apache.hudi.hadoop.HoodieParquetInputFormat.getRecordReaderInternal(HoodieParquetInputFormat.java:89) ~[hudi-hadoop-mr-0.12.2.jar:0.12.2]
at org.apache.hudi.hadoop.HoodieParquetInputFormat.getRecordReader(HoodieParquetInputFormat.java:83) ~[hudi-hadoop-mr-0.12.2.jar:0.12.2]
at org.apache.hudi.hadoop.realtime.HoodieParquetRealtimeInputFormat.getRecordReader(HoodieParquetRealtimeInputFormat.java:74) ~[hudi-hadoop-mr-0.12.2.jar:0.12.2]
at com.starrocks.hudi.reader.HudiSliceScanner.initReader(HudiSliceScanner.java:197) ~[starrocks-hudi-reader.jar:?]
at com.starrocks.hudi.reader.HudiSliceScanner.open(HudiSliceScanner.java:215) ~[starrocks-hudi-reader.jar:?]

您这边能看到decimal对应表的parquet文件吗?可以用一些工具确认下底层存储是不是byte array格式的

  1. 导出parquet文件到本地
    INSERT OVERWRITE LOCAL DIRECTORY ‘/home/disk1/sr/lwl’
    ROW FORMAT SERDE ‘org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe’
    STORED AS PARQUET
    SELECT * FROM example_table2;
  2. 安装parquet-tools
    pip install parquet-tools
  3. parquet-tools查看元数据
    parquet-tools inspect 000000_0

和截图的一致 loan_amount loan_rate 这两个字段
parquet-tools inspect 00000000-11b5-4c7e-b79a-e4baaad11a30-0_3-19-143_20230130150139824.parquet

############ file meta data ############

created_by: parquet-mr version 1.12.1 (build 2a5c06c58fa987f85aa22170be14d927d5ff6e7d)

num_columns: 41

num_rows: 1500000

num_row_groups: 2

format_version: 1.0

serialized_size: 739398

############ Columns ############

_hoodie_commit_time

_hoodie_commit_seqno

_hoodie_record_key

_hoodie_partition_path

_hoodie_file_name

id

user_id

plat_id

org_channel

product_no

tpp_code

serial_no

loan_type

loan_provide_no

loan_amount

loan_term

status

loan_rate

start_interest_date

end_date

serv_loan_provide_id

trx_id

fund_org_code

request_time

finish_time

backout_code

backout_msg

qunar_trade_no

create_time

update_time

error_code

error_msg

pbc_indicators

_binlog_offset

_event_type

_event_type_number

_hoodie_is_deleted

_db

_tbl

_source

_id_mod

############ Column(_hoodie_commit_time) ############

name: _hoodie_commit_time

path: _hoodie_commit_time

max_definition_level: 1

max_repetition_level: 0

physical_type: BYTE_ARRAY

logical_type: String

converted_type (legacy): UTF8

compression: GZIP (space_saved: 29%)

############ Column(_hoodie_commit_seqno) ############

name: _hoodie_commit_seqno

path: _hoodie_commit_seqno

max_definition_level: 1

max_repetition_level: 0

physical_type: BYTE_ARRAY

logical_type: String

converted_type (legacy): UTF8

compression: GZIP (space_saved: 86%)

############ Column(_hoodie_record_key) ############

name: _hoodie_record_key

path: _hoodie_record_key

max_definition_level: 1

max_repetition_level: 0

physical_type: BYTE_ARRAY

logical_type: String

converted_type (legacy): UTF8

compression: GZIP (space_saved: 73%)

############ Column(_hoodie_partition_path) ############

name: _hoodie_partition_path

path: _hoodie_partition_path

max_definition_level: 1

max_repetition_level: 0

physical_type: BYTE_ARRAY

logical_type: String

converted_type (legacy): UTF8

compression: GZIP (space_saved: -46%)

############ Column(_hoodie_file_name) ############

name: _hoodie_file_name

path: _hoodie_file_name

max_definition_level: 1

max_repetition_level: 0

physical_type: BYTE_ARRAY

logical_type: String

converted_type (legacy): UTF8

compression: GZIP (space_saved: -47%)

############ Column(id) ############

name: id

path: id

max_definition_level: 1

max_repetition_level: 0

physical_type: INT64

logical_type: None

converted_type (legacy): NONE

compression: GZIP (space_saved: 56%)

############ Column(user_id) ############

name: user_id

path: user_id

max_definition_level: 1

max_repetition_level: 0

physical_type: BYTE_ARRAY

logical_type: String

converted_type (legacy): UTF8

compression: GZIP (space_saved: 46%)

############ Column(plat_id) ############

name: plat_id

path: plat_id

max_definition_level: 1

max_repetition_level: 0

physical_type: BYTE_ARRAY

logical_type: String

converted_type (legacy): UTF8

compression: GZIP (space_saved: 46%)

############ Column(org_channel) ############

name: org_channel

path: org_channel

max_definition_level: 1

max_repetition_level: 0

physical_type: BYTE_ARRAY

logical_type: String

converted_type (legacy): UTF8

compression: GZIP (space_saved: 44%)

############ Column(product_no) ############

name: product_no

path: product_no

max_definition_level: 1

max_repetition_level: 0

physical_type: BYTE_ARRAY

logical_type: String

converted_type (legacy): UTF8

compression: GZIP (space_saved: -49%)

############ Column(tpp_code) ############

name: tpp_code

path: tpp_code

max_definition_level: 1

max_repetition_level: 0

physical_type: BYTE_ARRAY

logical_type: String

converted_type (legacy): UTF8

compression: GZIP (space_saved: 14%)

############ Column(serial_no) ############

name: serial_no

path: serial_no

max_definition_level: 1

max_repetition_level: 0

physical_type: BYTE_ARRAY

logical_type: String

converted_type (legacy): UTF8

compression: GZIP (space_saved: 65%)

############ Column(loan_type) ############

name: loan_type

path: loan_type

max_definition_level: 1

max_repetition_level: 0

physical_type: INT64

logical_type: None

converted_type (legacy): NONE

compression: GZIP (space_saved: -49%)

############ Column(loan_provide_no) ############

name: loan_provide_no

path: loan_provide_no

max_definition_level: 1

max_repetition_level: 0

physical_type: BYTE_ARRAY

logical_type: String

converted_type (legacy): UTF8

compression: GZIP (space_saved: 68%)

############ Column(loan_amount) ############

name: loan_amount

path: loan_amount

max_definition_level: 1

max_repetition_level: 0

physical_type: FIXED_LEN_BYTE_ARRAY

logical_type: Decimal(precision=30, scale=15)

converted_type (legacy): DECIMAL

compression: GZIP (space_saved: 78%)

############ Column(loan_term) ############

name: loan_term

path: loan_term

max_definition_level: 1

max_repetition_level: 0

physical_type: INT64

logical_type: None

converted_type (legacy): NONE

compression: GZIP (space_saved: 39%)

############ Column(status) ############

name: status

path: status

max_definition_level: 1

max_repetition_level: 0

physical_type: INT64

logical_type: None

converted_type (legacy): NONE

compression: GZIP (space_saved: 42%)

############ Column(loan_rate) ############

name: loan_rate

path: loan_rate

max_definition_level: 1

max_repetition_level: 0

physical_type: FIXED_LEN_BYTE_ARRAY

logical_type: Decimal(precision=30, scale=15)

converted_type (legacy): DECIMAL

compression: GZIP (space_saved: 93%)

############ Column(start_interest_date) ############

name: start_interest_date

path: start_interest_date

max_definition_level: 1

max_repetition_level: 0

physical_type: INT64

logical_type: Timestamp(isAdjustedToUTC=true, timeUnit=microseconds, is_from_converted_type=false, force_set_converted_type=false)

converted_type (legacy): TIMESTAMP_MICROS

compression: GZIP (space_saved: 2%)

############ Column(end_date) ############

name: end_date

path: end_date

max_definition_level: 1

max_repetition_level: 0

physical_type: INT64

logical_type: Timestamp(isAdjustedToUTC=true, timeUnit=microseconds, is_from_converted_type=false, force_set_converted_type=false)

converted_type (legacy): TIMESTAMP_MICROS

compression: GZIP (space_saved: 7%)

############ Column(serv_loan_provide_id) ############

name: serv_loan_provide_id

path: serv_loan_provide_id

max_definition_level: 1

max_repetition_level: 0

physical_type: BYTE_ARRAY

logical_type: String

converted_type (legacy): UTF8

compression: GZIP (space_saved: 68%)

############ Column(trx_id) ############

name: trx_id

path: trx_id

max_definition_level: 1

max_repetition_level: 0

physical_type: BYTE_ARRAY

logical_type: String

converted_type (legacy): UTF8

compression: GZIP (space_saved: 67%)

############ Column(fund_org_code) ############

name: fund_org_code

path: fund_org_code

max_definition_level: 1

max_repetition_level: 0

physical_type: BYTE_ARRAY

logical_type: String

converted_type (legacy): UTF8

compression: GZIP (space_saved: -49%)

############ Column(request_time) ############

name: request_time

path: request_time

max_definition_level: 1

max_repetition_level: 0

physical_type: INT64

logical_type: Timestamp(isAdjustedToUTC=true, timeUnit=microseconds, is_from_converted_type=false, force_set_converted_type=false)

converted_type (legacy): TIMESTAMP_MICROS

compression: GZIP (space_saved: 28%)

############ Column(finish_time) ############

name: finish_time

path: finish_time

max_definition_level: 1

max_repetition_level: 0

physical_type: INT64

logical_type: Timestamp(isAdjustedToUTC=true, timeUnit=microseconds, is_from_converted_type=false, force_set_converted_type=false)

converted_type (legacy): TIMESTAMP_MICROS

compression: GZIP (space_saved: 28%)

############ Column(backout_code) ############

name: backout_code

path: backout_code

max_definition_level: 1

max_repetition_level: 0

physical_type: BYTE_ARRAY

logical_type: String

converted_type (legacy): UTF8

compression: GZIP (space_saved: -49%)

############ Column(backout_msg) ############

name: backout_msg

path: backout_msg

max_definition_level: 1

max_repetition_level: 0

physical_type: BYTE_ARRAY

logical_type: String

converted_type (legacy): UTF8

compression: GZIP (space_saved: -49%)

############ Column(qunar_trade_no) ############

name: qunar_trade_no

path: qunar_trade_no

max_definition_level: 1

max_repetition_level: 0

physical_type: BYTE_ARRAY

logical_type: String

converted_type (legacy): UTF8

compression: GZIP (space_saved: 67%)

############ Column(create_time) ############

name: create_time

path: create_time

max_definition_level: 1

max_repetition_level: 0

physical_type: INT64

logical_type: Timestamp(isAdjustedToUTC=true, timeUnit=microseconds, is_from_converted_type=false, force_set_converted_type=false)

converted_type (legacy): TIMESTAMP_MICROS

compression: GZIP (space_saved: 28%)

############ Column(update_time) ############

name: update_time

path: update_time

max_definition_level: 1

max_repetition_level: 0

physical_type: INT64

logical_type: Timestamp(isAdjustedToUTC=true, timeUnit=microseconds, is_from_converted_type=false, force_set_converted_type=false)

converted_type (legacy): TIMESTAMP_MICROS

compression: GZIP (space_saved: 28%)

############ Column(error_code) ############

name: error_code

path: error_code

max_definition_level: 1

max_repetition_level: 0

physical_type: BYTE_ARRAY

logical_type: String

converted_type (legacy): UTF8

compression: GZIP (space_saved: 52%)

############ Column(error_msg) ############

name: error_msg

path: error_msg

max_definition_level: 1

max_repetition_level: 0

physical_type: BYTE_ARRAY

logical_type: String

converted_type (legacy): UTF8

compression: GZIP (space_saved: 57%)

############ Column(pbc_indicators) ############

name: pbc_indicators

path: pbc_indicators

max_definition_level: 1

max_repetition_level: 0

physical_type: INT64

logical_type: None

converted_type (legacy): NONE

compression: GZIP (space_saved: -49%)

############ Column(_binlog_offset) ############

name: _binlog_offset

path: _binlog_offset

max_definition_level: 1

max_repetition_level: 0

physical_type: INT64

logical_type: None

converted_type (legacy): NONE

compression: GZIP (space_saved: 0%)

############ Column(_event_type) ############

name: _event_type

path: _event_type

max_definition_level: 1

max_repetition_level: 0

physical_type: BYTE_ARRAY

logical_type: String

converted_type (legacy): UTF8

compression: GZIP (space_saved: -49%)

############ Column(_event_type_number) ############

name: _event_type_number

path: _event_type_number

max_definition_level: 1

max_repetition_level: 0

physical_type: INT64

logical_type: None

converted_type (legacy): NONE

compression: GZIP (space_saved: -49%)

############ Column(_hoodie_is_deleted) ############

name: _hoodie_is_deleted

path: _hoodie_is_deleted

max_definition_level: 1

max_repetition_level: 0

physical_type: BOOLEAN

logical_type: None

converted_type (legacy): NONE

compression: GZIP (space_saved: 97%)

############ Column(_db) ############

name: _db

path: _db

max_definition_level: 1

max_repetition_level: 0

physical_type: BYTE_ARRAY

logical_type: String

converted_type (legacy): UTF8

compression: GZIP (space_saved: -49%)

############ Column(_tbl) ############

name: _tbl

path: _tbl

max_definition_level: 1

max_repetition_level: 0

physical_type: BYTE_ARRAY

logical_type: String

converted_type (legacy): UTF8

compression: GZIP (space_saved: -49%)

############ Column(_source) ############

name: _source

path: _source

max_definition_level: 1

max_repetition_level: 0

physical_type: BYTE_ARRAY

logical_type: String

converted_type (legacy): UTF8

compression: GZIP (space_saved: -49%)

############ Column(_id_mod) ############

name: _id_mod

path: _id_mod

max_definition_level: 0

max_repetition_level: 0

physical_type: INT32

logical_type: None

converted_type (legacy): NONE

compression: GZIP (space_saved: -69%)

有试过不select *,只select单个字段看下哪个字段会报错吗

现在的情况有些乱
查询报错的多个hudi表,我按照你说的这个查询了下带有decimal字段的表
1、有的表select * 或者select 单个字段都是ok的


2、有的表查询某一列,偶尔会失败, 重试会成功,但是select * 试了十几次,没有成功过一次(不知道是否和这个表两个decimal字段紧邻有关)

1情况对应的表结构
CREATE EXTERNAL TABLE cffinrepayagentdb_user_auto_repay_launch_rt(
_hoodie_commit_time string COMMENT ‘’,
_hoodie_commit_seqno string COMMENT ‘’,
_hoodie_record_key string COMMENT ‘’,
_hoodie_partition_path string COMMENT ‘’,
_hoodie_file_name string COMMENT ‘’,
ext string COMMENT ‘’,
roll_order bigint COMMENT ‘’,
batch_no string COMMENT ‘’,
error_msg string COMMENT ‘’,
create_time timestamp COMMENT ‘’,
detail_serial_no string COMMENT ‘’,
org_channel string COMMENT ‘’,
product_no string COMMENT ‘’,
finish_time timestamp COMMENT ‘’,
auto_type string COMMENT ‘’,
asset_info string COMMENT ‘’,
update_time timestamp COMMENT ‘’,
withhold_amt decimal(30,15) COMMENT ‘’,
launch_way string COMMENT ‘’,
repay_succ_amt decimal(30,15) COMMENT ‘’,
user_id string COMMENT ‘’,
asset_type string COMMENT ‘’,
auto_serial_no string COMMENT ‘’,
error_code string COMMENT ‘’,
repay_amt decimal(30,15) COMMENT ‘’,
id bigint COMMENT ‘’,
req_user_id string COMMENT ‘’,
status bigint COMMENT ‘’,
req_serial_no string COMMENT ‘’,
_binlog_offset bigint COMMENT ‘’,
_event_type string COMMENT ‘’,
_event_type_number bigint COMMENT ‘’,
_hoodie_is_deleted boolean COMMENT ‘’,
_source string COMMENT ‘’)
PARTITIONED BY (
_db string COMMENT ‘’,
_tbl string COMMENT ‘’,
_id_mod int COMMENT ‘’)
2情况对应的表结构
CREATE EXTERNAL TABLE cffincashloandb_cash_loan_confirm_rt(
_hoodie_commit_time string COMMENT ‘’,
_hoodie_commit_seqno string COMMENT ‘’,
_hoodie_record_key string COMMENT ‘’,
_hoodie_partition_path string COMMENT ‘’,
_hoodie_file_name string COMMENT ‘’,
end_date timestamp COMMENT ‘’,
loan_provide_no string COMMENT ‘’,
backout_code string COMMENT ‘’,
loan_type bigint COMMENT ‘’,
update_time timestamp COMMENT ‘’,
tpp_code string COMMENT ‘’,
request_time timestamp COMMENT ‘’,
id bigint COMMENT ‘’,
serv_loan_provide_id string COMMENT ‘’,
error_msg string COMMENT ‘’,
qunar_trade_no string COMMENT ‘’,
create_time timestamp COMMENT ‘’,
org_channel string COMMENT ‘’,
loan_amount decimal(30,15) COMMENT ‘’,
loan_rate decimal(30,15) COMMENT ‘’,
product_no string COMMENT ‘’,
finish_time timestamp COMMENT ‘’,
start_interest_date timestamp COMMENT ‘’,
trx_id string COMMENT ‘’,
user_id string COMMENT ‘’,
backout_msg string COMMENT ‘’,
fund_org_code string COMMENT ‘’,
loan_term bigint COMMENT ‘’,
pbc_indicators bigint COMMENT ‘’,
plat_id string COMMENT ‘’,
error_code string COMMENT ‘’,
serial_no string COMMENT ‘’,
status bigint COMMENT ‘’,
_binlog_offset bigint COMMENT ‘’,
_event_type string COMMENT ‘’,
_event_type_number bigint COMMENT ‘’,
_hoodie_is_deleted boolean COMMENT ‘’,
_source string COMMENT ‘’)
PARTITIONED BY (
_db string COMMENT ‘’,
_tbl string COMMENT ‘’,
_id_mod int COMMENT ‘’)
ROW FORMAT SERDE

Fixed in https://github.com/StarRocks/starrocks/pull/38212/files

主要原因是没有正确解析decimal(x, y)这样的形式。这种形式在3.2/main上面都支持了。

ok 我今天升级下 感谢大佬们支持