【Insert】insert into执行后查询不到数据

  • 问题
    据用户反馈

1.alter table ads.ads_ec_sams_app_result_monitor_di set (‘dynamic_partition.enable’ = ‘false’)
2.alter table ads.ads_ec_sams_app_result_monitor_di add partition if not exists p20230615 values [(‘2023-06-15’), (‘2023-06-16’))
3.truncate table ads.ads_ec_sams_app_result_monitor_di partition(p20230615)
4.insert into ads.ads_ec_sams_app_result_monitor_di
select client_type
, group_id
, group_name
, sub_group_name
, kpi_name
, kpi_value
, day_tb_rate
, week_hb_rate
, day_warning_flag
, week_warning_flag
, new_flag
, etl_load_time
, ‘2023-06-15’ as ts
FROM hive.ads.ads_ec_sams_app_result_monitor_di
where ts = ‘2023-06-15’;
5.alter table ads.ads_ec_sams_app_result_monitor_di set (‘dynamic_partition.enable’ = ‘true’)
执行该语句后,查询不到数据。

  • 判断

当时判断该问题出现可能跟执行语句的顺序有关(都是在同1秒中执行)

  • 措施
  • 1. 执行步骤的调整

1.alter table ads.ads_ec_sams_app_result_monitor_di set (‘dynamic_partition.enable’ = ‘false’)
2.alter table ads.ads_ec_sams_app_result_monitor_di add partition if not exists p20230615 values [(‘2023-06-15’), (‘2023-06-16’))
3.alter table ads.ads_ec_sams_app_result_monitor_di set (‘dynamic_partition.enable’ = ‘true’)
4.truncate table ads.ads_ec_sams_app_result_monitor_di partition(p20230615)
5.insert into ads.ads_ec_sams_app_result_monitor_di select xxxxxxxxx

  • 2. Query Profile打开(为下次出现做校验做准备)

set enable_profile =true;

在Profile中存在“RowsRead”字段,该字段记录当时扫描写入的行数
- RowsRead: 325
- RowsReturned: 325

           - BlockCacheWriteFailCounter: 0
           - BlockCacheWriteTimer: 0ns
         - BytesRead: 0.00 
         - ColumnConvertTime: 62.597us
         - ColumnReadTime: 5.995ms
         - ExprFilterTime: 497ns
         - InputStream: 0
           - AppIOBytesRead: 5.42 KB
           - AppIOCounter: 1
           - AppIOTime: 25.370us
           - FSBytesRead: 0.00 
           - FSIOCounter: 0
           - FSIOTime: 0ns
         - NumDiskAccess: 0
         - ORC: 0
           - DeleteBuildTimer: 0ns
           - DeleteFilesPerScan: 0
         - OpenFile: 4.20us
         - PeakMemoryUsage: 0.00 
         - ReaderInit: 269.351us
         - RowsRead: 325
         - RowsReturned: 325
         - RowsReturnedRate: 50.026K /sec
         - ScanRanges: 1
         - ScanTime: 6.468ms
         - ScannerQueueCounter: 1
         - ScannerQueueTime: 6.492us
         - ScannerThreadsInvoluntaryContextSwitches: 0
         - ScannerThreadsTotalWallClockTime: 0ns
           - MaterializeTupleTime(*): 0ns

以上仅供参考

你这个保存了多少天的动态分区?是不是动态分区超过了20230615?

感觉不是, 好像是catalog查不到hive的表数据, 查的时候出现了orc文件解析异常的报错

集群是哪个版本的,提供一下相关的报错信息或者日志中详细的报错信息

这是3.0的, 使用的是hive catalog