为了更快的定位您的问题,请提供以下信息,谢谢
【详述】问题详细描述
792d95ab-402a-11f0-ad85-02427eb9344e.be.WARNING (3.9 KB)
W20250603 11:14:47.914017 140654652081728 thrift_rpc_helper.cpp:129] Rpc error: FE RPC failure, address=TNetworkAddress(hostname=saas-node13, port=9020), reason=No more data to read.
W20250603 11:19:27.386326 140653730014784 storage_engine.cpp:1124] tablet not found, remove rowset meta, rowset_id=020000000058840595473fae3f614e7527f969ab5c9199ba tablet: 291734875
W20250603 11:19:28.627028 140653730014784 storage_engine.cpp:1124] tablet not found, remove rowset meta, rowset_id=0200000000588e2f95473fae3f614e7527f969ab5c9199ba tablet: 291736991
W20250603 11:19:28.727757 140653730014784 storage_engine.cpp:1144] traverse_rowset_meta and remove 106/7003 invalid rowset metas, path:/dbdata1 duration:1342ms
W20250603 11:27:26.763230 140653730014784 storage_engine.cpp:1124] tablet not found, remove rowset meta, rowset_id=020000000058843d95473fae3f614e7527f969ab5c9199ba tablet: 291734974
W20250603 11:27:27.824040 140653730014784 storage_engine.cpp:1124] tablet not found, remove rowset meta, rowset_id=0200000000588ae995473fae3f614e7527f969ab5c9199ba tablet: 291736013
W20250603 11:27:28.466870 140653730014784 storage_engine.cpp:1144] traverse_rowset_meta and remove 2263/7358 invalid rowset metas, path:/dbdata1 duration:1894ms
W20250603 11:27:49.041565 140653738407488 rowset.cpp:331] Fail to delete /dbdata1/data/522/291735356/1369509696/02000000005885b295473fae3f614e7527f969ab5c9199ba_0.dat: Not found: /dbdata1/data/522/291735356/1369509696/02000000005885b295473fae3f614e7527f969ab5c9199ba_0.dat: No such file or directory
W20250603 11:27:49.041616 140653738407488 rowset.cpp:331] Fail to delete /dbdata1/data/522/291735356/1369509696/02000000005885b295473fae3f614e7527f969ab5c9199ba_1.dat: Not found: /dbdata1/data/522/291735356/1369509696/02000000005885b295473fae3f614e7527f969ab5c9199ba_1.dat: No such file or directory
W20250603 11:27:49.041638 140653738407488 rowset.cpp:331] Fail to delete /dbdata1/data/522/291735356/1369509696/02000000005885b295473fae3f614e7527f969ab5c9199ba_2.dat: Not found: /dbdata1/data/522/291735356/1369509696/02000000005885b295473fae3f614e7527f969ab5c9199ba_2.dat: No such file or directory
W20250603 11:27:49.041660 140653738407488 rowset.cpp:331] Fail to delete /dbdata1/data/522/291735356/1369509696/02000000005885b295473fae3f614e7527f969ab5c9199ba_3.dat: Not found: /dbdata1/data/522/291735356/1369509696/02000000005885b295473fae3f614e7527f969ab5c9199ba_3.dat: No such file or directory
W20250603 11:28:13.166536 140655138858560 olap_scan_node.cpp:751] failed to get tablet. tablet_id=291250530, with schema_hash=2129885200, reason=tablet does not exist
W20250603 11:28:13.167972 140655138858560 pipeline_driver.cpp:76] query_id=792d95ab-402a-11f0-ad85-02427eb9344e fragment_id=792d95ab-402a-11f0-ad85-02427eb9345f driver=driver_37_0, status=NOT_READY, operator-chain: [olap_scan_prepare_37_0x7fed2530f710(O) -> noop_sink_37_0x7fed2530f990(O)] prepare failed
W20250603 11:28:13.169005 140655138858560 internal_service.cpp:306] exec plan fragment failed, errmsg=failed to get tablet. tablet_id=291250530, with schema_hash=2129885200, reason=tablet does not exist
W20250603 11:28:13.182862 140655113680448 fragment_context.cpp:191] [Driver] Canceled, query_id=792d95ab-402a-11f0-ad85-02427eb9344e, instance_id=792d95ab-402a-11f0-ad85-02427eb9345a, reason=InternalError
W20250603 11:28:13.182906 140655113680448 fragment_context.cpp:191] [Driver] Canceled, query_id=792d95ab-402a-11f0-ad85-02427eb9344e, instance_id=792d95ab-402a-11f0-ad85-02427eb93461, reason=InternalError
W20250603 11:28:13.182919 140655113680448 fragment_context.cpp:191] [Driver] Canceled, query_id=792d95ab-402a-11f0-ad85-02427eb9344e, instance_id=792d95ab-402a-11f0-ad85-02427eb93454, reason=InternalError
W20250603 11:28:13.182929 140655113680448 fragment_context.cpp:191] [Driver] Canceled, query_id=792d95ab-402a-11f0-ad85-02427eb9344e, instance_id=792d95ab-402a-11f0-ad85-02427eb9346f, reason=InternalError
W20250603 11:28:13.182940 140655113680448 fragment_context.cpp:191] [Driver] Canceled, query_id=792d95ab-402a-11f0-ad85-02427eb9344e, instance_id=792d95ab-402a-11f0-ad85-02427eb93450, reason=InternalError
W20250603 11:28:13.182949 140655113680448 fragment_context.cpp:191] [Driver] Canceled, query_id=792d95ab-402a-11f0-ad85-02427eb9344e, instance_id=792d95ab-402a-11f0-ad85-02427eb93455, reason=InternalError
W20250603 11:28:13.182958 140655113680448 fragment_context.cpp:191] [Driver] Canceled, query_id=792d95ab-402a-11f0-ad85-02427eb9344e, instance_id=792d95ab-402a-11f0-ad85-02427eb9346a, reason=InternalError
W20250603 11:28:13.183027 140655113680448 fragment_context.cpp:191] [Driver] Canceled, query_id=792d95ab-402a-11f0-ad85-02427eb9344e, instance_id=792d95ab-402a-11f0-ad85-02427eb93478, reason=InternalError
W20250603 11:28:13.183095 140654971004480 pipeline_driver.cpp:567] cancel pipeline driver error [driver=query_id=792d95ab-402a-11f0-ad85-02427eb9344e fragment_id=792d95ab-402a-11f0-ad85-02427eb93450 driver=driver_52_2, status=INPUT_EMPTY, operator-chain: [spillable_aggregate_blocking_source_52_0x7fed88074010(X) -> project_53_0x7fed78738c10(X) -> olap_table_sink_-1_0x7fed7875bd10(X)]]: Cancelled: Cancelled by pipeline engine
W20250603 11:28:13.183126 140654971004480 tablet_sink_sender.cpp:250] close channel failed. channel_name=NodeChannel[10003], load_info=load_id=792d95ab-402a-11f0-ad85-02427eb9344e, txn_id: 33726165, parallel=1, compress_type=2, error_msg=Cancelled by pipeline engine
W20250603 11:28:13.183132 140654971004480 tablet_sink_sender.cpp:250] close channel failed. channel_name=NodeChannel[10002], load_info=load_id=792d95ab-402a-11f0-ad85-02427eb9344e, txn_id: 33726165, parallel=1, compress_type=2, error_msg=Cancelled by pipeline engine
W20250603 11:28:13.183145 140654971004480 tablet_sink_sender.cpp:250] close channel failed. channel_name=NodeChannel[10001], load_info=load_id=792d95ab-402a-11f0-ad85-02427eb9344e, txn_id: 33726165, parallel=1, compress_type=2, error_msg=Cancelled by pipeline engine
W20250603 11:28:13.190496 140654937433664 tablet_sink_sender.cpp:299] close channel failed. channel_name=NodeChannel[10003], load_info=load_id=792d95ab-402a-11f0-ad85-02427eb9344e, txn_id: 33726165, parallel=1, compress_type=2, error_msg=Cancelled by pipeline engine
W20250603 11:28:13.190542 140654937433664 tablet_sink_sender.cpp:299] close channel failed. channel_name=NodeChannel[10002], load_info=load_id=792d95ab-402a-11f0-ad85-02427eb9344e, txn_id: 33726165, parallel=1, compress_type=2, error_msg=Cancelled by pipeline engine
W20250603 11:28:13.190552 140654937433664 tablet_sink_sender.cpp:299] close channel failed. channel_name=NodeChannel[10001], load_info=load_id=792d95ab-402a-11f0-ad85-02427eb9344e, txn_id: 33726165, parallel=1, compress_type=2, error_msg=Cancelled by pipeline engine
W20250603 11:28:36.728615 140655113680448 olap_scan_node.cpp:751] failed to get tablet. tablet_id=291167893, with schema_hash=2129885200, reason=tablet does not exist
W20250603 11:28:36.728666 140655113680448 pipeline_driver.cpp:76] query_id=0ca3c9ee-402a-11f0-a104-0242f591ea1f fragment_id=0ca3c9ee-402a-11f0-a104-0242f591ea30 driver=driver_43_0, status=NOT_READY, operator-chain: [olap_scan_prepare_43_0x7fec05d07610(O) -> noop_sink_43_0x7fec05d07d90(O)] prepare failed
W20250603 11:28:36.729298 140655113680448 internal_service.cpp:306] exec plan fragment failed, errmsg=failed to get tablet. tablet_id=291167893, with schema_hash=2129885200, reason=tablet does not exist
W20250603 11:28:36.740346 140655122073152 fragment_context.cpp:191] [Driver] Canceled, query_id=0ca3c9ee-402a-11f0-a104-0242f591ea1f, instance_id=0ca3c9ee-402a-11f0-a104-0242f591ea2d, reason=InternalError
W20250603 11:28:36.740382 140655122073152 fragment_context.cpp:191] [Driver] Canceled, query_id=0ca3c9ee-402a-11f0-a104-0242f591ea1f, instance_id=0ca3c9ee-402a-11f0-a104-0242f591ea29, reason=InternalError
tablet的create和drop记录
be.INFO.log.20250603-065155:I20250603 09:06:38.767489 140550097192512 agent_server.cpp:469] Submit task success. type=CREATE, signature=291250530, task_count_in_queue=4
be.INFO.log.20250603-065155:I20250603 09:06:38.773499 140550265046592 tablet_manager.cpp:169] Creating tablet 291250530
be.INFO.log.20250603-065155:I20250603 09:06:38.774270 140550265046592 tablet_manager.cpp:238] Created tablet 291250530
be.INFO.log.20250603-065155:I20250603 09:06:38.791880 140653788763712 local_tablets_channel.cpp:933] LocalTabletsChannel txn_id: 33711842 load_id: 5056baeb-4014-11f0-a104-0242f591ea1f sink_id: 0 incremental open delta writer: [291250514:1][291250520:1][291250524:2][291250530:2][291250534:2][291250540:2]
be.INFO.log.20250603-065155:I20250603 09:06:45.958472 140547362481728 tablet_sink.cpp:434] load_id=5056baeb-4014-11f0-a104-0242f591ea1f, txn_id: 33711842automatic partition rpc end response TCreatePartitionResult(status=TStatus(status_code=OK, error_msgs=<null>), partitions=[TOlapTablePartition(id=291250523, start_key=<null>, end_key=<null>, num_buckets=3, indexes=[TOlapTableIndexTablets(index_id=291166857, tablets=[291250524, 291250527, 291250530])], start_keys=<null>, end_keys=<null>, in_keys=[[TExprNode(node_type=STRING_LITERAL, type=TTypeDesc(types=[TTypeNode(type=SCALAR, scalar_type=TScalarType(type=VARCHAR, len=-1, precision=<null>, scale=<null>), struct_fields=<null>, is_named=<null>)]), opcode=<null>, num_children=0, agg_expr=<null>, bool_literal=<null>, case_expr=<null>, date_literal=<null>, float_literal=<null>, int_literal=<null>, in_predicate=<null>, is_null_pred=<null>, like_pred=<null>, literal_pred=<null>, slot_ref=<null>, string_literal=TStringLiteral(value=elittleyltgfqjd_095913), tuple_is_null_pred=<null>, info_func=<null>, decimal_literal=<null>, output_scale=-1, fn_call_expr=<null>, large_int_literal=<null>, output_column=<null>, output_type=<null>, vector_opcode=<null>, fn=<null>, vararg_start_idx=<null>, child_type=<null>, vslot_ref=<null>, used_subfield_names=<null>, binary_literal=<null>, copy_flag=<null>, check_is_out_of_bounds=<null>, use_vectorized=<null>, has_nullable_child=0, is_nullable=0, child_type_desc=<null>, is_monotonic=1, dict_query_expr=<null>, dictionary_get_expr=<null>, is_index_only_filter=0)]], is_shadow_partition=0)], tablets=[TTabletLocation(tablet_id=291250524, node_ids=[10001, 10002]), TTabletLocation(tablet_id=291250527, node_ids=[10001, 10003]), TTabletLocation(tablet_id=291250530, node_ids=[10003, 10002])], nodes=[TNodeInfo(id=10001, option=0, host=172.17.9.108, async_internal_port=8060), TNodeInfo(id=10002, option=0, host=172.17.9.106, async_internal_port=8060), TNodeInfo(id=10003, option=0, host=172.17.9.107, async_internal_port=8060)])
be.INFO.log.20250603-065155:I20250603 09:12:43.879386 140653562160704 tablet.cpp:1415] start to do tablet meta checkpoint, tablet=291250530.2129885200.624f4632a44da5de-f622bb6716c1a59b
be.INFO.log.20250603-065155:I20250603 11:27:17.196070 140550390937152 agent_server.cpp:473] Submit task success. type=DROP, signature=291250530, task_count_in_queue=1
be.INFO.log.20250603-065155:I20250603 11:27:17.201965 140654048937536 tablet_manager.cpp:395] Start to drop tablet 291250530
be.INFO.log.20250603-065155:I20250603 11:27:17.202057 140654048937536 tablet_manager.cpp:470] Succeed to drop tablet 291250530
be.INFO.log.20250603-065155:I20250603 11:27:17.204932 140654048937536 agent_task.cpp:158] Remove task success. type=DROP, signature=291250530, task_count_in_queue=3
be.INFO.log.20250603-065155:I20250603 11:27:25.474450 140653730014784 tablet_manager.cpp:1079] Removed /dbdata1/data/425/291250530
be.INFO.log.20250603-065155:W20250603 11:28:13.166536 140655138858560 olap_scan_node.cpp:751] failed to get tablet. tablet_id=291250530, with schema_hash=2129885200, reason=tablet does not exist
be.INFO.log.20250603-065155:W20250603 11:28:13.169005 140655138858560 internal_service.cpp:306] exec plan fragment failed, errmsg=failed to get tablet. tablet_id=291250530, with schema_hash=2129885200, reason=tablet does not exist
be.INFO.log.20250603-065155:I20250603 17:20:32.386362 140654744401472 schema_be_tablets_scanner.cpp:103] get_tablets_basic_infos table_id:-1 partition:-1 tablet:291250530 #info:0
【背景】做过哪些操作?
【业务影响】
【是否存算分离】
【StarRocks版本】例如:3.3.12
【集群规模】例如:3fe(3 follower+0 observer)+3be(fe与be混部)
【机器信息】CPU虚拟核/内存/网卡,例如:8C/64G/万兆
【联系方式】为了在解决问题过程中能及时联系到您获取一些日志信息,请补充下您的联系方式,例如:社区群16-可乐鸡或者邮箱,谢谢
【附件】
- fe.log/beINFO/相应截图
- 慢查询:
- Profile信息,获取Profile,通过Profile分析查询瓶颈
- 并行度:show variables like ‘%parallel_fragment_exec_instance_num%’;
- pipeline是否开启:show variables like ‘%pipeline%’;
- be节点cpu和内存使用率截图
- 查询报错:
- query_dump,怎么获取query_dump文件
- be crash
- be.out
- coredump,如何获取coredump
- 外表查询报错
- be.out和fe.warn.log