为了更快的定位您的问题,请提供以下信息,谢谢
【详述】问题详细描述
curl --location-trusted -u root -T /data1/software/tpcds1000x/store_sales_ext.dat -H “column_separator:|” -H “columns:ss_sold_date_sk,ss_sold_time_sk,ss_item_sk,ss_customer_sk,ss_cdemo_sk,ss_hdemo_sk,ss_addr_sk,ss_store_sk,ss_promo_sk,ss_ticket_number,ss_quantity,ss_wholesale_cost,ss_list_price,ss_sales_price,ss_ext_discount_amt,ss_ext_sales_price,ss_ext_wholesale_cost,ss_ext_list_price,ss_ext_tax,ss_coupon_amt,ss_net_paid,ss_net_paid_inc_tax,ss_net_profit” http://127.0.0.1:8030/api/tpcds1000x/store_sales_ext/_stream_load
Enter host password for user ‘root’:
{
“TxnId”: 1136,
“Label”: “bf0ee091-152c-4a01-872b-d54ba81ce3b0”,
“Status”: “Fail”,
“Message”: “cancel”,
“NumberTotalRows”: 0,
“NumberLoadedRows”: 0,
“NumberFilteredRows”: 0,
“NumberUnselectedRows”: 0,
“LoadBytes”: 36456763392,
“LoadTimeMs”: 1574081,
“BeginTxnTimeMs”: 3,
“StreamLoadPlanTimeMs”: 11,
“ReadDataTimeMs”: 566592,
“WriteDataTimeMs”: 601421,
“CommitAndPublishTimeMs”: 0
}
real 26m15.167s
user 0m45.578s
sys 9m13.597s
【背景】做过哪些操作?
在be.conf中调大streaming_load_max_mb=450000
【业务影响】数据无法导入,影响后续操作
【StarRocks版本】3.0.0
【集群规模】例如:3fe(1 leader+1 follower+1observer)+3be(fe与be混部)
【机器信息】CPU虚拟核/内存/网卡,16C/64G/万兆
【表模型】明细模型
【导入或者导出方式】stream_load
【联系方式】rfgnet@163.com
【附件】
- fe.log/be.INFO/相应截图
fe.log
2023-09-05 10:21:58,300 INFO (replayer|79) [GlobalStateMgr.replayJournalInner():2044] replayed journal from 443172 - 443173
2023-09-05 10:22:00,699 WARN (tablet stat mgr|34) [TabletStatMgr.updateLocalTabletStat():149] task exec error. backend[10021]
org.apache.thrift.transport.TTransportException: java.net.ConnectException: Connection refused (Connection refused)
at org.apache.thrift.transport.TSocket.open(TSocket.java:226) ~[libthrift-0.13.0.jar:0.13.0]
at com.starrocks.common.GenericPool$ThriftClientFactory.create(GenericPool.java:144) ~[starrocks-fe.jar:?]
at com.starrocks.common.GenericPool$ThriftClientFactory.create(GenericPool.java:129) ~[starrocks-fe.jar:?]
at org.apache.commons.pool2.BaseKeyedPooledObjectFactory.makeObject(BaseKeyedPooledObjectFactory.java:62) ~[commons-pool2-2.3.jar:2.3]
at org.apache.commons.pool2.impl.GenericKeyedObjectPool.create(GenericKeyedObjectPool.java:1036) ~[commons-pool2-2.3.jar:2.3]
at org.apache.commons.pool2.impl.GenericKeyedObjectPool.borrowObject(GenericKeyedObjectPool.java:356) ~[commons-pool2-2.3.jar:2.3]
at org.apache.commons.pool2.impl.GenericKeyedObjectPool.borrowObject(GenericKeyedObjectPool.java:278) ~[commons-pool2-2.3.jar:2.3]
at com.starrocks.common.GenericPool.borrowObject(GenericPool.java:101) ~[starrocks-fe.jar:?]
at com.starrocks.catalog.TabletStatMgr.updateLocalTabletStat(TabletStatMgr.java:141) [starrocks-fe.jar:?]
at com.starrocks.catalog.TabletStatMgr.runAfterCatalogReady(TabletStatMgr.java:90) [starrocks-fe.jar:?]
at com.starrocks.common.util.LeaderDaemon.runOneCycle(LeaderDaemon.java:73) [starrocks-fe.jar:?]
at com.starrocks.common.util.Daemon.run(Daemon.java:115) [starrocks-fe.jar:?]
Caused by: java.net.ConnectException: Connection refused (Connection refused)
at java.net.PlainSocketImpl.socketConnect(Native Method) ~[?:1.8.0_191]
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350) ~[?:1.8.0_191]
at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206) ~[?:1.8.0_191]
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188) ~[?:1.8.0_191]
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) ~[?:1.8.0_191]
at java.net.Socket.connect(Socket.java:589) ~[?:1.8.0_191]
at org.apache.thrift.transport.TSocket.open(TSocket.java:221) ~[libthrift-0.13.0.jar:0.13.0]
… 11 more
2023-09-05 10:22:00,703 INFO (tablet stat mgr|34) [TabletStatMgr.updateLocalTabletStat():158] finished to get local tablet stat of all backends. cost: 10 ms
2023-09-05 10:22:00,703 INFO (tablet stat mgr|34) [TabletStatMgr.runAfterCatalogReady():126] finished to update index row num of all databases. cost: 0 ms
2023-09-05 10:22:03,302 INFO (nioEventLoopGroup-4-1|92) [RestBaseAction.handleRequest():70] receive http request. url=/api/bootstrap?cluster_id=608598752&token=5c91b4f0-6b3c-4935-b030-f235525a810e
2023-09-05 10:22:03,308 INFO (replayer|79) [GlobalStateMgr.replayJournalInner():2044] replayed journal from 443173 - 443174
2023-09-05 10:22:07,667 INFO (replayer|79) [GlobalStateMgr.replayJournalInner():2044] replayed journal from 443174 - 443175
be.INFO
I0905 10:32:54.427083 6590 heartbeat_server.cpp:76] get heartbeat from FE.host:192.168.239.241, port:9020, cluster id:608598752, counter:121
I0905 10:32:59.531713 6512 starlet.cc:83] Empty starmanager address, skip reporting!
I0905 10:33:03.012465 5806 daemon.cpp:201] Current memory statistics: process(155427480), query_pool(17503032), load(131342960), metadata(119785), compaction(0), schema_change(0), column_pool(0), page_cache(0), update(0), chunk_allocator(0), clone(0), consistency(0)
I0905 10:33:09.541242 6512 starlet.cc:83] Empty starmanager address, skip reporting!
I0905 10:33:18.015282 5806 daemon.cpp:201] Current memory statistics: process(59941096), query_pool(18205576), load(34023536), metadata(119785), compaction(0), schema_change(0), column_pool(0), page_cache(0), update(0), chunk_allocator(0), clone(0), consistency(0)
I0905 10:33:19.552177 6512 starlet.cc:83] Empty starmanager address, skip reporting!
I0905 10:33:29.569106 6512 starlet.cc:83] Empty starmanager address, skip reporting!
I0905 10:33:33.018689 5806 daemon.cpp:201] Current memory statistics: process(264877512), query_pool(18223832), load(239480688), metadata(119785), compaction(0), schema_change(0), column_pool(0), page_cache(0), update(0), chunk_allocator(0), clone(0), consistency(0)
I0905 10:33:39.583196 6512 starlet.cc:83] Empty starmanager address, skip reporting!
I0905 10:33:42.013247 5942 plan_fragment_executor.cpp:364] cancel(): fragment_instance_id=7f401e0a-be27-8876-923c-1cb1908dce9f
I0905 10:33:42.013649 5942 fragment_mgr.cpp:543] FragmentMgr cancel worker going to cancel timeout fragment 7f401e0a-be27-8876-923c-1cb1908dce9f
W0905 10:33:42.013741 6007 tablet_sink.cpp:1539] NodeChannel[10020], tablet add chunk failed, load_id=7f401e0a-be27-8876-923c-1cb1908dce9e, txn_id: 1136, parallel=1, compress_type=2, node=192.168.239.124:8060, errmsg=cancel
I0905 10:33:42.017438 6408 local_tablets_channel.cpp:557] cancel LocalTabletsChannel txn_id: 1136 load_id: 7f401e0abe278876-923c1cb1908dce9e index_id: 11181 #tablet:2 tablet_ids:11192,11186
W0905 10:33:42.779381 6007 fragment_mgr.cpp:199] Fail to open fragment 7f401e0a-be27-8876-923c-1cb1908dce9f: Cancelled: cancel
/build/starrocks/be/src/exec/tablet_sink.cpp:1469 _send_chunk_by_node(chunk, _channels[i].get(), _validate_select_idx)
/build/starrocks/be/src/runtime/plan_fragment_executor.cpp:249 _sink->send_chunk(runtime_state(), chunk.get())
I0905 10:33:42.779985 6007 plan_fragment_executor.cpp:492] Fragment 7f401e0a-be27-8876-923c-1cb1908dce9f:(Active: 10m, non-child: 0.44%)
- InstanceAllocatedMemoryUsage: 117.78 GB
- InstanceDeallocatedMemoryUsage: 75.89 GB
- InstancePeakMemoryUsage: 21.49 MB
- MemoryLimit: -1.00 B
- RowsProduced: 254.33M
OlapTableSink:(Active: 5m10s, non-child: 51.73%)- TxnID: 1136
- IndexNum: 1
- ReplicatedStorage: true
- AutomaticPartition: false
- AllocAutoIncrementTime: 25.360ms
- CloseWaitTime: 0.000ns
- OpenTime: 4.188ms
- PrepareDataTime: 39s755ms
- ConvertChunkTime: 190.440ms
- ValidateDataTime: 31s906ms
- RowsFiltered: 0
- RowsRead: 254.33M
- RowsReturned: 254.33M
- RpcClientSideTime: 5m35s
- RpcServerSideTime: 0.000ns
- SendDataTime: 4m30s
- PackChunkTime: 1m44s
- SendRpcTime: 0.000ns
- CompressTime: 0.000ns
- SerializeChunkTime: 0.000ns
- WaitResponseTime: 2m7s
FILE_SCAN_NODE (id=0):(Active: 4m47s, non-child: 47.83%)I0905 10:32:54.427083 6590 heartbeat_server.cpp:76] get heartbeat from FE.host:192.168.239.241, port:9020, cluster id:608598752, counter:121
I0905 10:32:59.531713 6512 starlet.cc:83] Empty starmanager address, skip reporting!
I0905 10:33:03.012465 5806 daemon.cpp:201] Current memory statistics: process(155427480), query_pool(17503032), load(131342960), metadata(119785), compaction(0), schema_change(0), column_pool(0), page_cache(0), update(0), chunk_allocator(0), clone(0), consistency(0)
I0905 10:33:09.541242 6512 starlet.cc:83] Empty starmanager address, skip reporting!
I0905 10:33:18.015282 5806 daemon.cpp:201] Current memory statistics: process(59941096), query_pool(18205576), load(34023536), metadata(119785), compaction(0), schema_change(0), column_pool(0), page_cache(0), update(0), chunk_allocator(0), clone(0), consistency(0)
I0905 10:33:19.552177 6512 starlet.cc:83] Empty starmanager address, skip reporting!
I0905 10:33:29.569106 6512 starlet.cc:83] Empty starmanager address, skip reporting!
I0905 10:33:33.018689 5806 daemon.cpp:201] Current memory statistics: process(264877512), query_pool(18223832), load(239480688), metadata(119785), compaction(0), schema_change(0), column_pool(0), page_cache(0), update(0), chunk_allocator(0), clone(0), consistency(0)
I0905 10:33:39.583196 6512 starlet.cc:83] Empty starmanager address, skip reporting!
I0905 10:33:42.013247 5942 plan_fragment_executor.cpp:364] cancel(): fragment_instance_id=7f401e0a-be27-8876-923c-1cb1908dce9f
I0905 10:33:42.013649 5942 fragment_mgr.cpp:543] FragmentMgr cancel worker going to cancel timeout fragment 7f401e0a-be27-8876-923c-1cb1908dce9f
W0905 10:33:42.013741 6007 tablet_sink.cpp:1539] NodeChannel[10020], tablet add chunk failed, load_id=7f401e0a-be27-8876-923c-1cb1908dce9e, txn_id: 1136, parallel=1, compress_type=2, node=192.168.239.124:8060, errmsg=cancel
I0905 10:33:42.017438 6408 local_tablets_channel.cpp:557] cancel LocalTabletsChannel txn_id: 1136 load_id: 7f401e0abe278876-923c1cb1908dce9e index_id: 11181 #tablet:2 tablet_ids:11192,11186
W0905 10:33:42.779381 6007 fragment_mgr.cpp:199] Fail to open fragment 7f401e0a-be27-8876-923c-1cb1908dce9f: Cancelled: cancel
/build/starrocks/be/src/exec/tablet_sink.cpp:1469 _send_chunk_by_node(chunk, _channels[i].get(), _validate_select_idx)
/build/starrocks/be/src/runtime/plan_fragment_executor.cpp:249 _sink->send_chunk(runtime_state(), chunk.get())
I0905 10:33:42.779985 6007 plan_fragment_executor.cpp:492] Fragment 7f401e0a-be27-8876-923c-1cb1908dce9f:(Active: 10m, non-child: 0.44%)
- InstanceAllocatedMemoryUsage: 117.78 GB
- InstanceDeallocatedMemoryUsage: 75.89 GB
- InstancePeakMemoryUsage: 21.49 MB
- MemoryLimit: -1.00 B
- RowsProduced: 254.33M
OlapTableSink:(Active: 5m10s, non-child: 51.73%)- TxnID: 1136
- IndexNum: 1
- ReplicatedStorage: true
- AutomaticPartition: false
- AllocAutoIncrementTime: 25.360ms
- CloseWaitTime: 0.000ns
- OpenTime: 4.188ms
- PrepareDataTime: 39s755ms
- ConvertChunkTime: 190.440ms
- ValidateDataTime: 31s906ms
- RowsFiltered: 0
- RowsRead: 254.33M
- RowsReturned: 254.33M
- RpcClientSideTime: 5m35s
- RpcServerSideTime: 0.000ns
- SendDataTime: 4m30s
- PackChunkTime: 1m44s
- SendRpcTime: 0.000ns
- CompressTime: 0.000ns
- SerializeChunkTime: 0.000ns
- WaitResponseTime: 2m7s
FILE_SCAN_NODE (id=0):(Active: 4m47s, non-child: 47.83%)
- BytesRead: 0
- NumDiskAccess: 0
- PeakMemoryUsage: 0
- RowsRead: 0
- RowsReturned: 254.33M
- RowsReturnedRate: 885.53 K/sec
- ScanTime: 8m20s
- ScannerQueueCounter: 767
- ScannerQueueTime: 10m
- ScannerThreadsInvoluntaryContextSwitches: 0
- ScannerThreadsTotalWallClockTime: 0.000ns
- MaterializeTupleTime(*): 0.000ns
- ScannerThreadsSysTime: 0.000ns
- ScannerThreadsUserTime: 0.000ns
- ScannerThreadsVoluntaryContextSwitches: 0
- ScannerTotalTime: 0.000ns
- TotalRawReadTime(*): 0.000ns
- TotalReadThroughput: 0.00 /sec
FileScanner:- CastChunkTime: 0.000ns
- CreateChunkTime: 0.000ns
- FillTime: 0.000ns
- MaterializeTime: 0.000ns
- ReadTime: 0.000ns
FilePRead:- FileReadTime: 0.000ns
W0905 10:33:42.780006 6007 stream_load_executor.cpp:100] fragment execute failed, query_id=7f401e0abe278876-923c1cb1908dce9e, err_msg=cancel, id=7f401e0abe278876-923c1cb1908dce9e, job_id=-1, txn_id: 1136, label=bf0ee091-152c-4a01-872b-d54ba81ce3b0, db=tpcds1000x
W0905 10:33:42.780122 6582 stream_load.cpp:353] append body content failed. errmsg=Cancelled: cancel
/build/starrocks/be/src/exec/tablet_sink.cpp:1469 _send_chunk_by_node(chunk, _channels[i].get(), _validate_select_idx)
/build/starrocks/be/src/runtime/plan_fragment_executor.cpp:249 _sink->send_chunk(runtime_state(), chunk.get()) context=id=7f401e0abe278876-923c1cb1908dce9e, job_id=-1, txn_id: 1136, label=bf0ee091-152c-4a01-872b-d54ba81ce3b0, db=tpcds1000x
I0905 10:33:44.508596 6590 heartbeat_server.cpp:93] Updating master info: TMasterInfo(network_address=TNetworkAddress(hostname=192.168.239.241, port=9020), cluster_id=608598752, epoch=2, token=5c91b4f0-6b3c-4935-b030-f235525a810e, backend_ip=192.168.239.138, http_port=8030, heartbeat_flags=0, backend_id=10021, min_active_txn_id=1137)
I0905 10:33:48.022119 5806 daemon.cpp:201] Current memory statistics: process(15787928), query_pool(15096696), load(0), metadata(119785), compaction(0), schema_change(0), column_pool(0), page_cache(0), update(0), chunk_allocator(0), clone(0), consistency(0)
I0905 10:33:48.415796 6412 load_channel_mgr.cpp:239] Memory consumption(bytes) limit=16374613708 current=0 peak=442638224
I0905 10:33:49.607372 6512 starlet.cc:83] Empty starmanager address, skip reporting!
I0905 10:33:53.438124 6479 tablet_manager.cpp:834] Report all 60 tablets info
- FileReadTime: 0.000ns
- 完整的报错异常栈