常见 Crash / BUG / 优化 查询

  1. 无法重置 root 密码

ERROR 2013 (HY000): Lost connection to MySQL server at ‘reading authorization packet’, system error: 0
  1. 异步物化视图内存泄漏

jmap -histo:live FE进程id

 num     #instances         #bytes  class name
----------------------------------------------
   1:      52869565     2967542312  [C
   2:      83179155     2661732960  java.util.concurrent.ConcurrentHashMap$Node
   3:      15120953     1541269008  [Ljava.util.concurrent.ConcurrentHashMap$Node;
   4:      52880778     1269138672  java.lang.String
   5:        232083     1153473936  [B
   6:      16837970     1077630080  java.util.concurrent.ConcurrentHashMap
   7:      37361372      896672928  com.starrocks.common.Pair
   8:      35647187      855532488  com.starrocks.common.util.Counter
   9:       9578434      473886784  [Ljava.lang.Object;
  10:      12169451      292066824  java.util.ArrayList
  11:      11814272      283542528  java.util.Collections$SetFromMap
  12:      11814043      283537032  java.util.concurrent.ConcurrentHashMap$KeySetView
  13:       1899199      167129512  com.starrocks.analysis.SlotRef
  14:        981845      141385680  com.starrocks.thrift.TExprNode
  15:       2864417      137492016  java.util.HashMap
  16:       5573428      133762272  java.lang.Long
  17:       2189413      122607128  java.util.LinkedHashMap
  18:       1406488      120784800  [Ljava.util.HashMap$Node;
  19:       3061632       97972224  java.util.HashMap$Node
  20:       1653441       79365168  com.starrocks.common.util.RuntimeProfile
  21:       1707982       68319280  java.util.LinkedHashMap$Entry
  22:       2764850       66356400  com.starrocks.thrift.TNetworkAddress
  23:        499248       63903744  com.starrocks.catalog.Replica
  24:       1955990       62591680  com.starrocks.thrift.TScalarType
  25:       1714766       54872512  java.util.concurrent.locks.ReentrantLock$NonfairSync
  26:       1653505       52912160  java.util.Collections$SynchronizedMap
  27:       1635130       52324160  com.starrocks.sql.analyzer.Field
  1. SET TRANSACTION ISOLATION 失败

syntax to use near ‘ISOLATION’

2023-02-03 10:50:31,865 WARN (starrocks-mysql-nio-pool-330|36506) [ConnectProcessor.handleQuery():334] Process one query failed because. com.starrocks.common.AnalysisException: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'ISOLATION' at line 1
        at com.starrocks.qe.ConnectProcessor.handleQuery(ConnectProcessor.java:290) ~[starrocks-fe.jar:?]
        at com.starrocks.qe.ConnectProcessor.dispatch(ConnectProcessor.java:430) ~[starrocks-fe.jar:?]
        at com.starrocks.qe.ConnectProcessor.processOnce(ConnectProcessoF.java:676) ~[starrocks-fe.jar:?]
        at com.starrocks.mysql.nio.ReadListener.lambdashandleEvent$0(ReadListener.java:55) ~[starrocks-fe.jar:?]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_231]
        at java.util.concurrent.ThreadPoolExecutorsWorker.run(ThreadPoolExecutor.java:624) [?:1.8.0_231]
        at java.lang.Thread.run(Thread.java:748) [?:1.8.0_231]
  1. Select limit 报错 java.lang.NullPointerException: null

2023-01-13 14:18:46,703 WARN(starrocks-mysql-nio-pool-25|285)[StmtExecutor,execute():524] execute Exception, sql select * from strock_ads_sg_bqbb_brand_detial_df limit 10

java.lang.NullPointerException: null
        at com.starrocks.sql.plan.PlanFragmentBuildersPhysicalPlanTranslator.visitPhysicalolapScan(PlanFragmentBuilder.java:539)-[starrocks-fe.jar:?]
        at com.starrocks.sql.plan.PlanFragmentBuildersPhysicalPlanTranslator.visitPhysical0lapScan(PlanFragmentBuilder,java:229)-[starrocks-fe.jar:?]
        at com.starrocks.sql,optimizer,operator.physical,Physical0lapScanOperator,accept(PhysicalolapScanOperator,java:132) -[starrocks-fe.jar:?]
        at com.starrocks,sql.plan.PlanFragmentBuildersPhysicalPlanTranslator,visit(PlanFragmentBuilder.java:238)-[starrocks-fe.jar:?]
        at com.starrocks.sql.plan.PlanFragmentBuildersPhysicalPlanTranslator,visitPhysicalDistribution(PlanFragmentBuilder.java:1389) -[starrocks-fe.jar:?]
        at com.starrocks.sql.plan.PlanFragmentBuildersPhysicalPlanTranslator,visitPhysicalDistribution(PlanFragmentBuilder.java:229) -[starrocks-fe.jar:?]
        at com.starrocks.sql,optimizer.operator.physical,PhysicalDistributionOperator.accept(PhysicalDistributionOperator.java:44) -[starrocks-fe.jar:?]
        at com.starrocks.sql.plan.PlanFragmentBuildersPhysicalPlanTranslator,visit(PlanFragmentBuilder. java:238) -[starrocks-fe.jar:?]
        at com.starrocks.sql.plan.PlanFragmentBuildersPhysicalPlanTranslator.visitPhysicalLimit(PlanFragmentBuilder.java:2301) -[starrocks-fe.jar:?]
        at com,starrocks.sql.plan.PlanFragmentBuilder$PhysicalPlanTranslator.visitPhysicalLimit(PlanFragmentBuilder,java:229) -[starrocks-fe,jar:?]
        at com.starrocks.sql.optimizer.operator.physical.PhysicalLimitOperator.accept(PhysicalLimitOperator.java:33)-[starrocks-fe.jar:?]
        at com.starrocks.sql.plan.PlanFragmentBuildersPhysicalPlanTranslator.visit(PlanFragmentBuilder.java:238) -[starrocks-fe.jar:?]
        at com.starrocks.sql.plan.PlanFragmentBuilder.createPhysicalPlan(PlanFragmentBuilder,java:163) -[starrocks-fe.jar:?]
        at com.starrocks.sql.StatementPlanner.createQueryPlan(StatementPlanner.java:115)-[starrocks-fe.jar:?]
        at com.starrocks.sql.StatementPlanner.plan(StatementPlanner.java:65)-[starrocks-fe.jar:?]
        at com.starrocks.sql.StatementPlanner.plan(StatementPlanner.java:39) -[starrocks-fe.jar:?]
        at com.starrocks.qe.StmtExecutor.execute(StatExecutor.java:373)-[starrocks-fe.jar:?]
  1. Broker load 导入报错: mismatched row count

Type:LOAD_RUN_FAIL; msg:mismatched row count: 512 vs 4096
*** Aborted at 1667808981 (unix time) try "date -d @1667808981" if you are using GNU date ***
PC: @          0x24014a3 strings::memcpy_inlined()
*** SIGSEGV (@0x0) received by PID 168026 (TID 0x7f3c1c420700) from PID 0; stack trace: ***
    @          0x507f842 google::(anonymous namespace)::FailureSignalHandler()
    @     0x7f3e683c4630 (unknown)
    @          0x24014a3 strings::memcpy_inlined()
    @          0x361cee3 starrocks::ColumnVisitorMutableAdapter<>::visit()
    @          0x2c30b4f starrocks::vectorized::ColumnFactory<>::accept_mutable()
    @          0x361ddc8 starrocks::serde::ColumnArraySerde::deserialize()
    @          0x46ec15f starrocks::serde::ProtobufChunkDeserializer::deserialize()
    @          0x3e3be18 starrocks::DataStreamRecvr::SenderQueue::_deserialize_chunk()
    @          0x3e4257c starrocks::DataStreamRecvr::NonPipelineSenderQueue::add_chunks<>()
    @          0x3e3c282 starrocks::DataStreamRecvr::NonPipelineSenderQueue::add_chunks()
    @          0x3dbf073 starrocks::DataStreamRecvr::add_chunks()
    @          0x3d604b6 starrocks::DataStreamMgr::transmit_chunk()
    @          0x4729c3c starrocks::PInternalServiceImplBase<>::transmit_chunk()
    @          0x51b115e brpc::policy::ProcessRpcRequest()
    @          0x51a7b67 brpc::ProcessInputMessage()
    @          0x51a8a13 brpc::InputMessenger::OnNewMessages()
    @          0x524f75e brpc::Socket::ProcessEvent()
    @          0x515d6af bthread::TaskGroup::task_runner()
    @          0x52e60a1 bthread_make_fcontext
  1. Josn 导入 crash

*** Aborted at 1667192760 (unix time) try "date -d @1667192760" if you are using GNU date ***
PC: @          0x27460a1 starrocks::vectorized::JsonDocumentStreamParser::get_current()
*** SIGSEGV (@0x8) received by PID 12653 (TID 0x7fa1a94c1700) from PID 8; stack trace: ***
    @          0x3fa3ad2 google::(anonymous namespace)::FailureSignalHandler()
    @     0x7fa2a1087630 (unknown)
    @          0x27460a1 starrocks::vectorized::JsonDocumentStreamParser::get_current()
    @          0x27455d7 starrocks::vectorized::JsonReader::_read_rows<>()
    @          0x27414d9 starrocks::vectorized::JsonReader::read_chunk()
    @          0x27416ec starrocks::vectorized::JsonScanner::get_next()
    @          0x272e5e0 starrocks::vectorized::FileScanNode::_scanner_scan()
    @          0x272ff4f starrocks::vectorized::FileScanNode::_scanner_worker()
    @          0x5a21410 execute_native_thread_routine
    @     0x7fa2a107fea5 start_thread
    @     0x7fa2a069ab0d __clone
  1. View + Union + Null 时查询报错或Crash

Mismatched row count

*** SIGSEGV (@0x0) received by PID 3659 (TID 0x7f17de2fb700) from PID 0; stack trace: ***
    @          0x3ff4972 google::(anonymous namespace)::FailureSignalHandler()
    @     0x7f186229a630 (unknown)
    @     0x7f186190bc00 __memmove_ssse3_back
    @          0x1a26464 starrocks::vectorized::FixedLengthColumnBase<>::append()
    @          0x25224ca starrocks::vectorized::NullableColumn::append()
    @          0x251009b starrocks::vectorized::Chunk::append_safe()
    @          0x27453a7 starrocks::vectorized::ChunksSorterHeapSort::done()
    @          0x27419e5 starrocks::vectorized::ChunksSorter::finish()
    @          0x28ba860 starrocks::pipeline::PartitionSortSinkOperator::set_finishing()
    @          0x28def07 starrocks::pipeline::PipelineDriver::_mark_operator_finishing()
    @          0x28dff3b starrocks::pipeline::PipelineDriver::process()
    @          0x28d67dc starrocks::pipeline::GlobalDriverExecutor::_worker_thread()
    @          0x21772c9 starrocks::ThreadPool::dispatch_thread()
    @          0x2172e7a starrocks::Thread::supervise_thread()
    @     0x7f1862292ea5 start_thread
    @     0x7f18618ad9fd __clone
    @                0x0 (unknown)
bug的触发条件:
1: 查询首先有view. 
2: view 中有 union 
3: union的孩子有 常量 NULL
4: 这个常量NULL位于union的第一个孩子中.
  1. bitmap_contains 消耗大量内存

terminate called after throwing an instance of 'query_id:b1e35703-a6de-11ed-adfa-78ac4489cf40, fragment_instance:b1e35703-a6de-11ed-adfa-78ac4489cf47
*** Aborted at 1675771249 (unix time) try "date -d @1675771249" if you are using GNU date ***
std::runtime_error'
  what():  failed memory alloc in constructor
PC: @     0x7fe1f947e387 __GI_raise
*** SIGABRT (@0xce40004d5a7) received by PID 316839 (TID 0x7fe13807f700) from PID 316839; stack trace: ***
    @          0x40e1c82 google::(anonymous namespace)::FailureSignalHandler()
    @     0x7fe1f9f33630 (unknown)
    @     0x7fe1f947e387 __GI_raise
    @     0x7fe1f947fa78 __GI_abort
    @          0x5ae6dd2 __gnu_cxx::__verbose_terminate_handler()
    @          0x5ae5886 __cxxabiv1::__terminate()
    @          0x5ae58f1 std::terminate()
    @          0x5ae5a96 __cxa_rethrow
    @          0x1653704 _ZNSt8_Rb_treeIjSt4pairIKj7RoaringESt10_Select1stIS3_ESt4lessIjESaIS3_EE7_M_copyINS9_11_Alloc_nodeEEEPSt13_Rb_tree_nodeIS3_EPKSD_PSt18_Rb_tree_node_baseRT_.isra.0.cold
    @          0x212826b starrocks::BitmapValue::BitmapValue()
    @          0x258608b starrocks::vectorized::ObjectColumn<>::append()
    @          0x2586412 starrocks::vectorized::ObjectColumn<>::append_value_multiple_times()
    @          0x293d750 starrocks::pipeline::CrossJoinLeftOperator::_copy_joined_rows_with_index_base_build()
    @          0x293dfc2 starrocks::pipeline::CrossJoinLeftOperator::pull_chunk()
    @          0x2965983 starrocks::pipeline::PipelineDriver::process()
    @          0x295bfb6 starrocks::pipeline::GlobalDriverExecutor::_worker_thread()
    @          0x21c59f9 starrocks::ThreadPool::dispatch_thread()
    @          0x21c15aa starrocks::Thread::supervise_thread()
    @     0x7fe1f9f2bea5 start_thread
    @     0x7fe1f9546b0d __clone
    @                0x0 (unknown)
  1. SQL 解析报错: location

You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'location' at line 3
  1. 低基数导致 Plan 改写 Unknown error
ava.lang.IllegalStateException: null
        at com.google.common.base.Preconditions.checkState(Preconditions.java:494) ~[spark-dpp-1.0.0.jar:?]
        at com.starrocks.sql.plan.ScalarOperatorToExpr$Formatter.visitVariableReference(ScalarOperatorToExpr.java:133) ~[starrocks-fe.jar:?]
        at com.starrocks.sql.plan.ScalarOperatorToExpr$Formatter.visitVariableReference(ScalarOperatorToExpr.java:112) ~[starrocks-fe.jar:?]
        at com.starrocks.sql.optimizer.operator.scalar.ColumnRefOperator.accept(ColumnRefOperator.java:110) ~[starrocks-fe.jar:?]
        at com.starrocks.sql.plan.ScalarOperatorToExpr.buildExecExpression(ScalarOperatorToExpr.java:79) ~[starrocks-fe.jar:?]
        at com.starrocks.sql.plan.PlanFragmentBuilder$PhysicalPlanTranslator.buildPartialTopNFragment(PlanFragmentBuilder.java:1749) ~[starrocks-fe.jar:?]
        at com.starrocks.sql.plan.PlanFragmentBuilder$PhysicalPlanTranslator.visitPhysicalTopN(PlanFragmentBuilder.java:1664) ~[starrocks-fe.jar:?]
        at com.starrocks.sql.plan.PlanFragmentBuilder$PhysicalPlanTranslator.visitPhysicalTopN(PlanFragmentBuilder.java:255) ~[starrocks-fe.jar:?]
        at com.starrocks.sql.optimizer.operator.physical.PhysicalTopNOperator.accept(PhysicalTopNOperator.java:113) ~[starrocks-fe.jar:?]
        at com.starrocks.sql.plan.PlanFragmentBuilder$PhysicalPlanTranslator.visit(PlanFragmentBuilder.java:264) ~[starrocks-fe.jar:?]
        at com.starrocks.sql.plan.PlanFragmentBuilder$PhysicalPlanTranslator.visitPhysicalDecode(PlanFragmentBuilder.java:474) ~[starrocks-fe.jar:?]
        at com.starrocks.sql.plan.PlanFragmentBuilder$PhysicalPlanTranslator.visitPhysicalDecode(PlanFragmentBuilder.java:255) ~[starrocks-fe.jar:?]
        at com.starrocks.sql.optimizer.operator.physical.PhysicalDecodeOperator.accept(PhysicalDecodeOperator.java:112) ~[starrocks-fe.jar:?]
        at com.starrocks.sql.plan.PlanFragmentBuilder$PhysicalPlanTranslator.visit(PlanFragmentBuilder.java:264) ~[starrocks-fe.jar:?]
        at com.starrocks.sql.plan.PlanFragmentBuilder.createPhysicalPlan(PlanFragmentBuilder.java:169) ~[starrocks-fe.jar:?]
        at com.starrocks.sql.StatementPlanner.createQueryPlan(StatementPlanner.java:110) ~[starrocks-fe.jar:?]
        at com.starrocks.sql.StatementPlanner.plan(StatementPlanner.java:66) ~[starrocks-fe.jar:?]
        at com.starrocks.sql.StatementPlanner.plan(StatementPlanner.java:37) ~[starrocks-fe.jar:?]
        at com.starrocks.qe.StmtExecutor.execute(StmtExecutor.java:373) ~[starrocks-fe.jar:?]
        at com.starrocks.qe.ConnectProcessor.handleQuery(ConnectProcessor.java:313) ~[starrocks-fe.jar:?]
        at com.starrocks.qe.ConnectProcessor.dispatch(ConnectProcessor.java:430) ~[starrocks-fe.jar:?]
        at com.starrocks.qe.ConnectProcessor.processOnce(ConnectProcessor.java:676) ~[starrocks-fe.jar:?]
        at com.starrocks.mysql.nio.ReadListener.lambda$handleEvent$0(ReadListener.java:55) ~[starrocks-fe.jar:?]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_201]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_201]
        at java.lang.Thread.run(Thread.java:748) [?:1.8.0_201]
  • Github Issue:

  • Github Fix PR:
  • Jira:

  • 问题版本:
    • 2.4.0 ~ 2.4.3
    • 2.5.0 ~ 2.5.1
  • 修复版本:
    • 2.4.4+
    • 2.5.2+
  • 临时规避方法:
    • set global cbo_enable_low_cardinality_optimize=false; (会影响部分SQL的性能)
  • 问题原因:
    • 低基数查询Plan改写的问题
  1. 同样的SQL, 加Limit比不加Limit性能退化严重
    触发条件:
  • Limit 比较小
  • 有过滤条件
  • 过滤后的结果数据比较少
mysql> select lo_custkey from lineorder_flat where lo_revenue = 3322363 and lo_custkey = 2684693 limit 1;
+------------+
| lo_custkey |
+------------+
|    2684693 |
+------------+
1 row in set (13.25 sec)

mysql> select lo_custkey from lineorder_flat where lo_revenue = 3322363 and lo_custkey = 2684693;
+------------+
| lo_custkey |
+------------+
|    2684693 |
+------------+
1 row in set (0.08 sec)
  1. Librdkafka crash
*** Aborted at 1632307784 (unix time) try "date -d @1632307784" if you are using GNU date ***
PC: @ 0x7f2e6145a387 __GI_raise
*** SIGABRT (@0x27de9) received by PID 163305 (TID 0x7f2d8a2f0700) from PID 163305; stack trace: ***
@ 0x1fd12e2 google::(anonymous namespace)::FailureSignalHandler()
@ 0x7f2e61b03630 (unknown)
@ 0x7f2e6145a387 __GI_raise
@ 0x7f2e6145ba78 __GI_abort
@ 0x7f2e614531a6 __assert_fail_base
@ 0x7f2e61453252 __GI___assert_fail
@ 0x1d7c767 rd_kafka_broker_destroy_final
@ 0x1dfccb8 rd_kafka_metadata_refresh_topics
@ 0x1dfd04a rd_kafka_metadata_refresh_known_topics
@ 0x1d774c2 rd_kafka_broker_fail
@ 0x1d87bd4 rd_kafka_broker_op_serve
@ 0x1d89463 rd_kafka_broker_ops_io_serve
@ 0x1d89968 rd_kafka_broker_consumer_serve
@ 0x1d8b485 rd_kafka_broker_thread_main
@ 0x1df43a7 _thrd_wrapper_function
@ 0x7f2e61afbea5 start_thread
@ 0x7f2e615229fd __clone
  1. InfoSchemaDb id shouldn’t larger than 10000
2023-02-16 00:00:34,021 ERROR (leaderCheckpointer|75) [Checkpoint.runAfterCatalogReady():106] Exception when generate new image file
java.lang.IllegalStateException: InfoSchemaDb id shouldn’t larger than 10000, please restart your FE server
at com.google.common.base.Preconditions.checkState(Preconditions.java:510) ~[spark-dpp-1.0.0.jar:?]
at com.starrocks.server.LocalMetastore.loadCluster(LocalMetastore.java:3598) ~[starrocks-fe.jar:?]
at com.starrocks.server.GlobalStateMgr.loadImage(GlobalStateMgr.java:1131) ~[starrocks-fe.jar:?]
at com.starrocks.master.Checkpoint.runAfterCatalogReady(Checkpoint.java:87) [starrocks-fe.jar:?]
at com.starrocks.common.util.MasterDaemon.runOneCycle(MasterDaemon.java:61) [starrocks-fe.jar:?]
at com.starrocks.common.util.Daemon.run(Daemon.java:115) [starrocks-fe.jar:?]
  1. 动态分区:创建大量历史分区,并且超时
    当 dynamic_partition.start 设置的比较小时,会创建大量历史分区
  1. 主键模型开启 persistent index 后磁盘空间持续增长
  1. BE 启动加载 persistent index crash

*** Aborted at 1676903379 (unix time) try "date -d @1676903379" if you are using GNU date ***
PC: @          0x310317a starrocks::PersistentIndex::_merge_compaction()
*** SIGFPE (@0x310317a) received by PID 30837 (TID 0x7f8d745fe700) from PID 51392890; stack trace: ***
    @          0x4825332 google::(anonymous namespace)::FailureSignalHandler()
    @     0x7fe755c565e0 (unknown)
    @          0x310317a starrocks::PersistentIndex::_merge_compaction()
    @          0x3104bc0 starrocks::PersistentIndex::_check_and_flush_l0()
    @          0x31070c0 starrocks::PersistentIndex::commit()
    @          0x2e8936e starrocks::PrimaryIndex::commit()
    @          0x2f5cbf5 starrocks::TabletUpdates::_apply_compaction_commit()
    @          0x2f5e52d starrocks::TabletUpdates::do_apply()
    @          0x3762b55 starrocks::ThreadPool::dispatch_thread()
    @          0x375df8a starrocks::Thread::supervise_thread()
    @     0x7fe755c4ee25 start_thread
    @     0x7fe75526e34d __clone
    @                0x0 (unknown)
  1. Insert overwrite 或 物化视图刷新 导致 FE follower 内存泄漏
  1. 单表物化视图查询 Crash

*** SIGSEGV (@0x0) received by PID 287072 (TID 0x7f99979d7640) from PID 0; stack trace: ***
6    @          0x56ec9c2 google::(anonymous namespace)::FailureSignalHandler()
7    @     0x7f9b214fb1d0 (unknown)
8    @          0x453d51d std::__push_heap<>()
9    @          0x454a1d9 starrocks::vectorized::HeapMergeIterator::fill()
10    @          0x454b046 starrocks::vectorized::HeapMergeIterator::do_get_next()
11    @          0x454b704 starrocks::vectorized::HeapMergeIterator::do_get_next()
12    @          0x43d40c3 starrocks::vectorized::TimedChunkIterator::do_get_next()
13    @          0x43cfbae starrocks::vectorized::AggregateIterator::do_get_next()
14    @          0x43d1254 starrocks::vectorized::AggregateIterator::do_get_next()
15    @          0x43d40c3 starrocks::vectorized::TimedChunkIterator::do_get_next()
16    @          0x40484ae starrocks::vectorized::TabletReader::do_get_next()
17    @          0x4021292 starrocks::vectorized::ProjectionIterator::do_get_next()
18    @          0x2ee880b starrocks::pipeline::OlapChunkSource::_read_chunk_from_storage()
19    @          0x2ee8eeb starrocks::pipeline::OlapChunkSource::_read_chunk()
20    @          0x2ed888c starrocks::pipeline::ChunkSource::buffer_next_batch_chunks_blocking()
21    @          0x2c5bce4 _ZZN9starrocks8pipeline12ScanOperator18_trigger_next_scanEPNS_12RuntimeStateEiENKUlvE_clEv
22    @          0x2c6cd1d starrocks::workgroup::ScanExecutor::worker_thread()
23    @          0x47a3a9d starrocks::ThreadPool::dispatch_thread()
24    @          0x479e82a starrocks::Thread::supervise_thread()
25    @     0x7f9b214f03fb start_thread
26    @     0x7f9b1fd27c23 __GI___clone
27    @                0x0 (unknown)
  1. 主键模型增量 clone 转全量 clone 失败,导致一直 Clone 失败

I0127 00:46:02.271173  7878 snapshot_manager.cpp:112] make primary snapshot tablet:334728 cur_version:65905 missing_version_ranges:65662 timeout:180
I0127 00:46:02.271206  7878 tablet_updates.cpp:2954] get_rowsets_for_snapshot: too many rowsets for incremental clone #rowset:244 #rowset_for_full_clone:1 tablet:334728 #version:1 [65905.1 65905.1@0 65905.1] #pending:0
W0127 00:46:02.271214  7878 agent_server.cpp:308] fail to make_snapshot. tablet_id:334728 msg:Not found: get_rowsets_for_snapshot: too many rowsets for incremental clone #rowset:244 #rowset_for_full_clone:1 tablet:334728 #version:1 [65905.1 65905.1@0 65905.1] #pending:0
  • Github Issue:
  • Github Fix PR:
  • Jira:
  • 问题版本:
    • 2.3.0 ~ 2.3.7
    • 2.4.0 ~ 2.4.3
    • 2.5.0
  • 修复版本:
    • 2.3.8+
    • 2.4.4+
    • 2.5.1+
  • 临时规避方法:
    • 手动删除BE对应的Tablet
  • 问题原因:
    • 见 issue 描述
  1. 低基数BUG: Dict Decode failed

Dict Decode failed, Dict can't take cover all key :0