常见 Crash / BUG / 优化 查询

  1. AVX2 不支持导致 Crash

query_id:00000000-0000-0000-0000-000000000000, fragment_instance:00000000-0000-0000-0000-000000000000
*** Aborted at 1673246592 (unix time) try “date -d @1673246592” if you are using GNU date ***
PC: @ 0x4ab69a4 bitset_container_from_array
*** SIGILL (@0x4ab69a4) received by PID 9331 (TID 0x7f72ef254700) from PID 78342564; stack trace: ***
@ 0x4659ee2 google::(anonymous namespace)::FailureSignalHandler()
@ 0x7f7309f71630 (unknown)
@ 0x4ab69a4 bitset_container_from_array
@ 0x4a9cc9d roaring_bitmap_add_many
@ 0x2fdd52e starrocks::DelVector::_add_dels()
@ 0x2fddcdc starrocks::DelVector::add_dels_as_new_version()
@ 0x2e5544c starrocks::TabletUpdates::_apply_rowset_commit()
@ 0x2e57482 starrocks::TabletUpdates::do_apply()
@ 0x362cab5 starrocks::ThreadPool::dispatch_thread()
@ 0x3627efa starrocks::thread::supervise_thread()
@ 0x7f7309f69ea5 start_thread
@ 0x7f7309584b0d __clone
@ 0x0 (unknown)
  • 问题原因
    • SIGILL 一般就是BE所在机器不支持 AVX2指令集导致
  • 修复方法
    • 换支持 AVX2 指令集的机器: cat /proc/cpuinfo |grep avx2
    • 关闭 AVX2支持,手动编译 BE

动不动就崩溃,这稳定性堪忧!

  1. Checksum mismatch 错误

Bad page: checksum mismatch (actual=243080401 vs expect=12)
  • 问题原因
    • 一般是磁盘硬件问题,可以查看下 dmesg -T 是否有 I/O 错误: I/O error
[Sat Jan 14 21:30:54 2023] nvme1n1: Write(0x1) @ LBA 174796784, 1016 blocks, Data Transfer Error (sct 0x0 / sc 0x4) DNR 
[Sat Jan 14 21:30:54 2023] blk_update_request: critical target error, dev nvme1n1, sector 174796784 op 0x1:(WRITE) flags 0x4000 phys_seg 127 prio class 0
[Sat Jan 14 21:30:54 2023] EXT4-fs warning (device dm-0): ext4_end_bio:325: I/O error 5 writing to inode 216269336 (offset 8388608 size 8388608 starting block 21849088)
[Sat Jan 14 21:30:54 2023] buffer_io_error: 502 callbacks suppressed
[Sat Jan 14 21:30:54 2023] Buffer I/O error on device dm-0, logical block 21849088
[Sat Jan 14 21:30:54 2023] Buffer I/O error on device dm-0, logical block 21849089
[Sat Jan 14 21:30:54 2023] Buffer I/O error on device dm-0, logical block 21849090
[Sat Jan 14 21:30:54 2023] Buffer I/O error on device dm-0, logical block 21849091
[Sat Jan 14 21:30:54 2023] Buffer I/O error on device dm-0, logical block 21849092
[Sat Jan 14 21:30:54 2023] Buffer I/O error on device dm-0, logical block 21849093
[Sat Jan 14 21:30:54 2023] Buffer I/O error on device dm-0, logical block 21849094
[Sat Jan 14 21:30:54 2023] Buffer I/O error on device dm-0, logical block 21849095
[Sat Jan 14 21:30:54 2023] Buffer I/O error on device dm-0, logical block 21849096
[Sat Jan 14 21:30:54 2023] Buffer I/O error on device dm-0, logical block 21849097
[Sat Jan 14 21:30:54 2023] JBD2: Detected IO errors while flushing file data on dm-0-8
  • 解决办法
    • 更换磁盘
  1. actual row size changed after compaction

W0131 10:10:09.796995 33398 task_worker_pool.cpp:1157] clone failed. signature:2616201
W0131 10:18:19.914609 33340 tablet_updates.cpp:1460] remove_expired_versions failed, tablet updates is in error state: tablet:2616201 actual row size changed after compaction 1697323 -> 1779662 tablet:2616201 #version:2 [58281 58281.181 58281.1] pending:rowsets:4
W0131 10:19:14.006429 33395 engine_clone_task.cpp:145] Fail to lood snapshot:Internal error:load snapshot failed, tablet updates is in error state: tablet:2616201 actual row size changed after compaction 1697323 -> 1779562 tablet:2616201 #version:2 [58281 58281.101 58281.1] pending:rowsets:4
  1. 无法重置 root 密码

ERROR 2013 (HY000): Lost connection to MySQL server at ‘reading authorization packet’, system error: 0
  1. 异步物化视图内存泄漏

jmap -histo:live FE进程id

 num     #instances         #bytes  class name
----------------------------------------------
   1:      52869565     2967542312  [C
   2:      83179155     2661732960  java.util.concurrent.ConcurrentHashMap$Node
   3:      15120953     1541269008  [Ljava.util.concurrent.ConcurrentHashMap$Node;
   4:      52880778     1269138672  java.lang.String
   5:        232083     1153473936  [B
   6:      16837970     1077630080  java.util.concurrent.ConcurrentHashMap
   7:      37361372      896672928  com.starrocks.common.Pair
   8:      35647187      855532488  com.starrocks.common.util.Counter
   9:       9578434      473886784  [Ljava.lang.Object;
  10:      12169451      292066824  java.util.ArrayList
  11:      11814272      283542528  java.util.Collections$SetFromMap
  12:      11814043      283537032  java.util.concurrent.ConcurrentHashMap$KeySetView
  13:       1899199      167129512  com.starrocks.analysis.SlotRef
  14:        981845      141385680  com.starrocks.thrift.TExprNode
  15:       2864417      137492016  java.util.HashMap
  16:       5573428      133762272  java.lang.Long
  17:       2189413      122607128  java.util.LinkedHashMap
  18:       1406488      120784800  [Ljava.util.HashMap$Node;
  19:       3061632       97972224  java.util.HashMap$Node
  20:       1653441       79365168  com.starrocks.common.util.RuntimeProfile
  21:       1707982       68319280  java.util.LinkedHashMap$Entry
  22:       2764850       66356400  com.starrocks.thrift.TNetworkAddress
  23:        499248       63903744  com.starrocks.catalog.Replica
  24:       1955990       62591680  com.starrocks.thrift.TScalarType
  25:       1714766       54872512  java.util.concurrent.locks.ReentrantLock$NonfairSync
  26:       1653505       52912160  java.util.Collections$SynchronizedMap
  27:       1635130       52324160  com.starrocks.sql.analyzer.Field
  1. SET TRANSACTION ISOLATION 失败

syntax to use near ‘ISOLATION’

2023-02-03 10:50:31,865 WARN (starrocks-mysql-nio-pool-330|36506) [ConnectProcessor.handleQuery():334] Process one query failed because. com.starrocks.common.AnalysisException: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'ISOLATION' at line 1
        at com.starrocks.qe.ConnectProcessor.handleQuery(ConnectProcessor.java:290) ~[starrocks-fe.jar:?]
        at com.starrocks.qe.ConnectProcessor.dispatch(ConnectProcessor.java:430) ~[starrocks-fe.jar:?]
        at com.starrocks.qe.ConnectProcessor.processOnce(ConnectProcessoF.java:676) ~[starrocks-fe.jar:?]
        at com.starrocks.mysql.nio.ReadListener.lambdashandleEvent$0(ReadListener.java:55) ~[starrocks-fe.jar:?]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_231]
        at java.util.concurrent.ThreadPoolExecutorsWorker.run(ThreadPoolExecutor.java:624) [?:1.8.0_231]
        at java.lang.Thread.run(Thread.java:748) [?:1.8.0_231]
  1. Select limit 报错 java.lang.NullPointerException: null

2023-01-13 14:18:46,703 WARN(starrocks-mysql-nio-pool-25|285)[StmtExecutor,execute():524] execute Exception, sql select * from strock_ads_sg_bqbb_brand_detial_df limit 10

java.lang.NullPointerException: null
        at com.starrocks.sql.plan.PlanFragmentBuildersPhysicalPlanTranslator.visitPhysicalolapScan(PlanFragmentBuilder.java:539)-[starrocks-fe.jar:?]
        at com.starrocks.sql.plan.PlanFragmentBuildersPhysicalPlanTranslator.visitPhysical0lapScan(PlanFragmentBuilder,java:229)-[starrocks-fe.jar:?]
        at com.starrocks.sql,optimizer,operator.physical,Physical0lapScanOperator,accept(PhysicalolapScanOperator,java:132) -[starrocks-fe.jar:?]
        at com.starrocks,sql.plan.PlanFragmentBuildersPhysicalPlanTranslator,visit(PlanFragmentBuilder.java:238)-[starrocks-fe.jar:?]
        at com.starrocks.sql.plan.PlanFragmentBuildersPhysicalPlanTranslator,visitPhysicalDistribution(PlanFragmentBuilder.java:1389) -[starrocks-fe.jar:?]
        at com.starrocks.sql.plan.PlanFragmentBuildersPhysicalPlanTranslator,visitPhysicalDistribution(PlanFragmentBuilder.java:229) -[starrocks-fe.jar:?]
        at com.starrocks.sql,optimizer.operator.physical,PhysicalDistributionOperator.accept(PhysicalDistributionOperator.java:44) -[starrocks-fe.jar:?]
        at com.starrocks.sql.plan.PlanFragmentBuildersPhysicalPlanTranslator,visit(PlanFragmentBuilder. java:238) -[starrocks-fe.jar:?]
        at com.starrocks.sql.plan.PlanFragmentBuildersPhysicalPlanTranslator.visitPhysicalLimit(PlanFragmentBuilder.java:2301) -[starrocks-fe.jar:?]
        at com,starrocks.sql.plan.PlanFragmentBuilder$PhysicalPlanTranslator.visitPhysicalLimit(PlanFragmentBuilder,java:229) -[starrocks-fe,jar:?]
        at com.starrocks.sql.optimizer.operator.physical.PhysicalLimitOperator.accept(PhysicalLimitOperator.java:33)-[starrocks-fe.jar:?]
        at com.starrocks.sql.plan.PlanFragmentBuildersPhysicalPlanTranslator.visit(PlanFragmentBuilder.java:238) -[starrocks-fe.jar:?]
        at com.starrocks.sql.plan.PlanFragmentBuilder.createPhysicalPlan(PlanFragmentBuilder,java:163) -[starrocks-fe.jar:?]
        at com.starrocks.sql.StatementPlanner.createQueryPlan(StatementPlanner.java:115)-[starrocks-fe.jar:?]
        at com.starrocks.sql.StatementPlanner.plan(StatementPlanner.java:65)-[starrocks-fe.jar:?]
        at com.starrocks.sql.StatementPlanner.plan(StatementPlanner.java:39) -[starrocks-fe.jar:?]
        at com.starrocks.qe.StmtExecutor.execute(StatExecutor.java:373)-[starrocks-fe.jar:?]
  1. Broker load 导入报错: mismatched row count

Type:LOAD_RUN_FAIL; msg:mismatched row count: 512 vs 4096
*** Aborted at 1667808981 (unix time) try "date -d @1667808981" if you are using GNU date ***
PC: @          0x24014a3 strings::memcpy_inlined()
*** SIGSEGV (@0x0) received by PID 168026 (TID 0x7f3c1c420700) from PID 0; stack trace: ***
    @          0x507f842 google::(anonymous namespace)::FailureSignalHandler()
    @     0x7f3e683c4630 (unknown)
    @          0x24014a3 strings::memcpy_inlined()
    @          0x361cee3 starrocks::ColumnVisitorMutableAdapter<>::visit()
    @          0x2c30b4f starrocks::vectorized::ColumnFactory<>::accept_mutable()
    @          0x361ddc8 starrocks::serde::ColumnArraySerde::deserialize()
    @          0x46ec15f starrocks::serde::ProtobufChunkDeserializer::deserialize()
    @          0x3e3be18 starrocks::DataStreamRecvr::SenderQueue::_deserialize_chunk()
    @          0x3e4257c starrocks::DataStreamRecvr::NonPipelineSenderQueue::add_chunks<>()
    @          0x3e3c282 starrocks::DataStreamRecvr::NonPipelineSenderQueue::add_chunks()
    @          0x3dbf073 starrocks::DataStreamRecvr::add_chunks()
    @          0x3d604b6 starrocks::DataStreamMgr::transmit_chunk()
    @          0x4729c3c starrocks::PInternalServiceImplBase<>::transmit_chunk()
    @          0x51b115e brpc::policy::ProcessRpcRequest()
    @          0x51a7b67 brpc::ProcessInputMessage()
    @          0x51a8a13 brpc::InputMessenger::OnNewMessages()
    @          0x524f75e brpc::Socket::ProcessEvent()
    @          0x515d6af bthread::TaskGroup::task_runner()
    @          0x52e60a1 bthread_make_fcontext
  1. Josn 导入 crash

*** Aborted at 1667192760 (unix time) try "date -d @1667192760" if you are using GNU date ***
PC: @          0x27460a1 starrocks::vectorized::JsonDocumentStreamParser::get_current()
*** SIGSEGV (@0x8) received by PID 12653 (TID 0x7fa1a94c1700) from PID 8; stack trace: ***
    @          0x3fa3ad2 google::(anonymous namespace)::FailureSignalHandler()
    @     0x7fa2a1087630 (unknown)
    @          0x27460a1 starrocks::vectorized::JsonDocumentStreamParser::get_current()
    @          0x27455d7 starrocks::vectorized::JsonReader::_read_rows<>()
    @          0x27414d9 starrocks::vectorized::JsonReader::read_chunk()
    @          0x27416ec starrocks::vectorized::JsonScanner::get_next()
    @          0x272e5e0 starrocks::vectorized::FileScanNode::_scanner_scan()
    @          0x272ff4f starrocks::vectorized::FileScanNode::_scanner_worker()
    @          0x5a21410 execute_native_thread_routine
    @     0x7fa2a107fea5 start_thread
    @     0x7fa2a069ab0d __clone
  1. View + Union + Null 时查询报错或Crash

Mismatched row count

*** SIGSEGV (@0x0) received by PID 3659 (TID 0x7f17de2fb700) from PID 0; stack trace: ***
    @          0x3ff4972 google::(anonymous namespace)::FailureSignalHandler()
    @     0x7f186229a630 (unknown)
    @     0x7f186190bc00 __memmove_ssse3_back
    @          0x1a26464 starrocks::vectorized::FixedLengthColumnBase<>::append()
    @          0x25224ca starrocks::vectorized::NullableColumn::append()
    @          0x251009b starrocks::vectorized::Chunk::append_safe()
    @          0x27453a7 starrocks::vectorized::ChunksSorterHeapSort::done()
    @          0x27419e5 starrocks::vectorized::ChunksSorter::finish()
    @          0x28ba860 starrocks::pipeline::PartitionSortSinkOperator::set_finishing()
    @          0x28def07 starrocks::pipeline::PipelineDriver::_mark_operator_finishing()
    @          0x28dff3b starrocks::pipeline::PipelineDriver::process()
    @          0x28d67dc starrocks::pipeline::GlobalDriverExecutor::_worker_thread()
    @          0x21772c9 starrocks::ThreadPool::dispatch_thread()
    @          0x2172e7a starrocks::Thread::supervise_thread()
    @     0x7f1862292ea5 start_thread
    @     0x7f18618ad9fd __clone
    @                0x0 (unknown)
bug的触发条件:
1: 查询首先有view. 
2: view 中有 union 
3: union的孩子有 常量 NULL
4: 这个常量NULL位于union的第一个孩子中.
  1. bitmap_contains 消耗大量内存

terminate called after throwing an instance of 'query_id:b1e35703-a6de-11ed-adfa-78ac4489cf40, fragment_instance:b1e35703-a6de-11ed-adfa-78ac4489cf47
*** Aborted at 1675771249 (unix time) try "date -d @1675771249" if you are using GNU date ***
std::runtime_error'
  what():  failed memory alloc in constructor
PC: @     0x7fe1f947e387 __GI_raise
*** SIGABRT (@0xce40004d5a7) received by PID 316839 (TID 0x7fe13807f700) from PID 316839; stack trace: ***
    @          0x40e1c82 google::(anonymous namespace)::FailureSignalHandler()
    @     0x7fe1f9f33630 (unknown)
    @     0x7fe1f947e387 __GI_raise
    @     0x7fe1f947fa78 __GI_abort
    @          0x5ae6dd2 __gnu_cxx::__verbose_terminate_handler()
    @          0x5ae5886 __cxxabiv1::__terminate()
    @          0x5ae58f1 std::terminate()
    @          0x5ae5a96 __cxa_rethrow
    @          0x1653704 _ZNSt8_Rb_treeIjSt4pairIKj7RoaringESt10_Select1stIS3_ESt4lessIjESaIS3_EE7_M_copyINS9_11_Alloc_nodeEEEPSt13_Rb_tree_nodeIS3_EPKSD_PSt18_Rb_tree_node_baseRT_.isra.0.cold
    @          0x212826b starrocks::BitmapValue::BitmapValue()
    @          0x258608b starrocks::vectorized::ObjectColumn<>::append()
    @          0x2586412 starrocks::vectorized::ObjectColumn<>::append_value_multiple_times()
    @          0x293d750 starrocks::pipeline::CrossJoinLeftOperator::_copy_joined_rows_with_index_base_build()
    @          0x293dfc2 starrocks::pipeline::CrossJoinLeftOperator::pull_chunk()
    @          0x2965983 starrocks::pipeline::PipelineDriver::process()
    @          0x295bfb6 starrocks::pipeline::GlobalDriverExecutor::_worker_thread()
    @          0x21c59f9 starrocks::ThreadPool::dispatch_thread()
    @          0x21c15aa starrocks::Thread::supervise_thread()
    @     0x7fe1f9f2bea5 start_thread
    @     0x7fe1f9546b0d __clone
    @                0x0 (unknown)
  1. SQL 解析报错: location

You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'location' at line 3
  1. 低基数导致 Plan 改写 Unknown error
ava.lang.IllegalStateException: null
        at com.google.common.base.Preconditions.checkState(Preconditions.java:494) ~[spark-dpp-1.0.0.jar:?]
        at com.starrocks.sql.plan.ScalarOperatorToExpr$Formatter.visitVariableReference(ScalarOperatorToExpr.java:133) ~[starrocks-fe.jar:?]
        at com.starrocks.sql.plan.ScalarOperatorToExpr$Formatter.visitVariableReference(ScalarOperatorToExpr.java:112) ~[starrocks-fe.jar:?]
        at com.starrocks.sql.optimizer.operator.scalar.ColumnRefOperator.accept(ColumnRefOperator.java:110) ~[starrocks-fe.jar:?]
        at com.starrocks.sql.plan.ScalarOperatorToExpr.buildExecExpression(ScalarOperatorToExpr.java:79) ~[starrocks-fe.jar:?]
        at com.starrocks.sql.plan.PlanFragmentBuilder$PhysicalPlanTranslator.buildPartialTopNFragment(PlanFragmentBuilder.java:1749) ~[starrocks-fe.jar:?]
        at com.starrocks.sql.plan.PlanFragmentBuilder$PhysicalPlanTranslator.visitPhysicalTopN(PlanFragmentBuilder.java:1664) ~[starrocks-fe.jar:?]
        at com.starrocks.sql.plan.PlanFragmentBuilder$PhysicalPlanTranslator.visitPhysicalTopN(PlanFragmentBuilder.java:255) ~[starrocks-fe.jar:?]
        at com.starrocks.sql.optimizer.operator.physical.PhysicalTopNOperator.accept(PhysicalTopNOperator.java:113) ~[starrocks-fe.jar:?]
        at com.starrocks.sql.plan.PlanFragmentBuilder$PhysicalPlanTranslator.visit(PlanFragmentBuilder.java:264) ~[starrocks-fe.jar:?]
        at com.starrocks.sql.plan.PlanFragmentBuilder$PhysicalPlanTranslator.visitPhysicalDecode(PlanFragmentBuilder.java:474) ~[starrocks-fe.jar:?]
        at com.starrocks.sql.plan.PlanFragmentBuilder$PhysicalPlanTranslator.visitPhysicalDecode(PlanFragmentBuilder.java:255) ~[starrocks-fe.jar:?]
        at com.starrocks.sql.optimizer.operator.physical.PhysicalDecodeOperator.accept(PhysicalDecodeOperator.java:112) ~[starrocks-fe.jar:?]
        at com.starrocks.sql.plan.PlanFragmentBuilder$PhysicalPlanTranslator.visit(PlanFragmentBuilder.java:264) ~[starrocks-fe.jar:?]
        at com.starrocks.sql.plan.PlanFragmentBuilder.createPhysicalPlan(PlanFragmentBuilder.java:169) ~[starrocks-fe.jar:?]
        at com.starrocks.sql.StatementPlanner.createQueryPlan(StatementPlanner.java:110) ~[starrocks-fe.jar:?]
        at com.starrocks.sql.StatementPlanner.plan(StatementPlanner.java:66) ~[starrocks-fe.jar:?]
        at com.starrocks.sql.StatementPlanner.plan(StatementPlanner.java:37) ~[starrocks-fe.jar:?]
        at com.starrocks.qe.StmtExecutor.execute(StmtExecutor.java:373) ~[starrocks-fe.jar:?]
        at com.starrocks.qe.ConnectProcessor.handleQuery(ConnectProcessor.java:313) ~[starrocks-fe.jar:?]
        at com.starrocks.qe.ConnectProcessor.dispatch(ConnectProcessor.java:430) ~[starrocks-fe.jar:?]
        at com.starrocks.qe.ConnectProcessor.processOnce(ConnectProcessor.java:676) ~[starrocks-fe.jar:?]
        at com.starrocks.mysql.nio.ReadListener.lambda$handleEvent$0(ReadListener.java:55) ~[starrocks-fe.jar:?]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_201]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_201]
        at java.lang.Thread.run(Thread.java:748) [?:1.8.0_201]
  • Github Issue:

  • Github Fix PR:
  • Jira:

  • 问题版本:
    • 2.4.0 ~ 2.4.3
    • 2.5.0 ~ 2.5.1
  • 修复版本:
    • 2.4.4+
    • 2.5.2+
  • 临时规避方法:
    • set global cbo_enable_low_cardinality_optimize=false; (会影响部分SQL的性能)
  • 问题原因:
    • 低基数查询Plan改写的问题
  1. 同样的SQL, 加Limit比不加Limit性能退化严重
    触发条件:
  • Limit 比较小
  • 有过滤条件
  • 过滤后的结果数据比较少
mysql> select lo_custkey from lineorder_flat where lo_revenue = 3322363 and lo_custkey = 2684693 limit 1;
+------------+
| lo_custkey |
+------------+
|    2684693 |
+------------+
1 row in set (13.25 sec)

mysql> select lo_custkey from lineorder_flat where lo_revenue = 3322363 and lo_custkey = 2684693;
+------------+
| lo_custkey |
+------------+
|    2684693 |
+------------+
1 row in set (0.08 sec)
  1. Librdkafka crash
*** Aborted at 1632307784 (unix time) try "date -d @1632307784" if you are using GNU date ***
PC: @ 0x7f2e6145a387 __GI_raise
*** SIGABRT (@0x27de9) received by PID 163305 (TID 0x7f2d8a2f0700) from PID 163305; stack trace: ***
@ 0x1fd12e2 google::(anonymous namespace)::FailureSignalHandler()
@ 0x7f2e61b03630 (unknown)
@ 0x7f2e6145a387 __GI_raise
@ 0x7f2e6145ba78 __GI_abort
@ 0x7f2e614531a6 __assert_fail_base
@ 0x7f2e61453252 __GI___assert_fail
@ 0x1d7c767 rd_kafka_broker_destroy_final
@ 0x1dfccb8 rd_kafka_metadata_refresh_topics
@ 0x1dfd04a rd_kafka_metadata_refresh_known_topics
@ 0x1d774c2 rd_kafka_broker_fail
@ 0x1d87bd4 rd_kafka_broker_op_serve
@ 0x1d89463 rd_kafka_broker_ops_io_serve
@ 0x1d89968 rd_kafka_broker_consumer_serve
@ 0x1d8b485 rd_kafka_broker_thread_main
@ 0x1df43a7 _thrd_wrapper_function
@ 0x7f2e61afbea5 start_thread
@ 0x7f2e615229fd __clone
  1. InfoSchemaDb id shouldn’t larger than 10000
2023-02-16 00:00:34,021 ERROR (leaderCheckpointer|75) [Checkpoint.runAfterCatalogReady():106] Exception when generate new image file
java.lang.IllegalStateException: InfoSchemaDb id shouldn’t larger than 10000, please restart your FE server
at com.google.common.base.Preconditions.checkState(Preconditions.java:510) ~[spark-dpp-1.0.0.jar:?]
at com.starrocks.server.LocalMetastore.loadCluster(LocalMetastore.java:3598) ~[starrocks-fe.jar:?]
at com.starrocks.server.GlobalStateMgr.loadImage(GlobalStateMgr.java:1131) ~[starrocks-fe.jar:?]
at com.starrocks.master.Checkpoint.runAfterCatalogReady(Checkpoint.java:87) [starrocks-fe.jar:?]
at com.starrocks.common.util.MasterDaemon.runOneCycle(MasterDaemon.java:61) [starrocks-fe.jar:?]
at com.starrocks.common.util.Daemon.run(Daemon.java:115) [starrocks-fe.jar:?]
  1. 动态分区:创建大量历史分区,并且超时
    当 dynamic_partition.start 设置的比较小时,会创建大量历史分区
  1. 主键模型开启 persistent index 后磁盘空间持续增长
  1. BE 启动加载 persistent index crash

*** Aborted at 1676903379 (unix time) try "date -d @1676903379" if you are using GNU date ***
PC: @          0x310317a starrocks::PersistentIndex::_merge_compaction()
*** SIGFPE (@0x310317a) received by PID 30837 (TID 0x7f8d745fe700) from PID 51392890; stack trace: ***
    @          0x4825332 google::(anonymous namespace)::FailureSignalHandler()
    @     0x7fe755c565e0 (unknown)
    @          0x310317a starrocks::PersistentIndex::_merge_compaction()
    @          0x3104bc0 starrocks::PersistentIndex::_check_and_flush_l0()
    @          0x31070c0 starrocks::PersistentIndex::commit()
    @          0x2e8936e starrocks::PrimaryIndex::commit()
    @          0x2f5cbf5 starrocks::TabletUpdates::_apply_compaction_commit()
    @          0x2f5e52d starrocks::TabletUpdates::do_apply()
    @          0x3762b55 starrocks::ThreadPool::dispatch_thread()
    @          0x375df8a starrocks::Thread::supervise_thread()
    @     0x7fe755c4ee25 start_thread
    @     0x7fe75526e34d __clone
    @                0x0 (unknown)