be 副本无法同步,日志version not found,has missed version

【详述】问题详细描述
be副本无法恢复,三个节点,副本无法恢复的节点相当于无法使用,请求过来会replay
副本无法恢复,目前已经半个月了,写入任务关停后,tablet依旧无法完全恢复
【背景】做过哪些操作?
之前这个节点挂过,导致femaster切换, 重启后就这样了,
【业务影响】
【StarRocks版本】2.3
【集群规模】例如:3fe(3 follower)+3be(fe与be混部)
【机器信息】CPU虚拟核/内存/网卡,例如:16C/64G/千兆
【联系方式】为了在解决问题过程中能及时联系到您获取一些日志信息,请补充下您的联系方式,例如:社区群12-meifj
be info日志
67576],[1067577-1067577],[1067578-1067578],[1067579-1067579],

W0128 21:56:50.132962 15849 version_graph.cpp:365] fail to find path in version_graph. spec_version: 0-1163421

I0128 21:56:50.133898 15747 pipeline_driver_executor.cpp:217] [Driver] Succeed to report exec state: fragment_instance_id=b9021fc4-9f99-11ed-8934-0aef383f6aa6

W0128 21:56:50.133929 15849 tablet.cpp:500] version not found. tablet_id: 715455, version: 1163421

W0128 21:56:50.133942 15849 tablet.cpp:849] 715455.49929927.1a47e7808c82dba2-17e927fb3f0302b4 has 259956 missed version:[902465-902465],[902466-902466],[902467-902467],[902468-902468],[902469-902469],[902470-902470],[902471-902471],[902472-902472],[902473-902473],[902474-902474],

I0128 21:56:50.168264 15915 txn_manager.cpp:304] rollback transaction from engine successfully. partition_id: 714345, txn_id: 24094843, tablet: 714440.49929927.234b3517365c38a0-248d66fd0862b3bc

I0128 21:56:50.168296 15917 txn_manager.cpp:304] rollback transaction from engine successfully. partition_id: 714345, txn_id: 24094843, tablet: 714444.49929927.294c83a8f0727561-b272e90f4edb2b8b

I0128 21:56:50.168332 15920 txn_manager.cpp:304] rollback transaction from engine successfully. partition_id: 714345, txn_id: 24094843, tablet: 714448.49929927.4748d7a46d3b09be-331dc42ed17b1db6

W0128 21:56:50.168344 15854 tablet.cpp:500] version not found. tablet_id: 714688, version: 1365403

I0128 21:56:50.168356 15918 txn_manager.cpp:304] rollback transaction from engine successfully. partition_id: 714345, txn_id: 24094843, tablet: 714452.49929927.fd4b08d09aa5373c-d7e9383ecf973d93

W0128 21:56:50.168357 15854 tablet.cpp:849] 714688.49929927.ac4a76d887777579-5af0433d0ddff4b0 has 35615 missed version:[1329787-1329787],[1329788-1329788],[1329789-1329789],[1329790-1329790],[1329791-1329791],[1329792-1329792],[1329793-1329793],[1329794-1329794],[1329795-1329795],[1329796-1329796],

W0128 21:56:50.168376 15854 version_graph.cpp:365] fail to find path in version_graph. spec_version: 0-1365403

I0128 21:56:50.168385 15915 txn_manager.cpp:304] rollback transaction from engine successfully. partition_id: 714345, txn_id: 24094843, tablet: 714456.49929927.a74ca362ee05ce01-c9ff8ca825d455be

I0128 21:56:50.168422 15917 txn_manager.cpp:304] rollback transaction from engine successfully. partition_id: 714345, txn_id: 24094843, tablet: 714460.49929927.914ab914a7c91afe-3880ca5519c7aaa2

W0128 21:56:50.168438 15917 load_channel.cpp:63] Fail to open index 714347 of load 764c4e6677eaba54-7fab01cfacffb8ab: Service unavailable: Too many versions. tablet_id: 714492, version_count: 1008, limit: 1000

/root/starrocks/be/src/storage/delta_writer.cpp:20 writer->_init()

/root/starrocks/be/src/runtime/local_tablets_channel.cpp:314 res.status()

/root/starrocks/be/src/runtime/local_tablets_channel.cpp:58 _open_all_writers(params)

W0128 21:56:50.168646 15854 tablet.cpp:500] version not found. tablet_id: 714656, version: 1365403

W0128 21:56:50.168661 15854 tablet.cpp:849] 714656.49929927.914e7384f46b91c6-36b00d9c0e0cb190 has 71243 missed version:[1294158-1294158],[1294159-1294159],[1294160-1294160],[1294161-1294161],[1294162-1294162],[1294163-1294163],[1294164-1294164],[1294165-1294165],[1294166-1294166],[1294167-1294167],

W0128 21:56:50.168686 15854 version_graph.cpp:365] fail to find path in version_graph. spec_version: 0-1365403

W0128 21:56:50.168754 15854 tablet.cpp:500] version not found. tablet_id: 714624, version: 1365403

W0128 21:56:50.168762 15854 tablet.cpp:849] 714624.49929927.0544ab14e0135c83-3a0dca3792403986 has 26940 missed version:[1338463-1338463],[1338464-1338464],[1338465-1338465],[1338466-1338466],[1338467-1338467],[1338468-1338468],[1338469-1338469],[1338470-1338470],[1338471-1338471],[1338472-1338472],

W0128 21:56:50.168776 15854 version_graph.cpp:365] fail to find path in version_graph. spec_version: 0-1365403

W0128 21:56:50.169039 15854 tablet.cpp:500] version not found. tablet_id: 714592, version: 1365403

W0128 21:56:50.169051 15854 tablet.cpp:849] 714592.49929927.5d41207a23314331-23ecb166e24e6f96 has 89686 missed version:[1275714-1275714],[1275715-1275715],[1275716-1275716],[1275717-1275717],[1275718-1275718],[1275719-1275719],[1275720-1275720],[1275721-1275721],[1275722-1275722],[1275723-1275723],

W0128 21:56:50.169063 15854 version_graph.cpp:365] fail to find path in version_graph. spec_version: 0-1365403

W0128 21:56:50.169131 15854 tablet.cpp:500] version not found. tablet_id: 714560, version: 1365403

W0128 21:56:50.169140 15854 tablet.cpp:849] 714560.49929927.9b4f9bd9aa38992c-37c11d5814bcd8a4 has 27762 missed version:[1337641-1337641],[1337642-1337642],[1337643-1337643],[1337644-1337644],[1337645-1337645],[1337646-1337646],[1337647-1337647],[1337648-1337648],[1337649-1337649],[1337650-1337650],

W0128 21:56:50.169150 15854 version_graph.cpp:365] fail to find path in version_graph. spec_version: 0-1365403

W0128 21:56:50.169629 15854 tablet.cpp:500] version not found. tablet_id: 714528, version: 1365403

W0128 21:56:50.169641 15854 tablet.cpp:849] 714528.49929927.6844d15b647130da-d2e296e9c04031b9 has 131287 missed version:[1234112-1234112],[1234113-1234113],[1234114-1234114],[1234115-1234115],[1234116-1234116],[1234117-1234117],[1234118-1234118],[1234119-1234119],[1234120-1234120],[1234121-1234121],

W0128 21:56:50.169651 15854 version_graph.cpp:365] fail to find path in version_graph. spec_version: 0-198179

W0128 21:56:50.169728 15854 tablet.cpp:500] version not found. tablet_id: 717632, version: 198179

W0128 21:56:50.169739 15854 tablet.cpp:849] 717632.73669397.ab48d9070daaf71e-4e57e8f3703c6ab3 has 22987 missed version:[175183-175183],[175184-175184],[175185-175185],[175186-175186],[175187-175187],[175188-175188],[175189-175189],[175190-175190],[175191-175191],[175192-175192],

W0128 21:56:50.169747 15854 version_graph.cpp:365] fail to find path in version_graph. spec_version: 0-198179

W0128 21:56:50.169785 15854 tablet.cpp:500] version not found. tablet_id: 717568, version: 198179

W0128 21:56:50.169795 15854 tablet.cpp:849] 717568.73669397.e3433fe8544dc37a-fd97aad8205f989d has 9066 missed version:[189111-189111],[189112-189112],[189113-189113],[189114-189114],[189115-189115],[189116-189116],[189117-189117],[189118-189118],[189119-189119],[189120-189120],

W0128 21:56:50.169804 15854 version_graph.cpp:365] fail to find path in version_graph. spec_version: 0-1365403

W0128 21:56:50.170500 15854 tablet.cpp:500] version not found. tablet_id: 714496, version: 1365403

W0128 21:56:50.170511 15854 tablet.cpp:849] 714496.49929927.344a5363956bb60b-522a084da3541382 has 241279 missed version:[1124117-1124117],[1124118-1124118],[1124119-1124119],[1124120-1124120],[1124121-1124121],[1124122-1124122],[1124123-1124123],[1124124-1124124],[1124125-1124125],[1124126-1124126],

W0128 21:56:50.170523 15854 version_graph.cpp:365] fail to find path in version_graph. spec_version: 0-198179

W0128 21:56:50.170557 15854 tablet.cpp:500] version not found. tablet_id: 717536, version: 198179

W0128 21:56:50.170562 15854 tablet.cpp:849] 717536.73669397.104c0da5a87dc9f0-f19eac60d55deea9 has 9066 missed version:[189111-189111],[189112-189112],[189113-189113],[189114-189114],[189115-189115],[189116-189116],[189117-189117],[189118-189118],[189119-189119],[189120-189120],

W0128 21:56:50.170570 15854 version_graph.cpp:365] fail to find path in version_graph. spec_version: 0-1365403

W0128 21:56:50.171119 15854 tablet.cpp:500] version not found. tablet_id: 714464, version: 1365403

W0128 21:56:50.171131 15854 tablet.cpp:849] 714464.49929927.1745768192c77a2b-00417119c424a0aa has 165725 missed version:[1199674-1199674],[1199675-1199675],[1199676-1199676],[1199677-1199677],[1199678-1199678],[1199679-1199679],[1199680-1199680],[1199681-1199681],[1199682-1199682],[1199683-1199683],

W0128 21:56:50.171142 15854 version_graph.cpp:365] fail to find path in version_graph. spec_version: 0-1365403

W0128 21:56:50.171263 15854 tablet.cpp:500] version not found. tablet_id: 715200, version: 1365403

W0128 21:56:50.171272 15854 tablet.cpp:849] 715200.49929927.01494678607e4b61-d72ef77d68b2ec95 has 44017 missed version:[1321385-1321385],[1321386-1321386],[1321387-1321387],[1321388-1321388],[1321389-1321389],[1321390-1321390],[1321391-1321391],[1321392-1321392],[1321393-1321393],[1321394-1321394],

W0128 21:56:50.171281 15854 version_graph.cpp:365] fail to find path in version_graph. spec_version: 0-198179

W0128 21:56:50.171342 15854 tablet.cpp:500] version not found. tablet_id: 717504, version: 198179

W0128 21:56:50.171350 15854 tablet.cpp:849] 717504.73669397.2e49b5fcd460afd4-82f5110b0d97b3b3 has 22557 missed version:[175613-175613],[175614-175614],[175615-175615],[175616-175616],[175617-175617],[175618-175618],[175619-175619],[175620-175620],[175621-175621],[175622-175622],

W0128 21:56:50.171360 15854 version_graph.cpp:365] fail to find path in version_graph. spec_version: 0-1365403

W0128 21:56:50.171932 15854 tablet.cpp:500] version not found. tablet_id: 714432, version: 1365403

W0128 21:56:50.171947 15854 tablet.cpp:849] 714432.49929927.ab4d9f47a7148ca3-934c2d3bd11fc1a4 has 171859 missed version:[1193540-1193540],[1193541-1193541],[1193542-1193542],[1193543-1193543],[1193544-1193544],[1193545-1193545],[1193546-1193546],[1193547-1193547],[1193548-1193548],[1193549-1193549],

W0128 21:56:50.171958 15854 version_graph.cpp:365] fail to find path in version_graph. spec_version: 0-198179

W0128 21:56:50.171981 15854 tablet.cpp:500] version not found. tablet_id: 717472, version: 198179

W0128 21:56:50.171988 15854 tablet.cpp:849] 717472.73669397.5e491545cb3260eb-5b1eec403a2667b4 has 7877 missed version:[190301-190301],[190302-190302],[190303-190303],[190304-190304],[190305-190305],[190306-190306],[190307-190307],[190308-190308],[190309-190309],[190310-190310],

W0128 21:56:50.171998 15854 version_graph.cpp:365] fail to find path in version_graph. spec_version: 0-198179

W0128 21:56:50.172062 15854 tablet.cpp:500] version not found. tablet_id: 717344, version: 198179

W0128 21:56:50.172080 15854 tablet.cpp:849] 717344.73669397.884c04e823f8d699-77265be6f7a6f694 has 27433 missed version:[170736-170736],[170737-170737],[170738-170738],[170739-170739],[170740-170740],[170741-170741],[170742-170742],[170743-170743],[170744-170744],[170745-170745],

W0128 21:56:50.172094 15854 version_graph.cpp:365] fail to find path in version_graph. spec_version: 0-1365403

W0128 21:56:50.172377 15854 tablet.cpp:500] version not found. tablet_id: 714400, version: 1365403

W0128 21:56:50.172387 15854 tablet.cpp:849] 714400.49929927.8a48022cdd68e0ff-6a788540e5316abe has 107485 missed version:[1257915-1257915],[1257916-1257916],[1257917-1257917],[1257918-1257918],[1257919-1257919],[1257920-1257920],[1257921-1257921],[1257922-1257922],[1257923-1257923],[1257924-1257924],

W0128 21:56:50.172398 15854 version_graph.cpp:365] fail to find path in version_graph. spec_version: 0-1365403

W0128 21:56:50.172631 15854 tablet.cpp:500] version not found. tablet_id: 715168, version: 1365403

W0128 21:56:50.172641 15854 tablet.cpp:849] 715168.49929927.554bd892b4f0c3f9-ae2f60ed7a577296 has 70414 missed version:[1294987-1294987],[1294988-1294988],[1294989-1294989],[1294990-1294990],[1294991-1294991],[1294992-1294992],[1294993-1294993],[1294994-1294994],[1294995-1294995],[1294996-1294996],

W0128 21:56:50.172650 15854 version_graph.cpp:365] fail to find path in version_graph. spec_version: 0-1365403

W0128 21:56:50.172894 15854 tablet.cpp:500] version not found. tablet_id: 715136, version: 1365403

W0128 21:56:50.172904 15854 tablet.cpp:849] 715136.49929927.0a4802d1a8215595-49c2f4b87d5ae8b9 has 67673 missed version:[1297728-1297728],[1297729-1297729],[1297730-1297730],[1297731-1297731],[1297732-1297732],[1297733-1297733],[1297734-1297734],[1297735-1297735],[1297736-1297736],[1297737-1297737],

W0128 21:56:50.172915 15854 version_graph.cpp:365] fail to find path in version_graph. spec_version: 0-198179

W0128 21:56:50.172974 15854 tablet.cpp:500] version not found. tablet_id: 717440, version: 198179

W0128 21:56:50.172981 15854 tablet.cpp:849] 717440.73669397.c34dc0bbd46cb9aa-4d3f82e1fba534af has 19821 missed version:[178349-178349],[178350-178350],[178351-178351],[178352-178352],[178353-178353],[178354-178354],[178355-178355],[178356-178356],[178357-178357],[178358-178358],

W0128 21:56:50.172991 15854 version_graph.cpp:365] fail to find path in version_graph. spec_version: 0-198179

W0128 21:56:50.173097 15854 tablet.cpp:500] version not found. tablet_id: 717312, version: 198179

W0128 21:56:50.173106 15854 tablet.cpp:849] 717312.73669397.3048089bd08903e0-311b1b990f06cfa5 has 39050 missed version:[159116-159116],[159117-159117],[159118-159118],[159119-159119],[159120-159120],[159121-159121],[159122-159122],[159123-159123],[159124-159124],[159125-159125],

W0128 21:56:50.173115 15854 version_graph.cpp:365] fail to find path in version_graph. spec_version: 0-1365403

W0128 21:56:50.173182 15854 tablet.cpp:500] version not found. tablet_id: 714368, version: 1365403

W0128 21:56:50.173189 15854 tablet.cpp:849] 714368.49929927.c6437a4d40b5a6b4-63de44e9b3086092 has 26942 missed version:[1338461-1338461],[1338462-1338462],[1338463-1338463],[1338464-1338464],[1338465-1338465],[1338466-1338466],[1338467-1338467],[1338468-1338468],[1338469-1338469],[1338470-1338470],

W0128 21:56:50.173199 15854 version_graph.cpp:365] fail to find path in version_graph. spec_version: 0-1365403

W0128 21:56:50.174777 15854 tablet.cpp:500] version not found. tablet_id: 715104, version: 1365403

W0128 21:56:50.174790 15854 tablet.cpp:849] 715104.49929927.b64f4502ad7897ec-5c5c51f54d8febb9 has 358408 missed version:[1006984-1006984],[1006985-1006985],[1006986-1006986],[1006987-1006987],[1006988-1006988],[1006989-1006989],[1006990-1006990],[1006991-1006991],[1006992-1006992],[1006993-1006993],

W0128 21:56:50.174800 15854 version_graph.cpp:365] fail to find path in version_graph. spec_version: 0-198179

W0128 21:56:50.174867 15854 tablet.cpp:500] version not found. tablet_id: 717408, version: 198179

W0128 21:56:50.174875 15854 tablet.cpp:849] 717408.73669397.324dd2a58edb6760-16d1938675589cb3 has 24690 missed version:[173480-173480],[173481-173481],[173482-173482],[173483-173483],[173484-173484],[173485-173485],[173486-173486],[173487-173487],[173488-173488],[173489-173489],

W0128 21:56:50.174896 15854 version_graph.cpp:365] fail to find path in version_graph. spec_version: 0-198179

W0128 21:56:50.174914 15854 tablet.cpp:500] version not found. tablet_id: 717280, version: 198179

W0128 21:56:50.174921 15854 tablet.cpp:849] 717280.73669397.06442af8e48f8ab2-7b2e984c5fb7479f has 5218 missed version:[192960-192960],[192961-192961],[192962-192962],[192963-192963],[192964-192964],[192965-192965],[192966-192966],[192967-192967],[192968-192968],[192969-192969],

W0128 21:56:50.174928 15854 version_graph.cpp:365] fail to find path in version_graph. spec_version: 0-1365403

W0128 21:56:50.175164 15854 tablet.cpp:500] version not found. tablet_id: 715072, version: 1365403

W0128 21:56:50.175173 15854 tablet.cpp:849] 715072.49929927.2e4edf64240a8429-1e620251e7e9748e has 69529 missed version:[1295872-1295872],[1295873-1295873],[1295874-1295874],[1295875-1295875],[1295876-1295876],[1295877-1295877],[1295878-1295878],[1295879-1295879],[1295880-1295880],[1295881-1295881],

W0128 21:56:50.175180 15854 version_graph.cpp:365] fail to find path in version_graph. spec_version: 0-198179

W0128 21:56:50.175209 15854 tablet.cpp:500] version not found. tablet_id: 717376, version: 198179

W0128 21:56:50.175216 15854 tablet.cpp:849] 717376.73669397.5c422bca9cbcd2d8-65466b00b80d759c has 8250 missed version:[189928-189928],[189929-189929],[189930-189930],[189931-189931],[189932-189932],[189933-189933],[189934-189934],[189935-189935],[189936-189936],[189937-189937],

W0128 21:56:50.175226 15854 version_graph.cpp:365] fail to find path in version_graph. spec_version: 0-198179

W0128 21:56:50.175295 15854 tablet.cpp:500] version not found. tablet_id: 717248, version: 198179

W0128 21:56:50.175302 15854 tablet.cpp:849] 717248.73669397.db4b49695699abed-4f46844045b6a792 has 30882 missed version:[167286-167286],[167287-167287],[167288-167288],[167289-167289],[167290-167290],[167291-167291],[167292-167292],[167293-167293],[167294-167294],[167295-167295],

W0128 21:56:50.175354 15854 version_graph.cpp:365] fail to find path in version_graph. spec_version: 0-198179

W0128 21:56:50.175813 15854 tablet.cpp:500] version not found. tablet_id: 717664, version: 198179

W0128 21:56:50.175827 15854 tablet.cpp:849] 717664.73669397.4d4067b0f232c4b1-754206d6bbaafea9 has 51201 missed version:[146963-146963],[146964-146964],[146965-146965],[146966-146966],[146967-146967],[146968-146968],[146969-146969],[146970-146970],[146971-146971],[146972-146972],

W0128 21:56:50.175838 15854 version_graph.cpp:365] fail to find path in version_graph. spec_version: 0-198179

W0128 21:56:50.175906 15854 tablet.cpp:500] version not found. tablet_id: 717600, version: 198179

W0128 21:56:50.175915 15854 tablet.cpp:849] 717600.73669397.95422ed2d8d5c417-1a40ed92616c9fa3 has 26619 missed version:[171550-171550],[171551-171551],[171552-171552],[171553-171553],[171554-171554],[171555-171555],[171556-171556],[171557-171557],[171558-171558],[171559-171559],

W0128 21:56:50.175925 15854 version_graph.cpp:365] fail to find path in version_graph. spec_version: 0-198179

W0128 21:56:50.176031 15854 tablet.cpp:500] version not found. tablet_id: 717728, version: 198179

W0128 21:56:50.176040 15854 tablet.cpp:849] 717728.73669397.2749704e9cc51430-eef48c72accc0280 has 39060 missed version:[159106-159106],[159107-159107],[159108-159108],[159109-159109],[159110-159110],[159111-159111],[159112-159112],[159113-159113],[159114-159114],[159115-159115],

W0128 21:56:50.176051 15854 version_graph.cpp:365] fail to find path in version_graph. spec_version: 0-198179

W0128 21:56:50.176084 15854 tablet.cpp:500] version not found. tablet_id: 717696, version: 198179

W0128 21:56:50.176090 15854 tablet.cpp:849] 717696.73669397.1d475081ea6972d5-5915c3bf92b78d9b has 10525 missed version:[187652-187652],[187653-187653],[187654-187654],[187655-187655],[187656-187656],[187657-187657],[187658-187658],[187659-187659],[187660-187660],[187661-187661],

W0128 21:56:50.176108 15854 version_graph.cpp:365] fail to find path in version_graph. spec_version: 0-1365403

W0128 21:56:50.176215 15854 tablet.cpp:500] version not found. tablet_id: 715232, version: 1365403

W0128 21:56:50.176225 15854 tablet.cpp:849] 715232.49929927.6a4001af161cae9f-92538fad9589a9b6 has 37453 missed version:[1327949-1327949],[1327950-1327950],[1327951-1327951],[1327952-1327952],[1327953-1327953],[1327954-1327954],[1327955-1327955],[1327956-1327956],[1327957-1327957],[1327958-1327958],

W0128 21:56:50.176241 15854 version_graph.cpp:365] fail to find path in version_graph. spec_version: 0-1365403

W0128 21:56:50.176813 15854 tablet.cpp:500] version not found. tablet_id: 715328, version: 1365403

W0128 21:56:50.176826 15854 tablet.cpp:849] 715328.49929927.e648dffbddf578e1-6496e0d133583dae has 174596 missed version:[1190803-1190803],[1190804-1190804],[1190805-1190805],[1190806-1190806],[1190807-1190807],[1190808-1190808],[1190809-1190809],[1190810-1190810],[1190811-1190811],[1190812-1190812],

W0128 21:56:50.176842 15854 version_graph.cpp:365] fail to find path in version_graph. spec_version: 0-1365403

W0128 21:56:50.177084 15854 tablet.cpp:500] version not found. tablet_id: 715264, version: 1365403

fe info日志:
2023-01-28 22:04:17,894 INFO (replayer|76) [DatabaseTransactionMgr.replayUpsertTransactionState():1522] replay a visible transaction TransactionState. txn_id: 24096131, label: fa0811bc-5858-4604-b4f8-f0fdb2ef5632, db id: 10879, table id list: , callback id: -1, coordinator: BE: 10.111.13.204, transaction status: VISIBLE, error replicas num: 36, replica ids: 772555,772619,772495,772559,772623, prepare time: 1674972257301, commit time: 1674972257706, finish time: 1674972257885, publish cost: 179ms, reason: attachment: com.starrocks.load.loadv2.ManualLoadTxnCommitAttachment@1e7c7116

2023-01-28 22:04:17,897 INFO (replayer|76) [GlobalStateMgr.replayJournal():1711] replayed journal id is 62546111, replay to journal id is 62546112

2023-01-28 22:04:17,899 INFO (replayer|76) [DatabaseTransactionMgr.replayUpsertTransactionState():1522] replay a visible transaction TransactionState. txn_id: 24096132, label: 1b3d18a5-8b12-467b-9b93-d8603d977230, db id: 10879, table id list: , callback id: -1, coordinator: BE: 10.111.13.204, transaction status: VISIBLE, error replicas num: 12, replica ids: 863455,863471,863487,863451,863467, prepare time: 1674972257314, commit time: 1674972257748, finish time: 1674972257890, publish cost: 142ms, reason: attachment: com.starrocks.load.loadv2.ManualLoadTxnCommitAttachment@47d2e702

2023-01-28 22:04:17,904 INFO (replayer|76) [GlobalStateMgr.replayJournal():1711] replayed journal id is 62546112, replay to journal id is 62546113

2023-01-28 22:04:17,905 INFO (replayer|76) [DatabaseTransactionMgr.replayUpsertTransactionState():1522] replay a visible transaction TransactionState. txn_id: 24096136, label: b0d57341-44b0-4479-be32-834470ae0413, db id: 10879, table id list: , callback id: -1, coordinator: BE: 10.111.13.204, transaction status: VISIBLE, error replicas num: 12, replica ids: 863455,863471,863487,863451,863467, prepare time: 1674972257350, commit time: 1674972257757, finish time: 1674972257896, publish cost: 139ms, reason: attachment: com.starrocks.load.loadv2.ManualLoadTxnCommitAttachment@28a3750d

2023-01-28 22:04:17,912 INFO (replayer|76) [GlobalStateMgr.replayJournal():1711] replayed journal id is 62546113, replay to journal id is 62546114

2023-01-28 22:04:17,913 INFO (replayer|76) [DatabaseTransactionMgr.replayUpsertTransactionState():1522] replay a visible transaction TransactionState. txn_id: 24096137, label: 96a0e103-c15a-49a8-ab42-9b615562a9dc, db id: 10879, table id list: , callback id: -1, coordinator: BE: 10.111.13.204, transaction status: VISIBLE, error replicas num: 12, replica ids: 863455,863471,863487,863451,863467, prepare time: 1674972257370, commit time: 1674972257762, finish time: 1674972257902, publish cost: 140ms, reason: attachment: com.starrocks.load.loadv2.ManualLoadTxnCommitAttachment@935606c

2023-01-28 22:04:17,921 INFO (replayer|76) [GlobalStateMgr.replayJournal():1711] replayed journal id is 62546114, replay to journal id is 62546115

2023-01-28 22:04:17,922 INFO (replayer|76) [DatabaseTransactionMgr.replayUpsertTransactionState():1522] replay a visible transaction TransactionState. txn_id: 24096133, label: 1a878ee8-13a4-412e-a023-a6f273d7253a, db id: 10879, table id list: , callback id: -1, coordinator: BE: 10.111.13.204, transaction status: VISIBLE, error replicas num: 12, replica ids: 863455,863471,863487,863451,863467, prepare time: 1674972257325, commit time: 1674972257768, finish time: 1674972257911, publish cost: 143ms, reason: attachment: com.starrocks.load.loadv2.ManualLoadTxnCommitAttachment@3d73bd5e

2023-01-28 22:04:17,936 INFO (replayer|76) [GlobalStateMgr.replayJournal():1711] replayed journal id is 62546115, replay to journal id is 62546116

2023-01-28 22:04:17,938 INFO (replayer|76) [DatabaseTransactionMgr.replayUpsertTransactionState():1522] replay a visible transaction TransactionState. txn_id: 24096135, label: f7ab7e99-9da5-44b9-8a49-145d3ecc5907, db id: 10879, table id list: , callback id: -1, coordinator: BE: 10.111.13.204, transaction status: VISIBLE, error replicas num: 12, replica ids: 863455,863471,863487,863451,863467, prepare time: 1674972257341, commit time: 1674972257794, finish time: 1674972257919, publish cost: 125ms, reason: attachment: com.starrocks.load.loadv2.ManualLoadTxnCommitAttachment@3d7f037f

2023-01-28 22:04:17,957 INFO (replayer|76) [GlobalStateMgr.replayJournal():1711] replayed journal id is 62546116, replay to journal id is 62546117

2023-01-28 22:04:17,958 INFO (replayer|76) [DatabaseTransactionMgr.replayUpsertTransactionState():1522] replay a visible transaction TransactionState. txn_id: 24096134, label: ef952026-c179-4662-8bde-0aacdd0cfe30, db id: 10879, table id list: , callback id: -1, coordinator: BE: 10.111.13.204, transaction status: VISIBLE, error replicas num: 12, replica ids: 863455,863471,863487,863451,863467, prepare time: 1674972257332, commit time: 1674972257811, finish time: 1674972257939, publish cost: 128ms, reason: attachment: com.starrocks.load.loadv2.ManualLoadTxnCommitAttachment@4e0a5010

2023-01-28 22:04:19,244 INFO (replayer|76) [GlobalStateMgr.replayJournal():1711] replayed journal id is 62546117, replay to journal id is 62546118

2023-01-28 22:04:19,247 INFO (replayer|76) [DatabaseTransactionMgr.replayUpsertTransactionState():1519] replay a committed transaction TransactionState. txn_id: 24096138, label: e31b266d-21c8-4c02-b862-d7fd9ba982ae, db id: 10879, table id list: , callback id: -1, coordinator: BE: 10.111.13.204, transaction status: COMMITTED, error replicas num: 12, replica ids: 863406,863422,863438,863402,863418, prepare time: 1674972259213, commit time: 1674972259238, finish time: -1, publish cost: -1674972259239ms, reason: attachment: com.starrocks.load.loadv2.ManualLoadTxnCommitAttachment@4db15def

2023-01-28 22:04:19,259 INFO (replayer|76) [GlobalStateMgr.replayJournal():1711] replayed journal id is 62546118, replay to journal id is 62546119

2023-01-28 22:04:19,261 INFO (replayer|76) [DatabaseTransactionMgr.replayUpsertTransactionState():1522] replay a visible transaction TransactionState. txn_id: 24096138, label: e31b266d-21c8-4c02-b862-d7fd9ba982ae, db id: 10879, table id list: , callback id: -1, coordinator: BE: 10.111.13.204, transaction status: VISIBLE, error replicas num: 12, replica ids: 863406,863422,863438,863402,863418, prepare time: 1674972259213, commit time: 1674972259238, finish time: 1674972259253, publish cost: 15ms, reason: attachment: com.starrocks.load.loadv2.ManualLoadTxnCommitAttachment@2328c51a

2023-01-28 22:04:19,762 INFO (replayer|76) [GlobalStateMgr.replayJournal():1711] replayed journal id is 62546119, replay to journal id is 62546120

2023-01-28 22:04:19,763 INFO (replayer|76) [DatabaseTransactionMgr.replayUpsertTransactionState():1519] replay a committed transaction TransactionState. txn_id: 24096139, label: 57dacc75-1cf0-4d55-aaaa-69c7920d663f, db id: 10879, table id list: , callback id: -1, coordinator: BE: 10.111.13.204, transaction status: COMMITTED, error replicas num: 0, replica ids: , prepare time: 1674972259737, commit time: 1674972259755, finish time: -1, publish cost: -1674972259756ms, reason: attachment: com.starrocks.load.loadv2.ManualLoadTxnCommitAttachment@6ab42fe2

2023-01-28 22:04:19,777 INFO (replayer|76) [GlobalStateMgr.replayJournal():1711] replayed journal id is 62546120, replay to journal id is 62546121

2023-01-28 22:04:19,778 INFO (replayer|76) [DatabaseTransactionMgr.replayUpsertTransactionState():1522] replay a visible transaction TransactionState. txn_id: 24096139, label: 57dacc75-1cf0-4d55-aaaa-69c7920d663f, db id: 10879, table id list: , callback id: -1, coordinator: BE: 10.111.13.204, transaction status: VISIBLE, error replicas num: 0, replica ids: , prepare time: 1674972259737, commit time: 1674972259755, finish time: 1674972259769, publish cost: 14ms, reason: attachment: com.starrocks.load.loadv2.ManualLoadTxnCommitAttachment@4465acc6

【附件】

  • fe.log/beINFO/相应截图
  • 慢查询:
    • Profile信息
    • 并行度:show variables like ‘%parallel_fragment_exec_instance_num%’;
    • pipeline是否开启:show variables like ‘%pipeline%’;
    • be节点cpu和内存使用率截图
  • 查询报错:
  • be crash
    • be.out

补充监控截图



be.warn

fe.warn

BE挂过一段时间后,落后版本太多,Clone调度太慢,而导入速度又太快,导致一直落后,所以疯狂打这个日志。这个问题,会可以通过调整 clone并发任务数或是手动强制set bad缓解,后面的版本,会通过落后版本太多的时候,执行全量Clone来彻底解决。