FE 节点频繁OOM

arter · 2024年05月28日 10:06

【详述】写入过程发现thrift-server-pool飙升达到最高，然后 fe 频繁OOM出现，写入过程有flink和insert overwrite select * from hive表，大可能是后者引起OOM, starrocks JVM XMX设置32Goom ,设置64G依然OOM, 经过自己分析大可能是内存泄露
【背景】尝试了3.1.11,3.2.6,3.2.7的版本，依然有这样的问题
【业务影响】
【是否存算分离】否
【StarRocks版本】3.1.11,3.2.6,3.2.7
【集群规模】3fe+80be
【机器信息】40C192G 12*1T SSD
【联系方式】论坛
【附件】

fe.log/beINFO/相应截图
慢查询：
- Profile信息，获取Profile，通过Profile分析查询瓶颈
- 并行度：1
- pipeline是否开启：是
- be节点cpu和内存使用率截图
  
  image1882×549 94.2 KB
  
  image1915×453 195 KB
  
  image1878×311 99.1 KB

heapdump过大超过32G,采用ParseHeapDump.sh heapdump.hprof org.eclipse.mat.api:suspects org.eclipse.mat.api:overview org.eclipse.mat.api:top_components的结果
java_pid40682_Leak_Suspects.zip (224.5 KB) [java_pid40682_System_Overview.zip|attachment]java_pid40682_Top_Components.zip (330.5 KB) (upload://qsHhjLEDOZ9V0doupMKPP4o9aiF.zip) (135.7 KB)

相关工具 mat分析截图：

跟进这个大的对像，发现每个thrift-server-pool都有一份一模一样的olaptable数据应该要复用
对像大致内容：
{
“comment”: “”,
“indexIdToMeta”: {
“12219”: {
“schemaHash”: 1730590400,
“storageType”: “COLUMN”,
“keysType”: “DUP_KEYS”,
“schemaId”: 12219,
“indexId”: 12219,
“isColocateMVIndex”: false,
“shortKeyColumnCount”: 3,
“dbId”: 0,
“schemaVersion”: 0,
“schema”: [
{
“comment”: “”,
“isAutoIncrement”: false,
“stats”: {
“maxSize”: -1,
“numDistinctValues”: -1,
“avgSerializedSize”: -1.0,
“numNulls”: -1
“lastFailedVersion”: -1,
“lastSuccessVersion”: 2,
“backendId”: 10011,
“rowCount”: 211135,
“state”: “DECOMMISSION”,
“version”: 2,
“dataSize”: 20133914,
“id”: 2721133,
“minReadableVersion”: 0
},
{
“lastFailedVersion”: -1,
“lastSuccessVersion”: 2,
“backendId”: 10332,
“rowCount”: 211135,
“state”: “NORMAL”,
“version”: 2,
“dataSize”: 20136612,
“id”: 2730970,
“minReadableVersion”: 1
}
],
“clazz”: “LocalTablet”,
“signature”: -1,
“id”: 2721130,
“checkedVersion”: 2
},
以上这样的数据，有几十万份，严重消耗内存

Dejun · 2024年05月29日 02:26

感谢反馈，我们看下怎么优化一下

Dejun · 2024年05月29日 02:32

方便加下微信吗？heap dump 如果比较大，可以压缩一下，这个有比较好的压缩比，然后我们这边有大内存机器，可以分析

Dejun · 2024年05月29日 06:29

好的，微信先加你了

hellozhouq · 2024年12月5日 01:49

请问是如何解决的，我们也遇到了同样的问题。

arter · 2025年01月24日 08:55

不好意思，才看到，你们是什么版本？新版本3.3x之后已经修复部分，但没有很彻底。根本原因是每个线程都会深度copy一个表的所有tablets信息，实际上对于insert overwrite根本用不到tablets信息，当一个表的tablets过多的时候，会占用很多内存，改的话这个地方我们是将不必要的信息不copy:
使用 selectiveCopyWithoutTablets 代替addPartitions和getCopiedTable里面的方法selectiveCopy

       // We don't do deep copy, because which is very expensive;
    public void copyWithoutTablets(OlapTable olapTable) {
        olapTable.id = this.id;
        olapTable.name = this.name;
        olapTable.fullSchema = Lists.newArrayList(this.fullSchema);
        olapTable.nameToColumn = Maps.newHashMap(this.nameToColumn);
        olapTable.state = this.state;
        olapTable.indexNameToId = Maps.newHashMap(this.indexNameToId);
        olapTable.indexIdToMeta = Maps.newHashMap(this.indexIdToMeta);
        olapTable.keysType = this.keysType;
        if (this.relatedMaterializedViews != null) {
            olapTable.relatedMaterializedViews = Sets.newHashSet(this.relatedMaterializedViews);
        }
        if (this.uniqueConstraints != null) {
            olapTable.uniqueConstraints = Lists.newArrayList(this.uniqueConstraints);
        }
        if (this.foreignKeyConstraints != null) {
            olapTable.foreignKeyConstraints = Lists.newArrayList(this.foreignKeyConstraints);
        }
        if (this.partitionInfo != null) {
            olapTable.partitionInfo = DeepCopy.copyWithGson(this.partitionInfo, PartitionInfo.class);
        }
        olapTable.defaultDistributionInfo = this.defaultDistributionInfo;
        Map<Long, Partition> idToPartitions = new HashMap<>();
        Map<String, Partition> nameToPartitions = Maps.newTreeMap(String.CASE_INSENSITIVE_ORDER);
        olapTable.idToPartition = idToPartitions;
        olapTable.nameToPartition = nameToPartitions;
        olapTable.tempPartitions = new TempPartitions();
        for (Partition tempPartition : this.getTempPartitions()) {
            olapTable.tempPartitions.addPartition(tempPartition.shallowCopy());
        }
        olapTable.baseIndexId = this.baseIndexId;
        if (this.tableProperty != null) {
            olapTable.tableProperty = this.tableProperty.copy();
        }

        // Shallow copy shared data to check whether the copied table has changed or not.
        olapTable.lastSchemaUpdateTime = this.lastSchemaUpdateTime;
        olapTable.lastVersionUpdateStartTime = this.lastVersionUpdateStartTime;
        olapTable.lastVersionUpdateEndTime = this.lastVersionUpdateEndTime;
    }

      public OlapTable selectiveCopyWithoutTablets(Collection<String> reservedPartitions,
                                                 boolean resetState, IndexExtState extState) {
        OlapTable copied = DeepCopy.copyWithGson(this, OlapTable.class);
        OlapTable copied = new OlapTable();
        this.copyWithoutTablets(copied);
        if (copied == null) {
            LOG.warn("failed to copy olap table: " + getName());
            return null;
        }
        return selectiveCopyInternal(copied, reservedPartitions, resetState, extState);
    }