如标题,想了解下原理。有如下问题:
- 读取parquet的时候是读取原始的parquet数据之后在BE内解压,还是读取解压后的数据?
- 读取的时候消耗的是物理IO还是逻辑IO?
- 分布式场景,是每个BE都会读取数据,还是集中在一个BE读取?
实测使用1FE+3BE的配置,通过hive catalog读取了168万行数据,耗时仅为1s227ms。
根据profile来看,RequestBytesRead: 668.445 MB,意味着单个BE至少要181.6MB/s的read speed。
但使用netdata监控StarRocks进程的IO消耗,发现physical IO只有38MiB/s。
logical IO 则达到 400MiB/s+,并且集中在一个BE,并没有分配到其他BE。
此时磁盘的IO util只有26%左右,看样子是和物理读写IO相关联的。由此推测StarRocks读取数据主要是消耗logical io,但这背后是什么原理?为何读取parquet文件消耗的不是physical io?
profile片段如下:
├──<PROBE> HDFS_SCAN (id=0)
│ Estimates: [row: 1150036, cpu: ?, memory: ?, network: ?, cost: 5.5201728E8]
│ TotalTime: 1s227ms (12.81%) [CPUTime: 53.878ms, ScanTime: 1s174ms]
│ OutputRows: 16.881M (16880652)
│ PeakMemory: 165.178 MB, AllocatedMemory: 5.132 GB
相关的metrics如下:
CONNECTOR_SCAN (plan_node_id=0):
CommonMetrics:
- CloseTime: 1.478ms
- __MAX_OF_CloseTime: 2.183ms
- __MIN_OF_CloseTime: 1.128ms
- JoinRuntimeFilterEvaluate: 0
- JoinRuntimeFilterHashTime: 0ns
- JoinRuntimeFilterInputRows: 0
- JoinRuntimeFilterOutputRows: 0
- JoinRuntimeFilterTime: 0ns
- OperatorAllocatedMemoryUsage: 5.132 GB
- __MAX_OF_OperatorAllocatedMemoryUsage: 310.027 MB
- __MIN_OF_OperatorAllocatedMemoryUsage: 144.342 MB
- OperatorDeallocatedMemoryUsage: 2.798 GB
- __MAX_OF_OperatorDeallocatedMemoryUsage: 165.152 MB
- __MIN_OF_OperatorDeallocatedMemoryUsage: 83.282 MB
- OperatorPeakMemoryUsage: 109.625 MB
- __MAX_OF_OperatorPeakMemoryUsage: 165.178 MB
- __MIN_OF_OperatorPeakMemoryUsage: 62.779 MB
- OperatorTotalTime: 15.793ms
- __MAX_OF_OperatorTotalTime: 53.878ms
- __MIN_OF_OperatorTotalTime: 3.534ms
- PrepareTime: 15.092us
- __MAX_OF_PrepareTime: 36.837us
- __MIN_OF_PrepareTime: 7.167us
- PullChunkNum: 4.214K (4214)
- __MAX_OF_PullChunkNum: 250
- __MIN_OF_PullChunkNum: 98
- PullRowNum: 16.881M (16880652)
- __MAX_OF_PullRowNum: 1.001M (1000884)
- __MIN_OF_PullRowNum: 393.313K (393313)
- PullTotalTime: 14.313ms
- __MAX_OF_PullTotalTime: 51.793ms
- __MIN_OF_PullTotalTime: 2.245ms
- PushChunkNum: 0
- PushRowNum: 0
- PushTotalTime: 0ns
- RuntimeBloomFilterNum: 0
- RuntimeInFilterNum: 0
- SetFinishedTime: 100ns
- __MAX_OF_SetFinishedTime: 407ns
- __MIN_OF_SetFinishedTime: 46ns
- SetFinishingTime: 997ns
- __MAX_OF_SetFinishingTime: 1.547us
- __MIN_OF_SetFinishingTime: 636ns
UniqueMetrics:
- AdaptiveIOTasks: False
- DataSourceType: HiveDataSource
- MorselQueueType: fixed_morsel_queue
- Predicates
- PredicatesPartition
- SharedScan: False
- Table: data_res_2080_15_bf754e49099b4461899da77f26313013
- ChunkBufferCapacity: 512
- DefaultChunkBufferCapacity: 512
- IOTaskExecTime: 354.794ms
- __MAX_OF_IOTaskExecTime: 1s170ms
- __MIN_OF_IOTaskExecTime: 128.379ms
- ColumnConvertTime: 0ns
- ColumnReadTime: 0ns
- ExprFilterTime: 0ns
- HdfsIO: 0ns
- OpenTimeMs: 102.333K (102333)
- __MAX_OF_OpenTimeMs: 493
- __MIN_OF_OpenTimeMs: 106
- TotalBytesRead: 703.815M (703814666)
- __MAX_OF_TotalBytesRead: 5.103M (5103374)
- __MIN_OF_TotalBytesRead: 34.447K (34447)
- TotalLocalBytesRead: 473.149M (473149237)
- __MAX_OF_TotalLocalBytesRead: 5.103M (5103374)
- __MIN_OF_TotalLocalBytesRead: 0
- TotalShortCircuitBytesRead: 473.149M (473149237)
- __MAX_OF_TotalShortCircuitBytesRead: 5.103M (5103374)
- __MIN_OF_TotalShortCircuitBytesRead: 0
- TotalZeroCopyBytesRead: 0
- InputStream:
- AppIOBytesRead: 668.445 MB
- __MAX_OF_AppIOBytesRead: 4.867 MB
- __MIN_OF_AppIOBytesRead: 33.640 KB
- AppIOCounter: 1.070K (1070)
- __MAX_OF_AppIOCounter: 8
- __MIN_OF_AppIOCounter: 1
- AppIOTime: 319.094ms
- __MAX_OF_AppIOTime: 927.937ms
- __MIN_OF_AppIOTime: 111.115ms
- FSIOBytesRead: 671.210 MB
- __MAX_OF_FSIOBytesRead: 4.867 MB
- __MIN_OF_FSIOBytesRead: 33.640 KB
- FSIOCounter: 1.070K (1070)
- __MAX_OF_FSIOCounter: 8
- __MIN_OF_FSIOCounter: 1
- FSIOTime: 319.047ms
- __MAX_OF_FSIOTime: 927.919ms
- __MIN_OF_FSIOTime: 111.103ms
- LateMaterializeSkipRows: 0
- OpenFile: 310.799ms
- __MAX_OF_OpenFile: 679.188ms
- __MIN_OF_OpenFile: 107.588ms
- Parquet:
- FooterCacheReadCount: 0
- FooterCacheReadTimer: 0ns
- FooterCacheWriteBytes: 0.000 B
- FooterCacheWriteCount: 0
- GroupActiveLazyCoalesceSeperately: 143
- __MAX_OF_GroupActiveLazyCoalesceSeperately: 1
- __MIN_OF_GroupActiveLazyCoalesceSeperately: 0
- GroupActiveLazyCoalesceTogether: 35
- __MAX_OF_GroupActiveLazyCoalesceTogether: 1
- __MIN_OF_GroupActiveLazyCoalesceTogether: 0
- GroupChunkRead: 0ns
- GroupDictDecode: 144.866us
- __MAX_OF_GroupDictDecode: 42.240ms
- __MIN_OF_GroupDictDecode: 0ns
- GroupDictFilter: 0ns
- GroupMinRound: 0
- HasPageStatistics: 0
- IcebergV2FormatTimer:
- DeleteFileBuildFilterTime: 0ns
- DeleteFileBuildTime: 0ns
- DeleteFilesPerScan: 0
- LevelDecodeTime: 0ns
- PageReadTime: 0ns
- PageSkipCounter: 2.492K (2492)
- __MAX_OF_PageSkipCounter: 14
- __MIN_OF_PageSkipCounter: 0
- ReaderInitColumnReaderInit: 8.115us
- __MAX_OF_ReaderInitColumnReaderInit: 66.230us
- __MIN_OF_ReaderInitColumnReaderInit: 0ns
- ReaderInitFooterRead: 295.779ms
- __MAX_OF_ReaderInitFooterRead: 573.514ms
- __MIN_OF_ReaderInitFooterRead: 106.504ms
- RequestBytesRead: 668.445 MB
- __MAX_OF_RequestBytesRead: 4.867 MB
- __MIN_OF_RequestBytesRead: 33.640 KB
- RequestBytesReadUncompressed: 896.539 MB
- __MAX_OF_RequestBytesReadUncompressed: 6.373 MB
- __MIN_OF_RequestBytesReadUncompressed: 33.640 KB
- ValueDecodeTime: 0ns
- RawRowsRead: 16.881M (16880652)
- __MAX_OF_RawRowsRead: 126.825K (126825)
- __MIN_OF_RawRowsRead: 0
- ReaderInit: 310.770ms
- __MAX_OF_ReaderInit: 679.177ms
- __MIN_OF_ReaderInit: 107.574ms
- RowsRead: 16.881M (16880652)
- __MAX_OF_RowsRead: 126.825K (126825)
- __MIN_OF_RowsRead: 0
- ScanRanges: 546
- __MAX_OF_ScanRanges: 5
- __MIN_OF_ScanRanges: 1
- SharedBuffered:
- DirectIOBytes: 32.123 MB
- __MAX_OF_DirectIOBytes: 249.679 KB
- __MIN_OF_DirectIOBytes: 33.640 KB
- DirectIOCount: 724
- __MAX_OF_DirectIOCount: 6
- __MIN_OF_DirectIOCount: 1
- DirectIOTime: 295.776ms
- __MAX_OF_DirectIOTime: 573.513ms
- __MIN_OF_DirectIOTime: 106.502ms
- SharedIOBytes: 639.087 MB
- __MAX_OF_SharedIOBytes: 4.694 MB
- __MIN_OF_SharedIOBytes: 0.000 B
- SharedIOCount: 346
- __MAX_OF_SharedIOCount: 2
- __MIN_OF_SharedIOCount: 0
- SharedIOTime: 23.313ms
- __MAX_OF_SharedIOTime: 476.633ms
- __MIN_OF_SharedIOTime: 0ns
- IOTaskWaitTime: 2.074ms
- __MAX_OF_IOTaskWaitTime: 28.771ms
- __MIN_OF_IOTaskWaitTime: 17.082us
- MorselsCount: 546
- __MAX_OF_MorselsCount: 42
- __MIN_OF_MorselsCount: 16
- PeakChunkBufferMemoryUsage: 193.803 MB
- PeakChunkBufferSize: 458
- PeakIOTasks: 16
- PeakScanTaskQueueSize: 294
- __MAX_OF_PeakScanTaskQueueSize: 17
- __MIN_OF_PeakScanTaskQueueSize: 8
- ScanTime: 356.868ms
- __MAX_OF_ScanTime: 1s174ms
- __MIN_OF_ScanTime: 128.953ms
- SubmitTaskCount: 739
- __MAX_OF_SubmitTaskCount: 52
- __MIN_OF_SubmitTaskCount: 22
- TabletCount: 546
- __MAX_OF_TabletCount: 207
- __MIN_OF_TabletCount: 151