【详述】问题详细描述
【背景】做过哪些操作?
【业务影响】
【StarRocks版本】例如:2.5.5
【集群规模】例如:1fe 4be(16C128G)
tpc-ds 1t的数据测试其中有 q17,q25,q95,看了下大多都是RuntimeFilter没有下推, join的方式也不太对, 这个是有什么参数可以调整吗?
官网没有tpcds,能不能提供建表和SQL我看一下合不合理
select i_item_id
,i_item_desc
,s_state
,count(ss_quantity) as store_sales_quantitycount
,avg(ss_quantity) as store_sales_quantityave
,stddev_samp(ss_quantity) as store_sales_quantitystdev
,stddev_samp(ss_quantity)/avg(ss_quantity) as store_sales_quantitycov
,count(sr_return_quantity) as store_returns_quantitycount
,avg(sr_return_quantity) as store_returns_quantityave
,stddev_samp(sr_return_quantity) as store_returns_quantitystdev
,stddev_samp(sr_return_quantity)/avg(sr_return_quantity) as store_returns_quantitycov
,count(cs_quantity) as catalog_sales_quantitycount ,avg(cs_quantity) as catalog_sales_quantityave
,stddev_samp(cs_quantity) as catalog_sales_quantitystdev
,stddev_samp(cs_quantity)/avg(cs_quantity) as catalog_sales_quantitycov
from store_sales
,store_returns
,catalog_sales
,date_dim d1
,date_dim d2
,date_dim d3
,store
,item
where d1.d_quarter_name = ‘2001Q1’
and d1.d_date_sk = ss_sold_date_sk
and i_item_sk = ss_item_sk
and s_store_sk = ss_store_sk
and ss_customer_sk = sr_customer_sk
and ss_item_sk = sr_item_sk
and ss_ticket_number = sr_ticket_number
and sr_returned_date_sk = d2.d_date_sk
and d2.d_quarter_name in (‘2001Q1’,‘2001Q2’,‘2001Q3’)
and sr_customer_sk = cs_bill_customer_sk
and sr_item_sk = cs_item_sk
and cs_sold_date_sk = d3.d_date_sk
and d3.d_quarter_name in (‘2001Q1’,‘2001Q2’,‘2001Q3’)
group by i_item_id
,i_item_desc
,s_state
order by i_item_id
,i_item_desc
,s_state
limit 100;root@emr-bb62cdb4427b0c76be20-core-1:/data01/luna/tpcds-tools-sr/queries# ex
root@emr-bb62cdb4427b0c76be20-core-1:/data01/luna/tpcds-tools-sr/queries# cat query17.sql
set pipeline_dop=8;
– query 17
select i_item_id
,i_item_desc
,s_state
,count(ss_quantity) as store_sales_quantitycount
,avg(ss_quantity) as store_sales_quantityave
,stddev_samp(ss_quantity) as store_sales_quantitystdev
,stddev_samp(ss_quantity)/avg(ss_quantity) as store_sales_quantitycov
,count(sr_return_quantity) as store_returns_quantitycount
,avg(sr_return_quantity) as store_returns_quantityave
,stddev_samp(sr_return_quantity) as store_returns_quantitystdev
,stddev_samp(sr_return_quantity)/avg(sr_return_quantity) as store_returns_quantitycov
,count(cs_quantity) as catalog_sales_quantitycount ,avg(cs_quantity) as catalog_sales_quantityave
,stddev_samp(cs_quantity) as catalog_sales_quantitystdev
,stddev_samp(cs_quantity)/avg(cs_quantity) as catalog_sales_quantitycov
from store_sales
,store_returns
,catalog_sales
,date_dim d1
,date_dim d2
,date_dim d3
,store
,item
where d1.d_quarter_name = ‘2001Q1’
and d1.d_date_sk = ss_sold_date_sk
and i_item_sk = ss_item_sk
and s_store_sk = ss_store_sk
and ss_customer_sk = sr_customer_sk
and ss_item_sk = sr_item_sk
and ss_ticket_number = sr_ticket_number
and sr_returned_date_sk = d2.d_date_sk
and d2.d_quarter_name in (‘2001Q1’,‘2001Q2’,‘2001Q3’)
and sr_customer_sk = cs_bill_customer_sk
and sr_item_sk = cs_item_sk
and cs_sold_date_sk = d3.d_date_sk
and d3.d_quarter_name in (‘2001Q1’,‘2001Q2’,‘2001Q3’)
group by i_item_id
,i_item_desc
,s_state
order by i_item_id
,i_item_desc
,s_state
您看一下大表关联那些有没有使用colocate with吧
三个大表 join 的顺序不太对, store_sales,store_returns 没有优先join,
--------------------------------------------------------+
| - Output => [192:i_item_id, 195:i_item_desc, 186:s_state, 213:count, 214:avg, 215:stddev_samp, 222:expr, 216:count, 217:avg, 218:stddev_samp, 223:expr, 219:count, 220:avg, 221:stddev_samp, 224:expr] |
| - TOP-100(FINAL)[192: i_item_id ASC NULLS FIRST, 195: i_item_desc ASC NULLS FIRST, 186: s_state ASC NULLS FIRST] |
| Estimates: {row: 100, cpu: 17230.86, memory: 17230.86, network: 17230.86, cost: 244014450820862.10} |
| - TOP-100(PARTIAL)[192: i_item_id ASC NULLS FIRST, 195: i_item_desc ASC NULLS FIRST, 186: s_state ASC NULLS FIRST] |
| Estimates: {row: 100, cpu: 28471241737258.48, memory: 17230.86, network: 28471241737258.48, cost: 244014450751938.62} |
| - AGGREGATE(GLOBAL) [192:i_item_id, 195:i_item_desc, 186:s_state] => [195:i_item_desc, 213:count, 214:avg, 215:stddev_samp, 216:count, 217:avg, 186:s_state, 218:stddev_samp, 219:count, 220:avg, 221:stddev_samp, 222:expr, 223:expr, 192:i_item_id, 224:expr] |
| Estimates: {row: 165233976314, cpu: 25992732092539.87, memory: 28471241737258.48, network: 0.00, cost: 187071967242959.94} |
| 213:count := count(213:count) |
| 214:avg := avg(214:avg) |
| 215:stddev_samp := stddev_samp(215:stddev_samp) |
| 216:count := count(216:count) |
| 217:avg := avg(217:avg) |
| 218:stddev_samp := stddev_samp(218:stddev_samp) |
| 219:count := count(219:count) |
| 220:avg := avg(220:avg) |
| 221:stddev_samp := stddev_samp(221:stddev_samp) |
| 222:expr := 215:stddev_samp / 214:avg |
| 223:expr := 218:stddev_samp / 217:avg |
| 224:expr := 221:stddev_samp / 220:avg |
| - EXCHANGE(SHUFFLE) [192, 195, 186] |
| Estimates: {row: 165233976314, cpu: 25992732092539.87, memory: 0.00, network: 25992732092539.87, cost: 117133117722173.02} |
| - AGGREGATE(LOCAL) [192:i_item_id, 195:i_item_desc, 186:s_state] |
| Estimates: {row: 165233976314, cpu: 25497030163596.14, memory: 25992732092539.87, network: 0.00, cost: 65147653537093.28} |
| 213:count := count(11:ss_quantity) |
| 214:avg := avg(11:ss_quantity) |
| 215:stddev_samp := stddev_samp(11:ss_quantity) |
| 216:count := count(34:sr_return_quantity) |
| 217:avg := avg(34:sr_return_quantity) |
| 218:stddev_samp := stddev_samp(34:sr_return_quantity) |
| 219:count := count(62:cs_quantity) |
| 220:avg := avg(62:cs_quantity) |
| 221:stddev_samp := stddev_samp(62:cs_quantity) |
| - HASH/INNER JOIN [28:sr_customer_sk = 5:ss_customer_sk AND 25:sr_item_sk = 2:ss_item_sk AND 26:sr_ticket_number = 3:ss_ticket_number] => [34:sr_return_quantity, 195:i_item_desc, 186:s_state, 11:ss_quantity, 62:cs_quantity, 192:i_item_id] |
| Estimates: {row: 165233976314, cpu: 92558306686.14, memory: 13066287446.71, network: 0.00, cost: 413674270215.48} |
| - EXCHANGE(SHUFFLE) [28, 25, 26] |
| Estimates: {row: 3974600961, cpu: 79492019239.43, memory: 0.00, network: 79492019239.43, cost: 197227587631.65} |
| - HASH/INNER JOIN [49:cs_bill_customer_sk = 28:sr_customer_sk AND 45:cs_item_sk = 25:sr_item_sk] => [34:sr_return_quantity, 25:sr_item_sk, 26:sr_ticket_number, 28:sr_customer_sk, 62:cs_quantity] |
| Estimates: {row: 3974600961, cpu: 3194475520.33, memory: 625675998.77, network: 0.00, cost: 38243549152.79} |
| - EXCHANGE(SHUFFLE) [49, 45] |
| Estimates: {row: 214066626, cpu: 2568799521.56, memory: 0.00, network: 2568799521.56, cost: 28177313949.60} |
| - HASH/INNER JOIN [44:cs_sold_date_sk = 134:d_date_sk] => [49:cs_bill_customer_sk, 45:cs_item_sk, 62:cs_quantity] |
| Estimates: {row: 214066626, cpu: 23039687753.11, memory: 1097.11, network: 0.00, cost: 23039714906.49} |
| - SCAN [catalog_sales] => [49:cs_bill_customer_sk, 44:cs_sold_date_sk, 45:cs_item_sk, 62:cs_quantity] |
| Estimates: {row: 1439980416, cpu: 23039686656.00, memory: 0.00, network: 0.00, cost: 11519843328.00} |
| partitionRatio: 1/1, tabletRatio: 256/256 |
| predicate: 49:cs_bill_customer_sk IS NOT NULL |
| - EXCHANGE(BROADCAST) |
| Estimates: {row: 274, cpu: 4388.43, memory: 4388.43, network: 4388.43, cost: 18925.09} |
| - SCAN [date_dim] => [134:d_date_sk] |
| Estimates: {row: 274, cpu: 2742.77, memory: 0.00, network: 0.00, cost: 1371.38} |
| partitionRatio: 1/1, tabletRatio: 1/1 |
| predicate: 149:d_quarter_name IN (‘2001Q1’, ‘2001Q2’, ‘2001Q3’) |
| - EXCHANGE(SHUFFLE) [28, 25] |
| Estimates: {row: 39104749, cpu: 625675998.77, memory: 0.00, network: 625675998.77, cost: 7011375528.04} |
| - HASH/INNER JOIN [24:sr_returned_date_sk = 106:d_date_sk] => [34:sr_return_quantity, 25:sr_item_sk, 26:sr_ticket_number, 28:sr_customer_sk] |
| Estimates: {row: 39104749, cpu: 5759996377.11, memory: 1097.11, network: 0.00, cost: 5760023530.49} |
| - SCAN [store_returns] => [34:sr_return_quantity, 24:sr_returned_date_sk, 25:sr_item_sk, 26:sr_ticket_number, 28:sr_customer_sk] |
| Estimates: {row: 287999764, cpu: 5759995280.00, memory: 0.00, network: 0.00, cost: 2879997640.00} |
| partitionRatio: 1/1, tabletRatio: 96/96 |
| predicate: 28:sr_customer_sk IS NOT NULL |
| - EXCHANGE(BROADCAST) |
| Estimates: {row: 274, cpu: 4388.43, memory: 4388.43, network: 4388.43, cost: 18925.09} |
| - SCAN [date_dim] => [106:d_date_sk] |
| Estimates: {row: 274, cpu: 2742.77, memory: 0.00, network: 0.00, cost: 1371.38} |
| partitionRatio: 1/1, tabletRatio: 1/1 |
| predicate: 121:d_quarter_name IN (‘2001Q1’, ‘2001Q2’, ‘2001Q3’) |
| - EXCHANGE(SHUFFLE) [5, 2, 3] |
| Estimates: {row: 97285531, cpu: 13066287446.71, memory: 0.00, network: 13066287446.71, cost: 98541741391.09} |
| - HASH/INNER JOIN [2:ss_item_sk = 191:i_item_sk] => [2:ss_item_sk, 3:ss_ticket_number, 195:i_item_desc, 5:ss_customer_sk, 186:s_state, 11:ss_quantity, 192:i_item_id] |
| Estimates: {row: 97285531, cpu: 1763557870.61, memory: 36094089.00, network: 0.00, cost: 72409166497.67} |
| - HASH/INNER JOIN [9:ss_store_sk = 162:s_store_sk] => [2:ss_item_sk, 3:ss_ticket_number, 5:ss_customer_sk, 186:s_state, 11:ss_quantity] |
| Estimates: {row: 95996822, cpu: 2876078477.97, memory: 6007.00, network: 0.00, cost: 70557907803.31} |
| - HASH/INNER JOIN [1:ss_sold_date_sk = 78:d_date_sk] => [2:ss_item_sk, 3:ss_ticket_number, 5:ss_customer_sk, 9:ss_store_sk, 11:ss_quantity] |
| Estimates: {row: 143803623, cpu: 69119712341.70, memory: 365.70, network: 0.00, cost: 69119721392.83} |
| - SCAN [store_sales] => [1:ss_sold_date_sk, 2:ss_item_sk, 3:ss_ticket_number, 5:ss_customer_sk, 9:ss_store_sk, 11:ss_quantity] |
| Estimates: {row: 2879987999, cpu: 69119711976.00, memory: 0.00, network: 0.00, cost: 34559855988.00} |
| partitionRatio: 1/1, tabletRatio: 256/256 |
| predicate: 5:ss_customer_sk IS NOT NULL |
| - EXCHANGE(BROADCAST) |
| Estimates: {row: 91, cpu: 1462.81, memory: 1462.81, network: 1462.81, cost: 6308.36} |
| - SCAN [date_dim] => [78:d_date_sk] |
| Estimates: {row: 91, cpu: 914.26, memory: 0.00, network: 0.00, cost: 457.13} |
| partitionRatio: 1/1, tabletRatio: 1/1 |
| predicate: 93:d_quarter_name = ‘2001Q1’ |
| - EXCHANGE(BROADCAST) |
| Estimates: {row: 1002, cpu: 24028.00, memory: 24028.00, network: 24028.00, cost: 99115.50} |
| - SCAN [store] => [162:s_store_sk, 186:s_state] |
| Estimates: {row: 1002, cpu: 6007.00, memory: 0.00, network: 0.00, cost: 3003.50} |
| partitionRatio: 1/1, tabletRatio: 1/1 |
| - EXCHANGE(BROADCAST) |
| Estimates: {row: 300000, cpu: 144376356.00, memory: 144376356.00, network: 144376356.00, cost: 595552468.50} |
| - SCAN [item] => [195:i_item_desc, 191:i_item_sk, 192:i_item_id] |
| Estimates: {row: 300000, cpu: 36094089.00, memory: 0.00, network: 0.00, cost: 18047044.50} |
| partitionRatio: 1/1, tabletRatio: 1/1