【详述】问题详细描述
【背景】做过哪些操作?
【业务影响】
【StarRocks版本】例如:2.5.5
【集群规模】例如:1fe 4be(16C128G)
tpc-ds 1t的数据测试其中有 q17,q25,q95,看了下大多都是RuntimeFilter没有下推, join的方式也不太对, 这个是有什么参数可以调整吗?
官网没有tpcds,能不能提供建表和SQL我看一下合不合理
select  i_item_id
,i_item_desc
,s_state
,count(ss_quantity) as store_sales_quantitycount
,avg(ss_quantity) as store_sales_quantityave
,stddev_samp(ss_quantity) as store_sales_quantitystdev
,stddev_samp(ss_quantity)/avg(ss_quantity) as store_sales_quantitycov
,count(sr_return_quantity) as store_returns_quantitycount
,avg(sr_return_quantity) as store_returns_quantityave
,stddev_samp(sr_return_quantity) as store_returns_quantitystdev
,stddev_samp(sr_return_quantity)/avg(sr_return_quantity) as store_returns_quantitycov
,count(cs_quantity) as catalog_sales_quantitycount ,avg(cs_quantity) as catalog_sales_quantityave
,stddev_samp(cs_quantity) as catalog_sales_quantitystdev
,stddev_samp(cs_quantity)/avg(cs_quantity) as catalog_sales_quantitycov
from store_sales
,store_returns
,catalog_sales
,date_dim d1
,date_dim d2
,date_dim d3
,store
 ,item
where d1.d_quarter_name = ‘2001Q1’
and d1.d_date_sk = ss_sold_date_sk
and i_item_sk = ss_item_sk
and s_store_sk = ss_store_sk
and ss_customer_sk = sr_customer_sk
and ss_item_sk = sr_item_sk
and ss_ticket_number = sr_ticket_number
and sr_returned_date_sk = d2.d_date_sk
and d2.d_quarter_name in (‘2001Q1’,‘2001Q2’,‘2001Q3’)
and sr_customer_sk = cs_bill_customer_sk
and sr_item_sk = cs_item_sk
and cs_sold_date_sk = d3.d_date_sk
and d3.d_quarter_name in (‘2001Q1’,‘2001Q2’,‘2001Q3’)
group by i_item_id
,i_item_desc
,s_state
order by i_item_id
,i_item_desc
,s_state
limit 100;root@emr-bb62cdb4427b0c76be20-core-1:/data01/luna/tpcds-tools-sr/queries# ex
root@emr-bb62cdb4427b0c76be20-core-1:/data01/luna/tpcds-tools-sr/queries# cat query17.sql
set pipeline_dop=8;
– query 17
select  i_item_id
,i_item_desc
,s_state
,count(ss_quantity) as store_sales_quantitycount
,avg(ss_quantity) as store_sales_quantityave
,stddev_samp(ss_quantity) as store_sales_quantitystdev
,stddev_samp(ss_quantity)/avg(ss_quantity) as store_sales_quantitycov
,count(sr_return_quantity) as store_returns_quantitycount
,avg(sr_return_quantity) as store_returns_quantityave
,stddev_samp(sr_return_quantity) as store_returns_quantitystdev
,stddev_samp(sr_return_quantity)/avg(sr_return_quantity) as store_returns_quantitycov
,count(cs_quantity) as catalog_sales_quantitycount ,avg(cs_quantity) as catalog_sales_quantityave
,stddev_samp(cs_quantity) as catalog_sales_quantitystdev
,stddev_samp(cs_quantity)/avg(cs_quantity) as catalog_sales_quantitycov
from store_sales
,store_returns
,catalog_sales
,date_dim d1
,date_dim d2
,date_dim d3
,store
,item
where d1.d_quarter_name = ‘2001Q1’
and d1.d_date_sk = ss_sold_date_sk
and i_item_sk = ss_item_sk
and s_store_sk = ss_store_sk
and ss_customer_sk = sr_customer_sk
and ss_item_sk = sr_item_sk
and ss_ticket_number = sr_ticket_number
and sr_returned_date_sk = d2.d_date_sk
and d2.d_quarter_name in (‘2001Q1’,‘2001Q2’,‘2001Q3’)
and sr_customer_sk = cs_bill_customer_sk
and sr_item_sk = cs_item_sk
and cs_sold_date_sk = d3.d_date_sk
and d3.d_quarter_name in (‘2001Q1’,‘2001Q2’,‘2001Q3’)
group by i_item_id
,i_item_desc
,s_state
order by i_item_id
,i_item_desc
,s_state
您看一下大表关联那些有没有使用colocate with吧
三个大表 join 的顺序不太对, store_sales,store_returns 没有优先join,
--------------------------------------------------------+
| - Output => [192:i_item_id, 195:i_item_desc, 186:s_state, 213:count, 214:avg, 215:stddev_samp, 222:expr, 216:count, 217:avg, 218:stddev_samp, 223:expr, 219:count, 220:avg, 221:stddev_samp, 224:expr]                                                                      |
|     - TOP-100(FINAL)[192: i_item_id ASC NULLS FIRST, 195: i_item_desc ASC NULLS FIRST, 186: s_state ASC NULLS FIRST]                                                                                                                                                        |
|             Estimates: {row: 100, cpu: 17230.86, memory: 17230.86, network: 17230.86, cost: 244014450820862.10}                                                                                                                                                             |
|         - TOP-100(PARTIAL)[192: i_item_id ASC NULLS FIRST, 195: i_item_desc ASC NULLS FIRST, 186: s_state ASC NULLS FIRST]                                                                                                                                                  |
|                 Estimates: {row: 100, cpu: 28471241737258.48, memory: 17230.86, network: 28471241737258.48, cost: 244014450751938.62}                                                                                                                                       |
|             - AGGREGATE(GLOBAL) [192:i_item_id, 195:i_item_desc, 186:s_state] => [195:i_item_desc, 213:count, 214:avg, 215:stddev_samp, 216:count, 217:avg, 186:s_state, 218:stddev_samp, 219:count, 220:avg, 221:stddev_samp, 222:expr, 223:expr, 192:i_item_id, 224:expr] |
|                     Estimates: {row: 165233976314, cpu: 25992732092539.87, memory: 28471241737258.48, network: 0.00, cost: 187071967242959.94}                                                                                                                              |
|                     213:count := count(213:count)                                                                                                                                                                                                                           |
|                     214:avg := avg(214:avg)                                                                                                                                                                                                                                 |
|                     215:stddev_samp := stddev_samp(215:stddev_samp)                                                                                                                                                                                                         |
|                     216:count := count(216:count)                                                                                                                                                                                                                           |
|                     217:avg := avg(217:avg)                                                                                                                                                                                                                                 |
|                     218:stddev_samp := stddev_samp(218:stddev_samp)                                                                                                                                                                                                         |
|                     219:count := count(219:count)                                                                                                                                                                                                                           |
|                     220:avg := avg(220:avg)                                                                                                                                                                                                                                 |
|                     221:stddev_samp := stddev_samp(221:stddev_samp)                                                                                                                                                                                                         |
|                     222:expr := 215:stddev_samp / 214:avg                                                                                                                                                                                                                   |
|                     223:expr := 218:stddev_samp / 217:avg                                                                                                                                                                                                                   |
|                     224:expr := 221:stddev_samp / 220:avg                                                                                                                                                                                                                   |
|                 - EXCHANGE(SHUFFLE) [192, 195, 186]                                                                                                                                                                                                                         |
|                         Estimates: {row: 165233976314, cpu: 25992732092539.87, memory: 0.00, network: 25992732092539.87, cost: 117133117722173.02}                                                                                                                          |
|                     - AGGREGATE(LOCAL) [192:i_item_id, 195:i_item_desc, 186:s_state]                                                                                                                                                                                        |
|                             Estimates: {row: 165233976314, cpu: 25497030163596.14, memory: 25992732092539.87, network: 0.00, cost: 65147653537093.28}                                                                                                                       |
|                             213:count := count(11:ss_quantity)                                                                                                                                                                                                              |
|                             214:avg := avg(11:ss_quantity)                                                                                                                                                                                                                  |
|                             215:stddev_samp := stddev_samp(11:ss_quantity)                                                                                                                                                                                                  |
|                             216:count := count(34:sr_return_quantity)                                                                                                                                                                                                       |
|                             217:avg := avg(34:sr_return_quantity)                                                                                                                                                                                                           |
|                             218:stddev_samp := stddev_samp(34:sr_return_quantity)                                                                                                                                                                                           |
|                             219:count := count(62:cs_quantity)                                                                                                                                                                                                              |
|                             220:avg := avg(62:cs_quantity)                                                                                                                                                                                                                  |
|                             221:stddev_samp := stddev_samp(62:cs_quantity)                                                                                                                                                                                                  |
|                         - HASH/INNER JOIN [28:sr_customer_sk = 5:ss_customer_sk AND 25:sr_item_sk = 2:ss_item_sk AND 26:sr_ticket_number = 3:ss_ticket_number] => [34:sr_return_quantity, 195:i_item_desc, 186:s_state, 11:ss_quantity, 62:cs_quantity, 192:i_item_id]      |
|                                 Estimates: {row: 165233976314, cpu: 92558306686.14, memory: 13066287446.71, network: 0.00, cost: 413674270215.48}                                                                                                                           |
|                             - EXCHANGE(SHUFFLE) [28, 25, 26]                                                                                                                                                                                                                |
|                                     Estimates: {row: 3974600961, cpu: 79492019239.43, memory: 0.00, network: 79492019239.43, cost: 197227587631.65}                                                                                                                         |
|                                 - HASH/INNER JOIN [49:cs_bill_customer_sk = 28:sr_customer_sk AND 45:cs_item_sk = 25:sr_item_sk] => [34:sr_return_quantity, 25:sr_item_sk, 26:sr_ticket_number, 28:sr_customer_sk, 62:cs_quantity]                                          |
|                                         Estimates: {row: 3974600961, cpu: 3194475520.33, memory: 625675998.77, network: 0.00, cost: 38243549152.79}                                                                                                                         |
|                                     - EXCHANGE(SHUFFLE) [49, 45]                                                                                                                                                                                                            |
|                                             Estimates: {row: 214066626, cpu: 2568799521.56, memory: 0.00, network: 2568799521.56, cost: 28177313949.60}                                                                                                                     |
|                                         - HASH/INNER JOIN [44:cs_sold_date_sk = 134:d_date_sk] => [49:cs_bill_customer_sk, 45:cs_item_sk, 62:cs_quantity]                                                                                                                   |
|                                                 Estimates: {row: 214066626, cpu: 23039687753.11, memory: 1097.11, network: 0.00, cost: 23039714906.49}                                                                                                                      |
|                                             - SCAN [catalog_sales] => [49:cs_bill_customer_sk, 44:cs_sold_date_sk, 45:cs_item_sk, 62:cs_quantity]                                                                                                                           |
|                                                     Estimates: {row: 1439980416, cpu: 23039686656.00, memory: 0.00, network: 0.00, cost: 11519843328.00}                                                                                                                    |
|                                                     partitionRatio: 1/1, tabletRatio: 256/256                                                                                                                                                                               |
|                                                     predicate: 49:cs_bill_customer_sk IS NOT NULL                                                                                                                                                                           |
|                                             - EXCHANGE(BROADCAST)                                                                                                                                                                                                           |
|                                                     Estimates: {row: 274, cpu: 4388.43, memory: 4388.43, network: 4388.43, cost: 18925.09}                                                                                                                                  |
|                                                 - SCAN [date_dim] => [134:d_date_sk]                                                                                                                                                                                        |
|                                                         Estimates: {row: 274, cpu: 2742.77, memory: 0.00, network: 0.00, cost: 1371.38}                                                                                                                                     |
|                                                         partitionRatio: 1/1, tabletRatio: 1/1                                                                                                                                                                               |
|                                                         predicate: 149:d_quarter_name IN (‘2001Q1’, ‘2001Q2’, ‘2001Q3’)                                                                                                                                                     |
|                                     - EXCHANGE(SHUFFLE) [28, 25]                                                                                                                                                                                                            |
|                                             Estimates: {row: 39104749, cpu: 625675998.77, memory: 0.00, network: 625675998.77, cost: 7011375528.04}                                                                                                                         |
|                                         - HASH/INNER JOIN [24:sr_returned_date_sk = 106:d_date_sk] => [34:sr_return_quantity, 25:sr_item_sk, 26:sr_ticket_number, 28:sr_customer_sk]                                                                                        |
|                                                 Estimates: {row: 39104749, cpu: 5759996377.11, memory: 1097.11, network: 0.00, cost: 5760023530.49}                                                                                                                         |
|                                             - SCAN [store_returns] => [34:sr_return_quantity, 24:sr_returned_date_sk, 25:sr_item_sk, 26:sr_ticket_number, 28:sr_customer_sk]                                                                                                |
|                                                     Estimates: {row: 287999764, cpu: 5759995280.00, memory: 0.00, network: 0.00, cost: 2879997640.00}                                                                                                                       |
|                                                     partitionRatio: 1/1, tabletRatio: 96/96                                                                                                                                                                                 |
|                                                     predicate: 28:sr_customer_sk IS NOT NULL                                                                                                                                                                                |
|                                             - EXCHANGE(BROADCAST)                                                                                                                                                                                                           |
|                                                     Estimates: {row: 274, cpu: 4388.43, memory: 4388.43, network: 4388.43, cost: 18925.09}                                                                                                                                  |
|                                                 - SCAN [date_dim] => [106:d_date_sk]                                                                                                                                                                                        |
|                                                         Estimates: {row: 274, cpu: 2742.77, memory: 0.00, network: 0.00, cost: 1371.38}                                                                                                                                     |
|                                                         partitionRatio: 1/1, tabletRatio: 1/1                                                                                                                                                                               |
|                                                         predicate: 121:d_quarter_name IN (‘2001Q1’, ‘2001Q2’, ‘2001Q3’)                                                                                                                                                     |
|                             - EXCHANGE(SHUFFLE) [5, 2, 3]                                                                                                                                                                                                                   |
|                                     Estimates: {row: 97285531, cpu: 13066287446.71, memory: 0.00, network: 13066287446.71, cost: 98541741391.09}                                                                                                                            |
|                                 - HASH/INNER JOIN [2:ss_item_sk = 191:i_item_sk] => [2:ss_item_sk, 3:ss_ticket_number, 195:i_item_desc, 5:ss_customer_sk, 186:s_state, 11:ss_quantity, 192:i_item_id]                                                                       |
|                                         Estimates: {row: 97285531, cpu: 1763557870.61, memory: 36094089.00, network: 0.00, cost: 72409166497.67}                                                                                                                            |
|                                     - HASH/INNER JOIN [9:ss_store_sk = 162:s_store_sk] => [2:ss_item_sk, 3:ss_ticket_number, 5:ss_customer_sk, 186:s_state, 11:ss_quantity]                                                                                                 |
|                                             Estimates: {row: 95996822, cpu: 2876078477.97, memory: 6007.00, network: 0.00, cost: 70557907803.31}                                                                                                                            |
|                                         - HASH/INNER JOIN [1:ss_sold_date_sk = 78:d_date_sk] => [2:ss_item_sk, 3:ss_ticket_number, 5:ss_customer_sk, 9:ss_store_sk, 11:ss_quantity]                                                                                         |
|                                                 Estimates: {row: 143803623, cpu: 69119712341.70, memory: 365.70, network: 0.00, cost: 69119721392.83}                                                                                                                       |
|                                             - SCAN [store_sales] => [1:ss_sold_date_sk, 2:ss_item_sk, 3:ss_ticket_number, 5:ss_customer_sk, 9:ss_store_sk, 11:ss_quantity]                                                                                                  |
|                                                     Estimates: {row: 2879987999, cpu: 69119711976.00, memory: 0.00, network: 0.00, cost: 34559855988.00}                                                                                                                    |
|                                                     partitionRatio: 1/1, tabletRatio: 256/256                                                                                                                                                                               |
|                                                     predicate: 5:ss_customer_sk IS NOT NULL                                                                                                                                                                                 |
|                                             - EXCHANGE(BROADCAST)                                                                                                                                                                                                           |
|                                                     Estimates: {row: 91, cpu: 1462.81, memory: 1462.81, network: 1462.81, cost: 6308.36}                                                                                                                                    |
|                                                 - SCAN [date_dim] => [78:d_date_sk]                                                                                                                                                                                         |
|                                                         Estimates: {row: 91, cpu: 914.26, memory: 0.00, network: 0.00, cost: 457.13}                                                                                                                                        |
|                                                         partitionRatio: 1/1, tabletRatio: 1/1                                                                                                                                                                               |
|                                                         predicate: 93:d_quarter_name = ‘2001Q1’                                                                                                                                                                             |
|                                         - EXCHANGE(BROADCAST)                                                                                                                                                                                                               |
|                                                 Estimates: {row: 1002, cpu: 24028.00, memory: 24028.00, network: 24028.00, cost: 99115.50}                                                                                                                                  |
|                                             - SCAN [store] => [162:s_store_sk, 186:s_state]                                                                                                                                                                                 |
|                                                     Estimates: {row: 1002, cpu: 6007.00, memory: 0.00, network: 0.00, cost: 3003.50}                                                                                                                                        |
|                                                     partitionRatio: 1/1, tabletRatio: 1/1                                                                                                                                                                                   |
|                                     - EXCHANGE(BROADCAST)                                                                                                                                                                                                                   |
|                                             Estimates: {row: 300000, cpu: 144376356.00, memory: 144376356.00, network: 144376356.00, cost: 595552468.50}                                                                                                                    |
|                                         - SCAN [item] => [195:i_item_desc, 191:i_item_sk, 192:i_item_id]                                                                                                                                                                    |
|                                                 Estimates: {row: 300000, cpu: 36094089.00, memory: 0.00, network: 0.00, cost: 18047044.50}                                                                                                                                  |
|                                                 partitionRatio: 1/1, tabletRatio: 1/1