Used: 438105027512, Limit: 438103947386. Mem usage has exceed the limit of single query, You can change the limit by set session variable exec_mem_limit目前我设置比较这个限制的值大,还是报错超出限制,

没有出现null就好了,您继续把数据导入先吧,数据重复那个问题不大。

-h 的目前已经跑出来了,tds的导入数据出现很多null 值修改了列的字段类型把int 变成了bigint 有的值有变化,有的值还是null ,null 是分布键,我试试把int换成string 或者是其他类型可以吗

string 性能会较差,如果有少量的null也属于正常情况。

-h全部SQL都能跑出来吧?性能咋样

发您了,帮忙看看还用提升的空间吗


咱们用到列来做分布键的还是有很多null

原文件是有值还是null?

源文件是没有的null

主要解决分布键吧 bigint都不够长吗?还是数字里有字符串?

对用bigint 还是null 这是数据

可能是本来float 写进去就变成null吧 bigint已经很长,如果是数字 应该可以写入

现在导入就null ,您那边生成ds数据导入应该都正常吧

我还没测试ds,具体找一下原因

我看您帮忙修改的建表语句指定的分布键上用的列不是null

分布键不是null就先测试吧,这个影响较大,其他列可能只会影响少数SQL,可以针对测试结果再优化。

对,我看目前有一个sql 查了将尽20多分钟了,还没有出来呢

麻烦发一下SQL和大表的表结构

MySQL [tcpds10tmp1l]> WITH cross_items AS
-> (SELECT i_item_sk ss_item_sk
-> FROM item,
-> (SELECT
-> iss.i_brand_id brand_id,
-> iss.i_class_id class_id,
-> iss.i_category_id category_id
-> FROM store_sales, item iss, date_dim d1
-> WHERE ss_item_sk = iss.i_item_sk
-> AND ss_sold_date_sk = d1.d_date_sk
-> AND d1.d_year BETWEEN 1999 AND 1999 + 2
-> INTERSECT
-> SELECT
-> ics.i_brand_id,
-> ics.i_class_id,
-> ics.i_category_id
-> FROM catalog_sales, item ics, date_dim d2
-> WHERE cs_item_sk = ics.i_item_sk
-> AND cs_sold_date_sk = d2.d_date_sk
-> AND d2.d_year BETWEEN 1999 AND 1999 + 2
-> INTERSECT
-> SELECT
-> iws.i_brand_id,
-> iws.i_class_id,
-> iws.i_category_id
-> FROM web_sales, item iws, date_dim d3
-> WHERE ws_item_sk = iws.i_item_sk
-> AND ws_sold_date_sk = d3.d_date_sk
-> AND d3.d_year BETWEEN 1999 AND 1999 + 2) x
-> WHERE i_brand_id = brand_id
-> AND i_class_id = class_id
-> AND i_category_id = category_id
-> ),
-> avg_sales AS
-> (SELECT avg(quantity * list_price) average_sales
-> FROM (SELECT
-> ss_quantity quantity,
-> ss_list_price list_price
-> FROM store_sales, date_dim
-> WHERE ss_sold_date_sk = d_date_sk
-> AND d_year BETWEEN 1999 AND 2001
-> UNION ALL
-> SELECT
-> cs_quantity quantity,
-> cs_list_price list_price
-> FROM catalog_sales, date_dim
-> WHERE cs_sold_date_sk = d_date_sk
-> AND d_year BETWEEN 1999 AND 1999 + 2
-> UNION ALL
-> SELECT
-> ws_quantity quantity,
-> ws_list_price list_price
-> FROM web_sales, date_dim
-> WHERE ws_sold_date_sk = d_date_sk
-> AND d_year BETWEEN 1999 AND 1999 + 2) x)
-> SELECT
-> channel,
-> i_brand_id,
-> i_class_id,
-> i_category_id,
-> sum(sales),
-> sum(number_sales)
-> FROM (
-> SELECT
-> ‘store’ channel,
-> i_brand_id,
-> i_class_id,
-> i_category_id,
-> sum(ss_quantity * ss_list_price) sales,
-> count() number_sales
-> FROM store_sales, item, date_dim
-> WHERE ss_item_sk IN (SELECT ss_item_sk
-> FROM cross_items)
-> AND ss_item_sk = i_item_sk
-> AND ss_sold_date_sk = d_date_sk
-> AND d_year = 1999 + 2
-> AND d_moy = 11
-> GROUP BY i_brand_id, i_class_id, i_category_id
-> HAVING sum(ss_quantity * ss_list_price) > (SELECT average_sales
-> FROM avg_sales)
-> UNION ALL
-> SELECT
-> ‘catalog’ channel,
-> i_brand_id,
-> i_class_id,
-> i_category_id,
-> sum(cs_quantity * cs_list_price) sales,
-> count(
) number_sales
-> FROM catalog_sales, item, date_dim
-> WHERE cs_item_sk IN (SELECT ss_item_sk
-> FROM cross_items)
-> AND cs_item_sk = i_item_sk
-> AND cs_sold_date_sk = d_date_sk
-> AND d_year = 1999 + 2
-> AND d_moy = 11
-> GROUP BY i_brand_id, i_class_id, i_category_id
-> HAVING sum(cs_quantity * cs_list_price) > (SELECT average_sales FROM avg_sales)
-> UNION ALL
-> SELECT
-> ‘web’ channel,
-> i_brand_id,
-> i_class_id,
-> i_category_id,
-> sum(ws_quantity * ws_list_price) sales,
-> count(*) number_sales
-> FROM web_sales, item, date_dim
-> WHERE ws_item_sk IN (SELECT ss_item_sk
-> FROM cross_items)
-> AND ws_item_sk = i_item_sk
-> AND ws_sold_date_sk = d_date_sk
-> AND d_year = 1999 + 2
-> AND d_moy = 11
-> GROUP BY i_brand_id, i_class_id, i_category_id
-> HAVING sum(ws_quantity * ws_list_price) > (SELECT average_sales
-> FROM avg_sales)
-> ) y
-> GROUP BY ROLLUP (channel, i_brand_id, i_class_id, i_category_id)
-> ORDER BY channel, i_brand_id, i_class_id, i_category_id
-> LIMIT 100;
这个是sql

表结构是
image

show create table xxx\G 发一下表结构

| store_sales | CREATE TABLE store_sales (
ss_sold_date_sk bigint(20) NULL COMMENT “”,
ss_sold_time_sk bigint(20) NULL COMMENT “”,
ss_item_sk bigint(20) NULL COMMENT “”,
ss_customer_sk bigint(20) NULL COMMENT “”,
ss_cdemo_sk bigint(20) NULL COMMENT “”,
ss_hdemo_sk bigint(20) NULL COMMENT “”,
ss_addr_sk bigint(20) NULL COMMENT “”,
ss_store_sk bigint(20) NULL COMMENT “”,
ss_promo_sk bigint(20) NULL COMMENT “”,
ss_ticket_number bigint(20) NULL COMMENT “”,
ss_quantity int(11) NULL COMMENT “”,
ss_wholesale_cost decimal64(7, 2) NULL COMMENT “”,
ss_list_price decimal64(7, 2) NULL COMMENT “”,
ss_sales_price decimal64(7, 2) NULL COMMENT “”,
ss_ext_discount_amt decimal64(7, 2) NULL COMMENT “”,
ss_ext_sales_price decimal64(7, 2) NULL COMMENT “”,
ss_ext_wholesale_cost decimal64(7, 2) NULL COMMENT “”,
ss_ext_list_price decimal64(7, 2) NULL COMMENT “”,
ss_ext_tax decimal64(7, 2) NULL COMMENT “”,
ss_coupon_amt decimal64(7, 2) NULL COMMENT “”,
ss_net_paid decimal64(7, 2) NULL COMMENT “”,
ss_net_paid_inc_tax decimal64(7, 2) NULL COMMENT “”,
ss_net_profit decimal64(7, 2) NULL COMMENT “”
) ENGINE=OLAP
DUPLICATE KEY(ss_sold_date_sk, ss_sold_time_sk, ss_item_sk)
COMMENT “OLAP”
DISTRIBUTED BY HASH(ss_item_sk) BUCKETS 576
PROPERTIES (
“replication_num” = “1”,
“bloom_filter_columns” = “ss_item_sk, ss_store_sk, ss_quantity”,
“colocate_with” = “group1”,
“in_memory” = “false”,
“storage_format” = “DEFAULT”,
“enable_persistent_index” = “false”,
“storage_medium” = “SSD”
);

| catalog_sales | CREATE TABLE catalog_sales (
cs_sold_date_sk bigint(20) NULL COMMENT “”,
cs_sold_time_sk bigint(20) NULL COMMENT “”,
cs_ship_date_sk bigint(20) NULL COMMENT “”,
cs_bill_customer_sk bigint(20) NULL COMMENT “”,
cs_bill_cdemo_sk bigint(20) NULL COMMENT “”,
cs_bill_hdemo_sk bigint(20) NULL COMMENT “”,
cs_bill_addr_sk bigint(20) NULL COMMENT “”,
cs_ship_customer_sk bigint(20) NULL COMMENT “”,
cs_ship_cdemo_sk bigint(20) NULL COMMENT “”,
cs_ship_hdemo_sk bigint(20) NULL COMMENT “”,
cs_ship_addr_sk bigint(20) NULL COMMENT “”,
cs_call_center_sk bigint(20) NULL COMMENT “”,
cs_catalog_page_sk bigint(20) NULL COMMENT “”,
cs_ship_mode_sk bigint(20) NULL COMMENT “”,
cs_warehouse_sk bigint(20) NULL COMMENT “”,
cs_item_sk bigint(20) NULL COMMENT “”,
cs_promo_sk bigint(20) NULL COMMENT “”,
cs_order_number bigint(20) NULL COMMENT “”,
cs_quantity bigint(20) NULL COMMENT “”,
cs_wholesale_cost decimal64(15, 2) NULL COMMENT “”,
cs_list_price decimal64(15, 2) NULL COMMENT “”,
cs_sales_price decimal64(15, 2) NULL COMMENT “”,
cs_ext_discount_amt decimal64(15, 2) NULL COMMENT “”,
cs_ext_sales_price decimal64(15, 2) NULL COMMENT “”,
cs_ext_wholesale_cost decimal64(15, 2) NULL COMMENT “”,
cs_ext_list_price decimal64(15, 2) NULL COMMENT “”,
cs_ext_tax decimal64(15, 2) NULL COMMENT “”,
cs_coupon_amt decimal64(15, 2) NULL COMMENT “”,
cs_ext_ship_cost decimal64(15, 2) NULL COMMENT “”,
cs_net_paid decimal64(15, 2) NULL COMMENT “”,
cs_net_paid_inc_tax decimal64(15, 2) NULL COMMENT “”,
cs_net_paid_inc_ship decimal64(15, 2) NULL COMMENT “”,
cs_net_paid_inc_ship_tax decimal64(15, 2) NULL COMMENT “”,
cs_net_profit decimal64(7, 2) NULL COMMENT “”,
INDEX index1 (cs_quantity) USING BITMAP COMMENT ‘Bitmap Index’
) ENGINE=OLAP
DUPLICATE KEY(cs_sold_date_sk, cs_sold_time_sk, cs_ship_date_sk)
COMMENT “OLAP”
DISTRIBUTED BY HASH(cs_item_sk) BUCKETS 576
PROPERTIES (
“replication_num” = “1”,
“bloom_filter_columns” = “cs_order_number, cs_item_sk”,
“colocate_with” = “group2”,
“in_memory” = “false”,
“storage_format” = “DEFAULT”,
“enable_persistent_index” = “false”,
“storage_medium” = “SSD”
);