StarRocks的SQL指纹应用

StarRocks的SQL指纹应用
–2022-03-10 春雷

1、前言

StarRocks 的2.1版本发布啦~这个版本上线了我们心心念念的SQL指纹。如下:

  • 支持 SQL 指纹,针对慢查询中各类 SQL 语句计算出 SQL 指纹,方便您快速定位慢查询

2、SQL指纹说 明与使用

2.1、说明
StarRocks支持规范化慢查询( 路径fe.audit.log 的slow_query )中 SQL 语句,并进行归类。然后针对各个类型的SQL语句,计算出其的MD5 哈希值,对应字段为 Digest。

日志举例:

2021-12-27 15:13:39,108 [slow_query] |Client=172.26.xx.xxx:54956|User=root|Db=default_cluster:test|State=EOF|Time=2469|ScanBytes=0|ScanRows=0|ReturnRows=6|StmtId=3|QueryId=824d8dc0-66e4-11ec-9fdc-00163e04d4c2|IsQuery=true|feIp=172.26.92.195|Stmt=select count(*) from test_basic group by id_bigint|Digest=51390da6b57461f571f0712d527320f4

有了SQL指纹,我们就可以按照SQL指纹汇总,就可以看出一段时间内的慢SQL汇总情况了,比如:总次数、平均返回行数、平均扫描行数、平均执行时间等

2.2、使用架构

2.3、平台设计

  • 天级别慢SQL
    • 方便查看前一天的慢SQL汇总情况,趋势图
  • 实时慢SQL
    • 快速查看最近时间的慢SQL情况
  • 指定时间慢SQL
    • 可以指定某段时间,汇总展示SQL情况,次数、平均执行时间等。

2.3.1、天级别慢SQL

2.3.2、实时慢SQL

2.3.3、指定时间慢SQL汇总

3、StarRocks慢SQL展示实现具体

3.1、实时慢SQL实现

StarRocks实时慢SQL实现,请参考:

https://mp.weixin.qq.com/s?__biz=Mzg5NjUyMzM2Ng==&mid=2247483952&idx=1&sn=256c91bb0c3405abead5c1989082cc7c&chksm=c07e8f39f709062f3c0e051ab503b038044a243523aae694aa165e573122538fcb061b78987a&scene=21#wechat_redirect

3.2、filebeat修改

修改配置文件,并重启filebeat

filebeat.yml

processors:

  • script:

    lang: javascript

    id: my_filter

    tag: enable

    source: >

    function process(event) {

    var str = event.Get(“message”);

    var slow_time = str.substr(0, 19);

    var query_type = str.substr(25,10);

    var detail_query = str.substr(38);

    var js_arr = detail_query.split("|");

    var Client_tmp = js_arr[0];

    var Client_tmp2 = Client_tmp.replace(‘Client=’,’’);

    var Client_tmp3 = Client_tmp2.replace(‘t=’,’’);

    var Client_arr = Client_tmp3.split(":");

    var Client = Client_arr[0]

var User_tmp = js_arr[1];

var User = User_tmp.replace(‘User=default_cluster:’,’’);

var Db_tmp = js_arr[2];

var Db = Db_tmp.replace(‘Db=default_cluster:’,’’);

var State_tmp = js_arr[3];

var State = State_tmp.replace(‘State=’,’’);

var Time_tmp = js_arr[4];

var Time = Time_tmp.replace(‘Time=’,’’);

var ScanBytes_tmp = js_arr[5];

var ScanBytes = ScanBytes_tmp.replace(‘ScanBytes=’,’’);

var ScanRows_tmp = js_arr[6];

var ScanRows = ScanRows_tmp.replace(‘ScanRows=’,’’);

var ReturnRows_tmp = js_arr[7];

var ReturnRows = ReturnRows_tmp.replace(‘ReturnRows=’,’’);

var StmtId_tmp = js_arr[8];

var StmtId = StmtId_tmp.replace(‘StmtId=’,’’);

var QueryId_tmp = js_arr[9];

var QueryId = QueryId_tmp.replace(‘QueryId=’,’’);

var IsQuery_tmp = js_arr[10];

var IsQuery = IsQuery_tmp.replace(‘IsQuery=’,’’);

var feIp_tmp = js_arr[11];

var feIp = feIp_tmp.replace(‘feIp=’,’’);

var Stmt_tmp = js_arr[12];

var Stmt = Stmt_tmp.replace(‘Stmt=’,’’);

var Stmt = Stmt.substring(0,65530)

var Digest_tmp = js_arr[13];

var Digest = Digest_tmp.replace(‘Digest=’,’’);

event.Put(“query_type”,query_type);

event.Put(“slow_time”,slow_time);

event.Put(“Client”,Client);

event.Put(“User”,User);

event.Put(“Db”,Db);

event.Put(“State”,State);

event.Put(“Time”,Time);

event.Put(“ScanBytes”,ScanBytes);

event.Put(“ScanRows”,ScanRows);

event.Put(“ReturnRows”,ReturnRows);

event.Put(“StmtId”,StmtId);

event.Put(“QueryId”,QueryId);

event.Put(“IsQuery”,IsQuery);

event.Put(“feIp”,feIp);

event.Put(“Stmt”,Stmt);

event.Put(“Digest”,Digest);

}

3.2、修改表结构

alter table starrocks_slow add column digest varchar(50) default null comment ‘SQL指纹 Digest’;

3.3、修改任务

CREATE ROUTINE LOAD starrocks_slow_load ON starrocks_slow
columns (slow_time,igid,db_name,fe_ip,query_id,time,client,user,state,scan_bytes,scan_rows,return_rows,stmt_id,is_query,stmt,query_type,digest)
PROPERTIES (
“format”=“json”,
“jsonpaths”="["$.slow_time","$.igid","$.Db","$.feIp","$.QueryId","$.Time","$.Client","$.User","$.State","$.ScanBytes","$.ScanRows","$.ReturnRows","$.StmtId","$.IsQuery","$.Stmt","$.query_type","$.Digest"]",
“desired_concurrent_number”=“8”,
“max_error_number” = “9999999999”,
“max_batch_rows”=“200000”,
“max_batch_size” = “209715200”,
“max_batch_interval” = “10”,
“strict_mode” = “false”
)
FROM KAFKA
(
“kafka_broker_list”= “10.1.1.1:9092,10.1.1.2:9092,10.1.1.3:9092,10.1.1.4:9092,10.1.1.5:9092”,
“kafka_topic” = “starrocks_slow_log”,
“property.kafka_default_offsets” = “OFFSET_END”,
“property.client.id” = “xxx”,
“property.group.id” = “starrocks_slow_load”
);

3.4、查看结果

模拟慢SQL,及查看结果