StarRocks的SQL指纹应用
–2022-03-10 春雷
1、前言
StarRocks 的2.1版本发布啦~这个版本上线了我们心心念念的SQL指纹。如下:
- 支持 SQL 指纹,针对慢查询中各类 SQL 语句计算出 SQL 指纹,方便您快速定位慢查询
2、SQL指纹说 明与使用
2.1、说明
StarRocks支持规范化慢查询( 路径fe.audit.log 的slow_query )中 SQL 语句,并进行归类。然后针对各个类型的SQL语句,计算出其的MD5 哈希值,对应字段为 Digest。
日志举例:
2021-12-27 15:13:39,108 [slow_query] |Client=172.26.xx.xxx:54956|User=root|Db=default_cluster:test|State=EOF|Time=2469|ScanBytes=0|ScanRows=0|ReturnRows=6|StmtId=3|QueryId=824d8dc0-66e4-11ec-9fdc-00163e04d4c2|IsQuery=true|feIp=172.26.92.195|Stmt=select count(*) from test_basic group by id_bigint|Digest=51390da6b57461f571f0712d527320f4
有了SQL指纹,我们就可以按照SQL指纹汇总,就可以看出一段时间内的慢SQL汇总情况了,比如:总次数、平均返回行数、平均扫描行数、平均执行时间等
2.2、使用架构
2.3、平台设计
- 天级别慢SQL
- 方便查看前一天的慢SQL汇总情况,趋势图
- 实时慢SQL
- 快速查看最近时间的慢SQL情况
- 指定时间慢SQL
- 可以指定某段时间,汇总展示SQL情况,次数、平均执行时间等。
2.3.1、天级别慢SQL
2.3.2、实时慢SQL
2.3.3、指定时间慢SQL汇总
3、StarRocks慢SQL展示实现具体
3.1、实时慢SQL实现
StarRocks实时慢SQL实现,请参考:
3.2、filebeat修改
修改配置文件,并重启filebeat
filebeat.yml
processors:
-
script:
lang: javascript
id: my_filter
tag: enable
source: >
function process(event) {
var str = event.Get(“message”);
var slow_time = str.substr(0, 19);
var query_type = str.substr(25,10);
var detail_query = str.substr(38);
var js_arr = detail_query.split("|");
var Client_tmp = js_arr[0];
var Client_tmp2 = Client_tmp.replace(‘Client=’,’’);
var Client_tmp3 = Client_tmp2.replace(‘t=’,’’);
var Client_arr = Client_tmp3.split(":");
var Client = Client_arr[0]
var User_tmp = js_arr[1];
var User = User_tmp.replace(‘User=default_cluster:’,’’);
var Db_tmp = js_arr[2];
var Db = Db_tmp.replace(‘Db=default_cluster:’,’’);
var State_tmp = js_arr[3];
var State = State_tmp.replace(‘State=’,’’);
var Time_tmp = js_arr[4];
var Time = Time_tmp.replace(‘Time=’,’’);
var ScanBytes_tmp = js_arr[5];
var ScanBytes = ScanBytes_tmp.replace(‘ScanBytes=’,’’);
var ScanRows_tmp = js_arr[6];
var ScanRows = ScanRows_tmp.replace(‘ScanRows=’,’’);
var ReturnRows_tmp = js_arr[7];
var ReturnRows = ReturnRows_tmp.replace(‘ReturnRows=’,’’);
var StmtId_tmp = js_arr[8];
var StmtId = StmtId_tmp.replace(‘StmtId=’,’’);
var QueryId_tmp = js_arr[9];
var QueryId = QueryId_tmp.replace(‘QueryId=’,’’);
var IsQuery_tmp = js_arr[10];
var IsQuery = IsQuery_tmp.replace(‘IsQuery=’,’’);
var feIp_tmp = js_arr[11];
var feIp = feIp_tmp.replace(‘feIp=’,’’);
var Stmt_tmp = js_arr[12];
var Stmt = Stmt_tmp.replace(‘Stmt=’,’’);
var Stmt = Stmt.substring(0,65530)
var Digest_tmp = js_arr[13];
var Digest = Digest_tmp.replace(‘Digest=’,’’);
event.Put(“query_type”,query_type);
event.Put(“slow_time”,slow_time);
event.Put(“Client”,Client);
event.Put(“User”,User);
event.Put(“Db”,Db);
event.Put(“State”,State);
event.Put(“Time”,Time);
event.Put(“ScanBytes”,ScanBytes);
event.Put(“ScanRows”,ScanRows);
event.Put(“ReturnRows”,ReturnRows);
event.Put(“StmtId”,StmtId);
event.Put(“QueryId”,QueryId);
event.Put(“IsQuery”,IsQuery);
event.Put(“feIp”,feIp);
event.Put(“Stmt”,Stmt);
event.Put(“Digest”,Digest);
}
3.2、修改表结构
alter table starrocks_slow add column digest varchar(50) default null comment ‘SQL指纹 Digest’;
3.3、修改任务
CREATE ROUTINE LOAD starrocks_slow_load ON starrocks_slow
columns (slow_time,igid,db_name,fe_ip,query_id,time,client,user,state,scan_bytes,scan_rows,return_rows,stmt_id,is_query,stmt,query_type,digest)
PROPERTIES (
“format”=“json”,
“jsonpaths”="["$.slow_time","$.igid","$.Db","$.feIp","$.QueryId","$.Time","$.Client","$.User","$.State","$.ScanBytes","$.ScanRows","$.ReturnRows","$.StmtId","$.IsQuery","$.Stmt","$.query_type","$.Digest"]",
“desired_concurrent_number”=“8”,
“max_error_number” = “9999999999”,
“max_batch_rows”=“200000”,
“max_batch_size” = “209715200”,
“max_batch_interval” = “10”,
“strict_mode” = “false”
)
FROM KAFKA
(
“kafka_broker_list”= “10.1.1.1:9092,10.1.1.2:9092,10.1.1.3:9092,10.1.1.4:9092,10.1.1.5:9092”,
“kafka_topic” = “starrocks_slow_log”,
“property.kafka_default_offsets” = “OFFSET_END”,
“property.client.id” = “xxx”,
“property.group.id” = “starrocks_slow_load”
);
3.4、查看结果
模拟慢SQL,及查看结果