剑道第一仙

导航

gaussdb通过编写shell脚本自动化执行查询和结果收集

转:https://support.huaweicloud.com/pwp-dws/dws_13_00033.html

1、登录ECS,进入到/opt目录下,使用vim命令生成query.conf和run_query.sh两个脚本文件。脚本内容如下,编辑后按:wq!保存脚本配置:

run_query.sh脚本如下:

#!/bin/bash
script_path=$(cd `dirname $0`;pwd)
query_mode=$1
query_path=$2
query_object=$3
query_log=${script_path}/query_log_`date +%y%m%d_%H%M%S`
source ${script_path}/query.conf

function usage()
{
    echo "[NOTICE]: This script is used to run queries and collect cost time, according to sepcified path and query file name."
    echo "          You can run the script as below:" 
    echo -e ""
    echo "          1. config the query.conf file."
    echo "          2. run the script in batch mode. "
    echo "          eg. sh run_query.sh batch [query file's absolute path]"
    echo -e ""
    echo "          3. run the script in single mode."
    echo "          eg. sh run_query.sh single [query file's absolute path] [specified query file name]"
}

function log_file_init()
{
    mkdir -p ${query_log}/explain_log
    mkdir -p ${query_log}/pre_warm_log
    mkdir -p ${query_log}/query_test_log
    touch ${query_log}/query_result.csv
    echo "query name,cost time1,cost time2,cost time3,average cost" > ${query_log}/query_result.csv
}

function single_query()
{
    echo "[INFO]: Single mode query is to start."
    echo "*****${query_object} begin*****"
    collect_plan
    pre_warm
    query_test
    echo "*****${query_object} end*****"
    echo "[INFO]: Single mode query is finished."
    echo "[NOTICE]: Get more details by query_result.csv and other logs in ${query_log}."
}

function batch_query()
{
    echo "[INFO]: Batch mode query is to start."
    for query_object in `ls ${query_path}`
    do
        echo "*****${query_object} begin*****"
        collect_plan
        pre_warm
        query_test
        echo "*****${query_object} end*****"
    done
    echo "[INFO]: Batch mode query is finished."
    echo "[NOTICE]: Get more details by query_result.csv and other logs in ${query_log}."
}

function collect_plan()
{
    echo "[STEP1]: Starting to collect plan."
    echo "explain performance" > ${query_log}/explain_log/${query_object}.tmp
    cat ${query_path}/${query_object} >> ${query_log}/explain_log/${query_object}.tmp
    gsql -h ${cluster_ip} -d ${db_name} -p ${db_port} -U ${db_user} -W "${user_pwd}" -f ${query_log}/explain_log/${query_object}.tmp > ${query_log}/explain_log/${query_object}.explain 2>&1
    echo "[STEP1]: Finished."
}

function pre_warm()
{
    echo "[STEP2]: Starting to pre-warm."
    for i in {1..2}
    do
        gsql -h ${cluster_ip} -d ${db_name} -p ${db_port} -U ${db_user} -W "${user_pwd}" -f ${query_path}/${query_object} > ${query_log}/pre_warm_log/${query_object}.pre${i} 2>&1
    done
    echo "[STEP2]: Finished."
}

function query_test()
{
    time1=""
    time2=""
    time3=""
    echo "[STEP3]: Starting to do real test."
    for i in {1..3}
    do
        gsql -h ${cluster_ip} -d ${db_name} -p ${db_port} -U ${db_user} -W "${user_pwd}" -f ${query_path}/${query_object} > ${query_log}/query_test_log/${query_object}.real${i} 2>&1
        let "`eval echo "time"${i}`=`cat ${query_log}/query_test_log/${query_object}.real${i}|grep "total time:"|awk {'print$3'}`"
    done
    time_ave=`echo "scale=2;(${time1}+${time2}+${time3})/3"|bc -l`
    echo "${query_object},${time1},${time2},${time3},${time_ave}" >> ${query_log}/query_result.csv
    echo "[step3]: Finished. The average time:${time_ave} ms."
}
case ${query_mode} in
    single)log_file_init;single_query;;
    batch)log_file_init;batch_query;;
    *)usage;;
esac

query.conf为集群信息配置文件,包含如下五个变量

cluster_ip=127.0.0.1                集群主cn节点内网ip 
db_name=tpcds_test                  数据库名称
db_port=6000                        数据库端口号
db_user=tpcds_user                  数据库用户
user_pwd=123456                     数据库用户密码

2、编辑query.conf为集群对应的信息后,先执行source gsql_env变量后,执行sh run_query.sh即可开始查询执行和结果收集。

示例:sh run_query.sh batch query1000x/

参数1:批量执行选择batch,单个query执行选择single。

参数2:tpcds1000x或者tpch1000x query存放的绝对路径。

参数3:如果参数1选择batch,此参数忽略;如果参数1选择single,此参数为具体执行的query名称,例如Q1。
须知:
1. gsql客户端的使用需要每次连接后,source gsql_env,执行查询脚本前请确认gsql可执行。
2. query1000x文件夹中必现包含tpch 22个标准查询文件Q1~Q22或者tpcds 99个标准查询文件Q1~Q99。
3. run_query.sh脚本依赖bc命令,执行前确认bc命令存在。
4、 每个查询默认会跑6次,第1次收集执行计划,第2,3次预热,第4到第6次正式查询,最终结果取后3次查询的平均值。
5、 查询脚本执行后会立即生成query_log_yymmdd_hhmmss名称的目录,其中
    exlain_log子目录存放查询计划。
    pre_warm子目录存放预热执行结果。
    query_test子目录存放正式查询执行结果。
    query_result.csv文件,csv格式汇总所有查询的执行结果。

 

 

posted on 2024-04-18 14:20  剑道第一仙  阅读(502)  评论(0编辑  收藏  举报