随笔-man-linux load|ssar
目录
linux load 计算过程
load计算过程,遍历每个cpu,累加每个cpu上的nr_active:
The global load average is an exponentially decaying average of nr_running + nr_uninterruptible.
Once every LOAD_FREQ(5 秒):
nr_active = 0;
for cpu in cpus:
nr_active += cpu->nr_running + cpu->nr_uninterruptible;
avenrun[n] = avenrun[0] * exp_n + nr_active * (1 - exp_n)
关于exp_n:首先,你可以当成没有这个东西,那么avenrun[n] = nr_active
,但计算load计算是每5001ms计算一次,nr_active是此刻的nr_active,所以做一下加权平均是比较合理的,比较简单的做法就是一半一半,但是更合理的做法是指数移动加权平均法
,是指各数值的加权系数随时间呈指数式递减,越靠近当前时刻的数值加权系数就越大,更能反映近期变化的趋势
ssar 架构
后台运行sressar常驻服务,将数据保存到/var/log/sre_proc/data/下
ssar 使用示例
date +%FT%T; sleep 5s; ./stress-ng -t 25 --mutex 1
$ssar load5s -b 2024-08-25T23:29:39 -r 1
collect_datetime threads load1 runq load5s stype sstate zstate act act_rto actr actr_rto actd
2024-08-25T23:29:43 314 1.15 1 1 5s N U - - - - -
2024-08-25T23:29:48 317 1.22 3 2 5s N U - - - - -
2024-08-25T23:29:53 317 1.28 3 2 5s N U - - - - -
2024-08-25T23:29:58 317 1.34 3 2 5s N U - - - - -
2024-08-25T23:30:03 317 1.47 1 3 5s N U - - - - -
ssar load2p -c使用问题: ReadLoadrdFileData failed. Make sure the param -c is correct, act field is not -.
root@192.168.99.124:~ $ssar load2p -c 2024-08-25T23:30:00
ReadLoadrdFileData failed. Make sure the param -c <collect time> is correct, act field is not -.
解决方式:gdb --arg ssar load2p -c 2024-08-25T23:30:00 断点ReadLoadrdFileData 配合源码,打印it_path
reakpoint 1, ReadLoadrdFileData (seq_option=..., it_list_load2p_t=empty std::__cxx11::list) at ssar.cpp:2283
2283 ssar.cpp: No such file or directory.
Missing separate debuginfos, use: dnf debuginfo-install libgcc-10.3.1-10.oe2203.x86_64 libstdc++-10.3.1-10.oe2203.x86_64 zlib-1.2.11-24.oe2203.x86_64
(gdb) n
2284 in ssar.cpp
(gdb) p it_path
$1 = "/var/log/sre_proc/data/2024082523/20240825233000_loadrd"
然后看下有哪些文件是_loadrd
结尾的,改成对应的时间即可:
root@192.168.99.124:~ $ll /var/log/sre_proc/data/2024082523/*_loadrd
-rw-r--r-- 1 root root 50 Aug 25 23:47 /var/log/sre_proc/data/2024082523/20240825234739_loadrd
ssar编译安装
点击查看代码
$ make
make -C conf
make[1]: Entering directory '/home/aim/aim/ssar/conf'
gzip -c ssar.1 > ssar.1.gz
gzip -c zh_CN.ssar.1 > zh_CN.ssar.1.gz
make[1]: Leaving directory '/home/aim/aim/ssar/conf'
make -C ssar
make[1]: Entering directory '/home/aim/aim/ssar/ssar'
g++ -g -std=c++11 -rdynamic -DCPPTOML_USE_MAP ssar.cpp -o ssar -lz
...
make[1]: Leaving directory '/home/aim/aim/ssar/ssar'
make -C sresar
make[1]: Entering directory '/home/aim/aim/ssar/sresar'
gcc -g -std=gnu99 -rdynamic -c toml.c -o toml.o
gcc -g -std=gnu99 -rdynamic -c utils.c -o utils.o
gcc -g -std=gnu99 -rdynamic -c collection.c -o collection.o
gcc -g -std=gnu99 -rdynamic -c readprocess.c -o readprocess.o
gcc -g -std=gnu99 -rdynamic -c sresar.c -o sresar.o
gcc -g -std=gnu99 -rdynamic toml.o utils.o collection.o readprocess.o sresar.o -o sresar -lpthread -lm -lz
make[1]: Leaving directory '/home/aim/aim/ssar/sresar'
root@192.168.99.124:/home/ssar $make V=1 install
install -d /etc/ssar/
install conf/ssar.conf /etc/ssar/
install conf/sys.conf /etc/ssar/
install -d /usr/src/os_health/ssar/
install conf/sresar.service /usr/src/os_health/ssar/
install -d /usr/bin/
install ssar/ssar /usr/bin/ssar
install ssar/ssar+.py /usr/bin/ssar+
install ssar/tsar2.py /usr/bin/tsar2
install sresar/sresar /usr/bin/sresar
install -d /run/lock/os_health/
touch /run/lock/os_health/sresar.pid
cp -f /usr/src/os_health/ssar/sresar.service /etc/systemd/system/sresar.service
chown root:root /etc/systemd/system/sresar.service
systemctl daemon-reload
if [ systemctl is-enabled sresar.service ]; then \
systemctl disable sresar.service; \
fi
/bin/sh: line 1: [: is-enabled: binary operator expected
systemctl enable sresar.service
Created symlink /etc/systemd/system/multi-user.target.wants/sresar.service → /etc/systemd/system/sresar.service.
if systemctl is-active sresar.service; then \
systemctl stop sresar.service; \
fi
inactive
systemctl start sresar.service
linux load并不是一个很好能衡量问题的严重程度及进一步定位问题的指标,可以用sched latency
参考: Linux Load Average:算法、实现与实用指南(2023)
参考
本文来自博客园,作者:LiYanbin,转载请注明原文链接:https://www.cnblogs.com/stellar-liyanbin/p/18377699