随笔-纵向优化-测量最大IPC
固定cpu运行频率
我的测试环境cpu频率管理是intel_pstate:
$ lscpu | grep -i hz
Model name: Intel(R) Core(TM) i5-10500 CPU @ 3.10GHz
CPU max MHz: 4500.0000
CPU min MHz: 800.0000
$ echo 1 | sudo tee /sys/devices/system/cpu/intel_pstate/no_turbo
1
$ echo 100 | sudo tee /sys/devices/system/cpu/intel_pstate/min_perf_pct
100
$ echo 100 | sudo tee /sys/devices/system/cpu/intel_pstate/max_perf_pct
观察设置结果:
$ sudo cpupower -c 0 frequency-info
...
current policy: frequency should be within 3.10 GHz and 3.10 GHz.
...
$ grep "cpu MHz" /proc/cpuinfo
...
cpu MHz : 3100.000
cpu MHz : 3100.665
max_ipc_test.sh
#!/bin/bash
[[ -z "$1" ]] && exit
echo 'void main() { do {__asm__ (' > nop.c
for ((i = 1; i <= $1; i++)); do
echo '"nop\n\t"' >> nop.c;
done
echo ');} while(1);}' >> nop.c
gcc -O0 nop.c -o nop
set -x
[ -f ./nop ] && {
xcpu=7; sudo perf stat -C $xcpu --timeout 2000 taskset -c $xcpu ./nop
}
set +x
$ bash max_ipc_test.sh 4
...
12,290,206,450 instructions # 3.97 insn per cycle
...
void main() { do {__asm__ (
"nop\n\t"
"nop\n\t"
"nop\n\t"
"nop\n\t"
);} while(1);}
LSD (Loop Stream Detector)
nop个数增加会怎么样,考虑LSD,如果循环体太长,则检测不到循环,取指速度若达不到指令执行速度,back end需要等待,出现气泡,IPC会下降
测试最大为32:
$ bash max_ipc_test.sh 33
...
11,486,619,532 instructions # 3.71 insn per cycle
...
看下对应的边际变化
sudo python3 toplev.py -l 3 --core S0-C0 --no-desc --verbose bash -c 'timeout 1 taskset -c 0 /data/products/hpc001/nop 33'
arm固定cpu频率方式
sudo cpufreq-info -c 7 -g
cat /sys/devices/system/cpu/cpu7/cpufreq/scaling_governor
echo performance | sudo tee /sys/devices/system/cpu/cpu7/cpufreq/scaling_governor
恢复:
echo ondemand | sudo tee /sys/devices/system/cpu/cpu7/cpufreq/scaling_governor
本文来自博客园,作者:LiYanbin,转载请注明原文链接:https://www.cnblogs.com/stellar-liyanbin/p/18414475