随笔-处理器微架构-rdtsc(Time Stamp Counter,读取cpu cycles)
rdtsc code
#include <stdio.h>
#include <stdint.h>
#if defined(__i386__) || defined(__x86_64__)
#include <x86intrin.h>
static inline uint64_t read_cycles() {
unsigned int aux;
return __rdtscp(&aux);
}
#elif defined(__arm__) || defined(__aarch64__)
/**
* Read the cycle counter works on ARMv7 and ARMv8 (AArch64)
* */
static inline uint64_t read_cycles() {
uint64_t cycle_count;
asm volatile("mrs %0, pmccntr_el0" : "=r" (cycle_count));
return cycle_count;
}
#else
#error "Unsupported architecture"
#endif
int main() {
uint64_t start_cycles, end_cycles;
start_cycles = read_cycles();
// Example workload
for (volatile int i = 0; i < 1000000; ++i);
end_cycles = read_cycles();
printf("CPU cycles: %llu\n", end_cycles - start_cycles);
return 0;
}
mrs pmccntr_el0 report 'Illegal instruction'
1、确认处理器型号,比如Cortex-A76
cat /proc/cpuinfo 根据CPU part 确认是cortex多少
参考:随笔-处理器微架构-获取处理器参数 - LiYanbin - 博客园
2、确认arm_pmu有开启
sudo dmesg | grep "PMU driver"
[ 13.446103] hw perfevents: enabled with armv8_pmuv3 PMU driver, 7 counters available
如何开启arm pmu,参考:Enable Arm PMU support for the kernel
3、enable user-mode access pmu register by kernel-modules
参考:
- How to Use Performance Monitor Unit(PMU) of 64-bit ARMv8-A in Linux
- zhiyisun/enable_arm_pmu: Enable user-mode access to ARMv7/Linux performance counters
- jerinjacobk/armv8_pmu_cycle_counter_el0: ARMv8 performance monitor from userspace
git clone https://githubfast.com/jerinjacobk/armv8_pmu_cycle_counter_el0 --depth=1
cd armv8_pmu_cycle_counter_el0/
make
+++
sudo insmod pmu_el0_cycle_counter.ko
+++
echo "PMCCNTR=0" | sudo tee /dev/pmuctl
echo "PMCCNTR=1" | sudo tee /dev/pmuctl
4、disable idle modes
User-mode access to ARMv8 PMU cycle counters
Beware that the registers set by this module can be reset by the kernel, such as when a core goes to sleep. For instance, on a NVidia Jetson TX1, one need to disable idle modes by running as root:
for X in $(seq 0 3); do for Y in $(seq 1 6); do echo 1 > /sys/devices/system/cpu/cpu$X/cpuidle/state$Y/disable ; done ; done
xfile=($(ls /sys/devices/system/cpu/cpu*/cpuidle/state*/disable)); for x in ${xfile[*]}; do echo 1 | sudo tee $x; done
参考:
- Got "illegal instruction" after insmod ko and read pmccntr_el0 in user-space application · Issue #10 · jerinjacobk/armv8_pmu_cycle_counter_el0
- Should add pthread_mutex_t when calling PMCCNTR_EL0 register? · Issue #11 · jerinjacobk/armv8_pmu_cycle_counter_el0
附: arm pmu_enable
enable_cycle_counter_el0(void* data)
{
u64 val;
/* Disable cycle counter overflow interrupt */
asm volatile("msr pmintenclr_el1, %0" : : "r" ((u64)(1 << 31)));
/* Enable cycle counter */
asm volatile("msr pmcntenset_el0, %0" :: "r" BIT(31));
/* Enable user-mode access to cycle counters. */
asm volatile("msr pmuserenr_el0, %0" : : "r"(BIT(0) | BIT(2)));
/* Clear cycle counter and start */
asm volatile("mrs %0, pmcr_el0" : "=r" (val));
val |= (BIT(0) | BIT(2));
isb();
asm volatile("msr pmcr_el0, %0" : : "r" (val));
val = BIT(27);
asm volatile("msr pmccfiltr_el0, %0" : : "r" (val));
}
本文来自博客园,作者:LiYanbin,转载请注明原文链接:https://www.cnblogs.com/stellar-liyanbin/p/18589353
【推荐】国内首个AI IDE,深度理解中文开发场景,立即下载体验Trae
【推荐】编程新体验,更懂你的AI,立即体验豆包MarsCode编程助手
【推荐】抖音旗下AI助手豆包,你的智能百科全书,全免费不限次数
【推荐】轻量又高性能的 SSH 工具 IShell:AI 加持,快人一步
· TypeScript + Deepseek 打造卜卦网站:技术与玄学的结合
· Manus的开源复刻OpenManus初探
· 三行代码完成国际化适配,妙~啊~
· .NET Core 中如何实现缓存的预热?
· 如何调用 DeepSeek 的自然语言处理 API 接口并集成到在线客服系统