随笔-处理器微架构-rdtsc(Time Stamp Counter,读取cpu cycles)

rdtsc code

#include <stdio.h>
#include <stdint.h>

#if defined(__i386__) || defined(__x86_64__)
#include <x86intrin.h>

static inline uint64_t read_cycles() {
    unsigned int aux;
    return __rdtscp(&aux);
}

#elif defined(__arm__) || defined(__aarch64__)
/**
 * Read the cycle counter works on ARMv7 and ARMv8 (AArch64)
 * */
static inline uint64_t read_cycles() {
    uint64_t cycle_count;
    asm volatile("mrs %0, pmccntr_el0" : "=r" (cycle_count));
    return cycle_count;
}

#else
#error "Unsupported architecture"
#endif

int main() {
    uint64_t start_cycles, end_cycles;

    start_cycles = read_cycles();

    // Example workload
    for (volatile int i = 0; i < 1000000; ++i);

    end_cycles = read_cycles();

    printf("CPU cycles: %llu\n", end_cycles - start_cycles);
    return 0;
}

mrs pmccntr_el0 report 'Illegal instruction'

1、确认处理器型号,比如Cortex-A76

cat /proc/cpuinfo 根据CPU part 确认是cortex多少

参考:随笔-处理器微架构-获取处理器参数 - LiYanbin - 博客园

2、确认arm_pmu有开启

sudo dmesg | grep "PMU driver"
[   13.446103] hw perfevents: enabled with armv8_pmuv3 PMU driver, 7 counters available

如何开启arm pmu,参考:Enable Arm PMU support for the kernel

3、enable user-mode access pmu register by kernel-modules

参考:

git clone https://githubfast.com/jerinjacobk/armv8_pmu_cycle_counter_el0 --depth=1
cd armv8_pmu_cycle_counter_el0/
make
+++
sudo insmod pmu_el0_cycle_counter.ko
+++
echo "PMCCNTR=0" | sudo tee /dev/pmuctl
echo "PMCCNTR=1" | sudo tee /dev/pmuctl

4、disable idle modes

User-mode access to ARMv8 PMU cycle counters

Beware that the registers set by this module can be reset by the kernel, such as when a core goes to sleep. For instance, on a NVidia Jetson TX1, one need to disable idle modes by running as root:
for X in $(seq 0 3); do for Y in $(seq 1 6); do echo 1 > /sys/devices/system/cpu/cpu$X/cpuidle/state$Y/disable ; done ; done
xfile=($(ls /sys/devices/system/cpu/cpu*/cpuidle/state*/disable)); for x in ${xfile[*]}; do echo 1 | sudo tee $x; done

参考:

附: arm pmu_enable

enable_cycle_counter_el0(void* data)
{
        u64 val;
        /* Disable cycle counter overflow interrupt */
        asm volatile("msr pmintenclr_el1, %0" : : "r" ((u64)(1 << 31)));
        /* Enable cycle counter */
        asm volatile("msr pmcntenset_el0, %0" :: "r" BIT(31));
        /* Enable user-mode access to cycle counters. */
        asm volatile("msr pmuserenr_el0, %0" : : "r"(BIT(0) | BIT(2)));
        /* Clear cycle counter and start */
        asm volatile("mrs %0, pmcr_el0" : "=r" (val));
        val |= (BIT(0) | BIT(2));
        isb();
        asm volatile("msr pmcr_el0, %0" : : "r" (val));
        val = BIT(27);
        asm volatile("msr pmccfiltr_el0, %0" : : "r" (val));
}
posted @   LiYanbin  阅读(8)  评论(0编辑  收藏  举报
相关博文:
阅读排行:
· TypeScript + Deepseek 打造卜卦网站:技术与玄学的结合
· Manus的开源复刻OpenManus初探
· 三行代码完成国际化适配,妙~啊~
· .NET Core 中如何实现缓存的预热?
· 如何调用 DeepSeek 的自然语言处理 API 接口并集成到在线客服系统
点击右上角即可分享
微信分享提示