使用 perf probe测量Linux用户空间代码的执行时间

1 Linux perf 简介

perf 是 Linux 内核自带的一个强大的性能分析工具，它能够深入到内核和用户空间，提供丰富的性能计数器和事件跟踪功能，帮助开发者和系统管理员精确地定位系统性能瓶颈。

1.1 perf 的主要功能：

性能事件采样：可以采样各种硬件性能事件，如 CPU 周期、缓存命中率、指令退休率等，从而分析程序在 CPU 上的执行情况。
软件事件跟踪：可以跟踪内核和用户空间的函数调用、系统调用等软件事件，分析程序的执行流程和函数调用关系。
性能计数器：提供了大量的硬件性能计数器，可以用来分析 CPU、内存、缓存等硬件资源的使用情况。
火焰图生成：可以生成火焰图，直观地展示程序的调用栈，快速定位性能热点。
实时监控：可以实时监控系统的性能指标，帮助快速发现性能问题。

1.2 perf 的工作原理：

perf 主要通过以下方式来收集性能数据：

硬件性能计数器：利用 CPU 内置的性能计数器，直接获取硬件的性能数据。
软件事件跟踪：在内核和用户空间插入探针，跟踪软件事件的发生。
内核协作： perf 与内核紧密协作，通过内核接口获取性能数据。

1.3 perf 的常用命令：

perf list：列出所有可用的性能事件。
perf stat：统计性能事件的计数。
perf record：记录性能数据。
perf report：分析记录的数据，生成报告。
perf top：实时显示占用 CPU 时间最多的函数。
perf trace：追踪系统调用和软中断。

1.4 perf 的应用场景：

定位性能瓶颈：通过分析性能数据，找出程序中消耗时间最长的部分。
优化代码：根据性能分析结果，有针对性地优化代码。
分析系统问题：分析系统性能问题，如 CPU 负载过高、内存泄漏等。
调优内核参数：调整内核参数，提高系统性能。

1.5 perf 的优势：

功能强大：提供了丰富的性能分析功能。
灵活易用：命令行界面简单易学。
开源免费：作为 Linux 内核的一部分，完全免费。
社区活跃：有一个庞大的社区，提供丰富的文档和支持。

2 使用 perf probe测量Linux用户空间代码的执行时间

2.1 测量的示例程序evenodd.c

#include <stdio.h>
#include <unistd.h>

void even(void)
{
     printf("it is even\n");
}

void odd(void)
{
     printf("it is odd\n");
}
int main()
{
     int i;
     for (i = 0; ;i++) {
          usleep(500);
          if (i % 2)
               odd();
          else
               even();
     }
}

参考资料

软件测试精品书籍文档下载持续更新 https://github.com/china-testing/python-testing-examples 请点赞，谢谢！
本文涉及的python测试开发库谢谢点赞！ https://github.com/china-testing/python_cn_resouce
python精品书籍下载 https://github.com/china-testing/python_cn_resouce/blob/main/python_good_books.md
Linux精品书籍下载 https://www.cnblogs.com/testing-/p/17438558.html
https://bristot.me/using-perf-probe-to-measure-execution-time-of-user-space-code-on-linux/
https://github.com/open-power-sdk/curt/blob/master/curt.py
https://opensource.com/article/18/7/fun-perf-and-python
https://man7.org/linux/man-pages/man1/perf-script-python.1.html
https://github.com/torvalds/linux/blob/master/tools/perf/scripts/python/stackcollapse.py
https://www.brendangregg.com/perf.html
https://github.com/brendangregg/FlameGraph
https://medium.com/@techhara/profiling-visualize-program-bottleneck-with-flamegraph-3e0c5855b2fe
https://perf.wiki.kernel.org/index.php/Tutorial#Flame_Graph
如需技术支持联系钉ding或微信pythontesting , 邮箱: xurongzhong#126.com

2.2 使用调试符号编译 even odd：

# gcc -o -g evenodd evenodd.c
# perf probe --line even -x evenodd
<even@/root/code/evenodd.c:0>
      0  void even(void)
         {
      2       printf("it is even\n");
      3  }

         void odd(void)
         {

2.3 添加探针

在 even() 函数的开头添加 perf 探针，请运行以下命令：

# objdump -t evenodd

evenodd:     file format elf64-littleaarch64

SYMBOL TABLE:
0000000000400238 l    d  .interp        0000000000000000              .interp
0000000000400254 l    d  .note.gnu.build-id     0000000000000000              .note.gnu.build-id
0000000000400278 l    d  .note.ABI-tag  0000000000000000              .note.ABI-tag
0000000000400298 l    d  .gnu.hash      0000000000000000              .gnu.hash
...
# perf probe -x evenodd 'even'

要获取给定函数的执行时间，我们需要在函数的最后一行添加探针。如 perf probe -line 示例所示，是第 3 行。下面的命令在 evenondd 二进制文件的函数 event() 的第 3 行添加了名为 even_end event 的探针。

# perf probe -x evenodd 'even_end=even:3'

开始运行evenodd应用程序后，运行以下命令记录 even() 执行 10 秒的进入和退出探针：

# perf record -e probe_evenodd:even -e probe_evenodd:even_end -a sleep 10

记录结束后，可以使用以下命令打印跟踪结果：

# perf script
         evenodd 2342328 [034] 1232877.416865:     probe_evenodd:even: (4006c4)
         evenodd 2342328 [034] 1232877.416875: probe_evenodd:even_end: (4006d8)
         evenodd 2342328 [034] 1232877.417990:     probe_evenodd:even: (4006c4)
         evenodd 2342328 [034] 1232877.417996: probe_evenodd:even_end: (4006d8)
         evenodd 2342328 [034] 1232877.419109:     probe_evenodd:even: (4006c4)
...

2.4 python分析

还可以使用 perf python脚本自动计算执行时间。捕获这两个事件的跟踪后，以下命令将生成名为 perf-script.py 的 python 脚本框架：

# perf script -g python

然后编辑探测处理程序，计算执行时间：

# perf script event handlers, generated by perf script -g python
# Licensed under the terms of the GNU GPL License version 2

# The common_* event handler fields are the most useful fields common to
# all events.  They don't necessarily correspond to the 'common_*' fields
# in the format files.  Those fields not available as handler params can
# be retrieved using Python functions of the form common_*(context).
# See the perf-script-python Documentation for the list of available functions.

from __future__ import print_function

import os
import sys

sys.path.append(os.environ['PERF_EXEC_PATH'] + \
        '/scripts/python/Perf-Trace-Util/lib/Perf/Trace')

from perf_trace_context import *
from Core import *


def trace_begin():
        print("in trace_begin")

def trace_end():
        print("in trace_end")

start=0
def probe_evenodd__even(event_name, context, common_cpu, common_secs, common_nsecs, common_pid, common_comm, common_callchain, __probe_ip, perf_sample_dict):
    global start
    start=(common_secs * 1000000000) + common_nsecs

def probe_evenodd__even_end(event_name, context, common_cpu, common_secs, common_nsecs, common_pid, common_comm, common_callchain, __probe_ip, perf_sample_dict):

    now=common_secs * 1000000000 + common_nsecs
    duration=now - start
    print("even runtime: " + str(duration) + " ns")

def trace_unhandled(event_name, context, event_fields_dict, perf_sample_dict):
                print(get_dict_as_string(event_fields_dict))
                print('Sample: {'+get_dict_as_string(perf_sample_dict['sample'], ', ')+'}')

def print_header(event_name, cpu, secs, nsecs, pid, comm):
        print("%-20s %5u %05u.%09u %8u %-20s " % \
        (event_name, cpu, secs, nsecs, pid, comm), end="")

def get_dict_as_string(a_dict, delimiter=' '):
        return delimiter.join(['%s=%s'%(k,str(v))for k,v in sorted(a_dict.items())])

然后，运行以下命令，就可以打印出函数 even 的执行时间：

# perf script -s perf-script.py  | tail
even runtime: 6540 ns
even runtime: 7080 ns
even runtime: 6770 ns
even runtime: 7220 ns
even runtime: 6850 ns
even runtime: 6390 ns
even runtime: 6910 ns
even runtime: 6760 ns
even runtime: 7460 ns
in trace_end

posted @ 2024-08-28 08:14 磁石空杯阅读(825) 评论(0) 收藏举报

刷新页面返回顶部