牛虻与专家

导航

 

blktrace是一个用户态的工具,提供I/O子系统上时间如何消耗的详细信息,从中可以分析是IO调度慢还是硬件响应慢等。配套工具blkparse从blktrace读取原始输出,并产生人们可读的输入和输出操作摘要。btt作为blktrace软件包的一部分而被提供,它分析blktrace输出,并显示该数据用在每个I/O栈区域的时间量,使它更容易在I/O子系统中发现瓶颈。

安装yum -y install blktrace

使用方式

  1. 使用blktrace命令抓取指定设备的IO信息,如:blktrace -w 10 -d /dev/sdb

    === sdb ===
    CPU 0: 41 events, 2 KiB data
    CPU 1: 2 events, 1 KiB data
    CPU 2: 2 events, 1 KiB data
    CPU 3: 22 events, 2 KiB data
    CPU 4: 5 events, 1 KiB data
    CPU 5: 0 events, 0 KiB data
    CPU 6: 4 events, 1 KiB data
    CPU 7: 14 events, 1 KiB data
    CPU 8: 4 events, 1 KiB data
    CPU 9: 4 events, 1 KiB data
    CPU 10: 0 events, 0 KiB data
    CPU 11: 0 events, 0 KiB data
    CPU 12: 17 events, 1 KiB data
    CPU 13: 6 events, 1 KiB data
    CPU 14: 3 events, 1 KiB data
    CPU 15: 28 events, 2 KiB data
    CPU 16: 0 events, 0 KiB data
    CPU 17: 87 events, 5 KiB data
    CPU 18: 2 events, 1 KiB data
    CPU 19: 9 events, 1 KiB data
    CPU 20: 6 events, 1 KiB data
    CPU 21: 4 events, 1 KiB data
    CPU 22: 7 events, 1 KiB data
    CPU 23: 7 events, 1 KiB data
    CPU 24: 24 events, 2 KiB data
    CPU 25: 0 events, 0 KiB data
    CPU 26: 8 events, 1 KiB data
    CPU 27: 8 events, 1 KiB data
    CPU 28: 5 events, 1 KiB data
    CPU 29: 18 events, 1 KiB data
    CPU 30: 8 events, 1 KiB data
    CPU 31: 4 events, 1 KiB data
    CPU 32: 16 events, 1 KiB data
    CPU 33: 11 events, 1 KiB data
    CPU 34: 3 events, 1 KiB data
    CPU 35: 4 events, 1 KiB data
    CPU 36: 82 events, 4 KiB data
    CPU 37: 44 events, 3 KiB data
    CPU 38: 4 events, 1 KiB data
    CPU 39: 73 events, 4 KiB data
    Total: 586 events (dropped 0), 29 KiB data

     

    #在当前目录下回生成一系列的文件,如下:

    [root@node-2]# ls -l ./
    total 148
    drwxr-xr-x 2 root 4096 Feb 10 10:36 .
    drwxr-xr-x. 5 root 178 Feb 10 10:36 ..
    -rw-r--r-- 1 root 1984 Feb 10 10:36 sdb.blktrace.0
    -rw-r--r-- 1 root 112 Feb 10 10:36 sdb.blktrace.1
    -rw-r--r-- 1 root 0 Feb 10 10:36 sdb.blktrace.10
    -rw-r--r-- 1 root 0 Feb 10 10:36 sdb.blktrace.11
    -rw-r--r-- 1 root 832 Feb 10 10:36 sdb.blktrace.12
    -rw-r--r-- 1 root 304 Feb 10 10:36 sdb.blktrace.13
    -rw-r--r-- 1 root 160 Feb 10 10:36 sdb.blktrace.14
    -rw-r--r-- 1 root 1360 Feb 10 10:36 sdb.blktrace.15
    -rw-r--r-- 1 root 0 Feb 10 10:36 sdb.blktrace.16
    -rw-r--r-- 1 root 4176 Feb 10 10:36 sdb.blktrace.17
    -rw-r--r-- 1 root 112 Feb 10 10:36 sdb.blktrace.18
    -rw-r--r-- 1 root 448 Feb 10 10:36 sdb.blktrace.19
    -rw-r--r-- 1 root 112 Feb 10 10:36 sdb.blktrace.2
    -rw-r--r-- 1 root 320 Feb 10 10:36 sdb.blktrace.20
    -rw-r--r-- 1 root 208 Feb 10 10:36 sdb.blktrace.21
    -rw-r--r-- 1 root 336 Feb 10 10:36 sdb.blktrace.22
    -rw-r--r-- 1 root 352 Feb 10 10:36 sdb.blktrace.23
    -rw-r--r-- 1 root 1184 Feb 10 10:36 sdb.blktrace.24
    -rw-r--r-- 1 root 0 Feb 10 10:36 sdb.blktrace.25
    -rw-r--r-- 1 root 416 Feb 10 10:36 sdb.blktrace.26
    -rw-r--r-- 1 root 416 Feb 10 10:36 sdb.blktrace.27
    -rw-r--r-- 1 root 272 Feb 10 10:36 sdb.blktrace.28
    -rw-r--r-- 1 root 896 Feb 10 10:36 sdb.blktrace.29
    -rw-r--r-- 1 root 1088 Feb 10 10:36 sdb.blktrace.3
    -rw-r--r-- 1 root 416 Feb 10 10:36 sdb.blktrace.30
    -rw-r--r-- 1 root 208 Feb 10 10:36 sdb.blktrace.31
    -rw-r--r-- 1 root 800 Feb 10 10:36 sdb.blktrace.32
    -rw-r--r-- 1 root 560 Feb 10 10:36 sdb.blktrace.33
    -rw-r--r-- 1 root 160 Feb 10 10:36 sdb.blktrace.34
    -rw-r--r-- 1 root 224 Feb 10 10:36 sdb.blktrace.35
    -rw-r--r-- 1 root 3968 Feb 10 10:36 sdb.blktrace.36
    -rw-r--r-- 1 root 2144 Feb 10 10:36 sdb.blktrace.37
    -rw-r--r-- 1 root 216 Feb 10 10:36 sdb.blktrace.38
    -rw-r--r-- 1 root 3536 Feb 10 10:36 sdb.blktrace.39
    -rw-r--r-- 1 root 272 Feb 10 10:36 sdb.blktrace.4
    -rw-r--r-- 1 root 0 Feb 10 10:36 sdb.blktrace.5
    -rw-r--r-- 1 root 224 Feb 10 10:36 sdb.blktrace.6
    -rw-r--r-- 1 root 672 Feb 10 10:36 sdb.blktrace.7
    -rw-r--r-- 1 root 208 Feb 10 10:36 sdb.blktrace.8
    -rw-r--r-- 1 root 208 Feb 10 10:36 sdb.blktrace.9

    “-w”后接的参数指定抓取时间,以秒为单位;“-d”后接的参数指定设备。

    命令执行完后会生成一系列以device.blktrace.cpu格式命令的二进制文件。

  2. 使用blkparse命令解析blktrace生成的数据,如:
    1 [root@node-2]blkparse -i sdb -d blkparse.out
    2 
    3 #在当前目录下生成文件blkparse.out
    4 [root@node-2]# ls blkparse.out5 -rw-r--r-- 1 root 28904 Feb 10 10:42 blkparse.out

    “-i”后接的参数指定输入文件名,由于blktrace输出的文件名默认都是device.blktrace.cpu格式,所以只需输入前面的“device”即可;“-d”后接的参数指定二进制输出文件。

    这个命令会将分析结果输出到屏幕,并且将分析结果的二进制数据输出到blkparse.out文件中。

  3. 使用btt命令查看IO的整体情况,如:btt -i blkparse.out

            “-i”后接的参数指定输入文件名。

[root@node-2]# btt -i blkparse.out
==================== All Devices ====================

            ALL           MIN           AVG           MAX           N
--------------- ------------- ------------- ------------- -----------

Q2Q               0.000001043   0.404216664   2.560145206          19
Q2G               0.000001199   0.000003634   0.000005320           5
G2I               0.000000355   0.000000795   0.000001129           5
Q2M               0.000000250   0.000000514   0.000001320          15
I2D               0.000002556   0.000009842   0.000014352           5
M2D               0.000002465   0.000005110   0.000009736          15
D2C               0.007196095   0.009711240   0.013328925          20
Q2C               0.007200405   0.009719025   0.013347278          20

==================== Device Overhead ====================

       DEV |       Q2G       G2I       Q2M       I2D       D2C
---------- | --------- --------- --------- --------- ---------
 (  8, 16) |   0.0093%   0.0020%   0.0040%   0.0253%  99.9199%
---------- | --------- --------- --------- --------- ---------
   Overall |   0.0093%   0.0020%   0.0040%   0.0253%  99.9199%

==================== Device Merge Information ====================

       DEV |       #Q       #D   Ratio |   BLKmin   BLKavg   BLKmax    Total
---------- | -------- -------- ------- | -------- -------- -------- --------
 (  8, 16) |       20        5     4.0 |        8     2094      128    10472

==================== Device Q2Q Seek Information ====================

       DEV |          NSEEKS            MEAN          MEDIAN | MODE           
---------- | --------------- --------------- --------------- | ---------------
 (  8, 16) |              20      45325985.6               0 | 0(19)
---------- | --------------- --------------- --------------- | ---------------
   Overall |          NSEEKS            MEAN          MEDIAN | MODE           
   Average |              20      45325985.6               0 | 0(19)

==================== Device D2D Seek Information ====================

       DEV |          NSEEKS            MEAN          MEDIAN | MODE           
---------- | --------------- --------------- --------------- | ---------------
 (  8, 16) |               5     181303942.4               0 | 0(4)
---------- | --------------- --------------- --------------- | ---------------
   Overall |          NSEEKS            MEAN          MEDIAN | MODE           
   Average |               5     181303942.4               0 | 0(4)

==================== Plug Information ====================

       DEV |    # Plugs # Timer Us  | % Time Q Plugged
---------- | ---------- ----------  | ----------------

       DEV |    IOs/Unp   IOs/Unp(to)
---------- | ----------   ----------
 (  8, 16) |        0.0          0.0

==================== Active Requests At Q Information ====================

       DEV |  Avg Reqs @ Q
---------- | -------------
 (  8, 16) |           0.8

==================== I/O Active Period Information ====================

       DEV |     # Live      Avg. Act     Avg. !Act % Live
---------- | ---------- ------------- ------------- ------
 (  8, 16) |          4   0.009723459   3.177443548   0.41
---------- | ---------- ------------- ------------- ------
 Total Sys |          4   0.009723459   3.177443548   0.41

# Total System
#     Total System : q activity
  1.883907451   0.0
  1.883907451   0.4
  1.883936507   0.4
  1.883936507   0.0
  4.443892313   0.0
  4.443892313   0.4
  4.443907463   0.4
  4.443907463   0.0
  7.004052669   0.0
  7.004052669   0.4
  7.004064573   0.4
  7.004064573   0.0
  9.564008551   0.0
  9.564008551   0.4
  9.564024072   0.4
  9.564024072   0.0

#     Total System : c activity
  1.891967134   0.5
  1.891967134   0.9
  2.116851895   0.9
  2.116851895   0.5
  2.223860147   0.5
  2.223860147   0.9
  2.417331081   0.9
  2.417331081   0.5
  2.665056072   0.5
  2.665056072   0.9
  2.714272411   0.9
  2.714272411   0.5
  3.235484928   0.5
  3.235484928   0.9
  3.238227753   0.9
  3.238227753   0.5
  3.383799121   0.5
  3.383799121   0.9
  3.383998065   0.9
  3.383998065   0.5
  3.550780112   0.5
  3.550780112   0.9
  3.621802077   0.9
  3.621802077   0.5
  3.888886315   0.5
  3.888886315   0.9
  3.888886315   0.9
  3.888886315   0.5
  4.457239591   0.5
  4.457239591   0.9
  4.457239591   0.9
  4.457239591   0.5
  5.352198370   0.5
  5.352198370   0.9
  5.354955290   0.9
  5.354955290   0.5
  6.106284377   0.5
  6.106284377   0.9
  6.125009156   0.9
  6.125009156   0.5
  7.014334478   0.5
  7.014334478   0.9
  7.014334478   0.9
  7.014334478   0.5
  7.525389674   0.5
  7.525389674   0.9
  7.583052230   0.9
  7.583052230   0.5
  7.728428178   0.5
  7.728428178   0.9
  7.728428178   0.9
  7.728428178   0.5
  7.859381084   0.5
  7.859381084   0.9
  8.031665043   0.9
  8.031665043   0.5
  8.382506351   0.5
  8.382506351   0.9
  8.383092145   0.9
  8.383092145   0.5
  8.488990675   0.5
  8.488990675   0.9
  8.488990675   0.9
  8.488990675   0.5
  8.669453481   0.5
  8.669453481   0.9
  8.704800087   0.9
  8.704800087   0.5
  9.153631222   0.5
  9.153631222   0.9
  9.181520268   0.9
  9.181520268   0.5
  9.417434072   0.5
  9.417434072   0.9
  9.431747448   0.9
  9.431747448   0.5
  9.571224477   0.5
  9.571224477   0.9
  9.571224477   0.9
  9.571224477   0.5

# Per process
#      3_scheduler : q activity

#      3_scheduler : c activity
  6.106284377   1.5
  6.106284377   1.9
  6.106284377   1.9
  6.106284377   1.5

#  ceilometer-coll : q activity

#  ceilometer-coll : c activity
  9.571224477   2.5
  9.571224477   2.9
  9.571224477   2.9
  9.571224477   2.5

#           kernel : q activity

#           kernel : c activity
  1.891967134   3.5
  1.891967134   3.9
  2.116851895   3.9
  2.116851895   3.5
  2.223860147   3.5
  2.223860147   3.9
  2.417331081   3.9
  2.417331081   3.5
  2.665056072   3.5
  2.665056072   3.9
  2.714272411   3.9
  2.714272411   3.5
  3.621802077   3.5
  3.621802077   3.9
  3.621802077   3.9
  3.621802077   3.5
  3.888886315   3.5
  3.888886315   3.9
  3.888886315   3.9
  3.888886315   3.5
  4.457239591   3.5
  4.457239591   3.9
  4.457239591   3.9
  4.457239591   3.5
  6.125009156   3.5
  6.125009156   3.9
  6.125009156   3.9
  6.125009156   3.5
  7.014334478   3.5
  7.014334478   3.9
  7.014334478   3.9
  7.014334478   3.5
  7.525389674   3.5
  7.525389674   3.9
  7.583052230   3.9
  7.583052230   3.5
  7.728428178   3.5
  7.728428178   3.9
  7.728428178   3.9
  7.728428178   3.5
  7.859381084   3.5
  7.859381084   3.9
  8.031665043   3.9
  8.031665043   3.5
  8.488990675   3.5
  8.488990675   3.9
  8.488990675   3.9
  8.488990675   3.5
  9.153631222   3.5
  9.153631222   3.9
  9.153631222   3.9
  9.153631222   3.5
  9.417434072   3.5
  9.417434072   3.9
  9.431747448   3.9
  9.431747448   3.5

#          kworker : q activity
  1.883907451   4.0
  1.883907451   4.4
  1.883936507   4.4
  1.883936507   4.0
  4.443892313   4.0
  4.443892313   4.4
  4.443907463   4.4
  4.443907463   4.0
  7.004052669   4.0
  7.004052669   4.4
  7.004064573   4.4
  7.004064573   4.0
  9.564008551   4.0
  9.564008551   4.4
  9.564024072   4.4
  9.564024072   4.0

#          kworker : c activity
  3.235484928   4.5
  3.235484928   4.9
  3.238227753   4.9
  3.238227753   4.5
  3.383799121   4.5
  3.383799121   4.9
  3.383998065   4.9
  3.383998065   4.5
  3.550780112   4.5
  3.550780112   4.9
  3.593125059   4.9
  3.593125059   4.5
  5.352198370   4.5
  5.352198370   4.9
  5.354955290   4.9
  5.354955290   4.5
  8.382506351   4.5
  8.382506351   4.9
  8.383092145   4.9
  8.383092145   4.5
  8.669453481   4.5
  8.669453481   4.9
  8.704800087   4.9
  8.704800087   4.5
  9.181520268   4.5
  9.181520268   4.9
  9.181520268   4.9
  9.181520268   4.5

#    msgr-worker-0 : q activity

#    msgr-worker-0 : c activity
  2.070443937   5.5
  2.070443937   5.9
  2.070443937   5.9
  2.070443937   5.5

 

 

相关事件的含义:

       A      IO was remapped to a different device
       B      IO bounced
       C      IO completion
       D      IO issued to driver
       F      IO front merged with request on queue
       G      Get request
       I      IO inserted onto request queue
       M      IO back merged with request on queue
       P      Plug request
       Q      IO handled by request queue code
       S      Sleep request
       T      Unplug due to timeout
       U      Unplug request
       X      Split
  • Q2Q:多个发送到块IO层请求的时间间隔。
  • Q2G:生成IO请求所消耗的时间,包括remap(可能被DM(Device Mapper)或MD(Multiple Device, Software RAID)remap到其它设备)和split(可能会因为I/O请求与扇区边界未对齐、或者size太大而被分拆(split)成多个物理I/O)的时间。
  • G2I:IO请求进入IO Scheduler所消耗的时间,包括merge(可能会因为与其它I/O请求的物理位置相邻而合并成一个I/O)的时间。
  • I2D:IO请求在IO Scheduler中等待的时间。
  • D2C:IO请求在driver和硬件上(IO请求被driver提交给硬件,经过HBA、电缆(光纤、网线等)、交换机(SAN或网络)、最后到达存储设备,设备完成IO请求之后再把结果发回)所消耗的时间。
  • Q2C:整个IO请求所消耗的时间(Q2I + I2D + D2C = Q2C),相当于iostat的await。

正常情况D2C占比90%以上,若I2D占比较高,可以尝试调整IO调度策略。

posted on 2022-02-10 10:51  牛虻&专家  阅读(290)  评论(0编辑  收藏  举报