11 Ceph 集群测试
mon 高可用测试
模拟 mon 节点宕掉
# 首先查看集群状态
[root@node0 ceph-deploy]# ceph -s
cluster:
id: 97702c43-6cc2-4ef8-bdb5-855cfa90a260
health: HEALTH_OK
services:
mon: 3 daemons, quorum node0,node1,node2 (age 9d)
mgr: node1(active, since 12d), standbys: node2, node0
mds: cephfs-demo:1 {0=node1=up:active} 2 up:standby
osd: 6 osds: 6 up (since 6d), 6 in (since 12d)
rgw: 2 daemons active (node0, node1)
task status:
data:
pools: 9 pools, 352 pgs
objects: 534 objects, 655 MiB
usage: 8.4 GiB used, 292 GiB / 300 GiB avail
pgs: 352 active+clean
# 登陆 node2,宕掉 node2 节点 mon 服务
[root@node0 ceph-deploy]# ssh node2
Last login: Thu Nov 3 15:53:48 2022 from node0
[root@node2 ~]# systemctl stop ceph-mon@node2
# 再次查看集群情况
[root@node2 ~]# ceph -s
cluster:
id: 97702c43-6cc2-4ef8-bdb5-855cfa90a260
health: HEALTH_WARN
1/3 mons down, quorum node0,node1 # node2 节点宕机
services:
mon: 3 daemons, quorum node0,node1 (age 0.228493s), out of quorum: node2 # node2 节点宕机
mgr: node1(active, since 12d), standbys: node2, node0
mds: cephfs-demo:1 {0=node1=up:active} 2 up:standby
osd: 6 osds: 6 up (since 6d), 6 in (since 12d)
rgw: 2 daemons active (node0, node1)
task status:
data:
pools: 9 pools, 352 pgs
objects: 534 objects, 655 MiB
usage: 8.4 GiB used, 292 GiB / 300 GiB avail
pgs: 352 active+clean
一个 mon 节点宕机后,测试集群业务情况
# 创建一个 rbd 块测试
[root@node2 ~]# rbd create --size 1G ceph-demo/mon-test
# 查看 rbd 信息
[root@node2 ~]# rbd -p ceph-demo ls
ceph-trash.img
crush-demo.img
mon-test # 新创建的 rbd 块
rbd-test-new.img
rbd-test-new2.img
rbd-test.img
rdb-demo.img
vm1-clone.img
vm2-clone.img
vm3-clone.img
[root@node2 ~]# exit
logout
Connection to node2 closed.
宕掉集群中2个 mon 节点测试业务是否正常
# 登陆 node1 节点
[root@node0 ceph-deploy]# ssh node1
Last login: Thu Nov 3 16:45:40 2022 from node0
# 宕掉 node1 节点的 mon 服务
[root@node1 ~]# systemctl stop ceph-mon@node1
# 查看集群状态 (需要登陆一段时间,登陆集群检查 node1 mon 服务异常)
# ceph -s 会组塞住,无法访问
[root@node0 ceph-deploy]# ceph -s
^CCluster connection aborted
[root@node0 ceph-deploy]# rbd -p ceph-demo ls
^C
ceph 集群 mon 选举规则,需要半数以上的节点进行选举,所以在日常维护中,需要确保集群 mon 有半数以上节点运行正常
集群 mon 服务恢复
# 恢复 node1 mon 服务
[root@node0 ceph-deploy]# ssh node1
Last login: Thu Nov 3 17:08:40 2022 from node0
[root@node1 ~]# systemctl start ceph-mon@node1
[root@node1 ~]# exit
logout
Connection to node1 closed.
# 恢复 node2 mon 服务
[root@node0 ceph-deploy]# ssh node2
Last login: Thu Nov 3 17:07:04 2022 from node0
[root@node2 ~]# systemctl start ceph-mon@node2
[root@node2 ~]# exit
logout
Connection to node2 closed.
# 查看集群状态
[root@node0 ceph-deploy]# ceph -s
cluster:
id: 97702c43-6cc2-4ef8-bdb5-855cfa90a260
health: HEALTH_OK
services:
mon: 3 daemons, quorum node0,node1,node2 (age 1.89454s)
mgr: node1(active, since 12d), standbys: node2, node0
mds: cephfs-demo:1 {0=node1=up:active} 2 up:standby
osd: 6 osds: 6 up (since 6d), 6 in (since 12d)
rgw: 2 daemons active (node0, node1)
task status:
data:
pools: 9 pools, 352 pgs
objects: 537 objects, 655 MiB
usage: 8.4 GiB used, 292 GiB / 300 GiB avail
pgs: 352 active+clean
mds 主从切换
mds 用于 cephfs
- mds 默认为 active standby 模式,只有集群中存在一个节点,服务即可用
查看集群状态信息
[root@node0 ceph-deploy]# ceph -s
cluster:
id: 97702c43-6cc2-4ef8-bdb5-855cfa90a260
health: HEALTH_OK
services:
mon: 3 daemons, quorum node0,node1,node2 (age 12m)
mgr: node1(active, since 12d), standbys: node2, node0
mds: cephfs-demo:1 {0=node1=up:active} 1 up:standby # 当前 mds 服务的 active 在 node1 节点
osd: 6 osds: 6 up (since 6d), 6 in (since 12d)
rgw: 2 daemons active (node0, node1)
task status:
data:
pools: 9 pools, 352 pgs
objects: 537 objects, 655 MiB
usage: 8.4 GiB used, 292 GiB / 300 GiB avail
pgs: 352 active+clean
手动模拟 mds active 服务宕机
# 登陆 node1 节点
[root@node0 ceph-deploy]# ssh node1
Last login: Thu Nov 3 17:12:25 2022 from node0
# 停止当前的 mds active 服务
[root@node1 ~]# systemctl stop ceph-mds@node1
# 查看集群状态信息
[root@node1 ~]# ceph -s
cluster:
id: 97702c43-6cc2-4ef8-bdb5-855cfa90a260
health: HEALTH_OK
services:
mon: 3 daemons, quorum node0,node1,node2 (age 13m)
mgr: node1(active, since 12d), standbys: node2, node0
mds: cephfs-demo:1 {0=node2=up:active} 1 up:standby # 当前 mds 服务的 active 在 node2 节点,完成了主从切换,并且当前 standby 节点只有 1 个了
osd: 6 osds: 6 up (since 6d), 6 in (since 12d)
rgw: 2 daemons active (node0, node1)
task status:
data:
pools: 9 pools, 352 pgs
objects: 537 objects, 655 MiB
usage: 8.4 GiB used, 292 GiB / 300 GiB avail
pgs: 352 active+clean
手动模拟 mds 服务宕机 2 个节点
# 当前 node1 服务已宕机,再次宕机 node2 节点 mds 服务
[root@node1 ~]# exit
logout
Connection to node1 closed.
[root@node0 ceph-deploy]# ssh node2
Last login: Thu Nov 3 17:12:37 2022 from node0
# 停止当前的 mds active 服务
[root@node2 ~]# systemctl stop ceph-mds@node2
# 查看集群状态信息
[root@node2 ~]# ceph -s
cluster:
id: 97702c43-6cc2-4ef8-bdb5-855cfa90a260
health: HEALTH_WARN
insufficient standby MDS daemons available
services:
mon: 3 daemons, quorum node0,node1,node2 (age 15m)
mgr: node1(active, since 12d), standbys: node2, node0
mds: cephfs-demo:1 {0=node0=up:active} # 当前 mds 服务的 active 在 node0 节点,完成了主从切换,并且当前没有 standby 节点
osd: 6 osds: 6 up (since 6d), 6 in (since 12d)
rgw: 2 daemons active (node0, node1)
task status:
data:
pools: 9 pools, 352 pgs
objects: 537 objects, 655 MiB
usage: 8.4 GiB used, 292 GiB / 300 GiB avail
pgs: 352 active+clean
集群 mds 服务恢复
[root@node2 ~]# systemctl start ceph-mds@node2
[root@node2 ~]# exit
logout
Connection to node2 closed.
[root@node0 ceph-deploy]# ssh node1
Last login: Thu Nov 3 17:25:35 2022 from node0
[root@node1 ~]# systemctl start ceph-mds@node1
[root@node1 ~]# ceph -s
cluster:
id: 97702c43-6cc2-4ef8-bdb5-855cfa90a260
health: HEALTH_OK
services:
mon: 3 daemons, quorum node0,node1,node2 (age 17m)
mgr: node1(active, since 12d), standbys: node2, node0
mds: cephfs-demo:1 {0=node0=up:active} 2 up:standby
osd: 6 osds: 6 up (since 6d), 6 in (since 12d)
rgw: 2 daemons active (node0, node1)
task status:
data:
pools: 9 pools, 352 pgs
objects: 537 objects, 655 MiB
usage: 8.4 GiB used, 292 GiB / 300 GiB avail
pgs: 352 active+clean
RGW 高可用测试
RGW 用于对象存储,本身是无状态化服务
查看集群状态信息
[root@node1 ~]# ceph -s
cluster:
id: 97702c43-6cc2-4ef8-bdb5-855cfa90a260
health: HEALTH_OK
services:
mon: 3 daemons, quorum node0,node1,node2 (age 29m)
mgr: node1(active, since 12d), standbys: node2, node0
mds: cephfs-demo:1 {0=node0=up:active} 2 up:standby
osd: 6 osds: 6 up (since 6d), 6 in (since 12d)
rgw: 2 daemons active (node0, node1) # rgw 节点有2个
task status:
data:
pools: 9 pools, 352 pgs
objects: 537 objects, 655 MiB
usage: 8.4 GiB used, 292 GiB / 300 GiB avail
pgs: 352 active+clean
手动模拟 radosgw 服务宕机
# 停止 node1 节点 radosgw 服务
[root@node1 ~]# systemctl stop ceph-radosgw.target
# 查看集群状态
[root@node1 ~]# ceph -s
cluster:
id: 97702c43-6cc2-4ef8-bdb5-855cfa90a260
health: HEALTH_OK
services:
mon: 3 daemons, quorum node0,node1,node2 (age 30m)
mgr: node1(active, since 12d), standbys: node2, node0
mds: cephfs-demo:1 {0=node0=up:active} 2 up:standby
osd: 6 osds: 6 up (since 6d), 6 in (since 12d)
rgw: 1 daemon active (node0) # rgw 节点变为1个
task status:
data:
pools: 9 pools, 352 pgs
objects: 537 objects, 655 MiB
usage: 8.4 GiB used, 292 GiB / 300 GiB avail
pgs: 352 active+clean
一个 radosgw 节点宕机后,测试集群业务情况
[root@node1 ~]# curl http://192.168.100.100
<?xml version="1.0" encoding="UTF-8"?>
<ListAllMyBucketsResult xmlns="http://s3.amazonaws.com/doc/2006-03-01/">
<Owner>
<ID>anonymous</ID>
<DisplayName></DisplayName>
</Owner>
<Buckets></Buckets>
</ListAllMyBucketsResult>
[root@node1 ~]# exit
logout
Connection to node1 closed.
# 客户端工具测试
[root@node0 ceph-deploy]# s3cmd ls
2022-10-21 01:39 s3://ceph-s3-bucket
2022-10-21 03:16 s3://s3cmd-demo
2022-10-21 06:46 s3://swift-demo
[root@node0 ceph-deploy]# swift list
ceph-s3-bucket
s3cmd-demo
swift-demo
所有 radosgw 节点宕机后,测试集群业务情况
# 停止 node0 节点 radosgw 服务
[root@node0 ceph-deploy]# systemctl stop ceph-radosgw.target
# 查看集群状态信息
[root@node0 ceph-deploy]# ceph -s
cluster:
id: 97702c43-6cc2-4ef8-bdb5-855cfa90a260
health: HEALTH_OK
services:
mon: 3 daemons, quorum node0,node1,node2 (age 32m)
mgr: node1(active, since 12d), standbys: node2, node0
mds: cephfs-demo:1 {0=node0=up:active} 2 up:standby
osd: 6 osds: 6 up (since 6d), 6 in (since 12d)
# 没有 rgw 信息
data:
pools: 9 pools, 352 pgs
objects: 537 objects, 655 MiB
usage: 8.4 GiB used, 292 GiB / 300 GiB avail
pgs: 352 active+clean
# 客户端测试
[root@node0 ceph-deploy]# s3cmd ls
ERROR: Error parsing xml: Malformed error XML returned from remote server.. ErrorXML: <html><body><h1>503 Service Unavailable</h1>
No server is available to handle this request.
</body></html>
WARNING: Retrying failed request: / (503 (Service Unavailable))
WARNING: Waiting 3 sec...
^CSee ya!
[root@node0 ceph-deploy]# swift list
^C Aborted
集群 radosgw 服务恢复
# 启动 radosgw 服务
[root@node0 ceph-deploy]# systemctl start ceph-radosgw.target
[root@node0 ceph-deploy]# ssh node1
Last login: Thu Nov 3 17:42:41 2022 from node0
[root@node1 ~]# systemctl start ceph-radosgw.target
# 查看集群信息
[root@node1 ~]# ceph -s
cluster:
id: 97702c43-6cc2-4ef8-bdb5-855cfa90a260
health: HEALTH_OK
services:
mon: 3 daemons, quorum node0,node1,node2 (age 33m)
mgr: node1(active, since 12d), standbys: node2, node0
mds: cephfs-demo:1 {0=node0=up:active} 2 up:standby
osd: 6 osds: 6 up (since 6d), 6 in (since 12d)
rgw: 2 daemons active (node0, node1)
task status:
data:
pools: 9 pools, 352 pgs
objects: 537 objects, 655 MiB
usage: 8.4 GiB used, 292 GiB / 300 GiB avail
pgs: 352 active+clean
OSD 坏盘测试
查看集群和 osd 状态
# 查看集群状态
[root@node1 ~]# ceph -s
cluster:
id: 97702c43-6cc2-4ef8-bdb5-855cfa90a260
health: HEALTH_OK
services:
mon: 3 daemons, quorum node0,node1,node2 (age 51m)
mgr: node1(active, since 12d), standbys: node2, node0
mds: cephfs-demo:1 {0=node0=up:active} 2 up:standby
osd: 6 osds: 6 up (since 6d), 6 in (since 12d)
rgw: 2 daemons active (node0, node1)
task status:
data:
pools: 9 pools, 352 pgs
objects: 537 objects, 655 MiB
usage: 8.4 GiB used, 292 GiB / 300 GiB avail
pgs: 352 active+clean
# 查看 osd 状态
[root@node1 ~]# ceph osd tree
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-9 0.14639 root ssd
-10 0.04880 host node0-ssd
3 hdd 0.04880 osd.3 up 1.00000 1.00000
-11 0.04880 host node1-ssd
4 hdd 0.04880 osd.4 up 1.00000 1.00000
-12 0.04880 host node2-ssd
5 hdd 0.04880 osd.5 up 1.00000 1.00000
-1 0.14639 root default
-3 0.04880 host node0
0 hdd 0.04880 osd.0 up 1.00000 1.00000
-5 0.04880 host node1
1 hdd 0.04880 osd.1 up 1.00000 1.00000
-7 0.04880 host node2
2 hdd 0.04880 osd.2 up 1.00000 1.00000
手动模拟 1个节点 osd 服务宕机
# 手动停止 node1 节点 osd 服务
[root@node1 ~]# systemctl stop ceph-osd.target
# 查看 osd 状态
[root@node1 ~]# ceph osd tree
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-9 0.14639 root ssd
-10 0.04880 host node0-ssd
3 hdd 0.04880 osd.3 up 1.00000 1.00000
-11 0.04880 host node1-ssd
4 hdd 0.04880 osd.4 down 1.00000 1.00000
-12 0.04880 host node2-ssd
5 hdd 0.04880 osd.5 up 1.00000 1.00000
-1 0.14639 root default
-3 0.04880 host node0
0 hdd 0.04880 osd.0 up 1.00000 1.00000
-5 0.04880 host node1
1 hdd 0.04880 osd.1 down 1.00000 1.00000
-7 0.04880 host node2
2 hdd 0.04880 osd.2 up 1.00000 1.00000
# 查看集群状态
[root@node1 ~]# ceph -s
cluster:
id: 97702c43-6cc2-4ef8-bdb5-855cfa90a260
health: HEALTH_WARN
2 osds down
2 hosts (2 osds) down
Degraded data redundancy: 537/1611 objects degraded (33.333%), 188 pgs degraded
services:
mon: 3 daemons, quorum node0,node1,node2 (age 51m)
mgr: node1(active, since 12d), standbys: node2, node0
mds: cephfs-demo:1 {0=node0=up:active} 2 up:standby
osd: 6 osds: 4 up (since 6s), 6 in (since 12d)
rgw: 2 daemons active (node0, node1)
task status:
data:
pools: 9 pools, 352 pgs
objects: 537 objects, 655 MiB
usage: 8.4 GiB used, 292 GiB / 300 GiB avail
pgs: 537/1611 objects degraded (33.333%)
188 active+undersized+degraded
164 active+undersized
一个节点 osd 服务宕机后,测试集群业务情况
# 新建 rbd 块测试
[root@node1 ~]# rbd create ceph-demo/osd-test.img --size 1G
# 查看 rbd 信息
[root@node1 ~]# rbd -p ceph-demo ls
ceph-trash.img
crush-demo.img
mon-test
osd-test.img # 新创建的 rbd 块
rbd-test-new.img
rbd-test-new2.img
rbd-test.img
rdb-demo.img
vm1-clone.img
vm2-clone.img
vm3-clone.img
[root@node1 ~]# exit
logout
Connection to node1 closed.
# 客户端命令测试
[root@node0 crushmap]# s3cmd ls
2022-10-21 01:39 s3://ceph-s3-bucket
2022-10-21 03:16 s3://s3cmd-demo
2022-10-21 06:46 s3://swift-demo
2 个节点 osd 服务宕机后,测试集群业务情况
# 停止 node0 节点 osd 服务
[root@node0 crushmap]# systemctl stop ceph-osd.target
# 查看集群状态
[root@node0 crushmap]# ceph -s
cluster:
id: 97702c43-6cc2-4ef8-bdb5-855cfa90a260
health: HEALTH_WARN
4 osds down
4 hosts (4 osds) down
Degraded data redundancy: 540/1620 objects degraded (33.333%), 188 pgs degraded, 352 pgs undersized
services:
mon: 3 daemons, quorum node0,node1,node2 (age 53m)
mgr: node1(active, since 12d), standbys: node2, node0
mds: cephfs-demo:1 {0=node0=up:active} 2 up:standby
osd: 6 osds: 2 up (since 2s), 6 in (since 12d)
rgw: 2 daemons active (node0, node1)
task status:
data:
pools: 9 pools, 352 pgs
objects: 540 objects, 655 MiB
usage: 8.4 GiB used, 292 GiB / 300 GiB avail
pgs: 540/1620 objects degraded (33.333%)
103 active+undersized+degraded
90 stale+active+undersized
85 stale+active+undersized+degraded
74 active+undersized
# 查看 osd 状态
[root@node0 crushmap]# ceph osd tree
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-9 0.14639 root ssd
-10 0.04880 host node0-ssd
3 hdd 0.04880 osd.3 down 1.00000 1.00000
-11 0.04880 host node1-ssd
4 hdd 0.04880 osd.4 down 1.00000 1.00000
-12 0.04880 host node2-ssd
5 hdd 0.04880 osd.5 up 1.00000 1.00000
-1 0.14639 root default
-3 0.04880 host node0
0 hdd 0.04880 osd.0 down 1.00000 1.00000
-5 0.04880 host node1
1 hdd 0.04880 osd.1 down 1.00000 1.00000
-7 0.04880 host node2
2 hdd 0.04880 osd.2 up 1.00000 1.00000
# 命令行测试,命令会卡住,感觉集群不可用
[root@node0 crushmap]# s3cmd ls
^CSee ya!
集群 osd 服务恢复
# 启动 node0 节点 osd 服务
[root@node0 crushmap]# systemctl start ceph-osd.target
# 启动 node1 节点 osd 服务
[root@node0 crushmap]# ssh node1
Last login: Thu Nov 3 18:04:01 2022 from node0
[root@node1 ~]# systemctl start ceph-osd.target
# 等待数据重分布完成
# 查看集群状态
[root@node1 ~]# ceph -s
cluster:
id: 97702c43-6cc2-4ef8-bdb5-855cfa90a260
health: HEALTH_OK
services:
mon: 3 daemons, quorum node0,node1,node2 (age 56m)
mgr: node1(active, since 12d), standbys: node2, node0
mds: cephfs-demo:1 {0=node0=up:active} 2 up:standby
osd: 6 osds: 6 up (since 40s), 6 in (since 12d)
rgw: 2 daemons active (node0, node1)
task status:
data:
pools: 9 pools, 352 pgs
objects: 540 objects, 655 MiB
usage: 8.4 GiB used, 292 GiB / 300 GiB avail
pgs: 352 active+clean
# 查看 osd 状态
[root@node1 ~]# ceph osd tree
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-9 0.14639 root ssd
-10 0.04880 host node0-ssd
3 hdd 0.04880 osd.3 up 1.00000 1.00000
-11 0.04880 host node1-ssd
4 hdd 0.04880 osd.4 up 1.00000 1.00000
-12 0.04880 host node2-ssd
5 hdd 0.04880 osd.5 up 1.00000 1.00000
-1 0.14639 root default
-3 0.04880 host node0
0 hdd 0.04880 osd.0 up 1.00000 1.00000
-5 0.04880 host node1
1 hdd 0.04880 osd.1 up 1.00000 1.00000
-7 0.04880 host node2
2 hdd 0.04880 osd.2 up 1.00000 1.00000
flo 性能压测评估
ceph 性能测试,常用的测试工具:
- flo # 模拟虚拟机中对块设备读写,需要手动安装
- rbd bench # ceph 默认自带提供
- rados bench # ceph 默认自带提供
参考文档
https://github.com/get-set/fio-bench-disks-ceph
安装 fio 工具
[root@node0 ceph-deploy]# yum install fio -y
查看挂载的 RDB 磁盘设备
[root@node0 ceph-deploy]# df -h
Filesystem Size Used Avail Use% Mounted on
devtmpfs 898M 0 898M 0% /dev
tmpfs 910M 0 910M 0% /dev/shm
tmpfs 910M 52M 858M 6% /run
tmpfs 910M 0 910M 0% /sys/fs/cgroup
/dev/mapper/centos-root 37G 2.7G 35G 8% /
/dev/sda1 1014M 151M 864M 15% /boot
tmpfs 910M 24K 910M 1% /var/lib/ceph/osd/ceph-0
tmpfs 910M 52K 910M 1% /var/lib/ceph/osd/ceph-3
/dev/rbd2 9.8G 37M 9.2G 1% /media
/dev/rbd3 9.8G 37M 9.2G 1% /opt
tmpfs 182M 0 182M 0% /run/user/0
[root@node0 ceph-deploy]# ls /media/
file2 lost+found test.txt
命令参数说明
filename # 文件路径
ioengine=libaio # 异步IO
direct=1 # 排除OS的IO缓存机制的影响
size=200M # 每个fio进程/线程的最大读写
iodepth=32 # 队列深度1(后续有关于参数的测试)
numjobs=8 # 同时开启的fio进程/线程数为1
rw=randread # 每次测试修改该值:randread/read/randwrite/write
bs=4k # 每次测试修改该值:rand对应4k,seq对应64k
runtime=10 # 本次测试runtime:10 20 30 40 50 60 90
4K 随机写
fio -filename=/media/test.file -direct=1 -iodepth 32 -thread -rw=randwrite -ioengine=libaio -bs=4k -size=200M -numjobs=8 -runtime=60 -group_reporting -name=test
- 4K 随机写测试
[root@node0 ceph-deploy]# fio -filename=/media/test.file -direct=1 -iodepth 32 -thread -rw=randwrite -ioengine=libaio -bs=4k -size=200M -numjobs=8 -runtime=60 -group_reporting -name=test
test: (g=0): rw=randwrite, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=32
...
fio-3.7
Starting 8 threads
test: Laying out IO file (1 file / 200MiB)
Jobs: 8 (f=8): [w(8)][100.0%][r=0KiB/s,w=15.6MiB/s][r=0,w=3995 IOPS][eta 00m:00s]
test: (groupid=0, jobs=8): err= 0: pid=89592: Thu Nov 3 19:29:03 2022
write: IOPS=3471, BW=13.6MiB/s (14.2MB/s)(814MiB/60066msec)
slat (nsec): min=1713, max=167750k, avg=133406.59, stdev=2068691.88
clat (msec): min=6, max=858, avg=73.57, stdev=44.01
lat (msec): min=6, max=858, avg=73.70, stdev=44.05
clat percentiles (msec):
| 1.00th=[ 26], 5.00th=[ 33], 10.00th=[ 39], 20.00th=[ 47],
| 30.00th=[ 54], 40.00th=[ 59], 50.00th=[ 66], 60.00th=[ 72],
| 70.00th=[ 80], 80.00th=[ 90], 90.00th=[ 109], 95.00th=[ 140],
| 99.00th=[ 255], 99.50th=[ 326], 99.90th=[ 489], 99.95th=[ 535],
| 99.99th=[ 693]
bw ( KiB/s): min= 142, max= 2440, per=12.44%, avg=1727.22, stdev=472.44, samples=960
iops : min= 35, max= 610, avg=431.59, stdev=118.13, samples=960
lat (msec) : 10=0.01%, 20=0.18%, 50=25.06%, 100=61.38%, 250=12.32%
lat (msec) : 500=0.98%, 750=0.08%, 1000=0.01%
cpu : usr=0.13%, sys=0.61%, ctx=79190, majf=2, minf=157
IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=99.9%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.1%, 64=0.0%, >=64=0.0%
issued rwts: total=0,208493,0,0 short=0,0,0,0 dropped=0,0,0,0
latency : target=0, window=0, percentile=100.00%, depth=32
Run status group 0 (all jobs):
WRITE: bw=13.6MiB/s (14.2MB/s), 13.6MiB/s-13.6MiB/s (14.2MB/s-14.2MB/s), io=814MiB (854MB), run=60066-60066msec
Disk stats (read/write):
rbd2: ios=58/206439, merge=0/2864, ticks=697/14673448, in_queue=7558026, util=100.00%
4K 随机读
fio -filename=/media/test.file -direct=1 -iodepth 32 -thread -rw=randread -ioengine=libaio -bs=4k -size=200M -numjobs=8 -runtime=60 -group_reporting -name=test
- 4K 随机读测试
[root@node0 ceph-deploy]# fio -filename=/media/test.file -direct=1 -iodepth 32 -thread -rw=randread -ioengine=libaio -bs=4k -size=200M -numjobs=8 -runtime=60 -group_reporting -name=test
test: (g=0): rw=randread, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=32
...
fio-3.7
Starting 8 threads
Jobs: 8 (f=8): [r(8)][100.0%][r=179MiB/s,w=0KiB/s][r=45.9k,w=0 IOPS][eta 00m:00s]
test: (groupid=0, jobs=8): err= 0: pid=90541: Thu Nov 3 19:33:56 2022
read: IOPS=40.1k, BW=157MiB/s (164MB/s)(1600MiB/10222msec)
slat (nsec): min=832, max=2019.3k, avg=3250.14, stdev=7184.48
clat (usec): min=158, max=59894, avg=6369.15, stdev=3427.94
lat (usec): min=160, max=59896, avg=6372.48, stdev=3428.15
clat percentiles (usec):
| 1.00th=[ 1893], 5.00th=[ 2900], 10.00th=[ 3458], 20.00th=[ 4146],
| 30.00th=[ 4752], 40.00th=[ 5342], 50.00th=[ 5866], 60.00th=[ 6456],
| 70.00th=[ 7111], 80.00th=[ 7898], 90.00th=[ 9110], 95.00th=[10945],
| 99.00th=[20055], 99.50th=[27919], 99.90th=[39584], 99.95th=[42730],
| 99.99th=[48497]
bw ( KiB/s): min=11040, max=24536, per=12.39%, avg=19864.53, stdev=3257.22, samples=160
iops : min= 2760, max= 6134, avg=4966.02, stdev=814.38, samples=160
lat (usec) : 250=0.01%, 500=0.02%, 750=0.03%, 1000=0.05%
lat (msec) : 2=1.14%, 4=16.41%, 10=75.56%, 20=5.76%, 50=1.01%
lat (msec) : 100=0.01%
cpu : usr=0.70%, sys=2.70%, ctx=259352, majf=0, minf=277
IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=99.9%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.1%, 64=0.0%, >=64=0.0%
issued rwts: total=409600,0,0,0 short=0,0,0,0 dropped=0,0,0,0
latency : target=0, window=0, percentile=100.00%, depth=32
Run status group 0 (all jobs):
READ: bw=157MiB/s (164MB/s), 157MiB/s-157MiB/s (164MB/s-164MB/s), io=1600MiB (1678MB), run=10222-10222msec
Disk stats (read/write):
rbd2: ios=397439/2, merge=7833/1, ticks=2481363/15, in_queue=987297, util=99.18%
4K 随机读写
fio -filename=/media/test.file -direct=1 -iodepth 32 -thread -rw=randrw --rwmixread=70 -ioengine=libaio -bs=4k -size=200M -numjobs=8 -runtime=60 -group_reporting -name=test
- 4K 随机读写测试
[root@node0 ceph-deploy]# rm /media/test.file -rf
[root@node0 ceph-deploy]# fio -filename=/media/test.file -direct=1 -iodepth 32 -thread -rw=randrw --rwmixread=70 -ioengine=libaio -bs=4k -size=200M -numjobs=8 -runtime=60 -group_reporting -name=test
test: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=32
...
fio-3.7
Starting 8 threads
test: Laying out IO file (1 file / 200MiB)
Jobs: 8 (f=8): [m(8)][100.0%][r=32.2MiB/s,w=13.8MiB/s][r=8248,w=3538 IOPS][eta 00m:00s]
test: (groupid=0, jobs=8): err= 0: pid=90076: Thu Nov 3 19:31:00 2022
read: IOPS=7058, BW=27.6MiB/s (28.9MB/s)(1119MiB/40585msec)
slat (nsec): min=1022, max=95355k, avg=13037.51, stdev=547433.62
clat (usec): min=15, max=312771, avg=22737.63, stdev=15433.24
lat (usec): min=262, max=312775, avg=22750.80, stdev=15442.52
clat percentiles (msec):
| 1.00th=[ 3], 5.00th=[ 6], 10.00th=[ 8], 20.00th=[ 11],
| 30.00th=[ 14], 40.00th=[ 17], 50.00th=[ 20], 60.00th=[ 24],
| 70.00th=[ 28], 80.00th=[ 33], 90.00th=[ 41], 95.00th=[ 50],
| 99.00th=[ 75], 99.50th=[ 91], 99.90th=[ 142], 99.95th=[ 169],
| 99.99th=[ 199]
bw ( KiB/s): min= 1596, max= 4455, per=12.48%, avg=3523.13, stdev=571.72, samples=643
iops : min= 399, max= 1113, avg=880.62, stdev=142.97, samples=643
write: IOPS=3033, BW=11.9MiB/s (12.4MB/s)(481MiB/40585msec)
slat (nsec): min=1303, max=94612k, avg=13992.99, stdev=431206.41
clat (msec): min=2, max=401, avg=31.22, stdev=16.72
lat (msec): min=2, max=401, avg=31.24, stdev=16.72
clat percentiles (msec):
| 1.00th=[ 11], 5.00th=[ 14], 10.00th=[ 17], 20.00th=[ 20],
| 30.00th=[ 23], 40.00th=[ 26], 50.00th=[ 28], 60.00th=[ 32],
| 70.00th=[ 35], 80.00th=[ 41], 90.00th=[ 50], 95.00th=[ 58],
| 99.00th=[ 88], 99.50th=[ 111], 99.90th=[ 178], 99.95th=[ 213],
| 99.99th=[ 313]
bw ( KiB/s): min= 656, max= 1980, per=12.48%, avg=1514.49, stdev=255.76, samples=643
iops : min= 164, max= 495, avg=378.51, stdev=64.00, samples=643
lat (usec) : 20=0.01%, 50=0.01%, 100=0.01%, 500=0.01%, 750=0.01%
lat (usec) : 1000=0.02%
lat (msec) : 2=0.21%, 4=1.59%, 10=10.91%, 20=29.25%, 50=52.08%
lat (msec) : 100=5.48%, 250=0.44%, 500=0.01%
cpu : usr=0.34%, sys=1.11%, ctx=177815, majf=0, minf=19
IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=99.9%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.1%, 64=0.0%, >=64=0.0%
issued rwts: total=286473,123127,0,0 short=0,0,0,0 dropped=0,0,0,0
latency : target=0, window=0, percentile=100.00%, depth=32
Run status group 0 (all jobs):
READ: bw=27.6MiB/s (28.9MB/s), 27.6MiB/s-27.6MiB/s (28.9MB/s-28.9MB/s), io=1119MiB (1173MB), run=40585-40585msec
WRITE: bw=11.9MiB/s (12.4MB/s), 11.9MiB/s-11.9MiB/s (12.4MB/s-12.4MB/s), io=481MiB (504MB), run=40585-40585msec
Disk stats (read/write):
rbd2: ios=281793/122139, merge=4006/749, ticks=6116534/3661561, in_queue=5023543, util=100.00%
1M 顺序写
fio -filename=/media/test.file -direct=1 -iodepth 32 -thread -rw=write -ioengine=libaio -bs=1M -size=200M -numjobs=8 -runtime=60 -group_reporting -name=test
- 1M 顺序写测试
[root@node0 ceph-deploy]# rm /media/test.file -rf
[root@node0 ceph-deploy]# fio -filename=/media/test.file -direct=1 -iodepth 32 -thread -rw=write -ioengine=libaio -bs=1M -size=200M -numjobs=8 -runtime=60 -group_reporting -name=test
test: (g=0): rw=write, bs=(R) 1024KiB-1024KiB, (W) 1024KiB-1024KiB, (T) 1024KiB-1024KiB, ioengine=libaio, iodepth=32
...
fio-3.7
Starting 8 threads
test: Laying out IO file (1 file / 200MiB)
Jobs: 8 (f=8): [W(8)][76.9%][r=0KiB/s,w=113MiB/s][r=0,w=113 IOPS][eta 00m:03s]
test: (groupid=0, jobs=8): err= 0: pid=90293: Thu Nov 3 19:32:09 2022
write: IOPS=164, BW=164MiB/s (172MB/s)(1600MiB/9732msec)
slat (usec): min=43, max=99240, avg=253.89, stdev=2871.41
clat (msec): min=6, max=3988, avg=1471.39, stdev=965.40
lat (msec): min=6, max=3988, avg=1471.67, stdev=965.31
clat percentiles (msec):
| 1.00th=[ 29], 5.00th=[ 99], 10.00th=[ 176], 20.00th=[ 330],
| 30.00th=[ 651], 40.00th=[ 1250], 50.00th=[ 1552], 60.00th=[ 1921],
| 70.00th=[ 2165], 80.00th=[ 2400], 90.00th=[ 2668], 95.00th=[ 2937],
| 99.00th=[ 3574], 99.50th=[ 3742], 99.90th=[ 3977], 99.95th=[ 3977],
| 99.99th=[ 3977]
bw ( KiB/s): min= 2019, max=69354, per=13.75%, avg=23141.23, stdev=14280.91, samples=119
iops : min= 1, max= 67, avg=22.03, stdev=14.02, samples=119
lat (msec) : 10=0.12%, 20=0.31%, 50=1.88%, 100=3.12%, 250=9.94%
lat (msec) : 500=11.50%, 750=4.81%, 1000=4.12%
cpu : usr=0.10%, sys=0.19%, ctx=741, majf=3, minf=68
IO depths : 1=0.5%, 2=1.0%, 4=2.0%, 8=4.0%, 16=8.0%, 32=84.5%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=99.4%, 8=0.0%, 16=0.0%, 32=0.6%, 64=0.0%, >=64=0.0%
issued rwts: total=0,1600,0,0 short=0,0,0,0 dropped=0,0,0,0
latency : target=0, window=0, percentile=100.00%, depth=32
Run status group 0 (all jobs):
WRITE: bw=164MiB/s (172MB/s), 164MiB/s-164MiB/s (172MB/s-172MB/s), io=1600MiB (1678MB), run=9732-9732msec
Disk stats (read/write):
rbd2: ios=1/954, merge=0/616, ticks=29/1239789, in_queue=1225428, util=98.78%
RDB Bench 压力测试
bench 命令帮助
[root@node0 ceph-deploy]# rbd help bench
usage: rbd bench [--pool <pool>] [--namespace <namespace>] [--image <image>]
[--io-size <io-size>] [--io-threads <io-threads>]
[--io-total <io-total>] [--io-pattern <io-pattern>]
[--rw-mix-read <rw-mix-read>] --io-type <io-type>
<image-spec>
Simple benchmark.
Positional arguments
<image-spec> image specification
(example: [<pool-name>/[<namespace>/]]<image-name>)
Optional arguments
-p [ --pool ] arg pool name
--namespace arg namespace name
--image arg image name
--io-size arg IO size (in B/K/M/G/T) [default: 4K]
--io-threads arg ios in flight [default: 16]
--io-total arg total size for IO (in B/K/M/G/T) [default: 1G]
--io-pattern arg IO pattern (rand or seq) [default: seq]
--rw-mix-read arg read proportion in readwrite (<= 100) [default: 50]
--io-type arg IO type (read , write, or readwrite(rw))
随机写
[root@node0 ceph-deploy]# rbd bench ceph-demo/osd-test.img --io-size 4K --io-threads 16 --io-total 200M --io-pattern rand --io-type write
bench type write io_size 4096 io_threads 16 bytes 209715200 pattern random
SEC OPS OPS/SEC BYTES/SEC
1 8272 8197.86 33578435.84
2 11248 5632.02 23068774.36
3 14256 4746.28 19440762.01
4 17376 4328.54 17729703.35
5 20432 4086.35 16737685.77
6 22688 2862.60 11725218.03
7 24912 2643.97 10829719.87
8 25888 2326.88 9530882.85
9 27776 2087.52 8550499.60
10 29968 1896.96 7769968.03
11 31376 1746.34 7153006.20
12 32784 1628.81 6671602.53
13 35120 1733.39 7099969.91
14 37344 1912.84 7835006.37
15 40704 2150.22 8807300.44
16 44048 2538.98 10399667.85
17 47136 2869.27 11752509.55
18 50720 3341.92 13688492.60
elapsed: 19 ops: 51200 ops/sec: 2600.18 bytes/sec: 10650354.49
随机读
[root@node0 ceph-deploy]# rbd bench ceph-demo/osd-test.img --io-size 4K --io-threads 16 --io-total 200M --io-pattern rand --io-type read
bench type read io_size 4096 io_threads 16 bytes 209715200 pattern random
SEC OPS OPS/SEC BYTES/SEC
1 15616 15663.40 64157270.44
2 33904 16977.05 69538006.02
elapsed: 2 ops: 51200 ops/sec: 17403.20 bytes/sec: 71283524.75
随机混合读写
[root@node0 ceph-deploy]# rbd bench ceph-demo/osd-test.img --io-size 4K --io-threads 16 --io-total 200M --io-pattern rand --io-type readwrite --rw-mix-read 70
bench type readwrite read:write=70:30 io_size 4096 io_threads 16 bytes 209715200 pattern random
SEC OPS OPS/SEC BYTES/SEC
1 14960 14961.11 61280687.15
2 19472 9633.26 39457835.74
3 24512 8176.04 33489044.40
4 31584 7896.09 32342372.20
5 38368 7676.83 31444312.13
6 45680 6145.26 25170969.70
elapsed: 7 ops: 51200 ops/sec: 6593.72 bytes/sec: 27007872.40
read_ops: 36100 read_ops/sec: 4649.09 read_bytes/sec: 19042660.03
write_ops: 15100 write_ops/sec: 1944.63 write_bytes/sec: 7965212.37
顺序写
[root@node0 ceph-deploy]# rbd bench ceph-demo/osd-test.img --io-size 4K --io-threads 16 --io-total 1G --io-pattern seq --io-type write
bench type write io_size 4096 io_threads 16 bytes 1073741824 pattern sequential
SEC OPS OPS/SEC BYTES/SEC
1 33440 33157.73 135814054.96
2 61920 30890.91 126529165.55
3 98624 32858.24 134587350.16
4 124416 31100.36 127387084.22
5 152752 30547.63 125123073.95
6 181968 29747.38 121845258.57
7 207424 29130.06 119316720.16
8 234368 27132.64 111135294.93
elapsed: 9 ops: 262144 ops/sec: 28665.41 bytes/sec: 117413512.09