ceph(四)ceph集群管理、pg常见状态总结
1. ceph常见管理命令总结
1.1 只显示存储池
ceph osd pool ls
示例
$ ceph osd pool ls device_health_metrics mypool myrbd1 rbd-data1
1.2 列出存储池并显示id
ceph osd lspools
示例
$ ceph osd lspools 1 device_health_metrics 2 mypool 3 myrbd1 4 rbd-data1
1.3 查看pg状态
ceph pg stat
示例
$ ceph pg stat 97 pgs: 97 active+clean; 43 MiB data, 5.9 GiB used, 20 TiB / 20 TiB avail cephadmin@ceph-deploy:~$ ceph osd pool stats mypool pool mypool id 2 nothing is going on
1.4 查看指定pool或所有pool的状态
ceph osd pool stats
示例
cephadmin@ceph-deploy:~$ ceph osd pool stats mypool pool mypool id 2 nothing is going on cephadmin@ceph-deploy:~$ ceph osd pool stats pool device_health_metrics id 1 nothing is going on pool mypool id 2 nothing is going on pool myrbd1 id 3 nothing is going on pool rbd-data1 id 4 nothing is going on
1.5 查看集群存储状态
cephadmin@ceph-deploy:~$ ceph df --- RAW STORAGE --- CLASS SIZE AVAIL USED RAW USED %RAW USED hdd 20 TiB 20 TiB 5.9 GiB 5.9 GiB 0.03 TOTAL 20 TiB 20 TiB 5.9 GiB 5.9 GiB 0.03 --- POOLS --- POOL ID PGS STORED OBJECTS USED %USED MAX AVAIL device_health_metrics 1 1 0 B 0 0 B 0 6.3 TiB mypool 2 32 0 B 0 0 B 0 6.3 TiB myrbd1 3 32 19 B 3 12 KiB 0 6.3 TiB rbd-data1 4 32 11 MiB 74 33 MiB 0 6.3 TiB
1.6 查看集群存储状态详情
ceph df detail
示例
$ ceph df detail --- RAW STORAGE --- CLASS SIZE AVAIL USED RAW USED %RAW USED hdd 20 TiB 20 TiB 5.9 GiB 5.9 GiB 0.03 TOTAL 20 TiB 20 TiB 5.9 GiB 5.9 GiB 0.03 --- POOLS --- POOL ID PGS STORED (DATA) (OMAP) OBJECTS USED (DATA) (OMAP) %USED MAX AVAIL QUOTA OBJECTS QUOTA BYTES DIRTY USED COMPR UNDER COMPR device_health_metrics 1 1 0 B 0 B 0 B 0 0 B 0 B 0 B 0 6.3 TiB N/A N/A N/A 0 B 0 B mypool 2 32 0 B 0 B 0 B 0 0 B 0 B 0 B 0 6.3 TiB N/A N/A N/A 0 B 0 B myrbd1 3 32 19 B 19 B 0 B 3 12 KiB 12 KiB 0 B 0 6.3 TiB N/A N/A N/A 0 B 0 B rbd-data1 4 32 11 MiB 11 MiB 0 B 74 33 MiB 33 MiB 0 B 0 6.3 TiB N/A N/A N/A 0 B 0 B
1.7 查看osd状态
ceph osd stat
示例
$ ceph osd stat 20 osds: 20 up (since 3h), 20 in (since 2d); epoch: e302
1.8 显示osd底层详细信息
ceph osd dump
示例
$ ceph osd dump epoch 302 fsid 28820ae5-8747-4c53-827b-219361781ada created 2023-09-21T02:58:34.034362+0800 modified 2023-09-24T04:18:36.462497+0800 flags sortbitwise,recovery_deletes,purged_snapdirs,pglog_hardlimit crush_version 62 full_ratio 0.95 backfillfull_ratio 0.9 nearfull_ratio 0.85 require_min_compat_client luminous min_compat_client luminous require_osd_release pacific stretch_mode_enabled false pool 1 'device_health_metrics' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 1 pgp_num 1 autoscale_mode on last_change 182 flags hashpspool stripe_width 0 pg_num_max 32 pg_num_min 1 application mgr_devicehealth pool 2 'mypool' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode on last_change 150 flags hashpspool stripe_width 0 pool 3 'myrbd1' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode on last_change 203 flags hashpspool,selfmanaged_snaps stripe_width 0 application rbd pool 4 'rbd-data1' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode on last_change 299 flags hashpspool,selfmanaged_snaps stripe_width 0 application rbd max_osd 20 osd.0 up in weight 1 up_from 251 up_thru 291 down_at 248 last_clean_interval [184,250) [v2:10.0.0.57:6800/2618,v1:10.0.0.57:6801/2618] [v2:192.168.10.57:6816/1002618,v1:192.168.10.57:6817/1002618] exists,up 886cced3-abd0-47ba-9ecb-fff72fe44a5a osd.1 up in weight 1 up_from 218 up_thru 291 down_at 216 last_clean_interval [184,217) [v2:10.0.0.57:6816/2633,v1:10.0.0.57:6817/2633] [v2:192.168.10.57:6804/1002633,v1:192.168.10.57:6805/1002633] exists,up 9774613d-32ca-4250-ab1a-f51db165a56f osd.2 up in weight 1 up_from 259 up_thru 289 down_at 258 last_clean_interval [184,258) [v2:10.0.0.57:6812/2625,v1:10.0.0.57:6813/2625] [v2:192.168.10.57:6806/1002625,v1:192.168.10.57:6807/1002625] exists,up 15dc3bc8-0c06-410b-ac09-1a81e3cf207c osd.3 up in weight 1 up_from 255 up_thru 291 down_at 253 last_clean_interval [184,254) [v2:10.0.0.57:6804/2630,v1:10.0.0.57:6806/2630] [v2:192.168.10.57:6800/2002630,v1:192.168.10.57:6801/2002630] exists,up 12accf18-5fb9-4645-b1ac-33d988db6753 osd.4 up in weight 1 up_from 270 up_thru 291 down_at 267 last_clean_interval [184,269) [v2:10.0.0.57:6805/2634,v1:10.0.0.57:6807/2634] [v2:192.168.10.57:6812/2002634,v1:192.168.10.57:6813/2002634] exists,up 7b2cbcbe-c572-4539-85ce-9df729be1e13 osd.5 up in weight 1 up_from 284 up_thru 288 down_at 281 last_clean_interval [187,283) [v2:10.0.0.58:6809/2645,v1:10.0.0.58:6811/2645] [v2:192.168.10.58:6804/3002645,v1:192.168.10.58:6805/3002645] exists,up bc8dfb8a-7906-49dc-a6da-89e626a24e4e osd.6 up in weight 1 up_from 243 up_thru 291 down_at 237 last_clean_interval [187,242) [v2:10.0.0.58:6806/2647,v1:10.0.0.58:6807/2647] [v2:192.168.10.58:6820/2002647,v1:192.168.10.58:6821/2002647] exists,up f23ec8f1-56da-41c7-b8ed-837100612e26 osd.7 up in weight 1 up_from 243 up_thru 291 down_at 226 last_clean_interval [187,242) [v2:10.0.0.58:6800/2643,v1:10.0.0.58:6801/2643] [v2:192.168.10.58:6816/2002643,v1:192.168.10.58:6817/2002643] exists,up b6e9dfb1-1ab4-40b5-ba3e-fea590753f79 osd.8 up in weight 1 up_from 243 up_thru 291 down_at 234 last_clean_interval [187,242) [v2:10.0.0.58:6804/2651,v1:10.0.0.58:6805/2651] [v2:192.168.10.58:6809/2002651,v1:192.168.10.58:6811/2002651] exists,up fda2c1d3-acab-4c3b-a94c-089a585a76fe osd.9 up in weight 1 up_from 286 up_thru 291 down_at 281 last_clean_interval [187,285) [v2:10.0.0.58:6816/2648,v1:10.0.0.58:6817/2648] [v2:192.168.10.58:6806/3002648,v1:192.168.10.58:6807/3002648] exists,up 2ca3d89a-2b5c-45ff-94a4-c87960f94533 osd.10 up in weight 1 up_from 271 up_thru 291 down_at 267 last_clean_interval [174,270) [v2:10.0.0.56:6805/1496,v1:10.0.0.56:6806/1496] [v2:192.168.10.56:6816/1001496,v1:192.168.10.56:6817/1001496] exists,up 10cd2d83-482f-4687-93ae-fe5e172694a1 osd.11 up in weight 1 up_from 264 up_thru 294 down_at 261 last_clean_interval [175,263) [v2:10.0.0.56:6807/1493,v1:10.0.0.56:6808/1493] [v2:192.168.10.56:6805/2001493,v1:192.168.10.56:6806/2001493] exists,up 57a2c73e-2629-4c8c-a9c5-e798d07cd332 osd.12 up in weight 1 up_from 256 up_thru 294 down_at 253 last_clean_interval [174,255) [v2:10.0.0.56:6816/1497,v1:10.0.0.56:6817/1497] [v2:192.168.10.56:6809/2001497,v1:192.168.10.56:6811/2001497] exists,up e282d88f-4937-4f0d-bba8-960dd0f0a26d osd.13 up in weight 1 up_from 259 up_thru 291 down_at 258 last_clean_interval [174,258) [v2:10.0.0.56:6800/1494,v1:10.0.0.56:6809/1494] [v2:192.168.10.56:6820/2001494,v1:192.168.10.56:6821/2001494] exists,up 36287ce5-ed8a-4494-84dd-a8a739139f90 osd.14 up in weight 1 up_from 255 up_thru 291 down_at 253 last_clean_interval [174,254) [v2:10.0.0.56:6801/1495,v1:10.0.0.56:6802/1495] [v2:192.168.10.56:6800/2001495,v1:192.168.10.56:6801/2001495] exists,up 444174d7-8cd5-4567-bab3-247506d981a2 osd.15 up in weight 1 up_from 288 up_thru 291 down_at 276 last_clean_interval [179,287) [v2:10.0.0.59:6804/1462,v1:10.0.0.59:6805/1462] [v2:192.168.10.59:6816/2001462,v1:192.168.10.59:6817/2001462] exists,up c3a323cd-652f-4b70-9de2-e290547a3df0 osd.16 up in weight 1 up_from 289 up_thru 294 down_at 279 last_clean_interval [178,288) [v2:10.0.0.59:6800/1463,v1:10.0.0.59:6801/1463] [v2:192.168.10.59:6804/2001463,v1:192.168.10.59:6805/2001463] exists,up 485600b0-d305-49ef-84e4-112bab8f6cc2 osd.17 up in weight 1 up_from 288 up_thru 291 down_at 279 last_clean_interval [179,287) [v2:10.0.0.59:6808/1461,v1:10.0.0.59:6809/1461] [v2:192.168.10.59:6808/2001461,v1:192.168.10.59:6809/2001461] exists,up d4796af7-b353-4bea-a8a2-5e4f8ff262aa osd.18 up in weight 1 up_from 288 up_thru 291 down_at 279 last_clean_interval [179,287) [v2:10.0.0.59:6812/1464,v1:10.0.0.59:6813/1464] [v2:192.168.10.59:6820/2001464,v1:192.168.10.59:6821/2001464] exists,up 93c30827-b7a0-4612-93b7-5b04618869a2 osd.19 up in weight 1 up_from 289 up_thru 289 down_at 276 last_clean_interval [179,288) [v2:10.0.0.59:6816/1465,v1:10.0.0.59:6817/1465] [v2:192.168.10.59:6812/2001465,v1:192.168.10.59:6813/2001465] exists,up 218b659a-0082-4760-8997-419d4b4b11c2 pg_upmap_items 4.6 [7,5] pg_upmap_items 4.b [7,5] pg_upmap_items 4.1f [1,0] blocklist 10.0.0.65:0/3630118219 expires 2023-09-24T05:18:35.693210+0800 blocklist 10.0.0.55:6800/990 expires 2023-09-24T23:50:36.409813+0800 blocklist 10.0.0.55:0/4018866837 expires 2023-09-24T23:50:36.409813+0800 blocklist 10.0.0.55:6801/990 expires 2023-09-24T23:50:36.409813+0800 blocklist 10.0.0.55:0/403645778 expires 2023-09-24T23:50:36.409813+0800 blocklist 10.0.0.55:0/1925626704 expires 2023-09-24T23:50:36.409813+0800
1.9 显示osd和节点对应关系
ceph osd tree
示例:
查看osd对应的硬盘,如osd.13在node1节点上
$ ceph osd tree ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF -1 20.00000 root default -7 5.00000 host ceph-node1 10 hdd 1.00000 osd.10 up 1.00000 1.00000 11 hdd 1.00000 osd.11 up 1.00000 1.00000 12 hdd 1.00000 osd.12 up 1.00000 1.00000 13 hdd 1.00000 osd.13 up 1.00000 1.00000 14 hdd 1.00000 osd.14 up 1.00000 1.00000 -3 5.00000 host ceph-node2 0 hdd 1.00000 osd.0 up 1.00000 1.00000 1 hdd 1.00000 osd.1 up 1.00000 1.00000 2 hdd 1.00000 osd.2 up 1.00000 1.00000 3 hdd 1.00000 osd.3 up 1.00000 1.00000 4 hdd 1.00000 osd.4 up 1.00000 1.00000 -5 5.00000 host ceph-node3 5 hdd 1.00000 osd.5 up 1.00000 1.00000 6 hdd 1.00000 osd.6 up 1.00000 1.00000 7 hdd 1.00000 osd.7 up 1.00000 1.00000 8 hdd 1.00000 osd.8 up 1.00000 1.00000 9 hdd 1.00000 osd.9 up 1.00000 1.00000 -9 5.00000 host ceph-node4 15 hdd 1.00000 osd.15 up 1.00000 1.00000 16 hdd 1.00000 osd.16 up 1.00000 1.00000 17 hdd 1.00000 osd.17 up 1.00000 1.00000 18 hdd 1.00000 osd.18 up 1.00000 1.00000 19 hdd 1.00000 osd.19 up 1.00000 1.00000
到osd对应的node节点查看与osd对应的硬盘
# 查看发现osd.13对应的硬盘为sde [root@ceph-node1 ~]#ll /var/lib/ceph/osd/ceph-13/block lrwxrwxrwx 1 ceph ceph 93 Sep 23 23:49 /var/lib/ceph/osd/ceph-13/block -> /dev/ceph-e54d4fb9-6f42-4e06-96a3-330813cf9342/osd-block-36287ce5-ed8a-4494-84dd-a8a739139f90 [root@ceph-node1 ~]#lsblk -f|grep -B1 ceph sdb LVM2_member 96sQIt-6luz-rxwI-DevT-TTSX-gwPw-0akFDx └─ceph--be8feab1--4bc8--44b8--9394--2eb42fca07fe-osd--block--10cd2d83--482f--4687--93ae--fe5e172694a1 ceph_bluestore ... sde LVM2_member jGHnbR-WmNT-8mZW-6pXD-8ZR6-xXqL-23ACnR └─ceph--e54d4fb9--6f42--4e06--96a3--330813cf9342-osd--block--36287ce5--ed8a--4494--84dd--a8a739139f90 ceph_bluestore
1.10 显示osd存储信息和节点对应关系
ceph osd df tree
示例
$ ceph osd df tree ID CLASS WEIGHT REWEIGHT SIZE RAW USE DATA OMAP META AVAIL %USE VAR PGS STATUS TYPE NAME -1 20.00000 - 20 TiB 5.9 GiB 117 MiB 0 B 5.8 GiB 20 TiB 0.03 1.00 - root default -7 5.00000 - 5.0 TiB 1.5 GiB 32 MiB 0 B 1.5 GiB 5.0 TiB 0.03 1.01 - host ceph-node1 10 hdd 1.00000 1.00000 1024 GiB 299 MiB 4.2 MiB 0 B 295 MiB 1024 GiB 0.03 0.99 9 up osd.10 11 hdd 1.00000 1.00000 1024 GiB 303 MiB 8.2 MiB 0 B 295 MiB 1024 GiB 0.03 1.00 11 up osd.11 12 hdd 1.00000 1.00000 1024 GiB 303 MiB 4.3 MiB 0 B 299 MiB 1024 GiB 0.03 1.00 20 up osd.12 13 hdd 1.00000 1.00000 1024 GiB 310 MiB 7.0 MiB 0 B 303 MiB 1024 GiB 0.03 1.03 22 up osd.13 14 hdd 1.00000 1.00000 1024 GiB 303 MiB 8.2 MiB 0 B 295 MiB 1024 GiB 0.03 1.00 12 up osd.14 -3 5.00000 - 5.0 TiB 1.5 GiB 27 MiB 0 B 1.4 GiB 5.0 TiB 0.03 1.00 - host ceph-node2 0 hdd 1.00000 1.00000 1024 GiB 303 MiB 8.2 MiB 0 B 295 MiB 1024 GiB 0.03 1.00 8 up osd.0 1 hdd 1.00000 1.00000 1024 GiB 305 MiB 6.3 MiB 0 B 299 MiB 1024 GiB 0.03 1.01 21 up osd.1 2 hdd 1.00000 1.00000 1024 GiB 295 MiB 4.2 MiB 0 B 291 MiB 1024 GiB 0.03 0.98 6 up osd.2 3 hdd 1.00000 1.00000 1024 GiB 307 MiB 4.3 MiB 0 B 303 MiB 1024 GiB 0.03 1.02 20 up osd.3 4 hdd 1.00000 1.00000 1024 GiB 299 MiB 4.3 MiB 0 B 295 MiB 1024 GiB 0.03 0.99 11 up osd.4 -5 5.00000 - 5.0 TiB 1.5 GiB 28 MiB 0 B 1.4 GiB 5.0 TiB 0.03 1.00 - host ceph-node3 5 hdd 1.00000 1.00000 1024 GiB 299 MiB 4.2 MiB 0 B 295 MiB 1024 GiB 0.03 0.99 11 up osd.5 6 hdd 1.00000 1.00000 1024 GiB 303 MiB 8.2 MiB 0 B 295 MiB 1024 GiB 0.03 1.00 15 up osd.6 7 hdd 1.00000 1.00000 1024 GiB 302 MiB 6.4 MiB 0 B 295 MiB 1024 GiB 0.03 1.00 19 up osd.7 8 hdd 1.00000 1.00000 1024 GiB 299 MiB 4.2 MiB 0 B 295 MiB 1024 GiB 0.03 0.99 16 up osd.8 9 hdd 1.00000 1.00000 1024 GiB 300 MiB 4.9 MiB 0 B 295 MiB 1024 GiB 0.03 0.99 11 up osd.9 -9 5.00000 - 5.0 TiB 1.5 GiB 30 MiB 0 B 1.4 GiB 5.0 TiB 0.03 1.00 - host ceph-node4 15 hdd 1.00000 1.00000 1024 GiB 299 MiB 4.2 MiB 0 B 295 MiB 1024 GiB 0.03 0.99 16 up osd.15 16 hdd 1.00000 1.00000 1024 GiB 299 MiB 4.2 MiB 0 B 295 MiB 1024 GiB 0.03 0.99 11 up osd.16 17 hdd 1.00000 1.00000 1024 GiB 303 MiB 8.2 MiB 0 B 295 MiB 1024 GiB 0.03 1.00 19 up osd.17 18 hdd 1.00000 1.00000 1024 GiB 307 MiB 8.4 MiB 0 B 299 MiB 1024 GiB 0.03 1.02 17 up osd.18 19 hdd 1.00000 1.00000 1024 GiB 301 MiB 4.9 MiB 0 B 296 MiB 1024 GiB 0.03 1.00 16 up osd.19 TOTAL 20 TiB 5.9 GiB 117 MiB 19 KiB 5.8 GiB 20 TiB 0.03 MIN/MAX VAR: 0.98/1.03 STDDEV: 0
1.11 查看mon节点状态
ceph mon stat
示例
$ ceph mon stat e3: 3 mons at {ceph-mon1=[v2:10.0.0.51:3300/0,v1:10.0.0.51:6789/0],ceph-mon2=[v2:10.0.0.52:3300/0,v1:10.0.0.52:6789/0],ceph-mon3=[v2:10.0.0.53:3300/0,v1:10.0.0.53:6789/0]} removed_ranks: {}, election epoch 30, leader 0 ceph-mon1, quorum 0,1,2 ceph-mon1,ceph-mon2,ceph-mon3
1.12 查看mon节点的dump信息
ceph mon dump
示例
$ ceph mon dump epoch 3 fsid 28820ae5-8747-4c53-827b-219361781ada last_changed 2023-09-21T04:46:48.910442+0800 created 2023-09-21T02:58:33.478584+0800 min_mon_release 16 (pacific) election_strategy: 1 0: [v2:10.0.0.51:3300/0,v1:10.0.0.51:6789/0] mon.ceph-mon1 1: [v2:10.0.0.52:3300/0,v1:10.0.0.52:6789/0] mon.ceph-mon2 2: [v2:10.0.0.53:3300/0,v1:10.0.0.53:6789/0] mon.ceph-mon3 dumped monmap epoch 3
2. ceph集群停止或重启步骤
重启之前,提前设置ceph集群将osd标记为noout,避免node节点关闭服务后踢出ceph集群
# 关闭服务前设置noout cephadmin@ceph-deploy:~$ ceph osd set noout noout is set cephadmin@ceph-deploy:~$ ceph osd stat 20 osds: 20 up (since 13h), 20 in (since 3d); epoch: e304 flags noout # 启动服务后取消noout cephadmin@ceph-deploy:~$ ceph osd unset noout noout is unset cephadmin@ceph-deploy:~$ ceph osd stat 20 osds: 20 up (since 13h), 20 in (since 3d); epoch: e305
2.1 关闭顺序
- 关闭服务前设置noout
- 关闭存储客户端停止读写数据
- 如果使用RGW,关闭RGW
- 关闭cephfs元数据服务
- 关闭ceph osd
- 关闭ceph manager
- 关闭ceph monitor
2.2 启动顺序
- 启动ceph monitor
- 启动ceph manager
- 启动ceph osd
- 启动cephfs 元数据服务
- 启动RGW
- 启动存储客户端
- 启动服务后取消noout
2.3 添加服务器
-
添加ceph仓库源
-
安装ceph服务
ceph-deploy install --release pacific {ceph-nodeX} -
擦除磁盘
ceph-deploy disk zap {ceph-nodeX} {/dev/sdx} -
添加osd
ceph-deploy osd create {ceph-nodeX} --data {/dev/sdx}
2.4 删除OSD或服务器
把故障OSD从ceph集群删除
-
把osd踢出集群
ceph osd out osd.{id} -
等一段时间
-
进入对应node节点,停止osd.{id}进程
systemctl stop ceph-osd@{id}.service -
删除osd
ceph osd rm osd.{id}
2.5 删除服务器
删除服务器之前要把服务器上OSD先停止并从ceph集群移除
-
把osd踢出集群
-
等一段时间
-
进入对应node节点,停止osd.{id}进程
-
删除osd
-
重复上述步骤,删除该node节点上所有osd
-
osd全部操作完成后下线主机
-
从crush删除ceph-nodeX节点
ceph osd crush rm ceph ceph-nodeX
3. pg常见状态总结
PG的常见状态如下:
3.1 peering
正在同步状态,同一个PG中的OSD需要将准备数据同步一致,而peering(对等)就是OSD同步过程中的状态。
3.2 activating
Peering 已经完成,PG正在等待所有PG实例同步Peering的结果(Info、Log等)
3.3 clean
干净态,PG当前不存在待修复的对象,并且大小等于存储池的副本数,即PG的活动集(Acting Set)和上行集(Up Set)为同一组OSD且内容一致。
活动集(Acting Set):由PG当前主的OSD和其余处于活动状态的备用OSD组成,当前PG内的OSD负责处理用户的读写请求。
上行集(Up Set):在某一个OSD故障时,需要将故障的OSD更换为可用的OSD,并主PG内部的主OSD同步数据到新的OSD上,例如PG内有OSD1、OSD2、OSD3,当OSD3故障后需要用OSD4替换OSD3,那么OSD1、OSD2、OSD3就是上行集,替换后OSD1,OSD2、OSD4就是活动集,OSD替换完成后活动集最终要替换上行集。
3.4 active
就绪状态或活跃状态,Active表示主OSD和备OSD处于正常工作状态,此时的PG可以正常处理来自客户端的读写请求,正常的PG默认就是Active+Clean状态。
cephadmin@ceph-deploy:~$ ceph pg stat 97 pgs: 97 active+clean; 43 MiB data, 5.9 GiB used, 20 TiB / 20 TiB avail
3.5 degraded
降级状态,该状态出现于OSD被标记为down以后,那么其他映射到此OSD的PG都会转换到降级状态。
如果此OSD还能重新启动完成并完成Peering操作后,那么使用此OSD的PG将重新恢复为clean状态。
如果此OSD被标记为down的时间超过5分钟还没有修复,那么此OSD将会被ceph踢出集群,然后ceph会对被降级的PG启动恢复操作,直到所有由于此OSD而被降级的PG重新恢复为clean状态。
恢复数据会从PG内的主OSD恢复,如果是主OSD故障,那么会在剩下的两个备用OSD重新选择一个作为主OSD。
3.6 stale
过期状态,正常情况下每个主OSD都要周期性的向RADOS集群中的监视器(Mon)报告其作为主OSD所持有的所有PG的最新统计数据,因任何原因导致某个OSD无法正常向监视器发送汇报信息的、或者由其他OSD报告某个OSD已经down 的时候,则所有以此OSD为主PG则会立即被标记为stale状态,即他们的主OSD已经不是最新的数据了,如果是备份的OSD发送down的时候,则ceph会执行修复而不会触发PG状态转换为stale状态。
3.7 undersized
小于正常状态,PG当前副本数小于其存储池定义的值的时候,PG会转换为undersized状态,比如两个备份OSD都down了,那么此时PG中就只有一个主OSD了,不符合ceph最少要求一个主OSD加一个备OSD的要求,那么就会导致使用此OSD的PG转换为undersized 状态,直到添加备份OSD添加完成,或者修复完成。
3.8 scrubbing
scrub是ceph对数据的清洗状态,用来保证数据完整性的机制,Ceph 的OSD定期启动scrub线程来扫描部分对象,通过与其他副本比对来发现是否一致,如果存在不一致,抛出异常提示用户手动解决, scrub 以PG为单位,对于每一个pg, ceph分析该pg下所有的object,产生一个类似于元数据信息摘要的数据结构,如对象大小,属性等,叫scrubmap,比较主与副scrubmap,来保证是不是有object丢失或者不匹配,扫描分为轻量级扫描和深度扫描,轻量级扫描也叫做light scrubs或者shallow scrubs或者simply scrubs即轻量级扫描.
Light scrub(daily)比较object size 和属性, deep scrub (weekly)读取数据部分并通过checksum(CRC32算法)对比和数据的一致性,深度扫描过程中的PG会处于scrubbing+deep状态.
3.9 recovering
正在恢复态,集群正在执行迁移或同步对象和他们的副本,这可能是由于添加了一个新的OSD到集群中或者某个OSD宕掉后,PG可能会被CRUSH算法重新分配不同的OSD,而由于OSD更换导致PG发生内部数据同步的过程中的PG会被标记为Recovering.
3.10 backfilling
正在后台填充态, backfill是recovery 的一种特殊场景,指peering完成后,如果基于当前权威日志无法对Up Set(上行集)当中的某些PG实例实施增量同步(例如承载这些PG实例的OSD离线太久,或者是新的OSD加入集群导致的PG实例整体迁移)则通过完全拷贝当前Primary所有对象的方式进行全量同步,此过程中的PG会处于backfilling.
3.11 backfill-toofull
某个需要被backfill的PG实例,其所在的OSD可用空间不足,Backfill流程当前被挂起时PG给的状态。
【推荐】国内首个AI IDE,深度理解中文开发场景,立即下载体验Trae
【推荐】编程新体验,更懂你的AI,立即体验豆包MarsCode编程助手
【推荐】抖音旗下AI助手豆包,你的智能百科全书,全免费不限次数
【推荐】轻量又高性能的 SSH 工具 IShell:AI 加持,快人一步
· DeepSeek 开源周回顾「GitHub 热点速览」
· 物流快递公司核心技术能力-地址解析分单基础技术分享
· .NET 10首个预览版发布:重大改进与新特性概览!
· AI与.NET技术实操系列(二):开始使用ML.NET
· 单线程的Redis速度为什么快?