06 OSD 扩容和换盘

OSD 纵向扩容

OSD 扩容可以分为纵向扩容(scale up)和横行扩容(scale out)
纵向扩容(scale up):扩容服务器的磁盘空间,完成对 OSD 空间的扩容
横行扩容(scale out):添加新的服务器加入集群,完成对 OSD 空间的扩容,要求新服务器也需要从 0-1 部署 ceph 集群服务

OSD node0 新增一块磁盘

[root@node0 ceph-deploy]# cd /data/ceph-deploy/

# 添加新的磁盘
[root@node0 ceph-deploy]# ceph-deploy osd create node0 --data /dev/sdc
[ceph_deploy.conf][DEBUG ] found configuration file at: /root/.cephdeploy.conf
[ceph_deploy.cli][INFO  ] Invoked (2.0.1): /usr/bin/ceph-deploy osd create node0 --data /dev/sdc
[ceph_deploy.cli][INFO  ] ceph-deploy options:
[ceph_deploy.cli][INFO  ]  verbose                       : False
[ceph_deploy.cli][INFO  ]  bluestore                     : None
[ceph_deploy.cli][INFO  ]  cd_conf                       : <ceph_deploy.conf.cephdeploy.Conf instance at 0x7f3f155ad998>
[ceph_deploy.cli][INFO  ]  cluster                       : ceph
[ceph_deploy.cli][INFO  ]  fs_type                       : xfs
[ceph_deploy.cli][INFO  ]  block_wal                     : None
[ceph_deploy.cli][INFO  ]  default_release               : False
[ceph_deploy.cli][INFO  ]  username                      : None
[ceph_deploy.cli][INFO  ]  journal                       : None
[ceph_deploy.cli][INFO  ]  subcommand                    : create
[ceph_deploy.cli][INFO  ]  host                          : node0
[ceph_deploy.cli][INFO  ]  filestore                     : None
[ceph_deploy.cli][INFO  ]  func                          : <function osd at 0x7f3f155dd938>
[ceph_deploy.cli][INFO  ]  ceph_conf                     : None
[ceph_deploy.cli][INFO  ]  zap_disk                      : False
[ceph_deploy.cli][INFO  ]  data                          : /dev/sdc
[ceph_deploy.cli][INFO  ]  block_db                      : None
[ceph_deploy.cli][INFO  ]  dmcrypt                       : False
[ceph_deploy.cli][INFO  ]  overwrite_conf                : False
[ceph_deploy.cli][INFO  ]  dmcrypt_key_dir               : /etc/ceph/dmcrypt-keys
[ceph_deploy.cli][INFO  ]  quiet                         : False
[ceph_deploy.cli][INFO  ]  debug                         : False
[ceph_deploy.osd][DEBUG ] Creating OSD on cluster ceph with data device /dev/sdc
[node0][DEBUG ] connected to host: node0 
[node0][DEBUG ] detect platform information from remote host
[node0][DEBUG ] detect machine type
[node0][DEBUG ] find the location of an executable
[ceph_deploy.osd][INFO  ] Distro info: CentOS Linux 7.9.2009 Core
[ceph_deploy.osd][DEBUG ] Deploying osd to node0
[node0][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf
[node0][DEBUG ] find the location of an executable
[node0][INFO  ] Running command: /usr/sbin/ceph-volume --cluster ceph lvm create --bluestore --data /dev/sdc
[node0][WARNIN] Running command: /usr/bin/ceph-authtool --gen-print-key
[node0][WARNIN] Running command: /usr/bin/ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring -i - osd new b9d5f757-a8e7-459f-9cef-2dde0aa3a1f9
[node0][WARNIN] Running command: /usr/sbin/vgcreate --force --yes ceph-18f50dfb-d14b-43ec-b40b-40d4fd9bff1f /dev/sdc
[node0][WARNIN]  stdout: Physical volume "/dev/sdc" successfully created.
[node0][WARNIN]  stdout: Volume group "ceph-18f50dfb-d14b-43ec-b40b-40d4fd9bff1f" successfully created
[node0][WARNIN] Running command: /usr/sbin/lvcreate --yes -l 12799 -n osd-block-b9d5f757-a8e7-459f-9cef-2dde0aa3a1f9 ceph-18f50dfb-d14b-43ec-b40b-40d4fd9bff1f
[node0][WARNIN]  stdout: Logical volume "osd-block-b9d5f757-a8e7-459f-9cef-2dde0aa3a1f9" created.
[node0][WARNIN] Running command: /usr/bin/ceph-authtool --gen-print-key
[node0][WARNIN] Running command: /usr/bin/mount -t tmpfs tmpfs /var/lib/ceph/osd/ceph-3
[node0][WARNIN] Running command: /usr/bin/chown -h ceph:ceph /dev/ceph-18f50dfb-d14b-43ec-b40b-40d4fd9bff1f/osd-block-b9d5f757-a8e7-459f-9cef-2dde0aa3a1f9
[node0][WARNIN] Running command: /usr/bin/chown -R ceph:ceph /dev/dm-3
[node0][WARNIN] Running command: /usr/bin/ln -s /dev/ceph-18f50dfb-d14b-43ec-b40b-40d4fd9bff1f/osd-block-b9d5f757-a8e7-459f-9cef-2dde0aa3a1f9 /var/lib/ceph/osd/ceph-3/block
[node0][WARNIN] Running command: /usr/bin/ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring mon getmap -o /var/lib/ceph/osd/ceph-3/activate.monmap
[node0][WARNIN]  stderr: 2022-10-21 18:24:15.253 7f7d62f34700 -1 auth: unable to find a keyring on /etc/ceph/ceph.client.bootstrap-osd.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin,: (2) No such file or directory
[node0][WARNIN] 2022-10-21 18:24:15.253 7f7d62f34700 -1 AuthRegistry(0x7f7d5c0662f8) no keyring found at /etc/ceph/ceph.client.bootstrap-osd.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin,, disabling cephx
[node0][WARNIN]  stderr: got monmap epoch 3
[node0][WARNIN] Running command: /usr/bin/ceph-authtool /var/lib/ceph/osd/ceph-3/keyring --create-keyring --name osd.3 --add-key AQBOc1Jjx5srLBAA6OZuZM8JGED2o2B3hBa52A==
[node0][WARNIN]  stdout: creating /var/lib/ceph/osd/ceph-3/keyring
[node0][WARNIN] added entity osd.3 auth(key=AQBOc1Jjx5srLBAA6OZuZM8JGED2o2B3hBa52A==)
[node0][WARNIN] Running command: /usr/bin/chown -R ceph:ceph /var/lib/ceph/osd/ceph-3/keyring
[node0][WARNIN] Running command: /usr/bin/chown -R ceph:ceph /var/lib/ceph/osd/ceph-3/
[node0][WARNIN] Running command: /usr/bin/ceph-osd --cluster ceph --osd-objectstore bluestore --mkfs -i 3 --monmap /var/lib/ceph/osd/ceph-3/activate.monmap --keyfile - --osd-data /var/lib/ceph/osd/ceph-3/ --osd-uuid b9d5f757-a8e7-459f-9cef-2dde0aa3a1f9 --setuser ceph --setgroup ceph
[node0][WARNIN]  stderr: 2022-10-21 18:24:15.614 7ff9c213ca80 -1 bluestore(/var/lib/ceph/osd/ceph-3/) _read_fsid unparsable uuid
[node0][WARNIN] --> ceph-volume lvm prepare successful for: /dev/sdc
[node0][WARNIN] Running command: /usr/bin/chown -R ceph:ceph /var/lib/ceph/osd/ceph-3
[node0][WARNIN] Running command: /usr/bin/ceph-bluestore-tool --cluster=ceph prime-osd-dir --dev /dev/ceph-18f50dfb-d14b-43ec-b40b-40d4fd9bff1f/osd-block-b9d5f757-a8e7-459f-9cef-2dde0aa3a1f9 --path /var/lib/ceph/osd/ceph-3 --no-mon-config
[node0][WARNIN] Running command: /usr/bin/ln -snf /dev/ceph-18f50dfb-d14b-43ec-b40b-40d4fd9bff1f/osd-block-b9d5f757-a8e7-459f-9cef-2dde0aa3a1f9 /var/lib/ceph/osd/ceph-3/block
[node0][WARNIN] Running command: /usr/bin/chown -h ceph:ceph /var/lib/ceph/osd/ceph-3/block
[node0][WARNIN] Running command: /usr/bin/chown -R ceph:ceph /dev/dm-3
[node0][WARNIN] Running command: /usr/bin/chown -R ceph:ceph /var/lib/ceph/osd/ceph-3
[node0][WARNIN] Running command: /usr/bin/systemctl enable ceph-volume@lvm-3-b9d5f757-a8e7-459f-9cef-2dde0aa3a1f9
[node0][WARNIN]  stderr: Created symlink from /etc/systemd/system/multi-user.target.wants/ceph-volume@lvm-3-b9d5f757-a8e7-459f-9cef-2dde0aa3a1f9.service to /usr/lib/systemd/system/ceph-volume@.service.
[node0][WARNIN] Running command: /usr/bin/systemctl enable --runtime ceph-osd@3
[node0][WARNIN]  stderr: Created symlink from /run/systemd/system/ceph-osd.target.wants/ceph-osd@3.service to /usr/lib/systemd/system/ceph-osd@.service.
[node0][WARNIN] Running command: /usr/bin/systemctl start ceph-osd@3
[node0][WARNIN] --> ceph-volume lvm activate successful for osd ID: 3
[node0][WARNIN] --> ceph-volume lvm create successful for: /dev/sdc
[node0][INFO  ] checking OSD status...
[node0][DEBUG ] find the location of an executable
[node0][INFO  ] Running command: /bin/ceph --cluster=ceph osd stat --format=json
[ceph_deploy.osd][DEBUG ] Host node0 is now ready for osd use.

查看新添加磁盘

[root@node0 ceph-deploy]# ceph osd tree
ID CLASS WEIGHT  TYPE NAME      STATUS REWEIGHT PRI-AFF 
-1       0.19519 root default                           
-3       0.09760     host node0                         
 0   hdd 0.04880         osd.0      up  1.00000 1.00000 
 3   hdd 0.04880         osd.3      up  1.00000 1.00000 
-5       0.04880     host node1                         
 1   hdd 0.04880         osd.1      up  1.00000 1.00000 
-7       0.04880     host node2                         
 2   hdd 0.04880         osd.2      up  1.00000 1.00000 

查看磁盘空间是否增加

[root@node0 ceph-deploy]# ceph df
RAW STORAGE:
    CLASS     SIZE        AVAIL       USED        RAW USED     %RAW USED 
    hdd       200 GiB     195 GiB     731 MiB      4.7 GiB          2.36 
    TOTAL     200 GiB     195 GiB     731 MiB      4.7 GiB          2.36 
 
POOLS:
    POOL                          ID     PGS     STORED      OBJECTS     USED        %USED     MAX AVAIL 
    ceph-demo                      1     128     412 MiB         103     684 MiB      0.36       112 GiB 
    .rgw.root                      2      32     1.5 KiB           4     768 KiB         0        74 GiB 
    default.rgw.control            3      32         0 B           8         0 B         0        62 GiB 
    default.rgw.meta               4      32     2.4 KiB          10     1.7 MiB         0        77 GiB 
    default.rgw.log                5      32         0 B         207         0 B         0        64 GiB 
    default.rgw.buckets.index      6      32         0 B           3         0 B         0        69 GiB 
    default.rgw.buckets.data       7      32      16 EiB           1     192 KiB         0        16 EiB 
    cephfs_metadata                8      16      21 KiB          22     1.8 MiB         0        62 GiB 
    cephfs_data                    9      16     2.0 KiB           3     576 KiB         0        62 GiB 

数据 rebalancing 重分布

Rebalancing

When you add a Ceph OSD Daemon to a Ceph Storage Cluster, the cluster map gets
updated with the new OSD. Referring back to Calculating PG IDs, this changes
the cluster map. Consequently, it changes object placement, because it changes
an input for the calculations. The following diagram depicts the rebalancing
process (albeit rather crudely, since it is substantially less impactful with
large clusters) where some, but not all of the PGs migrate from existing OSDs
(OSD 1, and OSD 2) to the new OSD (OSD 3). Even when rebalancing, CRUSH is
stable. Many of the placement groups remain in their original configuration,
and each OSD gets some added capacity, so there are no load spikes on the
new OSD after rebalancing is complete.

添加新 osd 磁盘,模拟数据重分布

ceph 集群新增 osd,默认会导致数据重分布(可以使用指令暂时禁止同步)。建议生产环境新增 osd 放到业务访问量少的情况进行

# 集群新增 osd
[root@node0 ceph-deploy]# ceph-deploy osd create node1 --data /dev/sdc
[root@node0 ceph-deploy]# ceph-deploy osd create node2 --data /dev/sdc

# 查看数据 rebalancing 重分布
[root@node0 ceph-deploy]# ceph -s
  cluster:
    id:     97702c43-6cc2-4ef8-bdb5-855cfa90a260
    health: HEALTH_WARN
            Reduced data availability: 1 pg peering
            Degraded data redundancy: 540/890 objects degraded (60.674%), 29 pgs degraded
 
  services:
    mon: 3 daemons, quorum node0,node1,node2 (age 5h)
    mgr: node1(active, since 5h), standbys: node2, node0
    mds: cephfs-demo:1 {0=node1=up:active} 2 up:standby
    osd: 6 osds: 6 up (since 3s), 6 in (since 3s); 40 remapped pgs
    rgw: 1 daemon active (node0)
 
  task status:
 
  data:
    pools:   9 pools, 352 pgs
    objects: 323 objects, 241 MiB
    usage:   6.8 GiB used, 293 GiB / 300 GiB avail
    pgs:     22.159% pgs not active
             540/890 objects degraded (60.674%)
             154/890 objects misplaced (17.303%)
             232 active+clean
             73  peering
             17  active+recovery_wait+degraded
             12  active+recovery_wait+undersized+degraded+remapped
             8   active+remapped+backfill_wait
             5   remapped+peering
             3   active+recovery_wait+remapped
             1   active+recovering
             1   active+recovering+undersized+remapped
 
  io:
    recovery: 0 B/s, 0 objects/s
 
  progress:
    Rebalancing after osd.4 marked in
      [======================........]
    Rebalancing after osd.5 marked in
      [..............................]
 
 
[root@node0 ceph-deploy]# ceph -s
  cluster:
    id:     97702c43-6cc2-4ef8-bdb5-855cfa90a260
    health: HEALTH_WARN
            Degraded data redundancy: 1794/980 objects degraded (183.061%), 93 pgs degraded
 
  services:
    mon: 3 daemons, quorum node0,node1,node2 (age 5h)
    mgr: node1(active, since 5h), standbys: node2, node0
    mds: cephfs-demo:1 {0=node1=up:active} 2 up:standby
    osd: 6 osds: 6 up (since 10s), 6 in (since 10s); 37 remapped pgs
    rgw: 1 daemon active (node0)
 
  task status:
 
  data:
    pools:   9 pools, 352 pgs
    objects: 361 objects, 305 MiB
    usage:   6.8 GiB used, 293 GiB / 300 GiB avail
    pgs:     1794/980 objects degraded (183.061%)
             167/980 objects misplaced (17.041%)
             243 active+clean
             64  active+recovery_wait+degraded
             29  active+recovery_wait+undersized+degraded+remapped
             9   active+remapped+backfill_wait
             5   active+recovery_wait
             1   active+recovering
             1   active+recovering+undersized+remapped
 
  io:
    client:   185 KiB/s rd, 10 MiB/s wr, 22 op/s rd, 15 op/s wr
    recovery: 9.3 MiB/s, 2 keys/s, 7 objects/s
 
  progress:
    Rebalancing after osd.4 marked in
      [======================........]
    Rebalancing after osd.5 marked in
      [===============...............]
 
[root@node0 ceph-deploy]# ceph -s
  cluster:
    id:     97702c43-6cc2-4ef8-bdb5-855cfa90a260
    health: HEALTH_OK
 
  services:
    mon: 3 daemons, quorum node0,node1,node2 (age 5h)
    mgr: node1(active, since 5h), standbys: node2, node0
    mds: cephfs-demo:1 {0=node1=up:active} 2 up:standby
    osd: 6 osds: 6 up (since 3m), 6 in (since 3m)
    rgw: 1 daemon active (node0)
 
  task status:
 
  data:
    pools:   9 pools, 352 pgs
    objects: 361 objects, 305 MiB
    usage:   6.7 GiB used, 293 GiB / 300 GiB avail
    pgs:     352 active+clean
 
  io:
    client:   4.0 KiB/s rd, 0 B/s wr, 3 op/s rd, 2 op/s wr

临时关闭 rebalancing

查看 rebalancing 的线程配置

建议使用 一个线程进行数据填充,使用默认配置即可

[root@node0 ceph-deploy]# ceph --admin-daemon /var/run/ceph/ceph-mon.node0.asok config  show | grep max_b
    ......
    "osd_max_backfills": "1",     # osd 使用1个线程进行数据传输, 数据越大同步越快,但是会牺牲性能,数据 rebalancing 重分布的线程数
    ......

数据 rebalancing 时使用的网络

[root@node0 ceph-deploy]# cat /etc/ceph/ceph.conf 
[global]
fsid = 97702c43-6cc2-4ef8-bdb5-855cfa90a260
public_network = 192.168.100.0/24   # 客户端连接使用的网络
cluster_network = 192.168.100.0/24  # 数据重分布,集群内部通信使用的网络,建议使用万兆网络。生产环境这2个网络最好分开,减少他们之间的影响
mon_initial_members = node0
mon_host = 192.168.100.130
auth_cluster_required = cephx
auth_service_required = cephx
auth_client_required = cephx
mon_max_pg_per_osd=1000

[client.rgw.node0]
rgw_frontends = "civetweb port=80"

临时关闭 rebalancing

# 查看标志位
[root@node0 ceph-deploy]# ceph -h | grep reb
 noin|nobackfill|norebalance|norecover|  
 noin|nobackfill|norebalance|norecover|

# 关闭 rebalance
[root@node0 ceph-deploy]# ceph osd set norebalance
norebalance is set
[root@node0 ceph-deploy]# ceph -s
  cluster:
    id:     97702c43-6cc2-4ef8-bdb5-855cfa90a260
    health: HEALTH_WARN
            norebalance flag(s) set # 集群不健康,设置了 norebalance 标志位
 
  services:
    mon: 3 daemons, quorum node0,node1,node2 (age 15h)
    mgr: node1(active, since 15h), standbys: node2, node0
    mds: cephfs-demo:1 {0=node1=up:active} 2 up:standby
    osd: 6 osds: 6 up (since 9h), 6 in (since 9h)
         flags norebalance
    rgw: 1 daemon active (node0)
 
  task status:
 
  data:
    pools:   9 pools, 352 pgs
    objects: 361 objects, 305 MiB
    usage:   6.7 GiB used, 293 GiB / 300 GiB avail
    pgs:     352 active+clean

# 同步的还需要关闭 backfill 才能完全关闭数据填充
[root@node0 ceph-deploy]# ceph osd set nobackfill
nobackfill is set
[root@node0 ceph-deploy]# ceph -s
  cluster:
    id:     97702c43-6cc2-4ef8-bdb5-855cfa90a260
    health: HEALTH_WARN
            nobackfill,norebalance flag(s) set # 设置了 nobackfill,norebalance 标志位
 
  services:
    mon: 3 daemons, quorum node0,node1,node2 (age 15h)
    mgr: node1(active, since 15h), standbys: node2, node0
    mds: cephfs-demo:1 {0=node1=up:active} 2 up:standby
    osd: 6 osds: 6 up (since 9h), 6 in (since 9h)
         flags nobackfill,norebalance
    rgw: 1 daemon active (node0)
 
  task status:
 
  data:
    pools:   9 pools, 352 pgs
    objects: 361 objects, 305 MiB
    usage:   6.7 GiB used, 293 GiB / 300 GiB avail
    pgs:     352 active+clean

开启 rebalancing 功能

[root@node0 ceph-deploy]# ceph osd unset nobackfill
nobackfill is unset
[root@node0 ceph-deploy]# ceph osd unset norebalance
norebalance is unset
[root@node0 ceph-deploy]# ceph -s
  cluster:
    id:     97702c43-6cc2-4ef8-bdb5-855cfa90a260
    health: HEALTH_OK
 
  services:
    mon: 3 daemons, quorum node0,node1,node2 (age 15h)
    mgr: node1(active, since 15h), standbys: node2, node0
    mds: cephfs-demo:1 {0=node1=up:active} 2 up:standby
    osd: 6 osds: 6 up (since 9h), 6 in (since 9h)
    rgw: 1 daemon active (node0)
 
  task status:
 
  data:
    pools:   9 pools, 352 pgs
    objects: 361 objects, 305 MiB
    usage:   6.7 GiB used, 293 GiB / 300 GiB avail
    pgs:     352 active+clean

OSD 坏盘更换

磁盘坏盘后,需要剔除坏盘,并替换新磁盘
第一步:剔除坏磁盘
第二步:集群添加新硬盘

如何查看磁盘

[root@node2 ~]# ceph osd tree
ID CLASS WEIGHT  TYPE NAME      STATUS REWEIGHT PRI-AFF 
-1       0.29279 root default                           
-3       0.09760     host node0                         
 0   hdd 0.04880         osd.0      up  1.00000 1.00000 
 3   hdd 0.04880         osd.3      up  1.00000 1.00000 
-5       0.09760     host node1                         
 1   hdd 0.04880         osd.1      up  1.00000 1.00000 
 4   hdd 0.04880         osd.4      up  1.00000 1.00000 
-7       0.09760     host node2                         
 2   hdd 0.04880         osd.2      up  1.00000 1.00000 
 5   hdd 0.04880         osd.5      up  1.00000 1.00000 

查看磁盘延迟,巡检磁盘是否正常

延迟过大表示磁盘快要坏了,但是还未完全坏

[root@node0 ~]# ceph osd perf
osd commit_latency(ms) apply_latency(ms) 
  5                  0                 0 
  4                  0                 0 
  0                  0                 0 
  1                  0                 0 
  2                  0                 0 
  3                  0                 0 

剔除坏磁盘

停止 ceph-osd@5 服务,模拟磁盘坏了

# 远程到 node2 节点
[root@node0 ceph-deploy]# ssh node2
Last login: Sat Oct 22 00:01:41 2022 from node0

# 查看磁盘信息
[root@node2 ~]# ceph osd tree
ID CLASS WEIGHT  TYPE NAME      STATUS REWEIGHT PRI-AFF 
-1       0.29279 root default                           
-3       0.09760     host node0                         
 0   hdd 0.04880         osd.0      up  1.00000 1.00000 
 3   hdd 0.04880         osd.3      up  1.00000 1.00000 
-5       0.09760     host node1                         
 1   hdd 0.04880         osd.1      up  1.00000 1.00000 
 4   hdd 0.04880         osd.4      up  1.00000 1.00000 
-7       0.09760     host node2                         
 2   hdd 0.04880         osd.2      up  1.00000 1.00000 
 5   hdd 0.04880         osd.5      up  1.00000 1.00000 

# 停止 ceph-osd@5 服务,模拟磁盘坏了
[root@node2 ~]# systemctl stop ceph-osd@5

查看磁盘信息和集群状态

# 再次查看磁盘信息
[root@node2 ~]# ceph osd tree
ID CLASS WEIGHT  TYPE NAME      STATUS REWEIGHT PRI-AFF 
-1       0.29279 root default                           
-3       0.09760     host node0                         
 0   hdd 0.04880         osd.0      up  1.00000 1.00000 
 3   hdd 0.04880         osd.3      up  1.00000 1.00000 
-5       0.09760     host node1                         
 1   hdd 0.04880         osd.1      up  1.00000 1.00000 
 4   hdd 0.04880         osd.4      up  1.00000 1.00000 
-7       0.09760     host node2                         
 2   hdd 0.04880         osd.2      up  1.00000 1.00000 
 5   hdd 0.04880         osd.5    down  1.00000 1.00000 

# 查看集群状态
[root@node2 ~]# ceph -s
  cluster:
    id:     97702c43-6cc2-4ef8-bdb5-855cfa90a260
    health: HEALTH_WARN
            1 osds down
            Reduced data availability: 22 pgs inactive, 75 pgs peering
            Degraded data redundancy: 80/980 objects degraded (8.163%), 31 pgs degraded
 
  services:
    mon: 3 daemons, quorum node0,node1,node2 (age 17h)
    mgr: node1(active, since 17h), standbys: node2, node0
    mds: cephfs-demo:1 {0=node1=up:active} 2 up:standby
    osd: 6 osds: 5 up (since 5s), 6 in (since 11h)
    rgw: 1 daemon active (node0)
 
  task status:
 
  data:
    pools:   9 pools, 352 pgs
    objects: 361 objects, 305 MiB
    usage:   6.7 GiB used, 293 GiB / 300 GiB avail
    pgs:     22.159% pgs not active
             80/980 objects degraded (8.163%)     # osd 磁盘坏掉后,暂时不会立刻开始 rebalancing,默认会等待一段时间后开始 数据重分布
             188 active+clean
             78  peering
             55  active+undersized
             31  active+undersized+degraded

从 osd map 中移除坏掉的磁盘,让 osd 立即开始 rebalancing

# 查看磁盘信息
[root@node2 ~]# ceph osd tree
ID CLASS WEIGHT  TYPE NAME      STATUS REWEIGHT PRI-AFF 
-1       0.29279 root default                           
-3       0.09760     host node0                         
 0   hdd 0.04880         osd.0      up  1.00000 1.00000 
 3   hdd 0.04880         osd.3      up  1.00000 1.00000 
-5       0.09760     host node1                         
 1   hdd 0.04880         osd.1      up  1.00000 1.00000 
 4   hdd 0.04880         osd.4      up  1.00000 1.00000 
-7       0.09760     host node2                         
 2   hdd 0.04880         osd.2      up  1.00000 1.00000 
 5   hdd 0.04880         osd.5    down  1.00000 1.00000 

# osd out 从 osd map 中移除坏磁盘
[root@node2 ~]# ceph osd out osd.5
marked out osd.5. 

# 再次查看磁盘信息
[root@node2 ~]# ceph osd tree
ID CLASS WEIGHT  TYPE NAME      STATUS REWEIGHT PRI-AFF 
-1       0.29279 root default                           
-3       0.09760     host node0                         
 0   hdd 0.04880         osd.0      up  1.00000 1.00000 
 3   hdd 0.04880         osd.3      up  1.00000 1.00000 
-5       0.09760     host node1                         
 1   hdd 0.04880         osd.1      up  1.00000 1.00000 
 4   hdd 0.04880         osd.4      up  1.00000 1.00000 
-7       0.09760     host node2                         
 2   hdd 0.04880         osd.2      up  1.00000 1.00000 
 5   hdd 0.04880         osd.5    down        0 1.00000   # osd.5 磁盘的 REWEIGHT 从 1 -> 0

# 查看集群状态,应该是在进行 rebalancing
[root@node2 ~]# ceph -s
  cluster:
    id:     97702c43-6cc2-4ef8-bdb5-855cfa90a260
    health: HEALTH_WARN
            Degraded data redundancy: 872/980 objects degraded (88.980%), 53 pgs degraded
 
  services:
    mon: 3 daemons, quorum node0,node1,node2 (age 17h)
    mgr: node1(active, since 17h), standbys: node2, node0
    mds: cephfs-demo:1 {0=node1=up:active} 2 up:standby
    osd: 6 osds: 5 up (since 2m), 5 in (since 27s); 24 remapped pgs
    rgw: 1 daemon active (node0)
 
  task status:
 
  data:
    pools:   9 pools, 352 pgs
    objects: 361 objects, 305 MiB
    usage:   5.6 GiB used, 244 GiB / 250 GiB avail
    pgs:     872/980 objects degraded (88.980%)
             298 active+clean
             28  active+recovery_wait+degraded
             21  active+recovery_wait+undersized+degraded+remapped
             4   active+undersized+degraded+remapped+backfill_wait
             1   active+recovery_wait
 
  io:
    recovery: 35 B/s, 7 objects/s

从 crush map 中删除 osd.5 数据

# 查看 crush map 数据
[root@node2 ~]# ceph osd crush dump
{
    "devices": [
        ......
        {
            "id": 4,
            "name": "osd.4",
            "class": "hdd"
        },
        {
            "id": 5,
            "name": "osd.5",
            "class": "hdd"
        }
    ],
    "types": [
        ......
    ],
    ......
}

# 从 crush map 中删除 osd.5
[root@node2 ~]# ceph osd crush rm osd.5
removed item id 5 name 'osd.5' from crush map

# 再次查看 crush map 数据
[root@node2 ~]# ceph osd crush dump
{
    "devices": [
        ......
        {
            "id": 4,
            "name": "osd.4",
            "class": "hdd"
        }   # 没有 osd.5 信息
    ],
    "types": [
        ......
    ],
    ......
}

[root@node2 ~]# ceph osd tree
ID CLASS WEIGHT  TYPE NAME      STATUS REWEIGHT PRI-AFF 
-1       0.24399 root default                           
-3       0.09760     host node0                         
 0   hdd 0.04880         osd.0      up  1.00000 1.00000 
 3   hdd 0.04880         osd.3      up  1.00000 1.00000 
-5       0.09760     host node1                         
 1   hdd 0.04880         osd.1      up  1.00000 1.00000 
 4   hdd 0.04880         osd.4      up  1.00000 1.00000 
-7       0.04880     host node2                         
 2   hdd 0.04880         osd.2      up  1.00000 1.00000 
 5             0 osd.5            down        0 1.00000 

从 osd 中删除 osd.5

# 从 osd 中删除 osd.5
[root@node2 ~]# ceph osd rm osd.5
removed osd.5

# 查看 osd 磁盘数据,没有 osd.5 磁盘
[root@node2 ~]# ceph osd tree
ID CLASS WEIGHT  TYPE NAME      STATUS REWEIGHT PRI-AFF 
-1       0.24399 root default                           
-3       0.09760     host node0                         
 0   hdd 0.04880         osd.0      up  1.00000 1.00000 
 3   hdd 0.04880         osd.3      up  1.00000 1.00000 
-5       0.09760     host node1                         
 1   hdd 0.04880         osd.1      up  1.00000 1.00000 
 4   hdd 0.04880         osd.4      up  1.00000 1.00000 
-7       0.04880     host node2                         
 2   hdd 0.04880         osd.2      up  1.00000 1.00000

# 查看集群状态
[root@node2 ~]# ceph -s
  cluster:
    id:     97702c43-6cc2-4ef8-bdb5-855cfa90a260
    health: HEALTH_OK
 
  services:
    mon: 3 daemons, quorum node0,node1,node2 (age 17h)
    mgr: node1(active, since 17h), standbys: node2, node0
    mds: cephfs-demo:1 {0=node1=up:active} 2 up:standby
    osd: 5 osds: 5 up (since 48m), 5 in (since 46m) # 此处 osd 数据已经跟新为 5 块磁盘了
    rgw: 1 daemon active (node0)
 
  task status:
 
  data:
    pools:   9 pools, 352 pgs
    objects: 361 objects, 305 MiB
    usage:   5.8 GiB used, 244 GiB / 250 GiB avail
    pgs:     352 active+clean

从 auth 中删除 osd.5 认证数据

# 查看 auth 数据
[root@node2 ~]# ceph auth list
......
osd.4
        key: AQAFxFJjgPC7GxAAVwvROc0Usys/XIVVOls/OQ==
        caps: [mgr] allow profile osd
        caps: [mon] allow profile osd
        caps: [osd] allow *
osd.5
        key: AQAaxFJjz7wmNhAAQOtu8dC/DKtXihbiy2KdZw==
        caps: [mgr] allow profile osd
        caps: [mon] allow profile osd
        caps: [osd] allow *
......

# 从 auth 中删除 osd.5 认证数据
[root@node2 ~]# ceph auth rm osd.5
updated

# 再次查看 auth 数据
[root@node2 ~]# ceph auth list
......
osd.4
        key: AQAFxFJjgPC7GxAAVwvROc0Usys/XIVVOls/OQ==
        caps: [mgr] allow profile osd
        caps: [mon] allow profile osd
        caps: [osd] allow *
# 以及没有 osd.5 数据
......

卸载磁盘

[root@node2 ~]# umount  /var/lib/ceph/osd/ceph-5

删除 lvm 删除逻辑卷

# 查看 /dev/sdc 磁盘 PV Name
[root@node0 ceph-deploy]# pvdisplay 
  ......
   
  --- Physical volume ---
  PV Name               /dev/sdc
  VG Name               ceph-0bf75583-89f5-4ce4-a703-667003938236
  PV Size               50.00 GiB / not usable 4.00 MiB
  Allocatable           yes (but full)
  PE Size               4.00 MiB
  Total PE              12799
  Free PE               0
  Allocated PE          12799
  PV UUID               6uIHbB-61oC-rbLg-K1HD-gPvc-yl80-R0gdVm

# 通过 PV Name 获取到 lvm 数据
[root@node2 ~]# lvdisplay | grep -B 2 "ceph-0bf75583-89f5-4ce4-a703-667003938236"
   
  --- Logical volume ---
  LV Path                /dev/ceph-0bf75583-89f5-4ce4-a703-667003938236/osd-block-a69e2242-dbfc-4ed7-9e17-7d05fd2e6fe7
  LV Name                osd-block-a69e2242-dbfc-4ed7-9e17-7d05fd2e6fe7
  VG Name                ceph-0bf75583-89f5-4ce4-a703-667003938236

# dmsetup 查看当前 设备映射
[root@node2 ~]# dmsetup ls
ceph--90ab5473--6ad3--48b7--b733--64bc4baff012-osd--block--816b6503--9551--48a2--823e--1c6a52ea213b     (253:2)
ceph--0bf75583--89f5--4ce4--a703--667003938236-osd--block--a69e2242--dbfc--4ed7--9e17--7d05fd2e6fe7     (253:3)
centos-swap     (253:1)
centos-root     (253:0)

[root@node2 ~]# dmsetup ls --tree
ceph--90ab5473--6ad3--48b7--b733--64bc4baff012-osd--block--816b6503--9551--48a2--823e--1c6a52ea213b (253:2)
 └─ (8:16)
ceph--0bf75583--89f5--4ce4--a703--667003938236-osd--block--a69e2242--dbfc--4ed7--9e17--7d05fd2e6fe7 (253:3)
 └─ (8:32)
centos-swap (253:1)
 └─ (8:2)
centos-root (253:0)
 └─ (8:2)

# 删除 /dev/sdc 设备映射
[root@node2 ~]# dmsetup remove --force /dev/mapper/ceph--0bf75583--89f5--4ce4--a703--667003938236-osd--block--a69e2242--dbfc--4ed7--9e17--7d05fd2e6fe7

# 再次查看当前 设备映射
[root@node2 ~]# dmsetup ls
ceph--90ab5473--6ad3--48b7--b733--64bc4baff012-osd--block--816b6503--9551--48a2--823e--1c6a52ea213b     (253:2)
centos-swap     (253:1)
centos-root     (253:0)

[root@node2 ~]# dmsetup ls --tree
ceph--90ab5473--6ad3--48b7--b733--64bc4baff012-osd--block--816b6503--9551--48a2--823e--1c6a52ea213b (253:2)
 └─ (8:16)
centos-swap (253:1)
 └─ (8:2)
centos-root (253:0)
 └─ (8:2)

添加新磁盘

重置 node2 /dev/sdc 磁盘

[root@node2 ~]# exit
logout
Connection to node2 closed.
[root@node0 ceph-deploy]# ceph-deploy disk zap node2 /dev/sdc
[ceph_deploy.conf][DEBUG ] found configuration file at: /root/.cephdeploy.conf
[ceph_deploy.cli][INFO  ] Invoked (2.0.1): /usr/bin/ceph-deploy disk zap node2 /dev/sdc
[ceph_deploy.cli][INFO  ] ceph-deploy options:
[ceph_deploy.cli][INFO  ]  username                      : None
[ceph_deploy.cli][INFO  ]  verbose                       : False
[ceph_deploy.cli][INFO  ]  debug                         : False
[ceph_deploy.cli][INFO  ]  overwrite_conf                : False
[ceph_deploy.cli][INFO  ]  subcommand                    : zap
[ceph_deploy.cli][INFO  ]  quiet                         : False
[ceph_deploy.cli][INFO  ]  cd_conf                       : <ceph_deploy.conf.cephdeploy.Conf instance at 0x7fd276ee0830>
[ceph_deploy.cli][INFO  ]  cluster                       : ceph
[ceph_deploy.cli][INFO  ]  host                          : node2
[ceph_deploy.cli][INFO  ]  func                          : <function disk at 0x7fd276f149b0>
[ceph_deploy.cli][INFO  ]  ceph_conf                     : None
[ceph_deploy.cli][INFO  ]  default_release               : False
[ceph_deploy.cli][INFO  ]  disk                          : ['/dev/sdc']
[ceph_deploy.osd][DEBUG ] zapping /dev/sdc on node2
[node2][DEBUG ] connected to host: node2 
[node2][DEBUG ] detect platform information from remote host
[node2][DEBUG ] detect machine type
[node2][DEBUG ] find the location of an executable
[ceph_deploy.osd][INFO  ] Distro info: CentOS Linux 7.9.2009 Core
[node2][DEBUG ] zeroing last few blocks of device
[node2][DEBUG ] find the location of an executable
[node2][INFO  ] Running command: /usr/sbin/ceph-volume lvm zap /dev/sdc
[node2][WARNIN] --> Zapping: /dev/sdc
[node2][WARNIN] --> --destroy was not specified, but zapping a whole device will remove the partition table
[node2][WARNIN] Running command: /bin/dd if=/dev/zero of=/dev/sdc bs=1M count=10 conv=fsync
[node2][WARNIN]  stderr: 10+0 records in
[node2][WARNIN] 10+0 records out
[node2][WARNIN] 10485760 bytes (10 MB) copied
[node2][WARNIN]  stderr: , 0.00989675 s, 1.1 GB/s
[node2][WARNIN] --> Zapping successful for: <Raw Device: /dev/sdc>

添加新磁盘

[root@node0 ceph-deploy]# ceph-deploy osd create node2 --data /dev/sdc
[ceph_deploy.conf][DEBUG ] found configuration file at: /root/.cephdeploy.conf
[ceph_deploy.cli][INFO  ] Invoked (2.0.1): /usr/bin/ceph-deploy osd create node2 --data /dev/sdc
[ceph_deploy.cli][INFO  ] ceph-deploy options:
[ceph_deploy.cli][INFO  ]  verbose                       : False
[ceph_deploy.cli][INFO  ]  bluestore                     : None
[ceph_deploy.cli][INFO  ]  cd_conf                       : <ceph_deploy.conf.cephdeploy.Conf instance at 0x7ff65df77998>
[ceph_deploy.cli][INFO  ]  cluster                       : ceph
[ceph_deploy.cli][INFO  ]  fs_type                       : xfs
[ceph_deploy.cli][INFO  ]  block_wal                     : None
[ceph_deploy.cli][INFO  ]  default_release               : False
[ceph_deploy.cli][INFO  ]  username                      : None
[ceph_deploy.cli][INFO  ]  journal                       : None
[ceph_deploy.cli][INFO  ]  subcommand                    : create
[ceph_deploy.cli][INFO  ]  host                          : node2
[ceph_deploy.cli][INFO  ]  filestore                     : None
[ceph_deploy.cli][INFO  ]  func                          : <function osd at 0x7ff65dfa7938>
[ceph_deploy.cli][INFO  ]  ceph_conf                     : None
[ceph_deploy.cli][INFO  ]  zap_disk                      : False
[ceph_deploy.cli][INFO  ]  data                          : /dev/sdc
[ceph_deploy.cli][INFO  ]  block_db                      : None
[ceph_deploy.cli][INFO  ]  dmcrypt                       : False
[ceph_deploy.cli][INFO  ]  overwrite_conf                : False
[ceph_deploy.cli][INFO  ]  dmcrypt_key_dir               : /etc/ceph/dmcrypt-keys
[ceph_deploy.cli][INFO  ]  quiet                         : False
[ceph_deploy.cli][INFO  ]  debug                         : False
[ceph_deploy.osd][DEBUG ] Creating OSD on cluster ceph with data device /dev/sdc
[node2][DEBUG ] connected to host: node2 
[node2][DEBUG ] detect platform information from remote host
[node2][DEBUG ] detect machine type
[node2][DEBUG ] find the location of an executable
[ceph_deploy.osd][INFO  ] Distro info: CentOS Linux 7.9.2009 Core
[ceph_deploy.osd][DEBUG ] Deploying osd to node2
[node2][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf
[node2][DEBUG ] find the location of an executable
[node2][INFO  ] Running command: /usr/sbin/ceph-volume --cluster ceph lvm create --bluestore --data /dev/sdc
[node2][WARNIN] Running command: /bin/ceph-authtool --gen-print-key
[node2][WARNIN] Running command: /bin/ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring -i - osd new a69e2242-dbfc-4ed7-9e17-7d05fd2e6fe7
[node2][WARNIN] Running command: /usr/sbin/vgcreate --force --yes ceph-0bf75583-89f5-4ce4-a703-667003938236 /dev/sdc
[node2][WARNIN]  stdout: Physical volume "/dev/sdc" successfully created.
[node2][WARNIN]  stdout: Volume group "ceph-0bf75583-89f5-4ce4-a703-667003938236" successfully created
[node2][WARNIN] Running command: /usr/sbin/lvcreate --yes -l 12799 -n osd-block-a69e2242-dbfc-4ed7-9e17-7d05fd2e6fe7 ceph-0bf75583-89f5-4ce4-a703-667003938236
[node2][WARNIN]  stdout: Logical volume "osd-block-a69e2242-dbfc-4ed7-9e17-7d05fd2e6fe7" created.
[node2][WARNIN] Running command: /bin/ceph-authtool --gen-print-key
[node2][WARNIN] Running command: /bin/mount -t tmpfs tmpfs /var/lib/ceph/osd/ceph-5
[node2][WARNIN] Running command: /bin/chown -h ceph:ceph /dev/ceph-0bf75583-89f5-4ce4-a703-667003938236/osd-block-a69e2242-dbfc-4ed7-9e17-7d05fd2e6fe7
[node2][WARNIN] Running command: /bin/chown -R ceph:ceph /dev/dm-3
[node2][WARNIN] Running command: /bin/ln -s /dev/ceph-0bf75583-89f5-4ce4-a703-667003938236/osd-block-a69e2242-dbfc-4ed7-9e17-7d05fd2e6fe7 /var/lib/ceph/osd/ceph-5/block
[node2][WARNIN] Running command: /bin/ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring mon getmap -o /var/lib/ceph/osd/ceph-5/activate.monmap
[node2][WARNIN]  stderr: 2022-10-22 12:46:01.082 7f136d85b700 -1 auth: unable to find a keyring on /etc/ceph/ceph.client.bootstrap-osd.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin,: (2) No such file or directory
[node2][WARNIN] 2022-10-22 12:46:01.082 7f136d85b700 -1 AuthRegistry(0x7f13680662f8) no keyring found at /etc/ceph/ceph.client.bootstrap-osd.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin,, disabling cephx
[node2][WARNIN]  stderr: got monmap epoch 3
[node2][WARNIN] Running command: /bin/ceph-authtool /var/lib/ceph/osd/ceph-5/keyring --create-keyring --name osd.5 --add-key AQCIdVNjfBsaJhAAJ+I6/GPW22v8jesXc2RLfQ==
[node2][WARNIN]  stdout: creating /var/lib/ceph/osd/ceph-5/keyring
[node2][WARNIN] added entity osd.5 auth(key=AQCIdVNjfBsaJhAAJ+I6/GPW22v8jesXc2RLfQ==)
[node2][WARNIN] Running command: /bin/chown -R ceph:ceph /var/lib/ceph/osd/ceph-5/keyring
[node2][WARNIN] Running command: /bin/chown -R ceph:ceph /var/lib/ceph/osd/ceph-5/
[node2][WARNIN] Running command: /bin/ceph-osd --cluster ceph --osd-objectstore bluestore --mkfs -i 5 --monmap /var/lib/ceph/osd/ceph-5/activate.monmap --keyfile - --osd-data /var/lib/ceph/osd/ceph-5/ --osd-uuid a69e2242-dbfc-4ed7-9e17-7d05fd2e6fe7 --setuser ceph --setgroup ceph
[node2][WARNIN]  stderr: 2022-10-22 12:46:01.398 7f1aac3e7a80 -1 bluestore(/var/lib/ceph/osd/ceph-5/) _read_fsid unparsable uuid
[node2][WARNIN] --> ceph-volume lvm prepare successful for: /dev/sdc
[node2][WARNIN] Running command: /bin/chown -R ceph:ceph /var/lib/ceph/osd/ceph-5
[node2][WARNIN] Running command: /bin/ceph-bluestore-tool --cluster=ceph prime-osd-dir --dev /dev/ceph-0bf75583-89f5-4ce4-a703-667003938236/osd-block-a69e2242-dbfc-4ed7-9e17-7d05fd2e6fe7 --path /var/lib/ceph/osd/ceph-5 --no-mon-config
[node2][WARNIN] Running command: /bin/ln -snf /dev/ceph-0bf75583-89f5-4ce4-a703-667003938236/osd-block-a69e2242-dbfc-4ed7-9e17-7d05fd2e6fe7 /var/lib/ceph/osd/ceph-5/block
[node2][WARNIN] Running command: /bin/chown -h ceph:ceph /var/lib/ceph/osd/ceph-5/block
[node2][WARNIN] Running command: /bin/chown -R ceph:ceph /dev/dm-3
[node2][WARNIN] Running command: /bin/chown -R ceph:ceph /var/lib/ceph/osd/ceph-5
[node2][WARNIN] Running command: /bin/systemctl enable ceph-volume@lvm-5-a69e2242-dbfc-4ed7-9e17-7d05fd2e6fe7
[node2][WARNIN]  stderr: Created symlink from /etc/systemd/system/multi-user.target.wants/ceph-volume@lvm-5-a69e2242-dbfc-4ed7-9e17-7d05fd2e6fe7.service to /usr/lib/systemd/system/ceph-volume@.service.
[node2][WARNIN] Running command: /bin/systemctl enable --runtime ceph-osd@5
[node2][WARNIN] Running command: /bin/systemctl start ceph-osd@5
[node2][WARNIN] --> ceph-volume lvm activate successful for osd ID: 5
[node2][WARNIN] --> ceph-volume lvm create successful for: /dev/sdc
[node2][INFO  ] checking OSD status...
[node2][DEBUG ] find the location of an executable
[node2][INFO  ] Running command: /bin/ceph --cluster=ceph osd stat --format=json
[ceph_deploy.osd][DEBUG ] Host node2 is now ready for osd use.


# 查看集群状态
[root@node0 ceph-deploy]# ceph -s
  cluster:
    id:     97702c43-6cc2-4ef8-bdb5-855cfa90a260
    health: HEALTH_WARN
            Reduced data availability: 3 pgs inactive, 12 pgs peering
            Degraded data redundancy: 276/958 objects degraded (28.810%), 13 pgs degraded
 
  services:
    mon: 3 daemons, quorum node0,node1,node2 (age 18h)
    mgr: node1(active, since 18h), standbys: node2, node0
    mds: cephfs-demo:1 {0=node1=up:active} 2 up:standby
    osd: 6 osds: 6 up (since 4s), 6 in (since 4s); 27 remapped pgs
    rgw: 1 daemon active (node0)
 
  task status:
 
  data:
    pools:   9 pools, 352 pgs
    objects: 350 objects, 277 MiB
    usage:   6.8 GiB used, 293 GiB / 300 GiB avail
    pgs:     25.852% pgs not active
             276/958 objects degraded (28.810%)
             8/958 objects misplaced (0.835%)
             245 active+clean
             86  peering
             8   active+recovery_wait+degraded
             5   remapped+peering
             5   active+recovery_wait+undersized+degraded+remapped
             1   active+recovery_wait
             1   active+remapped+backfill_wait
             1   active+recovering
 
  io:
    recovery: 0 B/s, 2 keys/s, 14 objects/s
 
  progress:
    Rebalancing after osd.5 marked in
      [..............................]

数据一致性检查

Data Consistency

As part of maintaining data consistency and cleanliness, Ceph OSDs also scrub
objects within placement groups. That is, Ceph OSDs compare object metadata in
one placement group with its replicas in placement groups stored in other
OSDs. Scrubbing (usually performed daily) catches OSD bugs or filesystem
errors, often as a result of hardware issues. OSDs also perform deeper
scrubbing by comparing data in objects bit-for-bit. Deep scrubbing (by default
performed weekly) finds bad blocks on a drive that weren’t apparent in a light
scrub.

See Data Scrubbing for details on configuring scrubbing.

高可用 osd 存储数据,如何确保数据一致性?ceph 会做定期检查,检查方式有2种:
第一种:scrub (轻量数据一致性检查)
第二种:Deep scrubbing (深度的数据一致性检查)

轻量数据一致性检查

  • 对比 object metadata 数据是否一致
    • 文件名,文件属性,文件大小是否一致等
    • 每天一次
    • 不一致,从主的 pg 中复制一份数据出去进行同步

深度数据一致性检查

  • 对比数据内容
    • 对比数据是否一样
    • 每周做一次

手动执行数据一致性命令

查看检查数据一致性的命令

[root@node2 ~]# ceph -h | grep scrub

...... 
pg deep-scrub <pgid>                                        start deep-scrub on <pgid>
pg scrub <pgid>                                             start scrub on <pgid>

查看 pg id

[root@node2 ~]# ceph pg dump

手动执行轻量数据检查

[root@node2 ~]# ceph pg scrub 1.7c
instructing pg 1.7c on osd.0 to scrub

手动执行深度数据检查

[root@node2 ~]# ceph pg deep-scrub 1.7c
cinstructing pg 1.7c on osd.0 to deep-scrub

查看集群状态

# 因为测试环境数据少,没有捕捉到命令运行的一些情况
[root@node2 ~]# ceph -s
  cluster:
    id:     97702c43-6cc2-4ef8-bdb5-855cfa90a260
    health: HEALTH_OK
 
  services:
    mon: 3 daemons, quorum node0,node1,node2 (age 18h)
    mgr: node1(active, since 18h), standbys: node2, node0
    mds: cephfs-demo:1 {0=node1=up:active} 2 up:standby
    osd: 6 osds: 6 up (since 33m), 6 in (since 33m)
    rgw: 1 daemon active (node0)
 
  task status:
 
  data:
    pools:   9 pools, 352 pgs
    objects: 361 objects, 305 MiB
    usage:   6.8 GiB used, 293 GiB / 300 GiB avail
    pgs:     352 active+clean

全量执行一次 deep-scrub

[root@node2 ~]# for i in `ceph pg dump | grep "active+clean" | awk '{print $1}'`; do ceph pg deep-scrub ${i}; done
posted @   evescn  阅读(767)  评论(0编辑  收藏  举报
相关博文:
阅读排行:
· 被坑几百块钱后,我竟然真的恢复了删除的微信聊天记录!
· 没有Manus邀请码?试试免邀请码的MGX或者开源的OpenManus吧
· 【自荐】一款简洁、开源的在线白板工具 Drawnix
· 园子的第一款AI主题卫衣上架——"HELLO! HOW CAN I ASSIST YOU TODAY
· Docker 太简单,K8s 太复杂?w7panel 让容器管理更轻松!
  1. 1 毛不易
  2. 2 青丝 等什么君(邓寓君)
  3. 3 最爱 周慧敏
  4. 4 青花 (Live) 摩登兄弟刘宇宁/周传雄
  5. 5 怨苍天变了心 葱香科学家(王悠然)
  6. 6 吹梦到西洲 恋恋故人难/黄诗扶/王敬轩(妖扬)
  7. 7 姑娘别哭泣 柯柯柯啊
  8. 8 我会好好的 王心凌
  9. 9 半生雪 七叔-叶泽浩
  10. 10 用力活着 张茜
  11. 11 山茶花读不懂白玫瑰 梨笑笑
  12. 12 赴春寰 张壹ZHANG/Mukyo木西/鹿予/弦上春秋Official
  13. 13 故事终章 程响
  14. 14 沿海独白 王唯一(九姨太)
  15. 15 若把你 越南电音 云音乐AI/网易天音
  16. 16 世间美好与你环环相扣 柏松
  17. 17 愿你如愿 陆七言
  18. 18 多情种 胡杨林
  19. 19 和你一样 李宇春
  20. 20 晚风心里吹 李克勤
  21. 21 世面 黄梓溪
  22. 22 等的太久 杨大六
  23. 23 微醺状态 张一
  24. 24 醉今朝 安小茜
  25. 25 阿衣莫 阿吉太组合
  26. 26 折风渡夜 沉默书生
  27. 27 星河万里 王大毛
  28. 28 满目星辰皆是你 留小雨
  29. 29 老人与海 海鸣威/吴琼
  30. 30 海底 一支榴莲
  31. 31 只要有你 曹芙嘉
  32. 32 兰花指 阿里郎
  33. 33 口是心非 张大帅
  34. 34 爱不得忘不舍 白小白
  35. 35 惊鸿醉 指尖笑
  36. 36 如愿 葱香科学家(王悠然)
  37. 37 晚风心里吹 阿梨粤
  38. 38 惊蛰·归云 陈拾月(只有影子)/KasaYAYA
  39. 39 风飞沙 迪克牛仔
  40. 40 把孤独当做晚餐 井胧
  41. 41 星星点灯 郑智化
  42. 42 客子光阴 七叔-叶泽浩
  43. 43 走马观花 王若熙
  44. 44 沈园外 阿YueYue/戾格/小田音乐社
  45. 45 盗将行 花粥/马雨阳
  46. 46 她的眼睛会唱歌 张宇佳
  47. 47 一笑江湖 姜姜
  48. 48 虎二
  49. 49 人间烟火 程响
  50. 50 不仅仅是喜欢 萧全/孙语赛
  51. 51 你的眼神(粤语版) Ecrolyn
  52. 52 剑魂 李炜
  53. 53 虞兮叹 闻人听書_
  54. 54 时光洪流 程响
  55. 55 桃花诺 G.E.M.邓紫棋
  56. 56 行星(PLANET) 谭联耀
  57. 57 别怕我伤心 悦开心i/张家旺
  58. 58 上古山海经 小少焱
  59. 59 你的眼神 七元
  60. 60 怨苍天变了心 米雅
  61. 61 绝不会放过 王亚东
  62. 62 可笑的孤独 黄静美
  63. 63 错位时空 艾辰
  64. 64 像个孩子 仙屁孩
  65. 65 完美世界 [主题版] 水木年华
  66. 66 我们的时光 赵雷
  67. 67 万字情诗 椒椒JMJ
  68. 68 妖王 浮生
  69. 69 天地无霜 (合唱版) 杨紫/邓伦
  70. 70 塞北殇 王若熙
  71. 71 花亦山 祖娅纳惜
  72. 72 醉今朝 是可乐鸭
  73. 73 欠我个未来 艾岩
  74. 74 缘分一道桥 容云/青峰AomineDaiky
  75. 75 不知死活 子无余/严书
  76. 76 不可说 霍建华/赵丽颖
  77. 77 孤勇者 陈奕迅
  78. 78 让酒 摩登兄弟刘宇宁
  79. 79 红尘悠悠DJ沈念版 颜一彦
  80. 80 折风渡夜 (DJ名龙版) 泽国同学
  81. 81 吹灭小山河 国风堂/司南
  82. 82 等什么君 - 辞九门回忆 张大帅
  83. 83 绝世舞姬 张曦匀/戚琦
  84. 84 阿刁(无修音版|live) 张韶涵网易云资讯台
  85. 85 往事如烟 蓝波
  86. 86 清明上河图 李玉刚
  87. 87 望穿秋水 坤坤阿
  88. 88 太多 杜宣达
  89. 89 小阿七
  90. 90 霞光-《精灵世纪》片尾曲 小时姑娘
  91. 91 放开 爱乐团王超
  92. 92 醉仙美 娜美
  93. 93 虞兮叹(完整版) 黎林添娇kiki
  94. 94 单恋一枝花 夏了个天呐(朴昱美)/七夕
  95. 95 一个人挺好 (DJ版) 69/肖涵/沈子凡
  96. 96 一笑江湖 闻人听書_
  97. 97 赤伶 李玉刚
  98. 98 达拉崩吧 (Live) 周深
  99. 99 等你归来 程响
  100. 100 责无旁贷 阿悠悠
  101. 101 你是人间四月天(钢琴弹唱版) 邵帅
  102. 102 虐心 徐良/孙羽幽
  103. 103 大天蓬 (女生版) 清水er
  104. 104 赤伶 是二智呀
  105. 105 有种关系叫知己 刘大壮
  106. 106 怎随天下 王若熙
  107. 107 有人 赵钶
  108. 108 海底 三块木头
  109. 109 有何不可 许嵩
  110. 110 大天蓬 (抖音版) 璐爷
  111. 111 我吹过你吹过的晚风(翻自 ac) 辛辛
  112. 112 只爱西经 林一
  113. 113 关山酒 等什么君(邓寓君)
  114. 114 曾经的你 年少不川
  115. 115 倔强 五月天
  116. 116 Lydia F.I.R.
  117. 117 爱你 王心凌
  118. 118 杀破狼 哥哥妹妹
  119. 119 踏山河 七叔-叶泽浩
  120. 120 错过的情人 雷婷
  121. 121 你看到的我 黄勇/任书怀
  122. 122 新欢渡旧爱 黄静美
  123. 123 慕容晓晓-黄梅戏(南柯一梦 / 明洋 remix) 南柯一梦/MINGYANG
  124. 124 浮白 花粥/王胜娚
  125. 125 叹郁孤 霄磊
  126. 126 贝加尔湖畔 (Live) 李健
  127. 127 不虞 王玖
  128. 128 麻雀 李荣浩
  129. 129 一场雨落下来要用多久 鹿先森乐队
  130. 130 野狼disco 宝石Gem
  131. 131 我们不该这样的 张赫煊
  132. 132 海底 一支榴莲
  133. 133 爱情错觉 王娅
  134. 134 你一定要幸福 何洁
  135. 135 往后余生 马良
  136. 136 放你走 正点
  137. 137 只要平凡 张杰/张碧晨
  138. 138 只要平凡-小石头和孩子们 小石头和孩子们
  139. 139 红色高跟鞋 (Live) 韩雪/刘敏涛/万茜
  140. 140 明月天涯 五音Jw
  141. 141 华年 鹿先森乐队
  142. 142 分飞 徐怀钰
  143. 143 你是我撞的南墙 刘楚阳
  144. 144 同簪 小时姑娘/HITA
  145. 145 我的将军啊-唯美独特女版 熙宝(陆迦卉)
  146. 146 我的将军啊(女版戏腔) Mukyo木西
  147. 147 口是心非 南柯nanklo/乐小桃
  148. 148 DAY BY DAY (Japanese Ver.) T-ara
  149. 149 我承认我怕黑 雅楠
  150. 150 我要找到你 冯子晨
  151. 151 你的答案 子尧
  152. 152 一剪梅 费玉清
  153. 153 纸船 薛之谦/郁可唯
  154. 154 那女孩对我说 (完整版) Uu
  155. 155 我好像在哪见过你 薛之谦
  156. 156 林中鸟 葛林
  157. 157 渡我不渡她 (正式版) 苏谭谭
  158. 158 红尘来去梦一场 大壮
  159. 159 都说 龙梅子/老猫
  160. 160 산다는 건 (Cheer Up) 洪真英
  161. 161 听说 丛铭君
  162. 162 那个女孩 张泽熙
  163. 163 最近 (正式版) 王小帅
  164. 164 不谓侠 萧忆情Alex
  165. 165 芒种 音阙诗听/赵方婧
  166. 166 恋人心 魏新雨
  167. 167 Trouble Is A Friend Lenka
  168. 168 风筝误 刘珂矣
  169. 169 米津玄師-lemon(Ayasa绚沙 Remix) Ayasa
  170. 170 可不可以 张紫豪
  171. 171 告白の夜 Ayasa
  172. 172 知否知否(翻自 胡夏) 凌之轩/rainbow苒
  173. 173 琵琶行 奇然/沈谧仁
  174. 174 一曲相思 半阳
  175. 175 起风了 吴青峰
  176. 176 胡广生 任素汐
  177. 177 左手指月 古琴版 古琴唐彬/古琴白无瑕
  178. 178 清明上河图 排骨教主
  179. 179 左手指月 萨顶顶
  180. 180 刚刚好 薛之谦
  181. 181 悟空 戴荃
  182. 182 易燃易爆炸 陈粒
  183. 183 漫步人生路 邓丽君
  184. 184 不染 萨顶顶
  185. 185 不染 毛不易
  186. 186 追梦人 凤飞飞
  187. 187 笑傲江湖 刘欢/王菲
  188. 188 沙漠骆驼 展展与罗罗
  189. 189 外滩十八号 男才女貌
  190. 190 你懂得 小沈阳/沈春阳
  191. 191 铁血丹心 罗文/甄妮
  192. 192 温柔乡 陈雅森
  193. 193 似水柔情 王备
  194. 194 我只能爱你 彭青
  195. 195 年轻的战场 张杰
  196. 196 七月七日晴 许慧欣
  197. 197 心爱 金学峰
  198. 198 Something Just Like This (feat. Romy Wave) Anthony Keyrouz/Romy Wave
  199. 199 ブルーバード いきものがかり
  200. 200 舞飞扬 含笑
  201. 201 时间煮雨 郁可唯
  202. 202 英雄一怒为红颜 小壮
  203. 203 天下有情人 周华健/齐豫
  204. 204 白狐 陈瑞
  205. 205 River Flows In You Martin Ermen
  206. 206 相思 毛阿敏
  207. 207 只要有你 那英/孙楠
  208. 208 Croatian Rhapsody Maksim Mrvica
  209. 209 来生缘 刘德华
  210. 210 莫失莫忘 麦振鸿
  211. 211 往后余生 王贰浪
  212. 212 雪见—仙凡之旅 麦振鸿
  213. 213 让泪化作相思雨 南合文斗
  214. 214 追梦人 阿木
  215. 215 真英雄 张卫健
  216. 216 天使的翅膀 安琥
  217. 217 生生世世爱 吴雨霏
  218. 218 爱我就跟我走 王鹤铮
  219. 219 特别的爱给特别的你 伍思凯
  220. 220 杜婧荧/王艺翔
  221. 221 I Am You Kim Taylor
  222. 222 起风了 买辣椒也用券
  223. 223 江湖笑 周华健
  224. 224 半壶纱 刘珂矣
  225. 225 Jar Of Love 曲婉婷
  226. 226 野百合也有春天 孟庭苇
  227. 227 后来 刘若英
  228. 228 不仅仅是喜欢 萧全/孙语赛
  229. 229 Time (Official) MKJ
  230. 230 纸短情长 (完整版) 烟把儿
  231. 231 离人愁 曲肖冰
  232. 232 难念的经 周华健
  233. 233 佛系少女 冯提莫
  234. 234 红昭愿 音阙诗听
  235. 235 BINGBIAN病变 Cubi/多多Aydos
  236. 236 说散就散 袁娅维TIA RAY
  237. 237 慢慢喜欢你 莫文蔚
  238. 238 最美的期待 周笔畅
  239. 239 牵丝戏 银临/Aki阿杰
  240. 240 夜的钢琴曲 K. Williams
人间烟火 - 程响
00:00 / 00:00
An audio error has occurred, player will skip forward in 2 seconds.

Loading

点击右上角即可分享
微信分享提示