Ceph使用---CephFS

一、CephFS介绍

ceph FS 即 ceph filesystem, 可以实现文件系统共享功能(POSIX 标准), 客户端通过 ceph协议挂载并使用 ceph 集群作为数据存储服务器, http://docs.ceph.org.cn/cephfs/。
Ceph FS 需要运行 Meta Data Services(MDS)服务, 其守护进程为 ceph-mds, ceph-mds进程管理与 cephFS 上存储的文件相关的元数据, 并协调对 ceph 存储集群的访问。

     在 linux 系统使用 ls 等操作查看某个目录下的文件的时候, 会有保存在磁盘上的分区表记录文件的名称、 创建日期、 大小、 inode 及存储位置等元数据信息, 在 cephfs 由于数据是被打散为若干个离散的 object 进行分布式存储, 因此并没有统一保存文件的元数据, 而且将文件的元数据保存到一个单独的存储出 matedata pool, 但是客户端并不能直接访问matedata pool 中的元数据信息, 而是在读写数的时候有 MDS(matadata server)进行处理,读数据的时候有 MDS从 matedata pool加载元数据然后缓存在内存(用于后期快速响应其它客户端的请求)并返回给客户端, 写数据的时候有 MDS 缓存在内存并同步到 matedata pool。

 cephfs 的 mds 的数据结构类似于 linux 系统的根形目录结构及 nginx 中的缓存目录分层一样.

二、cephfs使用

如果要使用 cephFS, 需要部署 cephfs 服务。

Ubuntu:
root@ceph-mgr1:~# apt-cache madison ceph-mds
root@ceph-mgr1:~# apt install ceph-mds

Centos:
[root@ceph-mgr1 ~]# yum install ceph-mds

cephadmin@ceph-deploy:~/ceph-cluster$ ceph-deploy mds create ceph-mgr1

创建 CephFS metadata 和 data 存储池

使用 CephFS 之前需要事先于集群中创建一个文件系统, 并为其分别指定元数据和数据相关的存储池。 下面创建一个名为 cephfs 的文件系统用于测试, 它使用 cephfs-metadata 为元数据存储池, 使用 cephfs-data 为数据存储池

[cephadmin@ceph-deploy ceph-cluster]$ ceph osd pool create cephfs-metadata 32 32
[cephadmin@ceph-deploy ceph-cluster]$ ceph osd pool create cephfs-data 64 64
cephadmin@ceph-deploy:~/ceph-cluster$ ceph df 
--- RAW STORAGE ---
CLASS     SIZE    AVAIL     USED  RAW USED  %RAW USED
hdd    900 GiB  898 GiB  2.3 GiB   2.3 GiB       0.26
TOTAL  900 GiB  898 GiB  2.3 GiB   2.3 GiB       0.26
 
--- POOLS ---
POOL                   ID  PGS   STORED  OBJECTS     USED  %USED  MAX AVAIL
device_health_metrics   1    1      0 B        0      0 B      0    284 GiB
myrbd1                  2   64   10 MiB       18   31 MiB      0    284 GiB
.rgw.root               3   32  1.3 KiB        4   48 KiB      0    284 GiB
default.rgw.log         4   32  3.6 KiB      209  408 KiB      0    284 GiB
default.rgw.control     5   32      0 B        8      0 B      0    284 GiB
default.rgw.meta        6    8      0 B        0      0 B      0    284 GiB
cephfs-metadata         7   32   32 KiB       23  192 KiB      0    284 GiB
cephfs-data             8   64  373 KiB        1  1.1 MiB      0    284 GiB
mypool                 12   64  2.6 KiB        3   24 KiB      0    284 GiB
rbd-data1              13   32  171 MiB       74  515 MiB   0.06    284 GiB
cephadmin@ceph-deploy:~/ceph-cluster$ 

创建 cephFS 并验证

ceph-deploy ceph-cluster]$ ceph fs new mycephfs cephfs-metadata cephfs-data

cephadmin@ceph-deploy:~/ceph-cluster$ ceph fs ls
name: mycephfs, metadata pool: cephfs-metadata, data pools: [cephfs-data ]
cephadmin@ceph
-deploy:~/ceph-cluster$ cephadmin@ceph-deploy:~/ceph-cluster$ ceph fs status mycephfs #查看指定cephfs状况

验证 cepfFS 服务状态

cephadmin@ceph-deploy:~/ceph-cluster$ ceph mds stat
mycephfs:1 {0=ceph-mgr1=up:active}
cephadmin@ceph-deploy:~/ceph-cluster$ 

创建客户端账户

cephadmin@ceph-deploy:~/ceph-cluster$ ceph auth add client.cephfs mon 'allow r' mds 'allow rw' osd 'allow rwx pool=cephfs-data'
added key for client.cephfs
cephadmin@ceph-deploy:~/ceph-cluster$ 
cephadmin@ceph-deploy:~/ceph-cluster$ ceph auth get client.cephfs
[client.cephfs]
    key = AQAd7UJjV7WdERAAQwzc+s/z49zXl9zZ1zHiMg==
    caps mds = "allow rw"
    caps mon = "allow r"
    caps osd = "allow rwx pool=cephfs-data"
exported keyring for client.cephfs
cephadmin@ceph-deploy:~/ceph-cluster$ 
cephadmin@ceph-deploy:~/ceph-cluster$ 
cephadmin@ceph-deploy:~/ceph-cluster$ ceph auth get client.cephfs -o ceph.client.cephfs.keyring
exported keyring for client.cephfs
cephadmin@ceph-deploy:~/ceph-cluster$ 
cephadmin@ceph-deploy:~/ceph-cluster$ ceph auth print-key client.cephfs > cephfs.key
cephadmin@ceph-deploy:~/ceph-cluster$ 
cephadmin@ceph-deploy:~/ceph-cluster$ 
cephadmin@ceph-deploy:~/ceph-cluster$ cat ceph.client.cephfs.keyring
[client.cephfs]
    key = AQAd7UJjV7WdERAAQwzc+s/z49zXl9zZ1zHiMg==
    caps mds = "allow rw"
    caps mon = "allow r"
    caps osd = "allow rwx pool=cephfs-data"
cephadmin@ceph-deploy:~/ceph-cluster$ 
cephadmin@ceph-deploy:~/ceph-cluster$ cat cephfs.key 
AQAd7UJjV7WdERAAQwzc+s/z49zXl9zZ1zHiMg==
cephadmin@ceph-deploy:~/ceph-cluster$ cephadmin@ceph-deploy:~/ceph-cluster$

安装 ceph 客户端

[root@ceph-client-test ~]# yum install epel-release -y
[root@ceph-client-test ~]# yum install https://mirrors.aliyun.com/ceph/rpm-octopus/el7/noarch/ceph-release-1-1.el7.noarch.rpm -y
[root@ceph
-client-test ~]# yum install ceph-common -y

同步客户端认证文件

cephadmin@ceph-deploy:~/ceph-cluster$ scp ceph.client.cephfs.keyring  cephfs.key root@172.16.88.60:/etc/ceph/
cephadmin@ceph-deploy:~/ceph-cluster$ scp ceph.client.cephfs.keyring  cephfs.key root@172.16.88.61:/etc/ceph/

客户端验证权限

三、内核空间挂载 ceph-fs

客户端挂载有两种方式, 一是内核空间一是用户空间, 内核空间挂载需要内核支持 ceph 模块, 用户空间挂载需要安装 ceph-fuse

[root@ceph-client-test ~]# mkdir /cephfs-data
[root@ceph-client-test ~]# mount -t ceph 172.16.88.101:6789,172.16.88.102:6789,172.16.88.103:6789:/ /cephfs-data -o name=cephfs,secretfile=/etc/ceph/cephfs.key

验证写入数据

[root@ceph-client-test ~]# cp /var/log/messages /cephfs-data/
[root@ceph-client-test ~]# cd /cephfs-data/
[root@ceph-client-test cephfs-data]# ll -h
total 115K
-rw------- 1 root root 115K Oct 10 00:06 messages
[root@ceph-client-test cephfs-data]# dd if=/dev/zero of=/cephfs-data/testfile bs=3M count=100
100+0 records in
100+0 records out
314572800 bytes (315 MB) copied, 0.595602 s, 528 MB/s
[root@ceph-client-test cephfs-data]# 
[root@ceph-client-test cephfs-data]# ll -h
total 301M
-rw------- 1 root root 115K Oct 10 00:06 messages
-rw-r--r-- 1 root root 300M Oct 10 00:12 testfile
[root@ceph-client-test cephfs-data]# 

客户端通过 key 挂载

[root@ceph-client-test ~]# tail /etc/ceph/cephfs.key
AQAd7UJjV7WdERAAQwzc+s/z49zXl9zZ1zHiMg==
[root@ceph-client-test ~]# [root@ceph-client-test ~]# umount /cephfs-data [root@ceph-client-test ~]# df -h Filesystem Size Used Avail Use% Mounted on devtmpfs 3.9G 0 3.9G 0% /dev tmpfs 3.9G 0 3.9G 0% /dev/shm tmpfs 3.9G 8.6M 3.9G 1% /run tmpfs 3.9G 0 3.9G 0% /sys/fs/cgroup /dev/mapper/centos-root 49G 2.9G 47G 6% / /dev/vda1 497M 186M 311M 38% /boot overlay 49G 2.9G 47G 6% /var/lib/docker/overlay2/0dd0923ceae9c5b1db6dbf822857dcff521923fbd741f01a4bcdf4d9f76144ed/merged /dev/rbd0 3.0G 529M 2.5G 18% /data tmpfs 783M 0 783M 0% /run/user/0 [root@ceph-client-test ~]# mount -t ceph 172.16.88.101:6789,172.16.88.102:6789,172.16.88.103:6789:/ /cephfs-data -o name=cephfs,secret=AQAd7UJjV7WdERAAQwzc+s/z49zXl9zZ1zHiMg== [root@ceph-client-test ~]# df -h Filesystem Size Used Avail Use% Mounted on devtmpfs 3.9G 0 3.9G 0% /dev tmpfs 3.9G 0 3.9G 0% /dev/shm tmpfs 3.9G 8.6M 3.9G 1% /run tmpfs 3.9G 0 3.9G 0% /sys/fs/cgroup /dev/mapper/centos-root 49G 2.9G 47G 6% / /dev/vda1 497M 186M 311M 38% /boot overlay 49G 2.9G 47G 6% /var/lib/docker/overlay2/0dd0923ceae9c5b1db6dbf822857dcff521923fbd741f01a4bcdf4d9f76144ed/merged /dev/rbd0 3.0G 529M 2.5G 18% /data tmpfs 783M 0 783M 0% /run/user/0 172.16.88.101:6789,172.16.88.102:6789,172.16.88.103:6789:/ 284G 0 284G 0% /cephfs-data [root@ceph-client-test ~]#

测试写入数据

[root@ceph-client-test ~]# cp /etc/yum.repos.d/epel.repo /cephfs-data/
[root@ceph-client-test ~]# cd /cephfs-data/
[root@ceph-client-test cephfs-data]# ll -h
total 301M
-rw-r--r-- 1 root root  664 Oct 10 00:13 epel.repo
-rw------- 1 root root 115K Oct 10 00:06 messages
-rw-r--r-- 1 root root 300M Oct 10 00:12 testfile
[root@ceph-client-test cephfs-data]# stat -f /cephfs-data/ #查看挂载点状态信息
  File: "/cephfs-data/"
    ID: f6f39718aa80a4cc Namelen: 255     Type: ceph
Block size: 4194304    Fundamental block size: 4194304
Blocks: Total: 72583      Free: 72508      Available: 72508
Inodes: Total: 76         Free: -1
[root@ceph-client-test cephfs-data]# 

开机挂载

[root@ceph-client-test ~]# echo "172.16.88.101:6789,172.16.88.102:6789,172.16.88.103:6789:/ /cephfs-data ceph defaults,name=cephfs,secretfile=/etc/ceph/cephfs.key,_netdev 0 0" >> /etc/fstab
[root@ceph-client-test ~]# cat /etc/fstab 

#
# /etc/fstab
# Created by anaconda on Mon Jul 25 22:02:01 2022
#
# Accessible filesystems, by reference, are maintained under '/dev/disk'
# See man pages fstab(5), findfs(8), mount(8) and/or blkid(8) for more info
#
/dev/mapper/centos-root /                       xfs     defaults        0 0
UUID=123706e7-5065-4568-91c8-91cee867c393 /boot                   xfs     defaults        0 0
172.16.88.101:6789,172.16.88.102:6789,172.16.88.103:6789:/ /cephfs-data ceph defaults,name=cephfs,secretfile=/etc/ceph/cephfs.key,_netdev 0 0
[root@ceph-client-test ~]# mount -a
[root@ceph-client-test ~]# 

客户端模块

客户端内核加载 ceph.ko 模块挂载 cephfs 文件系统

四、用户空间挂载 ceph-fs

如果内核本较低而没有 ceph 模块, 那么可以安装 ceph-fuse 挂载, 但是推荐使用内核模块挂载

安装 ceph-fuse

http://docs.ceph.org.cn/man/8/ceph-fuse/
在一台新的客户端或还原快照, 然后安装 ceph-fuse

[root@ceph-test-03 ~]# yum install epel-release -y
[root@ceph-test-03 ~]# yum install https://mirrors.aliyun.com/ceph/rpm-octopus/el7/noarch/ceph-release-1-1.el7.noarch.rpm -y
[root@ceph-test-03 ~]# yum install ceph-fuse ceph-common -y

同步认证及配置文件

cephadmin@ceph-deploy:~/ceph-cluster$ scp ceph.conf ceph.client.cephfs.keyring  cephfs.key root@172.16.88.62:/etc/ceph/
The authenticity of host '172.16.88.62 (172.16.88.62)' can't be established.
ECDSA key fingerprint is SHA256:iRdgjq2HzptlZRnI0NPqeOWjjyRO3/NjOyfMglqhylo.
Are you sure you want to continue connecting (yes/no/[fingerprint])? yes
Warning: Permanently added '172.16.88.62' (ECDSA) to the list of known hosts.
root@172.16.88.62's password: 
ceph.conf                                                                                                                                                 100%  315   185.5KB/s   00:00    
ceph.client.cephfs.keyring                                                                                                                                100%  150   135.7KB/s   00:00    
cephfs.key                                                                                                                                                100%   40    40.5KB/s   00:00    
cephadmin@ceph-deploy:~/ceph-cluster$ 

通过 ceph-fuse 挂载 ceph

[root@ceph-test-03 ~]# mkdir /data
[root@ceph-test-03 ~]# ceph-fuse --name client.cephfs -m 172.16.88.101:6789,172.16.88.102:6789,172.16.88.103:6789 /data
2022-10-10T00:32:04.285+0800 7fd19d419080 -1 init, newargv = 0x560a37427490 newargc=9
ceph-fuse[23723]: starting ceph client
ceph-fuse[23723]: starting fuse
[root@ceph-test-03 ~]# df -Th
Filesystem          Type            Size  Used Avail Use% Mounted on
devtmpfs            devtmpfs        1.9G     0  1.9G   0% /dev
tmpfs               tmpfs           1.9G     0  1.9G   0% /dev/shm
tmpfs               tmpfs           1.9G  8.5M  1.9G   1% /run
tmpfs               tmpfs           1.9G     0  1.9G   0% /sys/fs/cgroup
/dev/mapper/rl-root xfs              49G  2.5G   47G   6% /
/dev/vda1           xfs             495M  178M  318M  36% /boot
tmpfs               tmpfs           374M     0  374M   0% /run/user/0
ceph-fuse           fuse.ceph-fuse  284G  300M  284G   1% /data
[root@ceph-test-03 ~]# 
[root@ceph-test-03 ~]# 
[root@ceph-test-03 ~]# cd /data
[root@ceph-test-03 data]# dd if=/dev/zero of=./ceph-fuse-data bs=5M count=100
100+0 records in
100+0 records out
524288000 bytes (524 MB, 500 MiB) copied, 15.4936 s, 33.8 MB/s
[root@ceph-test-03 data]# 
[root@ceph-test-03 data]# ll -h
total 801M
-rw-r--r-- 1 root root 500M Oct 10 00:34 ceph-fuse-data
-rw-r--r-- 1 root root  664 Oct 10 00:13 epel.repo
-rw------- 1 root root 115K Oct 10 00:06 messages
-rw-r--r-- 1 root root 300M Oct 10 00:12 testfile
[root@ceph-test-03 data]# 

开机挂载,指定用户会自动根据用户名称加载授权文件及配置文件 ceph.conf

[root@ceph-test-03 data]# echo "none /data fuse.ceph ceph.id=cephfs,ceph.conf=/etc/ceph/ceph.conf,_netdev,defaults 0 0" >> /etc/fstab 
[root@ceph-test-03 data]# cat /etc/fstab 

#
# /etc/fstab
# Created by anaconda on Mon Jul 25 13:16:35 2022
#
# Accessible filesystems, by reference, are maintained under '/dev/disk/'.
# See man pages fstab(5), findfs(8), mount(8) and/or blkid(8) for more info.
#
# After editing this file, run 'systemctl daemon-reload' to update systemd
# units generated from this file.
#
/dev/mapper/rl-root     /                       xfs     defaults        0 0
UUID=13f7d819-4093-4969-941e-49ccb65c8595 /boot                   xfs     defaults        0 0
none /data fuse.ceph ceph.id=cephfs,ceph.conf=/etc/ceph/ceph.conf,_netdev,defaults 0 0
[root@ceph-test-03 data]# 
[root@ceph-test-03 ~]# umount /data
[root@ceph-test-03 ~]# mount -a
ceph-fuse[23798]: starting ceph client
2022-10-10T00:42:32.829+0800 7fa596dfe080 -1 init, newargv = 0x7fa580001110 newargc=11
ceph-fuse[23798]: starting fuse
[root@ceph-test-03 ~]# 

五、ceph mds 高可用

     Ceph mds(metadata service)作为 ceph 的访问入口, 需要实现高性能及数据备份,而 MDS支持多 MDS 结构, 甚至还能实现类似于 redis cluster 的多主从结构, 以实现 MDS 服务的高性能和高可用, 假设启动 4 个 MDS 进程, 设置最大 max_mds 为 2, 这时候有 2 个MDS 成为主节点, 另外的两个 2 个 MDS 作为备份节点。 

设置每个主节点专用的备份 MDS, 也就是如果此主节点出现问题马上切换到另个 MDS 接管主 MDS 并继续对外提供元数据读写, 设置备份 MDS 的常用选项如下。

  • mds_standby_replay: 值为 true 或 false, true 表示开启 replay 模式, 这种模式下主 MDS内的数量将实时与从 MDS 同步, 如果主宕机, 从可以快速的切换。 如果为 false 只有宕机的时候才去同步数据, 这样会有一段时间的中断。
  • mds_standby_for_name: 设置当前 MDS 进程只用于备份于指定名称的 MDS。
  • mds_standby_for_rank: 设置当前 MDS 进程只用于备份于哪个 Rank((上级节点), 通常为Rank 编号。 另外在存在多个 CephFS 文件系统中, 还可以使用 mds_standby_for_fscid 参数来为指定不同的文件系统。
  • mds_standby_for_fscid: 指定 CephFS 文件系统 ID, 需要联合 mds_standby_for_rank 生效, 如果设置 mds_standby_for_rank, 那么就是用于指定文件系统的指定 Rank, 如果没有设置, 就是指定文件系统的所有 Rank。

查看当前 mds 服务器状态

[root@ceph-deploy ~]# ceph mds stat
mycephfs:1 {0=ceph-mgr1=up:active}
[root@ceph-deploy ~]# 

添加 MDS 服务器
将 ceph-mgr2 和 ceph-mon2 和 ceph-mon3 作为 mds 服务角色添加至 ceph 集群, 最后实两主两备的 mds 高可用和高性能结构。

#mds 服务器安装 ceph-mds 服务
[root@ceph-mgr2 ~]# apt install ceph-mds -y
[root@ceph-mon2 ~]# apt install ceph-mds -y
[root@ceph-mon3 ~]# apt install ceph-mds -y

#添加 mds 服务器
[cephadmin@ceph-deploy ceph-cluster]$ ceph-deploy mds create ceph-mgr2
[cephadmin@ceph-deploy ceph-cluster]$ ceph-deploy mds create ceph-mon2
[cephadmin@ceph-deploy ceph-cluster]$ ceph-deploy mds create ceph-mon3

#验证 mds 服务器当前状态:
cephadmin@ceph-deploy:~/ceph-cluster$ ceph mds stat
mycephfs:1 {0=ceph-mgr1=up:active} 3 up:standby
cephadmin@ceph-deploy:~/ceph-cluster$ 

操作过程

cephadmin@ceph-deploy:~/ceph-cluster$ ceph-deploy mds create ceph-mgr2
[ceph_deploy.conf][DEBUG ] found configuration file at: /home/cephadmin/.cephdeploy.conf
[ceph_deploy.cli][INFO  ] Invoked (2.1.0): /usr/local/bin/ceph-deploy mds create ceph-mgr2
[ceph_deploy.cli][INFO  ] ceph-deploy options:
[ceph_deploy.cli][INFO  ]  verbose                       : False
[ceph_deploy.cli][INFO  ]  quiet                         : False
[ceph_deploy.cli][INFO  ]  username                      : None
[ceph_deploy.cli][INFO  ]  overwrite_conf                : False
[ceph_deploy.cli][INFO  ]  ceph_conf                     : None
[ceph_deploy.cli][INFO  ]  cluster                       : ceph
[ceph_deploy.cli][INFO  ]  subcommand                    : create
[ceph_deploy.cli][INFO  ]  cd_conf                       : <ceph_deploy.conf.cephdeploy.Conf object at 0x7fdbc0419100>
[ceph_deploy.cli][INFO  ]  default_release               : False
[ceph_deploy.cli][INFO  ]  func                          : <function mds at 0x7fdbc0451160>
[ceph_deploy.cli][INFO  ]  mds                           : [('ceph-mgr2', 'ceph-mgr2')]
[ceph_deploy.mds][DEBUG ] Deploying mds, cluster ceph hosts ceph-mgr2:ceph-mgr2
[ceph-mgr2][DEBUG ] connection detected need for sudo
[ceph-mgr2][DEBUG ] connected to host: ceph-mgr2 
[ceph_deploy.mds][INFO  ] Distro info: ubuntu 20.04 focal
[ceph_deploy.mds][DEBUG ] remote host will use systemd
[ceph_deploy.mds][DEBUG ] deploying mds bootstrap to ceph-mgr2
[ceph-mgr2][INFO  ] Running command: sudo ceph --cluster ceph --name client.bootstrap-mds --keyring /var/lib/ceph/bootstrap-mds/ceph.keyring auth get-or-create mds.ceph-mgr2 osd allow rwx mds allow mon allow profile mds -o /var/lib/ceph/mds/ceph-ceph-mgr2/keyring
[ceph-mgr2][INFO  ] Running command: sudo systemctl enable ceph-mds@ceph-mgr2
[ceph-mgr2][WARNIN] Created symlink /etc/systemd/system/ceph-mds.target.wants/ceph-mds@ceph-mgr2.service → /lib/systemd/system/ceph-mds@.service.
[ceph-mgr2][INFO  ] Running command: sudo systemctl start ceph-mds@ceph-mgr2
[ceph-mgr2][INFO  ] Running command: sudo systemctl enable ceph.target
cephadmin@ceph-deploy:~/ceph-cluster$ 
cephadmin@ceph-deploy:~/ceph-cluster$ 
cephadmin@ceph-deploy:~/ceph-cluster$ ceph-deploy mds create ceph-mon2
[ceph_deploy.conf][DEBUG ] found configuration file at: /home/cephadmin/.cephdeploy.conf
[ceph_deploy.cli][INFO  ] Invoked (2.1.0): /usr/local/bin/ceph-deploy mds create ceph-mon2
[ceph_deploy.cli][INFO  ] ceph-deploy options:
[ceph_deploy.cli][INFO  ]  verbose                       : False
[ceph_deploy.cli][INFO  ]  quiet                         : False
[ceph_deploy.cli][INFO  ]  username                      : None
[ceph_deploy.cli][INFO  ]  overwrite_conf                : False
[ceph_deploy.cli][INFO  ]  ceph_conf                     : None
[ceph_deploy.cli][INFO  ]  cluster                       : ceph
[ceph_deploy.cli][INFO  ]  subcommand                    : create
[ceph_deploy.cli][INFO  ]  cd_conf                       : <ceph_deploy.conf.cephdeploy.Conf object at 0x7f32f13ab100>
[ceph_deploy.cli][INFO  ]  default_release               : False
[ceph_deploy.cli][INFO  ]  func                          : <function mds at 0x7f32f13e3160>
[ceph_deploy.cli][INFO  ]  mds                           : [('ceph-mon2', 'ceph-mon2')]
[ceph_deploy.mds][DEBUG ] Deploying mds, cluster ceph hosts ceph-mon2:ceph-mon2
[ceph-mon2][DEBUG ] connection detected need for sudo
[ceph-mon2][DEBUG ] connected to host: ceph-mon2 
[ceph_deploy.mds][INFO  ] Distro info: ubuntu 20.04 focal
[ceph_deploy.mds][DEBUG ] remote host will use systemd
[ceph_deploy.mds][DEBUG ] deploying mds bootstrap to ceph-mon2
[ceph-mon2][WARNIN] mds keyring does not exist yet, creating one
[ceph-mon2][INFO  ] Running command: sudo ceph --cluster ceph --name client.bootstrap-mds --keyring /var/lib/ceph/bootstrap-mds/ceph.keyring auth get-or-create mds.ceph-mon2 osd allow rwx mds allow mon allow profile mds -o /var/lib/ceph/mds/ceph-ceph-mon2/keyring
[ceph-mon2][INFO  ] Running command: sudo systemctl enable ceph-mds@ceph-mon2
[ceph-mon2][WARNIN] Created symlink /etc/systemd/system/ceph-mds.target.wants/ceph-mds@ceph-mon2.service → /lib/systemd/system/ceph-mds@.service.
[ceph-mon2][INFO  ] Running command: sudo systemctl start ceph-mds@ceph-mon2
[ceph-mon2][INFO  ] Running command: sudo systemctl enable ceph.target
cephadmin@ceph-deploy:~/ceph-cluster$ 
cephadmin@ceph-deploy:~/ceph-cluster$ ceph-deploy mds create ceph-mon3
[ceph_deploy.conf][DEBUG ] found configuration file at: /home/cephadmin/.cephdeploy.conf
[ceph_deploy.cli][INFO  ] Invoked (2.1.0): /usr/local/bin/ceph-deploy mds create ceph-mon3
[ceph_deploy.cli][INFO  ] ceph-deploy options:
[ceph_deploy.cli][INFO  ]  verbose                       : False
[ceph_deploy.cli][INFO  ]  quiet                         : False
[ceph_deploy.cli][INFO  ]  username                      : None
[ceph_deploy.cli][INFO  ]  overwrite_conf                : False
[ceph_deploy.cli][INFO  ]  ceph_conf                     : None
[ceph_deploy.cli][INFO  ]  cluster                       : ceph
[ceph_deploy.cli][INFO  ]  subcommand                    : create
[ceph_deploy.cli][INFO  ]  cd_conf                       : <ceph_deploy.conf.cephdeploy.Conf object at 0x7f1a63be10d0>
[ceph_deploy.cli][INFO  ]  default_release               : False
[ceph_deploy.cli][INFO  ]  func                          : <function mds at 0x7f1a63c1b160>
[ceph_deploy.cli][INFO  ]  mds                           : [('ceph-mon3', 'ceph-mon3')]
[ceph_deploy.mds][DEBUG ] Deploying mds, cluster ceph hosts ceph-mon3:ceph-mon3
[ceph-mon3][DEBUG ] connection detected need for sudo
[ceph-mon3][DEBUG ] connected to host: ceph-mon3 
[ceph_deploy.mds][INFO  ] Distro info: ubuntu 20.04 focal
[ceph_deploy.mds][DEBUG ] remote host will use systemd
[ceph_deploy.mds][DEBUG ] deploying mds bootstrap to ceph-mon3
[ceph-mon3][WARNIN] mds keyring does not exist yet, creating one
[ceph-mon3][INFO  ] Running command: sudo ceph --cluster ceph --name client.bootstrap-mds --keyring /var/lib/ceph/bootstrap-mds/ceph.keyring auth get-or-create mds.ceph-mon3 osd allow rwx mds allow mon allow profile mds -o /var/lib/ceph/mds/ceph-ceph-mon3/keyring
[ceph-mon3][INFO  ] Running command: sudo systemctl enable ceph-mds@ceph-mon3
[ceph-mon3][WARNIN] Created symlink /etc/systemd/system/ceph-mds.target.wants/ceph-mds@ceph-mon3.service → /lib/systemd/system/ceph-mds@.service.
[ceph-mon3][INFO  ] Running command: sudo systemctl start ceph-mds@ceph-mon3
[ceph-mon3][INFO  ] Running command: sudo systemctl enable ceph.target
cephadmin@ceph-deploy:~/ceph-cluster$ 
cephadmin@ceph-deploy:~/ceph-cluster$ 
cephadmin@ceph-deploy:~/ceph-cluster$ ceph mds stat
mycephfs:1 {0=ceph-mgr1=up:active} 3 up:standby
cephadmin@ceph-deploy:~/ceph-cluster$ 
View Code

验证 ceph 集群当前状态
当前处于激活状态的 mds 服务器有一台, 处于备份状态的 mds 服务器有三台。

cephadmin@ceph-deploy:~/ceph-cluster$ ceph fs status

当前的文件系统状态

cephadmin@ceph-deploy:~/ceph-cluster$ ceph fs get mycephfs
Filesystem 'mycephfs' (1)
fs_name    mycephfs
epoch    4
flags    12
created    2022-10-07T11:58:44.065065+0800
modified    2022-10-07T11:58:45.078402+0800
tableserver    0
root    0
session_timeout    60
session_autoclose    300
max_file_size    1099511627776
required_client_features    {}
last_failure    0
last_failure_osd_epoch    0
compat    compat={},rocompat={},incompat={1=base v0.20,2=client writeable ranges,3=default file layouts on dirs,4=dir inode in separate object,5=mds uses versioned encoding,6=dirfrag is stored in omap,7=mds uses inline data,8=no anchor table,9=file layout v2,10=snaprealm v2}
max_mds    1
in    0
up    {0=34364}
failed    
damaged    
stopped    
data_pools    [8]
metadata_pool    7
inline_data    disabled
balancer    
standby_count_wanted    1
[mds.ceph-mgr1{0:34364} state up:active seq 120 addr [v2:172.16.88.111:6802/91792027,v1:172.16.88.111:6803/91792027] compat {c=[1],r=[1],i=[7ff]}]
cephadmin@ceph-deploy:~/ceph-cluster$ 

设置处于激活状态 mds 的数量

目前有四个 mds 服务器, 但是有一个主三个备, 可以优化一下部署架构, 设置为为两主两备。

[cephadmin@ceph-deploy ceph-cluster]$ ceph fs set mycephfs max_mds 2 #设置同时活跃的主 mds 最大值为 2

六、MDS 高可用优化

目前的状态是 ceph-mgr1 和 ceph-mon2 分别是 active 状态, ceph-mon3 和 ceph-mgr2分别处于 standby 状态, 现在可以将 ceph-mgr2 设置为 ceph-mgr1 的 standby, 将ceph-mon3 设置为 ceph-mon2 的 standby, 以实现每个主都有一个固定备份角色的结构,则修改配置文件如下:

cephadmin@ceph-deploy:~/ceph-cluster$ vim ceph.conf
cephadmin@ceph-deploy:~/ceph-cluster$ cat ceph.conf 
[global]
fsid = 8dc32c41-121c-49df-9554-dfb7deb8c975
public_network = 172.16.88.0/24
cluster_network = 192.168.122.0/24
mon_initial_members = ceph-mon1,ceph-mon2,ceph-mon3
mon_host = 172.16.88.101,172.16.88.102,172.16.88.103
auth_cluster_required = cephx
auth_service_required = cephx
auth_client_required = cephx

mon clock drift allowed = 2
mon clock drift warn backoff = 30

[mds.ceph-mgr2]
mds_standby_for_name = ceph-mgr1
mds_standby_replay = true

[mds.ceph-mgr1]
mds_standby_for_name = ceph-mgr2
mds_standby_replay = true

[mds.ceph-mon3]
mds_standby_for_name = ceph-mon2
mds_standby_replay = true

[mds.ceph-mon2]
mds_standby_for_name = ceph-mon3
mds_standby_replay = true
cephadmin@ceph-deploy:~/ceph-cluster$ 

分发配置文件并重启 mds 服务

#分发配置文件保证各 mds 服务重启有效
ceph-deploy --overwrite-conf config push ceph-mon1
ceph-deploy --overwrite-conf config push ceph-mon2
ceph-deploy --overwrite-conf config push ceph-mon3
ceph-deploy --overwrite-conf config push ceph-mgr1
ceph-deploy --overwrite-conf config push ceph-mgr2

[root@ceph-mon2 ~]# systemctl restart ceph-mds@ceph-mon2.service
[root@ceph-mon3 ~]# systemctl restart ceph-mds@ceph-mon3.service
[root@ceph-mgr2 ~]# systemctl restart ceph-mds@ceph-mgr2.service
[root@ceph-mgr1 ~]# systemctl restart ceph-mds@ceph-mgr1.service

ceph 集群 mds 高可用状态

查看 active standby 对应关系

[cephadmin@ceph-deploy ceph-cluster]$ ceph fs get mycephfs

七、通过 ganesha 将 cephfs 导出为 NFS

通过 ganesha cephfs 通过 NFS 协议共享使用。
https://www.server-world.info/en/note?os=Ubuntu_20.04&p=ceph15&f=8

服务端配置

mgr 节点配置, 并提前 mgr 在准备好 ceph.conf ceph.client.admin.keyring 认证文件:

cephadmin@ceph-deploy:~/ceph-cluster$ sudo scp ceph.client.admin.keyring  172.31.6.104:/etc/ceph/
[root@ceph-mgr1 ~]# ll -h /etc/ceph/
total 20K
drwxr-xr-x   2 root root 4.0K Oct 10 21:46 ./
drwxr-xr-x 112 root root 4.0K Oct  7 11:14 ../
-rw-------   1 root root  151 Oct 10 20:28 ceph.client.admin.keyring
-rw-r--r--   1 root root  681 Oct 10 21:46 ceph.conf
-rw-r--r--   1 root root   92 Jul 22 01:38 rbdmap
-rw-------   1 root root    0 Oct  4 22:49 tmpqbicc8w7
[root@ceph-mgr1 ~]# apt install nfs-ganesha-ceph -y
[root@ceph-mgr1 ganesha]# vi ganesha.conf
[root@ceph-mgr1 ganesha]# cat ganesha.conf
# create new
NFS_CORE_PARAM {
   # disable NLM
   Enable_NLM = false;
   # disable RQUOTA (not suported on CephFS)
   Enable_RQUOTA = false;
   # NFS protocol
   Protocols = 4;
}
EXPORT_DEFAULTS {
   # default access mode
   Access_Type = RW;
}
EXPORT {
   # uniq ID
   Export_Id = 1;
   # mount path of CephFS
   Path = "/";
   FSAL {
      name = CEPH;
      # hostname or IP address of this Node
      hostname="172.16.88.111";
   }
   #setting for root Squash
   Squash="No_root_squash";
   # NFSv4 Pseudo path
   Pseudo="/magedu";
   # allowed security options
   SecType = "sys";
}
LOG {
    # default log level
    Default_Log_Level = WARN;
}
[root@ceph-mgr1 ganesha]#
[root@ceph-mgr1 ganesha]# systemctl restart nfs-ganesha
[root@ceph-mgr1 ganesha]# systemctl status nfs-ganesha
● nfs-ganesha.service - NFS-Ganesha file server
     Loaded: loaded (/lib/systemd/system/nfs-ganesha.service; enabled; vendor preset: enabled)
     Active: active (running) since Mon 2022-10-10 22:09:04 CST; 8s ago
       Docs: http://github.com/nfs-ganesha/nfs-ganesha/wiki
    Process: 23309 ExecStart=/bin/bash -c ${NUMACTL} ${NUMAOPTS} /usr/bin/ganesha.nfsd ${OPTIONS} ${EPOCH} (code=exited, status=0/SUCCE>
   Main PID: 23317 (ganesha.nfsd)
      Tasks: 36 (limit: 4612)
     Memory: 35.2M
     CGroup: /system.slice/nfs-ganesha.service
             └─23317 /usr/bin/ganesha.nfsd -L /var/log/ganesha/ganesha.log -f /etc/ganesha/ganesha.conf -N NIV_EVENT

Oct 10 22:09:04 ceph-mgr1.example.local systemd[1]: Starting NFS-Ganesha file server...
Oct 10 22:09:04 ceph-mgr1.example.local systemd[1]: Started NFS-Ganesha file server.

客户端挂载测试

[root@ceph-test-02 ~]# apt-get install nfs-common -y
[root@ceph-test-02 ~]# mount -t nfs4 172.16.88.111:/magedu /ganesha-data
[root@ceph-test-02 ~]# df -h
Filesystem                         Size  Used Avail Use% Mounted on
udev                               1.9G     0  1.9G   0% /dev
tmpfs                              394M  1.2M  393M   1% /run
/dev/mapper/ubuntu--vg-ubuntu--lv   48G  7.2G   39G  16% /
tmpfs                              2.0G     0  2.0G   0% /dev/shm
tmpfs                              5.0M     0  5.0M   0% /run/lock
tmpfs                              2.0G     0  2.0G   0% /sys/fs/cgroup
/dev/loop1                          62M   62M     0 100% /snap/core20/1587
/dev/vda2                          1.5G  205M  1.2G  15% /boot
/dev/loop0                          64M   64M     0 100% /snap/core20/1623
/dev/loop2                          68M   68M     0 100% /snap/lxd/21835
/dev/loop3                          68M   68M     0 100% /snap/lxd/22753
/dev/loop5                          48M   48M     0 100% /snap/snapd/17029
/dev/loop4                          47M   47M     0 100% /snap/snapd/16292
/dev/rbd0                          5.0G   33M  5.0G   1% /data
tmpfs                              394M     0  394M   0% /run/user/0
172.16.88.111:/magedu              284G  800M  283G   1% /ganesha-data
[root@ceph-test-02 ~]# 
[root@ceph-test-02 ~]# cd /ganesha-data/
[root@ceph-test-02 ganesha-data]# touch test-{1..9}.txt
[root@ceph-test-02 ganesha-data]# dd if=/dev/zero of=./ganesha-data bs=5M count=100
[root@ceph-test-02 ganesha-data]# ll -h
total 1.3G
drwxr-xr-x  2 nobody 4294967294 1.3G Oct 10 22:21 ./
drwxr-xr-x 21 root   root       4.0K Oct 10 22:16 ../
-rw-r--r--  1 nobody 4294967294 500M Oct 10 00:34 ceph-fuse-data
-rw-r--r--  1 nobody 4294967294  664 Oct 10 00:13 epel.repo
-rw-r--r--  1 nobody 4294967294 500M Oct 10 22:21 ganesha-data
-rw-------  1 nobody 4294967294 115K Oct 10 00:06 messages
-rw-r--r--  1 nobody 4294967294    0 Oct 10 22:20 test-1.txt
-rw-r--r--  1 nobody 4294967294    0 Oct 10 22:20 test-2.txt
-rw-r--r--  1 nobody 4294967294    0 Oct 10 22:20 test-3.txt
-rw-r--r--  1 nobody 4294967294    0 Oct 10 22:20 test-4.txt
-rw-r--r--  1 nobody 4294967294    0 Oct 10 22:20 test-5.txt
-rw-r--r--  1 nobody 4294967294    0 Oct 10 22:20 test-6.txt
-rw-r--r--  1 nobody 4294967294    0 Oct 10 22:20 test-7.txt
-rw-r--r--  1 nobody 4294967294    0 Oct 10 22:20 test-8.txt
-rw-r--r--  1 nobody 4294967294    0 Oct 10 22:20 test-9.txt
-rw-r--r--  1 nobody 4294967294 300M Oct 10 00:12 testfile
[root@ceph-test-02 ganesha-data]# 

 八、删除cephfs

参考文档:https://access.redhat.com/documentation/zh-cn/red_hat_ceph_storage/4/html/file_system_guide/removing-a-ceph-file-system-using-the-command-line-interface_fs

[root@easzlab-deploy ~]# ceph fs set mycephfs down true
mycephfs marked down. 
[root@easzlab-deploy ~]# ceph fs status
mycephfs - 0 clients
========
RANK   STATE       MDS     ACTIVITY   DNS    INOS   DIRS   CAPS  
 0    stopping  ceph-mgr2            2401     13     12      0   
      POOL         TYPE     USED  AVAIL  
cephfs-metadata  metadata  72.6M   273G  
  cephfs-data      data       0    273G  
STANDBY MDS  
 ceph-mon2   
 ceph-mon1   
 ceph-mgr1   
MDS version: ceph version 16.2.14 (238ba602515df21ea7ffc75c88db29f9e5ef12c9) pacific (stable)
[root@easzlab-deploy ~]# ceph mds fail 0
MDS named '0' does not exist, is not up or you lack the permission to see.
[root@easzlab-deploy ~]# ceph fs status
mycephfs - 0 clients
========
      POOL         TYPE     USED  AVAIL  
cephfs-metadata  metadata  12.6M   273G  
  cephfs-data      data       0    273G  
STANDBY MDS  
 ceph-mon2   
 ceph-mon1   
 ceph-mgr1   
 ceph-mgr2   
MDS version: ceph version 16.2.14 (238ba602515df21ea7ffc75c88db29f9e5ef12c9) pacific (stable)
[root@easzlab-deploy ~]# ceph fs rm mycephfs --yes-i-really-mean-it
[root@easzlab-deploy ~]# ceph fs ls
No filesystems enabled
[root@easzlab-deploy ~]#
[root@easzlab-deploy ~]# ceph osd pool delete cephfs_metadata cephfs_metadata --yes-i-really-really-mean-it
[root@easzlab-deploy ~]# ceph osd pool delete cephfs_data cephfs_data --yes-i-really-really-mean-it
posted @ 2022-10-10 22:31  cyh00001  阅读(2346)  评论(0编辑  收藏  举报