Kubernetes存储——ceph(集群)
一、搭建ceph(集群)版本:rpm-nautilus
1.1 服务器规划
master(k8s集群) | node1(k8s集群) | node2(k8s集群) |
---|---|---|
192.168.99.201 | 192.168.99.202 | 192.168.99.203 |
ceph-01(ceph集群) | ceph-02(ceph集群) | ceph-03(ceph集群) | ceph-client(ceph集群) |
---|---|---|---|
192.168.99.204 | 192.168.99.205 | 192.168.99.206 | 192.168.99.207 |
所有ceph服务器另外准备一块磁盘(裸盘)(/dev/sdb)
添加新磁盘
这里在所有节点添加1块50GB的新磁盘:/dev/sdb,作为OSD盘,提供存储空间,添加完成后扫描磁盘,确保主机能够正常识别到:
#扫描 SCSI 总线并添加 SCSI 设备 # for host in $(ls /sys/class/scsi_host) ; do echo "- - -" > /sys/class/scsi_host/$host/scan; done #重新扫描 SCSI 总线 # for scsi_device in $(ls /sys/class/scsi_device/); do echo 1 > /sys/class/scsi_device/$scsi_device/device/rescan; done #查看已添加的磁盘,能够看到sdb说明添加成功 lsblk
1.2 环境准备
所有 ceph (服务端 + 客户端) 添加 yum 源 (rpm-nautilus版本) (ceph version 14.2.22)
$ cat > /etc/yum.repos.d/ceph.repo << EOF [ceph] name=ceph baseurl=http://mirrors.aliyun.com/ceph/rpm-nautilus/el7/x86_64/ enabled=1 gpgcheck=0 priority=1 [ceph-noarch] name=cephnoarch baseurl=http://mirrors.aliyun.com/ceph/rpm-nautilus/el7/noarch/ enabled=1 gpgcheck=0 priority=1 [ceph-source] name=Ceph source packages baseurl=http://mirrors.aliyun.com/ceph/rpm-nautilus/el7/SRPMS enabled=1 gpgcheck=0 priority=1 EOF
[root@ceph-01 /etc/ceph]# ssh-keygen $ for i in 192.168.99.{201..207}; do echo ">>> $i";ssh-copy-id $i;done $ for i in ceph-{01..03}; do echo ">>> $i";ssh-copy-id $i;done
[root@ceph-01 /etc/ceph]# yum -y install ceph-deploy python-setuptools [root@ceph-01 /etc/ceph]# ceph-deploy --version
[root@ceph-01 /etc/ceph]# mkdir /etc/ceph && cd /etc/ceph [root@ceph-01 /etc/ceph]# ceph-deploy new ceph-01 [root@ceph-01 /etc/ceph]# ls ceph.conf ceph-deploy-ceph.log ceph.mon.keyring
[root@ceph-01 /etc/ceph]# yum -y install ceph ceph-radosgw [root@ceph-01 /etc/ceph]# ceph -v [root@ceph-02 ~]# yum -y install ceph ceph-radosgw [root@ceph-02 ~]# ceph -v [root@ceph-03 ~]# yum -y install ceph ceph-radosgw [root@ceph-03 ~]# ceph -v 上面操作也可以用下面的命令执行 $ ceph-deploy install ceph-01 ceph-02 ceph-03
[root@ceph-01 /etc/ceph]# echo public network = 192.168.99.0/24 >> /etc/ceph/ceph.conf
监控节点初始化,并将配置文件同步到所有节点
[root@ceph-01 /etc/ceph]# ceph-deploy mon create-initial [root@ceph-01 /etc/ceph]# ps -ef | grep ceph-mon [root@ceph-01 /etc/ceph]# ceph health HEALTH_WARN mon is allowing insecure global_id reclaim [root@ceph-01 /etc/ceph]# ceph-deploy admin ceph-01 ceph-02 ceph-03 [root@ceph-01 /etc/ceph]# ceph -s
禁用不安全模式:mon is allowing insecure global_id reclaim
[root@ceph-01 /etc/ceph]# ceph config set mon auth_allow_insecure_global_id_reclaim false [root@ceph-01 /etc/ceph]# ceph health [root@ceph-01 /etc/ceph]# ceph -s
为了防止mon单点故障,你可以加多个mon节点(建议奇数个,因为有quorum仲裁投票)
[root@ceph-01 /etc/ceph]# ceph-deploy mon add ceph-02 [root@ceph-01 /etc/ceph]# ceph-deploy mon add ceph-03 [root@ceph-01 /etc/ceph]# ceph -s
查看mon各种状态
#查看 mon 状态信息 [root@ceph-01 /etc/ceph]# ceph mon stat #查看 mon 的选举状态 [root@ceph-01 /etc/ceph]# ceph quorum_status #查看 mon 映射信息 [root@ceph-01 /etc/ceph]# ceph mon dump #查看 mon 详细状态 [root@ceph-01 /etc/ceph]# ceph daemon mon.ceph-01 mon_status
创建 mgr (管理)
# 创建一个mgr [root@ceph-01 /etc/ceph]# ceph-deploy mgr create ceph-01 [root@ceph-01 /etc/ceph]# ceph -s # 添加多个mgr可以实现HA [root@ceph-01 /etc/ceph]# ceph-deploy mgr create ceph-02 [root@ceph-01 /etc/ceph]# ceph-deploy mgr create ceph-03 [root@ceph-01 /etc/ceph]# ceph -s
创建 osd (存储盘)
[root@ceph-01 /etc/ceph]# ceph-deploy disk --help [root@ceph-01 /etc/ceph]# ceph-deploy osd --help [root@ceph-01 /etc/ceph]# ceph-deploy disk list ceph-01 [root@ceph-01 /etc/ceph]# ceph-deploy disk list ceph-02 [root@ceph-01 /etc/ceph]# ceph-deploy disk list ceph-03
[root@ceph-01 /etc/ceph]# ceph-deploy disk zap ceph-01 /dev/sdb [root@ceph-01 /etc/ceph]# ceph-deploy disk zap ceph-02 /dev/sdb [root@ceph-01 /etc/ceph]# ceph-deploy disk zap ceph-03 /dev/sdb [root@ceph-01 /etc/ceph]# ceph-deploy osd create --data /dev/sdb ceph-01 [root@ceph-01 /etc/ceph]# ceph-deploy osd create --data /dev/sdb ceph-02 [root@ceph-01 /etc/ceph]# ceph-deploy osd create --data /dev/sdb ceph-03 [root@server01 ceph]# ceph -s
查看 ceph osd 各种状态
#查看 osd 运行状态 [root@ceph-01 /etc/ceph]# ceph osd stat #查看 osd 映射信息 [root@ceph-01 /etc/ceph]# ceph osd dump #查看数据延迟 [root@ceph-01 /etc/ceph]# ceph osd perf #详细列出集群每块磁盘的使用情况 [root@ceph-01 /etc/ceph]# ceph osd df #查看 osd 目录树 [root@ceph-01 /etc/ceph]# ceph osd tree #查看最大 osd 的个数 [root@ceph-01 /etc/ceph]# ceph osd getmaxosd
时间导致的集群不健康,排错方法:
[root@ceph-01 /etc/ceph]# ceph -s cluster: id: 74cbea9d-a4a0-4efc-a267-38a595bb2174 health: HEALTH_WARN clock skew detected on mon.ceph-02
1、ntpd root@ceph-02 ~]# systemctl restart ntpd root@ceph-02 ~]# systemctl restart ntpd 操作节点上重启ceph-mon.target服务 [root@ceph-01 /etc/ceph]# systemctl restart ceph-mon.target 2、chronyd root@ceph-02 ~]# systemctl restart chronyd root@ceph-02 ~]# systemctl restart chronyd 操作节点上重启ceph-mon.target服务 [root@ceph-01 /etc/ceph]# systemctl restart ceph-mon.target 3、在global配置段里加上下面两句(调整时间偏差阈值) cat >> /etc/ceph/ceph.conf << EOF mon clock drift allowed = 2 mon clock drift warn backoff = 30 EOF 再把修改的配置同步到所有节点(前面同步过配置文件,所以这次命令有点不同,这是同步覆盖过去) [root@ceph-01 /etc/ceph]# ceph-deploy --overwrite-conf admin ceph-01 ceph-02 ceph-03 再回到ceph-01操作节点上重启ceph-mon.target服务 [root@ceph-01 /etc/ceph]# systemctl restart ceph-mon.target
在ceph.conf上添加删除pool的配置参数
# 在[global]配置下加上这一句 [root@ceph-01 /etc/ceph]# echo mon_allow_pool_delete = true >> /etc/ceph/ceph.conf [root@ceph-01 /etc/ceph]# echo mon_max_pg_per_osd = 2000 >> /etc/ceph/ceph.conf # 添加如上设置后,把ceph.conf同步到所有mon节点 [root@ceph-01 /etc/ceph]# ceph-deploy --overwrite-conf admin ceph-01 ceph-02 ceph-03 # 重启监控服务,给所有的mon节点重启服务 [root@ceph-01 /etc/ceph]# systemctl restart ceph-mon.target [root@ceph-02 ~]# systemctl restart ceph-mon.target [root@ceph-03 ~]# systemctl restart ceph-mon.target [root@ceph-0n ~]# systemctl status ceph-mon.target # 删除时pool名输两次,后再接--yes-i-really-really-mean-it参数就可以删除了 [root@ceph-01 /etc/ceph]# ceph osd pool delete test_pool test_pool --yes-i-really-reallymean-it or [root@ceph-01 /etc/ceph]# rados rmpool test_pool test_pool --yes-i-really-really-mean-it
1.3 ceph 文件存储
1.3.1 创建文件存储
第一步:在ceph-01部署节点上同步配置文件,并创建mds
先同步一下配置文件,(因为前面在 ceph-01 的配置文件里加了 mon_allow_pool_delete = true 所以要同步后,才能成功执行下一条命令) [root@ceph-01 /etc/ceph]# ceph-deploy --overwrite-conf admin ceph-01 ceph-02 ceph-03 #这里做三个mds [root@ceph-01 /etc/ceph]# ceph-deploy mds create ceph-01 ceph-02 ceph-03 [root@ceph-01 /etc/ceph]# ceph -s
第2步: 一个Ceph文件系统需要至少两个RADOS存储池,一个用于数据,一个用于元数据。
[root@ceph-01 /etc/ceph]# ceph osd pool create cephfs_pool 128 pool 'cephfs_pool' created [root@ceph-01 /etc/ceph]# ceph osd pool create cephfs_metadata 64 pool 'cephfs_metadata' created [root@ceph-01 /etc/ceph]# ceph osd pool ls |grep cephfs cephfs_pool cephfs_metadata
第3步: 创建Ceph文件系统,并确认客户端访问的节点
[root@ceph-01 /etc/ceph]# ceph fs new cephfs cephfs_metadata cephfs_pool new fs with metadata pool 2 and data pool 1 [root@ceph-01 /etc/ceph]# ceph fs ls name: cephfs, metadata pool: cephfs_metadata, data pools: [cephfs_pool ] [root@ceph-01 /etc/ceph]# ceph mds stat cephfs:1 {0=ceph-01=up:active} 2 up:standby #ceph-01为up状态
第4步: 在ceph-01(上面查看是ceph-01为up状态)上创建客户端挂载需要的验证key文件,并传给客户端 ceph默认启用了cephx认证,要求客户端的挂载必须要用户名和密码验证
使用ceph-authtool验证工具产生密码key文件 [root@ceph-01 /etc/ceph]# ceph-authtool -p /etc/ceph/ceph.client.admin.keyring > /etc/ceph/admin.key 拷贝给客户端 [root@ceph-01 /etc/ceph]# scp admin.key ceph-client:/root
第5步: 部署client节点
[root@ceph-01 /etc/ceph]# ssh-copy-id -i ceph-client [root@ceph-01 /etc/ceph]# ceph-deploy install ceph-client [root@ceph-01 /etc/ceph]# ceph-deploy --overwrite-conf admin ceph-01 ceph-02 ceph-03 ceph-client #请确认上述命令是否有效
第6步: 在客户端安装ceph-fuse,并使用ceph-01产生的key文件进行挂载
[root@ceph-client ~]# yum install ceph-fuse -y 下面命令为挂载命令,192.168.99.204为ceph-01的IP; /root/admin.key为密码key文件 [root@ceph-client ~]# mount -t ceph 192.168.99.204:6789:/ /mnt -o name=admin,secretfile=/root/admin.key [root@ceph-client ~]# df -h |tail -1 192.168.99.204:6789:/ 47G 0 47G 0% /mnt
第7步: 在客户端读写测试
[root@ceph-client ~]# echo haha > /mnt/123.txt [root@ceph-client ~]# cat /mnt/123.txt haha
练习: 请自行再加一个客户端挂载(也可以直接使用ceph-03模拟客户端),测试两个客户端是否能实现同读同写?
[root@ceph-01 /etc/ceph]# scp /etc/ceph/admin.key ceph-03:/root [root@ceph-03 ~]# mount -t ceph 192.168.99.204:6789:/ /mnt -o name=admin,secretfile=/root/admin.key [root@ceph-03 ~]# df -h |tail -1 192.168.99.204:6789:/ 47G 0 47G 0% /mnt
两个客户端读写测试:可以实现同读同写
1.3.2 删除文件存储
第1步: 在客户端上删除数据,并umount所有挂载
[root@ceph-client ~]# rm /mnt/* -rf [root@ceph-client ~]# umount /mnt/ [root@ceph-03 ~]# umount /mnt/
第2步: 停掉所有节点的mds(只有停掉mds才能删除文件存储)
[root@ceph-01 /etc/ceph]# systemctl stop ceph-mds.target [root@ceph-02 ~]# systemctl stop ceph-mds.target [root@ceph-03 ~]# systemctl stop ceph-mds.target
第3步: 回到OSD存储端删除
[root@ceph-01 /etc/ceph]# ceph fs rm cephfs --yes-i-really-mean-it [root@ceph-01 /etc/ceph]# ceph osd pool delete cephfs_metadata cephfs_metadata --yes-i-really-really-mean-it pool 'cephfs_metadata' removed [root@ceph-01 /etc/ceph]# ceph osd pool delete cephfs_pool cephfs_pool --yes-i-really-really-mean-it pool 'cephfs_pool' removed
第4步: 可以将mds服务再次启动(可选项,如果以后还要用文件存储就需要启动)
[root@ceph-01 /etc/ceph]# systemctl start ceph-mds.target [root@ceph-02 ~]# systemctl start ceph-mds.target [root@ceph-03 ~]# systemctl start ceph-mds.target [root@ceph-0n ~]# systemctl status ceph-mds.target
1.4 ceph 块存储
1.4.1 创建块存储
第1步: 在ceph-01部署节点上同步文件到所有节点(包括client)
[root@ceph-01 /etc/ceph]# ceph-deploy --overwrite-conf admin ceph-01 ceph-02 ceph-03 ceph-client
第2步:建立存储池,并初始化
[root@ceph-client ~]# ceph osd pool create rbd_pool 128 pool 'rbd_pool' created [root@ceph-client ~]# rbd pool init rbd_pool
第3步:创建一个存储卷(这里卷名为volume1,大小为500M)
[root@ceph-client ~]# rbd create volume1 --pool rbd_pool --size 500 [root@ceph-client ~]# rbd ls rbd_pool volume1
[root@ceph-client ~]# rbd info volume1 -p rbd_pool
第4步: 将创建的卷映射成块设备
# 因为rbd镜像的一些特性,OS kernel并不支持,所以映射报错 [root@ceph-client /etc/ceph]# rbd map rbd_pool/volume1 # 解决方法: # disable掉相关特性 [root@ceph-client ~]# rbd feature disable rbd_pool/volume1 object-map fast-diff deep-flatten # 再次映射 [root@client ~]# rbd map rbd_pool/volume1 /dev/rbd0 # 查看映射(如果要取消映射,可以使用rbd unmap /dev/rbd0) [root@ceph-client ~]# rbd showmapped id pool namespace image snap device 0 rbd_pool volume1 - /dev/rbd0
第5步: 块存储使用
[root@ceph-client ~]# lsblk [root@ceph-client ~]# mkfs.xfs /dev/rbd0 [root@ceph-client ~]# mount /dev/rbd0 /mnt/ [root@ceph-client ~]# df -h |tail -1 /dev/rbd0 498M 26M 473M 6% /mnt [root@ceph-client ~]# echo yyds > /mnt/456.txt [root@ceph-client ~]# cat /mnt/456.txt yyds
1.4.2 块存储扩容与裁减
- 在线扩容
# 500M扩容成800M [root@ceph-client ~]# rbd resize --size 800 rbd_pool/volume1 Resizing image: 100% complete...done. [root@ceph-client ~]# rbd info rbd_pool/volume1 |grep size size 800 MiB in 200 objects # 查看大小,并没有变化 [root@ceph-client ~]# df -h |tail -1 /dev/rbd0 498M 26M 473M 6% /mnt [root@ceph-client ~]# xfs_growfs -d /mnt/ # 再次查看大小,在线扩容成功 [root@ceph-client ~]# df -h |tail -1 /dev/rbd0 798M 26M 772M 4% /mnt
- 块存储裁减
不能在线裁减.裁减后需重新格式化再挂载,所以请提前备份好数据
# 再裁减回500M [root@ceph-client ~]# rbd resize --size 500 rbd_pool/volume1 --allow-shrink Resizing image: 100% complete...done. # 重新格式化挂载 [root@ceph-client ~]# umount /mnt/ [root@ceph-client ~]# mkfs.xfs -f /dev/rbd0 [root@ceph-client ~]# mount /dev/rbd0 /mnt/ # 再次查看,确认裁减成功 [root@ceph-client ~]# df -h |tail -1 /dev/rbd0 498M 26M 473M 6% /mnt
1.4.3 删除块存储
[root@ceph-client ~]# umount /mnt/ [root@ceph-client ~]# rbd unmap /dev/rbd0 [root@ceph-client ~]# ceph osd pool delete rbd_pool rbd_pool --yes-i-really-really-mean-it pool 'rbd_pool' removed
1.5 ceph 对象存储
1.5.1 测试 ceph 对象网关的连接
第1步: 在ceph-01上创建 rgw (对象存储网关)
[root@ceph-01 /etc/ceph]# yum install -y ceph-radosgw # 此步开始已经部署过 [root@ceph-01 /etc/ceph]# ceph-deploy rgw create ceph-01 [root@ceph-01 /etc/ceph]# lsof -i :7480 COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME radosgw 5925 ceph 46u IPv4 28670 0t0 TCP *:7480 (LISTEN) radosgw 5925 ceph 47u IPv6 28672 0t0 TCP *:7480 (LISTEN)
第2步: 在客户端测试连接对象网关
# 创建一个测试用户 [root@ceph-client ~]# radosgw-admin user list [] [root@ceph-client ~]# radosgw-admin user create --uid="testuser" --display-name="First User" { "user_id": "testuser", "display_name": "First User", "email": "", "suspended": 0, "max_buckets": 1000, "subusers": [], "keys": [ { "user": "testuser", "access_key": "4O1OMS47IK0196FE4LP1", "secret_key": "TQgfHVNigeYcqjzA6tKTShczFHzMVIAaAe9bzqAa" } ], "swift_keys": [], "caps": [], "op_mask": "read, write, delete", "default_placement": "", "default_storage_class": "", "placement_tags": [], "bucket_quota": { "enabled": false, "check_on_raw": false, "max_size": -1, "max_size_kb": 0, "max_objects": -1 }, "user_quota": { "enabled": false, "check_on_raw": false, "max_size": -1, "max_size_kb": 0, "max_objects": -1 }, "temp_url_keys": [], "type": "rgw", "mfa_ids": [] } [root@ceph-client ~]# radosgw-admin user list [ "testuser" ] # 记下输出的access_key 和 secret_key的值 # 如果没有记下也可以通过以下命令查看 $ radosgw-admin user info --uid=testuser
- 安装python测试工具
[root@ceph-client ~]# yum install python-boto -y
- 编写一个python程序测试
$ cat > s3_test.py <<EOF import boto import boto.s3.connection access_key = '' secret_key = '' conn = boto.connect_s3( aws_access_key_id = access_key, aws_secret_access_key = secret_key, host = 'ceph-01', port = 7480, is_secure=False, calling_format = boto.s3.connection.OrdinaryCallingFormat(), ) bucket = conn.create_bucket('my-new-bucket') for bucket in conn.get_all_buckets(): print "{name}".format( name = bucket.name, created = bucket.creation_date, ) EOF [root@ceph-client ~]# vim s3_test.py import boto import boto.s3.connection access_key = '4O1OMS47IK0196FE4LP1' # 这两个key和上面创建用户时的信息对应 secret_key = 'TQgfHVNigeYcqjzA6tKTShczFHzMVIAaAe9bzqAa' conn = boto.connect_s3( aws_access_key_id = access_key, aws_secret_access_key = secret_key, host = 'ceph-01', port = 7480, # ceph-client要能解析ceph-01,或者换成ceph-01的ip is_secure=False, calling_format = boto.s3.connection.OrdinaryCallingFormat(), ) bucket = conn.create_bucket('my-new-bucket') for bucket in conn.get_all_buckets(): print "{name}".format( name = bucket.name, created = bucket.creation_date, ) # 测试成功 [root@ceph-client ~]# python s3_test.py my-new-bucket
1.5.2 S3连接ceph对象网关
AmazonS3是一种面向Internet的对象存储服务.我们这里可以使用s3工具连接ceph的对象存储进行操作
第1步: 客户端安装s3cmd工具,并编写ceph连接配置文件
[root@ceph-client ~]# yum install s3cmd -y # 创建并编写下面的文件,key文件对应前面创建测试用户的key [root@ceph-client ~]# vim /root/.s3cfg [default] access_key = 4O1OMS47IK0196FE4LP1 secret_key = TQgfHVNigeYcqjzA6tKTShczFHzMVIAaAe9bzqAa host_base = 192.168.99.204:7480 host_bucket = 192.168.99.204:7480/%(bucket) cloudfront_host = 192.168.99.204:7480 use_https = False
第2步: 命令测试
# 列出bucket,可以查看到先前测试创建的my-new-bucket [root@ceph-client ~]# s3cmd ls 2021-07-11 19:41 s3://my-new-bucket # 再建一个桶 [root@ceph-client ~]# s3cmd mb s3://test_bucket Bucket 's3://test_bucket/' created # 上传文件到桶 [root@ceph-client ~]# s3cmd put /etc/fstab s3://test_bucket upload: '/etc/fstab' -> 's3://test_bucket/fstab' [1 of 1] 541 of 541 100% in 1s 350.03 B/s done # 下载到当前目录 [root@ceph-client ~]# s3cmd get s3://test_bucket/fstab download: 's3://test_bucket/fstab' -> './fstab' [1 of 1] 541 of 541 100% in 0s 11.03 KB/s done # 更多命令请见参考命令帮助 [root@ceph-client ~]# s3cmd --help
1.6 ceph dashboard
ceph 提供了原生的 Dashboard 功能,通过 ceph dashboard 完成对 ceph 存储系统可视化监视
**(nautilus版) 需要安装 ceph-mgr-dashboard **
1、在每个 mgr节点安装
$ yum install ceph-mgr-dashboard -y $ ceph mgr versions $ ps -ef | grep ceph-mgr $ ceph -s
2、查看 mgr module帮助及模块信息
$ ceph mgr module --help $ ceph mgr module ls | head -20
3、开启 mgr功能、开启dashboard模块
$ ceph mgr module enable dashboard $ ceph mgr module ls | head -20
4、创建自签名证书
默认情况下,仪表板的所有HTTP连接均使用SSL/TLS进行保护
要快速启动并运行仪表板,可以使用以下内置命令生成并安装自签名证书
[root@ceph-01 /etc/ceph]# ceph dashboard create-self-signed-cert Self-signed certificate created
5、创建新的访问控制角色 、创建具有管理员角色的用户
设置用户跟密码
[root@ceph-01 /etc/ceph]# echo admin123 > /root/ceph-password.txt # 创建用户admin,密码指定ceph-password.txt文件,用administrator角色 [root@ceph-01 /etc/ceph]# ceph dashboard ac-user-create admin -i /root/ceph-password.txt administrator {"username": "admin", "lastUpdate": 1626079263, "name": null, "roles": ["administrator"], "password": "$2b$12$zlN6AOugMKWqn4l680QEje8.Ny12XT7WHoBN4oeEceLLndjR.xlRi", "email": null} # 显示用户信息 [root@ceph-01 /etc/ceph]# ceph dashboard ac-user-show ["admin"] # 显示角色信息 [root@ceph-01 /etc/ceph]# ceph dashboard ac-role-show ["administrator", "pool-manager", "cephfs-manager", "cluster-manager", "block-manager", "read-only", "rgw-manager", "ganesha-manager"] # 删除用户命令 [root@ceph-01 /etc/ceph]# ceph dashboard ac-user-delete admin User 'admin' deleted [root@ceph-01 /etc/ceph]# ceph dashboard ac-user-show [] # 更多命令请见参考命令帮助 [root@ceph-01 /etc/ceph]# ceph dashboard -h
6、在 ceph active mgr节点上配置 mgr services
主要配置 dashboard 使用的IP及Port
#查看到ceph active mgr节点是ceph-01 [root@ceph-01 /etc/ceph]# ceph -s |grep mgr mgr: ceph-01(active, since 44m), standbys: ceph-02, ceph-03 #查看默认的ceph-mgr服务 [root@ceph-01 /etc/ceph]# ceph mgr services { "dashboard": "https://ceph-01:8443/" }
此时可以直接使用默认的service访问ceph-dashboard的web页面
https://ceph-01:8443/ or https://192.168.99.204:8443/
输入账号密码登录(admin、admin123)
如果是虚拟机,用域名访问需要解析!!!
在C:\Windows\System32\drivers\etc\hosts文件里添加192.168.99.204 ceph-01
- 自定义监听ip及端口
#可修改为 [root@ceph-01 /etc/ceph]# ceph config set mgr mgr/dashboard/server_addr 192.168.99.204 [root@ceph-01 /etc/ceph]# ceph config set mgr mgr/dashboard/server_port 8080 #修改后查看没有变化,需要重启dashboard [root@ceph-01 /etc/ceph]# ceph mgr services { "dashboard": "https://ceph-01:8443/" } # 使用配置生效 [root@ceph-01 /etc/ceph]# ceph mgr module disable dashboard [root@ceph-01 /etc/ceph]# ceph mgr module enable dashboard [root@ceph-01 /etc/ceph]# ceph mgr services { "dashboard": "https://192.168.99.204:8080/" }
禁用ssl
直接用http,如果想要用https的话,不需要操作这一步
[root@ceph-01 /etc/ceph]# ceph config ls |grep mgr/dashboard/ssl mgr/dashboard/ssl [root@ceph-01 /etc/ceph]# ceph config set mgr mgr/dashboard/ssl false
dashboard 启用 RGW ,开启 Object Gateway 管理功能
Ceph Dashboard默认安装好后,没有启用rgw,需要手动启用RGW
部署 rgw
# 全部节点安装,达到高可用 $ yum install ceph-radosgw -y #开始已经部署过 $ ceph -s [root@ceph-01 /etc/ceph]# ceph-deploy rgw create ceph-01 ceph-02 ceph-03
创建 rgw 系统账户
[root@ceph-01 /etc/ceph]# radosgw-admin user list [ "testuser" ] [root@ceph-01 /etc/ceph]# radosgw-admin user create --uid=rgw --display-name=rgw --system { "user_id": "rgw", "display_name": "rgw", "email": "", "suspended": 0, "max_buckets": 1000, "subusers": [], "keys": [ { "user": "rgw", "access_key": "QNJI1APRKX691UJ2R9B3", "secret_key": "u5g1JtnCotNjE1H9MMerLc7QefW8xK8PLiw7ZGUs" } ], "swift_keys": [], "caps": [], "op_mask": "read, write, delete", "system": "true", "default_placement": "", "default_storage_class": "", "placement_tags": [], "bucket_quota": { "enabled": false, "check_on_raw": false, "max_size": -1, "max_size_kb": 0, "max_objects": -1 }, "user_quota": { "enabled": false, "check_on_raw": false, "max_size": -1, "max_size_kb": 0, "max_objects": -1 }, "temp_url_keys": [], "type": "rgw", "mfa_ids": [] } [root@ceph-01 /etc/ceph]# radosgw-admin user list [ "rgw", "testuser" ] # 记下输出的access_key 和 secret_key的值 # 如果没有记下也可以通过以下命令查看 $ radosgw-admin user info --uid=rgw
设置access_key 和 secret_key
# 写入access_key的值 [root@ceph-01 /etc/ceph]# echo QNJI1APRKX691UJ2R9B3 > access_key # 写入secret_key的值 [root@ceph-01 /etc/ceph]# echo u5g1JtnCotNjE1H9MMerLc7QefW8xK8PLiw7ZGUs > secret_key # 提供Dashboard证书 [root@ceph-01 /etc/ceph]# ceph dashboard set-rgw-api-access-key -i access_key Option RGW_API_ACCESS_KEY updated [root@ceph-01 /etc/ceph]# ceph dashboard set-rgw-api-secret-key -i secret_key Option RGW_API_SECRET_KEY updated
禁用ssl
直接用http,如果想要用https的话,不需要操作这一步
# ceph dashboard set-rgw-api-ssl-verify False
1、这时候打开 dashboard 刷新就可以看到 rgw 的信息了
1.7 prometheus + grafana 监控 ceph
这里在 ceph-01 安装 promethus + grafana
1.7.1 安装grafana
# 1、配置yum源文件 [root@ceph-01 /etc/ceph]# cat > /etc/yum.repos.d/grafana.repo << EOF [grafana] name=grafana baseurl=https://packages.grafana.com/oss/rpm repo_gpgcheck=1 enabled=1 gpgcheck=1 gpgkey=https://packages.grafana.com/gpg.key sslverify=1 sslcacert=/etc/pki/tls/certs/ca-bundle.crt EOF # 2.通过yum命令安装grafana [root@ceph-01 /etc/ceph]# yum install grafana -y # 3.启动grafana并设为开机自启 [root@ceph-01 /etc/ceph]# systemctl enable grafana-server --now [root@ceph-01 /etc/ceph]# grafana-server -v Version 8.0.5 (commit: cbb2aa5001, branch: HEAD) [root@ceph-01 /etc/ceph]# grafana-cli -v Grafana CLI version 8.0.5
1.7.2 安装prometheus
# 1、下载安装包,下载地址https://prometheus.io/download/ [root@ceph-01 ~]# wget https://github.com/prometheus/prometheus/releases/download/v2.28.1/prometheus-2.28.1.linux-amd64.tar.gz # 2、解压压缩包 [root@ceph-01 ~]# tar xf prometheus-2.28.1.linux-amd64.tar.gz # 3、将解压后的目录改名 [root@ceph-01 ~]# mv prometheus-2.28.1.linux-amd64 /usr/local/prometheus [root@ceph-01 ~]# cd /usr/local/prometheus # 4、查看promethus版本 [root@ceph-01 /usr/local/prometheus]# ./prometheus --version # 5、配置系统服务启动 [root@ceph-01 /usr/local/prometheus]# cat > /etc/systemd/system/prometheus.service << EOF [Unit] Description=Prometheus Monitoring System Documentation=Prometheus Monitoring System [Service] ExecStart=/usr/local/prometheus/prometheus \ --config.file /usr/local/prometheus/prometheus.yml \ --web.listen-address=:9090 [Install] WantedBy=multi-user.target EOF # 6、加载系统服务 [root@ceph-01 /usr/local/prometheus]# systemctl daemon-reload # 7、启动服务和添加开机自启动 [root@ceph-01 /usr/local/prometheus]# systemctl enable prometheus --now [root@ceph-01 /usr/local/prometheus]# systemctl status prometheus
1.7.3 ceph mgr prometheus插件配置
[root@ceph-01 /usr/local/prometheus]# ceph mgr module enable prometheus [root@ceph-01 /usr/local/prometheus]# ceph mgr module ls | head -20 # 这里查看到是ceph-01 [root@ceph-01 /usr/local/prometheus]# ceph -s |grep mgr mgr: ceph-01(active, since 73s), standbys: ceph-02, ceph-03 [root@ceph-01 /usr/local/prometheus]# netstat -nltp | grep mgr # 检查端口 [root@ceph-01 /usr/local/prometheus]# curl 127.0.0.1:9283/metrics # 测试返回值
1.7.4 配置promethus
1、在 scrape_configs: 配置项下添加
[root@ceph-01 /usr/local/prometheus]# cat >> /usr/local/prometheus/prometheus.yml << EOF - job_name: 'ceph_cluster' static_configs: - targets: ['192.168.99.204:9283'] EOF 注意192.168.99.204:9283这个是正在运行mgr的ip [root@ceph-01 /usr/local/prometheus]# ceph -s |grep mgr mgr: ceph-01(active, since 62m), standbys: ceph-02
2、重启promethus服务
[root@ceph-01 /usr/local/prometheus]# systemctl restart prometheus [root@ceph-01 /usr/local/prometheus]# systemctl status prometheus
3、检查prometheus服务器中是否添加成功
# 浏览器-》 http://192.168.99.204:9090 -》status -》Targets
4、配置grafana
URL:http://192.168.99.204:3000
默认登陆的用户名密码为admin admin,登陆成功后会强制修改密码自己修改后的密码为admin123
https://grafana.com/grafana/dashboards?search=ceph
917 、2842
1、浏览器登录 grafana 管理界面
2、添加prometheus的data sources,点击configuration--》data sources
3、添加dashboard,打开官网https://grafana.com/grafana/dashboards?search=ceph 选择合适的dashboard,记录编号,点击import导入仪表板模板
1、添加prometheus
2、添加ceph-dashboard
二、k8s对接外部ceph集群
k8s可以通过两种方式使用ceph作为volume:
- cephfs
- rbd
1)一个ceph集群仅支持一个cephfs
2)cephfs方式支持k8s的pv的3种访问模式
ReadWriteOnce、ReadOnlyMany 、ReadWriteMany
3)rbd支持
ReadWriteOnce、ReadOnlyMany
两种模式注意:访问模式只是能力描述,并不是强制执行的,对于没有按pvc声明的方式使用pv,存储提供者应该负责访问时的运行错误。例如如果设置pvc的访问模式为
ReadOnlyMany
,pod挂载后依然可写,如果需要真正的不可写,申请pvc是需要指定readOnly: true
参数
2.1 静态PV(rbd)方式
2.1.1 安装依赖组件(所有k8s组件)
注意:安装ceph-common软件包推荐使用软件包源与Ceph集群源相同,软件版本一致。
$ cat > /etc/yum.repos.d/ceph.repo << EOF [ceph] name=ceph baseurl=http://mirrors.aliyun.com/ceph/rpm-nautilus/el7/x86_64/ enabled=1 gpgcheck=0 priority=1 [ceph-noarch] name=cephnoarch baseurl=http://mirrors.aliyun.com/ceph/rpm-nautilus/el7/noarch/ enabled=1 gpgcheck=0 priority=1 [ceph-source] name=Ceph source packages baseurl=http://mirrors.aliyun.com/ceph/rpm-nautilus/el7/SRPMS enabled=1 gpgcheck=0 priority=1 EOF
$ yum install ceph-common -y
2.1.2 同步ceph配置文件
[root@ceph-01 ~]# ssh-copy-id k8s-master [root@ceph-01 ~]# ssh-copy-id k8s-node01 [root@ceph-01 ~]# ssh-copy-id k8s-node02 [root@ceph-01 /etc/ceph]# ceph-deploy --overwrite-conf admin ceph-01 ceph-02 ceph-03 ceph-client k8s-master k8s-node01 k8s-node02
2.1.3 创建存储池并开启rbd功能
$ ceph osd pool create kube 128 128 # 创建kube池给k8s使用
2.1.4 创建ceph用户,提供给k8s使用
$ ceph auth list # 查看ceph集群中的认证用户及相关的key $ ceph auth del osd.0 # (这里只是给个删除用户命令,请勿执行啊!!) # 删除集群中的一个认证用户 $ ceph auth get-or-create client.kube mon 'allow r' osd 'allow class-read object_prefix rbd_children,allow rwx pool=kube' # 创建ceph用户,提供给k8s使用
2.1.5 创建secret资源
$ ceph auth get-key client.admin | base64 $ ceph auth get-key client.kube | base64
base64 单向加密一下,k8s不以明文方式存储账号密码
$ mkdir jtpv && cd jtpv $ cat > ceph-admin-secret.yaml << EOF apiVersion: v1 kind: Secret metadata: name: ceph-admin-secret namespace: default data: key: #( admin 的key) type: kubernetes.io/rbd EOF $ cat > ceph-kube-secret.yaml << EOF apiVersion: v1 kind: Secret metadata: name: ceph-kube-secret namespace: default data: key: #( kube 的key) type: kubernetes.io/rbd EOF $ cat > pv.yaml << EOF apiVersion: v1 kind: PersistentVolume metadata: name: ceph-pv-test spec: capacity: storage: 5Gi accessModes: - ReadWriteOnce rbd: monitors: - 192.168.99.204:6789 - 192.168.99.205:6789 - 192.168.99.206:6789 pool: kube image: ceph-image user: admin secretRef: name: ceph-admin-secret fsType: ext4 readOnly: false persistentVolumeReclaimPolicy: Retain EOF $ rbd create -p kube -s 5G ceph-image # 创建镜像(说白了就是划出一块空间给它用的意思) $ rbd ls -p kube ceph-image $ rbd info ceph-image -p kube $ rbd feature disable kube/ceph-image object-map fast-diff deep-flatten # 去除不支持的特性 $ cat > pvc.yaml << EOF apiVersion: v1 kind: PersistentVolumeClaim metadata: name: ceph-test-claim spec: accessModes: - ReadWriteOnce resources: requests: storage: 5Gi EOF $ cat > pod.yaml << EOF apiVersion: v1 kind: Pod metadata: name: ceph-pod spec: containers: - name: test-pod image: busybox:1.24 command: ["sleep", "60000"] volumeMounts: - name: pvc mountPath: /usr/share/busybox readOnly: false volumes: - name: pvc persistentVolumeClaim: claimName: ceph-test-claim EOF
验证:
$ kubectl apply -f ceph-admin-secret.yaml $ kubectl apply -f ceph-kube-secret.yaml $ kubectl apply -f pv.yaml $ kubectl apply -f pvc.yaml $ kubectl apply -f pod.yaml $ kubectl exec -it ceph-pod -- df -h |grep /dev/rbd0 /dev/rbd0 4.8G 20.0M 4.8G 0% /usr/share/busybox
2.2 动态pv(cephfs)方式
2.2.1 ceph操作
1、在 ceph-01 部署节点上同步配置文件,并创建至少一个 mds 服务
使用 cephfs 必须保证至少有一个节点提供 mds 服务
[root@ceph-01 /etc/ceph]# ceph-deploy --overwrite-conf admin ceph-01 ceph-02 ceph-03 #这里做三个mds [root@ceph-01 /etc/ceph]# ceph-deploy mds create ceph-01 ceph-02 ceph-03 [root@ceph-01 /etc/ceph]# ceph -s
2、创建存储池 、文件系统
1、创建 cephfs 存储池 :fs_metadata 、fs_data
2、创建 cephfs 文件系统:命名为 cephfs
一个 ceph 文件系统需要至少两个 RADOS 存储池,一个用于数据,一个用于元数据。
$ ceph osd pool create fs_data 128 128 $ ceph osd pool create fs_metadata 128 128 $ ceph fs new cephfs fs_metadata fs_data $ ceph fs ls
获取集群信息和查看 admin 用户 key(秘钥)
$ ceph mon dump $ ceph auth get client.admin
注意:这里不需要 base64 加密
2.2.2 k8s操作部分
1、所有 k8s节点安装依赖组件
$ cat > /etc/yum.repos.d/ceph.repo << EOF [ceph] name=ceph baseurl=http://mirrors.aliyun.com/ceph/rpm-nautilus/el7/x86_64/ enabled=1 gpgcheck=0 priority=1 [ceph-noarch] name=cephnoarch baseurl=http://mirrors.aliyun.com/ceph/rpm-nautilus/el7/noarch/ enabled=1 gpgcheck=0 priority=1 [ceph-source] name=Ceph source packages baseurl=http://mirrors.aliyun.com/ceph/rpm-nautilus/el7/SRPMS enabled=1 gpgcheck=0 priority=1 EOF
$ yum install ceph-common -y
2、需要把 ceph 集群的 /etc/ceph/{ceph.conf,ceph.client.admin.keyring}
文件同步到 k8s 所有节点上
[root@ceph-01 ~]# ssh-copy-id k8s-master [root@ceph-01 ~]# ssh-copy-id k8s-node01 [root@ceph-01 ~]# ssh-copy-id k8s-node02 [root@ceph-01 /etc/ceph]# ceph-deploy --overwrite-conf admin ceph-01 ceph-02 ceph-03 ceph-client k8s-master k8s-node01 k8s-node02
2.2.3 部署cephfs csi
$ mkdir -p /root/my-ceph-csi/deploy/cephfs && cd /root/my-ceph-csi/deploy/cephfs # 创建所需目录
修改 csi-config-map.yaml 文件,配置连接 ceph 集群的信息
$ cat > csi-config-map.yaml << EOF --- apiVersion: v1 kind: ConfigMap data: config.json: |- [ { "clusterID": "96f54e84-dbfc-4650-8896-8f3b5f524bbf", # ceph集群的ID,此内容可以使用ceph mon dump来查看,clusterID对应fsid "monitors": [ "192.168.99.204:6789,192.168.99.205:6789,192.168.99.206:6789" ] } ] metadata: name: ceph-csi-config EOF
需要把注释去掉,不然容易报错!!!
如果你需要部署到xxx命名空间里,需要把csi-provisioner-rbac.yaml和csi-nodeplugin-rbac.yaml里面的命名空间改为xxx,这里使用yaml文件里面默认的default命名空间
部署 cephfs 相关的 csi
$ cat > csi-provisioner-rbac.yaml << EOF --- apiVersion: v1 kind: ServiceAccount metadata: name: cephfs-csi-provisioner --- kind: ClusterRole apiVersion: rbac.authorization.k8s.io/v1 metadata: name: cephfs-external-provisioner-runner rules: - apiGroups: [""] resources: ["nodes"] verbs: ["get", "list", "watch"] - apiGroups: [""] resources: ["secrets"] verbs: ["get", "list"] - apiGroups: [""] resources: ["events"] verbs: ["list", "watch", "create", "update", "patch"] - apiGroups: [""] resources: ["persistentvolumes"] verbs: ["get", "list", "watch", "create", "delete", "patch"] - apiGroups: [""] resources: ["persistentvolumeclaims"] verbs: ["get", "list", "watch", "update"] - apiGroups: ["storage.k8s.io"] resources: ["storageclasses"] verbs: ["get", "list", "watch"] - apiGroups: ["snapshot.storage.k8s.io"] resources: ["volumesnapshots"] verbs: ["get", "list"] - apiGroups: ["snapshot.storage.k8s.io"] resources: ["volumesnapshotcontents"] verbs: ["create", "get", "list", "watch", "update", "delete"] - apiGroups: ["snapshot.storage.k8s.io"] resources: ["volumesnapshotclasses"] verbs: ["get", "list", "watch"] - apiGroups: ["storage.k8s.io"] resources: ["volumeattachments"] verbs: ["get", "list", "watch", "update", "patch"] - apiGroups: ["storage.k8s.io"] resources: ["volumeattachments/status"] verbs: ["patch"] - apiGroups: [""] resources: ["persistentvolumeclaims/status"] verbs: ["update", "patch"] - apiGroups: ["storage.k8s.io"] resources: ["csinodes"] verbs: ["get", "list", "watch"] - apiGroups: ["snapshot.storage.k8s.io"] resources: ["volumesnapshotcontents/status"] verbs: ["update"] --- kind: ClusterRoleBinding apiVersion: rbac.authorization.k8s.io/v1 metadata: name: cephfs-csi-provisioner-role subjects: - kind: ServiceAccount name: cephfs-csi-provisioner namespace: default roleRef: kind: ClusterRole name: cephfs-external-provisioner-runner apiGroup: rbac.authorization.k8s.io --- kind: Role apiVersion: rbac.authorization.k8s.io/v1 metadata: # replace with non-default namespace name namespace: default name: cephfs-external-provisioner-cfg rules: # remove this once we stop supporting v1.0.0 - apiGroups: [""] resources: ["configmaps"] verbs: ["get", "list", "create", "delete"] - apiGroups: ["coordination.k8s.io"] resources: ["leases"] verbs: ["get", "watch", "list", "delete", "update", "create"] --- kind: RoleBinding apiVersion: rbac.authorization.k8s.io/v1 metadata: name: cephfs-csi-provisioner-role-cfg # replace with non-default namespace name namespace: default subjects: - kind: ServiceAccount name: cephfs-csi-provisioner # replace with non-default namespace name namespace: default roleRef: kind: Role name: cephfs-external-provisioner-cfg apiGroup: rbac.authorization.k8s.io EOF $ cat > csi-nodeplugin-rbac.yaml << EOF --- apiVersion: v1 kind: ServiceAccount metadata: name: cephfs-csi-nodeplugin --- kind: ClusterRole apiVersion: rbac.authorization.k8s.io/v1 metadata: name: cephfs-csi-nodeplugin rules: - apiGroups: [""] resources: ["nodes"] verbs: ["get"] --- kind: ClusterRoleBinding apiVersion: rbac.authorization.k8s.io/v1 metadata: name: cephfs-csi-nodeplugin subjects: - kind: ServiceAccount name: cephfs-csi-nodeplugin namespace: default roleRef: kind: ClusterRole name: cephfs-csi-nodeplugin apiGroup: rbac.authorization.k8s.io EOF $ vim csi-cephfsplugin-provisioner.yaml --- kind: Service apiVersion: v1 metadata: name: csi-cephfsplugin-provisioner labels: app: csi-metrics spec: selector: app: csi-cephfsplugin-provisioner ports: - name: http-metrics port: 8080 protocol: TCP targetPort: 8681 --- kind: Deployment apiVersion: apps/v1 metadata: name: csi-cephfsplugin-provisioner spec: selector: matchLabels: app: csi-cephfsplugin-provisioner replicas: 3 template: metadata: labels: app: csi-cephfsplugin-provisioner spec: affinity: podAntiAffinity: requiredDuringSchedulingIgnoredDuringExecution: - labelSelector: matchExpressions: - key: app operator: In values: - csi-cephfsplugin-provisioner topologyKey: "kubernetes.io/hostname" serviceAccountName: cephfs-csi-provisioner priorityClassName: system-cluster-critical containers: - name: csi-provisioner image: k8s.gcr.io/sig-storage/csi-provisioner:v2.2.2 args: - "--csi-address=$(ADDRESS)" - "--v=5" - "--timeout=150s" - "--leader-election=true" - "--retry-interval-start=500ms" - "--feature-gates=Topology=false" - "--extra-create-metadata=true" env: - name: ADDRESS value: unix:///csi/csi-provisioner.sock imagePullPolicy: "IfNotPresent" volumeMounts: - name: socket-dir mountPath: /csi - name: csi-resizer image: k8s.gcr.io/sig-storage/csi-resizer:v1.2.0 args: - "--csi-address=$(ADDRESS)" - "--v=5" - "--timeout=150s" - "--leader-election" - "--retry-interval-start=500ms" - "--handle-volume-inuse-error=false" env: - name: ADDRESS value: unix:///csi/csi-provisioner.sock imagePullPolicy: "IfNotPresent" volumeMounts: - name: socket-dir mountPath: /csi - name: csi-snapshotter image: k8s.gcr.io/sig-storage/csi-snapshotter:v4.1.1 args: - "--csi-address=$(ADDRESS)" - "--v=5" - "--timeout=150s" - "--leader-election=true" env: - name: ADDRESS value: unix:///csi/csi-provisioner.sock imagePullPolicy: "IfNotPresent" securityContext: privileged: true volumeMounts: - name: socket-dir mountPath: /csi - name: csi-cephfsplugin-attacher image: k8s.gcr.io/sig-storage/csi-attacher:v3.2.1 args: - "--v=5" - "--csi-address=$(ADDRESS)" - "--leader-election=true" - "--retry-interval-start=500ms" env: - name: ADDRESS value: /csi/csi-provisioner.sock imagePullPolicy: "IfNotPresent" volumeMounts: - name: socket-dir mountPath: /csi - name: csi-cephfsplugin securityContext: privileged: true capabilities: add: ["SYS_ADMIN"] # for stable functionality replace canary with latest release version image: quay.io/cephcsi/cephcsi:canary args: - "--nodeid=$(NODE_ID)" - "--type=cephfs" - "--controllerserver=true" - "--endpoint=$(CSI_ENDPOINT)" - "--v=5" - "--drivername=cephfs.csi.ceph.com" - "--pidlimit=-1" - "--enableprofiling=false" env: - name: POD_IP valueFrom: fieldRef: fieldPath: status.podIP - name: NODE_ID valueFrom: fieldRef: fieldPath: spec.nodeName - name: CSI_ENDPOINT value: unix:///csi/csi-provisioner.sock imagePullPolicy: "IfNotPresent" volumeMounts: - name: socket-dir mountPath: /csi - name: host-sys mountPath: /sys - name: lib-modules mountPath: /lib/modules readOnly: true - name: host-dev mountPath: /dev - name: ceph-csi-config mountPath: /etc/ceph-csi-config/ - name: keys-tmp-dir mountPath: /tmp/csi/keys - name: liveness-prometheus image: quay.io/cephcsi/cephcsi:canary args: - "--type=liveness" - "--endpoint=$(CSI_ENDPOINT)" - "--metricsport=8681" - "--metricspath=/metrics" - "--polltime=60s" - "--timeout=3s" env: - name: CSI_ENDPOINT value: unix:///csi/csi-provisioner.sock - name: POD_IP valueFrom: fieldRef: fieldPath: status.podIP volumeMounts: - name: socket-dir mountPath: /csi imagePullPolicy: "IfNotPresent" volumes: - name: socket-dir emptyDir: { medium: "Memory" } - name: host-sys hostPath: path: /sys - name: lib-modules hostPath: path: /lib/modules - name: host-dev hostPath: path: /dev - name: ceph-csi-config configMap: name: ceph-csi-config - name: keys-tmp-dir emptyDir: { medium: "Memory" } $ vim csi-cephfsplugin.yaml --- kind: DaemonSet apiVersion: apps/v1 metadata: name: csi-cephfsplugin spec: selector: matchLabels: app: csi-cephfsplugin template: metadata: labels: app: csi-cephfsplugin spec: serviceAccountName: cephfs-csi-nodeplugin priorityClassName: system-node-critical hostNetwork: true # to use e.g. Rook orchestrated cluster, and mons' FQDN is # resolved through k8s service, set dns policy to cluster first dnsPolicy: ClusterFirstWithHostNet containers: - name: driver-registrar # This is necessary only for systems with SELinux, where # non-privileged sidecar containers cannot access unix domain socket # created by privileged CSI driver container. securityContext: privileged: true image: k8s.gcr.io/sig-storage/csi-node-driver-registrar:v2.2.0 args: - "--v=5" - "--csi-address=/csi/csi.sock" - "--kubelet-registration-path=/var/lib/kubelet/plugins/cephfs.csi.ceph.com/csi.sock" env: - name: KUBE_NODE_NAME valueFrom: fieldRef: fieldPath: spec.nodeName volumeMounts: - name: socket-dir mountPath: /csi - name: registration-dir mountPath: /registration - name: csi-cephfsplugin securityContext: privileged: true capabilities: add: ["SYS_ADMIN"] allowPrivilegeEscalation: true # for stable functionality replace canary with latest release version image: quay.io/cephcsi/cephcsi:canary args: - "--nodeid=$(NODE_ID)" - "--type=cephfs" - "--nodeserver=true" - "--endpoint=$(CSI_ENDPOINT)" - "--v=5" - "--drivername=cephfs.csi.ceph.com" - "--enableprofiling=false" # If topology based provisioning is desired, configure required # node labels representing the nodes topology domain # and pass the label names below, for CSI to consume and advertise # its equivalent topology domain # - "--domainlabels=failure-domain/region,failure-domain/zone" env: - name: POD_IP valueFrom: fieldRef: fieldPath: status.podIP - name: NODE_ID valueFrom: fieldRef: fieldPath: spec.nodeName - name: CSI_ENDPOINT value: unix:///csi/csi.sock imagePullPolicy: "IfNotPresent" volumeMounts: - name: socket-dir mountPath: /csi - name: mountpoint-dir mountPath: /var/lib/kubelet/pods mountPropagation: Bidirectional - name: plugin-dir mountPath: /var/lib/kubelet/plugins mountPropagation: "Bidirectional" - name: host-sys mountPath: /sys - name: lib-modules mountPath: /lib/modules readOnly: true - name: host-dev mountPath: /dev - name: host-mount mountPath: /run/mount - name: ceph-csi-config mountPath: /etc/ceph-csi-config/ - name: keys-tmp-dir mountPath: /tmp/csi/keys - name: liveness-prometheus securityContext: privileged: true image: quay.io/cephcsi/cephcsi:canary args: - "--type=liveness" - "--endpoint=$(CSI_ENDPOINT)" - "--metricsport=8681" - "--metricspath=/metrics" - "--polltime=60s" - "--timeout=3s" env: - name: CSI_ENDPOINT value: unix:///csi/csi.sock - name: POD_IP valueFrom: fieldRef: fieldPath: status.podIP volumeMounts: - name: socket-dir mountPath: /csi imagePullPolicy: "IfNotPresent" volumes: - name: socket-dir hostPath: path: /var/lib/kubelet/plugins/cephfs.csi.ceph.com/ type: DirectoryOrCreate - name: registration-dir hostPath: path: /var/lib/kubelet/plugins_registry/ type: Directory - name: mountpoint-dir hostPath: path: /var/lib/kubelet/pods type: DirectoryOrCreate - name: plugin-dir hostPath: path: /var/lib/kubelet/plugins type: Directory - name: host-sys hostPath: path: /sys - name: lib-modules hostPath: path: /lib/modules - name: host-dev hostPath: path: /dev - name: host-mount hostPath: path: /run/mount - name: ceph-csi-config configMap: name: ceph-csi-config - name: keys-tmp-dir emptyDir: { medium: "Memory" } --- # This is a service to expose the liveness metrics apiVersion: v1 kind: Service metadata: name: csi-metrics-cephfsplugin labels: app: csi-metrics spec: ports: - name: http-metrics port: 8080 protocol: TCP targetPort: 8681 selector: app: csi-cephfsplugin
先离线把镜像导入进去(k8s所有节点)
离线镜像百度网盘下载链接:https://pan.baidu.com/s/1xp2cJTDD-KR2hYqTD3L9QQ
提取码:m66b
$ grep image csi-cephfsplugin-provisioner.yaml # 查看所需镜像 # 上传镜像包到服务上 $ unzip '*.zip' $ ls *tar |xargs -i docker load -i {} && docker images
因为这里k8s集群只有三台,所以必须配置k8s的master运行pod
# 查看master表示不运行pod [root@master ~]# kubectl describe node k8s-master |grep Taints Taints: node-role.kubernetes.io/k8s-master:NoSchedule # 查看master表示运行pod [root@master ~]# kubectl describe node k8s-master |grep Taints Taints: <none> # 让master节点参与pod负载的命令为 [root@master ~]# kubectl taint nodes k8s-master node-role.kubernetes.io/master- # 让master节点恢复不参与pod负载的命令为 [root@master ~]# kubectl taint nodes k8s-master node-role.kubernetes.io/master=:NoSchedule
部署
$ kubectl apply -f csi-config-map.yaml $ kubectl create -f csi-provisioner-rbac.yaml $ kubectl create -f csi-nodeplugin-rbac.yaml $ kubectl create -f csi-cephfsplugin-provisioner.yaml $ kubectl create -f csi-cephfsplugin.yaml
验证
1、k8s 上创建连接 ceph 集群的秘钥(创建 secret.yaml)
$ cat > secret.yaml << EOF --- apiVersion: v1 kind: Secret metadata: name: csi-cephfs-secret namespace: default stringData: # 通过ceph auth get client.admin查看,这里不需要base64加密 # Required for statically provisioned volumes userID: admin userKey: AQCoAPBg9LQTMhAALBgNqW3DDcaAm9NL6HFzaA== # Required for dynamically provisioned volumes adminID: admin adminKey: AQCoAPBg9LQTMhAALBgNqW3DDcaAm9NL6HFzaA== EOF $ kubectl apply -f secret.yaml $ kubectl get secret csi-cephfs-secret -n default
2、创建存储类(创建 storageclass.yaml)
修改字段1(clusterID:)、改成自己ceph集群的ID,
ceph mon dump
修改字段2(fsName:)、填写上面创建
名为cephfs
的文件系统 ,ceph fs ls
修改字段3(pool:)、去掉注释,填写数据pool
,不是元数据的pool ceph osd pool ls
$ cat > storageclass.yaml <<EOF --- apiVersion: storage.k8s.io/v1 kind: StorageClass metadata: name: csi-cephfs-sc provisioner: cephfs.csi.ceph.com parameters: clusterID: 96f54e84-dbfc-4650-8896-8f3b5f524bbf #此处需要修改 fsName: cephfs #此处需要修改 pool: fs_data #此处需要修改 csi.storage.k8s.io/provisioner-secret-name: csi-cephfs-secret csi.storage.k8s.io/provisioner-secret-namespace: default csi.storage.k8s.io/controller-expand-secret-name: csi-cephfs-secret csi.storage.k8s.io/controller-expand-secret-namespace: default csi.storage.k8s.io/node-stage-secret-name: csi-cephfs-secret csi.storage.k8s.io/node-stage-secret-namespace: default reclaimPolicy: Delete allowVolumeExpansion: true mountOptions: - debug EOF $ kubectl apply -f storageclass.yaml $ kubectl get sc csi-cephfs-sc -n default
3、基于 sc 创建 pvc
$ cat > pvc.yaml << EOF --- apiVersion: v1 kind: PersistentVolumeClaim metadata: name: csi-cephfs-pvc spec: accessModes: - ReadWriteMany resources: requests: storage: 1Gi storageClassName: csi-cephfs-sc EOF $ kubectl apply -f pvc.yaml $ kubectl get pvc
4、创建 pod 应用 pvc
$ cat > pod.yaml << EOF --- apiVersion: v1 kind: Pod metadata: name: csi-cephfs-demo-pod spec: containers: - name: web-server image: nginx:alpine volumeMounts: - name: mypvc mountPath: /var/lib/www volumes: - name: mypvc persistentVolumeClaim: claimName: csi-cephfs-pvc readOnly: false EOF $ kubectl apply -f pod.yaml $ kubectl get pod csi-cephfs-demo-pod -n default
查看数据存放目录
$ kubectl exec -it csi-cephfs-demo-pod -- df -Th | grep ceph 192.168.99.204:6789,192.168.99.205:6789,192.168.99.206:6789:/volumes/csi/csi-vol-ba73d3f4-2f3a-11ec-8925-865ed536e16d/79a41bdb-9bfb-4d70-a98c-487a7f8ba04d ceph 1.0G 0 1.0G 0% /var/lib/www
【推荐】国内首个AI IDE,深度理解中文开发场景,立即下载体验Trae
【推荐】编程新体验,更懂你的AI,立即体验豆包MarsCode编程助手
【推荐】抖音旗下AI助手豆包,你的智能百科全书,全免费不限次数
【推荐】轻量又高性能的 SSH 工具 IShell:AI 加持,快人一步
· AI与.NET技术实操系列:向量存储与相似性搜索在 .NET 中的实现
· 基于Microsoft.Extensions.AI核心库实现RAG应用
· Linux系列:如何用heaptrack跟踪.NET程序的非托管内存泄露
· 开发者必知的日志记录最佳实践
· SQL Server 2025 AI相关能力初探
· 震惊!C++程序真的从main开始吗?99%的程序员都答错了
· 【硬核科普】Trae如何「偷看」你的代码?零基础破解AI编程运行原理
· 单元测试从入门到精通
· 上周热点回顾(3.3-3.9)
· winform 绘制太阳,地球,月球 运作规律