Ceph集群安装部署
集群安装
1、准备软件源,选择一个安装版本(所有节点安装)备注:Ceph-CSI需要N版及以上版本,版本请参照阿里云开源镜像站https://mirrors.aliyun.com/ceph/
cat >/etc/yum.repos.d/ceph.repo<<EOF
[ceph]
name=ceph
baseurl=https://mirrors.aliyun.com/ceph/rpm-octopus/el7/x86_64/
gpgcheck=0
[ceph-noarch]
name=cephnoarch
baseurl=https://mirrors.aliyun.com/ceph/rpm-octopus/el7/noarch/
gpgcheck=0
EOF
# 可选版本
https://mirrors.aliyun.com/ceph/rpm-luminous/el7/ # L版本
https://mirrors.aliyun.com/ceph/rpm-mimic/el7/ # M版本
https://mirrors.aliyun.com/ceph/rpm-nautilus/el7/ # N版本
https://mirrors.aliyun.com/ceph/rpm-octopus/el7/ # O版本
2、配置主机名、域名解析和免密登录(略)
hostnamectl set-hostname node01
hostnamectl set-hostname node02
hostnamectl set-hostname node03
cat >>/etc/hosts<<EOF
192.168.20.128 node01
192.168.20.129 node02
192.168.20.130 node03
EOF
3、管理节点安装部署工具(以下步骤仅在安装节点执行即可)
# 安装阿里云提供的2.0.1版本的ceph-deploy,epel源1.5.25版本的ceph-deploy对新版本ceph支持不够友好
rpm -ivh https://mirrors.aliyun.com/ceph/rpm-15.2.10/el7/noarch/ceph-deploy-2.0.1-0.noarch.rpm
# 安装依赖
yum install python-setuptools python2-subprocess32 ceph-common -y
4、新建集群指定Mon节点并生成ceph.conf和keyring(会在当前目录生成ceph.conf和ceph.mon.keyring)
cluster-network用于集群内部通信网络,生产环境应设置为其他私有网段地址
ceph-deploy new node01 node02 node03 --cluster-network 192.168.20.0/24 --public-network 192.168.20.0/24
5、集群节点安装Ceph软件包
ceph-deploy install --no-adjust-repos node01 node02 node03
# 上述命令同在每个节点执行 yum -y install ceph ceph-radosgw
6、创建并初始化Mon节点(Mon进程监听在6789端口)
ceph-deploy mon create-initial
7、推送配置文件和admin秘钥到其他主机
ceph-deploy admin node01 node02 node03
# 仅推送配置
ceph-deploy config push node01 node02 node03
8、在目标主机部署Mgr
ceph-deploy mgr create node01 node02
9、查看集群状态
ceph -s
10、集群warnning处理
# mon is allowing insecure global_id reclaim
ceph config set mon auth_allow_insecure_global_id_reclaim false
# Module 'restful' has failed dependency: No module named 'pecan'
pip3 install pecan werkzeug
systemctl restart ceph-mon.target
systemctl restart ceph-mgr.target
11、列出集群节点上的所有可用磁盘
ceph-deploy disk list node01 node02 node03
# 扫描SCSI总线设备
for host in /sys/class/scsi_host/host*/scan; do echo "- - -" > "$host"; done
12、擦除集群节点上用来用作OSD设备的磁盘
ceph-deploy disk zap node01 /dev/sdb /dev/sdc /dev/sdd
ceph-deploy disk zap node02 /dev/sdb /dev/sdc /dev/sdd
ceph-deploy disk zap node03 /dev/sdb /dev/sdc /dev/sdd
13、在集群节点上创建OSD
生产环境下可以单独指定block-db和block-wal设备以优化性能
ceph-deploy osd create node01 --bluestore --data /dev/sdb
ceph-deploy osd create node02 --bluestore --data /dev/sdb
ceph-deploy osd create node03 --bluestore --data /dev/sdb
ceph-deploy osd create node01 --bluestore --data /dev/sdc
ceph-deploy osd create node02 --bluestore --data /dev/sdc
ceph-deploy osd create node03 --bluestore --data /dev/sdc
ceph-deploy osd create node01 --bluestore --data /dev/sdd
ceph-deploy osd create node02 --bluestore --data /dev/sdd
ceph-deploy osd create node03 --bluestore --data /dev/sdd
14、创建和查看存储池
ceph osd pool create mypool 128 128 replicated
ceph osd pool ls
ceph osd pool stats mypool
15、存储池应用类型操作
osd pool application enable <pool> <app> [--yes-i-really-mean-it]
osd pool application disable <pool> <app> [--yes-i-really-mean-it]
osd pool application set <pool> <app> <key> <value>
osd pool application rm <pool> <app> <key>
osd pool application get [<pool>] [<app>] [<key>]
ceph osd pool application enable mypool rbd
16、查看和设置存储池副本数、最小副本数、pg数量、pgp数量等
osd pool get <poolname> size|min_size|pg_num|pgp_num
ceph osd pool get mypool size
ceph osd pool set mypool min_size 1
ceph osd pool get mypool all
# 列出所有存储池的副本数
ceph osd pool ls | xargs -i ceph osd pool get {} size
17、rados存储池和文件操作
rados mkpool mypool 128 128
rados lspools
rados put <obj-name> <infile> [--offset offset]
rados get <obj-name> <outfile>
rados rm <obj-name>
rados ls -p mypool
18、查看指定文件在Ceph集群中的映射
ceph osd map mypool <obj-name>
集群维护
1、增加Mon节点
ceph-deploy mon add node04
ceph-deploy mon add node05
2、添加Mgr节点
ceph-deploy mgr create node03
3、查看Mon的quorum状态
ceph mon stat
ceph quorum_status --format json-pretty
4、创建OSD时,将OSD的三类数据都分开存放——Object Data Blobs、SST文件、wal文件
ceph-deploy osd create {node} --data /path/to/data --block-db /path/to/db-device --block-wal /path/to/wal-device
5、删除一个存储池
ceph daemon mon.node01 config set mon_allow_pool_delete true
ceph daemon mon.node02 config set mon_allow_pool_delete true
ceph daemon mon.node03 config set mon_allow_pool_delete true
ceph osd pool rm <poolname> {<poolname>} {<sure>}
rados rmpool <pool-name> [<pool-name> --yes-i-really-really-mean-it]
6、停止和移除OSD
ceph osd out <ids>...
systemctl stop ceph-osd@<id>.service
ceph osd crush reweight <id> <weight:float>
ceph osd purge <id|osd.id> [--force] [--yes-i-really-mean-it]
# L版本之前移除OSD
ceph osd crush reweight <id> 0
systemctl stop ceph-osd@<id>.service
ceph osd out <id>
ceph osd crush remove <id>
ceph osd rm <id>
ceph auth del <id>
7、设备类操作
# 常见设备类:hdd、ssd、nvme、scm、any;类名称只是任意字符串,不需要遵循hdd、ssd或nvme
ceph osd crush class create <class>
ceph osd crush class ls
ceph osd crush class ls-osd <class>
ceph osd crush class rename <srcname> <dstname>
ceph osd crush class rm <class>
# 设置设备类
ceph osd crush get-device-class ssd osd.<id>
ceph osd crush set-device-class ssd osd.<id>
ceph osd crush rm-device-class osd.<id>
作者:wanghongwei
版权声明:本作品遵循<CC BY-NC-ND 4.0>版权协议,商业转载请联系作者获得授权,非商业转载请附上原文出处链接及本声明。