Ceph集群安装部署

集群安装

1、准备软件源,选择一个安装版本(所有节点安装)备注:Ceph-CSI需要N版及以上版本,版本请参照阿里云开源镜像站https://mirrors.aliyun.com/ceph/

cat >/etc/yum.repos.d/ceph.repo<<EOF 
[ceph]
name=ceph
baseurl=https://mirrors.aliyun.com/ceph/rpm-octopus/el7/x86_64/
gpgcheck=0
[ceph-noarch]
name=cephnoarch
baseurl=https://mirrors.aliyun.com/ceph/rpm-octopus/el7/noarch/
gpgcheck=0
EOF

# 可选版本
https://mirrors.aliyun.com/ceph/rpm-luminous/el7/  # L版本
https://mirrors.aliyun.com/ceph/rpm-mimic/el7/     # M版本
https://mirrors.aliyun.com/ceph/rpm-nautilus/el7/  # N版本
https://mirrors.aliyun.com/ceph/rpm-octopus/el7/   # O版本

2、配置主机名、域名解析和免密登录(略)

hostnamectl set-hostname node01
hostnamectl set-hostname node02
hostnamectl set-hostname node03

cat >>/etc/hosts<<EOF
192.168.20.128 node01
192.168.20.129 node02
192.168.20.130 node03
EOF

3、管理节点安装部署工具(以下步骤仅在安装节点执行即可)

# 安装阿里云提供的2.0.1版本的ceph-deploy,epel源1.5.25版本的ceph-deploy对新版本ceph支持不够友好
rpm -ivh https://mirrors.aliyun.com/ceph/rpm-15.2.10/el7/noarch/ceph-deploy-2.0.1-0.noarch.rpm

# 安装依赖
yum install python-setuptools python2-subprocess32 ceph-common -y

4、新建集群指定Mon节点并生成ceph.conf和keyring(会在当前目录生成ceph.conf和ceph.mon.keyring)
cluster-network用于集群内部通信网络,生产环境应设置为其他私有网段地址

ceph-deploy new node01 node02 node03 --cluster-network 192.168.20.0/24 --public-network 192.168.20.0/24

5、集群节点安装Ceph软件包

ceph-deploy install --no-adjust-repos node01 node02 node03

# 上述命令同在每个节点执行 yum -y install ceph ceph-radosgw

6、创建并初始化Mon节点(Mon进程监听在6789端口)

ceph-deploy mon create-initial

7、推送配置文件和admin秘钥到其他主机

ceph-deploy admin node01 node02 node03

# 仅推送配置
ceph-deploy config push node01 node02 node03

8、在目标主机部署Mgr

ceph-deploy mgr create node01 node02

9、查看集群状态

ceph -s

10、集群warnning处理

# mon is allowing insecure global_id reclaim
ceph config set mon auth_allow_insecure_global_id_reclaim false

# Module 'restful' has failed dependency: No module named 'pecan'
pip3 install pecan werkzeug
systemctl restart ceph-mon.target
systemctl restart ceph-mgr.target

11、列出集群节点上的所有可用磁盘

ceph-deploy disk list node01 node02 node03

# 扫描SCSI总线设备
for host in /sys/class/scsi_host/host*/scan; do echo "- - -" > "$host"; done

12、擦除集群节点上用来用作OSD设备的磁盘

ceph-deploy disk zap node01 /dev/sdb /dev/sdc /dev/sdd
ceph-deploy disk zap node02 /dev/sdb /dev/sdc /dev/sdd
ceph-deploy disk zap node03 /dev/sdb /dev/sdc /dev/sdd

13、在集群节点上创建OSD
生产环境下可以单独指定block-db和block-wal设备以优化性能

ceph-deploy osd create node01 --bluestore --data /dev/sdb
ceph-deploy osd create node02 --bluestore --data /dev/sdb
ceph-deploy osd create node03 --bluestore --data /dev/sdb
ceph-deploy osd create node01 --bluestore --data /dev/sdc
ceph-deploy osd create node02 --bluestore --data /dev/sdc
ceph-deploy osd create node03 --bluestore --data /dev/sdc
ceph-deploy osd create node01 --bluestore --data /dev/sdd
ceph-deploy osd create node02 --bluestore --data /dev/sdd
ceph-deploy osd create node03 --bluestore --data /dev/sdd

14、创建和查看存储池

ceph osd pool create mypool 128 128 replicated
ceph osd pool ls
ceph osd pool stats mypool

15、存储池应用类型操作

osd pool application enable <pool> <app> [--yes-i-really-mean-it]
osd pool application disable <pool> <app> [--yes-i-really-mean-it]
osd pool application set <pool> <app> <key> <value>
osd pool application rm <pool> <app> <key>
osd pool application get [<pool>] [<app>] [<key>]

ceph osd pool application enable mypool rbd

16、查看和设置存储池副本数、最小副本数、pg数量、pgp数量等

osd pool get <poolname> size|min_size|pg_num|pgp_num

ceph osd pool get mypool size
ceph osd pool set mypool min_size 1
ceph osd pool get mypool all

# 列出所有存储池的副本数
ceph osd pool ls | xargs -i ceph osd pool get {} size

17、rados存储池和文件操作

rados mkpool mypool 128 128
rados lspools

rados put <obj-name> <infile> [--offset offset]
rados get <obj-name> <outfile>
rados rm <obj-name>

rados ls -p mypool

18、查看指定文件在Ceph集群中的映射

ceph osd map mypool <obj-name>

集群维护

1、增加Mon节点

ceph-deploy mon add node04
ceph-deploy mon add node05

2、添加Mgr节点

ceph-deploy mgr create node03

3、查看Mon的quorum状态

ceph mon stat
ceph quorum_status --format json-pretty

4、创建OSD时,将OSD的三类数据都分开存放——Object Data Blobs、SST文件、wal文件

ceph-deploy osd create {node} --data /path/to/data --block-db /path/to/db-device --block-wal /path/to/wal-device

5、删除一个存储池

ceph daemon mon.node01 config set mon_allow_pool_delete true
ceph daemon mon.node02 config set mon_allow_pool_delete true
ceph daemon mon.node03 config set mon_allow_pool_delete true

ceph osd pool rm <poolname> {<poolname>} {<sure>}
rados rmpool <pool-name> [<pool-name> --yes-i-really-really-mean-it]

6、停止和移除OSD

ceph osd out <ids>...
systemctl stop ceph-osd@<id>.service
ceph osd crush reweight <id> <weight:float>
ceph osd purge <id|osd.id> [--force] [--yes-i-really-mean-it]

# L版本之前移除OSD
ceph osd crush reweight <id> 0
systemctl stop ceph-osd@<id>.service
ceph osd out <id>
ceph osd crush remove <id>
ceph osd rm <id>
ceph auth del <id>

7、设备类操作

# 常见设备类:hdd、ssd、nvme、scm、any;类名称只是任意字符串,不需要遵循hdd、ssd或nvme
ceph osd crush class create <class> 
ceph osd crush class ls
ceph osd crush class ls-osd <class>
ceph osd crush class rename <srcname> <dstname>
ceph osd crush class rm <class>

# 设置设备类
ceph osd crush get-device-class ssd osd.<id>
ceph osd crush set-device-class ssd osd.<id>
ceph osd crush rm-device-class osd.<id>
posted @ 2021-04-30 21:22  wanghongwei-dev  阅读(236)  评论(0编辑  收藏  举报