Ceph CSI 在 Nomad 上配置卷(稳定持续更新)
使用 Ceph CSI 在 Nomad 上配置卷
这篇文章中使用的重要组件如下
- Nomad v1.2.3
- Ceph Storage v14 (Nautilus)
- Ceph CSI v3.3.1
在 Nomad 上提供卷
首先,将以下内容添加到 nomad 客户端配置中,使Docker容器可以在Nomad客户端节点上以特权运行。
cat <<EOC >> /etc/nomad.d/client.hcl
plugin "docker" {
config {
allow_privileged = true
}
}
EOC
systemctl restart nomad
继续之前,在所有nomad客户端节点上加载RBD模块。
sudo modprobe rbd;
sudo lsmod |grep rbd;
rbd 83733 0
libceph 306750 1 rbd
#开机模块自启
echo "rbd" >> /etc/modules-load.d/ceph.conf
CSI 由 controller
和 node
组成。
首先创建一个 Ceph CSI controller job,类型为service
。
修改以下内容并创建
- clusterID:
ceph -s |grep id
获取 - monitors:
ceph -s |grep mon
获取,不写主机名写ip
cat <<EOC > ceph-csi-plugin-controller.nomad
job "csi-cephrbd-controller" {
datacenters = ["dc1", "dc2"]
constraint {
attribute = "${attr.kernel.name}"
value = "linux"
}
type = "service"
group "cephrbd" {
network {
port "prometheus" {}
}
service {
name = "prometheus"
port = "prometheus"
tags = ["ceph-csi"]
}
task "plugin" {
driver = "docker"
config {
image = "quay.io/cephcsi/cephcsi:v3.3.1"
args = [
"--drivername=rbd.csi.ceph.com",
"--v=5",
"--type=rbd",
"--controllerserver=true",
"--nodeid=${NODE_ID}",
"--instanceid=${POD_ID}",
"--endpoint=${CSI_ENDPOINT}",
"--metricsport=${NOMAD_PORT_prometheus}",
]
ports = ["prometheus"]
# we need to be able to write key material to disk in this location
mount {
type = "bind"
source = "secrets"
target = "/tmp/csi/keys"
readonly = false
}
mount {
type = "bind"
source = "ceph-csi-config/config.json"
target = "/etc/ceph-csi-config/config.json"
readonly = false
}
}
template {
data = <<-EOT
POD_ID=${NOMAD_ALLOC_ID}
NODE_ID=${node.unique.id}
CSI_ENDPOINT=unix://csi/csi.sock
EOT
destination = "${NOMAD_TASK_DIR}/env"
env = true
}
# 主要修改这里
template {
data = <<EOF
[{
"clusterID": "380a1e72-da89-4041-8478-xxxxx",
"monitors": [
"10.103.3.x:6789",
"10.103.3.x:6789",
"10.103.3.x:6789"
]
}]
EOF
destination = "ceph-csi-config/config.json"
}
csi_plugin {
id = "cephrbd"
type = "controller"
mount_dir = "/csi"
}
resources {
cpu = 256
memory = 256
}
}
}
}
EOC
再创建一个 Ceph CSI node job
cat <<EOC > ceph-csi-plugin-nodes.nomad
job "csi-cephrbd-node" {
datacenters = ["dc1", "dc2"]
constraint {
attribute = "${attr.kernel.name}"
value = "linux"
}
type = "system"
group "cephrbd" {
network {
port "prometheus" {}
}
service {
name = "prometheus"
port = "prometheus"
tags = ["ceph-csi"]
}
task "plugin" {
driver = "docker"
config {
image = "quay.io/cephcsi/cephcsi:v3.3.1"
network_mode = "host" #添加本地网络模式 解决系统messages日志 kernel: libceph: connect (1)10.103.3.xxx:6789 error -101
args = [
"--drivername=rbd.csi.ceph.com",
"--v=5",
"--type=rbd",
"--nodeserver=true",
"--nodeid=${NODE_ID}",
"--instanceid=${POD_ID}",
"--endpoint=${CSI_ENDPOINT}",
"--metricsport=${NOMAD_PORT_prometheus}",
]
privileged = true
ports = ["prometheus"]
# we need to be able to write key material to disk in this location
mount {
type = "tmpfs"
#source = "secrets"
target = "/tmp/csi/keys"
readonly = false
}
mount {
type = "bind"
source = "ceph-csi-config/config.json"
target = "/etc/ceph-csi-config/config.json"
readonly = false
}
}
template {
data = <<-EOT
POD_ID=${NOMAD_ALLOC_ID}
NODE_ID=${node.unique.id}
CSI_ENDPOINT=unix://csi/csi.sock
EOT
destination = "${NOMAD_TASK_DIR}/env"
env = true
}
# 主要修改这里
template {
data = <<EOF
[{
"clusterID": "380a1e72-da89-4041-8478-xxxxx",
"monitors": [
"10.103.3.x:6789",
"10.103.3.x:6789",
"10.103.3.x:6789"
]
}]
EOF
destination = "ceph-csi-config/config.json"
}
csi_plugin {
id = "cephrbd" #如果现在有一个csi,这个不要和那个名字冲突
type = "node"
mount_dir = "/csi"
}
# note: there's no upstream guidance on resource usage so
# this is a best guess until we profile it in heavy use
resources {
cpu = 256
memory = 256
}
}
}
}
EOC
此 Ceph node job 的类型是 system
,即将在所有 nomad 客户端节点上创建 ceph csi node 容器。
运行 Ceph CSI job
nomad job run ceph-csi-plugin-controller.nomad;
nomad job run ceph-csi-plugin-nodes.nomad;
查看 ceph csi 插件的状态
nomad plugin status ceph-csi;
ID = ceph-csi
Provider = rbd.csi.ceph.com
Version = v3.3.1
Controllers Healthy = 1
Controllers Expected = 1
Nodes Healthy = 2
Nodes Expected = 2
Allocations
ID Node ID Task Group Version Desired Status Created Modified
b6268d6d 457a8291 controller 0 run running 1d21h ago 1d21h ago
ec265d25 709ee9cc nodes 0 run running 1d21h ago 1d21h ago
4cd7dffa 457a8291 nodes 0 run running 1d21h ago 1d21h ago
现在,它可以使用ceph csi驱动程序从外部ceph存储装载卷了。
让我们创建一个ceph池myPool
和管理员用户myPoolAdmin
# 创建 pool:
ceph osd pool create nomad 64 64
rbd pool init nomad;
# 创建 nomadAdmin 用户
ceph auth get-or-create-key client.nomadAdmin mds 'allow *' mgr 'allow *' mon 'allow *' osd 'allow * pool=nomad'
# 查看用户名密码
ceph auth list |grep -A 3 nomadAdmin
现在我们需要一个卷在Nomad上注册,创建一个卷
我这里觉得它创建卷比较麻烦就直接写了个小脚本
vi /usr/bin/nvc
#!/bin/bash
# path:/usr/bin/nvc
VSIZE=$3
VNAME=$2
namespace=$1
cat <<EOF > /tmp/$namespace-$VNAME.hcl
type = "csi"
id = "$VNAME"
name = "$VNAME"
capacity_min = "$VSIZE"
capacity_max = "$VSIZE"
mount_options {
fs_type = "xfs"
mount_flags = ["discard" ,"defaults"] # 这里主要是 discard 来实时删除
}
capability {
access_mode = "single-node-writer"
attachment_mode = "file-system"
}
capability {
access_mode = "single-node-writer"
attachment_mode = "block-device"
}
plugin_id = "cephrbd" #这里和上方创建的 csi_plugin id 保持一致
secrets {
userID = "nomadAdmin"
userKey = "AQDD6GxiISmwGhAAlONuWEB869f6yeuEY9iicQ=="
}
parameters {
clusterID = "380a1e72-da89-4041-8478-76383f5f6378"
pool = "nomad"
imageFeatures = "layering"
}
EOF
nomad volume create -namespace $namespace /tmp/$namespace-$VNAME.hcl
现在,在Nomad上注册该卷。
#查询状态
nomad volume status -namespace=ic-es
#进行注册
namespace=ic-es
name=ic-node-2
nvc $namespace $name 12576GB
# 取消注册 ,但是ceph盘还在
nomad volume deregister -force -namespace=ic-es ic-node-2
nomad volume deregister -namespace=ic-es ic-node-2
# 直接删除ceph的rbd也会删除
nomad volume delete -namespace=ic-es ic-node-2
所遇问题
有时候删除已经创建的卷会失败,查看对应报错有以下字段
ceph is still being used
解决方法
就是查看目前的ceph是rbd在哪个产生了连接,然后我们去对应节点,取消挂载即可,但是我这里没有找到对方挂载,只好重启服务器解决,
解决K8S中Pod无法正常Mount PVC的问题:https://os.51cto.com/article/675005.html
引用
https://rancher.com/docs/rancher/v2.x/en/cluster-admin/volumes-and-storage/ceph/
https://learn.hashicorp.com/tutorials/nomad/stateful-workloads-csi-volumes?in=nomad/stateful-workloads
https://github.com/hashicorp/nomad/tree/main/demo/csi/ceph-csi-plugin
https://docs.ceph.com/en/latest/rbd/rbd-nomad/