Ceph CSI 在 Nomad 上配置卷(稳定持续更新)

使用 Ceph CSI 在 Nomad 上配置卷

参考文档:https://itnext.io/provision-volumes-from-external-ceph-storage-on-kubernetes-and-nomad-using-ceph-csi-7ad9b15e9809

这篇文章中使用的重要组件如下

  • Nomad v1.2.3
  • Ceph Storage v14 (Nautilus)
  • Ceph CSI v3.3.1

在 Nomad 上提供卷

首先,将以下内容添加到 nomad 客户端配置中,使Docker容器可以在Nomad客户端节点上以特权运行。

cat <<EOC >> /etc/nomad.d/client.hcl 
plugin "docker" {
  config {
    allow_privileged = true
  }
}
EOC

systemctl restart nomad

继续之前,在所有nomad客户端节点上加载RBD模块。

sudo modprobe rbd;
sudo lsmod |grep rbd;
rbd                    83733  0
libceph               306750  1 rbd

#开机模块自启 
echo "rbd" >> /etc/modules-load.d/ceph.conf

CSI 由 controllernode 组成。

首先创建一个 Ceph CSI controller job,类型为service
修改以下内容并创建

  • clusterID:ceph -s |grep id 获取
  • monitors: ceph -s |grep mon 获取,不写主机名写ip
cat <<EOC > ceph-csi-plugin-controller.nomad

job "csi-cephrbd-controller" {
  datacenters = ["dc1", "dc2"]

  constraint {
    attribute = "${attr.kernel.name}"
    value     = "linux"
  }

  type = "service"

  group "cephrbd" {

    network {
      port "prometheus" {}
    }

    service {
      name = "prometheus"
      port = "prometheus"
      tags = ["ceph-csi"]
    }

    task "plugin" {
      driver = "docker"

      config {
        image = "quay.io/cephcsi/cephcsi:v3.3.1"

        args = [
          "--drivername=rbd.csi.ceph.com",
          "--v=5",
          "--type=rbd",
          "--controllerserver=true",
          "--nodeid=${NODE_ID}",
          "--instanceid=${POD_ID}",
          "--endpoint=${CSI_ENDPOINT}",
          "--metricsport=${NOMAD_PORT_prometheus}",
        ]

        ports = ["prometheus"]

        # we need to be able to write key material to disk in this location
        mount {
          type     = "bind"
          source   = "secrets"
          target   = "/tmp/csi/keys"
          readonly = false
        }

        mount {
          type     = "bind"
          source   = "ceph-csi-config/config.json"
          target   = "/etc/ceph-csi-config/config.json"
          readonly = false
        }

      }

      template {
        data = <<-EOT
POD_ID=${NOMAD_ALLOC_ID}
NODE_ID=${node.unique.id}
CSI_ENDPOINT=unix://csi/csi.sock
EOT

        destination = "${NOMAD_TASK_DIR}/env"
        env         = true
      }

      # 主要修改这里  
      template {

        data = <<EOF
[{
    "clusterID": "380a1e72-da89-4041-8478-xxxxx",
    "monitors": [
      "10.103.3.x:6789",
      "10.103.3.x:6789",
      "10.103.3.x:6789"
    ]
}]
EOF
        destination = "ceph-csi-config/config.json"
      }

      csi_plugin {
        id        = "cephrbd"
        type      = "controller"
        mount_dir = "/csi"
      }

      resources {
        cpu    = 256
        memory = 256
      }
    }
  }
}

EOC

再创建一个 Ceph CSI node job

cat <<EOC > ceph-csi-plugin-nodes.nomad

job "csi-cephrbd-node" {
  datacenters = ["dc1", "dc2"]

  constraint {
    attribute = "${attr.kernel.name}"
    value     = "linux"
  }

  type = "system"

  group "cephrbd" {

    network {
      port "prometheus" {}
    }

    service {
      name = "prometheus"
      port = "prometheus"
      tags = ["ceph-csi"]
    }

    task "plugin" {
      driver = "docker"

      config {
        image = "quay.io/cephcsi/cephcsi:v3.3.1"
        network_mode = "host"  #添加本地网络模式  解决系统messages日志 kernel: libceph: connect (1)10.103.3.xxx:6789 error -101

        args = [
          "--drivername=rbd.csi.ceph.com",
          "--v=5",
          "--type=rbd",
          "--nodeserver=true",
          "--nodeid=${NODE_ID}",
          "--instanceid=${POD_ID}",
          "--endpoint=${CSI_ENDPOINT}",
          "--metricsport=${NOMAD_PORT_prometheus}",
        ]

        privileged = true
        ports      = ["prometheus"]
        # we need to be able to write key material to disk in this location
        mount {
          type     = "tmpfs"
          #source   = "secrets"
          target   = "/tmp/csi/keys"
          readonly = false
        }

        mount {
          type     = "bind"
          source   = "ceph-csi-config/config.json"
          target   = "/etc/ceph-csi-config/config.json"
          readonly = false
        }
      }

      template {
        data = <<-EOT
POD_ID=${NOMAD_ALLOC_ID}
NODE_ID=${node.unique.id}
CSI_ENDPOINT=unix://csi/csi.sock
EOT

        destination = "${NOMAD_TASK_DIR}/env"
        env         = true
      }
      # 主要修改这里  
      template {

        data = <<EOF
[{
    "clusterID": "380a1e72-da89-4041-8478-xxxxx",
    "monitors": [
      "10.103.3.x:6789",
      "10.103.3.x:6789",
      "10.103.3.x:6789"
    ]
}]
EOF

        destination = "ceph-csi-config/config.json"
      }
      csi_plugin {
        id        = "cephrbd"   #如果现在有一个csi,这个不要和那个名字冲突
        type      = "node"
        mount_dir = "/csi"
      }

      # note: there's no upstream guidance on resource usage so
      # this is a best guess until we profile it in heavy use
      resources {
        cpu    = 256
        memory = 256
      }
    }
  }
}


EOC

此 Ceph node job 的类型是 system ,即将在所有 nomad 客户端节点上创建 ceph csi node 容器。

运行 Ceph CSI job

nomad job run ceph-csi-plugin-controller.nomad;
nomad job run ceph-csi-plugin-nodes.nomad;

查看 ceph csi 插件的状态

nomad plugin status ceph-csi;
ID                   = ceph-csi
Provider             = rbd.csi.ceph.com
Version              = v3.3.1
Controllers Healthy  = 1
Controllers Expected = 1
Nodes Healthy        = 2
Nodes Expected       = 2
Allocations
ID        Node ID   Task Group  Version  Desired  Status   Created    Modified
b6268d6d  457a8291  controller  0        run      running  1d21h ago  1d21h ago
ec265d25  709ee9cc  nodes       0        run      running  1d21h ago  1d21h ago
4cd7dffa  457a8291  nodes       0        run      running  1d21h ago  1d21h ago

现在,它可以使用ceph csi驱动程序从外部ceph存储装载卷了。

让我们创建一个ceph池myPool和管理员用户myPoolAdmin

# 创建 pool:
ceph osd pool create nomad 64 64
rbd pool init nomad;
# 创建 nomadAdmin 用户
ceph auth get-or-create-key client.nomadAdmin mds 'allow *' mgr 'allow *' mon 'allow *' osd 'allow * pool=nomad'
# 查看用户名密码
ceph auth list |grep -A 3 nomadAdmin

现在我们需要一个卷在Nomad上注册,创建一个卷
我这里觉得它创建卷比较麻烦就直接写了个小脚本
vi /usr/bin/nvc

#!/bin/bash

# path:/usr/bin/nvc
VSIZE=$3
VNAME=$2
namespace=$1
cat <<EOF > /tmp/$namespace-$VNAME.hcl
type = "csi"
id   = "$VNAME"
name = "$VNAME"
capacity_min = "$VSIZE"
capacity_max = "$VSIZE"
mount_options {
  fs_type     = "xfs"
  mount_flags = ["discard" ,"defaults"]    # 这里主要是 discard 来实时删除
}
capability {
  access_mode     = "single-node-writer"
  attachment_mode = "file-system"
}
capability {
  access_mode     = "single-node-writer"
  attachment_mode = "block-device"
}
plugin_id       = "cephrbd"  #这里和上方创建的 csi_plugin id 保持一致
secrets {
  userID  = "nomadAdmin"
  userKey = "AQDD6GxiISmwGhAAlONuWEB869f6yeuEY9iicQ=="
}
parameters {
  clusterID = "380a1e72-da89-4041-8478-76383f5f6378"
  pool      = "nomad"
  imageFeatures = "layering"
}
EOF

nomad volume create -namespace $namespace /tmp/$namespace-$VNAME.hcl

现在,在Nomad上注册该卷。

#查询状态
nomad volume status  -namespace=ic-es
#进行注册
namespace=ic-es
name=ic-node-2
nvc $namespace $name 12576GB 

# 取消注册 ,但是ceph盘还在
nomad volume deregister -force -namespace=ic-es ic-node-2
nomad volume deregister  -namespace=ic-es ic-node-2
# 直接删除ceph的rbd也会删除
nomad volume delete  -namespace=ic-es ic-node-2

所遇问题

有时候删除已经创建的卷会失败,查看对应报错有以下字段

ceph  is still being used

解决方法
就是查看目前的ceph是rbd在哪个产生了连接,然后我们去对应节点,取消挂载即可,但是我这里没有找到对方挂载,只好重启服务器解决,

解决K8S中Pod无法正常Mount PVC的问题:https://os.51cto.com/article/675005.html

引用
https://rancher.com/docs/rancher/v2.x/en/cluster-admin/volumes-and-storage/ceph/
https://learn.hashicorp.com/tutorials/nomad/stateful-workloads-csi-volumes?in=nomad/stateful-workloads
https://github.com/hashicorp/nomad/tree/main/demo/csi/ceph-csi-plugin
https://docs.ceph.com/en/latest/rbd/rbd-nomad/

posted @ 2022-04-30 16:27  鸣昊  阅读(213)  评论(0编辑  收藏  举报