kubernetes 集群部署rabbimq3.11.11

通过官方镜像 RabbitMQ Docker Image 和 rabbitmq-peer-discovery-k8s 插件进行集群部署。

0. 环境

 kubernetes 1.24

 rabbitmq3.11.11

1.命名空间

将 rabbitmq 的资源都放在 rabbitmq 命名空间内。

Namespace.yaml

apiVersion: v1
kind: Namespace
metadata:
  name: rabbitmq

2.配置

通过配置 configMap 将配置文件挂载到 rabbitmq 容器内 。

Config.yaml
 apiVersion: v1
kind: ConfigMap
metadata:
  name: rabbitmq-config
  namespace: rabbitmq
data:
  enabled_plugins: |
      [rabbitmq_management,rabbitmq_mqtt,rabbitmq_web_mqtt,rabbitmq_peer_discovery_k8s].
  rabbitmq.conf: |
      ## Cluster formation. See https://www.rabbitmq.com/cluster-formation.html to learn more.
      cluster_formation.peer_discovery_backend  = k8s
      cluster_formation.k8s.host = kubernetes.default.svc.cluster.local
      ## Service name is rabbitmq by default but can be overridden using the cluster_formation.k8s.service_name key if needed
      cluster_formation.k8s.service_name = rabbitmq-internal
      ## It is possible to append a suffix to peer hostnames returned by Kubernetes using cluster_formation.k8s.hostname_suffix
      cluster_formation.k8s.hostname_suffix = .rabbitmq-internal.rabbitmq.svc.cluster.local
      ## Should RabbitMQ node name be computed from the pod's hostname or IP address?
      ## IP addresses are not stable, so using [stable] hostnames is recommended when possible.
      ## Set to "hostname" to use pod hostnames.
      ## When this value is changed, so should the variable used to set the RABBITMQ_NODENAME
      ## environment variable.
      cluster_formation.k8s.address_type = hostname
      ## How often should node cleanup checks run?
      cluster_formation.node_cleanup.interval = 30
      ## Set to false if automatic removal of unknown/absent nodes
      ## is desired. This can be dangerous, see
      ##  * https://www.rabbitmq.com/cluster-formation.html#node-health-checks-and-cleanup
      ##  * https://groups.google.com/forum/#!msg/rabbitmq-users/wuOfzEywHXo/k8z_HWIkBgAJ
      cluster_formation.node_cleanup.only_log_warning = true
      cluster_partition_handling = autoheal
      ## See https://www.rabbitmq.com/ha.html#master-migration-data-locality
      queue_master_locator=min-masters
      ## This is just an example.
      ## This enables remote access for the default user with well known credentials.
      ## Consider deleting the default user and creating a separate user with a set of generated
      ## credentials instead.
      ## Learn more at https://www.rabbitmq.com/access-control.html#loopback-users
      loopback_users.guest = false
      ## https://www.rabbitmq.com/memory.html#configuring-threshold
      vm_memory_high_watermark.relative = 0.6


      ## On first start RabbitMQ will create a vhost and a user. These
      ## config items control what gets created.
      ## Relevant doc guide: https://rabbitmq.com/access-control.html
      ##
      default_vhost = /
      default_user = system
      default_pass = rbmqu0101081710



      # =======================================
      # MQTT section
      # =======================================

      ## TCP listener settings.
      ##
      # mqtt.listeners.tcp.1 = 127.0.0.1:61613
      # mqtt.listeners.tcp.2 = ::1:61613
      mqtt.listeners.tcp.default = 1883

      ## Set the default user name and password used for anonymous connections (when client
      ## provides no credentials). Anonymous connections are highly discouraged!
      ##
      mqtt.default_user = mqtt_admin
      mqtt.default_pass = rbmqmqtt_07231816

      ## Enable anonymous connections. If this is set to false, clients MUST provide
      ## credentials in order to connect. See also the mqtt.default_user/mqtt.default_pass
      ## keys. Anonymous connections are highly discouraged!
      ##
      mqtt.allow_anonymous = false

      ## If you have multiple vhosts, specify the one to which the
      ## adapter connects.
      ##
      mqtt.vhost = /

      ## Specify the exchange to which messages from MQTT clients are published.
      ##
      mqtt.exchange = exchange_mqtt_topic

      ## Specify TTL (time to live) to control the lifetime of non-clean sessions.
      ##
      mqtt.subscription_ttl = 1800000

      ## Set the prefetch count (governing the maximum number of unacknowledged
      ## messages that will be delivered).
      ##
      mqtt.prefetch = 10

这里配置了账号信息,cluster信息,mqtt信息(这个刚测试了一下,subscribe会失败)

3.密钥

通过 secrets 将 erlang-cookie 和默认用户信息写入到环境变量中。

Secret.yaml
 apiVersion: v1
kind: Secret
metadata:
  name: rabbitmq-secret
  namespace: rabbitmq
type: Opaque
data:
  RABBITMQ_ERLANG_COOKIE: MTIzajE5dWVkYXM3ZGFkODEwMjNqMTM5ZGph
  RABBITMQ_DEFAULT_USER: c3lzdGVt==
  RABBITMQ_DEFAULT_PASS: cmJtcXUwMTAxMDgxNzEw==

这里填写的数据是需要base64处理的,k8s会自动base64解开放到pod里面的环境变量

4.RBAC

rabbitmq-peer-discovery 需要 rabc 权限来获取 endpoints 信息来做集群节点的自动发现。

Rbac.yaml
 apiVersion: v1
kind: ServiceAccount
metadata:
  name: rabbitmq
  namespace: rabbitmq
---
kind: Role
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: rabbitmq-peer-discovery-rbac
  namespace: rabbitmq
rules:
- apiGroups: [""]
  resources: ["endpoints"]
  verbs: ["get"]
---
kind: RoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: rabbitmq-peer-discovery-rbac
  namespace: rabbitmq
subjects:
- kind: ServiceAccount
  name: rabbitmq
  namespace: rabbitmq
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: Role
  name: rabbitmq-peer-discovery-rbac

5.服务

定义 headless service 作为 statefulset 的服务入口。

Service.yaml
 kind: Service
apiVersion: v1
metadata:
  namespace: rabbitmq
  name: rabbitmq-internal
  labels:
    app: rabbitmq
spec:
  clusterIP: None
  ports:
    - name: mqtt
      protocol: TCP
      port: 1883
    - name: epmd
      protocol: TCP
      port: 4369
    - name: amqp
      protocol: TCP
      port: 5672
    - name: amqp-tls
      protocol: TCP
      port: 5671
    - name: http
      protocol: TCP
      port: 15672
    - name: inter-node-cli
      protocol: TCP
      port: 25672
  selector:
    app: rabbitmq
---
kind: Service
apiVersion: v1
metadata:
  namespace: rabbitmq 
  name: rabbitmq
  labels:
    app: rabbitmq
spec:
  type: NodePort
  ports:
    - name: mqtt
      protocol: TCP
      port: 1883
      nodePort: 1883
    - name: amqp
      protocol: TCP
      port: 5672
      nodePort: 5672
    - name: http
      protocol: TCP
      port: 15672
      nodePort: 15672
  selector:
    app: rabbitmq 

6.持久卷

 statefulset 数据存储地方

PersistentVolume.yaml
 ---
apiVersion: v1
kind: PersistentVolume
metadata:
  name: rabbit-pv1
  labels:
    type: rabbitmq
spec:
  storageClassName: rabbitmq
  capacity:
    storage: 10Gi
  accessModes:
    - ReadWriteOnce
  hostPath:
    path: "/opt/rabbitmq_data1"
---
apiVersion: v1
kind: PersistentVolume
metadata:
  name: rabbit-pv2
  labels:
    type: rabbitmq
spec:
  storageClassName: rabbitmq
  capacity:
    storage: 10Gi
  accessModes:
    - ReadWriteOnce
  hostPath:
    path: "/opt/rabbitmq_data2"
---
apiVersion: v1
kind: PersistentVolume
metadata:
  name: rabbit-pv3
  labels:
    type: rabbitmq
spec:
  storageClassName: rabbitmq
  capacity:
    storage: 10Gi
  accessModes:
    - ReadWriteOnce
  hostPath:
    path: "/opt/rabbitmq_data3"

7. 有状态应用部署

按照官方集群部署的推荐方式使用 StatefulSet 方式部署,使用动态存储卷保存数据。

Statefulset.yaml
 apiVersion: apps/v1
# See the Prerequisites section of https://www.rabbitmq.com/cluster-formation.html#peer-discovery-k8s.
kind: StatefulSet
metadata:
  name: rabbitmq
  namespace: rabbitmq
spec:
  serviceName: rabbitmq-internal
  # Three nodes is the recommended minimum. Some features may require a majority of nodes
  # to be available.
  replicas: 3
  selector:
    matchLabels:
      app: rabbitmq
  template:
    metadata:
      labels:
        app: rabbitmq
    spec:
      serviceAccountName: rabbitmq
      terminationGracePeriodSeconds: 10
      nodeSelector:
        # Use Linux nodes in a mixed OS kubernetes cluster.
        # Learn more at https://kubernetes.io/docs/reference/kubernetes-api/labels-annotations-taints/#kubernetes-io-os
        kubernetes.io/os: linux
      initContainers:
        - name: fix-readonly-config
          image: busybox:1.31.1
          command:
            - sh
            - -c
            - cp /tmp/config/* /etc/rabbitmq;
          volumeMounts:
            - name: rabbitmq-config
              mountPath: /etc/rabbitmq
            - name: tmp-dir
              mountPath: /tmp/config
      containers:
        - name: rabbitmq
          image: rabbitmq:3.11.11
          # Learn more about what ports various protocols use
          # at https://www.rabbitmq.com/networking.html#ports
          ports:
            - name: mqtt
              protocol: TCP
              containerPort: 1883
            - name: epmd
              protocol: TCP
              containerPort: 4369
            - name: amqp
              protocol: TCP
              containerPort: 5672
            - name: amqp-tls
              protocol: TCP
              containerPort: 5671
            - name: http
              protocol: TCP
              containerPort: 15672
          livenessProbe:
            exec:
              # This is just an example. There is no "one true health check" but rather
              # several rabbitmq-diagnostics commands that can be combined to form increasingly comprehensive
              # and intrusive health checks.
              # Learn more at https://www.rabbitmq.com/monitoring.html#health-checks.
              #
              # Stage 2 check:
              command: ["rabbitmq-diagnostics", "status"]
            initialDelaySeconds: 60
            # See https://www.rabbitmq.com/monitoring.html for monitoring frequency recommendations.
            periodSeconds: 60
            timeoutSeconds: 15
          readinessProbe:
            exec:
              # This is just an example. There is no "one true health check" but rather
              # several rabbitmq-diagnostics commands that can be combined to form increasingly comprehensive
              # and intrusive health checks.
              # Learn more at https://www.rabbitmq.com/monitoring.html#health-checks.
              #
              # Stage 2 check:
              command: ["rabbitmq-diagnostics", "status"]
              # To use a stage 4 check:
              # command: ["rabbitmq-diagnostics", "check_port_connectivity"]
            initialDelaySeconds: 20
            periodSeconds: 60
            timeoutSeconds: 10
          env:
            - name: POD_NAME
              valueFrom:
                fieldRef:
                  apiVersion: v1
                  fieldPath: metadata.name
            - name: POD_NAMESPACE
              valueFrom:
                fieldRef:
                  fieldPath: metadata.namespace
            - name: RABBITMQ_NODENAME
              value: rabbit@$(POD_NAME).rabbitmq-internal.$(POD_NAMESPACE).svc.cluster.local
            - name: RABBITMQ_USE_LONGNAME
              value: "true"
          envFrom:
            - secretRef:
                name: rabbitmq-secret
          volumeMounts:
            - name: rabbitmq-config
              mountPath: /etc/rabbitmq
            - name: rabbitmq-data
              mountPath: /var/lib/rabbitmq
      volumes:
        - name: rabbitmq-config
          emptyDir: {}
        - name: tmp-dir
          configMap:
            name: rabbitmq-config
  volumeClaimTemplates:
    - metadata:
        name: rabbitmq-data
        namespace: rabbitmq
        labels:
          app: rabbitmq
      spec:
        accessModes:
          - "ReadWriteOnce"
        storageClassName: rabbitmq
        resources:
          requests:
            storage: 10Gi

8. 部署

sudo kubectl create -f Namespace.yaml
sudo kubectl create -f PersistentVolume.yaml
sudo kubectl create -f Rbac.yaml
sudo kubectl create -f Secret.yaml
sudo kubectl create -f Config.yaml
sudo kubectl create -f Statefulset.yaml
sudo kubectl create -f Service.yaml

9. 查看pods

qiteck@server:~/program/rabbitmq/3.11.11/k8s$ sudo kubectl get pods -n rabbitmq -o wide
NAME         READY   STATUS    RESTARTS   AGE   IP             NODE     NOMINATED NODE   READINESS GATES
rabbitmq-0   1/1     Running   0          32m   10.244.1.133   server   <none>           <none>
rabbitmq-1   1/1     Running   0          31m   10.244.1.134   server   <none>           <none>
rabbitmq-2   1/1     Running   0          30m   10.244.1.135   server   <none>           <none>

10. 查看服务

qiteck@server:~/program/rabbitmq/3.11.11/k8s$ sudo kubectl get service -n rabbitmq -o wide
NAME                TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)                                                   AGE   SELECTOR
rabbitmq            NodePort    10.96.172.247   <none>        1883:1883/TCP,5672:5672/TCP,15672:15672/TCP               32m   app=rabbitmq
rabbitmq-internal   ClusterIP   None            <none>        1883/TCP,4369/TCP,5672/TCP,5671/TCP,15672/TCP,25672/TCP   32m   app=rabbitmq

11. 管理系统查看

12. 集群查看

12.1. rabbitmqctl cluster_status

    root@rabbitmq-0:/# rabbitmqctl cluster_status
    Cluster status of node rabbit@rabbitmq-0.rabbitmq-internal.rabbitmq.svc.cluster.local ...
    Basics
    
    Cluster name: rabbit@rabbitmq-0.rabbitmq-internal.rabbitmq.svc.cluster.local
    Total CPU cores available cluster-wide: 6
    
    Disk Nodes
    
    rabbit@rabbitmq-0.rabbitmq-internal.rabbitmq.svc.cluster.local
    rabbit@rabbitmq-1.rabbitmq-internal.rabbitmq.svc.cluster.local
    rabbit@rabbitmq-2.rabbitmq-internal.rabbitmq.svc.cluster.local
    
    Running Nodes
    
    rabbit@rabbitmq-0.rabbitmq-internal.rabbitmq.svc.cluster.local
    rabbit@rabbitmq-1.rabbitmq-internal.rabbitmq.svc.cluster.local
    rabbit@rabbitmq-2.rabbitmq-internal.rabbitmq.svc.cluster.local
    
    Versions
    
    rabbit@rabbitmq-0.rabbitmq-internal.rabbitmq.svc.cluster.local: RabbitMQ 3.11.11 on Erlang 25.3
    rabbit@rabbitmq-1.rabbitmq-internal.rabbitmq.svc.cluster.local: RabbitMQ 3.11.11 on Erlang 25.3
    rabbit@rabbitmq-2.rabbitmq-internal.rabbitmq.svc.cluster.local: RabbitMQ 3.11.11 on Erlang 25.3
    
    CPU Cores
    
    Node: rabbit@rabbitmq-0.rabbitmq-internal.rabbitmq.svc.cluster.local, available CPU cores: 2
    Node: rabbit@rabbitmq-1.rabbitmq-internal.rabbitmq.svc.cluster.local, available CPU cores: 2
    Node: rabbit@rabbitmq-2.rabbitmq-internal.rabbitmq.svc.cluster.local, available CPU cores: 2
    
    Maintenance status
    
    Node: rabbit@rabbitmq-0.rabbitmq-internal.rabbitmq.svc.cluster.local, status: not under maintenance
    Node: rabbit@rabbitmq-1.rabbitmq-internal.rabbitmq.svc.cluster.local, status: not under maintenance
    Node: rabbit@rabbitmq-2.rabbitmq-internal.rabbitmq.svc.cluster.local, status: not under maintenance

12.2. 启动日志查看

tail -f /var/log/rabbitmq/rabbit\@rabbitmq-0.rabbitmq-internal.rabbitmq.svc.cluster.local.log:
    
    Feature flags
    
    Flag: classic_mirrored_queue_version, state: enabled
    Flag: classic_queue_type_delivery_support, state: enabled
    Flag: direct_exchange_routing_v2, state: enabled
    Flag: drop_unroutable_metric, state: enabled
    Flag: empty_basic_get_metric, state: enabled
    Flag: feature_flags_v2, state: enabled
    Flag: implicit_default_bindings, state: enabled
    Flag: listener_records_in_ets, state: enabled
    Flag: maintenance_mode_status, state: enabled
    Flag: quorum_queue, state: enabled
    Flag: stream_queue, state: enabled
    Flag: stream_single_active_consumer, state: enabled
    Flag: tracking_records_in_ets, state: enabled
    Flag: user_limits, state: enabled
    Flag: virtual_host_metadata, state: enabled
    root@rabbitmq-0:/# tail -f /var/log/rabbitmq/rabbit\@rabbitmq-0.rabbitmq-internal.rabbitmq.svc.cluster.local.log
    2023-03-31 08:33:03.820636+00:00 [info] <0.724.0> Server startup complete; 5 plugins started.
    2023-03-31 08:33:03.820636+00:00 [info] <0.724.0>  * rabbitmq_peer_discovery_k8s
    2023-03-31 08:33:03.820636+00:00 [info] <0.724.0>  * rabbitmq_peer_discovery_common
    2023-03-31 08:33:03.820636+00:00 [info] <0.724.0>  * rabbitmq_management
    2023-03-31 08:33:03.820636+00:00 [info] <0.724.0>  * rabbitmq_web_dispatch
    2023-03-31 08:33:03.820636+00:00 [info] <0.724.0>  * rabbitmq_management_agent
    2023-03-31 08:34:03.957323+00:00 [info] <0.638.0> node 'rabbit@rabbitmq-1.rabbitmq-internal.rabbitmq.svc.cluster.local' up
    2023-03-31 08:34:07.136896+00:00 [info] <0.638.0> rabbit on node 'rabbit@rabbitmq-1.rabbitmq-internal.rabbitmq.svc.cluster.local' up
    2023-03-31 08:35:07.762592+00:00 [info] <0.638.0> node 'rabbit@rabbitmq-2.rabbitmq-internal.rabbitmq.svc.cluster.local' up
    2023-03-31 08:35:10.314678+00:00 [info] <0.638.0> rabbit on node 'rabbit@rabbitmq-2.rabbitmq-internal.rabbitmq.svc.cluster.local' up

13. 遇到问题

13.1. this command requires the 'rabbit' app to be running on the target node. Start it with 'rabbitmqctl start_app'

  服务还没有起来很多命令不可以执行,

  k8s的PostStart :这个钩子在容器创建后立即执行,这个时候rabbit还没有起来 

13.2. pvc异常:storageclass.storage.k8s.io "rabbitmq" not found: 访问模式,存储类型要对

  PersistentVolume的accessModes和storageClassName必须要和Statefulset里面的volumeClaimTemplates一致

 

13.3. Error: secret "rabbitmq-secret" not found: 

  secret不存在,创建一下就好

13.4.Secret in version "v1" cannot be handled as a Secret: illegal base64 data at input byte 12

  secret里面配置的数据需要base64处理

13.5.Feature flags: `maintenance_mode_status`: required feature flag not enabled!  It must be enabled before upgrading RabbitMQ.

  后面才发现是集群重新配置以后,之前的PersistentVolume没有删除导致,集群重启的话,需要彻底清空PersistentVolume的数据

13.6.Node 'rabbit@rabbitmq-1.rabbitmq-internal.rabbitmq.svc.cluster.local'  thinks it's clustered with node 'rabbit@rabbitmq-0.rabbitmq-internal.rabbitmq.svc.cluster.local',but 'rabbit@rabbitmq-0.rabbitmq-internal.rabbitmq.svc.cluster.local' disagrees

  后面才发现是集群重新配置以后,之前的PersistentVolume没有删除导致,集群重启的话,需要彻底清空PersistentVolume的数据

posted @ 2023-03-31 18:36  若-飞  阅读(470)  评论(0编辑  收藏  举报