案例分享 生产环境逐步迁移至k8s集群 - pod注册到consul
#案例分享 生产环境逐步迁移至k8s集群 - pod注册到consul
#项目背景
- 多套业务系统, 所有节点注册到consul集群,方便统一管理
- 使用consul的dns功能, 所有节点hostname能ping通
- 使用consul健康检查功能, 健康检查通过才添加到service
- 部分服务之前调用直接使用consul的server地址即:
service-name
.service.datacenter
.consul- prometheus监控使用consul-templates自动添加节点
- 运行环境是阿里云, k8s集群容器IP和云主机IP互通
#1.1 需要解决的问题
- 部分服务迁移k8s集群后, k8s集群外的服务需要直连pod的ip访问
#1.2 解决办法
- pod添加consul-agent容器注册到consul集群
#2.1 pod注册到consul产生的新问题
- pod退出或删除时, consul集群应删除pod
- prometheus监控模板consul-templates需要排除pod
#2.2 解决办法
- consul容器使用preStop钩子, 退出前执行consul leave主动离开consul集群
- consul-templates排除pod
-
- pod注册到consul集群时添加前缀如
k8s-
- pod注册到consul集群时添加前缀如
-
- consul-templates使用regexMatch正则匹配忽略
k8s-
开头的节点
- consul-templates使用regexMatch正则匹配忽略
#演示demo如下
---
apiVersion: v1
kind: ConfigMap
metadata:
name: consul-demo-config
namespace: default
data:
consul.json: |-
{
"datacenter": "qa",
"acl_datacenter": "qa",
"data_dir": "/tmp/consul",
"bind_addr": "0.0.0.0",
"client_addr": "0.0.0.0",
"start_join": ["10.10.100.100"],
"retry_join": ["10.10.100.100"],
"retry_interval": "5s",
"disable_host_node_id": true,
"enable_script_checks": true,
"disable_update_check": true,
"leave_on_terminate": true,
"log_level": "WARN",
"server": false,
"service": {
"name": "qa-consul-demo",
"port" : 80,
"tags": ["k8s", "qa", "consul-demo"],
"checks": [
{
"id": "consul-demo-HealthCheck",
"name": "Health Check",
"notes": "Health Check",
"args": [ "sh", "-c", "[ $(curl -s 127.0.0.1 -I |grep 'nginx' |wc -l) -eq 1 ] && { echo 'Health check successful'; exit 0 ; } || { echo 'check error' ; exit 2 ; }" ],
"interval": "10s"
}
]
}
}
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: consul-demo
namespace: default
spec:
selector:
matchLabels:
app: consul-demo
replicas: 2
template:
metadata:
labels:
app: consul-demo
spec:
imagePullSecrets:
- name: docker-image-key
containers:
- name: consul-agent
image: consul:1.0.8
imagePullPolicy: IfNotPresent
command:
- sh
- -c
- |
consul agent -config-dir=/opt/consul -node=k8s-qa-$HOSTNAME -rejoin
lifecycle:
preStop:
exec:
command:
- sh
- -c
- |
consul leave
volumeMounts:
- mountPath: "/etc/consul"
name: consul-conf
resources:
requests:
cpu: 10m
memory: 16Mi
limits:
cpu: 50m
memory: 32Mi
readinessProbe:
tcpSocket:
port: 8500
livenessProbe:
tcpSocket:
port: 8500
volumeMounts:
- name: consul-config
mountPath: "/opt/consul"
- name: nginx-node
image: alivv/nginx:node
imagePullPolicy: IfNotPresent
volumes:
- name: consul-config
configMap:
name: consul-demo-config
items:
- key: consul.json
path: consul.json
监控模板consul-templates如下
- job_name: 'node'
static_configs:
{{range nodes}}
- targets: ['{{.Node}}:9100']
labels:
instance: {{.Node}}{{end}}
修改后如下, 使用regexMatch正则匹配排除k8s-
开头的节点名称
- job_name: 'node'
static_configs:
{{range nodes}}{{if .Node | regexMatch "^k8s-.*" }}{{else}}
- targets: ['{{.Node}}:9100']
labels:
instance: {{.Node}}{{end}}{{end}}
本文来自博客园,作者:blog-elvin-vip,转载请注明原文链接:https://www.cnblogs.com/elvi/p/16732694.html