第五章 部署master主控节点
一、部署etcd集群
1.1 集群规划
主机名 角色 IP
hdss7-12 leader 10.4.7.12
hdss7-21 follow 10.4.7.21
hdss7-22 follow 10.4.7.22
本例部署以10.4.7.12为例,另外两台安装类似
1.2 签发证书
在10.4.7.200上操作
创建基于根证书的config配置文件
[root@hdss7-200 ~]# vim /opt/certs/ca-config.json
{
"signing": {
"default": {
"expiry": "175200h"
},
"profiles": {
"server": {
"expiry": "175200h",
"usages": [
"signing",
"key encipherment",
"server auth"
]
},
"client": {
"expiry": "175200h",
"usages": [
"signing",
"key encipherment",
"client auth"
]
},
"peer": {
"expiry": "175200h",
"usages": [
"signing",
"key encipherment",
"server auth",
"client auth"
]
}
}
}
}
注释:
Server certificate:服务端使用,客户端以此验证服务端身份,例如docker服务端,kube-apiserver服务端;
Client certificate:客户端使用,用于服务端验证客户端身份,etcdctl、etcd-proxy、fleetctl、docker客户端;
Peer certificate:双向证书,用于etcd集群之间相互通信。
1.3 创建生成自签证书签名请求(csr)的JSON配置文件
[root@hdss7-200 ~]# vim /opt/certs/etcd-peer-csr.json
{
"CN": "k8s-etcd",
"hosts": [
"10.4.7.11",
"10.4.7.12",
"10.4.7.21",
"10.4.7.22"
],
"key": {
"algo": "rsa",
"size": 2048
},
"names": [
{
"C": "CN",
"ST": "beijing",
"L": "beijing",
"O": "od",
"OU": "ops"
}
]
}
注意:重点在hosts上,将所有可能的etcd服务器添加到host列表,不能使用网段,新增etcd服务器需要重新签发证书
1.4 签发证书
[root@hdss7-200 harbor]# cd /opt/certs/
[root@hdss7-200 certs]# cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=peer etcd-peer-csr.json |cfssl-json -bare etcd-peer
[root@hdss7-200 certs]# ls etcd*
etcd-peer.csr etcd-peer-csr.json etcd-peer-key.pem etcd-peer.pem
1.5 安装etcd服务
在10.4.7.12/21/22上部署,以12为例
[root@hdss7-12 ~]# mkdir /opt/src
[root@hdss7-12 ~]# cd /opt/src/
[root@hdss7-12 src]# curl -L https://github.com/coreos/etcd/releases/download/v3.3.1/etcd-v3.3.1-linux-amd64.tar.gz -o etcd-v3.3.1-linux-amd64.tar.gz
[root@hdss7-12 src]# tar -zxvf etcd-v3.3.1-linux-amd64.tar.gz -C /opt/etcd-v3.3.1
[root@hdss7-12 src]# ln -s /opt/etcd-v3.3.1 /opt/etcd
[root@hdss7-12 src]# ll /opt/
总用量 8
lrwxrwxrwx 1 root root 16 6月 8 20:36 etcd -> /opt/etcd-v3.3.1
drwxr-xr-x 5 etcd etcd 4096 7月 9 20:40 etcd-v3.3.1
drwxr-xr-x 2 root root 4096 7月 9 20:36 src
创建目录
[root@hdss7-12 src]# mkdir -p /opt/etcd/certs /data/etcd /data/logs/etcd-server
1.6 下发证书
将10.4.7.200上生成的证书分配到etcd集群服务器上
[root@hdss7-200 certs]# for i in 12 21 22;do scp ca.pem etcd-peer.pem etcd-peer-key.pem hdss7-${i}:/opt/etcd/certs/ ;done
1.7 创建etcd启动脚本
[root@hdss7-12 src]# vim /opt/etcd/etcd-server-startup.sh
#!/bin/sh
# listen-peer-urls etcd节点之间通信端口
# listen-client-urls 客户端与etcd通信端口
# quota-backend-bytes 配额大小
# 需要修改的参数:name,listen-peer-urls,listen-client-urls,initial-advertise-peer-urls,advertise-client-urls
WORK_DIR=$(dirname $(readlink -f $0))
[ $? -eq 0 ] && cd $WORK_DIR || exit
/opt/etcd/etcd --name etcd-server-7-12 \
--data-dir /data/etcd/etcd-server \
--listen-peer-urls https://10.4.7.12:2380 \
--listen-client-urls https://10.4.7.12:2379,http://127.0.0.1:2379 \
--quota-backend-bytes 8000000000 \
--initial-advertise-peer-urls https://10.4.7.12:2380 \
--advertise-client-urls https://10.4.7.12:2379,http://127.0.0.1:2379 \
--initial-cluster etcd-server-7-12=https://10.4.7.12:2380,etcd-server-7-21=https://10.4.7.21:2380,etcd-server-7-22=https://10.4.7.22:2380 \
--ca-file ./certs/ca.pem \
--cert-file ./certs/etcd-peer.pem \
--key-file ./certs/etcd-peer-key.pem \
--client-cert-auth \
--trusted-ca-file ./certs/ca.pem \
--peer-ca-file ./certs/ca.pem \
--peer-cert-file ./certs/etcd-peer.pem \
--peer-key-file ./certs/etcd-peer-key.pem \
--peer-client-cert-auth \
--peer-trusted-ca-file ./certs/ca.pem \
--log-output stdout
[root@hdss7-12 src]# chmod +x /opt/etcd/etcd-server-startup.sh
[root@hdss7-12 src]# chown -R etcd.etcd /opt/etcd/ /data/etcd /data/logs/etcd-server
1.8 安装后台管理工具supervisor
[root@hdss7-12 src]# yum -y install supervisor
[root@hdss7-12 src]# systemctl start supervisord ; systemctl enable supervisord
[root@hdss7-12 src]# vim /etc/supervisord.d/etcd-server.ini
[program:etcd-server-7-12]
command=/opt/etcd/etcd-server-startup.sh ; the program (relative uses PATH, can take args)
numprocs=1 ; number of processes copies to start (def 1)
directory=/opt/etcd ; directory to cwd to before exec (def no cwd)
autostart=true ; start at supervisord start (default: true)
autorestart=true ; retstart at unexpected quit (default: true)
startsecs=30 ; number of secs prog must stay running (def. 1)
startretries=3 ; max # of serial start failures (default 3)
exitcodes=0,2 ; 'expected' exit codes for process (default 0,2)
stopsignal=QUIT ; signal used to kill process (default TERM)
stopwaitsecs=10 ; max num secs to wait b4 SIGKILL (default 10)
user=etcd ; setuid to this UNIX account to run the program
redirect_stderr=true ; redirect proc stderr to stdout (default false)
stdout_logfile=/data/logs/etcd-server/etcd.stdout.log ; stdout log path, NONE for none; default AUTO
stdout_logfile_maxbytes=64MB ; max # logfile bytes b4 rotation (default 50MB)
stdout_logfile_backups=5 ; # of stdout logfile backups (default 10)
stdout_capture_maxbytes=1MB ; number of bytes in 'capturemode' (default 0)
stdout_events_enabled=false ; emit events on stdout writes (default false)
[root@hdss7-12 src]# supervisorctl update
etcd-server-7-12: added process group/
[root@hdss7-12 src]# supervisorctl start etcd-server-7-12
etcd-server-7-12: started
1.9 查看集群服务状态
[root@hdss7-12 src]# supervisorctl status
etcd-server-7-12 RUNNING pid 2270, uptime 0:00:38
查看etcd后台端口2379与2380
查看器群状态,如下是集群未全部启动
[root@hdss7-12 src]# /opt/etcd/etcdctl member list
client: etcd cluster is unavailable or misconfigured; error #0: client: endpoint http://127.0.0.1:2379 exceeded header timeout
; error #1: dial tcp 127.0.0.1:4001: getsockopt: connection refused
如下是起来两个节点的时候
[root@hdss7-12 src]# /opt/etcd/etcdctl member list
988139385f78284: name=etcd-server-7-22 peerURLs=https://10.4.7.22:2380 clientURLs= isLeader=false
5a0ef2a004fc4349: name=etcd-server-7-21 peerURLs=https://10.4.7.21:2380 clientURLs=http://127.0.0.1:2379,https://10.4.7.12:2379 isLeader=false
f4a0cb0a765574a8: name=etcd-server-7-12 peerURLs=https://10.4.7.12:2380 clientURLs=http://127.0.0.1:2379,https://10.4.7.12:2379 isLeader=true
全部起来的情况
[root@hdss7-12 src]# /opt/etcd/etcdctl member list
988139385f78284: name=etcd-server-7-22 peerURLs=https://10.4.7.22:2380 clientURLs=http://127.0.0.1:2379,https://10.4.7.12:2379 isLeader=false
5a0ef2a004fc4349: name=etcd-server-7-21 peerURLs=https://10.4.7.21:2380 clientURLs=http://127.0.0.1:2379,https://10.4.7.12:2379 isLeader=false
f4a0cb0a765574a8: name=etcd-server-7-12 peerURLs=https://10.4.7.12:2380 clientURLs=http://127.0.0.1:2379,https://10.4.7.12:2379 isLeader=true
随着etcd服务的重启会发生变化
查看健康状态
[root@hdss7-12 src]# /opt/etcd/etcdctl cluster-health
member 988139385f78284 is healthy: got healthy result from http://127.0.0.1:2379
member 5a0ef2a004fc4349 is healthy: got healthy result from http://127.0.0.1:2379
member f4a0cb0a765574a8 is healthy: got healthy result from http://127.0.0.1:2379
cluster is healthy
启停方式
[root@hdss7-12 src]# supervisorctl stop etcd-server-7-12
[root@hdss7-12 src]# supervisorctl start etcd-server-7-12
[root@hdss7-12 src]# supervisorctl restart etcd-server-7-12
[root@hdss7-12 src]# supervisorctl status etcd-server-7-12
二、部署kube-apiserver集群
2.1 集群规划
部署以10.4.7.21为例,10.4.7.22部署类似
主机名 角色 IP
hdss7-21 kube-apiserver 10.4.7.21
hdss7-22 kube-apiserver 10.4.7.22
2.2 下载kubernetes服务端
aipserver 涉及的服务器:10.4.7.21,10.4.7.22
下载 kubernetes 二进制版本包
进入kubernetes的github页面: https://github.com/kubernetes/kubernetes
进入tags页签: https://github.com/kubernetes/kubernetes/tags
选择要下载的版本: https://github.com/kubernetes/kubernetes/releases/tag/v1.15.2
点击 CHANGELOG-${version}.md 进入说明页面:
https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.15.md#downloads-for-v1152
下载Server Binaries: https://dl.k8s.io/v1.15.2/kubernetes-server-linux-amd64.tar.gz
[root@hdss7-21 ~]# cd /opt/src
[root@hdss7-21 src]# wget https://dl.k8s.io/v1.15.2/kubernetes-server-linux-amd64.tar.gz
[root@hdss7-21 src]# tar -zxvf kubernetes-server-linux-amd64.tar.gz
[root@hdss7-21 src]# mv kubernetes /opt/kubernetes-v1.15.2
添加软连接
[root@hdss7-21 src]# ln -s /opt/kubernetes-v1.15.2 /opt/kubernetes
[root@hdss7-21 src]# ll /opt/kubernetes
lrwxrwxrwx 1 root root 23 6月 8 21:17 /opt/kubernetes -> /opt/kubernetes-v1.15.
[root@hdss7-21 src]# cd /opt/kubernetes
删除源码文件
[root@hdss7-21 kubernetes]# rm -f kubernetes-src.tar.gz
删除docker镜像文件,留下可执行文件
[root@hdss7-21 kubernetes]# cd server/bin/
[root@hdss7-21 bin]# rm -f *.tar *_tag
[root@hdss7-21 bin]# ll
总用量 903324
-rwxr-xr-x 1 root root 42809984 12月 11 2019 apiextensions-apiserver
-rwxr-xr-x 1 root root 100001376 12月 11 2019 cloud-controller-manager
-rwxr-xr-x 1 root root 211376176 12月 11 2019 hyperkube
-rwxr-xr-x 1 root root 39603488 12月 11 2019 kubeadm
-rwxr-xr-x 1 root root 167161088 12月 11 2019 kube-apiserver
-rwxr-xr-x 1 root root 115177920 12月 11 2019 kube-controller-manager
-rwxr-xr-x 1 root root 43119424 12月 11 2019 kubectl
-rwxr-xr-x 1 root root 128116560 12月 11 2019 kubelet
-rwxr-xr-x 1 root root 36697728 12月 11 2019 kube-proxy
-rwxr-xr-x 1 root root 39266496 12月 11 2019 kube-scheduler
-rwxr-xr-x 1 root root 1648224 12月 11 2019 mounter
创建证书目录
[root@hdss7-21 bin]# mkdir certs
2.3 在10.4.7.200服务器上生成并签发证书
签发client证书,apiserver和etcd通行证书
[root@hdss7-200 ~]# cd /opt/certs/
[root@hdss7-200 certs]# vim /opt/certs/client-csr.json
{
"CN": "k8s-node",
"hosts": [
],
"key": {
"algo": "rsa",
"size": 2048
},
"names": [
{
"C": "CN",
"ST": "beijing",
"L": "beijing",
"O": "od",
"OU": "ops"
}
]
}
[root@hdss7-200 certs]# cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=client client-csr.json |cfssl-json -bare client
[root@hdss7-200 certs]# ls client*
client.csr client-csr.json client-key.pem client.pem
签发server证书
apiserver和k8s组件通信证书
hosts中将所有可能作为apiserver的ip添加进去,VIP 10.4.7.10 也要加入
[root@hdss7-200 certs]# vim /opt/certs/apiserver-csr.json
{
"CN": "k8s-apiserver",
"hosts": [
"127.0.0.1",
"192.168.0.1",
"kubernetes.default",
"kubernetes.default.svc",
"kubernetes.default.svc.cluster",
"kubernetes.default.svc.cluster.local",
"10.4.7.10",
"10.4.7.21",
"10.4.7.22",
"10.4.7.23"
],
"key": {
"algo": "rsa",
"size": 2048
},
"names": [
{
"C": "CN",
"ST": "beijing",
"L": "beijing",
"O": "od",
"OU": "ops"
}
]
}
[root@hdss7-200 certs]# cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=server apiserver-csr.json |cfssl-json -bare apiserver
[root@hdss7-200 certs]# ls apiserver*
apiserver.csr apiserver-csr.json apiserver-key.pem apiserver.pem
签发证书
[root@hdss7-200 certs]# for i in 21 22;do echo hdss7-$i;scp apiserver-key.pem apiserver.pem ca-key.pem ca.pem client-key.pem client.pem hdss7-$i:/opt/kubernetes/server/bin/certs/;done
2.4 配置apiserver日志审计
apiserver涉及的服务器:10.4.7.21,10.4.7.22,此处以21为例
[root@hdss7-21 bin]# mkdir /opt/kubernetes/conf
[root@hdss7-21 bin]# vim /opt/kubernetes/conf/audit.yaml
打开文件后设置:set paste避免粘贴的时候自动缩进
apiVersion: audit.k8s.io/v1beta1 # This is required.
kind: Policy
# Don't generate audit events for all requests in RequestReceived stage.
omitStages:
- "RequestReceived"
rules:
# Log pod changes at RequestResponse level
- level: RequestResponse
resources:
- group: ""
# Resource "pods" doesn't match requests to any subresource of pods,
# which is consistent with the RBAC policy.
resources: ["pods"]
# Log "pods/log", "pods/status" at Metadata level
- level: Metadata
resources:
- group: ""
resources: ["pods/log", "pods/status"]
# Don't log requests to a configmap called "controller-leader"
- level: None
resources:
- group: ""
resources: ["configmaps"]
resourceNames: ["controller-leader"]
# Don't log watch requests by the "system:kube-proxy" on endpoints or services
- level: None
users: ["system:kube-proxy"]
verbs: ["watch"]
resources:
- group: "" # core API group
resources: ["endpoints", "services"]
# Don't log authenticated requests to certain non-resource URL paths.
- level: None
userGroups: ["system:authenticated"]
nonResourceURLs:
- "/api*" # Wildcard matching.
- "/version"
# Log the request body of configmap changes in kube-system.
- level: Request
resources:
- group: "" # core API group
resources: ["configmaps"]
# This rule only applies to resources in the "kube-system" namespace.
# The empty string "" can be used to select non-namespaced resources.
namespaces: ["kube-system"]
# Log configmap and secret changes in all other namespaces at the Metadata level.
- level: Metadata
resources:
- group: "" # core API group
resources: ["secrets", "configmaps"]
# Log all other resources in core and extensions at the Request level.
- level: Request
resources:
- group: "" # core API group
- group: "extensions" # Version of group should NOT be included.
# A catch-all rule to log all other requests at the Metadata level.
- level: Metadata
# Long-running requests like watches that fall under this rule will not
# generate an audit event in RequestReceived.
omitStages:
- "RequestReceived"
2.5 配置启动脚本
[root@hdss7-21 bin]# vim /opt/kubernetes/server/bin/kube-apiserver-startup.sh
#!/bin/bash
# 这里需要修改 --etcd-servers https://10.4.7.12:2379,https://10.4.7.21:2379,https://10.4.7.22:2379 分别三台etcd的地址
WORK_DIR=$(dirname $(readlink -f $0))
[ $? -eq 0 ] && cd $WORK_DIR || exit
/opt/kubernetes/server/bin/kube-apiserver \
--apiserver-count 2 \
--audit-log-path /data/logs/kubernetes/kube-apiserver/audit-log \
--audit-policy-file ../../conf/audit.yaml \
--authorization-mode RBAC \
--client-ca-file ./certs/ca.pem \
--requestheader-client-ca-file ./certs/ca.pem \
--enable-admission-plugins NamespaceLifecycle,LimitRanger,ServiceAccount,DefaultStorageClass,DefaultTolerationSeconds,MutatingAdmissionWebhook,ValidatingAdmissionWebhook,ResourceQuota \
--etcd-cafile ./certs/ca.pem \
--etcd-certfile ./certs/client.pem \
--etcd-keyfile ./certs/client-key.pem \
--etcd-servers https://10.4.7.12:2379,https://10.4.7.21:2379,https://10.4.7.22:2379 \
--service-account-key-file ./certs/ca-key.pem \
--service-cluster-ip-range 192.168.0.0/16 \
--service-node-port-range 3000-29999 \
--target-ram-mb=1024 \
--kubelet-client-certificate ./certs/client.pem \
--kubelet-client-key ./certs/client-key.pem \
--log-dir /data/logs/kubernetes/kube-apiserver \
--tls-cert-file ./certs/apiserver.pem \
--tls-private-key-file ./certs/apiserver-key.pem \
--v 2
[root@hdss7-21 bin]# chmod +x /opt/kubernetes/server/bin/kube-apiserver-startup.sh
2.6 配置supervisor启动配置
[root@hdss7-21 bin]# vim /etc/supervisord.d/kube-apiserver.ini
[program:kube-apiserver-7-21]
command=/opt/kubernetes/server/bin/kube-apiserver-startup.sh
numprocs=1
directory=/opt/kubernetes/server/bin
autostart=true
autorestart=true
startsecs=30
startretries=3
exitcodes=0,2
stopsignal=QUIT
stopwaitsecs=10
user=root
redirect_stderr=true
stdout_logfile=/data/logs/kubernetes/kube-apiserver/apiserver.stdout.log
stdout_logfile_maxbytes=64MB
stdout_logfile_backups=5
stdout_capture_maxbytes=1MB
stdout_events_enabled=false
创建目录
[root@hdss7-21 bin]# mkdir -p /data/logs/kubernetes/kube-apiserver/
[root@hdss7-21 bin]# supervisorctl update
kube-apiserver-7-21: added process group
当启动有问题,出现如下情况时,可以查看启动输出文件
[root@hdss7-21 bin]# supervisorctl status kube-apiserver-7-21
kube-apiserver-7-21 BACKOFF Exited too quickly (process log may have details)
[root@hdss7-21 bin]# vim /data/logs/kubernetes/kube-apiserver/apiserver.stdout.log
修改好后重新启动
[root@hdss7-21 bin]# supervisorctl restart kube-apiserver-7-21
[root@hdss7-21 bin]# supervisorctl status kube-apiserver-7-21
kube-apiserver-7-21 RUNNING pid 4538, uptime 0:00:41
启停apiserver命令如下
[root@hdss7-21 bin]# supervisorctl start kube-apiserver-7-21
[root@hdss7-21 bin]# supervisorctl stop kube-apiserver-7-21
[root@hdss7-21 bin]# supervisorctl restart kube-apiserver-7-21
[root@hdss7-21 bin]# supervisorctl status kube-apiserver-7-21
2.7 查看进程
一个是对外6443端口,一个是回环端口8080
[root@hdss7-21 bin]# netstat -lntp|grep api
tcp 0 0 127.0.0.1:8080 0.0.0.0:* LISTEN 4542/kube-apiserver
tcp6 0 0 :::6443 :::* LISTEN 4542/kube-apiserver
同样在10.4.7.22上部署apiserver服务
2.8 配置apiserver L4代理
10.4.7.11和10.4.7.22上操作,利用keepalived创建VIP10.4.7.10
[root@hdss7-11 ~]# yum -y intstall nginx
[root@hdss7-11 ~]# vim /etc/nginx/nginx.conf
# 末尾加上以下内容,stream 只能加在 main 中
# 此处只是简单配置下nginx,实际生产中,建议进行更合理的配置
stream {
log_format proxy '$time_local|$remote_addr|$upstream_addr|$protocol|$status|'
'$session_time|$upstream_connect_time|$bytes_sent|$bytes_received|'
'$upstream_bytes_sent|$upstream_bytes_received' ;
upstream kube-apiserver {
server 10.4.7.21:6443 max_fails=3 fail_timeout=30s;
server 10.4.7.22:6443 max_fails=3 fail_timeout=30s;
}
server {
listen 7443;
proxy_connect_timeout 2s;
proxy_timeout 900s;
proxy_pass kube-apiserver;
access_log /var/log/nginx/proxy.log proxy;
}
}
问题处理:
如果yum安装nginx可能会出现如下问题,这是因为版本升级引起的,我的是nginx1.20.1,没有找到nginxstream模块
[root@hdss7-12 ~]# nginx -t
nginx: [emerg] dlopen() "/usr/lib64/nginx/modules/ngx_stream_module.so" failed (/usr/lib64/nginx/modules/ngx_stream_module.so: cannot open shared object file: No such file or directory) in /etc/nginx/nginx.conf:4
安装一下stream模块就好
[root@hdss7-11 modules]# yum list|grep nginx|grep stream
nginx-mod-stream.x86_64 1:1.20.1-2.el7 @epel
[root@hdss7-11 modules]# yum -y install nginx-mod-stream.x86_64
再次检查
[root@hdss7-11 modules]# nginx -t
nginx: the configuration file /etc/nginx/nginx.conf syntax is ok
nginx: configuration file /etc/nginx/nginx.conf test is successful
测试
[root@hdss7-21 bin]# systemctl start nginx; systemctl enable nginx
[root@hdss7-21 bin]# curl 127.0.0.1:7443
Client sent an HTTP request to an HTTPS server.
多测试几次,出现轮询效果
[root@localhost ~]# cat /var/log/nginx/proxy.log
08/Jun/2021:22:16:18 +0800|127.0.0.1|10.4.7.21:6443|TCP|200|0.001|0.000|76|78|78|76
08/Jun/2021:22:16:23 +0800|127.0.0.1|10.4.7.22:6443|TCP|200|0.003|0.002|76|78|78|76
08/Jun/2021:22:16:23 +0800|127.0.0.1|10.4.7.21:6443|TCP|200|0.001|0.000|76|78|78|76
08/Jun/2021:22:16:24 +0800|127.0.0.1|10.4.7.22:6443|TCP|200|0.003|0.002|76|78|78|76
2.9 部署keepalived
在10.4.7.11,12上操作,实现nginx高可用
安装keepalived(实验环境直接使用yum安装)
[root@hdss7-21 ~]# yum -y install keepalived
编辑nginx监听脚本
[root@hdss7-11 ~]# vim /etc/keepalived/check_port.sh
#!/bin/bash
#keepalived监控端口脚本
#使用方法:
#在keepalived的配置文件中
#vrrp_script check_port{#创建一个vrrp_script脚本,检查配置
# script "/etc/keepalived/check_port.sh 7443" #配置监听的端口
# interval 2 #检查脚本的频率,单位(秒)
#}
CHK_PORT=$1
if [ -n "$CHK_PORT" ];then
PORT_PROCESS=`ss -nlt|grep $CHK_PORT|wc -l`
if [ $PORT_PROCESS -eq 0 ];then
echo "Port $CHK_PORT Is Not Used,End"
exit 1
fi
else
echo "Check Port Can Be Empty!"
fi
[root@hdss7-11 ~]# chmod +x /etc/keepalived/check_port.sh
编辑keeaplived主配置,设置nopreempt非抢占模式
10.4.7.11上
[root@hdss7-11 ~]# vim /etc/keepalived/keepalived.conf
! Configuration File for keepalived
global_defs {
router_id 10.4.7.11
script_user root
enable_script_security
}
vrrp_script chk_nginx {
script "/etc/keepalived/check_port.sh 7443"
interval 2
weight -20
}
vrrp_instance VI_1 {
state BACKUP
interface ens33
virtual_router_id 51
priority 100
advert_int 1
mcast_src_ip 10.4.7.11
nopreempt
authentication {
auth_type PASS
auth_pass 1111
}
track_script {
chk_nginx
}
virtual_ipaddress {
10.4.7.10
}
}
10.4.7.12上
[root@hdss7-21 ~]# vim /etc/keepalived/keepalived.conf
! Configuration File for keepalived
global_defs {
router_id 10.4.7.12
script_user root
enable_script_security
}
vrrp_script chk_nginx {
script "/etc/keepalived/check_port.sh 7443"
interval 2
weight -20
}
vrrp_instance VI_1 {
state BACKUP
interface ens33
virtual_router_id 51
mcast_src_ip 10.4.7.12
priority 90
advert_int 1
authentication {
auth_type PASS
auth_pass 1111
}
track_script {
chk_nginx
}
virtual_ipaddress {
10.4.7.10
}
}
启动并加入开机自启
[root@hdss7-11 ~]# systemctl start keepalived && systemctl enable keepalived
Created symlink from /etc/systemd/system/multi-user.target.wants/keepalived.service to /usr/lib/systemd/system/keepalived.service.
[root@hdss7-11 ~]# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: ens33: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
link/ether 00:0c:29:ca:98:73 brd ff:ff:ff:ff:ff:ff
inet 10.4.7.11/24 brd 10.4.7.255 scope global noprefixroute ens33
valid_lft forever preferred_lft forever
inet 10.4.7.10/32 scope global ens33
valid_lft forever preferred_lft forever
inet6 fe80::66c4:334d:3cb1:9096/64 scope link tentative noprefixroute dadfailed
valid_lft forever preferred_lft forever
inet6 fe80::e4c6:edb7:e158:d84b/64 scope link noprefixroute
valid_lft forever preferred_lft forever
测试
关闭10.4.7.11上的nginx后再重启,观察情况VIP漂移只10.4.7.12上后不会再漂移回来,因为我们给的是非抢占模式的配置
三、部署controller-manager
Controller-manager设置只为调用当前主机的apiserver,走127.0.0.1网卡,因此不需要配置ssl证书
部署以10.4.7.21为例
3.1 集群规划
主机名 角色 ip
hdss7-21.host.com controller-manager 10.4.7.21
hdss7-22.host.com controller-manager 10.4.7.22
3.2 创建启动脚本
[root@hdss7-21 ~]# vim /opt/kubernetes/server/bin/kube-controller-manager-startup.sh
#!/bin/sh
WORK_DIR=$(dirname $(readlink -f $0))
[ $? -eq 0 ] && cd $WORK_DIR || exit
/opt/kubernetes/server/bin/kube-controller-manager \
--cluster-cidr 172.7.0.0/16 \
--leader-elect true \
--log-dir /data/logs/kubernetes/kube-controller-manager \
--master http://127.0.0.1:8080 \
--service-account-private-key-file ./certs/ca-key.pem \
--service-cluster-ip-range 192.168.0.0/16 \
--root-ca-file ./certs/ca.pem \
--v 2
3.3 修改文件权限,创建目录
[root@hdss7-21 ~]# chmod +x /opt/kubernetes/server/bin/kube-controller-manager-startup.sh
[root@hdss7-21 ~]# mkdir -p /data/logs/kubernetes/kube-controller-manager
3.4 创建supervisor配置
[root@hdss7-21 ~]# vim /etc/supervisord.d/kube-controller-manager.ini
[program:kube-controller-manager-7-21]
command=/opt/kubernetes/server/bin/kube-controller-manager-startup.sh ; the program (relative uses PATH, can take args)
numprocs=1 ; number of processes copies to start (def 1)
directory=/opt/kubernetes/server/bin ; directory to cwd to before exec (def no cwd)
autostart=true ; start at supervisord start (default: true)
autorestart=true ; retstart at unexpected quit (default: true)
startsecs=30 ; number of secs prog must stay running (def. 1)
startretries=3 ; max # of serial start failures (default 3)
exitcodes=0,2 ; 'expected' exit codes for process (default 0,2)
stopsignal=QUIT ; signal used to kill process (default TERM)
stopwaitsecs=10 ; max num secs to wait b4 SIGKILL (default 10)
user=root ; setuid to this UNIX account to run the program
redirect_stderr=true ; redirect proc stderr to stdout (default false)
stdout_logfile=/data/logs/kubernetes/kube-controller-manager/controller.stdout.log ; stderr log path, NONE for none; default AUTO
stdout_logfile_maxbytes=64MB ; max # logfile bytes b4 rotation (default 50MB)
stdout_logfile_backups=4 ; # of stdout logfile backups (default 10)
stdout_capture_maxbytes=1MB ; number of bytes in 'capturemode' (default 0)
stdout_events_enabled=false ; emit events on stdout writes (default false)
3.5 启动服务并检查
[root@hdss7-21 ~]# supervisorctl update
kube-controller-manager-7-21: added process group
查看状态
[root@hdss7-21 ~]# supervisorctl status
etcd-server-7-21 RUNNING pid 1404, uptime 0:23:30
kube-apiserver-7-21 RUNNING pid 1419, uptime 0:23:29
kube-controller-manager-7-21 RUNNING pid 2153, uptime 0:00:43
3.6 安装部署所有集群规划的主机
同样部署好10.4.7.22上的服务
四、部署kube-scheduler
Kube-scheduler设置只为调用当前本机的apiserver,走127.0.0.1网卡,因此不需要配置ssl证书。(如果apiserver、controller-manager和scheduler分别部署在不同的机器上,那么他们就需要分别部署client.pem和client-key.pem证书,因为我们这三个组建集群都是用的相同的主机,所以共用一套证书就可以了)
以10.4.7.21部署为例
4.1 集群规划
主机名 角色 ip
hdss7-21.host.com kube-scheduler 10.4.7.21
hdss7-22.host.com kube-scheduler 10.4.7.22
4.2 创建启动脚本
[root@hdss7-21 ~]# vim /opt/kubernetes/server/bin/kube-scheduler-startup.sh
#!/bin/sh
WORK_DIR=$(dirname $(readlink -f $0))
[ $? -eq 0 ] && cd $WORK_DIR || exit
/opt/kubernetes/server/bin/kube-scheduler \
--leader-elect \
--log-dir /data/logs/kubernetes/kube-scheduler \
--master http://127.0.0.1:8080 \
--v 2
4.3 调整文件权限,创建目录
[root@hdss7-21 ~]# chmod +x /opt/kubernetes/server/bin/kube-scheduler-startup.sh
[root@hdss7-21 ~]# mkdir -p /data/logs/kubernetes/kube-scheduler
4.4 创建supervisor配置
[root@hdss7-21 ~]# vim /etc/supervisord.d/kube-scheduler.ini
[program:kube-scheduler-7-21]
command=/opt/kubernetes/server/bin/kube-scheduler-startup.sh
numprocs=1
directory=/opt/kubernetes/server/bin
autostart=true
autorestart=true
startsecs=30
startretries=3
exitcodes=0,2
stopsignal=QUIT
stopwaitsecs=10
user=root
redirect_stderr=true
stdout_logfile=/data/logs/kubernetes/kube-scheduler/scheduler.stdout.log
stdout_logfile_maxbytes=64MB
stdout_logfile_backups=4
stdout_capture_maxbytes=1MB
stdout_events_enabled=false
4.5 启动服务并检查
[root@hdss7-21 ~]# supervisorctl update
kube-scheduler-7-21: added process group
[root@hdss7-21 ~]# supervisorctl status
etcd-server-7-21 RUNNING pid 1404, uptime 0:32:54
kube-apiserver-7-21 RUNNING pid 1419, uptime 0:32:53
kube-controller-manager-7-21 RUNNING pid 2153, uptime 0:10:07
kube-scheduler-7-21 RUNNING pid 2188, uptime 0:00:34
4.6 安装部署所有集群规划主机
同样,在10.4.7.22部署好
五、检查主控节点
操作主机10.4.7.21,22(以21为例)
创建软连接
[root@hdss7-21 ~]# ln -s /opt/kubernetes/server/bin/kubectl /usr/local/bin/
查看主控节点健康状态
[root@hdss7-21 ~]# kubectl get cs
NAME STATUS MESSAGE ERROR
controller-manager Healthy ok
scheduler Healthy ok
etcd-2 Healthy {"health":"true"}
etcd-1 Healthy {"health":"true"}
etcd-0 Healthy {"health":"true"}
问题排查
如果检查健康状态时,只显示了一个etcd的节点,那就是apiserver那里配置错了,填写了同一个etcd的地址,回去改一下就好
同样,在10.4.7.22部署好