K8S集群环境-搭建
jdk安装
wget http://172.23.210.21:83/software/admin/jdk-8u171-linux-x64.tar.gz mkdir /usr/local/jdk1.8 tar zxf jdk-8u171-linux-x64.tar.gz -C /usr/local/jdk1.8 vim /etc/profile.d/jdk.sh export JAVA_HOME=/usr/local/jdk1.8 export PATH=$JAVA_HOME/bin:$PATH export CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar source /etc/profile java -version
K8S集群环境
kubernetes 通过kubeadm安装
2 Master 1 Node
kubernetes版本: v1.14.3
操作系统 :CentOS7 1804
kubeadm自kubernetes v1.13版时已经达到GA稳定阶段,表示可以应用于生产环境.
一、在所有节点上进行如下操作
swapoff –a # 临时关闭swap分区
vi /etc/fstab/ # 修改fstab文件,禁用swap分区开机挂载
setenforce 0 # 临时关闭selinux
vi /etc/selinux/config # 修改selinux配置文件,禁用selinux开机自启
vi /etc/hosts # 添加所有参数集群的服务器主机名和IP到hosts文件
添加yum源
vi /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes Repository
baseurl=http://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64/
enabled=1
gpgcheck=0
安装docker kubectl kubeadm kubelet软件
yum remove docker docker-common docker-selinux docker-engine
yum install -y yum-utils device-mapper-persistent-data lvm2
yum-config-manager --add-repo https://download.docker.com/linux/centos/docker-ce.repo
yum install -y docker-ce kubelet kubeadm kubectl --disableexcludes=kubernetes
更新docker镜像源,以便在国内拉取镜像
echo '{"registry-mirrors":["https://registry.docker-cn.com"]}' > /etc/docker/daemon.json
启动服务
systemctl enable kubelet
systemctl enable docker
systemctl start kubelet
systemctl start docker
二、为了高可用和负载均衡,还需要安装haproxy+keepalived或nginx+keepalived来进行负载均衡和热备
以下用haproxy+keepalived来做演示 所有Master服务器的haproxy服务配置一样,keepalived服务大体一样,除了用于区分主从部分的配置
安装haproxy 和keepalived服务
yum install -y haproxy keepalived
haproxy服务的配置
vi /etc/haproxy/haproxy.cfg
listen stats 0.0.0.0:12345 # haproxy 状态查看端口,通过该端口可以通过浏览器查看haproxy负载状态
mode http
log global
maxconn 10
stats enable
stats hide-version
stats refresh 30s
stats show-node
stats auth admin:p@ssw0rd # 访问状态页面的认证信息
stats uri /stats # 访问状态页面的URL路径,注意是uri 不是url
frontend kube-api-https
bind 0.0.0.0:12567 # kube-apiserver 服务对外端口
mode tcp
default_backend kube-api-server
backend kube-api-server
balance roundrobin
mode tcp
server test-01 172.23.210.22:6443 check # master1
server test-02 172.23.210.23:6443 check # master2
keepalived服务的配置
vi /etc/keepalived/keepalived.conf
global_defs {
router_id test-01 # 服务器标识,两台服务器不能一样
}
vrrp_script chk_ha { # haproxy服务状态检测脚本
script "/root/check_haproxy.sh" # 脚本路径
interval 2 # 检测时间间隔单位秒。
}
vrrp_instance VI_1 {
state SLAVE # 主从状态,分别是MASTER和SLAVE。主服务器宕机后,从服务器接过VIP提供服务;主服务器恢复后,因为优先级一般比从服务器高,会抢回VIP。还有一种做法,两个都设置为SLAVE,设置相同优先级,可以避免主服务器重新启动后抢占VIP,导致服务访问切换。
interface ens192 # ens192 网卡接口,可以通过ip ad查看.
virtual_router_id 51
priority 99 # 服务器优先级,0-240,数据越大,优先及越高。例如我这里从服务器设置优先级为99,主服务器优先级为100
advert_int 1
authentication {
auth_type PASS
auth_pass 1111
}
virtual_ipaddress {
172.23.210.26 # 虚拟IP,会在两台服务器之间切换
}
track_script { # 调用上面的服务状态检测配置
chk_ha
}
}
check_haproxy.sh脚本内容,如果检测到haproxy服务不存在,尝试重启,重启后还不存在,则结束keepalived进程,由备机提供服务,注意保持脚本有可执行权限
#!/bin/bash
run=`ps -C haproxy --no-header | wc -l`
if [ $run -eq 0 ]
then
systemctl restart haproxy
sleep 2
if [ $run -eq 0 ]
then
killall keepalived
fi
fi
启动服务
systemctl enable haproxy
systemctl enable keepalived
systemctl start haproyx
systemctl start keepalived
三、初始化master
master1 编辑初始化配置文件
vi kubeadm-config.yaml
apiVersion: kubeadm.k8s.io/v1beta1
imageRepository: docker.io/dustise # 指定镜像地址,以免官方镜像无法下载
kind: ClusterConfiguration
kubernetesVersion: v1.14.0 # 指定k8s版本,用于拉取相应的镜像
controlPlaneEndpoint: "172.23.210.26:12567" # 配置kube-apiserver服务地址,因为前面用了haproxy进行负载,这使用keepalived的虚拟地址和haproxy负载的12567端口
networking:
podSubnet: "172.7.0.0/16"
初始化master1 --experimental-upload-certs 该参数专用于高可用部署,可以将需要在不同的控制平面之间传递的证书上传到集群中,以Secret形式保存起来,并使用token进行加密.Secret过期时间为两小时,过期后可以使用kubeadm init phase upload-certs --experimental-upload-certs命令重新生成.该命令需要k8s 1.14及以上版本才能支持.
kubeadm init --config=kubeadm-config.yaml --experimental-upload-certs
.....
Your Kubernetes control-plane has initialized successfully!
To start using your cluster, you need to run the following as a regular user:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
https://kubernetes.io/docs/concepts/cluster-administration/addons/
You can now join any number of the control-plane node running the following command on each as root:
# 以下信息用于master2、3加入集群时的配置
kubeadm join 172.23.210.26:12567 --token tfm6qd.ibzoeaorwnyqlcow \
--discovery-token-ca-cert-hash sha256:af32363c4993ca00b7322876cac036bc21816efc1ed0de2a662d606451c60cce \
--experimental-control-plane --certificate-key f6599b60661b3b5c2dbd17fd6487e50eba0eb13d7926b8c55a1a1bd7f869f7f9
Please note that the certificate-key gives access to cluster sensitive data, keep it secret!
As a safeguard, uploaded-certs will be deleted in two hours; If necessary, you can use
"kubeadm init phase upload-certs --experimental-upload-certs" to reload certs afterward.
Then you can join any number of worker nodes by running the following on each as root:
# 以下信息用于node加入集群时的配置
kubeadm join 172.23.210.26:12567 --token tfm6qd.ibzoeaorwnyqlcow \
--discovery-token-ca-cert-hash sha256:af32363c4993ca00b7322876cac036bc21816efc1ed0de2a662d606451c60cce
要开始使用kubectl命令,需要先执行以下操作,如果是普通用户,需要在后两条命令前加sudo
mkdir -p $HOME/.kube
cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
chown $(id -u):$(id -g) $HOME/.kube/config
安装网络插件,
kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
等网络插件安装完成,容器启动之后再到master2上操作,不然会如情况,coredns会一直卡在CrashLoopBackOff循环,无法启动
[root@TEST-01 ~]# kubectl get po,no --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system pod/coredns-6897bd7b5-7mcnm 0/1 CrashLoopBackOff 1 8m4s
kube-system pod/coredns-6897bd7b5-g8bqx 0/1 CrashLoopBackOff 1 8m4s
......
下面才是正常情况
[root@TEST-01 ~]# kubectl get po,no --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system pod/coredns-6897bd7b5-4zpnz 1/1 Running 0 30s
kube-system pod/coredns-6897bd7b5-gdf54 1/1 Running 0 30s
......
master2 将master1上初始化成功后显示的信息复制到master2执行即可将master2加入集群
kubeadm join 172.23.210.26:12567 --token tfm6qd.ibzoeaorwnyqlcow \
--discovery-token-ca-cert-hash sha256:af32363c4993ca00b7322876cac036bc21816efc1ed0de2a662d606451c60cce \
--experimental-control-plane --certificate-key f6599b60661b3b5c2dbd17fd6487e50eba0eb13d7926b8c55a1a1bd7f869f7f9
看到如下提示就表示加入成功
To start administering your cluster from this node, you need to run the following as a regular user:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
Run 'kubectl get nodes' to see this node join the cluster.
如果master2也想要使用kubectl命令,同样需要执行上面三条命令即可
NODE节点加入集群
kubeadm join 172.23.210.26:12567 --token tfm6qd.ibzoeaorwnyqlcow \
--discovery-token-ca-cert-hash sha256:af32363c4993ca00b7322876cac036bc21816efc1ed0de2a662d606451c60cce
以下是最后所有集群组件启动完成后的结果
[root@TEST-01 ~]# kubectl get po,no --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system pod/coredns-6897bd7b5-4zpnz 1/1 Running 0 2m13s
kube-system pod/coredns-6897bd7b5-gdf54 1/1 Running 0 2m13s
kube-system pod/etcd-test-01 1/1 Running 0 87s
kube-system pod/etcd-test-02 1/1 Running 0 79s
kube-system pod/kube-apiserver-test-01 1/1 Running 0 68s
kube-system pod/kube-apiserver-test-02 1/1 Running 0 81s
kube-system pod/kube-controller-manager-test-01 1/1 Running 1 92s
kube-system pod/kube-controller-manager-test-02 1/1 Running 0 81s
kube-system pod/kube-flannel-ds-amd64-d7rbq 1/1 Running 0 81s
kube-system pod/kube-flannel-ds-amd64-jpgnn 1/1 Running 0 2m
kube-system pod/kube-flannel-ds-amd64-rxzzg 1/1 Running 0 9s
kube-system pod/kube-proxy-hc96m 1/1 Running 0 9s
kube-system pod/kube-proxy-hf2m5 1/1 Running 0 81s
kube-system pod/kube-proxy-ztnwf 1/1 Running 0 2m13s
kube-system pod/kube-scheduler-test-01 1/1 Running 1 81s
kube-system pod/kube-scheduler-test-02 1/1 Running 0 81s
NAMESPACE NAME STATUS ROLES AGE VERSION
node/test-01 Ready master 2m32s v1.14.3
node/test-02 Ready master 81s v1.14.3
node/test-03 Ready <none> 9s v1.14.3
注意: 不论是MASTER还是NODE加入集群的token都是有时效性的,MASTER加入集群的token有效期为两小时,NODE加入集群的token有效期为24小时.可以通过kubeadm token list 查看.过期后可以重新生成.不过证书的hash值不会过期,建议集群初始完成后保存好hash信息,以便后期使用
查看现有token
root@TEST-01 ~]# kubeadm token list
TOKEN TTL EXPIRES USAGES DESCRIPTION EXTRA GROUPS
ay3tyz.peeqmomkayvivxhy 1h 2019-06-12T17:56:33+08:00 <none> Proxy for managing TTL for the kubeadm-certs secret <none>
tfm6qd.ibzoeaorwnyqlcow 23h 2019-06-13T15:56:33+08:00 authentication,signing <none> system:bootstrappers:kubeadm:default-node-token
重新生成NODE加入集群的token
[root@TEST-01 ~]# kubeadm token create
bfklrw.shyi4zofcj7hnjx8 #生成NODE加入集群时的新token
[root@TEST-01 ~]# kubeadm token list
TOKEN TTL EXPIRES USAGES DESCRIPTION EXTRA GROUPS
ay3tyz.peeqmomkayvivxhy 1h 2019-06-12T17:56:33+08:00 <none> Proxy for managing TTL for the kubeadm-certs secret <none>
bfklrw.shyi4zofcj7hnjx8 23h 2019-06-13T16:50:17+08:00 authentication,signing <none> system:bootstrappers:kubeadm:default-node-token
tfm6qd.ibzoeaorwnyqlcow 23h 2019-06-13T15:56:33+08:00 authentication,signing <none> system:bootstrappers:kubeadm:default-node-token
重新生成MASTER加入集群的token
[root@TEST-01 ~]# kubeadm init phase upload-certs --experimental-upload-certs
I0612 16:52:00.385590 7773 version.go:96] could not fetch a Kubernetes version from the internet: unable to get URL "https://dl.k8s.io/release/stable-1.txt": Get https://dl.k8s.io/release/stable-1.txt: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
I0612 16:52:00.385775 7773 version.go:97] falling back to the local client version: v1.14.3
[upload-certs] Storing the certificates in ConfigMap "kubeadm-certs" in the "kube-system" Namespace
[upload-certs] Using certificate key:
5ce76c3c4fc95e8ab7bf5d0abc1abe8232fd3d39a5ff9c49a65612ecbcc6cb3e #新的certificate-key
[root@TEST-01 ~]# kubeadm token list
TOKEN TTL EXPIRES USAGES DESCRIPTION EXTRA GROUPS
6zm2x5.p7dwn0q9xkcyah5v 1h 2019-06-12T18:52:00+08:00 <none> Proxy for managing TTL for the kubeadm-certs secret <none> #新的master加入集群token
ay3tyz.peeqmomkayvivxhy 1h 2019-06-12T17:56:33+08:00 <none> Proxy for managing TTL for the kubeadm-certs secret <none>
bfklrw.shyi4zofcj7hnjx8 23h 2019-06-13T16:50:17+08:00 authentication,signing <none> system:bootstrappers:kubeadm:default-node-token
tfm6qd.ibzoeaorwnyqlcow 23h 2019-06-13T15:56:33+08:00 authentication,signing <none> system:bootstrappers:kubeadm:default-node-token
etcd 二进制tls集群安装
系统:centos 7 1804
IP:172.23.210.22 172.23.210.23 172.23.210.24
etcd: 3.3.13
用cfssl生成集群tls证书安装
下载安装cfssl 和cfssjson
curl -s -L -o /usr/bin/cfssl https://pkg.cfssl.org/R1.2/cfssl_linux-amd64
curl -s -L -o /usr/bin/cfssljson https://pkg.cfssl.org/R1.2/cfssljson_linux-amd64
chmod +x /usr/bin/{cfssl,cfssljson}
创建个目录用来放生成的证书
mkdir ~/cfssl
cd ~/cfssl
生成CA配置文件
cfssl print-defaults config > ca-config.json
cfssl print-defaults csr > ca-csr.json
将内容修成如下,expiry的值为10年,表示生成的证书有效期
cat ca-config.json
{
"signing": {
"default": {
"expiry": "87600h"
},
"profiles": {
"server": { # 服务器证书由服务器使用,并由客户端验证服务器身份。例如docker服务器或kube-apiserver
"expiry": "87600h",
"usages": [
"signing",
"key encipherment",
"server auth"
]
},
"client": { # 客户端证书用于按服务器对客户端进行身份验证。例如etcdctl,etcd proxy或者docker客户端。
"expiry": "87600h",
"usages": [
"signing",
"key encipherment",
"client auth"
]
},
"peer": { # 对等证书由etcd集群成员使用,因为它们以两种方式相互通信。本文档就是用这个证书
"expiry": "87600h",
"usages": [
"signing",
"key encipherment",
"server auth",
"client auth"
]
}
}
}
}
修改ca-csr.json内容为如下
cat ca-csr.json
{
"CN": "etcd server",
"hosts": [
"localhost",
"127.0.0.1",
"172.23.210.22",
"172.23.210.23",
"172.23.210.24",
"test-01",
"test-02",
"test-03"
],
"key": {
"algo": "rsa",
"size": 2048
},
"names": [
{
"C": "CN",
"L": "HB",
"ST": "Wu Han"
}
]
}
生成CA证书
cfssl gencert -initca ca-csr.json | cfssljson -bare ca -
会得到以下文件:
ca-key.pem
ca.csr
ca.pem
生成对等证书
etcd tls就是用这个证书
cfssl print-defaults csr > peer.json
内容如下:
cat peer.json
{
"CN": "etcd",
"hosts": [
"localhost",
"127.0.0.1",
"172.23.210.22",
"172.23.210.23",
"172.23.210.24"
],
"key": {
"algo": "rsa",
"size": 2048
},
"names": [{
"C": "CN",
"L": "HB",
"ST": "Wu Han"
}]
}
执行生成命令
cfssl gencert -hostname="172.23.210.22,172.23.210.23,172.23.210.24" -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=peer peer.json | cfssljson -bare etcd
# 如果不加-hostname参数,生成证书后会报下面的错误,cfssl 1.2版本暂未修复这个BUG
[WARNING] This certificate lacks a "hosts" field. This makes it unsuitable for websites. For more information see the Baseline Requirements for the Issuance and Management of Publicly-Trusted Certificates, v.1.1.6, from the CA/Browser Forum (https://cabforum.org); specifically, section 10.2.3 ("Information Requirements").
生成如下文件
etcd.pem
etcd-key.pem
etct.csr
将ca.pem etcd.pem etcd-key.pem复制到/etc/etcd/ssl/目录下
etcd安装
tar -zxf etcd-v3.3.13-linux-amd64.tar.gz
cp etcd-v3.3.13/etcd /usr/bin/
cp etcd-v3.3.13/etcdctl /usr/bin/
etcd配置文件
cat /etc/etcd/etcd.conf
# [Member Flags]
# ETCD_ELECTION_TIMEOUT=1000
# ETCD_HEARTBEAT_INTERVAL=100
# 指定etcd的数据目录
ETCD_NAME=test-01
ETCD_DATA_DIR=/var/lib/etcd/
# [Cluster Flags]
# ETCD_AUTO_COMPACTION_RETENTIO:N=0
ETCD_INITIAL_CLUSTER_STATE=new
ETCD_ADVERTISE_CLIENT_URLS=https://172.23.210.22:2379 # 服务器2、3修改IP为172.23.210.23、24
ETCD_INITIAL_ADVERTISE_PEER_URLS=https://172.23.210.22:2380 # 服务器2、3修改IP为172.23.210.23、24
ETCD_LISTEN_CLIENT_URLS=https://172.23.210.22:2379,https://127.0.0.1:2379 # 服务器2、3修改IP为172.23.210.23、24
ETCD_INITIAL_CLUSTER_TOKEN=etcd-cluster
ETCD_LISTEN_PEER_URLS=https://172.23.210.22:2380 # 服务器2、3修改IP为172.23.210.23、24
ETCD_INITIAL_CLUSTER="test-01=https://172.23.210.22:2380,test-02=https://172.23.210.23:2380,test-03=https://172.23.210.24:2380"
# [Proxy Flags]
ETCD_PROXY=off
[Security flags]
ETCD_CERT_FILE="/etc/etcd/ssl/etcd.pem"
ETCD_KEY_FILE="/etc/etcd/ssl/etcd-key.pem"
ETCD_CLIENT_CERT_AUTH="true"
ETCD_TRUSTED_CA_FILE="/etc/etcd/ssl/ca.pem"
ETCD_AUTO_TLS="true"
ETCD_PEER_CERT_FILE="/etc/etcd/ssl/etcd.pem"
ETCD_PEER_KEY_FILE="/etc/etcd/ssl/etcd-key.pem"
ETCD_PEER_CLIENT_CERT_AUTH="true"
ETCD_PEER_TRUSTED_CA_FILE="/etc/etcd/ssl/ca.pem"
ETCD_PEER_AUTO_TLS="true"
etc服务控制
cat > /usr/lib/systemd/system/etcd.service <<EOF
[Unit]
Description=etcd server
After=network.target
After=network-online.target
Wants=network-online.target
[Service]
Type=notify
WorkingDirectory=/var/lib/etcd/
EnvironmentFile=-/etc/etcd/etcd.conf
ExecStart=/usr/local/bin/etcd
NotifyAccess=all
Restart=always
RestartSec=5s
LimitNOFILE=40000
[Install]
WantedBy=multi-user.target
EOF
etcd服务控制
systemctl start etcd
systemctl enable etcd
etcd健康检查
etcdctl --ca-file=/etc/etcd/ssl/ca.pem --cert-file=/etc/etcd/ssl/etcd.pem --key-file=/etc/etcd/ssl/etcd-key.pem --endpoints="https://172.23.210.22:2379,https://172.23.210.23:2379,https://172.23.210.24:2379" cluster-health
member 22d860099d5a23a4 is healthy: got healthy result from https://172.23.210.24:2379
member b8c31f277f3aec2f is healthy: got healthy result from https://172.23.210.22:2379
member c3d95832e4f4a6e7 is healthy: got healthy result from https://172.23.210.23:2379
etcdctl --ca-file=/etc/etcd/ssl/ca.pem --cert-file=/etc/etcd/ssl/etcd.pem --key-file=/etc/etcd/ssl/etcd-key.pem --endpoints="https://172.23.210.22:2379,https://172.23.210.23:2379,https://172.23.210.24:2379" member list
22d860099d5a23a4: name=test-03 peerURLs=https://172.23.210.24:2380 clientURLs=https://172.23.210.24:2379 isLeader=false
b8c31f277f3aec2f: name=test-01 peerURLs=https://172.23.210.22:2380 clientURLs=https://172.23.210.22:2379 isLeader=true
c3d95832e4f4a6e7: name=test-02 peerURLs=https://172.23.210.23:2380 clientURLs=https://172.23.210.23:2379 isLeader=false
注意事项:
注意 /var/lib/etcd/ 目录下不要有失败环境的遗留文件
注意节点时间同步,建议通过ntpd来同步时间