k8s 集群改造
现状阐述
七个机器,三台master ,四个node ,还有服务运行,需要重装集群,采取的方案是先下线三个机器重新搭建集群。下线的机器上面没有etcd 的node 节点。
下线操作
master 执行
kubectl drain sd-cluster-05 --delete-local-data --force --ignore-daemonsets
root@sd-cluster-01:~/service_config# kubectl get node NAME STATUS ROLES AGE VERSION sd-cluster-01 Ready master 85d v1.18.3 sd-cluster-02 Ready master 85d v1.18.3 sd-cluster-03 Ready master 85d v1.18.3 sd-cluster-04 Ready node 83d v1.18.3 sd-cluster-05 Ready,SchedulingDisabled node 85d v1.18.3 sd-cluster-06 Ready node 85d v1.18.3 sd-cluster-07 Ready node 85d v1.18.3
kubectl delete node sd-cluster-05
sd-cluster-05 执行
kubeadm reset
[reset] WARNING: Changes made to this host by 'kubeadm init' or 'kubeadm join' will be reverted. [reset] Are you sure you want to proceed? [y/N]: y [preflight] Running pre-flight checks W0910 12:31:41.284781 21176 removeetcdmember.go:79] [reset] No kubeadm config, using etcd pod spec to get data directory [reset] No etcd config found. Assuming external etcd [reset] Please, manually reset etcd to prevent further issues [reset] Stopping the kubelet service [reset] Unmounting mounted directories in "/var/lib/kubelet" [reset] Deleting contents of config directories: [/etc/kubernetes/manifests /etc/kubernetes/pki] [reset] Deleting files: [/etc/kubernetes/admin.conf /etc/kubernetes/kubelet.conf /etc/kubernetes/bootstrap-kubelet.conf /etc/kubernetes/controller-manager.conf /etc/kubernetes/scheduler.conf] [reset] Deleting contents of stateful directories: [/var/lib/kubelet /var/lib/dockershim /var/run/kubernetes /var/lib/cni] The reset process does not clean CNI configuration. To do so, you must remove /etc/cni/net.d The reset process does not reset or clean up iptables rules or IPVS tables. If you wish to reset iptables, you must do so manually by using the "iptables" command. If your cluster was setup to utilize IPVS, run ipvsadm --clear (or similar) to reset your system's IPVS tables. The reset process does not clean your kubeconfig files and you must remove them manually. Please, check the contents of the $HOME/.kube/config file.
rm -rf /etc/cni/net.d/*
ipvsadm-save >> ipvs.bak
ipvsadm -C
iptables-save >> iptables.bak
iptables -F (flush 清除所有的已定规则)
iptables -X (delete 删除所有用户“自定义”的链(tables))
iptables -Z (zero 将所有的chain的计数与流量统计都归零)
删除网卡等信息 kube-ipvs0 tunl0@NONE dummy0
ip link delete kube-ipvs0
ip link delete dummy0
其他节点操作同上
搭建集群
先在3台机器上面搭建集群
在节点4 操作
yum -y install bridge-utils
启动docker 进程
sudo kubeadm init --apiserver-advertise-address=192.168.1.200 --kubernetes-version v1.18.3 --service-cidr=10.96.0.0/16 --pod-network-cidr=10.244.0.0/16 --image-repository registry.sensedeal.wiki:8443/sp
输出
W0913 15:26:44.859040 45631 configset.go:202] WARNING: kubeadm cannot validate component configs for API groups [kubelet.config.k8s.io kubeproxy.config.k8s.io] [init] Using Kubernetes version: v1.18.3 [preflight] Running pre-flight checks [preflight] Pulling images required for setting up a Kubernetes cluster [preflight] This might take a minute or two, depending on the speed of your internet connection [preflight] You can also perform this action in beforehand using 'kubeadm config images pull' [kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env" [kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml" [kubelet-start] Starting the kubelet [certs] Using certificateDir folder "/etc/kubernetes/pki" [certs] Generating "ca" certificate and key [certs] Generating "apiserver" certificate and key [certs] apiserver serving cert is signed for DNS names [sd-cluster-04 kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [10.96.0.1 192.168.1.200] [certs] Generating "apiserver-kubelet-client" certificate and key [certs] Generating "front-proxy-ca" certificate and key [certs] Generating "front-proxy-client" certificate and key [certs] Generating "etcd/ca" certificate and key [certs] Generating "etcd/server" certificate and key [certs] etcd/server serving cert is signed for DNS names [sd-cluster-04 localhost] and IPs [192.168.1.200 127.0.0.1 ::1] [certs] Generating "etcd/peer" certificate and key [certs] etcd/peer serving cert is signed for DNS names [sd-cluster-04 localhost] and IPs [192.168.1.200 127.0.0.1 ::1] [certs] Generating "etcd/healthcheck-client" certificate and key [certs] Generating "apiserver-etcd-client" certificate and key [certs] Generating "sa" key and public key [kubeconfig] Using kubeconfig folder "/etc/kubernetes" [kubeconfig] Writing "admin.conf" kubeconfig file [kubeconfig] Writing "kubelet.conf" kubeconfig file [kubeconfig] Writing "controller-manager.conf" kubeconfig file [kubeconfig] Writing "scheduler.conf" kubeconfig file [control-plane] Using manifest folder "/etc/kubernetes/manifests" [control-plane] Creating static Pod manifest for "kube-apiserver" [control-plane] Creating static Pod manifest for "kube-controller-manager" W0913 15:26:48.438242 45631 manifests.go:225] the default kube-apiserver authorization-mode is "Node,RBAC"; using "Node,RBAC" [control-plane] Creating static Pod manifest for "kube-scheduler" W0913 15:26:48.439951 45631 manifests.go:225] the default kube-apiserver authorization-mode is "Node,RBAC"; using "Node,RBAC" [etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests" [wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s [apiclient] All control plane components are healthy after 23.004726 seconds [upload-config] Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace [kubelet] Creating a ConfigMap "kubelet-config-1.18" in namespace kube-system with the configuration for the kubelets in the cluster [upload-certs] Skipping phase. Please see --upload-certs [mark-control-plane] Marking the node sd-cluster-04 as control-plane by adding the label "node-role.kubernetes.io/master=''" [mark-control-plane] Marking the node sd-cluster-04 as control-plane by adding the taints [node-role.kubernetes.io/master:NoSchedule] [bootstrap-token] Using token: 285sm2.8z2ix543hzi7nwvu [bootstrap-token] Configuring bootstrap tokens, cluster-info ConfigMap, RBAC Roles [bootstrap-token] configured RBAC rules to allow Node Bootstrap tokens to get nodes [bootstrap-token] configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials [bootstrap-token] configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token [bootstrap-token] configured RBAC rules to allow certificate rotation for all node client certificates in the cluster [bootstrap-token] Creating the "cluster-info" ConfigMap in the "kube-public" namespace [kubelet-finalize] Updating "/etc/kubernetes/kubelet.conf" to point to a rotatable kubelet client certificate and key [addons] Applied essential addon: CoreDNS [addons] Applied essential addon: kube-proxy Your Kubernetes control-plane has initialized successfully! To start using your cluster, you need to run the following as a regular user: mkdir -p $HOME/.kube sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config sudo chown $(id -u):$(id -g) $HOME/.kube/config You should now deploy a pod network to the cluster. Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at: https://kubernetes.io/docs/concepts/cluster-administration/addons/ Then you can join any number of worker nodes by running the following on each as root: kubeadm join 192.168.1.200:6443 --token 285sm2.8z2ix543hzi7nwvu \ --discovery-token-ca-cert-hash sha256:70cad4ca6da02b8313c1bed9b2e2e094ffbbdc724845efbc1177d592bd77e335
cd /home/centos&& mkdir .kube
sudo cp -a /etc/kubernetes/admin.conf .kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
其他节点加入:
sudo kubeadm join 192.168.1.214:6443 --token catrzo.2i0xy4xn1it4vs8w --discovery-token-ca-cert-hash sha256:92447a83fe2ca0d1fd68749938b93cde33e8d4ae1acc1065b8edcd49212c1ef3
(token 等信息是kubeadm init 成功后输出的)
kubectl taint nodes --all node-role.kubernetes.io/master-
kubectl label nodes sd-cluster-04 hostname=sd-cluster-04
kubectl label nodes sd-cluster-05 hostname=sd-cluster-05
kubectl label nodes sd-cluster-06 hostname=sd-cluster-06
kubectl get node #节点的roles 为node
NAME STATUS ROLES AGE VERSION sd-cluster-04 Ready master 57m v1.18.3 sd-cluster-05 Ready <none> 55m v1.18.3 sd-cluster-06 Ready <none> 49m v1.18.3
配置节点角色为node
kubectl label nodes sd-cluster-05 node-role.kubernetes.io/node= kubectl label nodes sd-cluster-06 node-role.kubernetes.io/node=
CNI 插件部署
$ wget https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml # 修改国内镜像地址 $ sed -i -r "s#quay.io/coreos/flannel:.*-amd64#lizhenliang/flannel:v0.11.0-amd64#g" kube-flannel.yml kubectl apply -f kube-flannel.yml $ kubectl get pods -n kube-system
固化镜像
如果镜像是外网地址仓库的,需要把镜像下载到本地机房镜像仓库,以防后期外网镜像无法访问。
kube-proxy 运行模式默认为iptables ,现改为ipvs
kubectl edit cm kube-proxy -n kube-system
把 mode:"" 改为 mode: ipvs
删除现有的kube-proxy pod 即可
加入新的master 和node
在初始化生成的第一个master 上面执行
生成新token
[centos@sd-cluster-04 ~]$ sudo kubeadm token create --print-join-command W0915 11:51:36.267983 78379 configset.go:202] WARNING: kubeadm cannot validate component configs for API groups [kubelet.config.k8s.io kubeproxy.config.k8s.io] kubeadm join 192.168.1.214:6443 --token og24ga.acfs13osagzqhj3n --discovery-token-ca-cert-hash sha256:92447a83fe2ca0d1fd68749938b93cde33e8d4ae1acc1065b8edcd49212c1ef3
生成新master 加入的证书
[centos@sd-cluster-04 ~]$ kubeadm init phase upload-certs --upload-certs I0915 11:54:07.471519 81058 version.go:252] remote version is much newer: v1.19.1; falling back to: stable-1.18 W0915 11:54:10.269421 81058 configset.go:202] WARNING: kubeadm cannot validate component configs for API groups [kubelet.config.k8s.io kubeproxy.config.k8s.io] [upload-certs] Storing the certificates in Secret "kubeadm-certs" in the "kube-system" Namespace [upload-certs] Using certificate key: 3aad7b59049a3c62fa31d6727d265a374ad0cdafa8464dbb5eeed5e506cfe318
在要加入的master 上面执行
kubeadm join 192.168.1.214:6443 --token og24ga.acfs13osagzqh --discovery-token-ca-cert-hash sha256:92447a83fe2ca0d1fd68749938b93cde33e8d4ae1acc1065b8edcd49212c1ef3 --experimental-control-plane --certificate-key
3aad7b59049a3c62fa31d6727d265a374ad0cdafa8464dbb5eeed5e506cfe318