用kubeadm简单部署k8s

一、环境准备

　　1、三台CentOS6.7虚拟机
　　　　master：192.168.0.54 注意：主节点最好是2颗cpu，否则在k8s控制平面初始化的时候会报错；
　　　　node1:192.168.0.68
　　　　node2:192.168.0.56
　　2、三台主机时间要一致
　　3、关闭防火墙
　　4、禁用swap分区
　　　　swapoff -a 临时禁用
　　　　vim /etc/fstab 注释掉swap的行，永久禁止
　　5、为三台主机添加hosts文件内容，使其能互相通过主机名访问；
　　　　192.168.0.54 k8smaster
　　　　192.168.0.68 k8snode1
　　　　192.168.0.56 kusnode2

6、开启ip_forword转发
临时生效： echo "1" > /proc/sys/net/ipv4/ip_forward
永久生效：编辑/etc/rc.d/rc.local,将echo "1" > /proc/sys/net/ipv4/ip_forward加入该文件中；

二、软件安装部分
　　1、安装docker-ce和kubernetes的yum源，本次实验用的是华为云的源；
　　　　地址：https://mirrors.huaweicloud.com/
　　　　安装方式华为云都有说明，在这不做描述；
　　2、安装软件，master和node都需要安装
　　　　yum install kubelet kubeadm kubectl docker-ce -y
　　　　因为master上的每个组件都是通过pod的方式来运行的，因此master上也需要部署kubelet和docker；
　　　　kubelet和docker不运行为pod，运行为系统守护进程；
三、初始化k8s控制平面
　　支持两种初始化方式：指定配置文件和用命令行；本次实验用命令行的方式初始化；初始化官方文档地址：https://kubernetes.io/zh/docs/reference/setup-tools/kubeadm/kubeadm-init/
　　1、初始化

kubeadm init --image-repository registry.aliyuncs.com/google_containers --kubernetes-version v1.14.2 --apiserver-advertise-address 192.168.0.54 --apiserver-bind-port 6443 --pod-network-cidr 10.244.0.0/16
　　--image-repository 选择用于拉取控制平面镜像的容器仓库。默认值："k8s.gcr.io"，国内环境访问不到，所以指向国内仓库；
　　--kubernetes-version 为控制平面选择一个特定的 Kubernetes 版本，(用kubeadm version命令查看版本号)
　　--apiserver-advertise-address API 服务器所公布的其正在监听的 IP 地址；
　　--apiserver-bind-port API服务器绑定的端口；
　　--pod-network-cidr 指明 pod 网络可以使用的 IP 地址段；这里设置10.244.0.0/16，是因为这是flannel网络插件的默认地址段，等会要用到flannel插件，为了不引起不必要的麻烦，先这么指定；

　　2、遇到的警告/报错信息

1)、文件驱动报错
　　[WARNING IsDockerSystemdCheck]: detected "cgroupfs" as the Docker cgroup driver. The recommended driver is "systemd". Please follow the guide at https://kubernetes.io/docs/setup/cri/
　　原因：默认的文件驱动是cgroupfs，而docker使用的是systemd，因此要将文件驱动修改成systemd；
　　解决方法：(三台设备都添加一下)
　　1、vim /etc/docker/daemon.json
　　　　{
　　　　"exec-opts": ["native.cgroupdriver=systemd"]
　　　　}
　　2、重启docker服务查看docker状态
　　　　[root@k8smaster ~]# docker info |grep Cgroup
　　　　Cgroup Driver: systemd
2)、CPU数量少报错
　　error execution phase preflight: [preflight] Some fatal errors occurred:
　　[ERROR NumCPU]: the number of available CPUs 1 is less than the required 2
　　解决方法：添加CPU数量或者核心数(没有测试过增加cpu核心数能否报错，有待测试)
3)、内核参数报错
　　error execution phase preflight: [preflight] Some fatal errors occurred:
　　[ERROR FileContent--proc-sys-net-bridge-bridge-nf-call-iptables]: /proc/sys/net/bridge/bridge-nf-call-iptables contents are not set to 1
　　解决方法：开启bridge-nf-call-iptables，将0改成1；
　　1、vim /etc/sysctl.d/k8s.conf
　　　　net.bridge.bridge-nf-call-iptables = 1
　　　　net.bridge.bridge-nf-call-ip6tables = 1
　　2、重载配置文件	
　　　　sysctl -p /etc/sysctl.d/k8s.conf
　　　　sysctl -a |grep bridge  #查看更改结果

　　3、解决报错信息后初始化完成的状态

[init] Using Kubernetes version: v1.14.2
#自检部分
[preflight] Running pre-flight checks 
	[WARNING SystemVerification]: this Docker version is not on the list of validated versions: 20.10.6. Latest validated version: 18.09
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
#启动kubectl
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Activating the kubelet service
#生成自签名的CA证书来为集群中的每个组件建立身份标识；
[certs] Using certificateDir folder "/etc/kubernetes/pki"
[certs] Generating "etcd/ca" certificate and key
[certs] Generating "etcd/peer" certificate and key
[certs] etcd/peer serving cert is signed for DNS names [k8smaster localhost] and IPs [192.168.0.54 127.0.0.1 ::1]
[certs] Generating "apiserver-etcd-client" certificate and key
[certs] Generating "etcd/server" certificate and key
[certs] etcd/server serving cert is signed for DNS names [k8smaster localhost] and IPs [192.168.0.54 127.0.0.1 ::1]
[certs] Generating "etcd/healthcheck-client" certificate and key
[certs] Generating "ca" certificate and key
[certs] Generating "apiserver" certificate and key
[certs] apiserver serving cert is signed for DNS names [k8smaster kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [10.96.0.1 192.168.0.54]
[certs] Generating "apiserver-kubelet-client" certificate and key
[certs] Generating "front-proxy-ca" certificate and key
[certs] Generating "front-proxy-client" certificate and key
[certs] Generating "sa" key and public key
#将 kubeconfig 文件写入 /etc/kubernetes/ 目录以便 kubelet、控制器管理器和调度器用来连接到 API 服务器，它们每一个都有自己的身份标识，同时生成一个名为 admin.conf 的独立的 kubeconfig 文件，用于管理操作。
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[kubeconfig] Writing "admin.conf" kubeconfig file
[kubeconfig] Writing "kubelet.conf" kubeconfig file
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
[kubeconfig] Writing "scheduler.conf" kubeconfig file
#为 API 服务器、控制器管理器和调度器生成静态 Pod 的清单文件。
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
[control-plane] Creating static Pod manifest for "kube-scheduler"
[etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests"
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
[apiclient] All control plane components are healthy after 22.005211 seconds
[upload-config] storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
[kubelet] Creating a ConfigMap "kubelet-config-1.14" in namespace kube-system with the configuration for the kubelets in the cluster
[upload-certs] Skipping phase. Please see --experimental-upload-certs
[mark-control-plane] Marking the node k8smaster as control-plane by adding the label "node-role.kubernetes.io/master=''"
[mark-control-plane] Marking the node k8smaster as control-plane by adding the taints [node-role.kubernetes.io/master:NoSchedule]
#生成令牌，将来其他节点可以使用该令牌向控制平面注册自己；
[bootstrap-token] Using token: d9kx53.g4t2ia169zyh9byg
[bootstrap-token] Configuring bootstrap tokens, cluster-info ConfigMap, RBAC Roles
[bootstrap-token] configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials
[bootstrap-token] configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token
[bootstrap-token] configured RBAC rules to allow certificate rotation for all node client certificates in the cluster
[bootstrap-token] creating the "cluster-info" ConfigMap in the "kube-public" namespace
[addons] Applied essential addon: CoreDNS
[addons] Applied essential addon: kube-proxy

Your Kubernetes control-plane has initialized successfully!

To start using your cluster, you need to run the following as a regular user:
#这几条命令在master主机上，原则上是要用普通用户去执行，测试环境就用root用户执行；
  mkdir -p $HOME/.kube
  sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
  sudo chown $(id -u):$(id -g) $HOME/.kube/config

You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
  https://kubernetes.io/docs/concepts/cluster-administration/addons/

Then you can join any number of worker nodes by running the following on each as root:
#下面这条命令是要在从节点上执行的，将从节点接入集群中，要记录好，不能丢了，后面要用到；
kubeadm join 192.168.0.54:6443 --token d9kx53.g4t2ia169zyh9byg \
    --discovery-token-ca-cert-hash sha256:d8beb243d699f2cb7e5198419887441440d22722ab1cd144121a7f810cc4177a

此时，在master上执行docker image ls可以看到k8s控制平面所用的镜像；
[root@k8smaster ~]# docker image ls
REPOSITORY                                                        TAG           IMAGE ID       CREATED         SIZE
quay.io/coreos/flannel                                            v0.14.0-rc1   0a1a2818ce59   3 weeks ago     67.9MB
registry.aliyuncs.com/google_containers/kube-proxy                v1.14.2       5c24210246bb   24 months ago   82.1MB
registry.aliyuncs.com/google_containers/kube-apiserver            v1.14.2       5eeff402b659   24 months ago   210MB
registry.aliyuncs.com/google_containers/kube-controller-manager   v1.14.2       8be94bdae139   24 months ago   158MB
registry.aliyuncs.com/google_containers/kube-scheduler            v1.14.2       ee18f350636d   24 months ago   81.6MB
registry.aliyuncs.com/google_containers/coredns                   1.3.1         eb516548c180   2 years ago     40.3MB
registry.aliyuncs.com/google_containers/etcd                      3.3.10        2c4adeb21b4f   2 years ago     258MB
registry.aliyuncs.com/google_containers/pause                     3.1           da86e6ba6ca1   3 years ago     742kB

1、初始化失败时遇到的错误
[kubelet-check] Initial timeout of 40s passed.
error execution phase upload-config/kubelet: Error writing Crisocket information for the control-plane node: timed out waiting for the condition
解决方法：
swapoff -a && kubeadm reset  && systemctl daemon-reload && systemctl restart kubelet  && iptables -F && iptables -t nat -F && iptables -t mangle -F && iptables -X
2、执行kubectl命令时出现的错误，例如执行(kubectl get pods)
Unable to connect to the server: x509: certificate signed by unknown authority (possibly because of "crypto/rsa: verification error" while trying to verify candidate authority certificate "kubernetes")
解决方法：
cp -i /etc/kubernetes/admin.conf $HOME/.kube/config

以上问题均是原来集群出问题后，重新部署新集群的时候出现的，均是原来的配置没有清除干净导致的；

四、部署flannel网络插件

　　部署flannel需要用到kube-flannel.yml文件，可以去github上下载，也有执行命令；地址：https://github.com/flannel-io/flannel

[root@k8smaster ~]# kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
podsecuritypolicy.policy/psp.flannel.unprivileged created
clusterrole.rbac.authorization.k8s.io/flannel created
clusterrolebinding.rbac.authorization.k8s.io/flannel created
serviceaccount/flannel created
configmap/kube-flannel-cfg created
daemonset.apps/kube-flannel-ds created
[root@k8smaster ~]# kubectl get pods -n kube-system
NAME                                READY   STATUS    RESTARTS   AGE
coredns-8686dcc4fd-22qbc            1/1     Running   0          4h52m
coredns-8686dcc4fd-flvfx            1/1     Running   0          4h52m
etcd-k8smaster                      1/1     Running   0          4h51m
kube-apiserver-k8smaster            1/1     Running   0          4h51m
kube-controller-manager-k8smaster   1/1     Running   0          4h51m
kube-flannel-ds-dbfmf               1/1     Running   0          4h36m
kube-flannel-ds-gd2gw               1/1     Running   0          4h44m
kube-flannel-ds-zsrjj               1/1     Running   0          4h36m
kube-proxy-cr7r4                    1/1     Running   0          4h36m
kube-proxy-mnm49                    1/1     Running   0          4h52m
kube-proxy-r9g4b                    1/1     Running   0          4h36m
kube-scheduler-k8smaster            1/1     Running   0          4h51m

五、添加node节点，每台需要加入节点的node都需要执行

[root@k8snode1 ~]# kubeadm join 192.168.0.54:6443 --token d9kx53.g4t2ia169zyh9byg --discovery-token-ca-cert-hash sha256:d8beb243d699f2cb7e5198419887441440d22722ab1cd144121a7f810cc4177a
[preflight] Running pre-flight checks
	[WARNING SystemVerification]: this Docker version is not on the list of validated versions: 20.10.6. Latest validated version: 18.09
	[WARNING Hostname]: hostname "k8snode2" could not be reached
	[WARNING Hostname]: hostname "k8snode2": lookup k8snode2 on 114.114.114.114:53: no such host
	[WARNING Service-Kubelet]: kubelet service is not enabled, please run 'systemctl enable kubelet.service'
[preflight] Reading configuration from the cluster...
[preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml'
[kubelet-start] Downloading configuration for the kubelet from the "kubelet-config-1.14" ConfigMap in the kube-system namespace
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Activating the kubelet service
[kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap...

This node has joined the cluster:
* Certificate signing request was sent to apiserver and a response was received.
* The Kubelet was informed of the new secure connection details.

Run 'kubectl get nodes' on the control-plane to see this node join the cluster.

在主节点上查看状态：
[root@k8smaster ~]# kubectl get nodes -o wide
NAME        STATUS   ROLES    AGE     VERSION   INTERNAL-IP    EXTERNAL-IP   OS-IMAGE                KERNEL-VERSION          CONTAINER-RUNTIME
k8smaster   Ready    master   5h7m    v1.14.2   192.168.0.54   <none>        CentOS Linux 7 (Core)   3.10.0-957.el7.x86_64   docker://20.10.6
k8snode1    Ready    <none>   4h51m   v1.14.2   192.168.0.68   <none>        CentOS Linux 7 (Core)   3.10.0-957.el7.x86_64   docker://20.10.6
k8snode2    Ready    <none>   4h50m   v1.14.2   192.168.0.56   <none>        CentOS Linux 7 (Core)   3.10.0-957.el7.x86_64   docker://20.10.6

添加node节点时遇到的报错信息及解决办法
[root@k8snode1 ~]# kubeadm join 192.168.0.54:6443 --token d9kx53.g4t2ia169zyh9byg --discovery-token-ca-cert-hash sha256:d8beb243d699f2cb7e5198419887441440d22722ab1cd144121a7f810cc4177a
[preflight] Running pre-flight checks
	[WARNING SystemVerification]: this Docker version is not on the list of validated versions: 20.10.6. Latest validated version: 18.09
	[WARNING Service-Kubelet]: kubelet service is not enabled, please run 'systemctl enable kubelet.service'
error execution phase preflight: [preflight] Some fatal errors occurred:
	[ERROR FileAvailable--etc-kubernetes-kubelet.conf]: /etc/kubernetes/kubelet.conf already exists
	[ERROR FileAvailable--etc-kubernetes-bootstrap-kubelet.conf]: /etc/kubernetes/bootstrap-kubelet.conf already exists
	[ERROR FileContent--proc-sys-net-ipv4-ip_forward]: /proc/sys/net/ipv4/ip_forward contents are not set to 1
	[ERROR Swap]: running with swap on is not supported. Please disable swap
	[ERROR FileAvailable--etc-kubernetes-pki-ca.crt]: /etc/kubernetes/pki/ca.crt already exists
[preflight] If you know what you are doing, you can make a check non-fatal with `--ignore-preflight-errors=...`
##提示文件已经存在(already exists),需要将/etc/kubernetes/下的所有文件删除再初始化
##############
需要开启ipv4的转发功能
[root@k8snode1 ~]# kubeadm join 192.168.0.54:6443 --token d9kx53.g4t2ia169zyh9byg --discovery-token-ca-cert-hash sha256:d8beb243d699f2cb7e5198419887441440d22722ab1cd144121a7f810cc4177a
[preflight] Running pre-flight checks
	[WARNING SystemVerification]: this Docker version is not on the list of validated versions: 20.10.6. Latest validated version: 18.09
	[WARNING Service-Kubelet]: kubelet service is not enabled, please run 'systemctl enable kubelet.service'
error execution phase preflight: [preflight] Some fatal errors occurred:
	[ERROR FileContent--proc-sys-net-ipv4-ip_forward]: /proc/sys/net/ipv4/ip_forward contents are not set to 1
	[ERROR Swap]: running with swap on is not supported. Please disable swap
[preflight] If you know what you are doing, you can make a check non-fatal with `--ignore-preflight-errors=...`
[root@k8snode1 ~]#
[root@k8snode1 ~]# cat /proc/sys/net/ipv4/ip_forward
0
[root@k8snode1 ~]# echo 1 > /proc/sys/net/ipv4/ip_forward
[root@k8snode1 ~]# cat /proc/sys/net/ipv4/ip_forward
1
################
需要关闭swap
[root@k8snode1 ~]# kubeadm join 192.168.0.54:6443 --token d9kx53.g4t2ia169zyh9byg --discovery-token-ca-cert-hash sha256:d8beb243d699f2cb7e5198419887441440d22722ab1cd144121a7f810cc4177a
[preflight] Running pre-flight checks
	[WARNING SystemVerification]: this Docker version is not on the list of validated versions: 20.10.6. Latest validated version: 18.09
	[WARNING Service-Kubelet]: kubelet service is not enabled, please run 'systemctl enable kubelet.service'
error execution phase preflight: [preflight] Some fatal errors occurred:
	[ERROR Swap]: running with swap on is not supported. Please disable swap
[preflight] If you know what you are doing, you can make a check non-fatal with `--ignore-preflight-errors=...`
[root@k8snode1 ~]# free -h
              total        used        free      shared  buff/cache   available
Mem:           1.8G        201M        821M        9.7M        796M        1.4G
Swap:          2.0G          0B        2.0G

token过期的处理办法
[root@k8snode1 ~]# kubeadm join 192.168.0.54:6443 --token d9kx53.g4t2ia169zyh9byg --discovery-token-ca-cert-hash sha256:d8beb243d699f2cb7e5198419887441440d22722ab1cd144121a7f810cc4177a
[preflight] Running pre-flight checks
[WARNING SystemVerification]: this Docker version is not on the list of validated versions: 20.10.6. Latest validated version: 18.09
error execution phase preflight: couldn't validate the identity of the API Server: abort connecting to API servers after timeout of 5m0s
##此错误表示token过期；
[root@k8smaster ~]# kubeadm token create --print-join-command
kubeadm join 192.168.0.54:6443 --token 1axrit.s0u8ar8v0d218t0r     --discovery-token-ca-cert-hash sha256:d8beb243d699f2cb7e5198419887441440d22722ab1cd144121a7f810cc4177a
#用新生成的命令去扩容node节点；
[root@k8smaster ~]# kubeadm token list
TOKEN                     TTL       EXPIRES                     USAGES                   DESCRIPTION   EXTRA GROUPS
1axrit.s0u8ar8v0d218t0r   23h       2021-07-01T14:31:46+08:00   authentication,signing   <none>        system:bootstrappers:kubeadm:default-node-token
... ...

六、以命令行的方式在集群中跑个容器测试下

1、先执行docker search nginx，选择一个demo版本的nginx
nginxdemos/hello                   NGINX webserver that serves a simple page co…   68                   [OK]
2、在集群中运行该实例
[root@k8smaster ~]# kubectl create deployment nginx --image="nginxdemos/hello"
deployment.apps/nginx created
[root@k8smaster ~]# kubectl get pods -o wide
NAME                    READY   STATUS    RESTARTS   AGE   IP           NODE       NOMINATED NODE   READINESS GATES
nginx-dcf8cc94c-5snlw   1/1     Running   0          11m   10.244.1.2   k8snode1   <none>           <none>
3、测试
curl -vo /dev/null "10.244.1.2"

七、扩展多个实例

kubectl scale deployment nginx --replicas=3
[root@k8smaster ~]# kubectl get pods -o wide
NAME                    READY   STATUS    RESTARTS   AGE     IP           NODE       NOMINATED NODE   READINESS GATES
nginx-dcf8cc94c-5snlw   1/1     Running   0          4h45m   10.244.1.2   k8snode1   <none>           <none>
nginx-dcf8cc94c-mncrp   1/1     Running   0          123m    10.244.2.2   k8snode2   <none>           <none>
nginx-dcf8cc94c-wc7wv   1/1     Running   0          123m    10.244.1.3   k8snode1   <none>           <none>

八、创建一个service

　　由于在扩展多个实例的时候，指定了3个实例，那么就会存在一个问题，当我删掉一个实例的时候，系统会自动创建一个实例，这时候就会分配一个新的ip地址，导致访问旧ip报错，需要连带的替换新的ip地址，为了避免这个问题，创建一个server，类似于负载均衡的作用，在访问的时候访问server的地址，无论后端实例怎么变化也不会影响访问；(kubectl delete pods 资源名称)；

[root@k8smaster ~]# kubectl create service clusterip nginx --tcp=80:80
service/nginx created

clusterip 指定类型
nginx 实例名称，要跟之前定义的deployment名称保持一致；
[root@k8smaster ~]# kubectl get service
NAME         TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)   AGE
kubernetes   ClusterIP   10.96.0.1        <none>        443/TCP   5h26m
nginx        ClusterIP   10.108.249.197   <none>        80/TCP    129m

测试：
[root@k8smaster ~]# curl -I "10.108.249.197"
HTTP/1.1 200 OK
Server: nginx/1.13.8
Date: Wed, 12 May 2021 07:55:24 GMT
Content-Type: text/html
Connection: keep-alive
Expires: Wed, 12 May 2021 07:55:23 GMT
Cache-Control: no-cache