基于docker-Kubernetes的web集群项目

项目过程

实验拓扑图

版本说明

6台Linux服务器(均为2G内存,1核cpu),centos 7.7,keepalived 1.3.5,nfs v4,docker 20.10.6,nginx 1.19.0,keepalived 1.3.5,kubernetes 1.21.3

角色规划和分配

角色 IP地址 备注
k8s-master 192.168.181.174
k8s-node-1 192.168.181.164
k8s-node-2 192.168.181.165
k8s-node-3 192.168.181.166
k8s-nfs 192.168.181.177
LB1 192.168.0.5/192.168.181.10 两块网卡
LB2 192.168.0.6/192.168.181.11 两块网卡
192.168.0.10 vip地址

安装步骤

准备工作

  • 修改主机名
[root@localhost ~]# hostnamectl set-hostname k8s-master
[root@localhost ~]# su - root
上一次登录:日 6月 13 09:25:54 CST 2021从 desktop-em27t6u.lanpts/0 上
[root@k8s-master ~]#
[root@localhost ~]# hostnamectl set-hostname k8s-node-2
[root@localhost ~]# sed -i 41s/W/w/ /etc/bashrc     #指定文件/etc/bahsrc的41行替换(个人习惯)
[root@localhost ~]# su - root
上一次登录:日 6月 13 09:25:45 CST 2021从 desktop-em27t6u.lanpts/0 上
[root@k8s-node-1 ~]# 
  • 关闭防火墙和selinux
#关闭防火墙
systemctl stop  firewalld
systemctl disable  firewalld
#临时关闭selinux
setenforce 0
#永久关闭selinux
sed -i  '/^SELINUX/ s/enforcing/disabled/' /etc/selinux/config

安装docker(脚本)

#!/bin/bash

#解决依赖关系
yum install -y yum-utils zlib zlib-devel openssl openssl-devel pcre pcre-devel gcc gcc-c++ autoconf automake make  psmisc  lsof  net-tools vim python3 

#安装docker
##卸载旧版本
yum remove docker \
                  docker-client \
                  docker-client-latest \
                  docker-common \
                  docker-latest \
                  docker-latest-logrotate \
                  docker-logrotate \
                  docker-engine
##安装yum-utils软件包
yum install -y yum-utils
yum-config-manager \
      --add-repo \
          http://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo
##安装docker,设置开机自启
yum install -y docker-ce docker-ce-cli containerd.io
systemctl start docker
systemctl enable docker
##配置 Docker使用systemd作为默认Cgroup驱动
cat <<EOF > /etc/docker/daemon.json
{
   "exec-opts": ["native.cgroupdriver=systemd"]
}
EOF
#重启docker
systemctl restart docker
#临时/永久关闭swap分区
swapoff -a
sed -i '/ swap / s/^\(.*\)$/#\1/g' /etc/fstab
#修改hosts文件
cat >> /etc/hosts << EOF
192.168.181.174 k8s-master
192.168.181.164 k8s-node-1
192.168.181.165 k8s-node-2
192.168.181.166 k8s-node-3
EOF

安装kubeadm和相关工具

# 添加kubernetes YUM软件源
cat > /etc/yum.repos.d/kubernetes.repo << EOF
[kubernetes]
name=Kubernetes
baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64
enabled=1
gpgcheck=0
repo_gpgcheck=0
gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg
EOF
#安装kubeadm,kubelet和kubectl
yum install -y kubelet kubeadm kubectl
#设置开机自启
systemctl enable --now kubelet

运行kubeadm init命令安装Master

  • 在master节点操作
[root@k8s-master ~]# kubeadm init \
> --apiserver-advertise-address=192.168.181.174 \
> --image-repository registry.aliyuncs.com/google_containers \
> --service-cidr=10.1.0.0/16 \
> --pod-network-cidr=10.244.0.0/16
  • 成功后输入一下结果
Your Kubernetes control-plane has initialized successfully!

To start using your cluster, you need to run the following as a regular user:

  mkdir -p $HOME/.kube
  sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
  sudo chown $(id -u):$(id -g) $HOME/.kube/config

Alternatively, if you are the root user, you can run:

  export KUBECONFIG=/etc/kubernetes/admin.conf

You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
  https://kubernetes.io/docs/concepts/cluster-administration/addons/

Then you can join any number of worker nodes by running the following on each as root:

kubeadm join 192.168.181.174:6443 --token l2z66h.nzi1ne0fhypar9o4 \
	--discovery-token-ca-cert-hash sha256:6fa3d42ab22a789a92d915d438ed1a8dd47d1e6fca70d0ed79c30f26d46fa11d
  • 继续根据提示操作
[root@k8s-master ~]#  mkdir -p $HOME/.kube
[root@k8s-master ~]#  sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
[root@k8s-master ~]#  sudo chown $(id -u):$(id -g) $HOME/.kube/config

安装Node,加入集群

  • 安装kubeadm和kubelet
  • 加入节点
kubeadm join 192.168.181.174:6443 --token l2z66h.nzi1ne0fhypar9o4 \
	--discovery-token-ca-cert-hash sha256:6fa3d42ab22a789a92d915d438ed1a8dd47d1e6fca70d0ed79c30f26d46fa11d
  • 加入成功输出一下结果
[preflight] Running pre-flight checks
[preflight] Reading configuration from the cluster...
[preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml'
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Starting the kubelet
[kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap...

This node has joined the cluster:
* Certificate signing request was sent to apiserver and a response was received.
* The Kubelet was informed of the new secure connection details.

Run 'kubectl get nodes' on the control-plane to see this node join the cluster.
  • 在master上查看效果(此时的状态都是notready,因为没有安装网络插件)
[root@k8s-master ~]# kubectl get node
NAME         STATUS     ROLES                  AGE     VERSION
k8s-master   NotReady   control-plane,master   4m24s   v1.21.3
k8s-node-1   NotReady   <none>                 37s     v1.21.3
k8s-node-2   NotReady   <none>                 29s     v1.21.3 

安装CNI网络插件

[root@k8s-master ~]# cat kube-flannel.yaml 
apiVersion: policy/v1beta1
kind: PodSecurityPolicy
metadata:
  name: psp.flannel.unprivileged
  annotations:
    seccomp.security.alpha.kubernetes.io/allowedProfileNames: docker/default
    seccomp.security.alpha.kubernetes.io/defaultProfileName: docker/default
    apparmor.security.beta.kubernetes.io/allowedProfileNames: runtime/default
    apparmor.security.beta.kubernetes.io/defaultProfileName: runtime/default
spec:
  privileged: false
  volumes:
  - configMap
  - secret
  - emptyDir
  - hostPath
  allowedHostPaths:
  - pathPrefix: "/etc/cni/net.d"
  - pathPrefix: "/etc/kube-flannel"
  - pathPrefix: "/run/flannel"
  readOnlyRootFilesystem: false
  # Users and groups
  runAsUser:
    rule: RunAsAny
  supplementalGroups:
    rule: RunAsAny
  fsGroup:
    rule: RunAsAny
  # Privilege Escalation
  allowPrivilegeEscalation: false
  defaultAllowPrivilegeEscalation: false
  # Capabilities
  allowedCapabilities: ['NET_ADMIN', 'NET_RAW']
  defaultAddCapabilities: []
  requiredDropCapabilities: []
  # Host namespaces
  hostPID: false
  hostIPC: false
  hostNetwork: true
  hostPorts:
  - min: 0
    max: 65535
  # SELinux
  seLinux:
    # SELinux is unused in CaaSP
    rule: 'RunAsAny'
---
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: flannel
rules:
- apiGroups: ['extensions']
  resources: ['podsecuritypolicies']
  verbs: ['use']
  resourceNames: ['psp.flannel.unprivileged']
- apiGroups:
  - ""
  resources:
  - pods
  verbs:
  - get
- apiGroups:
  - ""
  resources:
  - nodes
  verbs:
  - list
  - watch
- apiGroups:
  - ""
  resources:
  - nodes/status
  verbs:
  - patch
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: flannel
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: flannel
subjects:
- kind: ServiceAccount
  name: flannel
  namespace: kube-system
---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: flannel
  namespace: kube-system
---
kind: ConfigMap
apiVersion: v1
metadata:
  name: kube-flannel-cfg
  namespace: kube-system
  labels:
    tier: node
    app: flannel
data:
  cni-conf.json: |
    {
      "name": "cbr0",
      "cniVersion": "0.3.1",
      "plugins": [
        {
          "type": "flannel",
          "delegate": {
            "hairpinMode": true,
            "isDefaultGateway": true
          }
        },
        {
          "type": "portmap",
          "capabilities": {
            "portMappings": true
          }
        }
      ]
    }
  net-conf.json: |
    {
      "Network": "10.244.0.0/16",
      "Backend": {
        "Type": "vxlan"
      }
    }
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: kube-flannel-ds
  namespace: kube-system
  labels:
    tier: node
    app: flannel
spec:
  selector:
    matchLabels:
      app: flannel
  template:
    metadata:
      labels:
        tier: node
        app: flannel
    spec:
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
            - matchExpressions:
              - key: kubernetes.io/os
                operator: In
                values:
                - linux
      hostNetwork: true
      priorityClassName: system-node-critical
      tolerations:
      - operator: Exists
        effect: NoSchedule
      serviceAccountName: flannel
      initContainers:
      - name: install-cni
        image: quay.io/coreos/flannel:v0.13.1-rc2
        command:
        - cp
        args:
        - -f
        - /etc/kube-flannel/cni-conf.json
        - /etc/cni/net.d/10-flannel.conflist
        volumeMounts:
        - name: cni
          mountPath: /etc/cni/net.d
        - name: flannel-cfg
          mountPath: /etc/kube-flannel/
      containers:
      - name: kube-flannel
        image: quay.io/coreos/flannel:v0.13.1-rc2
        command:
        - /opt/bin/flanneld
        args:
        - --ip-masq
        - --kube-subnet-mgr
        resources:
          requests:
            cpu: "100m"
            memory: "50Mi"
          limits:
            cpu: "100m"
            memory: "50Mi"
        securityContext:
          privileged: false
          capabilities:
            add: ["NET_ADMIN", "NET_RAW"]
        env:
        - name: POD_NAME
          valueFrom:
            fieldRef:
              fieldPath: metadata.name
        - name: POD_NAMESPACE
          valueFrom:
            fieldRef:
              fieldPath: metadata.namespace
        volumeMounts:
        - name: run
          mountPath: /run/flannel
        - name: flannel-cfg
          mountPath: /etc/kube-flannel/
      volumes:
      - name: run
        hostPath:
          path: /run/flannel
      - name: cni
        hostPath:
          path: /etc/cni/net.d
      - name: flannel-cfg
        configMap:
          name: kube-flannel-cfg
[root@k8s-master ~]# 

[root@k8s-master ~]# kubectl apply -f kube-flannel.yaml 
Warning: policy/v1beta1 PodSecurityPolicy is deprecated in v1.21+, unavailable in v1.25+
podsecuritypolicy.policy/psp.flannel.unprivileged created
clusterrole.rbac.authorization.k8s.io/flannel created
clusterrolebinding.rbac.authorization.k8s.io/flannel created
serviceaccount/flannel created
configmap/kube-flannel-cfg created
daemonset.apps/kube-flannel-ds created
  • 查看效果
[root@k8s-master ~]# kubectl get nodes
NAME         STATUS   ROLES                  AGE     VERSION
k8s-master   Ready    control-plane,master   9m46s   v1.21.3
k8s-node-1   Ready    <none>                 5m59s   v1.21.3
k8s-node-2   Ready    <none>                 5m51s   v1.21.3
[root@k8s-master ~]# ps aux|grep flannel
root       8345  0.3  1.1 1340152 21572 ?       Ssl  23:04   0:00 /opt/bin/flanneld --ip-masq --kube-subnet-mgr
root       9228  0.0  0.0 112824   984 pts/0    S+   23:05   0:00 grep --color=auto flannel
[root@k8s-master ~]#  kubectl get pod -n kube-system
NAME                                 READY   STATUS    RESTARTS   AGE
coredns-59d64cd4d4-d677g             1/1     Running   0          9m34s
coredns-59d64cd4d4-vkfkb             1/1     Running   0          9m34s
etcd-k8s-master                      1/1     Running   0          9m47s
kube-apiserver-k8s-master            1/1     Running   0          9m47s
kube-controller-manager-k8s-master   1/1     Running   0          9m48s
kube-flannel-ds-4dm2f                1/1     Running   0          3m21s
kube-flannel-ds-t4hzl                1/1     Running   0          3m21s
kube-flannel-ds-wbdgd                1/1     Running   0          3m21s
kube-proxy-48bbp                     1/1     Running   0          5m56s
kube-proxy-rw4vv                     1/1     Running   0          9m34s
kube-proxy-stqkd                     1/1     Running   0          6m4s
kube-scheduler-k8s-master            1/1     Running   0          9m49s

配置ServerLoadBalance

  • 安装nginx
yum install -y nginx
systemctl stop firewalld
systemctl disable firewalld
  • 启动nginx并且修改配置文件
systemctl start nginx

cat /etc/nginx/nginx.conf
	......
http {
    upstream web_pools {
	server 192.168.181.164:31002;
	server 192.168.181.165:31002;
	server 192.168.181.166:31002;
	}
    server {
	location / {
		proxy_pass http://web_pools;
			}
	}
	......
}
  • 检查语法后重新启动
[root@k8s-SLB1 ~]# nginx -t
nginx: the configuration file /etc/nginx/nginx.conf syntax is ok
nginx: configuration file /etc/nginx/nginx.conf test is successful
[root@k8s-SLB1 ~]# nginx -s reload
  • 验证效果
    image
    image

配置keepailved

  • 安装keepalived
[root@k8s-SLB1 ~]# yum install keepalived -y
[root@k8s-SLB1 ~]# keepalived --version
Keepalived v1.3.5 (03/19,2017), git commit v1.3.5-6-g6fa32f2

Copyright(C) 2001-2017 Alexandre Cassen, <acassen@gmail.com>

Build options:  PIPE2 LIBNL3 RTA_ENCAP RTA_EXPIRES RTA_PREF RTA_VIA FRA_OIFNAME FRA_SUPPRESS_PREFIXLEN FRA_TUN_ID RTAX_CC_ALGO RTAX_QUICKACK LIBIPTC LIBIPSET_DYNAMIC LVS LIBIPVS_NETLINK VRRP VRRP_AUTH VRRP_VMAC SOCK_NONBLOCK SOCK_CLOEXEC FIB_ROUTING INET6_ADDR_GEN_MODE SNMP_V3_FOR_V2 SNMP SNMP_KEEPALIVED SNMP_CHECKER SNMP_RFC SNMP_RFCV2 SNMP_RFCV3 SO_MARK
[root@localhost mysqlrouter]# 
  • 修改配置文件
master
[root@k8s-SLB1 keepalived]# cat keepalived.conf 
! Configuration File for keepalived

global_defs {
   notification_email {
     acassen@firewall.loc
     failover@firewall.loc
     sysadmin@firewall.loc
   }
   notification_email_from Alexandre.Cassen@firewall.loc
   smtp_server 192.168.200.1
   smtp_connect_timeout 30
   router_id LVS_DEVEL
   vrrp_skip_check_adv_addr
   #vrrp_strict
   vrrp_garp_interval 0
   vrrp_gna_interval 0
}

vrrp_instance VI_1 {
    state MASTER
    interface ens33
    virtual_router_id 51
    priority 120
    advert_int 1
    authentication {
        auth_type PASS
        auth_pass 1111
    }
    virtual_ipaddress {
        192.168.0.10
    }
}
[root@k8s-SLB1 keepalived]# 
------------------------------------
backup
[root@k8s-SLB1 keepalived]#  cat keepalived.conf 
! Configuration File for keepalived

global_defs {
   notification_email {
     acassen@firewall.loc
     failover@firewall.loc
     sysadmin@firewall.loc
   }
   notification_email_from Alexandre.Cassen@firewall.loc
   smtp_server 192.168.200.1
   smtp_connect_timeout 30
   router_id LVS_DEVEL
   vrrp_skip_check_adv_addr
   # vrrp_strict
   vrrp_garp_interval 0
   vrrp_gna_interval 0
}

vrrp_instance VI_1 {
    state BACKUP
    interface ens33
    virtual_router_id 51
    priority 100
    advert_int 1
    authentication {
        auth_type PASS
        auth_pass 1111
    }
    virtual_ipaddress {
        192.168.0.10
    }
}

[root@k8s-SLB1 keepalived]#  

部署prometheus和grafana

2.1、master/node节点环境部署
在【master】可以进行安装部署
安装git,并下载相关yaml文件

yum install git -y
git clone https://gitee.com/liugpwwwroot/k8s-prometheus-grafana.git

2.2、采用daemonset方式部署node-exporter组件

kubectl create -f k8s-prometheus-grafana/node-exporter.yaml

2.3、部署prometheus组件
2.3.1、rbac文件

kubectl create -f  k8s-prometheus-grafana/prometheus/rbac-setup.yaml

2.3.2 以configmap的形式管理prometheus组件的配置文件

kubectl create -f  k8s-prometheus-grafana/prometheus/configmap.yaml 

2.3.3 Prometheus deployment 文件

kubectl create -f  k8s-prometheus-grafana/prometheus/prometheus.deploy.yml

2.3.4 Prometheus service文件

kubectl create -f  k8s-prometheus-grafana/prometheus/prometheus.svc.yml 

2.4、部署grafana组件

2.4.1 grafana deployment配置文件

kubectl create -f   k8s-prometheus-grafana/grafana/grafana-deploy.yaml

2.4.2 grafana service配置文件

kubectl create -f   k8s-prometheus-grafana/grafana/grafana-svc.yaml

2.4.3 grafana ingress配置文件

kubectl create -f   k8s-prometheus-grafana/grafana/grafana-ing.yaml
  • 查看效果
[root@k8s-master ~]# kubectl get svc -n kube-system
NAME            TYPE        CLUSTER-IP     EXTERNAL-IP   PORT(S)                  AGE
grafana         NodePort    10.1.57.170    <none>        3000:30106/TCP           127m
kube-dns        ClusterIP   10.1.0.10      <none>        53/UDP,53/TCP,9153/TCP   7d18h
node-exporter   NodePort    10.1.224.117   <none>        9100:31672/TCP           131m
prometheus      NodePort    10.1.248.131   <none>        9090:30003/TCP           131m
[root@k8s-master ~]# 

2.5.0 访问prometheus(对应的30003端口)
image
image
2.5.1、访问node-exporter(上图对应的端口:31672),这个采集器只是采集宿主机上的一些指标

http://192.168.181.174:31672/metrics

image

2.5.2、cAdvisor:一种用于监控资源使用情况并分析容器性能的开源代理,即容器指标采集器,cAdvisor集成在Kubelet中,所有安装了k8s就自动安装了cAdvisor,cAdvisor的UI已经差不多被弃用了,Kubernetes 1.12版本之后cAdvisor的UI会被彻底删除,但是可以用granfana展示图表,还有一个问题就是默认不被外部访问,但是可以通过代理方式访问。使用代理如下:

# 这种方式只允许宿主机上访问,也就是localost或者127.0.0.1访问
[root@k8s-master ~]# kubectl proxy
Starting to serve on 127.0.0.1:8001
# 设置API server接收所有主机的请求
[root@k8s-master ~]# kubectl proxy --address='0.0.0.0'  --accept-hosts='^*$'
Starting to serve on [::]:8001

示例:

# 访问cadvisor采集的指标数据
$ http://192.168.181.174:8001/api/v1/nodes/k8s-master/proxy/metrics/cadvisor

image

2.5.3、访问prometheus(上图对应的端口:30003)

http://192.168.181.174:30003/targets

image
2.5.4、访问granfana,默认用户名,密码均为admin(上图对应的端口:30106)

http://192.168.181.174:30106/?orgId=1

image
1.添加数据源,使用直接访问模式
image
2、导入面板:Home->Dashboards->Import
导入面板,可以直接输入模板编号315在线导入,或者下载好对应的json模板文件本地导入,面板模板下载地址https://grafana.com/grafana/dashboards/315
image
稍等片刻,即可展示如下图,记得选择prometheus数据源为:prometheus
image
点击Import,即可查看展示效果

image

遇到的问题:

1.初始化的时候遇到镜像拉取失败的错误

出错提示: failed to pull image registry.aliyuncs.com/google_containers/coredns/coredns:v1.8.0

[root@k8s-master ~]# kubeadm init --apiserver-advertise-address=192.168.181.163 --image-repository registry.aliyuncs.com/google_containers --service-cidr=10.1.0.0/16 --pod-network-cidr=10.244.0.0/16
[init] Using Kubernetes version: v1.21.1
[preflight] Running pre-flight checks
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
error execution phase preflight: [preflight] Some fatal errors occurred:
[ERROR ImagePull]: failed to pull image registry.aliyuncs.com/google_containers/coredns/coredns:v1.8.0: output: Error response from daemon: pull access denied for registry.aliyuncs.com/google_containers/coredns/coredns, repository does not exist or may require 'docker login': denied: requested access to the resource is denied
, error: exit status 1
[preflight] If you know what you are doing, you can make a check non-fatal with `--ignore-preflight-errors=...`
To see the stack trace of this error execute with --v=5 or higher
[root@k8s-master ~]# 
  • 首先使用下面的命令获取需要拉取的docker镜像和已经拉取的镜像,这样能看出是最后一个镜像一直拉不成功。
[root@k8s-master ~]# kubeadm config images list
k8s.gcr.io/kube-apiserver:v1.21.3
k8s.gcr.io/kube-controller-manager:v1.21.3
k8s.gcr.io/kube-scheduler:v1.21.3
k8s.gcr.io/kube-proxy:v1.21.3
k8s.gcr.io/pause:3.4.1
k8s.gcr.io/etcd:3.4.13-0
k8s.gcr.io/coredns/coredns:v1.8.0
[root@k8s-master ~]# kubeadm config images pull --image-repository=registry.aliyuncs.com/google_containers
[config/images] Pulled registry.aliyuncs.com/google_containers/kube-apiserver:v1.21.3
[config/images] Pulled registry.aliyuncs.com/google_containers/kube-controller-manager:v1.21.3
[config/images] Pulled registry.aliyuncs.com/google_containers/kube-scheduler:v1.21.3
[config/images] Pulled registry.aliyuncs.com/google_containers/kube-proxy:v1.21.3
[config/images] Pulled registry.aliyuncs.com/google_containers/pause:3.4.1
[config/images] Pulled registry.aliyuncs.com/google_containers/etcd:3.4.13-0
failed to pull image "registry.aliyuncs.com/google_containers/coredns:v1.8.0": output: Error response from daemon: manifest for registry.aliyuncs.com/google_containers/coredns:v1.8.0 not found: manifest unknown: manifest unknown
, error: exit status 1
To see the stack trace of this error execute with --v=5 or higher
[root@k8s-master ~]#
  • 解决办法
第一步:从其他仓库拉取该镜像 
	[root@k8s-master ~]# docker pull coredns/coredns:1.8.0
第二步:根据选择的镜像源来重新打tag 
	[root@k8s-master ~]# docker tag coredns/coredns:1.8.0 registry.aliyuncs.com/google_containers/coredns:v1.8.0 
第三步:删掉不需要的镜像
	[root@k8s-master ~]# docker rmi coredns/coredns:1.8.0
第四步:重新执行初始化命令 kubectl init,即可成功
	[root@k8s-master ~]# kubeadm init \
	--apiserver-advertise-address=192.168.181.174 \
	--image-repository=registry.aliyuncs.com/google_containers \
	--service-cidr=10.1.0.0/16 \
	--pod-network-cidr=10.244.0.0/16

2.加入节点的时候遇到/proc/sys/net/bridge/bridge-nf-call-iptables没有设置为1的错误:

出错提示:/proc/sys/net/bridge/bridge-nf-call-iptables contents are not set to 1

[root@k8s-master ~]# kubeadm init --apiserver-advertise-address=192.168.181.163 --image-repository registry.aliyuncs.com/google_containers --service-cidr=10.1.0.0/16 --pod-network-cidr=10.244.0.0/16
[init] Using Kubernetes version: v1.21.1
[preflight] Running pre-flight checks
error execution phase preflight: [preflight] Some fatal errors occurred:
[ERROR FileContent--proc-sys-net-bridge-bridge-nf-call-iptables]: /proc/sys/net/bridge/bridge-nf-call-iptables contents are not set to 1
[preflight] If you know what you are doing, you can make a check non-fatal with `--ignore-preflight-errors=...`
To see the stack trace of this error execute with --v=5 or higher
[root@k8s-master ~]#

解决办法

[root@k8s-master ~]# echo 1 >/proc/sys/net/bridge/bridge-nf-call-iptables 
[root@k8s-master ~]# cat /proc/sys/net/bridge/bridge-nf-call-iptables
1
[root@k8s-master ~]# echo 1 >/proc/sys/net/bridge/bridge-nf-call-ip6tables 
[root@k8s-master ~]# cat /proc/sys/net/bridge/bridge-nf-call-ip6tables
1

3.coredns出现CrashLoopBackOff的错误

  • 成功部署kubernetes集群之后挂起,在再次开启虚拟机后,coredns容器出现镜像拉取失败的现象,一直处于CrashLoopBackOff状态,陷入了不停错误重启的死循环中。
[root@k8s-master ~]# kubectl get pods -o wide -A
NAMESPACE     NAME                                 READY   STATUS    RESTARTS   AGE     IP                NODE         NOMINATED NODE   READINESS GATES
kube-system   coredns-59d64cd4d4-d677g             1/1     Running   126        3d13h   10.244.0.9        k8s-master   <none>           <none>
kube-system   coredns-59d64cd4d4-vkfkb             1/1     Running   126        3d13h   10.244.0.8        k8s-master   <none>           <none>
kube-system   etcd-k8s-master                      1/1     Running   3          3d13h   192.168.181.174   k8s-master   <none>           <none>
kube-system   kube-apiserver-k8s-master            1/1     Running   3          3d13h   192.168.181.174   k8s-master   <none>           <none>
[root@k8s-master ~]# kubectl logs -f coredns-59d64cd4d4-d677g -n kube-system
[root@k8s-master ~]# kubectl describe pod coredns-59d64cd4d4-d677g -n kube-system

解决办法:

  • 重新启动一下docker和kubelet
[root@k8s-master ~]# systemctl stop kubelet
[root@k8s-master ~]# systemctl stop docker
[root@k8s-master ~]# iptables --flush
[root@k8s-master ~]# iptables -tnat --flush
[root@k8s-master ~]# systemctl start kubelet
[root@k8s-master ~]# systemctl start docker
  • 查看效果
[root@k8s-master ~]# kubectl get pods -n kube-system

总结:尝试重新启动docker和kubelet,防火墙规则清一清,域名解析文件/etc/resolv.conf文件

4.flannel网络问题导致该master节点与其他node节点上的pod网络不通的问题

[root@k8s-master ~]# kubectl get pods 
NAMESPACE     NAME                                 READY   STATUS    RESTARTS   AGE     IP                NODE         NOMINATED NODE   READINESS GATES
default       my-nginx-7f4fc68488-7cz7z            1/1     Running   2          4h36m   10.244.1.14       k8s-node-1   <none>           <none>
default       my-nginx-7f4fc68488-cbrjf            1/1     Running   2          4h36m   10.244.2.12       k8s-node-2   <none>           <none>
default       my-nginx-7f4fc68488-lpp4h            1/1     Running   2          4h36m   10.244.1.13       k8s-node-1   <none>           <none>
default       my-nginx-7f4fc68488-mdbmj            1/1     Running   2          4h36m   10.244.1.12       k8s-node-1   <none>           <none>
default       my-nginx-7f4fc68488-pnrgv            1/1     Running   2          4h36m   10.244.2.13       k8s-node-2   <none>           <none>
[root@k8s-master ~]#
[root@k8s-master ~]# curl 10.244.1.14
^C
[root@k8s-master ~]# curl 10.244.2.12
^C

解决办法:

  • 将node节点的flannel.1网络删除,在node节点上执行
ip link delete flannel.1
  • 在 /etc/sysctl.conf 中 设置 net.ipv4.ip_forward=1,开启路由功能。
echo 'net.ipv4.ip_forward=1' > /etc/sysctl.conf
重新启动服务器以应用更改,或执行sysctl -p以应用更改而不重新启动.他们将在重新启动时永久保持.
  • 将网卡重新启动
systemctl restart network
  • 在master节点上删掉flannel的Pod,重新启动flannel的yaml文件
kubectl delete -f kube-flannel.yaml
kubectl apply -f kube-flannel.yaml
  • 通过ip a查看node节点的IP,发现flannel.1果然有IP地址了
4: flannel.1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UNKNOWN group default 
    link/ether c2:cd:e3:2b:83:40 brd ff:ff:ff:ff:ff:ff
    inet 10.244.1.0/32 brd 10.244.1.0 scope global flannel.1
       valid_lft forever preferred_lft forever
    inet6 fe80::c0cd:e3ff:fe2b:8340/64 scope link 
       valid_lft forever preferred_lft forever

4: flannel.1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UNKNOWN group default 
    link/ether 2e:eb:d9:49:0a:d9 brd ff:ff:ff:ff:ff:ff
    inet 10.244.2.0/32 brd 10.244.2.0 scope global flannel.1
       valid_lft forever preferred_lft forever
    inet6 fe80::2ceb:d9ff:fe49:ad9/64 scope link 
       valid_lft forever preferred_lft forever
  • 检查效果
    1.在master节点ping node1节点上的Pod发现可以ping通或者用curl访问也可以
[root@k8s-master ~]# ping 10.244.2.12 -c 1
PING 10.244.2.12 (10.244.2.12) 56(84) bytes of data.
64 bytes from 10.244.2.12: icmp_seq=1 ttl=63 time=2.74 ms

--- 10.244.2.12 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 2.741/2.741/2.741/0.000 ms
[root@k8s-master ~]# ping 10.244.1.14 -c 1
PING 10.244.1.14 (10.244.1.14) 56(84) bytes of data.
64 bytes from 10.244.1.14: icmp_seq=1 ttl=63 time=1.37 ms

--- 10.244.1.14 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 1.376/1.376/1.376/0.000 ms
[root@k8s-master ~]# 
[root@k8s-master ~]# curl 10.244.1.14
<!DOCTYPE html>
<html>
<head>
<title>Welcome to nginx!</title>
<style>
    body {
        width: 35em;
        margin: 0 auto;
        font-family: Tahoma, Verdana, Arial, sans-serif;
    }
</style>
</head>
<body>
<h1>Welcome to nginx!</h1>
<p>If you see this page, the nginx web server is successfully installed and
working. Further configuration is required.</p>

<p>For online documentation and support please refer to
<a href="http://nginx.org/">nginx.org</a>.<br/>
Commercial support is available at
<a href="http://nginx.com/">nginx.com</a>.</p>

<p><em>Thank you for using nginx.</em></p>
</body>
</html>
[root@k8s-master ~]# curl 10.244.2.12
<!DOCTYPE html>
<html>
<head>
<title>Welcome to nginx!</title>
<style>
    body {
        width: 35em;
        margin: 0 auto;
        font-family: Tahoma, Verdana, Arial, sans-serif;
    }
</style>
</head>
<body>
<h1>Welcome to nginx!</h1>
<p>If you see this page, the nginx web server is successfully installed and
working. Further configuration is required.</p>

<p>For online documentation and support please refer to
<a href="http://nginx.org/">nginx.org</a>.<br/>
Commercial support is available at
<a href="http://nginx.com/">nginx.com</a>.</p>

<p><em>Thank you for using nginx.</em></p>
</body>
</html>
[root@k8s-master ~]# 

2.使用 kubectl exec -it my-nginx-7f4fc68488-7cz7z /bin/bash 命令进入其中的一个pod使用

容器内安装ping命令和curl命令
root@my-nginx-7f4fc68488-7cz7z:/# apt install curl && ping -y

[root@k8s-master ~]# kubectl exec -it my-nginx-7f4fc68488-7cz7z /bin/bash
root@my-nginx-7f4fc68488-7cz7z:/# curl -l 10.244.2.12
<!DOCTYPE html>
<html>
<head>
<title>Welcome to nginx!</title>
<style>
    body {
        width: 35em;
        margin: 0 auto;
        font-family: Tahoma, Verdana, Arial, sans-serif;
    }
</style>
</head>
<body>
<h1>Welcome to nginx!</h1>
<p>If you see this page, the nginx web server is successfully installed and
working. Further configuration is required.</p>

<p>For online documentation and support please refer to
<a href="http://nginx.org/">nginx.org</a>.<br/>
Commercial support is available at
<a href="http://nginx.com/">nginx.com</a>.</p>

<p><em>Thank you for using nginx.</em></p>
</body>
</html>
root@my-nginx-7f4fc68488-7cz7z:/# 

总结:中间还有一个插曲,在查看效果的时候ping的通node1,ping不同node2,最后是全部关机重新启动,问题解决。

posted @ 2021-07-25 18:01  随便学学111  阅读(410)  评论(0编辑  收藏  举报