搭建 k8s 环境

平台规划

单 Master 集群

多 Master 集群

硬件要求

环境	节点	硬件要求
测试环境	Master	2核，4G，20G
测试环境	Node	4核，8G，40G
生产环境	Master	8核，16G，100G
生产环境	Node	16核，64G，500G

搭建

搭建准备：

安装3台虚拟机或云主机。
安装 Ubuntu 操作系统，并执行初始化操作：
- 关闭防火墙
- 关闭selinux
- 关闭swap
  $ swapoff -a
  $ vim /etc/fstab
- 设置主机名
```
$ hostnamectl set-hostname k8smaster
```

并在Master添加hosts

$ cat >> /etc/hosts << EOF
192.168.241.128 k8smaster
192.168.241.129 k8sworker
EOF

将桥接的ipv4流量传递到iptables的链，并让设置生效

$ cat > /etc/sysctl.d/k8s.conf << EOF
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
EOF
$ sysctl --system
$ modprobe br_netfilter
$ echo "1" > /proc/sys/net/ipv4/ip_forward

时间同步

$ yum install ntpdate -y
$ ntpdate time.windows.com

为所有节点安装 Docker / Containerd
```
sudo apt-get install docker.io
```
- Kubernetes（1.24版以后）默认CRI（容器运行时接口）为 CRI-Containerd
- Containerd官网安装教程
- containerd.io 包已经包含了runc, 但是CNI插件需要另外安装。

搭建方式：

kubeadm
二进制方式

Kubeadm

K8s部署工具，快速部署。

为所有节点安装kubeamd / kubectl / kubelet

apt-get update && apt-get install -y apt-transport-https curl
curl https://mirrors.aliyun.com/kubernetes/apt/doc/apt-key.gpg | apt-key add - 
cat >> /etc/apt/sources.list.d/kubernetes.list << EOF
deb https://mirrors.aliyun.com/kubernetes/apt/ kubernetes-xenial main
EOF
apt-get update
apt-get install -y kubeadm kubelet kubectl

部署 master 命令kubeadm init config kubeadm.yaml

apiVersion: kubeadm.k8s.io/v1beta3
kind: InitConfiguration
localAPIEndpoint:  
  advertiseAddress: 192.168.241.128
  bindPort: 6443
nodeRegistration:  
  kubeletExtraArgs:    
    cgroup-driver: "systemd"
---
apiVersion: kubeadm.k8s.io/v1beta3
kind: ClusterConfiguration
imageRepository: registry.aliyuncs.com/google_containers
kubernetesVersion: 1.26.0
clusterName: "k8s-cluster"
controllerManager:  
  extraArgs:
    allocate-node-cidrs: "true"
    cluster-cidr: "10.244.0.0/16"
    horizontal-pod-autoscaler-sync-period: "10s"    
    node-monitor-grace-period: "10s"
apiServer:  
  extraArgs:    
    runtime-config: "api/all=true"

出现报错：

[kubelet-check] Initial timeout of 40s passed.

Unfortunately, an error has occurred:
        timed out waiting for the condition

This error is likely caused by:
        - The kubelet is not running
        - The kubelet is unhealthy due to a misconfiguration of the node in some way (required cgroups disabled)

If you are on a systemd-powered system, you can try to troubleshoot the error with the following commands:
        - 'systemctl status kubelet'
        - 'journalctl -xeu kubelet'

Additionally, a control plane component may have crashed or exited when started by the container runtime.
To troubleshoot, list all containers using your preferred container runtimes CLI.
Here is one example how you may list all running Kubernetes containers by using crictl:
        - 'crictl --runtime-endpoint unix:///var/run/containerd/containerd.sock ps -a | grep kube | grep -v pause'
        Once you have found the failing container, you can inspect its logs with:
        - 'crictl --runtime-endpoint unix:///var/run/containerd/containerd.sock logs CONTAINERID'
error execution phase wait-control-plane: couldn't initialize a Kubernetes cluster
To see the stack trace of this error execute with --v=5 or higher

查看日志发现，是从registry.k8s.io拉取pause:3.6失败引起的：

$ systemctl status kubelet

"RunPodSandbox from runtime service failed" err="rpc error: code = Unknown desc = failed to get sandbox image \"registry.k8s.io/pause:3.6\": failed to pull image \"registry.k8s.io/pause:3.6\": failed to pull and unpack image \"registry.k8s.io/pause:3.6\": failed to resolve reference \"registry.k8s.io/pause:3.6\": failed to do request: Head \"https://asia-east1-docker.pkg.dev/v2/k8s-artifacts-prod/images/pause/manifests/3.6\": dial tcp 142.251.8.82:443: connect: connection refused"

解决方案：修改 containerd 配置文件

没毛病！

这个时候还不能执行kubectl命令

mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config

查看节点kubectl get nodes

这是因为没有安装网络插件

部署 node 命令kubeadm join

部署 CNI 网络插件

$ cat >> /etc/hosts << EOF
185.199.110.133 raw.githubusercontent.com
EOF

$ wget https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
$ kubectl apply -f kube-flannel.yml

根据域名查询IP

$ kubectl get pods --all-namespaces

再次出错，kube-flannel 容器处于 CrashLoopBackOff 状态

查看日志

$ kubectl logs -f kube-flannel-ds-g57vw -n kube-flannel

找到原因，

Error registering network: failed to acquire lease: node "k8sworker" pod cidr not assigned

解决办法，在kubeadm.yaml文件中添加配置（已添加）

controllerManager:  
  extraArgs:
    allocate-node-cidrs: "true"
    cluster-cidr: "10.244.0.0/16"

再来一次！

集群测试：创建一个Pod，看看是否正常

$ kubectl create deployment nginx --image=nginx
deployment.apps/nginx created

$ kubectl get pods

$ kubectl expose deployment nginx --port=80 --type=NodePort
service/nginx exposed

对外暴露端口 31072

posted @ 2022-12-10 15:59 !ɹO 阅读(741) 评论(0) 编辑收藏举报

刷新页面返回顶部

Ori

搭建 k8s 环境

平台规划

硬件要求

搭建

Kubeadm