ubuntu安装k8s


1、查看Ubuntu系统版本
w@node1:~$ lsb_release -a
No LSB modules are available.
Distributor ID:	Ubuntu
Description:	Ubuntu 18.04.6 LTS
Release:	18.04
Codename:	bionic

2、设置主机名

vim /etc/hostname 
w@node1:~$ cat /etc/hostname 
node1

 

3、配置节点静态IP地址(可选)。

使用ifconfig查询本机ip和掩码(没有ifconfig命令,可以apt-get install net-tools)

w@node1:~$ ifconfig
docker0: flags=4099<UP,BROADCAST,MULTICAST>  mtu 1500
        ether 02:42:53:5f:6a:cd  txqueuelen 0  (以太网)
        RX packets 0  bytes 0 (0.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 0  bytes 0 (0.0 B)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

ens33: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 192.168.199.141  netmask 255.255.255.0  broadcast 192.168.199.255
        inet6 fe80::20c:29ff:fed8:ca5f  prefixlen 64  scopeid 0x20<link>
        ether 00:0c:29:d8:ca:5f  txqueuelen 1000  (以太网)
        RX packets 7904  bytes 5375252 (5.3 MB)
        RX errors 1  dropped 0  overruns 0  frame 0
        TX packets 6380  bytes 772741 (772.7 KB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
        device interrupt 19  base 0x2000  

 以太网卡为ens33,ip为192.168.199.141,掩码为255.255.255.0,

查询网关:

w@node1:/etc/netplan$ route -n
内核 IP 路由表
目标            网关            子网掩码        标志  跃点   引用  使用 接口
0.0.0.0         192.168.199.2   0.0.0.0         UG    100    0        0 ens33
169.254.0.0     0.0.0.0         255.255.0.0     U     1000   0        0 ens33
192.168.199.0   0.0.0.0         255.255.255.0   U     100    0        0 ens33

备份 /etc/netplan/01-network-manager-all.yaml ,然后更改文件:

# Let NetworkManager manage all devices on this system
network:
  version: 2
  renderer: NetworkManager   #和原先的保持一致
  ethernets:
    ens33:
      dhcp4: no
      addresses: [192.168.199.141/24]  #静态ip/掩码
      gateway4: 192.168.199.2 #网关
      nameservers:
        addresses: [192.168.199.141,223.6.6.6]   #设置一个响应快的DNS,因后续会使用本机作为DNS,所以这里把本机加上

 

sudo netlan apply

重启机器

查看配置:

w@node1:~$ ifconfig
docker0: flags=4099<UP,BROADCAST,MULTICAST>  mtu 1500
        inet 172.17.0.1  netmask 255.255.0.0  broadcast 172.17.255.255
        ether 02:42:18:e9:51:20  txqueuelen 0  (以太网)
        RX packets 0  bytes 0 (0.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 0  bytes 0 (0.0 B)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

ens33: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 192.168.199.141  netmask 255.255.255.0  broadcast 192.168.199.255
        inet6 fe80::20c:29ff:fed8:ca5f  prefixlen 64  scopeid 0x20<link>
        ether 00:0c:29:d8:ca:5f  txqueuelen 1000  (以太网)
        RX packets 5419  bytes 4116516 (4.1 MB)
        RX errors 2  dropped 2  overruns 0  frame 0
        TX packets 5133  bytes 709451 (709.4 KB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
        device interrupt 19  base 0x2000  

 配置ip地址和主机映射

vim /etc/hosts
w@node1:~$ cat /etc/hosts
127.0.0.1	localhost
127.0.1.1	w-virtual-machine
192.168.199.141 node1

 

w@node1:~$ ping node1
PING node1 (192.168.199.141) 56(84) bytes of data.
64 bytes from node1 (192.168.199.141): icmp_seq=1 ttl=64 time=0.043 ms
64 bytes from node1 (192.168.199.141): icmp_seq=2 ttl=64 time=0.113 ms
64 bytes from node1 (192.168.199.141): icmp_seq=3 ttl=64 time=0.123 ms
64 bytes from node1 (192.168.199.141): icmp_seq=4 ttl=64 time=0.136 ms

 

4、配置Ubuntu软件源

vi /etc/apt/sources.list
在最后增加:
deb https://mirrors.aliyun.com/kubernetes/apt/ kubernetes-xenial main
apt-key.gpg是k8s的deb源公钥,加载k8s的deb源公钥命令为: 首先su root再执行:
curl https://mirrors.aliyun.com/kubernetes/apt/doc/apt-key.gpg | apt-key add - 

 然后apt-get update

apt-cache madison kubelet 查看可安装版本

5、关闭swap、防火墙
原因是swap会导致挂掉的程序靠swap续命,不能及时OOM,让k8s来接管重启容器,
关闭防火墙是因为防火墙可能阻碍集群内部组件通信。
root@node1:/home/w# ufw disable
防火墙在系统启动时自动禁用

 

root@node1:/home/w# swapoff -a ;sed -i '/swap/d' /etc/fstab
root@node1:/home/w# cat /etc/fstab
# /etc/fstab: static file system information.
#
# Use 'blkid' to print the universally unique identifier for a
# device; this may be used with UUID= as a more robust way to name devices
# that works even if disks are added and removed. See fstab(5).
#
# <file system> <mount point>   <type>  <options>       <dump>  <pass>
# / was on /dev/sda1 during installation
UUID=2aa24dbd-cf22-4c5f-91d6-bc73544fd3c2 /               ext4    errors=remount-ro 0       1

 6、安装docker

apt-get install docker-ce

设置docker开机自启动并现在启动docker:

 systemctl enable docker --now
systemctl status docker

注意kubernetes V1.22.2版本及其以后,要求容器的cgroup driver 为systemd,但是docker默认的cgroup driver 是cgroupfs,kubernetes 1.21版本及其之前,是不需要修改cgroup driver的。

可以使用docker info | grep -i cgroup查看cgroup driver:

root@node1:~# docker info | grep -i cgroup
WARNING: No swap limit support
 Cgroup Driver: cgroupfs
 Cgroup Version: 1

 配置docker镜像加速器,并设置docker的cgroup driver 为systemd:

cat > /etc/docker/daemon.json <<EOF
{
 "exec-opts": ["native.cgroupdriver=systemd"] 
 }
EOF

systemctl daemon-reload

systemctl restart docker

docker info | grep -i cgroup

root@node1:~# docker info | grep -i cgroup
WARNING: No swap limit support
 Cgroup Driver: systemd
 Cgroup Version: 1

 

设置iptables不对bridge的数据进行处理,启用IP路由转发功能: 新建k8s.conf

cat <<EOF> /etc/sysctl.d/k8s.conf   
net.bridge.bridge-nf-call-ip6tables = 1 
net.bridge.bridge-nf-call-iptables = 1 
net.ipv4.ip_forward = 1 
 EOF

 使配置生效:

sysctl -p /etc/sysctl.d/k8s.conf

 

7、安装kubelet,kubeadm,kubectl

apt-get -y install kubelet=1.28.2-00 kubeadm=1.28.2-00 kubectl=1.28.2-00

设置kubelet开机自启:

设置kubelet开机自启动并现在启动kubelet。

systemctl enable kubelet --now

 

8、kubeadm初始化

  • --image-repository registry.aliyuncs.com/google_containers:表示使用阿里云镜像仓库,不然有些镜像下载不下来 ;
  • --kubernetes-version=v1.28.2:指定kubernetes的版本;
  • --pod-network-cidr=10.244.0.0/16:指定pod的网段 ;
  • coredns是一个用go语言编写的开源的DNS服务。

kubeadm init --image-repository registry.aliyuncs.com/google_containers --kubernetes-version=v1.28.2 --pod-network-cidr=10.244.0.0/16

root@node1:~# kubeadm init --image-repository registry.aliyuncs.com/google_containers --kubernetes-version=v1.28.2 --pod-network-cidr=10.244.0.0/16
[init] Using Kubernetes version: v1.28.2
[preflight] Running pre-flight checks
error execution phase preflight: [preflight] Some fatal errors occurred:
	[ERROR CRI]: container runtime is not running: output: time="2023-11-10T16:43:51+08:00" level=fatal msg="validate service connection: CRI v1 runtime API is not implemented for endpoint \"unix:///var/run/containerd/containerd.sock\": rpc error: code = Unimplemented desc = unknown service runtime.v1.RuntimeService"
, error: exit status 1
[preflight] If you know what you are doing, you can make a check non-fatal with `--ignore-preflight-errors=...`
To see the stack trace of this error execute with --v=5 or higher

 这个报错是因为containerd安装的时候默认把CRI禁用了,处理方法:

 vi /etc/containerd/config.toml
找到:
disabled_plugins = ["cri"],把cri删掉,变成disabled_plugins = [],然后重启continerd:systemctl restart containerd

然后再次kubeadm init, 遇到问题:

[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
[kubelet-check] Initial timeout of 40s passed.

Unfortunately, an error has occurred:
	timed out waiting for the condition

This error is likely caused by:
	- The kubelet is not running
	- The kubelet is unhealthy due to a misconfiguration of the node in some way (required cgroups disabled)

If you are on a systemd-powered system, you can try to troubleshoot the error with the following commands:
	- 'systemctl status kubelet'
	- 'journalctl -xeu kubelet'

Additionally, a control plane component may have crashed or exited when started by the container runtime.
To troubleshoot, list all containers using your preferred container runtimes CLI.
Here is one example how you may list all running Kubernetes containers by using crictl:
	- 'crictl --runtime-endpoint unix:///var/run/containerd/containerd.sock ps -a | grep kube | grep -v pause'
	Once you have found the failing container, you can inspect its logs with:
	- 'crictl --runtime-endpoint unix:///var/run/containerd/containerd.sock logs CONTAINERID'
error execution phase wait-control-plane: couldn't initialize a Kubernetes cluster
To see the stack trace of this error execute with --v=5 or higher

journalctl -xeu  kubelet查看发现:

failed to create sandbox for pod” err="rpc error: code = Unknown desc = failed to get sandbox image “ registry.k8s.io/pause:3.6”: failed to pull image “ registry.k8s.io/pause:3.6 问题:

这是拉取  registry.k8s.io/pause:3.6 镜像失败 导致sandbox 创建不了而报错,解决方法是重新配置 sandbox 镜像 仓库,将默认的  registry.k8s.io/pause:3.6 修改成 “ registry.aliyuncs.com/google_containers/pause:3.6” ,

containerd config default > /etc/containerd/config.toml

vim /etc/containerd/config.toml
sandbox_image = " registry.aliyuncs.com/google_containers/pause:3.6"

systemctl daemon-reload
systemctl restart containerd.service

root@node1:~# kubeadm init --image-repository registry.aliyuncs.com/google_containers --kubernetes-version=v1.28.2 --pod-network-cidr=10.244.0.0/16  --ignore-preflight-errors=...
[init] Using Kubernetes version: v1.28.2
[preflight] Running pre-flight checks
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
W1110 17:38:20.439667   13628 checks.go:835] detected that the sandbox image "registry.aliyuncs.com/google_containers/pause:3.6" of the container runtime is inconsistent with that used by kubeadm. It is recommended that using "registry.aliyuncs.com/google_containers/pause:3.9" as the CRI sandbox image.
[certs] Using certificateDir folder "/etc/kubernetes/pki"
[certs] Using existing ca certificate authority
[certs] Using existing apiserver certificate and key on disk
[certs] Using existing apiserver-kubelet-client certificate and key on disk
[certs] Using existing front-proxy-ca certificate authority
[certs] Using existing front-proxy-client certificate and key on disk
[certs] Using existing etcd/ca certificate authority
[certs] Using existing etcd/server certificate and key on disk
[certs] Using existing etcd/peer certificate and key on disk
[certs] Using existing etcd/healthcheck-client certificate and key on disk
[certs] Using existing apiserver-etcd-client certificate and key on disk
[certs] Using the existing "sa" key
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[kubeconfig] Using existing kubeconfig file: "/etc/kubernetes/admin.conf"
[kubeconfig] Using existing kubeconfig file: "/etc/kubernetes/kubelet.conf"
[kubeconfig] Using existing kubeconfig file: "/etc/kubernetes/controller-manager.conf"
[kubeconfig] Using existing kubeconfig file: "/etc/kubernetes/scheduler.conf"
[etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests"
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
[control-plane] Creating static Pod manifest for "kube-scheduler"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Starting the kubelet
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
[apiclient] All control plane components are healthy after 6.504867 seconds
[upload-config] Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
[kubelet] Creating a ConfigMap "kubelet-config" in namespace kube-system with the configuration for the kubelets in the cluster
[upload-certs] Skipping phase. Please see --upload-certs
[mark-control-plane] Marking the node node1 as control-plane by adding the labels: [node-role.kubernetes.io/control-plane node.kubernetes.io/exclude-from-external-load-balancers]
[mark-control-plane] Marking the node node1 as control-plane by adding the taints [node-role.kubernetes.io/control-plane:NoSchedule]
[bootstrap-token] Using token: l6a8jn.j8ilw25qae8vtj5m
[bootstrap-token] Configuring bootstrap tokens, cluster-info ConfigMap, RBAC Roles
[bootstrap-token] Configured RBAC rules to allow Node Bootstrap tokens to get nodes
[bootstrap-token] Configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials
[bootstrap-token] Configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token
[bootstrap-token] Configured RBAC rules to allow certificate rotation for all node client certificates in the cluster
[bootstrap-token] Creating the "cluster-info" ConfigMap in the "kube-public" namespace
[kubelet-finalize] Updating "/etc/kubernetes/kubelet.conf" to point to a rotatable kubelet client certificate and key
[addons] Applied essential addon: CoreDNS
[addons] Applied essential addon: kube-proxy

Your Kubernetes control-plane has initialized successfully!

To start using your cluster, you need to run the following as a regular user:

  mkdir -p $HOME/.kube
  sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
  sudo chown $(id -u):$(id -g) $HOME/.kube/config

Alternatively, if you are the root user, you can run:

  export KUBECONFIG=/etc/kubernetes/admin.conf

You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
  https://kubernetes.io/docs/concepts/cluster-administration/addons/

Then you can join any number of worker nodes by running the following on each as root:

kubeadm join 192.168.199.141:6443 --token l6a8jn.j8ilw25qae8vtj5m \
	--discovery-token-ca-cert-hash sha256:451cf781c7315cbec2cdb7178c5bb9981aa3e46a13fb3f35d475674aae3e73f0 

 根据提示创建目录和配置文件:

mkdir -p $HOME/.kube
  sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
  sudo chown $(id -u):$(id -g) $HOME/.kube/config

9、加入woker节点

这里先不加

10、部署CNI网络插件calico

虽然现在kubernetes集群已经有1个master节点,是NotReady的,原因是没有CNI网络插件,为了节点间的通信,需要安装cni网络插件,常用的cni网络插件有calico和flannel,两者区别为:flannel不支持复杂的网络策略,calico支持网络策略

现在去官网下载calico.yaml文件:

官网:https://projectcalico.docs.tigera.io/about/about-calico

浏览器里输入: https://docs.projectcalico.org/manifests/calico.yaml 获取yaml 或者 curl https://raw.githubusercontent.com/projectcalico/calico/v3.26.1/manifests/calico.yaml >calico.yaml
修改calico.yaml 文件,CALICO_IPV4POOL_CIDR的IP段要和kubeadm初始化时候的pod网段一致,(3.26.1不用配置这个)

当看到calico pod都running时,kubectl get node就ready了。

11、配置kubectl 自动补全

添加source <(kubectl completion bash)到/etc/profile,并使配置生效。

vim /etc/profile
source /etc/profile

 

 

参考:

https://www.cnblogs.com/renshengdezheli/p/17632858.html


 

12、卸载:kubeadm reset

posted on 2023-11-10 15:41  MissSimple  阅读(149)  评论(0编辑  收藏  举报

导航