ubuntu安装k8s
1、查看Ubuntu系统版本
w@node1:~$ lsb_release -a No LSB modules are available. Distributor ID: Ubuntu Description: Ubuntu 18.04.6 LTS Release: 18.04 Codename: bionic
2、设置主机名
vim /etc/hostname
w@node1:~$ cat /etc/hostname node1
3、配置节点静态IP地址(可选)。
使用ifconfig查询本机ip和掩码(没有ifconfig命令,可以apt-get install net-tools)
w@node1:~$ ifconfig docker0: flags=4099<UP,BROADCAST,MULTICAST> mtu 1500 ether 02:42:53:5f:6a:cd txqueuelen 0 (以太网) RX packets 0 bytes 0 (0.0 B) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 0 bytes 0 (0.0 B) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 ens33: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500 inet 192.168.199.141 netmask 255.255.255.0 broadcast 192.168.199.255 inet6 fe80::20c:29ff:fed8:ca5f prefixlen 64 scopeid 0x20<link> ether 00:0c:29:d8:ca:5f txqueuelen 1000 (以太网) RX packets 7904 bytes 5375252 (5.3 MB) RX errors 1 dropped 0 overruns 0 frame 0 TX packets 6380 bytes 772741 (772.7 KB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 device interrupt 19 base 0x2000
以太网卡为ens33,ip为192.168.199.141,掩码为255.255.255.0,
查询网关:
w@node1:/etc/netplan$ route -n 内核 IP 路由表 目标 网关 子网掩码 标志 跃点 引用 使用 接口 0.0.0.0 192.168.199.2 0.0.0.0 UG 100 0 0 ens33 169.254.0.0 0.0.0.0 255.255.0.0 U 1000 0 0 ens33 192.168.199.0 0.0.0.0 255.255.255.0 U 100 0 0 ens33
备份 /etc/netplan/01-network-manager-all.yaml ,然后更改文件:
# Let NetworkManager manage all devices on this system network: version: 2 renderer: NetworkManager #和原先的保持一致 ethernets: ens33: dhcp4: no addresses: [192.168.199.141/24] #静态ip/掩码 gateway4: 192.168.199.2 #网关 nameservers: addresses: [192.168.199.141,223.6.6.6] #设置一个响应快的DNS,因后续会使用本机作为DNS,所以这里把本机加上
sudo netlan apply
重启机器
查看配置:
w@node1:~$ ifconfig docker0: flags=4099<UP,BROADCAST,MULTICAST> mtu 1500 inet 172.17.0.1 netmask 255.255.0.0 broadcast 172.17.255.255 ether 02:42:18:e9:51:20 txqueuelen 0 (以太网) RX packets 0 bytes 0 (0.0 B) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 0 bytes 0 (0.0 B) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 ens33: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500 inet 192.168.199.141 netmask 255.255.255.0 broadcast 192.168.199.255 inet6 fe80::20c:29ff:fed8:ca5f prefixlen 64 scopeid 0x20<link> ether 00:0c:29:d8:ca:5f txqueuelen 1000 (以太网) RX packets 5419 bytes 4116516 (4.1 MB) RX errors 2 dropped 2 overruns 0 frame 0 TX packets 5133 bytes 709451 (709.4 KB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 device interrupt 19 base 0x2000
配置ip地址和主机映射
vim /etc/hosts
w@node1:~$ cat /etc/hosts 127.0.0.1 localhost 127.0.1.1 w-virtual-machine 192.168.199.141 node1
w@node1:~$ ping node1 PING node1 (192.168.199.141) 56(84) bytes of data. 64 bytes from node1 (192.168.199.141): icmp_seq=1 ttl=64 time=0.043 ms 64 bytes from node1 (192.168.199.141): icmp_seq=2 ttl=64 time=0.113 ms 64 bytes from node1 (192.168.199.141): icmp_seq=3 ttl=64 time=0.123 ms 64 bytes from node1 (192.168.199.141): icmp_seq=4 ttl=64 time=0.136 ms
4、配置Ubuntu软件源
vi /etc/apt/sources.list
在最后增加:
deb https://mirrors.aliyun.com/kubernetes/apt/ kubernetes-xenial main
apt-key.gpg是k8s的deb源公钥,加载k8s的deb源公钥命令为: 首先su root再执行:
curl https://mirrors.aliyun.com/kubernetes/apt/doc/apt-key.gpg | apt-key add -
然后apt-get update
apt-cache madison kubelet
查看可安装版本
5、关闭swap、防火墙
原因是swap会导致挂掉的程序靠swap续命,不能及时OOM,让k8s来接管重启容器,
关闭防火墙是因为防火墙可能阻碍集群内部组件通信。
root@node1:/home/w# ufw disable 防火墙在系统启动时自动禁用
root@node1:/home/w# swapoff -a ;sed -i '/swap/d' /etc/fstab root@node1:/home/w# cat /etc/fstab # /etc/fstab: static file system information. # # Use 'blkid' to print the universally unique identifier for a # device; this may be used with UUID= as a more robust way to name devices # that works even if disks are added and removed. See fstab(5). # # <file system> <mount point> <type> <options> <dump> <pass> # / was on /dev/sda1 during installation UUID=2aa24dbd-cf22-4c5f-91d6-bc73544fd3c2 / ext4 errors=remount-ro 0 1
6、安装docker
apt-get install docker-ce
设置docker开机自启动并现在启动docker:
systemctl enable docker --now
systemctl status docker
注意kubernetes V1.22.2版本及其以后,要求容器的cgroup driver 为systemd,但是docker默认的cgroup driver 是cgroupfs,kubernetes 1.21版本及其之前,是不需要修改cgroup driver的。
可以使用docker info | grep -i cgroup查看cgroup driver:
root@node1:~# docker info | grep -i cgroup WARNING: No swap limit support Cgroup Driver: cgroupfs Cgroup Version: 1
配置docker镜像加速器,并设置docker的cgroup driver 为systemd:
cat > /etc/docker/daemon.json <<EOF { "exec-opts": ["native.cgroupdriver=systemd"] } EOF
systemctl daemon-reload
systemctl restart docker
docker info | grep -i cgroup
root@node1:~# docker info | grep -i cgroup WARNING: No swap limit support Cgroup Driver: systemd Cgroup Version: 1
设置iptables不对bridge的数据进行处理,启用IP路由转发功能: 新建k8s.conf
cat <<EOF> /etc/sysctl.d/k8s.conf net.bridge.bridge-nf-call-ip6tables = 1 net.bridge.bridge-nf-call-iptables = 1 net.ipv4.ip_forward = 1 EOF
使配置生效:
sysctl -p /etc/sysctl.d/k8s.conf
7、安装kubelet,kubeadm,kubectl
apt-get -y install kubelet=1.28.2-00 kubeadm=
1.28.2-00
kubectl=1.28.2-00
设置kubelet开机自启:
设置kubelet开机自启动并现在启动kubelet。
systemctl enable kubelet --now
8、kubeadm初始化
- --image-repository registry.aliyuncs.com/google_containers:表示使用阿里云镜像仓库,不然有些镜像下载不下来 ;
- --kubernetes-version=v1.28.2:指定kubernetes的版本;
- --pod-network-cidr=10.244.0.0/16:指定pod的网段 ;
- coredns是一个用go语言编写的开源的DNS服务。
kubeadm init --image-repository registry.aliyuncs.com/google_containers --kubernetes-version=v1.28.2 --pod-network-cidr=10.244.0.0/16
root@node1:~# kubeadm init --image-repository registry.aliyuncs.com/google_containers --kubernetes-version=v1.28.2 --pod-network-cidr=10.244.0.0/16 [init] Using Kubernetes version: v1.28.2 [preflight] Running pre-flight checks error execution phase preflight: [preflight] Some fatal errors occurred: [ERROR CRI]: container runtime is not running: output: time="2023-11-10T16:43:51+08:00" level=fatal msg="validate service connection: CRI v1 runtime API is not implemented for endpoint \"unix:///var/run/containerd/containerd.sock\": rpc error: code = Unimplemented desc = unknown service runtime.v1.RuntimeService" , error: exit status 1 [preflight] If you know what you are doing, you can make a check non-fatal with `--ignore-preflight-errors=...` To see the stack trace of this error execute with --v=5 or higher
这个报错是因为containerd
安装的时候默认把CRI禁用了,处理方法:
vi /etc/containerd/config.toml
找到:disabled_plugins = ["cri"],把cri删掉,变成
disabled_plugins = []
,然后重启continerd:systemctl restart containerd
然后再次kubeadm init, 遇到问题:
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s [kubelet-check] Initial timeout of 40s passed. Unfortunately, an error has occurred: timed out waiting for the condition This error is likely caused by: - The kubelet is not running - The kubelet is unhealthy due to a misconfiguration of the node in some way (required cgroups disabled) If you are on a systemd-powered system, you can try to troubleshoot the error with the following commands: - 'systemctl status kubelet' - 'journalctl -xeu kubelet' Additionally, a control plane component may have crashed or exited when started by the container runtime. To troubleshoot, list all containers using your preferred container runtimes CLI. Here is one example how you may list all running Kubernetes containers by using crictl: - 'crictl --runtime-endpoint unix:///var/run/containerd/containerd.sock ps -a | grep kube | grep -v pause' Once you have found the failing container, you can inspect its logs with: - 'crictl --runtime-endpoint unix:///var/run/containerd/containerd.sock logs CONTAINERID' error execution phase wait-control-plane: couldn't initialize a Kubernetes cluster To see the stack trace of this error execute with --v=5 or higher
journalctl -xeu kubelet查看发现:
failed to create sandbox for pod” err="rpc error: code = Unknown desc = failed to get sandbox image “ registry.k8s.io/pause:3.6”: failed to pull image “ registry.k8s.io/pause:3.6 问题:
这是拉取 registry.k8s.io/pause:3.6 镜像失败 导致sandbox 创建不了而报错,解决方法是重新配置 sandbox 镜像 仓库,将默认的 registry.k8s.io/pause:3.6 修改成 “ registry.aliyuncs.com/google_containers/pause:3.6” ,
containerd config default > /etc/containerd/config.toml
vim /etc/containerd/config.toml
sandbox_image = " registry.aliyuncs.com/google_containers/pause:3.6"
systemctl daemon-reload
systemctl restart containerd.service
root@node1:~# kubeadm init --image-repository registry.aliyuncs.com/google_containers --kubernetes-version=v1.28.2 --pod-network-cidr=10.244.0.0/16 --ignore-preflight-errors=... [init] Using Kubernetes version: v1.28.2 [preflight] Running pre-flight checks [preflight] Pulling images required for setting up a Kubernetes cluster [preflight] This might take a minute or two, depending on the speed of your internet connection [preflight] You can also perform this action in beforehand using 'kubeadm config images pull' W1110 17:38:20.439667 13628 checks.go:835] detected that the sandbox image "registry.aliyuncs.com/google_containers/pause:3.6" of the container runtime is inconsistent with that used by kubeadm. It is recommended that using "registry.aliyuncs.com/google_containers/pause:3.9" as the CRI sandbox image. [certs] Using certificateDir folder "/etc/kubernetes/pki" [certs] Using existing ca certificate authority [certs] Using existing apiserver certificate and key on disk [certs] Using existing apiserver-kubelet-client certificate and key on disk [certs] Using existing front-proxy-ca certificate authority [certs] Using existing front-proxy-client certificate and key on disk [certs] Using existing etcd/ca certificate authority [certs] Using existing etcd/server certificate and key on disk [certs] Using existing etcd/peer certificate and key on disk [certs] Using existing etcd/healthcheck-client certificate and key on disk [certs] Using existing apiserver-etcd-client certificate and key on disk [certs] Using the existing "sa" key [kubeconfig] Using kubeconfig folder "/etc/kubernetes" [kubeconfig] Using existing kubeconfig file: "/etc/kubernetes/admin.conf" [kubeconfig] Using existing kubeconfig file: "/etc/kubernetes/kubelet.conf" [kubeconfig] Using existing kubeconfig file: "/etc/kubernetes/controller-manager.conf" [kubeconfig] Using existing kubeconfig file: "/etc/kubernetes/scheduler.conf" [etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests" [control-plane] Using manifest folder "/etc/kubernetes/manifests" [control-plane] Creating static Pod manifest for "kube-apiserver" [control-plane] Creating static Pod manifest for "kube-controller-manager" [control-plane] Creating static Pod manifest for "kube-scheduler" [kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env" [kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml" [kubelet-start] Starting the kubelet [wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s [apiclient] All control plane components are healthy after 6.504867 seconds [upload-config] Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace [kubelet] Creating a ConfigMap "kubelet-config" in namespace kube-system with the configuration for the kubelets in the cluster [upload-certs] Skipping phase. Please see --upload-certs [mark-control-plane] Marking the node node1 as control-plane by adding the labels: [node-role.kubernetes.io/control-plane node.kubernetes.io/exclude-from-external-load-balancers] [mark-control-plane] Marking the node node1 as control-plane by adding the taints [node-role.kubernetes.io/control-plane:NoSchedule] [bootstrap-token] Using token: l6a8jn.j8ilw25qae8vtj5m [bootstrap-token] Configuring bootstrap tokens, cluster-info ConfigMap, RBAC Roles [bootstrap-token] Configured RBAC rules to allow Node Bootstrap tokens to get nodes [bootstrap-token] Configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials [bootstrap-token] Configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token [bootstrap-token] Configured RBAC rules to allow certificate rotation for all node client certificates in the cluster [bootstrap-token] Creating the "cluster-info" ConfigMap in the "kube-public" namespace [kubelet-finalize] Updating "/etc/kubernetes/kubelet.conf" to point to a rotatable kubelet client certificate and key [addons] Applied essential addon: CoreDNS [addons] Applied essential addon: kube-proxy Your Kubernetes control-plane has initialized successfully! To start using your cluster, you need to run the following as a regular user: mkdir -p $HOME/.kube sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config sudo chown $(id -u):$(id -g) $HOME/.kube/config Alternatively, if you are the root user, you can run: export KUBECONFIG=/etc/kubernetes/admin.conf You should now deploy a pod network to the cluster. Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at: https://kubernetes.io/docs/concepts/cluster-administration/addons/ Then you can join any number of worker nodes by running the following on each as root: kubeadm join 192.168.199.141:6443 --token l6a8jn.j8ilw25qae8vtj5m \ --discovery-token-ca-cert-hash sha256:451cf781c7315cbec2cdb7178c5bb9981aa3e46a13fb3f35d475674aae3e73f0
根据提示创建目录和配置文件:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
9、加入woker节点
这里先不加
10、部署CNI网络插件calico
虽然现在kubernetes集群已经有1个master节点,是NotReady的,原因是没有CNI网络插件,为了节点间的通信,需要安装cni网络插件,常用的cni网络插件有calico和flannel,两者区别为:flannel不支持复杂的网络策略,calico支持网络策略
现在去官网下载calico.yaml文件:
官网:https://projectcalico.docs.tigera.io/about/about-calico
浏览器里输入: https://docs.projectcalico.org/manifests/calico.yaml 获取yaml 或者 curl https://raw.githubusercontent.com/projectcalico/calico/v3.26.1/manifests/calico.yaml >calico.yaml
修改calico.yaml 文件,CALICO_IPV4POOL_CIDR的IP段要和kubeadm初始化时候的pod网段一致,(3.26.1不用配置这个)
当看到calico pod都running时,kubectl get node就ready了。
11、配置kubectl 自动补全
添加source <(kubectl completion bash)到/etc/profile,并使配置生效。
vim /etc/profile
source /etc/profile
参考:
https://www.cnblogs.com/renshengdezheli/p/17632858.html
posted on 2023-11-10 15:41 MissSimple 阅读(149) 评论(0) 编辑 收藏 举报