一、准备阶段
除了对master的操作,前面的操作均需在每一台虚拟机进行。
建议每台机器上安装常用的如vim、curl、wegt、unzip等。
1、关闭每台虚拟机防火墙和seLinux,防止干扰k8s运行,最好清除iptables规则
# 关闭防火墙并且设置开机不启动
systemctl stop firewalld
systemctl disable firewalld
# 关闭seLinux,getenforce是获取seLinux的状态
# 没有设置过seLinux一般getenforce查询结果为Enforcing,需设置为disable关闭
getenforce
# 直接修改文件,改为SELINUX=disabled,重启之后生效
vim /etc/selinux/config
# 清除iptables规则
iptables -F
2、关闭交换分区(k8s1.8版本以上要求必须关闭)以提升性能
# 临时关闭
swapoff -a
# 修改/etc/fstab文件永久关闭,将文件中/dev/mapper/centos-swap swap行注释即可,也可以直接运行下面这条命令
sed -i '/swap/ s/^\(.*\)$/#\1/g' /etc/fstab
3、修改/etc/hosts文件
# 直接运行下面的命令,记得将IP地址修改为自己机器的IP地址,主机名字也换成自己的
cat >> /etc/hosts << EOF
192.168.1.83 jqmasterk8s
192.168.1.84 jqnode01k8s
192.168.1.85 jqnode02k8s
EOF
4、加载模块br_netfilter(和网络通信有关)
# 临时加载
modprobe br_netfilter
# 永久加载,创建文件,在文件中写入模块名字br_netfilter
vim /etc/modules-load.d/k8s.conf
systemctl restart systemd-modules-load.service
#代替命令,也需要运行systemctl这条命令
cat >> /etc/modules-load.d/k8s.conf << EOF
br_netfilter
EOF
5、修改内核参数以满足k8s的运行需求,运行下面的命令
cat <<EOF >> /etc/sysctl.conf
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
net.ipv4.ip_nonlocal_bind = 1
net.ipv4.ip_forward = 1
vm.swappiness=0
EOF
# 重新加载系统参数配置,让上面的修改生效
sysctl -p
6、所有节点执行时间同步:
# 启动chronyd服务
systemctl start chronyd
systemctl enable chronyd
date
二、安装docker
1、安装docker命令:
#安装工具包
yum install -y yum-utils
# 使用工具包自动下载docker,这是官方的源
yum-config-manager \
--add-repo \
https://download.docker.com/linux/centos/docker-ce.repo
#可以使用阿里云镜像替换官方源
yum-config-manager --add-repo http://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo
# 指定版本安装docker
yum install -y docker-ce-20.10.0 docker-ce-cli-20.10.0 containerd.io
# 开启docker服务
systemctl start docker
# 设置开机启动
systemctl enable docker
# 配置Docker使用systemd作为默认Cgroup驱动,配置之后需要重启docker
cat <<EOF > /etc/docker/daemon.json
{
"registry-mirrors": [
"http://hub-mirror.c.163.com",
"https://docker.mirrors.ustc.edu.cn",
"https://registry.docker-cn.com"
],
"exec-opts": ["native.cgroupdriver=systemd"]
}
EOF
{
"registry-mirrors": [
"http://hub-mirror.c.163.com",
"https://docker.mirrors.ustc.edu.cn",
"https://registry.docker-cn.com",
"https://docker.211678.top",
"https://docker.1panel.live",
"https://hub.rat.dev",
"https://docker.m.daocloud.io",
"https://do.nark.eu.org",
"https://dockerpull.com",
"https://dockerproxy.cn",
"https://docker.awsl9527.cn"
],
"exec-opts": ["native.cgroupdriver=systemd"]
}
#重启docker
systemctl restart docker
三、安装k8s
1、添加k8s yum源
cat <<EOF > /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64/
enabled=1
gpgcheck=1
repo_gpgcheck=1
gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg
EOF
2、安装kubeadm,kubelet,kubectl(指定版本)
yum install -y kubelet-1.23.6 kubeadm-1.23.6 kubectl-1.23.6
#设置kubelet开机自启动
systemctl enable kubelet
3、安装CoreDNS容器(k8s集群运行时需要CoreDNS提供DNS解析服务)
# 拉取镜像
docker pull coredns/coredns:1.8.4
# 将镜像改名
docker tag coredns/coredns:1.8.4 registry.aliyuncs.com/google_containers/coredns:v1.8.4
4、master节点初始化
使用kubeadm init命令进行初始化,运行下面的命令之后等待命令跑完
# 记得改变IP,只要改第一行的IP地址,一般改为master节点地址
kubeadm init \
--apiserver-advertise-address=192.168.1.83\
--image-repository registry.aliyuncs.com/google_containers \
--service-cidr=10.1.0.0/16 \
--pod-network-cidr=10.244.0.0/16
# –apiserver-advertise-address # 集群通告地址(master 机器IP,这里用的万兆网)
# –image-repository # 由于默认拉取镜像地址k8s.gcr.io国内无法访问,这里指定阿里云镜像仓库地址
# –kubernetes-version #K8s版本,与上面安装的一致
# –service-cidr #集群内部虚拟网络,Pod统一访问入口,可以不用更改,直接用上面的参数
# –pod-network-cidr #Pod网络,与下面部署的CNI网络组件yaml中保持一致,可以不用更改,直接用上面的参数
5、执行完之后要手动执行一些参数(尤其是 加入集群的join命令 需要复制记录下载):
[addons] Applied essential addon: kube-proxy
Your Kubernetes control-plane has initialized successfully!
To start using your cluster, you need to run the following as a regular user:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
Alternatively, if you are the root user, you can run:
export KUBECONFIG=/etc/kubernetes/admin.conf
You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
https://kubernetes.io/docs/concepts/cluster-administration/addons/
Then you can join any number of worker nodes by running the following on each as root:
kubeadm join 192.168.1.83:6443 --token ac0w4h.cj39s8s7e3mrj0ib \
--discovery-token-ca-cert-hash sha256:91ffbe926ff82a2a04eca35bd1419ca77da211349e56eb4d2501b3cfcec4440e
6、执行参数:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
vim /root/.bash_profile
7、加入以下这段:
# 超级用户变量
export KUBECONFIG=/etc/kubernetes/admin.conf
# 设置别名
alias k=kubectl
# 设置kubectl命令补齐功能
source <(kubectl completion bash)
source /root/.bash_profile
8、这段要复制记录下来(来自k8s初始化成功之后出现的join命令,需要先配置完Flannel才能加入子节点),后续子节点加入master节点需要执行这段命令:
kubeadm join 192.168.1.83:6443 --token ac0w4h.cj39s8s7e3mrj0ib \
--discovery-token-ca-cert-hash sha256:91ffbe926ff82a2a04eca35bd1419ca77da211349e56eb4d2501b3cfcec4440e
四、设置kubeletl网络(主节点部署)
1、部署容器网络,CNI网络插件(在Master上执行,著名的有flannel、calico、canal和kube-router等,简单易用的实现是为CoreOS提供的flannel项目),这里使用Flannel实现。
下载kube-flannel.yml:
wget https://raw.githubusercontent.com/flannel-io/flannel/master/Documentation/kube-flannel.yml
1
下载不到的话可以用这个kube-flannel.yml
apiVersion: v1
kind: Namespace
metadata:
labels:
k8s-app: flannel
pod-security.kubernetes.io/enforce: privileged
name: kube-flannel
---
apiVersion: v1
kind: ServiceAccount
metadata:
labels:
k8s-app: flannel
name: flannel
namespace: kube-flannel
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
labels:
k8s-app: flannel
name: flannel
rules:
- apiGroups:
- ""
resources:
- pods
verbs:
- get
- apiGroups:
- ""
resources:
- nodes
verbs:
- get
- list
- watch
- apiGroups:
- ""
resources:
- nodes/status
verbs:
- patch
- apiGroups:
- networking.k8s.io
resources:
- clustercidrs
verbs:
- list
- watch
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
labels:
k8s-app: flannel
name: flannel
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: flannel
subjects:
- kind: ServiceAccount
name: flannel
namespace: kube-flannel
---
apiVersion: v1
data:
cni-conf.json: |
{
"name": "cbr0",
"cniVersion": "0.3.1",
"plugins": [
{
"type": "flannel",
"delegate": {
"hairpinMode": true,
"isDefaultGateway": true
}
},
{
"type": "portmap",
"capabilities": {
"portMappings": true
}
}
]
}
net-conf.json: |
{
"Network": "10.244.0.0/16",
"Backend": {
"Type": "vxlan"
}
}
kind: ConfigMap
metadata:
labels:
app: flannel
k8s-app: flannel
tier: node
name: kube-flannel-cfg
namespace: kube-flannel
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
labels:
app: flannel
k8s-app: flannel
tier: node
name: kube-flannel-ds
namespace: kube-flannel
spec:
selector:
matchLabels:
app: flannel
k8s-app: flannel
template:
metadata:
labels:
app: flannel
k8s-app: flannel
tier: node
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/os
operator: In
values:
- linux
containers:
- args:
- --ip-masq
- --kube-subnet-mgr
command:
- /opt/bin/flanneld
env:
- name: POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: POD_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
- name: EVENT_QUEUE_DEPTH
value: "5000"
image: registry.cn-hangzhou.aliyuncs.com/liuk8s/flannel:v0.21.5
name: kube-flannel
resources:
requests:
cpu: 100m
memory: 50Mi
securityContext:
capabilities:
add:
- NET_ADMIN
- NET_RAW
privileged: false
volumeMounts:
- mountPath: /run/flannel
name: run
- mountPath: /etc/kube-flannel/
name: flannel-cfg
- mountPath: /run/xtables.lock
name: xtables-lock
hostNetwork: true
initContainers:
- args:
- -f
- /flannel
- /opt/cni/bin/flannel
command:
- cp
image: registry.cn-hangzhou.aliyuncs.com/liuk8s/flannel-cni-plugin:v1.1.2
name: install-cni-plugin
volumeMounts:
- mountPath: /opt/cni/bin
name: cni-plugin
- args:
- -f
- /etc/kube-flannel/cni-conf.json
- /etc/cni/net.d/10-flannel.conflist
command:
- cp
image: registry.cn-hangzhou.aliyuncs.com/liuk8s/flannel:v0.21.5
name: install-cni
volumeMounts:
- mountPath: /etc/cni/net.d
name: cni
- mountPath: /etc/kube-flannel/
name: flannel-cfg
priorityClassName: system-node-critical
serviceAccountName: flannel
tolerations:
- effect: NoSchedule
operator: Exists
volumes:
- hostPath:
path: /run/flannel
name: run
- hostPath:
path: /opt/cni/bin
name: cni-plugin
- hostPath:
path: /etc/cni/net.d
name: cni
- configMap:
name: kube-flannel-cfg
name: flannel-cfg
- hostPath:
path: /run/xtables.lock
type: FileOrCreate
name: xtables-lock
2、然后修改配置文件,找到如下位置,修改 Newwork 与执行 kubeadm init 输入的网段一致:
net-conf.json: |
{
"Network": "10.244.0.0/16",
"Backend"": {
"Type": "vxlan"
}
}
3、修改配置之后安装组件(如果安装的时候卡在pull镜像的时候,试一试手动用docker将镜像拉取下来):
kubectl apply -f kube-flannel.yml
1
4、查看flannel pod状态(必须要为Running状态,如果kube-flannel起不来,那么就用kubectl describe pod kube-flannel-ds-*** -n kube-flannel命令查看pod起不来的原因):
[root@jqmasterk8s ~]# kubectl get pod --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-flannel kube-flannel-ds-69nrr 1/1 Running 0 8m38s
kube-flannel kube-flannel-ds-8gzl2 1/1 Running 0 8m38s
kube-flannel kube-flannel-ds-9gn4h 1/1 Running 0 8m38s
kube-system coredns-6d8c4cb4d-dxx6t 1/1 Running 0 14m
kube-system coredns-6d8c4cb4d-jvdl5 1/1 Running 0 14m
kube-system etcd-jqmasterk8s 1/1 Running 0 14m
kube-system kube-apiserver-jqmasterk8s 1/1 Running 0 14m
kube-system kube-controller-manager-jqmasterk8s 1/1 Running 0 14m
kube-system kube-proxy-6rzc9 1/1 Running 0 14m
kube-system kube-proxy-sw9w2 1/1 Running 0 12m
kube-system kube-proxy-zp64d 1/1 Running 0 12m
kube-system kube-scheduler-jqmasterk8s 1/1 Running 0 14m
[root@jqmasterk8s ~]#
5、查看通信状态:
[root@jqmasterk8s ~]# kubectl get pod -n kube-system
NAME READY STATUS RESTARTS AGE
coredns-6d8c4cb4d-dxx6t 1/1 Running 0 14m
coredns-6d8c4cb4d-jvdl5 1/1 Running 0 14m
etcd-jqmasterk8s 1/1 Running 0 14m
kube-apiserver-jqmasterk8s 1/1 Running 0 15m
kube-controller-manager-jqmasterk8s 1/1 Running 0 14m
kube-proxy-6rzc9 1/1 Running 0 14m
kube-proxy-sw9w2 1/1 Running 0 12m
kube-proxy-zp64d 1/1 Running 0 12m
kube-scheduler-jqmasterk8s 1/1 Running 0 14m
[root@jqmasterk8s ~]#
[root@node1 home]#
[root@node1 home]# 获取主节点的状态
[root@jqmasterk8s ~]# kubectl get cs
Warning: v1 ComponentStatus is deprecated in v1.19+
NAME STATUS MESSAGE ERROR
controller-manager Healthy ok
scheduler Healthy ok
etcd-0 Healthy {"health":"true","reason":""}
[root@jqmasterk8s ~]#
至此 K8s master主服务器 已经部署完成!
五、标题子节点加入集群(在子节点上操作)
1、初始化会生成join命令,需要在子节点执行即可,以下token作为举例,以实际为主,例如:
[root@node2 home]# kubeadm join 192.168.2.1:6443 --token ochspx.15in9qkiu5z8tx2y --discovery-token-ca-cert-hash sha256:1f31202107af96a07df9fd78c3aa9bb44fd40076ac123e8ff28d6ab691a02a31
[preflight] Running pre-flight checks
[preflight] Reading configuration from the cluster...
[preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml'
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Starting the kubelet
[kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap...
This node has joined the cluster:
* Certificate signing request was sent to apiserver and a response was received.
* The Kubelet was informed of the new secure connection details.
Run 'kubectl get nodes' on the control-plane to see this node join the cluster.
2、默认的 join token 有效期限为24小时,当过期后该 token 就不能用了,这时需要重新创建 token,创建新的join token需要在主节点上创建,创建命令如下:
[root@node1 home]# kubeadm token create --print-join-command
1
3、加入之后再在主节点查看集群中节点的状态(必须要都为Ready状态):
[root@jqmasterk8s ~]# kubectl get nodes
NAME STATUS ROLES AGE VERSION
jqmasterk8s Ready control-plane,master 15m v1.23.6
jqnode01k8s Ready <none> 13m v1.23.6
jqnode02k8s Ready <none> 13m v1.23.6
[root@jqmasterk8s ~]#
如果所有的节点STATUS都为Ready的话,那么到此,所有的子节点加入完成!
六、删除子节点(在master主节点上操作)
# kubectl drain <node name> --delete-local-data --force --ignore-daemonsets
# 其中 <node name> 是在k8s集群中使用 <kubectl get nodes> 查询到的节点名称
# 假设这里删除 node3 子节点
[root@node1 home]# kubectl drain node3 --delete-local-data --force --ignore-daemonsets
[root@node1 home]# kubectl delete node node3
1、然后在删除的子节点上操作重置k8s(重置k8s会删除一些配置文件),这里在node3子节点上操作:
[root@node3 home]# # 子节点重置k8s
[root@node3 home]# kubeadm reset
[reset] WARNING: Changes made to this host by 'kubeadm init' or 'kubeadm join' will be reverted.
[reset] Are you sure you want to proceed? [y/N]: y
[preflight] Running pre-flight checks
W0425 01:59:40.412616 15604 removeetcdmember.go:80] [reset] No kubeadm config, using etcd pod spec to get data directory
[reset] No etcd config found. Assuming external etcd
[reset] Please, manually reset etcd to prevent further issues
[reset] Stopping the kubelet service
[reset] Unmounting mounted directories in "/var/lib/kubelet"
[reset] Deleting contents of config directories: [/etc/kubernetes/manifests /etc/kubernetes/pki]
[reset] Deleting files: [/etc/kubernetes/admin.conf /etc/kubernetes/kubelet.conf /etc/kubernetes/bootstrap-kubelet.conf /etc/kubernetes/controller-manager.conf /etc/kubernetes/scheduler.conf]
[reset] Deleting contents of stateful directories: [/var/lib/kubelet /var/lib/dockershim /var/run/kubernetes /var/lib/cni]
The reset process does not clean CNI configuration. To do so, you must remove /etc/cni/net.d
The reset process does not reset or clean up iptables rules or IPVS tables.
If you wish to reset iptables, you must do so manually by using the "iptables" command.
If your cluster was setup to utilize IPVS, run ipvsadm --clear (or similar)
to reset your system's IPVS tables.
The reset process does not clean your kubeconfig files and you must remove them manually.
Please, check the contents of the $HOME/.kube/config file.
2、然后在被删除的子节点上手动删除k8s配置文件、flannel网络配置文件 和 flannel网口
[root@node3 home]# rm -rf /etc/cni/net.d/
[root@node3 home]# rm -rf /root/.kube/config
[root@node3 home]# # 删除cni网络
[root@node3 home]# ifconfig cni0 down
[root@node3 home]# ip link delete cni0
[root@node3 home]# ifconfig flannel.1 down
[root@node3 home]# ip link delete flannel.1
七、k8s常用命令集合
# 查看当前集群的所有的节点
kubectl get node
# 显示 Node 的详细信息
kubectl describe node node1
# 查看所有的pod
kubectl get pod --all-namespaces
# 查看pod的详细信息
kubectl get pods -o wide --all-namespaces
# 查看所有创建的服务
kubectl get service
# 查看所有的deploy
kubectl get deploy
# 重启 pod(这个方式会删除原来的pod,然后再重新生成一个pod达到重启的目的)
# 有yaml文件的重启
kubectl replace --force -f xxx.yaml
# 无yaml文件的重启
kubectl get pod <POD_NAME> -n <NAMESPACE> -o yaml | kubectl replace --force -f -
# 查看pod的详细信息
kubectl describe pod nfs-client-provisioner-65c77c7bf9-54rdp -n default
# 根据 yaml 文件创建Pod资源(会主动修改一些pod)
kubectl apply -f pod.yaml
# 删除基于 pod.yaml 文件定义的Pod
kubectl delete -f pod.yaml
# 根据 yaml 文件创建Pod资源(pod存在的话不会创建需要先删除)
kubectl create -f pod.yaml
# 查看容器的日志
kubectl logs <pod-name>
# 实时查看日志
kubectl logs -f <pod-name>
# 若 pod 只有一个容器,可以不加 -c
kubectl log <pod-name> -c <container_name>
# 返回所有标记为 app=frontend 的 pod 的合并日志
kubectl logs -l app=frontend
# 通过bash获得 pod 中某个容器的TTY,相当于登录容器
# kubectl exec -it <pod-name> -c <container-name> -- bash
eg:
kubectl exec -it redis-master-cln81 -- bash
# 查看 endpoint 列表
kubectl get endpoints
# 查看已有的token
kubeadm token list
八、k8s扩展
1、对外服务可以用Ingress为弥补NodePort不足而生,进行公开了从集群外部到集群内服务的HTTP和HTTPS路由的规则集合。
2、metrics-server 组件可以监控k8s,只是显示数据,并不提供数据存储服务,主要关注的是资源度量 API 的实现,比如 CPU、文件描述符、内存、请求延时等指标,metric-server 收集数据给 k8s 集群内使用,如 kubectl,hpa,scheduler 等