Rocky Linux 9.0二进制安装k8s 1.24.3
Rocky Linux 9.0安装和基本设置#
下载镜像并最小化安装#
从Rocky Linux官网下载新版安装镜像ISO,传送门,具体安装过程省略,9.0的root账号默认锁定,并且不允许root用户使用密码进行SSH登录,在安装过程中需要
- 取消锁定root账号
- 允许root用户使用密码进行ssh登录
dnf update -y
reboot
$ uname -r
5.14.0-70.17.1.el9_0.x86_64
$ cat /etc/redhat-release
Rocky Linux release 9.0 (Blue Onyx)
主机安全设置#
# 关闭swap分区
swapoff -a
sed -ri 's/.*swap.*/#&/' /etc/fstab
# disable防火墙
systemctl stop firewalld
systemctl disable firewalld
# disable selinux
$ sed -i 's/enforcing/disabled/' /etc/selinux/config
setenforce 0
设置网卡IP#
使用nmtui
图形化工具修改网卡ip地址为静态。
设置主机名称#
hostnamectl set-hostname m01
设定hosts文件,用于主机名解析#
cat >> /etc/hosts <<EOF
192.168.2.10 m01
192.168.2.11 w01
192.168.2.12 w02
EOF
安装需要的软件和工具#
dnf install -y wget tree curl bash-completion jq vim net-tools telnet git \
lrzsz epel-release
设置ntp服务#
RL9.0默认安装chrony服务并且服务已经启动,如果需要修改NTP服务器可以修改配置文件/etc/chrony.conf
# 使用客户端进行验证
$ chronyc sources
MS Name/IP address Stratum Poll Reach LastRx Last sample
===============================================================================
^+ tick.ntp.infomaniak.ch 1 7 373 98 +414us[ +414us] +/- 107ms
^* time.cloudflare.com 3 7 377 100 +1955us[+2064us] +/- 107ms
^+ time.cloudflare.com 3 7 367 99 -9897us[-9897us] +/- 117ms
^+ electrode.felixc.at 3 7 377 98 +1295us[+1295us] +/- 148ms
配置unlimit#
# 文件句柄
ulimit -SHn 65535
cat > /etc/security/limits.conf <<EOF
* soft nofile 655360
* hard nofile 131072
* soft nproc 655350
* hard nproc 655350
* soft memlock unlimited
* hard memlock unlimited
EOF
安装ipvsadmin#
dnf install -y ipvsadm ipset sysstat conntrack libseccomp
cat > /etc/modules-load.d/ipvs.conf <<EOF
ip_vs
ip_vs_rr
ip_vs_wrr
ip_vs_sh
nf_conntrack
ip_tables
ip_set
xt_set
ipt_set
ipt_rpfilter
ipt_REJECT
ipip
EOF
systemctl restart systemd-modules-load.service
$ lsmod | grep ip_vs
ip_vs_sh 16384 0
ip_vs_wrr 16384 0
ip_vs_rr 16384 0
ip_vs 188416 6 ip_vs_rr,ip_vs_sh,ip_vs_wrr
nf_conntrack 176128 3 nf_nat,nft_ct,ip_vs
nf_defrag_ipv6 24576 2 nf_conntrack,ip_vs
libcrc32c 16384 5 nf_conntrack,nf_nat,nf_tables,xfs,ip_vs
修改内核参数#
cat > /etc/sysctl.d/95-k8s-sysctl.conf <<EOF
net.ipv4.ip_forward = 1
net.bridge.bridge-nf-call-iptables = 1
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-arptables = 1
fs.may_detach_mounts = 1
vm.swappiness = 0
vm.overcommit_memory=1
vm.panic_on_oom=0
vm.max_map_count=655360
fs.inotify.max_user_watches=89100
fs.file-max=52706963
fs.nr_open=52706963
net.netfilter.nf_conntrack_max=2310720
net.ipv4.tcp_keepalive_time = 600
net.ipv4.tcp_keepalive_probes = 3
net.ipv4.tcp_keepalive_intvl =15
net.ipv4.tcp_max_tw_buckets = 36000
net.ipv4.tcp_tw_reuse = 1
net.ipv4.tcp_max_orphans = 327680
net.ipv4.tcp_orphan_retries = 3
net.ipv4.tcp_syncookies = 1
net.ipv4.tcp_max_syn_backlog = 16384
net.ipv4.ip_conntrack_max = 65536
net.ipv4.tcp_max_syn_backlog = 16384
net.ipv4.tcp_timestamps = 0
net.core.somaxconn = 16384
EOF
sysctl --system
环境规划#
对于学习来说,最主要的是使用k8s,熟悉原理以及操作,所以搭建最简单环境进行学习,学会后可以进行扩展到多master,多etcd以及前端添加lb等等。
本次部署默认是在m01
节点上下载生成配置,然后将配置分发到2台worker节点。
部署架构#
主机规划#
主机名 | IP地址 | 角色 | 安装组件 |
---|---|---|---|
m01 | 192.168.2.10 | master | apiserver,controller-manager,scheduler,etcd,kubectl,kubelet,kube-proxy,containerd |
w01 | 192.168.2.11 | worker | kubelet,kube-proxy,containerd |
w02 | 192.168.2.12 | worker | kubelet,kube-proxy,containerd |
网络规划#
- 节点网络:192.168.2.0/24
- service网络:10.96.0.0/16
- pod网络:172.16.0.0/16
软件版本#
- kubernets:1.24.3
- etcd:3.5.4
- cfssl:1.6.1
- coredns:1.9.3
- metrics-server:0.6.1
基本设定#
配置免密认证#
# 进入root家目录
cd ~
dnf install -y sshpass
ssh-keygen -f /root/.ssh/id_rsa -P ''
export HOST="m01 w01 w02 192.168.2.10 192.168.2.11 192.168.2.12"
export SSHPASS=<SSH_PASS>
for H in $HOST; do \
sshpass -e ssh-copy-id -o StrictHostKeyChecking=no $H; \
done
$ ssh w01
Last login: Fri Jul 22 15:03:25 2022 from 192.168.2.10
安装PKI管理工具-cfssl#
检查下载新版本的工具,传送门
# 下载cfssl二进制程序
wget https://github.com/cloudflare/cfssl/releases/download/v1.6.1/cfssl_1.6.1_linux_amd64 -O /usr/local/bin/cfssl
wget https://github.com/cloudflare/cfssl/releases/download/v1.6.1/cfssljson_1.6.1_linux_amd64 -O /usr/local/bin/cfssljson
# 添加执行权限
chmod +x /usr/local/bin/cfssl*
安装配置containerd#
下载二进制文件#
检查下载软件的版本,传送门,下载带cri-containerd-cni
开头的文件,这个tar包里面包含了containerd以及crictl管理工具和cni网络插件,下载后可以使用tar -tf <包名>
,查看tar包的内容。
# 创建配置生成目录
mkdir -p /root/containerd/{app,bin,cnibin,config,service}
cd /root/containerd
# 下载二进制文件
wget https://github.com/containerd/containerd/releases/download/v1.6.6/cri-containerd-cni-1.6.6-linux-amd64.tar.gz -O app/containerd.tar.gz
# 解压
tar -xf app/containerd.tar.gz --strip-components=3 -C bin usr/local/bin/{containerd*,crictl,ctr}
tar -xf app/containerd.tar.gz --strip-components=3 -C cnibin opt/cni/bin/*
# 下载runc
wget https://github.com/opencontainers/runc/releases/download/v1.1.3/runc.amd64 -O bin/runc
# 添加执行权限
chmod +x bin/runc
启动service文件#
生成脚本containerd_config.sh
cat <<'EOF'> containerd_config.sh
# 创建service文件
cat > service/containerd.service <<EOF1
[Unit]
Description=containerd container runtime
Documentation=https://containerd.io
After=network.target local-fs.target
[Service]
ExecStartPre=-/sbin/modprobe overlay
ExecStart=/usr/local/bin/containerd
Type=notify
Delegate=yes
KillMode=process
Restart=always
RestartSec=5
# Having non-zero Limit*s causes performance problems due to accounting overhead
# in the kernel. We recommend using cgroups to do container-local accounting.
LimitNPROC=infinity
LimitCORE=infinity
LimitNOFILE=infinity
# Comment TasksMax if your systemd version does not supports it.
# Only systemd 226 and above support this version.
TasksMax=infinity
OOMScoreAdjust=-999
[Install]
WantedBy=multi-user.target
EOF1
cat > config/containerd.conf <<EOF2
overlay
br_netfilter
EOF2
EOF
执行
bash -x containerd_config.sh
# 在service目录下生成
├── containerd.service
# 在config目录下生成
├── containerd.conf
生成配置文件,并按需修改#
# 创建配置文件
./bin/containerd config default > config/config.toml
# 由于k8s.gcr.io网站打不开,需要修改sandbox_image参数,如果这里不修改,也可以手动下载pause镜像,
# 然后修改镜像名称
# ctr -n k8s.io i tag registry.cn-hangzhou.aliyuncs.com/kubernetes-kubespray/pause:3.6 k8s.gcr.io/pause:3.6
# root:容器存储路径,修改成磁盘空间充足的路径,
# bin_dir:containerd二进制文件
# conf_dir: cni 插件存储路径
# sandbox_image:pause镜像名称以及镜像tag,
vim config/config.toml
root = "/data/containerd"
#sandbox_image = "k8s.gcr.io/pause:3.6"
sandbox_image = "registry.aliyuncs.com/google_containers/pause:3.6"
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc.options]
SystemdCgroup = true
分发二进制文件、配置及创建相关路径#
for i in m01 w01 w02; do \
ssh $i "mkdir -p /etc/containerd"; \
ssh $i "mkdir -p /opt/cni/bin"; \
ssh $i "mkdir -p /opt/containerd"; \
ssh $i "mkdir -p /etc/cni/net.d"; \
scp bin/* $i:/usr/local/bin/; \
scp cnibin/* $i:/opt/cni/bin/; \
scp service/containerd.service $i:/usr/lib/systemd/system/; \
scp config/config.toml $i:/etc/containerd/; \
scp config/containerd.conf $i:/etc/modules-load.d/; \
done
启动containerd服务#
for i in m01 w01 w02; do \
ssh $i "systemctl restart systemd-modules-load.service"; \
ssh $i "systemctl daemon-reload"; \
ssh $i "systemctl enable containerd"; \
ssh $i "systemctl restart containerd --no-block"; \
ssh $i "systemctl is-active containerd"; \
done
测试containerd#
Containerd有namespaces
的概念,不同namespaces
之间进行隔离,镜像和容器不可见
$ ctr ns list
NAME LABELS
default
# 创建ns
ctr ns create test
$ ctr ns list
NAME LABELS
default
test
# 测试拉取busybox镜像,使用ctr拉取,镜像的路径要写全,没有指明ns,默认保存在default命名空间
ctr images pull docker.io/library/busybox:latest
# 查看镜像,test命名空间没有镜像
$ ctr -n test images list
REF TYPE DIGEST SIZE PLATFORMS LABELS
$ ctr images list
REF TYPE DIGEST SIZE PLATFORMS LABELS
docker.io/library/busybox:latest application/vnd.docker.distribution.manifest.list.v2+json sha256:3614ca5eacf0a3a1bcc361c939202a974b4902b9334ff36eb29ffe9011aaad83 759.3 KiB linux/386,linux/amd64,linux/arm/v5,linux/arm/v6,linux/arm/v7,linux/arm64/v8,linux/mips64le,linux/ppc64le,linux/riscv64,linux/s390x -
配置crictl#
使用containerd命令管理镜像较底层,对人类不友好,k8s内部提供了crictl来管理镜像,相当于docker命令行工具。
生成脚本crictl_config.sh
cat <<'EOF'> crictl_config.sh
# crictl是遵循CRI接口规范的一个命令行工具,通常用它来检查和管理kubelet节点上的容器运行时和镜像
# 使用cricti工具之前,需要先创建crictl的配置文件
# 注意runtime-endpoint和image-endpoint必须与/etc/containerd/config.toml中配置保持一致。
cat > config/crictl.yaml <<EOF1
runtime-endpoint: unix:///run/containerd/containerd.sock
image-endpoint: unix:///run/containerd/containerd.sock
timeout: 10
debug: false
EOF1
EOF
执行
bash -x crictl_config.sh
## 在config目录下生成
├── crictl.yaml
分发crictl.yaml
for i in m01 w01 w02; do \
scp config/crictl.yaml $i:/etc/
done
测试crictl#
crictl的使用方法基本和docker用法相同
# 拉取镜像
$ crictl pull busybox
Image is up to date for sha256:62aedd01bd8520c43d06b09f7a0f67ba9720bdc04631a8242c65ea995f3ecac8
# 列出所有cri容器镜像
$ crictl images
IMAGE TAG IMAGE ID SIZE
docker.io/library/busybox latest 2fb6fc2d97e10 777kB
部署etcd集群#
下载etcd#
# 创建保存配置的文件夹
mkdir -p /root/etcd/{bin,config,service,ssl,app}
cd /root/etcd
# 下载etcd二进制文件
# github二进制包下载地址:https://github.com/etcd-io/etcd/releases
wget https://github.com/etcd-io/etcd/releases/download/v3.5.4/etcd-v3.5.4-linux-amd64.tar.gz -O app/etcd.tar.gz
tar -xf app/etcd.tar.gz --strip-components=1 -C bin/ etcd-v3.5.4-linux-amd64/etcd{,ctl}
生成etcd使用的证书#
生成脚本gen_etcd_cert.sh
cat <<'EOF'> gen_etcd_cert.sh
# example: ./etcd-cert.sh 127.0.0.1,master01,master02,master03,192.168.10.51,192.168.10.52,192.168.10.53
HOSTNAME=$1
# etcd ca的配置文件
cat > ca-config.json <<EOF1
{
"signing": {
"default": {
"expiry": "876000h"
},
"profiles": {
"peer": {
"usages": [
"signing",
"key encipherment",
"server auth",
"client auth"
],
"expiry": "876000h"
}
}
}
}
EOF1
# etcd的ca证书签名请求文件
cat > etcd-ca-csr.json <<EOF2
{
"CN": "etcd",
"key": {
"algo": "rsa",
"size": 2048
},
"names": [
{
"C": "CN",
"ST": "GuangDong",
"L": "GuangZhou",
"O": "etcd",
"OU": "Etcd Security"
}
],
"ca": {
"expiry": "876000h"
}
}
EOF2
# 生成etcd集群使用的ca根证书
cfssl gencert \
-initca etcd-ca-csr.json | cfssljson -bare ssl/etcd-ca
# 生成etcd集群使用的证书申请签名文件
cat > etcd-csr.json <<EOF3
{
"CN": "etcd",
"key": {
"algo": "rsa",
"size": 2048
},
"names": [
{
"C": "CN",
"ST": "GuangDong",
"L": "GuangZhou",
"O": "etcd",
"OU": "Etcd Security"
}
]
}
EOF3
# 生产部署etcd集群可以使用3张证书用于不同认证。
# 1.etcd server持有的服务端证书
# 2.peer集群中节点互相通信使用的客户端证书
# 3.配置在kube-apiserver中用来与etcd-server做双向认证的客户端证书
# 学习环境使用一张peer类型的证书进行认证
cfssl gencert \
-ca=ssl/etcd-ca.pem \
-ca-key=ssl/etcd-ca-key.pem \
-config=ca-config.json \
-hostname=${HOSTNAME} \
-profile=peer etcd-csr.json | cfssljson -bare ssl/etcd
EOF
执行
# example:bash gen_etcd_cert.sh <etcd相关主机ip及主机名>
bash -x gen_etcd_cert.sh 127.0.0.1,m01,192.168.2.10
# 在ssl目录下生成
├── etcd-ca.csr
├── etcd-ca-key.pem
├── etcd-ca.pem
├── etcd.csr
├── etcd-key.pem
├── etcd.pem
生成参数文件及启动service文件#
生成脚本etcd_config.sh
cat <<'EOF'> etcd_config.sh
# example: ./etcd_config.sh master01 192.168.1.51 master02=https://192.168.1.52:2380,master03=https://192.168.1.53:2380
ETCD_NAME=$1
ETCD_IP=$2
ETCD_CLUSTER=$3
WORK_DIR=/opt/etcd
ETCD_CONF_DIR=/opt/etcd/config
ETCD_CA_CERT=etcd-ca.pem
ETCD_SERVER_CERT_PREFIX=etcd
cat > config/etcd.config.yaml.$1 <<EOF1
name: '${ETCD_NAME}'
data-dir: ${WORK_DIR}/data
wal-dir: ${WORK_DIR}/data/wal
snapshot-count: 5000
heartbeat-interval: 100
election-timeout: 1000
quota-backend-bytes: 0
listen-peer-urls: 'https://${ETCD_IP}:2380'
listen-client-urls: 'https://${ETCD_IP}:2379,http://127.0.0.1:2379'
max-snapshots: 3
max-wals: 5
cors:
initial-advertise-peer-urls: 'https://${ETCD_IP}:2380'
advertise-client-urls: 'https://${ETCD_IP}:2379'
discovery:
discovery-fallback: 'proxy'
discovery-proxy:
discovery-srv:
initial-cluster: '${ETCD_NAME}=https://${ETCD_IP}:2380,${ETCD_CLUSTER}'
initial-cluster-token: 'etcd-cluster'
initial-cluster-state: 'new'
strict-reconfig-check: false
enable-v2: true
enable-pprof: true
proxy: 'off'
proxy-failure-wait: 5000
proxy-refresh-interval: 30000
proxy-dial-timeout: 1000
proxy-write-timeout: 5000
proxy-read-timeout: 0
client-transport-security:
cert-file: '${WORK_DIR}/ssl/${ETCD_SERVER_CERT_PREFIX}.pem'
key-file: '${WORK_DIR}/ssl/${ETCD_SERVER_CERT_PREFIX}-key.pem'
client-cert-auth: true
trusted-ca-file: '${WORK_DIR}/ssl/${ETCD_CA_CERT}'
auto-tls: true
peer-transport-security:
cert-file: '${WORK_DIR}/ssl/${ETCD_SERVER_CERT_PREFIX}.pem'
key-file: '${WORK_DIR}/ssl/${ETCD_SERVER_CERT_PREFIX}-key.pem'
peer-client-cert-auth: true
trusted-ca-file: '${WORK_DIR}/ssl/${ETCD_CA_CERT}'
auto-tls: true
debug: false
log-package-levels:
log-outputs: [default]
force-new-cluster: false
EOF1
cat > service/etcd.service <<EOF2
[Unit]
Description=Etcd Service
Documentation=https://coreos.com/etcd/docs/latest/
After=network.target
[Service]
Type=notify
ExecStart=/usr/local/bin/etcd \\
--config-file=${ETCD_CONF_DIR}/etcd.config.yaml
Restart=on-failure
RestartSec=10
LimitNOFILE=65536
[Install]
WantedBy=multi-user.target
Alias=etcd3.service
EOF2
EOF
执行
# example:./etcd_config.sh <ETCD主机名> <ETCD_IP> <ETCD集群其他的信息>
bash -x etcd_config.sh m01 192.168.2.10
# 在config目录下生成
├── etcd.config.yaml.m01
# 在service目录下生成
├── etcd.service
分发etcd二进制文件、证书、配置及服务文件#
for i in m01; do \
ssh $i "mkdir -p /opt/etcd/{config,data,ssl}"; \
scp bin/etcd* $i:/usr/local/bin; \
scp ssl/etcd{,-key,-ca}.pem $i:/opt/etcd/ssl/; \
scp config/etcd.config.yaml.$i $i:/opt/etcd/config/etcd.config.yaml; \
scp service/etcd.service $i:/usr/lib/systemd/system/; \
done
启动etcd服务#
for i in m01; do \
ssh $i "systemctl daemon-reload"; \
ssh $i "systemctl enable etcd"; \
ssh $i "systemctl restart etcd --no-block"; \
ssh $i "systemctl is-active etcd"; \
done
验证集群#
# 查看集群
export ETCDCTL_API=3
export ENDPOINTS=192.168.2.10:2379
etcdctl \
--endpoints="$ENDPOINTS" \
--cacert=/opt/etcd/ssl/etcd-ca.pem \
--cert=/opt/etcd/ssl/etcd.pem \
--key=/opt/etcd/ssl/etcd-key.pem endpoint status \
--write-out=table
etcdctl \
--endpoints="$ENDPOINTS" \
--cacert=/opt/etcd/ssl/etcd-ca.pem \
--cert=/opt/etcd/ssl/etcd.pem \
--key=/opt/etcd/ssl/etcd-key.pem member list \
--write-out=table
etcdctl \
--endpoints="$ENDPOINTS" \
--cacert=/opt/etcd/ssl/etcd-ca.pem \
--cert=/opt/etcd/ssl/etcd.pem \
--key=/opt/etcd/ssl/etcd-key.pem endpoint health \
--write-out=table
+-------------------+--------+-------------+-------+
| ENDPOINT | HEALTH | TOOK | ERROR |
+-------------------+--------+-------------+-------+
| 192.168.2.10:2379 | true | 20.158058ms | |
+-------------------+--------+-------------+-------+
部署k8s组件#
在master节点部署3个服务
- kube-apiserver
- kube-controller-manager
- kube-scheduler
下载kubernets二进制文件#
下载二进制文件,传送门
# 创建配置目录
mkdir -p /root/k8s/{app,ssl,config,service,bin,kubeconfig}
cd /root/k8s
# 下载kubernets二进制包,按版本直接修改v1.24.3 -> v1.xx.x
wget https://dl.k8s.io/v1.24.3/kubernetes-server-linux-amd64.tar.gz -O app/kubernetes-server.tar.gz
# 解压
tar -xf app/kubernetes-server.tar.gz --strip-components=3 -C \
bin\kubernetes/server/bin/kube{let,ctl,-apiserver,-controller-manager,-scheduler,-proxy}
生成k8s使用的ca证书#
生成脚本 gen_ca_cert.sh
cat <<'EOF'> gen_ca_cert.sh
cat > ca-config.json <<EOF1
{
"signing": {
"default": {
"expiry": "87600h"
},
"profiles": {
"peer": {
"expiry": "87600h",
"usages": [
"signing",
"key encipherment",
"server auth",
"client auth"
]
}
}
}
}
EOF1
# 生成CA证书签名请求的配置文件
cat > ca-csr.json <<EOF2
{
"CN": "kubernetes",
"key": {
"algo": "rsa",
"size": 2048
},
"names": [
{
"C": "CN",
"L": "GuangDong",
"ST": "GuangZhou",
"O": "Kubernetes",
"OU": "System"
}
],
"ca": {
"expiry": "876000h"
}
}
EOF2
# 生成ca证书和ca的私钥
cfssl gencert -initca ca-csr.json | cfssljson -bare ssl/ca
EOF
执行
bash -x gen_ca_cert.sh
# 在ssl目录下生成
├── ca-key.pem
├── ca.pem
部署apiserver#
生成apiserver所需证书#
cat <<'EOF'> gen_apiserver_cert.sh
# 生成apiserver的证书和私钥(apiserver和其它k8s组件通信使用)
APISERVER_NAME=$1
cat > kube-apiserver-csr.json <<EOF1
{
"CN": "kube-apiserver",
"key": {
"algo": "rsa",
"size": 2048
},
"names": [
{
"C": "CN",
"L": "GuangDong",
"ST": "GuangZhou",
"O": "Kubernetes",
"OU": "System"
}
]
}
EOF1
cfssl gencert -ca=ssl/ca.pem -ca-key=ssl/ca-key.pem -config=ca-config.json \
-hostname=${APISERVER_NAME} \
-profile=peer kube-apiserver-csr.json | cfssljson -bare ssl/kube-apiserver
# apiserver聚合证书
# 访问kube-apiserver的另一种方式就是使用kube-proxy来代理访问, 而该证书就是用来支持SSL代理访问的. 在该种访问模
# 式下,我们是以http的方式发起请求到代理服务的, 此时, 代理服务会将该请求发送给kube-apiserver, 在此之前, 代理会
# 将发送给kube-apiserver的请求头里加入证书信息。
# 客户端 -- 发起请求 ---> 代理 -- Add Header信息:发起请求 --> kube-apiserver
# 如果apiserver所在的主机上没有运行kube-proxy,既无法通过服务的ClusterIP进行访问,需要
# --enable-aggregator-routing=true
# 生成ca签名请求文件
cat > front-proxy-ca-csr.json <<EOF2
{
"CN": "kubernetes",
"key": {
"algo": "rsa",
"size": 2048
},
"ca": {
"expiry": "876000h"
}
}
EOF2
# 此根证书用在requestheader-client-ca-file配置选项中, kube-apiserver使用该证书来验证客户端证书是否为自己所签发
cfssl gencert -initca front-proxy-ca-csr.json | cfssljson -bare ssl/front-proxy-ca
# 生成front-proxy-client证书请求文件
# 这里的CN名称要和apiserver启动参数--requestheader-allowed-names=front-proxy-client相同
cat > front-proxy-client-csr.json <<EOF3
{
"CN": "front-proxy-client",
"key": {
"algo": "rsa",
"size": 2048
}
}
EOF3
# 生成代理层证书,代理端使用此证书,用来代用户向kube-apiserver认证
cfssl gencert -ca=ssl/front-proxy-ca.pem -ca-key=ssl/front-proxy-ca-key.pem -config=ca-config.json \
-profile=peer front-proxy-client-csr.json | cfssljson -bare ssl/front-proxy-client
# 创建ServiceAccount Key —— secret
# serviceaccount账号的一种认证方式,创建serviceaccount的时候会创建一个与之绑定的secret,这个secret会生成
# token,这组的密钥对仅提供给controller-manager使用,controller-manager通过sa.key对token进行签名, master
# 节点通过公钥sa.pub进行签名的验证
openssl genrsa -out ssl/sa.key 2048
openssl rsa -in ssl/sa.key -pubout -out ssl/sa.pub
EOF
执行
# 10.96.0.1是server_cluseter_IP网段的第一个ip地址
bash -x gen_apiserver_cert.sh 127.0.0.1,kubernetes,kubernetes.default,kubernetes.default.svc,kubernetes.default.svc.cluster,kubernetes.default.svc.cluster.local,10.96.0.1,192.168.2.10
生成配置文件及启动service文件#
生成脚本apiserver_config.sh
# --service-cluster-ip-range,该网段不能和宿主机的网段、pod网段重复
cat <<'EOF'> apiserver_config.sh
#!/bin/bash
# 创建 kube-apiserver 启动参数配置文件
MASTER_ADDRESS=$1
ETCD_SERVERS=$2
ETCD_CERT_DIR=/opt/etcd/ssl
K8S_CERT_DIR=/opt/k8s/ssl
K8S_CONF_DIR=/opt/k8s/config
API_CERT_PRIFIX=kube-apiserver
cat > service/kube-apiserver.service.${MASTER_ADDRESS} <<EOF1
[Unit]
Description=Kubernetes API Server
Documentation=https://github.com/kubernetes/kubernetes
After=network.target
[Service]
ExecStart=/usr/local/bin/kube-apiserver \\
--v=2 \\
--allow-privileged=true \\
--bind-address=${MASTER_ADDRESS} \\
--advertise-address=${MASTER_ADDRESS} \\
--secure-port=6443 \\
--service-cluster-ip-range=10.96.0.0/16 \
--service-node-port-range=30000-32767 \\
--etcd-servers=${ETCD_SERVERS} \\
--etcd-cafile=${ETCD_CERT_DIR}/etcd-ca.pem \\
--etcd-certfile=${ETCD_CERT_DIR}/etcd.pem \\
--etcd-keyfile=${ETCD_CERT_DIR}/etcd-key.pem \\
--client-ca-file=${K8S_CERT_DIR}/ca.pem \\
--tls-cert-file=${K8S_CERT_DIR}/${API_CERT_PRIFIX}.pem \\
--tls-private-key-file=${K8S_CERT_DIR}/${API_CERT_PRIFIX}-key.pem \\
--kubelet-client-certificate=${K8S_CERT_DIR}/${API_CERT_PRIFIX}.pem \\
--kubelet-client-key=${K8S_CERT_DIR}/${API_CERT_PRIFIX}-key.pem \\
--service-account-key-file=${K8S_CERT_DIR}/sa.pub \\
--service-account-signing-key-file=${K8S_CERT_DIR}/sa.key \\
--service-account-issuer=https://kubernetes.default.svc.cluster.local \\
--kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname \\
--enable-admission-plugins=NamespaceLifecycle,LimitRanger,ServiceAccount,DefaultStorageClass,DefaultTolerationSeconds,NodeRestriction,ResourceQuota \\
--authorization-mode=Node,RBAC \\
--enable-bootstrap-token-auth=true \\
--enable-aggregator-routing=true \\
--proxy-client-cert-file=${K8S_CERT_DIR}/front-proxy-client.pem \\
--proxy-client-key-file=${K8S_CERT_DIR}/front-proxy-client-key.pem \\
--requestheader-client-ca-file=${K8S_CERT_DIR}/front-proxy-ca.pem \\
--requestheader-allowed-names=front-proxy-client \\
--requestheader-group-headers=X-Remote-Group \\
--requestheader-extra-headers-prefix=X-Remote-Extra- \\
--requestheader-username-headers=X-Remote-User
#--token-auth-file=\${K8S_CONF_DIR}/token.csv 这里禁用token文件进行认证
Restart=on-failure
RestartSec=10s
LimitNOFILE=65535
[Install]
WantedBy=multi-user.target
EOF1
EOF
执行
# bash apiserver_config.sh <master_IP> <etcd_cluster>
bash -x apiserver_config.sh 192.168.2.10 https://192.168.2.10:2379
# 在service目录下生成
├── kube-apiserver.service.192.168.2.10
分发二进制文件、证书及service文件#
for i in 192.168.2.10; do \
ssh $i "mkdir -p /opt/k8s/{ssl,config,log}"; \
scp bin/kube-apiserver $i:/usr/local/bin/ ;
scp ssl/{kube*.pem,ca{,-key}.pem,front-proxy-client*.pem,front-proxy-ca.pem,sa.*} $i:/opt/k8s/ssl/; \
scp service/kube-apiserver.service.$i $i:/usr/lib/systemd/system/kube-apiserver.service; \
done
启动kube-apiserver服务#
for i in m01; do \
ssh $i "systemctl daemon-reload"; \
ssh $i "systemctl enable kube-apiserver"; \
ssh $i "systemctl restart kube-apiserver --no-block"; \
ssh $i "systemctl is-active kube-apiserver"; \
done
部署kubectl#
先部署kubectl
客户端工具,部署后可以使用命令kubectl
查看集群的信息
生成kubectl所需证书#
生成脚本gen_kubectl_cert.sh
cat <<EOF> gen_kubectl_cert.sh
# 生成kubectl的证书和私钥
# k8s安装时会创建一个集群角色(clusterrole),名字为cluster-admin,对集群具有最高管理权限同时会创建一个集群角色
# 绑定(clusterrolebingding),名字也叫做cluster-admin,这个绑定将集群角色(cluster-admin)和用户组
# (system:masters)关联起来,于是属于system:masters组内的用户,都会有这个集群角色赋予的权限
# 生成证书时会定义用户clusteradmin,所属组为system:masters,所以clusteradmin拥有集群角色(cluster-admin)赋
# 予的权限。
cat > kubectl-csr.json <<EOF1
{
"CN": "clusteradmin",
"key": {
"algo": "rsa",
"size": 2048
},
"names": [
{
"C": "CN",
"ST": "GuangDong",
"L": "GuangZhou",
"O": "system:masters",
"OU": "Kubernetes-manual"
}
]
}
EOF1
cfssl gencert -ca=ssl/ca.pem -ca-key=ssl/ca-key.pem -config=ca-config.json -profile=peer kubectl-csr.json | cfssljson -bare ssl/kubectl
EOF
执行
bash -x gen_kubectl_cert.sh
# 在ssl目录先生成
├── kubectl-key.pem
├── kubectl.pem
生成kubeconfig文件#
生成脚本kubeconfig_kubectl_config.sh
# 参数根据自己部署需要进行修改
# USERNAME必须和证书申请的CN名字相同,这里的用户名是clusteradmin
cat <<'EOF' > kubeconfig_kubectl_config.sh
APISERVER_IP=$1
K8S_CERT_DIR=$2
PORT=6443
KUBE_APISERVER=https://${APISERVER_IP}:${PORT}
CLUSTER_NAME=kubernetes
USERNAME=clusteradmin
KUBECONFIG_FILE=kubeconfig/kubectl.kubeconfig
CONTEXT_NAME=${USERNAME}@${CLUSTER_NAME}
CERT_PRFIX=kubectl
# 设置集群参数
./bin/kubectl config set-cluster ${CLUSTER_NAME} \
--certificate-authority=${K8S_CERT_DIR}/ca.pem \
--embed-certs=true \
--server=${KUBE_APISERVER} \
--kubeconfig=${KUBECONFIG_FILE}
# 设置客户端认证参数
./bin/kubectl config set-credentials ${USERNAME} \
--client-certificate=${K8S_CERT_DIR}/${CERT_PRFIX}.pem \
--client-key=${K8S_CERT_DIR}/${CERT_PRFIX}-key.pem \
--embed-certs=true \
--kubeconfig=${KUBECONFIG_FILE}
# 设置context---将用户和集群关联起来
./bin/kubectl config set-context ${CONTEXT_NAME} \
--cluster=${CLUSTER_NAME} \
--user=${USERNAME} \
--kubeconfig=${KUBECONFIG_FILE}
# 设置默认contexts
./bin/kubectl config use-context ${CONTEXT_NAME} \
--kubeconfig=${KUBECONFIG_FILE}
EOF
执行
bash -x kubeconfig_kubectl_config.sh 192.168.2.10 ssl
## 在kubeconfig目录下生成
├── kubectl.kubeconfig
分发kubeconfig文件#
# 分发kubeconfig证书
for i in m01; do \
ssh $i "mkdir -p $HOME/.kube/"; \
scp bin/kubectl $i:/usr/local/bin/; \
scp kubeconfig/kubectl.kubeconfig $i:$HOME/.kube/config; \
done
kubectl命令补全功能#
# bash配置
source <(kubectl completion bash)
echo "source <(kubectl completion bash)" >> ~/.bashrc
查看集群状态#
$ kubectl cluster-info
Kubernetes control plane is running at https://192.168.2.10:6443
To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.
$ kubectl get componentstatus
Warning: v1 ComponentStatus is deprecated in v1.19+
NAME STATUS MESSAGE ERROR
scheduler Unhealthy Get "https://127.0.0.1:10259/healthz": dial tcp 127.0.0.1:10259: connect: connection refused
controller-manager Unhealthy Get "https://127.0.0.1:10257/healthz": dial tcp 127.0.0.1:10257: connect: connection refused
etcd-0 Healthy {"health":"true","reason":""}
$ kubectl get all -A
NAMESPACE NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
default service/kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 17m
部署controller-manager组件#
生成controller-manager所需证书#
创建生成证书脚本 gen_controller_cert.sh
cat <<'EOF'> gen_controller_cert.sh
#!/bin/bash
CONTROLLER_IP=$1
# 生成 controller-manager证书签名请求
cat > kube-controller-manager-csr.json <<EOF1
{
"CN": "system:kube-controller-manager",
"key": {
"algo": "rsa",
"size": 2048
},
"names": [
{
"C": "CN",
"ST": "GuangDong",
"L": "GuangZhou",
"O": "system:kube-controller-manager",
"OU": "Kubernetes-manual"
}
]
}
EOF1
cfssl gencert -ca=ssl/ca.pem -ca-key=ssl/ca-key.pem \
-config=ca-config.json \
-hostname=${CONTROLLER_IP} \
-profile=peer kube-controller-manager-csr.json | cfssljson -bare ssl/kube-controller-manager
EOF
执行
bash +x gen_controller_cert.sh 127.0.0.1,192.168.2.10
# 在ssl目录下生成
├── kube-controller-manager-key.pem
├── kube-controller-manager.pem
生成kubeconfig文件#
cat <<'EOF'> kubeconfig_kube-controller-manager.sh
#!/bin/bash
APISERVER_IP=$1
K8S_CERT_DIR=$2
PORT=6443
KUBE_APISERVER=https://${APISERVER_IP}:${PORT}
KUBECONFIG_FILE=kubeconfig/kube-controller-manager.kubeconfig
CLUSTER_NAME=kubernetes
USERNAME=system:kube-controller-manager
CONTEXT_NAME=${USERNAME}@${CLUSTER_NAME}
CERT_PRFIX=kube-controller-manager
# 设置集群参数
./bin/kubectl config set-cluster ${CLUSTER_NAME} \
--certificate-authority=${K8S_CERT_DIR}/ca.pem \
--embed-certs=true \
--server=${KUBE_APISERVER} \
--kubeconfig=${KUBECONFIG_FILE}
# 设置用户认证参数
./bin/kubectl config set-credentials ${USERNAME} \
--client-certificate=${K8S_CERT_DIR}/${CERT_PRFIX}.pem \
--client-key=${K8S_CERT_DIR}/${CERT_PRFIX}-key.pem \
--embed-certs=true \
--kubeconfig=${KUBECONFIG_FILE}
# 设置context---将用户和集群关联起来
./bin/kubectl config set-context ${CONTEXT_NAME} \
--cluster=${CLUSTER_NAME} \
--user=${USERNAME} \
--kubeconfig=${KUBECONFIG_FILE}
# 设置默认context
./bin/kubectl config use-context ${CONTEXT_NAME} \
--kubeconfig=${KUBECONFIG_FILE}
EOF
执行
# example:./kube-controller-manager_config.sh <MASTER_IPADDR> <证书目录>
bash -x kubeconfig_kube-controller-manager.sh 192.168.2.10 ssl
# 在kubeconfig目录下生成
├── kube-controller-manager.kubeconfig
生成kube-controller-manager的service文件#
生成脚本controller_manager_config.sh
# --cluster-cidr为pod网段,不能和宿主机网段,service网段重复
cat <<'EOF'> kube-controller-manager.sh
#!/bin/bash
K8S_CERT_DIR=/opt/k8s/ssl
K8S_CONF_DIR=/opt/k8s/config
cat > service/kube-controller-manager.service <<EOF1
[Unit]
Description=Kubernetes Controller Manager
Documentation=https://github.com/kubernetes/kubernetes
After=network.target
[Service]
ExecStart=/usr/local/bin/kube-controller-manager \\
--v=2 \\
--bind-address=127.0.0.1 \\
--root-ca-file=${K8S_CERT_DIR}/ca.pem \\
--cluster-signing-cert-file=${K8S_CERT_DIR}/ca.pem \\
--cluster-signing-key-file=${K8S_CERT_DIR}/ca-key.pem \\
--service-account-private-key-file=${K8S_CERT_DIR}/sa.key \\
--tls-cert-file=${K8S_CERT_DIR}/kube-controller-manager.pem \\
--tls-private-key-file=${K8S_CERT_DIR}/kube-controller-manager-key.pem \\
--kubeconfig=${K8S_CONF_DIR}/kube-controller-manager.kubeconfig \\
--leader-elect=true \\
--use-service-account-credentials=true \\
--node-monitor-grace-period=40s \\
--node-monitor-period=5s \\
--pod-eviction-timeout=2m0s \\
--controllers=*,bootstrapsigner,tokencleaner \\
--allocate-node-cidrs=true \\
--cluster-cidr=172.16.0.0/16 \\
--requestheader-client-ca-file=${K8S_CERT_DIR}/front-proxy-ca.pem \\
--node-cidr-mask-size=24
Restart=always
RestartSec=10s
[Install]
WantedBy=multi-user.target
EOF1
EOF
执行
bash -x kube-controller-manager.sh
# 在service目录下生成以下文件
├── kube-controller-manager.service
分发二进制文件、证书、kubeconfig文件及service文件#
for i in m01; do \
ssh $i "mkdir -p /opt/k8s/{ssl,config}"; \
scp bin/kube-controller-manager $i:/usr/local/bin/;
scp ssl/kube-controller*.pem $i:/opt/k8s/ssl/; \
scp service/kube-controller-manager.service $i:/usr/lib/systemd/system/; \
scp kubeconfig/kube-controller-manager.kubeconfig $i:/opt/k8s/config/; \
done
启动kube-controller-manager服务#
for i in m01; do \
ssh $i "systemctl daemon-reload"; \
ssh $i "systemctl enable kube-controller-manager"; \
ssh $i "systemctl restart kube-controller-manager --no-block"; \
ssh $i "systemctl is-active kube-controller-manager"; \
done
验证#
$ ss -tlp | grep kube-controller
LISTEN 0 16384 127.0.0.1:10257 0.0.0.0:* users:(("kube-controller",pid=2864,fd=7))
部署kube-scheduler组件#
生成kube-scheduler所需证书#
生成脚本gen_schduler_cert.sh
cat <<'EOF'> gen_schduler_cert.sh
# 生成 kube-scheduler 的证书和私钥
SCHEDULER_IP=$1
CSR_NAME_PREFIX=kube-scheduler
cat > ${CSR_NAME_PREFIX}-csr.json <<EOF1
{
"CN": "system:kube-scheduler",
"key": {
"algo": "rsa",
"size": 2048
},
"names": [
{
"C": "CN",
"ST": "GuangDong",
"L": "GuangZhou",
"O": "system:kube-scheduler",
"OU": "Kubernetes-manual"
}
]
}
EOF1
cfssl gencert -ca=ssl/ca.pem -ca-key=ssl/ca-key.pem \
-config=ca-config.json \
-hostname=${SCHEDULER_IP} \
-profile=peer ${CSR_NAME_PREFIX}-csr.json | cfssljson -bare ssl/${CSR_NAME_PREFIX}
EOF
执行
bash -x gen_schduler_cert.sh 127.0.0.1,192.168.2.10
# 在ssl目录下生成
├── kube-scheduler-key.pem
├── kube-scheduler.pem
生成kubeconfig文件#
生成脚本kubeconfig_kube-scheduler.sh
cat <<'EOF' > kubeconfig_kube-scheduler.sh
APISERVER_IP=$1
K8S_CERT_DIR=$2
PORT=6443
KUBE_APISERVER=https://${APISERVER_IP}:${PORT}
KUBECONFIG_FILE=kubeconfig/kube-scheduler.kubeconfig
CLUSTER_NAME=kubernetes
USERNAME=system:kube-scheduler
CONTEXT_NAME=${USERNAME}@${CLUSTER_NAME}
CERT_PRFIX=kube-scheduler
# 设置集群参数
./bin/kubectl config set-cluster ${CLUSTER_NAME} \
--certificate-authority=${K8S_CERT_DIR}/ca.pem \
--embed-certs=true \
--server=${KUBE_APISERVER} \
--kubeconfig=${KUBECONFIG_FILE}
# 设置用户认证参数
./bin/kubectl config set-credentials ${USERNAME} \
--client-certificate=${K8S_CERT_DIR}/${CERT_PRFIX}.pem \
--client-key=${K8S_CERT_DIR}/${CERT_PRFIX}-key.pem \
--embed-certs=true \
--kubeconfig=${KUBECONFIG_FILE}
# 设置context---将用户和集群关联起来
./bin/kubectl config set-context ${CONTEXT_NAME} \
--cluster=${CLUSTER_NAME} \
--user=${USERNAME} \
--kubeconfig=${KUBECONFIG_FILE}
# 设置默认context
./bin/kubectl config use-context ${CONTEXT_NAME} \
--kubeconfig=${KUBECONFIG_FILE}
EOF
添加可执行权限并运行
bash -x kubeconfig_kube-scheduler.sh 192.168.2.10 ssl
## 在kubeconfig目录下生成
├── kube-scheduler.kubeconfig
生成kube-scheduler的service文件#
生成脚本kube-scheduler.sh
cat <<'EOF'> kube-scheduler.sh
K8S_CONF_DIR=/opt/k8s/config
cat > service/kube-scheduler.service <<EOF1
[Unit]
Description=Kubernetes Scheduler
Documentation=https://github.com/kubernetes/kubernetes
After=network.target
[Service]
ExecStart=/usr/local/bin/kube-scheduler \\
--v=2 \\
--bind-address=127.0.0.1 \\
--leader-elect=true \\
--kubeconfig=${K8S_CONF_DIR}/kube-scheduler.kubeconfig
Restart=always
RestartSec=10s
[Install]
WantedBy=multi-user.target
EOF1
EOF
执行
bash -x kube-scheduler.sh
# 在service目录下生成
├── kube-scheduler.service
分发二进制文件、证书、kubeconfig文件及service文件#
for i in m01; do \
ssh $i "mkdir -p /opt/k8s/{ssl,config}"; \
scp bin/kube-scheduler $i:/usr/local/bin/ ;
scp ssl/kube-scheduler*.pem $i:/opt/k8s/ssl/; \
scp service/kube-scheduler.service $i:/usr/lib/systemd/system/; \
scp kubeconfig/kube-scheduler.kubeconfig $i:/opt/k8s/config/; \
done
启动kube-scheduler服务#
for i in m01; do \
ssh $i "systemctl daemon-reload"; \
ssh $i "systemctl enable kube-scheduler"; \
ssh $i "systemctl restart kube-scheduler --no-block"; \
ssh $i "systemctl is-active kube-scheduler"; \
done
验证集群#
# kubectl get cs
Warning: v1 ComponentStatus is deprecated in v1.19+
NAME STATUS MESSAGE ERROR
scheduler Healthy ok
controller-manager Healthy ok
etcd-0 Healthy {"health":"true","reason":""}
至此master节点三大组件部署完毕
部署kubelet#
配置TLS Bootstrap#
为什么这个证书不是手动管理?因为k8s的master节点可能是固定的,创建好之后一直就是那几台,但worker节点可能变化比较多,如果添加,删除,故障维护时手动添加会比较麻烦,证书和主机名是有绑定的,而我们的主机名又是不一样的,所以需要有一种机制自动颁发证书请求。
每个节点的kubelet组件都要使用由apiserver 使用的CA签发的有效证书才能与apiserver通讯;此时如果节点多起来,为每个节点单独签署证书将是一件非常繁琐的事情;TLS bootstrapping功能就是让kubelet先使用一个预定的低权限用户连接到 apiserver,然后向apiserver申请证书,kubelet的证书由apiserver动态签署。
生成kubeconfig文件#
bootstrap.kubeconfig文件是一个用来向apiserver申请证书的文件
生成脚本kubeconfig_bootstrap_config.sh
cat <<'EOF'> kubeconfig_bootstrap_config.sh
#!/bin/bash
APISERVER_IP=$1
K8S_CERT_DIR=$2
K8S_CONF_DIR=/opt/k8s/config
PORT=6443
KUBE_APISERVER=https://${APISERVER_IP}:${PORT}
KUBECONFIG_FILE=kubeconfig/bootstrap.kubeconfig
CLUSTER_NAME=kubernetes
# 生成bootstrap的token
TOKEN_ID=$(openssl rand -hex 3)
TOKEN_SECRET=$(openssl rand -hex 8)
BOOTSTRAP_TOKEN=${TOKEN_ID}.${TOKEN_SECRET}
USERNAME=system:bootstrap:${TOKEN_ID}
CONTEXT_NAME=${USERNAME}@${CLUSTER_NAME}
# 创建bootstrap.kubeconfig
# 设置集群参数
./bin/kubectl config set-cluster ${CLUSTER_NAME} \
--certificate-authority=${K8S_CERT_DIR}/ca.pem \
--embed-certs=true \
--server=${KUBE_APISERVER} \
--kubeconfig=${KUBECONFIG_FILE}
# 设置客户端认证参数,kubelet 使用bootstrap token认证
./bin/kubectl config set-credentials ${USERNAME} \
--token=${BOOTSTRAP_TOKEN} \
--kubeconfig=${KUBECONFIG_FILE}
# 设置上下文参数
./bin/kubectl config set-context ${CONTEXT_NAME} \
--cluster=kubernetes \
--user=${USERNAME} \
--kubeconfig=${KUBECONFIG_FILE}
# 使用上下文参数生成 bootstrap.kubeconfig 文件
./bin/kubectl config use-context ${CONTEXT_NAME} --kubeconfig=${KUBECONFIG_FILE}
# 创建boostrap token secret
cat > config/bootstrap-token-secret.yaml <<EOF1
apiVersion: v1
kind: Secret
metadata:
name: bootstrap-token-${TOKEN_ID}
namespace: kube-system
type: bootstrap.kubernetes.io/token
stringData:
token-id: ${TOKEN_ID}
token-secret: ${TOKEN_SECRET}
usage-bootstrap-authentication: "true"
usage-bootstrap-signing: "true"
auth-extra-groups: system:bootstrappers:default-node-token,system:bootstrappers:worker,system:bootstrappers:ingress
EOF1
EOF
执行
bash -x kubeconfig_bootstrap_config.sh 192.168.2.10 ssl
# 在config目录下生成
├── bootstrap-token-secret.yaml
# 在kubeconfig目录下生成
├── bootstrap.kubeconfig
导入bootstrap-token-secret#
# 创建secret
$ kubectl apply -f config/bootstrap-token-secret.yaml
secret/bootstrap-token-a0da46 created
查看bootstrap-token#
$ kubectl get secret -nkube-system
NAME TYPE DATA AGE
bootstrap-token-a0da46 bootstrap.kubernetes.io/token 5 14m
生成kubelet配置文件#
生成脚本kubelet.sh
cat <<'EOF'> kubelet_config.sh
#!/bin/bash
K8S_CONF_DIR=/opt/k8s/config
K8S_CERT_DIR=/opt/k8s/ssl
CLUSTER_DNS=10.96.0.10
# 生成kubelet参数文件
cat > config/kubelet.conf <<EOF1
KUBELET_OPTS="--v=4 \\
--container-runtime-endpoint=unix:///run/containerd/containerd.sock \\
--runtime-cgroups=/systemd/system.slice \\
--kubeconfig=${K8S_CONF_DIR}/kubelet.kubeconfig \\
--bootstrap-kubeconfig=${K8S_CONF_DIR}/bootstrap.kubeconfig \\
--config=${K8S_CONF_DIR}/kubelet.yaml \\
--cert-dir=${K8S_CERT_DIR} \\
--node-labels=node.kubernetes.io/node="
EOF1
# 生成kubelet配置yaml文件
cat > config/kubelet.yaml <<EOF2
kind: KubeletConfiguration
apiVersion: kubelet.config.k8s.io/v1beta1
address: 0.0.0.0
port: 10250
readOnlyPort: 10255
authentication:
anonymous:
enabled: false
webhook:
cacheTTL: 2m0s
enabled: true
x509:
clientCAFile: ${K8S_CERT_DIR}/ca.pem
authorization:
mode: Webhook
webhook:
cacheAuthorizedTTL: 5m0s
cacheUnauthorizedTTL: 30s
runtimeRequestTimeout: 15m
cgroupDriver: systemd
cgroupsPerQOS: true
clusterDNS:
- ${CLUSTER_DNS}
clusterDomain: cluster.local
EOF2
# 生成kubelet.service服务启动文件
cat > service/kubelet.service <<EOF3
[Unit]
Description=Kubernetes Kubelet
After=containerd.service
Requires=containerd.service
[Service]
ExecStartPre=-/bin/mkdir -p /sys/fs/cgroup/hugetlb/systemd/system.slice
ExecStartPre=-/bin/mkdir -p /sys/fs/cgroup/blkio/systemd/system.slice
ExecStartPre=-/bin/mkdir -p /sys/fs/cgroup/cpuset/systemd/system.slice
ExecStartPre=-/bin/mkdir -p /sys/fs/cgroup/devices/systemd/system.slice
ExecStartPre=-/bin/mkdir -p /sys/fs/cgroup/net_cls,net_prio/systemd/system.slice
ExecStartPre=-/bin/mkdir -p /sys/fs/cgroup/perf_event/systemd/system.slice
ExecStartPre=-/bin/mkdir -p /sys/fs/cgroup/cpu,cpuacct/systemd/system.slice
ExecStartPre=-/bin/mkdir -p /sys/fs/cgroup/freezer/systemd/system.slice
ExecStartPre=-/bin/mkdir -p /sys/fs/cgroup/memory/systemd/system.slice
ExecStartPre=-/bin/mkdir -p /sys/fs/cgroup/pids/systemd/system.slice
ExecStartPre=-/bin/mkdir -p /sys/fs/cgroup/systemd/systemd/system.slice
LimitNOFILE=655350
LimitNPROC=655350
LimitCORE=infinity
LimitMEMLOCK=infinity
# 在centos系统上需要配置CPUAccounting和MemoryAccounting
CPUAccounting=true
MemoryAccounting=true
EnvironmentFile=${K8S_CONF_DIR}/kubelet.conf
ExecStart=/usr/local/bin/kubelet \$KUBELET_OPTS
Restart=on-failure
KillMode=process
[Install]
WantedBy=multi-user.target
EOF3
EOF
执行
bash -x kubelet_config.sh
# 在service目录下
├── kubelet.service
# 在config目录下
├── kubelet.conf
├── kubelet.yaml
分发二进制文件、配置文件、证书、kuconfig文件及service文件#
# 分发给m01
for i in m01; do \
ssh $i "mkdir -p /opt/k8s/{config,ssl,manifests}"; \
scp bin/kubelet $i:/usr/local/bin/; \
scp config/kubelet.{conf,yaml} $i:/opt/k8s/config/; \
scp kubeconfig/bootstrap.kubeconfig $i:/opt/k8s/config/; \
scp ssl/ca.pem $i:/opt/k8s/ssl; \
scp service/kubelet.service $i:/usr/lib/systemd/system/; \
done
启动kubelet服务#
for i in m01; do \
ssh $i "systemctl daemon-reload"; \
ssh $i "systemctl enable kubelet"; \
ssh $i "systemctl restart kubelet --no-block"; \
ssh $i "systemctl is-active kubelet"; \
done
授权#
使用systemctl status kubelet
会发现kubelet启动失败,进一步使用journalctl -xe -u kubelet.service --no-pager | less
会发现如下错误,提示User "system:bootstrap:a0da46"
不能创建资源certificatesigningrequests
m01 kubelet[1452]: Error: failed to run Kubelet: cannot create certificate signing request: certificatesigningrequests.certificates.k8s.io is forbidden: User "system:bootstrap:a0da46" cannot create resource "certificatesigningrequests" in API group "certificates.k8s.io" at the cluster scope
在默认情况下,kubelet 通过 bootstrap.kubeconfig
中的预设用户Token
声明了自己的身份,然后创建 CSR 请求;但是这个用户默认没有任何权限的,包括创建 CSR 请求;所以需要授权如下命令创建一个 ClusterRoleBinding,将预设用户 kubelet-bootstrap
与内置的 ClusterRole system:node-bootstrapper
绑定到一起,使其能够发起 CSR 请求
在使用 Bootstrap Token 进行引导时,Kubelet 组件使用 Token 发起的请求其用户名为system:bootstrap:<tokenid>
,所属组为system:bootstrappers
,然后创建CSR请求,但是此用户没有任何权限;在k8s中已经创建了一个clusterrole(system:node-bootstrapper),此集群角色具有发起CSR请求的权限,我们需要创建一个clusterrolebinding将clusterrole和此token的用户名或者所属组进行关联,然后system:bootstrap:<tokenid>
拥有了system:node-bootstrapper
的权限,这样任何用户拿着这个token连接apiserver都具有system:node-bootstrapper
的权限
这里创建一个clusterrolebinding(create-csrs-for-bootstrapping)将clusterrole和group进行绑定
kubectl create clusterrolebinding create-csrs-for-bootstrapping \
--clusterrole=system:node-bootstrapper \
--group=system:bootstrappers:default-node-token
$ kubectl get clusterrolebinding create-csrs-for-bootstrapping
NAME ROLE AGE
create-csrs-for-bootstrapping ClusterRole/system:node-bootstrapper 22s
手动签发kubelet证书并查看集群#
任何人使用token进行认证通过后进入授权阶段,api-server从该token中获取namespace和name信息,并将该token特殊对待,授予anyone bootstrap权利,将该匿名用户划分到system:bootstraps组,至此anyone使用该token认证的时候都具有了system:node-bootstrapper的权利
# 重新启动kubelet服务
systemctl restart kubelet.service
# 查看节点kubelet启动证书请求状态,这时已经是Pending状态
$ kubectl get csr
kubectl get csr
NAME AGE SIGNERNAME REQUESTOR REQUESTEDDURATION CONDITION
node-csr-Rd0Mxw-GPHt5p66qLCnLWUxj0g7uWUDtiNS0UjVafb8 4s kubernetes.io/kube-apiserver-client-kubelet system:bootstrap:a0da46 <none> Pending
# 手动签发kubelet的证书
$ kubectl certificate approve node-csr-Rd0Mxw-GPHt5p66qLCnLWUxj0g7uWUDtiNS0UjVafb8
certificatesigningrequest.certificates.k8s.io/node-csr-Rd0Mxw-GPHt5p66qLCnLWUxj0g7uWUDtiNS0UjVafb8 approved
# 再次查看证书请求状态,已经变成了Approved,Issued
$ kubectl get csr
NAME AGE SIGNERNAME REQUESTOR REQUESTEDDURATION CONDITION
node-csr-Rd0Mxw-GPHt5p66qLCnLWUxj0g7uWUDtiNS0UjVafb8 2m48s kubernetes.io/kube-apiserver-client-kubelet system:bootstrap:a0da46 <none> Approved,Issued
# 查看node,由于网络插件还未安装,状态显示为NotReady
$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
m01 NotReady <none> 79s v1.24.3
自动批准,自动续期,自动颁发#
要是有很多worker节点要安装kubelet,手工去approve证书请求会很繁琐,增加工作量,就有了自动批准,自动续期,自动颁发的方法。
kubelet所发起的CSR请求是由controller manager签署的;如果想要是实现自动续期,就需要让controller manager能够在 kubelet发起证书请求的时候自动帮助其签署证书;那么controller manager不可能对所有的CSR证书申请都自动签署,这时候就需要配置RBAC规则,保证controller manager只对kubelet发起的特定CSR请求自动批准即可;在TLS bootstrapping官方文档中,CSR有三种请求类型:
- nodeclient: kubelet以O=system:nodes和CN=system:node:(node name)形式发起的CSR请求
- selfnodeclient: kubelet client renew自己的证书发起的CSR请求(与上一个证书就有相同的O和CN)
- selfnodeserver: kubelet server renew 自己的证书发起的CSR请求
通俗点讲就是:
nodeclient类型的CSR仅在第一次启动时会产生,selfnodeclient类型的CSR请求实际上就是kubelet renew自己作为client跟 apiserver通讯时使用的证书产生的,selfnodeserver类型的CSR请求则是kubelet首次申请或后续renew自己的10250 api端口证书时产生的,以下为3中CSR请求分别创建3种对应的Clusterrole
创建3个clusterrolebinding#
- 自动批准kubelet首次用于与 apiserver 通讯证书的 CSR 请求(nodeclient)
- 自动批准kubelet首次用于10250端口鉴权的CSR请求(实际上这个请求走的也是 selfnodeserver 类型 CSR)
- 自动批准kubelet后续renew用于与apiserver通讯证书的 CSR 请求(selfnodeclient)
- 自动批准kubelet后续renew用于10250端口鉴权的 CSR 请求(selfnodeserver)
# 自动批准 kubelet 的首次 CSR 请求(用于与 apiserver 通讯的证书)
kubectl create clusterrolebinding node-client-auto-approve-csr --clusterrole=system:certificates.k8s.io:certificatesigningrequests:nodeclient --group=system:bootstrappers
# 自动批准 kubelet 后续 renew 用于与 apiserver 通讯证书的 CSR 请求
kubectl create clusterrolebinding node-client-auto-renew-crt --clusterrole=system:certificates.k8s.io:certificatesigningrequests:selfnodeclient --group=system:nodes
# 自动批准 kubelet 发起的用于 10250 端口鉴权证书的 CSR 请求(包括后续 renew)
kubectl create clusterrolebinding node-server-auto-renew-crt --clusterrole=system:certificates.k8s.io:certificatesigningrequests:selfnodeserver --group=system:nodes
分发二进制文件、配置文件、证书、kuconfig文件及service文件#
# 分发到w01和w02
for i in w01 w02; do \
ssh $i "mkdir -p /opt/k8s/{config,ssl,manifests}"; \
scp bin/kubelet $i:/usr/local/bin/; \
scp config/kubelet.{conf,yaml} $i:/opt/k8s/config/; \
scp kubeconfig/bootstrap.kubeconfig $i:/opt/k8s/config/; \
scp ssl/ca.pem $i:/opt/k8s/ssl; \
scp service/kubelet.service $i:/usr/lib/systemd/system/; \
done
启动kubelet服务#
for i in w01 w02; do \
ssh $i "systemctl daemon-reload"; \
ssh $i "systemctl enable kubelet"; \
ssh $i "systemctl restart kubelet --no-block"; \
ssh $i "systemctl is-active kubelet"; \
done
查看csr和集群信息#
# 再次查看csr,w01和w02已经自己申请证书并自动批准,颁发
$ kubectl get csr
NAME AGE SIGNERNAME REQUESTOR REQUESTEDDURATION CONDITION
node-csr-Rd0Mxw-GPHt5p66qLCnLWUxj0g7uWUDtiNS0UjVafb8 10m kubernetes.io/kube-apiserver-client-kubelet system:bootstrap:a0da46 <none> Approved,Issued
node-csr-ZOEo99khFzaWWKmYTHn8DctSmnvyngGG-DLr7q-wIZo 13s kubernetes.io/kube-apiserver-client-kubelet system:bootstrap:a0da46 <none> Approved,Issued
node-csr-fO2nU2tQC9RjwRyvZ6aA2338yWzgA43UZr0ka79rO8A 11s kubernetes.io/kube-apiserver-client-kubelet system:bootstrap:a0da46 <none> Approved,Issued
$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
m01 NotReady <none> 9m29s v1.24.3
w01 NotReady <none> 77s v1.24.3
w02 NotReady <none> 75s v1.24.3
# 修改node的role标签
$ kubectl label nodes m01 node-role.kubernetes.io/master=
node/m01 labeled
$ kubectl label nodes w01 node-role.kubernetes.io/worker=
node/w01 labeled
$ kubectl label nodes w02 node-role.kubernetes.io/worker=
node/w02 labeled
$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
m01 NotReady master 10m v1.24.3
w01 NotReady worker 2m1s v1.24.3
w02 NotReady worker 119s v1.24.3
部署kube-proxy#
kube-proxy运行在所有worker节点上,它监听apiserver中service和endpoint的变化情况,创建路由规则提供服务IP和负载均衡功能。
生成证书#
kube-proxy提取证书中的CN作为客户端的用户名,即system:kube-proxy
。 kube-apiserver预定义的 RBAC使用的ClusterRoleBindings system:node-proxier
将用户system:kube-proxy
与ClusterRole system:node-proxier
绑定,该Role授予节点调用kube-apiserver proxy相关api的权限;
$ kubectl describe clusterrolebinding/system:node-proxier
Name: system:node-proxier
Labels: kubernetes.io/bootstrapping=rbac-defaults
Annotations: rbac.authorization.kubernetes.io/autoupdate: true
Role:
Kind: ClusterRole
Name: system:node-proxier
Subjects:
Kind Name Namespace
---- ---- ---------
User system:kube-proxy
生成脚本gen_kube_proxy_cert.sh
cat <<EOF> gen_kube_proxy_cert.sh
#!/bin/bash
# 生成 kube-proxy 的证书和私钥,
cat > kube-proxy-csr.json <<EOF1
{
"CN": "system:kube-proxy",
"key": {
"algo": "rsa",
"size": 2048
},
"names": [
{
"C": "CN",
"ST": "GuangDong",
"L": "GuangZhou",
"O": "system:kube-proxy",
"OU": "Kubernetes-manual"
}
]
}
EOF1
cfssl gencert -ca=ssl/ca.pem -ca-key=ssl/ca-key.pem -config=ca-config.json -profile=peer kube-proxy-csr.json | cfssljson -bare ssl/kube-proxy
EOF
执行
bash -x gen_kube_proxy_cert.sh
# 在ssl目录下生成
├── kube-proxy-key.pem
├── kube-proxy.pem
生成kubeconfig文件#
认证方式有2种
- 证书认证
- token认证
本次部署采用的是证书认证
生成脚本kube-proxy_kubeconfig.sh
cat <<'EOF'> kube-proxy_kubeconfig.sh
#!/bin/bash
K8S_CONF_DIR=/opt/k8s/config
APISERVER_IP=$1
K8S_CERT_DIR=$2
PORT=6443
CLUSTER_NAME=kubernetes
KUBE_APISERVER=https://${APISERVER_IP}:${PORT}
KUBECONFIG_FILE=kubeconfig/kube-proxy.kubeconfig
USERNAME=system:kube-proxy
CONTEXT_NAME=${USERNAME}@${CLUSTER_NAME}
CERT_PRFIX=kube-proxy
./bin/kubectl config set-cluster ${CLUSTER_NAME} \
--certificate-authority=${K8S_CERT_DIR}/ca.pem \
--embed-certs=true \
--server=${KUBE_APISERVER} \
--kubeconfig=${KUBECONFIG_FILE}
./bin/kubectl config set-credentials ${USERNAME} \
--client-certificate=${K8S_CERT_DIR}/${CERT_PRFIX}.pem \
--client-key=${K8S_CERT_DIR}/${CERT_PRFIX}-key.pem \
--embed-certs=true \
--kubeconfig=${KUBECONFIG_FILE}
./bin/kubectl config set-context ${CONTEXT_NAME} \
--cluster=${CLUSTER_NAME} \
--user=${USERNAME} \
--kubeconfig=${KUBECONFIG_FILE}
./bin/kubectl config use-context ${CONTEXT_NAME} \
--kubeconfig=${KUBECONFIG_FILE}
EOF
执行
bash -x kube-proxy_kubeconfig.sh 192.168.2.10 ssl
# 在kueconfig目录下生成
├── kube-proxy.kubeconfig
生成配置文件和service文件#
生成脚本kube-proxy_config.sh
# clusterCIDR: 172.16.0.0/16 这个是pod网段
cat <<'EOF'> kube-proxy_config.sh
K8S_CONF_DIR=/opt/k8s/config
CLUSER_CIDR=172.16.0.0/16
## 创建 kube-proxy 启动参数配置文件
cat > config/kube-proxy.yaml <<EOF1
apiVersion: kubeproxy.config.k8s.io/v1alpha1
bindAddress: 0.0.0.0
clientConnection:
acceptContentTypes: ""
burst: 10
contentType: application/vnd.kubernetes.protobuf
kubeconfig: ${K8S_CONF_DIR}/kube-proxy.kubeconfig
qps: 5
clusterCIDR: ${CLUSER_CIDR}
configSyncPeriod: 15m0s
conntrack:
max: null
maxPerCore: 32768
min: 131072
tcpCloseWaitTimeout: 1h0m0s
tcpEstablishedTimeout: 24h0m0s
enableProfiling: false
healthzBindAddress: 0.0.0.0:10256
hostnameOverride: ""
iptables:
masqueradeAll: false
masqueradeBit: 14
minSyncPeriod: 0s
syncPeriod: 30s
ipvs:
masqueradeAll: true
minSyncPeriod: 5s
scheduler: "rr"
syncPeriod: 30s
kind: KubeProxyConfiguration
metricsBindAddress: 127.0.0.1:10249
mode: "ipvs"
nodePortAddresses: null
oomScoreAdj: -999
portRange: ""
udpIdleTimeout: 250ms
EOF1
#----------------------
# 创建 kube-proxy.service 服务管理文件
cat > service/kube-proxy.service <<EOF2
[Unit]
Description=Kubernetes Proxy
After=network.target
[Service]
ExecStart=/usr/local/bin/kube-proxy \\
--config=${K8S_CONF_DIR}/kube-proxy.yaml \\
--v=2
Restart=always
RestartSec=10s
[Install]
WantedBy=multi-user.target
EOF2
EOF
执行
bash -x kube-proxy_config.sh
# 在config目录下生成
├── kube-proxy.yaml
# 在service目录下生成
├── kube-proxy.service
分发二进制文件、配置文件、kubeconfig文件及service文件#
for i in m01 w01 w02; do \
scp bin/kube-proxy $i:/usr/local/bin/; \
scp config/kube-proxy.yaml $i:/opt/k8s/config/; \
scp kubeconfig/kube-proxy.kubeconfig $i:/opt/k8s/config/; \
scp service/kube-proxy.service $i:/usr/lib/systemd/system/; \
scp ssl/front-proxy-ca.pem $i:/opt/k8s/ssl/; \
done
启动kube-proxy服务#
for i in m01 w01 w02; do \
ssh $i "systemctl daemon-reload"; \
ssh $i "systemctl enable kube-proxy"; \
ssh $i "systemctl restart kube-proxy --no-block"; \
ssh $i "systemctl is-active kube-proxy"; \
done
查看服务状态#
# ss -tnlp | grep kube-proxy
LISTEN 0 16384 127.0.0.1:10249 0.0.0.0:* users:(("kube-proxy",pid=2272,fd=14))
LISTEN 0 16384 *:10256 *:* users:(("kube-proxy",pid=2272,fd=12))
# ipvsadm -Ln
IP Virtual Server version 1.2.1 (size=4096)
Prot LocalAddress:Port Scheduler Flags
-> RemoteAddress:Port Forward Weight ActiveConn InActConn
TCP 10.96.0.1:443 rr
-> 192.168.2.10:6443 Masq 1 0 0
部署calico#
calico中,pod的网卡的另一端并不是接入虚拟网桥,而是直接接入了到内核中,所以看不到另外一端
组件
-
Felix:运行于各个节点的守护进程,主要完成接口管理、路由规划、ACL规划、路由和ACL的报文状态
-
BIRD:vRouter的实现,默认是bgp客户端,在运行Felix的节点必须要运行BIRD;同时又是路由反射器
BGP协议的守护进程:既可以是路由反射器又是bgp客户端
下载calico.yaml并修改配置#
mkdir -p /root/addons
cd /root/addons/
# calico的详细部署说明参见
# https://projectcalico.docs.tigera.io/getting-started/kubernetes/self-managed-onprem/onpremises
# 下载calicao部署文件,
curl https://docs.projectcalico.org/manifests/calico.yaml -o calico.yaml
# 最简单的配置,就是修改配置中CALICO_IPV4POOL_CIDR参数,这个参数表示为pod分配的网络的网段,默认是
# 192.168.0.0/16,我们可以自定义网络地址
- name: CALICO_IPV4POOL_CIDR
value: "172.16.0.0/16"
# 以下都是可选配置
# ------------------------------------------------------------
# yaml配置文件
# cni_network_config:配置calico如何通过CNI与K8s对接
# 工作模型,默认配置使用的IPIP模型,支持3中可用值
# Always(全局流量)、Cross-SubNet(跨子网流量)和Never3种可用值
# Enable IPIP
- name: CALICO_IPV4POOL_IPIP
value: "Always"
# Enable or Disable VXLAN on the default IP pool.
- name: CALICO_IPV4POOL_VXLAN
value: "Never"
# 要是想启用BGP模型,需要将IPIP和VXLAN修改成Never并添加支持BGP的配置
# 混合模式-主机跨子网,有2种组合,①IPIP + BGP,② VXLAN + BGP
# ①IPIP + BGP
- name: CALICO_IPV4POOL_IPIP
value: "Cross-SubNet'
- name: CALICO_IPV4POOL_VXLAN
value: "Never"
# ②VXLAN + BGP
- name: CALICO_IPV4POOL_IPIP
value: "Never'
- name: CALICO_IPV4POOL_VXLAN
value: "Cross-SubNet"
# 如果给定一个B类地址,calico默认使用26位掩码进行子网分配,这样占用c类子网的主机位,子网数量增
# 加,可以更多的支持node数量,但是每个node的POD数量减少,一个主机运行62个pod,使用kubeadmin
# 部署,node节点最大支持110个pod,最大pod数量可以修改
# 如果为了我们习惯和方便分清是哪个子网的pod,可以调整配置,让calico使用24位掩码进行子网切分地址池,
# 并将各子网配置给集群中的节点,需要配置如下参数
- name: CALICO_IPV4POOL_BLOCK_SIZE
value: "24"
# controller manager用配置中参数--allocate-node-cidrs=true,来为节点分配一个pod cidr,
# 这个cidr是什么由另外一个参数--cluster-cidr=xxx决定,但是Calico默认并不会从用这个cidr给
# pod分配地址。我们可以使用以下配置,让calico使用controller manager为node分配的pod的cidr
# 为pod分配地址
# 设置为“true”并结合host-local这一IPAM插件以强制从PodCIDR中分配地址
- name: USE_POD_CIDR
value: "true"
# 在地址分配方面,Calico在JSON格式的CNI插件配置文件中使用专有的calico-ipam插件,该插件并不会使用Node.Spec.PodCIDR中定义的子网作为节点本地用于为Pod分配地址的地址池,而是根据Calico插件为各节点的配置的地址池进行地址分配。若期望为节点真正使用地址池吻合PodCIDR的定义,则需要在部署清单中DaemonSet/calico-node资源的Pod模板中的calico-node容器之上将USE_POD_CIDR环境变量的值设置为true,并修改ConfigMap/calico-config资源中cni_network_config键中的plugins.ipam.type的值为host-local,且使用podCIDR为子网,具体配置如下所示。
cni_network_config:
{
"plugins": [
{
"ipam": {
"type": "host-local",
"subnet": "usePodCidr"
},
应用资源文件#
kubectl apply -f calico.yaml
检查#
# kubectl get pods -A
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system calico-kube-controllers-555bc4b957-26fw8 1/1 Running 0 6m35s
kube-system calico-node-9xhq4 1/1 Running 0 6m35s
kube-system calico-node-gtl2c 1/1 Running 0 6m35s
kube-system calico-node-hvcnw 1/1 Running 0 6m36s
# kubectl get nodes
NAME STATUS ROLES AGE VERSION
m01 Ready master 22m v1.24.3
w01 Ready worker 14m v1.24.3
w02 Ready worker 14m v1.24.3
部署addons#
部署coredns#
用于集群内部service的解析,可以让pod把service name解析成service ip,然后通过service的ip地址进行连接到对应的应用
,打开传送门,查看coredns的最新版本,目前最新是1.9.3
版本。
部署coredns后,每个pod启动后,会在resolv.conf之注入dns信息
$ cat /etc/resolv.conf
search default.svc.cluster.local svc.cluster.local cluster.local
nameserver 10.96.0.10
options ndots:5
生成yaml资源文件#
创建脚本文件gen_coredns_config.sh
cat <<'EOF'> gen_coredns_config.sh
DNS_DOMAIN=cluster.local
IMAGE_REGISTRY=registry.cn-hangzhou.aliyuncs.com/google_containers/coredns:v1.9.3
DNS_MEMORY_LIMIT=170Mi
DNS_SERVER_IP=10.96.0.10
cat > coredns.yaml <<EOF1
apiVersion: v1
kind: ServiceAccount
metadata:
name: coredns
namespace: kube-system
labels:
kubernetes.io/cluster-service: "true"
addonmanager.kubernetes.io/mode: Reconcile
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
labels:
kubernetes.io/bootstrapping: rbac-defaults
addonmanager.kubernetes.io/mode: Reconcile
name: system:coredns
rules:
- apiGroups:
- ""
resources:
- endpoints
- services
- pods
- namespaces
verbs:
- list
- watch
- apiGroups:
- ""
resources:
- nodes
verbs:
- get
- apiGroups:
- discovery.k8s.io
resources:
- endpointslices
verbs:
- list
- watch
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
annotations:
rbac.authorization.kubernetes.io/autoupdate: "true"
labels:
kubernetes.io/bootstrapping: rbac-defaults
addonmanager.kubernetes.io/mode: EnsureExists
name: system:coredns
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: system:coredns
subjects:
- kind: ServiceAccount
name: coredns
namespace: kube-system
---
apiVersion: v1
kind: ConfigMap
metadata:
name: coredns
namespace: kube-system
labels:
addonmanager.kubernetes.io/mode: EnsureExists
data:
Corefile: |
.:53 {
errors
health {
lameduck 5s
}
ready
kubernetes ${DNS_DOMAIN} in-addr.arpa ip6.arpa {
pods insecure
fallthrough in-addr.arpa ip6.arpa
ttl 30
}
prometheus :9153
forward . /etc/resolv.conf {
max_concurrent 1000
}
cache 30
loop
reload
loadbalance
}
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: coredns
namespace: kube-system
labels:
k8s-app: kube-dns
kubernetes.io/cluster-service: "true"
addonmanager.kubernetes.io/mode: Reconcile
kubernetes.io/name: "CoreDNS"
spec:
# replicas: not specified here:
# 1. In order to make Addon Manager do not reconcile this replicas parameter.
# 2. Default is 1.
# 3. Will be tuned in real time if DNS horizontal auto-scaling is turned on.
strategy:
type: RollingUpdate
rollingUpdate:
maxUnavailable: 1
selector:
matchLabels:
k8s-app: kube-dns
template:
metadata:
labels:
k8s-app: kube-dns
spec:
securityContext:
seccompProfile:
type: RuntimeDefault
priorityClassName: system-cluster-critical
serviceAccountName: coredns
affinity:
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
podAffinityTerm:
labelSelector:
matchExpressions:
- key: k8s-app
operator: In
values: ["kube-dns"]
topologyKey: kubernetes.io/hostname
tolerations:
- key: "CriticalAddonsOnly"
operator: "Exists"
nodeSelector:
kubernetes.io/os: linux
containers:
- name: coredns
image: ${IMAGE_REGISTRY}
imagePullPolicy: IfNotPresent
resources:
limits:
memory: ${DNS_MEMORY_LIMIT}
requests:
cpu: 100m
memory: 70Mi
args: [ "-conf", "/etc/coredns/Corefile" ]
volumeMounts:
- name: config-volume
mountPath: /etc/coredns
readOnly: true
ports:
- containerPort: 53
name: dns
protocol: UDP
- containerPort: 53
name: dns-tcp
protocol: TCP
- containerPort: 9153
name: metrics
protocol: TCP
livenessProbe:
httpGet:
path: /health
port: 8080
scheme: HTTP
initialDelaySeconds: 60
timeoutSeconds: 5
successThreshold: 1
failureThreshold: 5
readinessProbe:
httpGet:
path: /ready
port: 8181
scheme: HTTP
securityContext:
allowPrivilegeEscalation: false
capabilities:
add:
- NET_BIND_SERVICE
drop:
- all
readOnlyRootFilesystem: true
dnsPolicy: Default
volumes:
- name: config-volume
configMap:
name: coredns
items:
- key: Corefile
path: Corefile
---
apiVersion: v1
kind: Service
metadata:
name: kube-dns
namespace: kube-system
annotations:
prometheus.io/port: "9153"
prometheus.io/scrape: "true"
labels:
k8s-app: kube-dns
kubernetes.io/cluster-service: "true"
addonmanager.kubernetes.io/mode: Reconcile
kubernetes.io/name: "CoreDNS"
spec:
selector:
k8s-app: kube-dns
clusterIP: ${DNS_SERVER_IP}
ports:
- name: dns
port: 53
protocol: UDP
- name: dns-tcp
port: 53
protocol: TCP
- name: metrics
port: 9153
protocol: TCP
EOF1
EOF
执行脚本
bash -x gen_coredns_config.sh
# 当前目录下生成以下文件
├── coredns.yaml
应用yaml资源文件并查看#
kubectl apply -f coredns.yaml
# kubectl get pod -A
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system coredns-7f8b8f7b8-rhnnj 1/1 Running 0 20s
部署metrics-server#
在k8s中系统资源的采集均使用Metrics-server,可以通过Metrics采集节点和pod的内存,磁盘,CPU和网络使用率。
可以传送门下载yaml文件,但是要修改几个参数
IMAGE_REGISTRY
:在阿里云的容器服务里面查看对应metrics-server镜像的版本
生成yaml资源文件#
创建脚本文件gen_metrics-server_config.sh
cat <<'EOF'> gen_metrics-server_config.sh
CERT_PATH=/opt/k8s/ssl
CLIENT_CA_FILE=${CERT_PATH}/front-proxy-ca.pem
IMAGE_REGISTRY=registry.cn-hangzhou.aliyuncs.com/google_containers/metrics-server:v0.6.1
cat > metrics-server.yaml <<EOF1
apiVersion: v1
kind: ServiceAccount
metadata:
labels:
k8s-app: metrics-server
name: metrics-server
namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
labels:
k8s-app: metrics-server
rbac.authorization.k8s.io/aggregate-to-admin: "true"
rbac.authorization.k8s.io/aggregate-to-edit: "true"
rbac.authorization.k8s.io/aggregate-to-view: "true"
name: system:aggregated-metrics-reader
rules:
- apiGroups:
- metrics.k8s.io
resources:
- pods
- nodes
verbs:
- get
- list
- watch
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
labels:
k8s-app: metrics-server
name: system:metrics-server
rules:
- apiGroups:
- ""
resources:
- nodes/metrics
verbs:
- get
- apiGroups:
- ""
resources:
- pods
- nodes
verbs:
- get
- list
- watch
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
labels:
k8s-app: metrics-server
name: metrics-server-auth-reader
namespace: kube-system
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: extension-apiserver-authentication-reader
subjects:
- kind: ServiceAccount
name: metrics-server
namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
labels:
k8s-app: metrics-server
name: metrics-server:system:auth-delegator
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: system:auth-delegator
subjects:
- kind: ServiceAccount
name: metrics-server
namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
labels:
k8s-app: metrics-server
name: system:metrics-server
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: system:metrics-server
subjects:
- kind: ServiceAccount
name: metrics-server
namespace: kube-system
---
apiVersion: v1
kind: Service
metadata:
labels:
k8s-app: metrics-server
name: metrics-server
namespace: kube-system
spec:
ports:
- name: https
port: 443
protocol: TCP
targetPort: https
selector:
k8s-app: metrics-server
---
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
k8s-app: metrics-server
name: metrics-server
namespace: kube-system
spec:
selector:
matchLabels:
k8s-app: metrics-server
strategy:
rollingUpdate:
maxUnavailable: 0
template:
metadata:
labels:
k8s-app: metrics-server
spec:
containers:
- args:
- --cert-dir=/tmp
- --secure-port=4443
- --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname
- --kubelet-use-node-status-port
- --metric-resolution=15s
- --kubelet-insecure-tls # kubectl top nodes
- --requestheader-client-ca-file=${CLIENT_CA_FILE} # 聚合CA证书 front-proxy-ca.crt
- --requestheader-username-headers=X-Remote-User
- --requestheader-group-headers=X-Remote-Group
- --requestheader-extra-headers-prefix=X-Remote-Extra-
image: ${IMAGE_REGISTRY}
imagePullPolicy: IfNotPresent
livenessProbe:
failureThreshold: 3
httpGet:
path: /livez
port: https
scheme: HTTPS
periodSeconds: 10
name: metrics-server
ports:
- containerPort: 4443
name: https
protocol: TCP
readinessProbe:
failureThreshold: 3
httpGet:
path: /readyz
port: https
scheme: HTTPS
initialDelaySeconds: 20
periodSeconds: 10
resources:
requests:
cpu: 100m
memory: 200Mi
securityContext:
allowPrivilegeEscalation: false
readOnlyRootFilesystem: true
runAsNonRoot: true
runAsUser: 1000
volumeMounts:
- mountPath: /tmp
name: tmp-dir
- name: ca-ssl
mountPath: ${CERT_PATH}
nodeSelector:
kubernetes.io/os: linux
priorityClassName: system-cluster-critical
serviceAccountName: metrics-server
hostNetwork: true # kubectl top pods
volumes:
- emptyDir: {}
name: tmp-dir
- name: ca-ssl
hostPath:
path: ${CERT_PATH}
---
apiVersion: apiregistration.k8s.io/v1
kind: APIService
metadata:
labels:
k8s-app: metrics-server
name: v1beta1.metrics.k8s.io
spec:
group: metrics.k8s.io
groupPriorityMinimum: 100
insecureSkipTLSVerify: true
service:
name: metrics-server
namespace: kube-system
version: v1beta1
versionPriority: 100
EOF1
EOF
执行脚本
bash -x gen_metrics-server_config.sh
# 在当前目录下生成
└── metrics-server.yaml
应用yaml资源文件并查看#
kubectl apply -f metrics-server.yaml
$ kubectl get pods -n kube-system
NAME READY STATUS RESTARTS AGE
kube-system metrics-server-668477f7f9-xz8l5 1/1 Running 0 36s
# 若是metrics-server起不了报错,可以查看日志,发现问题出在哪里
kubectl logs -f metrics-server-668477f7f9-xz8l5 -n=kube-system
查看资源消耗#
$ kubectl top nodes
NAME CPU(cores) CPU% MEMORY(bytes) MEMORY%
m01 320m 16% 1835Mi 23%
w01 155m 7% 739Mi 88%
w02 130m 6% 696Mi 83%
$ kubectl top pods -A
NAMESPACE NAME CPU(cores) MEMORY(bytes)
kube-system calico-kube-controllers-555bc4b957-x8fh9 4m 20Mi
kube-system calico-node-jxp96 85m 142Mi
kube-system calico-node-lfnr5 81m 160Mi
kube-system calico-node-mj2tj 94m 153Mi
kube-system coredns-7f8b8f7b8-dfn7s 2m 20Mi
kube-system metrics-server-668477f7f9-p9z85 5m 16Mi
集群验证#
集群可用需要通过以下测试项
- pod必须能解析相同namespace下service
- pod必须能解析跨不同namespace的service
- 每个节点必须要能访问k8s的kubernetes的service端口443和kube-dns的service端口53
- pod与pod之间通通信
- 相同namespace内的pod之间能通信
- 跨不同namespace的pod之前能通信
- 跨不同worker节点的pod之间能通信
创建测试pod#
mkdir -p /root/yaml
cd /root/yaml
cat <<EOF> busybox.yaml
apiVersion: v1
kind: Pod
metadata:
name: busybox
namespace: default
spec:
containers:
- name: busybox
image: busybox:1.28
command:
- sleep
- "3600"
imagePullPolicy: IfNotPresent
restartPolicy: Always
EOF
# kubectl apply -f busybox.yaml
pod/busybox created
pod必须能解析相同namespace下service#
# 集群安装完毕后,default名称空间下创建了一个service--kubernetes
$ kubectl get svc
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 10d
$ kubectl exec busybox -- nslookup kubernetes
error: unable to upgrade connection: Forbidden (user=kube-apiserver, verb=create, resource=nodes, subresource=proxy)
# 提示user kube-apiserver被禁止,apiserver访问kubelet也需要授权,创建一个集群角色绑定--apiserver-kubelet
$ kubectl create clusterrolebinding apiserver-kubelet --clusterrole=system:kubelet-api-admin --user=kube-apiserver
# 再次查询,可以正常解析
$ kubectl exec busybox -- nslookup kubernetes
Server: 10.96.0.10
Address 1: 10.96.0.10 kube-dns.kube-system.svc.cluster.local
Name: kubernetes
Address 1: 10.96.0.1 kubernetes.default.svc.cluster.local
跨不同namespace的service#
# busybox在default下,kube-dns在kube-system下,可以正常解析
$ kubectl exec busybox -n default -- nslookup kube-dns.kube-system
Server: 10.96.0.10
Address 1: 10.96.0.10 kube-dns.kube-system.svc.cluster.local
Name: kube-dns.kube-system
Address 1: 10.96.0.10 kube-dns.kube-system.svc.cluster.local
每个节点必须要能访问k8s的kubernetes svc 443和kube-dns的 service的53#
# 用telnet命令测试k8s service的443端口。
$ telnet 10.96.0.1 443
Trying 10.96.0.1...
Connected to 10.96.0.1.
Escape character is '^]'.
$ telnet 10.96.0.10 53
Trying 10.96.0.10...
Connected to 10.96.0.10.
Escape character is '^]'.
# 2个service的端口可以telnet
pod与pod之间通通信#
# 相同namespace内的pod之间能通信
cat <<EOF> busybox1.yaml
apiVersion: v1
kind: Pod
metadata:
name: busybox1
namespace: default
spec:
containers:
- name: busybox
image: busybox:1.28
command:
- sleep
- "3600"
imagePullPolicy: IfNotPresent
restartPolicy: Always
EOF
kubectl apply -f busybox1.yaml
$ kubectl get pod -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
busybox 1/1 Running 0 12m 172.16.28.67 m01 <none> <none>
busybox1 1/1 Running 0 11s 172.16.28.68 m01 <none> <none>
$ kubectl exec busybox -- ping 172.16.28.68
PING 172.16.28.68 (172.16.28.68): 56 data bytes
64 bytes from 172.16.28.68: seq=0 ttl=63 time=0.120 ms
64 bytes from 172.16.28.68: seq=1 ttl=63 time=0.081 ms
# 跨不同namespace的pod之前能通信
$ kubectl create ns test
namespace/test created
cat <<EOF> busybox2.yaml
apiVersion: v1
kind: Pod
metadata:
name: busybox2
namespace: test
spec:
containers:
- name: busybox
image: busybox:1.28
command:
- sleep
- "3600"
imagePullPolicy: IfNotPresent
restartPolicy: Always
EOF
kubectl apply -f busybox2.yaml
$ kubectl get pod -A -owide
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
default busybox 1/1 Running 0 16m 172.16.28.67 m01 <none> <none>
default busybox1 1/1 Running 0 4m 172.16.28.68 m01 <none> <none>
test busybox2 1/1 Running 0 61s 172.16.28.69 m01 <none> <none>
kube-system calico-kube-controllers-555bc4b957-x8fh9 1/1 Running 0 37m 172.16.211.65 w01 <none> <none>
$ kubectl exec busybox -- ping 172.16.28.69
PING 172.16.28.69 (172.16.28.69): 56 data bytes
64 bytes from 172.16.28.69: seq=0 ttl=63 time=0.503 ms
64 bytes from 172.16.28.69: seq=1 ttl=63 time=0.079 ms
# 跨不同worker节点的pod之间能通信
$ kubectl exec busybox -- ping 172.16.211.65
PING 172.16.211.65 (172.16.211.65): 56 data bytes
64 bytes from 172.16.211.65: seq=0 ttl=62 time=0.753 ms
64 bytes from 172.16.211.65: seq=1 ttl=62 time=1.070 ms
至此一个学习用的k8s集群搭建完毕。
【推荐】国内首个AI IDE,深度理解中文开发场景,立即下载体验Trae
【推荐】编程新体验,更懂你的AI,立即体验豆包MarsCode编程助手
【推荐】抖音旗下AI助手豆包,你的智能百科全书,全免费不限次数
【推荐】轻量又高性能的 SSH 工具 IShell:AI 加持,快人一步
· 10年+ .NET Coder 心语 ── 封装的思维:从隐藏、稳定开始理解其本质意义
· 地球OL攻略 —— 某应届生求职总结
· 周边上新:园子的第一款马克杯温暖上架
· Open-Sora 2.0 重磅开源!
· 提示词工程——AI应用必不可少的技术