CentOS上部署Kubernetes集群
1、开始前系统环境准备
# 1、设置基本环境
yum install -y net-tools conntrack-tools wget vim ntpdate libseccomp libtool-ltdl #在所有的机器上执行,安装基本命令
systemctl stop firewalld && systemctl disable firewalld #执行关闭防火墙和SELinux
setenforce 0 #关闭selinux
sed -i 's/SELINUX=enforcing/SELINUX=disabled/g' /etc/selinux/config
swapoff -a #关闭swap
sed -i 's/.*swap.*/#&/' /etc/fstab
# 2、设置免密登陆
ssh-keygen -t rsa #配置免密登陆
ssh-copy-id <ip地址> #拷贝密钥
# 3、更改国内yum源
mv /etc/yum.repos.d/CentOS-Base.repo /etc/yum.repos.d/CentOS-Base.repo.$(date +%Y%m%d)
wget -O /etc/yum.repos.d/CentOS-Base.repo http://mirrors.cloud.tencent.com/repo/centos7_base.repo
wget -O /etc/yum.repos.d/epel.repo http://mirrors.cloud.tencent.com/repo/epel-7.repo
#docker源
wget https://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo -O /etc/yum.repos.d/docker-ce.repo
#配置国内Kubernetes源
cat <<EOF > /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64/
enabled=1
gpgcheck=1
repo_gpgcheck=1
gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg
EOF
yum clean all && yum makecache -y
#有的人是写成这样的kubernetes源
[root@localhost ~]# cat >> /etc/yum.repos.d/kubernetes.repo << EOF
[kubernetes]
name=Kubernetes
baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64/
enabled=1
gpgcheck=0
EOF
# 4、配置内核参数,将桥接的IPv4流量传递到IPtables链
modprobe br_netfilter #加载br_netfilter模块
cat <<EOF > /etc/sysctl.d/k8s.conf
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
EOF
sysctl -p /etc/sysctl.d/k8s.conf
sysctl --system
ls /proc/sys/net/bridge
# 5.配置文件描述数
echo "* soft nofile 65536" >> /etc/security/limits.conf
echo "* hard nofile 65536" >> /etc/security/limits.conf
echo "* soft nproc 65536" >> /etc/security/limits.conf
echo "* hard nproc 65536" >> /etc/security/limits.conf
echo "* soft memlock unlimited" >> /etc/security/limits.conf
echo "* hard memlock unlimited" >> /etc/security/limits.conf
2、安装Docker
如今Docker分为了Docker-CE和Docker-EE两个版本,CE为社区版即免费版,EE为企业版即商业版。我们选择使用CE版。在所有的机器上操作
#1.安装yum源工具包
yum install -y yum-utils device-mapper-persistent-data lvm2
#2.下载docker-ce官方的yum源配置文件
yum-config-manager --add-repo https://download.docker.com/linux/centos/docker-ce.repo
#3.禁用docker-c-edge源配edge是不开发版,不稳定,下载stable版
yum-config-manager --disable docker-ce-edge
#4.更新本地YUM源缓存
yum makecache fast
#5.安装Docker-ce相应版本
yum -y install docker-ce
#6.修改配置文件
ExecStart=/usr/bin/dockerd -H tcp://0.0.0.0:2375 -H unix:///var/run/docker.sock
#7.设置开机自启动
systemctl daemon-reload
systemctl restart docker && systemctl enable docker
运行hello world验证
[root@localhost ~]# systemctl start docker
[root@localhost ~]# docker run hello-world
Unable to find image 'hello-world:latest' locally
latest: Pulling from library/hello-world
9a0669468bf7: Pull complete
Digest: sha256:0e06ef5e1945a718b02a8c319e15bae44f47039005530bc617a5d071190ed3fc
Status: Downloaded newer image for hello-world:latest
Hello from Docker!
This message shows that your installation appears to be working correctly.
To generate this message, Docker took the following steps:
1. The Docker client contacted the Docker daemon.
2. The Docker daemon pulled the "hello-world" image from the Docker Hub.
3. The Docker daemon created a new container from that image which runs the
executable that produces the output you are currently reading.
4. The Docker daemon streamed that output to the Docker client, which sent it
to your terminal.
To try something more ambitious, you can run an Ubuntu container with:
$ docker run -it ubuntu bash
Share images, automate workflows, and more with a free Docker ID:
https://cloud.docker.com/
For more examples and ideas, visit:
https://docs.docker.com/engine/userguide/
3、安装kubelet与kubeadm包
使用kubeadm init命令初始化集群之下载Docker镜像到所有主机的实始化时会下载kubeadm必要的依赖镜像,同时安装etcd,kube-dns,kube-proxy,由于我们GFW防火墙问题我们不能直接访问,因此先通过其它方法下载下面列表中的镜像,然后导入到系统中,再使用kubeadm init来初始化集群
1.使用DaoCloud加速器(可以跳过这一步)
[root@localhost ~]# curl -sSL https://get.daocloud.io/daotools/set_mirror.sh | sh -s http://0d236e3f.m.daocloud.io
docker version >= 1.12
{"registry-mirrors": ["http://0d236e3f.m.daocloud.io"]}
Success.
You need to restart docker to take effect: sudo systemctl restart docker
[root@localhost ~]# systemctl restart docker
2.下载镜像,自己通过Dockerfile到dockerhub生成对镜像,也可以克隆其他人的
手动下载kubernetes的相关镜像,下载地址为:https://hub.docker.com/u/warrior下载后需要将镜像名改为以 k8s.gcr.io或者gcr.io/google_containers 开头的名称(具体名称按照系统提示操作)。
#参考别人的写法(我这里没有参考用的自己的方式下载的)
images=(kube-controller-manager-amd64 etcd-amd64 k8s-dns-sidecar-amd64 kube-proxy-amd64 kube-apiserver-amd64 kube-scheduler-amd64 pause-amd64 k8s-dns-dnsmasq-nanny-amd64 k8s-dns-kube-dns-amd64)
for imageName in ${images[@]} ; do
docker pull champly/$imageName
docker tag champly/$imageName gcr.io/google_containers/$imageName
docker rmi champly/$imageName
done
# 然后修改版本号,需要和kubernetes的版本对应
docker tag gcr.io/google_containers/etcd-amd64 gcr.io/google_containers/etcd-amd64:3.0.17 && \
docker rmi gcr.io/google_containers/etcd-amd64 && \
docker tag gcr.io/google_containers/k8s-dns-dnsmasq-nanny-amd64 gcr.io/google_containers/k8s-dns-dnsmasq-nanny-amd64:1.14.5 && \
docker rmi gcr.io/google_containers/k8s-dns-dnsmasq-nanny-amd64 && \
docker tag gcr.io/google_containers/k8s-dns-kube-dns-amd64 gcr.io/google_containers/k8s-dns-kube-dns-amd64:1.14.5 && \
docker rmi gcr.io/google_containers/k8s-dns-kube-dns-amd64 && \
docker tag gcr.io/google_containers/k8s-dns-sidecar-amd64 gcr.io/google_containers/k8s-dns-sidecar-amd64:1.14.2 && \
docker rmi gcr.io/google_containers/k8s-dns-sidecar-amd64 && \
docker tag gcr.io/google_containers/kube-apiserver-amd64 gcr.io/google_containers/kube-apiserver-amd64:v1.7.5 && \
docker rmi gcr.io/google_containers/kube-apiserver-amd64 && \
docker tag gcr.io/google_containers/kube-controller-manager-amd64 gcr.io/google_containers/kube-controller-manager-amd64:v1.7.5 && \
docker rmi gcr.io/google_containers/kube-controller-manager-amd64 && \
docker tag gcr.io/google_containers/kube-proxy-amd64 gcr.io/google_containers/kube-proxy-amd64:v1.6.0 && \
docker rmi gcr.io/google_containers/kube-proxy-amd64 && \
docker tag gcr.io/google_containers/kube-scheduler-amd64 gcr.io/google_containers/kube-scheduler-amd64:v1.7.5 && \
docker rmi gcr.io/google_containers/kube-scheduler-amd64 && \
docker tag gcr.io/google_containers/pause-amd64 gcr.io/google_containers/pause-amd64:3.0 && \
docker rmi gcr.io/google_containers/pause-amd64
#以上是别人的容器,修改的tag是老的,可以参考如下方式下载镜像,下载的容器版本需要和kubernetes软件的版本一致,找了2个下载比较多的镜像版本,我们以他们的镜像为准(mirrorgooglecontainers googlecontainer)
export image=pause:3.1
docker pull mirrorgooglecontainers/${image}
docker tag mirrorgooglecontainers/${image} k8s.gcr.io/${image}
docker rmi mirrorgooglecontainers/${image}
export image=kube-apiserver:v1.14.3
docker pull mirrorgooglecontainers/${image}
docker tag mirrorgooglecontainers/${image} k8s.gcr.io/${image}
docker rmi mirrorgooglecontainers/${image}
export image=kube-scheduler:v1.14.3
docker pull mirrorgooglecontainers/${image}
docker tag mirrorgooglecontainers/${image} k8s.gcr.io/${image}
docker rmi mirrorgooglecontainers/${image}
export image=kube-controller-manager:v1.14.3
docker pull mirrorgooglecontainers/${image}
docker tag mirrorgooglecontainers/${image} k8s.gcr.io/${image}
docker rmi mirrorgooglecontainers/${image}
export image=kube-proxy:v1.14.3
docker pull mirrorgooglecontainers/${image}
docker tag mirrorgooglecontainers/${image} k8s.gcr.io/${image}
docker rmi mirrorgooglecontainers/${image}
export image=k8s-dns-kube-dns-amd64:1.15.3
docker pull mirrorgooglecontainers/${image}
docker tag mirrorgooglecontainers/${image} k8s.gcr.io/${image}
docker rmi mirrorgooglecontainers/${image}
export image=k8s-dns-dnsmasq-nanny-amd64:1.15.3
docker pull mirrorgooglecontainers/${image}
docker tag mirrorgooglecontainers/${image} k8s.gcr.io/${image}
docker rmi mirrorgooglecontainers/${image}
export image=k8s-dns-sidecar-amd64:1.15.3
docker pull mirrorgooglecontainers/${image}
docker tag mirrorgooglecontainers/${image} k8s.gcr.io/${image}
docker rmi mirrorgooglecontainers/${image}
export image=etcd:3.3.10
docker pull mirrorgooglecontainers/${image}
docker tag mirrorgooglecontainers/${image} k8s.gcr.io/${image}
docker rmi mirrorgooglecontainers/${image}
export image=coredns:1.3.1
docker pull coredns/${image}
docker tag coredns/${image} k8s.gcr.io/${image}
docker rmi coredns/${image}
3.安装kubectl kubelet kubeadm kubernetes-cni
yum list kubectl kubelet kubeadm kubernetes-cni #查看可安装的包
已加载插件:fastestmirror
Loading mirror speeds from cached hostfile
* base: mirrors.tuna.tsinghua.edu.cn
* extras: mirrors.sohu.com
* updates: mirrors.sohu.com
#显示可安装的软件包
kubeadm.x86_64 1.14.3-0 kubernetes
kubectl.x86_64 1.14.3-0 kubernetes
kubelet.x86_64 1.14.3-0 kubernetes
kubernetes-cni.x86_64 0.7.5-0 kubernetes
[root@localhost ~]#
#然后安装kubectl kubelet kubeadm kubernetes-cni
yum install -y kubectl kubelet kubeadm kubernetes-cni
# Kubelet负责与其他节点集群通信,并进行本节点Pod和容器生命周期的管理。
# Kubeadm是Kubernetes的自动化部署工具,降低了部署难度,提高效率。
# Kubectl是Kubernetes集群管理工具
systemctl enable kubelet && systemctl start kubelet #启动所有主机上的kubelet服务
4.初始化master master节点上操作
定义POD的网段为: 10.244.0.0/16, --apiserver-advertise-address参数 指定的就是master本机IP地址。
由于kubeadm 默认从官网k8s.grc.io下载所需镜像,国内无法访问,也可以通过–image-repository指定阿里云镜像仓库地址,咱们已经下载好了镜像可以不用加上这个参数
kubeadm reset && kubeadm init --apiserver-advertise-address=192.168.10.13 --kubernetes-version=v1.14.3 --pod-network-cidr=10.244.0.0/16
#如果没有准备好镜像可以指定阿里云镜像的仓库地址。使用下面的命令。
kubeadm init --kubernetes-version=1.14.3 --apiserver-advertise-address=192.168.10.13 --image-repository registry.aliyuncs.com/google_containers --service-cidr=10.1.0.0/16 --pod-network-cidr=10.244.0.0/16
#集群初始化成功后返回如下信息:
[preflight] Running pre-flight checks
[reset] Stopping the kubelet service
[reset] Unmounting mounted directories in "/var/lib/kubelet"
[reset] Removing kubernetes-managed containers
[reset] Deleting contents of stateful directories: [/var/lib/kubelet /etc/cni/net.d /var/lib/dockershim /var/lib/etcd]
[reset] Deleting contents of config directories: [/etc/kubernetes/manifests /etc/kubernetes/pki]
[reset] Deleting files: [/etc/kubernetes/admin.conf /etc/kubernetes/kubelet.conf /etc/kubernetes/controller-manager.conf /etc/kubernetes/scheduler.conf]
[kubeadm] WARNING: kubeadm is in beta, please do not use it for production clusters.
[init] Using Kubernetes version: v1.14.3
[init] Using Authorization modes: [Node RBAC]
[preflight] Running pre-flight checks
[preflight] WARNING: docker version is greater than the most recently validated version. Docker version: 17.09.0-ce. Max validated version: 1.12
[preflight] Starting the kubelet service
[kubeadm] WARNING: starting in 1.8, tokens expire after 24 hours by default (if you require a non-expiring token use --token-ttl 0)
[certificates] Generated CA certificate and key.
[certificates] Generated API server certificate and key.
[certificates] API Server serving cert is signed for DNS names [master kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [10.96.0.1 192.168.0.100]
[certificates] Generated API server kubelet client certificate and key.
[certificates] Generated service account token signing key and public key.
[certificates] Generated front-proxy CA certificate and key.
[certificates] Generated front-proxy client certificate and key.
[certificates] Valid certificates and keys now exist in "/etc/kubernetes/pki"
[kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/admin.conf"
[kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/kubelet.conf"
[kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/controller-manager.conf"
[kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/scheduler.conf"
[apiclient] Created API client, waiting for the control plane to become ready
[apiclient] All control plane components are healthy after 34.002949 seconds
[token] Using token: 0696ed.7cd261f787453bd9
[apiconfig] Created RBAC rules
[addons] Applied essential addon: kube-proxy
[addons] Applied essential addon: kube-dns
Your Kubernetes master has initialized successfully!
To start using your cluster, you need to run (as a regular user):
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
http://kubernetes.io/docs/admin/addons/
You can now join any number of machines by running the following on each node
as root:
#没有问题的情况下会返回如下参数(一定要记好)
kubeadm join 192.168.10.13:6443 --token wdmykh.u84g6ijzu4n99qez \
--discovery-token-ca-cert-hash sha256:868b5d27c078ddd3ce98bf67bbad4d8568d3cb134763b732e7ad4b47eda196b2
#遇到报错问题处理
1、[ERROR FileContent--proc-sys-net-bridge-bridge-nf-call-iptables]: /proc/sys/net/bridge/bridge-nf-call-iptables contents are not set to 1
解决办法:echo "1" > /proc/sys/net/bridge/bridge-nf-call-iptables
2、[ERROR Swap]: running with swap on is not supported. Please disable swap
解决办法:关闭swap分区 swapoff -a
vim /etc/fstab #注释下面一行
#/dev/mapper/rhel-swap swap swap defaults 0 0
3、[ERROR DirAvailable--var-lib-etcd]: /var/lib/etcd is not empty
解决办法:直接删除/var/lib/etcd文件夹 rm -rf /var/lib/etcd
4、报错内容:error execution phase preflight: [preflight] Some fatal errors occurred:
[ERROR Port-6443]: Port 6443 is in use
[ERROR Port-10251]: Port 10251 is in use
[ERROR Port-10252]: Port 10252 is in use
[ERROR FileAvailable--etc-kubernetes-manifests-kube-apiserver.yaml]: /etc/kubernetes/manifests/kube-apiserver.yaml already exists
[ERROR FileAvailable--etc-kubernetes-manifests-kube-controller-manager.yaml]: /etc/kubernetes/manifests/kube-controller-manager.yaml already exists
[ERROR FileAvailable--etc-kubernetes-manifests-kube-scheduler.yaml]: /etc/kubernetes/manifests/kube-scheduler.yaml already exists
[ERROR FileAvailable--etc-kubernetes-manifests-etcd.yaml]: /etc/kubernetes/manifests/etcd.yaml already exists
[ERROR Port-10250]: Port 10250 is in use
[preflight] If you know what you are doing, you can make a check non-fatal with `--ignore-preflight-errors=...`
解决办法:安装提示忽略掉加上参数 --ignore-preflight-errors=all
5、This error is likely caused by:
- The kubelet is not running
- The kubelet is unhealthy due to a misconfiguration of the node in some way (required cgroups disabled)
解决办法:暂无
这个一定要记住,添加节点需要
# 配置kubectl工具
mkdir -p /root/.kube
cp /etc/kubernetes/admin.conf /root/.kube/config #如果是其他用户也要拷贝到对应的用户目录下然后赋权
kubectl get nodes
kubectl get cs
#部署flannel网络
kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/a70459be0084506e4ec919aa1c114638878db11b/Documentation/kube-flannel.yml
#或者这样操作
docker pull quay.io/coreos/flannel:v0.8.0-amd64
kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/v0.8.0/Documentation/kube-flannel.yml
kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/v0.8.0/Documentation/kube-flannel-rbac.yml
5.添加node节点
#在node节点的机器上执行操作
kubeadm join 192.168.10.13:6443 --token wdmykh.u84g6ijzu4n99qez --discovery-token-ca-cert-hash sha256:868b5d27c078ddd3ce98bf67bbad4d8568d3cb134763b732e7ad4b47eda196b2
#提示如下
[kubeadm] WARNING: kubeadm is in beta, please do not use it for production clusters.
[preflight] Running pre-flight checks
[preflight] WARNING: docker version is greater than the most recently validated version. Docker version: 17.09.0-ce. Max validated version: 1.12
[preflight] WARNING: kubelet service is not enabled, please run 'systemctl enable kubelet.service'
[preflight] Starting the kubelet service
[discovery] Trying to connect to API Server "192.168.10.13:6443"
[discovery] Created cluster-info discovery client, requesting info from "https://192.168.10.13:6443"
[discovery] Cluster info signature and contents are valid, will use API Server "https://192.168.10.13:6443"
[discovery] Successfully established connection with API Server "192.168.10.13:6443"
[bootstrap] Detected server version: v1.14.3
[bootstrap] The server supports the Certificates API (certificates.k8s.io/v1beta1)
[csr] Created API client to obtain unique certificate for this node, generating keys and certificate signing request
[csr] Received signed certificate from the API server, generating KubeConfig...
[kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/kubelet.conf"
Node join complete:
* Certificate signing request sent to master and response
received.
* Kubelet informed of new secure connection details.
Run 'kubectl get nodes' on the master to see this machine join.
6.查看集群
[root@master ~]# kubectl get cs
NAME STATUS MESSAGE ERROR
scheduler Healthy ok
controller-manager Healthy ok
etcd-0 Healthy {"health": "true"}
[root@master ~]# kubectl get nodes
NAME STATUS AGE VERSION
master Ready 24m v1.7.5
node1 NotReady 45s v1.7.5
node2 NotReady 7s v1.7.5
[root@master ~]# kubectl get pods --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system etcd-master 1/1 Running 0 24m
kube-system kube-apiserver-master 1/1 Running 0 24m
kube-system kube-controller-manager-master 1/1 Running 0 24m
kube-system kube-dns-2425271678-h48rw 0/3 ImagePullBackOff 0 25m
kube-system kube-flannel-ds-28n3w 1/2 CrashLoopBackOff 13 24m
kube-system kube-flannel-ds-ndspr 0/2 ContainerCreating 0 41s
kube-system kube-flannel-ds-zvx9j 0/2 ContainerCreating 0 1m
kube-system kube-proxy-qxxzr 0/1 ImagePullBackOff 0 41s
kube-system kube-proxy-shkmx 0/1 ImagePullBackOff 0 25m
kube-system kube-proxy-vtk52 0/1 ContainerCreating 0 1m
kube-system kube-scheduler-master 1/1 Running 0 24m
[root@master ~]#
如果出现:The connection to the server localhost:8080 was refused - did you specify the right host or port?
解决办法: 为了使用kubectl访问apiserver,在~/.bash_profile中追加下面的环境变量: export KUBECONFIG=/etc/kubernetes/admin.conf source ~/.bash_profile 重新初始化kubectl