安装部署 Kubernetes 集群
阅读目录:
- 准备工作
- 部署 Master 管理节点
- 部署 Minion 工作节点
- 部署 Hello World 应用
- 安装 Dashboard 插件
- 安装 Heapster 插件
- 后记
相关文章:Kubernetes 概念整理
关于 Kubernetes 的相关概念内容,上面这篇文章已经整理的很详细了,这边就不再进行叙述了。
这篇文章主要记录的是 Kubernetes 的安装部署过程,我们都知道 Kubernetes 是由众多的组件组成的(而且很多都是在 Google 源下,你懂的),本来我是想全手动安装的,但后来 Google 找了很多的资料,也尝试了很长时间,发现工程量太大,而且也太复杂,所以只能暂时放弃了。
不过,后来无意间找到了一篇很好的部署教程,以后再尝试下。
退而求其次,不能全手动安装,那就使用 Kubeadm 安装工具进行部署,但 Kubeadm 目前还不能用于生产环境,所以,我们只能在测试环境进行安装部署。
废话不多说,我们开干(历时一个多月左右😂)。
1. 准备工作
先说下我安装部署的环境:
- 主机环境:Mac OS High Sierra
- 虚拟机环境:Ubuntu 16.04
- 虚拟机工具:Vagrant
因为我电脑空间有限,所以暂时只能部署两个虚拟机,一个是 Master 管理节点,一个是 Minion 工作节点,如下:
manager1 10.9.10.154
worker1 10.9.10.152
两个虚拟机的创建使用的 Vagrant 工具(参考文章:Mac OS 使用 Vagrant 管理虚拟机),为了使用桥接网络模式(独立的 IP 地址访问),别忘了在Vagrantfile
中,增加下面配置:
config.vm.network "public_network", bridge: "en0: Wi-Fi (AirPort)"
config.vm.boot_timeout = 2000
两个虚拟机创建好之后,使用下面命令,分别创建root
账号:
$ vagrant ssh
$ sudo passwd root
$ su root
然后分别编辑/etc/hostname
和/etc/hosts
,将hostname
和host
修改如下:
manager1
10.9.10.154 manager1
worker1
10.9.10.152 worker1
因为安装部署过程需要访问 Google 源,所以我们还需要配置代理(连接的是 Mac 主机的代理服务器,我使用的是 Shadowsocks,需要在“偏好设置”中配置“HTTP 代理”):
export http_proxy=http://10.9.10.215:1087;export https_proxy=http://10.9.10.215:1087;
设置好之后,我们可以检查下是否生效:
$ curl ip.cn
当前 IP:185.225.14.5 来自:美国
然后,我们再分别配置 Docker 环境,参考文档:Get Docker CE for Ubuntu
$ apt-get update && apt-get install docker.io
$ docker version
Client:
Version: 1.13.1
API version: 1.26
Go version: go1.6.2
Git commit: 092cba3
Built: Thu Nov 2 20:40:23 2017
OS/Arch: linux/amd64
Server:
Version: 1.13.1
API version: 1.26 (minimum version 1.12)
Go version: go1.6.2
Git commit: 092cba3
Built: Thu Nov 2 20:40:23 2017
OS/Arch: linux/amd64
Experimental: false
然后,再配置 Docker 代理(拉取 Google 镜像),参考文章:centos7 docker 使用 https_proxy 代理配置
创建目录和文件http-proxy.conf
,并在文件中添加后面的代理配置:
$ mkdir /etc/systemd/system/docker.service.d
$ touch /etc/systemd/system/docker.service.d/http-proxy.conf
$ vi /etc/systemd/system/docker.service.d/http-proxy.conf
[Service]
Environment="HTTP_PROXY=http://10.9.10.215:1087" "HTTPS_PROXY=https://10.9.10.215:1087"
然后重启 Docker 服务,检查代理是否生效:
$ systemctl daemon-reload
$ systemctl restart docker
$ systemctl show docker --property Environment
Environment=HTTP_PROXY=http://10.9.10.215:1087 HTTPS_PROXY=https://10.9.10.215:1087
至此,管理节点和工作节点的准备工作都做好了。
Master 管理节点包含组件:
docker
etcd
kube-apiserver
kube-controller-manager
kubelet
kube-scheduler
Minion 工作节点包含组件:
docker
kubelet
kube-proxy
2. 部署 Master 管理节点
Kubeadm 安装 Kubernetes 命令:
$ apt-get update && apt-get install -y apt-transport-https
$ curl -s https://packages.cloud.google.com/apt/doc/apt-key.gpg | apt-key add -
$ cat <<EOF >/etc/apt/sources.list.d/kubernetes.list
deb http://apt.kubernetes.io/ kubernetes-xenial main
EOF
$ apt-get update
$ apt-get install -y kubelet kubeadm kubectl
Kubeadm 初始化集群:
$ kubeadm init --pod-network-cidr=10.244.0.0/16
unable to get URL "https://dl.k8s.io/release/stable-1.9.txt": Get https://storage.googleapis.com/kubernetes-release/release/stable-1.9.txt: dial tcp 216.58.200.48:443: i/o timeout
但是出现了上面的错误,解决方案(需要指定版本):
重新执行命令(指定版本为1.9.0
):
$ kubeadm init --kubernetes-version=1.9.0 --pod-network-cidr=10.244.0.0/16 --apiserver-advertise-address=10.9.10.154 --node-name=manager1
[init] Using Kubernetes version: v1.9.0
[init] Using Authorization modes: [Node RBAC]
[preflight] Running pre-flight checks.
[WARNING FileExisting-crictl]: crictl not found in system path
[preflight] Starting the kubelet service
[certificates] Generated ca certificate and key.
[certificates] Generated apiserver certificate and key.
[certificates] apiserver serving cert is signed for DNS names [manager1 kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [10.96.0.1 10.9.10.154]
[certificates] Generated apiserver-kubelet-client certificate and key.
[certificates] Generated sa key and public key.
[certificates] Generated front-proxy-ca certificate and key.
[certificates] Generated front-proxy-client certificate and key.
[certificates] Valid certificates and keys now exist in "/etc/kubernetes/pki"
[kubeconfig] Wrote KubeConfig file to disk: "admin.conf"
[kubeconfig] Wrote KubeConfig file to disk: "kubelet.conf"
[kubeconfig] Wrote KubeConfig file to disk: "controller-manager.conf"
[kubeconfig] Wrote KubeConfig file to disk: "scheduler.conf"
[controlplane] Wrote Static Pod manifest for component kube-apiserver to "/etc/kubernetes/manifests/kube-apiserver.yaml"
[controlplane] Wrote Static Pod manifest for component kube-controller-manager to "/etc/kubernetes/manifests/kube-controller-manager.yaml"
[controlplane] Wrote Static Pod manifest for component kube-scheduler to "/etc/kubernetes/manifests/kube-scheduler.yaml"
[etcd] Wrote Static Pod manifest for a local etcd instance to "/etc/kubernetes/manifests/etcd.yaml"
[init] Waiting for the kubelet to boot up the control plane as Static Pods from directory "/etc/kubernetes/manifests".
[init] This might take a minute or longer if the control plane images have to be pulled.
[apiclient] All control plane components are healthy after 29.006486 seconds
[uploadconfig] Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
[markmaster] Will mark node manager1 as master by adding a label and a taint
[markmaster] Master manager1 tainted and labelled with key/value: node-role.kubernetes.io/master=""
[bootstraptoken] Using token: 561fd3.febddbbda0c219bc
[bootstraptoken] Configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials
[bootstraptoken] Configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token
[bootstraptoken] Configured RBAC rules to allow certificate rotation for all node client certificates in the cluster
[bootstraptoken] Creating the "cluster-info" ConfigMap in the "kube-public" namespace
[addons] Applied essential addon: kube-dns
[addons] Applied essential addon: kube-proxy
Your Kubernetes master has initialized successfully!
To start using your cluster, you need to run the following as a regular user:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
https://kubernetes.io/docs/concepts/cluster-administration/addons/
You can now join any number of machines by running the following on each node
as root:
kubeadm join --token 1b43ed.7391162ed9072892 10.9.10.154:6443 --discovery-token-ca-cert-hash sha256:6ab593dfe4d2822912d11cf009830dd033591a6285e4b5ce4505c39d4b40f12b
因为初始化集群的时候,我们指定了参数--pod-network-cidr=10.244.0.0/16
,表示我们选择的网络组件是 Flannel,所以,我们还需要安装 Flannel,命令:
$ kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/v0.9.1/Documentation/kube-flannel.yml
The connection to the server localhost:8080 was refused - did you specify the right host or port?
出现了上面的错误,解决方案:1.6.1 The connection to the server localhost:8080 was refused
$ sudo cp /etc/kubernetes/admin.conf $HOME/
$ sudo chown $(id -u):$(id -g) $HOME/admin.conf
$ export KUBECONFIG=$HOME/admin.conf
其实上面的解决方式,在初始化集群的打印信息里面就有,要求我们设置环境变量KUBECONFIG
为配置文件的路径,以便后续的使用。
重新执行安装 Flannel 命令:
$ kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/v0.9.1/Documentation/kube-flannel.yml
clusterrole "flannel" created
clusterrolebinding "flannel" created
serviceaccount "flannel" created
configmap "kube-flannel-cfg" created
daemonset "kube-flannel-ds" created
然后,执行查看节点命令:
$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
manager1 NotReady master 47m v1.9.2
节点状态为 NotReady,需要执行下面命令解决:
$ kubectl taint nodes --all node-role.kubernetes.io/master-
node "manager1" untainted
重新执行查看节点命令(变为了 Ready):
$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
manager1 Ready master 48m v1.9.2
至此,我们在 Master 管理节点,使用 Kubeadm 成功安装部署了 Kubernetes。
查看集群信息命令:
$ kubectl cluster-info
Kubernetes master is running at https://10.9.10.154:6443
KubeDNS is running at https://10.9.10.154:6443/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy
To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.
查看所有的 Pod 命令:
$ kubectl get pods --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system etcd-manager1 1/1 Running 0 23h
kube-system kube-apiserver-manager1 1/1 Running 0 23h
kube-system kube-controller-manager-manager1 1/1 Running 0 23h
kube-system kube-dns-6f4fd4bdf-8sbrt 3/3 Running 0 23h
kube-system kube-flannel-ds-j625p 1/1 Running 0 23h
kube-system kube-proxy-jn9hl 1/1 Running 0 23h
kube-system kube-scheduler-manager1 1/1 Running 0 23h
kube-system kubernetes-dashboard-845747bdd4-6fsc4 1/1 Running 0 23h
如果我们安装部署中间出现了什么问题,可以使用kubeadm reset
还原命令,重新进行安装部署 Kubernetes。
3. 部署 Minion 工作节点
Kubeadm 安装 Kubernetes 命令:
$ apt-get update && apt-get install -y apt-transport-https
$ curl -s https://packages.cloud.google.com/apt/doc/apt-key.gpg | apt-key add -
$ cat <<EOF >/etc/apt/sources.list.d/kubernetes.list
deb http://apt.kubernetes.io/ kubernetes-xenial main
EOF
$ apt-get update
$ apt-get install -y kubelet kubeadm kubectl
安装成功之后,将工作节点加入到集群中,执行下面命令:
$ kubeadm join --token 1b43ed.7391162ed9072892 10.9.10.154:6443 --discovery-token-ca-cert-hash sha256:6ab593dfe4d2822912d11cf009830dd033591a6285e4b5ce4505c39d4b40f12b
[preflight] Running pre-flight checks.
[WARNING FileExisting-crictl]: crictl not found in system path
[discovery] Trying to connect to API Server "10.9.10.154:6443"
[discovery] Created cluster-info discovery client, requesting info from "https://10.9.10.154:6443"
[discovery] Requesting info from "https://10.9.10.154:6443" again to validate TLS against the pinned public key
[discovery] Failed to request cluster info, will try again: [Get https://10.9.10.154:6443/api/v1/namespaces/kube-public/configmaps/cluster-info: x509: certificate has expired or is not yet valid]
[discovery] Failed to request cluster info, will try again: [Get https://10.9.10.154:6443/api/v1/namespaces/kube-public/configmaps/cluster-info: x509: certificate has expired or is not yet valid]
出现了上面错误,原因是工作节点和管理节点的时间没同步(执行命令date
检查),解决方案:kubeadm join Endpoint check failed
具体就是安装ntp
服务,使两台服务器的时间进行同步。
重新执行命令:
$ kubeadm join --token 1b43ed.7391162ed9072892 10.9.10.154:6443 --discovery-token-ca-cert-hash sha256:6ab593dfe4d2822912d11cf009830dd033591a6285e4b5ce4505c39d4b40f12b
[preflight] Running pre-flight checks.
[WARNING FileExisting-crictl]: crictl not found in system path
[discovery] Trying to connect to API Server "10.9.10.154:6443"
[discovery] Created cluster-info discovery client, requesting info from "https://10.9.10.154:6443"
[discovery] Requesting info from "https://10.9.10.154:6443" again to validate TLS against the pinned public key
[discovery] Cluster info signature and contents are valid and TLS certificate validates against pinned roots, will use API Server "10.9.10.154:6443"
[discovery] Successfully established connection with API Server "10.9.10.154:6443"
This node has joined the cluster:
* Certificate signing request was sent to master and a response
was received.
* The Kubelet was informed of the new secure connection details.
Run 'kubectl get nodes' on the master to see this node join the cluster.
可以看到,工作节点加入集群成功了,然后,我们可以在工作节点上,执行查看节点命令:
$ kubectl get nodes
The connection to the server localhost:8080 was refused - did you specify the right host or port?
出现了上面的错误,这个错误和我们在管理节点配置集群的时候,是一样的,什么原因呢?就是 kubectl 默认连接的是localhost:8080
API Server,所以,需要我们手动配置环境变量。
查看/etc/kubernetes/kubelet.conf
配置信息:
$ cat /etc/kubernetes/kubelet.conf
apiVersion: v1
clusters:
- cluster:
certificate-authority-data: .......
server: https://10.9.10.154:6443
name: default-cluster
contexts:
- context:
cluster: default-cluster
namespace: default
user: default-auth
name: default-context
current-context: default-context
kind: Config
preferences: {}
users:
- name: default-auth
user:
client-certificate: /var/lib/kubelet/pki/kubelet-client.crt
client-key: /var/lib/kubelet/pki/kubelet-client.key
可以看到,上面有我们管理节点的 API Server 地址(https://10.9.10.154:6443),然后我们将此配置文件,配置为环境变量:
$ sudo cp /etc/kubernetes/kubelet.conf $HOME/
$ sudo chown $(id -u):$(id -g) $HOME/kubelet.conf
$ export KUBECONFIG=$HOME/kubelet.conf
然后在工作节点上,重新执行查看节点命令:
$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
manager1 Ready master 18h v1.9.2
worker1 Ready <none> 57m v1.9.3
这样,我们就可以在工作节点上,使用kubectl
命令查看集群信息了(只能查看,不能操作),但其实还是通过访问 API Server,来获取集群信息。
至此,我们工作节点的部署就完成了。
4. 部署 Hello World 应用
Kubernetes 节点配置好之后,下面我们就部署一个示例应用,部署参考:Use a Service to Access an Application in a Cluster
创建一个名称为hello-workd
的 Deployment:
$ kubectl run hello-world --replicas=1 --labels="run=load-balancer-example" --image=gcr.io/google-samples/node-hello:1.0 --port=8080
deployment "hello-world" created
--replicas=1
表示我们创建的 Pod 数量为 1,--port=8080
是容器的端口,并不是外部访问的端口。
创建好之后,我们可以通过下面几个命令,查看部署的信息和进度:
$ kubectl get deployments hello-world
$ kubectl describe deployments hello-world
$ kubectl get replicasets
$ kubectl describe replicasets
部署成功之后,我们还需要创建对应的 Service(类型为 NodePort):
$ kubectl expose deployment hello-world --type=NodePort --name=example-service
service "example-service" exposed
创建好之后,我们查看下 Service 的信息(31860 就是对外暴露的端口):
$ kubectl describe services example-service
Name: example-service
Namespace: default
Labels: run=load-balancer-example
Annotations: <none>
Selector: run=load-balancer-example
Type: NodePort
IP: 10.96.52.166
Port: <unset> 8080/TCP
TargetPort: 8080/TCP
NodePort: <unset> 31860/TCP
Endpoints: 10.244.3.8:8080
Session Affinity: None
External Traffic Policy: Cluster
Events: <none>
我们还可以查看 Pod 的信息:
$ kubectl get pods --selector="run=load-balancer-example" --output=wide
NAME READY STATUS RESTARTS AGE IP NODE
hello-world-58f9949f8-c28gx 1/1 Running 0 15m 10.244.3.8 worker1
然后,我们就可以浏览器直接打开(http://10.9.10.152:31860/),10.9.10.152
是 worker1 的 IP 地址,或者直接测试访问命令:
$ curl http://10.9.10.152:31860/
Hello Kubernetes!
另外,说明下 Kubernetes 三种暴露服务的方式:
- LoadBlancer Service:LoadBlancer Service 是 kubernetes 深度结合云平台的一个组件;当使用 LoadBlancer Service 暴露服务时,实际上是通过向底层云平台申请创建一个负载均衡器来向外暴露服务;目前 LoadBlancer Service 支持的云平台已经相对完善,比如国外的 GCE、DigitalOcean,国内的 阿里云,私有云 Openstack 等等,由于 LoadBlancer Service 深度结合了云平台,所以只能在一些云平台上来使用。
- NodePort Service:NodePort Service 顾名思义,实质上就是通过在集群的每个 node 上暴露一个端口,然后将这个端口映射到某个具体的 service 来实现的,虽然每个 node 的端口有很多(0~65535),但是由于安全性和易用性(服务多了就乱了,还有端口冲突问题)实际使用可能并不多。
- Ingress:Ingress 这个东西是 1.2 后才出现的,通过 Ingress 用户可以实现使用 nginx 等开源的反向代理负载均衡器实现对外暴露服务。
5. 安装 Dashboard 插件
Kubernetes Dashboard 地址:https://github.com/kubernetes/dashboard
Kubectl 安装 Dashboard 命令:
$ kubectl apply -f https://raw.githubusercontent.com/kubernetes/dashboard/master/src/deploy/recommended/kubernetes-dashboard.yaml
secret "kubernetes-dashboard-certs" created
serviceaccount "kubernetes-dashboard" created
role "kubernetes-dashboard-minimal" created
rolebinding "kubernetes-dashboard-minimal" created
deployment "kubernetes-dashboard" created
service "kubernetes-dashboard" created
$ kubectl proxy
Starting to serve on 127.0.0.1:8001
删除安装的 Dashboard 配置命令:
$ kubectl -n kube-system delete $(kubectl -n kube-system get pod -o name | grep dashboard)
pod "kubernetes-dashboard-3313488171-7706x" deleted
上面安装的 Dashboard 只能本机访问,如果需要外网访问,需要再进行配置一下,参考资料:Accessing Dashboard 1.7.X and above
需要先删除下 Dashboard,然后再执行kubectl apply -f ...
创建命令,然后执行下面的命令(2. 修改 kubernetes-dashboard Service):
$ kubectl -n kube-system edit service kubernetes-dashboard
会出现如下配置信息:
apiVersion: v1
kind: Service
metadata:
annotations:
kubectl.kubernetes.io/last-applied-configuration: |
{"apiVersion":"v1","kind":"Service","metadata":{"annotations":{},"labels":{"k8s-app":"kubernetes-dashboard"},"name":"kubernetes-dashboard","namespace":"kube-system"},"spec":{"ports":[{"port":443,"targetPort":8443}],"selector":{"k8s-app":"kubernetes-dashboard"}}}
creationTimestamp: 2018-02-08T05:30:23Z
labels:
k8s-app: kubernetes-dashboard
name: kubernetes-dashboard
namespace: kube-system
resourceVersion: "6652"
selfLink: /api/v1/namespaces/kube-system/services/kubernetes-dashboard
uid: 29323549-0c91-11e8-98f6-0293061ca64f
spec:
clusterIP: 10.97.21.104
externalTrafficPolicy: Cluster
ports:
- nodePort: 30835
port: 443
protocol: TCP
targetPort: 8443
selector:
k8s-app: kubernetes-dashboard
sessionAffinity: None
type: ClusterIP
status:
loadBalancer: {}
然后将type: ClusterIP
修改为type: NodePort
,clusterIP
并不需要更改,然后保存下。
查看分配的 NodePort(NodePort 30835 映射到 Dashboard pod 433 端口):
$ kubectl get services kubernetes-dashboard -n kube-system
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes-dashboard NodePort 10.97.21.104 <none> 443:30835/TCP 29m
然后,检查 Controller 是否正常运行:
$ kubectl get deployment kubernetes-dashboard -n kube-system
NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE
kubernetes-dashboard 1 1 1 1 30m
$ kubectl get pods -n kube-system | grep dashboard
kubernetes-dashboard-845747bdd4-r4xsj 1/1 Running 0 30m
访问 Dashboard 的三种方式(参考资料:安装 Dashboard 插件):
- kubernetes-dashboard 服务暴露了 NodePort,可以使用 http://NodeIP:nodePort 地址访问 Dashboard。
- 通过 API Server 访问 Dashboard(https 6443 端口和 http 8080 端口方式)。
- 通过 kubectl proxy 访问 Dashboard。
5.1 NodePort 方式访问
上面的配置,其实就是第一种,我们直接就可以访问:https://10.9.10.154:30835/
5.2 kubectl proxy 方式访问
第二种方式,需要配置证书,网上有相关资料,这边就不叙述了,第三种方式,执行命令:
$ kubectl proxy --address='10.9.10.154' --port=8086 --accept-hosts='^*$'
Starting to serve on 10.9.10.154:8086
需要指定--accept-hosts
选项,否则浏览器访问 Dashboard 页面时提示“Unauthorized”
,浏览器访问 URL:http://10.9.10.154:8086/ui 自动跳转到:http://10.9.10.154:8086/api/v1/proxy/namespaces/kube-system/services/kubernetes-dashboard/#/workload?namespace=default
打开 https://10.9.10.154:30835/#!/login,然后会出现下面的登录窗口(也可以选择“跳过”,只不过没有权限操作):
5.3 token 方式登录
这里我们选择 token 方式登录,参考资料:Access control
执行命令:
$ kubectl -n kube-system get secret
NAME TYPE DATA AGE
attachdetach-controller-token-tlj4p kubernetes.io/service-account-token 3 2h
bootstrap-signer-token-vdkth kubernetes.io/service-account-token 3 2h
bootstrap-token-561fd3 bootstrap.kubernetes.io/token 7 2h
certificate-controller-token-w9bb7 kubernetes.io/service-account-token 3 2h
clusterrole-aggregation-controller-token-bztd8 kubernetes.io/service-account-token 3 2h
cronjob-controller-token-4f7t4 kubernetes.io/service-account-token 3 2h
daemon-set-controller-token-q7865 kubernetes.io/service-account-token 3 2h
default-token-khxp8 kubernetes.io/service-account-token 3 2h
deployment-controller-token-vlsrk kubernetes.io/service-account-token 3 2h
disruption-controller-token-mfnwh kubernetes.io/service-account-token 3 2h
endpoint-controller-token-hm69z kubernetes.io/service-account-token 3 2h
flannel-token-mrzqq kubernetes.io/service-account-token 3 52m
generic-garbage-collector-token-m78z6 kubernetes.io/service-account-token 3 2h
horizontal-pod-autoscaler-token-c6v4w kubernetes.io/service-account-token 3 2h
job-controller-token-jkf4b kubernetes.io/service-account-token 3 2h
kube-dns-token-qvqhk kubernetes.io/service-account-token 3 2h
kube-proxy-token-rqjdn kubernetes.io/service-account-token 3 2h
kubernetes-dashboard-certs Opaque 0 47m
kubernetes-dashboard-key-holder Opaque 2 47m
kubernetes-dashboard-token-n4sjc kubernetes.io/service-account-token 3 47m
namespace-controller-token-qvblb kubernetes.io/service-account-token 3 2h
node-controller-token-qpz8t kubernetes.io/service-account-token 3 2h
persistent-volume-binder-token-qfrkh kubernetes.io/service-account-token 3 2h
pod-garbage-collector-token-trzhf kubernetes.io/service-account-token 3 2h
replicaset-controller-token-gp75m kubernetes.io/service-account-token 3 2h
replication-controller-token-9tmhr kubernetes.io/service-account-token 3 2h
resourcequota-controller-token-2djtx kubernetes.io/service-account-token 3 2h
service-account-controller-token-vxkp8 kubernetes.io/service-account-token 3 2h
service-controller-token-q54bc kubernetes.io/service-account-token 3 2h
statefulset-controller-token-ldcz9 kubernetes.io/service-account-token 3 2h
token-cleaner-token-lk9kt kubernetes.io/service-account-token 3 2h
ttl-controller-token-9mks8 kubernetes.io/service-account-token 3 2h
找到并复制 Name 为kubernetes-dashboard-token-n4sjc
,然后执行下面命令:
$ kubectl -n kube-system describe secret kubernetes-dashboard-token-n4sjc
Name: kubernetes-dashboard-token-nstpj
Namespace: kube-system
Labels: <none>
Annotations: kubernetes.io/service-account.name=kubernetes-dashboard
kubernetes.io/service-account.uid=94dc6cdc-0cb4-11e8-ae1c-0293061ca64f
Type: kubernetes.io/service-account-token
Data
====
ca.crt: 1025 bytes
namespace: 11 bytes
token: eyJhbGciOiJSUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJrdWJlcm5ldGVzL3NlcnZpY2VhY2NvdW50Iiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9uYW1lc3BhY2UiOiJrdWJlLXN5c3RlbSIsImt1YmVybmV0ZXMuaW8vc2VydmljZWFjY291bnQvc2VjcmV0Lm5hbWUiOiJrdWJlcm5ldGVzLWRhc2hib2FyZC10b2tlbi1uc3RwaiIsImt1YmVybmV0ZXMuaW8vc2VydmljZWFjY291bnQvc2VydmljZS1hY2NvdW50Lm5hbWUiOiJrdWJlcm5ldGVzLWRhc2hib2FyZCIsImt1YmVybmV0ZXMuaW8vc2VydmljZWFjY291bnQvc2VydmljZS1hY2NvdW50LnVpZCI6Ijk0ZGM2Y2RjLTBjYjQtMTFlOC1hZTFjLTAyOTMwNjFjYTY0ZiIsInN1YiI6InN5c3RlbTpzZXJ2aWNlYWNjb3VudDprdWJlLXN5c3RlbTprdWJlcm5ldGVzLWRhc2hib2FyZCJ9.gfYyS4l4z-qjz7_38E9bJ2hg8P173qKPJFVoFxX3uMX4gDgiYQJoCjx6ldiIX8oSVXfg572iUH2qMdCEskYYKbYfYOiuyqLL_O7I6pWic0XbW4qB_h-6PJmlqQUPhwVykDxb3qObnUm97tswbCjic461XqrrL3RwKqL8ox0wyLVhlB-QU4HrHzxUNCinffGLUMznFK0lFp4MfrFCK9TfAZEsXAWycsMBm9g-6ltUUVn7Y7aJrb4jxX5QfCsuKsySYvzOibkdzNGeMM8hVjxofrhE36nbgvpu98zVEcqgtz8ipbBi8K__O0_mlbpb5wCbW8VJuRCegMTKAB19Rb3NLg
然后,我们拿最下面的 token,就可以进行登录了。
5.4 解决访问权限问题
不过进去之后,出现了下面的错误(token 取的 key 为replicaset-controller-token-kzpmc
,应该是上面的kubernetes-dashboard-token-n4sjc
):
由于 kube-apiserver 启用了 RBAC 授权,而官方源码目录的dashboard-controller.yaml
没有定义授权的 ServiceAccount,所以后续访问 API server 的 API 时会被拒绝,Web 中提示:
Forbidden (403)
User "system:serviceaccount:kube-system:default" cannot list jobs.batch in the namespace "default". (get jobs.batch)
解决方案(Update docs to cover RBAC changes in K8S 1.6):
# Create the clusterrole and clusterrolebinding:
# $ kubectl create -f kube-dashboard-rbac.yml
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: kubernetes-dashboard
subjects:
- kind: ServiceAccount
name: kubernetes-dashboard
namespace: kube-system
roleRef:
kind: ClusterRole
name: cluster-admin
apiGroup: rbac.authorization.k8s.io
这个问题搞了我好久,虽然执行了上面的命令,但我试过还是不行的,后来我又重新部署了 Kubernetes 和 Kubernetes Dashboard,然后修改了官方的dashboard-controller.yaml
,在最后增加了上面的配置(我现在用的完整配置:https://github.com/yuezhongxin/Kubernetes.Sample/blob/master/kubernetes-dashboard.yaml)。
安装命令(1. 不要使用上面命令,直接使用此命令安装):
$ kubectl apply -f https://github.com/yuezhongxin/Kubernetes.Sample/blob/master/kubernetes-dashboard.yaml
secret "kubernetes-dashboard-certs" created
serviceaccount "kubernetes-dashboard" created
role "kubernetes-dashboard-minimal" created
rolebinding "kubernetes-dashboard-minimal" created
deployment "kubernetes-dashboard" created
service "kubernetes-dashboard" created
但当时试过还是不行的,然后隔天我重新启动虚拟机和 Kubernetes 的时候,发现又可以了,成功的界面:
导致上面配置出错的原因,我觉得一个是dashboard-controller.yaml
的配置问题,还有一个是 token 取的值不对,之前 token 取的 key 为replicaset-controller-token-kzpmc
,应该是上面的kubernetes-dashboard-token-n4sjc
,kubernetes-dashboard
是上面配置的name
值,这两个要一一对应,要不然就会出现权限错误。
所以,我觉得解决 Dashboard 权限问题的方式:
- 按照完善后的
dashboard-controller.yaml
配置进行创建,而不是使用官方的配置文件。 - 3. 获取正确的 token 值(key 为
kubernetes-dashboard-token-n4sjc
)。
6. 安装 Heapster 插件
Heapster 插件包含三部分内容:
- Heapster:显示各 Nodes、Pods 的 CPU、内存、负载等利用率曲线图。
- InfluxDB:存储 Pod 信息相关的数据库, Heapster 获取数据之后, 可以指定存储在 InfluxDB。
- Grafana:这个主要是用于显示 InfluxDB 里面的数据情况, 可以让我们很直观看到数据变化。
到 Heapster release 页面,下载最新版本的 Heapster。
$ wget https://github.com/kubernetes/heapster/archive/v1.5.1.zip
$ unzip v1.5.1.zip
$ cd heapster-1.5.1/kube-config/influxdb
$ ls
grafana.yaml heapster.yaml influxdb.yaml
部署命令:
$ kubectl create -f .
deployment "monitoring-grafana" created
service "monitoring-grafana" created
serviceaccount "heapster" created
deployment "heapster" created
service "heapster" created
deployment "monitoring-influxdb" created
service "monitoring-influxdb" created
删除命令:
$ kubectl --namespace kube-system delete deployment heapster &&
kubectl --namespace kube-system delete deployment monitoring-grafana &&
kubectl --namespace kube-system delete deployment monitoring-influxdb &&
kubectl --namespace kube-system delete service heapster &&
kubectl --namespace kube-system delete service monitoring-grafana &&
kubectl --namespace kube-system delete service monitoring-influxdb &&
kubectl --namespace kube-system delete serviceaccounts heapster
查看日志:
$ kubectl get pods -n kube-system
$ kubectl logs -n kube-system -f pods/heapster-5d4dcf5f49-vq64w
E0520 01:18:55.360650 1 reflector.go:190]
k8s.io/heapster/metrics/util/util.go:51: Failed to list *v1.Node: Get
https://kubernetes.default/api/v1/nodes?resourceVersion=0: dial tcp
10.96.0.1:443: i/o timeout
解决方案:
$ kubectl get services
$ vi heapster.yaml
- --source=kubernetes:https://10.96.0.1
删除并重新部署,出现下面日志信息:
E0223 08:16:16.766774 1 reflector.go:190] k8s.io/heapster/metrics/util/util.go:30: Failed to list *v1.Node: nodes is forbidden: User "system:serviceaccount:kube-system:heapster" cannot list nodes at the cluster scope
$ kubectl create -f /home/ubuntu/heapster-1.5.1/deploy/kube-config/rbac/heapster-rbac.yaml
clusterrolebinding "heapster" created
删除并重新部署,出现下面日志信息:
E0223 08:19:45.066213 1 influxdb.go:208] Failed to create influxdb: failed to ping InfluxDB server at "monitoring-influxdb.kube-system.svc:8086" - Get http://monitoring-influxdb.kube-system.svc:8086/ping: dial tcp: lookup monitoring-influxdb.kube-system.svc on 10.96.0.10:53: read udp 10.244.3.25:46031->10.96.0.10:53: i/o timeout
暂时未解决。
感觉是网络的问题,因为创建的 Servcie 并没有起作用。
7. 后记
关于 Kubernetes,其实我个人觉得是微服务编排领域最强的那个,提供一系列强大的功能,但因为涉及的东西非常多,并且都需要花多的时间去研究消化,所以,这篇文章只是投石问路,后面需要学习的还有很多。
参考资料: