Fork me on GitHub
安装部署 Kubernetes 集群

安装部署 Kubernetes 集群

阅读目录:

  1. 准备工作
  2. 部署 Master 管理节点
  3. 部署 Minion 工作节点
  4. 部署 Hello World 应用
  5. 安装 Dashboard 插件
  6. 安装 Heapster 插件
  7. 后记

相关文章:Kubernetes 概念整理

关于 Kubernetes 的相关概念内容,上面这篇文章已经整理的很详细了,这边就不再进行叙述了。

这篇文章主要记录的是 Kubernetes 的安装部署过程,我们都知道 Kubernetes 是由众多的组件组成的(而且很多都是在 Google 源下,你懂的),本来我是想全手动安装的,但后来 Google 找了很多的资料,也尝试了很长时间,发现工程量太大,而且也太复杂,所以只能暂时放弃了。

不过,后来无意间找到了一篇很好的部署教程,以后再尝试下。

退而求其次,不能全手动安装,那就使用 Kubeadm 安装工具进行部署,但 Kubeadm 目前还不能用于生产环境,所以,我们只能在测试环境进行安装部署。

废话不多说,我们开干(历时一个多月左右😂)。

1. 准备工作

先说下我安装部署的环境:

  • 主机环境:Mac OS High Sierra
  • 虚拟机环境:Ubuntu 16.04
  • 虚拟机工具:Vagrant

因为我电脑空间有限,所以暂时只能部署两个虚拟机,一个是 Master 管理节点,一个是 Minion 工作节点,如下:

manager1 10.9.10.154 
worker1 10.9.10.152

两个虚拟机的创建使用的 Vagrant 工具(参考文章:Mac OS 使用 Vagrant 管理虚拟机),为了使用桥接网络模式(独立的 IP 地址访问),别忘了在Vagrantfile中,增加下面配置:

config.vm.network "public_network", bridge: "en0: Wi-Fi (AirPort)"
config.vm.boot_timeout = 2000

两个虚拟机创建好之后,使用下面命令,分别创建root账号:

$ vagrant ssh
$ sudo passwd root
$ su root

然后分别编辑/etc/hostname/etc/hosts,将hostnamehost修改如下:

manager1
10.9.10.154 manager1

worker1
10.9.10.152 worker1

因为安装部署过程需要访问 Google 源,所以我们还需要配置代理(连接的是 Mac 主机的代理服务器,我使用的是 Shadowsocks,需要在“偏好设置”中配置“HTTP 代理”):

export http_proxy=http://10.9.10.215:1087;export https_proxy=http://10.9.10.215:1087;

设置好之后,我们可以检查下是否生效:

$ curl ip.cn
当前 IP:185.225.14.5 来自:美国

然后,我们再分别配置 Docker 环境,参考文档:Get Docker CE for Ubuntu

$ apt-get update && apt-get install docker.io
$ docker version
Client:
 Version:      1.13.1
 API version:  1.26
 Go version:   go1.6.2
 Git commit:   092cba3
 Built:        Thu Nov  2 20:40:23 2017
 OS/Arch:      linux/amd64

Server:
 Version:      1.13.1
 API version:  1.26 (minimum version 1.12)
 Go version:   go1.6.2
 Git commit:   092cba3
 Built:        Thu Nov  2 20:40:23 2017
 OS/Arch:      linux/amd64
 Experimental: false

然后,再配置 Docker 代理(拉取 Google 镜像),参考文章:centos7 docker 使用 https_proxy 代理配置

创建目录和文件http-proxy.conf,并在文件中添加后面的代理配置:

$ mkdir /etc/systemd/system/docker.service.d
$ touch /etc/systemd/system/docker.service.d/http-proxy.conf
$ vi /etc/systemd/system/docker.service.d/http-proxy.conf

[Service]
Environment="HTTP_PROXY=http://10.9.10.215:1087" "HTTPS_PROXY=https://10.9.10.215:1087"

然后重启 Docker 服务,检查代理是否生效:

$ systemctl daemon-reload
$ systemctl restart docker
$ systemctl show docker --property Environment
Environment=HTTP_PROXY=http://10.9.10.215:1087 HTTPS_PROXY=https://10.9.10.215:1087

至此,管理节点和工作节点的准备工作都做好了。


Master 管理节点包含组件:

docker
etcd
kube-apiserver
kube-controller-manager
kubelet
kube-scheduler

Minion 工作节点包含组件:

docker
kubelet
kube-proxy

2. 部署 Master 管理节点

Kubeadm 安装 Kubernetes 命令:

$ apt-get update && apt-get install -y apt-transport-https
$ curl -s https://packages.cloud.google.com/apt/doc/apt-key.gpg | apt-key add -
$ cat <<EOF >/etc/apt/sources.list.d/kubernetes.list
deb http://apt.kubernetes.io/ kubernetes-xenial main
EOF
$ apt-get update
$ apt-get install -y kubelet kubeadm kubectl

Kubeadm 初始化集群:

$ kubeadm init --pod-network-cidr=10.244.0.0/16
unable to get URL "https://dl.k8s.io/release/stable-1.9.txt": Get https://storage.googleapis.com/kubernetes-release/release/stable-1.9.txt: dial tcp 216.58.200.48:443: i/o timeout

但是出现了上面的错误,解决方案(需要指定版本):

重新执行命令(指定版本为1.9.0):

$ kubeadm init --kubernetes-version=1.9.0 --pod-network-cidr=10.244.0.0/16 --apiserver-advertise-address=10.9.10.154 --node-name=manager1

[init] Using Kubernetes version: v1.9.0
[init] Using Authorization modes: [Node RBAC]
[preflight] Running pre-flight checks.
    [WARNING FileExisting-crictl]: crictl not found in system path
[preflight] Starting the kubelet service
[certificates] Generated ca certificate and key.
[certificates] Generated apiserver certificate and key.
[certificates] apiserver serving cert is signed for DNS names [manager1 kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [10.96.0.1 10.9.10.154]
[certificates] Generated apiserver-kubelet-client certificate and key.
[certificates] Generated sa key and public key.
[certificates] Generated front-proxy-ca certificate and key.
[certificates] Generated front-proxy-client certificate and key.
[certificates] Valid certificates and keys now exist in "/etc/kubernetes/pki"
[kubeconfig] Wrote KubeConfig file to disk: "admin.conf"
[kubeconfig] Wrote KubeConfig file to disk: "kubelet.conf"
[kubeconfig] Wrote KubeConfig file to disk: "controller-manager.conf"
[kubeconfig] Wrote KubeConfig file to disk: "scheduler.conf"
[controlplane] Wrote Static Pod manifest for component kube-apiserver to "/etc/kubernetes/manifests/kube-apiserver.yaml"
[controlplane] Wrote Static Pod manifest for component kube-controller-manager to "/etc/kubernetes/manifests/kube-controller-manager.yaml"
[controlplane] Wrote Static Pod manifest for component kube-scheduler to "/etc/kubernetes/manifests/kube-scheduler.yaml"
[etcd] Wrote Static Pod manifest for a local etcd instance to "/etc/kubernetes/manifests/etcd.yaml"
[init] Waiting for the kubelet to boot up the control plane as Static Pods from directory "/etc/kubernetes/manifests".
[init] This might take a minute or longer if the control plane images have to be pulled.
    [apiclient] All control plane components are healthy after 29.006486 seconds
[uploadconfig] Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
[markmaster] Will mark node manager1 as master by adding a label and a taint
[markmaster] Master manager1 tainted and labelled with key/value: node-role.kubernetes.io/master=""
[bootstraptoken] Using token: 561fd3.febddbbda0c219bc
[bootstraptoken] Configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials
[bootstraptoken] Configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token
[bootstraptoken] Configured RBAC rules to allow certificate rotation for all node client certificates in the cluster
[bootstraptoken] Creating the "cluster-info" ConfigMap in the "kube-public" namespace
[addons] Applied essential addon: kube-dns
[addons] Applied essential addon: kube-proxy

Your Kubernetes master has initialized successfully!

To start using your cluster, you need to run the following as a regular user:

  mkdir -p $HOME/.kube
  sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
  sudo chown $(id -u):$(id -g) $HOME/.kube/config

You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
  https://kubernetes.io/docs/concepts/cluster-administration/addons/

You can now join any number of machines by running the following on each node
as root:

  kubeadm join --token 1b43ed.7391162ed9072892 10.9.10.154:6443 --discovery-token-ca-cert-hash sha256:6ab593dfe4d2822912d11cf009830dd033591a6285e4b5ce4505c39d4b40f12b

因为初始化集群的时候,我们指定了参数--pod-network-cidr=10.244.0.0/16,表示我们选择的网络组件是 Flannel,所以,我们还需要安装 Flannel,命令:

$ kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/v0.9.1/Documentation/kube-flannel.yml
The connection to the server localhost:8080 was refused - did you specify the right host or port?

出现了上面的错误,解决方案:1.6.1 The connection to the server localhost:8080 was refused

$ sudo cp /etc/kubernetes/admin.conf $HOME/
$ sudo chown $(id -u):$(id -g) $HOME/admin.conf
$ export KUBECONFIG=$HOME/admin.conf

其实上面的解决方式,在初始化集群的打印信息里面就有,要求我们设置环境变量KUBECONFIG为配置文件的路径,以便后续的使用。

重新执行安装 Flannel 命令:

$ kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/v0.9.1/Documentation/kube-flannel.yml

clusterrole "flannel" created
clusterrolebinding "flannel" created
serviceaccount "flannel" created
configmap "kube-flannel-cfg" created
daemonset "kube-flannel-ds" created

然后,执行查看节点命令:

$ kubectl get nodes
NAME            STATUS     ROLES     AGE       VERSION
manager1   NotReady   master    47m       v1.9.2

节点状态为 NotReady,需要执行下面命令解决:

$ kubectl taint nodes --all node-role.kubernetes.io/master-
node "manager1" untainted

重新执行查看节点命令(变为了 Ready):

$ kubectl get nodes
NAME            STATUS    ROLES     AGE       VERSION
manager1   Ready     master    48m       v1.9.2

至此,我们在 Master 管理节点,使用 Kubeadm 成功安装部署了 Kubernetes。

查看集群信息命令:

$ kubectl cluster-info
Kubernetes master is running at https://10.9.10.154:6443
KubeDNS is running at https://10.9.10.154:6443/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy

To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.

查看所有的 Pod 命令:

$ kubectl get pods --all-namespaces
NAMESPACE     NAME                                    READY     STATUS    RESTARTS   AGE
kube-system   etcd-manager1                           1/1       Running   0          23h
kube-system   kube-apiserver-manager1                 1/1       Running   0          23h
kube-system   kube-controller-manager-manager1        1/1       Running   0          23h
kube-system   kube-dns-6f4fd4bdf-8sbrt                3/3       Running   0          23h
kube-system   kube-flannel-ds-j625p                   1/1       Running   0          23h
kube-system   kube-proxy-jn9hl                        1/1       Running   0          23h
kube-system   kube-scheduler-manager1                 1/1       Running   0          23h
kube-system   kubernetes-dashboard-845747bdd4-6fsc4   1/1       Running   0          23h

如果我们安装部署中间出现了什么问题,可以使用kubeadm reset还原命令,重新进行安装部署 Kubernetes。

3. 部署 Minion 工作节点

Kubeadm 安装 Kubernetes 命令:

$ apt-get update && apt-get install -y apt-transport-https
$ curl -s https://packages.cloud.google.com/apt/doc/apt-key.gpg | apt-key add -
$ cat <<EOF >/etc/apt/sources.list.d/kubernetes.list
deb http://apt.kubernetes.io/ kubernetes-xenial main
EOF
$ apt-get update
$ apt-get install -y kubelet kubeadm kubectl

安装成功之后,将工作节点加入到集群中,执行下面命令:

$ kubeadm join --token 1b43ed.7391162ed9072892 10.9.10.154:6443 --discovery-token-ca-cert-hash sha256:6ab593dfe4d2822912d11cf009830dd033591a6285e4b5ce4505c39d4b40f12b

[preflight] Running pre-flight checks.
    [WARNING FileExisting-crictl]: crictl not found in system path
[discovery] Trying to connect to API Server "10.9.10.154:6443"
[discovery] Created cluster-info discovery client, requesting info from "https://10.9.10.154:6443"
[discovery] Requesting info from "https://10.9.10.154:6443" again to validate TLS against the pinned public key
[discovery] Failed to request cluster info, will try again: [Get https://10.9.10.154:6443/api/v1/namespaces/kube-public/configmaps/cluster-info: x509: certificate has expired or is not yet valid]
[discovery] Failed to request cluster info, will try again: [Get https://10.9.10.154:6443/api/v1/namespaces/kube-public/configmaps/cluster-info: x509: certificate has expired or is not yet valid]

出现了上面错误,原因是工作节点和管理节点的时间没同步(执行命令date检查),解决方案:kubeadm join Endpoint check failed

具体就是安装ntp服务,使两台服务器的时间进行同步。

重新执行命令:

$ kubeadm join --token 1b43ed.7391162ed9072892 10.9.10.154:6443 --discovery-token-ca-cert-hash sha256:6ab593dfe4d2822912d11cf009830dd033591a6285e4b5ce4505c39d4b40f12b

[preflight] Running pre-flight checks.
    [WARNING FileExisting-crictl]: crictl not found in system path
[discovery] Trying to connect to API Server "10.9.10.154:6443"
[discovery] Created cluster-info discovery client, requesting info from "https://10.9.10.154:6443"
[discovery] Requesting info from "https://10.9.10.154:6443" again to validate TLS against the pinned public key
[discovery] Cluster info signature and contents are valid and TLS certificate validates against pinned roots, will use API Server "10.9.10.154:6443"
[discovery] Successfully established connection with API Server "10.9.10.154:6443"

This node has joined the cluster:
* Certificate signing request was sent to master and a response
  was received.
* The Kubelet was informed of the new secure connection details.

Run 'kubectl get nodes' on the master to see this node join the cluster.

可以看到,工作节点加入集群成功了,然后,我们可以在工作节点上,执行查看节点命令:

$ kubectl get nodes
The connection to the server localhost:8080 was refused - did you specify the right host or port?

出现了上面的错误,这个错误和我们在管理节点配置集群的时候,是一样的,什么原因呢?就是 kubectl 默认连接的是localhost:8080API Server,所以,需要我们手动配置环境变量。

查看/etc/kubernetes/kubelet.conf配置信息:

$ cat /etc/kubernetes/kubelet.conf
apiVersion: v1
clusters:
- cluster:
    certificate-authority-data: .......
    server: https://10.9.10.154:6443
  name: default-cluster
contexts:
- context:
    cluster: default-cluster
    namespace: default
    user: default-auth
  name: default-context
current-context: default-context
kind: Config
preferences: {}
users:
- name: default-auth
  user:
    client-certificate: /var/lib/kubelet/pki/kubelet-client.crt
    client-key: /var/lib/kubelet/pki/kubelet-client.key

可以看到,上面有我们管理节点的 API Server 地址(https://10.9.10.154:6443),然后我们将此配置文件,配置为环境变量:

$ sudo cp /etc/kubernetes/kubelet.conf $HOME/
$ sudo chown $(id -u):$(id -g) $HOME/kubelet.conf
$ export KUBECONFIG=$HOME/kubelet.conf

然后在工作节点上,重新执行查看节点命令:

$ kubectl get nodes
NAME       STATUS    ROLES     AGE       VERSION
manager1   Ready     master    18h       v1.9.2
worker1    Ready     <none>    57m       v1.9.3

这样,我们就可以在工作节点上,使用kubectl命令查看集群信息了(只能查看,不能操作),但其实还是通过访问 API Server,来获取集群信息。

至此,我们工作节点的部署就完成了。

4. 部署 Hello World 应用

Kubernetes 节点配置好之后,下面我们就部署一个示例应用,部署参考:Use a Service to Access an Application in a Cluster

创建一个名称为hello-workd的 Deployment:

$ kubectl run hello-world --replicas=1 --labels="run=load-balancer-example" --image=gcr.io/google-samples/node-hello:1.0  --port=8080
deployment "hello-world" created

--replicas=1表示我们创建的 Pod 数量为 1,--port=8080是容器的端口,并不是外部访问的端口。

创建好之后,我们可以通过下面几个命令,查看部署的信息和进度:

$ kubectl get deployments hello-world
$ kubectl describe deployments hello-world

$ kubectl get replicasets
$ kubectl describe replicasets

部署成功之后,我们还需要创建对应的 Service(类型为 NodePort):

$ kubectl expose deployment hello-world --type=NodePort --name=example-service
service "example-service" exposed

创建好之后,我们查看下 Service 的信息(31860 就是对外暴露的端口):

$ kubectl describe services example-service
Name:                     example-service
Namespace:                default
Labels:                   run=load-balancer-example
Annotations:              <none>
Selector:                 run=load-balancer-example
Type:                     NodePort
IP:                       10.96.52.166
Port:                     <unset>  8080/TCP
TargetPort:               8080/TCP
NodePort:                 <unset>  31860/TCP
Endpoints:                10.244.3.8:8080
Session Affinity:         None
External Traffic Policy:  Cluster
Events:                   <none>

我们还可以查看 Pod 的信息:

$ kubectl get pods --selector="run=load-balancer-example" --output=wide
NAME                          READY     STATUS    RESTARTS   AGE       IP           NODE
hello-world-58f9949f8-c28gx   1/1       Running   0          15m       10.244.3.8   worker1

然后,我们就可以浏览器直接打开(http://10.9.10.152:31860/),10.9.10.152是 worker1 的 IP 地址,或者直接测试访问命令:

$ curl http://10.9.10.152:31860/
Hello Kubernetes!

另外,说明下 Kubernetes 三种暴露服务的方式:

  • LoadBlancer Service:LoadBlancer Service 是 kubernetes 深度结合云平台的一个组件;当使用 LoadBlancer Service 暴露服务时,实际上是通过向底层云平台申请创建一个负载均衡器来向外暴露服务;目前 LoadBlancer Service 支持的云平台已经相对完善,比如国外的 GCE、DigitalOcean,国内的 阿里云,私有云 Openstack 等等,由于 LoadBlancer Service 深度结合了云平台,所以只能在一些云平台上来使用。
  • NodePort Service:NodePort Service 顾名思义,实质上就是通过在集群的每个 node 上暴露一个端口,然后将这个端口映射到某个具体的 service 来实现的,虽然每个 node 的端口有很多(0~65535),但是由于安全性和易用性(服务多了就乱了,还有端口冲突问题)实际使用可能并不多。
  • Ingress:Ingress 这个东西是 1.2 后才出现的,通过 Ingress 用户可以实现使用 nginx 等开源的反向代理负载均衡器实现对外暴露服务。

5. 安装 Dashboard 插件

Kubernetes Dashboard 地址:https://github.com/kubernetes/dashboard

Kubectl 安装 Dashboard 命令:

$ kubectl apply -f https://raw.githubusercontent.com/kubernetes/dashboard/master/src/deploy/recommended/kubernetes-dashboard.yaml
secret "kubernetes-dashboard-certs" created
serviceaccount "kubernetes-dashboard" created
role "kubernetes-dashboard-minimal" created
rolebinding "kubernetes-dashboard-minimal" created
deployment "kubernetes-dashboard" created
service "kubernetes-dashboard" created

$ kubectl proxy
Starting to serve on 127.0.0.1:8001

删除安装的 Dashboard 配置命令:

$ kubectl -n kube-system delete $(kubectl -n kube-system get pod -o name | grep dashboard)
pod "kubernetes-dashboard-3313488171-7706x" deleted

上面安装的 Dashboard 只能本机访问,如果需要外网访问,需要再进行配置一下,参考资料:Accessing Dashboard 1.7.X and above

需要先删除下 Dashboard,然后再执行kubectl apply -f ...创建命令,然后执行下面的命令(2. 修改 kubernetes-dashboard Service):

$ kubectl -n kube-system edit service kubernetes-dashboard

会出现如下配置信息:

apiVersion: v1
kind: Service
metadata:
  annotations:
    kubectl.kubernetes.io/last-applied-configuration: |
      {"apiVersion":"v1","kind":"Service","metadata":{"annotations":{},"labels":{"k8s-app":"kubernetes-dashboard"},"name":"kubernetes-dashboard","namespace":"kube-system"},"spec":{"ports":[{"port":443,"targetPort":8443}],"selector":{"k8s-app":"kubernetes-dashboard"}}}
  creationTimestamp: 2018-02-08T05:30:23Z
  labels:
    k8s-app: kubernetes-dashboard
  name: kubernetes-dashboard
  namespace: kube-system
  resourceVersion: "6652"
  selfLink: /api/v1/namespaces/kube-system/services/kubernetes-dashboard
  uid: 29323549-0c91-11e8-98f6-0293061ca64f
spec:
  clusterIP: 10.97.21.104
  externalTrafficPolicy: Cluster
  ports:
  - nodePort: 30835
    port: 443
    protocol: TCP
    targetPort: 8443
  selector:
    k8s-app: kubernetes-dashboard
  sessionAffinity: None
  type: ClusterIP
status:
  loadBalancer: {}

然后将type: ClusterIP修改为type: NodePortclusterIP并不需要更改,然后保存下。

查看分配的 NodePort(NodePort 30835 映射到 Dashboard pod 433 端口):

$ kubectl get services kubernetes-dashboard -n kube-system
NAME                   TYPE       CLUSTER-IP     EXTERNAL-IP   PORT(S)         AGE
kubernetes-dashboard   NodePort   10.97.21.104   <none>        443:30835/TCP   29m

然后,检查 Controller 是否正常运行:

$ kubectl get deployment kubernetes-dashboard  -n kube-system
NAME                   DESIRED   CURRENT   UP-TO-DATE   AVAILABLE   AGE
kubernetes-dashboard   1         1         1            1           30m

$ kubectl get pods  -n kube-system | grep dashboard
kubernetes-dashboard-845747bdd4-r4xsj   1/1       Running   0          30m

访问 Dashboard 的三种方式(参考资料:安装 Dashboard 插件):

  • kubernetes-dashboard 服务暴露了 NodePort,可以使用 http://NodeIP:nodePort 地址访问 Dashboard。
  • 通过 API Server 访问 Dashboard(https 6443 端口和 http 8080 端口方式)。
  • 通过 kubectl proxy 访问 Dashboard。

5.1 NodePort 方式访问

上面的配置,其实就是第一种,我们直接就可以访问:https://10.9.10.154:30835/

5.2 kubectl proxy 方式访问

第二种方式,需要配置证书,网上有相关资料,这边就不叙述了,第三种方式,执行命令:

$ kubectl proxy --address='10.9.10.154' --port=8086 --accept-hosts='^*$'
Starting to serve on 10.9.10.154:8086

需要指定--accept-hosts选项,否则浏览器访问 Dashboard 页面时提示“Unauthorized”,浏览器访问 URL:http://10.9.10.154:8086/ui 自动跳转到:http://10.9.10.154:8086/api/v1/proxy/namespaces/kube-system/services/kubernetes-dashboard/#/workload?namespace=default

打开 https://10.9.10.154:30835/#!/login,然后会出现下面的登录窗口(也可以选择“跳过”,只不过没有权限操作):

5.3 token 方式登录

这里我们选择 token 方式登录,参考资料:Access control

执行命令:

$ kubectl -n kube-system get secret
NAME                                             TYPE                                  DATA      AGE
attachdetach-controller-token-tlj4p              kubernetes.io/service-account-token   3         2h
bootstrap-signer-token-vdkth                     kubernetes.io/service-account-token   3         2h
bootstrap-token-561fd3                           bootstrap.kubernetes.io/token         7         2h
certificate-controller-token-w9bb7               kubernetes.io/service-account-token   3         2h
clusterrole-aggregation-controller-token-bztd8   kubernetes.io/service-account-token   3         2h
cronjob-controller-token-4f7t4                   kubernetes.io/service-account-token   3         2h
daemon-set-controller-token-q7865                kubernetes.io/service-account-token   3         2h
default-token-khxp8                              kubernetes.io/service-account-token   3         2h
deployment-controller-token-vlsrk                kubernetes.io/service-account-token   3         2h
disruption-controller-token-mfnwh                kubernetes.io/service-account-token   3         2h
endpoint-controller-token-hm69z                  kubernetes.io/service-account-token   3         2h
flannel-token-mrzqq                              kubernetes.io/service-account-token   3         52m
generic-garbage-collector-token-m78z6            kubernetes.io/service-account-token   3         2h
horizontal-pod-autoscaler-token-c6v4w            kubernetes.io/service-account-token   3         2h
job-controller-token-jkf4b                       kubernetes.io/service-account-token   3         2h
kube-dns-token-qvqhk                             kubernetes.io/service-account-token   3         2h
kube-proxy-token-rqjdn                           kubernetes.io/service-account-token   3         2h
kubernetes-dashboard-certs                       Opaque                                0         47m
kubernetes-dashboard-key-holder                  Opaque                                2         47m
kubernetes-dashboard-token-n4sjc                 kubernetes.io/service-account-token   3         47m
namespace-controller-token-qvblb                 kubernetes.io/service-account-token   3         2h
node-controller-token-qpz8t                      kubernetes.io/service-account-token   3         2h
persistent-volume-binder-token-qfrkh             kubernetes.io/service-account-token   3         2h
pod-garbage-collector-token-trzhf                kubernetes.io/service-account-token   3         2h
replicaset-controller-token-gp75m                kubernetes.io/service-account-token   3         2h
replication-controller-token-9tmhr               kubernetes.io/service-account-token   3         2h
resourcequota-controller-token-2djtx             kubernetes.io/service-account-token   3         2h
service-account-controller-token-vxkp8           kubernetes.io/service-account-token   3         2h
service-controller-token-q54bc                   kubernetes.io/service-account-token   3         2h
statefulset-controller-token-ldcz9               kubernetes.io/service-account-token   3         2h
token-cleaner-token-lk9kt                        kubernetes.io/service-account-token   3         2h
ttl-controller-token-9mks8                       kubernetes.io/service-account-token   3         2h

找到并复制 Name 为kubernetes-dashboard-token-n4sjc,然后执行下面命令:

$ kubectl -n kube-system describe secret kubernetes-dashboard-token-n4sjc
Name:         kubernetes-dashboard-token-nstpj
Namespace:    kube-system
Labels:       <none>
Annotations:  kubernetes.io/service-account.name=kubernetes-dashboard
              kubernetes.io/service-account.uid=94dc6cdc-0cb4-11e8-ae1c-0293061ca64f

Type:  kubernetes.io/service-account-token

Data
====
ca.crt:     1025 bytes
namespace:  11 bytes
token:      eyJhbGciOiJSUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJrdWJlcm5ldGVzL3NlcnZpY2VhY2NvdW50Iiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9uYW1lc3BhY2UiOiJrdWJlLXN5c3RlbSIsImt1YmVybmV0ZXMuaW8vc2VydmljZWFjY291bnQvc2VjcmV0Lm5hbWUiOiJrdWJlcm5ldGVzLWRhc2hib2FyZC10b2tlbi1uc3RwaiIsImt1YmVybmV0ZXMuaW8vc2VydmljZWFjY291bnQvc2VydmljZS1hY2NvdW50Lm5hbWUiOiJrdWJlcm5ldGVzLWRhc2hib2FyZCIsImt1YmVybmV0ZXMuaW8vc2VydmljZWFjY291bnQvc2VydmljZS1hY2NvdW50LnVpZCI6Ijk0ZGM2Y2RjLTBjYjQtMTFlOC1hZTFjLTAyOTMwNjFjYTY0ZiIsInN1YiI6InN5c3RlbTpzZXJ2aWNlYWNjb3VudDprdWJlLXN5c3RlbTprdWJlcm5ldGVzLWRhc2hib2FyZCJ9.gfYyS4l4z-qjz7_38E9bJ2hg8P173qKPJFVoFxX3uMX4gDgiYQJoCjx6ldiIX8oSVXfg572iUH2qMdCEskYYKbYfYOiuyqLL_O7I6pWic0XbW4qB_h-6PJmlqQUPhwVykDxb3qObnUm97tswbCjic461XqrrL3RwKqL8ox0wyLVhlB-QU4HrHzxUNCinffGLUMznFK0lFp4MfrFCK9TfAZEsXAWycsMBm9g-6ltUUVn7Y7aJrb4jxX5QfCsuKsySYvzOibkdzNGeMM8hVjxofrhE36nbgvpu98zVEcqgtz8ipbBi8K__O0_mlbpb5wCbW8VJuRCegMTKAB19Rb3NLg

然后,我们拿最下面的 token,就可以进行登录了。

5.4 解决访问权限问题

不过进去之后,出现了下面的错误(token 取的 key 为replicaset-controller-token-kzpmc,应该是上面的kubernetes-dashboard-token-n4sjc):

由于 kube-apiserver 启用了 RBAC 授权,而官方源码目录的dashboard-controller.yaml没有定义授权的 ServiceAccount,所以后续访问 API server 的 API 时会被拒绝,Web 中提示:

Forbidden (403)

User "system:serviceaccount:kube-system:default" cannot list jobs.batch in the namespace "default". (get jobs.batch)

解决方案(Update docs to cover RBAC changes in K8S 1.6):

# Create the clusterrole and clusterrolebinding:
# $ kubectl create -f kube-dashboard-rbac.yml
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: kubernetes-dashboard
subjects:
  - kind: ServiceAccount
    name: kubernetes-dashboard
    namespace: kube-system
roleRef:
  kind: ClusterRole
  name: cluster-admin
  apiGroup: rbac.authorization.k8s.io

这个问题搞了我好久,虽然执行了上面的命令,但我试过还是不行的,后来我又重新部署了 Kubernetes 和 Kubernetes Dashboard,然后修改了官方的dashboard-controller.yaml,在最后增加了上面的配置(我现在用的完整配置:https://github.com/yuezhongxin/Kubernetes.Sample/blob/master/kubernetes-dashboard.yaml)。

安装命令(1. 不要使用上面命令,直接使用此命令安装):

$ kubectl apply -f https://github.com/yuezhongxin/Kubernetes.Sample/blob/master/kubernetes-dashboard.yaml
secret "kubernetes-dashboard-certs" created
serviceaccount "kubernetes-dashboard" created
role "kubernetes-dashboard-minimal" created
rolebinding "kubernetes-dashboard-minimal" created
deployment "kubernetes-dashboard" created
service "kubernetes-dashboard" created

但当时试过还是不行的,然后隔天我重新启动虚拟机和 Kubernetes 的时候,发现又可以了,成功的界面:

导致上面配置出错的原因,我觉得一个是dashboard-controller.yaml的配置问题,还有一个是 token 取的值不对,之前 token 取的 key 为replicaset-controller-token-kzpmc,应该是上面的kubernetes-dashboard-token-n4sjckubernetes-dashboard是上面配置的name值,这两个要一一对应,要不然就会出现权限错误。

所以,我觉得解决 Dashboard 权限问题的方式:

  • 按照完善后的dashboard-controller.yaml配置进行创建,而不是使用官方的配置文件。
  • 3. 获取正确的 token 值(key 为kubernetes-dashboard-token-n4sjc)。

6. 安装 Heapster 插件

Heapster 插件包含三部分内容:

  • Heapster:显示各 Nodes、Pods 的 CPU、内存、负载等利用率曲线图。
  • InfluxDB:存储 Pod 信息相关的数据库, Heapster 获取数据之后, 可以指定存储在 InfluxDB。
  • Grafana:这个主要是用于显示 InfluxDB 里面的数据情况, 可以让我们很直观看到数据变化。

到 Heapster release 页面,下载最新版本的 Heapster。

$ wget https://github.com/kubernetes/heapster/archive/v1.5.1.zip
$ unzip v1.5.1.zip
$ cd heapster-1.5.1/kube-config/influxdb
$ ls
grafana.yaml  heapster.yaml  influxdb.yaml

部署命令:

$ kubectl create -f  .
deployment "monitoring-grafana" created
service "monitoring-grafana" created
serviceaccount "heapster" created
deployment "heapster" created
service "heapster" created
deployment "monitoring-influxdb" created
service "monitoring-influxdb" created

删除命令:

$ kubectl --namespace kube-system delete deployment heapster && 
kubectl --namespace kube-system delete deployment monitoring-grafana && 
kubectl --namespace kube-system delete deployment monitoring-influxdb && 
kubectl --namespace kube-system delete service heapster && 
kubectl --namespace kube-system delete service monitoring-grafana && 
kubectl --namespace kube-system delete service monitoring-influxdb && 
kubectl --namespace kube-system delete serviceaccounts heapster

查看日志:

$ kubectl get pods -n kube-system
$ kubectl logs -n kube-system -f pods/heapster-5d4dcf5f49-vq64w
E0520 01:18:55.360650       1 reflector.go:190]
k8s.io/heapster/metrics/util/util.go:51: Failed to list *v1.Node: Get
https://kubernetes.default/api/v1/nodes?resourceVersion=0: dial tcp
10.96.0.1:443: i/o timeout

解决方案:

$ kubectl get services
$ vi heapster.yaml
- --source=kubernetes:https://10.96.0.1

删除并重新部署,出现下面日志信息:

E0223 08:16:16.766774       1 reflector.go:190] k8s.io/heapster/metrics/util/util.go:30: Failed to list *v1.Node: nodes is forbidden: User "system:serviceaccount:kube-system:heapster" cannot list nodes at the cluster scope

解决方案(Failed to list *v1.Namespace: User "system:serviceaccount:kube-system:heapster" cannot list namespaces at the cluster scope):

$ kubectl create -f /home/ubuntu/heapster-1.5.1/deploy/kube-config/rbac/heapster-rbac.yaml
clusterrolebinding "heapster" created

删除并重新部署,出现下面日志信息:

E0223 08:19:45.066213       1 influxdb.go:208] Failed to create influxdb: failed to ping InfluxDB server at "monitoring-influxdb.kube-system.svc:8086" - Get http://monitoring-influxdb.kube-system.svc:8086/ping: dial tcp: lookup monitoring-influxdb.kube-system.svc on 10.96.0.10:53: read udp 10.244.3.25:46031->10.96.0.10:53: i/o timeout

暂时未解决。

感觉是网络的问题,因为创建的 Servcie 并没有起作用。

7. 后记

关于 Kubernetes,其实我个人觉得是微服务编排领域最强的那个,提供一系列强大的功能,但因为涉及的东西非常多,并且都需要花多的时间去研究消化,所以,这篇文章只是投石问路,后面需要学习的还有很多。

参考资料:

posted on 2018-03-05 23:01  HackerVirus  阅读(27400)  评论(0编辑  收藏  举报