centos7.4安装高可用(haproxy+keepalived实现)kubernetes1.6.0集群(开启TLS认证)

目录

 

 

前言

本文档使用haproxy+keepalived的方式部署为集群高可用模式,lvs+keepalived也可以实现,理论上来说可以照搬用于生产。本文档最初是基于kubenetes1.6版本编写的,对于kuberentes1.8及以上版本同样适用,只是个别位置有稍许变动,变动的地方将特别注明版本要求。

本系列文档介绍使用二进制部署 kubernetes 集群的所有步骤,而不是使用 kubeadm 等自动化方式来部署集群,同时开启了集群的TLS安全认证,安装时使用vmvare创建的虚拟机,理论上适用于所有bare metal环境、on-premise环境和公有云环境。

在部署的过程中,将详细列出各组件的启动参数,给出配置文件,详解它们的含义和可能遇到的问题。

部署完成后,你将理解系统各组件的交互原理,进而能快速解决实际问题。

所以本文档主要适合于那些有一定 kubernetes 基础,想通过一步步部署的方式来学习和了解系统配置、运行原理的人。

集群详情

  • OS:CentOS Linux release 7.4.1708 (Core) 3.10.0-693.el7.x86_64
  • Kubernetes 1.6.0+(最低的版本要求是1.6)
  • haproxy(yum安装)
  • keepalived(yum安装)
  • docker-1.13.1 or docker-ce-17.12.1(使用yum安装 or rpm)
  • etcd-3.2.15(使用yum安装)
  • flannel-0.7.1 vxlan 或者 host-gw 网络(使用yum安装)
  • TLS 认证通信 (所有组件,如 etcd、kubernetes master 和 node)
  • RBAC 授权
  • kubelet TLS BootStrapping
  • kubedns、dashboard、heapster(influxdb、grafana)、EFK(elasticsearch、fluentd、kibana) 集群插件
  • 私有docker镜像仓库harbor(请自行部署,harbor提供离线安装包,直接使用docker-compose启动即可)

环境说明

在下面的步骤中,将在8台CentOS系统的虚拟机上部署高可用集群。

角色分配如下:
keepalived1+haproxy1+etcd1: 192.168.223.201
keepalived2+haproxy2+etcd2: 192.168.223.202
keepalived3+haproxy3+etcd3: 192.168.223.203
Master1: 192.168.223.204
Master2: 192.168.223.205
Node1: 192.168.223.206
Node2: 192.168.223.207
docker+hub: 192.168.223.208

vip: 192.168.223.200
集群访问kube-apiserver使用此地址

注意: etcd和keepalived+haproxy复用3台主机,实际生产最好2台单独部署keepalived+haproxy,3台单独部署etcd

安装前准备

  1. 关闭所有节点的SELinux
修改 /etc/selinux/config 文件中设置 SELINUX=disabled
setenforce 0
  1. 关闭所有节点防火墙firewalld
systemctl disable firewalld;  
systemctl stop firewalld;
  1.  192.168.223.208 上安装harbor私有镜像仓库

参考教程:https://github.com/vmware/harbor 需要使用到的所有docker images:https://pan.baidu.com/s/1YH6OCpmz8EiO1OlmmxLtfg 密码:k2mr

提醒

  1. 由于启用了 TLS 双向认证、RBAC 授权等严格的安全机制,建议从头开始部署,而不要从中间开始,否则可能会认证、授权等失败!
  2. 部署过程中需要有很多证书的操作,请大家耐心操作,不明白的操作可以参考本书中的其他章节的解释。
  3. 该部署操作仅是搭建成了一个可用 kubernetes 集群,而很多地方还需要进行优化,heapster 插件、EFK 插件不一定会用于真实的生产环境中,但是通过部署这些插件,可以让大家了解到如何部署应用到集群上

以下正式开始部署


一、创建TLS证书和秘钥

这一步是在安装配置kubernetes的所有步骤中最容易出错也最难于排查问题的一步,而这却刚好是第一步,万事开头难,不要因为这点困难就望而却步。

kubernetes 系统的各组件需要使用 TLS 证书对通信进行加密,本文档使用 CloudFlare 的 PKI 工具集 cfssl 来生成 Certificate Authority (CA) 和其它证书;

生成的 CA 证书和秘钥文件如下:

  • ca-key.pem
  • ca.pem
  • kubernetes-key.pem
  • kubernetes.pem
  • kube-proxy.pem
  • kube-proxy-key.pem
  • admin.pem
  • admin-key.pem

使用证书的组件如下:

  • etcd:使用 ca.pem、kubernetes-key.pem、kubernetes.pem;
  • kube-apiserver:使用 ca.pem、kubernetes-key.pem、kubernetes.pem;
  • kubelet:使用 ca.pem;
  • kube-proxy:使用 ca.pem、kube-proxy-key.pem、kube-proxy.pem;
  • kubectl:使用 ca.pem、admin-key.pem、admin.pem;
  • kube-controller-manager:使用 ca-key.pem、ca.pem

注意: 以下操作都在 192.168.223.201 主机上执行,然后分发到集群所有主机,证书只需要创建一次即可,以后在向集群中添加新节点时只要将 /etc/kubernetes/ 目录下的证书拷贝到新节点上即可。

安装CFSSL

直接使用二进制源码包安装

wget https://pkg.cfssl.org/R1.2/cfssl_linux-amd64
chmod +x cfssl_linux-amd64
mv cfssl_linux-amd64 /usr/bin/cfssl

wget https://pkg.cfssl.org/R1.2/cfssljson_linux-amd64
chmod +x cfssljson_linux-amd64
mv cfssljson_linux-amd64 /usr/bin/cfssljson

wget https://pkg.cfssl.org/R1.2/cfssl-certinfo_linux-amd64
chmod +x cfssl-certinfo_linux-amd64
mv cfssl-certinfo_linux-amd64 /usr/bin/cfssl-certinfo

创建 CA (Certificate Authority)

创建 CA 配置文件

mkdir /root/ssl
cd /root/ssl
cat > ca-config.json << EOF
{
  "signing": {
    "default": {
      "expiry": "87600h"
    },
    "profiles": {
      "kubernetes": {
        "usages": [
            "signing",
            "key encipherment",
            "server auth",
            "client auth"
        ],
        "expiry": "87600h"
      }
    }
  }
}
EOF
  • ca-config.json:可以定义多个 profiles,分别指定不同的过期时间、使用场景等参数;后续在签名证书时使用某个 - profile;
  • signing:表示该证书可用于签名其它证书;生成的 ca.pem 证书中 CA=TRUE;
  • server auth:表示client可以用该 CA 对server提供的证书进行验证;
  • client auth:表示server可以用该CA对client提供的证书进行验证;

创建 CA 证书签名请求

创建 ca-csr.json 文件,内容如下:

cat > ca-csr.json << EOF
{
  "CN": "kubernetes",
  "key": {
    "algo": "rsa",
    "size": 2048
  },
  "names": [
    {
      "C": "CN",
      "ST": "BeiJing",
      "L": "BeiJing",
      "O": "k8s",
      "OU": "System"
    }
  ]
}
EOF
  • "CN":Common Name,kube-apiserver 从证书中提取该字段作为请求的用户名 (User Name);浏览器使用该字段验证网站是否合法;
  • "O":Organization,kube-apiserver 从证书中提取该字段作为请求用户所属的组 (Group);

生成 CA 证书和私钥

cfssl gencert -initca ca-csr.json | cfssljson -bare ca

ls ca*
ca-config.json  ca.csr  ca-csr.json  ca-key.pem  ca.pem

创建 kubernetes 证书

创建 kubernetes 证书签名请求文件 kubernetes-csr.json:

cat > kubernetes-csr.json << EOF
{
    "CN": "kubernetes",
    "hosts": [
      "127.0.0.1",
      "192.168.223.200",
      "192.168.223.201",
      "192.168.223.202",
      "192.168.223.203",
      "192.168.223.204",
      "192.168.223.205",
      "192.168.223.206",
      "192.168.223.207",
      "192.168.223.208",
      "10.254.0.1",
      "kubernetes",
      "kubernetes.default",
      "kubernetes.default.svc",
      "kubernetes.default.svc.cluster",
      "kubernetes.default.svc.cluster.local"
    ],
    "key": {
        "algo": "rsa",
        "size": 2048
    },
    "names": [
        {
            "C": "CN",
            "ST": "BeiJing",
            "L": "BeiJing",
            "O": "k8s",
            "OU": "System"
        }
    ]
}
EOF
  • 如果 hosts 字段不为空则需要指定授权使用该证书的 IP 或域名列表,由于该证书后续被 etcd 集群和 kubernetes master 集群使用,所以上面分别指定了 etcd 集群、kubernetes master 集群的主机 IP 和 kubernetes 服务的服务 IP(一般是 kube-apiserver 指定的 service-cluster-ip-range 网段的第一个IP,如 10.254.0.1)。
  • 以上节点的IP也可以更换为主机名。

生成 kubernetes 证书和私钥

cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=kubernetes kubernetes-csr.json | cfssljson -bare kubernetes

ls kubernetes*
kubernetes.csr  kubernetes-csr.json  kubernetes-key.pem  kubernetes.pem

创建 admin 证书

创建 admin 证书签名请求文件 admin-csr.json:

cat > admin-csr.json << EOF
{
  "CN": "admin",
  "hosts": [],
  "key": {
    "algo": "rsa",
    "size": 2048
  },
  "names": [
    {
      "C": "CN",
      "ST": "BeiJing",
      "L": "BeiJing",
      "O": "system:masters",
      "OU": "System"
    }
  ]
}
EOF
  • 后续 kube-apiserver 使用 RBAC 对客户端(如 kubelet、kube-proxy、Pod)请求进行授权;
  • kube-apiserver 预定义了一些 RBAC 使用的 RoleBindings,如 cluster-admin 将 Group system:masters 与 Role cluster-admin 绑定,该 Role 授予了调用kube-apiserver 的所有 API的权限;
  • O 指定该证书的 Group 为 system:masters,kubelet 使用该证书访问 kube-apiserver 时 ,由于证书被 CA 签名,所以认证通过,同时由于证书用户组为经过预授权的 system:masters,所以被授予访问所有 API 的权限;

注意: 这个admin 证书,是将来生成管理员用的kube config 配置文件用的,现在我们一般建议使用RBAC 来对kubernetes 进行角色权限控制, kubernetes 将证书中的CN 字段 作为User, O 字段作为 Group。

在搭建完 kubernetes 集群后,我们可以通过命令: kubectl get clusterrolebinding cluster-admin -o yaml ,查看到 clusterrolebinding cluster-admin 的 subjects 的 kind 是 Group,name 是 system:masters。 roleRef 对象是 ClusterRole cluster-admin。 意思是凡是 system:masters Group 的 user 或者 serviceAccount 都拥有 cluster-admin 的角色。 因此我们在使用 kubectl 命令时候,才拥有整个集群的管理权限。可以使用 kubectl get clusterrolebinding cluster-admin -o yaml 来查看。

kubectl get clusterrolebinding cluster-admin -o yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  annotations:
    rbac.authorization.kubernetes.io/autoupdate: "true"
  creationTimestamp: 2017-04-11T11:20:42Z
  labels:
    kubernetes.io/bootstrapping: rbac-defaults
  name: cluster-admin
  resourceVersion: "52"
  selfLink: /apis/rbac.authorization.k8s.io/v1/clusterrolebindings/cluster-admin
  uid: e61b97b2-1ea8-11e7-8cd7-f4e9d49f8ed0
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: cluster-admin
subjects:
- apiGroup: rbac.authorization.k8s.io
  kind: Group
  name: system:masters

生成 admin 证书和私钥

cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=kubernetes admin-csr.json|cfssljson -bare admin

ls admin*
admin.csr  admin-csr.json  admin-key.pem  admin.pem

创建 kube-proxy 证书

创建 kube-proxy 证书签名请求文件 kube-proxy-csr.json:

cat > kube-proxy-csr.json << EOF
{
  "CN": "system:kube-proxy",
  "hosts": [],
  "key": {
    "algo": "rsa",
    "size": 2048
  },
  "names": [
    {
      "C": "CN",
      "ST": "BeiJing",
      "L": "BeiJing",
      "O": "k8s",
      "OU": "System"
    }
  ]
}
EOF
  • CN 指定该证书的 User 为 system:kube-proxy;
  • kube-apiserver 预定义的 RoleBinding cluster-admin 将User system:kube-proxy 与 Role system:node-proxier 绑定,该 Role 授予了调用 kube-apiserver Proxy 相关 API 的权限;

生成 kube-proxy 客户端证书和私钥

cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=kubernetes  kube-proxy-csr.json | cfssljson -bare kube-proxy

ls kube-proxy*
kube-proxy.csr  kube-proxy-csr.json  kube-proxy-key.pem  kube-proxy.pem

校验证书

使用 opsnssl 命令

openssl x509  -noout -text -in  kubernetes.pem
...
    Signature Algorithm: sha256WithRSAEncryption
        Issuer: C=CN, ST=BeiJing, L=BeiJing, O=k8s, OU=System, CN=Kubernetes
        Validity
            Not Before: Apr  5 05:36:00 2017 GMT
            Not After : Apr  5 05:36:00 2018 GMT
        Subject: C=CN, ST=BeiJing, L=BeiJing, O=k8s, OU=System, CN=kubernetes
...
        X509v3 extensions:
            X509v3 Key Usage: critical
                Digital Signature, Key Encipherment
            X509v3 Extended Key Usage:
                TLS Web Server Authentication, TLS Web Client Authentication
            X509v3 Basic Constraints: critical
                CA:FALSE
            X509v3 Subject Key Identifier:
                DD:52:04:43:10:13:A9:29:24:17:3A:0E:D7:14:DB:36:F8:6C:E0:E0
            X509v3 Authority Key Identifier:
                keyid:44:04:3B:60:BD:69:78:14:68:AF:A0:41:13:F6:17:07:13:63:58:CD

            X509v3 Subject Alternative Name:
                DNS:kubernetes, DNS:kubernetes.default, DNS:kubernetes.default.svc, DNS:kubernetes.default.svc.cluster, DNS:kubernetes.default.svc.cluster.local, IP Address:127.0.0.1, IP Address:192.168.223.200, IP Address:192.168.223.201, IP Address:192.168.223.202, IP Address:192.168.223.203, IP Address:192.168.223.204, IP Address:192.168.223.205, IP Address:192.168.223.206, IP Address:192.168.223.207, IP Address:192.168.223.208, IP Address:10.254.0.1
...
  • 确认 Issuer 字段的内容和 ca-csr.json 一致;
  • 确认 Subject 字段的内容和 kubernetes-csr.json 一致;
  • 确认 X509v3 Subject Alternative Name 字段的内容和 kubernetes-csr.json 一致;
  • 确认 X509v3 Key Usage、Extended Key Usage 字段的内容和 ca-config.json 中 kubernetes profile 一致;

使用 cfssl-certinfo 命令

cfssl-certinfo -cert kubernetes.pem
...
{
  "subject": {
    "common_name": "kubernetes",
    "country": "CN",
    "organization": "k8s",
    "organizational_unit": "System",
    "locality": "BeiJing",
    "province": "BeiJing",
    "names": [
      "CN",
      "BeiJing",
      "BeiJing",
      "k8s",
      "System",
      "kubernetes"
    ]
  },
  "issuer": {
    "common_name": "Kubernetes",
    "country": "CN",
    "organization": "k8s",
    "organizational_unit": "System",
    "locality": "BeiJing",
    "province": "BeiJing",
    "names": [
      "CN",
      "BeiJing",
      "BeiJing",
      "k8s",
      "System",
      "Kubernetes"
    ]
  },
  "serial_number": "174360492872423263473151971632292895707129022309",
  "sans": [
	  "127.0.0.1",
      "192.168.223.200",
      "192.168.223.201",
      "192.168.223.202",
      "192.168.223.203",
      "192.168.223.204",
      "192.168.223.205",
      "192.168.223.206",
      "192.168.223.207",
      "192.168.223.208",
      "10.254.0.1",
      "kubernetes",
      "kubernetes.default",
      "kubernetes.default.svc",
      "kubernetes.default.svc.cluster",
      "kubernetes.default.svc.cluster.local"
  ],
  "not_before": "2017-04-05T05:36:00Z",
  "not_after": "2018-04-05T05:36:00Z",
  "sigalg": "SHA256WithRSA",
...

分发证书

将生成的证书和秘钥文件(后缀名为.pem)拷贝到所有机器的 /etc/kubernetes/ssl 目录下备用;

mkdir -p /etc/kubernetes/ssl
cp *.pem /etc/kubernetes/ssl
ssh 192.168.223.202 "mkdir -p /etc/kubernetes/ssl"
ssh 192.168.223.203 "mkdir -p /etc/kubernetes/ssl"
ssh 192.168.223.204 "mkdir -p /etc/kubernetes/ssl"
ssh 192.168.223.205 "mkdir -p /etc/kubernetes/ssl"
ssh 192.168.223.206 "mkdir -p /etc/kubernetes/ssl"
ssh 192.168.223.207 "mkdir -p /etc/kubernetes/ssl"
scp *.pem 192.168.223.202:/etc/kubernetes/ssl
scp *.pem 192.168.223.203:/etc/kubernetes/ssl
scp *.pem 192.168.223.204:/etc/kubernetes/ssl
scp *.pem 192.168.223.205:/etc/kubernetes/ssl
scp *.pem 192.168.223.206:/etc/kubernetes/ssl
scp *.pem 192.168.223.207:/etc/kubernetes/ssl

二、安装kubectl命令行工具

一般只需在两台master主机安装即可

下载 kubectl

注意请下载对应的Kubernetes版本的安装包。

wget https://dl.k8s.io/v1.6.0/kubernetes-client-linux-amd64.tar.gz
tar -xzvf kubernetes-client-linux-amd64.tar.gz
cp kubernetes/client/bin/kube* /usr/bin/
chmod a+x /usr/bin/kube*

创建 kubectl kubeconfig 文件

export KUBE_APISERVER="https://192.168.223.200:6443"

# 设置集群参数
kubectl config set-cluster kubernetes \
  --certificate-authority=/etc/kubernetes/ssl/ca.pem \
  --embed-certs=true \
  --server=${KUBE_APISERVER}
  
# 设置客户端认证参数
kubectl config set-credentials admin \
  --client-certificate=/etc/kubernetes/ssl/admin.pem \
  --embed-certs=true \
  --client-key=/etc/kubernetes/ssl/admin-key.pem
  
# 设置上下文参数
kubectl config set-context kubernetes \
  --cluster=kubernetes \
  --user=admin
  
# 设置默认上下文
kubectl config use-context kubernetes
  • admin.pem 证书 OU 字段值为 system:masters,kube-apiserver 预定义的 RoleBinding cluster-admin 将 Group system:masters 与 Role cluster-admin 绑定,该 Role 授予了调用kube-apiserver 相关 API 的权限;
  • 生成的 kubeconfig 被保存到 ~/.kube/config 文件;

注意: ~/.kube/config文件拥有对该集群的最高权限,请妥善保管。如果node节点上需要使用kubelet工具,只需将此文件拷贝过去。

三、创建 kubeconfig 文件

kubelet、kube-proxy 等 Node 机器上的进程与 Master 机器的 kube-apiserver 进程通信时需要认证和授权;

kuberetes 1.4 开始支持由 kube-apiserver 为客户端生成 TLS 证书的 TLS Bootstrapping 功能,这样就不需要为每个客户端生成证书了;该功能当前仅支持为 kubelet 生成证书;

以下操作只需要在 master1: 192.168.223.204 节点上执行,生成的 *.kubeconfig 文件可以直接拷贝到其他节点的 /etc/kubernetes 目录下。

创建 TLS Bootstrapping Token

Token可以是任意的包含128 bit的字符串,可以使用安全的随机数发生器生成。

export BOOTSTRAP_TOKEN=$(head -c 16 /dev/urandom | od -An -t x | tr -d ' ')
cat > token.csv <<EOF
${BOOTSTRAP_TOKEN},kubelet-bootstrap,10001,"system:kubelet-bootstrap"
EOF

注意: 请检查 token.csv 文件,确认其中的 ${BOOTSTRAP_TOKEN} 环境变量已经被真实的值替换。 **BOOTSTRAP_TOKEN ** 将被写入到 kube-apiserver 使用的 token.csv 文件和 kubelet 使用的 bootstrap.kubeconfig 文件,如果后续重新生成了 BOOTSTRAP_TOKEN,则需要:

更新 token.csv 文件,分发到所有机器 (master 和 node)的 /etc/kubernetes/ 目录下,分发到node节点上非必需; 重新生成 bootstrap.kubeconfig 文件,分发到所有 node 机器的 /etc/kubernetes/ 目录下; 重启 kube-apiserver 和 kubelet 进程; 重新 approve kubelet 的 csr 请求;

cp token.csv /etc/kubernetes/
scp token.csv 192.168.223.205:/etc/kubernetes/
scp token.csv 192.168.223.206:/etc/kubernetes/
scp token.csv 192.168.223.207:/etc/kubernetes/

创建 kubelet bootstrapping kubeconfig 文件

cd /etc/kubernetes
export KUBE_APISERVER="https://192.168.223.200:6443"

# 设置集群参数
kubectl config set-cluster kubernetes \
  --certificate-authority=/etc/kubernetes/ssl/ca.pem \
  --embed-certs=true \
  --server=${KUBE_APISERVER} \
  --kubeconfig=bootstrap.kubeconfig

# 设置客户端认证参数
kubectl config set-credentials kubelet-bootstrap \
  --token=${BOOTSTRAP_TOKEN} \
  --kubeconfig=bootstrap.kubeconfig

# 设置上下文参数
kubectl config set-context default \
  --cluster=kubernetes \
  --user=kubelet-bootstrap \
  --kubeconfig=bootstrap.kubeconfig

# 设置默认上下文
kubectl config use-context default --kubeconfig=bootstrap.kubeconfig
  • --embed-certs 为 true 时表示将 certificate-authority 证书写入到生成的 bootstrap.kubeconfig 文件中;
  • 设置客户端认证参数时没有指定秘钥和证书,后续由 kube-apiserver 自动生成;

创建 kube-proxy kubeconfig 文件

export KUBE_APISERVER="https://192.168.223.200:6443"

# 设置集群参数
kubectl config set-cluster kubernetes \
  --certificate-authority=/etc/kubernetes/ssl/ca.pem \
  --embed-certs=true \
  --server=${KUBE_APISERVER} \
  --kubeconfig=kube-proxy.kubeconfig
  
# 设置客户端认证参数
kubectl config set-credentials kube-proxy \
  --client-certificate=/etc/kubernetes/ssl/kube-proxy.pem \
  --client-key=/etc/kubernetes/ssl/kube-proxy-key.pem \
  --embed-certs=true \
  --kubeconfig=kube-proxy.kubeconfig
  
# 设置上下文参数
kubectl config set-context default \
  --cluster=kubernetes \
  --user=kube-proxy \
  --kubeconfig=kube-proxy.kubeconfig
  
# 设置默认上下文
kubectl config use-context default --kubeconfig=kube-proxy.kubeconfig
  • 设置集群参数和客户端认证参数时 --embed-certs 都为 true,这会将 certificate-authority、client-certificate 和 client-key 指向的证书文件内容写入到生成的 kube-proxy.kubeconfig 文件中;
  • kube-proxy.pem 证书中 CN 为 system:kube-proxy,kube-apiserver 预定义的 RoleBinding cluster-admin 将User system:kube-proxy 与 Role system:node-proxier 绑定,该 Role 授予了调用 kube-apiserver Proxy 相关 API 的权限;

分发 kubeconfig 文件

将两个 kubeconfig 文件分发到所有 节点机器的 /etc/kubernetes/ 目录

scp bootstrap.kubeconfig kube-proxy.kubeconfig 192.168.223.205:/etc/kubernetes/
scp bootstrap.kubeconfig kube-proxy.kubeconfig 192.168.223.206:/etc/kubernetes/
scp bootstrap.kubeconfig kube-proxy.kubeconfig 192.168.223.207:/etc/kubernetes/

四、创建高可用 etcd 集群

kuberntes 系统使用 etcd 存储所有数据,本次部署一个三节点高可用 etcd 集群的步骤,分别为:192.168.223.201、192.168.223.202、192.168.223.203。

TLS 认证文件

需要为 etcd 集群创建加密通信的 TLS 证书,这里复用以前创建的 kubernetes 证书

ls /etc/kubernetes/ssl/*.pem
admin-key.pem  admin.pem  ca-key.pem  ca.pem  kube-proxy-key.pem  kube-proxy.pem  kubernetes-key.pem  kubernetes.pem
  • kubernetes 证书的 hosts 字段列表中包含上面三台机器的 IP,否则后续证书校验会失败;

安装etcd

 https://github.com/coreos/etcd/releases 页面下载最新版本的二进制文件

wget https://github.com/coreos/etcd/releases/download/v3.1.5/etcd-v3.1.5-linux-amd64.tar.gz
tar -xvf etcd-v3.1.5-linux-amd64.tar.gz
mv etcd-v3.1.5-linux-amd64/etcd* /usr/sbin

或者直接使用yum命令安装:

yum install etcd -y
  • 建议使用yum安装

创建 etcd 的 systemd unit 文件

vi /usr/lib/systemd/system/etcd.service,内容如下。注意替换IP地址为你自己的etcd集群的主机IP。

[Unit]
Description=Etcd Server
After=network.target
After=network-online.target
Wants=network-online.target
Documentation=https://github.com/coreos

[Service]
Type=notify
WorkingDirectory=/var/lib/etcd/
EnvironmentFile=-/etc/etcd/etcd.conf
ExecStart=/usr/sbin/etcd \
  --name ${ETCD_NAME} \
  --cert-file=/etc/kubernetes/ssl/kubernetes.pem \
  --key-file=/etc/kubernetes/ssl/kubernetes-key.pem \
  --peer-cert-file=/etc/kubernetes/ssl/kubernetes.pem \
  --peer-key-file=/etc/kubernetes/ssl/kubernetes-key.pem \
  --trusted-ca-file=/etc/kubernetes/ssl/ca.pem \
  --peer-trusted-ca-file=/etc/kubernetes/ssl/ca.pem \
  --initial-advertise-peer-urls ${ETCD_INITIAL_ADVERTISE_PEER_URLS} \
  --listen-peer-urls ${ETCD_LISTEN_PEER_URLS} \
  --listen-client-urls ${ETCD_LISTEN_CLIENT_URLS},http://127.0.0.1:2379 \
  --advertise-client-urls ${ETCD_ADVERTISE_CLIENT_URLS} \
  --initial-cluster-token ${ETCD_INITIAL_CLUSTER_TOKEN} \
  --initial-cluster infra1=https://192.168.223.201:2380,infra2=https://192.168.223.202:2380,infra3=https://192.168.223.203:2380 \
  --initial-cluster-state new \
  --data-dir=${ETCD_DATA_DIR}
Restart=on-failure
RestartSec=5
LimitNOFILE=65536

[Install]
WantedBy=multi-user.target
  • 指定 etcd 的工作目录为 /var/lib/etcd,数据目录为 /var/lib/etcd,需在启动服务前创建这个目录 mkdir -p /var/lib/etcd,否则启动服务的时候会报错“Failed at step CHDIR spawning /usr/bin/etcd: No such file or directory”;
  • 为了保证通信安全,需要指定 etcd 的公私钥(cert-file和key-file)、Peers 通信的公私钥和 CA 证书(peer-cert-file、peer-key-file、peer-trusted-ca-file)、客户端的CA证书(trusted-ca-file);
  • 创建 kubernetes.pem 证书时使用的 kubernetes-csr.json 文件的 hosts 字段包含所有 etcd 节点的IP,否则证书校验会出错;
  • --initial-cluster-state 值为 new 时,--name 的参数值必须位于 --initial-cluster 列表中;

环境变量配置文件 vi /etc/etcd/etcd.conf

# [member]
ETCD_NAME=infra1
ETCD_DATA_DIR="/var/lib/etcd"
ETCD_LISTEN_PEER_URLS="https://192.168.223.201:2380"
ETCD_LISTEN_CLIENT_URLS="https://192.168.223.201:2379"

#[cluster]
ETCD_INITIAL_ADVERTISE_PEER_URLS="https://192.168.223.201:2380"
ETCD_INITIAL_CLUSTER_TOKEN="etcd-cluster"
ETCD_ADVERTISE_CLIENT_URLS="https://192.168.223.201:2379"

这是192.168.223.201节点的配置,其他两个etcd节点只要将上面的IP地址改成相应节点的IP地址即可。ETCD_NAME换成对应节点的infra1/2/3。

启动 etcd 服务

systemctl daemon-reload
systemctl enable etcd
systemctl start etcd
systemctl status etcd

在所有的 kubernetes master 节点重复上面的步骤,直到所有机器的 etcd 服务都已启动。

验证服务

etcdctl \
  --ca-file=/etc/kubernetes/ssl/ca.pem \
  --cert-file=/etc/kubernetes/ssl/kubernetes.pem \
  --key-file=/etc/kubernetes/ssl/kubernetes-key.pem \
  cluster-health
member 9a2ec640d25672e5 is healthy: got healthy result from https://192.168.223.201:2379
member bc6f27ae3be34308 is healthy: got healthy result from https://192.168.223.202:2379
member e5c92ea26c4edba0 is healthy: got healthy result from https://192.168.223.203:2379
cluster is healthy

结果最后一行为 cluster is healthy 时表示集群服务正常。

五、部署 haproxy+keepalived

本次部署一个三节点高可用 haproxy+keepalived 集群,分别为:192.168.223.201、192.168.223.202、192.168.223.203。VIP 地址 192.168.223.200

安装 haproxy+keepalived

yum install -y haproxy keepalived

注: 3台 haproxy+keepalived 节点都需安装

配置 keepalived

节点1 192.168.223.201 配置文件 vi /etc/keepalived/keepalived.conf

! Configuration File for keepalived  
global_defs {  
    notification_email {   
        test@sina.com   
    }   
    notification_email_from admin@test.com  
    smtp_server 127.0.0.1  
    smtp_connect_timeout 30  
    router_id LVS_MASTER  
}  

vrrp_script check_haproxy {
    script "/etc/keepalived/check_haproxy.sh"
    interval 3
}  

vrrp_instance VI_1 {  
    state MASTER  # 如果配置主从,从服务器改为BACKUP即可
    interface ens33
    virtual_router_id 60  
    priority 100  # 从服务器设置小于100的数即可
    advert_int 1  
    authentication {  
        auth_type PASS  
        auth_pass 1111  
    }  
    virtual_ipaddress {  
        192.168.223.200/24
    }

    track_script {   
        check_haproxy
    }
}

节点2 192.168.223.202 配置文件 vi /etc/keepalived/keepalived.conf

! Configuration File for keepalived  
global_defs {  
    notification_email {   
        test@sina.com   
    }   
    notification_email_from admin@test.com  
    smtp_server 127.0.0.1  
    smtp_connect_timeout 30  
    router_id LVS_MASTER  
}  

vrrp_script check_haproxy {
    script "/etc/keepalived/check_haproxy.sh"
    interval 3
}  

vrrp_instance VI_1 {  
    state BACKUP  # 如果配置主从,从服务器改为BACKUP即可
    interface ens33
    virtual_router_id 60  
    priority 90  # 从服务器设置小于100的数即可
    advert_int 1  
    authentication {  
        auth_type PASS  
        auth_pass 1111  
    }  
    virtual_ipaddress {  
        192.168.223.200/24
    }

    track_script {   
        check_haproxy
    }
}

节点3 192.168.223.203 配置文件 vi /etc/keepalived/keepalived.conf

! Configuration File for keepalived  
global_defs {  
    notification_email {   
        test@sina.com   
    }   
    notification_email_from admin@test.com  
    smtp_server 127.0.0.1  
    smtp_connect_timeout 30  
    router_id LVS_MASTER  
}  

vrrp_script check_haproxy {
    script "/etc/keepalived/check_haproxy.sh"
    interval 3
}  

vrrp_instance VI_1 {  
    state BACKUP  # 如果配置主从,从服务器改为BACKUP即可
    interface ens33
    virtual_router_id 60  
    priority 80  # 从服务器设置小于100的数即可
    advert_int 1  
    authentication {  
        auth_type PASS  
        auth_pass 1111  
    }  
    virtual_ipaddress {  
        192.168.223.200/24
    }

    track_script {   
        check_haproxy
    }
}

检测脚本 vi /etc/keepalived/check_haproxy.sh

#!/bin/bash

flag=$(systemctl status haproxy &> /dev/null;echo $?)

if [[ $flag != 0 ]];then
        echo "haproxy is down,close the keepalived"
        systemctl stop keepalived
fi

修改keepalived启动文件 vi /usr/lib/systemd/system/keepalived.service 以下部分:

[Unit]
Description=LVS and VRRP High Availability Monitor
After=syslog.target network-online.target haproxy.service
Requires=haproxy.service
  • keepalived配置文件三台主机基本一样,除了state,主节点配置为MASTER,备节点配置BACKUP,优化级参数priority,主节点设置最高,备节点依次递减
  • 自定义的检测脚本作用是检测本机haproxy服务状态,如果不正常就停止本机keepalived,释放VIP
  • 这里没有考虑keepalived脑裂的问题,后期可以在脚本中加入相关检测

配置 haproxy

3台节点配置一模一样 配置文件 vim /etc/haproxy/haproxy.cfg

global
    log         127.0.0.1 local2
    chroot      /var/lib/haproxy
    pidfile     /var/run/haproxy.pid
    maxconn     4000
    user        haproxy
    group       haproxy
    daemon
    stats socket /var/lib/haproxy/stats

defaults
    mode                    tcp
    log                     global
    option                  tcplog
    option                  dontlognull
    option                  redispatch
    retries                 3
    timeout queue           1m
    timeout connect         10s
    timeout client          1m
    timeout server          1m
    timeout check           10s
    maxconn                 3000

listen stats
    mode   http
    bind :10086
    stats   enable
    stats   uri     /admin?stats
    stats   auth    admin:admin
    stats   admin   if TRUE
    
frontend  k8s_http *:8080
    mode      tcp
    maxconn      2000
    default_backend     http_sri
 
backend http_sri
    balance      roundrobin
    server s1 192.168.223.204:8080  check inter 10000 fall 2 rise 2 weight 1
    server s2 192.168.223.205:8080  check inter 10000 fall 2 rise 2 weight 1

frontend  k8s_https *:6443
    mode      tcp
    maxconn      2000
    default_backend     https_sri
    
backend https_sri
    balance      roundrobin
    server s1 192.168.223.204:6443  check inter 10000 fall 2 rise 2 weight 1
    server s2 192.168.223.205:6443  check inter 10000 fall 2 rise 2 weight 1
  • listen stats定义了haproxy自身状态查看地址,在里面可以看到haproy目前的各种状态
  • frontend 定义了前端提供服务的端口等信息
  • backend 定义了后端真实服务器的信息

启动 haproxy+keepalived

3个节点都启动

systemctl daemon-reload
systemctl enable haproxy
systemctl enable keepalived
systemctl start haproxy
systemctl start keepalived

如果没有什么报错,那应该就可以在主节点 192.168.223.201 上面看到ens33网卡已绑定VIP: 192.168.223.200

ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN qlen 1
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: ens33: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
    link/ether 00:0c:29:2b:74:46 brd ff:ff:ff:ff:ff:ff
    inet 192.168.223.201/24 brd 192.168.223.255 scope global ens33
       valid_lft forever preferred_lft forever
    inet 192.168.223.200/24 scope global secondary ens33
       valid_lft forever preferred_lft forever
    inet6 fe80::435e:5e98:6d14:6c40/64 scope link 
       valid_lft forever preferred_lft forever

六、安装 flannel 网络插件

所有的 node 节点都需要安装网络插件才能让所有的Pod加入到同一个局域网中,如果想要在master节点上也能访问 pods的ip,master 节点也安装。

安装 flannel

建议直接使用 yum 安装 flanneld ,除非对版本有特殊需求,默认安装的是0.7.1版本的 flannel 。

yum install -y flannel

service配置文件 vi /usr/lib/systemd/system/flanneld.service

[Unit]
Description=Flanneld overlay address etcd agent
After=network.target
After=network-online.target
Wants=network-online.target
After=etcd.service
Before=docker.service

[Service]
Type=notify
EnvironmentFile=/etc/sysconfig/flanneld
EnvironmentFile=-/etc/sysconfig/docker-network
ExecStart=/usr/bin/flanneld-start \
  -etcd-endpoints=${FLANNEL_ETCD_ENDPOINTS} \
  -etcd-prefix=${FLANNEL_ETCD_PREFIX} \
  $FLANNEL_OPTIONS
ExecStartPost=/usr/libexec/flannel/mk-docker-opts.sh -k DOCKER_NETWORK_OPTIONS -d /run/flannel/docker
Restart=on-failure

[Install]
WantedBy=multi-user.target
RequiredBy=docker.service

vi /etc/sysconfig/flanneld 配置文件:

# Flanneld configuration options  

# etcd url location.  Point this to the server where etcd runs
FLANNEL_ETCD_ENDPOINTS="https://192.168.223.201:2379,https://192.168.223.202:2379,https://192.168.223.203:2379"

# etcd config key.  This is the configuration key that flannel queries
# For address range assignment
FLANNEL_ETCD_PREFIX="/kube-centos/network"

# Any additional options that you want to pass
FLANNEL_OPTIONS="-etcd-cafile=/etc/kubernetes/ssl/ca.pem -etcd-certfile=/etc/kubernetes/ssl/kubernetes.pem -etcd-keyfile=/etc/kubernetes/ssl/kubernetes-key.pem"

注: 如果是主机是多网卡,则需要在FLANNEL_OPTIONS中增加指定的外网出口的网卡,例如-iface=eth2

在etcd中创建网络配置

执行下面的命令为docker分配IP地址段。

etcdctl --endpoints=https://192.168.223.201:2379,https://192.168.223.202:2379,https://192.168.223.203:2379 \
  --ca-file=/etc/kubernetes/ssl/ca.pem \
  --cert-file=/etc/kubernetes/ssl/kubernetes.pem \
  --key-file=/etc/kubernetes/ssl/kubernetes-key.pem \
  mkdir /kube-centos/network

etcdctl --endpoints=https://192.168.223.201:2379,https://192.168.223.202:2379,https://192.168.223.203:2379 \
  --ca-file=/etc/kubernetes/ssl/ca.pem \
  --cert-file=/etc/kubernetes/ssl/kubernetes.pem \
  --key-file=/etc/kubernetes/ssl/kubernetes-key.pem \
  mk /kube-centos/network/config '{"Network":"172.30.0.0/16","SubnetLen":24,"Backend":{"Type":"vxlan"}}'

如果你要使用host-gw模式,可以直接将vxlan改成host-gw即可。根据原作者测试,使用host-gw模式时网络性能好一些。

启动flannel服务

systemctl daemon-reload
systemctl enable flanneld
systemctl start flanneld
systemctl status flanneld

注: 启动flannel前,请先停止docker,flannel启动好后,再启动docker。

现在查询etcd中的内容可以看到:

etcdctl --endpoints=${ETCD_ENDPOINTS} \
  --ca-file=/etc/kubernetes/ssl/ca.pem \
  --cert-file=/etc/kubernetes/ssl/kubernetes.pem \
  --key-file=/etc/kubernetes/ssl/kubernetes-key.pem \
  ls /kube-centos/network/subnets
/kube-centos/network/subnets/172.30.14.0-24
/kube-centos/network/subnets/172.30.38.0-24
/kube-centos/network/subnets/172.30.46.0-24
/kube-centos/network/subnets/172.30.91.0-24

etcdctl --endpoints=${ETCD_ENDPOINTS} \
  --ca-file=/etc/kubernetes/ssl/ca.pem \
  --cert-file=/etc/kubernetes/ssl/kubernetes.pem \
  --key-file=/etc/kubernetes/ssl/kubernetes-key.pem \
  get /kube-centos/network/config
{ "Network": "172.30.0.0/16", "SubnetLen": 24, "Backend": { "Type": "vxlan" } }

etcdctl --endpoints=${ETCD_ENDPOINTS} \
  --ca-file=/etc/kubernetes/ssl/ca.pem \
  --cert-file=/etc/kubernetes/ssl/kubernetes.pem \
  --key-file=/etc/kubernetes/ssl/kubernetes-key.pem \
get /kube-centos/network/subnets/172.30.14.0-24
{"PublicIP":"192.168.223.204","BackendType":"vxlan","BackendData":{"VtepMAC":"56:27:7d:1c:08:22"}}

etcdctl --endpoints=${ETCD_ENDPOINTS} \
  --ca-file=/etc/kubernetes/ssl/ca.pem \
  --cert-file=/etc/kubernetes/ssl/kubernetes.pem \
  --key-file=/etc/kubernetes/ssl/kubernetes-key.pem \
get /kube-centos/network/subnets/172.30.38.0-24
{"PublicIP":"192.168.223.205","BackendType":"vxlan","BackendData":{"VtepMAC":"12:82:83:59:cf:b8"}}

etcdctl --endpoints=${ETCD_ENDPOINTS} \
  --ca-file=/etc/kubernetes/ssl/ca.pem \
  --cert-file=/etc/kubernetes/ssl/kubernetes.pem \
  --key-file=/etc/kubernetes/ssl/kubernetes-key.pem \
get /kube-centos/network/subnets/172.30.46.0-24
{"PublicIP":"192.168.223.206","BackendType":"vxlan","BackendData":{"VtepMAC":"e6:b2:fd:f6:66:96"}}

etcdctl --endpoints=${ETCD_ENDPOINTS} \
  --ca-file=/etc/kubernetes/ssl/ca.pem \
  --cert-file=/etc/kubernetes/ssl/kubernetes.pem \
  --key-file=/etc/kubernetes/ssl/kubernetes-key.pem \
get /kube-centos/network/subnets/172.30.91.0-24
{"PublicIP":"192.168.223.207","BackendType":"vxlan","BackendData":{"VtepMAC":"e3:b1:43:f6:34:67"}}

如果可以查看到以上内容证明flannel已经安装完成,并且已经正常分配kubernetes网段

七、部署master节点

kubernetes master 节点包含的组件:

  • kube-apiserver
  • kube-scheduler
  • kube-controller-manager
  • kube-scheduler、kube-controller-manager 和 kube-apiserver 三者的功能紧密相关;
  • 同时只能有一个 kube-scheduler、kube-controller-manager 进程处于工作状态,如果运行多个,则需要通过选举产生一个 leader;
  • kube-apiserver 为无状态服务,使用haproxy+keepalived 实现高可用

TLS 证书文件

以下pem证书文件我们在创建TLS证书和秘钥这一步中已经创建过了,token.csv文件在创建kubeconfig文件的时候创建。我们再检查一下。

cd /etc/kubernetes/ssl
ls 
admin-key.pem  admin.pem  ca-key.pem  ca.pem  kube-proxy-key.pem  kube-proxy.pem  kubernetes-key.pem  kubernetes.pem

下载二进制文件

有两种下载方式,请注意下载对应的Kubernetes版本。

方式一

从 github release 页面 下载发布版 tarball,解压后再执行下载脚本

wget https://github.com/kubernetes/kubernetes/releases/download/v1.6.0/kubernetes.tar.gz
tar -xzvf kubernetes.tar.gz
cd kubernetes
./cluster/get-kube-binaries.sh

方式二

从 CHANGELOG页面 下载 client 或 server tarball 文件 server 的 tarball kubernetes-server-linux-amd64.tar.gz 已经包含了 client(kubectl) 二进制文件,所以不用单独下载kubernetes-client-linux-amd64.tar.gz文件;

# wget https://dl.k8s.io/v1.6.0/kubernetes-client-linux-amd64.tar.gz
wget https://dl.k8s.io/v1.6.0/kubernetes-server-linux-amd64.tar.gz
tar -xzvf kubernetes-server-linux-amd64.tar.gz
cp -r kubernetes/server/bin/{kube-apiserver,kube-controller-manager,kube-scheduler,kubectl,kube-proxy,kubelet} /usr/bin/
chmod +x /usr/bin/kube*

配置和启动 kube-apiserver

创建 kube-apiserver的service配置文件

service配置文件 vi /usr/lib/systemd/system/kube-apiserver.service内容:

[Unit]
Description=Kubernetes API Service
Documentation=https://github.com/GoogleCloudPlatform/kubernetes
After=network.target
After=etcd.service

[Service]
EnvironmentFile=-/etc/kubernetes/config
EnvironmentFile=-/etc/kubernetes/apiserver
ExecStart=/usr/bin/kube-apiserver \
        $KUBE_LOGTOSTDERR \
        $KUBE_LOG_LEVEL \
        $KUBE_ETCD_SERVERS \
        $KUBE_API_ADDRESS \
        $KUBE_API_PORT \
        $KUBELET_PORT \
        $KUBE_ALLOW_PRIV \
        $KUBE_SERVICE_ADDRESSES \
        $KUBE_ADMISSION_CONTROL \
        $KUBE_API_ARGS
Restart=on-failure
Type=notify
LimitNOFILE=65536

[Install]
WantedBy=multi-user.target

vi /etc/kubernetes/config 文件的内容为:

###
# kubernetes system config
#
# The following values are used to configure various aspects of all
# kubernetes services, including
#
#   kube-apiserver.service
#   kube-controller-manager.service
#   kube-scheduler.service
#   kubelet.service
#   kube-proxy.service
# logging to stderr means we get it in the systemd journal
KUBE_LOGTOSTDERR="--logtostderr=true"

# journal message level, 0 is debug
KUBE_LOG_LEVEL="--v=0"

# Should this cluster be allowed to run privileged docker containers
KUBE_ALLOW_PRIV="--allow-privileged=true"

# How the controller-manager, scheduler, and proxy find the apiserver
#KUBE_MASTER="--master=http://sz-pg-oam-docker-test-001.tendcloud.com:8080"
KUBE_MASTER="--master=http://192.168.223.200:8080"

注: 该配置文件同时被kube-apiserver、kube-controller-manager、kube-scheduler、kubelet、kube-proxy使用。KUBE_MASTER 填写 VIP 地址

apiserver配置文件 vi /etc/kubernetes/apiserver 内容为:

###
## kubernetes system config
##
## The following values are used to configure the kube-apiserver
##
#
## The address on the local server to listen to.
#KUBE_API_ADDRESS="--insecure-bind-address=sz-pg-oam-docker-test-001.tendcloud.com"
KUBE_API_ADDRESS="--advertise-address=0.0.0.0 --bind-address=0.0.0.0 --insecure-bind-address=0.0.0.0"
#
## The port on the local server to listen on.
#KUBE_API_PORT="--port=8080"
#
## Port minions listen on
#KUBELET_PORT="--kubelet-port=10250"
#
## Comma separated list of nodes in the etcd cluster
KUBE_ETCD_SERVERS="--etcd-servers=https://192.168.223.201:2379,https://192.168.223.202:2379,https://192.168.223.203:2379"
#
## Address range to use for services
KUBE_SERVICE_ADDRESSES="--service-cluster-ip-range=10.254.0.0/16"
#
## default admission control policies
KUBE_ADMISSION_CONTROL="--admission-control=ServiceAccount,NamespaceLifecycle,NamespaceExists,LimitRanger,ResourceQuota"
#
## Add your own!
KUBE_API_ARGS="--authorization-mode=RBAC --runtime-config=rbac.authorization.k8s.io/v1beta1 --kubelet-https=true --experimental-bootstrap-token-auth --token-auth-file=/etc/kubernetes/token.csv --service-node-port-range=30000-32767 --tls-cert-file=/etc/kubernetes/ssl/kubernetes.pem --tls-private-key-file=/etc/kubernetes/ssl/kubernetes-key.pem --client-ca-file=/etc/kubernetes/ssl/ca.pem --service-account-key-file=/etc/kubernetes/ssl/ca-key.pem --etcd-cafile=/etc/kubernetes/ssl/ca.pem --etcd-certfile=/etc/kubernetes/ssl/kubernetes.pem --etcd-keyfile=/etc/kubernetes/ssl/kubernetes-key.pem --enable-swagger-ui=true --apiserver-count=3 --audit-log-maxage=30 --audit-log-maxbackup=3 --audit-log-maxsize=100 --audit-log-path=/var/lib/audit.log --event-ttl=1h"
  • --experimental-bootstrap-token-auth Bootstrap Token Authentication在1.9版本已经变成了正式feature,参数名称改为--enable-bootstrap-token-auth
  • 如果中途修改过--service-cluster-ip-range地址,则必须将default命名空间的kubernetes的service给删除,使用命令:kubectl delete service kubernetes,然后系统会自动用新的ip重建这个service,不然apiserver的log有报错the cluster IP x.x.x.x for service kubernetes/default is not within the service CIDR x.x.x.x/16; please recreate
  • --authorization-mode=RBAC 指定在安全端口使用 RBAC 授权模式,拒绝未通过授权的请求;
  • kube-scheduler、kube-controller-manager 一般和 kube-apiserver 部署在同一台机器上,它们使用非安全端口和 kube-apiserver通信;
  • kubelet、kube-proxy、kubectl 部署在其它 Node 节点上,如果通过安全端口访问 kube-apiserver,则必须先通过 TLS 证书认证,再通过 RBAC 授权;
  • kube-proxy、kubectl 通过在使用的证书里指定相关的 User、Group 来达到通过 RBAC 授权的目的;
  • 如果使用了 kubelet TLS Boostrap 机制,则不能再指定 --kubelet-certificate-authority、--kubelet-client-certificate 和 --kubelet-client-key 选项,否则后续 kube-apiserver 校验 kubelet 证书时出现 ”x509: certificate signed by unknown authority“ 错误;
  • --admission-control 值必须包含 ServiceAccount;
  • runtime-config配置为rbac.authorization.k8s.io/v1beta1,表示运行时的apiVersion;
  • --service-cluster-ip-range 指定 Service Cluster IP 地址段,该地址段不能路由可达;
  • 缺省情况下 kubernetes 对象保存在 etcd /registry 路径下,可以通过 --etcd-prefix 参数进行调整;
  • 如果需要开通http的无认证的接口,则可以增加以下两个参数:--insecure-port=8080 --insecure-bind-address=0.0.0.0。

Kubernetes 1.9不同点

  • 对于Kubernetes1.9集群,需要注意配置KUBE_API_ARGS环境变量中的--authorization-mode=Node,RBAC,增加对Node授权的模式,否则将无法注册node。
  • --experimental-bootstrap-token-auth Bootstrap Token Authentication在kubernetes 1.9版本已经废弃,参数名称改为--enable-bootstrap-token-auth

启动kube-apiserver

systemctl daemon-reload
systemctl enable kube-apiserver
systemctl start kube-apiserver
systemctl status kube-apiserver

配置和启动 kube-controller-manager

创建 kube-controller-manager的serivce配置文件

文件路径 vi /usr/lib/systemd/system/kube-controller-manager.service

[Unit]
Description=Kubernetes Controller Manager
Documentation=https://github.com/GoogleCloudPlatform/kubernetes

[Service]
EnvironmentFile=-/etc/kubernetes/config
EnvironmentFile=-/etc/kubernetes/controller-manager
ExecStart=/usr/bin/kube-controller-manager \
        $KUBE_LOGTOSTDERR \
        $KUBE_LOG_LEVEL \
        $KUBE_MASTER \
        $KUBE_CONTROLLER_MANAGER_ARGS
Restart=on-failure
LimitNOFILE=65536

[Install]
WantedBy=multi-user.target

配置文件 vi /etc/kubernetes/controller-manager

###
# The following values are used to configure the kubernetes controller-manager

# defaults from config and apiserver should be adequate

# Add your own!
KUBE_CONTROLLER_MANAGER_ARGS="--address=127.0.0.1 --service-cluster-ip-range=10.254.0.0/16 --cluster-name=kubernetes --cluster-signing-cert-file=/etc/kubernetes/ssl/ca.pem --cluster-signing-key-file=/etc/kubernetes/ssl/ca-key.pem  --service-account-private-key-file=/etc/kubernetes/ssl/ca-key.pem --root-ca-file=/etc/kubernetes/ssl/ca.pem --leader-elect=true"
  • --service-cluster-ip-range 参数指定 Cluster 中 Service 的CIDR范围,该网络在各 Node 间必须路由不可达,必须和 kube-apiserver 中的参数一致;
  • --cluster-signing-* 指定的证书和私钥文件用来签名为 TLS BootStrap 创建的证书和私钥;
  • --root-ca-file 用来对 kube-apiserver 证书进行校验,指定该参数后,才会在Pod 容器的 ServiceAccount 中放置该 CA 证书文件;
  • --address 值必须为 127.0.0.1,kube-apiserver 期望 scheduler 和 controller-manager 在同一台机器;

启动 kube-controller-manager

systemctl daemon-reload
systemctl enable kube-controller-manager
systemctl start kube-controller-manager
systemctl status kube-controller-manager

我们启动每个组件后可以通过执行命令 kubectl get componentstatuses,来查看各个组件的状态;

kubectl get componentstatuses
NAME                 STATUS      MESSAGE              ERROR 
scheduler            Unhealthy   Get http://127.0.0.1:10251/healthz: dial tcp 127.0.0.1:10251: getsockopt: connection refused   
controller-manager   Healthy     ok
etcd-2               Healthy     {"health": "true"}
etcd-0               Healthy     {"health": "true"}
etcd-1               Healthy     {"health": "true"}

注: 目前scheduler未启动,报错是正常的

配置和启动 kube-scheduler

创建 kube-scheduler的serivce配置文件

文件路径 vi /usr/lib/systemd/system/kube-scheduler.service

[Unit]
Description=Kubernetes Scheduler Plugin
Documentation=https://github.com/GoogleCloudPlatform/kubernetes

[Service]
EnvironmentFile=-/etc/kubernetes/config
EnvironmentFile=-/etc/kubernetes/scheduler
ExecStart=/usr/bin/kube-scheduler \
            $KUBE_LOGTOSTDERR \
            $KUBE_LOG_LEVEL \
            $KUBE_MASTER \
            $KUBE_SCHEDULER_ARGS
Restart=on-failure
LimitNOFILE=65536

[Install]
WantedBy=multi-user.target

配置文件 vi /etc/kubernetes/scheduler

###
# kubernetes scheduler config

# default config should be adequate

# Add your own!
KUBE_SCHEDULER_ARGS="--leader-elect=true --address=127.0.0.1"
  • --address 值必须为 127.0.0.1,因为当前 kube-apiserver 期望 scheduler 和 controller-manager 在同一台机器;

启动 kube-scheduler

systemctl daemon-reload
systemctl enable kube-scheduler
systemctl start kube-scheduler
systemctl status kube-scheduler

验证 master 节点功能

kubectl get componentstatuses
NAME                 STATUS    MESSAGE              ERROR
scheduler            Healthy   ok                   
controller-manager   Healthy   ok                   
etcd-0               Healthy   {"health": "true"}   
etcd-1               Healthy   {"health": "true"}   
etcd-2               Healthy   {"health": "true"}

注: 两个master节点安装方式与配置一样

八、部署node节点

Kubernetes node节点包含如下组件:

  • Flanneld:参考我之前写的文章Kubernetes基于Flannel的网络配置,之前没有配置TLS,现在需要在service配置文件中增加TLS配置,安装过程请参考上一节安装flannel网络插件。
  • Docker1.12.5:docker的安装很简单,这里也不说了,但是需要注意docker的配置。
  • kubelet:直接用二进制文件安装
  • kube-proxy:直接用二进制文件安装

注意: 每台 node 上都需要安装 flannel,master 节点上选装。

步骤简介

  1. 确认在上一步中我们安装配置的网络插件flannel已启动且运行正常
  2. 安装配置docker后启动
  3. 安装配置kubelet、kube-proxy后启动
  4. 验证

目录和文件

我们再检查一下三个节点上,经过前几步操作我们已经创建了如下的证书和配置文件。

cd  /etc/kubernetes/ssl
ls
admin-key.pem  admin.pem  ca-key.pem  ca.pem  kube-proxy-key.pem  kube-proxy.pem  kubernetes-key.pem  kubernetes.pem

ls /etc/kubernetes/
apiserver  bootstrap.kubeconfig  config  controller-manager  kubelet  kube-proxy.kubeconfig  proxy  scheduler  ssl  token.csv

安装配置Docker

如果您使用yum的方式安装的flannel则不需要执行mk-docker-opts.sh文件这一步,参考Flannel官方文档中的Docker Integration。

如果你不是使用yum安装的flannel,那么需要下载flannel github release中的tar包,解压后会获得一个 mk-docker-opts.sh 文件,到flannel release页面下载对应版本的安装包,该脚本见mk-docker-opts.sh,因为我们使用yum安装所以不需要执行这一步。这个文件是用来 Generate Docker daemon options based on flannel env file。 使用systemctl命令启动flanneld后,会自动执行./mk-docker-opts.sh -i生成如下两个文件环境变量文件:

  • /run/flannel/subnet.env
FLANNEL_NETWORK=172.30.0.0/16
FLANNEL_SUBNET=172.30.46.1/24
FLANNEL_MTU=1450
FLANNEL_IPMASQ=false
  • /run/docker_opts.env
DOCKER_OPT_BIP="--bip=172.30.46.1/24"
DOCKER_OPT_IPMASQ="--ip-masq=true"
DOCKER_OPT_MTU="--mtu=1450"

Docker将会读取这两个环境变量文件作为容器启动参数。

注意: 安装docker-ce-17.12.1.ce版本的rpm包时,需给docker.service额外添加$DOCKER_NETWORK_OPTIONS --exec-opt native.cgroupdriver=systemd

ExecStart=/usr/bin/dockerd $DOCKER_NETWORK_OPTIONS --exec-opt native.cgroupdriver=systemd

注意: 不论您用什么方式安装的flannel,下面这一步是必不可少的。

yum方式安装的flannel

修改docker的配置文件 vi /usr/lib/systemd/system/docker.service,增加一条环境变量配置

EnvironmentFile=-/run/flannel/docker

/run/flannel/docker文件是flannel启动后自动生成的,其中包含了docker启动时需要的参数。

二进制方式安装的flannel

修改docker的配置文件 vi /usr/lib/systemd/system/docker.service,增加如下几条环境变量配置:

EnvironmentFile=-/run/docker_opts.env
EnvironmentFile=-/run/flannel/subnet.env

这两个文件是mk-docker-opts.sh脚本生成环境变量文件默认的保存位置,docker启动的时候需要加载这几个配置文件才可以加入到flannel创建的虚拟网络里。

所以不论您使用何种方式安装的flannel,将以下配置加入到docker.service中可确保万无一失。

EnvironmentFile=-/run/flannel/docker
EnvironmentFile=-/run/docker_opts.env
EnvironmentFile=-/run/flannel/subnet.env
EnvironmentFile=-/etc/sysconfig/docker
EnvironmentFile=-/etc/sysconfig/docker-storage
EnvironmentFile=-/etc/sysconfig/docker-network
EnvironmentFile=-/run/docker_opts.env

docker安装方式也分yum和rpm包安装

方式一: yum 安装

版本1.13.1-53

yum install docker -y

然后修改配置 vi /etc/sysconfig/docker 中OPTIONS参数如下:

OPTIONS='--log-driver=json-file --signature-verification=false --insecure-registry 192.168.223.208:80' 
# 附:192.168.223.208:80 为harbor私有镜像仓库

修改 vi /etc/sysconfig/docker-storage 如下:

DOCKER_STORAGE_OPTIONS="--storage-driver overlay "

修改docker pull源 vi /etc/docker/daemon.json

{ 
	"registry-mirrors":["https://registry.docker-cn.com"]
}

修改 vi /usr/lib/systemd/system/docker.service:

[Unit]
Description=Docker Application Container Engine
Documentation=http://docs.docker.com
After=network.target rhel-push-plugin.socket registries.service
Wants=docker-storage-setup.service
Requires=docker-cleanup.timer

[Service]
Type=notify
NotifyAccess=all
EnvironmentFile=-/run/containers/registries.conf
EnvironmentFile=-/run/docker_opts.env
EnvironmentFile=-/etc/sysconfig/docker-network
EnvironmentFile=-/etc/sysconfig/docker-storage
EnvironmentFile=-/etc/sysconfig/docker
EnvironmentFile=-/run/flannel/subnet.env
EnvironmentFile=-/run/docker_opts.env
EnvironmentFile=-/run/flannel/docker
Environment=GOTRACEBACK=crash
Environment=DOCKER_HTTP_HOST_COMPAT=1
Environment=PATH=/usr/libexec/docker:/usr/bin:/usr/sbin
ExecStart=/usr/bin/dockerd-current \
          --add-runtime docker-runc=/usr/libexec/docker/docker-runc-current \
          --default-runtime=docker-runc \
          --exec-opt native.cgroupdriver=systemd \
          --userland-proxy-path=/usr/libexec/docker/docker-proxy-current \
          --seccomp-profile=/etc/docker/seccomp.json \
          $OPTIONS \
          $DOCKER_STORAGE_OPTIONS \
          $DOCKER_NETWORK_OPTIONS \
          $ADD_REGISTRY \
          $BLOCK_REGISTRY \
          $INSECURE_REGISTRY \
          $REGISTRIES
ExecReload=/bin/kill -s HUP $MAINPID
LimitNOFILE=1048576
LimitNPROC=1048576
LimitCORE=infinity
TimeoutStartSec=0
Restart=on-abnormal
MountFlags=slave
KillMode=process

[Install]
WantedBy=multi-user.target

方式二: rpm安装

版本:ce-17.12.1

rpm -ivh docker-ce-17.12.1.ce-1.el7.centos.x86_64.rpm

然后修改配置 vi /etc/sysconfig/docker 中OPTIONS参数如下:

# /etc/sysconfig/docker

# Modify these options if you want to change the way the docker daemon runs
OPTIONS='--log-driver=json-file --insecure-registry 192.168.223.208:80'
# 附:192.168.223.208:80 为harbor私有镜像仓库,-signature-verification=false选项在此版本已不存在
if [ -z "${DOCKER_CERT_PATH}" ]; then
    DOCKER_CERT_PATH=/etc/docker
fi

# Do not add registries in this file anymore. Use /etc/containers/registries.conf
# from the atomic-registries package.
#

# On an SELinux system, if you remove the --selinux-enabled option, you
# also need to turn on the docker_transition_unconfined boolean.
# setsebool -P docker_transition_unconfined 1

# Location used for temporary files, such as those created by
# docker load and build operations. Default is /var/lib/docker/tmp
# Can be overriden by setting the following environment variable.
# DOCKER_TMPDIR=/var/tmp

# Controls the /etc/cron.daily/docker-logrotate cron job status.
# To disable, uncomment the line below.
# LOGROTATE=false

# docker-latest daemon can be used by starting the docker-latest unitfile.
# To use docker-latest client, uncomment below lines
#DOCKERBINARY=/usr/bin/docker-latest
#DOCKERDBINARY=/usr/bin/dockerd-latest
#DOCKER_CONTAINERD_BINARY=/usr/bin/docker-containerd-latest
#DOCKER_CONTAINERD_SHIM_BINARY=/usr/bin/docker-containerd-shim-latest

修改 vi /etc/sysconfig/docker-storage 如下:

DOCKER_STORAGE_OPTIONS="--storage-driver overlay "

修改docker pull源 vi /etc/docker/daemon.json

{ 
	"registry-mirrors":["https://registry.docker-cn.com"]
}

修改 vi /usr/lib/systemd/system/docker.service:

[Unit]
Description=Docker Application Container Engine
Documentation=http://docs.docker.com
After=network-online.target firewalld.service
Wants=network-online.target

[Service]
Type=notify
EnvironmentFile=-/etc/sysconfig/docker
EnvironmentFile=-/etc/sysconfig/docker-storage
EnvironmentFile=-/etc/sysconfig/docker-network
EnvironmentFile=-/run/docker_opts.env
EnvironmentFile=-/run/flannel/subnet.env
EnvironmentFile=-/run/docker_opts.env
EnvironmentFile=-/run/flannel/docker
Environment=GOTRACEBACK=crash
ExecStart=/usr/bin/dockerd $OPTIONS \
      --exec-opt native.cgroupdriver=systemd \
      $DOCKER_STORAGE_OPTIONS \
      $DOCKER_NETWORK_OPTIONS \
      $ADD_REGISTRY \
      $BLOCK_REGISTRY \
      $INSECURE_REGISTRY
ExecReload=/bin/kill -s HUP $MAINPID
LimitNOFILE=1048576
LimitNPROC=1048576
LimitCORE=infinity
MountFlags=slave
TimeoutStartSec=1min
Delegate=yes
KillMode=process
Restart=on-failure
StartLimitBurst=3
StartLimitInterval=60s

[Install]
WantedBy=multi-user.target

启动docker

systemctl reload-daemon
systemctl enable docker
systemctl restart docker
systemctl status docker

注: 重启了docker后还要重启kubelet,如果遇到以下问题,kubelet启动失败。报错:

Mar 31 16:44:41 k8s_node1 kubelet[81047]: error: failed to run Kubelet: failed to create kubelet: misconfiguration: kubelet cgroup driver: "cgroupfs" is different from docker cgroup driver: "systemd"

这是kubelet与docker的cgroup driver不一致导致的,kubelet启动的时候有个—cgroup-driver参数可以指定为"cgroupfs"或者“systemd”。

--cgroup-driver string       Driver that the kubelet uses to manipulate cgroups on the host.  Possible values: 'cgroupfs', 'systemd' (default "cgroupfs")

配置docker的service配置文件 vi /usr/lib/systemd/system/docker.service,设置ExecStart中的 --exec-opt native.cgroupdriver=systemd 再重启即可。

安装和配置kubelet

kubernets1.8不同点

相对于kubernetes1.6集群必须进行的配置有: 对于kuberentes1.8集群,必须关闭swap,否则kubelet启动将失败。 修改/etc/fstab将,swap系统注释掉。

kubelet 启动时向 kube-apiserver 发送 TLS bootstrapping 请求,需要先将 bootstrap token 文件中的 kubelet-bootstrap 用户赋予 system:node-bootstrapper cluster 角色(role), 然后 kubelet 才能有权限创建认证请求(certificate signing requests):

cd /etc/kubernetes
kubectl create clusterrolebinding kubelet-bootstrap \
  --clusterrole=system:node-bootstrapper \
  --user=kubelet-bootstrap
  • --user=kubelet-bootstrap 是在 /etc/kubernetes/token.csv 文件中指定的用户名,同时也写入了 /etc/kubernetes/bootstrap.kubeconfig 文件;

下载最新的kubelet和kube-proxy二进制文件

注意请下载对应的Kubernetes版本的安装包。

wget https://dl.k8s.io/v1.6.0/kubernetes-server-linux-amd64.tar.gz
tar -xzvf kubernetes-server-linux-amd64.tar.gz
cp -r kubernetes/server/bin/{kube-proxy,kubelet} /usr/bin/
chmod +x /usr/bin/kube*

创建kubelet的service配置文件

文件位置 vi /usr/lib/systemd/system/kubelet.service

[Unit]
Description=Kubernetes Kubelet Server
Documentation=https://github.com/GoogleCloudPlatform/kubernetes
After=docker.service
Requires=docker.service

[Service]
WorkingDirectory=/var/lib/kubelet
EnvironmentFile=-/etc/kubernetes/config
EnvironmentFile=-/etc/kubernetes/kubelet
ExecStart=/usr/bin/kubelet \
            $KUBE_LOGTOSTDERR \
            $KUBE_LOG_LEVEL \
            $KUBELET_API_SERVER \
            $KUBELET_ADDRESS \
            $KUBELET_PORT \
            $KUBELET_HOSTNAME \
            $KUBE_ALLOW_PRIV \
            $KUBELET_POD_INFRA_CONTAINER \
            $KUBELET_ARGS
Restart=on-failure

[Install]
WantedBy=multi-user.target

kubelet的配置文件/etc/kubernetes/kubelet。其中的IP地址更改为你的每台node节点的IP地址。 注意: 在启动kubelet之前,需要先手动创建/var/lib/kubelet目录:mkdir -p /var/lib/kubelet

kubelet的配置文件 vi /etc/kubernetes/kubelet:

kubernetes1.8不同点

相对于kubenrete1.6的配置变动:

  • 对于kuberentes1.8集群中的kubelet配置,取消了KUBELET_API_SERVER的配置,而改用kubeconfig文件来定义master地址,所以请注释掉KUBELET_API_SERVER配置。
###
## kubernetes kubelet (minion) config
#
## The address for the info server to serve on (set to 0.0.0.0 or "" for all interfaces)
KUBELET_ADDRESS="--address=192.168.223.206"
#
## The port for the info server to serve on
#KUBELET_PORT="--port=10250"
#
## You may leave this blank to use the actual hostname
KUBELET_HOSTNAME="--hostname-override=192.168.223.206"
#
## location of the api-server
## COMMENT THIS ON KUBERNETES 1.8+
KUBELET_API_SERVER="--api-servers=http://192.168.223.200:8080"
#
## pod infrastructure container
KUBELET_POD_INFRA_CONTAINER="--pod_infra_container_image=192.168.223.208:80/k8s/pause-amd64:v3.0"
#
## Add your own!
KUBELET_ARGS="--cgroup-driver=systemd --cluster-dns=10.254.0.2 --experimental-bootstrap-kubeconfig=/etc/kubernetes/bootstrap.kubeconfig --kubeconfig=/etc/kubernetes/kubelet.kubeconfig --require-kubeconfig --cert-dir=/etc/kubernetes/ssl --cluster-domain=cluster.local --hairpin-mode promiscuous-bridge --serialize-image-pulls=false"
  • 如果使用systemd方式启动,则需要额外增加两个参数--runtime-cgroups=/systemd/system.slice --kubelet-cgroups=/systemd/system.slice
  • --experimental-bootstrap-kubeconfig 在1.9版本已经变成了--bootstrap-kubeconfig
  • --address 不能设置为 127.0.0.1,否则后续 Pods 访问 kubelet 的 API 接口时会失败,因为 Pods 访问的 127.0.0.1 指向自己而不是 kubelet;
  • 如果设置了 --hostname-override 选项,则 kube-proxy 也需要设置该选项,否则会出现找不到 Node 的情况;
  • "--cgroup-driver 配置成 systemd,不要使用cgroup,否则在 CentOS 系统中 kubelet 将启动失败(保持docker和kubelet中的cgroup driver配置一致即可,不一定非使用systemd)。
  • --experimental-bootstrap-kubeconfig 指向 bootstrap kubeconfig 文件,kubelet 使用该文件中的用户名和 token 向 kube-apiserver 发送 TLS Bootstrapping 请求;
  • 管理员通过了 CSR 请求后,kubelet 自动在 --cert-dir 目录创建证书和私钥文件(kubelet-client.crt 和 kubelet-client.key),然后写入 --kubeconfig 文件;
  • 建议在 --kubeconfig 配置文件中指定 kube-apiserver 地址,如果未指定 --api-servers 选项,则必须指定 --require-kubeconfig 选项后才从配置文件中读取 kube-apiserver 的地址,否则 kubelet 启动后将找不到 kube-apiserver (日志中提示未找到 API Server),kubectl get nodes 不会返回对应的 Node 信息;
  • --cluster-dns 指定 kubedns 的 Service IP(可以先分配,后续创建 kubedns 服务时指定该 IP),--cluster-domain 指定域名后缀,这两个参数同时指定后才会生效;
  • --cluster-domain 指定 pod 启动时 /etc/resolve.conf 文件中的 search domain ,起初我们将其配置成了 cluster.local.,这样在解析 service 的 DNS 名称时是正常的,可是在解析 headless service 中的 FQDN pod name 的时候却错误,因此我们将其修改为 cluster.local,去掉嘴后面的 ”点号“ 就可以解决该问题,关于 kubernetes 中的域名/服务名称解析请参见我的另一篇文章。
  • --kubeconfig=/etc/kubernetes/kubelet.kubeconfig中指定的kubelet.kubeconfig文件在第一次启动kubelet之前并不存在,请看下文,当通过CSR请求后会自动生成kubelet.kubeconfig文件,如果你的节点上已经生成了~/.kube/config文件,你可以将该文件拷贝到该路径下,并重命名为kubelet.kubeconfig,所有node节点可以共用同一个kubelet.kubeconfig文件,这样新添加的节点就不需要再创建CSR请求就能自动添加到kubernetes集群中。同样,在任意能够访问到kubernetes集群的主机上使用kubectl --kubeconfig命令操作集群时,只要使用~/.kube/config文件就可以通过权限认证,因为这里面已经有认证信息并认为你是admin用户,对集群拥有所有权限。
  • KUBELET_POD_INFRA_CONTAINER 是基础pod镜像容器,这里我用的是私有镜像仓库地址,大家部署的时候需要修改为自己的镜像。这里的pod镜像可以使用:pod-infrastructure 或者 pause 。pod-infrastructure镜像是Redhat制作的,大小接近80M,下载比较耗时,其实该镜像并不运行什么具体进程,推荐使用Google的pause镜像gcr.io/google_containers/pause-amd64:3.0,这个镜像只有300多K。

启动kubelet

systemctl daemon-reload
systemctl enable kubelet
systemctl start kubelet
systemctl status kubelet

通过 kublet 的 TLS 证书请求

kubelet 首次启动时向 kube-apiserver 发送证书签名请求,必须通过后 kubernetes 系统才会将该 Node 加入到集群。

查看未授权的 CSR 请求

kubectl get csr
NAME        AGE       REQUESTOR           CONDITION
csr-2b308   4m        kubelet-bootstrap   Pending

kubectl get nodes
No resources found.

通过 CSR 请求

kubectl certificate approve csr-2b308
certificatesigningrequest "csr-2b308" approved

kubectl get nodes
NAME           STATUS    AGE       VERSION
192.168.223.206   Ready     1m       v1.6.0

自动生成了 kubelet kubeconfig 文件和公私钥

ls -l /etc/kubernetes/kubelet.kubeconfig
-rw------- 1 root root 2284 Apr  7 02:07 /etc/kubernetes/kubelet.kubeconfig

ls -l /etc/kubernetes/ssl/kubelet*
-rw-r--r-- 1 root root 1046 Apr  7 02:07 /etc/kubernetes/ssl/kubelet-client.crt
-rw------- 1 root root  227 Apr  7 02:04 /etc/kubernetes/ssl/kubelet-client.key
-rw-r--r-- 1 root root 1103 Apr  7 02:07 /etc/kubernetes/ssl/kubelet.crt
-rw------- 1 root root 1675 Apr  7 02:07 /etc/kubernetes/ssl/kubelet.key

假如你更新kubernetes的证书,只要没有更新token.csv,当重启kubelet后,该node就会自动加入到kuberentes集群中,而不会重新发送certificaterequest,也不需要在master节点上执行kubectl certificate approve操作。前提是不要删除node节点上的/etc/kubernetes/ssl/kubelet*和/etc/kubernetes/kubelet.kubeconfig文件。否则kubelet启动时会提示找不到证书而失败。

注意: 如果启动kubelet的时候见到证书相关的报错,有个trick可以解决这个问题,可以将master节点上的~/.kube/config文件(该文件在安装kubectl命令行工具这一步中将会自动生成)拷贝到node节点的/etc/kubernetes/kubelet.kubeconfig位置,这样就不需要通过CSR,当kubelet启动后就会自动加入的集群中。

配置 kube-proxy

安装conntrack

yum install -y conntrack-tools

创建 kube-proxy 的service配置文件
文件路径 vi /usr/lib/systemd/system/kube-proxy.service

[Unit]
Description=Kubernetes Kube-Proxy Server
Documentation=https://github.com/GoogleCloudPlatform/kubernetes
After=network.target

[Service]
EnvironmentFile=-/etc/kubernetes/config
EnvironmentFile=-/etc/kubernetes/proxy
ExecStart=/usr/bin/kube-proxy \
        $KUBE_LOGTOSTDERR \
        $KUBE_LOG_LEVEL \
        $KUBE_MASTER \
        $KUBE_PROXY_ARGS
Restart=on-failure
LimitNOFILE=65536

[Install]
WantedBy=multi-user.target

kube-proxy配置文件 vi /etc/kubernetes/proxy

###
# kubernetes proxy config

# default config should be adequate

# Add your own!
KUBE_PROXY_ARGS="--bind-address=192.168.223.206 --hostname-override=192.168.223.206 --kubeconfig=/etc/kubernetes/kube-proxy.kubeconfig --cluster-cidr=10.254.0.0/16"
  • --hostname-override 参数值必须与 kubelet 的值一致,否则 kube-proxy 启动后会找不到该 Node,从而不会创建任何 iptables 规则;
  • kube-proxy 根据 --cluster-cidr 判断集群内部和外部流量,指定 --cluster-cidr 或 --masquerade-all 选项后 kube-proxy 才会对访问 Service IP 的请求做 SNAT;
  • --kubeconfig 指定的配置文件嵌入了 kube-apiserver 的地址、用户名、证书、秘钥等请求和认证信息;
  • 预定义的 RoleBinding cluster-admin 将User system:kube-proxy 与 Role system:node-proxier 绑定,该 Role 授予了调用 kube-apiserver Proxy 相关 API 的权限;

启动 kube-proxy

systemctl daemon-reload
systemctl enable kube-proxy
systemctl start kube-proxy
systemctl status kube-proxy
  • node2 节点 192.168.223.207 安装方式一样,只需要把相应配置文件里面的IP改为 192.168.223.207 即可
  • 新增节点的话只需要把master节点的证书拷贝到新主机,证书包括: /etc/kubernetes/bootstrap.kubeconfig; /etc/kubernetes/kube-proxy.kubeconfig;/etc/kubernetes/ssl/*.pem ,然后先安装flanneld,后照本章操作加入集群即可。

验证测试

我们创建一个nginx的service试一下集群是否可用

kubectl run nginx --replicas=2 --labels="run=load-balancer-example" --image=192.168.223.208:80/k8s/nginx:v1.9.4  --port=80
deployment "nginx" created

kubectl expose deployment nginx --type=NodePort --name=example-service
service "example-service" exposed

kubectl describe svc example-service
Name:            example-service
Namespace:        default
Labels:            run=load-balancer-example
Annotations:        <none>
Selector:        run=load-balancer-example
Type:            NodePort
IP:            10.254.62.207
Port:            <unset>    80/TCP
NodePort:        <unset>    32724/TCP
Endpoints:        172.30.60.2:80,172.30.94.2:80
Session Affinity:    None
Events:            <none>

curl "10.254.62.207:80"
<!DOCTYPE html>
<html>
<head>
<title>Welcome to nginx!</title>
<style>
    body {
        width: 35em;
        margin: 0 auto;
        font-family: Tahoma, Verdana, Arial, sans-serif;
    }
</style>
</head>
<body>
<h1>Welcome to nginx!</h1>
<p>If you see this page, the nginx web server is successfully installed and
working. Further configuration is required.</p>

<p>For online documentation and support please refer to
<a href="http://nginx.org/">nginx.org</a>.<br/>
Commercial support is available at
<a href="http://nginx.com/">nginx.com</a>.</p>

<p><em>Thank you for using nginx.</em></p>
</body>
</html>

注意: 此时可能会出现不同node节点上面的pod之间网络不通,解决方法如下

#设置所有节点iptables
yum install iptables-services -y;
systemctl disable iptables;
systemctl stop iptables;
modprobe ip_tables;
iptables -P FORWARD ACCEPT;
  • 上面的测试示例中使用的nginx是我的私有镜像仓库中的镜像 192.168.223.208:80/k8s/nginx:v1.9.4,大家在测试过程中请换成自己的nginx镜像地址。
  • 10.254.62.207 为集群内部地址,只有在安装了kube-proxy的节点上能够访问,访问这个地址时是做了负载均衡的
  • 访问 192.168.223.206:32724 或 192.168.223.207:32724 都可以得到nginx的页面。
  • 删除此测试服务方法:kubectl delete deployment nginx; kubectl delete svc example-service

至此kubernets1.6.0集群基础环境已经安装完成,后面将安装一些常用插件

九、安装kubedns插件

该插件需要使用以下官方镜像:

gcr.io/google_containers/k8s-dns-dnsmasq-nanny-amd64:1.14.1
gcr.io/google_containers/k8s-dns-kube-dns-amd64:1.14.1
gcr.io/google_containers/k8s-dns-sidecar-amd64:1.14.1

由于大中华局域网的原因,这些镜像是pull不下来的。所有我这里使用自己搭建的私有镜像仓库

192.168.223.208:80/k8s/k8s-dns-kube-dns-amd64:v1.14.1
192.168.223.208:80/k8s/k8s-dns-dnsmasq-nanny-amd64:v1.14.1
192.168.223.208:80/k8s/k8s-dns-sidecar-amd64:v1.14.1

需要使用的yaml配置文件

kubedns-cm.yaml  
kubedns-sa.yaml  
kubedns-controller.yaml  
kubedns-svc.yaml

系统预定义的 RoleBinding

预定义的 RoleBinding system:kube-dns 将 kube-system 命名空间的 kube-dns ServiceAccount 与 system:kube-dns Role 绑定, 该 Role 具有访问 kube-apiserver DNS 相关 API 的权限;

kubectl get clusterrolebindings system:kube-dns -o yaml
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRoleBinding
metadata:
  annotations:
    rbac.authorization.kubernetes.io/autoupdate: "true"
  creationTimestamp: 2017-04-11T11:20:42Z
  labels:
    kubernetes.io/bootstrapping: rbac-defaults
  name: system:kube-dns
  resourceVersion: "58"
  selfLink: /apis/rbac.authorization.k8s.io/v1beta1/clusterrolebindingssystem%3Akube-dns
  uid: e61f4d92-1ea8-11e7-8cd7-f4e9d49f8ed0
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: system:kube-dns
subjects:
- kind: ServiceAccount
  name: kube-dns
  namespace: kube-system
  • kubedns-controller.yaml 中定义的 Pods 时使用了 kubedns-sa.yaml 文件定义的 kube-dns ServiceAccount,所以具有访问 kube-apiserver DNS 相关 API 的权限。

配置 kube-dns ServiceAccount

yaml文件 vi kubedns-sa.yaml

apiVersion: v1
kind: ServiceAccount
metadata:
  name: kube-dns
  namespace: kube-system
  labels:
    kubernetes.io/cluster-service: "true"
    addonmanager.kubernetes.io/mode: Reconcile

yaml文件 vi kubedns-cm.yaml

# Copyright 2016 The Kubernetes Authors.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

apiVersion: v1
kind: ConfigMap
metadata:
  name: kube-dns
  namespace: kube-system
  labels:
    addonmanager.kubernetes.io/mode: EnsureExists

配置 kube-dns 服务

yaml文件 vi kubedns-controller.yaml

# Copyright 2016 The Kubernetes Authors.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# Should keep target in cluster/addons/dns-horizontal-autoscaler/dns-horizontal-autoscaler.yaml
# in sync with this file.

# __MACHINE_GENERATED_WARNING__

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: kube-dns
  namespace: kube-system
  labels:
    k8s-app: kube-dns
    kubernetes.io/cluster-service: "true"
    addonmanager.kubernetes.io/mode: Reconcile
spec:
  # replicas: not specified here:
  # 1. In order to make Addon Manager do not reconcile this replicas parameter.
  # 2. Default is 1.
  # 3. Will be tuned in real time if DNS horizontal auto-scaling is turned on.
  strategy:
    rollingUpdate:
      maxSurge: 10%
      maxUnavailable: 0
  selector:
    matchLabels:
      k8s-app: kube-dns
  template:
    metadata:
      labels:
        k8s-app: kube-dns
      annotations:
        scheduler.alpha.kubernetes.io/critical-pod: ''
    spec:
      tolerations:
      - key: "CriticalAddonsOnly"
        operator: "Exists"
      volumes:
      - name: kube-dns-config
        configMap:
          name: kube-dns
          optional: true
      containers:
      - name: kubedns
        image: 192.168.223.208:80/k8s/k8s-dns-kube-dns-amd64:v1.14.1
        resources:
          # TODO: Set memory limits when we've profiled the container for large
          # clusters, then set request = limit to keep this container in
          # guaranteed class. Currently, this container falls into the
          # "burstable" category so the kubelet doesn't backoff from restarting it.
          limits:
            memory: 170Mi
          requests:
            cpu: 100m
            memory: 70Mi
        livenessProbe:
          httpGet:
            path: /healthcheck/kubedns
            port: 10054
            scheme: HTTP
          initialDelaySeconds: 60
          timeoutSeconds: 5
          successThreshold: 1
          failureThreshold: 5
        readinessProbe:
          httpGet:
            path: /readiness
            port: 8081
            scheme: HTTP
          # we poll on pod startup for the Kubernetes master service and
          # only setup the /readiness HTTP server once that's available.
          initialDelaySeconds: 3
          timeoutSeconds: 5
        args:
        - --domain=cluster.local.
        - --dns-port=10053
        - --config-dir=/kube-dns-config
        - --v=2
        #__PILLAR__FEDERATIONS__DOMAIN__MAP__
        env:
        - name: PROMETHEUS_PORT
          value: "10055"
        ports:
        - containerPort: 10053
          name: dns-local
          protocol: UDP
        - containerPort: 10053
          name: dns-tcp-local
          protocol: TCP
        - containerPort: 10055
          name: metrics
          protocol: TCP
        volumeMounts:
        - name: kube-dns-config
          mountPath: /kube-dns-config
      - name: dnsmasq
        image: 192.168.223.208:80/k8s/k8s-dns-dnsmasq-nanny-amd64:v1.14.1
        livenessProbe:
          httpGet:
            path: /healthcheck/dnsmasq
            port: 10054
            scheme: HTTP
          initialDelaySeconds: 60
          timeoutSeconds: 5
          successThreshold: 1
          failureThreshold: 5
        args:
        - -v=2
        - -logtostderr
        - -configDir=/etc/k8s/dns/dnsmasq-nanny
        - -restartDnsmasq=true
        - --
        - -k
        - --cache-size=1000
        - --log-facility=-
        - --server=/cluster.local./127.0.0.1#10053
        - --server=/in-addr.arpa/127.0.0.1#10053
        - --server=/ip6.arpa/127.0.0.1#10053
        ports:
        - containerPort: 53
          name: dns
          protocol: UDP
        - containerPort: 53
          name: dns-tcp
          protocol: TCP
        # see: https://github.com/kubernetes/kubernetes/issues/29055 for details
        resources:
          requests:
            cpu: 150m
            memory: 20Mi
        volumeMounts:
        - name: kube-dns-config
          mountPath: /etc/k8s/dns/dnsmasq-nanny
      - name: sidecar
        image: 192.168.223.208:80/k8s/k8s-dns-sidecar-amd64:v1.14.1
        livenessProbe:
          httpGet:
            path: /metrics
            port: 10054
            scheme: HTTP
          initialDelaySeconds: 60
          timeoutSeconds: 5
          successThreshold: 1
          failureThreshold: 5
        args:
        - --v=2
        - --logtostderr
        - --probe=kubedns,127.0.0.1:10053,kubernetes.default.svc.cluster.local.,5,A
        - --probe=dnsmasq,127.0.0.1:53,kubernetes.default.svc.cluster.local.,5,A
        ports:
        - containerPort: 10054
          name: metrics
          protocol: TCP
        resources:
          requests:
            memory: 20Mi
            cpu: 10m
      dnsPolicy: Default  # Don't use cluster DNS.
      serviceAccountName: kube-dns
  • spec.clusterIP = 10.254.0.2,即明确指定了 kube-dns Service IP,这个 IP 需要和 kubelet 的 --cluster-dns 参数值一致;

配置 kube-dns Deployment

yaml文件 vi kubedns-svc.yaml

# Copyright 2016 The Kubernetes Authors.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# __MACHINE_GENERATED_WARNING__

apiVersion: v1
kind: Service
metadata:
  name: kube-dns
  namespace: kube-system
  labels:
    k8s-app: kube-dns
    kubernetes.io/cluster-service: "true"
    addonmanager.kubernetes.io/mode: Reconcile
    kubernetes.io/name: "KubeDNS"
spec:
  selector:
    k8s-app: kube-dns
  clusterIP: 10.254.0.2
  ports:
  - name: dns
    port: 53
    protocol: UDP
  - name: dns-tcp
    port: 53
    protocol: TCP
  • 使用系统已经做了 RoleBinding 的 kube-dns ServiceAccount,该账户具有访问 kube-apiserver DNS 相关 API 的权限;

执行所有定义文件

ls *.yaml
kubedns-cm.yaml  kubedns-controller.yaml  kubedns-sa.yaml  kubedns-svc.yaml

kubectl create -f .

检查 kubedns 功能

新建一个 Deployment

cat > my-nginx.yaml << EOF
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: my-nginx
spec:
  replicas: 2
  template:
    metadata:
      labels:
        run: my-nginx
    spec:
      containers:
      - name: my-nginx
        image: 192.168.223.208:80/k8s/nginx:v1.9.4
        ports:
        - containerPort: 80
EOF

kubectl create -f my-nginx.yaml

Export 该 Deployment, 生成 my-nginx 服务

kubectl expose deploy my-nginx

kubectl get services --all-namespaces |grep my-nginx
default       my-nginx     10.254.179.239   <none>        80/TCP          42m

进入kubernete生成的 my-nginx 服务的pods中

kubectl get pods
NAME                        READY     STATUS        RESTARTS   AGE
my-nginx-1108742923-1bpml   1/1       Running       0          1m
my-nginx-1108742923-44dp8   1/1       Running       0          1m

kubectl exec -it my-nginx-1108742923-1bpml /bin/bash
root@my-nginx-1108742923-1bpml:~# cat /etc/resolv.conf
nameserver 10.254.0.2
search default.svc.cluster.local svc.cluster.local cluster.local
options ndots:5

root@my-nginx-1108742923-1bpml:~# ping my-nginx     
PING my-nginx.default.svc.cluster.local (10.254.54.162): 56 data bytes
^C--- my-nginx.default.svc.cluster.local ping statistics ---
11 packets transmitted, 0 packets received, 100% packet loss
root@my-nginx-1108742923-1bpml:~# 

root@my-nginx-1108742923-1bpml:~# ping kubernetes
PING kubernetes.default.svc.cluster.local (10.254.0.1): 56 data bytes
^C--- kubernetes.default.svc.cluster.local ping statistics ---
5 packets transmitted, 0 packets received, 100% packet loss
root@my-nginx-1108742923-1bpml:~# 

root@my-nginx-1108742923-1bpml:~# ping kube-dns.kube-system.svc.cluster.local
PING kube-dns.kube-system.svc.cluster.local (10.254.0.2): 56 data bytes
^C--- kube-dns.kube-system.svc.cluster.local ping statistics ---
2 packets transmitted, 0 packets received, 100% packet loss
root@my-nginx-1108742923-1bpml:~# 

从结果来看,service名称可以正常解析。

注意: 直接ping ClusterIP是ping不通的,ClusterIP是根据IPtables路由到服务的endpoint上,只有结合ClusterIP加端口才能访问到对应的服务。

十、安装dashboard插件

注意:本文档中安装的是kubernetes dashboard v1.6.0 安装dashboard时,遇到一个问题,如果直接安装v1.6.3版本的话,后面安装的Heapster插件现在 的 CPU、内存等 metric 图形等功能不可以;如果先安装v1.6.0再装Heapster 插件,最后再升级为v1.6.3版本就没有此问题

官方文件目录:https://github.com/kubernetes/kubernetes/tree/master/cluster/addons/dashboard

需要使用的镜像:

192.168.223.208:80/k8s/kubernetes-dashboard-amd64:v1.6.0

我们使用的yaml文件如下:

ls *.yaml
dashboard-controller.yaml  dashboard-service.yaml dashboard-rbac.yaml

配置 dashboard ServiceAccount

文件 vi dashboard-rbac.yaml

apiVersion: v1
kind: ServiceAccount
metadata:
  name: dashboard
  namespace: kube-system

---

kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
  name: dashboard
subjects:
  - kind: ServiceAccount
    name: dashboard
    namespace: kube-system
roleRef:
  kind: ClusterRole
  name: cluster-admin
  apiGroup: rbac.authorization.k8s.io

配置dashboard-controller

文件 vi dashboard-controller.yaml

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: kubernetes-dashboard
  namespace: kube-system
  labels:
    k8s-app: kubernetes-dashboard
    kubernetes.io/cluster-service: "true"
    addonmanager.kubernetes.io/mode: Reconcile
spec:
  selector:
    matchLabels:
      k8s-app: kubernetes-dashboard
  template:
    metadata:
      labels:
        k8s-app: kubernetes-dashboard
      annotations:
        scheduler.alpha.kubernetes.io/critical-pod: ''
    spec:
      serviceAccountName: dashboard
      containers:
      - name: kubernetes-dashboard
        image: 192.168.223.208:80/k8s/kubernetes-dashboard-amd64:v1.6.0
        resources:
          limits:
            cpu: 100m
            memory: 50Mi
          requests:
            cpu: 100m
            memory: 50Mi
        ports:
        - containerPort: 9090
        livenessProbe:
          httpGet:
            path: /
            port: 9090
          initialDelaySeconds: 30
          timeoutSeconds: 30
      tolerations:
      - key: "CriticalAddonsOnly"
        operator: "Exists"

配置dashboard-service

文件 vi dashboard-service.yaml

apiVersion: v1
kind: Service
metadata:
  name: kubernetes-dashboard
  namespace: kube-system
  labels:
    k8s-app: kubernetes-dashboard
    kubernetes.io/cluster-service: "true"
    addonmanager.kubernetes.io/mode: Reconcile
spec:
  type: NodePort 
  selector:
    k8s-app: kubernetes-dashboard
  ports:
  - port: 80
    targetPort: 9090
  • 指定端口类型为 NodePort,这样外界可以通过地址 nodeIP:nodePort 访问 dashboard;

执行所有定义文件

kubectl create -f  .
service "kubernetes-dashboard" created
deployment "kubernetes-dashboard" created

检查执行结果

查看分配的 NodePort

kubectl get services kubernetes-dashboard -n kube-system
NAME                   CLUSTER-IP       EXTERNAL-IP   PORT(S)        AGE
kubernetes-dashboard   10.254.224.130   <nodes>       80:30312/TCP   25s
  • NodePort 30312映射到 dashboard pod 80端口;

检查 controller

kubectl get deployment kubernetes-dashboard  -n kube-system
NAME                   DESIRED   CURRENT   UP-TO-DATE   AVAILABLE   AGE
kubernetes-dashboard   1         1         1            1           3m

kubectl get pods  -n kube-system | grep dashboard
kubernetes-dashboard-1339745653-pmn6z   1/1       Running   0          4m

访问dashboard

有以下三种方式:

  • kubernetes-dashboard 服务暴露了 NodePort,可以使用 http://NodeIP:nodePort 地址访问 dashboard
  • 通过 API server 访问 dashboard(https 6443端口和http 8080端口方式)
  • 通过 kubectl proxy 访问 dashboard

通过 kubectl proxy 访问 dashboard

启动代理

kubectl proxy --address='192.168.223.204' --port=8086 --accept-hosts='^*$'
Starting to serve on 192.168.223.204:8086
  • 需要指定 --accept-hosts 选项,否则浏览器访问 dashboard 页面时提示 “Unauthorized”;

浏览器访问 URL:http://192.168.223.204:8086/ui 自动跳转到:http://192.168.223.204:8086/api/v1/proxy/namespaces/kube-system/services/kubernetes-dashboard/#/workload?namespace=default

通过 API server 访问dashboard

获取集群服务地址列表

kubectl cluster-info
Kubernetes master is running at https://192.168.223.200:6443
KubeDNS is running at https://192.168.223.200:6443/api/v1/proxy/namespaces/kube-system/services/kube-dns
kubernetes-dashboard is running at https://192.168.223.200:6443/api/v1/proxy/namespaces/kube-system/services/kubernetes-dashboard

浏览器访问 URL:https://192.168.223.200:6443/api/v1/proxy/namespaces/kube-system/services/kubernetes-dashboard(浏览器会提示证书验证,因为通过加密通道,以改方式访问的话,需要提前导入证书到你的计算机中)。这是我当时在这遇到的坑:通过 kube-apiserver 访问dashboard,提示User "system:anonymous" cannot proxy services in the namespace "kube-system". #5,解决方法如下: 导入证书 将生成的admin.pem证书转换格式

cd /etc/kubernetes/ssl
openssl pkcs12 -export -in admin.pem  -out admin.p12 -inkey admin-key.pem

将生成的admin.p12证书导入的你的电脑即可,导出的时候记住你设置的密码,导入的时候还要用到。

如果你不想使用https的话,可以直接访问insecure port 8080端口:http://192.168.223.200:8080/api/v1/proxy/namespaces/kube-system/services/kubernetes-dashboard

由于缺少 Heapster 插件,当前 dashboard 不能展示 Pod、Nodes 的 CPU、内存等 metric 图形。

更新dashboard到v1.6.3

Kubernetes 1.6 版本的 dashboard 的镜像已经到了 v1.6.3 版本,我们可以使用下面的方式更新。 修改 dashboard-controller.yaml 文件中的镜像的版本将 v1.6.0 更改为 v1.6.3。

image: 192.168.223.208:80/k8s/kubernetes-dashboard-amd64:v1.6.3

然后执行下面的命令:

kubectl apply -f dashboard-controller.yaml

监听 dashboard Pod 的状态可以看到:

kubectl get pods --all-namespaces|grep dashboard
kubernetes-dashboard-215087767-2jsgd   0/1       Pending   0         0s
kubernetes-dashboard-3966630548-0jj1j   1/1       Terminating   0         1d
kubernetes-dashboard-215087767-2jsgd   0/1       Pending   0         0s
kubernetes-dashboard-3966630548-0jj1j   1/1       Terminating   0         1d
kubernetes-dashboard-215087767-2jsgd   0/1       ContainerCreating   0         0s
kubernetes-dashboard-3966630548-0jj1j   0/1       Terminating   0         1d
kubernetes-dashboard-3966630548-0jj1j   0/1       Terminating   0         1d
kubernetes-dashboard-215087767-2jsgd   1/1       Running   0         6s
kubernetes-dashboard-3966630548-0jj1j   0/1       Terminating   0         1d
kubernetes-dashboard-3966630548-0jj1j   0/1       Terminating   0         1d
kubernetes-dashboard-3966630548-0jj1j   0/1       Terminating   0         1d

新的 Pod 的启动了,旧的 Pod 被终结了。

Dashboard 的访问地址不变,重新访问 http://192.168.223.200:8080/api/v1/proxy/namespaces/kube-system/services/kubernetes-dashboard, 可以看到新版的界面支持中文了。 新版本中最大的变化是增加了进入容器内部的入口(类似ssh终端),可以在页面上进入到容器内部操作,同时又增加了一个搜索框。

十一、安装heapster插件

该插件需要使用以下镜像:

192.168.223.208:80/k8s/heapster-amd64:vv1.4.3
192.168.223.208:80/k8s/heapster-influxdb-amd64:v1.1.1
192.168.223.208:80/k8s/heapster-grafana-amd64:v4.0.2

需要的yaml文件

ls *.yaml
grafana-deployment.yaml  grafana-service.yaml  heapster-deployment.yaml  heapster-rbac.yaml  heapster-service.yaml  influxdb-cm.yaml  influxdb-deployment.yaml  influxdb-service.yaml

配置 heapster-deployment

文件 vi heapster-rbac.yaml

apiVersion: v1
kind: ServiceAccount
metadata:
  name: heapster
  namespace: kube-system

---

kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
  name: heapster
subjects:
  - kind: ServiceAccount
    name: heapster
    namespace: kube-system
roleRef:
  kind: ClusterRole
  name: cluster-admin
  apiGroup: rbac.authorization.k8s.io

文件 vi heapster-deployment.yaml

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: heapster
  namespace: kube-system
spec:
  replicas: 1
  template:
    metadata:
      labels:
        task: monitoring
        k8s-app: heapster
    spec:
      serviceAccountName: heapster
      containers:
      - name: heapster
        image: 192.168.223.208:80/k8s/heapster-amd64:v1.4.3
        imagePullPolicy: IfNotPresent
        command:
        - /heapster
        - --source=kubernetes:https://kubernetes.default
        - --sink=influxdb:http://monitoring-influxdb:8086

文件 vi heapster-service.yaml

apiVersion: v1
kind: Service
metadata:
  labels:
    task: monitoring
    # For use as a Cluster add-on (https://github.com/kubernetes/kubernetes/tree/master/cluster/addons)
    # If you are NOT using this as an addon, you should comment out this line.
    kubernetes.io/cluster-service: 'true'
    kubernetes.io/name: Heapster
  name: heapster
  namespace: kube-system
spec:
  ports:
  - port: 80
    targetPort: 8082
  selector:
    k8s-app: heapster
    k8s-app: heapster

配置 grafana-deployment

文件 vi grafana-deployment.yaml

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: monitoring-grafana
  namespace: kube-system
spec:
  replicas: 1
  template:
    metadata:
      labels:
        task: monitoring
        k8s-app: grafana
    spec:
      containers:
      - name: grafana
        image: 192.168.223.208:80/k8s/heapster-grafana-amd64:v4.0.2
        ports:
          - containerPort: 3000
            protocol: TCP
        volumeMounts:
        - mountPath: /var
          name: grafana-storage
        env:
        - name: INFLUXDB_HOST
          value: monitoring-influxdb
        - name: GRAFANA_PORT
          value: "3000"
          # The following env variables are required to make Grafana accessible via
          # the kubernetes api-server proxy. On production clusters, we recommend
          # removing these env variables, setup auth for grafana, and expose the grafana
          # service using a LoadBalancer or a public IP.
        - name: GF_AUTH_BASIC_ENABLED
          value: "false"
        - name: GF_AUTH_ANONYMOUS_ENABLED
          value: "true"
        - name: GF_AUTH_ANONYMOUS_ORG_ROLE
          value: Admin
        - name: GF_SERVER_ROOT_URL
          # If you're only using the API Server proxy, set this value instead:
          value: /api/v1/proxy/namespaces/kube-system/services/monitoring-grafana/
          #value: /
      volumes:
      - name: grafana-storage
        emptyDir: {}
  • 如果后续使用 kube-apiserver 或者 kubectl proxy 访问 grafana dashboard,则必须将 GF_SERVER_ROOT_URL 设置为 /api/v1/proxy/namespaces/kube-system/services/monitoring-grafana/,否则后续访问grafana时访问时提示找不到http://192.168.223.200:8086/api/v1/proxy/namespaces/kube-system/services/monitoring-grafana/api/dashboards/home 页面;

文件 vi grafana-service.yaml

apiVersion: v1
kind: Service
metadata:
  labels:
    # For use as a Cluster add-on (https://github.com/kubernetes/kubernetes/tree/master/cluster/addons)
    # If you are NOT using this as an addon, you should comment out this line.
    kubernetes.io/cluster-service: 'true'
    kubernetes.io/name: monitoring-grafana
  name: monitoring-grafana
  namespace: kube-system
spec:
  # In a production setup, we recommend accessing Grafana through an external Loadbalancer
  # or through a public IP.
  # type: LoadBalancer
  # You could also use NodePort to expose the service at a randomly-generated port
  ports:
  - port : 80
    targetPort: 3000
  selector:
    k8s-app: grafana

配置 influxdb-deployment

文件 vi influxdb-cm.yaml

apiVersion: v1
kind: ConfigMap
metadata:
  name: influxdb-config
  namespace: kube-system
data:
  config.toml: |
    reporting-disabled = true
    bind-address = ":8088"
    [meta]
      dir = "/data/meta"
      retention-autocreate = true
      logging-enabled = true
    [data]
      dir = "/data/data"
      wal-dir = "/data/wal"
      query-log-enabled = true
      cache-max-memory-size = 1073741824
      cache-snapshot-memory-size = 26214400
      cache-snapshot-write-cold-duration = "10m0s"
      compact-full-write-cold-duration = "4h0m0s"
      max-series-per-database = 1000000
      max-values-per-tag = 100000
      trace-logging-enabled = false
    [coordinator]
      write-timeout = "10s"
      max-concurrent-queries = 0
      query-timeout = "0s"
      log-queries-after = "0s"
      max-select-point = 0
      max-select-series = 0
      max-select-buckets = 0
    [retention]
      enabled = true
      check-interval = "30m0s"
    [admin]
      enabled = true
      bind-address = ":8083"
      https-enabled = false
      https-certificate = "/etc/ssl/influxdb.pem"
    [shard-precreation]
      enabled = true
      check-interval = "10m0s"
      advance-period = "30m0s"
    [monitor]
      store-enabled = true
      store-database = "_internal"
      store-interval = "10s"
    [subscriber]
      enabled = true
      http-timeout = "30s"
      insecure-skip-verify = false
      ca-certs = ""
      write-concurrency = 40
      write-buffer-size = 1000
    [http]
      enabled = true
      bind-address = ":8086"
      auth-enabled = false
      log-enabled = true
      write-tracing = false
      pprof-enabled = false
      https-enabled = false
      https-certificate = "/etc/ssl/influxdb.pem"
      https-private-key = ""
      max-row-limit = 10000
      max-connection-limit = 0
      shared-secret = ""
      realm = "InfluxDB"
      unix-socket-enabled = false
      bind-socket = "/var/run/influxdb.sock"
    [[graphite]]
      enabled = false
      bind-address = ":2003"
      database = "graphite"
      retention-policy = ""
      protocol = "tcp"
      batch-size = 5000
      batch-pending = 10
      batch-timeout = "1s"
      consistency-level = "one"
      separator = "."
      udp-read-buffer = 0
    [[collectd]]
      enabled = false
      bind-address = ":25826"
      database = "collectd"
      retention-policy = ""
      batch-size = 5000
      batch-pending = 10
      batch-timeout = "10s"
      read-buffer = 0
      typesdb = "/usr/share/collectd/types.db"
    [[opentsdb]]
      enabled = false
      bind-address = ":4242"
      database = "opentsdb"
      retention-policy = ""
      consistency-level = "one"
      tls-enabled = false
      certificate = "/etc/ssl/influxdb.pem"
      batch-size = 1000
      batch-pending = 5
      batch-timeout = "1s"
      log-point-errors = true
    [[udp]]
      enabled = false
      bind-address = ":8089"
      database = "udp"
      retention-policy = ""
      batch-size = 5000
      batch-pending = 10
      read-buffer = 0
      batch-timeout = "1s"
      precision = ""
    [continuous_queries]
      log-enabled = true
      enabled = true
      run-interval = "1s"

文件 vi influxdb-deployment.yaml

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: monitoring-influxdb
  namespace: kube-system
spec:
  replicas: 1
  template:
    metadata:
      labels:
        task: monitoring
        k8s-app: influxdb
    spec:
      containers:
      - name: influxdb
        image: 192.168.223.208:80/k8s/heapster-influxdb-amd64:v1.1.1
        volumeMounts:
        - mountPath: /data
          name: influxdb-storage
        - mountPath: /etc/
          name: influxdb-config
      volumes:
      - name: influxdb-storage
        emptyDir: {}
      - name: influxdb-config
        configMap:
          name: influxdb-config

文件 vi influxdb-service.yaml


apiVersion: v1
kind: Service
metadata:
  labels:
    task: monitoring
    # For use as a Cluster add-on (https://github.com/kubernetes/kubernetes/tree/master/cluster/addons)
    # If you are NOT using this as an addon, you should comment out this line.
    kubernetes.io/cluster-service: 'true'
    kubernetes.io/name: monitoring-influxdb
  name: monitoring-influxdb
  namespace: kube-system
spec:
  type: NodePort
  ports:
  - port: 8086
    targetPort: 8086
    name: http
  - port: 8083
    targetPort: 8083
    name: admin
  selector:
    k8s-app: influxdb
  • 定义端口类型为 NodePort,额外增加了 admin 端口映射,用于后续浏览器访问 influxdb 的 admin UI 界面;

执行所有定义文件

ls *.yaml
grafana-service.yaml heapster-rbac.yaml influxdb-cm.yaml influxdb-service.yaml grafana-deployment.yaml  heapster-deployment.yaml heapster-service.yaml influxdb-deployment.yaml

kubectl create -f  .
deployment "monitoring-grafana" created
service "monitoring-grafana" created
deployment "heapster" created
serviceaccount "heapster" created
clusterrolebinding "heapster" created
service "heapster" created
configmap "influxdb-config" created
deployment "monitoring-influxdb" created
service "monitoring-influxdb" created

检查执行结果

检查 Deployment

kubectl get deployments -n kube-system | grep -E 'heapster|monitoring'
heapster               1         1         1            1           2m
monitoring-grafana     1         1         1            1           2m
monitoring-influxdb    1         1         1            1           2m

检查 Pods

kubectl get pods -n kube-system | grep -E 'heapster|monitoring'
heapster-110704576-gpg8v                1/1       Running   0          2m
monitoring-grafana-2861879979-9z89f     1/1       Running   0          2m
monitoring-influxdb-1411048194-lzrpc    1/1       Running   0          2m

此时检查 kubernets dashboard 界面,就可以显示各 Nodes、Pods 的 CPU、内存、负载等利用率曲线图了;

访问 grafana

  1. 通过 kube-apiserver 访问: 获取 monitoring-grafana 服务 URL
kubectl cluster-info
Kubernetes master is running at https://192.168.223.200:6443
Heapster is running at https://192.168.223.200:6443/api/v1/proxy/namespaces/kube-system/services/heapster
KubeDNS is running at https://192.168.223.200:6443/api/v1/proxy/namespaces/kube-system/services/kube-dns
kubernetes-dashboard is running at https://192.168.223.200:6443/api/v1/proxy/namespaces/kube-system/services/kubernetes-dashboard
monitoring-grafana is running at https://192.168.223.200:6443/api/v1/proxy/namespaces/kube-system/services/monitoring-grafana
monitoring-influxdb is running at https://192.168.223.200:6443/api/v1/proxy/namespaces/kube-system/services/monitoring-influxdb

To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.

浏览器访问 URL: http://192.168.223.200:8080/api/v1/proxy/namespaces/kube-system/services/monitoring-grafana

  1. 通过 kube-apiserver 访问: 创建代理
kubectl proxy --address='192.168.223.204' --port=8086 --accept-hosts='^*$'
Starting to serve on 192.168.223.204:8086

浏览器访问 URL:http://192.168.223.204:8086/api/v1/proxy/namespaces/kube-system/services/monitoring-grafana

注意 在安装好 Grafana 之后我们使用的是默认的 template 配置,页面上的 namespace 选择里只有 default 和 kube-system,并不是说其他的 namespace 里的指标没有得到监控,只是我们没有在 Grafana 中开启他它们的显示而已。将 Templating 中的 namespace 的 Data source 设置为 influxdb-datasource,Refresh 设置为 on Dashboard Load 保存设置,刷新浏览器,即可看到其他 namespace 选项。

访问 influxdb admin UI

获取 influxdb http 8086 映射的 NodePort

kubectl get svc -n kube-system|grep influxdb
monitoring-influxdb    10.254.22.46    <nodes>       8086:32299/TCP,8083:30269/TCP   9m

通过 kube-apiserver 的非安全端口访问 influxdb 的 admin UI 界面: http://192.168.223.200:8080/api/v1/proxy/namespaces/kube-system/services/monitoring-influxdb:8083/ 在页面的 “Connection Settings” 的 Host 中输入 node IP, Port 中输入 8086 映射的 nodePort 如上面的 32299,点击 “Save” 即可(我的集群中的地址是192.168.223.206:32299): Quary 框中输入 show stats 查看基本信息

十二、安装EFK插件

我们通过在每台node上部署一个以DaemonSet方式运行的fluentd来收集每台node上的日志。Fluentd将docker日志目录/var/lib/docker/containers和/var/log目录挂载到Pod中,然后Pod会在node节点的/var/log/pods目录中创建新的目录,可以区别不同的容器日志输出,该目录下有一个日志文件链接到/var/lib/docker/contianers目录下的容器日志输出。

该插件需要使用以下镜像:

192.168.223.208:80/k8s/elasticsearch:v2.4.1
192.168.223.208:80/k8s/fluentd-elasticsearch:v1.22
192.168.223.208:80/k8s/kibana:v4.6.1

需要使用的yaml配置文件

ls *.yaml
efk-rbac.yaml  es-controller.yaml  es-service.yaml  fluentd-es-ds.yaml  kibana-controller.yaml  kibana-service.yaml

配置 es

文件 vi efk-rbac.yaml

apiVersion: v1
kind: ServiceAccount
metadata:
  name: efk
  namespace: kube-system

---

kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
  name: efk
subjects:
  - kind: ServiceAccount
    name: efk
    namespace: kube-system
roleRef:
  kind: ClusterRole
  name: cluster-admin
  apiGroup: rbac.authorization.k8s.io

文件 vi es-controller.yaml

apiVersion: v1
kind: ReplicationController
metadata:
  name: elasticsearch-logging-v1
  namespace: kube-system
  labels:
    k8s-app: elasticsearch-logging
    version: v1
    kubernetes.io/cluster-service: "true"
    addonmanager.kubernetes.io/mode: Reconcile
spec:
  replicas: 2
  selector:
    k8s-app: elasticsearch-logging
    version: v1
  template:
    metadata:
      labels:
        k8s-app: elasticsearch-logging
        version: v1
        kubernetes.io/cluster-service: "true"
    spec:
      serviceAccountName: efk
      containers:
      - image: 192.168.223.208:80/k8s/elasticsearch:v2.4.1
        name: elasticsearch-logging
        resources:
          # need more cpu upon initialization, therefore burstable class
          limits:
            cpu: 1000m
          requests:
            cpu: 100m
        ports:
        - containerPort: 9200
          name: db
          protocol: TCP
        - containerPort: 9300
          name: transport
          protocol: TCP
        volumeMounts:
        - name: es-persistent-storage
          mountPath: /data
        env:
        - name: "NAMESPACE"
          valueFrom:
            fieldRef:
              fieldPath: metadata.namespace
      volumes:
      - name: es-persistent-storage
        emptyDir: {}

文件 vi es-service.yaml

apiVersion: v1
kind: Service
metadata:
  name: elasticsearch-logging
  namespace: kube-system
  labels:
    k8s-app: elasticsearch-logging
    kubernetes.io/cluster-service: "true"
    addonmanager.kubernetes.io/mode: Reconcile
    kubernetes.io/name: "Elasticsearch"
spec:
  ports:
  - port: 9200
    protocol: TCP
    targetPort: db
  selector:
    k8s-app: elasticsearch-logging

配置 fluentd-es

文件 vi fluentd-es-ds.yaml

apiVersion: extensions/v1beta1
kind: DaemonSet
metadata:
  name: fluentd-es-v1.22
  namespace: kube-system
  labels:
    k8s-app: fluentd-es
    kubernetes.io/cluster-service: "true"
    addonmanager.kubernetes.io/mode: Reconcile
    version: v1.22
spec:
  template:
    metadata:
      labels:
        k8s-app: fluentd-es
        kubernetes.io/cluster-service: "true"
        version: v1.22
      # This annotation ensures that fluentd does not get evicted if the node
      # supports critical pod annotation based priority scheme.
      # Note that this does not guarantee admission on the nodes (#40573).
      annotations:
        scheduler.alpha.kubernetes.io/critical-pod: ''
    spec:
      serviceAccountName: efk
      containers:
      - name: fluentd-es
        image: 192.168.223.208:80/k8s/fluentd-elasticsearch:v1.22
        command:
          - '/bin/sh'
          - '-c'
          - '/usr/sbin/td-agent 2>&1 >> /var/log/fluentd.log'
        resources:
          limits:
            memory: 200Mi
          requests:
            cpu: 100m
            memory: 200Mi
        volumeMounts:
        - name: varlog
          mountPath: /var/log
        - name: varlibdockercontainers
          mountPath: /var/lib/docker/containers
          readOnly: true
      nodeSelector:
        beta.kubernetes.io/fluentd-ds-ready: "true"
      tolerations:
      - key : "node.alpha.kubernetes.io/ismaster"
        effect: "NoSchedule"
      terminationGracePeriodSeconds: 30
      volumes:
      - name: varlog
        hostPath:
          path: /var/log
      - name: varlibdockercontainers
        hostPath:
          path: /var/lib/docker/containers

配置 kibana

文件 vi kibana-controller.yaml

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: kibana-logging
  namespace: kube-system
  labels:
    k8s-app: kibana-logging
    kubernetes.io/cluster-service: "true"
    addonmanager.kubernetes.io/mode: Reconcile
spec:
  replicas: 1
  selector:
    matchLabels:
      k8s-app: kibana-logging
  template:
    metadata:
      labels:
        k8s-app: kibana-logging
    spec:
      serviceAccountName: efk
      containers:
      - name: kibana-logging
        image: 192.168.223.208:80/k8s/kibana:v4.6.1
        resources:
          # keep request = limit to keep this container in guaranteed class
          limits:
            cpu: 100m
          requests:
            cpu: 100m
        env:
          - name: "ELASTICSEARCH_URL"
            value: "http://elasticsearch-logging:9200"
          - name: "KIBANA_BASE_URL"
            value: "/api/v1/proxy/namespaces/kube-system/services/kibana-logging"
        ports:
        - containerPort: 5601
          name: ui
          protocol: TCP

文件 vi kibana-service.yaml

apiVersion: v1
kind: Service
metadata:
  name: kibana-logging
  namespace: kube-system
  labels:
    k8s-app: kibana-logging
    kubernetes.io/cluster-service: "true"
    addonmanager.kubernetes.io/mode: Reconcile
    kubernetes.io/name: "Kibana"
spec:
  ports:
  - port: 5601
    protocol: TCP
    targetPort: ui
  selector:
    k8s-app: kibana-logging

给 Node 设置标签

定义 DaemonSet fluentd-es-v1.22 时设置了 nodeSelector beta.kubernetes.io/fluentd-ds-ready=true ,所以需要在期望运行 fluentd 的 Node 上设置该标签;

kubectl get nodes
NAME        STATUS    AGE       VERSION
192.168.223.206   Ready     1d        v1.6.0
192.168.223.207   Ready     1d        v1.6.0

kubectl label nodes 192.168.223.206 beta.kubernetes.io/fluentd-ds-ready=true
node "192.168.223.206" labeled

kubectl label nodes 192.168.223.207 beta.kubernetes.io/fluentd-ds-ready=true
node "192.168.223.207" labeled

执行定义文件

kubectl create -f .
serviceaccount "efk" created
clusterrolebinding "efk" created
replicationcontroller "elasticsearch-logging-v1" created
service "elasticsearch-logging" created
daemonset "fluentd-es-v1.22" created
deployment "kibana-logging" created
service "kibana-logging" created

检查执行结果

kubectl get deployment -n kube-system|grep kibana
kibana-logging         1         1         1            1           2m

kubectl get pods -n kube-system|grep -E 'elasticsearch|fluentd|kibana'
elasticsearch-logging-v1-mlstp          1/1       Running   0          1m
elasticsearch-logging-v1-nfbbf          1/1       Running   0          1m
fluentd-es-v1.22-31sm0                  1/1       Running   0          1m
fluentd-es-v1.22-bpgqs                  1/1       Running   0          1m
fluentd-es-v1.22-qmn7h                  1/1       Running   0          1m
kibana-logging-1432287342-0gdng         1/1       Running   0          1m

kubectl get service  -n kube-system|grep -E 'elasticsearch|kibana'
elasticsearch-logging   10.254.77.62    <none>        9200/TCP                        2m
kibana-logging          10.254.8.113    <none>        5601/TCP                        2m

kibana Pod 第一次启动时会用较长时间(10-20分钟)来优化和 Cache 状态页面,可以 tailf 该 Pod 的日志观察进度:

kubectl logs kibana-logging-1432287342-0gdng -n kube-system -f
ELASTICSEARCH_URL=http://elasticsearch-logging:9200
server.basePath: /api/v1/proxy/namespaces/kube-system/services/kibana-logging
{"type":"log","@timestamp":"2017-04-12T13:08:06Z","tags":["info","optimize"],"pid":7,"message":"Optimizing and caching bundles for kibana and statusPage. This may take a few minutes"}
{"type":"log","@timestamp":"2017-04-12T13:18:17Z","tags":["info","optimize"],"pid":7,"message":"Optimization of bundles for kibana and statusPage complete in 610.40 seconds"}
{"type":"log","@timestamp":"2017-04-12T13:18:17Z","tags":["status","plugin:kibana@1.0.0","info"],"pid":7,"state":"green","message":"Status changed from uninitialized to green - Ready","prevState":"uninitialized","prevMsg":"uninitialized"}
{"type":"log","@timestamp":"2017-04-12T13:18:18Z","tags":["status","plugin:elasticsearch@1.0.0","info"],"pid":7,"state":"yellow","message":"Status changed from uninitialized to yellow - Waiting for Elasticsearch","prevState":"uninitialized","prevMsg":"uninitialized"}
{"type":"log","@timestamp":"2017-04-12T13:18:19Z","tags":["status","plugin:kbn_vislib_vis_types@1.0.0","info"],"pid":7,"state":"green","message":"Status changed from uninitialized to green - Ready","prevState":"uninitialized","prevMsg":"uninitialized"}
{"type":"log","@timestamp":"2017-04-12T13:18:19Z","tags":["status","plugin:markdown_vis@1.0.0","info"],"pid":7,"state":"green","message":"Status changed from uninitialized to green - Ready","prevState":"uninitialized","prevMsg":"uninitialized"}
{"type":"log","@timestamp":"2017-04-12T13:18:19Z","tags":["status","plugin:metric_vis@1.0.0","info"],"pid":7,"state":"green","message":"Status changed from uninitialized to green - Ready","prevState":"uninitialized","prevMsg":"uninitialized"}
{"type":"log","@timestamp":"2017-04-12T13:18:19Z","tags":["status","plugin:spyModes@1.0.0","info"],"pid":7,"state":"green","message":"Status changed from uninitialized to green - Ready","prevState":"uninitialized","prevMsg":"uninitialized"}
{"type":"log","@timestamp":"2017-04-12T13:18:19Z","tags":["status","plugin:statusPage@1.0.0","info"],"pid":7,"state":"green","message":"Status changed from uninitialized to green - Ready","prevState":"uninitialized","prevMsg":"uninitialized"}
{"type":"log","@timestamp":"2017-04-12T13:18:19Z","tags":["status","plugin:table_vis@1.0.0","info"],"pid":7,"state":"green","message":"Status changed from uninitialized to green - Ready","prevState":"uninitialized","prevMsg":"uninitialized"}
{"type":"log","@timestamp":"2017-04-12T13:18:19Z","tags":["listening","info"],"pid":7,"message":"Server running at http://0.0.0.0:5601"}
{"type":"log","@timestamp":"2017-04-12T13:18:24Z","tags":["status","plugin:elasticsearch@1.0.0","info"],"pid":7,"state":"yellow","message":"Status changed from yellow to yellow - No existing Kibana index found","prevState":"yellow","prevMsg":"Waiting for Elasticsearch"}
{"type":"log","@timestamp":"2017-04-12T13:18:29Z","tags":["status","plugin:elasticsearch@1.0.0","info"],"pid":7,"state":"green","message":"Status changed from yellow to green - Kibana index ready","prevState":"yellow","prevMsg":"No existing Kibana index found"}

访问 kibana

  1. 通过 kube-apiserver 访问: 获取 monitoring-grafana 服务 URL
kubectl cluster-info
 Kubernetes master is running at https://192.168.223.200:6443
 Elasticsearch is running at https://192.168.223.200:6443/api/v1/proxy/namespaces/kube-system/services/elasticsearch-logging
 Heapster is running at https://192.168.223.200:6443/api/v1/proxy/namespaces/kube-system/services/heapster
 Kibana is running at https://192.168.223.200:6443/api/v1/proxy/namespaces/kube-system/services/kibana-logging
 KubeDNS is running at https://192.168.223.200:6443/api/v1/proxy/namespaces/kube-system/services/kube-dns
 kubernetes-dashboard is running at https://192.168.223.200:6443/api/v1/proxy/namespaces/kube-system/services/kubernetes-dashboard
 monitoring-grafana is running at https://192.168.223.200:6443/api/v1/proxy/namespaces/kube-system/services/monitoring-grafana
 monitoring-influxdb is running at https://192.168.223.200:6443/api/v1/proxy/namespaces/kube-system/services/monitoring-influxdb

浏览器访问 URL: https://192.168.223.200:6443/api/v1/proxy/namespaces/kube-system/services/kibana-logging/app/kibana 或者非安全连接: http://192.168.223.200:8080/api/v1/proxy/namespaces/kube-system/services/kibana-logging/app/kibana

  1. 通过 kubectl proxy 访问: 创建代理
kubectl proxy --address='192.168.223.204' --port=8086 --accept-hosts='^*$'
 Starting to serve on 192.168.223.204:8086

浏览器访问 URL:http://192.168.223.204:8086/api/v1/proxy/namespaces/kube-system/services/kibana-logging

在 Settings -> Indices 页面创建一个 index(相当于 mysql 中的一个 database),选中 Index contains time-based events,使用默认的 logstash-* pattern,点击 Create ; 创建Index后,可以在 Discover 下看到 ElasticSearch logging 中汇聚的日志;

可能遇到的问题 如果你在这里发现Create按钮是灰色的无法点击,且Time-filed name中没有选项,fluentd要读取/var/log/containers/目录下的log日志,这些日志是从/var/lib/docker/containers/${CONTAINER_ID}/${CONTAINER_ID}-json.log链接过来的,查看你的docker配置,—log-dirver需要设置为json-file格式,默认的可能是journald。

posted @ 2018-04-10 23:10  leffss  阅读(5028)  评论(0编辑  收藏  举报