Helm 部署 skywalking-v8.4.0


1. 调用链简介

  在分布式架构、微服务以及 k8s 生态相关技术环境下,对应用的请求链路进行追踪(也叫做 APM,Application Performance Management)是非常有必要的,链路追踪简单来说就是将应用从流量到达前端开始,一直到最后端的数据库核心,中间经过的每一层请求链路的完整行为都记录下来,而且通过可视化的形式实现链路信息查询、依赖关系、性能分析、拓扑展示等等,利用链路追踪系统可以很好的帮我们定位问题,这是常规监控手段实现起来比较困难的。

常见商业版本:

  • 听云
  • 博睿宏远

常见开源版本:

  • Skywalking:中国,个人开源,目前隶属于 Apache 基金会,作者近期刚刚入选 Apache 首位中国董事
  • Pinpoint:韩国,个人开源
  • Zipkin:美国,Twitter 公司开源
  • Cat:中国,美团开源

2. 环境

  • K8S v1.22.5 集群
主机 IP
master 192.168.10.100
node01 192.168.10.101
node02 192.168.10.102
  • Elasticsearch v7.12.0
  • Skywalking
skywalking-oap-server:后端服务
skywalking-ui:ui 前端
skywalking-es-init:初始化 es 集群数据使用
elasticsearch:存储 skywalking 的数据指标

本次编写时候:

skywalking 最高版本 9.1.0

elasticsearch 最高版本 8.3.2

3. K8S 集群部署 nfs 环境

3.1 创建命名空间

[root@master ~]# kubectl create ns efk
namespace/efk created
[root@master ~]# kubectl get ns
NAME              STATUS   AGE
default           Active   22h
efk               Active   2s
ingress-nginx     Active   22h
istio-system      Active   19h
kube-node-lease   Active   22h
kube-public       Active   22h
kube-system       Active   22h
metallb-system    Active   22h

3.2 创建 NFS

### 这里就将 nfs-server 安装在 master 节点

# 安装 nfs-utils、rpcbind 软件包(===所有节点===)
yum -y install nfs-utils rpcbind

# 创建目录
sudo mkdir -p /nfsdata
 
# 添加权限
sudo chmod 777 -R /nfsdata
 
# 编辑文件,添加以下内容
sudo vim /etc/exports
/nfsdata 192.168.10.0/24(rw,no_root_squash,sync)

# 重启服务
systemctl start rpcbind && systemctl enable rpcbind
systemctl start nfs && systemctl enable nfs(所有节点)

# 配置生效
exportfs -rv

# 查看共享目录
sudo showmount -e 192.168.10.100
# 返回值如下,表示创建成功
Export list for 192.168.10.100:
/nfsdata	192.168.108.*

3.3 创建 StorageClass

external-storage/nfs-client/deploy at master · kubernetes-retired/external-storage (github.com)

rbac.yaml:创建 serviceaccount

apiVersion: v1
kind: ServiceAccount
metadata:
  name: nfs-client-provisioner
  namespace: default 
---
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: nfs-client-provisioner-runner
rules:
  - apiGroups: [""]
    resources: ["persistentvolumes"]
    verbs: ["get", "list", "watch", "create", "delete"]
  - apiGroups: [""]
    resources: ["persistentvolumeclaims"]
    verbs: ["get", "list", "watch", "update"]
  - apiGroups: ["storage.k8s.io"]
    resources: ["storageclasses"]
    verbs: ["get", "list", "watch"]
  - apiGroups: [""]
    resources: ["events"]
    verbs: ["create", "update", "patch"]
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: run-nfs-client-provisioner
subjects:
  - kind: ServiceAccount
    name: nfs-client-provisioner
    namespace: default
roleRef:
  kind: ClusterRole
  name: nfs-client-provisioner-runner
  apiGroup: rbac.authorization.k8s.io
---
kind: Role
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: leader-locking-nfs-client-provisioner
  namespace: default
rules:
  - apiGroups: [""]
    resources: ["endpoints"]
    verbs: ["get", "list", "watch", "create", "update", "patch"]
---
kind: RoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: leader-locking-nfs-client-provisioner
subjects:
  - kind: ServiceAccount
    name: nfs-client-provisioner
    namespace: default
roleRef:
  kind: Role
  name: leader-locking-nfs-client-provisioner
  apiGroup: rbac.authorization.k8s.io

prvisor-deployment.yaml:创建 nfs-client-provisioner,要和 rbac 一个 ns

apiVersion: apps/v1
kind: Deployment
metadata:
  name: nfs-client-provisioner
  labels:
    app: nfs-client-provisioner
  namespace: default 
spec:
  replicas: 1
  selector:
    matchLabels:
      app: nfs-client-provisioner
  strategy:
    type: Recreate
  selector:
    matchLabels:
      app: nfs-client-provisioner
  template:
    metadata:
      labels:
        app: nfs-client-provisioner
    spec:
      serviceAccountName: nfs-client-provisioner
      containers:
        - name: nfs-client-provisioner
          image: quay.io/external_storage/nfs-client-provisioner:latest
          volumeMounts:
            - name: nfs-client-root
              mountPath: /persistentvolumes
          env:
            - name: PROVISIONER_NAME
              value: wuchang-nfs-storage 
            - name: NFS_SERVER
              value: 192.168.10.100   #NFS Server IP地址
            - name: NFS_PATH 
              value: /nfsdata        #NFS挂载卷
      volumes:
        - name: nfs-client-root
          nfs:
            server: 192.168.10.100    #NFS Server IP地址
            path: /nfsdata           #NFS 挂载卷

storageclass.yaml:创建 storageclass

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: managed-nfs-storage
provisioner: wuchang-nfs-storage

# 允许 pvc 创建后扩容
allowVolumeExpansion: True

parameters:
# 资源删除策略,如果为 "true" 则表示删除 PVC 时,同时删除绑定的 PV
  archiveOnDelete: "false"

按顺序执行

kubectl apply -f rbac.yaml
kubectl get serviceaccount

kubectl apply -f prvisor-deployment.yaml
kubectl get deploy

kubectl apply -f storageclass.yaml
kubectl get sc

4. K8S 安装 ES

可事先下载镜像:

docker pull elasticsearch:x.x.x
docker pull apache/skywalking-oap-server:9.0.0
docker pull apache/skywalking-ui:9.0.0

es-pvc.yaml:serviceaccount 和 pv-deploy 要放在 default 下,pvc 可以指定任意 ns。

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: es-pvc
  namespace: default
spec:
  storageClassName: "managed-nfs-storage" #指定动态 PV 名称
  accessModes:
    - ReadWriteMany
  resources:
    requests:
      storage: 10Gi

es-svc.yaml

apiVersion: v1
kind: Service
metadata:
  name: elasticsearch-single
  namespace: default
spec:
  ports:
  - port: 9200
    protocol: TCP
    targetPort: 9200
  selector:
    k8s-app: elasticsearch-single

elasticsearch-single.yaml

apiVersion: apps/v1
kind: Deployment
metadata:
  name: elasticsearch-single
  namespace: default
  labels:
    k8s-app: elasticsearch-single
spec:
  replicas: 1
  selector:
    matchLabels:
      k8s-app: elasticsearch-single
  template:
    metadata:
      labels:
        k8s-app: elasticsearch-single
    spec:
      containers:
      - image: elasticsearch:7.12.0		# 目前最高 8.3.2
        name: elasticsearch-single
        resources:
          limits:
            cpu: 2
            memory: 3Gi
          requests:
            cpu: 0.5 
            memory: 500Mi
        env:
          - name: "discovery.type"
            value: "single-node"
          - name: ES_JAVA_OPTS
            value: "-Xms512m -Xmx2g" 
        ports:
        - containerPort: 9200
          name: db
          protocol: TCP
        volumeMounts:
        - name: elasticsearch-data
          mountPath: /usr/share/elasticsearch/data
      volumes:
      - name: elasticsearch-data
        persistentVolumeClaim:
          claimName: es-pvc

执行

kubectl apply -f es-pvc.yaml
kubectl get pv,pvc

kubectl apply -f es-svc.yaml
kubectl get svc

kubectl apply -f elasticsearch-single.yaml
kubectl get pod

故障描述:

PVC 显示创建不成功:kubectl get pvc -n efk 显示 Pending,这是由于版本太高导致的。k8sv1.20 以上版本默认禁止使用 selfLink。(selfLink:通过 API 访问资源自身的 URL,例如一个 Pod 的 link 可能是 /api/v1/namespaces/ns36aa8455/pods/sc-cluster-test-1-6bc58d44d6-r8hld)。

故障解决:

[root@k8sm storage]# vi /etc/kubernetes/manifests/kube-apiserver.yaml
apiVersion: v1
···
    - --tls-private-key-file=/etc/kubernetes/pki/apiserver.key
    - --feature-gates=RemoveSelfLink=false # 添加这个配置
重启下kube-apiserver.yaml

# 如果是二进制安装的 k8s,执行 systemctl restart kube-apiserver
# 如果是 kubeadm 安装的 k8s
[root@k8sm manifests]# ps aux|grep kube-apiserver
[root@k8sm manifests]# kill -9 [Pid]	# 有可能自动重启
[root@k8sm manifests]# kubectl apply -f /etc/kubernetes/manifests/kube-apiserver.yaml
...
[root@master ~]# kubectl get pods -A | grep kube-apiserver-master	# 查看时间
......
[root@k8sm storage]# kubectl get pvc	# 查看 pvc 显示 Bound
NAME     STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE
my-pvc   Bound    pvc-ae9f6d4b-fc4c-4e19-8854-7bfa259a3a04   1Gi        RWX            example-nfs    13m

5. 安装 skywalking

5.1 安装 Helm

curl https://raw.githubusercontent.com/helm/helm/master/scripts/get-helm-3 | bash
curl http://49.232.8.65/shell/helm/helm-v3.5.0_install.sh | bash
------------------------------------------------------------------------------------
[root@master ~]# curl https://raw.githubusercontent.com/helm/helm/master/scripts/get-helm-3 | bash
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 11156  100 11156    0     0  17974      0 --:--:-- --:--:-- --:--:-- 17964
Downloading https://get.helm.sh/helm-v3.9.0-linux-amd64.tar.gz
Verifying checksum... Done.
Preparing to install helm into /usr/local/bin
helm installed into /usr/local/bin/helm

5.2 初始化 skywalking 的 charts 配置

clone helm 仓库

git clone https://github.com/apache/skywalking-kubernetes
cd skywalking-kubernetes/chart && ls

添加 ES repo:即使使用外部 ES 也要添加这个 repo,否则会导致依赖错误

helm repo add elastic https://helm.elastic.co
helm dep up skywalking
export SKYWALKING_RELEASE_NAME=skywalking
export SKYWALKING_RELEASE_NAMESPACE=skywalking
-------------------------------------------------------
[root@master ~/skywalking-kubernetes/chart]# ls
skywalking
[root@master ~/skywalking-kubernetes/chart]# helm dep up skywalking
Hang tight while we grab the latest from your chart repositories...
...Successfully got an update from the "elastic" chart repository
Update Complete. ⎈Happy Helming!⎈
Saving 1 charts
Downloading elasticsearch from repo https://helm.elastic.co/
Deleting outdated charts
[root@master ~/skywalking-kubernetes/chart]# export SKYWALKING_RELEASE_NAME=skywalking
[root@master ~/skywalking-kubernetes/chart]# export SKYWALKING_RELEASE_NAMESPACE=skywalking

创建 skywalking 的 namespace

[root@master ~]# kubectl create namespace skywalking
namespace/skywalking created
[root@master ~]# kubectl get ns
NAME              STATUS   AGE
default           Active   2d1h
efk               Active   124m
ingress-nginx     Active   2d
istio-system      Active   46h
kube-node-lease   Active   2d1h
kube-public       Active   2d1h
kube-system       Active   2d1h
metallb-system    Active   2d
skywalking        Active   4s

5.3 配置 skywalking 的 vaules 配置参数

初始化完成后需要自行调整配置文件:

1️⃣ 配置 oap-server 使用外部 ES,values-my-es-o1.yaml(自己创建)

2️⃣ 使用 values 自带的 es 的配置示例 values-my-es.yaml

预先下载镜像文件:
docker pull skywalking.docker.scarf.sh/apache/skywalking-oap-server:9.1.0
docker pull skywalking.docker.scarf.sh/apache/skywalking-ui:9.1.0

修改 values.yaml

[root@master ~/skywalking-kubernetes/chart/skywalking]# vim values.yaml
----------------------------------------
......
image:
    repository: skywalking.docker.scarf.sh/apache/skywalking-oap-server
  改为:
    repository: docker.mirrors.ustc.edu.cn/apache/skywalking-oap-server
......

如果使用外部 es

# skywalking 目前最高 9.1.0
[root@master ~/skywalking-kubernetes/chart/skywalking]# cat values-my-es-01.yaml
oap:
  image:
    tag: 8.4.0-es7 
  storageType: elasticsearch7

ui:
  image:
    tag: 8.4.0
  service:
    type: NodePort
    externalPort: 80
    internalPort: 8080
    nodePort: 30008
elasticsearch:
  enabled: false
  config: 
    # {SERVICE_NAME}.{NAMESPACE_NAME}.svc.cluster.local
    host: elasticsearch-single.default  # elasticsearch-single 是服务名,default 是命名空间
    port:
      http: 9200
    # user: "elastic"         # [optional]
    # password: "elastic"     # [optional]
    # es 没有使用账号密码

5.4 helm 安装 skywalking 8.4.0

cd /root/skywalking-kubernetes/chart/

--- 方法一:直接指定版本安装,不适用外部 es
helm install "${SKYWALKING_RELEASE_NAME}" skywalking -n "${SKYWALKING_RELEASE_NAMESPACE}"   --set oap.image.tag=8.4.0-es7   --set oap.storageType=elasticsearch7   --set ui.image.tag=8.4.0   --set elasticsearch.imageTag=7.12.0

--- 方法二:使用外部 es 命令(设置环境变量有点多此一举)
helm install "${SKYWALKING_RELEASE_NAME}" skywalking -n "${SKYWALKING_RELEASE_NAMESPACE}" -f ./skywalking/values-my-es-01.yaml

-----------------------------------------------------------------------------------------
[root@master ~/skywalking-kubernetes/chart]# helm install skywalking skywalking -n  skywalking  -f ./skywalking/values-my-es-01.yaml
NAME: skywalking
LAST DEPLOYED: Sat Jul  9 15:05:23 2022
NAMESPACE: skywalking
STATUS: deployed
REVISION: 1
TEST SUITE: None
NOTES:
************************************************************************
*                                                                      *
*                 SkyWalking Helm Chart by SkyWalking Team             *
*                                                                      *
************************************************************************

Thank you for installing skywalking.

Your release is named skywalking.

Learn more, please visit https://skywalking.apache.org/

Get the UI URL by running these commands:
  export NODE_PORT=$(kubectl get --namespace skywalking -o jsonpath="{.spec.ports[0].nodePort}" services skywalking-ui)
  export NODE_IP=$(kubectl get nodes --namespace skywalking -o jsonpath="{.items[0].status.addresses[0].address}")
  echo http://$NODE_IP:$NODE_PORT

卸载 skywalking

helm uninstall skywalking -n skywalking

持续查看 pod 安装进度

[root@master ~/skywalking-kubernetes/chart]# kubectl get pod -n skywalking -w
[root@master ~/skywalking-kubernetes/chart]# kubectl get pod -n skywalking -w
NAME                              READY   STATUS            RESTARTS   AGE
skywalking-es-init--1-fnnbn       0/1     PodInitializing   0          42s
skywalking-oap-7596f94959-97ccs   0/1     PodInitializing   0          42s
skywalking-oap-7596f94959-t5sxv   0/1     PodInitializing   0          42s
skywalking-ui-7957d9fb5f-drpwg    1/1     Running           0          42s
......

临时对外暴露 skywalking 端口,我用了 NodePort 的方法开放了端口,生产中也可以使用 ingress 的方式开放

export POD_NAME=$(kubectl get pods --namespace skywalking -l "app=skywalking,release=skywalking,component=ui" -o jsonpath="{.items[0].metadata.name}")
kubectl port-forward $POD_NAME 8080:8080 --namespace skywalking

查看 skywalking 的访问 URL:k8s master/node ip + nodeport

export NODE_PORT=$(kubectl get --namespace skywalking -o jsonpath="{.spec.ports[0].nodePort}" services skywalking-ui)
export NODE_IP=$(kubectl get nodes --namespace skywalking -o jsonpath="{.items[0].status.addresses[0].address}")
echo http://$NODE_IP:$NODE_PORT

-----------------------------------------------------------------------------------------
[root@master ~]#kubectl get --namespace skywalking -o jsonpath="{.spec.ports[0].nodePort}" services skywalking-ui
30008
[root@master ~]#kubectl get nodes --namespace skywalking -o jsonpath="{.items[0].status.addresses[0].address}"
192.168.10.65

运行状态检查

# docker.mirrors.ustc.edu.cn/apache/skywalking-oap-server   8.4.0-es7  # 这个镜像最好提前下载
# 显示 PodInitializing 就是因为镜像一直没下载下来
[root@master ~]#kubectl get pods,svc -o wide -n skywalking
NAME                                        READY   STATUS            RESTARTS      AGE   IP            NODE     NOMINATED NODE   READINESS GATES
pod/elasticsearch-single-6768b6454b-l5tds   1/1     Running           0             31m   10.244.2.17   node02   <none>           <none>
pod/skywalking-es-init--1-kqkz2             0/1     Completed         0             28m   10.244.1.32   node01   <none>           <none>
pod/skywalking-oap-666d7ffb45-bd2bl         0/1     PodInitializing   0             28m   10.244.2.18   node02   <none>           <none>
pod/skywalking-oap-666d7ffb45-kpl7p         1/1     Running           1 (24m ago)   28m   10.244.1.31   node01   <none>           <none>
pod/skywalking-ui-9c4c5f495-vwnrz           1/1     Running           0             28m   10.244.1.30   node01   <none>           <none>

NAME                           TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)               AGE   SELECTOR
service/elasticsearch-single   ClusterIP   10.102.62.187    <none>        9200/TCP              31m   k8s-app=elasticsearch-single
service/skywalking-oap         ClusterIP   10.101.90.191    <none>        11800/TCP,12800/TCP   28m   app=skywalking,component=oap,release=skywalking
service/skywalking-ui          NodePort    10.111.254.170   <none>        80:30008/TCP          28m   app=skywalking,component=ui,release=skywalking

有时候 init 和 oap pod 运行不成功,显示等待 es container 创建,这是 oap 没有检测到 es 存储,可能是对接出了问题,有点玄学,需要排查。

访问

http://nodeIP:nodePort


K8s 环境 nfs 动态存储卷部署

Helm 部署 skywalking

k8s 跨 namespace 访问服务

helm v3 在 k8s 上面的部署 skywalking

k8s 部署 elasticsearch(包含数据挂载VOLUME)

k8s 安装 elasticsearch 集群

公博义

K3S 集群安装 helmv3 与使用

Skywalking 实战

【Docker】之部署 skywalking 实现全链路监控功能

Kubernetes 中部署 ES 集群及运维 --- 华为云 重点


posted @ 2022-07-10 23:28  公博义  阅读(1283)  评论(1编辑  收藏  举报