Kubernetes APM链路追踪Skywalking
随着RPC框架、微服务、云计算、大数据的发展,业务的规模和深度相比过往也都在增加。一个业务可能横跨多个模块/服务/容器,依赖的中间件也越来越多,其中任何一个节点出现异常,都可能导致业务出现波动或者异常,这就导致服务质量监控和异常诊断/定位变得异常复杂。于是催生了新的业务监控模式:调用链跟踪系统APM
在诸多优秀的开源APM产品中Skywalking
和Pinpoint
脱颖而出,两款产品都通过字节码注入的方式,实现了对代码完全无任何侵入。对比如下:
前面我们介绍过单纯Docker方式(
docker-compose
)部署Pinpoint, 可以提供参考。本节我们介绍在Kubernetes上部署Skywalking。
1、Helm3
curl -LO https://get.helm.sh/helm-v3.2.4-linux-amd64.tar.gz tar -zxf helm-v3.2.4-linux-amd64.tar.gz cp linux-amd64/helm /usr/local/bin/helm3
2、服务端
Skywalking后端存储,使用EFK日志系统的ES集群。注意index加前缀区分
详细的Elasticsearch集群部署可以参考:Kubernetes日志系统EFK
cd ~/k8s/helm/charts git clone https://github.com/apache/skywalking-kubernetes.git cd skywalking-kubernetes/chart helm dep up skywalking # 创建namespace kubectl create ns skywalking # 准备values文件, 详见Values vim skywalking/values.yaml # helm3 install skywalking skywalking -n skywalking --values ./skywalking/values.yaml helm3 -n skywalking list helm3 -n skywalking delete skywalking helm3 -n skywalking upgrade skywalking --values ./skywalking/values.yaml
Helm Values
oap: name: oap dynamicConfigEnabled: false image: repository: apache/skywalking-oap-server tag: 8.1.0-es7 pullPolicy: IfNotPresent storageType: elasticsearch7 # 存储类型es7 ports: grpc: 11800 rest: 12800 replicas: 2 service: type: ClusterIP javaOpts: -Xmx2g -Xms2g antiAffinity: "soft" nodeAffinity: {} nodeSelector: {} tolerations: [] resources: {} env: SW_NAMESPACE: "skywalking" # es索引前缀skywalking_, _下划线会自动加上 ui: name: ui replicas: 1 image: repository: apache/skywalking-ui tag: 8.1.0 pullPolicy: IfNotPresent ingress: enabled: true annotations: {} path: / hosts: - skywalking.boer.xyz # ingress地址 tls: [] elasticsearch: enabled: false # 关闭内置es,我们使用EFK日志系统的ES集群 config: port: http: 9200 host: "elasticsearch-logging.logging.svc" # 日志系统ES地址 user: "elastic" password: "<your-es-password>"
3、客户端
制作skywalking-agent镜像
cd ~/k8s/apps/skywalking-agent tar -zxf apache-skywalking-apm-es7-8.1.0.tar.gz cp apache-skywalking-apm-bin-es7/agent agent vim Dockerfile # 准备Dockerfile, 详见Dockerfile docker build -t registry.boer.xyz/public/skywalking-agent:8.1.0 . docker push registry.boer.xyz/public/skywalking-agent:8.1.0
Dockerfile
FROM busybox:latest ENV LANG=C.UTF-8 WORKDIR /usr/skywalking/agent COPY agent/ .
skywalking-agent配置
# vim agent/config agent.service_name=${SW_AGENT_NAME:Your_ApplicationName} # 服务名:区分不同服务,通过环境变量设置 agent.instance_name=${HOSTNAME} # 实例名:区分多实例,取Pod主机名 collector.backend_service=${SW_AGENT_COLLECTOR_BACKEND_SERVICES:skywalking-oap.skywalking.svc:11800} # 服务端地址 logging.file_name=${SW_LOGGING_FILE_NAME:skywalking-api.log} logging.level=${SW_LOGGING_LEVEL:INFO} logging.max_file_size=${SW_LOGGING_MAX_FILE_SIZE:31457280}
4、使用示例
使用skywalking-agent
一般会想到两种方法:
- 将 agent 包构建到已经存在的基础镜像中
- 通过
initContainer
方式拷贝Agent
initContainer方式将skywalking-agent
拷贝到应用Pod中,无需修改基础JVM镜像,所以更推荐此方法:
apiVersion: apps/v1 kind: Deployment metadata: name: produce-deployment annotations: kubernetes.io/change-cause: <CHANGE_CAUSE> spec: selector: matchLabels: app: produce replicas: 2 template: metadata: labels: app: produce spec: initContainers: - image: registry.boer.xyz/public/skywalking-agent:8.1.0 name: skywalking-agent imagePullPolicy: IfNotPresent command: ['sh'] args: ['-c','cp -r /usr/skywalking/agent/* /skywalking/agent'] volumeMounts: - mountPath: /skywalking/agent name: skywalking-agent containers: - name: produce image: <IMAGE>:<IMAGE_TAG> imagePullPolicy: IfNotPresent volumeMounts: - mountPath: /usr/skywalking/agent name: skywalking-agent ports: - containerPort: 10080 resources: requests: memory: "512Mi" cpu: "200m" limits: memory: "1Gi" cpu: "600m" env: - name: ENVIRONMENT value: "pro" - name: SW_AGENT_NAME # sw服务名 value: "springboot-produce" - name: JVM_OPTS value: "-Xms512m -Xmx512m -javaagent:/usr/skywalking/agent/skywalking-agent.jar" livenessProbe: httpGet: path: /actuator/health port: 10080 initialDelaySeconds: 10 periodSeconds: 10 timeoutSeconds: 5 readinessProbe: httpGet: path: /actuator/health port: 10080 initialDelaySeconds: 10 periodSeconds: 10 timeoutSeconds: 5 lifecycle: preStop: exec: command: - "curl" - "-XPOST" - "http://127.0.0.1:10080/actuator/shutdown" imagePullSecrets: - name: regcred volumes: - name: skywalking-agent emptyDir: {}
5、Skywalking ES存储索引管理
详细iLM索引生命周期,见Kubernetes日志系统EFK一文
PUT _ilm/policy/skywalking-policy { "policy": { "phases": { "warm": { "min_age": "2d", "actions": { "forcemerge": { "max_num_segments": 1 } } }, "delete": { "min_age": "3d", "actions": { "delete": {} } } } } } PUT template/skywalking-template { "index_patterns": ["skywalking*"], // 这里完全匹配skywalking索引前缀,即SW_NAMESPACE "settings": { "number_of_shards": 3, "number_of_replicas": 0, "index.lifecycle.name": "skywalking-policy", "index.refresh_interval": "30s", "index.translog.durability": "async", "index.translog.sync_interval":"60s" } }
6、The show
Ref
- https://github.com/apache/skywalking-kubernetes
- https://skywalking.apache.org/zh/blog/2019-08-30-how-to-use-Skywalking-Agent.html
- https://skywalking.apache.org/zh/blog/2019-02-24-skywalking-pk-pinpoint.html
- https://skywalking.apache.org/zh/blog/2019-11-07-skywalking-elasticsearch-storage-optimization.html
- https://www.boer.xyz/2020/08/16/k8s-apm-skywalking/
【推荐】国内首个AI IDE,深度理解中文开发场景,立即下载体验Trae
【推荐】编程新体验,更懂你的AI,立即体验豆包MarsCode编程助手
【推荐】抖音旗下AI助手豆包,你的智能百科全书,全免费不限次数
【推荐】轻量又高性能的 SSH 工具 IShell:AI 加持,快人一步
· 基于Microsoft.Extensions.AI核心库实现RAG应用
· Linux系列:如何用heaptrack跟踪.NET程序的非托管内存泄露
· 开发者必知的日志记录最佳实践
· SQL Server 2025 AI相关能力初探
· Linux系列:如何用 C#调用 C方法造成内存泄露
· Manus爆火,是硬核还是营销?
· 终于写完轮子一部分:tcp代理 了,记录一下
· 震惊!C++程序真的从main开始吗?99%的程序员都答错了
· 别再用vector<bool>了!Google高级工程师:这可能是STL最大的设计失误
· 单元测试从入门到精通