一、首先摆一张基于 zipkin 的调用链数据架构图
1.1 应用程序把搜集到的 span 信息发送到 kafka
1.2 zipkin 消费 kafka 中消息
1.3 zipkin 将数据写入到 mysql
1.4 为 zipkin 集群配置反向代理,路由 web ui 请求
二、接下来描述每个步骤的操作流程
2.1 以 java 应用为例,spring cloud sleuth 做了比较完整的封装,引入依赖后,只需要进行相应的配置即可,这里只给出 sender 部分的配置,默认使用的 topic 是 zipkin,因为是部署 zipkin 集群,则分区数应大于等于 zipkin 实例数。
spring.zipkin.sender.type=kafka spring.kafka.bootstrap-servers=127.0.0.1:9092
2.2 翻阅 zipkin 的文档,找到了一种简洁的配置方式,因 zipkin 是基于 spring boot 开发的应用,启动时按照 spring boot 项目配置即可
在 zipkin.jar 所在目录创建 zipkin-server.properties 文件,该文件中配置的变量会覆盖默认值。
zipkin collector 消费 kafka 数据,配置如下:
zipkin.collector.kafka.enabled=true zipkin.collector.kafka.bootstrap-servers=172.16.101.74:9092,172.16.101.75:9092,172.16.101.76:9092 zipkin.collector.kafka.topic=zipkin zipkin.collector.kafka.group-id=zipkin
需要注意的是,zipkin 支持配置多个 collector 同时工作:http, grpc, kafka 等,即 zipkin 可以同时接收 http 方式传输的数据和 kafka 中的消息数据。
2.3 zipkin storage 支持 ES,MySQL,Cassandra 等,本文使用 MySQL 存储,脚本在 zipkin 项目中,如下:

CREATE TABLE IF NOT EXISTS zipkin_spans ( `trace_id_high` BIGINT NOT NULL DEFAULT 0 COMMENT 'If non zero, this means the trace uses 128 bit traceIds instead of 64 bit', `trace_id` BIGINT NOT NULL, `id` BIGINT NOT NULL, `name` VARCHAR(255) NOT NULL, `remote_service_name` VARCHAR(255), `parent_id` BIGINT, `debug` BIT(1), `start_ts` BIGINT COMMENT 'Span.timestamp(): epoch micros used for endTs query and to implement TTL', `duration` BIGINT COMMENT 'Span.duration(): micros used for minDuration and maxDuration query', PRIMARY KEY (`trace_id_high`, `trace_id`, `id`) ) ENGINE=InnoDB ROW_FORMAT=COMPRESSED CHARACTER SET=utf8 COLLATE utf8_general_ci; ALTER TABLE zipkin_spans ADD INDEX(`trace_id_high`, `trace_id`) COMMENT 'for getTracesByIds'; ALTER TABLE zipkin_spans ADD INDEX(`name`) COMMENT 'for getTraces and getSpanNames'; ALTER TABLE zipkin_spans ADD INDEX(`remote_service_name`) COMMENT 'for getTraces and getRemoteServiceNames'; ALTER TABLE zipkin_spans ADD INDEX(`start_ts`) COMMENT 'for getTraces ordering and range'; CREATE TABLE IF NOT EXISTS zipkin_annotations ( `trace_id_high` BIGINT NOT NULL DEFAULT 0 COMMENT 'If non zero, this means the trace uses 128 bit traceIds instead of 64 bit', `trace_id` BIGINT NOT NULL COMMENT 'coincides with zipkin_spans.trace_id', `span_id` BIGINT NOT NULL COMMENT 'coincides with zipkin_spans.id', `a_key` VARCHAR(255) NOT NULL COMMENT 'BinaryAnnotation.key or Annotation.value if type == -1', `a_value` BLOB COMMENT 'BinaryAnnotation.value(), which must be smaller than 64KB', `a_type` INT NOT NULL COMMENT 'BinaryAnnotation.type() or -1 if Annotation', `a_timestamp` BIGINT COMMENT 'Used to implement TTL; Annotation.timestamp or zipkin_spans.timestamp', `endpoint_ipv4` INT COMMENT 'Null when Binary/Annotation.endpoint is null', `endpoint_ipv6` BINARY(16) COMMENT 'Null when Binary/Annotation.endpoint is null, or no IPv6 address', `endpoint_port` SMALLINT COMMENT 'Null when Binary/Annotation.endpoint is null', `endpoint_service_name` VARCHAR(255) COMMENT 'Null when Binary/Annotation.endpoint is null' ) ENGINE=InnoDB ROW_FORMAT=COMPRESSED CHARACTER SET=utf8 COLLATE utf8_general_ci; ALTER TABLE zipkin_annotations ADD UNIQUE KEY(`trace_id_high`, `trace_id`, `span_id`, `a_key`, `a_timestamp`) COMMENT 'Ignore insert on duplicate'; ALTER TABLE zipkin_annotations ADD INDEX(`trace_id_high`, `trace_id`, `span_id`) COMMENT 'for joining with zipkin_spans'; ALTER TABLE zipkin_annotations ADD INDEX(`trace_id_high`, `trace_id`) COMMENT 'for getTraces/ByIds'; ALTER TABLE zipkin_annotations ADD INDEX(`endpoint_service_name`) COMMENT 'for getTraces and getServiceNames'; ALTER TABLE zipkin_annotations ADD INDEX(`a_type`) COMMENT 'for getTraces and autocomplete values'; ALTER TABLE zipkin_annotations ADD INDEX(`a_key`) COMMENT 'for getTraces and autocomplete values'; ALTER TABLE zipkin_annotations ADD INDEX(`trace_id`, `span_id`, `a_key`) COMMENT 'for dependencies job'; CREATE TABLE IF NOT EXISTS zipkin_dependencies ( `day` DATE NOT NULL, `parent` VARCHAR(255) NOT NULL, `child` VARCHAR(255) NOT NULL, `call_count` BIGINT, `error_count` BIGINT, PRIMARY KEY (`day`, `parent`, `child`) ) ENGINE=InnoDB ROW_FORMAT=COMPRESSED CHARACTER SET=utf8 COLLATE utf8_general_ci;
zipkin storage 使用 MySQL 的配置如下:
zipkin.storage.type=mysql zipkin.storage.mysql.host=127.0.0.1 zipkin.storage.mysql.port=3306 zipkin.storage.mysql.username=root zipkin.storage.mysql.password=Root_2023 zipkin.storage.mysql.db=zipkin
需要注意的是,调用链数据如果很多,MySQL 表中的数据会急剧膨胀,需要定时清理数据,使用 MySQL 的定时任务(只保留 24 小时以内的数据)即可:
CREATE EVENT clean_zipkin_spans ON SCHEDULE EVERY 24 HOUR DO delete from zipkin.zipkin_spans where (start_ts / 1000000) < UNIX_TIMESTAMP(now()) - 24 * 60 * 60; CREATE EVENT clean_zipkin_annotations ON SCHEDULE EVERY 24 HOUR DO delete from zipkin.zipkin_annotations
where (a_timestamp/1000000) < UNIX_TIMESTAMP(now()) - 24 * 60 * 60;
2.4 以三节点集群为例,每个 zipkin 实例的配置都是一样的,为集群配置好反向代理后,整个集群就搭建好了
聊点题外话,在使用调用链的过程中,有人好奇 trace id 是如何生成的吗?brave 生成 trace id 的代码在:
// brave.internal.Platform.Jre7#randomLong @IgnoreJRERequirement @Override public long randomLong() { return java.util.concurrent.ThreadLocalRandom.current().nextLong(); }
【推荐】编程新体验,更懂你的AI,立即体验豆包MarsCode编程助手
【推荐】凌霞软件回馈社区,博客园 & 1Panel & Halo 联合会员上线
【推荐】抖音旗下AI助手豆包,你的智能百科全书,全免费不限次数
【推荐】博客园社区专享云产品让利特惠,阿里云新客6.5折上折
【推荐】轻量又高性能的 SSH 工具 IShell:AI 加持,快人一步