可观测-Opentelemetry链路追踪原理
下面先来看一些基本概念
信息传播器-Propagator
// TextMapPropagator propagates cross-cutting concerns as key-value text
// pairs within a carrier that travels in-band across process boundaries.
type TextMapPropagator interface {
// DO NOT CHANGE: any modification will not be backwards compatible and
// must never be done outside of a new major release.
// Inject set cross-cutting concerns from the Context into the carrier.
Inject(ctx context.Context, carrier TextMapCarrier)
// DO NOT CHANGE: any modification will not be backwards compatible and
// must never be done outside of a new major release.
// Extract reads cross-cutting concerns from the carrier into a Context.
Extract(ctx context.Context, carrier TextMapCarrier) context.Context
// DO NOT CHANGE: any modification will not be backwards compatible and
// must never be done outside of a new major release.
// Fields returns the keys whose values are set with Inject.
Fields() []string
// DO NOT CHANGE: any modification will not be backwards compatible and
// must never be done outside of a new major release.
}
链路信息的传播器,主要做以下工作:
将trace信息注入数据媒介carrier中
从媒介中提取trace信息
数据媒介-Carrier
// TextMapCarrier is the storage medium used by a TextMapPropagator.
type TextMapCarrier interface {
// DO NOT CHANGE: any modification will not be backwards compatible and
// must never be done outside of a new major release.
// Get returns the value associated with the passed key.
Get(key string) string
// DO NOT CHANGE: any modification will not be backwards compatible and
// must never be done outside of a new major release.
// Set stores the key-value pair.
Set(key string, value string)
// DO NOT CHANGE: any modification will not be backwards compatible and
// must never be done outside of a new major release.
// Keys lists the keys stored in this carrier.
Keys() []string
// DO NOT CHANGE: any modification will not be backwards compatible and
// must never be done outside of a new major release.
}
在 OpenTelemetry 中,Trace 的传递中有一个核心的概念,叫 Carrier(搬运工具)。它是 Propagator 用来读取 Context 数据的一种媒介 medium,Carrier 数据格式可能是一个字符 Map 或者一个字符数组。
Carrier 充当"搬运" Span 中 SpanContext 工具。例如 OpenTelemetry 中为了把 Trace 的 Span 信息传递下去,在 HTTP 调用场景中,会有 HttpCarrier,在 RPC 的调用场景中会有 RpcCarrier 来搬运 SpanContext。Trace 通过 Carrier 可以把链路追踪状态在进程中、进程间传递。
操作对象-Span
// Span is the individual component of a trace. It represents a single named
// and timed operation of a workflow that is traced. A Tracer is used to
// create a Span and it is then up to the operation the Span represents to
// properly end the Span when the operation itself ends.
//
// Warning: methods may be added to this interface in minor releases.
type Span interface {
// End completes the Span. The Span is considered complete and ready to be
// delivered through the rest of the telemetry pipeline after this method
// is called. Therefore, updates to the Span are not allowed after this
// method has been called.
End(options ...SpanEndOption)
// AddEvent adds an event with the provided name and options.
AddEvent(name string, options ...EventOption)
// IsRecording returns the recording state of the Span. It will return
// true if the Span is active and events can be recorded.
IsRecording() bool
// RecordError will record err as an exception span event for this span. An
// additional call to SetStatus is required if the Status of the Span should
// be set to Error, as this method does not change the Span status. If this
// span is not being recorded or err is nil then this method does nothing.
RecordError(err error, options ...EventOption)
// SpanContext returns the SpanContext of the Span. The returned SpanContext
// is usable even after the End method has been called for the Span.
SpanContext() SpanContext
// SetStatus sets the status of the Span in the form of a code and a
// description, provided the status hasn't already been set to a higher
// value before (OK > Error > Unset). The description is only included in a
// status when the code is for an error.
SetStatus(code codes.Code, description string)
// SetName sets the Span name.
SetName(name string)
// SetAttributes sets kv as attributes of the Span. If a key from kv
// already exists for an attribute of the Span it will be overwritten with
// the value contained in kv.
SetAttributes(kv ...attribute.KeyValue)
// TracerProvider returns a TracerProvider that can be used to generate
// additional Spans on the same telemetry pipeline as the current Span.
TracerProvider() TracerProvider
}
// recordingSpan is an implementation of the OpenTelemetry Span API
// representing the individual component of a trace that is sampled.
type recordingSpan struct {
// mu protects the contents of this span.
mu sync.Mutex
// parent holds the parent span of this span as a trace.SpanContext.
parent trace.SpanContext
// spanKind represents the kind of this span as a trace.SpanKind.
spanKind trace.SpanKind
// name is the name of this span.
name string
// startTime is the time at which this span was started.
startTime time.Time
// endTime is the time at which this span was ended. It contains the zero
// value of time.Time until the span is ended.
endTime time.Time
// status is the status of this span.
status Status
// childSpanCount holds the number of child spans created for this span.
childSpanCount int
// spanContext holds the SpanContext of this span.
spanContext trace.SpanContext
// attributes is a collection of user provided key/values. The collection
// is constrained by a configurable maximum held by the parent
// TracerProvider. When additional attributes are added after this maximum
// is reached these attributes the user is attempting to add are dropped.
// This dropped number of attributes is tracked and reported in the
// ReadOnlySpan exported when the span ends.
attributes []attribute.KeyValue
droppedAttributes int
// events are stored in FIFO queue capped by configured limit.
events evictedQueue
// links are stored in FIFO queue capped by configured limit.
links evictedQueue
// executionTracerTaskEnd ends the execution tracer span.
executionTracerTaskEnd func()
// tracer is the SDK tracer that created this span.
tracer *tracer
}
Span 代表了事务中的操作,每个 Span 封装了以下状态:
操作名称
起止时间戳
属性(Attributes):一系列键值对
0 个或多个事件(Events)的集合,每个都是一个元组(时间戳,名称,属性),名称必须是字符串
父 Span 的标识
与 0 个或多个具有因果关系的 Span 链接(Links),通过相关 Span 的 SpanContext
引用 Span 所需的 SpanContext 信息
操作上下文-SpanContext
// SpanContext contains identifying trace information about a Span.
type SpanContext struct {
traceID TraceID
spanID SpanID
traceFlags TraceFlags
traceState TraceState
remote bool
}
表示标识 Trace 中的 Span 的所有信息,必须传播到子 Span 和跨进程边界。一个 SpanContext 包含从父 Span 传播到子 Span 的跟踪标识符和选项。
TraceId:trace 的标识符。全局唯一,随机生成 16 个字节。TraceId 用于将跨进程的特定 trace 的所有 span 分组在一起。
SpanId:span 的标识符。全局唯一,随机生成 8 个字节。当传递给子 Span 时,该标识符将成为子 Span 的父 span id 。
TraceFlags:trace 的选项。表示为一字节(位图 bitmap)- Sampling bit:表示 trace 是否被采样的比特(掩码
0x1
)Tracestate:在一个键值对列表中携带特定于追踪系统的上下文。Tracestate 允许不同的供应商传播额外的信息,用它们的遗留的 Id 格式进行互操作。
Trace Span传播过程
OpenTelemetry Span 在传播中有的基本步骤:
StartSpan:Trace 在具体操作中自动生成一个 Span
Inject 注入: 将 Span 的 SpanContext 写入到 Carrier 的过程
链路数据为了进行网络传输,需要数据进行序列化和反序列化。这个过程 Trace 通过一个负责数据序列化反序列化上下文的 Formatter 接口实现的。例如在 HttpCarrier 使用中通常就会有一个对应的 HttpFormatter。所以 Inject 注入是委托给 Formatter 将 SpanContext 进行序列化写入 Carrier。
Formatter 提供不同场景序列化的数据格式,叫做 Format 描述。比如:
TextMap: 基于字符串的 Map 记录 SpanContext 信息,适用 RPC 网络传输
HTTP Headers: 方便解析 HTTP Headers 信息,用于 HTTP 传输
https://jckling.github.io/2021/04/02/Jaeger/OpenTelemetry%20%E8%A7%84%E8%8C%83%E9%98%85%E8%AF%BB/
https://www.cnblogs.com/haiyux/p/15317614.html
【推荐】国内首个AI IDE,深度理解中文开发场景,立即下载体验Trae
【推荐】编程新体验,更懂你的AI,立即体验豆包MarsCode编程助手
【推荐】抖音旗下AI助手豆包,你的智能百科全书,全免费不限次数
【推荐】轻量又高性能的 SSH 工具 IShell:AI 加持,快人一步
· TypeScript + Deepseek 打造卜卦网站:技术与玄学的结合
· .NET Core 中如何实现缓存的预热?
· 阿里巴巴 QwQ-32B真的超越了 DeepSeek R-1吗?
· 如何调用 DeepSeek 的自然语言处理 API 接口并集成到在线客服系统
· 【译】Visual Studio 中新的强大生产力特性