可观测-Opentelemetry链路追踪原理

下面先来看一些基本概念

信息传播器-Propagator

// TextMapPropagator propagates cross-cutting concerns as key-value text
// pairs within a carrier that travels in-band across process boundaries.
type TextMapPropagator interface {
   // DO NOT CHANGE: any modification will not be backwards compatible and
   // must never be done outside of a new major release.

   // Inject set cross-cutting concerns from the Context into the carrier.
   Inject(ctx context.Context, carrier TextMapCarrier)
   // DO NOT CHANGE: any modification will not be backwards compatible and
   // must never be done outside of a new major release.

   // Extract reads cross-cutting concerns from the carrier into a Context.
   Extract(ctx context.Context, carrier TextMapCarrier) context.Context
   // DO NOT CHANGE: any modification will not be backwards compatible and
   // must never be done outside of a new major release.

   // Fields returns the keys whose values are set with Inject.
   Fields() []string
   // DO NOT CHANGE: any modification will not be backwards compatible and
   // must never be done outside of a new major release.
}

链路信息的传播器,主要做以下工作:

  • 将trace信息注入数据媒介carrier中

  • 从媒介中提取trace信息

数据媒介-Carrier

// TextMapCarrier is the storage medium used by a TextMapPropagator.
type TextMapCarrier interface {
   // DO NOT CHANGE: any modification will not be backwards compatible and
   // must never be done outside of a new major release.

   // Get returns the value associated with the passed key.
   Get(key string) string
   // DO NOT CHANGE: any modification will not be backwards compatible and
   // must never be done outside of a new major release.

   // Set stores the key-value pair.
   Set(key string, value string)
   // DO NOT CHANGE: any modification will not be backwards compatible and
   // must never be done outside of a new major release.

   // Keys lists the keys stored in this carrier.
   Keys() []string
   // DO NOT CHANGE: any modification will not be backwards compatible and
   // must never be done outside of a new major release.
}

在 OpenTelemetry 中,Trace 的传递中有一个核心的概念,叫 Carrier(搬运工具)。它是 Propagator 用来读取 Context 数据的一种媒介 medium,Carrier 数据格式可能是一个字符 Map 或者一个字符数组。

Carrier 充当"搬运" Span 中 SpanContext 工具。例如 OpenTelemetry 中为了把 Trace 的 Span 信息传递下去,在 HTTP 调用场景中,会有 HttpCarrier,在 RPC 的调用场景中会有 RpcCarrier 来搬运 SpanContext。Trace 通过 Carrier 可以把链路追踪状态在进程中、进程间传递。

操作对象-Span

// Span is the individual component of a trace. It represents a single named
// and timed operation of a workflow that is traced. A Tracer is used to
// create a Span and it is then up to the operation the Span represents to
// properly end the Span when the operation itself ends.
//
// Warning: methods may be added to this interface in minor releases.
type Span interface {
   // End completes the Span. The Span is considered complete and ready to be
   // delivered through the rest of the telemetry pipeline after this method
   // is called. Therefore, updates to the Span are not allowed after this
   // method has been called.
   End(options ...SpanEndOption)

   // AddEvent adds an event with the provided name and options.
   AddEvent(name string, options ...EventOption)

   // IsRecording returns the recording state of the Span. It will return
   // true if the Span is active and events can be recorded.
   IsRecording() bool

   // RecordError will record err as an exception span event for this span. An
   // additional call to SetStatus is required if the Status of the Span should
   // be set to Error, as this method does not change the Span status. If this
   // span is not being recorded or err is nil then this method does nothing.
   RecordError(err error, options ...EventOption)

   // SpanContext returns the SpanContext of the Span. The returned SpanContext
   // is usable even after the End method has been called for the Span.
   SpanContext() SpanContext

   // SetStatus sets the status of the Span in the form of a code and a
   // description, provided the status hasn't already been set to a higher
   // value before (OK > Error > Unset). The description is only included in a
   // status when the code is for an error.
   SetStatus(code codes.Code, description string)

   // SetName sets the Span name.
   SetName(name string)

   // SetAttributes sets kv as attributes of the Span. If a key from kv
   // already exists for an attribute of the Span it will be overwritten with
   // the value contained in kv.
   SetAttributes(kv ...attribute.KeyValue)

   // TracerProvider returns a TracerProvider that can be used to generate
   // additional Spans on the same telemetry pipeline as the current Span.
   TracerProvider() TracerProvider
}
// recordingSpan is an implementation of the OpenTelemetry Span API
// representing the individual component of a trace that is sampled.
type recordingSpan struct {
   // mu protects the contents of this span.
   mu sync.Mutex

   // parent holds the parent span of this span as a trace.SpanContext.
   parent trace.SpanContext

   // spanKind represents the kind of this span as a trace.SpanKind.
   spanKind trace.SpanKind

   // name is the name of this span.
   name string

   // startTime is the time at which this span was started.
   startTime time.Time

   // endTime is the time at which this span was ended. It contains the zero
   // value of time.Time until the span is ended.
   endTime time.Time

   // status is the status of this span.
   status Status

   // childSpanCount holds the number of child spans created for this span.
   childSpanCount int

   // spanContext holds the SpanContext of this span.
   spanContext trace.SpanContext

   // attributes is a collection of user provided key/values. The collection
   // is constrained by a configurable maximum held by the parent
   // TracerProvider. When additional attributes are added after this maximum
   // is reached these attributes the user is attempting to add are dropped.
   // This dropped number of attributes is tracked and reported in the
   // ReadOnlySpan exported when the span ends.
   attributes []attribute.KeyValue
   droppedAttributes int

   // events are stored in FIFO queue capped by configured limit.
   events evictedQueue

   // links are stored in FIFO queue capped by configured limit.
   links evictedQueue

   // executionTracerTaskEnd ends the execution tracer span.
   executionTracerTaskEnd func()

   // tracer is the SDK tracer that created this span.
   tracer *tracer
}

Span 代表了事务中的操作,每个 Span 封装了以下状态:

  • 操作名称

  • 起止时间戳

  • 属性(Attributes):一系列键值对

  • 0 个或多个事件(Events)的集合,每个都是一个元组(时间戳,名称,属性),名称必须是字符串

  • 父 Span 的标识

  • 与 0 个或多个具有因果关系的 Span 链接(Links),通过相关 Span 的 SpanContext

  • 引用 Span 所需的 SpanContext 信息

操作上下文-SpanContext

// SpanContext contains identifying trace information about a Span.
type SpanContext struct {
   traceID TraceID
   spanID SpanID
   traceFlags TraceFlags
   traceState TraceState
   remote bool
}

表示标识 Trace 中的 Span 的所有信息,必须传播到子 Span 和跨进程边界。一个 SpanContext 包含从父 Span 传播到子 Span 的跟踪标识符和选项。

  • TraceId:trace 的标识符。全局唯一,随机生成 16 个字节。TraceId 用于将跨进程的特定 trace 的所有 span 分组在一起。

  • SpanId:span 的标识符。全局唯一,随机生成 8 个字节。当传递给子 Span 时,该标识符将成为子 Span 的父 span id 。

  • TraceFlags:trace 的选项。表示为一字节(位图 bitmap)- Sampling bit:表示 trace 是否被采样的比特(掩码 0x1 )

  • Tracestate:在一个键值对列表中携带特定于追踪系统的上下文。Tracestate 允许不同的供应商传播额外的信息,用它们的遗留的 Id 格式进行互操作。

Trace Span传播过程

OpenTelemetry Span 在传播中有的基本步骤:

  • StartSpan:Trace 在具体操作中自动生成一个 Span

  • Inject 注入: 将 Span 的 SpanContext 写入到 Carrier 的过程

链路数据为了进行网络传输,需要数据进行序列化和反序列化。这个过程 Trace 通过一个负责数据序列化反序列化上下文的 Formatter 接口实现的。例如在 HttpCarrier 使用中通常就会有一个对应的 HttpFormatter。所以 Inject 注入是委托给 Formatter 将 SpanContext 进行序列化写入 Carrier。

Formatter 提供不同场景序列化的数据格式,叫做 Format 描述。比如:

  • TextMap: 基于字符串的 Map 记录 SpanContext 信息,适用 RPC 网络传输

  • HTTP Headers: 方便解析 HTTP Headers 信息,用于 HTTP 传输

https://jckling.github.io/2021/04/02/Jaeger/OpenTelemetry%20%E8%A7%84%E8%8C%83%E9%98%85%E8%AF%BB/
https://www.cnblogs.com/haiyux/p/15317614.html

posted on   萌兰三太子  阅读(120)  评论(0编辑  收藏  举报  

相关博文:
阅读排行:
· TypeScript + Deepseek 打造卜卦网站:技术与玄学的结合
· .NET Core 中如何实现缓存的预热?
· 阿里巴巴 QwQ-32B真的超越了 DeepSeek R-1吗?
· 如何调用 DeepSeek 的自然语言处理 API 接口并集成到在线客服系统
· 【译】Visual Studio 中新的强大生产力特性
< 2025年3月 >
23 24 25 26 27 28 1
2 3 4 5 6 7 8
9 10 11 12 13 14 15
16 17 18 19 20 21 22
23 24 25 26 27 28 29
30 31 1 2 3 4 5

导航

统计

点击右上角即可分享
微信分享提示