Otel.Trace.SpanStorage (otel v0.4.1)

Copy Markdown View Source

ETS-backed storage for spans across their full lifecycle — both active (mutable via set_attribute / add_event) and completed (waiting for export after end_span) spans live in a single table.

Each row is a 4-tuple {span_id, %Otel.Trace.Span{}, status, inserted_at_ms} where status is :active or :completed and inserted_at_ms is the millisecond timestamp stamped at insert/1 time. The 4th column is internal-only — it exists solely so the sweep loop can identify stale rows by insertion time (not by span.start_time, which the caller may legitimately backdate via start_span/3's :start_time opt). It is set once and preserved unchanged by update/1 and complete/1.

Public API — generic CRUD on active spans

FunctionRole
insert/1insert a fresh span as :active (back-pressure aware)
get/1look up an active span by span_id
update/1atomic replace of an active span (no-op if already completed)
complete/1atomic flip :active → :completed with the caller's final span value
take_completed/1take + delete a batch of completed spans (Exporter only)

Mutation flow used by Otel.Trace.Span:

span = SpanStorage.get(span_id)
new_span = apply_limits(span, ...)   # caller-side transformation
SpanStorage.update(new_span)         # atomic replace via :ets.select_replace

Termination flow (end_span):

span = SpanStorage.get(span_id)
ended = %{span | end_time: end_time}
SpanStorage.complete(ended)     # atomic flip with the final span value

Concurrency

Multi-writer + single-reader (the Exporter):

  • insert / get / update / complete run on the caller process and write to ETS directly (write_concurrency makes this lock-free).
  • update/1 and complete/1 use a single atomic :ets.select_replace/2 BIF whose match-spec only matches :active rows. Completed spans are never accidentally re-mutated.
  • take_completed/1 is called only by SpanExporter (single reader — no take/insert races).
  • Span mutation is bound to the process that owns the span (the one that called start_span); end_span is the authoritative termination boundary — concurrent mutations not committed by the time complete/1 runs are not preserved.

Backpressure

insert/1 silently drops the span when the ETS table is already at @max_queue_size, matching the spec's maxQueueSize parameter for the Batching processor (trace/sdk.md L1086-L1118). Drop is a normal lifecycle event (per spec) rather than a failure — callers don't branch on the result. Subsequent set_attribute / add_event calls on a dropped span become no-ops because update/1 matches no row.

Sweep — stale active spans

The GenServer runs a self-scheduled sweep every @sweep_interval_ms (10 minutes) that issues a single :ets.select_delete/2 removing :active rows whose inserted_at_ms (row position 4) is older than @span_ttl_ms (30 minutes). This is the safety net for spans that never reach end_span (process crash, dropped context, leaked span_ctx) — without it, stale rows would accumulate until the @max_queue_size backpressure starts dropping fresh spans.

Sweep keys off inserted_at_ms, not span.start_time, because callers may pass a backdated :start_time (history replay, batch ingestion). Insertion time is the SDK-internal signal of "how long has this row sat in storage."

Defaults match opentelemetry-erlang's otel_span_sweeper configuration. Sweep strategy is drop only — exporting fragmentary spans muddles backend data; if observability into sweep events is needed later, an end_span-on-sweep variant can be added.

References

  • OTel Trace SDK §Span: opentelemetry-specification/specification/trace/sdk.md L692-L944
  • OTel Trace SDK Batching processor: opentelemetry-specification/specification/trace/sdk.md L1086-L1118
  • Erlang :ets.select_replace/2: https://www.erlang.org/doc/man/ets#select_replace-2
  • Erlang reference sweeper: opentelemetry-erlang/apps/opentelemetry/src/otel_span_sweeper.erl

Summary

Functions

Returns a specification to start this module under a supervisor.

Atomically flip an active span to :completed with the caller's final span value via :ets.select_replace/2. The caller is expected to have set end_time on the span before calling.

Look up an active span. Returns nil for missing or already-completed spans (:completed rows are exporter-only).

Insert a fresh span as :active. Always returns :ok — silent drop when the table is at @max_queue_size (spec trace/sdk.md L1109 "After the size is reached, spans are dropped": drop is a normal lifecycle event, not a failure).

Take up to n :completed spans atomically. Called only by Otel.Trace.SpanExporter (single reader).

Atomic replace of an active span via :ets.select_replace/2.

Functions

child_spec(init_arg)

Returns a specification to start this module under a supervisor.

See Supervisor.

complete(span)

@spec complete(span :: Otel.Trace.Span.t()) :: :ok

Atomically flip an active span to :completed with the caller's final span value via :ets.select_replace/2. The caller is expected to have set end_time on the span before calling.

Always returns :ok — silent no-op when the span is missing or already :completed (match-spec only matches :active rows).

end_span is the authoritative termination boundary — concurrent mutations not committed by the time this BIF runs are not preserved.

get(span_id)

@spec get(span_id :: Otel.Trace.SpanId.t()) :: Otel.Trace.Span.t() | nil

Look up an active span. Returns nil for missing or already-completed spans (:completed rows are exporter-only).

insert(span)

@spec insert(span :: Otel.Trace.Span.t()) :: :ok

Insert a fresh span as :active. Always returns :ok — silent drop when the table is at @max_queue_size (spec trace/sdk.md L1109 "After the size is reached, spans are dropped": drop is a normal lifecycle event, not a failure).

Drop counting / observability lives inside SpanStorage — callers don't branch on the result.

start_link(opts \\ [])

@spec start_link(opts :: keyword()) :: GenServer.on_start()

take_completed(n)

@spec take_completed(n :: pos_integer()) :: [Otel.Trace.Span.t()]

Take up to n :completed spans atomically. Called only by Otel.Trace.SpanExporter (single reader).

update(span)

@spec update(span :: Otel.Trace.Span.t()) :: :ok

Atomic replace of an active span via :ets.select_replace/2.

No-op when the span is missing or already :completed — the match-spec only matches :active rows, so completed spans are never accidentally re-activated.