Skip to content

Telemetry Architecture

This guide explains the internals of weevr's telemetry system: how spans are created and collected, how the trace tree is composed, and how telemetry integrates with execution results.

Introduction

weevr provides OTel-compatible telemetry without depending on the OpenTelemetry SDK. Spans, trace IDs, and status codes follow OTel conventions, making the data portable to any tracing backend. The system is designed for concurrent execution — each thread operates on its own collector, and merging happens only after thread completion.

Key concepts

  • ExecutionSpan — An immutable record of a unit of work. Carries a trace ID, span ID, parent span ID, status, timestamps, attributes, and events.
  • SpanBuilder — A mutable accumulator for constructing a span during execution. Attributes and events are added incrementally. Calling finish() produces an immutable ExecutionSpan.
  • SpanCollector — A mutable container that holds finished spans for a given scope (thread, weave, or loom). Collectors merge upward after execution completes.
  • Trace tree — A hierarchy of LoomTraceWeaveTraceThreadTrace objects that wrap spans with navigable parent-child relationships.

How it works

Span lifecycle

collector.start_span()builder.set_attribute()builder.add_event()builder.finish(status)collector.add_span()Immutable ExecutionSpan SpanBuildermutablestored in collector
  1. A SpanCollector is created for each execution scope with a shared trace_id.
  2. collector.start_span("thread:dim_customer", parent_span_id=...) returns a mutable SpanBuilder with an auto-generated span_id.
  3. During execution, the builder accumulates attributes (row counts, durations) and events (sources_read, write_complete).
  4. builder.finish(SpanStatus.OK) stamps the end time and produces an immutable ExecutionSpan. The builder cannot be used after this call.
  5. The finished span is added to the collector.

Collector isolation during parallel execution

When threads run concurrently within a weave, each thread gets its own SpanCollector. This avoids lock contention — no shared mutable state between threads.

After a thread completes, its collector is merged into the weave-level collector via weave_collector.merge(thread_collector). This operation appends all spans from the thread collector into the weave collector's list.

The same pattern repeats at the weave → loom boundary.

Trace tree composition

The telemetry.trace module provides tree-shaped models for navigating execution results:

Model Contains Used for
LoomTrace Loom span + dict[str, WeaveTrace] Root of the tree
WeaveTrace Weave span + dict[str, ThreadTrace] Intermediate level
ThreadTrace Thread span + ThreadTelemetry Leaf nodes with full metrics

Each trace type provides a to_spans() method that recursively flattens the tree into a list of ExecutionSpan objects, suitable for export to OTel- compatible backends.

Telemetry result hierarchy

Parallel to the trace tree, the telemetry.results module provides data classes that carry execution metrics:

  • ThreadTelemetry — Span, row counts (read/written/quarantined), validation results, assertion results, watermark state, CDC operation counts.
  • WeaveTelemetry — Span + dict[str, ThreadTelemetry].
  • LoomTelemetry — Span + dict[str, WeaveTelemetry].

These are attached to the result objects returned by ctx.run(), giving callers both navigable tree access and flat metric queries.

Structured logging

The StructuredJsonFormatter formats Python log records as single-line JSON with OTel-compatible fields:

  • Standard fields: timestamp, level, logger, message
  • Correlation fields: trace_id, span_id (when available)
  • Context fields: thread_name, weave_name, loom_name
  • Extra: attributes dict for custom key-value context

Log level mapping from weevr config values to Python levels:

weevr level Python level
minimal WARNING
standard INFO
verbose DEBUG
debug DEBUG

Module map

Module Responsibility
telemetry/span.py ExecutionSpan, SpanEvent, SpanStatus, ID generators
telemetry/collector.py SpanBuilder (mutable), SpanCollector (accumulator)
telemetry/trace.py LoomTrace, WeaveTrace, ThreadTrace — tree navigation and flattening
telemetry/logging.py StructuredJsonFormatter, configure_logging, LogLevel
telemetry/results.py ThreadTelemetry, WeaveTelemetry, LoomTelemetry, ValidationResult, AssertionResult
telemetry/events.py LogEvent model, create_log_event factory

Design decisions

  • OTel-compatible but not OTel-dependent — Span IDs (16-char hex), trace IDs (32-char hex), and status codes match OTel conventions. This allows export to any OTel backend without requiring the OTel SDK as a runtime dependency. The SDK can be added later as an optional integration.
  • Per-thread collectors — Concurrent threads each get their own collector. No locks, no contention. Merging happens only after thread completion, when the executor is back in single-threaded control flow.
  • Immutable spans, mutable buildersExecutionSpan is frozen for safe sharing across threads and inclusion in result objects. SpanBuilder provides a natural accumulation API during execution. The finish() method enforces the transition from mutable to immutable.
  • Two views of the same data — The trace tree (LoomTraceWeaveTraceThreadTrace) provides hierarchical navigation for programmatic access. The to_spans() method provides a flat list for serialization and export.

Further reading