Telemetry Architecture¶
This guide explains the internals of weevr's telemetry system: how spans are created and collected, how the trace tree is composed, and how telemetry integrates with execution results.
Introduction¶
weevr provides OTel-compatible telemetry without depending on the OpenTelemetry SDK. Spans, trace IDs, and status codes follow OTel conventions, making the data portable to any tracing backend. The system is designed for concurrent execution — each thread operates on its own collector, and merging happens only after thread completion.
Key concepts¶
- ExecutionSpan — An immutable record of a unit of work. Carries a trace ID, span ID, parent span ID, status, timestamps, attributes, and events.
- SpanBuilder — A mutable accumulator for constructing a span during
execution. Attributes and events are added incrementally. Calling
finish()produces an immutableExecutionSpan. - SpanCollector — A mutable container that holds finished spans for a given scope (thread, weave, or loom). Collectors merge upward after execution completes.
- Trace tree — A hierarchy of
LoomTrace→WeaveTrace→ThreadTraceobjects that wrap spans with navigable parent-child relationships.
How it works¶
Span lifecycle¶
- A
SpanCollectoris created for each execution scope with a sharedtrace_id. collector.start_span("thread:dim_customer", parent_span_id=...)returns a mutableSpanBuilderwith an auto-generatedspan_id.- During execution, the builder accumulates attributes (row counts, durations) and events (sources_read, write_complete).
builder.finish(SpanStatus.OK)stamps the end time and produces an immutableExecutionSpan. The builder cannot be used after this call.- The finished span is added to the collector.
Collector isolation during parallel execution¶
When threads run concurrently within a weave, each thread gets its own
SpanCollector. This avoids lock contention — no shared mutable state
between threads.
After a thread completes, its collector is merged into the weave-level
collector via weave_collector.merge(thread_collector). This operation
appends all spans from the thread collector into the weave collector's list.
The same pattern repeats at the weave → loom boundary.
Trace tree composition¶
The telemetry.trace module provides tree-shaped models for navigating
execution results:
| Model | Contains | Used for |
|---|---|---|
LoomTrace |
Loom span + dict[str, WeaveTrace] |
Root of the tree |
WeaveTrace |
Weave span + dict[str, ThreadTrace] |
Intermediate level |
ThreadTrace |
Thread span + ThreadTelemetry |
Leaf nodes with full metrics |
Each trace type provides a to_spans() method that recursively flattens the
tree into a list of ExecutionSpan objects, suitable for export to OTel-
compatible backends.
Telemetry result hierarchy¶
Parallel to the trace tree, the telemetry.results module provides data
classes that carry execution metrics:
- ThreadTelemetry — Span, row counts (read/written/quarantined), validation results, assertion results, watermark state, CDC operation counts.
- WeaveTelemetry — Span +
dict[str, ThreadTelemetry]. - LoomTelemetry — Span +
dict[str, WeaveTelemetry].
These are attached to the result objects returned by ctx.run(), giving
callers both navigable tree access and flat metric queries.
Structured logging¶
The StructuredJsonFormatter formats Python log records as single-line JSON
with OTel-compatible fields:
- Standard fields:
timestamp,level,logger,message - Correlation fields:
trace_id,span_id(when available) - Context fields:
thread_name,weave_name,loom_name - Extra:
attributesdict for custom key-value context
Log level mapping from weevr config values to Python levels:
| weevr level | Python level |
|---|---|
minimal |
WARNING |
standard |
INFO |
verbose |
DEBUG |
debug |
DEBUG |
Module map¶
| Module | Responsibility |
|---|---|
telemetry/span.py |
ExecutionSpan, SpanEvent, SpanStatus, ID generators |
telemetry/collector.py |
SpanBuilder (mutable), SpanCollector (accumulator) |
telemetry/trace.py |
LoomTrace, WeaveTrace, ThreadTrace — tree navigation and flattening |
telemetry/logging.py |
StructuredJsonFormatter, configure_logging, LogLevel |
telemetry/results.py |
ThreadTelemetry, WeaveTelemetry, LoomTelemetry, ValidationResult, AssertionResult |
telemetry/events.py |
LogEvent model, create_log_event factory |
Design decisions¶
- OTel-compatible but not OTel-dependent — Span IDs (16-char hex), trace IDs (32-char hex), and status codes match OTel conventions. This allows export to any OTel backend without requiring the OTel SDK as a runtime dependency. The SDK can be added later as an optional integration.
- Per-thread collectors — Concurrent threads each get their own collector. No locks, no contention. Merging happens only after thread completion, when the executor is back in single-threaded control flow.
- Immutable spans, mutable builders —
ExecutionSpanis frozen for safe sharing across threads and inclusion in result objects.SpanBuilderprovides a natural accumulation API during execution. Thefinish()method enforces the transition from mutable to immutable. - Two views of the same data — The trace tree (
LoomTrace→WeaveTrace→ThreadTrace) provides hierarchical navigation for programmatic access. Theto_spans()method provides a flat list for serialization and export.
Further reading¶
- Observability — User-facing guide to logging, spans, and telemetry access
- Implement Custom Telemetry Sink — Export telemetry to Delta or external systems