Schema Drift¶
Schema drift occurs when extra columns from upstream source changes flow through the pipeline to the target boundary. weevr detects drift and applies configurable handling at write time.
What drift detects¶
Drift detection compares the pipeline output columns against a baseline. Extra columns (present in the output but not in the baseline) are drift. Missing columns are a separate concern handled by warp enforcement.
Baseline sources¶
| Condition | Baseline |
|---|---|
| Warp exists | Warp column names |
| No warp, table exists | Existing Delta table schema |
| No warp, no table | No baseline (drift skipped) |
Drift modes¶
The schema_drift setting controls the strategy. The on_drift
setting controls the severity for strict mode.
Behavior matrix¶
schema_drift |
on_drift |
Extra columns | Result |
|---|---|---|---|
lenient |
(ignored) | Pass through | No change |
strict |
error |
Detected | Thread fails |
strict |
warn |
Dropped | Warning logged |
strict |
ignore |
Dropped | Silent |
adaptive |
(ignored) | Pass through | Evolve warp |
lenient (default)¶
Extra columns pass through to the target unchanged. A drift report is populated for visibility but no action is taken.
strict¶
Extra columns are detected and handled per on_drift:
- error — the thread fails with a
SchemaDriftError - warn — extra columns are dropped with a logged warning
- ignore — extra columns are dropped silently
adaptive¶
Extra columns pass through to the target. When combined with
warp_mode: auto, the auto-generated warp file is updated
with the new columns marked discovered: true.
Configuration¶
Set drift mode on the target or cascade from defaults:
Cascade via defaults.target¶
All three drift/warp keys cascade through
defaults.target.* at loom, weave, and thread levels:
Thread-level settings override weave, which overrides loom.
Drift with warps¶
When a warp is present, the warp column list is the baseline. This provides a stricter contract than the existing table schema because the warp reflects intentional declarations.
Drift without warps¶
When no warp exists, the existing Delta table schema is used as the baseline. This provides value even for threads that have not adopted warps — any new column from source schema changes is detected at the target boundary.
On first write (no table exists yet), there is no baseline, so drift detection is skipped.
Adaptive + auto interaction¶
When schema_drift: adaptive and warp_mode: auto are both
active:
- Extra columns pass through to Delta (adaptive behavior)
- The auto-generated warp file includes the new columns
- New columns are marked
discovered: truein the warp
This creates a living schema document that tracks actual output
evolution. Removing the discovered flag from a column signals
intentional curation.
Observability¶
Drift events are captured regardless of mode:
- Telemetry —
drift_detected,drift_columns,drift_mode, anddrift_action_takenfields on theThreadTelemetryobject - Notebook output — drift report section in the
_repr_html_()display with extra columns and action taken - Logging — drift findings logged at the appropriate level
Configuration reference¶
| Key | Level | Default | Description |
|---|---|---|---|
target.schema_drift |
Cascadable | lenient |
Drift mode |
target.on_drift |
Cascadable | warn |
Strict severity |