Release Notes — v1.11¶

Release date: April 2026

This release adds connections for cross-lakehouse data pipelines with Fabric context variables, sub-pipelines (with: block) for CTE-style intermediate transformations, generated sources (date_sequence, int_sequence) for creating data without external tables, and warp schema contracts with schema drift handling for target-boundary validation and evolution control.

Connections¶

Declare a Fabric lakehouse once by name and reference it from sources, targets, and exports. Eliminates hardcoded abfss:// paths and workspace GUIDs.

connections:
  raw_lh:
    type: onelake
    workspace: "${fabric.workspace_id}"
    lakehouse: "${fabric.lakehouse_id}"
    default_schema: raw

sources:
  orders:
    connection: raw_lh
    table: raw_orders

target:
  connection: raw_lh
  table: fact_orders
  schema: curated

Key features:

Cross-lakehouse reads and writes — reference different workspaces and lakehouses by name within the same thread
Fabric context variables — ${fabric.workspace_id}, ${fabric.lakehouse_id}, and ${fabric.workspace_name} auto-resolve from the active Spark session at execution time
Schema overrides — default_schema on the connection applies to all tables; schema: on a source or target overrides for that specific table
Cascade — connections defined at loom or weave level are inherited by all threads; thread-level connections override by name
Export support — exports reference connections for secondary output destinations

See the Connections guide for full documentation.

Sub-Pipelines (`with:` block)¶

Define named sub-pipelines that run before the main thread steps. Each sub-pipeline reads from a source or another sub-pipeline, applies its own step chain, and produces a named DataFrame available to downstream steps.

sources:
  orders:
    path: Tables/stg_orders
  customers:
    path: Tables/dim_customer

with:
  active_customers:
    from: customers
    steps:
      - filter:
          expr: "is_active = true"

steps:
  - join:
      source: active_customers
      on: [{left: customer_id, right: customer_id}]
      type: inner

Key features:

CTE-like composition — named intermediate results without writing to storage
Chaining — sub-pipelines can reference earlier sub-pipelines by name
Full step support — each sub-pipeline accepts any step type (filter, select, join, rename, etc.)
Explain visibility — sub-pipelines appear in plan output with row counts and step counts
Telemetry — each sub-pipeline emits its own span under the thread span

See the Sub-Pipelines guide for full documentation.

Generated Sources¶

Two new source types produce single-column DataFrames from a range definition — no external table required.

`date_sequence`¶

Generates one row per calendar interval over a date range.

sources:
  calendar:
    type: date_sequence
    start: "2024-01-01"
    end: "2024-12-31"
    column: date
    step: day

step — day (default), week, month, or year
Output column is DateType
Both start and end are inclusive

`int_sequence`¶

Generates one row per integer over a numeric range.

sources:
  ids:
    type: int_sequence
    start: 1
    end: 10000
    column: id
    step: 1

step — any positive integer (default 1)
Output column is LongType
Both start and end are inclusive

Generated sources are mutually exclusive with alias, path, connection, lookup, options, and dedup.

Warp Schema Contracts¶

Warps are typed configuration files (.warp) that declare the intended shape of a target table — columns, types, nullability, and keys. They enable contract validation, pre-initialization, stub columns, and documentation independent of pipeline execution.

# dim_customer.warp
config_version: "1.0"
columns:
  - name: customer_sk
    type: bigint
    nullable: false
  - name: customer_id
    type: string
    nullable: false
  - name: customer_name
    type: string
  - name: email
    type: string
    default: "unknown"
keys:
  surrogate: customer_sk
  business: [customer_id]

Warp enforcement¶

Validates pipeline output against the declared contract.

target:
  alias: gold.dim_customer
  warp: dim_customer
  warp_enforcement: enforce

Three modes:

warn (default) — log findings, continue execution
enforce — fail the thread on any violation
off — skip validation

Checks missing columns, type mismatches, and nullable violations. Findings include column-level detail in both log messages and the WarpEnforcementError payload.

Warp-only columns¶

Columns declared in the warp but absent from pipeline output are appended with their declared default value (or null). This enables stub columns for downstream consumers without requiring the pipeline to produce them.

Pre-initialization¶

Set warp_init: true to create the Delta table from the effective warp before the first pipeline run. Solves the seed-row chicken-and-egg problem for dimension targets.

Auto-generation¶

Set warp_mode: auto to write or update the .warp file after each successful run. The generated file carries auto_generated: true as a marker.

See the Warps guide for full documentation.

Schema Drift Handling¶

Configurable behavior when extra columns from upstream source changes reach the target boundary. Operates with or without a warp.

target:
  alias: gold.dim_customer
  schema_drift: strict
  on_drift: warn

Drift modes¶

Mode	Behavior
`lenient` (default)	Extra columns pass through
`strict`	Extra columns handled per `on_drift`
`adaptive`	Extra columns pass through and extend auto-generated warps

`on_drift` severity (strict mode)¶

Action	Effect
`error`	Thread fails with `SchemaDriftError`
`warn` (default)	Extra columns dropped with warning
`ignore`	Extra columns dropped silently

Drift baseline¶

When a warp exists, the warp column list is the baseline. Without a warp, the existing Delta table schema is used. On first write with no table, drift detection is skipped.

Adaptive + auto¶

When schema_drift: adaptive and warp_mode: auto are both active, new pipeline columns pass through to Delta and are added to the auto-generated warp with discovered: true.

Cascade¶

All three settings (schema_drift, on_drift, warp_enforcement) cascade through defaults.target.* at loom, weave, and thread levels.

Observability¶

Drift events are captured regardless of mode:

Telemetry — drift_detected, drift_columns, drift_mode, drift_action_taken on ThreadTelemetry
Notebook output — drift report and warp findings sections in _repr_html_()

See the Schema Drift guide for full documentation.

Other Improvements¶

JSON Schema and LLM discoverability — all Pydantic models export JSON Schema with Field(description=...) for LLM tooling. Schema files at docs/schema/.
New error types — WarpEnforcementError and SchemaDriftError carry structured payloads (findings, drift_report) for programmatic error handling.
Plan mode warp summary — plan/explain output shows warp configuration per thread when non-default settings are present.