weevr¶

Configuration-driven execution framework for Spark in Microsoft Fabric.

weevr lets you declare data shaping intent in YAML and execute it as optimized, repeatable PySpark transformations. No code generation, no manual notebook orchestration — just deterministic, metadata-driven data pipelines.

Key features¶

Declarative YAML — Define sources, transforms, and targets in configuration
Spark-native — Executes via PySpark DataFrame APIs in Microsoft Fabric
Deterministic — Same config + inputs = same outputs, every time
23 transform types — Filter, derive, join, aggregate, window, pivot, and more
DAG orchestration — Threads form weaves, weaves form looms, with automatic dependency resolution
Incremental processing — Watermark and CDC modes for efficient loads
Structured telemetry — Spans, events, and row counts for full observability

Quick start¶

pip install weevr

from weevr import Context

ctx = Context(spark, "my-project.weevr")
result = ctx.run("nightly.loom")

See the Your First Loom tutorial for a complete walkthrough.

How it works¶

CLI¶

weevr-cli is a standalone command-line companion for validating configs, inspecting schemas, and running dry-run operations outside of a notebook.

pip install weevr-cli

See the CLI documentation for usage and command reference.

Learn more¶


Tutorials	Step-by-step guides to get started
How-to Guides	Task-oriented recipes for common scenarios
Reference	YAML schema, API docs, and configuration keys
Concepts	Architecture, design principles, and mental models