Config API¶
The weevr.config module handles YAML parsing, schema validation, configuration
inheritance, and parameter resolution across the loom/weave/thread hierarchy.
weevr.config
¶
Configuration loading and validation.
ConfigLocation
¶
Bases: ABC
Abstract reference to a config file or directory.
Implementations encapsulate a single addressing scheme (local filesystem or remote Hadoop URI) and expose the minimal surface area the config pipeline needs: joining, existence checks, text reads, name and parent derivation, and a containment check used for path-traversal protection.
name
abstractmethod
property
¶
The final path segment, including any extension.
stem
abstractmethod
property
¶
The final path segment without its extension.
suffix
abstractmethod
property
¶
The file extension including the leading dot, or empty string.
parent
abstractmethod
property
¶
The location one level up.
join(rel)
abstractmethod
¶
Resolve rel against this location and return a new location.
rel must be a relative path. Implementations normalize ..
and . segments and reject inputs that look absolute.
exists()
abstractmethod
¶
Return whether the underlying file or directory exists.
read_text()
abstractmethod
¶
Return the file contents decoded as UTF-8.
Raises:
| Type | Description |
|---|---|
FileNotFoundError
|
If the file does not exist. |
OSError
|
For any other I/O failure. |
is_relative_to(other)
abstractmethod
¶
Return whether this location is contained within other.
__str__()
abstractmethod
¶
Return the canonical path or URI for diagnostics and logging.
__fspath__()
¶
Return the string form for os.fspath consumers.
Remote implementations return their URI; using the result with the local filesystem will fail loudly, which is the desired behavior.
LocalConfigLocation
¶
Bases: ConfigLocation
A :class:ConfigLocation backed by a local filesystem path.
path
property
¶
The underlying :class:pathlib.Path.
name
property
¶
The final path segment.
stem
property
¶
The final path segment without its extension.
suffix
property
¶
The file extension including the leading dot.
parent
property
¶
The parent directory as a :class:LocalConfigLocation.
__init__(path)
¶
Wrap an absolute or relative :class:Path.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
path
|
Path
|
The path to wrap. Stored as-is; resolution and existence checks are performed lazily by individual methods. |
required |
join(rel)
¶
Join rel to the underlying path.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
rel
|
str
|
Relative path to append. |
required |
Returns:
| Type | Description |
|---|---|
ConfigLocation
|
A new :class: |
Raises:
| Type | Description |
|---|---|
ValueError
|
If |
exists()
¶
Return whether the path exists on the local filesystem.
read_text()
¶
Read the file as UTF-8 text.
is_relative_to(other)
¶
Return whether this path is contained within other.
Both sides are resolved to absolute paths before comparison so
relative segments such as .. are honored. Comparing across
location types always returns False.
__str__()
¶
Return the underlying path as a string.
__repr__()
¶
Developer-friendly representation.
__eq__(other)
¶
Two locations are equal when their underlying paths are equal.
__hash__()
¶
Hash by underlying path.
ConfigError
¶
Bases: WeevError
Base exception for configuration-related errors.
__init__(message, cause=None, file_path=None, config_key=None)
¶
Initialize ConfigError.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
message
|
str
|
Human-readable error message |
required |
cause
|
Exception | None
|
Optional underlying exception |
None
|
file_path
|
str | None
|
Path to the config file where the error occurred |
None
|
config_key
|
str | None
|
Specific config key that caused the error |
None
|
__str__()
¶
Return string representation with context.
ModelValidationError
¶
Bases: ConfigError
Raised when a fully resolved config fails to hydrate into a typed model.
This occurs after variable resolution and inheritance, when the concrete values are validated through the Pydantic domain model (Thread, Weave, or Loom). Semantic constraints that span multiple fields are checked here.
Loom
¶
Bases: FrozenBase
A deployment unit containing weave references with optional shared defaults.
Thread
¶
Bases: FrozenBase
Complete domain model for a thread configuration.
A thread is the smallest unit of work: one or more sources, a sequence of transformation steps, and a single target.
Weave
¶
Bases: FrozenBase
A collection of thread references with optional shared defaults.
apply_inheritance(loom_defaults, weave_defaults, thread_config, *, loom_audit_templates=None, weave_audit_templates=None, loom_connections=None, weave_connections=None)
¶
Apply multi-level inheritance cascade.
Cascade order (lowest to highest priority): 1. loom_defaults (lowest) 2. weave_defaults 3. thread_config (highest)
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
loom_defaults
|
dict[str, Any] | None
|
Defaults from loom level |
required |
weave_defaults
|
dict[str, Any] | None
|
Defaults from weave level |
required |
thread_config
|
dict[str, Any]
|
Thread-specific config |
required |
loom_audit_templates
|
dict[str, Any] | None
|
User-defined audit template definitions from loom |
None
|
weave_audit_templates
|
dict[str, Any] | None
|
User-defined audit template definitions from weave |
None
|
loom_connections
|
dict[str, Any] | None
|
Named connection definitions from loom top-level |
None
|
weave_connections
|
dict[str, Any] | None
|
Named connection definitions from weave top-level |
None
|
Returns:
| Type | Description |
|---|---|
dict[str, Any]
|
Fully merged config with thread values taking precedence |
make_location(path_or_uri, spark=None)
¶
Construct a :class:ConfigLocation from a path, URI, or existing location.
A string containing :// is treated as a remote URI and requires a
spark session. Anything else is treated as a local filesystem path.
Existing :class:ConfigLocation inputs are returned unchanged.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
path_or_uri
|
str | Path | ConfigLocation
|
A local path, a URI string, or an existing
:class: |
required |
spark
|
SparkSession | None
|
Active :class: |
None
|
Returns:
| Name | Type | Description |
|---|---|---|
A |
ConfigLocation
|
class: |
Raises:
| Type | Description |
|---|---|
ValueError
|
If a remote URI is supplied without a |
expand_foreach(steps)
¶
Expand foreach macro blocks into repeated step sequences.
Each foreach block in the steps list is replaced by its template steps
repeated once per value, with {var} placeholders substituted.
Non-foreach entries pass through unchanged.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
steps
|
list[dict[str, Any]]
|
Raw step list (dicts), possibly containing foreach blocks. |
required |
Returns:
| Type | Description |
|---|---|
list[dict[str, Any]]
|
Expanded step list with all foreach blocks replaced. |
Raises:
| Type | Description |
|---|---|
ConfigError
|
If a foreach block is missing required fields. |
detect_config_type(raw)
¶
Detect the type of config from its structure.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
raw
|
dict[str, Any]
|
Parsed config dictionary |
required |
Returns:
| Type | Description |
|---|---|
str
|
Config type: 'thread', 'weave', 'loom', or 'params' |
Raises:
| Type | Description |
|---|---|
ConfigParseError
|
If config type cannot be determined |
detect_config_type_from_extension(path)
¶
Detect config type from file extension.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
path
|
str | Path | ConfigLocation
|
Path or location of the config file. |
required |
Returns:
| Type | Description |
|---|---|
str | None
|
Config type string if the extension is a typed extension |
str | None
|
( |
str | None
|
|
Raises:
| Type | Description |
|---|---|
ConfigError
|
If the extension is |
extract_config_version(raw)
¶
Extract and parse config_version from a config dict.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
raw
|
dict[str, Any]
|
Parsed config dictionary |
required |
Returns:
| Type | Description |
|---|---|
tuple[int, int]
|
Tuple of (major, minor) version numbers |
Raises:
| Type | Description |
|---|---|
ConfigParseError
|
If config_version is missing or invalid format |
parse_yaml(path)
¶
Parse a YAML file and return its contents.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
path
|
str | Path | ConfigLocation
|
A local path, a URI string, or a :class: |
required |
Returns:
| Type | Description |
|---|---|
dict[str, Any]
|
Parsed YAML content as a dictionary. |
Raises:
| Type | Description |
|---|---|
ConfigParseError
|
If the file is not found, unreadable, or contains invalid YAML syntax. |
validate_config_version(version, config_type)
¶
Validate that the config version is supported.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
version
|
tuple[int, int]
|
Tuple of (major, minor) version |
required |
config_type
|
str
|
Type of config (thread, weave, loom, params) |
required |
Raises:
| Type | Description |
|---|---|
ConfigVersionError
|
If the major version doesn't match supported version |
build_param_context(runtime_params=None, config_defaults=None, fabric_context=None, entry_params=None)
¶
Build parameter context with proper priority layering.
Priority order (highest to lowest):
1. runtime_params
2. entry_params (nested under param key for ${param.x} access)
3. config_defaults
4. fabric_context
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
runtime_params
|
dict[str, Any] | None
|
Runtime parameter overrides. |
None
|
config_defaults
|
dict[str, Any] | None
|
Default parameters from config. |
None
|
fabric_context
|
dict[str, Any] | None
|
Fabric environment values keyed as |
None
|
entry_params
|
dict[str, Any] | None
|
ThreadEntry-level parameters injected under the
|
None
|
Returns:
| Type | Description |
|---|---|
dict[str, Any]
|
Merged parameter context dictionary with dotted key access support. |
resolve_declared_params(param_specs, runtime_params, *, file_path=None)
¶
Resolve loom/weave-level declared params: to a flat {name: value} dict.
Precedence per declared param:
- Value from
runtime_paramsif supplied ParamSpec.defaultif set on the specConfigSchemaErrorif the param is required- Omitted from the result if optional with no default
The returned dict is intended to bind under the param.* namespace via
:func:build_param_context's entry_params argument, so that
${param.x} expressions resolve to the layered value.
Runtime keys not declared in param_specs are ignored — declared scope
is the only contract honored at this layer.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
param_specs
|
dict[str, Any] | None
|
Mapping of declared params ( |
required |
runtime_params
|
dict[str, Any] | None
|
Caller-supplied values keyed by param name. |
required |
file_path
|
str | None
|
Optional originating config file path for error messages. |
None
|
Returns:
| Type | Description |
|---|---|
dict[str, Any]
|
Resolved |
Raises:
| Type | Description |
|---|---|
ConfigSchemaError
|
A required declared param has no runtime value and no spec default. The message includes the file path and param name. |
resolve_references(config, config_type, project_root, runtime_params=None, visited=None)
¶
Resolve references to other config files.
Handles both external references (ref key) and inline definitions
(name + body keys). Recursively loads referenced configs with
circular reference detection.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
config
|
dict[str, Any]
|
Config dict to resolve references in. |
required |
config_type
|
str
|
Type of this config ( |
required |
project_root
|
ConfigLocation | Path
|
The |
required |
runtime_params
|
dict[str, Any] | None
|
Runtime parameters to pass to child configs. |
None
|
visited
|
set[str] | None
|
Set of already-visited ref strings (for cycle detection). |
None
|
Returns:
| Type | Description |
|---|---|
dict[str, Any]
|
Config dict with resolved child configs attached under |
dict[str, Any]
|
|
Raises:
| Type | Description |
|---|---|
ReferenceResolutionError
|
If referenced file not found or circular reference detected. |
ConfigError
|
If an inline definition is missing a |
resolve_variables(config, context, consumed_keys=None)
¶
Recursively resolve variable references in config.
Supports: - ${var} - simple variable reference (error if not found) - ${var:-default} - variable with fallback default
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
config
|
dict[str, Any] | list[Any] | str | Any
|
Config structure to resolve (dict, list, str, or primitive) |
required |
context
|
dict[str, Any]
|
Parameter context for variable lookup |
required |
consumed_keys
|
set[str] | None
|
Optional set to track which context keys were consumed during resolution. When provided, each resolved dotted key is added to the set for post-resolution unused-param analysis. |
None
|
Returns:
| Type | Description |
|---|---|
Any
|
Config with all variables resolved |
Raises:
| Type | Description |
|---|---|
VariableResolutionError
|
If variable not found and no default provided |
validate_params(param_specs, context)
¶
Validate parameters against their type specifications.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
param_specs
|
dict[str, Any] | None
|
Parameter specifications from config |
required |
context
|
dict[str, Any]
|
Actual parameter values |
required |
Raises:
| Type | Description |
|---|---|
ConfigSchemaError
|
If required params missing or type mismatches |
validate_schema(raw, config_type)
¶
Validate a raw config dict against the appropriate pre-resolution schema.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
raw
|
dict[str, Any]
|
Raw config dictionary (variables not yet resolved) |
required |
config_type
|
str
|
Type of config (thread, weave, loom, params) |
required |
Returns:
| Type | Description |
|---|---|
BaseModel
|
Validated Pydantic model instance |
Raises:
| Type | Description |
|---|---|
ConfigSchemaError
|
If validation fails |
_derive_config_name(path)
¶
Derive the component name from a config location.
Returns the filename stem — the filename without the typed extension.
For example, dim_customer.thread returns 'dim_customer'.
load_config(path, runtime_params=None, project_root=None)
¶
Load and validate a weevr configuration file.
This function orchestrates the full config loading pipeline: 1. Parse YAML file 2. Extract and validate config_version 3. Detect config type (extension-based for components, content-based for params) 4. Validate schema with Pydantic 5. Build parameter context (runtime > defaults) 6. Resolve variable references (${var} and ${var:-default}) 7. Resolve references to child configs (threads, weaves) 8. Apply inheritance cascade (loom -> weave -> thread) 9. Validate name against filename stem 10. Hydrate into typed domain model (thread, weave, loom only)
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
path
|
str | Path | ConfigLocation
|
Path or :class: |
required |
runtime_params
|
dict[str, Any] | None
|
Optional runtime parameter overrides. |
None
|
project_root
|
Path | ConfigLocation | None
|
The |
None
|
Returns:
| Type | Description |
|---|---|
Thread | Weave | Loom | dict[str, Any]
|
A frozen, typed domain model instance (Thread, Weave, or Loom) for |
Thread | Weave | Loom | dict[str, Any]
|
thread/weave/loom config types. Returns a plain dict for params configs. |
Raises:
| Type | Description |
|---|---|
ConfigParseError
|
YAML syntax errors, file not found |
ConfigVersionError
|
Unsupported config version |
ConfigSchemaError
|
Schema validation failures |
ConfigError
|
Extension or name validation failures |
VariableResolutionError
|
Unresolved variables without defaults |
ReferenceResolutionError
|
Missing referenced files, circular dependencies |
ModelValidationError
|
Semantic validation failures during model hydration |