Release Notes — v1.15¶
Release date: April 2026
This release lets the generic CDC load path compose with a
watermark column, so append-only change data capture history
tables no longer reread the full history on every run. The
dominant use case is SAP data landed by Fabric Open Database
Mirror, where every change row carries an operation flag like
OPFLAG and a change timestamp like AEDATTM.
CDC + watermark composition¶
Prior releases rejected this configuration at model validation
time, under the assumption that any CDC use meant the Delta
Change Data Feed preset. The generic CDC path — the one with an
explicit operation_column and I/U/D value mapping — was caught
in the same rejection even though it had no built-in incremental
mechanism and left users rereading the entire source on every
thread run.
Starting with v1.15, mode: cdc and watermark_column compose
for the generic path:
load:
mode: cdc
cdc:
operation_column: OPFLAG
insert_value: "I"
update_value: "U"
delete_value: "D"
on_delete: soft_delete
watermark_column: AEDATTM
watermark_type: timestamp
# watermark_store omitted: defaults to table_properties
On the first run weevr reads the full source and captures
max(AEDATTM) as the new high-water mark. On subsequent runs
the source read is narrowed to rows past the stored HWM, the
usual I/U/D routing runs over only the new window, and the HWM
advances after a successful write. Steady-state cost drops from
O(history) to O(delta).
What stays the same¶
- The Delta CDF preset (
cdc.preset: delta_cdf) is unchanged. It still rejectswatermark_column— CDF's commit-version tracking is the incremental mechanism for that path, and combining the two would be redundant. - Generic CDC threads that don't set
watermark_columnkeep full-source read behavior. No existing configuration changes meaning. - Failure semantics mirror
incremental_watermark: the HWM is persisted only after a successful write, so a mid-run failure leaves the prior HWM in place and the next run idempotently reprocesses the same window via CDC merge match keys. - Both
watermark_store: table_properties(default, zero-config) andwatermark_store: metadata_tablebackends work.
Delete rows advance the watermark¶
The HWM is captured from the filtered DataFrame before any
I/U/D routing. That means delete rows (the D branch of the
operation column) still participate in max(watermark_column)
— their change timestamp counts toward advancing the window,
even though they route to the delete path during the merge. This
avoids a subtle bug where a run that only saw delete rows would
fail to advance the HWM and reread them on the next run.
All three watermark types supported¶
watermark_type: timestamp, date, and long all work for CDC
composition, via the same build_watermark_filter helper that
incremental_watermark mode uses. watermark_inclusive: true
behaves identically to the incremental watermark mode: the
filter becomes >= prior_hwm instead of > prior_hwm, which is
the safer default when pairing with merge or overwrite writes.
Configuration summary¶
No new fields were added. The change is a validator relaxation plus an internal wiring change, so the public YAML schema and the state schema are both unchanged.
Cross-field rules now enforced by LoadConfig:
| Rule | Effect |
|---|---|
mode=cdc + cdc.operation_column + watermark_column |
Accepted; composition path |
mode=cdc + cdc.preset=delta_cdf + watermark_column |
Rejected with a preset-specific error |
watermark_column set without watermark_type in cdc mode |
Rejected |
See the
load configuration reference
for the full field table and the
thread YAML schema for
labelled YAML examples of all three load patterns
(incremental_watermark, CDC via delta_cdf preset, and CDC
with a watermark column).
Internal changes¶
read_cdc_source now returns tuple[DataFrame, str | None].
The second element is the HWM captured from this run, or None
when nothing was captured (CDF preset path, empty first run, or
empty subsequent window). The executor unpacks the tuple, wires
new_hwm through the existing save_watermark plumbing, and
only sets last_version on the CDF preset branch — explicit,
with no reliance on incidental exception behavior in int().
Backwards compatibility¶
- Existing configurations keep working without changes.
- No new YAML keys, no new state schema, no new credentials.
- Only an internal helper signature changed; the executor is the sole non-test caller and was migrated in the same commit.