Skip to content

Release Notes — v1.2

Release date: March 2026

This release adds narrow lookup projection and expanded hash algorithms — improving memory efficiency for lookup-heavy pipelines and offering more flexibility for key generation.


Narrow Lookup Projection

Lookups now support key, values, filter, and unique_key fields that control projection and filtering applied during materialization:

lookups:
  dim_product:
    source:
      type: delta
      alias: silver.dim_product
    materialize: true
    key: [product_id]
    values: [product_name, category]
    filter: "is_active = true"
    unique_key: true
  • key — Column(s) used for matching. Always retained in the cached projection.
  • values — Payload column(s) to keep. When set, only key + values columns are cached, reducing memory usage.
  • filter — SQL WHERE expression applied before projection.
  • unique_key — Validates that key columns form a unique key after filtering. Controlled by on_failure (abort or warn).

These fields are optional. Existing lookup configurations continue to work without changes.

Expanded Hash Algorithms

Surrogate key and change detection hash generation now supports eight algorithms:

Algorithm Output Type Notes
xxhash64 LongType Fast, non-cryptographic 64-bit hash
sha1 StringType 160-bit cryptographic hash
sha256 StringType Default for surrogate keys
sha384 StringType 384-bit cryptographic hash
sha512 StringType 512-bit cryptographic hash
md5 StringType Default for change detection
crc32 IntegerType 32-bit CRC — small output, higher collision risk
murmur3 IntegerType Spark's hash() — may vary across Spark versions

A new output field controls the return type for integer-returning algorithms (xxhash64, crc32, murmur3):

keys:
  surrogate_key:
    name: sk_order
    algorithm: xxhash64
    output: string     # cast to StringType instead of native LongType
  • native (default) — preserves the algorithm's return type
  • string — casts to StringType

This has no effect on sha*/md5 algorithms, which already return hex strings.

Compatibility

No breaking changes. All existing configurations continue to work without modification.

Component Version
Python 3.11
PySpark 3.5.x
Delta Lake 3.2.x
Microsoft Fabric Runtime 1.3