Skip to content

Connections

Connections let you declare a Fabric lakehouse once by name and reference it from sources, targets, and exports throughout a thread. Instead of embedding workspace and lakehouse GUIDs — or hardcoded abfss:// paths — in every source and target block, you define the connection in one place and refer to it by a short name.

When to use connections

Pattern Use this
Single-lakehouse thread, tables registered in the metastore alias:
Single-lakehouse thread, raw OneLake path path:
Reading from or writing to a different workspace or lakehouse connections:
Multiple threads sharing the same remote lakehouse connections: at weave or loom level
Portability across environments without changing GUIDs connections: with ${fabric.*}

Connection block structure

Connections are declared in a connections: map at the thread, weave, or loom level. Each key is a logical name; the value is a connection object.

connections:
  staging:
    type: onelake
    workspace: "a1b2c3d4-0000-0000-0000-111111111111"
    lakehouse: "e5f6a7b8-0000-0000-0000-222222222222"
    default_schema: staging

Connection fields

Field Required Description
type Yes Connection type. Only onelake is supported in v1.
workspace Yes OneLake workspace GUID or ${fabric.workspace_id}.
lakehouse Yes OneLake lakehouse GUID or ${fabric.lakehouse_id}.
default_schema No Default schema for tables in this connection.

Once declared, a connection is referenced from a source or target using connection: <name> together with table: <table_name>.

Cross-lakehouse example

A common pattern is reading from a raw lakehouse and writing to a curated one in a different workspace.

connections:
  raw_lh:
    type: onelake
    workspace: "aaaaaaaa-0000-0000-0000-000000000001"
    lakehouse: "bbbbbbbb-0000-0000-0000-000000000002"
    default_schema: raw

  curated_lh:
    type: onelake
    workspace: "cccccccc-0000-0000-0000-000000000003"
    lakehouse: "dddddddd-0000-0000-0000-000000000004"
    default_schema: curated

sources:
  orders:
    connection: raw_lh
    table: orders

  customers:
    connection: raw_lh
    table: customers

target:
  connection: curated_lh
  table: fact_orders

The engine constructs the abfss:// path for each source and target at execution time from the workspace, lakehouse, schema, and table name.

Auto-detection with ${fabric.*}

When all lakehouses are within the current Fabric session's workspace, you can use ${fabric.workspace_id} and ${fabric.lakehouse_id} rather than hardcoding GUIDs. The engine reads these values from the active Spark session at execution time.

connections:
  current:
    type: onelake
    workspace: "${fabric.workspace_id}"
    lakehouse: "${fabric.lakehouse_id}"

sources:
  orders:
    connection: current
    table: raw_orders

target:
  connection: current
  table: fact_orders

This pattern makes threads portable: the same YAML file runs correctly in development, test, and production workspaces without modification.

Available ${fabric.*} variables:

Variable Description
${fabric.workspace_id} GUID of the current Fabric workspace
${fabric.lakehouse_id} GUID of the active lakehouse in the session
${fabric.workspace_name} Display name of the current workspace

Schema overrides

default_schema on a connection applies to all tables unless overridden. Use schema: on a source or target to use a different schema for that specific table.

connections:
  archive:
    type: onelake
    workspace: "aaaaaaaa-0000-0000-0000-000000000001"
    lakehouse: "bbbbbbbb-0000-0000-0000-000000000002"
    default_schema: current_year

sources:
  transactions:
    connection: archive
    table: transactions         # resolves to current_year.transactions

  historical:
    connection: archive
    schema: prior_year          # overrides default_schema
    table: transactions         # resolves to prior_year.transactions

The schema: field on a source or target always takes precedence over default_schema on the connection.

Source, target, and export references

Connections work the same way in sources, the primary target, and exports.

Source:

sources:
  products:
    connection: staging
    table: raw_products

Target:

target:
  connection: curated_lh
  table: dim_products

Export (delta type only):

exports:
  - name: compliance_copy
    type: delta
    connection: archive
    table: fact_orders_archive

Export connections

The connection field on exports is only valid when type: delta. Parquet, CSV, JSON, and ORC exports use path: instead.

Cascade and override

Connections cascade from loom to weave to thread. Define shared connections at the loom level to make them available everywhere; override at the thread level when a specific thread needs a different endpoint.

# loom.loom
connections:
  shared_raw:
    type: onelake
    workspace: "${fabric.workspace_id}"
    lakehouse: "bbbbbbbb-0000-0000-0000-000000000002"
# my_thread.thread — override shared_raw with a different lakehouse
connections:
  shared_raw:
    type: onelake
    workspace: "${fabric.workspace_id}"
    lakehouse: "eeeeeeee-0000-0000-0000-000000000005"
    default_schema: overridden

The most specific level always wins. A thread-level connection with the same name as a loom-level connection replaces it entirely for that thread.

Migration from manual abfss:// paths

If you have existing threads using hardcoded abfss:// paths in source.path or target.path, you can migrate to connections gradually.

Before:

sources:
  orders:
    type: delta
    path: >-
      abfss://aaaaaaaa-0000-0000-0000-000000000001@onelake.dfs.fabric.microsoft.com/
      bbbbbbbb-0000-0000-0000-000000000002/Tables/raw/orders

target:
  path: >-
    abfss://aaaaaaaa-0000-0000-0000-000000000001@onelake.dfs.fabric.microsoft.com/
    bbbbbbbb-0000-0000-0000-000000000002/Tables/curated/fact_orders

After:

connections:
  lh:
    type: onelake
    workspace: "aaaaaaaa-0000-0000-0000-000000000001"
    lakehouse: "bbbbbbbb-0000-0000-0000-000000000002"

sources:
  orders:
    connection: lh
    schema: raw
    table: orders

target:
  connection: lh
  schema: curated
  table: fact_orders

Connections are also more maintainable when the same lakehouse is referenced across multiple threads: update the GUID in one place rather than in every file.

Column sets via connections

Named column sets defined at the loom or weave level can also resolve their mapping table through a connection. This is the right pattern when the mapping data lives in a reference lakehouse that is not the attached notebook lakehouse — for example, a portable loom that runs against arbitrary workspaces and reads its column dictionary from a centralized reference lakehouse.

connections:
  ref:
    type: onelake
    workspace: "aaaaaaaa-0000-0000-0000-000000000001"
    lakehouse: "cccccccc-0000-0000-0000-000000000003"

column_sets:
  sap_dictionary:
    source:
      type: delta
      connection: ref
      schema: dictionary
      table: sap_column_renames
      from_column: raw_name
      to_column: friendly_name
      filter: "system = 'SAP'"

The connection + table form is mutually exclusive with the alias form — the validator rejects any ColumnSetSource that sets both. When connection is set, table is required and path is rejected; schema is optional and selects the schema within the connection's lakehouse. Use connection whenever the mapping table is not guaranteed to exist in the active Spark catalog at execution time. The alias form continues to work for column sets backed by tables in the attached lakehouse.