In this article ⏷

One Model, One Hash: Keeping Data Vault Keys Deterministic Across SQL Server, Fabric, and Databricks

One setting generates deterministic Data Vault keys on every platform.

June 11, 2026

A hash key is a promise. The same business key, hashed the same way, has to produce the same value every time, on every platform you land on, or the joins in your Data Vault quietly stop matching. Hubs orphan their satellites. Links point at keys that no longer exist. Nothing errors. The data is just wrong.

That promise is easy to keep while everything lives in one SQL Server. It gets hard the moment the warehouse moves. HASHBYTES doesn't exist in a Fabric Lakehouse. Databricks spells its hash functions differently and returns a different type. Snowflake has its own again. A hand-coded vault answers this with a drawer full of platform-specific SQL that someone has to keep in lockstep forever, and the day those drift is the day your keys stop reconciling across a migration.

BimlFlex treats the hash as a generated artifact, not hand-written SQL. You set one value, HashAlgorithm, and the generator emits the platform-correct expression per target. You never write the hash SQL. You receive it, and it's yours to read before it runs.

Set HashAlgorithm to SHA1 and point the project at SQL Server (or any of the SQL-surfaced targets, including Fabric Warehouse). For a CustomerNumber business key, the generated expression is fixed and deterministic:

UPPER(CONVERT(CHAR(40), HASHBYTES('SHA1', CONVERT(NVARCHAR(MAX), CustomerNumber)), 2)) 

UPPER, CHAR(40), hex conversion with the 2 style argument: every part of that exists to make the output byte-for-byte repeatable. Generate the same model for a Fabric Warehouse and the shape holds, but the cast changes to the type the engine actually supports, VARCHAR in place of NVARCHAR:

UPPER(CONVERT(CHAR(40), HASHBYTES('SHA1', CONVERT(VARCHAR(MAX), CustomerNumber)), 2)) 

Now move the same model onto a Fabric Lakehouse or Databricks, where HASHBYTES doesn't exist. The generator drops the SQL Server wrapper entirely and emits the platform's native function for the same algorithm:

-- Fabric Lakehouse / Databricks, HashAlgorithm = SHA2_256 

SHA2(CustomerNumber, 256) 

Same model, same business key, a different dialect of the same algorithm. That's the part hand-coding gets wrong over time, because nobody remembers that Fabric Warehouse needs VARCHAR where SQL Server needs NVARCHAR, and the mismatch only surfaces months later as a reconciliation failure.

The harder problem is a migration. When you move an existing SQL Server vault onto Fabric or Databricks, you usually need the new platform's keys to match the old platform's keys, so history lines up and you don't rebuild the vault from scratch. That's what the UseSqlCompatibleHash setting is for. Turn it on, and the generated hash on the new target is shaped to line up with SQL Server's HASHBYTES output rather than the target's native default. Your migrated keys reconcile against the keys you already have. That's the difference between re-pointing metadata and re-loading ten years of history.

Determinism also lives in the output's shape, not just its algorithm. The result is sized per algorithm: SHA1 is 40 characters as hex or 20 bytes as binary, SHA2-256 is 64 or 32, MD5 is 32 or 16. None of that is something you set by hand. It follows from the one algorithm choice, and it's the reason a generated CHAR(40) column on SQL Server and a native SHA2(..., 256) on Databricks can still be made to agree.

Underneath the hash is the business key it hashes, and that has to be deterministic too. In metadata, you define the key with a FlexToBk(...) expression on the source. The precedence you'll observe is straightforward: a FlexToBk(...) expression wins, then the dataflow expression, then the target expression, then the column name. The column list inside FlexToBk(col1, col2, col3) is read in order and preserved, so the same columns hash in the same order no matter which target you generate. The business key definition is metadata. The hash that protects it is generated from that metadata. Both travel with the model.

The payoff is the thing a hand-coded vault can't give you cheaply. Re-point the project at a new target, regenerate, and the keys still mean what they meant. You can read every line of the generated SQL before it runs, because it's yours, not a black box. When the engine under your warehouse changes, and over a long enough horizon it always does, the hash stops being the thing that breaks the move.