Skip to main content
Acceptedinfrastructure2026-04-23

SHIN-NY FHIR to Databricks Gold-Standard Publish Blueprint

Decision makers: Lakehouse Platform Lead, Interoperability Lead, Security Lead, Delivery Lead

Context

HDIM already has the control-plane contract (lakehouse-api), package

materialization flow (analytics-export-service), customer-boundary execution

plane (lakehouse-edge-service), local conformance engine, DQM-backed publish

gating, and durable publish-run state. The remaining hardening gap is non-prod

certification evidence and operational proof for the live Databricks probe,

write, trigger, and downstream polling path.

The repo also contains a Databricks writer gap plan and a partner handoff brief,

but those documents were written as planning aids. They do not freeze the

architecture, and they leave room for an unsafe outcome where a partner-owned

Databricks writer introduces a parallel publish API, bypasses DQM and preview

controls, or expects raw secrets to be passed through manifests.

The customer context is SHIN-NY. The SHIN-NY FHIR integration guide is the

normative upstream baseline for FHIR and operational assumptions. HDIM still

needs its own canonical architectural decision for the lakehouse controls that

start after SHIN-NY-conformant data is available for export.

Decision

Adopt a single certified architecture for the SHIN-NY-to-Databricks path.

The authoritative publish sequence is:

1.SHIN-NY FHIR-conformant source assumptions.
2.HDIM control-plane validation and manifest generation in

analytics-export-service.

3.Preview and publish orchestration in lakehouse-edge-service.
4.Local supported-subset checks plus DQM verdict gating before any target-side

write.

5.Customer-boundary Databricks execution behind DatabricksTargetPlugin.
6.Downstream trigger and run-state polling with

GET /api/v1/edge/publish-runs/{runId} as the operator source of truth.

7.Audit evidence capture and syslog forwarding for probe, publish, and trigger

stages.

The following defaults are frozen for v1:

No parallel publish API. All destructive Databricks operations stay behind the

current edge seam.

secretRefs only. Raw secrets are forbidden in manifests, control-plane

payloads, publish-run state, and operator logs.

Supported Databricks write modes are unity-catalog-volume and dbfs.
The default ingest trigger is the Databricks Jobs API.
publish-runs/{runId} remains the operator source of truth even when the

Databricks writer performs its own downstream polling.

The documentation boundary is also frozen:

The SHIN-NY FHIR integration guide is normative for upstream FHIR profile,

operational, and release assumptions.

HDIM architecture and implementation guides are normative for lakehouse

controls, audit behavior, security posture, Databricks execution, and

promotion gates.

Change-control rules:

Any Databricks-specific field not already present in lakehouse-api must be

introduced as a versioned contract extension, not as a side-channel payload.

The external Databricks run identifier carried through

TargetExecutionMetadata.externalRunId is mandatory for certification; any

future change to that representation must be introduced through

lakehouse-api versioning.

Alternative trigger mechanisms, storage modes, or target-side status models

require explicit approval against this ADR and the normative blueprint.

Consequences

Positive

HDIM, the partner, and the customer build against one end-to-end certified

path instead of multiple planning notes.

DQM gating, preview/conformance controls, audit evidence, and customer

boundary rules cannot be bypassed by a partner-owned writer.

The edge API, operator workflow, and durable run-state model remain stable

while the live Databricks runtime is hardened and certified.

Future implementation tickets can be derived directly from requirement and

acceptance matrices instead of re-litigating the seam.

Negative

Partners cannot introduce a convenience sidecar or alternate publish flow to

move faster.

The live Databricks certification path is blocked until non-prod evidence and

operational validation are complete.

Alternative storage modes and trigger types need formal review instead of

ad hoc rollout.

References

docs/implementation-guides/SHIN_NY_FHIR_DATABRICKS_GOLD_STANDARD_BLUEPRINT.md
backend/modules/shared/api-contracts/lakehouse-api/
backend/modules/services/analytics-export-service/README.md
backend/modules/services/lakehouse-edge-service/README.md
docs/plans/2026-04-23-databricks-writer-gap-closure-plan.md (planning history)
docs/plans/2026-04-23-databricks-writer-partner-handoff-brief.md (planning history)
← All DecisionsView on GitHub →