Architecture Evolution Timeline
From Node.js prototype to 51-service production platform in 7 phases.
HDIM did not emerge fully formed. It evolved through deliberate architectural phases, each solving a specific class of problem that the previous phase exposed. This timeline documents what changed at each phase, why the change was necessary, and what metrics improved as a result.
Decision Log Highlights
| Date | Trigger | Decision | Outcome |
|---|---|---|---|
| Nov 24, 2024 | Prototype proved product value but hit ecosystem friction for healthcare-grade delivery. | Rebuild platform in Java 21 + Spring Boot. | Production-capable architecture delivered in six weeks with 51+ services. |
| Jan 2025 | Schema changes lacked a single enforceable migration standard across services. | Standardize on Liquibase with rollback requirements. | 100% migration rollback coverage and consistent database change governance. |
| Feb 2025 | Token validation repeated in downstream services increased latency and complexity. | Adopt gateway trust pattern with trusted identity headers. | Centralized auth enforcement and simpler service-level security posture. |
| Mar 2025 | Cross-service diagnosis was slow without request-level visibility. | Roll out OpenTelemetry end-to-end tracing. | Faster incident triage across HTTP, Kafka, and Redis workflows. |
| Apr 2025 | PR cycle time constrained feature throughput as service count grew. | Parallelize CI/CD with targeted test selection. | 42.5% faster PR feedback and lower operational scaling risk. |
Evolution Phases
The original HDIM prototype was a Node.js application with Express endpoints, a single PostgreSQL database, and basic FHIR resource handling. It validated the core hypothesis: healthcare organizations need a unified platform for quality measure evaluation and care gap detection. However, it lacked the type safety, ecosystem maturity, and enterprise patterns required for production healthcare workloads.
- Single-service monolith handling all domains
- No tenant isolation — single-tenant design
- Manual FHIR resource parsing without HAPI library
- No event-driven architecture
Complete platform rewrite in Java 21 with Spring Boot 3.x. Introduced Liquibase as the sole migration tool, replacing ad-hoc schema management. Every service received its own database with independent schema lifecycle. Hibernate ddl-auto: validate mode enforced entity-migration synchronization.
Introduced the gateway-trust authentication pattern. The API gateway validates JWT tokens and injects trusted headers (X-Auth-User,X-Auth-Roles, X-Tenant-ID). Downstream services trust these headers, eliminating redundant token validation. Four specialized gateways were deployed: public, clinical, admin, and data ingestion.
- 4 specialized gateways with distinct security profiles
- Gateway-core shared module for common authentication logic
- RBAC with 5 role tiers (SUPER_ADMIN through VIEWER)
- Multi-tenant isolation enforced at database query level
Deployed OpenTelemetry across all services for end-to-end request visibility. Trace propagation was configured for HTTP (Feign and RestTemplate), Kafka producers/consumers, and Redis operations. Custom spans were added for business-critical operations like CQL measure evaluation and FHIR bundle processing.
Introduced event-driven architecture with Apache Kafka. Four dedicated event services were built: patient-event-service, care-gap-event-service, evaluation-event-service, and quality-measure-event-service. CQRS pattern separated read and write models for high-throughput clinical data processing.
- 4 event services with dedicated Kafka topics
- Event projections for materialized views
- Dead letter queue handling for failed events
- Tenant-scoped event partitioning
Phase 5 migrated from Docker-dependent Testcontainers to Spring embedded Kafka, cutting test execution time by 50%. Phase 6 introduced parallel test execution with six JVM forks, the TestEventWaiter utility for deterministic event synchronization, and six test execution modes for different development contexts.
The final infrastructure phase parallelized the CI/CD pipeline with intelligent change detection. Twenty-one service-specific filters ensure only affected tests run on each PR. Four parallel test jobs and three parallel validation jobs reduced PR feedback time by 42.5%. Docs-only PRs complete in under one minute.
Cumulative Impact
Each phase compounded the improvements of the previous phase. The combination of standardized databases, gateway trust, distributed tracing, event sourcing, and optimized CI/CD creates a platform where new services can be added with confidence and deployed without risk.
Dive Deeper
Explore the technical architecture or understand how specifications drove each phase.