Java Rebuild Deep Dive
Why we rewrote a working Node.js prototype in Java. And why it was the right call.
Rewriting a working system is one of the most contentious decisions in software engineering. The Node.js prototype validated the HDIM concept. The Java rebuild made it production-ready for healthcare. This page documents the reasoning, the execution, and the patterns that made a full-platform rewrite viable in weeks rather than months.
Transformation Narrative
Problem: The Node.js prototype validated the idea but could not sustain healthcare-grade delivery velocity and compliance depth.
Decision date: November 24, 2024.
Why Java: The healthcare interoperability ecosystem and core implementation libraries are Java-native.
Measurable outcome: In six weeks, the rebuild produced a purpose-built Java platform with 51+ services and 613+ automated tests.
Why Java
The decision was driven by four factors, each independently sufficient to justify the rewrite.
1. Healthcare Ecosystem Alignment
HAPI FHIR, the gold-standard open-source FHIR implementation, is a Java library. CQL evaluation engines are Java-based. HL7 reference implementations are Java. The healthcare interoperability ecosystem is overwhelmingly JVM. Building in Node.js meant reimplementing or wrapping every critical dependency with FFI bridges, losing type safety and library-level validation at each boundary.
2. Enterprise Hiring and Skills Market
Healthcare IT organizations hire Java engineers. Payer technology teams, health system integration departments, and EHR vendors maintain Java codebases. Building HDIM in Java means the engineers who will eventually maintain and extend it already exist in the target market. A Node.js healthcare platform would require hiring from a smaller, less domain-experienced talent pool.
3. AI Code Generation Effectiveness
Java's explicit type system, annotation-driven configuration, and well-established patterns (Spring Boot, JPA, Liquibase) make AI-generated code significantly more reliable. The compiler catches type errors that would be runtime failures in Node.js. Spring Boot conventions mean AI assistants produce consistent, idiomatic code because the "right way" is well-defined and heavily represented in training data.
4. Long-Term Maintainability
Healthcare platforms have 10-15 year lifecycles. Java 21 LTS provides a stable foundation with predictable upgrade paths. Spring Boot's backwards compatibility track record, Gradle's build reproducibility, and the JVM's performance characteristics at scale all favor long-term operational stability over rapid prototyping convenience.
What We Kept vs What We Replaced
The rebuild followed disciplined evolution: preserve validated domain ideas, replace implementation layers that created integration and compliance drag.
What We Kept (Validated in Prototype)
- Domain boundaries for patients, measures, evaluations, and care gaps
- FHIR R4 as the canonical clinical data model
- REST-first service contracts and versioned APIs
- Tenant isolation as a non-negotiable architecture requirement
- Event-driven notifications for cross-service state propagation
What We Replaced (To Reach Production)
- Manual FHIR parsing replaced with HAPI FHIR validation workflows
- Mixed migration practices replaced by Liquibase with rollback discipline
- Prototype auth middleware replaced by gateway trust and Spring Security
- Ad-hoc service wiring replaced with Spring Cloud and typed contracts
- Limited prototype testing replaced with a 613+ test automation baseline
Preserved from Node.js
The rewrite was not a rejection of the prototype. Several architectural concepts proved their value in Node.js and were carried forward.
- Domain model boundaries: patient, care gap, quality measure, and evaluation domains remained the core service decomposition.
- FHIR R4 as the canonical data format for clinical resources.
- REST-first API design with versioned endpoints and consistent error response shapes.
- Tenant isolation as a first-class architectural concern, not a retrofit.
- Event-driven notifications for cross-service state changes.
- Health check endpoints and readiness probes on every service.
Redesigned Components
Other aspects of the architecture were fundamentally redesigned to take advantage of Java ecosystem capabilities.
| Component | Node.js Prototype | Java Platform |
|---|---|---|
| FHIR handling | Manual JSON parsing | HAPI FHIR 7.x with validation |
| Database migrations | Knex.js migrations | Liquibase with rollback coverage |
| Authentication | Passport.js middleware | Spring Security + gateway trust |
| Event processing | Bull queues (Redis-backed) | Kafka with CQRS event sourcing |
| API documentation | Manual Swagger files | SpringDoc OpenAPI annotations |
| Test infrastructure | Jest + Supertest | JUnit 5 + MockMvc + embedded Kafka |
| Caching | Redis with manual TTL | Spring Cache + HIPAA-compliant TTL |
| Service communication | Axios HTTP calls | Spring Cloud Feign with retry |
| Observability | Winston logging | OpenTelemetry + Prometheus + Grafana |
| Build system | npm scripts | Gradle Kotlin DSL with parallel tasks |
Key Architecture Decisions
The rebuild was guided by Architecture Decision Records (ADRs) that documented the rationale for each major choice.
Database Per Service
Each of the 29 databases has its own schema lifecycle managed by Liquibase. This eliminates cross-service schema coupling and enables independent deployment. The tradeoff is operational complexity in database management, mitigated by standardized Docker Compose configurations and automated migration validation.
Gateway Trust Over Token Forwarding
Rather than forwarding JWT tokens to every downstream service for independent validation, the gateway validates once and injects trusted headers. This reduces latency on the critical path and simplifies service-level security configuration. The tradeoff is that internal network security becomes critical — mitigated by network policies and mTLS.
Embedded Kafka for Testing
The decision to use Spring embedded Kafka instead of Testcontainers eliminated Docker daemon dependency for test execution. This enabled CI/CD parallelization and reduced test suite execution from 60 minutes to 10-15 minutes. Tests run identically on developer laptops and CI servers.
AI Assistance Patterns
The Java rebuild was executed using spec-driven AI assistance. Several patterns emerged that were specific to Java and Spring Boot code generation.
- Spring Boot conventions made AI output highly predictable. Controller-service-repository layering, annotation-driven configuration, and property file conventions are well-represented in training data.
- JPA entity generation from data model specs was nearly zero-defect. Column types, constraints, and relationships mapped cleanly from specification to annotation.
- Liquibase migration generation required the most human oversight. Migration ordering, rollback directives, and cross-service schema references needed architect review.
- Test generation benefited from Java's type system. Mockito mocks, assertion chains, and test data builders were generated correctly because the type signatures constrain the output space.
- Security annotations were the highest-risk generation target. RBAC matrices in specs translated to @PreAuthorize annotations, but the architect verified every endpoint's access control independently.
- Kafka consumer/producer code required cross-service specification review. Topic names, serialization formats, and consumer group IDs had to match across independently generated services.
Rebuild Results
See the Architecture
Explore the platform architecture and evolution timeline that resulted from this rebuild.