Context
Problem Statement:
The HDIM platform faced database scalability challenges:
Business Context:
Technical Context:
Decision
We will implement PostgreSQL streaming replication with read replicas and application-level read/write routing.
Specific Implementation:
- 1 Primary (read-write)
- 2 Replicas (read-only)
- Synchronous replication for durability
- PgBouncer for connection pooling
- Write operations → Primary
- Read operations in transactions → Primary (consistency)
- Read-only queries → Replicas (round-robin)
- Explicit @Transactional(readOnly = true) for routing
- Automatic replica promotion on primary failure
- Application reconnection with retry logic
- Manual intervention for split-brain scenarios
Alternatives Considered
Alternative 1: Citus (Distributed PostgreSQL)
Description: Horizontal sharding with Citus extension
Pros:
Cons:
Why Not Chosen: Read replicas sufficient for current scale; Citus may be needed at 10x growth
Alternative 2: Amazon Aurora
Description: AWS-managed PostgreSQL-compatible database
Pros:
Cons:
Why Not Chosen: Need cloud-agnostic solution; self-managed provides more control
Alternative 3: Application-Level Caching (Redis)
Description: Cache frequently-read data in Redis
Pros:
Cons:
Why Not Chosen: Caching is complementary, not replacement for read scaling
Consequences
Positive Consequences
Negative Consequences
Mitigation
@Transactional(readOnly = true) explicitlyConfiguration
Primary Configuration (postgresql.conf)
Replication Settings
wal_level = replica
max_wal_senders = 10
max_replication_slots = 10
synchronous_commit = on
synchronous_standby_names = 'replica1,replica2'
Performance
shared_buffers = 4GB
effective_cache_size = 12GB
maintenance_work_mem = 1GB
Replica Configuration (postgresql.conf)
hot_standby = on hot_standby_feedback = on max_standby_streaming_delay = 30s
Spring Data Source Routing
public class ReadWriteRoutingDataSource extends AbstractRoutingDataSource {
@Override
protected Object determineCurrentLookupKey() {
boolean isReadOnly = TransactionSynchronizationManager
.isCurrentTransactionReadOnly();
return isReadOnly ? DataSourceType.REPLICA : DataSourceType.PRIMARY;
}
}Application Configuration
datasource:
primary:
url: jdbc:postgresql://primary:5432/hdim
username: ${DB_USERNAME}
password: ${DB_PASSWORD}
replica:
url: jdbc:postgresql://replica1:5432,replica2:5432/hdim
username: ${DB_READONLY_USERNAME}
password: ${DB_READONLY_PASSWORD}
readOnly: trueImplementation Plan
Files Created
Infrastructure:
docker-compose.ha.ymldocker/postgres/primary/postgresql.confdocker/postgres/primary/pg_hba.confdocker/postgres/replica/postgresql.confdocker/postgres/init-replication.shApplication:
backend/modules/shared/infrastructure/persistence/src/main/java/com/healthdata/persistence/routing/ReadWriteRoutingDataSource.javabackend/modules/shared/infrastructure/persistence/src/main/java/com/healthdata/persistence/routing/ReadWriteDataSourceConfig.javaSuccess Metrics
| Metric | Target | Measurement |
|--------|--------|-------------|
| Read query latency p95 | <100ms | APM metrics |
| Replication lag | <100ms | pg_stat_replication |
| Primary CPU utilization | <60% | Infrastructure monitoring |
| Read throughput | 10K QPS | Database metrics |