| argocd | ||
| base | ||
| charts | ||
| DNS_ENTRIES.md | ||
| PREREQUISITES.md | ||
| README.md | ||
| RUNBOOK.md | ||
| SOW-TASK-LIST.md | ||
Confluent Platform on OpenShift — GitOps Deployment
Fully automated multi-cluster deployment of Confluent Platform on OpenShift using ArgoCD, Helm charts, and CFK (Confluent for Kubernetes) operator.
Architecture Overview
West Region East Region
(3-5ms intra-region) (3-5ms intra-region)
┌──────────────┬──────────────┐ ┌──────────────┬──────────────┐
│ dc-west-1 │ dc-west-2 │ │ dc-east-1 │ dc-east-2 │
│ (rack1) │ (rack2) │ │ (rack1) │ (rack2) │
│ │ │ │ │ │
│ KRaft (1) │ KRaft (1) │ │ KRaft (1) │ KRaft (1) │
│ Kafka 0,1,2 │ Kafka │ │ Kafka 0,1,2 │ Kafka │
│ │ 100,101,102 │ │ │ 100,101,102 │
│ Schema Reg │ Schema Reg │ │ Schema Reg │ Schema Reg │
│ REST Proxy │ REST Proxy │ │ REST Proxy │ REST Proxy │
│ Connect │ Connect │ │ Connect │ Connect │
│ ControlCenter│ ControlCenter│ │ ControlCenter│ ControlCenter│
│ │ │ │ │ │
│ CFK Operator│ CFK Operator│ │ CFK Operator│ CFK Operator│
└──────┬───────┴──────┬───────┘ └──────┬───────┴──────┬───────┘
│ Cluster │ │ Cluster │
│ Linking │ │ Linking │
└──────────────┘ └──────────────┘
│ Cross-Region │
│ Cluster Linking │
│ (30-40ms async) │
└────────────────────────────────────────┘
Each OCP cluster runs a complete Confluent Platform stack. Clusters within a region are linked for synchronous-like replication (intra-region). Clusters across regions are linked for async DR replication (cross-region).
Repository Structure
.
├── README.md # This file
├── PREREQUISITES.md # What must be in place before GitOps
├── RUNBOOK.md # Operational findings and lessons learned
├── DNS_ENTRIES.md # Required DNS entries per cluster
│
├── argocd/ # ArgoCD Application definitions
│ ├── applicationset-cfk-operator.yaml
│ ├── applicationset-infra.yaml
│ └── applicationset-kafka.yaml
│
├── base/ # Per-cluster Helm values
│ ├── garland-infra.yaml # dc-west-1 infrastructure values
│ ├── garland-kafka.yaml # dc-west-1 Kafka stack values
│ ├── louisville-infra.yaml # dc-west-2 infrastructure values
│ ├── louisville-kafka.yaml # dc-west-2 Kafka stack values
│ ├── sterling-infra.yaml # dc-east-1 infrastructure values
│ ├── sterling-kafka.yaml # dc-east-1 Kafka stack values
│ ├── manassas-infra.yaml # dc-east-2 infrastructure values
│ └── manassas-kafka.yaml # dc-east-2 Kafka stack values
│
└── charts/
├── cluster-infra/ # Helm chart: cluster prerequisites
│ ├── Chart.yaml
│ ├── values.yaml
│ └── templates/
│ ├── namespace.yaml # confluent namespace
│ ├── scc.yaml # Custom SCC (UID 1001)
│ ├── storageclass.yaml # NFS CSI StorageClass
│ ├── metallb.yaml # MetalLB IPAddressPool + L2Advertisement
│ └── pull-secret.yaml # Docker Hub pull secret
│
└── confluent-kafka/ # Helm chart: Confluent Platform stack
├── Chart.yaml
├── values.yaml
└── templates/
├── kraftcontroller.yaml # KRaft controller with deterministic clusterID
├── kafka.yaml # Kafka brokers with ID offset + rack awareness
├── schemaregistry.yaml # Schema Registry
├── restproxy.yaml # REST Proxy
├── connect.yaml # Kafka Connect
├── controlcenter.yaml # Control Center (300s probe delay)
├── kafkarestclass.yaml # KafkaRestClass for ClusterLink REST API
├── clusterlink.yaml # ClusterLink CRs (intra + cross-region)
└── static-lb-ips.yaml # PostSync Job for static MetalLB IPs
Deployment Layers
The deployment is structured in 3 layers, each managed by separate ArgoCD Applications:
Layer 1: Infrastructure (cluster-infra chart)
Creates cluster-level prerequisites that Confluent depends on:
| Resource | Description |
|---|---|
| Namespace | confluent namespace |
| SecurityContextConstraints | Custom confluent-scc — allows UID 1001, scoped to 8 Confluent SAs |
| StorageClass | nfs-csi — NFS CSI provisioner pointing to shared NFS server |
| MetalLB IPAddressPool | Dedicated IP range per cluster for Kafka LoadBalancer services |
| L2Advertisement | MetalLB L2 mode for the IP pool |
| Docker Pull Secret | confluent-registry — credentials for pulling CFK images from Docker Hub |
Layer 2: CFK Operator (upstream Helm chart)
Installs the Confluent for Kubernetes operator from https://packages.confluent.io/helm:
- Chart:
confluent-for-kubernetesversion0.1514.19 - CFK operator version:
3.2.1 podSecurity.enabled=false(OpenShift uses SCCs, not PodSecurity)- Must use
ServerSideApply=truein ArgoCD syncOptions — CFK has 22 large CRDs that cause ArgoCD controller OOM with client-side apply
Layer 3: Confluent Platform (confluent-kafka chart)
Deploys all Confluent Platform components:
| Component | Replicas | Notes |
|---|---|---|
| KRaftController | 1 | Pre-defined clusterID (exactly 22 chars) for deterministic cluster links |
| Kafka | 3 | Broker ID offset via annotation (0 for rack1, 100 for rack2) |
| SchemaRegistry | 1 | |
| KafkaRestProxy | 1 | |
| Connect | 1 | |
| ControlCenter | 1 | 300s liveness probe delay (slow startup with Kafka Streams) |
| KafkaRestClass | 1 | Required for ClusterLink CR REST API access |
| ClusterLink | 1-2 | Intra-region + cross-region links per cluster |
| Static LB IP Job | 1 | ArgoCD PostSync hook — patches MetalLB IPs onto Kafka LB services |
Key Design Decisions
Deterministic Cluster IDs
KRaft controllers use pre-defined clusterID values (exactly 22 characters) set in Helm values. This enables:
- Single-pass GitOps: ClusterLink CRs reference known IDs at deploy time — no manual discovery step
- Stable across redeployments: IDs don't change when tearing down and redeploying
- Pattern for scaling:
{dc}-{region}-{az}-{deployment}(e.g.,garland-west-az1-nb01)
Broker ID Offset
CFK annotation platform.confluent.io/broker-id-offset assigns non-overlapping broker IDs:
- rack1 clusters (AZ1): offset
0→ brokers0, 1, 2 - rack2 clusters (AZ2): offset
100→ brokers100, 101, 102 - Gap of 100 allows scaling to 100 brokers per AZ without conflicts
Static MetalLB IPs
CFK creates LoadBalancer services but doesn't support per-broker IP annotations. An ArgoCD PostSync hook (Kubernetes Job) waits for CFK to create the services, then patches each with metallb.universe.tf/loadBalancerIPs to assign deterministic IPs matching pre-configured DNS entries.
Custom SCC (Not anyuid)
A dedicated confluent-scc SecurityContextConstraints allows UID 1001 (Confluent default) with MustRunAs. Scoped to 8 specific service accounts — not the broad anyuid SCC. Deployed as part of the infra chart.
Rack Awareness
Brokers are tagged with rack labels matching their AZ:
- AZ1 clusters:
broker.rack=rack1 - AZ2 clusters:
broker.rack=rack2
Kafka uses this for replica placement — ensures replicas are spread across racks (AZs) for fault tolerance.
Per-Cluster Configuration
Configuration is driven entirely by per-cluster values files in base/. Each cluster has two files:
{cluster}-infra.yaml
nfs:
server: "172.16.2.201" # NFS server IP
share: "/mnt/samsung-1tbs/csi/vols" # NFS export path
metallb:
addressPool: "172.16.2.90-172.16.2.99" # Dedicated MetalLB IP range
confluent:
namespace: confluent
dockerRegistry:
server: docker.io
username: <username>
password: <password>
{cluster}-kafka.yaml
namespace: confluent
kraft:
replicas: 1
clusterID: "garland-west-az1--c001" # Exactly 22 chars
storageClass: nfs-csi
kafka:
replicas: 3
brokerIdOffset: "0" # 0 for rack1, 100 for rack2
rack: rack1 # rack1 or rack2
externalDomain: garland.arsalan.io # Domain for external listener
dataVolumeCapacity: 50Gi
storageClass: nfs-csi
staticIPs:
brokers:
- 172.16.2.90 # kafka-0-lb
- 172.16.2.91 # kafka-1-lb
- 172.16.2.92 # kafka-2-lb
bootstrap: 172.16.2.93 # kafka-bootstrap-lb
clusterLinks:
- name: garland-to-louisville
sourceBootstrap: kafka-bootstrap.louisville.arsalan.io:9092
sourceClusterId: louisville-west-az2-c001
destinationClusterId: garland-west-az1--c001
Adding a New Cluster / deployment
-
Create values files: Copy an existing pair (
{cluster}-infra.yaml+{cluster}-kafka.yaml) and update:clusterID: unique, exactly 22 charactersbrokerIdOffset:0for rack1,100for rack2rack:rack1orrack2externalDomain: cluster's DNS domainmetallb.addressPool: unique IP rangestaticIPs: IPs from the MetalLB poolclusterLinks: source/destination cluster IDs and bootstrap endpoints
-
Add DNS entries: Per DNS_ENTRIES.md — broker and bootstrap records pointing to MetalLB IPs
-
Create ArgoCD Applications: Add entries to the ApplicationSets or create individual Applications pointing to the new values files
-
Push to git: ArgoCD auto-syncs and deploys the full stack
Validated Capabilities
| Capability | Status | Details |
|---|---|---|
| Multi-cluster deployment | Proven | 4 clusters, 36 pods total via ArgoCD |
| Broker ID offset | Proven | 0,1,2 / 100,101,102 per cluster pair |
| Rack awareness | Proven | rack1/rack2 per AZ |
| Intra-region cluster linking | Proven | 10/10 messages replicated, Lag: 0 |
| Cross-region cluster linking | Proven | 10/10 messages replicated both directions |
| Failover (broker kill) | Proven | 150/150 messages, zero data loss with acks=all |
| RF=3 with min.insync.replicas=2 | Proven | acks=all production during broker outage |
| GitOps tear-down + redeploy | Proven | Full stack from git in ~30 minutes |
| Deterministic cluster IDs | Proven | ClusterLink CRs work on first deploy |
| Static MetalLB IPs | Proven | PostSync hook assigns deterministic IPs |
| Custom SCC (UID 1001) | Proven | confluent-scc, not anyuid |
| OCI pull-through proxy | Proven | IDMS routes all pulls via oci.arsalan.io |
Known Limitations
| Limitation | Impact | Workaround |
|---|---|---|
| CFK enforces required pod anti-affinity | Brokers per cluster ≤ nodes per cluster | Add worker nodes for more brokers |
| KRaft replicas cannot be scaled after creation | Must set correct quorum size at deploy | Deploy with target replica count from start |
| ControlCenter requires 3+ brokers | Won't start with RF < 3 | Ensure 3+ brokers before deploying CC |
| Connect defaults RF=3 for internal topics | Fails with < 3 brokers | Override *.storage.replication.factor if needed |
CFK storageClass only on Kafka/KRaft |
Schema validation error on SR/Connect | Don't add storageClass to SR/Connect CRs |
| clusterID must be exactly 22 bytes | KRaft won't deploy with wrong length | Use fixed pattern: {name}-{region}-{az}-{id} |
| ArgoCD OOMs on CFK CRDs | Controller CrashLoopBackOff | Use ServerSideApply=true + increase memory to 6Gi |
Prerequisites
See PREREQUISITES.md for the full list of what must be configured before deploying.
Operational Runbook
See RUNBOOK.md for all 23 findings, solutions, and operational procedures discovered during the PoC.