Kafka
Kafka is the event streaming backbone of the tenant cluster. It is deployed using the Strimzi operator in KRaft mode — no ZooKeeper dependency.
Deployment
Strimzi manages the Kafka cluster as Kubernetes custom resources. The cluster runs in KRaft mode, with separate broker and controller node pools.
| Port | Protocol | Use |
|---|---|---|
| 9092 | Plaintext | Internal broker-to-client traffic within the cluster |
| 9093 | Internal | KRaft controller-to-controller coordination |
Storage is backed by persistent volumes (PVCs) — configurable disk size per broker and per controller.
Monitoring
JMX metrics are exposed from each broker and collected by Prometheus via a JMX exporter sidecar. Kafka-level dashboards are available in Grafana as part of the observability stack.
Who Uses Kafka
| Consumer | Topic | Purpose |
|---|---|---|
| Datahub | Various (MetadataChangeEvent_v4, MetadataChangeLog_Versioned_v1, etc.) | Datahub's internal event bus for metadata ingestion and change propagation |
| ranger-tag-sync | MetadataChangeLog_Versioned_v1 | Consumes tag change events from Datahub to sync tags into Ranger |
| Datahub Schema Registry | — | Schema registry for Datahub's Avro-encoded event streams |