Skip to main content

Kafka

Kafka is the event streaming backbone of the tenant cluster. It is deployed using the Strimzi operator in KRaft mode — no ZooKeeper dependency.

Deployment

Strimzi manages the Kafka cluster as Kubernetes custom resources. The cluster runs in KRaft mode, with separate broker and controller node pools.

PortProtocolUse
9092PlaintextInternal broker-to-client traffic within the cluster
9093InternalKRaft controller-to-controller coordination

Storage is backed by persistent volumes (PVCs) — configurable disk size per broker and per controller.

Monitoring

JMX metrics are exposed from each broker and collected by Prometheus via a JMX exporter sidecar. Kafka-level dashboards are available in Grafana as part of the observability stack.

Who Uses Kafka

ConsumerTopicPurpose
DatahubVarious (MetadataChangeEvent_v4, MetadataChangeLog_Versioned_v1, etc.)Datahub's internal event bus for metadata ingestion and change propagation
ranger-tag-syncMetadataChangeLog_Versioned_v1Consumes tag change events from Datahub to sync tags into Ranger
Datahub Schema RegistrySchema registry for Datahub's Avro-encoded event streams

Go Deeper

  • Datahub — the primary Kafka consumer; also ships a Schema Registry pod
  • Ranger — receives tags via ranger-tag-sync which reads from Kafka