MLflow
MLflow is the ML experiment tracking and model registry. Data scientists use it to log training runs, compare metrics, register models, and promote them through staging to production.
Authentication
MLflow uses Keycloak OIDC. Group membership is derived from the roles claim in the Keycloak JWT.
Storage
| Store | Purpose |
|---|---|
| PostgreSQL (KubeBlocks) | MLflow backend store — experiments, runs, metrics, parameters, tags |
| S3 artifact bucket | Artifact store — model files, datasets, plots, and other large outputs logged during training |
The artifact bucket is named {platform_id}-mlflow-artifact-bucket. MLflow has an IRSA role granting it S3 read/write access scoped to this bucket.
Artifacts are stored under the mlflow-artifact path prefix. Clients that log artifacts (e.g. from notebook or Spark job code) need S3 access to the same bucket — this is typically provided via the user's IRSA role or the notebook pod's inherited credentials.
Go Deeper
- JupyterHub — notebooks where training runs are commonly executed
- Features — Machine Learning — the user-facing experiments and models features