Catalog
The Catalog gives users a view of the data assets available in their workspace — tables, schemas, ownership, lineage, and access permissions — all in one place.
Services Behind This Feature
| Service | Role |
|---|---|
| BFF API | Proxies requests from the Cogrion UI to the catalog backend. |
| Datahub | Metadata catalog and lineage |
| Superset | Data exploration and visualization |
| Trino | Query engine |
| Ranger | Provides the access control layer — permissions visible in the catalog are enforced here at query time. |
| Hive Metastore | Table catalog |
What a Table Detail Page Shows
When a user opens a specific table in the Catalog, they see:
| Tab | Content |
|---|---|
| Columns | Column names, data types, Datahub tags, and descriptions |
| Sample data | A preview of actual row data from the table (fetched via Trino) |
| History / Lineage | Upstream and downstream tables — which tables feed this one and which depend on it |
| Data quality | Results from data quality checks run against the table |
Lineage
Table and column lineage is populated by the Airflow Datahub plugin. When an Airflow DAG runs a pipeline that reads from or writes to tables, the plugin emits lineage events to Datahub automatically. Lineage is visible in the Catalog without any manual configuration on the data engineering side.
Go Deeper
- Sharing & Permissions — grant and revoke table-level access from the Catalog
- Column Tagging — apply tags to columns and how they flow through to Ranger
- Data Access and Authorization — overview of the auth model
- Datahub — the metadata catalog component powering the Catalog UI
- Hive Metastore — the schema catalog that HMS and Trino share