Skip to main content

Column Tagging

Column tags in the Catalog describe what a column contains — for example PII, CONFIDENTIAL, or INTERNAL. Tags are the link between the metadata layer (Datahub) and the access control layer (Ranger): once a tag is applied to a column in Datahub, it becomes available in Ranger for tag-based restriction and masking policies.

The Tag-to-Policy Flow

Step by Step

1. Apply a tag in the Catalog UI

A user or admin navigates to a table in the Cogrion Catalog, selects a column, and applies a Datahub tag. Tags are free-form strings (e.g. PII, SENSITIVE) and can be applied to individual columns.

2. Datahub commits the change

The BFF forwards the tag write to Datahub GMS. Datahub stores the tag as a GlobalTag attached to the column's schema field entity and emits a MetadataChangeLog (MCL) event to the Kafka topic MetadataChangeLog_Versioned_v1.

3. ranger-tag-sync picks up the event

The ranger-tag-sync service (deployed as part of the aws/datahub bundle) consumes the MCL event from Kafka. It extracts the tag name and registers it in Ranger's tag store against the trino Ranger service.

4. Tag is available in Ranger

The tag now exists in Ranger. It can be selected when creating a column restriction policy or a data masking policy via the Data Access Management UI.

Where Tags Come From

Tags on columns have two origins:

SourceHow it works
Manual taggingAn admin applies a tag directly in the Cogrion Catalog UI — the flow described on this page
PII scanningThe PII scanning Spark job scans table data on a schedule and writes PII scan results as structured properties back to Datahub

Important: Tag Scope

A tag-based policy applies to all columns carrying that tag, across all tables and schemas. Removing a tag from a column in Datahub triggers a new MCL event, and the tag sync will update Ranger accordingly — but any existing Ranger policies that reference the tag are not automatically removed.

When removing a tag from a column, verify that the Ranger policy intent still makes sense for the remaining tagged columns.

Go Deeper